Abstract: In compositional data, an observation is a vector with non-negative components which sum to a constant, typically 1. Data of this type arise in many areas, such as geology, archaeology, biology, economics and political science among others. The goal of this paper is to extend the taxicab metric and a newly suggested metric for com-positional data by employing a power transformation. Both metrics are to be used in the k-nearest neighbours algorithm regardless of the presence of zeros. Examples with real data are exhibited.
We demonstrate how to test for conditional independence of two variables with categorical data using Poisson log-linear models. The size of the conditioning set of variables can vary from 0 (simple independence) up to many variables. We also provide a function in R for performing the test. Instead of calculating all possible tables with for loop we perform the test using the loglinear models and thus speeding up the process. Time comparison simulation studies are presented.
Forward regression has been criticised heavily and one of the many reasons is regarding its speed and its stopping criteria. The main focus of this paper is on demonstrating how to make it efficient, using R. Our method worksfor continuous predictor variables only, as the use of the partial correlation plays the most important role.