TY - JOUR

T1 - Identification of the unchanging reference component of compositional data from the properties of the coefficient of variation

AU - Ohta, Tohru

AU - Arai, Hiroyoshi

AU - Noda, Atsushi

PY - 2011/5/1

Y1 - 2011/5/1

N2 - In analyses of compositional data, it is important to select a suitable unchanging component as a reference to detect the behavior of a single variable in isolation. This paper introduces two tests for detecting the unchanging component, based on a new approach that utilizes the coefficient of variation of component ratios. That is, the coefficient of variation of a compositional ratio is subject to change when the unchanging component is switched between the denominator and numerator, and the coefficient of variation tends to be small when the unchanging component occurs as the denominator against any arbitrary components (Test 1). In addition, the ratio of the component pair that gives the lowest coefficient of variation is most likely to represent the two unchanging components (Test 2). However, Tests 1 and 2 are not necessary and sufficient conditions for uniquely finding the unchanging component. To verify the effectiveness of the tests, 500 artificial datasets were analyzed and the results suggest that the tests are able to identify the unchanging component, although Test 1 underperforms when the dataset includes a component with skewness greater than 0.5, and Test 2 fails when the dataset includes components with a correlation coefficient greater than 0.75. These defects can be overcome by interpreting the two test results in a complementary manner. The proposed tests provide powerful yet simple criteria for identifying the unchanging component in compositional data; however, the reliability of this approach needs to be assessed in further studies.

AB - In analyses of compositional data, it is important to select a suitable unchanging component as a reference to detect the behavior of a single variable in isolation. This paper introduces two tests for detecting the unchanging component, based on a new approach that utilizes the coefficient of variation of component ratios. That is, the coefficient of variation of a compositional ratio is subject to change when the unchanging component is switched between the denominator and numerator, and the coefficient of variation tends to be small when the unchanging component occurs as the denominator against any arbitrary components (Test 1). In addition, the ratio of the component pair that gives the lowest coefficient of variation is most likely to represent the two unchanging components (Test 2). However, Tests 1 and 2 are not necessary and sufficient conditions for uniquely finding the unchanging component. To verify the effectiveness of the tests, 500 artificial datasets were analyzed and the results suggest that the tests are able to identify the unchanging component, although Test 1 underperforms when the dataset includes a component with skewness greater than 0.5, and Test 2 fails when the dataset includes components with a correlation coefficient greater than 0.75. These defects can be overcome by interpreting the two test results in a complementary manner. The proposed tests provide powerful yet simple criteria for identifying the unchanging component in compositional data; however, the reliability of this approach needs to be assessed in further studies.

KW - Closed data

KW - Coefficient of variation

KW - Compositional data

KW - Reference frame

KW - Unchanging component

UR - http://www.scopus.com/inward/record.url?scp=79955885540&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79955885540&partnerID=8YFLogxK

U2 - 10.1007/s11004-011-9332-y

DO - 10.1007/s11004-011-9332-y

M3 - Article

AN - SCOPUS:79955885540

VL - 43

SP - 421

EP - 434

JO - Mathematical Geosciences

JF - Mathematical Geosciences

SN - 1874-8961

IS - 4

ER -