Three Approaches to supervised learning for compositional data with pairwise logratios

Coenders, Germà; Greenacre, Michael J.

Three Approaches to supervised learning for compositional data with pairwise logratios

orcId Coenders, Germà researcherId Coenders, Germà scopusId Coenders, Germà

Coenders, Germà

Greenacre, Michael J.

2023-11-09

Text Complet

037467.pdf

Sol·licita còpia

Sol·licita còpia a l'autor de l'article

En omplir aquest formulari esteu demanant una còpia de l'article dipositat al repositori institucional (DUGiDocs) al seu autor o a l'autor principal de l'article. Serà el mateix autor qui decideixi lliurar una còpia del document a qui ho sol•liciti si ho creu convenient. En tot cas, la Biblioteca de la UdG no intervé en aquest procés ja que no està autoritzada a facilitar articles quan aquests són d'accés restringit.

Logratios between pairs of compositional parts (pairwise logratios) are the easiest to interpret in compositional data analysis, and include the well-known additive logratios as particular cases. When the number of parts is large (sometimes even larger than the number of cases), some form of logratio selection is needed. In this article, we present three alternative stepwise supervised learning methods to select the pairwise logratios that best explain a dependent variable in a generalized linear model, each geared for a specific problem. The first method features unrestricted search, where any pairwise logratio can be selected. This method has a complex interpretation if some pairs of parts in the logratios overlap, but it leads to the most accurate predictions. The second method restricts parts to occur only once, which makes the corresponding logratios intuitively interpretable. The third method uses additive logratios, so that K−1 selected logratios involve a K-part subcomposition. Our approach allows logratios or non-compositional covariates to be forced into the models based on theoretical knowledge, and various stopping criteria are available based on information measures or statistical significance with the Bonferroni correction. We present an application on a dataset from a study predicting Crohn's disease

Tots els drets reservats

Mostra el registre complet de l'element

Identificadors

http://hdl.handle.net/10256/24073

issn: 0266-4763

doi: 10.1080/02664763.2022.2108007

eissn: 1360-0532

Text Complet

037467.pdf

Sol·licita còpia

Sol·licita còpia a l'autor de l'article

En omplir aquest formulari esteu demanant una còpia de l'article dipositat al repositori institucional (DUGiDocs) al seu autor o a l'autor principal de l'article. Serà el mateix autor qui decideixi lliurar una còpia del document a qui ho sol•liciti si ho creu convenient. En tot cas, la Biblioteca de la UdG no intervé en aquest procés ja que no està autoritzada a facilitar articles quan aquests són d'accés restringit.

Compartir

Impacte

90

2

Veure estadístiques d'ús

Citat vegades a Scopus

Citat vegades a Web of Science

H-index d'aquesta revista:

Índex Scimago de 1971:

Google Acadèmic