Invited lectures'05http://hdl.handle.net/10256/6312025-08-09T22:46:49Z2025-08-09T22:46:49ZCompositional Data in Biomedical ResearchBillheimer, Deanhttp://hdl.handle.net/10256/6552013-07-17T09:58:02Z2005-10-01T00:00:00ZCompositional Data in Biomedical Research
Billheimer, Dean
Mateu i Figueras, Glòria; Barceló i Vidal, Carles
Modern methods of compositional data analysis are not well known in biomedical research.
Moreover, there appear to be few mathematical and statistical researchers
working on compositional biomedical problems. Like the earth and environmental sciences,
biomedicine has many problems in which the relevant scienti c information is
encoded in the relative abundance of key species or categories. I introduce three problems
in cancer research in which analysis of compositions plays an important role. The
problems involve 1) the classi cation of serum proteomic pro les for early detection of
lung cancer, 2) inference of the relative amounts of di erent tissue types in a diagnostic
tumor biopsy, and 3) the subcellular localization of the BRCA1 protein, and it's
role in breast cancer patient prognosis. For each of these problems I outline a partial
solution. However, none of these problems is \solved". I attempt to identify areas in
which additional statistical development is needed with the hope of encouraging more
compositional data analysts to become involved in biomedical research
2005-10-01T00:00:00ZSome last thoughts on compositional data analysisAitchison, Johnhttp://hdl.handle.net/10256/6542013-07-17T09:58:19Z2005-10-01T00:00:00ZSome last thoughts on compositional data analysis
Aitchison, John
Mateu i Figueras, Glòria; Barceló i Vidal, Carles
One of the disadvantages of old age is that there is more past than future: this,
however, may be turned into an advantage if the wealth of experience and, hopefully,
wisdom gained in the past can be reflected upon and throw some light on possible
future trends. To an extent, then, this talk is necessarily personal, certainly nostalgic,
but also self critical and inquisitive about our understanding of the discipline of
statistics. A number of almost philosophical themes will run through the talk: search
for appropriate modelling in relation to the real problem envisaged, emphasis on
sensible balances between simplicity and complexity, the relative roles of theory and
practice, the nature of communication of inferential ideas to the statistical layman, the
inter-related roles of teaching, consultation and research. A list of keywords might be:
identification of sample space and its mathematical structure, choices between
transform and stay, the role of parametric modelling, the role of a sample space
metric, the underused hypothesis lattice, the nature of compositional change,
particularly in relation to the modelling of processes. While the main theme will be
relevance to compositional data analysis we shall point to substantial implications for
general multivariate analysis arising from experience of the development of
compositional data analysis…
2005-10-01T00:00:00ZAitchison Geometry for Probability and Likelihood as a new approach to mathematical statisticsBoogaart, K. Gerald van denhttp://hdl.handle.net/10256/6532013-07-17T09:58:49Z2005-10-01T00:00:00ZAitchison Geometry for Probability and Likelihood as a new approach to mathematical statistics
Boogaart, K. Gerald van den
Mateu i Figueras, Glòria; Barceló i Vidal, Carles
The Aitchison vector space structure for the simplex is generalized to a Hilbert space structure A2(P) for distributions and likelihoods on arbitrary spaces. Central
notations of statistics, such as Information or Likelihood, can be identified in the algebraical structure of A2(P) and their corresponding notions in compositional data analysis, such as Aitchison distance or centered log ratio transform.
In this way very elaborated aspects of mathematical statistics can be understood
easily in the light of a simple vector space structure and of compositional data analysis. E.g. combination of statistical information such as Bayesian updating,
combination of likelihood and robust M-estimation functions are simple additions/
perturbations in A2(Pprior). Weighting observations corresponds to a weighted
addition of the corresponding evidence.
Likelihood based statistics for general exponential families turns out to have a
particularly easy interpretation in terms of A2(P). Regular exponential families form
finite dimensional linear subspaces of A2(P) and they correspond to finite dimensional
subspaces formed by their posterior in the dual information space A2(Pprior).
The Aitchison norm can identified with mean Fisher information. The closing constant itself is identified with a generalization of the cummulant function and shown to be Kullback Leiblers directed information. Fisher information is the local geometry of the manifold induced by the A2(P) derivative of the Kullback Leibler information and the space A2(P) can therefore be seen as the tangential geometry of statistical inference at the distribution P.
The discussion of A2(P) valued random variables, such as estimation functions
or likelihoods, give a further interpretation of Fisher information as the expected squared norm of evidence and a scale free understanding of unbiased reasoning
2005-10-01T00:00:00Z