Session 2: From counts
http://hdl.handle.net/10256/640
Thu, 18 Apr 2024 14:23:28 GMT2024-04-18T14:23:28ZIntrinsic test of independence in contingency tables
http://hdl.handle.net/10256/721
Intrinsic test of independence in contingency tables
Casella, George; Moreno, Elías
Daunis-i-Estadella, Pepus; Martín Fernández, Josep Antoni
A condition needed for testing nested hypotheses from a Bayesian
viewpoint is that the prior for the alternative model concentrates
mass around the small, or null, model. For testing independence
in contingency tables, the intrinsic priors satisfy this requirement.
Further, the degree of concentration of the priors is controlled by
a discrete parameter m, the training sample size, which plays an
important role in the resulting answer regardless of the sample
size.
In this paper we study robustness of the tests of independence
in contingency tables with respect to the intrinsic priors with
different degree of concentration around the null, and compare
with other “robust” results by Good and Crook. Consistency of
the intrinsic Bayesian tests is established.
We also discuss conditioning issues and sampling schemes,
and argue that conditioning should be on either one margin or
the table total, but not on both margins.
Examples using real are simulated data are given
Wed, 28 May 2008 00:00:00 GMThttp://hdl.handle.net/10256/7212008-05-28T00:00:00ZA unified approach for representing rows and columns in contingency tables
http://hdl.handle.net/10256/720
A unified approach for representing rows and columns in contingency tables
Cuadras, C.M.; Cuadras i Pallejà, Daniel
Daunis-i-Estadella, Pepus; Martín Fernández, Josep Antoni
By using suitable parameters, we present a uni¯ed aproach for describing four methods for representing categorical data in a contingency table. These methods include:
correspondence analysis (CA), the alternative approach using Hellinger distance (HD),
the log-ratio (LR) alternative, which is appropriate for compositional data, and the
so-called non-symmetrical correspondence analysis (NSCA). We then make an appropriate comparison among these four methods and some illustrative examples are given.
Some approaches based on cumulative frequencies are also linked and studied using
matrices.
Key words: Correspondence analysis, Hellinger distance, Non-symmetrical correspondence analysis, log-ratio analysis, Taguchi inertia
Wed, 28 May 2008 00:00:00 GMThttp://hdl.handle.net/10256/7202008-05-28T00:00:00ZQuantifying rock fabrics: a test of autocorrelation of the spatial distribution of cristals
http://hdl.handle.net/10256/719
Quantifying rock fabrics: a test of autocorrelation of the spatial distribution of cristals
Egozcue, Juan José; Mackenzie, J.R.; Heilbronner, Renée; Hielscher, Ralf; Müller, A.; Schaeben, Helmut
Daunis-i-Estadella, Pepus; Martín Fernández, Josep Antoni
A novel test of spatial independence of the distribution of crystals or phases in rocks
based on compositional statistics is introduced. It improves and generalizes the common
joins-count statistics known from map analysis in geographic information systems.
Assigning phases independently to objects in RD is modelled by a single-trial multinomial
random function Z(x), where the probabilities of phases add to one and are
explicitly modelled as compositions in the K-part simplex SK. Thus, apparent inconsistencies
of the tests based on the conventional joins{count statistics and their possibly
contradictory interpretations are avoided. In practical applications we assume that the
probabilities of phases do not depend on the location but are identical everywhere in
the domain of de nition. Thus, the model involves the sum of r independent identical
multinomial distributed 1-trial random variables which is an r-trial multinomial
distributed random variable. The probabilities of the distribution of the r counts can
be considered as a composition in the Q-part simplex SQ. They span the so called
Hardy-Weinberg manifold H that is proved to be a K-1-affine subspace of SQ. This is
a generalisation of the well-known Hardy-Weinberg law of genetics. If the assignment
of phases accounts for some kind of spatial dependence, then the r-trial probabilities
do not remain on H. This suggests the use of the Aitchison distance between observed
probabilities to H to test dependence. Moreover, when there is a spatial
uctuation of
the multinomial probabilities, the observed r-trial probabilities move on H. This shift
can be used as to check for these
uctuations. A practical procedure and an algorithm
to perform the test have been developed. Some cases applied to simulated and real
data are presented.
Key words: Spatial distribution of crystals in rocks, spatial distribution of phases,
joins-count statistics, multinomial distribution, Hardy-Weinberg law, Hardy-Weinberg
manifold, Aitchison geometry
Wed, 28 May 2008 00:00:00 GMThttp://hdl.handle.net/10256/7192008-05-28T00:00:00ZCompositional data and Simpson's paradox
http://hdl.handle.net/10256/718
Compositional data and Simpson's paradox
Egozcue, Juan José; Pawlowsky-Glahn, Vera
Daunis-i-Estadella, Pepus; Martín Fernández, Josep Antoni
Simpson's paradox, also known as amalgamation or aggregation paradox, appears when
dealing with proportions. Proportions are by construction parts of a whole, which can
be interpreted as compositions assuming they only carry relative information. The
Aitchison inner product space structure of the simplex, the sample space of compositions, explains the appearance of the paradox, given that amalgamation is a nonlinear
operation within that structure. Here we propose to use balances, which are specific
elements of this structure, to analyse situations where the paradox might appear. With
the proposed approach we obtain that the centre of the tables analysed is a natural
way to compare them, which avoids by construction the possibility of a paradox.
Key words: Aitchison geometry, geometric mean, orthogonal projection
Wed, 28 May 2008 00:00:00 GMThttp://hdl.handle.net/10256/7182008-05-28T00:00:00Z