CODAWORK’03http://hdl.handle.net/10256/6162017-12-18T04:59:06Z2017-12-18T04:59:06ZValidation of order rank scales based on compositional data analysis: a proposalMalpica Lander, ClaudiaGalindo Villardón, Purificaciónhttp://hdl.handle.net/10256/6962012-06-28T12:30:36Z2003-10-17T00:00:00ZValidation of order rank scales based on compositional data analysis: a proposal
Malpica Lander, Claudia; Galindo Villardón, Purificación
Thió i Fernández de Henestrosa, Santiago; Martín Fernández, Josep Antoni
Usually, psychometricians apply classical factorial analysis to evaluate construct validity of order rank
scales. Nevertheless, these scales have particular characteristics that must be taken into account: total
scores and rank are highly relevant
2003-10-17T00:00:00ZRefinement criteria for global illumination using convex funcionsRigau Vilalta, JaumeFeixas Feixas, MiquelSbert, Mateuhttp://hdl.handle.net/10256/6952012-06-28T12:30:36Z2003-10-17T00:00:00ZRefinement criteria for global illumination using convex funcions
Rigau Vilalta, Jaume; Feixas Feixas, Miquel; Sbert, Mateu
Thió i Fernández de Henestrosa, Santiago; Martín Fernández, Josep Antoni
In several computer graphics areas, a refinement criterion is often needed to decide whether to go
on or to stop sampling a signal. When the sampled values are homogeneous enough, we assume that
they represent the signal fairly well and we do not need further refinement, otherwise more samples are
required, possibly with adaptive subdivision of the domain. For this purpose, a criterion which is very
sensitive to variability is necessary. In this paper, we present a family of discrimination measures, the
f-divergences, meeting this requirement. These convex functions have been well studied and successfully
applied to image processing and several areas of engineering. Two applications to global illumination
are shown: oracles for hierarchical radiosity and criteria for adaptive refinement in ray-tracing. We
obtain significantly better results than with classic criteria, showing that f-divergences are worth further
investigation in computer graphics. Also a discrimination measure based on entropy of the samples for
refinement in ray-tracing is introduced. The recursive decomposition of entropy provides us with a natural
method to deal with the adaptive subdivision of the sampling region
2003-10-17T00:00:00ZAlternative ways to estimate change points in multinomial sequences. An application to an authorship attribution problemRiba, AlexGinebra, Josephttp://hdl.handle.net/10256/6942012-06-28T12:30:36Z2003-10-17T00:00:00ZAlternative ways to estimate change points in multinomial sequences. An application to an authorship attribution problem
Riba, Alex; Ginebra, Josep
Thió i Fernández de Henestrosa, Santiago; Martín Fernández, Josep Antoni
The statistical analysis of literary style is the part of stylometry that compares measurable characteristics
in a text that are rarely controlled by the author, with those in other texts. When the
goal is to settle authorship questions, these characteristics should relate to the author’s style and
not to the genre, epoch or editor, and they should be such that their variation between authors is
larger than the variation within comparable texts from the same author.
For an overview of the literature on stylometry and some of the techniques involved, see for example
Mosteller and Wallace (1964, 82), Herdan (1964), Morton (1978), Holmes (1985), Oakes (1998) or
Lebart, Salem and Berry (1998).
Tirant lo Blanc, a chivalry book, is the main work in catalan literature and it was hailed to be
“the best book of its kind in the world” by Cervantes in Don Quixote. Considered by writters
like Vargas Llosa or Damaso Alonso to be the first modern novel in Europe, it has been translated
several times into Spanish, Italian and French, with modern English translations by Rosenthal
(1996) and La Fontaine (1993). The main body of this book was written between 1460 and 1465,
but it was not printed until 1490.
There is an intense and long lasting debate around its authorship sprouting from its first edition,
where its introduction states that the whole book is the work of Martorell (1413?-1468), while at
the end it is stated that the last one fourth of the book is by Galba (?-1490), after the death of
Martorell. Some of the authors that support the theory of single authorship are Riquer (1990),
Chiner (1993) and Badia (1993), while some of those supporting the double authorship are Riquer
(1947), Coromines (1956) and Ferrando (1995). For an overview of this debate, see Riquer (1990).
Neither of the two candidate authors left any text comparable to the one under study, and therefore
discriminant analysis can not be used to help classify chapters by author. By using sample texts
encompassing about ten percent of the book, and looking at word length and at the use of 44
conjunctions, prepositions and articles, Ginebra and Cabos (1998) detect heterogeneities that might
indicate the existence of two authors. By analyzing the diversity of the vocabulary, Riba and
Ginebra (2000) estimates that stylistic boundary to be near chapter 383.
Following the lead of the extensive literature, this paper looks into word length, the use of the most
frequent words and into the use of vowels in each chapter of the book. Given that the features
selected are categorical, that leads to three contingency tables of ordered rows and therefore to
three sequences of multinomial observations.
Section 2 explores these sequences graphically, observing a clear shift in their distribution. Section 3
describes the problem of the estimation of a suden change-point in those sequences, in the following
sections we propose various ways to estimate change-points in multinomial sequences; the method
in section 4 involves fitting models for polytomous data, the one in Section 5 fits gamma models
onto the sequence of Chi-square distances between each row profiles and the average profile, the
one in Section 6 fits models onto the sequence of values taken by the first component of the
correspondence analysis as well as onto sequences of other summary measures like the average
word length. In Section 7 we fit models onto the marginal binomial sequences to identify the
features that distinguish the chapters before and after that boundary. Most methods rely heavily
on the use of generalized linear models
2003-10-17T00:00:00ZA compositional statistical analysis of capital stockLarrosa, Juan M.http://hdl.handle.net/10256/6932012-06-28T12:30:36Z2003-10-17T00:00:00ZA compositional statistical analysis of capital stock
Larrosa, Juan M.
Thió i Fernández de Henestrosa, Santiago; Martín Fernández, Josep Antoni
Most of economic literature has presented its analysis under the assumption of homogeneous capital stock.
However, capital composition differs across countries. What has been the pattern of capital composition
associated with World economies? We make an exploratory statistical analysis based on compositional data
transformed by Aitchinson logratio transformations and we use tools for visualizing and measuring statistical
estimators of association among the components. The goal is to detect distinctive patterns in the composition.
As initial findings could be cited that:
1. Sectorial components behaved in a correlated way, building industries on one side and , in a less
clear view, equipment industries on the other.
2. Full sample estimation shows a negative correlation between durable goods component and
other buildings component and between transportation and building industries components.
3. Countries with zeros in some components are mainly low income countries at the bottom of the
income category and behaved in a extreme way distorting main results observed in the full
sample.
4. After removing these extreme cases, conclusions seem not very sensitive to the presence of
another isolated cases
2003-10-17T00:00:00Z