Techniques and tools. Corpus methods and statistics for semantics
by Dylan Glynn
An overview of the corpus methods and statistical techniques in Cognitive Semantics An overview of the corpus methods and statistical techniques in Cognitive Semantics
Morphometrical Variations of Malaysian Hipposideros Species. 2012
Read Vijaya et al (2012)
A study on the morphometrical variations among four Malaysian Hipposideros species was conducted using voucher... more
A study on the morphometrical variations among four Malaysian Hipposideros species was conducted using voucher specimens deposited in Universiti Malaysia Sarawak (UNIMAS) Zoological Museum and the Department of Widlife and National Park (DWNP) Kuala Lumpur. Twenty two individuals from four species of Hipposideros ater, H. bicolor, H. cineraceus and H. dyacorum were morphologically measured, in which a total of 27 linear parameters of body, skull and dentals of each were appropriately recorded. The statistical data were later subjected to discriminant function analysis (DFA) and canonical variate analysis (CVA) using SPSS version 15.0 and unweighted pair-group method average (UPGMA) cluster analysis using Minitab version 14.4. The highest character loadings observed in Function l, Function 2 and Function 3 were the forearm length (FA), the third digit second phalanx length (D3P2L) and the palatal length (PL) with standardised canonical discriminant function coefficient values of 21.910, 5.770 and 5.095, respectively. These three characters were identified as the best diagnostic features for discriminating these closely related species of Hipposideros. Hence, this morphometric approach could be a promising tool as an alternative to the molecular
DNA analysis for identification of Chiroptera species.
A Predictive Model to Differentiate the Fruit Bats Cynopterus brachyotis and C. cf. brachyotis Forest (Chiroptera: Pteropodidae) from Malaysia Using Multivariate Analysis. 2012
Read also Abdullah (2003)
Field discrimination of Cynopterus brachyotis and C. cf. brachyotis Forest (as designated by Francis 2008) in southern... more Field discrimination of Cynopterus brachyotis and C. cf. brachyotis Forest (as designated by Francis 2008) in southern Thailand, Peninsular Malaysia, and Borneo is problematic. These 2 forms are sympatric in this region but are confined to different habitat types: C. brachyotis inhabits open habitats, orchards, and agricultural areas, while C. cf. brachyotis Forest is confined to primary and old secondary forests. In this study, we attempted to develop prediction models to identify both C. brachyotis and C. cf. brachyotis Forest in this region based on multivariate statistics. Two predictive models were generated using a canonical discriminant function, and it was found that 5 characters can be used to accurately identify museum vouchers of C. brachyotis and C. cf. brachyotis Forest. Four characters are needed for field identification of these 2 forms of Cynopterus in southern Thailand, Peninsular Malaysia, and Borneo. A review of the current taxonomy and classification indicated that there is a need to describe the 6 existing forms of the C. brachyotis complex in the Indo-Malayan region. This will aid conservationists, field ecologists, and taxonomists in taxonomic- and conservation-related decisions about this species complex.
29 views
Seen by:Local Asymptotic Minimax Theory for Block-Decreasing Densities
Journal of Statistical Planning and Inference, Volume 142, Issue 8, August 2012, Pages 2322–2329
In this paper, we study Lebesgue densities on $\s$ that are
non-increasing in each coordinate, while keeping all... more
In this paper, we study Lebesgue densities on $\s$ that are
non-increasing in each coordinate, while keeping all other coordinates fixed, from the perspective of local asymptotic minimax lower bound theory. In particular, we establish a local optimal rate of convergence of the order $n^{-1/(d+2)}$.
Nonparametric estimation of multivariate scale mixtures of uniform densities
Journal of Multivariate Analysis, Volume 107, May 2012, Pages 71–89
Suppose that $\boldsymbol{U} = (U_1, \ldots , U_d) $ has a Uniform$([0,1]^d)$ distribution, that $\boldsymbol{Y} =... more Suppose that $\boldsymbol{U} = (U_1, \ldots , U_d) $ has a Uniform$([0,1]^d)$ distribution, that $\boldsymbol{Y} = (Y_1 , \ldots , Y_d) $ has the distribution $G$ on $\mathds{R}_+^d$, and let $\m{X} = (X_1 , \ldots , X_d) = (U_1 Y_1 , \ldots , U_d Y_d )$. The resulting class of distributions of $\boldsymbol{X}$ (as $G$ varies over all distributions on $\mathds{R}_+^d$) is called the {\sl Scale Mixture of Uniforms} class of distributions, and the corresponding class of densities on $\mathds{R}_+^d$ is denoted by $\mathcal {F}_{\text{SMU}}(d)$. We study maximum likelihood estimation in the family $\mathcal {F}_{\text{SMU}}(d)$. We prove existence of the MLE, establish Fenchel characterizations, and prove strong consistency of the almost surely unique maximum likelihood estimator (MLE) in $\mathcal {F}_{\text{SMU}}(d)$. We also provide an asymptotic minimax lower bound for estimating the functional $f \mapsto f(\boldsybol{x})$ under reasonable differentiability assumptions on $f\in\mathcal {F}_{\text{SMU}}(d)$ in a neighborhood of $\boldsymbol{x}$. We conclude the paper with discussion, conjectures and open problems pertaining to global and local rates of convergence of the MLE.
Relationships between nutrient enrichment and the phytoplankton community at an andean oligotrophic lake: a multivariate assessment
Co-authored with Gabriel J. Castaño-Villa
Ecología Aplicada 10(2): 75-81
Phytoplankton is one of the groups most sensitive to eutrophic conditions, and its disturbance has a potential... more Phytoplankton is one of the groups most sensitive to eutrophic conditions, and its disturbance has a potential negative bottom-up effect on lentic ecosystems. In this research we used a multivariate statistics approach (Canonical Correspondence Analysis with a Monte Carlo permutational test) to assess the relationships between nutrient enrichment and the phytoplankton community diversity. Four locations with different levels of eutrophication were sampled at the Bolivian sector of Lake Titikaka. Phytoplankton richness ranged from eight to 16 genera, varying significantly among sites, and its diversity was significantly correlated to nitrogen, phosphorus, and pH (Monte Carlo test, p < 0.05). Phosphorus was determined to be the limiting nutrient in the ecosystem. Community structure assessment showed a non-aggregated distribution of genera among study sites, with few abundant genera, and a BDG analysis pointed to a log-series distribution, suggesting a non–fully random niche partition. The methodological approach used here allowed a rapid assessment of the nutrient enrichment effect considering phytoplankton and nutrients as a whole, which is a more powerful approach than studying single-nutrient or single group relationships with univariate procedures.
18 views
Seen by:I massi incisi calcolitici della Valcamonica e della Valtellina: appunti per un nuovo percorso di ricerca
(Italian). Co-authored with Alfredo Barbieri. Published in E. Anati (Ed.), Valcamonica Symposium 2004: nuove scoperte, nuove interpretazioni, nuovi metodi di ricerca. Darfo Boario Terme 8-14 settembre 2004 (preatti) (pp. 314-331), 2004. Capo di Ponte: Edizioni del Centro.
As progress is made in retrieving Valcamonica and Valtellina menhirs, association rules and patterns to be found among... more As progress is made in retrieving Valcamonica and Valtellina menhirs, association rules and patterns to be found among recurrent figures grow ever more articulate. This essay looks for such recurrent patterns by running Principal Component Analysis, a robust multivariate technique known to work adequately on a high number of variables. The same procedure can be used to unveil new research paths or to comfort existing theory with evidence to be checked on raw data. This research approach is new to the rock art area in object, and the introduction of an innovative working methodology is among the goals of this paper. The database to be processed includes all the tracings of the engraved menhirs from Valcamonica and Valtellina published as of 2004, for a total of 43 for the former and 18 for the latter. Our results provide suggestions and information on various topics: the separation of menhirs into three major groups, generically called group A, B, and C; a collection of traits for each group (the “sun” for group A, the “U” motif for group B, the human figures for group C); the detection of the distinctive motifs of Valcamonica menhirs and of Valtellina menhirs. Our multivariate analysis has on one hand confirmed some known issues (such as the presence of “male” and “female” menhirs – our A and B groups); on the other hand, some insight is provided on lesser-known issues, like the existence of a group C and the geographical characteristics of menhirs. Moreover, the results of further research on human figure sequences suggest a new interpretation framework of the superimposition phenomenon not necessarily based on a strictly chronological theory. Finally, the vast amount of unpublished material represents a one-of-a-kind opportunity for model evaluation and possible extensions.
11 views
Seen by:EstCRM: An R package for Samejima's Continuous IRT Model.
http://apm.sagepub.com/content/current
EstCRM: An R Package for Samejima’s Continuous IRT Model, Applied Psychological Measurement, March 2012, 36, 149-150
Continuous Response Model (CRM) is an IRT model developed for continuous outcomes. CRM is not commonly used in... more Continuous Response Model (CRM) is an IRT model developed for continuous outcomes. CRM is not commonly used in practice although it is as old as the other well known and popular binary and polytomous IRT models. This may be due to the lack of an accessible software to estimate the model parameters. The R package, EstCRM, was developed to estimate the model parameters for the CRM.
Grave Typology and Chronology of a Lengyel Culture Settlement: Formalized Methods in Archaeological Data Processing.
by Peter Demján
2012, In J. Kolář, F. Trampota (Eds), Theoretical and Methodological Considerations in Central European Neolithic. Proceedings of the ‘Theory and Method in Archaeology of the Neolithic (7th - 3rd millennium BC)’ conference held in Mikulov, Czech Republic, 26th – 28th of October 2010. BAR International Series 2325.
The application of typological methods is a well-known means to find regularities within an archaeological assemblage.... more The application of typological methods is a well-known means to find regularities within an archaeological assemblage. A formalized approach is needed when dealing with larger sets of data to eliminate bias arising from defining types a priori and searching for structures which we already expect to exist. By creating a quantitative descriptive system and using multivariate statistic methods we are able to identify similarities and dissimilarities between single artefacts as well as between individual find complexes. These similarities (or factors) represent chronological and cultural relationships as well as post-deposition transformations. We can further sort find complexes using seriation (e.g. correspondence analysis) to elaborate a chronological sequence. This paper presents an application of a formalized typological and chronological method on graves and their inventories from the Lengyel Culture settlement in Svodín (Southwest Slovakia) using multivariate statistical analysis.
219 views
Seen by: and 35 moreNeither dashboard nor ‘mashup’ indices: an empirical wealth approach as a pathway to a comprehensive measure of development
Universitat Autònoma de Barcelona. Departament d’Economia i d’Història Econòmica. UHE Working Paper 2012_01, 2012
The article is composed of two sections. The first one is a critical review of the three main alternative indices to... more
The article is composed of two sections. The first one is a critical review of the three main alternative indices to GDP which were proposed in the last decades – the Human Development Index (HDI), the Genuine Progress Indicator (GPI), and the Happy Planet Index (HPI) – which is made on the basis of conceptual foundations, rather than looking at issues of statistical consistency or mathematical refinement as most of the literature does. The pars construens aims to propose an alternative measure, the composite wealth index, consistent with
an approach to development based on the notion of composite wealth, which is in turn derived from an empirical common sense criterion. Arguably, this approach is suitable to be conveyed into
an easily understandable and coherent indicator, and thus appropriate to track development in its various dimensions: simple in its formulation, the wealth approach can incorporate social and ecological goals without significant alterations in conceptual foundations, while reducing to a minimum arbitrary weighting.
57 views
Seen by: and 3 more67 views
Seen by: and 6 moreGeração de Diversidade na Otimização Dinâmica Multiobjetivo Evolucionária por Paisagens de Não-Dominância
MSc Thesis, 2011 (in portuguese).
The generation and maintenance of distinct solutions in Multiobjective Evolutionary Algorithms (MOEAs), especially in... more The generation and maintenance of distinct solutions in Multiobjective Evolutionary Algorithms (MOEAs), especially in dynamic environments in which the criteria for evaluating solutions may vary over time, is an open problem, in which there are few studies on the influence of the different ways to generate diversity in the quality of the optimal solutions set. The inclusion of diversity generators in MOEAs can increase the cost of the evolutionary process and impair their performance. Hence, it comes the need to seek ways of mitigating the negative impact of the rising levels of dispersion within the candidate solutions population in the road ahead to the surface where lie the optimal points, known as the Pareto Front (PF). In biological systems, immigration schemes increase the possible combinations of genetic exchanges, promoting diversity of evolutionary paths. Inspired by the natural models of immigration, this research investigates the inclusion of atypical solutions (immigrants) in populations of candidate solutions as a way to generate diversity in MOEAs applied to dynamic multiobjective optimization. This dissertation also proposes and formalizes the Non-Dominance Landscapes (NDL) to guide the insertion of the generated immigrants in the population. The NDLs provide MOEAs with the probabilities of the immigrants being non-dominated in the population, from the estimation of probability density functions and of multivariate order statistics in the objective space. After characterizing the influence of diversity in the pproximation dynamics of the PF in MOEAs, the NDLs have been incorporated into the immigrants generators. The experimental validation of the NDL-based Diversity Generator (NDL-DG) expresses the potential of the proposed approach in increasing the average quality of the evolved non-dominated solutions sets. The results analysis of the incorporation of the NDL-DG into the NSGA2 algorithm show that higher average quality solutions are obtained with statistical significance at 79% of the studied dynamic optimization scenarios, in terms of the offline Hypervolume indicator, when compared with populations evolved without the use of NDLs. We then identified the optimization scenarios in which the NDL-DG appears more promising. Finally, we indicated research irections to extend the range of application of NDLs to other open problems in evolutionary multiobjective optimization.
80 views
Seen by:Sensitivity assessment of freshwater macroinvertebrates to pesticides using biological traits
A. Ippolito, R. Todeschini, M. Vighi
Ecotoxicology
Volume 21, Number 2, March 2012, Pages 336–352
Assessing the sensitivity of different species to chemicals is one of the key points in predicting the effects of... more Assessing the sensitivity of different species to chemicals is one of the key points in predicting the effects of toxic compounds in the environment. Trait-based predicting methods have proved to be extremely efficient for assessing the sensitivity of macroinvertebrates toward compounds with non specific toxicity (narcotics). Nevertheless, predicting the sensitivity of organisms toward compounds with specific toxicity is much more complex, since it depends on the mode of action of the chemical. The aim of this work was to predict the sensitivity of several freshwater macroinvertebrates toward three classes of plant protection products: organophosphates, carbamates and pyrethroids. Two databases were built: one with sensitivity data (retrieved, evaluated and selected from the U.S. Environmental Protection Agency ECOTOX database) and the other with biological traits. Aside from the "traditional" traits usually considered in ecological analysis (i.e. body size, respiration technique, feeding habits, etc.), multivariate analysis was used to relate the sensitivity of organisms to some other characteristics which may be involved in the process of intoxication. Results confirmed that, besides traditional biological traits, related to uptake capability (e.g. body size and body shape) some traits more related to particular metabolic characteristics or patterns have a good predictive capacity on the sensitivity to these kinds of toxic substances. For example, behavioral complexity, assumed as an indicator of nervous system complexity, proved to be an important predictor of sensitivity towards these compounds. These results confirm the need for more complex traits to predict effects of highly specific substances. One key point for achieving a complete mechanistic understanding of the process is the choice of traits, whose role in the discrimination of sensitivity should be clearly interpretable, and not only statistically significant.
Age, cumulative (dis)advantage, and subjective well-being in employed and unemployed Germans: A moderated mediation model
With Rainer K. Silbereisen.
Published in Journal of Occupational Health Psychology (2012), 17, 93-104.
Available on request.
The negative impact of unemployment on subjective well-being (SWB) is well known, but the role of age in this... more The negative impact of unemployment on subjective well-being (SWB) is well known, but the role of age in this relationship remains unclear. We suggest that cumulative advantage (or disadvantage) associated with the duration of current employment status may produce an age-related divergence in SWB between employed and unemployed individuals. We used cross-sectional data on employed (n = 1382) and unemployed (n = 254) Germans (age 18–42) surveyed in 2005. We found that, among currently employed individuals, relatively older age predicted longer employment duration (tenure), which was related to higher SWB via higher income and higher perceived occupational security. Among currently unemployed individuals, age predicted longer unemployment duration, which was associated with lower SWB via lower perceived social support. Thus, age was indirectly related to higher SWB in employed individuals and to lower SWB in unemployed individuals. In this way, cumulative advantage of long-term employment and cumulative disadvantage of long-term unemployment contributed to the age-related divergence in SWB between employed and unemployed Germans already in the first half of working life.
An R package to compute commonality coefficients in the multiple regression case: An introduction to the package and a practical example
by Mitzi Lewis
Nimon, K., Lewis, M., Kane, R., & Haynes, R. (2008). An R package to compute commonality coefficients in the multiple regression case: An introduction to the package and a practical example. Behavior Research Methods, 40(2), 457 – 466.
Multiple regression is a widely used technique for data analysis in social and behavioral research. The complexity of... more Multiple regression is a widely used technique for data analysis in social and behavioral research. The complexity of interpreting such results increases when correlated predictor variables are involved. Commonality analysis provides a method of determining the variance accounted for by respective predictor variables and is especially useful in the presence of correlated predictors. However, computing commonality coefficients is laborious. To make commonality analysis accessible to more researchers, a program was developed to automate the calculation of unique and common elements in commonality analysis, using the statistical package R. The program is described, and a heuristic example using data from the Holzinger and Swineford (1939) study, readily available in the MBESS R package, is presented.
12 views
Las cadenas operativas líticas de la mina de sílex de Casa Montero (Madrid)
CASTAÑEDA, N., CAPOTE, M., CRIADO, C., CONSUEGRA, S., DÍAZ-DEL-RÍO, P., TERRADAS, X., OROZCO, T. 2008: Actas del IV Congreso Neolítico Peninsular (Alicante, 2006), tomo II: 231-234.

