Understanding the relationship among launch variables in the golf drive using neural network visualisations
by Peter Lamb
Lamb, P. F.
Sports Biomechanics. Volume iFirst article, Pages 1-13, 2012.
The aim of this study was to identify and characterise individual differences in launch conditions measured from the... more The aim of this study was to identify and characterise individual differences in launch conditions measured from the same hole during four rounds of a professional golf tournament. Launch data from the 18th tee at the 2009 Dubai World Championship were used for the analysis. Self-organising maps (SOMs) were chosen to visualise the potentially non-linear relationship among the launch variables. Several distinctly different types of drives were identified on the Output Map. Drives which carried the furthest were not necessarily associated with the highest rates of ball speed. As indicated by carry distance, the longest drives had backspin rates of roughly 2,700 rpm, a launch angle of 11°, a straight or slightly left-to-right curving ball flight (for right-handers), and reached an apex of about 36 m. These values are specific to the 18th hole at the Dubai World Championship and differ from the general launch recommendations found in the literature.
High-resolution imaging of the fusiform face area (FFA) using multivariate non-linear classifiers shows diagnosticity for non-face categories
Co-authored with Stephen J. Hanson and published in Neuroimage (2011), 54(2), 1715-34.
Does the "fusiform face area" (FFA) code only for faces? This question continues to elude the neuroimaging... more Does the "fusiform face area" (FFA) code only for faces? This question continues to elude the neuroimaging field due to at least two kinds of problems: first, the relatively low spatial resolution of fMRI in which the FFA was defined and second, the potential bias inherent in prevailing statistical methods for analyzing the actual diagnosticity of cortical tissue. Using high-resolution (1mmx1mmx1mm) imaging data of the fusiform face area (FFA) from 4 subjects who had categorized images as 'animal', 'car', 'face', or 'sculpture', we used multivariate linear and non-linear classifiers to decode the resultant voxel patterns. Prior to identifying the appropriate classifier we performed exploratory analysis to determine the nature of the distributions over classes and the voxel intensity pattern structure between classes. The FFA was visualized using non-metric multidimensional scaling revealing "string-like" sequences of voxels, which appeared in small non-contiguous clusters of categories, intertwined with other categories. Since this analysis suggested that feature space was highly non-linear, we trained various statistical classifiers on the class-conditional distributions (labelled) and separated the four categories with 100% reliability (over replications) and generalized to out-of-sample cases with high significance (up to 50%; p<.000001, chance=25%). The increased noise inherent in high-resolution neuroimaging data relative to standard resolution resisted any further gains in category performance above ~60% (with FACE category often having the highest bias per category) even coupled with various feature extraction/selection methods. A sensitivity/diagnosticity analysis for each classifier per voxel showed: (1) reliable (with S.E.<3%) sensitivity present throughout the FFA for all 4 categories, and (2) showed multi-selectivity; that is, many voxels were selective for more than one category with some high diagnosticity but at submaximal intensity. This work is clearly consistent with the characterization of the FFA as a distributed, object-heterogeneous similarity structure and bolsters the view that the FFA response to "FACE" stimuli in standard resolution may be primarily due to a linear bias, which has resulted from an averaging artefact.
Supraglacial lake assessment in the Sagarmatha region in the Nepal Himalaya
In Geospatial Techniques: Managing World Resources (Eds.) Thakur, J.K.; Singh, S.K.; Ramanathan, A.; Prasad, M.B.K.; Gossel, W. Springer, October 29, 2011
The global retreat of mountain glaciers since mid-19th century may have severe ecological and economical impacts... more The global retreat of mountain glaciers since mid-19th century may have severe ecological and economical impacts affecting human lives and infrastructure. There is a general pattern of glacial retreat in the Nepal Himalayas leading to increased production of glacial melt water and development of supraglacial lakes (formed on the surface of glaciers) that may pose hazards such as glacial lake outburst floods to downstream populated areas. Glacier classification is an important initial step in any glacier-related assessment; however, limitation of multivariate classification algorithms and spectral similarity of supraglacial features have posed significant challenges. Existing methods of classification such as thresholding and manual digitization may underestimate or overestimate cover classes. This paper demonstrates the utility of a hybrid approach that integrates classification tree analysis (CTA) and shape complexity analysis to better differentiate supraglacial cover types and delineate supraglacial lakes in the Everest region. We used 2004 Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) visible near-infrared (VNIR) and short-wave infrared (SWIR) imagery along with snow and vegetation indices, image band ratios and elevation derivatives to characterize and assess supraglacial conditions on the glaciers in the Everest region. The initial segmentation by CTA, a non-parametric, recursive partitioning method, was followed by a shape complexity analysis using the Square Pixel Metric (SqP) to further differentiate supraglacial features, mainly lakes. The results indicate that CTA is able to utilize ancillary variables to classify supraglacial cover types and to delineate supraglacial lakes. Shape complexity metric such as SqP was also effective in reducing misclassification of glacial lakes. The results of this semi-automated, hybrid approach classification of supraglacial cover types has been promising when used in the Everest region with very high overall accuracy.
Place disambiguation with co-occurrence models
CLEF Working Notes 2006, Alicante
In this paper we describe the geographic information retrieval system developed by the Multimedia & Information... more In this paper we describe the geographic information retrieval system developed by the Multimedia & Information Systems team for GeoCLEF 2006 and the results achieved. We detail our methods for generating and applying co-occurrence models for the purpose of place name disambiguation, our use of named entity recognition tools and text indexing applications. The presented system is split into two stages: a batch text & geographic indexer and a real time query engine. The query engine takes manually crafted queries where the text component is separated from the geographic component. Two monolingual runs were submitted for the GeoCLEF evaluation, the first constructed from the title and description, the second included the narrative also. We explain in detail our use of co-occurrence models for place name disambiguation using a model generated from Wikipedia. The paper concludes with a full description of future work and ways in which the system could be optimised.
Classifying Tags using Open Content Resources
Published at WSDM 2009
Tagging has emerged as a popular means to annotate on-line objects such as bookmarks, photos and videos. Tags vary in... more Tagging has emerged as a popular means to annotate on-line objects such as bookmarks, photos and videos. Tags vary in semantic meaning and can describe different aspects of a media object. Tags describe the content of the media as well as locations, dates, people and other associated meta-data. Being able to automatically classify tags into semantic categories allows us to understand better the way users annotate media objects and to build tools for viewing and browsing the media objects. In this paper we present a generic method for classifying tags using third party open content resources, such as Wikipedia and the Open Directory. Our method uses structural patterns that can be extracted from resource meta-data. We describe the implementation of our method on Wikipedia using WordNet categories as our classification schema and ground truth. Two structural patterns found in Wikipedia are used for training and classification: categories and templates. We apply our system to classifying Flickr tags. Compared to a WordNet baseline our method increases the coverage of the Flickr vocabulary by 115%. We can classify many important entities that are not covered by WordNet, such as, London Eye, Big Island, Ronaldinho, geocaching and wii.
9 views
Seen by:Geographic Information Retrieval: Classification, Disambiguation and Modelling
PhD Thesis
My thesis aims to augment the Geographic Information Retrieval process with information extracted from world... more My thesis aims to augment the Geographic Information Retrieval process with information extracted from world knowledge. This aim is approached from three directions: classifying world knowledge, disambiguating placenames and modelling users. Geographic information is becoming ubiquitous across the Internet, with a significant proportion of web documents and web searches containing geographic entities, and the proliferation of Internet enabled mobile devices. Traditional information retrieval treats these geographic entities in the same way as any other textual data. In this thesis I augment the retrieval process with geographic information, and show how methods built upon world knowledge outperform methods based on heuristic rules. The source of world knowledge used is Wikipedia. Wikipedia has become a phenomenon of the Internet age and needs little introduction. As a linked corpus of semi-structured data, it is unsurpassed. Two approaches to mining information from Wikipedia are rigorously explored: initially I classify Wikipedia articles into broad categories; this is followed by much finer classification where Wikipedia articles are disambiguated as specific locations. The thesis concludes with the proposal of the Steinberg hypothesis: By analysing a range of wikipedias in different languages I demonstrate that a localised view of the world is ubiquitous and inherently part of human nature. All people perceive closer places as larger and more important than distant ones. The core contributions of mythesis are in the areas of extracting information from Wikipedia, supervised placename disambiguation, and providing a quantitative model for how people view the world. The findings clearly have a direct impact for applications such as geographically aware search engines, but in a broader context documents can be automatically annotated with machine readable meta-data and dialogue enhanced with a model of how people view the world. This will reduce ambiguity and confusion in dialogue between people or computers.
The Dynamic Stage Bayesian Network: identifying and modelling key stages in a temporal process.
Published in "Advances in Intelligent Data Analysis X" - Lecture Notes in Computer Science, Vol. 7014/2011, pp. 101-112. Springer, Berlin. Presented at Intelligence Data Analysis 2011, Porto, Portugal.
Data modeling using Bayesian Networks (BNs) has been investigated in depth for many years. More recently, Dynamic... more Data modeling using Bayesian Networks (BNs) has been investigated in depth for many years. More recently, Dynamic Bayesian Networks (DBNs) have been developed to deal with longitudinal datasets and exploit time dependent relationships in data. Our approach makes a further step in this context, by integrating into the BN framework a dynamic on-line data-selection process. The aims are to efficiently remove noisy data points in order to identify and model the key stages in a temporal process and to obtain better performance in classification. We tested our approach, called Dynamic Stage Bayesian Networks (DSBN), in the complex context of glaucoma functional tests, in which the available data is typically noisy and irregularly spaced. We compared the performances of DSBN with a static BN and a standard DBN. We also explored the potential of the technique by testing on another dataset from the Transport of London database. The results are promising and the potential of the technique is considerable.
Oblique Decision Trees Using Embedded Support Vector Machines in Classifier Ensembles
Classifier ensembles have emerged in recent years
as a promising research area for boosting pattern... more
Classifier ensembles have emerged in recent years
as a promising research area for boosting pattern recognition
systems’ performance. We present a new base classifier that
utilizes Oblique Decision Tree technology based on Support
Vector Machines for the construction of oblique (non-axis
parallel) tests on the nodes of the decision tree inducted. We
describe a number of heuristic techniques for enhancing the tree
construction process by better estimation of the gain obtained by
an oblique split at any tree node. We then show how embedding
the new classifier in an ensemble of classifiers using the classical
Hedge(β) algorithm boosts performance of the system. Testing
10-fold cross validation on UCI machine learning repository data
sets shows that the new hybrid classifiers outperforms on average
by more than 2.1% both the WEKA implementation of C4.5
(J48) and the SMO implementation of SVM in WEKA. The
application of the particular ensemble algorithm is an excellent
fit for online-learning applications where one seeks to improve
performance of self-healing dependable computing systems based
on reconfiguration by gradually and adaptively learning what
constitutes good system configurations.
41 views
Seen by:Instance-Based Classifiers to Discover the Gradient of Typicality in Data
Gagliardi, F. (2011) “Instance-Based Classifiers to Discover the Gradient of Typicality in Data”. In: Pirrone, R., Sorbello, F. (eds.) “AI*IA 2011: Artificial Intelligence Around Man and Beyond. XIIth International Conference of the Italian Association for Artificial Intelligence, Palermo, Italy, September 15-17, 2011. Proceedings”. LNCS vol. 6934, Springer Berlin, Heidelberg. Pp. 457-462. (ISBN: 978-3-642-23953-3) (DOI: 10.1007/978-3-642-23954-0_47) (Link: http://dx.doi.org/10.1007/978-3-642-23954-0_47 http://www.springer.com/978-3-642-23953-3)
One of the aims of machine learning and data mining regards the problem of discovering useful and interesting... more
One of the aims of machine learning and data mining regards the problem of discovering useful and interesting knowledge from data. Usually instance-based (IB) classifiers are considered unsuitable for knowledge extraction tasks.
Conversely in this paper we consider the families of IB classifiers based on prototype methods and on nearest-neighbours and we show that some hybrid IB classifiers can infer a mixture of representative instances, varying from abstracted prototypes to previous observed atypical exemplars, which can be used to discover the “typicality structure” of learnt categories.
Experimental results show that one of the proposed hybrid classifiers (the Prototype exemplar learning classifier), detects a concise and meaningful set of representative instances varying from prototypical ones to atypical ones, which form a gradient of typicality.
This kind of class representations cohere with theories developed in cognitive science about how human mind classifies.
13 views
Seen by:Tuning support vector machines for minimax and Neyman-Pearson classification
Co-authored with R.G. Baraniuk and C.D. Scott. (IEEE Trans. on Pattern Analysis and Machine Intelligence, 32(10) pp. 1888-1898, October 2010.)
This paper studies the training of support vector machine (SVM) classifiers with respect to the minimax and... more This paper studies the training of support vector machine (SVM) classifiers with respect to the minimax and Neyman-Pearson criteria. In principle, these criteria can be optimized in a straightforward way using a cost-sensitive SVM. In practice, however, because these criteria require especially accurate error estimation, standard techniques for tuning SVM parameters, such as cross-validation, can lead to poor classifier performance. To address this issue, we first prove that the usual cost-sensitive SVM, here called the 2C-SVM, is equivalent to another formulation called the 2ν-SVM. We then exploit a characterization of the 2ν-SVM parameter space to develop a simple yet powerful approach to error estimation based on smoothing. In an extensive experimental study, we demonstrate that smoothing significantly improves the accuracy of cross-validation error estimates, leading to dramatic performance gains. Furthermore, we propose coordinate descent strategies that offer significant gains in computational efficiency, with little to no loss in performance.
2 views
Seen by:Signal processing with compressive measurements
Co-authored with P.T. Boufounos, M.B. Wakin, and R.G. Baraniuk. (IEEE J. of Selected Topics in Signal Processing, 4(2) pp. 445-460, April 2010.)
The recently introduced theory of compressive sensing enables the recovery of sparse or compressible signals from a... more The recently introduced theory of compressive sensing enables the recovery of sparse or compressible signals from a small set of nonadaptive, linear measurements. If properly chosen, the number of measurements can be much smaller than the number of Nyquist-rate samples. Interestingly, it has been shown that random projections are a near-optimal measurement scheme. This has inspired the design of hardware systems that directly implement random measurement protocols. However, despite the intense focus of the community on signal recovery, many (if not most) signal processing problems do not require full signal recovery. In this paper, we take some first steps in the direction of solving inference problems-such as detection, classification, or estimation-and filtering problems using only compressive measurements and without ever reconstructing the signals involved. We provide theoretical bounds along with experimental results.
18 views
Seen by: and 4 moreManifold-based approaches for improved classification
Co-authored with C. Hegde, M.B. Wakin, and R.G. Baraniuk. (NIPS Workshop on Topology Learning, Whistler, Canada, December 2007.)
While manifold structure is often exploited for dimensionality reduction or feature extraction, this structure is... more While manifold structure is often exploited for dimensionality reduction or feature extraction, this structure is rarely used by classification algorithms. We present a class of algorithms that utilize the low-dimensional manifold nature of signal ensembles and result in improved classification performance. The algorithms are built within theoretical frameworks that take into consideration prior knowledge of geometric structure in both labeled and unlabeled data points. Additionally, these frameworks can exploit recent results on random projections of smooth manifolds to ensure computational feasibility on extremely high-dimensional problems.
9 views
Seen by:Efficient machine learning using random projections
Co-authored with C. Hegde, M.B. Wakin, and R.G. Baraniuk. (NIPS Workshop on Efficient Machine Learning, Whistler, Canada, December 2007.)
As an alternative to cumbersome nonlinear schemes for dimensionality reduction, the technique of random linear... more As an alternative to cumbersome nonlinear schemes for dimensionality reduction, the technique of random linear projection has recently emerged as a viable alternative for storage and rudimentary processing of high-dimensional data. We invoke new theory to motivate the following claim: the random projection method may be used in conjunction with standard algorithms for a multitude of machine learning tasks, with virtually no degradation in performance. Thus, random projections can been shown to result in both significant computational savings and provably good performance.
19 views
Seen by: and 4 moreMultiscale random projections for compressive classification
Co-authored with M.F. Duarte, M.B. Wakin, J.N. Laska, D. Takhar, K.F. Kelly, and R.G. Baraniuk. (Proc. IEEE International Conference on Image Processing (ICIP), San Antonio, Texas, September 2007.)
We propose a framework for exploiting dimension-reducing random projections in detection and classification problems.... more We propose a framework for exploiting dimension-reducing random projections in detection and classification problems. Our approach is based on the generalized likelihood ratio test; in the case of image classification, it exploits the fact that a set of images of a fixed scene under varying articulation parameters forms a low-dimensional, nonlinear manifold. Exploiting recent results showing that random projections stably embed a smooth manifold in a lower-dimensional space, we develop the multiscale smashed filter as a compressive analog of the familiar matched filter classifier. In a practical target classification problem using a single-pixel camera that directly acquires compressive image projections, we achieve high classification rates using many fewer measurements than the dimensionality of the images.
