Generative Oscillation - A Cognitive Model for the Emergence of Language
Research Material for a discontinued PhD
DRAFT COPY ONLY
NOT READY FOR PRINT PUBLICATION
The GO model proposes a co-generative view of the emergence of language. Most conventional linguistics models conceive... more The GO model proposes a co-generative view of the emergence of language. Most conventional linguistics models conceive of language as a representational system of symbols which refer to events, either mental or external to the organism. This representational function is said to motivate the linguistic system and (depending upon the linguistic model) largely control its form. The GO (Generative Oscillation) model proposed here recognizes the representational role of language. However it notes that as the mental linguistic system itself becomes efficiently organized, it creates an internal logic and drive of its own. To some extent this internally motivated linguistic system is conceived to override the external motivation to represent another reality. Since the internal linguistic system is dynamic and generative, it may give rise to linguistic output which seems strange in an inter-human communicative context (or even within the reflective mind of the creator). Thus while the external communicative context can become a constraint on unmotivated non-representational "internal language", it might not eliminate it. The Generative Oscillation model proposes that actual language production is an oscillating compromise between the representational function of language and the mental "language bot" itself (i.e. an internal self-organizing system) which is generating language strings just because that is what language language bots do. As far as I know, the Generative Oscillation Model, or anything like it, had not been suggested before in linguistics at the time of writing. Some conventional linguists may find it a bit "off the wall".
21 views
Seen by:Os dicionários onomasiológicos e as ontologias computorizadas
2009 "Os dicionários onomasiológicos e as ontologias computorizadas". In Simões, A.; Almeida, J. J.; Guinovart, X. (2009). Linguamática, Nº 2, December 2009. Pp. 93-105 ISSN: 1647-0818.URL: http://linguamatica.com/index.php/linguamatica/article/view/34/46.
Este artigo pretende construir a ponte entre dicionários onomasiológicos e as recentes ontologias
computorizadas... more
Este artigo pretende construir a ponte entre dicionários onomasiológicos e as recentes ontologias
computorizadas ou formais.
São apresentados aqui os conceitos de onomasiologia e de dicionário onomasiológico, de forma a
tomá-los como instrumentos auxiliares no trabalho que tem vindo a ser desenvolvido relativamente às
ontologias. São expostas aqui também algumas das críticas, do ponto de vista prático e teórico, que esses
dicionários mereceram aquando da sua publicação, de forma a que possam ser úteis à construção das
ontologias modernas.
Farei ainda uma breve nota sobre o que está hoje a ser feito, na prática de elaboração de ontologias
computorizadas, para ultrapassar algumas das limitações apontadas aos produtos lexicográficos
onomasiológicos.
Segmentation Similarity and Agreement
Proceedings of Human Language Technologies: The 2012 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2012), Montreal, Canada, June 2012
We propose a new segmentation evaluation metric, called segmentation similarity (S), that quantifies the similarity... more We propose a new segmentation evaluation metric, called segmentation similarity (S), that quantifies the similarity between two segmentations as the proportion of boundaries that are not transformed when comparing them using edit distance, essentially using edit distance as a penalty function and scaling penalties by segmentation size. We propose several adapted inter-annotator agreement coefficients which use S that are suitable for segmentation. We show that S is configurable enough to suit a wide variety of segmentation evaluations, and is an improvement upon the state of the art. We also propose using inter-annotator agreement coefficients to evaluate automatic segmenters in terms of human performance.
Purposive Constructions in English
The detailed analysis of Purposive Constructions in this long paper will help researchers to clarify these phenomena in English, even though the linguistic model employed, Chomsky's Government and Binding, has (in my view) been superseded.
Abstract: This thesis* explores some of the syntactic & semantic properties of Purposive Constructions in English.... more Abstract: This thesis* explores some of the syntactic & semantic properties of Purposive Constructions in English. The term "purposive" is recognized as a semantic concept which finds regular expression in a small range of syntactic configurations. Purpose Clauses (PCs) and Rationale Clauses (Rat.Cs) are examined in some detail. Briefer reference is made to several other configurations, notably Because Clauses, So-That Clauses and Infinitival Relatives. In general Purposive Constructions comprise rather fuzzy semantic categories. Nevertheless, the main syntactic features are fairly clear. Interpretation of the constructions requires a systematic account of the control of empty slots (ellipted NPs) by thematic elements in the matrix clause. General conditions of Government and Binding appear adequate to predict the distribution of gaps in most Purposive Clauses. However, the relationship between propositions predicated of a common argument in these constructions is found to sometimes require matching conditions too subtle for syntax alone to predict. A concept of Thematic Coextensiveness is introduced to account for such matching.
20 views
Seen by: and 5 more1 views
Seen by:Quicksum
Natural Language Processing
Quick Summary is an innovate implementation of an automatic document summarizer that inputs a document in the English... more Quick Summary is an innovate implementation of an automatic document summarizer that inputs a document in the English language and evaluates each sentence. The scanner or evaluator determines criteria based on its grammatical structure and place in the paragraph. The program then asks the user to specify the number of sentences the person wishes to highlight. For example should the user ask to have three of the most important sentences, it would highlight the first and most important sentence in green. Commonly this is the sentence containing the conclusion. Then Quick Summary finds the second most important sentence usually called a satellite and highlights it in yellow. This is usually the topic sentence. Then the program finds the third most important sentence and highlights it in red. The implementations of this technology are useful in a society of information overload when a person typically receives 42 emails a day (Microsoft). Another implication is meeting summary information from video for peace officer records and sports broadcasters. The paper also is a candid look at difficulty that machine learning has in literal textural translating. However, it speaks on how to overcome the obstacles that historically prevented progress. This research is different from other research attempts because it takes into account heuristics in the paragraph and treats the document as not just a list of disjointed sentences but also each sentence contributing meaning to other sentences until it achieves a pivotal point in document. Prior methods of a document summary generator included reducing redundant words or junction words until it develops a nuclear core. This paper proposes mathematical metadata criteria that justify the place of importance of a sentence. Just as tools for the study of relational symmetry in bioinformatics, this tool seeks to classify words with greater clarity. "Survey Finds Workers Average Only Three Productive Days per Week." Microsoft News Center. Microsoft. Web. 31 Mar. 2012.
18 views
Seen by:Applications Of Natural Language Processing In Biodiversity Science
by Anne Thessen
in press for Advances in Bioinformatics
Sentiment Analysis amidst Ambiguities in YouTube Comments on Yoruba Language (Nollywood) Movies
In Proceedings of the 21st international World Wide Web Conference (WWW2012), April 16 - April 20, 2012, Lyon, France
Nollywood is the second largest movie industry in the world in terms of annual movie production. A dominant number of... more Nollywood is the second largest movie industry in the world in terms of annual movie production. A dominant number of the movies are in Yoruba language spoken by over 20 million people across the globe. The number of Yoruba language movies uploaded to YouTube and their corresponding comments is growing exponentially. However, YouTube comments made by native speakers on Yoruba movies combine English language, Yoruba language, and other commonly used “pidgin” Yoruba language words. Since Yoruba is still a resource constrained language, existing sentiment or subjectivity analysis algorithms have poor performances on YouTube comments made on Yoruba language movies. This is because of the constrained language ambiguities. In this work, we present an automatic sentiment analysis algorithm for YouTube comments on Yoruba language movies. The algorithm uses SentiWordNet thesaurus and a lexicon of commonly used Yoruba language sentiment words and phrases. In terms of precision-recall, the algorithm performs more than a state-of-the-art sentiment analysis technique by up to 20%.
Performance and Trends in Recent Opinion Retrieval Techniques
Sylvester Olubolu Orimaye, Saadat. M Alhashmi and Siew Eu-Gene
Faculty of Information Technology, Monash University
email: {sylvester.orimaye, alhashmi, siew.eu-gene}@monash.edu
The Knowledge Engineering Review (in press), Cambridge University Press.
This paper presents trends and performance of opinion retrieval techniques proposed within the last eight years. We... more This paper presents trends and performance of opinion retrieval techniques proposed within the last eight years. We identify major techniques in opinion retrieval and group them into four popular categories. We describe the state-of-the-art techniques for each category and emphasize on their performance and limitations. We then summarize with a performance comparison table for the techniques on different datasets. Finally, we highlight possible future research directions that can help solve existing challenges in opinion retrieval.
26 views
Seen by:Extracting semantic relations through the analysis of terms correlation in documents
by Ivan Ricarte
Co-authored with Sergio W Botero. Published in Brazilian Symposium in Information and Human Language Technology, 2009. In Portuguese.
Ontologies are important to organize and describe information, but are hard to create and maintain, which motivates... more Ontologies are important to organize and describe information, but are hard to create and maintain, which motivates the development of tools to help in this task. This article presents a strategy to extract, from a corpora of documents in a given domain, semantic elements expressing proximity relations between terms and concepts to help the construction of domain ontologies. The technique presented here, ACT, is based on linguistic processing, machine learning, and biclustering. Results show that concepts obtained by ACT are at least as good as those from similar techniques, such as LSI and NMF. In relation to those techniques, it additionally has the advantage of allowing the supervision by a domain expert.
15 views
Seen by:突破信息超载 展现时代效率—全自然语言处理系统 InQuizit
by Yi Shen
Journal of Information, 2000(6)
A Descriptive Analysis of Natural Language Understanding System A Descriptive Analysis of Natural Language Understanding System

