ERSE: an expert retrieval system for electronic databases
Authors: P. Shoval, B. Arazi, E. Gudes, D. Ephraim
Journal: Expert Systems for Information Management (1990), Vol. 3 (2), 85-114.
Evaluation of Traceability Recovery in Context: A Taxonomy for Information Retrieval Tools
by Markus Borg
Draft version of paper presented at the 16th International Conference on Evaluation & Assessment in Software Engineering, Ciudad Real, Spain, May 2012.
Background: Development of complex, software intensive
systems generates large amounts of information. Several
systems generates large amounts of information. Several
researchers have developed tools implementing information
retrieval (IR) approaches to suggest traceability links among
artifacts. Aim: We explore the consequences of the fact that
a majority of the evaluations of such tools have been focused
on benchmarking of mere tool output. Method: To illustrate this
issue, we have adapted a framework of general IR evaluations to a context taxonomy specifically for IR-based traceability recovery. Furthermore, we evaluate a previously proposed experimental framework by conducting a study using two publicly available tools on two datasets originating from development of embedded software systems. Results: Our study shows that even though both datasets contain software artifacts from embedded development, the characteristics of the two datasets differ considerably, and consequently the traceability outcomes. Conclusions: To enable replications and secondary studies, we suggest that datasets should be thoroughly characterized in future studies on traceability
recovery, especially when they can not be disclosed. Also, while
we conclude that the experimental framework provides useful
support, we argue that our proposed context taxonomy is a useful complement. Finally, we discuss how empirical evidence of the feasibility of IR-based traceability recovery can be strengthened in future research.
Crossover phenomenon in the performance of an Internet search engine
Co-authored with Lucas Lacasa and Andrew Berdahl
In this work we explore the ability of the Google search engine to find results for random N-letter strings. These... more In this work we explore the ability of the Google search engine to find results for random N-letter strings. These random strings, dense over the set of possible N-letter words, address the existence of typos, acronyms, and other words without semantic meaning. Interestingly, we find that the probability of finding such strings sharply drops from one to zero at Nc = 6. The behavior of such order parameter suggests the presence of a transition-like phenomenon in the geometry of the search space. Furthermore, we define a susceptibility-like parameter which reaches a maximum in the neighborhood, suggesting the presence of criticality. We finally speculate on the possible connections to Ramsey theory.
Dynamical Information Retrieval Modelling: A Portfolio-Armed Bandit Machine Approach
by Marc Sloan
Co-authored with Jun Wang, poster paper published at WWW2012
The dynamic nature of document relevance is largely ignored by traditional Information Retrieval (IR) models, which... more The dynamic nature of document relevance is largely ignored by traditional Information Retrieval (IR) models, which assume that scores (relevance) for documents given an information need are static. In this paper, we formulate a general Dynamical Information Retrieval problem, where we consider retrieval as a stochastic, controllable process. The ranking action continuously controls the retrieval system’s dynamics and an optimal ranking policy is found that maximises the overall users’ satisfaction during each period. Through deriving the posterior probability of the documents evolving relevancy from user clicks, we can provide a plugin framework for incorporating a number of click models, which can be combined with Multi-Armed Bandit theory and Portfolio Theory of IR to create a dynamic ranking rule that takes rank bias and click dependency into account. We verify the versatility of our algorithms in a number of experiments and demonstrate improved performance over strong baselines and as a result significant performance gains have been achieved.
Information Retrieval on Mind Maps – What could it be good for?
by Joeran Beel
Joeran Beel, Bela Gipp, and Jan Olaf Stiller. Information Retrieval on Mind Maps – What could it be good for? In Proceedings of the 5th International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom’09), pages 1–4, Washington (USA), November 2009. IEEE. ISBN 978-963-9799-76-9. Downloaded from www.docear.org
Mind maps are used by millions of people. In this paper we present how information retrieval on mind maps could be... more Mind maps are used by millions of people. In this paper we present how information retrieval on mind maps could be used to enhance expert search, document summarization, keyword based search engines, document recommender systems and determining word relatedness. For instance, words in a mind map could be used for creating a skill profile of the mind maps’ author and hence enhance expert search. This paper is a research-in-progress paper which means no research results are presented but only ideas.
An Exploratory Analysis of Mind Maps
by Joeran Beel
Joeran Beel and Stefan Langer. An Exploratory Analysis of Mind Maps. In Proceedings of the 11th ACM Symposium on Document Engineering (DocEng’11), Mountain View, California, USA, pages 81-84 2011. ACM. Downloaded from www.docear.org
The results presented in this paper come from an exploratory study of 19,379 mind maps created by 11,179 users from... more The results presented in this paper come from an exploratory study of 19,379 mind maps created by 11,179 users from the mind mapping applications ‘Docear’ and ‘MindMeister’. The objective was to find out how mind maps are structured and which information they contain. Results include: A typical mind map is rather small, with 31 nodes on average (median), whereas each node usually contains between one to three words. In 66.12% of cases there are few notes, if any, and the number of hyperlinks tends to be rather low, too, but depends upon the mind mapping application. Most mind maps are edited only on one (60.76%) or two days (18.41%). A typical user creates around 2.7 mind maps (mean) a year. However, there are exceptions which create a long tail. One user created 243 mind maps, the largest mind map contained 52,182 nodes, one node contained 7,497 words and one mind map was edited on 142 days.
6 views
Seen by:Tools for Reordering: Commonplacing and the Space of Words in Linnaeus's Philosophia Botanica, Intellectual History Review, 20 (2010), 227-252
Author: Matthew Daniel Eddy
Recent studies on commonplacing have shown that it flourished as an important information management tool and, in some... more Recent studies on commonplacing have shown that it flourished as an important information management tool and, in some cases, it functioned as a method (methodus) that facilitated the ordering of natural history systems. In what follows in this essay, I wish to extend this point by examining the role played by heads in the work of Carolus Linnaeus (Carl von Linné). I address two core questions. First, what were the economies of attention that guided his commonplacing techniques? Second, what type of impact did his note-taking skills have upon the way that he spatially arranged information in texts? Whereas intellectual historians sometimes tend to focus on the role that he played as the unique originator of modern botanical and zoological classification systems, I approach his work merely as one example in a long tradition of commonplacing and graphic design that originated in the Renaissance, but which had become an indispensable organisational tool used to create knowledge systems in the leading research centres of Enlightenment Europe.
134 views
Seen by: and 44 moreIndustrial comparability of student artifacts in traceability recovery research - An exploratory survey
by Markus Borg
Draft version of paper presented at the 16th European Conference on Software Maintenance and Reengineering, Szeged, Hungary, 2012.
About a hundred studies on traceability recovery have been published in software engineering fora. In roughly half of... more About a hundred studies on traceability recovery have been published in software engineering fora. In roughly half of them, software artifacts developed by students have been used as input. To what extent student artifacts differ from industrial counterparts has not been fully explored in the literature. We conducted a survey among authors of studies on traceability recovery, including both academics and practitioners, to explore their perspectives on the matter. Our results indicate that a majority of authors consider software artifacts originating from student projects to be only partly representative to industrial artifacts. Moreover, only few respondents validated student artifacts for industrial representativeness. Furthermore, our respondents made suggestions for improving the description of artifact sets used in studies by adding contextual, domain-specificand artifact-centric information. Example suggestions include adding descriptions of processes used for artifact development,meaning of traceability links, and the structure of artifacts. Our findings call for further research on characterization and validation of software artifacts to support aggregation of results from empirical studies.
43 views
Seen by:A reference model for intelligent information search
by Ivan Ricarte
Co-authored with Fernando Gomide. FLINT 2001
The paper aims a tutorial review of the current state of the art in the area of Web search to address information... more The paper aims a tutorial review of the current state of the art in the area of Web search to address information retrieval models and a reference model for intelligent information search. We first review current information Web search models and methods, followed by contributions brought by machine learning, artificial and computational intelligence. As a result, a reference model is sketched. Its purpose is to summarize the main relationships between computational intelligence and information search systems as a means to promote innovative, intelligent information search systems development.
A Reference Software Model for Intelligent Information Search
by Ivan Ricarte
Co-authored with Fernando Gomide. Book chapter: Enhancing the power of Internet, 2004
This chapter provides a tutorial review of the current state of the art in the area of Web search and addresses... more This chapter provides a tutorial review of the current state of the art in the area of Web search and addresses information retrieval models that induce a reference software model for intelligent search systems. For these purposes, we review current information Web search models and methods from the point of view of information retrieval systems. Next, we present a reference software model which abstracts the search and retrieval process. This abstraction is important to identify the points of adaptation to integrate soft computing techniques into the information search and retrieval. We discuss the contributions that machine learning, artificial and computational intelligence brought to improve information retrieval models to enhance information search effectiveness, and to develop intelligent information search. The purpose of the model is to capture the relationships between computational intelligence and information search systems as a means to promote development and implementation of innovative, intelligent information search systems.
Ontologia relacional fuzzy em sistemas de recuperação de informação
by Ivan Ricarte
Co-authored with Rachel Pereira and Fernando Gomide. Published in Encontro Nacional de Inteligência Artificial, 2005. In Portuguese.
Currently, document search in information retrieval systems is a
common task performed daily. The considerable... more
Currently, document search in information retrieval systems is a
common task performed daily. The considerable growth of documents in databases increases the need of better information retrieval models and algorithms using n ew techniques of artificial intelligence. This paper presents an information retrieval model based on ontologies encoded by fuzzy relations. The model uses the principles of fuzzy set theory and approximate reasoning for knowledge representation and information search. Two query algorithms are suggested. Experimental results show that the fuzzy relational ontological model achieves better performance when compared with two alternative approaches based on thesauri and fuzzy conceptual network.

