Exploring Manuscripts: Sharing Ancient Wisdoms across the Semantic Web
Co-authored with Mark Hedges, K. Faith Lawrence and Charlotte Tupman. For the ACM International Conference on Web Intelligence, Mining and Semantics (WIMS) June 13-15, 2012 Craiova, Romania
Recent work in digital humanities has seen researchers increasingly producing online editions of texts and... more
Recent work in digital humanities has seen researchers increasingly producing online editions of texts and manuscripts,
particularly in adoption of the TEI XML format for online
publishing. The benefits of semantic web techniques are underexplored in such research, however, with a lack of sharing
and communication of research information. The Sharing
Ancient Wisdoms (SAWS) project applies linked data practices to enhance and expand on what is possible with these
digital text editions. Focussing on Greek and Arabic collections of ancient wise sayings, which are often related to
each other, we use RDF to annotate and extract semantic information from the TEI documents as RDF triples.
This allows researchers to explore the conceptual networks
that arise from these interconnected sayings. The SAWS
project advocates a semantic-web-based methodology, enhancing rather than replacing current workflow processes,
for digital humanities researchers to share their findings and
collectively benefit from each other's work.
22 views
Seen by:Making Sense of the News in a Hybrid Regime: How Young Russians Decode State TV and an Oppositional Blog
DRAFT VERSION. This is a revised version of a manuscript that is currently in the second round of peer-review at the "Journal of Communication". The research was supported by a research grant of the Thyssen Foundation, Germany, awarded to Florian Toepfl.
The past two decades have seen an increasingly intense debate on how the rise of internet-mediated communication has... more The past two decades have seen an increasingly intense debate on how the rise of internet-mediated communication has impacted politics in (semi-)authoritarian regimes. Previous works have adopted a wide array of approaches. Yet, to date no major study has investigated how citizens in these regimes are making sense of political messages they encounter online in the new, more fragmented media environment of the internet age. In an initial attempt to fill this gap, this explorative study juxtaposes how a specific group of Russians make sense of a lliberal-democratic blog entry and, by contrast, a news broadcast from state-controlled TV. Based on findings from 20 in-depth interviews, the article discusses promising avenues for fu-ture audience research in hybrid regimes.
35 views
Seen by:Building a Volunteer Community: Results and Findings from Transcribe Bentham
by Tim Causer
Forthcoming 2012. Co-authored with Valerie Wallace
This paper contributes to the literature examining the burgeoning field of academic crowdsourcing, by analysing the... more This paper contributes to the literature examining the burgeoning field of academic crowdsourcing, by analysing the results of the crowdsourced manuscript transcription project, Transcribe Bentham. First, it describes how the project team sought to recruit volunteer transcribers to take part, and discusses which strategies were successes (and which were not). We then examine Transcribe Bentham’s results during its six-month testing period (8 September 2010 to 8 March 2011), which include a detailed quantitative and qualitative analysis of website statistics, work completed by the amateur transcribers, as well as the demographics of the volunteer base and their motivations for taking part. The paper concludes by discussing the success of our community building with reference to this analysis. We find that Transcribe Bentham’s volunteer transcribers have produced a remarkable amount of work – and continue to do so, carrying out the equivalent labour of a full-time transcriber – despite the nature and complexity of the task at hand.
Transcription Maximized; Expense Minimized? Crowdsourcing and Editing 'The Collected Works of Jeremy Bentham'
by Tim Causer
Co-authored with Justin Tonra and Valerie Wallace.
Published in Literary and Linguistic Computing, Vol. 27, Issue 2, 2012, pp. 119-137.
This paper discusses the award-winning crowdsourced manuscript transcription project, Transcribe Bentham, and how it... more
This paper discusses the award-winning crowdsourced manuscript transcription project, Transcribe Bentham, and how it will impact upon long-established editorial practices at the Bentham Project, University College London, which is producing the new and authoritative edition of The Collected Works of Jeremy Bentham. We site Transcribe Bentham in the burgeoning field of scholarly crowdsourcing projects, and attempt to assess the potential benefits of engaging the public in a seemingly complex task in order to further humanities research by detailing our experiences of running and administering the project.
The paper examines the conceptualisation and development of Transcribe Bentham, and how editorial practices at the Bentham Project may change as a result. We account for the design of the bespoke transcription tool which is at the project’s heart, and allows volunteers to transcribe the material and encode it in TEI-compliant XML. We attempt to answer five key questions: is crowdsourcing the transcription of complex manuscripts cost-effective? Is crowdsourcing exploitative? Would the volunteer-produced transcripts be of sufficient quality for editorial use and uploading to a digital repository, and what quality controls are required? Would crowdsourcing ensure sustainability and widen access to this priceless material? And finally, should the success of a project like Transcribe Bentham be measured solely according to cost-effectiveness or the volume of work produced, or do considerations of public engagement and access outweigh such concerns?
La representación digital de la génesis del texto. Un caso de estudio
D. Fiormonte – V. Martiradonna, “La representación digital de la génesis del texto. Un caso de estudio”, in Aurélie Arcocha-Scarcia, Javier Lluch-Prats y Mari Jose Olaziregui (eds.), En el taller del escritor: génesis textual
y edición de textos, Bilbao, Servicio Editorial del País Vasco, pp. 147-176.
'The Text As a Product and As a Process. History, Genesis, Experiments
Co-authored with Cinzia Pusceddu. Originally published in E. Vanhoutte – M. de Smedt (eds.), Manuscript, Variant, Genese – Genesis, Gent, Koninklijke Academie voor Nederlandse Taal- en Letterkunde, pp. 109-128.
In the last ten years there has been a growing consensus among textual scholars that one of the main purposes of a... more In the last ten years there has been a growing consensus among textual scholars that one of the main purposes of a digital edition should be the presentation of a ‘mobile text’, in which each manuscript or witness – ancient or modern – can be read, compared and manipulated by various computing tools. However, is the concept of textual movement strictly related to the use of the computer? This paper argues that, not only is the idea of mobile text much older than the computer, but current digital tools and languages often seem inadequate to the task of representing the richness and complexity of such literary artefacts. The first part of this paper tries to reconstruct and discuss the birth of a new textual paradigm from its origins in the intellectual milieu of the 1930s, up to research into the writing process carried out in the 1980s. The paper then focuses on the theory of text as a ‘system’, as outlined by the Italian editor and philologist Gianfranco Contini. It follows a concise overview of the theoretical issues raised by genetic criticism in France, showing its connections with experiments carried out into the psychology of composition. The last part of the paper presents an ongoing experiment of genetic philology developed by the Digital Variants (www.selc.ed.ac.uk/italian/digitalvariants) team: the Magrelli Genetic Machine. Variant texts and brouillons d’écriture generously made available by the Italian author Valerio Magrelli are displayed through several interactive tools realised in Flash, and their original textual mouvance is reproduced by means of a dynamic display. This experiment involved a number of practical steps, but also raised some theoretical questions: how to convert a printed text and a manuscript in digital format; how to create a dynamic text which appears as an image, simultaneously comparing different choices made by the author, etc.; and, finally, how to show the writing process. It is undeniable that writing with a computer implies a loss of knowledge about the genesis of a text, as writers do not usually save earlier versions. Digital Variants, by asking writers to save their writing-stages, can allow users to examine a variant text from the author’s as well as the reader’s perspective, and thus stimulate reflection on how the digital dimension can blur the boundaries between ‘authorial’ and ‘editorial’ practices.
60 views
Seen by:The Beauty (and Darkness- No Need for Bias Here) of Language
This thought paper walks through some positive and negative aspects of language- verbal, written & symbolic-... more This thought paper walks through some positive and negative aspects of language- verbal, written & symbolic- depending on their employment & interpretation. This paper also provides advise on how one can become a more effective practitioner of language.
27 views
Seen by:Mind the gap. A preliminary evaluation of issues in combining text and music encoding
Co-authored with Joachim Veit, published in Die Tonkunst Nr.3 - 5, 2011
Textual representations and knowledge support-systems in research intensive networks.
2011: Vines, R., Hall, W.P., McCarthy, G. Textual representations and knowledge support-systems in research intensive networks. (in) Cope, B., Kalantzis, M., Magee, L. (eds). Towards a Semantic Web: Connecting Knowledge in Academic Research. Oxford: Chandos Press, pp. 145-195.
To support the increased efficacy and efficiency of research intensive networks and their impact in the world, we... more
To support the increased efficacy and efficiency of research intensive networks and their impact in the world, we claim there is a need to expand the context of knowledge systems associated with research intensive networks. This idea for us involves the development of a public knowledge imperative. We suggest that textual representations expressed as knowledge claims can no longer be hidden away from the eyes of public scrutiny when there are important matters of public interest either implicitly or explicitly at stake. The recent catastrophe in the Gulf of Mexico provides an example of how particular types of knowledge, for example, procedures associated with offshore oil rigs, can rise up to become of the highest public priority almost overnight. To neglect the potency of such knowledge through a lack of public scrutiny can have devastating consequences, as the whole world has found out.
In this chapter we set out to provide a rationale as to why we think a public knowledge imperative is so important. To give expression to this imperative, we think there is a need for a new type of institutional and regulatory framework to protect and enhance the role of public knowledge. We call this framework a public knowledge space. It is public by virtue of the fact that it relies on semantic technologies and web publishing principles. But more importantly, in order to understand the multiple functions of a public knowledge space, we suggest it is first necessary to develop a detailed ontology of knowledge itself. Our ontology outlined in this chapter is broadly based because we emphasise the value of experience and lifeworlds as much as we do the importance of rigorous critiquing and transparent review. By extension, our views are slightly orthogonal to prevailing perspectives of the semantic web.
Encoding the Language of Landscape: XML and Databases at the Service of Anglo-Saxon Lexicography
by Peter Stokes
With E. Pierazzo. Perspectives on Lexicography in Italy and Europe, ed. by S. Bruti, R. Cella, and M. Foschi Albert (Newcastle, 2009), 203–38
1 views
Putting the Text back into Context: A Codicological Approach to Manuscript Transcription
by Peter Stokes
Second author; first author E. Pierazzo. Codicology and Palaeography in the Digital Age II, ed. Fischer et al. (Norderstedt, 2011), 397–430. http://kups.ub.uni-koeln.de/4360/
Textual scholars have tended to produce editions which present the text without its manuscript context. Even though... more Textual scholars have tended to produce editions which present the text without its manuscript context. Even though digital editions now often present single-witness editions with facsimiles of the manuscripts, nevertheless the text itself is still transcribed and represented as a linguistic object rather than a physical one. Indeed, this is explicitly stated as the theoretical basis for the de facto standard of markup for digital texts: the Guidelines of the Text Encoding Initiative (TEI). These explicitly treat texts as semantic units such as paragraphs, sentences, verses and so on, rather than physical elements such as pages, openings, or surfaces, and some scholars have argued that this is the only viable model for representing texts. In contrast, this chapter presents arguments for considering the document as a physical object in the markup of texts. The theoretical arguments of what constitutes a text are first reviewed, with emphasis on those used by the TEI and other theoreticians of digital markup. A series of cases is then given in which a document-centric approach may be desirable, with both modern and medieval examples. Finally a step forward in this direction is raised, namely the results of the Genetic Edition Working Group in the Manuscript Special Interest Group of the TEI: this includes a proposed standard for documentary markup, whereby aspects of codicology and mise en page can be included in digital editions, putting the text back into its manuscript context.
A rationale for the TEI recommendations for feature-structure markup
Co-authored with Gary Simons, published in Computers and the Humanities 29, pp. 191-205, 1995
In this paper, we concentrate on justifying the decisions we made in developing the Text Encoding Initiative (TEI)... more In this paper, we concentrate on justifying the decisions we made in developing the Text Encoding Initiative (TEI) recommendations for feature structure markup. The first four sections of this paper present the justification for the recommended treatment of feature structures, features and their values, and of combinations of features or values and of alternations and negations of features and their values. Section 5 departs briefly from the linguistic focus to argue that the markup scheme developed for feature structures is in fact a general-purpose mechanism that can be used for a wide range of applications. Section 6 describes an auxiliary document called a "feature system declaration" (FSD) that is used to document and validate a system of feature-structure markup. The seventh and final section illustrates the use of the recommended markup scheme with two examples, lexical tagging and interlinear text analysis.
57 views
Seen by:Il testo digitale: traduzione, codifica, modelli culturali
Published in in P. R. Piras, A. Alessandro, D. Fiormonte (a cura di), Italianisti in Spagna, ispanisti in Italia: la traduzione. Atti del Convegno Internazionale (Roma, 30 - 31 ottobre 2007), Roma, Edizioni Q, pp. 271-284
This paper is an attempt to apply cultural semiotics to the realm of digital textuality. The first part defines what... more This paper is an attempt to apply cultural semiotics to the realm of digital textuality. The first part defines what digital humanities are, and the interconnections between the translation activity and the work of the digital text encoder. The second part provides examples of how markup languages, i.e. XML-TEI, by adding semiotic layers to a document, can both represent and manipulate the original source. In conclusion, digital encoding can be seen as a complex semiotic act that can have profound effects on the identity of our cultural artifacts.
Multi-Version Documents: a Digitisation Solution for Textual Cultural Heritage Artefacts
Desmond Schmidt and Domenico Fiormonte, "Multi-Version Documents: a Digitisation Solution for Textual Cultural Heritage Artefacts", Intelligenza Artificiale, IV, 1, pp. 56-61.
Textual cultural heritage artefacts present two serious
problems for the encoder: how to record different or... more
Textual cultural heritage artefacts present two serious
problems for the encoder: how to record different or revised
versions of the same work, and how to encode conflicting
perspectives of the text using markup. Both are
forms of textual variation, and can be accurately recorded
using a multi-version document, based on a minimally redundant
directed graph that cleanly separates variation
from content.
77 views
Seen by:
