Skip to main content

When data become news. A content analysis of data journalism pieces.

This paper
A short summary of this paper
37 Full PDFs related to this paper

READ PAPER
Academia.edu

When data become news. A content analysis of data journalism pieces.

When data become news. A content analysis of data journalism pieces.

Conference Paper The Future of Journalism: Risks, Threats and Opportunities September 10-11, 2015, Cardiff University, UK When Data Become News A content analysis of data journalism pieces Wiebke Loosena, Julius Reimera & Fenja Schmidtb a Hans-Bredow-Institute for Media Research, Hamburg, Germany b Institute of Journalism and Communication Studies, University of Hamburg, Germany Corresponding author: Wiebke Loosen, w.loosen@hans-bredow-institut.de Abstract For journalism the phenomena of ‘big data’ and an increasingly data-driven society are doubly relevant: First, it is a topic worth covering so that the related developments and their consequences are made understandable and debatable for the public. Second, the ‘computational turn’ has already begun to affect practices of news production and is giving rise to novel ways to identify and tell stories. Thus, what we observe is the emergence of a new journalistic sub-field mostly described as ‘computational/data journalism’. This study focuses on the output of data journalism – with the aim of contributing to a better understanding of its reporting styles. The method used is a classical ‘handmade’ standardised content analysis. The sample consists of all the pieces that were nominated for the Data Journalism Award (DJA) – an award issued annually by the Global Editors Network – in 2013 and 2014 (n= 120). Categories of analysis look at, amongst other aspects, data sources and types, visualisation strategies, interactive features, topics, and types of nominated media outlets. Results show that over 40 percent of the data-driven pieces were published on the websites of (daily or weekly) newspapers; just over 20 Percent came mainly from non-profit organisations for investigative journalism like ProPublica. Almost half of the cases cover a political topic, and social and scientific issues appear frequently too. Financial data and geodata are the types of data used most often and most of the data relates to a national context. More than two-thirds of the projects use data from official sources like Eurostat. Further analyses regard differences between 2013 and 2014 and look deeper into visualisation strategies and interactive features. 1 Introduction and Literature Review The emergence of data journalism can be understood as journalism’s response to the datafication of society and the “data deluge” (Lewis, 2015: 322) generated by it. Indeed, the field of journalism is grappling with the big data phenomenon, creating novel ways to identify and tell stories (Anderson, 2013; Coddington, 2015; Lewis & Usher, 2014). The intense discussions of these developments within journalism itself (e.g., Fink & Anderson, 2015; Weinacht & Spiller, 2014) indicate their practical relevance (as well as their ability to stimulate journalistic self- reflection). Meanwhile the trend is already transforming journalism education, e.g. with data journalism courses being offered in journalism programs at several universities (Anderson, 2013; Weinacht & Spiller, 2014). The extensive attention paid to data journalism by practitioners has also fuelled a “rapidly growing body” (Lewis, 2015: 322) of scientific studies on the topic - “an explosion in data journalism-oriented scholarship” (Fink & Anderson, 2015: 476). These mainly concentrate on two aspects: Firstly, scholars have tried to clarify what data journalism is and how it is similar to and different from investigative journalism, computer-assisted reporting, computational journalism, etc.: These definitional approaches (e.g., Anderson, 2013; Appelgren & Nygren, 2014; Coddington, 2015; Fink & Anderson, 2015; Gray et al., 2012) highlight the following presumed characteristics of data journalism: ● It builds on (usually large) sets of quantitative (digital) data as ‘raw material’ which is subjected to some form of (statistical) analysis in order to find stories in it or tell stories with it; ● the results “often need visualization” (Gray et al., 2012: n.p.), i.e. they are presented in the form of maps, bar charts and other graphics; ● it is “characterised by its participatory openness” (Coddington, 2015: 337) and “so-called crowdsourcing” (Appelgren & Nygren, 2014: 394) in that users help with collecting, 2 analysing or interpreting the data – going along with journalists relinquishing their “interpretive authority” (Weinacht & Spiller, 2014: 414; own translation); ● and it follows an open data and open source approach by publishing the raw data a story is built on. However, scholars’ definitional approaches are often contradicting: While Anderson (2013: 1005), for instance, places data(-driven) journalism within the broader field of “computational journalism”, Coddington (2015) explicitly distinguishes between these two concepts (as well as computer-assisted reporting). Although he (2015: 333) acknowledges that computational and data journalism “are not mutually exclusive” and that they “inevitably overlap”, he identifies “significant differences between these forms of practice” and makes the point that “not all data journalism is computational” (Coddington 2015: 336). In addition, computational journalism, for Coddington (2015: 337–341), is characterised to a much lesser extent by professional expertise as well as transparency. Consequently, Fink & Anderson (2015: 478) lament the “lack of a shared definition of data journalism”, which (presumably not only) Coddington (2015: 332) considers “fundamental” for building “a coherent body of scholarship”. Secondly, it is the actors involved in the production of data journalism that have been the foci of many of these studies: Data journalists in Sweden (Appelgren & Nygren, 2014), Belgium (De Maeyer et al., 2015), the United States (Fink & Anderson, 2015; Parasie, 2014; Parasie & Dagiral, 2013), Norway (Karlsen & Stavelin, 2014) and Germany (Weinacht & Spiller, 2014) were interviewed and observed with regard to their (journalistic) role conceptions and self- understanding as well as their organisational implementation into newsrooms. However, there are hitherto no systematically gathered insights regarding data journalism as “an emerging form of storytelling” (Appelgren & Nygren, 2014: 394), i.e. the structural elements of data-journalistic pieces which could be constitutive of a whole new “pattern of reporting” (Schmidt & Weischenberg, 1994; own translation) that data journalism represents. The only exception to this we came across is the study by Parasie & Dagiral (2013) who also analysed 3 data-journalistic pieces. Their data set, however, is very limited in spatio-temporal terms since it comprises only pieces from one outlet, which were published before March 2011. Analysing data-journalistic content on a broader scale does not only close this gap and add substantially to the body of research on this new phenomenon, it is also an important step towards a more consensual definition, since it appears as though the focus on the actors in the field alone is not sufficient to clarify the “diffuse term” (Fink & Anderson, 2015: 470) that data(-driven) journalism continues to embody. Research Objectives Against the background of the research gaps identified above, this study focuses on the output of data journalism. The attempts to define and systematise data journalism mentioned above suggest that a content analysis of data-driven pieces must, at the very least, look at what are often considered its core characteristics as discussed above: a) its foundation on ‘data’ as the ‘raw material’ to tell stories on a certain topic; b) its particular strategies to process and visualise this data; and c) the use of interactive features that allow users to explore stories and the underlying data sets according to their own interests. Based on these characteristics, the content analysis (e.g., Krippendorff, 2013) principally needs to collect data on which topics are covered by data journalism pieces, by what means data- driven stories are told and presented, and on what kind of data (sources) they rely. Here the aim of this study is twofold: First, we seek to advance current research by contributing to a better understanding of data journalism as a distinctive reporting style. Second, we also understand this study as a methodological attempt to develop variables in order to make data journalism’s products analysable. 4 Therefore, we pursue the following research objectives:  to identify the forms of presentation and the structural elements in data-driven pieces which are regularly stated in different definitions of data journalism with respect to o types of data and data processing, o visualisation elements and o interactive features;  to describe the topics that are covered in the data-driven projects;  to assess what kind of media organisations are particularly active in the field of data journalism. Methodology Selection of Material for Analysis The literature review has shown that data journalism is still a “diffuse term” (Fink & Anderson, 2015: 470), making it difficult, or rather preconditional, to identify respective pieces for a content analysis of data journalism products. Under these circumstances we have chosen a pragmatic as well as an inductive approach for this first systematic content analysis of data journalism products. This will help us avoid the possibility of starting with either too narrow or too broad a definition of what counts as data journalism. That is why our material for analysis relies on a definition from within data journalism itself: Our database consists of pieces that were nominated for the Data Journalism Award (DJA) – an award issued annually by the Global Editors Network1 – in 2013 and 2014. This guarantees that we will analyse projects that count as data journalism within the field itself and have been chosen to represent significant or innovative forms of data journalism. Similar approaches to sampling have already proven useful for analysing particular reporting styles and aspects of storytelling: For instance, Wahl- 1 Cf. http://www.globaleditorsnetwork.org/about-us/ (accessed March 13, 2015). 5 Jorgensen (2013a, 2013b) as well as Lanosga (2014) turned to nominees and winners of the Pulitzer Prize for researching emotionality in the news and investigative reporting. Table 1 gives an overview of the sample; if a nomination referred to a media outlet as a whole and not to a specific project the case was excluded from the analysis as our unit of analysis is a single data-driven piece. Tab. 1: Dataset overview Submissions Nominated projects Projects suited for Award-winning the analysis projects (% of analysed projects) * 2013 >300 72 56 6 (10.7) 2014 520 75 64 9 (14.1) Total >820 147 120 15 (12.5) * The GEN does not specify the number of submissions for 2013, but only states that “more than 300 entries” had been submitted (http://www.globaleditorsnetwork.org/programmes/dja; accessed February 17, 2014). With respect to the research objectives formulated above this sample also allows us to identify differences between a) the years 2013 and 2014 as well as b) those data journalism pieces that were only nominated and those actually awarded. Codebook Most variables in the codebook were developed inductively and based on an explorative analysis of a subsample from 2013. Some categories were inspired by Parasie & Dagiral’s (2013) study and others were suggested by fellow researcher Julian Ausserhofer and data journalist Lorenz Matzat. A pretest was conducted with two coders and a subsample of 10 percent of cases. All variables reached an intercoder reliability coefficient (Holsti and Krippendorff’s Alpha) equivalent to or higher than 0.7 which is generally considered sufficient for exploratory research (Lombard et al., 2002). 6 The final codebook contains 29 categories in four dimensions (see table 2). With the presentation of the results we will provide deeper insights on a selection of the analysed categories as it goes beyond the scope of this paper to present all our findings here.2 Tab. 2: Dimensions and variables of the codebook3 Dimension Variables formal characteristics - medium - type of medium - winner of DJA - topic - reference to a specific event* - headline - question(s) posed to data - referring article(s)** - length of article - language - number of people involved mentioned by name - external partners data set - additional information on data* - data source(s) - type(s) of data source(s) - access to data - kind of data - geographical reference - changeability of dataset*** - time period covered - unit of analysis analysis and journalistic editing of content - personalised case example**** - call for public intervention or criticism** - purpose of data analysis***** - visualisation context of use - interactive functions - online access to the database** - opportunities of communication * Suggested by data journalist Lorenz Matzat ** Adopted from Parasie & Dagiral (2013: 5–14) *** Suggested by (data) journalism researcher Julian Ausserhofer **** Inspired by Holtermann 2011 ***** Inspired by Gray et al. (2012: n.p.) 2 For more detailed results as well as information on the study in general see Loosen et al. (forthcoming). 3 The authors will provide the complete codebook on request. 7 Results To ensure a straightforward entry into the data, we approach the presentation of the results by organising the research questions in reverse order. We start with the identified media organisations among the nominees and the staff involved and give insights into the topics covered by the data-driven pieces in our sample as well as into several of their formal elements. Against this background we will then present the results gathered with respect to our ‘key variables’, those dealing with types of data and data processing, visualisation elements and interactive features. Where we found differences of statistical or substantial significance between particular groups of stories (between those from 2013 and those from 2014, between DJA-awarded projects and those only nominated, between pieces on specific topics, etc.), this is indicated in the text. Types of organisations among the nominees and number of identifiable actors involved Data journalism is the domain of newspapers – at least according to our particular sample. Over the two years they represent by far the biggest group among the nominees (see table 3). 8 Table 3: Type of medium 2013 2014 Awarded Total (n = 56) (n = 64) (2013 + 2014) (n = 120) (n = 15) Freq % Freq % Freq % Freq % Website of print newspaper 23 41.1 28 43.8 8 53.3 51 42.5 Website of investigative journalistic 8 14.3 16 25.0 4 26.7 24 20.0 organisation Website of print magazine 4 7.1 11 17.2 - - 15 12.5 Genuine online medium 5 8.9 2 3.1 1 6.7 7 5.8 Website of public broadcasting 5 8.9 2 3.1 2 13.3 7 5.8 company Website of university medium 3 5.4 - - - - 5 4.2 Website of non-journalistic organisation 2 3.6 2 3.1 - - 4 3.3 Website of private broadcasting 2 3.6 1 1.6 - - 3 2.5 company Website of news agency 3 5.4 - - - - 3 2.5 Other* 1 1.8 - - - - 1 0.8 * In this case: private website of freelance journalist Gregor Aisch The print organisations nominated the most include The New York Times, the US magazine Mother Jones, the Argentinian newspaper La Nación and The Guardian. Another important group are investigative journalistic organisations such as Pro Publica – contributing the most projects in total (12 cases) – and The Center for Public Integrity. Pro Publica and The Guardian are the only organisations which are represented with more than one project in both years. Our sample includes projects from twenty different countries, half of them, however, are represented with a single project. The top three countries are the United States (47,5%), Great Britain (12,5%) and Germany (8,3%). It is not surprising, then, that more than two thirds of the nominated pieces are in English (67,5 %). This might be partly explained by the fact that data journalism has a longer history in English speaking countries. The next most frequently occurring projects are bi- or multinational (14,2%). In most of these cases, the project exists in 9 two versions: in English and in the medium’s native language (most common: Spanish and German with 4 cases each). The predominance of the English language proved constant over the two years. Our results also illustrate that data journalism, more often than not, is a collaborative effort. In cases where the project contained credits (n = 100), almost five individuals are named as authors or contributors on average. More than a third of all projects have been realised in association with external partners either contributing to the analysis or designing the visualisations of the data being reported on. The average number of people involved in the production of data-driven projects increased from about four in 2013 to nearly six people in 2014 (M = 4.13, SD = 3.84 vs. M = 5.55, SD = 3.97). This difference is only significant on the 10%-level (t = 1.812, dF = 98, p < .10) but could be interpreted as an indication that the production of data journalism is increasingly personnel intensive – at least as far as our sample is concerned. The difference between projects only nominated (M = 4.66, SD = 3.92, n = 86) and those awarded (M = 6.21, SD = 4.04, n = 14) is even larger but is statistically insignificant (however, this could be due to the different sample sizes). Topics covered in data journalism pieces and formal elements The data journalism in our sample is dominated by politics. Almost half of the pieces analysed (48,3%) cover a political topic or combine political aspects with other topics. Typical subjects are election trends or results: “Exit Polls 2012: How the Vote has Shifted”4, “Bundestagswahl 2013 in Berlin – Alle Stimmen der 1709 Wahllokale (The 2013 General Election in Berlin – Every Vote from 1,709 Polling Stations)”, and “Municipales 2014 (Local Elections 2014)” are prime examples of this subgroup. Political issues are often combined with financial ones, as seen in: “Ethics Explorer – A Guide to the Financial Interests of Elected Officials”, “Gastos en el Senado 4 A list of (and links to) all projects nominated for a DJA in 2013 and 2014 is available on: http://community.globaleditorsnetwork.org/projects_by_global_event/744 (accessed August 31, 2015). 10 2004-2013 (Senate Expenses 2004-2013)” and “Il prezzo della politica italiana: 5 miliardi di euro in 20 anni (The Price of Italy's Politics: 5bn Euros in 20 Years)”. Societal issues such as census results and crime reports are also a preferential topic for data journalism, accounting for one third of cases (33.3%). So are health and science (21.7%; e.g. “Hooked – Canada’s Pill Problem”, “Life on the Line: 911 Breakdowns at LAFD”, “Innovative Energy Projects in Developing Countries”) as well as business and economy (20.0%). This illustrates that data journalism is mainly concerned with those domains where data are (becoming) routinely available. The subjects with the least coverage in our sample were education (7.5%), sports (2,5%; actually a topic that is predestined for data journalism due to its traditional ‘data-centricity’) and culture (1,7%; examples for this category are: “Le marché de l'art pour les nuls” (The Art Market for Dummies) and “Front Row to Fashion Week”). More than half of data-driven stories are topic-oriented (53.3%), i.e. they are not driven by a particular recent event. Political pieces are more likely to refer to a specific event (58.6%) like an election, while those dealing with society (37.5%), economy (33.3%) or health and science (26.9%) are much less likely to do so. Data journalism is not only relying on data as its genuine ‘fuel’, but often provides additional context and interpretation: Almost half of the analysed pieces in our sample contain one accompanying text contribution (48.3%); more than a fifth (22.5%) even come with a whole dossier of multiple articles. However, we also found that 17.5 percent of pieces had no additional articles. One way to counter the abstractness of quantitative data is to complement it with a personalised case example – also a regular technique for non-data-driven journalism: For instance, a health- related article will start with the story of one patient. This storytelling technique could be found in 40.8 percent of the pieces analysed, while the rates were considerably lower for economic and education topics (20.8% and 22.2%). 11 Types of data and data processing By far most pieces in our sample are based on data by official institutions like Eurostat, Land Statistical Offices, and ministries for education or defense (see table 4). The second largest group is data journalism that uses data by ‘other, non-commercial organisations’ which include universities, research institutes and NGOs. Roughly 20 percent rely on their own data sources which means that the respective media organisation collected the data itself, e.g. through a survey or by searching its own archives. Table 4: Type of data source (multiple coding possible) 2013 2014 Awarded Total (n = 56) (n = 64) (2013 + 2014) (n = 120) (n = 15) Freq % Freq % Freq % Freq % Official institution 37 66.1 44 68.8 10 66.7 81 67.5 Other, non-commercial organisation 19 33.9 34 53.1 6 40.0 53 44.2 Own source 13 23.2 9 14.1 3 20.0 22 18.3 Private company 8 14.3 12 18.8 3 20.0 20 16.7 Source not indicated 3 5.4 5 7.8 - - 8 6.7 It is considered a quality criterion in data journalism that data sources are indicated; yet, approximately seven percent of the cases did not indicate where they got the data from. However, this is not the case for any of the awarded pieces. More than half of the stories in which a source is indicated (n = 112) analyse data from only one kind of source (53.6%). Those pieces building on two (39.3%) or more (7.2%) different kinds of sources most frequently combined official data with data from either non-commercial organisations (34 cases), their own organisation (10 cases) or private companies (9 cases). Political topics are more likely to be covered with the help of data from an official institution (79.3%). Stories from 2014 are built on data from other, non-commercial organisations 12 significantly more often (53.1%) than those from 2013 (33.9%; χ 2 = 4.463, dF = 1, p < .05). Early data journalism focused on data from official institutions and other well-known sources and only now is it beginning to discover data sources beyond these. Sixty percent of the projects used data collected on a national level. However, data journalism is also adaptable to ‘smaller’ scales: Almost a quarter of cases are based on data from a regional context (24.2%) and 18.3 percent analyse information gathered on a local level (multiple coding was possible). Stories awarded with a DJA are even more likely to refer to data on a national level (80.0.%) than those who were only nominated (57.1%). However, this difference is only significant on the 10%-level (χ 2 = 2.857, dF = 1, p < .01). Stories from 2014 were significantly less likely to draw on regional data (9.4%) than those from 2013 (41.1%; χ 2 = 16.373, dF = 1, p < .01). One explanation for this pattern could be the fact that the large newspapers particularly active in the field of data journalism increasingly try to reach spatially extended audiences for which data on a national level are assumed to have a higher news value than those on a regional scale. Many projects (60.0%) use data referring to a simple unit of analysis (e.g., a person, a flight, a vote); complex units like a nation or a company are covered in almost half of all cases (46.7%). Only 10.8 percent of cases deal with an aggregated unit of analysis like a household or a class of schoolchildren. Most of the analysed pieces (also) rely on data that is publicly available. This is due to the fact that most data originates from official institutions. However, in two-fifths of the pieces, journalists did not indicate how they accessed the data (see table 5). In 18.3 percent of cases, the data had to be requested from the source since it was not publicly available beforehand. Freedom of Information requests also belong in this category and were sometimes explicitly mentioned in the additional information about the data. Only very few cases are based on data scraped or collected otherwise by the journalists themselves (e.g., “Gastos en el Senado 2004-2013 13 (Senate Expenses 2004-2013)”). Moreover, despite the public attention devoted to it, work using leaked data is rare. Table 5: Access to data (multiple coding possible) 2013 2014 Awarded Total (n = 56) (n = 64) (2013 + 2014) (n = 120) (n = 15) Freq % Freq % Freq % Freq % Public available data 22 39.3 28 43.8 7 46.7 50 41.7 Access to data not indicated 20 35.7 28 43.8 3 20.0 48 40.0 Requested data 12 21.4 10 15.6 4 26.7 22 18.3 Scraped data 3 5.4 5 7.8 1 6.7 8 6.7 Own data collection 5 8.9 1 1.6 1 6.7 6 5.0 Leaked data 1 1.8 3 4.7 - - 4 3.3 The data journalism we analysed relied to a large extent on financial data (45.4%), geodata (42.9%) and measured values which are compiled by sensors or measuring tools (39.5%; e.g., aircraft noise, weather data, train speeds) (see table 6). While this last category gained prominence over the years, award-winning projects rely on this kind of data to a below average extent. Other types of data used more frequently in 2014 are: personal data – i.e. information which can be attributed to individual persons – and metadata – i.e. ‘data about data’, for instance about individual instances of application use and data content. In contrast, the use of sociodemographic data – i.e. data that is only available in the form of mean values for larger groups – has decreased. However, none of these differences is statistically significant. The kind of data used least frequently originates from polls or surveys. 14 Table 6: Kind of data (multiple coding possible) 2013 2014 Awarded Total (n = 55) (n = 64) (2013 + 2014) (n = 119) (n = 15) Freq % Freq % Freq % Freq % Financial data 25 45.5 29 45.3 8 53.5 54 45.4 Geo data 26 47.3 25 39.1 6 40.0 51 42.9 Measured values 19 34.5 28 43.8 4 26.7 47 39.5 Sociodemographic data 21 38.2 16 25.0 4 26.7 37 31.1 Personal data 12 21.8 21 32.8 5 33.3 33 27.7 Metadata 7 12.7 13 20.3 1* 6.7 20 16.8 Poll ratings / survey data 8 14.5 7 10.9 1 6.7 15 12.6 Other data - - - - 1 6.7 2** 1.7 * “Homes for the Taking” ** Legislative texts in “Gay Rights by State”, experts’ reports in “Who's Pulling the Strings of D.C. Puppet Corporations?” Only about a quarter of the pieces nominated for a DJA were based on a single type of data (26.6%); most stories referred to two (40.0%) or three (25.0%) different kinds. Most frequently, geodata was combined with either measured values (25 cases; e.g. radiation levels in becquerel or noise exposure in decibel) or with sociodemographic or financial information (20 cases each); furthermore, sociodemographic statistics appeared together with either financial data (17 cases) or measured values (15 cases). Almost all of our cases use ‘static’ data that does not change (93.3%); only seven pieces are built on data that are updated regularly (e.g., “Bloomberg Billionaires: Today’s Ranking of the World’s Richest People”), and only one project worked with real time data (“Tweetómetro” that used data directly from the Twitter API). 15 Table 7 gives an overview on what is actually shown with the data: Table 7: Purpose of data analysis (multiple coding possible) 2013 2014 Awarded Total (n = 56) (n = 64) (2013 + 2014) (n = 120) (n = 15) Freq % Freq % Freq % Freq % Compare values 46 82.1 56 87.5 15 100.0 102 85.0 Show connections and flows 18 32.1 23 35.9 4 26.7 41 34.2 Show changes over time 26 46.4 30 46.9 8 53.3 56 46.7 Show hierarchy 8 14.3 6 9.4 1 6.7 14 11.7 In the vast majority of cases, the data is analysed with a focus on comparing values (e.g., to show differences between men and women or neighbourhoods) and almost half of the pieces show changes over time (e.g. “Climate Change: How Hot Will It Get in My Lifetime?”). Connections and flows are illustrated in more than a third of all projects. Less frequent are pieces that use data to show hierarchies – as in “Women as Academic Authors” which ranks the most important female scientists. In stories that deal with, among others, a political topic, data is used significantly more often to show connections and processes than in the average data- driven project (46.6% vs. 34.2%): For instance, the project “Rede de Escândalos (Network of Scandals)” shows connections between Brazilian politicians and their involvement in different political scandals, and “Consider the Source” follows cash flows from corporations to non-profit organisations. Visualisation and other structural elements If we think of data journalism as a distinct style of reporting it is crucial to learn about the particular ways it tells stories. The following results refer to these aspects and deal with the kinds of visualisations and interactive features that are applied to data-driven pieces. 16 Table 8 shows that there seems to be a more or less stable set of visualisation elements which mainly includes pictures (60.0%), simple static charts (54.2%), and maps (49.2%); over a quarter of the projects (also) work with tables (26.7%). Animated visualisations are rarer (15.8%). This partly echoes the statements of the data journalists interviewed by Appelgren & Nygren (2014: 403) who “described maps as the standard visualizing method”. The share of animations as well as that of pictures is relatively high among the award-winning projects as well as among pieces from 2014. However, both differences are statistically significant only for pictures (χ2 = 2.857, dF = 1, p < .10 and χ 2 = 8.058, dF = 1, p < .01). Moreover, comparing pieces (also) covering societal topics with those (also) focusing on politics, we find that the former contain simple, static graphics more often (65.0% vs. 54.2%) as well as maps (65.0% vs. 49.2%) while offering a search function less frequently (15.0% vs. 26.7%). Moreover, political stories are even less likely to contain an animation (6.9%) than the average piece. Table 8: Visualisation (multiple coding possible) 2013 2014 Awarded Total (n = 56) (n = 64) (2013 + 2014) (n = 120) (n = 15) Freq % Freq % Freq % Freq % Pictures 26 46.4 46 71.9 12 80.0 72 60.0 Simple static chart(s) 31 55.4 34 53.1 7 46.7 65 54.2 Map 29 51.8 30 46.9 7 46.7 59 49.2 Table 14 25.0 18 28.1 4 26.7 32 26.7 Combined static diagram(s) 11 19.6 11 17.2 3 20.0 22 18.3 Animated visualisation 6 10.7 13 20.3 4 26.7 19 15.8 No visualisation - - - - - - - - 17 On average, the pieces contained more than two different5 visualisations (M = 2.24, SD = 1.05). The numbers are only slightly (and not statistically significant) higher for awarded projects as compared to those only nominated as well as for stories from 2014 as compared to pieces from 2013. Typical combinations of visualising elements include simple static charts with pictures (38 cases) or with a map (33 cases). Interactive Features Interactive elements are often discussed as a ‘key characteristic’ of data journalism (e.g., Coddington, 2015; Gray et al., 2012; Weinacht & Spiller, 2014). However, we found that an 18.3% share of cases have no interactive functions at all; political stories are even more likely to come without them (24.1%). Yet, this is true for only one of the award winning projects (“Reshaping New York”) (see table 9) leading us to speculate that their use is considered a quality criterion for the DJA. Table 9: Interactive functions (multiple coding possible) 2013 2014 Awarded Total (n = 56) (n = 64) (2013 + 2014) (n = 120) (n = 15) Freq % Freq % Freq % Freq % Zoom / details on demand 32 57.1 35 54.7 10 66.7 67 55.8 Filtering 30 53.6 32 50.0 7 46.7 62 51.7 Search 17 30.4 15 23.4 1 6.7 32 26.7 No interactive functions 7 12.5 15 23.4 1 6.7 22 18.3 Personalisation 13 23.2 9 14.1 4 26.7 22 18.3 Playful interaction 2 3.6 1 1.6 - - 3 2.5 The interactive features most often integrated are zoom functions for maps, details on demand (e.g., the number of victims for each case of a reported school shooting), and filtering functions 5 We did not take into account whether elements of the same kind were included more than once: Several pictures, for instance, were counted as one visualisation. 18 which allow the user to filter the provided data with respect to different variables (e.g., to only select voting results from one state or one year). Personalisation tools – where the user has to enter personal data like their ZIP code or age to tailor the piece with customised data – is less common (18.3% of cases). Only three projects include an opportunity for a playful interaction (e.g., “Heart Saver”, a game in which the user has to send ambulances as fast as possible to fictional characters having a heart attack). Despite the large amount of projects which offer no interactive option at all, the average piece contains 1.55 different6 features (SD = 1.10), with seven stories offering the maximum number of four interactive elements. Conclusion/Discussion What does the reporting style of data journalism look like? According to our study the ‘typical’ data-driven piece  is published by a newspaper,  covers a political topic,  relies on public data from official sources,  builds its story on financial and/or geodata – preferably collected on a national scale,  is based on a simple unit of analysis such as single persons,  compares values in order to show differences and similarities between different objects of study (e.g., people of different gender, neighbourhoods)  combines two types of visualisations – preferably pictures with maps or simple charts,  allows the user to zoom into a map, request details and/or to filter data. 6 We did not take into account whether feature of the same kind (e.g., zoom-in function) were offered more than once (e.g., in more than one map included in the story). 19 Overall, this shows that data journalism as a reporting style is firmly characterised by those elements that cursory observations, literature reviews and actor studies have already hinted at. However, these characteristics do not apply to all data-journalistic projects and we found a variety of (combinations of) other story elements used less frequently but still often enough to be significant. This also means that our study could not conclusively clarify what the reporting style of data journalism is in terms of a universal definition. By the same token, we confirmed the diversity of forms, topics and combinations of story elements that other researchers’ partly contradicting definitions already implied. Obviously, data journalism as an emerging reporting style is both still evolving and flexible in that different types of data, analyses and visualisation strategies can be combined – or omitted – when it suits the topic and story. However, our sample has particular limitations as it is based on pieces that have a double bias: First, as nominees for a data journalism award they represent a special group. Second, these pieces are based on self-selection as any data journalist is able to hand in his/her data-driven pieces to be considered for nomination by the organising committee. Despite these limitations the sample also has two particular advantages: First, we can assume that the analysed cases, as nominees for a DJA, fulfill a certain quality standard and that the awarded pieces in particular are seen by experts in the field as a ‘gold standard’ and as such could influence the development of the field into the future. Second, the comparison between two successive years allows us, to a certain degree, to trace the field’s development. However, we did not find any significant differences (on a 5%-level) between 2013 and 2014 as well as between nominees and awarded pieces with regards to the sheer (average) number of certain story elements: visualisations, topics touched, sources and types of data used as well as interactive functions. Consequently, it is neither likely that data-driven pieces are awarded if they follow the principle ‘the more, the better’, nor do we observe a trend in that direction if we compare 2013 and 2014. We can assume, therefore, that data-driven means are not considered award-worthy or applied by journalists as an end in itself but that they clearly have to support 20 the story being told. This echoes Coddington’s (2015: 339) observation that data journalists subordinate the use of data “to the professional journalistic value of narrative and the ‘story.’ [...] [D]ata journalism discourse foregrounds telling the story over using data”. References Anderson, Chris W. (2013). Towards a sociology of computational and algorithmic journalism. New Media & Society, 15(7), pp. 1005–1021. Appelgren, Ester; Nygren, Gunnar (2014). Data journalism in Sweden. Introducing new methods and genres of journalism into “old” organizations. Digital Journalism, 2(3), pp. 394–405. Coddington, Mark (2015). Clarifying journalism’s quantitative turn. A typology for evaluating data journalism, computational journalism, and computer-assisted reporting. Digital Journalism, 3(3), pp. 331–348. De Maeyer, Juliette; Libert, Manon; Domingo, David; Heinderyckx, François; Le Cam, Florence (2015). Waiting for data journalism. A qualitative assessment of the anecdotal take-up of data journalism in French-speaking Belgium. Digital Journalism, 3(3), pp. 432–446. Fink, Katherine; Anderson, Christopher W. (2015). Data journalism in the United States. Beyond the “usual suspects”. Journalism Studies, 6(4), pp. 467–481. Gray, Jonathan; Bounegru, Liliana; Chambers, Lucy (eds.) (2012): The data journalism handbook. How journalists can use data to improve the news. (Early release). Sebastopol: O’Reilly. Holtermann, Hannes (2011): Datenjournalismus: eine neue Form der journalistischen Wertschöpfung aus Daten [Data journalism: a new form of journalistically creating value from data]. Unpublished Master Thesis. Hamburg. Karlsen, Joakim; Stavelin, Eirik (2014). Computational journalism in Norwegian newsrooms. Journalism Practice, 8(1), pp. 34–48. Krippendorff, Klaus (2013). Content analysis: an introduction to its methodology. Los Angeles: SAGE. Lanosga, Gerry (2014): New views of investigative reporting in the twentieth century. American Journalism, 31(4), pp. 490–506. Lewis, Seth C. (2015). Journalism in an era of big data. Digital Journalism, 3(3), pp. 321–330. Lewis, Seth C.; Usher, Nikki (2014). Code, collaboration, and the future of journalism. A case study of the Hacks/Hackers global network. Digital Journalism, 2(3), pp. 383–393. 21 Lombard, Matthew; Snyder-Duch, Jennifer; Bracken, Cheryl Campanella (2002): Content Analysis in Mass Communication. Assessment and Reporting of Intercoder Reliability. Human Communication Research, 28(4), pp. 587–604. Loosen, Wiebke; Schmidt, Fenja; Reimer, Julius (2015, forthcoming). “When data become news”: eine explorative Inhaltsanalyse daten-journalistischer Projekte [“When Data Become News”: an exploratory content analysis of data-journalistic projects]. Hamburg: Hans-Bredow-Institut. Parasie, Sylvain (2014). Data-driven revelation? Epistemological tensions in investigative journalism in the age of “big data”. Digital Journalism, DOI: 10.1080/21670811.2014.976408. Parasie, Sylvain; Dagiral, Eric (2013). Data-driven journalism and the public good. “Computer-assisted- reporters” and “programmer-journalists” in Chicago. New Media & Society, 15(6), pp. 853–871. Schmidt, Siegfried J.; Weischenberg, Siegfried (1994). Mediengattungen, Berichterstattungsmuster, Darstellungsformen [Media genres, patterns of reporting, presentation forms]. In: Merten, Klaus; Schmidt, Siegfried J.; Weischenberg, Siegfried (eds.): Die Wirklichkeit der Medien [The reality of the media]. Opladen: Westdeutscher Verlag, pp. 212–236. Wahl-Jorgensen, Karin (2013a) Subjectivity and story-telling in journalism. Examining expressions of affect, judgement and appreciation in Pulitzer Prize-winning stories. Journalism Studies 14(3), pp. 305–20. Wahl-Jorgensen, Karin (2013b): The strategic ritual of emotionality: a case study of Pulitzer Prize-winning articles. Journalism 14(1), pp. 129–45. Weinacht, Stefan; Spiller, Ralf (2014). Datenjournalismus in Deutschland. Eine explorative Untersuchung zu Rollenbildern von Datenjournalisten [Data-journalism in Germany. An exploratory study on the role conceptions of data-journalists]. Publizistik, 59(4), pp. 411–433. 22