Research Interests:
The Science fiction library of the future and why we donʼt have it yet By Sean Johnson Andrews Presented at PCA/ACA SW conference, Feb. 2012
I would like to start this journey with a peculiar example of the library of the future. P.D. James begins her novel The Children of Men with an explanation of humankind’s tragic condition in the late twentieth century: “For all our knowledge, our intelligence, our power, we can no longer do what the animals do without thought.” Namely, they are unable to reproduce. The story opens in the year 2021.
The future, in this case, is one in which the human species no longer exists. This presents a vexing question that we often pose in relation to individuals, but rarely in relation to entire cultures, societies, or even species: what would you do if you knew you were going to die? For those involved in the library sciences, it isn’t surprising that one of the inclinations James highlights is something found in several Early Modern sonnets: the attempt to guarantee immortality by archiving one’s thoughts (or the youthful beauty of one’s lover) in writings and one’s writings in a folio. When Shakespeare records the living memory of his beloved, for “the eyes of all posterity” or Spenser uses verse to eternize the virtues of his, they did so with the reasonable belief that, even if no one found his writing or was inspired by it, someone COULD find it. Someone alive, later.
For those living in the parallel universe of James’ book, there could be no such promise. Recognizing that this might make the situation too desperate, James presents, if ever so briefly, a source for hope on this front. In the late 1990s, shortly after the last natural born human took his first breath, scientists discovered alien life, on a far off planet, thousands of light years away. There is no contact with this lifeforce, no communication with its beings, no guarantee they had or ever would have the capability (or desire) to visit the literally posthuman earth. Yet this glimmer of hope gives those inclined to speak to the ages the motivation to carry on. As James puts it:
All over the world nation states are preparing to store their testimony for the posterity which we can still occasionally convince ourselves may follow us, those creatures from another planet who may land on this green wilderness and ask what kind of sentient life once inhabited it. We are storing our books and manuscripts, the great paintings, the musical scores and instruments, the artefacts. The world's greatest libraries will in forty years' time at most be darkened and sealed. The buildings, those that are still standing, will speak for themselves. Johnson Andrews 1
It would likely take generations (assuming aliens live about as long as humans) of dedicated alien scholars poring through these archaic codexes before they could finally claim to understand us. But to the hypothetical crafter of (hopefully) universal metadata standards, this possibility was enough.
This is a grim foundation on which to build the library of the future: a mausoleum for the interstellar archeologists who might one day be interested in our culture. But look on the bright side: when those aliens discover this vast trove of information, they will be able to do whatever they want with it. While this could mean they use most of the storage media for fire wood or scrap metal, in the best case, they would be even more advanced than us, technologically, socially, and culturally. They could create infinite copies to distribute throughout their society, drawing upon the networked knowledge of everyone to help interpret and understand the lost civilization they had happened upon in their travels.
They would be free to reinterpret works using their own cultural framework, making sense of them despite (or even because of) the distance in time and space. And this reinterpretation could take any form they liked, be redistributed and reinterpreted until the ideas and history we had bequeathed to this galactic posterity merged with the great mind of the universe and we achieved an immortality beyond Shakespeare’s dreams. All of this would be possible whether the works were from Gutenberg’s time, published in the nineteenth century, recorded in the twentieth, or released as an ebook in 20-25. All of this would be possible not just because diligent archivists kept track of it, or brilliant coders fabricated a Rosetta Stone legible to the stars, or curious extraterrestrials plugged away at understanding it: no, all of this would be possible because, for the incredibly fortunate future invaders of our planet, all the information in the history of the planet earth would then be in the public domain.
In Arthur C. Clarke’s 2001: A Space Odessy the tables are turned: it is the earthlings who are the interplanetary archeologists. They discover a mysterious monolith on the moon, find it to be 3 million years old, and send a search party out to discover whether this civilization still exists. The library plays a very small role in this story, but one that is essential: to the single human onboard the ship, after he has lost contact with earth and destroyed the misguided computer that tried to kill him, looks to an infinite library of all human knowledge for both research and entertainment, relying on archives of symphony music to keep him from going insane. Clark doesn’t stoop to explain the licensing situation that allowed for this Maybe NASA just decided it was worth it to buy everything for their crew: no number of arias is too many for these guys. Johnson Andrews 2
I know many people at this conference will be interested to hear more Sci-Fi library examples – and I’m interested to hear those that come to mind for you – but I’m going to cut some of that short to get to the meat of the argument here:
First, why Science Fiction Second why do we want these libraries Third, what are their attributes And finally, why we don’t have them I’ll give you a preview of this concluding point by stating an answer to my overarching question: We will not have the digital library of the future until we tame rightsholders.
In between, I’ll talk about a few other Science Fiction libraries.
But why? Why, science fiction? I’m cheating on these points, in part because this is a popular culture association conference, therefore I feel less of a need to make the case for looking at, well, popular culture. Science Fiction, at its best, is a rich source of utopian and dystopian fantasies, where writers work out alternative futures so that we might more deeply contemplate the present. In some cases, there are warnings: Philip K. Dick’s Do Androids Dream of Electric Sheep asks, among other things, what the world will be like if we continue to progress down the path of mutually assured nuclear destruction. The resulting extinction of most animals and the invention of androids to assist in the necessary colonization efforts, all create an odd set of preoccupations with human emotion, as well as a range of new inventions and professions such as bounty hunter and mechanical animal technicians.
In other cases, they are visions of what could be, if we work hard enough. Science Fiction author Neal Stephenson recently discussed this in relation to the way the genre aids in inspiring innovation, particularly among people directly engaged in science and engineering. Calling it “The Hieroglyph Theory,” Stephenson claims, “Good SF supplies a plausible, fully thought-out picture of an alternate reality in which some sort of compelling innovation has taken place. A good SF universe has a coherence and internal logic that makes sense to scientists and engineers.” This vision then leads them to invent things – much as Stephenson’s own vision of the Metaverse in his book Snow Crash provided a direct inspiration to the inventor of Second Life.
Johnson Andrews
-
3
So, in general, Science Fiction often serves to outline what Stephen Johnson calls “The Adjacent Possible.” In his most recent book, Where Good Ideas Come From, Johnson compares the adjacent possible to the “primordial soup” of molecules and chemical elements, all of which form the basic building blocks for all subsequent life.
“The atomic elements that make up a sunflower are the very same ones available on earth before the emergence of life, but you can’t spontaneously create a sunflower in that environment, because it relies on a whole series of subsequent innovations that wouldn’t evolve on earth for billions of years.” Every new combination of these elements provided a new building block that an evolving life form could use. And each new evolution created a further possibility for evolution. It is this that Johnson terms “The Adjacent possible.”
The adjacent possible is a kind of shadow future, hovering on the edges of the present state of things, a map of all the ways in which the present can reinvent itself. [. . . ] What the adjacent possible tells us is that at any moment the world is capable of extraordinary change, but only certain changes can happen. [. . . .] The history of life and human culture, then, can be told as the story of a gradual but relentless probing of the adjacent possible, each new innovation opening up new paths to explore.
So Dick asks what the middle class suburban future will look like after a nuclear war and Stephenson asks what will be possible with Virtual Reality. While all fiction – and even the basic outline of the sitcom – could be summed up as a version of this - the future-oriented focus of most science fiction makes it a good space for mining insight into the adjacent possible with regard, in this case, to libraries.
It is, after all, the adjacent possible of the digital or universal library that makes them interesting. This is, to move to the second point, why I think we do – or maybe SHOULD want them. The idea of what Brewster Kahle of the Internet Archive calls Universal Access to all Human Knowledge is, in and of itself, is a lofty and laudable goal. But it is what that platform would allow us to do – the adjacent possible of the digital library – that really excites me. Kahle himself, in a recent interview with Internet pioneer Stewart Brand, revealed that he was initially motivated to create the Internet Archive when working at the MIT Artificial intelligence Lab: he said, “We were going to build a thinking machine, but it didn’t know anything. So I said, okay, if we’re going to bring up our new overlords, lets at least have them read good
Johnson Andrews
-
4
books.” The Internet Archive, therefore, is an almost twenty-year detour he’s taken to create a digital resource that will make the adjacent possible – Artificial intelligence – possible.
From my own perspective as a cultural studies scholar, it is this that drives me to want a resource of this magnitude – particularly with the myriad of emergent projects in libraries and archives that use social media tools to tag and curate resources that used to be accessible to very few people. This is an ongoing project: as open data militant Carl Malamud said in his speech at the Digital Public Library Plenary, “Some of our most important pools of knowledge are not available at all, or available only to those with golden credit cards or positions of privilege in our elite institutions. Knowledge in our world belongs to the 1 percent.” In a common iteration of the science fiction vision he says, “If we can put a man on the moon, why can’t we launch the Library of Congress into cyberspace?”
This leads me to the question of what the attributes of the science fiction library are and, again, to look at why we don’t have them. The most obvious attribute of these libraries seems to be that they contain virtually everything. Digital Humanities scholar Amanda French recently reported on South Korea’s National Digital Library – or DiBrary – and the plans to merge it with similar national experiments in China and Japan, creating a massive Asian Library modeled after the European library project, Europeana. She invoked one of the most well known, though probably less obvious examples of a science fiction library, saying, “To me it sounds like the second step toward the single digital library filed contentedly away in the humming systems of the starship Enterprise, waiting to be addressed with a question: “Computer . . .”
So we can point to at least two attributes of this library of the future – one, it contains everything and, two, there is a way to query this vast amount of data in some way. What this fantasy implies is that, not only do we need a vast infrastructure, but we need a librarian or at least the careful standards of order and thought that their profession teaches.
Annual journal output alone is staggering: in 2006, several researchers set out to discover the percentage of that years’ open access journal output. They found almost 19% of the articles published that year were freely available. While I’d like it to be 100% - this being the portal to that adjacent possible – the 100% amounted to 1million 350thousand articles published in 23,750 journals. I’m sure that growth has only increased in the last five years: but what am I supposed to do with almost 1 and a half million articles every year? Clone myself. As one commentator puts it: Johnson Andrews 5
The difficulty seems to be, not so much that we publish unduly in view of the extent and variety of present-day interests, but rather that publication has been extended far beyond our present ability to make real use of the record. The summation of human experience is being expanded at a prodigious rate, and the means we use for threading through the consequent maze to the momentarily important item is the same as was used in the days of square-rigged ships.
Right before this, the author of this quote, a scientist named Vannevar Bush, laments the difficulty scientists have in keeping up with the literature in their field. But he’s not lamenting the “growing mountain of research” that came out in 2006: he’s writing about the same problem as it existed in 1945.
To deal with this problem, Bush proposed devising an associative engine based on a deep local cache of all documents – on microfiche, of course – which was mechanized and so that it could be consulted with “exceeding speed and flexibility.” As library science scholar Michael Buckland points out, “Bush had a low opinion of indexes and classification schemes,” which led him to recommend that, instead of having an index or classification system for this data, it would instead be designed, “as we may think” – or as he claimed the human brain thinks. It would keep track of the associative trails the user developed based on the perceived relevance of items, much in the same way that the brain created associations in the memory. He called this device the memex, or memory extension device.
Buckland points out that, as innovative as it sounds, the organization this associative engine provides is quite arbitrary: it is an overly personalized, “superficial and inherently self-defeating design.” He goes on…
The trails, being based on one individual's personal knowledge and perceptions of relevance, would be highly obsolescent. As a user's knowledge increased, perceptions of relevance would change, and the trails would need to be remade. Any given pattern of trails would remain appropriate only so long as the user did not learn anything from the use of the Memex--or in any other way.
While Buckland is skeptical of the design – he thinks traditional subject and metadata indexing superior in the long term – he credits Bush’s idea as a sort of scientist science fiction which, “through its skillful
Johnson Andrews
-
6
writing and the social prestige of the author, has had an immediate and lasting effect in stimulating others. [. . . .] Bush's paper was timely and ‘opened people's eyes and purses.’”
Opening people’s eyes and purses is an important effect of inspirational ideas like those of Bush. With government funding, Bush and his team of scientists went a long way to building a prototype of the rapid microfilm selector he had imagined, and his piece, titled, “As We May Think,” is often credited with inspiring the creation of the word processor, mouse and hyperlink. One could even argue that his idea of the associative engine is largely central to that of another government funded project – that of the NSF funded Stanford Integrated Digital Library Project, which we now know as Google.
The algorithmic connections Google makes between hyperlinks and our general activity on the web are quite similar to what Bush imagined his memex might perform: indeed one of the commonly stated effects of the search engine has become what Siva Vaidhyanathan refers to as the Googlization of Memory, which he briefly describes saying, “I don’t have to remember very much” (175). This was exactly the goal Bush had for his memex, which he saw as freeing up scientists’ brains for higher thought functions – this was the adjacent possible for him.
On the other hand, as Geoffery Nunberg points out, this algorithmic associative engine still suffers from the problems Buckland finds in Bush’s model, particularly when it comes to books. Nunberg, describes Google’s metadata for the books project, “a train wreck: a mish-mash wrapped in a muddle wrapped in a mess.” This isn’t the only thing to be said about Google, but it serves to illustrate that it missed a key component of most all science fiction libraries. Not only must it have everything in it, but to make sense of that everything, we need order. This increasingly requires that the library and information sciences work as one, with librarians helping to design these future tools.
Thus one answer to the question of what librarianship looks like and where it is going, is that coding may be the new cataloging. For instance, unlike Google’s library, the projects of the DPLA Beta Sprint are clearly, deeply influenced by librarians and archivists. It includes innovative user interfaces; both of the projects associated with Harvard’s MetLab, allow for something approaching the serendipity of the stacks, drawing upon the systemic resources of local and national libraries, and allowing for user-curated, visually rich environments. Yet these projects and all of the others, concentrate part , if not most, of their energy on creating metadata interoperability that will allow for a national platform able to access the already vast pools of resources in public archives and private institutions like JSTOR and the Hathi Trust. Johnson Andrews 7
In short, with librarians on the case, it is clear that the infrastructure for something like the library of the future is on near horizon. The problem as you all well know, is of the vagaries of copyright. Listening to the DPLA plenary is like watching NASA scientists build an amazing spaceship for intergalactic travel, then hearing that they will be unable to use it because forty of its manufacturers are involved in an intractable patent dispute. The technology is there, but until we are allowed to populate these libraries with the basic resources people want to access, it will remain tethered to the launch pad.
The problem in this regard is not so much that we have to pay current authors to digitize and distribute their work: the continuing serials crisis notwithstanding, the real problem of the digital library has to do with the issues Google almost overcame in its audacious attempt to digitize all books. It was, is, and will be prohibitively expensive to track down the rightsholder for every in-copyright, out of print book contained in physical form in libraries around the country. Jonathan Band, who represents and prepares briefings for various library associations, estimates the liability for violating the copyright of the 24 million books in this category would be close to $3.6 trillion. On the other hand, just the process of clearing the rights – not the cost of the rights themselves – would be close to $24 billion.
By relying on a fair use defense early on, Google threaded a very tiny needle that it later used to sew up a tenuous deal with publishers, allowing it to create a national registry, make arrangements with authors who didn’t want to be included, and create revenue splits between itself and publishers. It was ultimately left to one man, the judge in this case, to decide whether to accept or reject the terms of the settlement. Last March, he rejected the settlement. As Joe Mullin, writing for paidcontent.org wrote, “The proposed settlement was without precedent in its scope. The settlement had the potential to change the way we all interact with books—to actually change human culture. A class-action settlement just wasn’t the right tool for that serious work.”
The mess of its metadata aside, the agreement with publishers was the moon shot that might have made opened the adjacent possible of the digital library. I agree with Vaidhyanathan that it represents a great public failure – a failure of our public institutions – that it was left to Google to accomplish this. And its monopoly in out-of-print, in-copyright works would have been formidable. Yet there seems to be no institution willing or able to take the enormous risk they did in challenging the absurdity of the system. As Mullin observed, “Copyright is proving to be an obstacle to creating new digital businesses even when all major stakeholders have reached agreement.” Johnson Andrews 8
In both this case, and in the more recent orphan works suit brought by the same Author’s Guild against the Hathi Trust, it seems that the courts will not determine this question: instead both the judge in the Google Books settlement and the petition against Hathi recommend that the only way forward is a legislative solution. In other words, creating a legal platform for a digital library of the future is left up to the congressional leaders who only recently relented on SOPA and are now pushing for the troubling Research Works Act. That will be an uphill climb, but outside of outright civil disobedience, it is a trek we may soon have to make. Unless you are all prepared to become guerilla librarians like Aaron Schwartz, it may be time to bone up on our policy approaches – or support the ARL and ACRL in their work on this front.
An interesting feature of the more recent science fiction libraries I’ve encountered is that authors have started to account for this absurd obstacle to the perfect library of the future.
Vernor Vinge's Rainbow's End, written after the Google Books project began, has scenes of library carnage porn beyond compare: the central project is a digitization effort where the books (all of them in the UCSD library) are shredded via wood chipper and the refuse is blown through an enormous tube equipped with cameras to take pictures of the scraps. Using a complex software program, the images are reassembled into something approximating the original, but in digital form: gaps are rectified by shredding the contents of other libraries, hoping that missed fragments of works or pages will get caught the second, third or fourth time around.
While obviously cavalier about the importance of physical libraries, the technology seems messy but plausible. What remains unconvincing is that the character behind this charnal house digitization project claims he, "has lawyers and software that will allow him to render microroyalty payments across all old copyright regimes - without any new permissions." Now that's science fiction!
But perhaps the most instructive vision of both library and librarian can be found in Stephenson’s Snow Crash. The library in this case is wholly contained in a data card given to the protagonist, Hiro, by a colleague within the virtual reality space of the Metaverse. The interface of the library is a librarian. It appears as an avatar or daemon, but is actually an artificial intelligence computer program written to organize the data within its collection. Before too many hackles are raised at the suggestion that robots can replace librarians, know that this particular library AND its librarian daemon were designed by an actual Johnson Andrews 9
librarian working high in the administration of the Library of Congress. Hiro enters the Metaverse and queries the librarian daemon; it can give Hiro a batch of data - including video, audio, book sections, and news clippings, – can provide narrative summaries of data within a collection, and is even able to give a sense of what the original librarian thought was going on in the data – in other words, the associative paths they were tracing through the information. Based on this, the software librarian could make tentative guesses as to the arguments the original librarian was considering based on the patterns of the data collection itself, his queries of it, and his paths through the data. In this way, the software librarian acted much like Vannevar Bush’s memex--both (1946) collecting all the data in one place and mapping pathways through that data. This helps Hiro pick up where his predecessor left off, but to take slightly different pathways and figure out what he missed.
Ironically, considering our present predicament, within the context of the story, all knowledge and intelligence gathering is privatized. With the breakdown of the federal government, the Library of Congress and the CIA are merged and privatized (into the Central Intelligence Corporation). All knowledge is therefore available only for a fee. Hiro works part time as a stringer for the CIC, uploading bits of intelligence, for which he is paid by people who want access to it. Corporations, governments, and private citizens alike can all pay for access to this information, but it must be bought.
Therefore what is most significant about the utopian library Hiro possesses is that it is not subject to any of these infrastructure controls. The creator of the library worked at the Library of Congress before it was privatized and pulled an Aaron Schwartz before taking off. The data would normally have been prohibitively expensive to access in this completely open way and defied all the privatized protections to which it would normally have subject. Hero is given the data set and the librarian as a gift, suggesting that one of the best possible routes to facilitating the interface and user practice for the library of the future is to steal all the data and give it a way for free.
In F451, lovers of books, knowledge and free-thinking democracy have to revert to an oral tradition in order to keep alive the posterity of mankind. As in The Children of Men, in this case, the complete obliteration of the bulk of humanity is a good thing. The redeeming fire destroys the thick, decrepit, rotting overgrowth of the dominant civilization. The new revolution in knowledge - in this case, people returning to tell their stories orally and turn them back into books - is only made possible by a violent nuclear holocaust which destroys the power structure that disavows the importance of the free exchange of Johnson Andrews 10
ideas. Bradbury leaves little doubt that the near complete destruction of mankind is a hopeful ending when it means mankind won’t be there to mess it up for everyone. In relation to copyright and libraries, this is the message of both The Children of Men and the message of F451. I for one would much rather be alive to see the amazing possiblity of the library of the future, but I know that, if that is to happen, we have to continue fighting now.
Johnson Andrews
-
11
The Science Fiction Libraries of the Future - and why we don't have them
Uploaded by
Sean Andrews
11 Pages




