As Libraries and Archives Digitize, Implications for Maintaining Individual Privacy

    by Ellen LeClere
    May 24, 2016
    Photo by by Vijetha Vijayan on Flickr and used here with Creative Commons license.

    This piece is part of a special series on Libraries + Media. Click here for the whole series.

    We live in an era in which we expect information to be provided to us at the click of a button. Paul Otlet, a Belgian librarian and father of the Universal Decimal Classification, was motivated by the notion that the world’s information materials could one day be non-rivalrous through technological innovation. Otlet’s vision of technology generating access is being realized today, as many libraries and archives are in the process of digitizing their collections on massive scales.

    Base image via Shutterstock; photo illustration by Kerry Conboy. Click the image for the full series.

    Click the image for the full series.Base image via Shutterstock; photo illustration by Kerry Conboy.


    Digital collections may grant broader and non-rivalrous access, but there are many consequences to the process that should be considered before creating the digital Library of Alexandria; one of which is maintaining individual privacy.

    What happens when the analog becomes digital? Archives collect primary source materials of enduring value to society, like personal papers and organizational records. These records become accessible to anybody who visits the archives and requests access to the collections. Before the internet, accessing these collections required a researcher to travel to the archives, request materials from the collection, examine the papers within the physical confines of the archive, make copies or transcribe information from the documents, and then synthesize the findings at home.

    Digitizing analog collections disrupts the flow of information, allowing for broad and non-rivalrous access, and user capacity to store, copy, transmit, publish, and publicize potentially private information gleaned from the archives.


    Where the Right to Privacy Comes From

    The right to privacy is compelling and finds support in constitutional guarantees. The First Amendment’s guarantee of freedom of speech allows for people to be secure in their personal thoughts and beliefs. The Fourth Amendment protects the right of individuals “to be secure in their persons, houses, papers, and effects, against unreasonable searches and seizures.” It is an ethical expression of the physical properties of privacy. But the constitution’s language leaves room for legal and theoretical interpretations, which has led to many different articulations of privacy as it applies to the individual.

    Perhaps the most famous description of privacy comes from Samuel Warren and Louis Brandeis in their influential article “The Right to Privacy” written in 1890. They conclude “that the protection afforded to thoughts, sentiments, and emotions, expressed through the medium of writing or of the arts, so far as it consists in preventing publication, is merely an instance of the enforcement of the more general right of the individual to be let alone,” thus extending intellectual property law to respect the right of an individual to control information from being published. While it may not be articulated in the law as such, it is articulated in our own assumptions of personal welfare, justice, and above all self-determination.

    The New Ethics of the Online Researcher

    The examples used by Warren and Brandeis illustrate some of the issues of privacy negotiated by contemporary archives. “Suppose a letter has been addressed to [an individual] without his solicitation. He opens it, and reads. Surely, he has not made any contract; he has not accepted any trust. He cannot, by opening and reading the letter, have come under any obligation save what the law declares; and, however expressed, that obligation is simply to observe the legal right of the sender, whatever it may be, and whether it be called his right or property in the contents of the letter, or his right to privacy.”

    The individual in this scenario might very well be a researcher accessing a collection of correspondence in the archives. Upon entering the archives, most researchers present a form of identification and follow rules pursuant to a registration form, creating contractual obligations (or at the very least, social expectations) between the archives, the researcher, and the individuals represented in the correspondence. Though these third parties likely had no idea that they would have their letters read by anyone other than their intended recipient, the rules of the archives have created a system of protections that puts liability and pressure on the researcher to act in accordance with legal rules and social norms, respectively.

    Warren and Brandeis could not have anticipated how technology would change how information flows. Today, the researcher doesn’t have to enter the archive at all. Nor does he have to present identification or sign a registration form.

    CC0 Public Domain photo.

    CC0 Public Domain photo.

    Privacy in the Archives

    In terms of assigning value to personal privacy, humans can be overwhelmingly myopic. We want (and have a right to access) information concerning others, but are hesitant to allow information about ourselves be viewed by the general public. We become upset when our most intimate secrets are shared – but why?

    The very nature of generating meaningful relationships and building our own identities requires a degree of intimacy, and most social individuals have at some point shared secrets with others. If privacy is a right to appropriate flow of personal information as Helen Nissenbaum, Professor and Director of the Information Law Institute at New York University, suggests, does the digitization of personal letters constitute a privacy violation?

    Different actors have different privacy concerns within the context of the archives. The donor of a collection enjoys full intellectual control over the collection, ideally having legally created or owned all of the collection’s contents. In this scenario, the donor can easily transfer all rights to the repository through an agreement. More often than not, donors are not the sole creator or owner of the materials they donate. Many collections are wholly comprised of letters sent to the donor by family members or friends. When this happens, archivists try to leverage the donor’s knowledge of the collection’s contents to identify areas of risk. Of course, this approach only works if the donor does have deep knowledge of the materials, which they often do not given the breadth of many collections.

    The other actors with legitimate privacy concerns are the third parties depicted knowingly (or unknowingly) in collections. “[T]he privacy of so-called third parties who may be represented in a collection can be the most worrisome and difficult to address,” writes Sara Hodson, curator of literary manuscripts at the Huntington Library. “[They] had no voice in deciding the fate of the papers, and are unlikely to have been consulted about any potential sensitivity in the collection.”

    Consulting all of the third parties in a collection would be an onerous process for archivists, however. It would require careful scrutiny of the collection’s contents, finding up-to-date contact information for all third parties, and finally being able to actually contact them – a difficult and labor-intensive task if the collection is large. Most personal collections do have some degree of third party representation, however.

    CC0 Public Domain photo.

    CC0 Public Domain photo.

    If any actor can be blamed for disrupting the appropriate flow of information in this process, it is the researcher who disseminates personal information that causes harm to the donor or third parties. This problem is exacerbated when you considered the limited liability and anonymity remote researchers enjoy when they access digital collections.

    Access to Everything, Access to Nothing

    Archives have long provided physical structure to historical collections, and archivists have long provided order to the collections inside. Though they do operate under the same guiding principles as their physical counterparts, digital archives and digital collections are not (and cannot) be the definitive solution for access – though they are motivated by the notion of democratic access to information and creating sites of all human knowledge akin to the Library of Alexandria.

    These projects not only ignore the persistent problem of privacy for the individuals represented in the collections, they add to the existing problem of the information glut. While we may have access to more than ever before, we often feel as if we have access to nothing. Somewhere among all the digitized materials is the information that we need; the information that answers our questions. Digital collections reproduce curated selections from already accessed collections, rather than bringing less visible collections into view. In this way, digital collections represent human expression that has already been “discovered” and is consequently already accessible, suggests Lauren Gottlieb-Miller, Assistant Librarian at the Menil Collection and I in a forthcoming paper.

    We must reconsider the concept of “access” as “visibility” and ask ourselves whether digital collections should be viewed as yet another valuable keeper of our cultural property among a bouquet of many, with the same omissions and failures of its physical counterparts; or if making all knowledge “visible” is a worthy goal.

    Update: This post has been updated to add credit to the author and collaborator Lauren Gottlieb-Miller for their forthcoming paper.

    Ellen LeClere is a Ph.D. student at the University of Wisconsin-Madison’s School of Library and Information Studies. Her research interests include digital collections and barriers to accessing and using archives collections, such as copyright and imposed restrictions on private or sensitive materials. You can reach her at [email protected].

    Tagged: access archives digital collections digitization libraries privacy

    Comments are closed.

  • Who We Are

    MediaShift is the premier destination for insight and analysis at the intersection of media and technology. The MediaShift network includes MediaShift, EducationShift, MetricShift and Idea Lab, as well as workshops and weekend hackathons, email newsletters, a weekly podcast and a series of DigitalEd online trainings.

    About MediaShift »
    Contact us »
    Sponsor MediaShift »
    MediaShift Newsletters »

    Follow us on Social Media