How Do We Liberate the U.S. National Archives?

    by Matt Stempeck
    February 25, 2013

    The following is an MIT Center for Civic Media lunch live-blogged by the Center’s Nathan Matias and Rahul Bhargava.

    Today, we’re hearing from the National Archives and Records Administration about the archives they maintain, how they’re making those archives available online at Archives.gov, and approaches to sharing the archives to broader audiences.


    Pamela Wright is the chief innovation officer at the National Archives and Records Administration. Bill Mayer is the executive for Research Services at NARA. Michael Moore is the access coordinator for Research Services East (right here in Waltham, Mass).


    About the National Archive

    The U.S. National Archives keeps more than 4.5 million cubic feet of traditional records: paper records, audio, video, maps, and over 500 terabytes of electronic data. Their records span U.S. federal agencies, courts, records from Congress, and 13 presidential administrations. Only about 5% of the federal government records are preserved in the National Archives; the sheer volume requires them to be selective. There’s a lag of at least 15 years before the records are sent over to the Archives from the agencies that created them.

    The archives cover any issue the federal agency touches, from the environment to housing to health. There are studies, reports, case files, hearings files, aerial photos, and more.

    Environmental scientists from Maine have studied fishing records to learn about changes in the region’s stock over the years. The archives keep project files for the U.S. space program, traffic data from America’s cities, and much more. Now that the U.S. space shuttle program has been shut down, records are already being “palletized” for archival.

    How Can You Find Records from NARA?

    In addition to checking the Archive website, researchers can talk to archive curators directly. To help, NARA also offers residential research fellowships for anyone who wants to explore the archives more closely.

    The Archive would like to expand access to their records, but archives are often held hostage by their format, their location, and the challenge of indexing, says Mayer. There are 15 locations nationwide, but ideally, researchers have the same experience accessing records no matter where in the U.S. they are located.

    Setting free these records is at the heart of the Archive’s mission. It is an awesome responsibility that supports the democratic process in this country by allowing citizens to hold the government accountable for its decisions. Archives can be personal. Mayer tells us of the time he met a Vietnam vet who spent 43 years gathering the emotional courage to visit the Archives and look up his unit commander. It took the curator 30 seconds to find the commander’s name. These records change peoples’ lives.

    How do we set the records free?

    Records on paper fill miles of storage space. The Archives’ current footprint consists of miles and miles of shelving, “from the limestone caves of Kansas to compact shelves in downtown D.C.”

    The records come in from agencies with varying degrees of metadata. Mayer gives us “a snapshot of the pain” of the archives, showing us the process of obtaining, processing, and sharing data to the public. For example, the Archive also deals with Freedom Of Information Act requests. They have sealed records that are subject to FOIA requests, and if the request is approved, they must find the record, redact it, and share it.

    Wright tells us that the archives just set up their office innovation in October. It covers social media, the web, the online public catalog, the standards program, and presidential libraries. They’re also responsible for coordinating NARA’s Open Government program and Digital Government strategy. Wright comes to the Archives from a research career focusing on water issues for a Native American tribe in Montana.

    Wright tells us about recent culture changes in the archives. They started to dabble with social media for the first time in 2009. At the time, the organization faced fear coming from a desire to stay in control. People in the organization weren’t interested because they thought that social media wouldn’t serve the mission. NARA tried short, three-month pilot projects, which convinced people.

    A new Archivist, David Ferriero, came on in 2009 with a new energy and created a culture where employees could assume the answer to new ideas was “yes until no,” drawing inspiration from Joshua Greenberg’s work at the New York Public Library. The social media team adopted this motto.

    The Obama administration initiated an Open Government Directive, in response to which federal agencies were required to develop plans. President Obama established the expectation that the government doesn’t have all the answers and needs to be more open. Some staff were thrilled by this shift, while others were upset.

    The Archive launched a document-of-the-day series on Tumblr, with wild success. But the relative scale of the following there wasn’t enough for the National Archives. Recognizing the role of Wikipedia in Americans’ research habits, they hired a Wikipedian in residence. In the month of April 2012 alone, National Archive articles received hundreds of millions of views. The Archive has also uploaded public domain images and started a Wikipedia document transcript project. Wright credits the strategy of “skating to the puck” online with helping them reach many more Americans.

    After these successful experiments, the team developed a coherent social media strategy to support the Open Government Initiative. They shifted away from a broadcast model.

    In 2012, more than 135 external projects were published on 13 platforms, generating tens of millions of views. Wright’s big, hairy, audacious goal is to get all of NARA’s records online. They have 30,000 linear feet of records a year.

    Government organizations often pressure their employees to speak with only one voice. There’s a fear of staff or the public saying something wrong and hurting the brand. Wright says that single voices strangle and paralyze institutions, preventing them from having an authentic conversation with the public.

    Wright tells us about the Citizen Archivist Dashboard, a platform that enables users to tag, transcribe, and edit online records, adding their own uploads and sharing them with their friends. They’ve treated these as pilots, to see what gets traction with the public (and to fly under management’s radar). They list some of these projects on Challenge.gov, a government public engagement platform. One real-life outreach effort was the “History Happens Here“ initiative, which challenges people to find a catalog picture and take a current photo of that in the same location today. It was inspired by the Museum of London’s augmented reality historical photo app.The top 20 photos were chosen for a postcard book. The transcription pilot was a success for them, with the 1,000 trial documents getting transcribed in just two weeks.

    i-4fe36c01c265e05c30644a032bc4d68a-liberateinfo copy.jpg

    Internally, NARA has been using the Jive platform to encourage more horizontal, social information sharing among people working at different archives across the country.

    They want to engage software developers, in addition to the historians, archivists, librarians, etc., by making more datasets available.

    the next challenge

    Electronic records are the next great challenge for NARA. Agencies are under a directive from Obama that by 2019 they will manage all of their electronic records electronically. That may seem rational. But agencies have legacy, rigid, monolithic technologies stuck in contracts. Once the data comes in, NARA needs to find ways to keep it safe and provide access. Things have changed from 2007, when George W. Bush tried to claim that presidential email didn’t count as records. Now that attitudes have changed, NARA faces a huge challenge to process and make that data available.

    What kind of data does NARA currently have?

    Mayer’s work includes classified, private data. Wright focuses on much more open information. Structured data is often tough to process, and NARA will often take in flat files that can be redacted before sharing. Fifty years from now, how will it be possible to share data? NARA also needs to navigate its relationship with other agencies that are opening up their data more directly and engaging the public with hackathons.

    Wright tells us that constant access and interest is one of the best ways to ensure that something is preserved. That’s why NARA is trying to find interfaces to share that data with the public.

    An audience member asks about the challenge of digitization. Wright explains that NARA has an internal set of labs that work on this, and they have external partners (like Ancestry.com). How does the transcription pilot fit into this? Wright explains they have millions of documents in their catalog digitized. Some clearly important ones are done manually, others are just digitized en-masse.

    An audience member asks about controversial material (like Wikileaks). Its not easy, Mayer says. The mission is access, but there are laws that govern how an agency releases that material. They work with the general counsel and the national declassification center.

    Following 9/11, the federal government became concerned about “records of concern”, or potentially damaging information that had been made public. The result was that a number of records groups were shut down, including Vietnam War records that were previously available. Mayer expressed frustration with a recent Fresh Air piece that wasn’t able to go into details about why that happened and how they might use an FOIA request to access those records now.

    An audience member asks about which open datasets have been flagged as “interesting.” Wright says the open ones have been added to the catalog (for download). They have 85% or all the records they hold described in the catalog. The electronic records operate on more of a pipeline — NARA takes them in as they are being created, but the agency still owns the access (legally NARA can’t legally provide access yet). This is one of the more complicated issues they face. You can create a FOIA request asking what other FOIA requests are being made — so that can be a source for what might be interesting.

    George W. Bush emails will be available for FOIA in Jan 2014 — NARA needs to be ready for this.

    Another audience mentions some examples with Flickr, and asks where NARA draws inspiration from. Wright mentions some neat NASA + Angry Birds and Smithsonian 3d printing examples.

    Sites and data sources managed by the National Archives:

    Matt is a Research Assistant at the Center for Civic Media at the MIT Media Lab. He has spent his career at the intersection of technology and social change, mostly in Washington, D.C. He has advised numerous non-profits, startups, and socially responsible businesses on online strategy. Matt’s interested in location, games, online tools, and other fun things. He’s on Twitter @mstem.

    This post originally appeared on the MIT Center for Civic Media blog.

    Tagged: archives.gov civic media government mit national archives records

    Comments are closed.

  • Who We Are

    MediaShift is the premier destination for insight and analysis at the intersection of media and technology. The MediaShift network includes MediaShift, EducationShift, MetricShift and Idea Lab, as well as workshops and weekend hackathons, email newsletters, a weekly podcast and a series of DigitalEd online trainings.

    About MediaShift »
    Contact us »
    Sponsor MediaShift »
    MediaShift Newsletters »

    Follow us on Social Media