The fate of history in the balance: The Seattle Federal Records Center still under threat

On February 16, John C. Coughenour, a Reagan-appointee and Senior Judge of the U.S. District Court for the Western District of Washington, blocked the sale of the National Archives facility at Seattle, one of the Federal Records Centers (FRC) in the U.S. with a preliminary injunction. This ended the movement of records from the facility to FRCs in Missouri and California, many of which are “un-digitized records.” He called the situation a “public relations disaster” of the Public Buildings Reform Board (PBRB), the entity which proposed the sale, and said that the PBRB had “a stunning lack of appreciation of the issues” of indigenous people. While the attorney generals of Washington State and Oregon applauded the decision, as did indigenous people, genealogists, U.S. Senator Maria Cantwell, and others, the fight is not over. The Stranger said that history “requires defending in the present,” The Cut argued that the fate of the Seattle FRC “remains undecided,” and MyNorthwest noted there is “more potential trouble” in the future if noting about the facility changes going forward. On February 18, local Seattle leaders and the governor of Oregon both wrote President Biden, calling on him to stop the sale of the facility. Even with the injunction, it is short-lived, meaning that the facility remains under threat. As such, it is important to once again, as I noted in February and November of last year, to explain the negative impact the closure of this facility will have on those in the Pacific Northwest and in the U.S. as a whole.

Over the past year, there have been legal efforts to delay the closure. Kim Wyman, the Secretary of State of Washington State, began meeting with the National Archives and Records Administration (NARA) and other stakeholders, in hopes of brokering a solution to keep the archival materials, which document “history across the Pacific Northwest” in the state of Washington. At the same time, Washington Attorney General Bob Ferguson made filings in federal court, including the recent lawsuit which included almost 600 pages from indigenous peoples, individuals, and interested groups which attest to the value of the Seattle facility and materials which are held there. If the “nearly million” boxes of archival materials from the facility were moved to Missouri and California as planned, access to records about Asian American history would be made more difficult, as would records that relate to the “cultural preservation, history and treaty rights” of various indigenous nations in the Pacific Northwest. Moving the records to facilities in those states would make them less publicly accessible, destroying one of the “wellsprings” from which the “collective memory” of the region and nation is formed, as argued in the case in the amicus brief by the Korematsu Center. A recent successful lawsuit filed by Ferguson in early January, joined by 29 indigenous groups, and historic community and preservation groups, to stop the relocation and sale of the Seattle FRC, explains the problem succinctly:

“This action shows a callous disregard for the people who have the greatest interest in being able to access these profoundly important records…The facility contains the DNA of our region. It provides public access to permanent records created by Federal agencies and courts in Alaska, Idaho, Oregon, and Washington…the National Archives at Seattle is the only property among those the PBRB recommended for sale that has profound importance to the region in which it is situated and is regularly used by members of the public…These irreplaceable archives are primarily un-digitized and do not exist elsewhere.”

The closure of the facility would violate NARA’s own principles to preserve and provide access to U.S. records and document U.S. history, especially those documents essential to U.S. government actions, rights of U.S. citizens, and any other records which “provide information of value to citizens.” It also runs afoul of NARA’s commitment to drive “openness, cultivate public participation” and strengthen U.S. democracy through “public access to high-value government records.” That same commitment states that NARA will lead the “archival and information professions to ensure archives thrive in a digital world.” That seems unlikely since only about 1% of the NARA’s record holdings are digitized and even less than 1% of presidential library records have been put online.

Furthermore, moving the records from Seattle to the FRCs in California, whether in Riverside or in San Francisco, and St. Louis, Missouri, would disregard the core values of archivists outlined by the Society of American Archivists. These core values state that archivists have a duty to foster greater access and use to records, maintain records which allow “contemporary and future entities” to seek accountability, serve as responsible stewards for primary sources,” and root their “ethics of care that prioritizes sustainable practices and policies” when it comes to archival duties. The “boxes of information” within the Seattle FRC, highlighted by one local Seattle reporter, Matthew Smith, would be made less accessible if the records were moved elsewhere in the country. If the Seattle FRC is closed, it will be a sad day for archives, records, and preservation of U.S., indigenous, and community history.

Although the closure of the Seattle FRC has been halted by Judge Coughenour, this is only a temporary measure. In the short-term, you could contact the management team of NARA, especially chief archivist David Ferriero (david.ferriero@nara.gov), deputy chief archivist Debra Steidel Wall (debra.wall@nara.gov), and Chief Operating Officer William J. Bosanko (william.bosanko@nara.gov), and the PBRB at fastainfo@pbrb.gov, to express your opposition to the closure, while calling on President Biden to follow the judge’s decision and keep the facility open. In the long term, NARA needs increased funding and you can use the information put together by the Archival Researchers Association to contact your members of Congress to push for legislation which would increase the agency’s budget.

Reprinted from Issues & Advocacy. This was written before the sale of the facility was halted by the Biden Administration. After learning this, I said on Twitter, “that doesn’t mean it should be sold. The decision to sell tthee [sic] facility was rotten and it’s good it was stopped,” called for a bigger budget to NARA, and noted “it was good timing to write another article about this back in March. I personally wasn’t sure whether the sale would be cancelled [sic], but I am glad it was.”

REPOST–“Far-reaching impacts”: Why the closure of NARA’s Seattle facility still matters

Archivists on the Issues is a forum for archivists to discuss the issues we are facing today. The following is from Burkely Hermann, recent graduate of the University of Maryland – College Park’s graduate program in Library and Information Science, with a concentration in Archives and Digital Curation.

Back on February 18, I wrote about the closure of the National Archives and Records Administration (NARA)’s Seattle facility, NAS for short. Recently this issue came to the fore with the publication of an article by Megan E. Llewellyn and Sarah A. Buchanan titled “Will the Last Archivist in Seattle Please Turn Out the Lights: Value and the National Archives” in the Journal of Western Archives.

The NAS facility is key to many different communities. The official page for the facility specifically highlights information they hold about Chinese immigrants and indigenous affairs, along with land records, court records, and genealogical resources. This includes tribal and treaty records of indigenous people living in the Pacific Northwest, and original case files for Chinese immigrants in the 19th century. Volunteers have been trying to index the Chinese immigrant files and create an “extensive database of family history.” This will be interrupted if the files are moved, making the database incomplete.

The NAS facility itself has regional significance. The property the facility sits on was once the location of a prospering farm owned by Japanese immigrant Uyeji family from 1910 to 1942. [1] These immigrants were evicted from their land during World War II and put into concentration camps, like the over 120,000 Japanese Americans. The immigrant Uyeji family never returned to their home, and the land was seized by the U.S. Navy in 1945, after it had been condemned in earlier years, in order to build a warehouse. [2] The warehouse was later converted into a facility and began to be occupied by the National Archives after 1963. This transfer of ownership intersected with the history of Seattle’s development which benefited White people above those of other races, from 1923 onward.

There is more to be considered. As Llewellyn and Buchanan argue in the Journal of Western Archives, the closure of NAS is harmful, a failure at “multiple levels of government,” and was made without considering how valuable marginalized communities in the area see the records held at the facility. [3] 58,000 cubic feet are permanent records of federal agencies in the Pacific Northwest, while 6,600 cubic feet are occupied by records from the Bureau of Indian Affairs alone. [4] Neither should be destroyed per NARA guidance. This amount of cubic feet is equivalent to about 1,871 side-by-side refrigerators or about 1,234 top-mount refrigerators. [5] No matter how the size is measured, the NAS facility is well-used, as is its digital resources, by Asian-Americans, indigenous people, and various researchers. [6] Some indigenous people even called the closure and movement of records to other locations a “paper genocide.” As Bob Ferguson, the Washington State Attorney General, stated in February, moving the records from the NAS facility to states such as California and Missouri, contradicts the purpose of the archives and impedes efforts by local families to research their ancestors.

There are other problems with the closure. Llewellyn and Buchanan pointed out, for one, the errors in the Public Buildings Reform Board (PBRB)’s assessment to close the facility, noting the significant level of foot traffic, the lack of public hearings on the closure, and NARA management agreeing with the decision to close. [7] There is also concern that not all the records held at the NAS facility could be digitized. Some news outlets, like MyNorthwest, have rightly pointed out that large items like bound books and maps might not be “properly scanned” or digitized at all. Llewellyn and Buchanan further note the involved process of digitization, and extra costs researchers will have to pay if the records from the NAS facility are moved. [8]

Readers may be asking what can be done about the closure. Now is not the time to sit back and let the Washington State government do the heavy lifting, nor the Seattle media. In the latter case, the Seattle Times opined against the decision to close the NAS facility. In the case of Washington State, Ferguson, mentioned earlier, proposed a compromise to keep the regional facility of NARA in Washington State, worrying, like others, of the prospect of losing access to “over a century of history.” But his noble efforts have been for naught. The closure is on track, with NARA justifying it based on experience with the COVID-19 pandemic, saying the agency will be “less location dependent” in the future, with users accessing resources remotely rather than in-person. On the legal front, in August, Ferguson filed federal Freedom of Information Act lawsuits for public records against NARA, the Office of Management & Budget (OMB), and the General Services Administration (GSA). He also requested documents from the PBRB the same month. He stated that NARA and OMB failed to respond to requests he made in early February, while the GSA has not sent records it promised in the summer of this year. The PBRB, on the other hand, wanted taxpayers to pay about $65,000 to redact information from documents even though no sensitive information is present, as stated in various articles in the Seattle Times, HeraldNet, and Seattle Weekly. These efforts will likely go forward as Ferguson won the race to be the Attorney General of Washington State against Republican challenger Matt Larkin.

In the short-term, readers should email the OMB Director Russell Vought at Russell.t.vought@omb.eop.gov, the GSA Administrator Emily Murphy at emily.murphy@gsa.gov, Archivist David Ferriero at David.Ferriero@nara.gov, and the PBRB at fastainfo@pbrb.gov, opposing the closure of the NAS facility. Currently, the NAS facility has not been listed by the GSA for sale, whether on its database of real property or its database displaying federal properties being auctioned off. While COVID-19 makes the push for more remote learning attractive, it is still possible and vital to open in-person facilities, in line with existing rules and regulations to ensure the safety of the staff and patrons at specific facilities. In the long-term, if the NAS facility is closed, it could put other NARA facilities in jeopardy, as Llewellyn and Buchanan point out. [9] At the same time, archivists should advocate for a “massive investment in time, money, and planning” to digitize more of NARA’s holdings, as the aforementioned scholars argue for, [10] with not even 1% digitized at the present! Whether the facility is closed or not, there are dark times ahead for NARA, as less government spending may be on the horizon, unless the proposed budget for NARA is approved by the House of Representatives and Senate.


Notes

[1] Llewellyn, Megan E., and Sarah A. Buchanan, “Will the Last Archivist in Seattle Please Turn Out the Lights: Value and the National Archives and the National Archives,” Journal of Western Archives 11, no. 1 (October 12, 2020): 7, https://digitalcommons.usu.edu/cgi/viewcontent.cgi?article=1125&context=westernarchives.

[2] Llewellyn and Buchanan, 7-9.

[3] Ibid, 3-4.

[4] Ibid, 4-5.

[5] Karie Lapham Fay, “Dimensions of a Standard Size Refrigerator,” SFGate, December 17, 2018, https://homeguides.sfgate.com/dimensions-standard-size-refrigerator-82262.html. I used the largest size of a side-by-side refrigerator (31 cubic feet) and the largest size of a top-mount refrigerator is 47 cubic feet when using the highest numbers in Fay’s article.

[6] Llewellyn and Buchanan,  5-6.

[7] Ibid, 11-17.

[8] Ibid, 17-19.

[9] Ibid, 24-25.

[10] Ibid, 21.

REPOST — More than a warehouse: why the closure of Seattle’s National Archives facility matters

Archivists on the Issues is a forum for archivists to discuss the issues we are facing today. The following is from Burkely Hermann, recent graduate of the University of Maryland – College Park’s graduate program in Library and Information Science, with a concentration in Archives and Digital Curation.

On January 26, the Office of Management and Budget (OMB) approved the sale of the 157,000 square foot National Archives and Records Administration (NARA) Seattle facility, which holds permanent federal records for Alaska, Idaho, Oregon, and Washington. This decision raises the question: which is more important, access to historic records or selling a public facility in a high-value real estate market? There has been fierce opposition from historical societies in Alaska and Seattle, historical researchers, genealogical groups, indigenous leaders, university professors, archivists, and historians. They were joined by a bipartisan group of eight Alaskan state legislators and 16 Congress members. The latter, comprising Washingtonian, Alaskan, Idahoan, and Montanan politicians, was also bipartisan. Washington Governor Jay Inslee also opposed the decision, as did Washington’s Secretary of State Kim Wyman. Washington Attorney General Bob Ferguson is considering suing the federal government over the closure. He reportedly submitted a Freedom of Information Act request to the five-person Public Buildings Reform Board (PBRB), OMB, NARA, and the General Services Administration (GSA) regarding the closure. The Washington State Archives even created a page about the topic.

History Associates Incorporated, which cautioned their clients to plan ahead for the facility’s closure, noted the process would take 18 months. They also included the estimate from Susan Karren, NARA’s Seattle director that only “.001% of the facility’s 56,000 cubic feet of records are digitized and available online,” and stated that permanent records may be inaccessible when transferred between facilities. According to NARA, no actions are being taken imminently which affect users of the facility, and NARA has requested to stay in the facility for three years following the sale. With such hullabaloo on this topic, one question is relevant: why does this closure matter to us, as fellow archivists?

NARA’s Seattle facility in Sand Point is more than a “giant U.S. government warehouse” or “excess property” as described in bureaucratic language. This facility holds records on indigenous people in Alaska, Washington, Oregon, and Idaho. It also holds: Chinese Exclusion Act case files which have been diligently indexed by local volunteers for the past 28 years; Forest Service teletypes about the Mount St. Helens explosion in 1980; federal case records from the early 1900s; and other important local documents. Such records make the NARA facility part of the “historical ecosystem” in the Northwestern United States, providing the public “direct access to government documents, from genealogical records to court files.” These aspects make the facility a “high value” federal property (or “asset”) which has a “deferred maintenance backlog of $2.5 million.” Additionally, no public PBRB meeting transcripts showed discussion of the closure. In one meeting, “warehouse[s]” used by NARA for “long-term storage” was touched on and at another there was a passing mention of Seattle.

Some may point to existing digitization efforts. Sure, some of Alaska’s records have been digitized, but record series are often digitized by FamilySearch and the project is only five years old. For instance, some records relating to Alaska have been digitized like crew lists, immigrant lists, draft cards, and naturalization records, as is the case with Washington and Idaho. But these are primarily 20th century records, with very few 19th century records. The letter from congress members criticizing the decision also called this out, stating that “NARA’s partnership with FamilySearch to digitize records has…not resulted in actual access to records that have been prioritized by stakeholders,” a unique and rare criticism of the NARA-FamilySearch partnership. The limitations of existing digitization undermines NARA’s reasoning that some of their “popular records” are already digitized or available online, asserting that public access to their archival records will stay in place.

Access to “archived knowledge” is vital and inherent to archival ethics. Moving records away from those who can use it, dividing it between two existing facilities in Riverside, California, and Kansas City, Missouri, is an act of cruel inaccessibility. Furthermore, splitting the records between two locations, regardless of the reason, leads to a strain on those facilities, which need additional storage space. NARA itself admits that the closure will negatively affect those who use the facility. They pledge to engage with researchers in a “smooth” transition when the facility is shuttered, even though this change will undoubtedly disadvantage various stakeholders, whether state archivists, government employees, scientists, students, or others. In a recent invitation-only meeting, they showed their commitment to the closure of the facility, pledging to work with indigenous groups.

The PBRB’s executive director Adam Bodner claimed that the closure of the facility was a decision by NARA staff. If true, this would put them at odds with users and stakeholders who want the facility to remain open. On pages A-68 to A-71 of their report, the PBRB concluded that NARA wanted to move to a more modern facility and that the 10 acres the facility sat on would be great for residential housing, apparently worth tens of millions of dollars as one article claimed. The PBRB also stated that NARA could only fulfill its storage needs at another facility because the current facility does not meet NARA’s “long-term storage needs.” In the process, some records will be moved to a temporary facility. Reportedly, NARA justified the closure by the fact that the facility is the third-least visited NARA site in the country and has “high operating costs.” Such arguments don’t consider the fact that the 73-year-old building could be retrofitted for the agency’s needs or records could be moved closer rather than split between two locations. This closure also stands against NARA’s stated goal that public access is part of its core mission and violates the Society of American Archivists’ Code of Ethics, stating that archivists “promote and provide the widest accessibility of materials.”

In coming days, NARA will be submitting a Report of Excess to the GSA, headed by Administrator Emily Murphy, which will collaborate with the PBRB and OMB to help “offload” properties like this facility. As such, to speak out against the closure, you could email Emily Murphy at emily.murphy@gsa.gov, the GSA’s Deputy Administrator at Allison Brigati at allison.brigati@gsa.gov, call 1-844-GSA-4111 or contact the GSA’s Office of Real Property Utilization and Disposal at 202-501-0084 and at realestate.buildingdisposal@gsa.gov. Alternatively, you could contact the OMB’s Russell Vought at Russell.t.vought@omb.eop.gov or Archivist David Ferriero at David.Ferriero@nara.gov.


Note: This post is reprinted from Issues & Advocacy, as part of their “Archivists on the Issues” series. I wrote this article back on February 18 and am glad I did so. The situation has not changed as a result of COVID-19. Articles by the Seattle Times, Seattle Times again, and MyNorthwest, show that the closure seems to still be on the agenda, although discussions with the Congressional delegations and others with NARA continue in hopes of reaching an agreement.

The erasure of records, digitization, and 1990s Hollywood films

Gif of one of the scenes from Hackers (1995)

In the past week, I’ve watched a number of 1990s Hollywood films, such as Sneakers (1992), Hackers (1995), The Net (1995), and My Fellow Americans (1996), where the “everything’s on the computer” state of records, as stated in passing in The Andromeda Strain (1971), has been reached. All of these films share a similar theme: the erasure and change of records (mostly digital), which has an increased relevance as archival institutions continue to digitize more and more of their records, although not everything, as I noted in my post about challenges of archival digitization in late April.

Looking at the 1990s films

Let’s start with The Net, since it was the first of these films that I watched, computer with bulky hand-held phones and dial-up computers. In this film, Sandra Bullock plays an isolated middle-age White woman (Angela Bassett) who is a “program systems analyst from Los Angeles” who lives most of her life online, talking on chat rooms and ordering pizza. That all changes when she takes a trip to Cessna (before which there is a computer malfunction which screws with flights), Mexico, meets a man who basically seduces her in order to get control of a virus which is on a floppy disk, of all things. This plan fails, however, as she realizes, after literally sleeping with him for some reason, that he wants to kill her, so she gets away in a dingy that crashes on rocks, knocking her unconscious. She wakes up three days later in a hospital and the disk has been destroyed. As she is about to go back into the country, after a record was changed that checked her out of the hotel, she is told to sign a temporary visa document which states that her name is Ruth Marx.

As the movie goes from here, she realizes that her identity has been stolen by an imposter, with the change of records by the villains who want to make profits off their security technology and gain access  to every system possible. With this, the movie is a bit of warning that it is very easy for someone to be digitally erased with so much of our lives online, with which you don’t even have to spoil the ending. Clearly there are inept secondary characters (police officers, nurses, and jailers), many of whom, like sole archivist Madame Nu in Attack of the Clones think that records are inviolable and cannot be changed. The partially inept villains are even able to kill a few people, like the  Undersecretary of Defense by falsifying a report saying he has AIDS and a friend of Bullock’s character. At one point, she says that “our whole lives are on the computer, and they knew that I could be vanished. They knew that nobody would care and it wouldn’t matter.” Later she adds to the inept court-appointed lawyer, who believes in the inviolability of the records in that they cannot be tampered with, to defend her from false charges:

Just think about it. Our whole world is sitting there on a computer. It’s in the computer. Everything. Your DMV records, your Social Security… your credit cards, your medical history. It’s all right there. Everyone is stored. And there’s this little electronic shadow on each of us… just begging for somebody to screw with. They’ve done it to me, and they’re gonna do it to you…I’m not Ruth Marx. They invented her. They put her on your computer with my thumbprint.

There were some similar themes in the 1992 film, Sneakers, which starred Robert Redford. The film focuses around attempts to create a black box which would crack American codes, allowing access to any American security system. In the process, a team tries to steal the box back and one of the characters purchases blueprints from the county recorders office for $50.00, leading the movie to be cited as an example of “the use and portrayal of records in film.” [1] With the information from the county recorder’s office, and their own observations, they are able to break-in to the company of the villain and get the box, but before it is handed to the NSA of the characters removes the main processing chip.

There is more than that. Redford’s character is basically a hacker, as was his friend Cosmo (who is the film’s villain) who was arrested and thrown into prison for computer crimes. The black box has a similar power to malicious code in The Net. Again, the focus is that records can easily be changed, or in the case of this movie, mimicked, to certain ends. Like the previously mentioned film, the cast is mostly White, but a bit more diverse in that they have a former Black CIA agent on the team of the “heroes.”

There’s one other film which has similar themes: Hackers, which features Angelina Jolie in a starring role. It focuses on a group of teen hackers who work to take down a villain who wants to sink a few oil tankers while getting wealthy in the process. In this “cult classic” film, as some places call it, there are computers running on dial-up (like in The Net), huge portable phones, people in some of the nerdiest clothes ever, and moving of information around on…floppy disks! In fact, the virus itself is on a floppy disk.

The altering of records is a key part of this film as well, as the villain alters criminal records of the male protagonist and his mother to list them as criminals, blackmailing him to give up the floppy disk. In the end, this group of hackers, all men except Angelina Jolie’s character, and all White except one kid with dreadlocks, sets out to take down the servers of the villain’s mega-corporation,  succeeding thanks to help from two Japanese hackers and their subsequent “electronic army” of hackers. Somehow they basically get off from their prison sentence thanks to a television broadcast from one of the hackers, which seems strange as he could be utterly lying. As with most movies of this nature, the plot doesn’t always completely add up.

Finally, there is a bit of an outlier: the 1996 film, My Fellow Americans. This is perhaps the most hokey film of all, although archives is a main part of this film. Ex-Presidents, played by James Garner and Jack Lemmon, discover a scandal in the current administration. Lemmon discovers that conspirators have altered his official records, at his presidential library archival vault, in order to “erase traces of a meeting.” At another time, Mark Lowethal’s character goes to the National Archives, finding that the presidential appointment log does not show this meeting. [2] It turns out the culprit behind these changes is the current sitting present, the former vice-president, with his chief of staff being the one whom “doctored the Archives log and the log in Kramer’s library.”

In this case, the film does not involve the changing of a digital record but only the changing of a paper record. Still, this has a similar theme to the other three movies in that records can be doctored, manipulated, and changed to the benefit of certain individuals. Although, this can be, at times, easier to do with digital records than with paper records. I would also say the theme that records can be changed, erased, or rewritten follows through the Halt and Catch Fire series, along with shows like Mr. Robot, going into its last season this coming fall.

Why do these films matter?

“If  I  could take  all the  things  that  I  am, all the  feelings  I  have, all the  things  that  I  want,  and somehow  get  them  on a  computer  card, you would be  the  answer. I  don’t  know  why  or  how  you’ve  come  along at this  particular  point  in my  life. See, that’s  the  magic part. I’m  not  gonna  let  you  go.”- Dr. Sidney Schaefer talks to his girlfriend (who ends up being one of the people who is spying on him) in The President’s Analyst, a 1967 film

They matter because more and more of the records held by archival institutions are digital, specifically “born-digital” (like tweets, Facebook posts). Of course, they are a bit dated, as they came out between 1992 and 1996. However, the point that records can be changed and manipulated should be considered. There should be measures in place to make sure that the records, especially digital records, are not tampered with. Perhaps this would require fixity checks, but also could necessitate rules on the usage of records themselves.

At the same time, the archives themselves should not be like the dark and haunting Thatcher Memorial Library in Citizen Kane, which has what some have described as having one of the world’s meanest archivists, played by Georgia Backus, with hair up in a bun “and an intimidating stare on her face, a real dragon lady at the gates of knowledge.” This is not the type of archives you want to go to! This is not the image which should be projected. [3]

What I have said so far is only scratching the surface. These 1990s movies have standing importance because born-digital files which are entering archives across the world, like some in New Zealand, include “photos, radio broadcasts and documents,” requiring appropriate workflows. Margot Note, a prolific writer in this field, described that as a former lone arranger who directed all archival management at an organization she launched a project to digitize a set of records, creating digital surrogates of 2,000 of the collection’s best images, adding that such surrogates are superior to past formats like microfilm since they can be delivered through networks “offering enhanced access to simultaneous users around the world.” In the same article she advocated the importance of digital collections, saying they grant “valuable remote access to the information contained within the original records” if they are created within the appropriate archival infrastructure, with metadata and search functionality, indexing. She adds that digital collections of archival records can not only provide for “multiple points of access and enhanced image details” but it can allow for more in-depth study than analog originals, increase interest in items which have often been ignored,and it can also act as “an advocacy tool for an archives.” She also argues that different types of digital surrogates of records can be created, either for web display, storage, or print reproduction. She ends by saying that while “electronic copies suffer no degradation through the duplication process,” a copy of a digital photograph is “indistinguishable from its source” meaning that the “original” loses its meaning, and that with digitized images, “researchers risk losing information that enables them to understand how the image was accessed and how its physicality changed over time.” As such, there should be efforts to limit or eliminate such a loss.

But there is another aspect to archival records. Librarian Carrie Wade argued back in December 2018 that information is political with information loss affected by federal funding decisions of research repositories ruining the work of professionals. Similarly in the case of archivists, they should not be completely neutral not only because who “we elect impacts our ability to do our jobs well and the access that people have to information,” as she argues, but they literally cannot be neutral as they are human beings with viewpoints, emotions, and thoughts of their own.  Building upon this, there are clear archival silences or “gaps in the archival record,” with these silences “created and enforced within archives” as a result of practices that are  “central to the work of archivists.” Digital records, whether born-digital, like social media posts, or digitized paper records, can help bridge this gap. After all, paper or analog records can be digitized in ways that allows access to them through online channels while originals are restricted.

All of this is relevant to the 1990s films I referenced in the first half of this post, as it requires having effective records management programs. The policies regarding records not only in Hackers and The Net, or even My Fellow Americans and Attack of the Clones were clearly outdated, and should be taken as a warning to have correct policies. This also requires taking into account challenges with capturing resources that are born-digital and making it available, effectively curating this information for the user. Furthermore this is important as a major trend in libraries is collection of data to prove their value even though this has its downsides especially when it comes to ethical concerns with data mining and big data, even though this can be useful. At the same time, how material is defined for easy access is a challenge “to every content owner,” as is choosing the right metadata, with “important detail work” in this process. The same is the case for finding more “accessible ways for people to find and scan content” and ways to share these “images with your target audience.” [4]

Concluding words

All of this ties back, of course, to the classic animated sitcom, Futurama, with its mentions of “technical support,” CDs, CD players/CD racks, and floppy disks (some of which are 15-inch). In fact, in one episode, “How Hermes Requisitioned His Groove Back” (season 2, episode 15), the last half of the episode is about going into the central bureaucracy to get back a disk with Bender’s brain on it, which is floppy disk. Others mention existing government records, databases, a record vault (safe  box) and an arrest record. In one episode Fry even declares to Bender that “I’m not a robot like you! I don’t like having disks crammed into me” while in another he downloads “a celebrity from the Internet” from a parody of Napster, which is kidnapping celebrities and illegally copying them, with the “backup disk” being a floppy disk. Others focus on big data and concentration of information, digital cameras and operating systems.

I mention all of this because it shows the relevance of record erasure, digital archives, digitization, and the changing digital environment. This requires of course that you don’t have “unauthorized data access” like Fry accessing the computer connected to the brain spawn. In the end, while these 1990s Hollywood movies are dated in various ways and problematic in others, they still have relevance connected to present developments of archival institutions in response to new technologies and making records more accessible to online users.


Notes

[1] Kyle Neill, Senior Archivist of the Peel Art Gallery Museum & Archives also argues that there are archival themes in The Dark Knight (2008), The Avengers (1998), Chinatown (1974), and Tinker, Tailor, Solider, Spy (2011).

[2] This reminds me of a major plot point in Thrill Seekers, a 1999 TV movie, where the protagonist finds out that there are people who travel in time (from the future) to disasters and serve as tourists, disgustingly watching people die. In the process, the researcher on staff at a local newspaper, a bit like a records clerk, has databases of newspapers on her computer, which he searches to find information, which she lets him use even though she just met him (not good records management). Ultimately she says that she will go to the National Archives to find the original images, proving that he was not lying about the time travelers. Later, the protagonist goes back and time and saves her. But, I thought I’d just mention this, as the fact she is a bit of a records clerk brings in line with the records clerks in Erin Brocovitch (2000) and Chinatown (1974). The former has a clerk who flirts with a law firm filing clerk (Erin Brocovitch) who uncovers wrongdoings of a water utility company on her three visits to the records office of the Regional Water Board, letting her into “a records storage area, piled high with files, papers and binders, where she proceeds to copy water records,” allowing her to complete her work. The latter has a sullen young man who does not like his job, grudgingly providing assistance, with Jack Nicholson’s character “tearing out part of a page from a record book by covering the noise with a cough” after he is told he cannot check out the volume.This clerk, as one reviewer puts it, has “a well crafted scene presenting a stereotypical records keeper” with the clerk/archivist as “an impatient, unhelpful civil servant guarding over his records domain who treats the public as trespassers” while the “records are in long aisles in bound volumes.” Some have compared Erin Brocovitch to another film with records as central, specifically A Civil Action (1998).

[3] The same goes for Hollywood images of old archivists like in Vampires (1998) where the church archivist is introduced, a “slight, bearded man with glasses” whom is sent along on a quest,” in They Might Be Giants (1971) where a wealthy lawyer, who thinks he is Sherlock Holmes, teams up with a psychiatrist “to try to rid the world of evil” and in the process, one person plays an aged archivist who, despite his problems, “does come across as the sanest person in the movie and he finds clues to track down Moriarity,” or in Amityville II: The Possession (1982) when a father uses a local archives to find out about a hosue causing trouble for his family, and in the process he is helped by an elderly archivist, a person who says “I’ve worked here for 25 years.” There are other mentions of archives, but without archivists in Arlington Road (1998), Batman Begins (2005), Beverly Hills Ninja (1997), Broken Lullaby (1994), GoldenEye (1995), Journey to the Far Side of the Sun (aka Doppelganger) (1969), L.A. Confidential (1997), Message in a Bottle (1999), Ninth Gate (1999), Rogue One: A Star Wars Story (2017), Secret Nation (1991) [Canadian film], Shooting the Past (1999), Smila’s Sense of Snow (1997), The Dark Knight (2008), The Name of the Rose (1986), The Phantom (1996), and The Shadow (1994). Also, there are said to be flirtatious archivists in Carolina Skeletons (1991) and Just Cause (1995), along with helpful ones (either initially or ultimately) in Cloud Atlas (2012), Deceived (1991), Quatermass and the Pitt (1967), The Fugitive (1993), and The Mask of Dimitrios (1944). There are also a number of films which have archivists in the background: Charlton-Brown of the F.O. (1959), Macaroni (1986), Red (2010), Ridicule (1996), Rollerball (1975), and The Age of Stupid (2009), and those that are said to have nasty or mean archivists: Blade (1998), In the Name of the Father (1993), Scream 3 (2000), The Nasty Girl [Das Schreckliche Madchen] (1990), and The Watermelon Woman (1996). Please, do not constitute this as an endorsement of any of these films, as likely they are mostly terrible.

[4] Also see articles about how libraries lead with digital skills and a cryptic finding aid.

Interpreting history: thoughts on History Day

This post originally had thoughts on my presentation at the iSchool symposium, which has been incorporated into an upcoming e-book.

I’d like to talk about some thoughts on Maryland History Day, for which I judged this past weekend, including as a chief judge in the morning for senior individual websites. They included topics ranging from, as I noted on Twitter, the Apollo Missions to the Atomic Bomb. I also did runoffs for documentaries, with topics including “Cocoanut Grove, Stonewall Riot, Thalidomide tragedy, ACT-UP, the Osage indigenous people (and oil), and the Triangle Shirtwaist Fire,” some of which I had not heard of before. As I awaited the winners, I already knew that the group documentaries I had reviewed had won, documentaries like “Last Dance at the Cocoanut Grove” (by Aidan Goldenberg-Hart, Daniel Greigg, Eli Protas, Joey Huang, and Charles Shi) which got first place, and “From Inefficient to Inspiring: How the Stonewall Riots Changed LGBT Activism” (by Pallavi Battina and Amulya Puttaraju) which got second place. However, when it came to individual websites, one of the ones I reviewed got first place! It was titled “Julius Rosenwald and Booker T. Washington: How Their Investment in People Led from Tragedy to Triumph” and it was by Matthew Palatnik. None of the websites my group had nominated for special prizes won. So that was positive.

History Day made it clear to me that even the topics often written about can be talked about in a new way, with a new interpretation, with these students entering the process of historical research, so I wish them the best in the days going forward. In June, I will serve as a judge on the national level of History Day at College Park, which should be fun!

In closing, there is a strain that connects the visualizations I made this semester and Maryland History Day: the importance of history and interpretations of what happened, allowing for new insights and thoughts, enriching how our collective past is understood.

Challenges of archival digitization, Robert Caro, and digital archives

Recently, when going through LinkedIn, I came upon a post by Margot Note, whom wears many hats simultaneously as a records manager, archivist, author, and consultant, about the shifting concepts of preservation in the digital world, which had been written last fall. She argues that information professionals, like archivists, have questioned existing assumptions about preservation, with the creation of new principles to born-digital materials (like tweets, Instagram and Facebook posts) and those materials which are digitized. This change is happening while physical records deemed to have “enduring value” are still acquired, stored, and made accessible. She goes on to state that the ever-changing digital landscape has added complexities to archival practice, altering existing procedures, especially in the realm of preservation, since those methods used to preserve physical paper materials no longer translate to digital resources, requiring new methods. For example, she notes that you can’t reverse preservation treatments for digital records, unlike with paper records, such as migrating digital files to new formats when old ones are not usable anymore. These are transformations that, hopefully, do not constrain the original functionality of records.

She also adds that for digital materials, the content is what important, not the carrier for such content and that unlike physical paper materials, which may not deteriorate rapidly if they are ignored, digital files are stored on media that “deteriorates, and rely on hardware and software that may no longer be available” which means that neglect is not an option. This means that despite differences in preserving digital and paper materials (often called “analog” or “legacy” materials), some practices can apply to both, like appraisal and addressing information as a collection rather than on an individual level, while recognizing that all materials have “the tendency to decay.” She ends by saying that digital and paper preservation considers needs of patrons, with action needed, ultimately, to preserve materials in the immediate future, “ensure the survival of research materials for our users,” and ultimately sustain “cultural heritage for the next generation.”

While this is a good start, there is a lot more to talk about. I could bring in some of her other publications, like a book on family archives [1], but I’d like to broaden the scope. This article will talk about the challenge of digitization in archives (with connection to Robert Caro’s recent comments) and challenges of digital archives. There will also be a connection to sister institutions of archives, libraries, which are distinct in and of themselves [2], as I have noted on this blog in the past, even as you get a MLIS/MLS (Master of Library and Information Science or the rapidly dwindling Master of Library Science) to study…archives. As the SAA notes on their “So You Want to Be an Archivist” page, the “number and content of archival education offerings, especially multi-course programs, has continued to expand in recent years, and a few institutions now offer master’s degrees in archival studies.” I’ve recently wondered why degrees like archival science (or perhaps archival studies) are not more widely offered, but perhaps that is a discussion which can branch out from this post.

Robert Caro’s faulty argument and archival digitization

From the NARA Strategic Plan (2014-2018).

In order to begin this discussion, I am reminded of some dialogue in the 1971 science fiction movie, The Andromeda Strain. One character, Mr. Mark Hall (played by James Olson) asks “where is the library?” to which his colleague, Dr. Charles Dutton (played by David Wayne) responds: “No need for books. Everything’s in the computer.” And the movie goes on, as there is no more discussion. Later on, the computer does have an error and overload when too much information is inputted by the scientists, the “heroes” of this film in this top-secret facility in the Nevada desert called “Wildfire.” The fact that everything is stored on the computer is not mentioned in any reviews of the movie I have found, and as such, perhaps people should revisit this movie for just this reason, as it is still relatively enjoyable. We have gotten to the point that everything is “in the computer” like in this film, not only with libraries and other public institutions, but more and more with archival institutions in recent days.

That brings us to the recent debate of what Robert Caro, a presidential scholar of the Johnson Administration said about digitization, whom was criticized by fellow archivists on the Twittersphere (and likely elsewhere), of archival records. He tried to describe how people are differently interacting with the records now than they had in the past, in the “pre-internet” days, those before the internet was publicly available, the days in which it was available only to universities and the government which Joe McMillian tried to exploit in a few episodes (starting with the Yerba Buena episode) of the third season of the short-lived series, Halt and Catch Fire, but not having much success as the show is all about failure.

Caro’s words come from a recent interview by of Popular Mechanics because of the publication of his new book, Working, about his research process, apparently a #1 best-seller on Amazon. He told the interviewer  that he still does much of his writing on a typewriter although he has a laptop on his desk (apparently a Lenovo ThinkPad). This is because he was told by those at the Johnson Presidential Library that his “typewriter was so noisy, it was disturbing the other researchers” which is telling. He also tells the interviewer that he took notes on his computer but still uses his typewriter and writes in longhand (who does that anymore?). While some would argue that this is fine, what he stated next is what was criticized by archivists on Twitter:

It [writing on a typewriter] makes me think more. Today everybody believes fast is good. Sometimes slow is good. Almost two years ago, Ina [Caro’s wife] and I went down [to the archives], and I’m sitting there, in the reading room, writing my notes. Everybody else is standing there taking photographs of their documents. They do it with cell phones now. If you saw me there, you’d see one person who’s not in the modern age.

Now, while each researcher can choose their own way to use documents, it seems like he is glaring down on those whom use their phones, or other electronic devices, to take pictures of documents. How can you even argue that those individuals are not taking their own notes or that they can think the same amount when using digital devices? As Jan Murphy, a family historian whom is a big fan of encouraging people to take notes, added on Twitter, it wouldn’t be right to “insist on all handwritten notes all the time,” the latter of which is “just nuts.” Adding to this is the fact that digital photos can be transcribed at home, even comparing information from different archives. Additionally, sometimes people like Caro, whom could be considered to be part of the traditionalist/silent generation since he was born in 1935, may not even be able to read their own handwriting! This is the case with other people, especially those whom have dysgraphia, with the extent this learning disability affects the general population not currently known. With this, we should also consider that not everyone has the leisure/ability to transcribe material needed from an archive in longhand. Some, as Murphy noted in another tweet, would rather “spend the time in the archive, having taken my photo, making notes about the record’s condition & taking notes for my source citation etc.” The question is simple, as Murphy, who sometimes wishes she had a small manual typewriter when electricity is off, asks, posing a question which Caro never really answers: “But what’s wrong with taking digital photos of records in archives?” I could concur with that. I don’t see anything wrong with it. In fact, I would argue that institutions like the Maryland State Archives are examples of institutions which allow electronic devices such as phones to take photos of documents.

After this, he goes into the use of paper records:

I feel there’s something very important, to be able to turn the pages yourself. I don’t want anything standing in between me and the paper. People compliment me on finding out how [Johnson] rose to power so fast in Congress by using money. That happened down there, and it was a vague, amorphous thing. I was sitting there with all these boxes, taking all these notes. And you saw letters, his very subservient letters—“Can I have five minutes of your time?”—and then you see the same letters coming back to him. And I said, Something happened here. What’s the explanation? Why is a committee chairman writing to Lyndon Johnson, asking for a few minutes of his time? So I sat there and put my notes into chronological order. And then it became absolutely clear. Would the same thing have happened if I’d stood there taking photographs and went back? Possibly. But I don’t believe it. To me, being in the papers is really important.

While I understand what he is saying here, more and more records are online than ever before, meaning that the records of the Obama Administration and future presidencies will undoubtedly be different from those of the Johnson Administration. Caro is almost stuck back in time, part of the old guard of presidential scholars whom inhabited presidential libraries (which can more accurately be called presidential archives). I won’t touch on the plans for the Obama Library only because I have written on that topic for one of my classes at UMD and it may be published in an academic journal in the future (fingers crossed), so I don’t want to tread on the same topics in this post. I would add that using paper records is not the only way to interact with records, as users can easily interact with them online using new and exciting methods.

From here, Caro becomes a bit ridiculous:

Well, there’s no reason why that [a deep dive through thousands of digital pages of emails] has to be a different kind of research. Someone else could come along who was nuts like me and say, I’m going to look at every email. What’s more worrisome to me is that, when you talk about digitization, somebody has to decide what’s digitized. I don’t want anyone deciding what I can see. It’s very hard to destroy a complete paper trail of something. Lyndon Johnson was very secretive, and he wanted a lot of stuff destroyed. But the fact is, they were cross-referencing these pages into ten or twenty or thirty different files. There’s always something. But the whole idea of emails—I don’t use emails, I may be wrong—I’m not sure there’s a trail like that. It’s too easy to delete.

While he makes a good point that there can be the same kind of research, that doesn’t mean he is right overall. It is laughable for him to claim that “when you talk about digitization, somebody has to decide what’s digitized” and to then declare “I don’t want anyone deciding what I can see.” Clearly, he does not, understand the fundamental archival principle of appraisal, which has been debated from the time of those like British archivist Hilary Jenkinson in 1922 and U.S. archivist T.R. Schellenberg in 1956, the selection and description within archives. The records he is looking at, while researching at the Johnson Library, are chosen by professional archivists, specifically those from NARA, so people are deciding what he can see. As such, deciding what records are digitized is also a responsibility of archivists, which will be explained later in this post.

He further claims that it is “very hard to destroy a complete paper trail of something.” I’m not actually completely sure about that. Taking from NARA’s official history of presidential libraries, they write that before these libraries came about, with impetus from FDR in 1939 when he donated his personal papers to the federal government, presidential papers were often dispersed by former presidents and their heirs after their time in office. They further note that while many collections of records exist of presidents before Hoover at the Library of Congress, others are divided between historical societies, libraries, and private collectors. Even worse, as they acknowledge, “many materials have been lost or deliberately destroyed.” So, a “complete paper trail,” as he described it, CAN be destroyed.

Considering that “Lyndon Johnson was very secretive, and he wanted a lot of stuff destroyed” as he notes, this contradicts his point that it is “very hard to destroy a complete paper trail of something.” I mention this because it would mean that if Johnson wanted, he could have worked to destroy a complete paper trail, especially since it was after Watergate that presidential records were considered property of the federal government rather than “private property” of the former Presidents, a view also widely held in the archival profession at the time. Furthermore, when he talks about cross-referencing of the pages, he seems to not understand how emails work. This is no surprise from someone who doesn’t “use emails,” as he admits! He claims that he is not “sure there’s a trail like that” and that “it’s too easy to delete” emails. While it is true is easy to “delete” them, think about “deleted” files on a computer. They are not really deleted but rather the directory to them is eliminated. The same is also true of any file, whether a PDF, a photograph, or something else you upload online: the file is never truly deleted, but only the directory to it is deleted. Just like when you throw something away in a garbage can, it is not simply eliminated, but it is sent somewhere else, like a horrid waste-to-energy plant or an overflowing landfill. There was actually a whole Futurama episode about an overly wasteful society back in May 1999, titled “A Big Piece of Garbage.”

As Curl Hopkins wrote in The Daily Dot six years ago, when a user “deletes” an email normally it becomes “invisible to that user and is immediately a candidate to be overwritten” but until then it exists and it may even “persist longer on company servers.” He further notes that even if a computer is “taken off your computer, it may still be available on the host’s server,” adding that you must “presume that any email you compose will be available remain accessible forever,” although secure email services are available. There may still be “elements that indicate the prior presence of the email” and logins that are often retained, to say the least. Even one article recommending how to delete emails forever warns that “some online email services maintain an offline backup of email accounts,” adding that “your permanently deleted email may still reside in these inaccessible backups…There is no way to force immediate deletion of emails in these backups.” Also, there are specific data retention rules on the federal level and likely within various organizations, which require retention of such emails. I am also reminded here of “Testimony” (S4, ep9) of Veep. I mention this because, at one point during the episode, Mike McLintock (played by Matt Walsh), the incompetent press secretary, is brought before a congressional committee. He thinks he deleted the voice memos of then-president, Selina Meyer (played by Julia Louis-Dreyfus). In fact, as the committee reminds him, these memos exist in the cloud and they plan to listen to them for any further evidence in their investigation! [3]

With that, it leads to the next part of this post, which goes to a question that the public, taken in by stereotypes about archivists, often asks of archivists and archival institutions.

Why can’t everything be digitized?

In May 2017, Samantha Thompson, an archivist at the Peel Art Gallery Museum and Archives, wrote a post which aimed to answer the question of why archivists don’t digitize everything since it is a common question. As such, it is clearly important to remind people who not everything is digitized and that, in fact, “only a tiny fraction of the world’s primary resources are available digitally,” coupled with the fact that archivists and librarians themselves are “behind the abundance of primary sources already available on the internet” while organizations like the Internet Archive, or Ancestry.com have raised “public expectations about access to historical resources.” [4] She goes onto argue that digitization, the “production of an electronic image of these record,” saves information from a paper record, but it does not produce “a clone of the record” but rather results in an “approximation…of a dimension of the record,” often called a surrogate. She further notes that while archivists commonly digitize records in order to increase access (which some cataloguers do as well), they also argue (rightly) that mass digitization is costly in time and money, which sometimes people are skeptical of, not realizing that “large-scale digitization in an institutional setting is not your average home scanning operation.” There a few reasons for this, including archives holding vast amounts of material, with digitizing of even small archival collections as a big-time commitment since many groups of archival records are not easy to scan in quickly.

For instance, while you could use an automatic feeder to quickly scan a stack of pages, the benefits of such speed must be “weighed against the risk of a one-of-a-kind document being mangled by a paper jam” which is always a concern! This means you must engage in manually scanning which includes tasks such as removing staples (and paper clips), positioning the item, processing the images, and entering the appropriate metadata, all of which is a lot of work. As such, “scanning a single archival box of records can take days” as she puts it. This is even more the case if records within the file are various shapes and sizes, or if they are large enough that they must be scanned in sections and “digitally stitched together.” While sometimes taking a photograph is the best option, you need a “high-quality photographic set-up including lighting, document holders, and a camera with an appropriate lens” which obviously is expensive enough that not all institutions can afford such a set-up. This means that scanning produces not an exact copy of the record “but only an impression of certain aspects of it” and it may be hard to convey annotations (like sticky notes) on the paper record itself in a digital form, or physical characteristics of the paper records. This brings us to one of the most important parts: linking the digitized record to crucial information, which is often called metadata, some of which is technical and other parts that describe the record itself. The latter is information like a date or time the record was created. But some elements are more complex like determining the “story of the person or organization that created it.” As she puts it rightly, an individual record “within an archival collection does not tell us its whole story.” This means that without vital descriptive work of paper records in the first place, those electronic records which are produced through digitization would be an unusable and undifferentiated mass.

She goes onto note that since digitization involves investment of resources and time, archivists need to be clear that the electronic files produced adequately represent the originals, meaning there need to be quality control checks in place. This involves factors such as scanning resolutions, typing accuracy and photographic skill, since archivists are responsible for ensuring that “people are getting a reliable and authentic view of records.” There is another conundrum with digitization itself: archivists are required to not only retain the paper originals but the digital files as well. These are files that are subject to disorder and decay just like paper records, with a tiny shift causing a set of errors, with even unused data subject to random degradation and loss, often called “bit rot.” Coupled with this is the question of future readability of the data, since digitization of files is not worthwhile if no one can open the files as software and the accompanying “hardware inevitably becomes obsolete.” Luckily for all of us, especially those in the archival field, archivists are at the forefront of pushing boundaries of digital longevity as technologies and file format standards are improving. However,as she notes, the “average lifespan of a hard or flash drive is still a fraction of that of a piece of paper stored in optimal conditions” with digital data needing to be stored in specific temperature conditions as well. All of this means that when anything is digitized, archivists commit to maintaining the digital file and the original on which that file is based.

This connects to the resources required for digitization and post-digitization duties. For one, cameras and scanners which are high-resolution which can accurately capture the data are relatively expensive, with the same being the case for software to process images and attain digital storage which is secure. In order for digitization to “make a dent” in an average archival collection, a scanner, or several scanners, need to be constantly working, with some large archivists maintaining specific digitization units while smaller institutions fit it in when and where they can among their other duties. As a result, digitization of specific records is often part of projects which are funded by partnerships or grants, as she notes. In terms of the post-digitization duties, it is needed to make sure that the records are responsibly shared on the web, after checking with donor(s) to make sure the records can be freely shared in the first place with some not wanting this to happen for various reasons or due to copyright restrictions. Such sharing is important as it allows archivists to make the full meaning of records available to those accessing them online.

As such, digitization itself, as she argues, is a process that is approached by archivists methodically. This requires, of course, assessing archival collections beforehand in order to determine whether the records are worth being shared and digitized. Such a process takes time, even if an “inexpensive pool” of labor can be mobilized, along with a big investment of resources and time. As a result, as she puts it, we may never, in fact, have everything digitized, with trials and triumphs of digitization being a “constantly unfolding process” while new models are coming about. With that, access is still important, as is digitization, with archivists continuing to “grapple with this immensely powerful way to broadcast the knowledge we steward.” Her article ends by stating that everyone can help support digitization through sharing information that goes with a photograph from an institutional collection, and to, most important of all: “be curious about what archivists, information professionals, and cultural workers do.” The latter requires, of course, asking questions and spreading answers, since the more people who understand the value of archivists, the more support they will get, and the more support archivists can provide to the public at-large.

It is worth recalling here a paper I wrote last semester (which will likely never be published anywhere academically) where I asked different archival institutions about their approach to digitization, using different forms of interaction, like Twitter, email, web-form submissions, and web-chat (AskUsNow!), the latter which is relatively horrible/annoying from my experience, although others may have had different experiences. [5] One of the best responses I got was from Corey Lewis of the Maryland State Archives (MSA) whom told me that I could personally contact him if I was interested in their digitization efforts. It was a response of high quality I wouldn’t have gotten if I had just looked on their website. To this day, they still don’t have their digitization strategy on their website from what I can tell (perhaps its hidden somewhere). I also got responses back from the Council of State Archives (CoSA) on digitization and even from the Oregon State Archives, the latter of which I hadn’t even tweeted to, which was impressive. In a similar manner to the person from the MSA, I got a message from Joanne Archer, the head of Access and Outreach Services at Special Collections and University Archives at the University of Maryland Libraries, which said I could send her any further questions. Interestingly, when it comes to digitization they do not “directly solicit campus input.”

With that, we can move into the final part of this post which focuses on challenges of digital archives and the digital world.

Challenges of digital archives and the current digital landscape

In the “Mars University” episode of Futurama, which first aired on October 3rd, 1999, the Planet Express crew go to Mars, which has, in the universe of this wondrous animated sitcom, been terraformed and has a typical college campus called Mars University. Before the episode becomes an homage/parody to Animal House, there is a scene where Professor Farnsworth tells Leela, Fry, and Bender about the Wong Library, adding that it has “the largest collection of literature in the Western universe.” After that, Fry looks in and sees these two disks:

That’s obviously the joke, and is more than a “bookish moment.” It’s basically saying that all the knowledge can be stored on two disks. It’s still kinda funny, although the joke is dated, as these are supposed to be something like CDs (which first came about in 1982). In a future post I’ll definitely bring in the Futurama episode (“Lethal Inspection”) that fellow archivist Samantha Cross of POP Archives reviewed, when I get to that season, as I’m currently only on Season 2 of the show as I plan to re-watch all the show’s episodes, over time.

This brings us to digital archives, specifically, which goes beyond the digitization of paper files. This applies to files which are born-digital. It requires, of course, a digital preservation policy as Margot Note, who was cited at the beginning of this article, writes about, which would need to be integrated into the program of an archives itself. It would also necessitate collaboration with other institutions and individuals in preserving digital records, and making sure that digital preservation is specifically tailored to your institution. Beyond this, there are two elements that apply to digital archives: choosing what will be preserved and file formats that are sustainable.

For the first element, I turn to an article, again, by Margot Note. She writes that selection and appraisal of digital records is similar to physical records,but that long-term preservation of digital records relies on “understanding of how file formats work.” It also requires, as she notes, access to the appropriate hardware and software, with the appropriate skills, with the unavailability of these factors in an archival institution meaning that preservation of the digital files will not be successful. As such, technical appraisal of the digital files, themselves, considers whether they can be read, then subsequently documented, processed and finally preserved. Helping choose what digital archives preserve depends on whether the content itself is relevant to the mission of the archival institution, the historical value of the records, specifically if they have enduring value or are significant socially or culturally. For the digital records themselves, archivists also need to consider the integrity of the files, if they are usable or reliable. This means answering whether the materials themselves are in “preservation-friendly file formats” and if there are limits on the records, in terms of privacy or intellectual property, which makes them “inaccessible for research.” Another important factor, as she describes is funding since the preservation and management of such digital records is by no means cheap. Finally, she notes that one must consider whether the digital records are unique or whether they are fully documented. She adds that keeping everything, when it comes to digital files, is not wise, since there are limited resources and mechanisms to search (and access) collections of a large-scale are often not adequate, and that selection curates collections which will ultimately have “high research value.” She ends with her point that no matter how complicated the systems for managing digital records become, people need to be involved in choosing what is preserved as digital archival records. Even with the possible automation of some decisions in days to come, archivists would need to balance benefits of saving certain digital records over other digital records, at a time that archivists continue to rise to the challenge of selecting and maintenance of “digital artifacts in a changing technological landscape” as she puts it.

In a related article, she writes about archivists choosing the right and sustainable file formats. This relates to digital archives because the sustainability of digital records in and of themselves depends on file formats that will last for long times, with the Library of Congress putting in place “some criteria for predicting sustainable file formats in digital archives” as she puts it. It further requires considering whether a format is widely used, the files can be identified, specifications of file formats are publicly available and documented, the files can function on a variety of services (be interoperable), and they have an open format since issues with licensing, patents, digital rights, and property rights complicate preservation efforts. She points to efforts by the Digital Preservation Coalition to analyze file formats which are commonly used. She also writes that over time some file formats have become preferred over others, like TIFF files used as master images for preservation during digitization and PDF/A as a standard file format. Even so, some standards for file formats are still in flux, with no consensus among archivists, as she puts it, as to what “file format or codecs should be used for preservation purposes for digital video”! At the closing of her article, she argues that regardless of the preservation actions you take, having file formats that are sustainable is crucial, since having file formats which are lasting influences the “feasibility of protecting content” in the face of a changing environment in the technological world where repositories and users co-exist at the present.

Speaking of all of this, I am reminded of an ongoing study by S.C. Healy, a PhD candidate in digital humanities at a university based in Ireland (Maynooth University), trying to find how “wider research and cultural heritage communities’ can progress from creating web archives to establishing paradigms to use web archives for study and research.” I plan to sign up for this study as I’ve talked about web archiving in several classes. This is relevant since, as Genealogy Jude, as she calls herself on Twitter, noted, “the Internet…has shifted the demographic profile of genealogists.” This matters to archives and archivists because many of those genealogists are some of the most common users of libraries. [6] In fact, one of the articles I found during my research for my paper on the Obama Library, a scholar in the 1990s (I don’t remember the exact date), National History Day, where I am being a judge again this year on the state and national levels, and connecting with genealogists as a way to bring in more users to archival institutions.

Perhaps we can even bring in one of the SAA words of the week, specifically level of description. Simply it is defined as the “level of arrangement of the unit being described” and the “completeness or exhaustiveness of the description.” It connects to recent discussions like one at Hornbake Library recently which focuses on impact of digital repositories, which is in the same realm as digital archives. Perhaps discussions like this will make it easier to define what archivists do and what archives are, as some have tried to do through teaching.

I also think about, apart from creation of some digital archives portals, of what Lilly Carrel, archivist at the Menil Archives in Houston said about digital preservation: “I think digital preservation offers creative ways to enhance the post-custodial approach and ensure important records are preserved” whom was recently interviewed by Vince Lee of the SAA’s Committee of Public Awareness, also known as COPA. That is even more the case when there are digital archives, whether completely digital or part of traditional archival institutions like those at universities or serving specific states. There is also a job at the Library of Congress about web archiving, with applications that close on May 1.

With all of this, there is, not surprisingly, a debate among scholars, especially in the field of archives and libraries, over a possible difference between a digital library and a digital archives. Some within the field say there is a difference, while others dismiss that, arguing that there is not. Currently, I don’t want to go down that road, or talk about some continuing tension between historians and archivists, despite past efforts by the SAA to make connections with the AHA, the American Historians Association. I also could talk more about the challenges when it comes to archiving born-digital material, but perhaps I will revisit that in a future post on here.

I’ll end with what one archivist, blogging on the New Archivist WordPress over five years ago, put it, “please keep up the discussions, and contribute in ways that you think have value,” adding that the “seeming lack of support in public” doesn’t mean that archivists are not doing anything. [7] That is what I am trying to do with post and this blog, as a whole, changing from a focus on historical explorations about the Maryland Extra Regiment, the Maryland Loyalist Regiment, reprinting past posts and biographies I wrote when I worked at the MSA on the First Maryland Regiment, which is often called the Maryland 400, and other topics, as readers of this blog from the beginning will know. This all connects to my newfangled newsletter on SubStack, which I recommend readers of this blog subscribe to, which I hope expands in the days to come.

Until next time! I look forward to all of your comments.


Notes

[1] She has written so much that I recommended that she could even write a few e-books. She has actually written a number of books already, like Creating Family Archives: How to Preserve Your Papers and Photographs, a paperback book, and two other books more specifically for information professionals: Project Management for Information Professionals (seems like a textbook, although she calls it a “handbook“) and Managing Image Collections: A Practical Guide (Chandos Information Professional Series) (a guide for those at institutional archives, perhaps?).

[2] If you want to know more about the distinction between the two, there is a new book published by the SAA (Society of American Archivists), titled Archives in Libraries: What Librarians and Archivists Need to Know to Work Together, which seems to make these distinctions and could be a good read. I can’t give a firmer assessment as I have not read the book.

[3] Interestingly, in the review of this episode by Kate Kulzick of A.V. Club, this part of the episode is not mentioned. In fact, Mike’s role in the episode is not mentioned at all!

[4] If you are interested, I’d also recommend reading “How do archivists organize collections?“, “How Do Archivists Describe Collections? (or, How to Read a Finding Aid)“, and most importantly “What do archivists do all day?“, two of which are also by Samantha Thompson.

[5] Perhaps at a later time I’ll bring in my other papers I have currently uploaded to academia.edu like “The concept of a Baltimorean Homeless Library (BHL),” “Uggles and the University of Illinois: a very furry situation indeed!,” and “Strategic Plan Analysis–Maryland State Library Resource Center (SLRC),” the latter of which is relatively technical. All of these are mainly in the realm of libraries rather than archives, however.

[6] She also stated, in a tweet following, that it is good that genealogy has found new people with “energy and new ideas, otherwise it would be a dying hobby” which I will agree with, as a millennial genealogist myself, beyond what someone like fellow genealogist Amy Johnson Crow will describe. Others whom responded to her said that its a time-consuming hobby, while others said that retired people still have some advantages over young people, and her responding to a concern that the internet has isolated people (not an invalid concern), that “the Internet has enabled people to contact relatives and share research much more easily than before” which also is a valid point! This also includes, as Carolynn, another genealogist, argued: “challenging racist, misogynistic and xenophobic genealogists” even if that can be hard. At the same time, I see those, in the wake of the racist ancestry.com ad (for Ancestry Canada) to grumble about how much they “hate” them, for justified reasons, although I don’t necessarily feel the same as a person whom runs two genealogy blogs and is a family historian for both my mom and dad’s side of the family. I seem to sympathize more with those whom say that there are reasons “why you can’t rely on search engines like @Ancestry” with misspellings and mistaken listings.

[7] They also said that the lack of supportive views on Twitter or lists “does not mean that the vast majority of people are not appalled by the few rude ones” but rather that the latter are shown indifference by the many.

I’m part of a wonderful research cohort this semester

Just thought I’d share the most recent news, that I’m working with the DCIC (Digital Curation and Innovation Center) at UMD to “conduct research using computational tools and archival data to illuminate the history of enslaved people in Maryland,” with two other MLIS students (Chrissy Perry and Ben Shaw), and one sophomore in the iSchool (Ali Bhatti). I’ll also be working, with these wonderful people, Ryan Cox of the Maryland State Archives, faculty sponsor Katrina Fenlon, project manager Noah Dibert, and programmer Greg Jansen “to tell the stories of people represented in the data using mapping and digital storytelling tools; to identify connections between the data and related projects on the history of enslavement; and to develop and explore visualizations to support discovery, use, and interpretation of the Archives.” Read more about it on the DCIC’s website. Also see photographs of me, and other fellow students, at the student showcase last semester in that horrid ugly sweater, lol, with most of the charts on the poster made by yours truly:

The Library of Congress, its digital strategy, and crowdsourcing

Screenshot of the homepage of the Library of Congress’s Crowd program

In late October, I asked the Preservation Directorate of the Library of Congress (LOC), about what they decide to digitize and if they have a process similar to NARA (National Archives and Records Administration, called National Archives in the rest of this article), with their own digitization priorities including working with external partners. After thanking me for my interest in the LOC’s preservation work, Jon Sweitzer-Lamme of the Preservation Directorate responded by saying:

The Library’s digital strategy is available here: https://www.loc.gov/digital-strategy. Our prioritization is driven by demand, such as demand for our presidential papers collections like the newly released Theodore Roosevelt Papers (https://www.loc.gov/item/prn-18-132/), and preservation needs, especially if an item can’t be served to researchers anymore due to its condition. We have excellent in-house digitization capabilities and also utilize external contractors and partners to digitize our content.

Generally, that does answer my question, but unfortunately the answer from LOC did not come soon enough for a class assignment I had where I asked reference questions in the same vein of different institutions (AskUsNow!, Maryland State Archives, and UMD Archives). I’ll post that on Academia.edu likely later this month.

This also shows the site is made possible with a partnership via Amazon’s SES [Simple Email Service], a worrying infiltration of public institutions with those from the corporate world. Even so, the Crowd program runs on open source software, so that is a positive.
Most exciting of all is not the digital strategy, but LOC’s new “crowd” program, which is a bit like the citizen archivist initiative of the National Archives which I have participated a bit with in the past. While there are only five campaigns to transcribe, review, or tag information currently, but it is only in its beta stage, so this will likely be expanded in the future, without a doubt. This could become something of linked open data at its finest, not only connecting people with content, but bringing them further into the process to make the usage of records more collaborative for all, going beyond past efforts. In the coming days, I will test out the site and let the rest of you know on this blog what it is like. They even tied in the anniversary of the Gettysburg Address to this program.

With that, this new program fulfills the digital strategy of LOC (without a doubt different than the one in 2000), which states that their mission is to “engage, inspire, and inform the Congress and the American people with a universal and enduring source of knowledge and creativity,” with initiatives such as this one trying to ensure that “all Americans are connected to the Library of Congress.” This is also connected to their strategic plan which has four major goals: expanding access, enhancing services, optimizing resources, and measuring results. As for the digital strategy it also notes the role of digital technology in fulfilling the mission of this institution, while also “throwing open the treasure chest, connecting, and investing in our future.” This strategy is also forward-thinking, stating that:

The Library’s content, programs, and expertise are national treasures…We will make that content available and accessible to more people, work carefully to respect the expectations of the Congress and the rights of creators, and support the use of our content in software-enabled research, art, exploration, and learning The Library will continue to build a universal and enduring source of knowledge and creativity…We will expedite the availability of newly acquired or created content to the web and on-site access systems…We will explore creative solutions to reduce the barriers to material while respecting the rights of creators, the desires of our donors, and our other legal and ethical responsibilities…We will continue to enable computational use of our content and metadata…The Library offers an incredible wealth of content, programs, and services to Congress and the American people. We strive to connect with more users by making those services and content accessible for all…Many of the Library’s digital users come directly to our websites to discover content. To expose even more people to the Library’s content and services, we will bring digital content to users by making more of our material available in other websites and apps that they are already using…We will continue to participate in professional organizations and cooperatives that expand our perspectives and enable us to share our experiences. Additionally, developing partners in industry can allow us to connect the Library with new areas of expertise and resources…We will cultivate an innovation culture by empowering our staff, who have expertise in a wide range of subject areas, including the work of Congress, United States copyright law, American and foreign law, and our collections…Our plans for the future must entail preserving and protecting our collections and content…While we plan for our future, we are also paying close attention to innovations and trends that will present future challenges and opportunities. Newer tools, such as augmented and virtual reality, computer vision, natural language processing, and machine learning, are already transforming how we live and work.

Screenshot of the opening section of LOC’s digital strategy

There aren’t many other articles on this subject [1], from a quick online search, but all of the ones I found are relatively positive, although some are more critical than others. Roll Call, in their article on the subject, described how the digital strategy is “digital forward,” advocated strongly by Librarian of Congress Carla Hayden (who heads LOC, and formerly the Pratt Library in Baltimore), and Kate Zwaard, the Director of Digital Strategy. Most interesting in this article was not that Accenture, a huge contractor, won a contract “to build the long-planned new data center” for LOC, or that the plan includes “employing user-centered design to invite digital and physical visitors to explore more offerings” but that the organization has been stuck in the past, trying to shed this past, because it has “a computing system built in the 1970s to static processes for staff.” Having a 21st century computing system is important for LOC, which holds over 167 million items in its collections which sit on “approximately 838 miles of bookshelves,” making it the “largest library in the world.”

FedScoop also wrote about the digital strategy, noting that the “The Library of Congress…is interested in exploring what artificial intelligence and similar technologies can do for its mission,” saying this focus on digital aspects is not “out of the blue” as LOC launched labs.loc.gov, “a home for digital experiments…last year…[and] it…recently began experimenting with geographic information systems mapping as a way to explore collections online.” Both are positive aspects, to say the least.

Finally, there is Cory Doctrow of Boing Boing, which often has short articles with little content other than the document(s) they are quoting from. Regardless, Doctrow describes how the digital strategy supports “data-driven research with giant bulk-downloadable corpuses of materials and metadata…crowdsourc[ing] the acquisition of new materials…[and] preserv[ing] digital assets with the same assiduousness that the Library has shown with its physical collection for centuries,” among other aspects. He interestingly notes how the LOC has an “outsized role” in the current digital era because it contains the Copyright Office, which is “patient zero in the epidemic of terrible internet law that reaches into every corner of our lives.” This clashes with the fact that Carl Hayden, the Librarian of Congress “is the most freedom-friendly, internet-friendly, access-friendly leader in the Library’s history, replacing unfit leaders who were brought down in grotesque corruption scandals” even though her leadership has fallen short, in Doctrow’s view, because “the Copyright Office is still a creature of Big Content, and it has direct oversight over your ability to modify, repair, sell, and use all of your digital property.” Still, he argues that

…this digital strategy is a very bright light, but it shines in a dark and menacing cave. I love the Library — I love its work, its collections, its diligent and thoughtful staff, its magnificent building. But for all that, the Library has become a locus of terrible policy that runs directly counter to its mission. The contradiction between the Library’s mission and its real role in policy has never been more clear than it is in this wonderful document. [2]

That brings me to the end of this article. What are your thoughts on this new digital strategy of LOC and its new Crowd program?


Notes

[1] Through a further search I found a snippet from the report on infodocket, dh+lib blog of the ALA, and the Digital Journal.

[2] James Tanner of Genealogy’s Star makes a similar point, but says that LOC is not “certainly not the leader in the number and value of their online offerings” since the “the recent history of the Library of Congress is far from promising” with the closure of the Local History and Genealogy Reading Room in 2013, and the “inherent contradiction in the current efforts of the Library of Congress due to the fact that they are also the agency responsible for the controversial access policies inherent in the United States Copyright Law because the Copyright Office is an integral part of the Library.” This means, as Tanner argues,due to “Congressional action, use and access to many valuable research materials have been overwhelmingly restricted” while adding that “policies and budgetary constraints at both the Library of Congress and the National Archives have severely limited the number and availability of digitized records from both institutions. It would be a huge change if this present plan includes real changes in the number and availability to access items in both institutions collections.” Still, he is optimistic, saying that “it will be interesting to see what will happen, although I do not expect any significant changes during what is left of my lifetime,” although he says that the Internet Archive “may become the largest library in the world considering its growth during the past few months and years assuming they catch up with the National Library of Australia.”