Welcome to DU! The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards. Join the community: Create a free account Support DU (and get rid of ads!): Become a Star Member Latest Breaking News General Discussion The DU Lounge All Forums Issue Forums Culture Forums Alliance Forums Region Forums Support Forums Help & Search

frazzled

(18,402 posts)
4. Or maybe it's about archival longevity and safety
Thu Mar 12, 2015, 06:26 AM
Mar 2015

If you think a digital record is timeless—something that people will be able to read 200 years from now—you're probably wrong. Moving-image preservationists are grappling with this problem: if you think film was an unstable medium, video and digital may prove much less so. They haven't figured it out yet. Neither have digital archivists who must store written records.

We know a piece of paper, if stored and maintained properly, lasts hundreds of years or more. You know from experience that something you wrote on some funky word-processing program in 1992 you can't even open on your computer now. Without constant migration to new platforms as old ones become obsolete (which is frequent) many things become unreadable and thus lost. (My husband needed to do research from a CD-ROM compendium published in the early 2000s; but he could no longer access it; we finally had to put find an old laptop from the era to be able to view it; think of the millions of formats and systems we've run through in just the past 20 years; think about 200 years from now).

The government now generates hundreds of millions of emails, tweets, spreadsheets, etc. every year. Maintaining these records for the duration of history is no easy task. Maybe State felt that until such preservation issues are figured out, paper copies are the safest means of ensuring the historical record will be safe. I don't know. But this is no small issue.

GIVEN THE CONVENIENCE and potential cost saving of digital delivery for both libraries and users, combined with the power digitization offers to search within texts, why not embrace the digital future now? The issue of preservation is one of the main obstacles.

Speaking from her experience as head of collection care for the British Library, Helen Shenton explains that “the greatest risks to printed material are the environment, wear and tear, security, and custodial neglect.” Facilities such as the Harvard Depository address most of those concerns, although wear and tear is an unavoidable consequence of use. On the digital side, on the other hand, use of the data is one of the best ways of preserving it, because “‘bit rot’ is one of the biggest risks.” A book left on the shelf for a hundred years might be fine, Shenton says, but digital data must be read and checked constantly to ensure their integrity.

For digital preservationists, a prime concern is that data might be kept perfectly secure and complete, but still be unreadable by machines and programs in the future. A New Yorker cover depicting an alien, come to post-apocalyptic Earth, sitting amid the detritus of modern civilization—discarded CDs, tapes, and computers—illustrates the point: the alien is reading a book, the only thing that still “works.” “You have to think about moving the content along as technology changes,” explains Andrea Goethals, digital-preservation and repository-services manager. In order to make this feasible, librarians try to limit the number of file formats they make use of, and store detailed technical metadata with every object so that in 10 years or 100 “it can be rendered again in a usable way.” Every few years, as the programs that created a text file or a PDF become obsolete, librarians must ensure that the contents of those files remain readable by the current generation of computers and software. But opening each file manually in order to save it in a current format is not feasible when there are millions of them. “Because of the enormous amount of digital material we hold, migrating content is done at scale, not one file at a time,” Goethals explains. “We have to be able to do it on a whole class of objects”—all Microsoft Word files, for example. This content management strategy should work, “but because digital preservation is a young science, we don’t have a lot of experience with it yet.

Objects that begin in an analog format—a book, a recording of a poetry reading, or a piece of music—are easiest to preserve digitally because librarians can choose the optimal file format for long-term access. Material that is born digital—e-mail, for example, which comes in many different, often proprietary, formats—is not so easy to preserve.

http://harvardmagazine.com/2010/05/digital-preservation-an-unsolved-problem


Maybe, just maybe, State determined that turning over paper copies of variously formatted materials will allow the technicians at the National Archives to put it into the most optimal, single format for future digital preservation.
Latest Discussions»General Discussion»Watching the repeat of Ra...»Reply #4