I’ve lost track of how many “OH CRAP!” moments I have experienced in my short career. As the Digital Collections Curator at Northern Illinois University, I have seen some pretty gnarly (digital) things. Things that have really hammered home for me the importance of the research work we are doing on the Digital POWRR project.
Many institutions have acquired digital content without much thought given to future use and access. We take it all in, the good shepherds that we are. We build systems and websites that can do nifty things. We frequently take it for granted that digital materials require active and ongoing preservation. This is not a one time thing! Traditional printed materials readily show us their boo-boos upon physical inspection….torn pages, broken bindings, insect damage, mold, graffiti, etc. Digital stuff, on the other hand, can’t be eyeballed for a quick health check. You need to be mindfully monitoring it constantly. Like a puppy near a bunch of electrical cables. 0.o
However, sometimes just monitoring or checking files does not ensure that they will be adequately preserved for true long-term use. You are inevitably going to run into various challenges opening, rendering, or transforming digital objects that are dependent on proprietary software. I’m sure most everyone knows the pains of shuttling things even as simple as word processing files between different software versions and hardware platforms. In the mid 1990s, I remember being horrified that Microsoft Word wouldn’t open my ClarisWorks files. How many of us actually take the time to forward migrate word processing documents proactively? It inevitably always seems to happen when you need a crucial piece of information. At least there are digital forensics tools and converters to help you out of this jam. What if you opted to save materials in a compressed/lossy filetype, rather than retaining a full quality master file? Getting those bytes back is just simply not possible.
This very issue is currently a problem here in the Digital Initiatives unit as we move old digital objects into a Fedora Commons-based repository framework. We have a collection of audio files that were recorded in the late 1990s for which we have no master or preservation format. There are no analog master tapes, no recording software session files, no digital .wavs, and certainly no broadcast .wavs. The majority of the recordings only seem to exist in RealMedia format. Even though I plan to follow the Association for Recorded Sound Collections’s best practices regarding audio materials, it remains to be seen if we will be able to migrate RealMedia files to .wav. We have also worked with a variety of video files for our Southeast Asia Digital Library project, and have ran into various baffling issues with filetypes, codecs, and software platforms. Standards in digital video are in such flux, one can feel utterly paralyzed when it comes to making preservation decisions. The LOC digital preservation blog, The Signal, even refers to digital video standards as presently being “the wild, wild west.” I wonder if I should start wearing a holster and riding a pony to work….
And what if you can’t even find your darn files to begin with? Are they stored on a network drive, server, or scattered across external hard drives, flash drives, or even CD/DVD-ROMS? Consolidating all of our materials into a single storage NAS unit has been a huge step forward for us, however, occasionally I cannot find files when I need them. Sometimes master files have different filenames than their derivatives. This would not be a problem if it were a file here or there, but hundreds or thousands is another matter. Bulk file naming programs are very powerful tools that can be a blessing to people working in a digital production lab. However, whoever is doing image production and processing needs to keep an eye on their part of the preservation puzzle. Remember to sync your naming across all file versions, and document your work, preferably, for your co-workers and predecessors. Even better yet, create and stick to consistent file naming protocols for all of your digital collections! This is the best preservation strategy there is. Also, consider sticking to hierarchical folder structures that are easily intuited by your colleagues, and document your methods of organization as much as you can.
There are other preservation nightmares that we have experienced in the lab. Have you ever been stranded? (“Stranded on your own….stranded far from home..” as a favorite old song of mine goes.) Someone forgot to pick you up….your car broke down….flight was cancelled….doesn’t it feel awful? Have you ever been unable to rescue and revive stranded digital objects? The poor things. They went on a three-hour tour, and now they’re stuck on Gilligan’s Island. (And yes, I know they’re just 1’s and 0’s…I just tend to anthropomorphize things just to keep myself entertained!).
Our lab’s own Gilligan’s Island has recently became a fully-fledged lost continent of Atlantis, sadly. I am speaking of a stand-alone server that we used to power our interactive/map resources, that used ESRI’s ArcGIS server software. The server was quite old – it ran Windows Server 2000. Yes – *wince*. We have known that it was terrifically out of date for some time now, but we were unable to migrate to a new server because our version of ArcGIS required this particular server environment. Upgrading ArcGIS (which the library had received for free, initially), was a costly proposition. The files associated with the software are proprietary and cannot simply be exported and loaded into an open source alternative. After years of hand-wringing, the server finally failed. Migrating the data and software to a virtual machine also failed. We are going to have to try to reconstruct Atlantis in an entirely new framework, rather than building on what was there previously.
Although this loss will allow us to reconceptualize and refresh our interactive resources, it’s a shame to lose a digital resource that you or your colleagues have put a lot of hard work into. But this experience has proved to be a genuine “life lesson” that can inform and improve practice for ourselves and others as well. Our Digital Initiatives unit, like countless others, sprang up due to the availability of grant money for large-scale digitization initiatives. There was an initial push to make materials available, and not as much thought given towards their long-term preservation. The Digital POWRR project is allowing us to go through the digital preservation planning process, so that we can fully take inventory of our digital materials, assess (and adjust) our past and future practices, and write comprehensive policies so that we never have to deal with the “oh crap” types of moments that I have described in this post. Well, maybe not never….but hopefully way less! Additionally, policies and procedures need to be holistic, addressing both the digital objects and their underlying environment (hardware and software). Websites and other interactive types of resources should be part of any ongoing digital preservation plan – they need to be consistently evaluated and refreshed as well.
Digital POWRR has become kind of like the brown paper bag I hyperventilate into when I’m having an “oh crap” kind of day. It’s soothing to know that I’m not alone, and that I have colleagues grappling with the same issues out there in the digital ether. And it’s downright exciting to finally be able to install and test digital curation tools and systems that will help get us thinking in more of a “lifecycle” mindset, rather than a linear one. I’m definitely feeling more confident in our ability to straighten out our past and lay the groundwork for the future.
Really intriguing and informative advice. Book-marked …
Appreciate the awesome read.
Many institutions have acquired digital content without much thought given to future use and access.