Where is the practical digital preservation research? Some here.

Today the NDIPP blog asked “Where is applied digital preservation research”?

*stands up*

We’re right here.

The Digital POWRR project is focused on pragmatic, practical, sustainable digital preservation for smaller and medium sized libraries and cultural heritage organizations. We know we need to do this, but the resources, and sometimes the knowledge, are tough to come by.

Which is where our project comes in. Have you looked at our Tool Grid? We’ve compared and contrasted over 50 different tools for digital preservation. We’re spending this summer doing in-depth end-user testing of several open-source comprehensive DP solutions: Archivematica, Curators Workbench, and Hoppla, to see how they interact with tools for long-term storage like MetaArchive, DuraCloud, and ArchiveIt.

Want to keep up on our shenanigans? Check out our project wiki. We’re trying to figure out how to do DP at our own institutions, publicly, without a net.

Thanks to the IMLS for funding this highwire act.

A little soul-baring

In our regular conference call this month I was venting, yet again, about all the trouble I have with people not being critical about 1) what digital content they create, and 2) how we can possibly save it all. These obstacles seem insurmountable. My colleagues challenged me to write about this frustration so here goes. I’ll try not to let this become a complete rant ūüôā

All of us on the POWRR team are doing a lot of reading–articles, books, software “about” pages, technical reports–and there seems to be no controversy about how digital preservation actually happens: you make copies, put them in different places and do bit-level checks from here to eternity. Or at least for as long as it takes for someone to come up with a better way. That process is what happens once an object is captured/ingested into a DP system. But there’s the rub! You’ve got to get your virtual hands on the things in order to ingest them!

Once upon a time I lived in a part of the country that was known for its cockroach population. I was in an apartment that I thoroughly cleaned before moving in and that the landlord regularly had bug toxins professionally applied to. And yet, I could still count on seeing fleeting glimpses of scurrying bodies if I turned the kitchen light on at night. That’s the image that comes to mind when I think of how “official” communication happens at my institution. My only consolation comes from regular attendance at professional archivists’ conferences that help me realize I am not alone.

We live in a fragmented, scurrying world of e-record creation. It’s a place where Content Management Systems are viewed as a good thing so that every employee can create a Web page, upload, edit or change a document in their department 24/7. Things get “routed” by blog posts, Websites and email distribution lists. Tweets, Facebook, Vimeo, YouTube and ISSUU channels are all the rage. Question: if there’s a conflict between the way a document shows up on one Cloud compared to the other, who will sort it out?

And if by chance you’ve gotten the ear of the right person in the right office and can snag a copy of things as they’re distributed, you’ll also need to become a file-name decoder. Who knows how long it has been since anyone has gone through professional training to become a secretary? Instead of a reliable core of institutionally-knowledgeable people who learned how to file and name things, every person today is their own publisher and distributor. And every person who comes after them will have their own way of doing things and then it’s back to square one: introduce yourself, state what kind of record you’d like a copy of and why, let alone how.

We know that people have always come up with creative ways to document their own lives, but now those practices have spread to professional realms of our institutional history. This feels like a game-changer. Others will have different points-of-view, but I have no interest in telling people about what they should do with their personal files — for research or pleasure. People in the library and archives professions are often asked about how individuals can save personal objects after the fact, like when a photo album gets wet or when the folded over letters/certificates of an ancestor resurface from some forgotten storage place. My anticipated response to the question “What should I do?” after the fact in the digital age seems pretty brief at this point: Find a digital archaeologist!

The future of my profession lies in becoming that archaeologist. I may already be the dinosaur ūüėČ but my sincere hope is that my institution’s digital bones will last long enough to be excavated by my future self. More will be needed, though, to make sense out of the Digital Deluge. I’d like to propose a new area of study for the archivist/archaeologist: social psychology. There’s going to be a real need for professionally prepared people who can understand why/how people create things as well as how to help their objects resurface for future use.

What Are Your Digital Preservation Roadblocks?

shared under a Creative Commons license DeweyC21; original source: http://www.artsjournal.com/dewey21c/road_block.jpg

Working on large initiatives like digital preservation often feels like being in a car on a bumpy road under construction. There are lots of stops and starts. It’s “hurry up and wait.”

One of the things we’ve come across here on the Digital POWRR project that we weren’t expecting are some technical roadblocks. Specifically, that moment when you’re figuring out what the next step is to take, and you look at the instructions that some generous soul has written up for installing that piece of software, or configuring it, and you see words like this: “Sourceforge” “virtual machine” and “command line.” I’ve linked to the Wikipedia explanations of each, which is one of the ways I’ve come to understand each of these terms.

That’s the first roadblock. The instructions assume that you, too, are a developer or tester, and know what these things are, how to access them, and how to install and run them on your computer. If you’ve never used a command line interface within a virtual machine setup to install software from Sourceforge, you may find some of this intimidating.

If you’re in an open computing environment (i.e. you can download anything you want to your machine without anyone’s permission), you may be able to poke at some of this stuff and muddle through it, especially with the help of user communities. That said, I would love to see some of the instruction writeups for software get “translated” from developer-to-developer to developer-to-possible enduser.

Let’s put it this way. I did the tiniest amount of command line work back in the late 1990s in library school. I learned HTML. I took one scripting class, and we worked in Perl. I am not a programmer or a network administrator, and the chances of me becoming one in the next year are slim to none. I can follow basic directions to complete a task, and am unlikely to blow up my computer by hitting a wrong key. I can tell the difference between a good source of code and a bad one, and I’m not the kind of user who clicks random links and spreads malware or viruses. I’m *reasonably* savvy as a user. (Real world example: I could probably root my Nook to turn it into an android tablet with decent directions, but it would take me 3 times as long as someone who does this sort of thing for a living or a hobby).

This is partially because of the second roadblock. I work in a closed computing environment, and have for basically most of my professional career.  Like many libraries, we have dedicated professional staff who maintain our desktop machines, and monitor all the different software packages that we use. They are great, but any project that involves new software in a computing environment like ours means that they, too, are involved in the project. The have to be, to maintain the integrity of our networked environment.

Even if I wanted to, I’m not generally allowed to download and install software on my desktop just to take it for a spin.¬†I’ve never installed open source software, and I wouldn’t know how to set up a virtual machine if my life depended upon it.

I understand the need for a closed computing environment when you’re trying to deal with a really complex system. No one wants to deal with people installing random software willy-nilly that could crash everyone else’s computers. Heck, no one wants to be the end-user who DID that.

So how do we drive around the roadblocks? Lots of meetings. Good, clear communication. More meetings. Understanding that the systems folks in our closed environment are very busy, and making sure that our requests for software are clear, concise, and lay out the risks, if any, as best we can determine them.

This is one of the roadblocks we’re dealing with as we get ready to test software: the combination of not being terribly familiar with the downloading/configuration side, and working within the schedules of our fantastic IT folk to make it all work.

What are your roadblocks to digital preservation?

Preservation Exhibit at NIU

With Preservation Month just around the corner we have put together a Preservation Exhibit here at NIU that covers both physical and digital preservation. We would like you to see a little preview of what has come together, and in the future we hope to also adapt the exhibit into an online version. We’ll keep you updated. ; )

This is the exhibit introduction poster…

Our Gift to the Future Exhibit Introduction Poster

 

The first case focuses on physical preservation. It discusses and shows some of the tools and strategies that archivists can use to help preserve paper (and other physical) materials.

Preservation Exhibit Case1 NIU

The second case includes two “tales”. ¬†One is a Tale of 2 Artifacts discussing the Dead Sea Scrolls, which have been preserved as much as possible, and the ’96 Election Website for Clinton, which has not been preserved and most of what remains is an image of the homepage. ¬†The second tale is a Tale of 2 Generations, 1913 and 2013, the individuals from 1913 have journals and photos that have been preserved, but do the individuals of today know how to save their Facebook profiles and the like so that future generations can learn about them?

Preservation Exhibit Case 2 NIU

The third case explains what can happen to digital objects and files over time, and thus why they need to be preserved with care.   Specifically it explains hardware obsolescence, software obsolescence, and bit-level deterioration.

Preservation Exhibit Case 3 NIU

The last case includes personal preservation tips (top) and explains what is going on at NIU concerning digital preservation (bottom).

Preservation Exhibit Case 4 NIU

The last piece of our exhibit is a poster that gives a little bit more information about our Digital POWRR research project. ¬†Here is a digital version…

PowrrPoster1

Are you doing something to share or celebrate Preservation Month?  Tell us about it in the comments!

Do it anyway.

 

I’ve been thinking about this song lately. Not just because I’m a fan of the Muppets and enjoy Ben Folds (although both of these things are absolutely true).

But because keeping up with digital preservation, all of the tech, all of the news, all of the projects is hard.

I just want to acknowledge that. This is hard.

One of the tasks I needed to complete this month was getting information about several projects to add to our Tool Grid. This will be available once it’s complete.

Talk about a very quick lesson in how much there is to know, and how little of it I can keep in my head.

The way our tool grid works is that we divvied up the names of tools, and basically each try to figure out what we can from looking at the website. This should be familiar to anyone working currently in libraries and archives as a practice.

Just as an example, let me talk about a tool called BWF Metaedit, which was on my list.

So far as I can tell, BWF Metaedit is an open source tool that “permits embedding, editing, and exporting of metadata in Broadcast WAVE Format (BWF) files.” It was created by “Federal Agencies Digitization Guidelines Initiative (FADGI) supported by AudioVisual Preservation Solutions.”

So, free to use, and it does a specific task. Great. I’m approaching this as someone who doesn’t work with AV very much; my metadata experience is limited to social media tagging and Archon these days, although I used to be a cataloger way back when. ¬†I tried looking at the documentation, and I must say:

This documentation? Not designed for anyone other than a current developer or someone who does coding routinely. Here’s how you set preferences for the software. Here’s a suggested workflow. I think I understood *some* of that one.

Now, my MLS is about *cough* 15 years old now. I took a single scripting class there (PERL, if you’re curious), specifically to give me some chance to be able to parse this kind of information when working with specialists.

I’m feeling very much out of my depth here.

That, I expect, is not uncommon. Nor is the feeling that there just isn’t time and energy and administrative space to learn all of this stuff, even if I wanted to. Which I’m not sure I do, right this second, as this particular tool is less likely to be “my problem” in current library workflows here. It’s not one of the tools we’re looking at closely as one of our more comprehensive solutions, but it’s good to know it exists.

But for the materials that *are* “my problem”? I need to learn “all this stuff” *gestures at the wide world of tech*.

…or maybe I don’t. Maybe I need to learn just what I need to learn. Maybe I need to acknowledge that there will be gaps in my knowledge, and that’s OK. I can develop expertise over time, in the things that are relevant to me, to my collections, to my organization. Just as I did for the rest of my library work. For teaching and bibliographic instruction. For collection development. For exhibits. I’m basically self-taught in many of these areas, going from things I experienced as a student assistant, graduate assistant, library assistant, and patron. Or I just did what I thought was best for the collection based on the information I had, and didn’t exceed my budget.

I did it anyway. Without ALL the information. Just enough.

Digital preservation is no different.

Yes. I live in the fear that Digital POWRR will somehow miss one of the bajillion projects that are out there. It’s really, really hard to get one’s arms around this whole digital preservation problem.¬†I fear that we will not choose wisely for our digital preservation setup, even though we are doing everything in our power to do so. I fear that budget restrictions will force us to select an option that is not the best for us, but what we can afford.

It’s absolutely a risk. One that every organization must take.

We must do it anyway.

Paralysis, or hoping the problem will go away, is not an option. These materials will only grow in number and size, and change. It is better to do something than nothing. Something gets us more options 3-5 years down the road.

Doing nothing ensures that our current crop of cultural heritage professionals will be remembered as the generation that really, really borked the digital revolution and killed our historic legacy just as surely as those people that burned letters and papers in the 19th century.

Which is definitely NOT what I was hoping for.

 

A Cry From the Trenches: Fighting the Ongoing War with Digital Obsolescence One Day at a Time

I’ve lost track of how many “OH CRAP!” moments I have experienced in my short career. As the Digital Collections Curator at Northern Illinois University, I have seen some pretty gnarly (digital) things. Things that have really hammered home for me the importance of the research work we are doing on the Digital POWRR project.

Many institutions have acquired digital content without much thought given to future use and access. We take it all in, the good shepherds that we are. We build systems and websites that can do nifty things.¬†We frequently take it for granted that digital materials require active and ongoing preservation. This is not a one time thing! Traditional printed materials readily show us their boo-boos upon physical inspection….torn pages, broken bindings, insect damage, mold, graffiti, etc. Digital stuff, on the other hand, can’t be eyeballed for a quick health check. You need to be mindfully monitoring it constantly. Like a puppy near a bunch of electrical cables. 0.o

However, sometimes just monitoring or checking files does not ensure that they will be adequately preserved for true long-term use. You are inevitably going to run into various challenges opening, rendering, or transforming digital objects that are dependent on proprietary software. I’m sure most everyone knows the pains of shuttling things even as simple as word processing files between different software versions and hardware platforms. In the mid 1990s, I remember being horrified that Microsoft Word wouldn’t open my ClarisWorks files. How many of us actually take the time to forward migrate word processing documents proactively? It inevitably always seems to happen when you need a crucial piece of information. At least there are digital forensics tools and converters to help you out of this jam. What if you opted to save materials in a compressed/lossy filetype, rather than retaining a full quality master file? Getting those bytes back is just simply not possible.

This very issue is currently a problem here in the Digital Initiatives unit as we move old digital objects into a Fedora Commons-based repository framework. We have a collection of audio files that were recorded in the late 1990s for which we have no master or preservation format. There are no analog master tapes, no recording software session files, no digital .wavs, and certainly no broadcast .wavs. The majority of the recordings only seem to exist in RealMedia format. Even though I plan to follow the Association for Recorded Sound Collections’s best practices regarding audio materials, it remains to be seen if we will be able to migrate RealMedia files to .wav. We have also worked with a variety of video files for our Southeast Asia Digital Library project, and have ran into various baffling issues with filetypes, codecs, and software platforms. Standards in digital video are in such flux, one can feel utterly paralyzed when it comes to making preservation decisions. The LOC digital preservation blog, The Signal, even refers to digital video standards as presently being “the wild, wild west.”¬†I wonder if I should start wearing a holster and riding a pony to work….

And what if you can’t even find your darn files to begin with? Are they stored on a network drive, server, or scattered across external hard drives, flash drives, or even CD/DVD-ROMS? Consolidating all of our materials into a single storage NAS unit has been a huge step forward for us, however, occasionally I cannot find files when I need them. Sometimes master files have different filenames than their derivatives. This would not be a problem if it were a file here or there, but hundreds or thousands is another matter. Bulk file naming programs are very powerful tools that can be a blessing to people working in a digital production lab. However, whoever is doing image production and processing needs to keep an eye on their part of the preservation puzzle. Remember to sync your naming across all file versions, and document your work, preferably, for your co-workers and predecessors. Even better yet, create and stick to consistent file naming protocols for all of your digital collections! This is the best preservation strategy there is. Also, consider sticking to hierarchical folder structures that are easily intuited by your colleagues, and document your methods of organization as much as you can.

There are other preservation nightmares that we have experienced in the lab. Have you ever been stranded? (“Stranded on your own….stranded far from home..” as a favorite old song of mine goes.) Someone forgot to pick you up….your car broke down….flight was cancelled….doesn’t it feel awful? Have you ever been unable to rescue and revive stranded digital objects? The poor things. They went on a three-hour tour, and now they’re stuck on Gilligan’s Island. (And yes,¬†I know they’re just 1’s and 0’s…I just tend to anthropomorphize things just to keep myself entertained!).

Our lab’s own¬†Gilligan’s Island has recently became a fully-fledged lost continent of Atlantis, sadly. I am speaking of a stand-alone server that we used to power our interactive/map resources, that used ESRI’s ArcGIS server software. The server was quite old – it ran Windows Server 2000. Yes – *wince*. We have known that it was terrifically out of date for some time now, but we were unable to migrate to a new server because our version of ArcGIS required this particular server environment. Upgrading ArcGIS (which the library had received for free, initially), was a costly proposition. The files associated with the software are proprietary and cannot simply be exported and loaded into an open source alternative. After years of hand-wringing, the server finally failed. Migrating the data and software to a virtual machine also failed. We are going to have to try to reconstruct Atlantis in an entirely new framework, rather than building on what was there previously.

Although this loss will allow us to reconceptualize and refresh our interactive resources, it’s a shame to lose a digital resource that you or your colleagues have put a lot of hard work into. But this experience has proved to be a genuine “life lesson” that can inform and improve practice for ourselves and others as well. Our Digital Initiatives unit, like countless others, sprang up due to the availability of grant money for large-scale digitization initiatives. There was an initial push to make materials available, and not as much thought given towards their long-term preservation. The Digital POWRR project is allowing us to go through the digital preservation planning process, so that we can fully take inventory of our digital materials, assess (and adjust) our past and future practices, and write comprehensive policies so that we never have to deal with the “oh crap” types of moments that I have described in this post. Well, maybe not never….but hopefully way less! Additionally, policies and procedures need to be holistic, addressing both the digital objects and their underlying environment (hardware and software). Websites and other interactive types of resources should be part of any ongoing digital preservation plan – they need to be consistently evaluated and refreshed as well.

Digital POWRR has become kind of like the brown paper bag I hyperventilate into when I’m having an “oh crap” kind of day. It’s soothing to know that I’m not alone, and that I have colleagues grappling with the same issues out there in the digital ether. And it’s downright exciting to finally be able to install and test digital curation tools and systems that will help get us thinking in more of a “lifecycle” mindset, rather than a linear one.¬†I’m definitely feeling more confident in our ability to straighten out our past and lay the groundwork for the future.

2012 Digital POWRR Annual Project Report

We have posted the Digital POWRR report for the first year to our wiki. ¬†If you’re interested in reading our Annual Report narrative from 2012, you can do so HERE! It discusses what progress the Digital POWRR project has made and some of the challenges that we have come upon. ¬†Boy have we got a lot done, so be sure to check it out! Enjoy : )

An Unsung Hero of Grants Administration: The Business Manager

Everyone’s organization is different, but we all have someone who fits the description:

The Library Business Manager, otherwise known as She (or He) Who Makes Things Happen.

The Business Manager is the person who does lots of cat-herding, who makes sure that all of the forms that you didn’t know existed get filled out, and that the processes at any given institution are followed, so that all of the money from the grant can be used. They deal with red tape so that you can focus on the actual grant work. They are like administrative ninjas.

They know whom to call to get key questions answered, paperwork moved along, and checks cut.

We *love* our business manager. She makes things happen. (Thanks, S!)

If you’re new to grants administration, here’s a word to the wise: get to know (and appreciate) your business manager (or the person who fulfills that role in your organization). ¬†They make life so much better for everyone.

Managing Born Digital Content For Dummies

**Disclaimer- No we don’t actually think we (or you) are “dummies” for not knowing this information, even if that is how we may feel. It’s just a catchy title. ; )**

We have all been there at some point in our lives.  We set out to get something done and have no idea how to do it.  So the logical next step is to find some sort of instructions, yeah? *insert head nods*

We (many of us on the Digital POWRR team) had such a question about what to do first when we have a physical piece of media in our hands.  Low and behold we found this wonderful article that we believed would solve all our problems!

Erway, Ricky. 2012. You’ve Got to Walk Before You Can Run: First Steps for Managing BornDigital Content Received on Physical Media. Dublin, Ohio: OCLC
Research. http://www.oclc.org/content/dam/research/publications/library/2012/2012-06.pdf

(Seriously, it’s great! Give it a read!)

After briefly reading over it many of us accepted that the problem was solved and moved on with our lives¬†temporarily. ¬†Until recently, when one of us actually tried to go through step by step. ¬†We realized our knowledge may not be as deep as we originally thought. ¬†Therefore we have taken this wonderful article that gives a step by step process to get off the ground running with physical media, and added some ¬†additional information for those of us who may not have all the training to get over the hurdles we weren’t aware were going to be in our way. ¬†It can be found in our Digital Preservation 101 section of this website, or directly linked right…HERE.

The process for discovering these resources was just a simple process of researching the things that tripped us up. ¬†Like how do we “Write Protect” things, what is a disk image, and how do we create a checksum? ¬†The additional resources are just links to places that will help to answer those questions, so instead of you having to take the time to look for them, we did! ¬†**Disclaimer** No we don’t actually think we (or you) are “dummies” for not knowing this information, even if that is how we may feel. It’s just a catchy title. ; )

Digital Preservation Tool Help? Yes, Please!

The Digital POWRR project team is excited to be working on something we think will be helpful to those of you¬†who are trying to sort out what on EARTH all of the (alleged) digital preservation tools and technologies actually do! It started with a brainstorming session where the team captured every tool/technology/service that we have come across in our digital preservation explorations.¬†We came up with almost 90….ACK!¬†Seriously, who has the TIME or the INCLINATION to¬†sift¬†through all of these¬†tools to figure out what they do, how much they cost, etc.

Team PictureWE DO!

Taking a divide-and-conquer approach (along with a whittled down list of 60 tools), the Digital POWRR team is tackling this so you don’t have to! As a part of¬†our investigation into how institutions with fewer resources (read: money and/or people) can engage successfully¬†in digital preservation, we have created¬†a grid that will map out each tool against a list of¬†functions a digital preservation system should provide. We have based our list of functions on the OAIS reference model¬†and thrown in a few of our own, like:

 

  • Is it open source?
  • What are the basic system requirements?
  • How much does it cost (or is it FREE!!!)?
  • Does it offer a geographically dispersed data storage model?

We feel the pain of our colleagues who are trying to figure out this “digital preservation thing” while still managing all of their normal responsibilities. (ya know, like explaining to the very nice donor¬†why your institution can’t take Uncle Bert’s collection of romance paperbacks off of¬†their hands…)

What the Digital POWRR team is¬†hoping to accomplish with this particular exercise is this: Professionals who are overwhelmed by the concept of digital preservation and the number of technologies that purport to fulfill some digital preservation¬†requirement will be able to use the grid we have created to understand what a digital preservation system should do, which tools provide which functions, and a snapshot of each tool’s costs/system requirements/etc. We also recognize that some institutions with fewer resources (our¬†project’s target audience!) need to piece together a digital preservation system with various open source/freely available tools….we are hoping the grid will help them in that process.

We will be spending the next few months working on this, so look for our results by late spring 2013. We will also be including¬†the grid¬†in¬†the¬†final report of the larger investigation we are conducting. That report will be coming out through the IMLS in 2014. Which should be in just enough time for about 60 new digital preservation tools to be introduced on the market……ARGH!!!!