Untying the Digital Preservation Knot

Meg pointed out the whys and wherefores of digital preservation in her last post.

I’m supposed to talk about how.

Of course, since the point of the IMLS grant is to help figure out the “how” for libraries with fewer resources, I’m not going to have all of the answers just yet. We have some ideas about the how, but until we run through all of the scenarios with people who know more than us, it won’t be very useful.

I have a metaphor I’d like to play with a little bit, if you’ll bear with me. I’m a novice knitter, and so thinking about process in this way is helpful to me. Dealing with digital preservation feels a lot like untying a very messy knot in someone else’s knitting.

It’s as though we’re being handed a portion of a partially- knitted, really quite complicated lace scarf, a very messy ex-ball of yarn that a kitten has gotten at,  some knitting needles which may or may not be the right size, no pattern, no instruction, no stitch markers, and being told to figure it out, and go on from here. (This is the data and digital objects sitting on servers across my campus). Some of it has been worked, some of it has been relatively ignored, and on occasion, some damage has happened through benign neglect (the kitten is bit rot).

Now, we have some literature that can serve as a guidebook when we get to the part where we can learn new skills and keep knitting the scarf. But there’s still a lot of groundwork to be done before we get to that stage. We need to figure out what we have: what kind of yarn? what sized needles? what kind of stitches make up the pattern? Can we find a similar pattern elsewhere, or do we need to reverse-engineer it from what we see?  Do we have the right tools for this pattern?

Thinking about digital preservation feels a lot like that.

When I think about long-term access to digital objects, I think about it in terms of pulling out yarn knots. More fundamentally, before we can keep knitting, we need to untie all the knots made by that kitten. There may be enough yarn there to move forward, but it’s unworkable until we can pull at the individual threads, untie knots, separate parts from one another, and smooth things out to make a workable ball of yarn.

If I pull on the server space knot, does personnel that can help come with it, or is that a different portion of the thread? Can I pull out particular objects (knots) without making other knots (different file types) more difficult to deal with? Where can I find a knitting instructor (expert) or a book (documentation) or a friend who can explain a new technique (experienced end-user)? How will I know when I’ve mastered a particular stitch (technique or product) enough to incorporate it into my pattern without errors? Even if I learn that stitch, does it complement the scarf pattern (data) I already have?

Would it be more effective to rip back the other person’s work, and begin again with a simpler pattern? Would doing so give the same effect, or does it fundamentally change the scarf too much? As I pull one knot, the yarn often tightens in other areas. Will I be able to unravel as I need to? Do I need to change my approach?

Eventually, as a knitter, to learn a new skill, you just have to sit down and do it. You will probably mess it up quite a few times. But, with some help, and good humor, you will come out the other side with a scarf you can live with. Even if it doesn’t look exactly like the pattern on the page.

Digital preservation is kind of like that, too. What matters is the result (long-term access to the data we create). The process of getting there may not be pretty, and it may not look as nice as the very experienced knitter’s work next to yours, but it still keeps your neck warm.

So here’s to the first attempts to unravel the knot.

Why you should care about digital preservation (DP)

Many people assume that digital files are superior to analog formats. Who doesn’t like to search for what they need online and have access to it instantly? Surely the market will keep up with our demand for access! But who is providing that content and how much will you have to pay each time you use it? As curators for evidence of our cultural heritage, we owe it to our institutions, patrons AND our future selves to think through DP (digital preservation) issues as soon as a digital object is created.

On the surface, the superiority of digital content seems obvious. We often hear that digital files take up less space, cost less to create and use fewer natural resources to store more data. Besides, everyone and everything is going digital….how could it be a problem for our future? I’d argue that the perceived strengths of digital storage are also their weaknesses:
1) Less space—the danger with this idea is that it leads us to think we can keep everything. Also, our need to purchase lots of storage space may lull us into thinking it’s okay to purchase cheap storage. Buying cheap storage will make system failure more likely. Relying on “free” or cheap cloud space subjects our collections to the will of commercial hosts’ practices of data mining our content or pulling the rug out by closing the site altogether.
2) Costs less—costs are hidden. The number one resource digital things need is the thing most people don’t usually think about: people. People power is needed to create digital files that are usable and identifiable now and 50 years from now; additionally, some person has to set up, and possibly initiate, frequent migration of file formats. Good DP systems also have subscription or purchase costs.
3) Uses fewer resources—saving paper or not printing photographs does save the environment from the chemical processes needed to create the material, but if something goes wrong and a file becomes corrupt (called “bit rot”), some knowledgeable person must intervene and find or create a replacement.

The latter point, about file replacement, currently has a technological solution: multiple copies of the same file can be shared in a DP system and examined for bit rot. When a file becomes corrupt, the system can be designed to automatically call up one of the other file locations and create a new copy. But all of this is dependent on the ability of people to create the object in a migrate-able format to begin with and include enough information in the item so that people far into the future will be able to find it and understand it.

Finally, think about how fast digital file formats change. As cultural heritage curators, we do not have the luxury of stuffing digital media into an archivally sound box and walking away.If we ignore it, it will go away. Source: http://geekandpoke.typepad.com