Tuesday, June 30, 2009

Digital Preservation -- Observations from PASIG

The SUN Preservation & Archiving Special Interest Group (PASIG) meeting, held last week in Malta, was once again a meeting of thought leaders in the technical aspects of digital preservation. Many common trends emerged as I listened to the excellent presentations.

Among the first trends I observed was this: digital preservation is still
perceived as "too hard and (as a result, most feel they) can't do it", said Steve Knight, of the New Zealand National Library. He also noted the continuing "need for a sustainable digital preservation solution." Steve's observations closely match my own experiences and concerns. Too many librarians and archivists are quite simply ignoring the issue and/or using the economic crisis to avoid dealing with the challenges of digital preservation. Yet we face staggering growth in digital data. This also underscored Steve's observation that building long-term, economically sustainable solutions is a critically important issue.

The issues of context and technology crossed paths in conversations
with and presentations by David Rosenthal of Stanford University. He pointed out the difficulties we face in preserving the context of objects, as well as the very information objects themselves, when so many objects are the result of the process of dynamic assembly used in creating mashups. He posed these questions: How do we capture all that information and do so in a way it can be re-used? And, how do we build solutions that will scale to meet these needs? These points also raised questions again of what do we choose to preserve because capturing any one information object contextually can logarithmically expand the total quantity of data to be captured. In addition, David raised the point that we don't yet have the needed "copyright framework for research data". The complexities involved in answering that particular legal issue involve working around people with vested interests and require working directly with research funding authorities on a national level. Good points and questions, each and every one of them.

The need for more best-practices, creating and/or updating standards,
as well as the need for certification of Trusted Digital Repositories (TDRs) also emerged as a common theme. For instance, while OAIS is out for review and has been for quite some time, the status of that update work is very unclear. Yet, in a presentation by the FamilySearch attendees, they showed how they've made extensions to the standard to accommodate scalability issues that they've encountered and that were not addressed in the original design. This indicates to me things that need to be fed into that OAIS revision process. Chris Rusbridge of the University of Edinburgh did an interesting presentation about the need for repository certification and TDR's in order to provide users with the "trust"required to place their digital objects in preservation repositories. This is all too often an issue that has not yet received enough attention, yet Chris is absolutely right that it is very important.

During a panel session, I raised the point that while we were seeing a
lot of thought leadership in the PASIG presentations, I felt what was needed was not just thought leadership, but "active and coordinated" thought leadership -- which it did not feel like we were achieving. I specifically put forth that, given the sheer size of the constantly growing corpus of digital data to be captured, not to mention the system scalability issues incurred by that growth and the necessary business models involved in making it all affordable, we might seriously question if the conference represented the best use of the valuable resources the conference itself entailed?

Too often, throughout the conference, it felt like many people were
busy trying to solve the same problems, often arriving at similar, but yet different solutions that in the end catered to very specific and unique organizational needs. One wonders if we wouldn't be better off to focus on a larger, but more generalized vision. By designing a total overall framework in which developments made by the many conference participants could be plugged together for a more comprehensive solution, far greater progress might be made. Certainly best practices such as policy templates, lists of applicable and needed standards, audit practices, and standard backup processes/procedures would all be good starting points. For instance, we could take the list of identified areas needing standards development and put forth light-weight, quick-to-the-field draft standards that could evolve to become full, accredited standards after we had actual experience in using them. One approach that should definitely be used more is national plans and possibly even funding to drive national digital preservation work. I was particularly impressed at the conference, by the work being done in Slovakia in this area. They have come up with a "National Information Infrastructure" plan that covers digital preservation, and they've created an "Integrated Conservation Centre" to help coordinate libraries digitization initiatives. In a more specific example, Rob Sharpe of Tessella called for the creation of more national registries. All are examples of approaches that should be replicated on a wide scale. It was also very interesting to note the number of presentations that involved video and/or audio-visual objects in parallel, but separate fields. One remark by Richard Wright of BBC Future Media and Technology was particularly telling when he pointed out that they are also trying to solve many of the same issues and coming up with similar ideas. I've never heard a clearer call for librarians and archivists to reach across traditional boundary lines and work arm-in-arm with others to solve some large scale problems.

At the end of the conference, while the content was excellent, I was
left with the concern that, while we're continuing to face huge challenges in digital preservation, we're trying to solve those challenges individually rather than collaboratively. That is an approach we can't afford financially or strategically. While we grapple with the challenges, the black hole that is permanently sucking away digital content that should be preserved is growing. We'll never be able to recover it. The only hope we have of bringing needs into line with capabilities is for us to envision a large-scale plan, seriously evaluate what can be done, and how, who can do it and start parsing out assignments to bring the collective results forward to the profession.

Monday, June 22, 2009

As the supply of information grows, so to does the need for new skills in librarianship.

I’m always reading. This is probably because my upbringing included weekly visits to the library and now because I am a librarian. Like many people, I find the most rewarding part of reading is how when you set the item down and think to yourself how interesting the content was and then being able to extrapolate how it applies to your life. Such has been the case for me recently with two items recommended by friends. The first is an article that appeared on the Educause website called “The Tower, the Cloud and Posterity” by Richard Katz and Paul Gandel and the second is a book called “True Enough; Learning to Live in a Post-Fact Society” by Farhad Manjoo (Wiley and Sons, 2008). Both works cause you to stop and think about the affect the abundance of information and technology that is now available has on society and human behavior. The article goes on to raise the question of the role of the librarian in this changing environment.

What I found so fascinating was that like many, I’ve been so engrossed in the concept of making sure we capture, store, make discoverable and preserve access to information, that I hadn’t really stepped back to think about what the result of that might be. When mixed with the massive trend toward collaboration and social networking it turns out that it might not be entirely positive. I found this paragraph in the article by Katz/Gandel particularly thought provoking:
“will we leave a human record possessed of “too much
scrambled, meaningless trivia of information where discerning
anything of value or having context-rich value statements at all
becomes impossible?”…. “It is possible that as information
becomes so voluminous, the standards of selection become so
pluralistic, and the content of information becomes so nuanced,
feeling will replace analysis as the social barometer of truth?(1)”
It turns out they’re not alone in that thought. In the book “True Enough”, Farhad Manjoo also leads us through extensive examples of how information is now manipulated, spun, massaged, and sponsored. This is frequently a result of collaborative efforts such as are typical of Web 2.0 initiatives and access to the vast supply of information that is now available. By the end of the book, any worthwhile librarian is deeply disturbed and wondering how we will know that the information we’re selecting, storing and representing as accurate will really be so.

“The implications of having more than a billion people with persistent connections to the Internet and exabytes of information freely and openly available cannot be overstated.(2)” It raises the spectrum of the possibility that librarianship will need a whole new set of skill sets in the future. It almost certainly means for librarians that the context of any information stored must also be captured and stored with the information. Possibly, we’ll need to develop and use, via those same Web 2.0 collaborative initiatives and/or networks, people who can tell us if something has been manipulated. For instance; has a picture been extensively modified by a Photoshop(TM) expert? Given the vast supplies of information that will exist, all of these authors suggest that any point of view can and will be justified, in depth and great detail. If such is the case, how do we capture all of that information so we can assure people that we have the ability to provide the equally complete context in which any theory or hypothesis was developed? Think about how we do that when it comes to medical information about the authors? How many Lincoln scholars would love to have detailed information about Lincoln and the probability he had Graves disease? But if Lincoln lived today, given the issues of information privacy, even if we held that information, would we be able to allow its use?

People frequently ask each other for information about topics in their lives. I know as a librarian, I’ve always encouraged people to not just ask your friends, go to the library and get the facts. Now we must question the very information that we archive in the library for them to check. As librarians, it is becoming apparent that we will also need to be well trained in the laws pertaining to the use of information. Not only must we develop the new skills with which to do this, as noted by Katz and Gandel, “the librarians and archivist must not simply be part of this new cloud of digital information artifacts. They must take a leadership role in guiding its policies and practices. ”

As librarians this raises the specter of extensive new training courses in librarianship, new policies and guidelines to be developed, new things to teach and convey to our users along with new tools to be developed. The exabytes of information are growing. We best get busy ensuring the same is happening with our librarianship skills and training.

(1) “The Tower, the Cloud and Posterity” Richard N. Katz and Paul B. Gandel. Pg 186.
(2) Ibid.

Sunday, June 14, 2009

Going, going, gone??

It’s one of those days where I find myself on a morning flight between the offices of Ex Libris in Chicago and Boston, and I’m scanning today’s newspapers. I’m reading them on an Amazon Kindle, which is appropriate because this morning’s news stories have much to say about the accelerating move of books and information from analog to digital.

The “Financial Times” (June 10, 2009) carries the article “School textbooks near digital doomsday” wherein it details how California’s governor, Arnold Schwarzenegger is promising to replace costly “outdated” textbooks with digital learning devices. He goes on to call textbooks “antiquated, heavy and expensive” and states that he no “longer sees the need for traditional hard-bound books when information is so readily available in electronic form”.

The next article I read is in Wired (June 2009) magazine, entitled “The Future of Reading” where Clive Thompson, in the subtitle states “To save books, publishers must go digital—and let audiences unlock the potential of the written word.” Thompson goes on to say that “Books are the last bastion of the old business model—the only major medium that still hasn’t embraced the digital age.” He then nudges us to “stop thinking about the future of publishing and instead think about the future of reading.”

All of which causes me to once again pause and ponder about the future of libraries and librarianship. As information continues to move to the digital medium, I wonder why we expect students whose textbooks and other sources of information are readily available electronically and wherever they are, to come to the library to use our resources, be they digital or analog? Will the library as “a place” or librarianship as a service have sufficient added-value to end-users to justify its continued existence? Or, will Deans and Provosts begin to eliminate librarian positions and/or library facilities on their campuses because they buy into the “it’s all available digitally” belief? Not to mention, they currently have to deal with an economic crisis so does that give them the perfect excuse to reduce/eliminate if they think this way? (In fact, later in the day, I talk to a consortium director who tells me the elimination of librarian positions is exactly what has happened at two of the colleges in his consortium). At the same time, we’re seeing the Pennsylvania state library have nearly 50 of their 57 staff positions eliminated. Left with such a skeleton staff one has to speculate if they’ll be able to do little more than keep the doors open and even then, at very limited hours, with very limited services. This is not exactly the future of librarianship we all had in mind, I’m sure. This leads me to the belief that we’ve arrived at a very important time for libraries and librarianship. It’s time to redefine them and then rapidly move towards that redefinition before it’s too late.

An interesting, yet obviously preliminary and partial part of that redefinition, is described later in the same June issue of Wired ( in an article by Steven Levy entitled “The Answer Engine” which describes Stephen Wolfram’s new Wolfram Alpha service . Applying a computational engine to the vast amount of digital information already available, Wolfram Alpha attempts to answer questions poised using the digital information now available. If you haven’t yet taken the time to experiment with this product, I would certainly encourage you to do so. Most librarians will likely find an encounter with Wolfram’s tool frustrating at the moment, but the potential it shows is fascinating.

What you’ll find clearly missing in this service, is what many of us librarians learned in the course called “Reference Services” which is where we learned how to interview the user, before starting the search, to find what exactly would meet the users information needs. There are many ways to do this in today’s digital environment. The point though, is that this is a place where clearly the skills of librarianship are needed and could play a very important role. Engaging in the development of these types of enhanced services is a place where I believe librarianship should be focused and headed, today and tomorrow. Of course, in the short term we need to show more immediate results. This can be done with activities such as those we describe in our Initiatives blog and as I’ve recently described in a post in the Federated Search Bog.

What seems obvious to me after reading all of these articles is that if we don’t start filling gaps like these with library services and librarianship skills, others will. If we want librarianship, and the values it represents, to survive intact, we must more rapidly adapt to this environment just as information is doing in moving from analog to digital. Otherwise, librarianship will be gone.

Tuesday, June 2, 2009

How does it know?!?

We all know that part of life is death, but it never lessens the pain or sorrow when you get the news that someone who substantially helped shape your career has left this earthly coil. Such was the news for me this past weekend with the news that a long-time colleague of mine in an earlier part of my career, Jim Michael, has departed.

Jim was a remarkable man, with a huge appetite for funny stories, libraries, life, family and food -- all of which he enjoyed with relish. I’ll always remember how he did a demo of the software, showed some wonderfully clever feature and then would turn to the audience of librarians and with a huge grin would ask; "How does it know?"

For those of us work in the field of library automation and were recruited away from libraries into the business side of librarianship by Jim, we owe him a lot. Jim was very close to the same age as my father. Like my dad did, and still does, Jim guided me with gentle patience as he shared his incredible knowledge and expertise on a wide range of subjects. As you would expect, conversations with Jim focused on libraries, building software products for libraries, library standards, understanding librarians and their needs as well as all those others who work in this industry, ranging from the press to lawyers, to consultants and other vendors. When you had reached your saturation point on the subject of libraries, he could just as easily change gears to discuss Biblical studies, fine wines (and God bless him, the best port I’ve ever smelled and tasted), cigars, coffees, food and any other subject you could wish to discuss at any level of detail you wanted to discuss. When you were finally tired of learning for the day, he’d tell you a funny story or joke, put a laugh in your belly, a smile on your face, a good cup of coffee in your hand and then send you back to your office to start applying that new bundle of expertise he'd just handed you.

However, the most important thing Jim taught me was that as you rose in the organization, you had an obligation to bring along the next generation of leadership. Through Jim’s understanding and guidance, he did that for me. Sometimes by counseling me when needed, sometimes by introducing me to those I needed to meet or explaining that which I did not yet understand. He always set the best example possible for me to follow. He taught me to lead when needed and follow when appropriate. He did it all in a way that showed tremendous respect for the people around him.

I’ve tried over
the years, to faithfully apply those lessons and to do the same with those who work with me. Sometimes I succeed, sometimes I don’t, but I’ve always tried to remember the examples and the lessons Jim imparted. It’s an important part of leading an organization and one easily forgotten in the rush to get things done. But do it we must, for it is part of the job of leading and part of the obligation we hold, to those like Jim Michael, who taught us.

How does it know? Because Jim, like all the rest of us, you took the time to teach it. God bless you on your journey.