Monday, September 20, 2010

The cooperative we need: Open & Collaborative Library Content


Today our technology tool sets include Web-services, cloud computing, SaaS, grid computing, mobile devices, etc.—all of which have made possible a whole new way of thinking about library systems/services. As a result there are several efforts underway to build the next generation of library automation software. These include the open source intitiative OLE, Ex Libris's URM and OCLC’s Web Scale Management Services. Each of these efforts, in outlining plans for a next generation of systems/services, utilize at least some portion of these technologies.

All of these next generation systems would benefit immensely from access to a massive store of expanded, networked, linked and shared library data. While OCLC has a starting point in place, the ability for it to serve this expanded role across the profession and multiple products has been overshadowed by a number of issues, including a very questionable record usage policy (earlier withdrawn, revised, resubmitted and now approved), moves regarding regional affiliates and now, a lawsuit announced by SkyRiver and Innovative that further raises questions, concerns, distrust, and anger across the market in many directions. Why are we facing this situation?

It appears to me that the interests of the OCLC we know today do not appear to be in total alignment with the needs and interests of its overall actual membership. Perhaps they are in alignment with the interests of the Board, Council, and other governing and administrative arms, but the feeling I get in talks with librarians is that it is not in alignment with what they want. As I talk to librarians, across the country today, I hear that what they want is an organization, a cooperative that is focused on developing and providing open and collaborative library content and services that are widely accessible by all in order that they (the librarians) can focus on re-establishing and/or maintaining the value of libraries in our society.

The current OCLC originally started out on this path by building a shared bibliographic cataloging utility—i.e., the creation and sharing of bibliographic records—a resource that has long been at the core of many automated library services. OCLC did this exceedingly well and in a timely manner, as there was massive interest/demand for this type of service and OCLC could provide it while offering a cost advantage to help libraries further stretch their dollars. A win-win situation for nearly everyone involved. This was in part because the OCLC service filled a critical need for many libraries, and at the time, was not in direct competition with other major for-profit businesses, it was done for an affordable cost, and brought the power of collaboration to bear in addressing a critical library need.

Today, OCLC has continued the shared bibliographic utility but has, in my opinion, lost its direction. OCLC has bought numerous for-profit businesses and has continued to operate them as for-profit organizations that pay taxes. In trying to use these assets to grow, OCLC is trying to leverage the assets of the non-profit cooperative to achieve the commercial goals of the owned and for-profit businesses. It makes for a conflict-ridden mission statement and a critically important player in the marketplace that is trusted by too few including its members/customers and competitors/partners.

No for-profit vendor, whether they admit it publicly or not (although clearly, SkyRiver/III has gone the public route), likes it when a competitor appears on the market and has the benefit of tax-free status. In fact, most businesses will ask: Why should our tax dollars be used to help create a competitor for our company? Especially one that will not pay taxes on the business they take away from us? In the end, all of these business initiatives, and now resulting lawsuit, strongly work against OCLC being able to do what it does best—building collaboration, content, and related services as a non-profit entity to serve the larger profession.

We all need that cooperative. This should be accompanied by the cooperative building a national information processing structure and amalgamating all library related data that supports all types of library services as delivered through libraries and which can be enhanced by the value librarianship brings to the total offering.

How It Might Look

The necessary questions to ask are: What would this organization look like? How would it operate? How does an existing cooperative move in that direction?

Let’s start with some base line assumptions. First, I think that we can all agree there is no shortage of information today. What there is a shortage of are ways to deal with that information, to determine what is the best, the most authoritative, authenticated and appropriate information and to place that information into a meaningful context to answer an information seeker needs. If we can agree that this constitutes a substantial part of what we as librarians do, then I have some suggestions to make.

I’ve just read an interesting book that I found extremely applicable in thinking about how this might be done and what it might look like. The book “The Power of Pull”, by John Hagel III, examines how communities of users, thinkers, and doers are reshaping the way major progress is made as a result of small moves made by many participants working together loosely, but with a common, and sharply focused goal. There are lessons here to be applied to the world of libraries and particularly cooperatives.

These are lessons that we’ve seen used in places like Wikipedia and even open source software initiatives. I frequently lament that librarians miss one of the really valuable points of open source software and that is because I see them only applying the concepts at a micro level (actually producing open source code, admittedly important, but stepping back is also important) rather than looking at what is happening at a macro level. Applied at a macro level, what happens with the production of open source software, that is relevant to this discussion, can be boiled down to this:
  1. A community of people loosely band together, contributing their time and expertise, in order to help create a product (in this case, software or in the case of Wikipedia, an encyclopedia). In many cases, it is a very large community (an important point as scalability is an important aspect for large scale projects, such as we in libraries are dealing with in processing information).
  2. In so doing they agree to be governed and rewarded by shared guidelines and incentives (a large piece of which is ‘community good’).
  3. They also agree (for the most part) to have their work widely reviewed, modified, improved and/or accepted or rejected for inclusion in the final product.
  4. They share the resulting product openly and freely for the benefit of all.
Now, let’s take these basic principles, back up and apply them to the information landscape at large.

The growth of information today (which, as we all know, continues to grow at logarithmic rates) creates a problem which is finding the answers users need in all of this information and doing it in a scalable way. Which is why Librarianship will continue to be important well into the future. So, if we take the concepts above and apply them here, I see the following possible solution.
  • We need to look at what Wikipedia has done in employing a large community of users to create, filter, and refine a massive database of content. We can argue all day long about whether the precision of their content is as good as what we get in other forms, but the reality is the basic concept works and the resulting resource is massively utilized. So, let’s apply those principles to having libraries/librarians employ their users, and others, to create, filter and contribute to the information banks that we call libraries.
  • Let’s create tools, like browser plug-ins, that allow information seekers/users to instantly rate information sources as they use them including the domain of knowledge they apply to and the score the reader gives them. Use of such plug-ins would require personal registration with the library collaborative that runs this, so that users themselves can build personal authority ratings and collect rewards associated with contributing (which might just be personal satisfaction, recognition or status).
  • Rankings, once entered, would be automatically processed and compiled as to subject domain, source, content ranking and rater ID (which can be linked back to their ranking). These rankings would then be moved into a database for use by others, via software, that would provide results on any such item retrieved from the web.
  • Such rankings would be reviewed by domain experts whose certification as a experts either derives from their sustained rankings within the collaborative or from academic or other credentials that establish their expertise. Once the rankings are reviewed, authority for moving the item to a certified rating would be applied by librarians. Eventually, some of this authority would be delegated down through contributor trees to help make the system more scalable.
  • Over time, a massive new information resource would result and at the top of the pile of ranking/reviewing/organizing and providing discovery and delivery would sit libraries and librarianship.
  • We could then begin to tailor our discovery-to-delivery tools (like Primo) to utilize these certifications as part of the relevancy rankings applied to information as well as offer a whole host of other related new and useful services.
Could such a cooperative draw a community of users large enough to do what libraries need to be done, i.e. to process all the information we’re seeing made available on the Web? Would this work to process, sort, rank and float the very best of that information to the top for inclusion in library discovery/delivery tools? It’s certainly a fair and challenging question, but processing the store of human knowledge and contributing to its long term sustainability would certainly appeal to many people provided the right recognition was associated with their participation and contribution of time and labor. However, it is equally clear that this is only one component of the total solution needed by librarianship and end-users. Algorithmic computation, data mining and statistical analysis tools must accompany the final solution. These are things that I expect the vendor community will supply.

The Benefits

The benefits of this new form of library collaborative would be substantial for the profession and human knowledge. The role of libraries and librarianship would be strengthened. If OCLC were to move in this direction, it would be returning to its non-profit, collaborative roots and as a result the antagonism with libraries and the for-profit business sector would be lessened. If the resulting amalgamated data were to be provided under truly open API’s and other interfaces, libraries would see their collective content truly leveraged and utilized. They would be able to get more functionality from their software vendors as they would be able to focus all their resources on end-user needs rather than building (and frequently duplicating) shared data systems and other infrastructure components.

The business model

It is clear that over the years OCLC has struggled with finding a new business model that will sustain the organization over the long run. The reality appears to be that the majority of current OCLC income still comes from bibliographic based services, which should be an indicator that the market best supports OCLC when it stays within its non-profit, collaborative, shared content/services model. Furthermore, this is a model that works in conjunction with, and not against, the business community.

If OCLC were to focus on developing the collaborative and shared library content, the most likely and sustainable business model for all concerned would be a subscription-based annual fee that provides access to content/services and/or the API’s that serve that data. Libraries would pay a lower fee because they’re also non-profit organizations, but they would continue to pay OCLC revenues needed to run this massive collaborative. Vendors, who are for-profit, would pay a higher fee, but would be able to freely and openly subscribe to an extensive ala carte menu of the content and services on which to build their products and be able without worry that for-profit commercial interests of the collaborative would interfere with the necessary and needed trusted relationship down the road.


I believe that librarianship today truly needs and want a collaborative effort that would produce these kinds of data/services. OCLC could do this by returning to its roots of being a true non-profit collaborative building shared infrastructure, content, and services for libraries. It should be very open, providing open interfaces that support both open source and proprietary extensions so that the totality of solutions and services available to the profession would deliver substantial added value to information. This strategy would benefit libraries, the businesses that work and serve them, and ultimately the profession of librarianship.

Don’t get me wrong, like most librarians I speak with these days I’ve always thought we needed an OCLC—but I too think we need one that stands for Open & Collaborative Library Content.