Thursday, August 4, 2011

The Library Linked Data Model – from a librarian/vendor point of view

The discussions about the Library Linked Data Model indicate that many people clearly feel it is an important topic for librarianship. The desire to make hundreds of silos of data more accessible, usable and maintainable are shared by the community and it is equally of interest to the many organizations that provide products and services to that community. Ex Libris, as one of those organizations, frequently gets asked: How we feel about this topic and where do we see it fitting into our plans? As you might suspect, the answers, while seemingly simple, are actually far more complex.

I was at the ELAG 2011 conference in Prague in the Czech Republic recently and was sitting on a panel when an attendee asked the vendor organizations on the panel (a representative of OCLC and me): What were our plans concerning Library Linked Data? The audience, as indicated by the Tweets that followed, was concerned when they learned that neither organization had detailed plans to share. Clearly, thought and movement is being given to the technology in both organizations, but uncertainty exists. For organizations, like Ex Libris, that enjoy the reputation for being forward thinking, one might wonder what are the reasons for that?

It includes a lack of clear understanding of what exactly are the problems being solved for the profession by this technology that can only be solved with the Library Linked Data model or that can’t be otherwise solved? Are these problems shared across the profession, across institutions? Is it agreed that the Library Linked Data model is the solution? If so, how many institutions, or even personal services, are in production status using this model to solve those problems?

Please don’t misunderstand or think that we read the benefit statements and fail to understand them. That is simply not the case. We totally get it. We see the potential of unleashed innovation, the embodiment of the concepts of the Semantic Web. We understand the stated benefits and possibilities. We too are excited by what this technology could bring to end-users. The drive for innovation could certainly help transform librarianship and enable it to become more dynamic in meeting the needs of end-users. We agree upon all points.

However, as stated in the title of this post, take a moment and slide around the table and sit in our chair. From here, what you’ll see is a situation best described by the E.M. Rogers, “Diffusion of Innovations” Bell curve which was later enhanced by Geoffrey Moore, when he introduced the concept of “Crossing the Chasm” into the model. It’s a fascinating description of how technology goes from being an idea to a product on towards the end-of-life. Rogers does this by dividing the technology market for any product into five segments. Moore introduced the chasm in the model, into which many technologies fall and fail if they can’t successfully clear the gap between the first two segments and those that follow. For the purpose of this discussion, let’s focus on those first two segments as they are particularly relevant in the discussion of Library Linked Data at this point in time:
  • Innovators, or Technology Enthusiasts. This is the “bleeding edge”. Typically about 2.5% of most markets. Organizations here comprise the initial leading edge of the curve. They represent those who like to be the first because they believe it will improve life. Organizations in this group rarely have much money.
  • Early Adopters or Visionaries. About 13.5% of organizations. These are the revolutionaries, those who will actually break with the past and embrace a new future. They also like to be known as visionaries, so they’re very good about talking about what they’re doing. Better yet, in most markets, these organizations have money to implement their vision. However, these organizations also want products customized to meet their needs, sometimes asking for things few other organizations will want.
Those two groups together constitute what Rogers/Moore calls the “early market”, i.e., the leading thinkers. Together they constitute about 16% of any technology market segment. The other 85% are called Early Majority, Late Majority and Laggards. Each bears its own behavior patterns and descriptions (and if you want to read about them, Moore’s book is excellent). The point is this; that 85% form the majority of any market for a product/service. For a vendor, taking the technology across the chasm between the leading edge and the market majority is the key to being successful and clearly it is no easy task. It’s a combination of timing, development and management and certainly even an element of luck.

What most developers/providers of products analyzing the potential of Library Linked Data would see is that at this stage, this technology is very much in the research stage. There are a lot of ideas being discussed, a lot of possibilities described and a lot of unanswered questions being asked. In terms of pushing forward with implementation, the vendor sees that the first 16% of the market consists of 3.5% that typically (but not always) has no money to spend and consequently their participation will be solely through in-kind donations of brainpower and people time. This is not to be minimized. It is a very important contribution to the development of new technology and helps to bring ideas to life. However, at this stage, for a vendor, it doesn’t help to pay our staff or bills. When we do the business case study, we need to wait until at least the 13.5% of a market that more likely has money engages. However, even then, we have to factor in that this market is already facing extremely challenging financial times and they have little time for fully exploiting their existing technology, much less exploring the possibilities of new technology. It’s just a reality of the times. At the end of that business case analysis we come up with some very, very small numbers over which to spread the costs and return it will take to implement this major overhaul in core data structures and related software that runs on top of those structures. Which makes the product unaffordable at this point in time.

Someone will ask: Isn’t this why you do R&D? Isn’t this how you develop a new market? If you want to be leading edge, shouldn’t you be engaged? Again, fair questions but let’s continue our examination from the vendor view in order to answer those questions. Specifically, we see the following issues needing more narrative, more agreement and resolution:
  1. Let’s start by going back to the critical need to answer the question about the problems being solved for the profession by the use of this technology that can only be solved by using this model? To answer that, I’ll repeat what customers tell us all the time when we bring them new products, services and ideas: “Show me”. Yes, it can be frustrating to face that question. However it is the nature of this marketplace. Not without good cause. We understand that the majority of this market is buying products/services with money that is entrusted to them to be spent very wisely. As a result, the profession of librarianship is very careful. They want to see what they’re buying before they buy it. The challenge becomes to develop some working demonstrations of Library Linked Data, that can be widely shared, widely used and clearly and easily demonstrate the remarkable benefits. If one of the main benefits is “unleashed innovation”, how do you show that? Not easy, but we do need at least a few really good examples. This will help to fuel the interest in moving this technology forward. One possible answer? For the innovators; technology enthusiasts, early adopters and visionaries to bind together and develop some working examples of the innovative possibilities. Use a limited set of data, but develop some demonstrations and, at the same time, try to answer some of the points below through those demonstrations.
  2. How this technology will get implemented also needs more clarity. Do we see it as technology that will be implemented only with newly created data? Can we, as a profession, afford to wait the amount of time that would take? That doesn’t seem likely. So, if not, how are we going to convert data from the existing data structures and silos into this format? Who is going to do that, when and how? Will it be something expected of the vendors? Certainly to get the data to work with our end products, we know we will have to write some amount of conversion software. To do this, any organization will need a lot of details in order to spec out the amount of time and effort and therefore cost it will take to achieve answer this need.
  3. How do we see this data being maintained? We all know, and numerous posts have pointed out, the data is dynamic. It’s constantly being corrected, updated, revised and enhanced. Maybe not in huge quantities compared to the total body of data, but still it must be accommodated. Certainly the possibility for linked data to reduce the number of times the data will need to be replicated streamlines this need. However, the need still exists. So we need to understand how exactly this will get done, by who and how frequently? Our customers will not want to capture the benefits of linked data and then seem them slowly erode due to the data increasingly becoming outdated over time. The answers to these questions are an essential component.
  4. If the answers to some of the points above are to come from open communities, be it open-source or others, we also need to factor in the maturity and sustainability of the tools that are put forth. In some instances, we’ve had experiences where we moved to adopt OSS tools only to find the development/maintenance resources behind those tools vaporize and the tools languished. Yes, of course we realize that we could pick that up that task, but like most organizations, we have our development resources tightly scheduled far in advance and therefore this is not always an immediate option for us. So we’ve learned to adopt such tools after they’ve demonstrated a level of independent sustainability. Ex Libris, with thousands of customers, has to ensure that anything we incorporate or rely upon is stable and sustainable. It’s important for all our customers, but especially important for the very many large, enterprise level organizations that use our products.
All of this information will also be factored into the cost to move to using this technology, converting our thousands of customers, and therefore pricing the final products and testing that pricing with customers. This is needed to help us assess the viability of developing products using this technology. In the end, we’ll perform a business case analysis to determine if the return on the investment will meet or exceed that of other ideas and technology that we’re considering implementing. This can be challenging because assigning value to things like “unleashed innovation” intuitively seems easy. However, when you’re trying to show sharp-penciled funding authorities that value, they like to see numbers and they like accountability around those numbers. Like all organizations, we have limited resources and we want to make sure we make wise choices in order to provide that accountability and that will ultimately serve customers and our best interests. This is not easy to do without solid answers and informed, well-grounded projections.

Conceptually, we’re on board with the ideas behind the Library Linked Data model and in fact, we’re designing our new system Alma with the necessary capabilities at the core to support the Library Linked Data model. We’re actively developing it. However, from a business perspective, the technology and the ideas that will result from the model seem too nascent for us to be able to provide the answers and projections needed in order to bet major development resources. We believe that will change. It’s going to take more time. Until then, we plan on putting the foundation in place, participating in the discussions, contributing ideas and information where possible and planning for the day when we’ll have those answers in hand and to be able to offer a firm development schedule for the delivery of the Library Linked Data model in our products and services.