Thursday, November 9, 2017

"Living under the API." Some things librarians need to consider.

At OU Libraries, we've been investigating interesting new tech products and when requested we provide input into the development of some these products.  There truly are some new tech products in the pipes that are going to offer libraries a real chance to provide new value-add capabilities to our communities of users. But. These new products are also raising some real concerns. Those concerns will require us think very carefully as we develop, implement and use them, and not just us at OU Libraries, but all librarians. Let's look at just a couple of those:

  1. First, as is happening all around us, we’re increasingly seeing data and algorithms interwoven into academic scholarship and librarianship.  Clearly, machine processing is going to provide us with a whole new dimension of analysis and research results, but as with any new technology it is also providing new set of challenges.  Two recent moments underscored this point for me.  The first was when we had a guest speaker, Dave King of Exaptive, give a talk at our recent Research Bazaar.  During that talk, Dave stated: “In the past, programers wrote code to implement management decisions.  Now programmers are writing code that makes decisions and managers are trusting it because they don’t understand what is happening in that code.  This is dangerous.”  Dave's company writes such code, so it's really important to listen to what he is saying about how it gets developed.  The second moment, which underscored what Dave pointed out, was when I was reading a new book with the title: "Radical Technologies" and it coined a phrase that describes a lot of what is happening with technology and that phrase was to point out that people are increasingly: "living under an API". API's are Application Programming Interfaces that can embody algorithms that are controlling our lives.  It's happening in social media, search engines, even in your local Target and Walmart stores.  Yes, it can produce great value.  It can also be, as Dave said, very dangerous and increase problematic societal issues that we're already wrestling with full-time.  Another recent book, "Machine Platform Crowd" stated something equally important for us to remember in this discussion: “Technology is a tool. That is true whether it’s a hammer or a deep neural network. Tools don’t decide what happens to people. We decide.” So, when working with these new technologies for instance, we at OU Libraries spend a great deal of time thinking about issues surrounding the support/development of critical thinking skills and reproducibility of research results.  We're concerned about the need for:
    • An understanding, by all parties involved, that software used in analyzing and processing data must be open source or extremely well documented, such that it can be clearly understood what is happening within the processing sequences of that code.  Now, clearly proprietary vendors are not going to make their entire products open source, in order to address this issue, that's well understood.  They need to recover costs and make a profit to sustain and grow their companies.  But the code bits that do actual analysis/processing do need to be either open source or openly and accurately documented.  In part, peer review of code and code logic would be one way to allow us to ensure the integrity of the work as well as ensuring that biases are not written into the code.  This is really important because if we don't watch for this, biases will be propagated over generations of research as results from one research project are used as the basis for subsequent research projects. As part of peer reviewing of code as a scholarly product, it would also need to be: i) Documented, ii) Reusable by any scholar trying to replicate results, iii) Citable, and, finally, iv) Versioned (to ensure accurate reproducibility, including workflows).  
    • The same set of issues occur with the data the code operates upon, be it numerical data, visualizations, citations or the full-text of publications.  When using data, we need to ensure the quality and the openness of the data for the ability of others to verify and reproduce research findings.  
    • These items are really not options, they can't be for us to ensure the integrity of the research done at our universities. 
  2. A major secondary issue we're seeing is that many of these new tools are closely coupled to the content, which is also supplied by the vendor.  Now, as many of you know, I've been on the vendor side of the discussion table in my previous lives and I fully understand their desire is to provide value-add as a way to increase the sale of their content.  Understood.  But it's a sales model from the past, not the future.  Analysis tools need to be decoupled from the content.  Our thinking, as librarians, is focused on the value-add of the tools on top of content, and, thus the need to use that tool across all content, not just that in one suite of offerings.  For us, not doing this poses real challenges in training community members and in the overall ease-of-use of library resources.  (We really don’t want to be in a position of saying: “Yes, we know you can do that with these databases, but you can NOT do it with those databases…”)  We believe what is required is a shift in the understanding on the part of content providers that content is increasingly becoming a commodity, and the value (and thus future sales) is coming in the differentiation of the tools provided on top of content.  The fact that vendors are producing these types of tools already shows an understanding of this, but the fact that they continue to couple the tools with only their content shows a bifurcation of thinking that we fear is not healthy for all concerned.
In order for these new technologies to be successful for both the organizations producing them and the profession of librarianship, these are issues that really need to be addressed head-on, all around the table, whether the tool is open source or proprietary.  As Librarians, we need to do our homework and our due diligence to ensure we understand, in detail, the topics involved.  We also need to insist that the technology allow us to continue to support our core values as well as those of the communities we support.


Tuesday, August 15, 2017

Déjà vu re library data and ownership rights

Recent events in the librarianship profession have brought me out of the slumber I've been in with this blog, although the actual causes for that slumber are separate and work related. 

I've found myself over the past two weeks marveling at the reaction of the many librarians contacting me about the recent purchase of BePress by Elsevier and that reaction was compounded by a former vendor of our library sending us a legal letter that raised my ire considerably. Both reactions are related to the issue of the library's data when stored in proprietary systems, and particularly, the library's rights to get that data back out of the system when the library is ready to move to a different system.

At that point, it's really too late to do what I'm about to say, but please, when selecting the new replacement system, make sure you do the following if you're buying a proprietary system.  Here is a list of things that need to be specified about the data in both the RFI or RFP as well as the actual contract between the vendor and your organization.  Those are:
  1. The Library owns their data.  No conditions apply or allowed.  The Library owns ALL their data and all the rights associated with the use of that data. 
  2. You have the right to request that data be extracted in an industry standard specified format (and that format should be stated in the both the procurement and legal documents).
  3. The cost for extracting that data should be identified in the contract.  That number should be reasonable for your institution and certainly should not exceed the cost you paid them to load it (but, be fair and remember the inflation factor, can apply over time).
  4. The time period for the delivery of that data should also be specified.  A reasonable time period is 30 days from the date of request.  Any more than that and you're going to be creating a backlog of updates to apply to it once it is loaded in your new system.
Now, here is the part that has surprised me since joining the academic world over four years ago, which is how the legal department of the academic institution is really somewhat disconnected from the librarians who are selecting systems.  For instance, most librarians write up their RFP and send it to purchasing, which then conducts the procurement and basically acts as a wall between the vendor and the librarians, filtering things through in a way that ensures a fair procurement.  Once the librarians select the system of choice, procurement works with the legal department to get the contract put together and in most cases, the librarians sit on the sideline during this process, until the document is signed. So, they're not involved in the negotiations and are totally reliant on their legal department to ensure crucial points are covered.  

I can tell you from experience, that counting on this for coverage of crucial points is NOT always the case because your legal people will not have the same understanding of your needs, nor the depth-of-understanding of the topics that you do.  Their expertise is legal and negotiation, but if a vendor flat tells them "we don't do that", legal might give it away in the negotiation and you might lose out.  So, make sure you see the contract language BEFORE it is signed. If you don't understand what it is being said, ask your legal people to explain it to you.  I'm usually, stunned when I ask a librarian if they've read their vendor contract, either before or after the signing, and how many will tell me "no".  Mistake, mistake, mistake! You know when you pay for that mistake in this area?  When you go to leave and find out your data is locked in a proprietary format, will be delivered in that format and because it is proprietary means you can't give to your new vendor without violating intellectual property laws.  Or that you're only entitled to a subset of what you think is your total data set.  Or that cost of having your data extraction approaches the cost of a new replacement system.  Or that sure, you can have your data, but it'll take six months to get it.  Or... you get the picture.  

We've lived through these issues before with integrated library systems and now apparently many have forgotten those lessons of history.  As Yogi Berra once said: "It looks like Déjà vu all over again."

Monday, January 9, 2017

Do you see what I’m saying? Why Libraries should be embracing virtual reality.

Galileo's World VR Station
Here at the University of Oklahoma Libraries, we introduced Virtual Reality (VR) to our community with the Galileo’s World Exhibition. I fully expected VR would be of interest to many of our younger exhibition attendees. However, I was certain we were on to something much bigger, when in giving tours of the exhibition, I found myself regularly helping elderly people into the station where we were doing the VR demos, in order that they too could see the Universe as Galileo thought it existed, and then letting them compare that to a VR demo we'd created of the Universe as we know it exists today, using images we obtained from NASA. Watching people light up as a result of that virtual reality experience, no matter their age, really made an impression on me.  

Innovation @ the Edge VR stations
We then decided to grow our virtual reality activities by taking the technology under wing at our Innovation @ the Edge facility. Next we started the Oklahoma Virtual Academic Laboratory (OVAL) which included the installation of workstations designed specifically to support VR using Oculus Rift technology and to show our community of users how we could support their use of this technology within the pedagogy and research at our University.  Since then we’ve grown the program to embrace HTC Vive units and high-quality Google Cardboard headsets. We also assembled a team of Emerging Technologists who have a burning passion to introduce this (and other leading-edge) technologies to our community. 

Innovation Hub OVAL
The combination of these things has resulted in us seeing instructors from over 20 courses, and from a variety of colleges across the campus, that have built exercises into their courses that require the use of the virtual reality units in the library labs. In addition we’ve seen numerous other community explorations of the technology, some of which will leave you in total awe of what is possible and being done (some of these have literally brought me to tears they are so moving, but let's keep that as a subject for a separate, future post).  All of this has been tremendously rewarding because it not only builds traffic in the libraries, it positions the library as the point to engage with, learn about and experience leading-edge technologies.  At the same time, it also allows us to connect the users with the many other resources and users of the library that will help facilitate their thinking and adoption of these types of technologies.  (Note:  All of this underscores our positioning as the “intellectual crossroads of the University”). Plus, it’s clearly going to lead to higher-level learning and research outcomes as we study the use of the technology and learn from those studies. 

As wonderful as all of that is, those are not the only reasons I see for Librarians to embrace virtual reality. Recently I was sitting in the airport, waiting for a delayed airplane (an unfortunately frequent occurrence in my life) and heard a nearby grandmother trying to explain something to her grandchild. As she was doing so she was frequently interjecting the question: “Do you see what I’m saying?” It occurred to me, that question was a rather perfect encapsulation of what I wanted to describe in this post.  Let me explain.

The difference between information and knowledge

This is all related to what I’ve previously written about in that Librarianship is about knowledge creation. David Lankes is probably one of the most recognized speakers on this topic and has written many wonderful columns on his blogsome excellent books and given many, many speeches that touch on this topic. I consider his work among the best available today on this topic.  

Of course we, as librarians, are not only about knowledge creation, we’re also about how to transmit the information upon which that knowledge is based, from generation-to-generation and how we then render it in ways that the next person to encounter that information can turn it into similar knowledge or improved knowledge in their minds. Now, admittedly information comes in many containers today, from the web, to audio, to moving images to books. The point is that information comes not only as symbols on a page, but also includes sound, tactile, and certainly additional forms of visual transmission beyond that of just symbols on a page. This is where virtual reality comes into play.

For the sake of clarification, let us take a short aside. The difference between “information” and "knowledge" is an important one and yet I find the terms are often used interchangeably and I think quite mistakenly.  As a point of clarification, I believe information is conveyed understanding and serves as input to thinking organisms where it can result in the creation of the same, new and or different knowledge, but hopefully at least the same knowledge. In other words, to my way of thought, knowledge only exists in thought, be that human or machine. Knowledge committed to transmittable forms, whatever they might be, are to my way of thinking, information that is a result of the knowledge they now have created and/or possess.    

Knowledge is defined by Merriam-Webster as: “information, understanding, or skill that you get from experience or education.”  Working with that definition, the part I want to focus on at the moment is that word: “experience.” It will help illuminate why I’m so certain that virtual reality is technology we need to embrace.

Virtual Reality (VR) as a means of conveying information upon which knowledge is based

History tells us that early humans conveyed information experientially, i.e. verbally or visually and in early times, via drawings/paintings. The means of conveyance was later expanded to include the written word. Over time, recorded sounds, video and other formats were added.

Reading is of course, for most librarians, the most endearing and longest running form of conveying information. We deeply appreciate the value that reading provides in our lives, both from the ability it gives us to experience new knowledge, be better informed and educated and thus to improve our lives, but also because it can entertain us. 

Now step back and think about reading for a moment. What makes a book memorable when you read it?  In doing so, you’ve consumed probably hundreds of pages of print and in some way, when you were in that act, you might have felt moved by that text. Maybe it was an emotional experience or maybe it was one of enlightenment or understanding. I often say there are “Aha!” moments, those moments when a switch is flipped in your brain and you find that you’ve identified with the writer, seen the world through their eyes and experienced their understanding. You then possess new knowledge as a result. 

Kevin Kelly’s recent book “The Inevitable” (highly recommended reading) says this: “Some scholars of literature claim that a book is really that virtual place your mind goes to when you are reading…. When you read you are transported, focused, immersed.” 

I totally agree with that and yet not all people are text-based consumers of information. A September 1, 2016 Pew Report tells us: “65% of Americans have read a print book in the last year” and that “(28%) have read an e-book”. While those are not horrible numbers, it does indicate there are still large segments of the American population that are not reading, not experiencing the information needed to create new knowledge and thus likely have become somewhat stagnant in their learning. Or they become easily subject to the “post-factual” society that we all now find ourselves wrestling with (as well as some of the horrendous outcomes that have resulted). 

So, is there something we can do to move the needle back in the "reader" direction?  Well, let’s go back to that grandmother’s comment: “Do you see what I’m saying?” Virtual reality can be another way to help people experience, i.e. “see” that type of virtual place that many of us experience as a result of reading.  VR is engaging in part, because it’s so very life-like. It’s enlightening because we can use it to convey information and provide enhanced learning just as we’re doing here at OU’s Libraries. But for people that haven’t learned to read in their youth miss out on a great deal. Basic reading skills need to be developed by age 8-9.  Trying to create a reader later than that is extremely difficult. Yet, many students arrive on the doorsteps of a typical University without that skillset in place. Many adults live their lives without this ability.

VR gives us the possibility of filling that gap. For instance, what if we took publicly, out-of-copyright works that are available as full-text and downloaded some of those, and then what if we could write code to match up terms in a text with VR models that would illustrate those what those terms mean?  Doing so in such a way that when a reader is working their way through a text (in a VR headset), they could pause, point to that link and have the VR model play.  For instance, the word "Paris" might appear in a text.  What if we could then allow the reader, who’d likely never been there, to take a short 2-3 minute video tour of Paris, i.e. one where they could choose walkways to follow, and explore at will? Imagine if they could do that with a large number of the phrases/words in a body of text. Would that result in their beginning to understand the kind of visualization that reading creates for others who learned those skills at an early age?  I’m not certain and there is much to be done to test this idea, but I think it’s worth pursuing. Think about how this could transform community members’ lives.  Think about how you, as a librarian would feel, if you helped someone who didn’t think they wanted to, or couldn't read, learn to experience what we has librarians have experienced all of our lives as a result of reading?  

I think it’s one more reason why librarians need to be embracing virtual reality.  As that grandmother said: “Do you see what I’m saying?”  

OU College of Law - OVAL (VR) Stations

Monday, November 28, 2016

FOLIO, acronym for “Future Of Libraries Is Open”? I’d suggest: “Fantasy Of Librarians Inflamed by Organizations”


There is a feature of librarian groupthink that is both a great asset and a tremendous liability to the profession. The asset is that Librarians have big and often inspiring hopes/dreams. The liability is that they don’t have the resources to achieve all of them. Nor do they have a good mechanism for synchronizing the two so that their hopes/dreams are in line with their resources. This can be inflamed by a willful refusal to examine the historical record, to extract lessons learned and apply them to the future. Worse yet, it can be further inflamed by organizations with vested interests.  

Such is the case with the OLE, now recast/renamed the FOLIO project. Now, let me say, as I have before, that I’m a huge supporter of open source software.  As the Chief Technology Officer of an R1 research university, I understand the advantages and disadvantages of it, and have worked with it in libraries and companies. In my library, we run a lot of open source software, including DSpace, Fedora Commons, Drupal, Omeka and many, many more core infrastructure packages.  It’s a wonderful platform and can offer outstanding value.  

So, it’s with dismay that I watch what is happening with FOLIO. Because we’re simply betting a great deal on a bad bet, as it is presently framed.  We need to inject some reality into what is being said and done. We need to separate fact from fiction and the associated marketing hype. I hope this post will help move the discussions in that direction.

I recently keynoted the 2016 University of Houston Discovery Camp and had the opportunity to hear a presentation on FOLIO. I also attended the Charleston Conference and heard a presentation there. One of the talks was very well crafted, but, like many marketing talks, glossed over the realities in order to paint the picture the speaker wanted to paint. Some of that talk aimed to address issues I raised in my keynote talk from earlier that day, including:

  • Understanding that the OLE code base is essentially dead and the at least the Library Service Portion (LSP) portion is being restarted from scratch.
  • Questions about the suitability of micro-services as the core architecture.
  • Questions about the governance structure.
  • What is the true target market?
  • Is the core architecture being used really multi-tenant?  What are the implications if it isn’t?
  • Is it realistic to imply the product delivery date will be 2018? 
These are all topics covered in my previous blog post on OLE. However, as a result of the presentations I heard at the Discovery Camp and the Charleston Conference, some new questions and points need to be raised and added to the list. These include: 

1.  Can the library community support the vision this project entails?  

At the Discovery Camp, there were over 130 academic librarians attending. I did a quick poll asking how many of those librarians worked in libraries which had programmers on staff? Less than 12 people (less than 10%) raised their hands (and note that some of those might have been from the same institution, but for the sake of discussion, let’s say they were not). I then asked how many of them had programmers waiting for projects to perform. ZERO hands were left in the air. I know I’m in that situation at my library. If I wanted to, it would be 1 - 2 years before I could assign anyone to this project for any significant amount of his or her time, and I have a moderately sized programming team for a library.   

When the FOLIO rep took the stage, the response to this issue was: “250 libraries indicated interest in providing development assistance, and there were a total of 750 attendees at 4 recent seminars.”  OK, let's take this apart and inject some facts and reality into that response.   

> First, librarians are usually spending frugally and they feel an obligation to investigate options so they can make informed choices. Open source software is often believed to be a lower cost option (although it’s certainly not always true,) so a library wants to evaluate the option. Their attending a session is clearly not the same as saying they’re definitely going to download the product and put it into production! It simply means they’re interested. 

> Second, 250 libraries saying they want to provide development assistance can mean very little. While it seems wildly out of fashion these days, let’s bring some facts into the discussion and look at the size of the development teams of companies/organizations that have built a Library Service Platform from the ground up. Let’s do this by using Marshall Breeding’s Library Technology website data on staffing levels for the only two organizations that have successfully developed, from the ground-up, Library Service Platform (not ILS!) products. Those would be OCLC and Ex Libris (ProQuest), so let’s focus on those. Next, using numbers from that chart and the number of products they’re supporting (from the same chart), one can do a rough calculation on how many hours were likely devoted to the Library Service Platform software development.  Then using the year those organizations have publicly reported they started their work on the Library Service Platforms, calculations would show that in total person years of development of these products, OCLC = 542, Ex Libris = 447.  

If there are 250 libraries saying they could provide development assistance (which again, doesn’t necessarily mean a programmer), and if 15% of those could provide a .25 FTE programming position on a continuing basis (I’d really like to know who those are as I can’t find any of them in my informal surveys) to help develop the product, and even if the commercial organizations involved devote another 3 FTE just to this development effort -- even if all that were true -- it would take decades of effort to match either Ex Libris’s effort or OCLC’s effort to date.  Now, it is true that libraries could proxy, by financial contributions, development resources that might mitigate this timeline.  However, given the state of higher education funding in the United States today, unless the private institutions take on this load fully (also unlikely) and even looking into the future, this still seems a very long shot.

Which is a nice segue to pointing out that, during the time FOLIO is being developed to bring it to the functional level of the current Library Service Platform products, those products won’t be sitting still. They’ll continue to move forward; and, therefore, FOLIO will always, always be behind the competitive offerings unless a LOT more resources are devoted to it!   

Let’s remember, OLE went on for eight years (starting in 2008) using this same proposed resource model and never did produce a complete production ready system that could be installed by all types of academic libraries.  As I pointed out in my last post, the OLE software only went into semi-production status in three research libraries because it was still missing a lot of functionality.  

Enterprise software requires enterprise level development teams, and we simply do not have that available to us in the form of library part-time open source software developers.   Enterprise development teams need programmers who can focus full time on the extremely complicated nature of the work they're doing in our libraries.  They also have the advantage of work towards filling a common defined product definition (although this is a point of contention for many librarians, as we'll discuss below).   

If you like challenges, just try to get 10 agreed upon functional definitions for any one major library function, out of a group of 250 libraries.  And if you think that's good, then be sure to think about why you're frustrated with your current ILS, which ties directly back to what it takes to maintain all those versions, to coordinate software development between them, to run them on different hardware platforms, to synchronize testing, documentation, releases and installations. In other words, yes, think back to the days of locally installed ILS's and back up the vision for your library by 25+ years. Tell you software developers to emulate that in the open source code they write/contribute because you enjoyed it so much. Oh yes, and be sure to call that new open source software you develop FOLIO.  Because that's what you will be creating.

It is, quite simply, insanity that for as long as libraries have been using automated systems, we can't find common vision or "shared-best-practices" for our workflows.  You can find it at the Dean/Director level, but it rapidly starts disintegrating as you get to the lower levels of the Library. It is truly a failure of the profession and its leadership.

Does anyone besides me see the vicious cycle here?  Quite simply, the answer to the question I posed at the start of this section is: “NO, no, we can’t support this vision.”  

2. Listen to what you’re REALLY being told by the “organizations” involved!  

I found it very interesting to note, that in the Charleston Conference presentation, done by an EBSCO rep, the following was said: "The code would be released in 2018." But they also said this: “The list of things that can be developed are (you caught that right? “can”!):

  • E-Resource management
  • Acquisitions
  • Cataloging
  • Circulation
  • Data Conversion
  • Knowledge Base
  • Resource Sharing 
And "that you will be able to expand" (again, note the highlighting here!) the Library Service Platform to include:
  • Discovery
  • Open URL Linking
  • Holdings Management
  • ERP integration
  • Room booking
  • Analytics and student outcomes
  • Linked Open Data
  • Data mining
  • Research data management
  • Institutions Repositories
  • And ….more!….
Once again, analyze what they’re saying. They’re telling us that the release date of 2018 is actually just going to be the kernel of the system, not the complete system. The good news?  That might even make that date doable. But it also confirms the calculation above, that if you want to have a complete system, you need to add on the years of development required to put all these other modules in place, which will mean a date more like what I’ve shown above, decades from now.

Now ask yourself a really hard question: Do you really believe that this list describes the functional system you’ll want, or even need, decades from now??  I doubt it. I know it doesn’t for me, not even if the numbers were far less, like a single decade from now.  

At the Charleston Conference, I heard James Neal, incoming President of ALA, say in his opening talk:  “I propose that by 2026, there will be no information and services industry targeting products to the library marketplace. Content and applications will be directed to the consumer, the end user." While I might argue his date, I will not argue the direction of his statement. It’s not that far in the future.  Think about that and the implications it has for FOLIO (and really, a lot of library software). 

3. “Follow the money” 

This is a saying that is frequently used to remind people that, in analyzing a situation, they should look at the financial benefits (and to whom they flow) in order to understand why something is happening. Let me suggest we do that here.  EBSCO is providing most of the financial backing for this project, which on the surface level seems quite laudable. They’re a good company with long history in libraries and are privately owned by a family with a long history of philanthropic activities. BUT. They’re also a major company selling a discovery product without an ILS or Library Service Platform that they own (unlike both OCLC and Ex Libris), thus they can’t provide the tight integration that those solutions provide. That puts them at a severe disadvantage in the marketplace.  So they have a vested interest in diverting you from buying one of those solutions, because they don’t want to lose discovery system sales or content sales. Even if they wanted to try to buy another LSP vendor/product to offer, they’re too late. If they were for sale at all, they’ve already been bought. So, EBSCO is in a squeeze. Again, this is a big disadvantage for them and is why they're driving the FOLIO product to move towards the Library Service Platform when, in fact, it should be moving to address market segments where we really do need new tools, i.e., research data management systems, knowledge creation platforms, open access/OA publishing systems, GIS systems, etc., etc. But that’s not where EBSCO wants to go and so since they’re providing the funding, the Library Service Platform is the focus.  

We also had an EBSCO VP standing on stage in Charleston, telling us: “The vendor marketplace is consolidating therefore we must develop alternatives!”  Really? It seems to me we DID develop alternatives (look at this chart Marshall Breeding compiles on this topic), and over a long period of time, the market has acted to consolidate and focus on the solutions that best meet its needs. In so doing it has narrowed and selected two superior products to fill the Library Service Platform niche i.e., WorldShare from OCLC and Alma from Ex Libris.  We don’t need more alternatives, the market has already spoken, and, for the foreseeable future these are the successful and widely adopted products.   Alternatives at this point are only going to further fracture a profession that needs to be spending its efforts on new areas which we allow librarians to add new value.  Simply put, at this point in time, that is not in reinventing the ILS as an open source software solution!

4. What is the governing structure going to be for this effort?  

Have you noticed how little is being said about this topic? Ask and they’ll point you to this webpage. Except there are really only two pages here, neither of which address the specifics of how this project will be governed. Things like:

  • Who sets code directions?  
  • Who decides and commits what gets into the code base? 
  • How are conflicts resolved? 
  • How is Open Library Foundation funded? 
  • By whom and in what amounts? 
  • Who are the board members? 
  • How do they get on the board? 
  • Who names them (EBSCO)?? 
  • How will they make strategic decisions for the organization? 
Without this information, there is cause for concern. Lest we forget, OLE went through multiple governance structures, all of which, after Duke University Library’s initial work on the project, ultimately failed.  A large part of this was documented well in this post. I’m truly unclear why we want to relive this, but apparently some of us do. Again, it seems to me we should learn from history here and, as a result, this whole governance issue should be cause for deep, deep concern.


I’ll say what I’ve said before. There is a real place for a FOLIO concept (i.e. mass community collaboration around open source software) in the library profession. The concept needs far more work in defining governance, sustainability and the guidelines for mass collaboration BEFORE it takes on the effort of developing any products.  What it will not be successful in doing at this point in time is producing a Library Service Platform. (And please remember, the concerns above are my latest ones in addition to all those expressed previously in my last post and only briefly summarized above. 

At the Charleston Conference, one of the librarian presenters, speaking about FOLIO, proudly said: “If FOLIO is a Unicorn, then I believe in Unicorns.”  I listened intently and thought: “You’d better, because that’s exactly the concept you're buying into here…”

There are a lot of good people involved in this effort, and for the sake of all concerned, we need them to do something that will be wildly and broadly successful for libraries. It's not in doing a Library Service Platform. We’ve already put somewhere between $5-10M in the failed venture of trying to build OLE for this purpose.  Now we’re queuing up, like tourists at a DisneyWorld ride, to do it all over again. Let’s stop fawning over FOLIO, as was done recently in Information Today, apply some of the vast facts we have at our disposal in our libraries, and realize we simply CAN’T afford to waste this money or time on this particular idea. 

Let’s stop this fantasy, pause, redirect and get pointed in a direction that offers real value-add and solves some of the many real problems that we really need to be solved.  

Wednesday, April 27, 2016

The OLE Merry-Go-Round spins on…

The news about the OLE (Open Library Environment project) has resulted in two reactions from me.  First, disappointment that my long-standing concerns about this project have proven correct, and second, that the profession of librarianship has seemingly forgot what we know and teach our communities about the skills of accessing and use existing knowledge to perform critical analysis in support of creating new knowledge. Such is apparently the case with the announcement that EBSCO will support a new open source project to build a Library Service Platform (LSP).

Marshall Breeding has done an analysis of this news  on the American Libraries Magazine website.  If you haven’t read it, I would suggest you do so as it will give you a solid foundation for the rest of this post. Return here when done.

Now let me say right up front, there are some very encouraging facets to this announcement.  These include:
  1. The involvement of some organizations with considerable business skills and savvy.   Both EBSCO and Index Data have been in business a long time and bring some much needed business analysis skills to the table.  This is good for the OLE project because it’s been sorely missing for a very long time.
  2. The fact that EBSCO is apparently pivoting in a substantial way to support open source software for the community.  I’m cautiously optimistic about this move.
  3. There are few people in the Open Source Software (OSS) business I respect more than Sebastian Hammer and Lynn Bailey.  Sebastian in particular was doing open source software before most librarians even knew what the term meant.  I’ve partnered with him in past business projects and know his expertise to be amongst the best in the field.  Lynn brings business skills to the equation and together they form an excellent duo.  They have numerous OSS success stories to point towards.  This is good for the OLE project. 
  4. At a presentation at the recent CNI Membership Meeting in San Antonio, in a session led by Nassib Nassar, a Senior Software Engineer at Index Data, he discussed their plan to use microservices as the foundation for this new project.  Microservices (an evolved iteration of SOA architecture), focuses on tightly coupled small software services.  A very good explanation of microservices can be found here.  This is a promising new architecture that has been evolving over the past several years and certainly might have applicability for future library software projects (see below for more on this point).
Now for a little history on the OLE project: Back in August 2008, (note, that was nearly EIGHT years ago) according to the press release, the Mellon Foundation provided an initial grant of $475,000 to support OLE.  The announcement said: “The goal of the Open Library Environment (OLE) Project is to develop a design document for library automation technology that fits modern library workflows, is built on Service Oriented Architecture, and offers an alternative to commercial Integrated Library System products.”

You’d be forgiven if you think that announcement sounds amazingly close to the most recent announcement, which says: “It carries forward much of the vision… for comprehensive resource management and streamlined workflows.” You'd also be forgiven for thinking that after eight years, we might expect something more?

But for now, let’s work our way through this announcement, using what we do know about the history of library automation systems, in order to pose some questions I really think need to be asked:
  1. Is the existing OLE code base dead?  Marshall might have been a little too politic in his article, but I’ll say the obvious:  After eight years (2008-2016) of development and grant awards totaling (according to press releases) of $5,652,000 (yes, read that carefully, five million, six hundred and fifty two thousand dollars) by the Mellon Foundation and who knows how much in-kind investment by affiliated libraries (through the costs of their staff participation in the OLE project) it has all resulted in what Marshall points out in his article: “EBSCO opted not to adopt its (Kuali OLE) codebase as it’s starting point.” And the “Kuali OLE software will not be completed as planned,” but will be developed incrementally to support the three (emphasis my own) libraries that currently use it in production, but it will not be built out as a comprehensive resource management system.”  For those of you not experienced in software development, that phrasing is code for: “it’s dead.”  They’re going to start over from scratch. Sure, they’ll use the user cases over, but for well over $5.5M, we should have expected, indeed demanded a lot more. Let’s also remember that this also means a number of very large libraries, all over the country, delayed their move to newer technology while waiting for OLE.  They stayed on the older, inferior ILS systems and they and their users suffered as a result.  How do we factor that cost in?   Now, sure, we’ll call this new project OLE to paper over this outcome, but folks please, let’s be honest with ourselves here: OLE has failed and it has carried a huge cost.
  2. Do we really need microservices?  Yes, it’s the latest, greatest technology.  But do we need it to do what we need to do?  And do we fully understand all the impacts of that decision? What value does it bring us that we don’t have with existing technology?  Is it proven using open source software in our size market?  (Yes, Amazon uses it.  But Amazon is a huge organization with huge staff resources to devote to this.  Libraries can’t make either claim.) We must answer the question: What is the true value of building a Library system based on this? What will libraries will be able to do that they can't do with current LSP technology.  Why should we take this risk?  Do we really understand the costs of developing and maintaining using this technology? Do we really want to experiment with this in our small and budget-tight community?
  3. Governance – Haven’t we been here before? What’s different?  A new Open Library Foundation is being envisioned to govern OLE.  But hasn’t this been tried?  I thought the reason the Kuali association was put into place was because the financial need and the overhead of running a non-profit organization was too taxing on the participant organizations?  So, the Kuali association made a lot of sense from that viewpoint.  But now the libraries are going to return to a separate foundation?  Why is it going to work this time when it didn’t previously?  Because we have vendors at the table, because we think we’ll enroll more participating organizations?  (See later points on this subject).  Because we found out that charging libraries to be a full participant in an open source software project didn’t fly with the crowd?  Given that library budgets and staffing are stretched to the limit, what is the logic that suddenly says we’re going to now have the capacity to take this new organizational overhead on?  I admit I’m totally mystified by this one. This choice seems to have an incredibly low probability of success.  The merry-go-round continues…
  4. So, OLE will again be solely aimed at academic libraries?  This new project is once again, focused on academic libraries. This is good.  And it’s bad.  It’s good, because as I’ve argued countless times in this blog, success in a software project is dependent upon building a good solution that addresses a market need so thoroughly and successfully, that it finds widespread adoption as early as possible within that segment.  Then, and only then, should a project branch out to address related segments.  To do so too early can result in lower adoption rates (see OCLC’s WorldShare, a product trying to address too many markets concurrently, and their resulting low adoption rate in academic markets.  Compare this to Ex Libris’ Alma, a product focused on academics and the experiencing significant success as a result).  The reason this focus is bad is for the reasons I pointed out, back in 2009, in this blog post.  Back in 2009 they also focused on academic markets, but I questioned how they would add additional market segments; the competitive positioning and market share and what that would leave for OLE and if it would be enough to sustain the product and/or it’s development.   Again, in 2012, I did an analysis of OLE and I also questioned the chosen architecture saying: “OLE is going to miss out on the associated benefits of true multi-tenant architecture.”  Well, here we are anywhere from four to seven years later and it appears those concerns were entirely correct, i.e. the choices made were wrong.  It gives me little satisfaction in saying this, but I think people ignored the obvious.  Given this most recent announcement, I’m concerned once again, the merry-go-round is going to continue.
  5. Multi-Tenant – redefined? The choice of microservices as a new architecture is definitely interesting.  But it has some implications I don’t think many fully understand.  This new version of OLE, based on microservices will, quoting Marshall’s article: “provide core services, exposing API’s to support functional modules that can be developed by any organization.”  Let me share my interpretation of that statement: What will be delivered on first release is probably a very basic set of services and what exactly that will include needs to be very openly and transparently communicated to the profession ASAP.  Because without it, there is no way to understand whether that means basic description processes, fulfillment (circulation) or will it mean it is just a communication layer on top of databases for which users will then have to write additional microservices to provide each of the following, including: selection (acquisitions and demand driven acquisitions (DDA)), print management (circulation, reserves, ILL, etc.), electronic management (licensing, tracking, etc.), digital asset management (IR functions), metadata management (cataloging, collaborative metadata management) and link resolution (OpenURL)).  Because as I’m sure you realize, that’s a lot of additional microservice code that someone is going to need to write to make a fully functioning system.  Plus, I’ll just say, as someone who has been involved in software development for nearly three decades, I find it hard to believe that you can write all these additional related microservices and not need to change the underlying core infrastructure microservice?  Or at least, at the start of designing that core infrastructure, know in some detail how those other pieces of code are going to work so you can provide the supporting and truly necessary infrastructure calls/responses back to the related microservices?  If that doesn’t happen, then when a major new microservice is developed, that core will have to be modified and updated. So, why am I saying multi-tenant appears to need redefinition in this model?  Multi-tenant means there is one version of that core code, perhaps running in multiple geographic locations for failover reasons, but the same exact code running everywhere.  This brought us the capability to try and move forward in some big ways on establishing best-practices, being able to compile transactional analytics that would allow global comparisons of libraries effectiveness and, as mentioned above, real failover capabilities, which given global weather conditions is becoming more and more important.  But now, with the microservices version of multi-tenant LSP’s, we’re back to everyone customizing their implementation and only that common shared core code becomes truly multi-tenant.  Everybody else is doing something different.  Great for allowing customization to unique institutional needs, but sacrificing many of the benefits of true multi-tenant software design.  Plus I have a very hard time, given the competitive nature of vendors in our marketplace, believing for a second that one will agree to serve as a failover location for another vendor running the service.  Maybe, but I’m definitely not holding my breath.
  6. 2018?  Who are we trying to kid? Marshall’s article contains another key phrase:  “Kuali OLE, despite its eight-year planning development process, has experienced slow growth (emphasis my own) beyond its initial development partners, and it has not yet completed electronic resource management functionality.”  Indeed, that would be true.  At the time of this writing, there are three (yes you can count them on one hand and have fingers left over) sites in “production” mode, which apparently means production minus the capability to handle electronic resources (a fairly major operation in academic libraries wouldn’t you agree?).  So, I will admit I nearly fell out of my chair when I heard said at CNI (later confirmed in Marshall’s article) that they expect to have an initial version of the software ready for release in early 2018.  My goodness.  Please pass me some of whatever you’re drinking, because it sure must be a good energy drink, or more probably a hallucinogen!  Some points to consider here:
    • OLE was worked on from 2008-2016 and is still missing functionality.  It was, as mentioned above, put into production by three libraries.  However, there were, according to the website, 15 partners, although two of those were EBSCO and HTC Global, vendors with an interest in the code.  I believe that’s out of 3,000+ academic libraries in North America?  Slow growth indeed.
    • HTC Global was hired as a contract programming firm to expedite the development of the code and conversely, because clearly the number of programmers needed to do the project in a timely manner was NOT available from the library community at large.  Do these people really, REALLY think, that because they’ve now broadened the scope that libraries are now going to assign their limited (and frequently non-existent) programming resources to this project?  I probably have a one-year backlog for my programming staff before I could even think of assigning resources to this project -- in a research library.  As I keep pointing out to my colleagues when discussing open source projects, we have to remember many academic libraries have NO, zip, zero, zilch programmers on staff.  Where oh where do they think they’re going to find needed programmers to enlist to get this massive project done? I’ll say the obvious:  It won’t happen.
    • As noted above, what will likely be coming out in the 2018 version of OLE is just the core code.  So, add a lot of time to those additional and oh-so-necessary other microservices modules needed to make this a complete project.  Index Data really needs to be transparent (by posting on their website) exactly will the the actual deliverable of v1.  Libraries need to know what they will need to build on top of it as additional services microservices (think of microservices as functional modules). This will clearly mean extra cost and probably extra time to get to “complete” (maybe it ccould be done in parallel with careful planning).
    • Let’s also remember the definition of “complete” product is ever evolving.  Even if they could get something out in two years, WorldShare and Alma are not going to sit still. They’ll be 24 releases further down the road.  So “Complete” is a moving target.
    • Let’s also take a moment to study some historical data here.  The Library Journal Annual Automation Reviews have sometimes provided some staffing analysis (last staffing report was in 2013 ) for the firms involved.  If we look at the major players that have tried to develop a “true” Library Service Platforms (Ex Libris, OCLC and Serials Solutions), we see staffing reported numbers of between 130 and 190 people (granted, they were working on more than just the LSP within their organizations, but you can bet the majority were working there).  One of those projects (Intota by Serials Solutions) never made it to the street.  Two (WorldShare by OCLC and Alma by Ex Libris) did.
  7. Do we have options here?  Of course, there are always options.  There are at least two that come to mind, some of which I’ve certainly advocated before:
    • Librarians have already worked together in a collaborative to create a Library Service Platform.  It’s called WorldShare and it has been developed by OCLC.  Librarians need to collectively call upon the collaborative they theoretically own, and help govern, and say: “We want to make WorldShare open source.”  It certainly has issues, but it’s a far more realistic vision that being described for the next generation of OLE.  Then the microservices could be extended out from a solid, true multi-tenant platform with real API’s.  For that matter, if those microservices worked with WorldShare, then there should be no reason, provided similar Alma API’s were supported by Ex Libris (or could be added, for those very same microservices to also work with Alma.  This would then  broaden the adoption base for the microservices  and thus the support for them.
    • Again, let’s take a moment and examine history and let’s see if there is something there we can learn from there to apply in today’s situation.  For instance, look at the history of some of the early vendors into the library automation space:  NOTIS started out with a pre-NOTIS real-time circulation module. Data Research started out in libraries with a Newspaper Indexing module, which eventually gave way to ATLAS (A Total Library Automation System) and Innovative Interfaces per their website: “Innovative provided an interface that allowed libraries to download OCLC bibliographic records into a CLSI circulation system without re-keying.”  The point is this, none of these systems started out as a comprehensive, do-it-all solution.  They started out with niche products and responded to market opportunities until they ended up shaping the products we have today. My point is this: If OLE wants to build a comprehensive library service platform solution, they clearly can’t do it in the time frame needed.  So, instead, they need to start out with a niche product that addresses a key market need (perhaps managing research data? providing a citation tool for research data? Library linked data solution that integrates with existing search engines?), and then drive that product to a leading position in the market.  Only THEN should it start moving sideways to encompass other functionality such as is found in an LSP.
Some remaining questions

Of course, this announcement is being positioned as a big step forward and a positive development.  But it seems to me that in addition to the questions posed above, there are some additional tough questions to be asked before the profession blindly plunges ahead here:
  1. Why did OLE fail?  There was a LOT of time and money spent to produce essentially “use-cases”. Do we really understand what went wrong and what needs to be done differently?
  2. Why did the foundation model/associations fail?  What will be different this time?
  3. Are we entrusting this new version of OLE to the same administrative people who did the previous version? Why?  Don’t we owe it to ourselves to think carefully about the leadership of the project? Is the addition of Index Data and EBSCO enough? We need to think carefully about both governance and administration.  What will everyone do differently to ensure the project’s success this time?
  4. What are the lessons to be learned about open source development for an enterprise module?  Is the library community truly large enough and well resourced enough to support the development of an enterprise, foundational, module for libraries such as the Library Service Platform?  (It would appear not, but I’m willing to be convinced with appropriate data, but I warn you, that’s going to be a tough sell!)
Librarians are some of the most wonderful, positive people in the world.  But here is a time where the rose colored glasses need to come off and we need to ask these serious questions, get some thoughtful answers and do some serious analysis.  We should use our existing knowledge base in order to determine the best path forward.  Otherwise this crazy merry-go-around called OLE is just going to keep spinning in a circle with no real forward progress.  We can’t afford for this to happen again.

Thursday, October 8, 2015

Another perspective on ProQuest buying the Ex Libris Group.

The dust has settled a bit and I’ve had the opportunity to talk to senior executives at both ProQuest and Ex Libris Group about the recent announcement that Ex Libris has been acquired by ProQuest.  Now it’s time for us to sit back and start analyzing what has just happened to a couple of the major suppliers of library automation, and by any measure, this was a BIG event.  

I wrote a series of posts about Library Service Platforms several years back (2012). They apparently met a real need in the profession, as those posts have been viewed over 40,000 times as of today and since the time they were posted. The first post in that series is still very valid, but much of what I’ve said about the companies in subsequent posts has since changed. Of the companies I wrote about then, VTLS was sold to Innovative, Kuali/OLE has gone through massive changes in structure and backing (it’s open source, but not totally, at least not by the classical definition), WorldShare by OCLC has matured a great deal, but the organization behind it is still convulsing with changes under the new OCLC leadership and finally, there is Sierra by Innovative which now seems to be in a very questionable spot.  

In fact, when it comes to Innovative I’m predicting that we’ll see ownership changes of that company as soon as they can be arranged.  You simply don’t force out the CEO on a day’s notice, install a new CEO from the equity owner company and likely do so with any plan other than finding out how fast you can sell the company.  The problem for Innovative (and I told this to previous CEO shortly after he arrived at the company) is that they’ve stayed with the old architecture way too long.  Now whoever buys the company is going to be facing the massive task of totally rewriting and/or developing a new platform that is a true multi-tenant, cloud based architecture, i.e. a truly competitive Library Service Platform, (see this post for a definition).  That's a sizeable task that is slow, costly and has a target market of shrinking size.  My guess is the previous CEO was probably pushing to do that investment and when the equity owners looked at what that was going to cost and the return-on-investment, they decided to pursue another path with their money.  Parts of that sound familiar?  Yes, that would serve as an excellent segue back to the ProQuest / Ex Libris announcement.

Now there have already been a couple of excellent posts published that analyze this acquisition announcement in some detail and do so quite well and are generally very fair.  If you haven’t read the post by Marshall Breeding and the post by Roger Schonfeld, I’d certainly recommend you do so.  

Trying not to repeat what Marshall and Roger have said, here’s where I see important differences from what they’ve said in their posts:

  1. ILS’s vs LSP’s.  Integrated Library Systems (ILS’s), even when hosted in the cloud, and Library Service Platforms (LSP’s) are radically different architectures with huge implications for the future of library technology and thus libraries.  I detailed all this in a post, I’ve already mentioned a couple of times, but it’s worth saying again, multi-tenant software is the future.  Simply hosting multiple virtual instances of an ILS is not an LSP and will not get you where you need to go in another 3-10 years.  It simply won’t. If you go down that path you’re going to eventually get left behind -- way behind.  If you choose that path, understand it’s only good for the short term. (See my post on the “coming divide” for a full explanation).  I would also take serious exception with Schonfeld’s belief that libraries may not need this kind of technology in the future because they’re resources are becoming increasingly digital.  While the latter is true, it doesn’t make the former true.  Most libraries still have massive print collections and as a recent article in the NY Times described, we’re seeing publishers printing more books each year as the e-book business has seemingly hit a plateau, at least for now.  Library management systems will be around for a long time to come.
  2. Content-neutrality.  Let’s not lose sight of the fact that we’ve lost another “content-neutral” discovery vendor as a result of this acquisition.  That’s not a good thing for libraries, although most librarians ignore this reality.  In the end, I believe they’ll regret doing so. We’ve had yet another check-and-balance removed from our supply chain. This post explains why content neutrality is so important and why that loss carries a potentially high price for libraries.  So, in this regard, this is not good news.  OCLC with their WorldCat offering remain our only content-neutral discovery solution at this point outside of open source solutions (which don't’ have an aggregated metadata database like Primo Central, which provides important functionality for libraries).
  3. Equity Ownership.  Ex Libris is no longer held by equity investors. It’s no secret that I’m not a fan of equity ownership of major suppliers to libraries. I understand how equity ownership works and I’ve detailed my related concerns previously in this post. Yes, Ex Libris did well under equity ownership for the very reasons I outlined in my post.  But the fact remains, they could have done even better and invested even more in their products and services had they not been sending so much of their profit to the equity owners.  I’m hoping with that aspect of the ownership now removed from the equation, we’ll see some accelerated product development is some much needed areas, like the discovery system, course management system integration tools, and the some other needed product areas.
  4. Intota’s Future.  Despite what company executives will tell you, Intota has been languishing and a full product has never been released into the marketplace.  That reality has come at a steep price for ProQuest, as other companies now own large portions of the targeted high-end LSP market.  Of course one of those products was Alma by Ex Libris, now part of the ProQuest holdings.  So there is plenty of speculation that Alma will become the premier offering and Intota will eventually fade away entirely or the functionality that exists will be merged with Alma.  Certainly that’s possible although company executives deny that and insist the choices will remain.  However, I think there might well be another outcome.   Alma has long been aimed primarily for the academic, corporate and national library markets.  Which leaves public libraries and smaller academics thirsty for some competition in LSP offerings tailored more to their specific needs.  They really only have OCLC’s WorldShare at this point and I can easily see ProQuest re-aiming Intota towards those markets.  However, if I was betting, over the long-term, I'd go with Intota slowly merging with Alma and there being only one platform left, although possibly with two names to accent the different markets being served.
  5. Primo vs. Summon Discovery Systems.  As Marshall pointed out in his post, these products both have large and very devoted installed bases.  Neither product will disappear anytime soon, although pure business logic will dictate that over time, they will slowly meld together from the core outward until they are one.  But this will take many, many years and I’d agree with Roger Schonfeld, the future of discovery systems in general is more questionable than the future of these two product offerings in particular.
  6. Will Ex Libris remain a separate company?  Yes, for now, I think that’s a safe bet.  But it’s important to look at ProQuest acquisition history here and to note that over time, other companies that have been purchased have been slowly absorbed (remember Serials Solutions?) with only the product names remaining as vestiges of those firms.  But for now, yes, it makes total sense for these organizations to largely remain separate.  At least until company cultures are merged, operations are merged and everything is stabilized.
  7. What’s EBSCO’s next move?  Good question.  Clearly both Ebsco and ProQuest are trying to assemble end-to-end technology solutions for libraries.  Ebsco needs an LSP in their offerings.  They might be working on one behind the scenes.  Many people are speculating that buying Innovative or Sirsi/Dynix could be a step in that direction.  It could be, but as I outlined above, it’s a very problematic one because neither firm’s products are multi-tenant architecture needed for a real Library Service Platform.  So, a total rewrite would be required for them to turn that offering into the needed solution. Ebsco has a real challenge in front of them.  

What’s the bottom line here? I personally have a lot of respect for both of these companies and their teams. From a business point of view, it is a very good move.  Library automation is a tough and challenging field.  These companies have very smart people at the helm. Right now, they have all the right people saying all the right things.  But that’s normal at this stage of an acquisition.  What will matter is what actually happens in the weeks and months ahead.  So stay tuned.  Walking the talk is much harder to do.