Thursday, November 9, 2017

"Living under the API." Some things librarians need to consider.

At OU Libraries, we've been investigating interesting new tech products and when requested we provide input into the development of some these products.  There truly are some new tech products in the pipes that are going to offer libraries a real chance to provide new value-add capabilities to our communities of users. But. These new products are also raising some real concerns. Those concerns will require us think very carefully as we develop, implement and use them, and not just us at OU Libraries, but all librarians. Let's look at just a couple of those:

  1. First, as is happening all around us, we’re increasingly seeing data and algorithms interwoven into academic scholarship and librarianship.  Clearly, machine processing is going to provide us with a whole new dimension of analysis and research results, but as with any new technology it is also providing new set of challenges.  Two recent moments underscored this point for me.  The first was when we had a guest speaker, Dave King of Exaptive, give a talk at our recent Research Bazaar.  During that talk, Dave stated: “In the past, programers wrote code to implement management decisions.  Now programmers are writing code that makes decisions and managers are trusting it because they don’t understand what is happening in that code.  This is dangerous.”  Dave's company writes such code, so it's really important to listen to what he is saying about how it gets developed.  The second moment, which underscored what Dave pointed out, was when I was reading a new book with the title: "Radical Technologies" and it coined a phrase that describes a lot of what is happening with technology and that phrase was to point out that people are increasingly: "living under an API". API's are Application Programming Interfaces that can embody algorithms that are controlling our lives.  It's happening in social media, search engines, even in your local Target and Walmart stores.  Yes, it can produce great value.  It can also be, as Dave said, very dangerous and increase problematic societal issues that we're already wrestling with full-time.  Another recent book, "Machine Platform Crowd" stated something equally important for us to remember in this discussion: “Technology is a tool. That is true whether it’s a hammer or a deep neural network. Tools don’t decide what happens to people. We decide.” So, when working with these new technologies for instance, we at OU Libraries spend a great deal of time thinking about issues surrounding the support/development of critical thinking skills and reproducibility of research results.  We're concerned about the need for:
    • An understanding, by all parties involved, that software used in analyzing and processing data must be open source or extremely well documented, such that it can be clearly understood what is happening within the processing sequences of that code.  Now, clearly proprietary vendors are not going to make their entire products open source, in order to address this issue, that's well understood.  They need to recover costs and make a profit to sustain and grow their companies.  But the code bits that do actual analysis/processing do need to be either open source or openly and accurately documented.  In part, peer review of code and code logic would be one way to allow us to ensure the integrity of the work as well as ensuring that biases are not written into the code.  This is really important because if we don't watch for this, biases will be propagated over generations of research as results from one research project are used as the basis for subsequent research projects. As part of peer reviewing of code as a scholarly product, it would also need to be: i) Documented, ii) Reusable by any scholar trying to replicate results, iii) Citable, and, finally, iv) Versioned (to ensure accurate reproducibility, including workflows).  
    • The same set of issues occur with the data the code operates upon, be it numerical data, visualizations, citations or the full-text of publications.  When using data, we need to ensure the quality and the openness of the data for the ability of others to verify and reproduce research findings.  
    • These items are really not options, they can't be for us to ensure the integrity of the research done at our universities. 
  2. A major secondary issue we're seeing is that many of these new tools are closely coupled to the content, which is also supplied by the vendor.  Now, as many of you know, I've been on the vendor side of the discussion table in my previous lives and I fully understand their desire is to provide value-add as a way to increase the sale of their content.  Understood.  But it's a sales model from the past, not the future.  Analysis tools need to be decoupled from the content.  Our thinking, as librarians, is focused on the value-add of the tools on top of content, and, thus the need to use that tool across all content, not just that in one suite of offerings.  For us, not doing this poses real challenges in training community members and in the overall ease-of-use of library resources.  (We really don’t want to be in a position of saying: “Yes, we know you can do that with these databases, but you can NOT do it with those databases…”)  We believe what is required is a shift in the understanding on the part of content providers that content is increasingly becoming a commodity, and the value (and thus future sales) is coming in the differentiation of the tools provided on top of content.  The fact that vendors are producing these types of tools already shows an understanding of this, but the fact that they continue to couple the tools with only their content shows a bifurcation of thinking that we fear is not healthy for all concerned.
In order for these new technologies to be successful for both the organizations producing them and the profession of librarianship, these are issues that really need to be addressed head-on, all around the table, whether the tool is open source or proprietary.  As Librarians, we need to do our homework and our due diligence to ensure we understand, in detail, the topics involved.  We also need to insist that the technology allow us to continue to support our core values as well as those of the communities we support.


Tuesday, August 15, 2017

Déjà vu re library data and ownership rights

Recent events in the librarianship profession have brought me out of the slumber I've been in with this blog, although the actual causes for that slumber are separate and work related. 

I've found myself over the past two weeks marveling at the reaction of the many librarians contacting me about the recent purchase of BePress by Elsevier and that reaction was compounded by a former vendor of our library sending us a legal letter that raised my ire considerably. Both reactions are related to the issue of the library's data when stored in proprietary systems, and particularly, the library's rights to get that data back out of the system when the library is ready to move to a different system.

At that point, it's really too late to do what I'm about to say, but please, when selecting the new replacement system, make sure you do the following if you're buying a proprietary system.  Here is a list of things that need to be specified about the data in both the RFI or RFP as well as the actual contract between the vendor and your organization.  Those are:
  1. The Library owns their data.  No conditions apply or allowed.  The Library owns ALL their data and all the rights associated with the use of that data. 
  2. You have the right to request that data be extracted in an industry standard specified format (and that format should be stated in the both the procurement and legal documents).
  3. The cost for extracting that data should be identified in the contract.  That number should be reasonable for your institution and certainly should not exceed the cost you paid them to load it (but, be fair and remember the inflation factor, can apply over time).
  4. The time period for the delivery of that data should also be specified.  A reasonable time period is 30 days from the date of request.  Any more than that and you're going to be creating a backlog of updates to apply to it once it is loaded in your new system.
Now, here is the part that has surprised me since joining the academic world over four years ago, which is how the legal department of the academic institution is really somewhat disconnected from the librarians who are selecting systems.  For instance, most librarians write up their RFP and send it to purchasing, which then conducts the procurement and basically acts as a wall between the vendor and the librarians, filtering things through in a way that ensures a fair procurement.  Once the librarians select the system of choice, procurement works with the legal department to get the contract put together and in most cases, the librarians sit on the sideline during this process, until the document is signed. So, they're not involved in the negotiations and are totally reliant on their legal department to ensure crucial points are covered.  

I can tell you from experience, that counting on this for coverage of crucial points is NOT always the case because your legal people will not have the same understanding of your needs, nor the depth-of-understanding of the topics that you do.  Their expertise is legal and negotiation, but if a vendor flat tells them "we don't do that", legal might give it away in the negotiation and you might lose out.  So, make sure you see the contract language BEFORE it is signed. If you don't understand what it is being said, ask your legal people to explain it to you.  I'm usually, stunned when I ask a librarian if they've read their vendor contract, either before or after the signing, and how many will tell me "no".  Mistake, mistake, mistake! You know when you pay for that mistake in this area?  When you go to leave and find out your data is locked in a proprietary format, will be delivered in that format and because it is proprietary means you can't give to your new vendor without violating intellectual property laws.  Or that you're only entitled to a subset of what you think is your total data set.  Or that cost of having your data extraction approaches the cost of a new replacement system.  Or that sure, you can have your data, but it'll take six months to get it.  Or... you get the picture.  

We've lived through these issues before with integrated library systems and now apparently many have forgotten those lessons of history.  As Yogi Berra once said: "It looks like Déjà vu all over again."

Monday, January 9, 2017

Do you see what I’m saying? Why Libraries should be embracing virtual reality.

Galileo's World VR Station
Here at the University of Oklahoma Libraries, we introduced Virtual Reality (VR) to our community with the Galileo’s World Exhibition. I fully expected VR would be of interest to many of our younger exhibition attendees. However, I was certain we were on to something much bigger, when in giving tours of the exhibition, I found myself regularly helping elderly people into the station where we were doing the VR demos, in order that they too could see the Universe as Galileo thought it existed, and then letting them compare that to a VR demo we'd created of the Universe as we know it exists today, using images we obtained from NASA. Watching people light up as a result of that virtual reality experience, no matter their age, really made an impression on me.  

Innovation @ the Edge VR stations
We then decided to grow our virtual reality activities by taking the technology under wing at our Innovation @ the Edge facility. Next we started the Oklahoma Virtual Academic Laboratory (OVAL) which included the installation of workstations designed specifically to support VR using Oculus Rift technology and to show our community of users how we could support their use of this technology within the pedagogy and research at our University.  Since then we’ve grown the program to embrace HTC Vive units and high-quality Google Cardboard headsets. We also assembled a team of Emerging Technologists who have a burning passion to introduce this (and other leading-edge) technologies to our community. 

Innovation Hub OVAL
The combination of these things has resulted in us seeing instructors from over 20 courses, and from a variety of colleges across the campus, that have built exercises into their courses that require the use of the virtual reality units in the library labs. In addition we’ve seen numerous other community explorations of the technology, some of which will leave you in total awe of what is possible and being done (some of these have literally brought me to tears they are so moving, but let's keep that as a subject for a separate, future post).  All of this has been tremendously rewarding because it not only builds traffic in the libraries, it positions the library as the point to engage with, learn about and experience leading-edge technologies.  At the same time, it also allows us to connect the users with the many other resources and users of the library that will help facilitate their thinking and adoption of these types of technologies.  (Note:  All of this underscores our positioning as the “intellectual crossroads of the University”). Plus, it’s clearly going to lead to higher-level learning and research outcomes as we study the use of the technology and learn from those studies. 

As wonderful as all of that is, those are not the only reasons I see for Librarians to embrace virtual reality. Recently I was sitting in the airport, waiting for a delayed airplane (an unfortunately frequent occurrence in my life) and heard a nearby grandmother trying to explain something to her grandchild. As she was doing so she was frequently interjecting the question: “Do you see what I’m saying?” It occurred to me, that question was a rather perfect encapsulation of what I wanted to describe in this post.  Let me explain.

The difference between information and knowledge

This is all related to what I’ve previously written about in that Librarianship is about knowledge creation. David Lankes is probably one of the most recognized speakers on this topic and has written many wonderful columns on his blogsome excellent books and given many, many speeches that touch on this topic. I consider his work among the best available today on this topic.  

Of course we, as librarians, are not only about knowledge creation, we’re also about how to transmit the information upon which that knowledge is based, from generation-to-generation and how we then render it in ways that the next person to encounter that information can turn it into similar knowledge or improved knowledge in their minds. Now, admittedly information comes in many containers today, from the web, to audio, to moving images to books. The point is that information comes not only as symbols on a page, but also includes sound, tactile, and certainly additional forms of visual transmission beyond that of just symbols on a page. This is where virtual reality comes into play.

For the sake of clarification, let us take a short aside. The difference between “information” and "knowledge" is an important one and yet I find the terms are often used interchangeably and I think quite mistakenly.  As a point of clarification, I believe information is conveyed understanding and serves as input to thinking organisms where it can result in the creation of the same, new and or different knowledge, but hopefully at least the same knowledge. In other words, to my way of thought, knowledge only exists in thought, be that human or machine. Knowledge committed to transmittable forms, whatever they might be, are to my way of thinking, information that is a result of the knowledge they now have created and/or possess.    

Knowledge is defined by Merriam-Webster as: “information, understanding, or skill that you get from experience or education.”  Working with that definition, the part I want to focus on at the moment is that word: “experience.” It will help illuminate why I’m so certain that virtual reality is technology we need to embrace.

Virtual Reality (VR) as a means of conveying information upon which knowledge is based

History tells us that early humans conveyed information experientially, i.e. verbally or visually and in early times, via drawings/paintings. The means of conveyance was later expanded to include the written word. Over time, recorded sounds, video and other formats were added.

Reading is of course, for most librarians, the most endearing and longest running form of conveying information. We deeply appreciate the value that reading provides in our lives, both from the ability it gives us to experience new knowledge, be better informed and educated and thus to improve our lives, but also because it can entertain us. 

Now step back and think about reading for a moment. What makes a book memorable when you read it?  In doing so, you’ve consumed probably hundreds of pages of print and in some way, when you were in that act, you might have felt moved by that text. Maybe it was an emotional experience or maybe it was one of enlightenment or understanding. I often say there are “Aha!” moments, those moments when a switch is flipped in your brain and you find that you’ve identified with the writer, seen the world through their eyes and experienced their understanding. You then possess new knowledge as a result. 

Kevin Kelly’s recent book “The Inevitable” (highly recommended reading) says this: “Some scholars of literature claim that a book is really that virtual place your mind goes to when you are reading…. When you read you are transported, focused, immersed.” 

I totally agree with that and yet not all people are text-based consumers of information. A September 1, 2016 Pew Report tells us: “65% of Americans have read a print book in the last year” and that “(28%) have read an e-book”. While those are not horrible numbers, it does indicate there are still large segments of the American population that are not reading, not experiencing the information needed to create new knowledge and thus likely have become somewhat stagnant in their learning. Or they become easily subject to the “post-factual” society that we all now find ourselves wrestling with (as well as some of the horrendous outcomes that have resulted). 

So, is there something we can do to move the needle back in the "reader" direction?  Well, let’s go back to that grandmother’s comment: “Do you see what I’m saying?” Virtual reality can be another way to help people experience, i.e. “see” that type of virtual place that many of us experience as a result of reading.  VR is engaging in part, because it’s so very life-like. It’s enlightening because we can use it to convey information and provide enhanced learning just as we’re doing here at OU’s Libraries. But for people that haven’t learned to read in their youth miss out on a great deal. Basic reading skills need to be developed by age 8-9.  Trying to create a reader later than that is extremely difficult. Yet, many students arrive on the doorsteps of a typical University without that skillset in place. Many adults live their lives without this ability.

VR gives us the possibility of filling that gap. For instance, what if we took publicly, out-of-copyright works that are available as full-text and downloaded some of those, and then what if we could write code to match up terms in a text with VR models that would illustrate those what those terms mean?  Doing so in such a way that when a reader is working their way through a text (in a VR headset), they could pause, point to that link and have the VR model play.  For instance, the word "Paris" might appear in a text.  What if we could then allow the reader, who’d likely never been there, to take a short 2-3 minute video tour of Paris, i.e. one where they could choose walkways to follow, and explore at will? Imagine if they could do that with a large number of the phrases/words in a body of text. Would that result in their beginning to understand the kind of visualization that reading creates for others who learned those skills at an early age?  I’m not certain and there is much to be done to test this idea, but I think it’s worth pursuing. Think about how this could transform community members’ lives.  Think about how you, as a librarian would feel, if you helped someone who didn’t think they wanted to, or couldn't read, learn to experience what we has librarians have experienced all of our lives as a result of reading?  

I think it’s one more reason why librarians need to be embracing virtual reality.  As that grandmother said: “Do you see what I’m saying?”  

OU College of Law - OVAL (VR) Stations