Monday, February 6, 2012

A key librarianship differentiator: Finding information without bias

(c) iStockPhoto
Yesterday’s New York Times contained another chilling article entitled: “Facebook is using you” that tells us how we, as individuals, are having our lives shaped by massive data aggregations that describe us and over which we have little, if any, control.  In the article are examples of how profiles derived from these aggregations are shaping hiring decisions, credit limits and of particular interest to librarians, the information we’re seeing when we search online. 

We’re all painfully aware that as librarians, the volume of information that we need to process, sort and select from in order to provide meaningful search results to our users is rapidly exceeding our ability to do so.   At the last Charleston Conference, Cliff Lynch, Executive Director of CNI pointed out “a scientific paper is published every 1 or 2 minutes.”  Eric Schmidt is on record as saying: “If you recorded all human knowledge from the dawn of time to 2003, it’d take up about 5 billion gigabytes of storage space.  Now we’re creating that much data every two days.”   Anyway you look at it, as librarians grappling with all that information, we need to find new ways, using new technology to sort through all of this in order to best serve the needs of our library membership.

One piece of new technology that is increasingly being used is that of personalized searches.  Google first introduced the phrase “personalized search” in 2004.  Then in 2009, Google announced the availability of personalized Search. Google’s technology was that it would use dozens upon dozens of “signals” to personalize search results for individuals.  It sounded like a promising idea. However, as I always encourage librarians to remember, the primary need for Google is to sell ads and that is VERY different from that of librarians whose primary need is to help their membership to build knowledge.

If you haven’t yet had the chance to read the book called “The Filter Bubble; What the Internet is Hiding from You by Eli Pariser, I strongly encourage you to do so. It is truly an eye-opening revelation on the dangers of personalized searching if used without knowledge and forethought on the part of both the information supplier and the information consumer.  (If you’re an audio/visual person, there is a 10 minute talk on the same topic by the author over on the TED site.

Pariser’s book lays out the problem within the first few pages:  Personalized search, as used by Google, means the same search, by two different individuals, could display radically different results as shaped by the criteria that Google (not the individual) has determined makes up their description of you.  The book goes on to explore the sources of some of those signals and how they will be utilized in shaping that “description” of you.  Then it examines the consequences of this type of searching on learning, politics, individuals, etc.

I’ll admit, some of those sources of Google’s signaling information were completely new to me and I found it deeply disturbing not only to see the extent of the information they’re using, but the ways they use it to shape the search results presented to me.

According to the book, one source of these “signals” that Google uses is a company called Acxiom.  Pariser tells us:
“Here’s what Acxiom knows about 96 percent of American households and half a billion people worldwide; the names of their family members, their current and past addresses, how often they pay their credit card bills, whether they own a dog or cat (and what breed it is), whether the are right-handed or left-handed, what kinds of medication they use (based pharmacy records)… the list of data points is about 1,500 items long.”  
Out of all this information that Acxiom has, Google apparently distills it down to some 56 items they’ll use to define you.  (You might also want to read this book to find out how other companies beyond Google, are using this information as well. That’s also a major eye-opener but it’s tangential to this post).

At a pure marketing level, we’re lead to believe this kind of searching has the potential to be good. The goal, according to the marketing, is to be able to quickly move, to the front of search results, those items, which appear are most likely of interest or import to you as an individual.  While that makes sense, we need to remember, it’s based on a description (made up of signals) of you that you do NOT control and apparently can only modify over long periods of time by behaving differently as you use the Web.

So what if you are someone who only has a high school diploma as an educational certificate?  Do you think someone of that intellectual caliber should only see materials calibrated to that level of achievement?  Or what if your income level is lower than that of the super-wealthy (an increasing concern for many in today’s society)?  Should you, or should you not, see the same results shown to those individuals?  Now you might say; Look they’ve got at least 1,500 data elements on you (in the case of Google, narrowed down to 56) and you don’t know what they are, nor do you know the algorithms by which they are correlated, compiled and/or utilized.  So it’s virtually impossible to say how these are controlling what you’ll see. You’d be absolutely right.  It also means that when you do a search on the Web, you don’t know what’s defining you.  Nor do you know how it is being used to decide what will be included and more specifically, what will be excluded in presenting results to your search.  The book does an excellent job of talking about all the ways this can be very problematic for people and the societies in which they live.  It’s a very compelling read.  But let’s narrow this conversation to libraries, librarians and our member communities and what the “Filter Bubble” means for Librarianship.

Librarians are, as we noted above dealing with the same information explosion. They have the task of distilling out of all the information available that which is the most authoritative, appropriate, authenticated and placed in context that will meet the library member needs.  So, librarians and their suppliers are also examining how to use information to personalize search results.

However, herein lies the difference, because this is a place where librarians can differentiate themselves and where they demonstrate their value.  Because now we need to amend the sentence above to say our job is to: Present without bias the information available which is the most authoritative, appropriate, authenticated and placed in context that will meet the library member needs. 

This is a real opportunity for librarianship.   We have the chance to clearly differentiate the service we provide as opposed to those of Google, Facebook and many others.   It has always been, and will continue to be, at the core of our mission to provide free and unfettered access to all information deemed worthy of being in a library’s domain without restriction based upon economics, race, class or gender. 

So in order to combine that mission with this new technology means using a different approach.   Pariser, in his TED video, says that those serving as gatekeepers to information must code into their algorithms not only that which is “relevant, but also that which is important, uncomfortable, challenging and represents other points of views.”  That’s not a bad list.  

As librarians, when we start using personalized searching we need to ensure that we provide easy-to-use switches that allow the user to turn-off specific or parameters.  Perhaps we should be looking at slider controls that allow the user to adjust the parameters to represent different ranges of personalized searching.  For instance, freshman to doctoral, liberal to conservative, rich to poor, “A” student to “F” student and so on.  Certainly the more challenging functions would be to have a slider bar that says this is Viewpoint A, now show me the 180-degree opposing viewpoint. 

If we do those things, we'll be providing the community, students, faculty and staff with that which we’ve largely provided them for years:  A unbiased supply of information that will challenge them to grow, learn, apply and create new knowledge.  Technology is a wonderful tool and in the face of our many challenges, we need to use it.  However, we also need to stay focused on our mission in order to ensure our continuing value.  

Personalized searching without control by the end-user is dangerous. Personalized searching with control by the end-user is powerful.  Make sure you understand the difference and make sure the technology you put in place gives you the capability to continue to support the core mission of librarianship.