The Washington Post published an interesting article on Sunday about Google's Book Search function.  With the google.gifUniversity of California recently adding its massive library to the book-scanning process, this idea of creating a comprehensive digitized library of all the world's books seems like a reachable goal.  This move, which opened the University of California's network of 100 libraries and some 34 million books to Google's scan-and-search framework is a landmark event in Google's quest of digitizing every book ever written.

The Issue

Google Book Search, originally dubbed Google Print,  was the basic idea on which Google creators built their franchise in 1996.  After getting distracted by indexing the entire World Wide Web and earning massive amounts of money, the Googlers returned to their original goal of digitizing books in 2004.

The search function was originally developed as a marketing tool for publishers and authors.  Under this structure, the Google team gained permission from publishers to scan books into a Google database.  Users can now conduct a search online, and their results consist of various books that include the search term within the text of the book.  Citations are provided, along with snippets of text with the search terms highlighted.  Links are provided to publishing companies and bookstores so that users can buy the books online.  In many cases where permission was obtained from publishers, books are shown in their entirety on the Web.  Publishers can track their stats on Google to see how many people are reading their books online, are given the opportunity to "opt out" of the service at any point, and can even make some extra money from Google ads.

This portion of the Google Book Search tool has done wonders for small publishing companies and authors of obscure books who have no other way of marketing their materials.  An example of a success story is Arcadia arcadia.gifPublishing, one of the first participants in the Google Book Search program, which publishes history books about cities and towns across the United States. According to Heather Deacon, Arcadia's national account manager, "Before Google, people would have to go out with the purpose of buying a book.  Now, people who don't even intend to buy a book come across us."  Before Google Book Search, Arcadia and similar companies published just 100 books per year.  Now, as an active participant in Google Book Search, Arcadia has experienced publishing growth in double digits every year.  This aspect of Google Book Search, which worked with publishers to legally deal with copyright laws, has been received very well and is popular within the community of publishers and authors.

Controversy has developed, however, around the second part of Google's Book Search, the Library Project.  Under this project, Google is cooperating with 6 major libraries: Harvard University, University of Michigan, University of California, New York Public Library, Stamford University, and Oxford University.  These universities have given Google access to their libraries, where Google is taking the books and scanning them into its digital library.  Sounds just like the Publisher program, right? Wrong.  There is one major difference between the two programs.  In the library program, Google is scanning all books, without publisher or author permission, regardless of their copyright. This differs vastly from the publishing program, where Google obtains permission from publishers to reproduce material on the Web.

For old books with expired copyrights, this is no problem. These books (about 15% of all books) are part of public domain, and can be reporduced by anyone at any time. Another 10% are still in print with valid copyrights, which cannot be reproduced without author or publisher permission.  The remaining 75% of books lie in a kind of "copyright limbo": these are the books whose copyright information has been lost through the years or changed hands several times. According to Kevin Kelly of the New York Times, author of a piece called "Scan this Book!", it is nearly impossible to track down the copyright information for the "limbo books", sometimes taking years for copyright searches to be completed.  Banking on the estimated 10-1 chance that these books have expired copyrights and thus belong to public domain, Google is essentially ignoring possible copyright restrictions and putting them up for grabs on the Web, offering publishers and authors the chance to "opt out" if they discover their book on the Web and want it removed.  Google defends this practice because they only plan to make snippets of these books available on the Web, thus following "fair use" policies.

According to Kelly, there are two points of the library program that anger authors and publishers:  "the virtual copy of the book that sits on Google's indexing server and Google's assumption that it could scan first and ask questions later."

Nick Taylor, president of the Authors' Guild (one of many organizations suing Google for its scan-n-search tool), authors.gifexpresses the concerns of authors and publishers all over the country in a letter available on the Guild's website:

"Google says it has given authors and publishers a chance to opt out of the program. That's not how it works. It is Google's obligation to seek licenses to property it intends to copy for commercial gain. Google says its display of the copied books will be covered by "fair use," the provision under copyright that allows limited use of protected works without seeking permission. That's not the way it works either. Fair use applies to the way the end user-the researcher or writer-uses the material she finds, not to the copier who makes it available, again for its own commercial gain."

Taylor continues,

Our primary concern is with Google Library, a subset of Google Print. And even there, the Authors Guild does not object to what Google is attempting to do in scanning and making searchable the contents of important library collections. That is a valuable tool made possible by digitization and the Internet.

But neither Google nor anyone else has the right to do that without the permission of authors whose copyrights remain in force. The company is, in effect, stealing people's property and providing others with access to it for its own gain.

In the same letter, Taylor also applauded a few of Google Book Search&#39
;s library partners, Oxford's Bodleian Library and the New York Public Library, for "doing it right", meaning that these libraries "are limiting the books scanned to those that are out of copyright and in the public domain".

Thoughts

I find this issue very interesting and believe that Google Book Search, if completed fully, has the potential to change the way people read, research, and write.  Here are some of my thoughts on the issue:

  • What does this mean for academia?  As a graduate student, I reap the benefits of paying hefty tuition dollars by getting to use the university library's system of databases to search for academic books and journal articles. Right now, universities only allow enrolled students and faculty to access these systems, which use journal and database subscriptions that cost thousands of dollars per year.  In this way, the world of academia is restricted to only those with access to people with an extra 20 grand lying around that can afford to pay for these subscriptions, or to those associated with universities.  What would it mean for this community if anyone could access academic literature on Google?  Would that break down the restrictions caused by the need to have library access?  And would this increased level of access cheapen academic writing or make it better by opening up this world to new people?  Something to think about.
  • What does this mean for the future of books?  Are we going to be forced to read books in electronic format? Will printed material die out?  In his New York Times article, Kevin Kelly warns of a world of "liquid books", books.gifwhere the art of reading a paperback on a beach or flipping through picture books with your kids in endangered.  This sounds a little extreme to me, but by allowing the content of books to appear on the Web, it does seem like the value of a hardcover could be undermined by the ease of reading the "liquid version".
     
  • Is Google trying to take over the world?  I think so.  As it spread its reach into different kinds of Web tools (maps, email, blogs, and books, among others), Google dominates the Web experiences of many average Internet users.  Can Google be trusted?  I hope so.  At least the company's motto is encouraging: "Don't be evil".

Overall, I think the Google Book Search program has the potential to do wonders for researchers, authors, publishers, and librarians, but only if it is used correctly.  Books should not be made fully available on the Web unless Google has legal permission to do so.  This is where the "opt-in" versus "opt-out" debate becomes dicey.  It makes much more sense for Google to pursue the "opt-out" strategy, as this allows them to publish more books and saves them the trouble of researching copyright information.  But should authors be made responsible for actively seeking out their material just to protect it?  I don't think so.  Kevin Kelly agrees:

"Search technology is becoming a commodity, and if it turns out there is any money in it, it is not impossible to imagine a hundred mavericks scanning out-of-print books.  Should you as a createor be obliged to find and notify each and every geek who scanned your work, if for some reason you did not want it indexed? what if you miss one?"