Google’s Moon Shot: The Quest for the Universal Library

A recent article in the New Yorker magazine by Jeffrey Toobin chronicles Google’s ambitious library-scanning endeavor.

Excerpt:
Every weekday, a truck pulls up to the Cecil H. Green Library, on the campus of Stanford University, and collects at least a thousand books, which are taken to an undisclosed location and scanned, page by page, into an enormous database being created by Google. The company is also retrieving books from libraries at several other leading universities, including Harvard and Oxford, as well as the New York Public Library. At the University of Michigan, Google’s original partner in Google Book Search, tens of thousands of books are processed each week on the company’s custom-made scanning equipment.

Google intends to scan every book ever published, and to make the full texts searchable, in the same way that Web sites can be searched on the company’s engine at google.com. At the books site, which is up and running in a beta (or testing) version, at books.google.com, you can enter a word or phrase—say, Ahab and whale—and the search returns a list of works in which the terms appear, in this case nearly eight hundred titles, including numerous editions of Herman Melville’s novel. Clicking on “Moby-Dick, or The Whale” calls up Chapter 28, in which Ahab is introduced. You can scroll through the chapter, search for other terms that appear in the book, and compare it with other editions. Google won’t say how many books are in its database, but the site’s value as a research tool is apparent; on it you can find a history of Urdu newspapers, an 1892 edition of Jane Austen’s letters, several guides to writing haiku, and a Harvard alumni directory from 1919.

No one really knows how many books there are. The most volumes listed in any catalogue is thirty-two million, the number in WorldCat, a database of titles from more than twenty-five thousand libraries around the world. Google aims to scan at least that many. “We think that we can do it all inside of ten years,” Marissa Mayer, a vice-president at Google who is in charge of the books project, said recently, at the company’s headquarters, in Mountain View, California. “It’s mind-boggling to me, how close it is. I think of Google Books as our moon shot.”

To read the article in it’s entirety: http://www.newyorker.com/fact/content/articles/070205fa_fact_toobin

Related news item:
Princeton U. Library Latest to Join Google Scan Plan

Google has signed another university library to participate in its library scan plan, this week announcing an agreement with the Princeton University Libraries to digitize roughly one million books. Under the agreement, Princeton will initially supply only public domain books over the “next six years,” which it says will be indexed and searchable on the web and freely available for download. (Private universities, because they have fewer protections, have been more wary than public ones in testing the limits of copyright laws.) Princeton University Librarian Karin Trainer said joining Google Book Search will “make it easier for Princeton students and faculty to do research” and would also allow Princeton to “share our collection with researchers worldwide.”

Trainer said Princeton librarians will work with Google over the next several months to choose the subject areas to be digitized and the timetable for the scanning. Library staff, faculty, and students also will be able to suggest titles for inclusion. Princeton is the twelfth institution to join the Google Books Library Project, joining Harvard, Oxford, Stanford, the University of California, the University of Michigan, the University of Texas-Austin, the University of Virginia, the University of Wisconsin-Madison, the New York Public Library, the University Complutense of Madrid, and the National Library of Catalonia.

Library Journal Academic Newswire, Feb. 8, 2007