Google Category


Bringing history online, one newspaper at a time

From the Official Google Blog (9/8/08):


For more than 200 years, matters of local and national significance have been conveyed in newsprint — from revolutions and politics to fashion to local weather or high school football scores. Around the globe, we estimate that there are billions of news pages containing every story ever written. And it’s our goal to help readers find all of them, from the smallest local weekly paper up to the largest national daily.

The problem is that most of these newspapers are not available online. We want to change that.

Today, we’re launching an initiative to make more old newspapers accessible and searchable online by partnering with newspaper publishers to digitize millions of pages of news archives. Let’s say you want to learn more about the landing on the Moon. Try a search for [Americans walk on moon] on Google News Archive Search, and you’ll be able to find and read an original article from a 1969 edition of the Pittsburgh Post-Gazette.

Not only will you be able to search these newspapers, you’ll also be able to browse through them exactly as they were printed — photographs, headlines, articles, advertisements and all.

This effort expands on the contributions of others who’ve already begun digitizing historical newspapers. In 2006, we started working with publications like the New York Times and the Washington Post to index existing digital archives and make them searchable via the Google News Archive. Now, this effort will enable us to help you find an even greater range of material from newspapers large and small, in conjunction with partners such as ProQuest and Heritage, who’ve joined in this initiative. One of our partners, the Quebec Chronicle-Telegraph, is actually the oldest newspaper in North America—history buffs, take note: it has been publishing continuously for more than 244 years.


Google signs a deal to e-publish out-of-print books

From The New York Times, Nov. 9, 2008


Late last month, American authors and publishers reached an agreement with Google to settle lawsuits over Google’s Book Search program, which scans millions of books and makes their contents available on the Internet. The deal lets Google sell electronic versions of copyrighted works that have gone out ofprint.

“Almost overnight, not only has the largest publishing deal been struck, but the largest bookshop in the world has been built, even if it is not quite open for business yet,” wrote Neill Denny, editor of The Bookseller, a trade publication based in London, on his blog.

The settlement remains subject to court approval, and the bookshop would operate only in the United States for now. But the agreement is only one of many initiatives under which books are making what may be the biggest technological leap since Gutenberg invented moveable type.

From LJ Academic Newswire, One for All? As Google Deal Is Evaluated, Critics Question Single Library Terminal, Nov. 11, 2008:


While the Google Book Search settlement has prompted much debate, commentators have only begun to question how it might affect library service. One of the big questions: the deal’s allowance for free access at a designated terminal within public libraries. On one hand, getting every library a free access terminal for patrons to use the full Google Book Search database is a win for libraries—certainly neither Google nor publishers were obligated to consider libraries needs in their deal. On the other hand, critics note, mandating a “single terminal” is a  counterintuitive restriction in the digital age, and unfairly lumps all libraries, large and small, well-funded or not, into a single, geographic point of access.

“I strongly object to at least one aspect of the proposed Google Book Search settlement, which lets libraries offer just one terminal per library building for access to various books,” blogged Teleread’s David Rothman. “How backwards—not just the one terminal limit, but also the whole notion of linking access to your presence inside a library!”

Digital Library Federation president Peter Brantley called the restriction “irksome” and hinted that a one-size-fits-all provision was inadequate in the face of a lingering digital divide. “I do not know where program management at Google wakes up every morning; I do not know what pretty suburbs publishing executives wake up in every morning,”Brantley blogged. “But in Richmond, CA, [Brantley’s home city] and in many cities around the country, it is heinous to suppose that one public terminal given free reign to the corpus of the world’s literature is an adequate set aside against the promise of the opportunity that Google, publishers, and authors have made possible.”

From The Chronicle of Higher Education, Harvard Says No Thanks to Google Deal for Scanning In-Copyright Works, Oct. 30, 2008


Harvard University has examined Google’s recent legal settlement with publishers and authors, and found it wanting. The Harvard Crimson reported today that the university would not allow its in-copyright holdings to be scanned by Google Book Search because of concerns over the terms of the $125-million settlement, which was announced on Tuesday.

The deal, touted as a watershed moment in mass access to books, promises to make millions more titles available online. Scanned books could be previewed at designated public-library terminals or at institutions that bought a subscription from Google. Users would have to pay to purchase copies of the books, with Google, the publishers, and the authors sharing the proceeds. The settlement awaits final approval by a judge.

Harvard’s concerns center on access to the scanned texts — how widely available access would be and how much it might cost. “As we understand it, the settlement contains too many potential limitations on access to and use of the books by members of the higher-education community and by patrons of public libraries,” Harvard’s university-library director, Robert C. Darnton, wrote in a letter to the library staff.

More Commentary on the Google settlement:

Kevin Smith’s Looking for the devil in the details

Karen Coyle’s “pinball” comments here.

Open Content Alliance’s objections here.

This Washington Post article on Google’s New Monopoly (requires free membership).

PC World’s article on how business considerations have trumped ideals in this negotiation.


U. of Michigan Places 1 Millionth Scanned Book Online

The University of Michigan has reached the 1 million book milestonetemp in its digitization program. That figure represents around 13% of the 7.5 million books in the library’s collections. The books are available via the library’s catalog or via Google Book Search, as part of the Michigan Digitization Project.

Most of the scanning has been done as part of the library’s controversial deal with Google. The search giant is working with dozens of major libraries around the world to scan the full text of books to add to its index. But Michigan is one of the only institutions to agree to scan every one of its holdings — even those that are still covered by copyright. Some publishers have sued Google for copyright infringement over the scanning effort, though officials from Google say their effort is legal because they are not making the full text of copyrighted books available to the public.

The Wired Campus News Blog, Feb. 4, 2008
and Open Access News, Feb. 4, 2008


Committee on Institutional Cooperation (CIC) Joins Google’s Library Project

The number of libraries participating in the Google Book Search Library Project just got a whole lot bigger with today’s addition of the Committee on Institutional Cooperation (CIC). The CIC is a national consortium of 12 research universities, including University of Chicago, University of Illinois, Indiana University, University of Iowa, University of Michigan, Michigan State University, University of Minnesota, Northwestern University, Ohio State University, Pennsylvania State University, Purdue University and the University of Wisconsin-Madison. Google will work with the CIC to digitize select collections across all its libraries, up to 10 million volumes.

Readers will have access to many distinctive and unique collections held by the consortium. Users will be able to explore collections that are global in scope, like Northwestern’s Africana collection or dive deep into the universities’ unique Midwest heritage, including the University of Minnesota’s Scandinavian and forestry collections, Michigan State’s extensive holding in agriculture, Indiana University’s folklore collection, and the history and culture of Chicago collection at the University of Illinois-Chicago.

Google will provide the CIC with a digital copy of the public domain materials digitized for this project. With these files, the consortium will create a first-of-its-kind shared digital repository of these works held across the CIC libraries. Both readers and libraries will benefit from this group effort:

* The shared repository of public domain books will give faculty and students convenient access to a large and diverse online library before housed in separate locations.
* This new collaboration will enable librarians to collectively archive materials over time, and allow researchers to access a vast array of material with searches customized for scholarly activity.

For books in the public domain, readers will be able to view, browse, and read the full texts online. For books protected by copyright, users will get basic background (such as the book’s title and the author’s name), at most a few lines of text related to their search, and information about where they can buy or borrow a book.

“This library digitization agreement is one of the largest cooperative actions of its kind in higher education,” said CIC chairman Lawrence Dumas, provost of Northwestern University. “We have a collective ambition to share resources and work together to preserve the world’s printed treasures.”

Two CIC member universities are already working with Google Book Search, the University of Michigan and the University of Wisconsin-Madison, and this new agreement will complement the digitization work already taking place.

The CIC becomes the latest partner in the Google Books Library Project, which in addition to the University of Michigan and University of Wisconsin-Madison, also includes Harvard University, Stanford University, Oxford University, the New York Public Library, Stanford University, University of California, University of Texas at Austin, University of Virginia, Princeton Library, the Complutense University of Madrid, the Bavarian State Library, the Library of Catalonia, the University Library of Lausanne and Ghent University Library. Google is also conducting a pilot project with the Library of Congress.

The Google Books Library Project digitizes books from major libraries around the world and makes their collections searchable on Google Book Search. More information can be found at:

Google Press Center, June 6, 2007


Google’s Moon Shot: The Quest for the Universal Library

A recent article in the New Yorker magazine by Jeffrey Toobin chronicles Google’s ambitious library-scanning endeavor.

Every weekday, a truck pulls up to the Cecil H. Green Library, on the campus of Stanford University, and collects at least a thousand books, which are taken to an undisclosed location and scanned, page by page, into an enormous database being created by Google. The company is also retrieving books from libraries at several other leading universities, including Harvard and Oxford, as well as the New York Public Library. At the University of Michigan, Google’s original partner in Google Book Search, tens of thousands of books are processed each week on the company’s custom-made scanning equipment.

Google intends to scan every book ever published, and to make the full texts searchable, in the same way that Web sites can be searched on the company’s engine at At the books site, which is up and running in a beta (or testing) version, at, you can enter a word or phrase—say, Ahab and whale—and the search returns a list of works in which the terms appear, in this case nearly eight hundred titles, including numerous editions of Herman Melville’s novel. Clicking on “Moby-Dick, or The Whale” calls up Chapter 28, in which Ahab is introduced. You can scroll through the chapter, search for other terms that appear in the book, and compare it with other editions. Google won’t say how many books are in its database, but the site’s value as a research tool is apparent; on it you can find a history of Urdu newspapers, an 1892 edition of Jane Austen’s letters, several guides to writing haiku, and a Harvard alumni directory from 1919.

No one really knows how many books there are. The most volumes listed in any catalogue is thirty-two million, the number in WorldCat, a database of titles from more than twenty-five thousand libraries around the world. Google aims to scan at least that many. “We think that we can do it all inside of ten years,” Marissa Mayer, a vice-president at Google who is in charge of the books project, said recently, at the company’s headquarters, in Mountain View, California. “It’s mind-boggling to me, how close it is. I think of Google Books as our moon shot.”

To read the article in it’s entirety:

Related news item:
Princeton U. Library Latest to Join Google Scan Plan

Google has signed another university library to participate in its library scan plan, this week announcing an agreement with the Princeton University Libraries to digitize roughly one million books. Under the agreement, Princeton will initially supply only public domain books over the “next six years,” which it says will be indexed and searchable on the web and freely available for download. (Private universities, because they have fewer protections, have been more wary than public ones in testing the limits of copyright laws.) Princeton University Librarian Karin Trainer said joining Google Book Search will “make it easier for Princeton students and faculty to do research” and would also allow Princeton to “share our collection with researchers worldwide.”

Trainer said Princeton librarians will work with Google over the next several months to choose the subject areas to be digitized and the timetable for the scanning. Library staff, faculty, and students also will be able to suggest titles for inclusion. Princeton is the twelfth institution to join the Google Books Library Project, joining Harvard, Oxford, Stanford, the University of California, the University of Michigan, the University of Texas-Austin, the University of Virginia, the University of Wisconsin-Madison, the New York Public Library, the University Complutense of Madrid, and the National Library of Catalonia.

Library Journal Academic Newswire, Feb. 8, 2007


Google’s Big Book Scanning Project: Read up!

Search Me?
Google Wants to Digitize Every Book. Publishers Say Read the Fine Print First.

By Bob Thompson
Washington Post Staff Writer
Sunday, August 13, 2006; Page D01

STANFORD, Calif. If it is really true that Google is going to digitize the roughly 9 million books in the libraries of Stanford University, then you can be sure that the folks who brought you the world’s most ambitious search engine will come, in due time, for call number E169 D3.

Google workers will pull Lillian Dean’s 1950 travelogue “This Is Our Land” — the story of one family’s “pleasant and soul-satisfying auto journey across our continent” — from a shelf in the second-floor stacks of the Cecil H. Green Library. They will place the slim blue volume on a book cart, wheel it into a Google truck backed up to the library’s loading dock and whisk it a few miles southeast to the Googleplex, the $100 billion-plus company’s sprawling, campuslike headquarters in Mountain View. There, at an undisclosed location, it will be scanned and added to the ever-expanding universe of digitally searchable knowledge.
Why undisclosed?

Because for one thing, in their race to assemble the greatest digital library the world has ever seen, Google’s engineers have developed sophisticated technology they’d prefer their competitors not see.

And for another, perhaps — though Google executives don’t say so directly — the library scanning program already has generated a little too much heat.

To read the full article:

U. of California System’s 100 Libraries Join Google’s Controversial Book-Scanning Project

The University of California system has joined Google’s controversial book-digitization project, and the partnership is expected to convert millions of books from the system’s 100 libraries — even volumes that are protected by copyright — into fully searchable electronic texts. Google officials say they plan to add even more academic libraries to the program in the near future.

The university system is the seventh major participant to join Google’s ambitious effort to add digital versions of books to its popular online search engine, and the first full partner to join since two groups of publishers sued to stop the company from scanning any books still covered by copyright.

“We’re comfortable that the activity is fully respectful of copyright law,” said Daniel Greenstein, executive director of the California Digital Library, a division of the university system.

Other participants in the Google project, which began in December 2004, include Harvard University, the New York Public Library, Stanford University, the University of Michigan, and the University of Oxford, in England. The Library of Congress is taking part in a pilot stage of the project as well.

Some publishers and authors are challenging the project because the company plans to scan not only books in the public domain but also those on which copyright has not run out. Google has defended the legality of the project, stressing that its search results will offer only short excerpts from copyrighted books unless longer excerpts are authorized by a book’s publisher. Publishers and authors argue that Google must obtain permission before scanning a copyrighted work.

To read the full article, go to: (accessible only to UI affiliates, or subscribers of the Chronicle of Higher Education)
Chronicle of Higher Education, Aug. 9, 2006

In Defense of Google’s Book-Scanning Project

“The nation’s colleges and universities should support Google’s controversial project to digitize great libraries and offer books online,” writes Richard Ekman, president of the Council of Independent Colleges, in an editorial for The Washington Post. “It has the potential to do a lot of good for higher education in this country.” Google’s endeavor has drawn criticism from publishers, who have argued that the book-scanning project amounts to a violation of copyright law. But Mr. Ekman, who serves on the advisory boards of two university presses, argues that “those of us who are researchers and readers of books and articles ought to be disturbed by the loss of trust among publishers and libraries, which a decade ago embraced technological innovation and collaboration.” Mr. Ekman takes an optimistic view of the digitization project’s impact on scholarship:

Read theWashington Post article:
The Books Google Could Open

The Wired Campus, A service of the Chronicle of Higher Education, Aug.22, 2006

Scholarship and Academic Liraries (and their kin) in the World of Google

The prospect of ubiquitous digitization will not change the fundamental relationships among scholarship, academic libraries, and publication. Collaboration across time and space, which is a principal mechanism of scholarship, ought to be enhanced. Reforms in copyright law will be required if the promise of digitization is to be realized; absent such reform, there is a serious risk that much academically valuable material will become invisible and unused. Ubiquitous digitization will change radically the economics that have supported university–based collections of published material. Scholars and scholarly institutions (including libraries and university presses) must assert vigorously claims of fair use and openness.

To read the full article, go to:

by Paul N. Courant, First Monday, Volume 11, Number 8 — 7 August 2006