Excerpts from Peter Brantley’s eloquent words on the Google Book settlement, in A fire on the plain (bold added).

With recent back and forth over the proposed Google Book Search settlement (e.g., Robert Darnton’s essay in The New York Review of Books; Tim O’Reilly’s response; and James Grimmelman’s litany of proposed corrections predating both at The Labortorium), I’ve been cast again into thinking about aspects of the agreement.

It is difficult to credit that frustrating access is ever able to delay or stem fundamental social trends – for example, the increasing importance of visual and interactive media. … Or the possibility that searching and reading networked books for anyone under the age of 40 might be an inherently social activity that generally increases enthusiasm for all forms of reading.

Let us consider a far more basic, more fundamental concern: the proposed Google Book Search settlement is embedded in a set of conceptions about books, reading, and information access which is as profoundly obsolescent as the printed Encyclopedia.

This is a world where young children carry around in the palm of their hands gaming consoles that have more networked computing capacity than a moderately powerful Sun workstation of five years back. Where increasingly I think about printed books with as much fondness as large cinder blocks, …  And yet authors and publishers worry that a fair level of access to digitized books … might reduce their profits. Truly, this should not be their worry. Their eyes remain cast on a horizon which has fallen from the earth, while a new sun is rising.

The settlement describes a world of time past, not a world of possibilities. Can we not imagine a redrafting of the settlement’s terms with libraries? … let us envision an alternative world where children routinely carry Alexandria in their hands. Where they experience works of literature as games, pushing at the borders of their knowledge and experience by engaging the library with others as a festschrift.

The people served by our libraries – let them show us how to re-make literature in a world where it fits in the circle of many hands, caressed by fingers, shared between minds. Libraries are laboratories for the future of reading, and with this, we have the key to it. … We stride into a world where books are narratives in long winding rivers; drops of thought misting from the sundering thrust of great waterfalls; and seas from which all rivers and rain coalesce, and which carry our sails to continents not yet imagined.

[concluding paragraph] Digital books are sparkles of magic untapped. The settlement proposes a bold path from darkness. But it is a trail that circles back to an old forest, abandoned. Our people have left, ventured onto a flat savannah, strewn with rocks, thorny shrubs, windblown trees, beasts. We can see it all now. And we are starting fires, with wood from fallen trees. Burning down the forest.

Related articles:

Eric Rumsey is at @ericrumsey

In a brief response letter, author and publisher Marc Aronson writes about the copyright status of pictures that are in publisher partner books in Google Books. Aronson suggests that the rights for pictures are separate from the rights for text. I’ve corresponded with Aronson to expand on this idea, and he says that in his experience as an author and editor, he has been told that he needs to obtain rights to pictures and text separately. I’ve searched for other commentary on this issue, and have found very little. It’s a subject that needs exploration. Anyone have ideas?

All books in the publisher partner program, of course, are under copyright, and are available only in Limited Preview, with the publisher giving Google the rights to display a specific number of pages. In some cases of books containing pictures, however, the pages are available, but without the pictures. Is this because the publisher has gotten the rights for limited preview of the text, but not the pictures, as Aronson suggests? The three examples below show a variety of Limited Preview options. The first two are especially pertinent, because they are for books from the same publisher (Macmillan), in the same series, that have a different picture preview status, possibly indicating that the illustrator has given permission to display pictures in the first case, but not in the second.

In this example, the first 39 pages* are available for preview, with all pictures displaying. There are about 30 thumbnail images for pages with pictures on the About this Book page.
Birds of North America (Golden Field Guides)
By Chandler S. Robbins et al, Illustrated by Arthur Singer, Published by Macmillan, 2001

In this book, from the same publisher, the first 37 pages* are available for preview, but almost all pictures do not display, replaced with the message “Copyrighted image.” There are no thumbnail images on the About page.
Wildflowers of North America (Golden Field Guides)
By Frank D. Venning, Illustrated by Manabu C. Saito, Published by Macmillan, 2001

This book follows the most common, fairly liberal, pattern of publishers in Limited preview books, with the first 50 pages* available including all pictures. A full complement of 30 thumbnails is on the About page.
Central Rocky Mountain Wildflowers
By H. Wayne Phillips, Illustrated, Published by Globe Pequot, 1999

* The number of pages available for preview varies from session to session — The number given here is the maximum I experienced.

DjVu

A month ago, Google announced that it has begun putting magazines in Google Books. In one way, this is a new direction for Google. But looked at broadly, it’s really not so new — Google has been putting old journals in Google Books for a long time. The basic difference between the newly announced “Google magazines” and Google’s “old journals,” of course, is the date of publication — The titles that are being treated as “magazines” are generally published in the last 50 years or so. But some of these also include much older issues, in some cases, such as Popular Science, going back to the 1800′s. A bit of digging — searching for words in an article — finds a nice case of a title that’s in Google Books both ways, as a magazine and as an old journal. Snippets from the “About this book” and “About this magazine” pages below show differences.

Old journals – The journal / book format

Old journals are given the same treatment as books, with each volume of the journal being considered a book. The record here is for volume 26 of Popular Science Monthly (the old name of Popular Science).

Old journals are scanned into Google Books by libraries, in the case shown here, Harvard University. As with other books scanned by libraries, the About page has a selection of thumbnail images, giving an idea of what sort of graphics are in the book. Also note the button to Download the entire volume in PDF format.

The Magazine format

In contrast to journal/book format, in which the volume (made up of several issues) is treated as the basic record unit, in magazines, the basic record unit is the issue. This record is for the Feb 1885 issue of Popular Science.

Comparing this with the journal/book format, this lacks thumbnail preview images and it also does not support downloading a PDF of the issue. It does, however, have the great advantage over the journal/book format, that all issues are connected in the Browse all issues menu.

DjVu Google Books is full of surprises!  In surveying medical journals in Google Books, I discovered that volumes of British Medical Journal circa 1880 scanned at Harvard have extensive sections devoted to advertisements. Most libraries, when they bind issues of journals and magazines into bound volumes, very reasonably remove pages that have only advertisements, to save space on the shelf. So it’s good to have a Harvard, that can afford to save the rare gems of 19th century ads, so that they can be put online for the world to enjoy!

As fanciful at the ad shown here is (“Ask for Cadbury’s Pure Cocoa, makers to the Queen”), there is a wealth of more prosaic ads in the same volume, awaiting future medical historians, on subjects such as malted infant food, lactopeptine for indigestion, bronchitis & croup kettles, and state-of-the-art wheelchairs.

I found several other journals in Google Books from the same late-19th-century era, that also have extensive ads. But British Medical Journal is the only one I found that has entire, separate volumes of advertising. Apparently there must have been separate supplements that were only ads (this was in the dawn of the age of mass advertising, and people, even including physicians, were actually GLAD to read ads!)

So, how searchable are the ads in Google Books? I tried a few examples and had mixed results — Searching for this phrase that’s in the Cadbury’s ad — “why does my doctor recommend Cadbury’s Cocoa” — was successful. But searching for a phrase in the ad that follows the Cadbury’s ad, for Anodyne Amyl Colloid — “in cases of neuralgia, sciatica, lumbago” — found the phrase in other ads for the same product, but not the one occurring in this instance.

Here are volumes of British Medical Journal that I found that are exclusively advertising (All of these were scanned at Harvard):

The list presented here has FULL VIEW (public domain, pre-1923) journals in Google Books. This is certainly NOT intended to be a complete list! There’s no easy way that I have found to limit a search in Google Books to journals, so I have found these titles by searching for appropriate words such as medical, dermatology, journal, archive, transactions. I have not included titles that have less than 5 volumes in Google Books. Unfortunately, there’s no way that I have found to sort the title searches chronologically, so to find a particular volume, it’s necessary to go through the results list. Each entry in the list below has links to the first and last volumes that I have found for each title; these dates are not necessarily inclusive. For “contributing libraries,” examples are given if there is more than one contributing library.

This list grew a lot longer than I thought it would — I was surprised to find so many journals in Google Books! It was a tedious job compiling this, and I probably won’t try to keep it current, with new volumes being added all the time. If I get feedback :-) I’m more likely to put in more work on it, so please add a comment, or mail me at: eric[hyphen]rumsey AT uiowa[dot]edu

Until now, books with pictures, especially color pictures, have been a relatively small part of Google Books. But the addition of highly visual, popular magazines changes this — The titles added so far are filled with pictures!

On one level, more pictures in Google Books is gratifying — a theme of this blog! But the navigation/search capabilities for finding these pictures is limited. The best way seems to be to use Advanced Search and limit the search to Magazines. But the results listing for this is text-only. It would be much easier to search for pictures with the sort of thumbnail search results interface that’s used in Google Image Search.

In light of the launching of picture-laden magazines as part of Google Books, it’s interesting to note that only last month, Google launched Life magazine pictures, as part of Google Image Search. Google is facing the same choice that librarians have been considering for the last while — Should books (or magazines) that have many pictures be considered mainly as books that happen to have pictures, or as pictures that happen to be in books?

The pictures & links below are from magazines that are in Google Books. I’ve chosen them because I know from work on Hardin MD that they are on highly-searched subjects, which would likely appear in Google Image Search if they were crawlable.

.           .

Google Books - Magazines

When I started this list in Dec, 2008, Google did not provide a list of their own — Thankfully, they provided one in Nov, 2009 (their announcement is Here, their list is Here). Assuming they keep up their list, I will probably not add to the list provided here. Comparing their list with mine now (11/12/09), they have everything on my list except one title (Log home living). Good start, Google, Hope you keep it up :-)

Please note: the dates given for titles is not necessarily inclusive! Some are quite spotty.

Eric Rumsey is at: eric-rumsey AttSign uiowa dott edu and on Twitter @ericrumsey

Peter Suber, at Open Access News, has a good article on Google’s recent announcement that they are now OCR’ing scanned PDF documents so that they become searchable text documents in Google Web Search.

Scroll down especially to Suber’s comments, in which he describes the background to this Google advance, which is already in Google Book Search — As he says, it’s had an OCR’d text layer version of full-view books from the start, which is how they can be searched. (Google Catalogs also has a searchable text layer).

For more on searchable and non-searchable text see: Identifying Google scanned PDF’s

Kalev Leetaru (Univ Illinois) recently published a lengthy and interesting article comparing Google Books and the Open Content Alliance. It’s especially interesting because it brings together a good description of many nitty-gritty details of Google Books that are not easy to track down. I’m excerpting a few passages on the use of color and PDF format in Google Books.

Color in Google Books – I have the impression, as Leetaru says, that when Google first started scanning books they didn’t scan in color — They do now though, at least in some cases.

[I've added the bold-face in quotes below. The order of quotes is not necessarily the same as in Leetaru's article.]

Since the majority of out–of–copyright books do not have color photographs or other substantial color information, Google decided early on that it would be acceptable to trade color information for spatial resolution.

Google’s use of bitonal imagery and its interactive online viewing client significantly decrease the computing resources required to view its material. … Google Book’s bitonal page images, on the other hand, render nearly instantly, permitting realtime interactive exploration of works.

Use of PDF in Google Books – It’s interesting that Leetaru says the Google Books view “mimics the PDF Acrobat viewer.” Until recently, I avoided using the “Download PDF” button link in Google Books, thinking that it was mainly for downloading to print, and that the PDF view would take a long time to load. But I’m finding that it loads quickly, and provides a fairly usable interface that is in fact reminiscent of the Google Books view, as Leetaru suggests.

Google realized it was necessary to use different compression algorithms for text and image regions and package them in some sort of container file format that would allow them to be combined and layered appropriately. It quickly settled on the PDF format for its flexibility, near ubiquitous support, and its adherence to accepted compression standards (JBIG2, JPEG2000).

While many digital library systems either do not permit online viewing of digitized works, or force the user to view the book a single page at a time (called flipbook viewing), Google has developed an innovative online viewing application. Designed to work entirely within the Web browser, the Google viewing interface mimics the experience of viewing an Adobe Acrobat PDF file.

While most services take advantage of the linearized PDF format, Google made a conscious decision to avoid it. Linearized PDFs use a special data layout to allow the first page of the file to be loaded immediately for viewing … Google found several shortcomings with this format [noting that] the majority of PDF downloads are from users wanting to view the entire work offline or print it [and that] for these users, linearized PDFs provide no benefit.

See Leetaru’s extensively-referenced article for many other useful details.

Color pictures in full-view books in Google Books are generally not common. This is not surprising, since color pictures in books generally before the pre-copyright date (1923) were uncommon. Searches in Google Books for likely subjects — museum, sculpture, french painting, history – do find many books with pictures, but they are almost all black and white.

An exception to the general lack of colored illustrations in older books is in the areas of botany and dermatology, two subjects in which I have a particular interest. In these subjects there were many books published in the 19th century, especially in Europe, with excellent color illustrations. A few examples from Google Books About This Book: Selected Pages are shown here.

For more see Color Pictures in Google Books: More examples