I searched a small sample of ten pre-1923, public-domain books in Google Web Search in the last week, to find full-text versions, with the results below. These are all non-fiction titles, chosen more/less randomly, in subject fields of my interest — Medicine, botany, and history.


I did the searches in Google Web Search as detailed below — I looked at the first ten results, and recorded all occurrences of freely-available full-view versions for each title, with rank number. I’ve identified the GBS records by the library that scanned the book. For Internet Archive (IA), I’ve identified records by sponsor/contributor, and also noted whether the link goes to the book home page or the DjVu-formatted version of the book.

Both Google Books & Internet Archive records found:

• 1. American medical botany, Cummings and Hilliard, 1817
Google Web Search: American medical botany cummings
. . # 3 GBS-Library: Oxford Univ
. . # 7 GBS-Library: Harvard
. . # 9 IA:  Book Home Page – Sponsor & Contrib: NCSU

• 2. Portfolio of dermochromes, Jerome Kingsbury, 1913 (3 volumes)
Google Web Search: portfolio dermochromes kingsbury
. . # 1 GBS-Library: Harvard – Volume 1
. . # 5 IA:  Book Home Page – Volume 1 – Sponsor: IA; Contrib: U California
. . # 6 IA:  DjVu format – Volume 2 – Sponsor: IA; Contrib: U California

• 3. The Complete herbalist, or, The people their own physicians, Oliver Phelps Brown, 1870
Google Web Search: Complete herbalist, or, The people their own physicians
. . # 1 IA:  Book Home Page – Sponsor: Lyrasis, Sloan Fndtn; Contrib: Rutgers
. . # 2 IA:  Book Home Page – Sponsor: MSN; Contrib: U California
. . # 8 GBS-Library: Harvard

• 4. English and American tool builders, Joseph W. Roe, 1916
Google Web Search: english and american tool builders roe
. . # 1 GBS-Library: Harvard
. . # 4 IA:  Book Home Page – Sponsor: Boston Lib Consortium; Contrib: Northeastern U
. . # 5 IA:  DjVu format – Full Text of #4

• 5. Health service in industry, Irving Clark, 1922
Google Web Search: health service in industry clark
. . # 1 GBS-Library: California
. . # 2 IA:  Book Home Page – Sponsor: MSN; Contrib: U Toronto
. . # 3 IA:  Book Home Page – Sponsor: Google; Contrib: ?

• 6. History of medicine in its salient features, Walter Libby, 1922
Google Web Search: history of medicine in its salient features libby
. . # 1 GBS-Library: Harvard
. . # 4 IA:  DjVu record – Sponsor: MSN; Contrib: U California

Only Google Books records found, none from Internet Archive:

• 7. The Theory and practice of veterinary medicine, Austin H. Baker, Alexander Eger, 1911
Google Web Search: theory and practice of veterinary medicine baker
. . # 1 GBS-Library: Wisconsin

• 8. Atlas of diseases of the skin, Franz Mraček, ed. by Henry W. Stelwagon, 1899
Google Web Search: atlas diseases of the skin stelwagon
. . # 1 GBS-Library: Harvard – umQPAAAAYAAJ

• 9. How are you feeling now, Edwin Sabin, 1917
Google Web Search: how are you feeling now sabin
. . # 1 GBS-Library: California

Only in Google Books – Publisher Preview only – Google Book Search has in Full-view:

• 10. Beyond the Mississippi : from the great river to the great ocean, Albert Richardson, 1867
Google Web Search: beyond the mississippi richardson
. . # 3 GBS-Publisher: Preview of 2007 reprint, no full-view available. The title IS available when searched directly in Google Book Search ->>
>> Google Book Search search, limit to Full view: beyond the mississippi richardson
. . # 1 GBS-Library: Virginia


This is certainly not a larger enough sample to draw many conclusions, but I think it does show a few things:

  • There’s a lot of overlap between what’s in the two sources – The first 6 of the 10 books searched are in both Google Books (GBS) and Internet Archive (IA).
  • Not surprisingly, when there are titles in both sources, Google usually ranks GBS higher than IA (one exception: #3).
  • Libraries represented in GBS – Harvard predominates, with 6 of the 10 records — This fits my general Googling experience. Univ California is second with 2 records — This is a higher proportion than I’ve experienced.
  • IA sources – 3 of the 6 records have MSN as sponsor; of these, 2 are contributed by Univ California.
  • Links to Internet Archive are haphazard – In most cases there’s a link to the Book Home Page, as there should be, since it has a list of different formats available. In some cases, there’s also a link to the DjVu format, and in one case (#10), that’s the only link. Why does Google link to this format instead of others? Maybe it’s because DjVu is good for displaying pages with pictures. But the version of the DjVu format that Google links to is not the best one, as I’ve discussed previously.
  • In one case (#10), Google Web Search didn’t find any full-view versions, and Google Book Search did find one.

My purpose here was not to look at the proportion of all books that are in GBS or IA — That would take a larger sample, and more systematic randomizing. But I can report that I did find most of the titles I searched, which surprised me.

As I report in a separate article, it’s likely that there are  GBS or IA versions of other editions of many of these books, that could be found by searching directly in these sources.

There were no full-text versions in the Google Web Searches I did from any other source than GBS or IA. I was surprised at this, especially that Gutenberg.org did not appear in any of the search results.

Caveat: The results for the specific searches in Google Web Search will certainly change over time, so the study should be thought of as capturing a moment in time, not results set in stone!

Related articles:

Eric Rumsey is at: eric-rumseytemp AttSign uiowa dott edu and on Twitter @ericrumseytemp

Good writing relevant to a broad topic is often buried in a discussion of a more narrow topic — By its title, Zoe Rose’s recent article is about the Monocle eReader, which works in the Safari browser. But in fact much of the article is a good narrative about why networked eBooks are better than downloaded, device-tethered eBooks.

Rose’s article caught my attention for several reasons — It resonates with Hugh McGuire’s recent article about the future of connected books on the Internet and my commentary suggesting that Google Books gives a hint of this. As with McGuire, Rose’s writing on Internet-connected eBooks suggests what I’ve written before about Google Books as the new eBooks. Also, recent discussions of the Safari browser on the iPad outshining Apps for reading are much in tune with Rose’s emphasis on the appeal of browser-based eBooks. Here are her words:

Monocle is a new development in eBookery. It could be revolutionary, for one reason: it works in browsers. Which is to say, you access your eBook content through the Internet.

Fundamentally, there are two ways to access content using machines:
1. Content lives on the user’s own device. This is a download-based model. Example: iTunes.
2. Content lives on external servers which are accessed by the user’s device. This is a web-based model. Example: Spotfiy.

There’s a user-facing difference between the two, and I think the no-download model will eventually have the upper hand. For content users, the download model is the more annoying option, because it’s tethered to a device. To  use my husband as an example: At work and at home, he uses different machines. They’re powerful desktops, so physically lugging them about is not a good option. Because of this, a device-tethered eBook is no use to him. But he always has an Internet connection. He can always log on to a website.

There’s a developing trend of people using multiple machines to access their content, instead of the trusty old family PC. My household isn’t unusual – it has two people, a desktop, two laptops, an ipad, and an iphone. Matching that trend on the other side, we can also see Internet access growing exponentially – it’s quite likely (if not inevitable) that ubiquity is just around the corner.

Which is more likely? A future where people use the same device all the time, or a future where people have Internet access all the time? In choosing between browser-based and download-based content models, these trends point to an access-it-through-the-Internet model as being where the smart money is.

Eric Rumsey is at: eric-rumseytemp AttSign uiowa dott edu and on Twitter @ericrumseytemp

Dan Cohen gave an excellent talk in a panel (Is Google Good for History) at the recent annual meeting of the American Historical Association that brought much attention in the media and in Twitter by people following the conference’s hash tag #AHA2010.

The main focus of Dan’s talk was Google Book Search — He gave a nicely balanced view, noting that it’s been a valuable new source for historians, but also discussing problems with it, especially what he sees as a lack of openness on the part of Google.

The discussion around Dan’s paper brought in many voices and opinions on Google Books — It was especially encouraging to see positive opinions on the usefulness of GBS by historians (about which I’ve blogged) — An opinion that gets little attention and respect on Twitter 😉

So, because I found this discussion so valuable, and because tweets stay in Twitter Search for only 10 days, I’m taking the unusual step of recording all of the tweets retrieved with this search: #aha2010 google, which follows below (The search was done on Jan 14):

Eric Rumsey is at: eric-rumseytemp AttSign uiowa dott edu and on Twitter @ericrumseytemp

I’ve blogged before about the almost universally negative opinion of Google Books on the blogosphere/twittersphere — I found another notable example last week. On Dec 14, there appeared in NY Times a 12 paragraph article France to Digitize Its Own Literary Works, about President Sarkozy pledging $1.1 billion for book scanning. The article mentions that there has been animosity on the part of Sarkozy toward Google, but doesn’t dwell on this.

In paragraphs 5 and 6, Bruno Racine, president of the National Library, who was interviewed by phone for the article, says the project might be done in partnership with Google. I’ve followed the GBS-France story enough in the last several months that this struck me as surprising. I came across the story on Dec 17, 3 days after it was published, so that there were many links to it on Twitter –Searching in Twitter by its title I found 103 tweets — But surprisingly, only 8 of these mentioned Google. And most of these 8 tweets that did mention Google used terms like “slap at Google … in response to Google uproar … in competition with Google.” One Tweeter — @platformlead – to her credit! – did say “… with Google?”

What’s going on people! Surely more than one of the 103 people who tweeted this article must have seen the mention of a possible partnership with Google. Makes me think people see what they’re looking for. There’s been so much talk about French negativity toward GBS that maybe people just DON’T NOTICE something that shows another side.

Library Journal, to their credit, in this Dec 15 article reporting on the NYT article, did make prominent note in subtitle : “Google might still be a partner, says head of national library” …


Surely someone will notice this in Twitter, I think. But, no, the same blindness continues — Searching in Twitter for the title of this article, I find that only seven people tweeted it. And only one @bcbiupr, mentioned the possible partnership with Google.

Eric Rumsey is at: eric-rumseytemp AttSign uiowa dott edu and on Twitter @ericrumseytemp

My son, Brian Rumsey, studies History in Mississippi.This is an interest of mine also, so I read along with him on some of his books. We’ve been having a running discussion on the revival of narrative history writing, as discussed by Lawrence Stone, who relates it to the idea of “thick description.” More less unconsciously, I think, as this idea  has percolated in my mind, it has become “thick history” instead of “thick description.”

These ideas were bouncing around in my mind when I visited Brian recently at Mississippi State, where he studies, and especially when we got a kind invitation from a fellow grad student to share Thanksgiving dinner with his extended family. The gracious Southern hospitality we enjoyed there was the highlight of our trip, in many ways – Story-telling, food, and much more. Of course, I couldn’t resist making a connection — This is history at its thickest! The rich dimensions of a Southern family! This made me realize that my conception of “thick history” is much like family history — It’s history that takes in all “members of the family” — all dimensions, all of the context of the story. History that values the STORY, and follows it wherever it goes, without trying to fit it into an ideological framework.

As I continued to cogitate on the idea of “thick history,” of course, I turned to Google — Searching in Google Web and Google Books, I find that I’m certainly not the first one to coin the term — It’s been used especially in discussions of Keynesian economics, but also in religion and sociology.


So I poke around more, and explore the idea that “thick history” resonates with “family” — I don’t find much in Google Web search, but then I turn to GBS and — Bingo! — Searching GBS for thick history” family, I find just what I’ve been imagining — Number two is The Genetic Strand, with the passage “DNA measures thick history …” This book, by Edward Ball — is “the story of a writer’s investigation, using DNA science, into the tale of his family’s origins.” — with his Southern family being centered in Charleston, South Carolina.

No earth-shattering find, admittedly, but a neat little trick nonetheless — Using GBS to make a surprising connection between history and biology that would have been impossible without it. The sort of connection that I’m sure makes GBS invaluable for real historians, enabling them to see history in completely new ways.

Eric Rumsey is at: eric-rumseytemp AttSign uiowa dott edu and on Twitter @ericrumseytemp

As a dilettantish historian, I find Google Books invaluable, especially for 19th century sources. With the chorus of negativity surrounding the GBS Settlement debate, it’s been hard to find anyone saying what seems obvious to me – Whether you like it or not, it’s apparent from the comments below that GBS is revolutionizing historical research. So it was good to come across a recent discussion to this effect among historians on a rather obscure academic listserv (SHARP-L: The Society for the History of Authorship, Reading & Publishing). This is especially interesting because it centers around commentary by Geoff Nunberg in August on GBS metadata as a “train wreck,” which I discussed in articles here. The postings below on the GBS thread are in chronological order. I’ve included excerpts from most of the posts, including representations of all points of view. Most of the postings are in the November archive (thread title: do you use Google Books?). Thanks to @cpwillett for bringing this to my attention.

Beth Luey, Arizona State University

[An earlier] post called to mind a number of recent attacks on Google. I have become addicted to Google Books, which has not only given me access to books that would be hard to find … but has allowed me to find people and passages in books where I would not have known to look. … I’d be interested to know whether other SHARPists find Google Books useful.

Mark Samuels Lasner, University of Delaware

I find Google Books useful for finding truly obscure references but have learned not to trust either the bibliographical information given in the listings or the integrity of the scanned books themselves.

Patrick Leary, Northwestern University

(boldface added here and below) Beth calls attention to a phenomenon that I’ve noticed lately, too: articles sneering at Google Book Search, despite the fact that every serious researcher I know, including myself, now uses it routinely to accomplish certain tasks that would otherwise be very difficult or even impossible to do.

The article by Geoffrey Nunberg in the August 31 Chronicle of Higher Education, is typical.  That article, which has been widely cited, is entitled, “Google’s Book Search: A Disaster forScholars.”  That characterization is flatly ridiculous, and utterly irresponsible. …. The plain fact is that Google Book Search is *not* by any measure a “disaster for scholars”; to the contrary, it is one of the most useful tools that scholars (and other researchers, of all kinds) around the world have ever had available to them, and unlike the many subscription full-text databases, it is available for free to anyone who can muster an Internet connection. … We absolutely do not need are any more sneering dismissals of the entire enterprise.

Elizabeth Horan, Arizona State University

Many books printed in Spanish, esp. Latin America, even important books, were printed without indexes in order to save on production costs. Google Books isn’t fail-safe but it’s better than sitting and skimming huge swathes of text, especially for finding name references.

Zachary Lesser, University of Pennsylvania. These comments suggest what I’ve discussed in previous articles.

The Nunberg article might as well have been called, “Libraries: A Disaster for Scholars.” After all, I’m sure we could all relate anecdotes about how a book was mis-shelved, lost in the stacks for years, catalogued under an inappropriate subject header, etc. For that matter, one might write an amusing article entitled, “Printed Books: A Disaster for Scholars,” with funny examples of typos.

Richard Fine, Virginia Commonwealth University

I agree with Patrick and others.  Google Books is a useful tool and promises to be even more useful in the future.  I think it is especially so for my colleagues working in 19th century (and earlier) materials, and those in the public domain.  That said, I’ve been fishing around for texts from the 1940s and have found several of relevance to a current project through Google Books that I could not locate elsewhere.  Like any tool, it is imperfect and can’t do everything, … Nunberg was way off base, as Zachary Lesser indicated.

Eleanor Shevlin, West Chester (Pennsylvania) University

I just want to quickly second Patrick’s remarks–especially about the value of GBS (despite its flaws, errors, etc.–one needs to be an aware user).  I find GBS indispensable as a finding aid for a host of purposes.

Paul Duguid, University of California, Berkeley

I would indeed almost go as far as to say that to criticise Nunberg through the subtitle without addressing his interests and issues directly comes close to being “flatly ridiculous and utterly irresponsible”.  (Full disclosure here, I am a friend and co-teacher with Nunberg, while Patrick and I have crossed swords before about Google and its critics and he regards me, as he seems to Nunberg, as suffering from “scholarly fastidiousness” for finding fault with Google.) …. Looking beyond the headline, note that Nunberg is well aware of what Google is good at …. So in no way is his piece a “sneering dismissal of the entire enterprise”.

Lisa Berglund, Buffalo State College, SUNY

I use Google Books a LOT, especially for my courses. I can assign chapters and sections of books … Google Books frequently helps me answer quickly and easily questions that would  have been difficult to research with only Buffalo State’s limited college library … Yes, it has its limitations and yes, it can be frustrating but it has made my job so much easier, and enlivened my research so dramatically, that my occasional whining is lost in an almost daily rush of relief and gratification.

Daniel Allington, The Open University

Paul Duguid writes that “we need GBS to take metadata far more seriously so that the collection can be examined as an unrivalled and reliable corpus and not simply a bunch of scanned books”. This is, I think, the heart of the issue. As a “bunch of scanned books”, Google Books is very useful indeed. But it could be so much more than that, and it’s the problems that would have been easiest to avoid that are in many ways the most frustrating.

Eric Rumsey is at: eric-rumseytemp AttSign uiowa dott edu and on Twitter @ericrumseytemp

The recent controversy about the Google Book Search Settlement seems to have taken up peoples’ Google-watching attention so much that advances in the way GBS actually works have been getting overlooked. Several notable improvements were made during the summer, for example, that got very little recognition. Another change that seems to have gotten little recognition is that Google web searches have begun to include links to books in GBS in the last 1-2 years (as in the example at left). Particularly in searching for historical topics, I’ve been seeing searches recently in which the majority of the first 10 hits are from GBS — A great advance, I think, for historical research. Up to now, my experience has been that history has been a fairly weak subject on the Web — Locked away in books, not on Web pages.

I had occasion to take advantage of the newly accessible books from GBS recently, when I was least expecting it, while having a discussion with my son David, who’s a long-distance runner, about track runners of the past at the University of Iowa. I remembered that one particular runner on the team, Ted Wheeler, ran on the US Olympic team in the 1950’s, and that he later went on to become the coach for the UI track team (I especially knew about him because while he was the coach he married Sheila Creth, the University Librarian at the University of Iowa Libraries, where I work). David knew that Wheeler had been in the Olympics, and thought that he had been an assistant coach at Iowa, rather than the head coach. So … of course I turned to Google to settle the “discussion.” It turned out to be a surprisingly difficult search. I assumed that it would be fairly easy to find records of recent track coaches at a large, Big Ten program like Iowa. But it wasn’t — I tried several search terms without success before — Bingo! — I finally hit upon the combination that turned up the page shown here, establishing that Wheeler was, indeed, the UI track coach from 1978 to 1996 — with the added benefit of a great picture!

The point of this little story: I think integrating GBS links into Google web search is a great advance, and deserves more attention. As I said above, there’s been so much negative press for Google in recent discussions of the Settlement that everything they do is interpreted negatively — I saw a link in the last couple of weeks, that I unfortunately didn’t keep track of, decrying Google’s putting GBS links in Web search results because someone thought Google was trying to unfairly boost their own content. Really?? I think there’s such a treasure in old books that the world will benefit from Google’s making them more accessible. There are questions, certainly, about the algorithm used by Google to determine which books are included in Web search results, and I hope Google will say more about that. But it’s not only Google that’s saying little on the subject — I haven’t seen much discussion at all by anybody on the integration of GBS books in Google web search results —  If anyone can find it, please add a comment or contact me by Twitter or Email.

Eric Rumsey is at: eric-rumseytemp AttSign uiowa dott edu and on Twitter @ericrumseytemp

Searching for talk on Google Books and the Settlement since Judge Denny Chin delayed the decision on October 7, I’ve been finding very little — What had been a stream of chatter in Twitter searches has turned into a trickle. I found a little example reflecting this today that I think is worth recording — The first seven hits in a Twitter search for #GBS, going back a day, are in German. … You can pretty much tell when NO ONE in the US is talking about a subject when you search in Twitter and find that the last day’s tweets are NOT IN ENGLISH! … I’d predict that in a couple of weeks there WILL be a bit of discussion in English!

Eric Rumsey is at: eric-rumseytemp AttSign uiowa dott edu and on Twitter @ericrumseytemp