In an earlier article, I reported on searching in Google Web Search for ten full-text books, which were found in Google Books and Internet Archive. Most of the titles included in that study had only one edition, which makes them relatively easy to search for. In this article I report on a more complex search, for one book title with multiple editions, searched in Google Web Search, and also searching directly in Google Book Search. I also report on results for searching a specific phrase from the book in Web Search and Book Search. In a separate article, I examine the results for the same title in Internet Archive.
Searching for the book
The book I used as a case-study example is Diagnostic and therapeutic technic, by Albert S. Morrow, which has four editions from 1911 to 1921.
Google Web Search: “Diagnostic and therapeutic technic” morrow – In the first 10 records retrieved, the only full-text books are two GBS records, from Harvard 1911 (#1) and Harvard 1917 (#4) [The 1911 edition was scanned twice by Harvard, as shown in the detailed GBS results below, so the boldface is used to distinguish the different versions]. There are no records from Internet Archive.
Google Book Search (searching the same phrase, limiting to Full-view): “Diagnostic and therapeutic technic” morrow – The only edition in the top 10 items is the same Harvard 1911 edition, ranked at the top of the list — It seems odd that only one edition is found in Book Search, when there were two editions in Web Search … BUT, there’s a trick in Book Search! – Clicking “More editions” brings up a great wealth of additional editions — Seven different records for four editions, from Harvard and Stanford, listed below in chronologic publication order, with search retrieval rank in parentheses (As mentioned above, the 1911 edition was entered twice by Harvard, distinguished by boldface and ID number):
Harvard 1911/uL8R (#4)
Harvard 1911/PqsR (#5)
Stanford 1921 (#1) [Results list incorrect, says 1911]
Harvard 1915 (#3)
Stanford 1911 (#2) [Results list incorrect, says 1915]
Harvard 1917 (#7)
Harvard 1921 (#6)
Taken together, these results for Web Search and Book Search raise questions —
With GBS records for several editions, is there any predictable reason why particular records appear in each of the searches? Why two records are retrieved in Web Search when seven are found in Book Search?
The favored record seems to be the Harvard 1911/uL8R one, which appears first in both Web Search and Book Search — Is this chosen because it’s the first edition? Why not choose the latest edition from 1921? It’s also notable that this seemingly favored record links to a rather confusing scan from the front cover of the book instead of linking to the clearly-displayed title page or table of contents, as most of the other records do.
“More editions” search – How is the order of these determined? Is it significant that the two top-ranked records noted on the list above are the only ones from Stanford? Another little glitch – Each of the entries linked from the initial “More editions” has its own “More editions” link, which goes to the same list as the initial list — An obvious oversight.
Both of the Stanford editions in the list have incorrect dates given in the search list. This fits my experience — I’ve found that Stanford GBS records are often quirky, especially compared to Harvard.
Searching for a *phrase* in the book
Google Web Search – The ability to search in Google Web Search for a phrase that occurs in a GBS book is invaluable — I searched Google Web for this phrase that I had seen in the Morrow book: “The tube devised by Crile is of German silver.” As expected, this gets only Morrow, which goes to page 117, with the phrase nicely highlighted, in the same Harvard 1911 edition that’s featured in the searches above for the whole book. Clicking on “Repeat the search with the omitted results included” retrieves six of the seven editions listed above, all with the phrase highlighted. Interestingly it also retrieves, ranked #2, the Internet Archive DjVu-formatted record for the book, which goes to the top of the book record instead of the highlighted occurrence on a page. Finding the Internet Archive record in a phrase search is surprising since there were no Internet Archive records retrieved in searching for the book title above.
Google Book Search (searching the same phrase, limiting to Full-view): “The tube devised by Crile is of German silver” – Searching for the same phrase in Google Books – This goes to the highlighted phrase in the the Stanford 1921 edition. This the only record retrieved (Interesting that this edition was ranked #1 in the GBS search for the book title). Clicking on “More editions” retrieves the same seven records listed above, which do not link to the highlighted phrase, but only to the beginning of the book.
Thoughts on Google Books and Metadata
There’s been much talk about metadata in Google Book Search, with strong complaints about its poor quality by Geoff Nunberg and others, especially as it relates to different editions, and the examples presented in this article fit into that category. I guess I take a less negative view of this than Nunberg — The upside that I see is that its better to have many editions, even with perplexing search patterns, than to have a more limited number of editions. The lesson that this little case study gives, I think, is to be aware of the quirks, and to be persistent. In particular, if you come across an interesting reference to a book in Google Books in a Google Web Search, be sure to “repeat the search with omitted results” and to do the search in Google Book Search.
Keeping multiple editions straight, which takes the large-scale organizational skills of a librarian, I suspect its going to take a while for Google to figure out how to do it well (hopefully with the help of librarians). Keeping multiple editions straight, I suspect, is something that librarians have more skill at than Google, and I suspect its going to take a while for Google to figure out how to do it well (hopefully with the help of librarians). On the other hand, where Google excels, as shown in this little study, is in hitting the small target — I apply this especially to Google’s ability to find specific phrases, and to highlight the phrase searched for, which I think will be very helpful for scholars.
Eric Rumsey is at: eric-rumseytemp AttSign uiowa dott edu and on Twitter @ericrumseytemp