With the lack of progress in figuring out the Settlement, Google Book Search has been out of the news recently. So I was glad to see that Hugh McGuire, in his recent article about the line between the book and the Internet disappearing, in the second half of the article, gives credit to GBS as helping to lay the foundation for this (see comments below). In the first part of his paper, McGuire has a lucid description of the future world of books on the Internet that’s being foreshadowed by GBS:
A book properly hooked into the Internet is a far more valuable collection of information than a book not properly hooked into the Internet. … The false battles between ebooks and print books continue to ignore the real, though as-yet unknown, value that comes with books being truly digital; not the phony, unconnected digital of our current understanding of “ebooks.” … We still consider that books live outside of the Internet. There is massive and untapped (and unknown) value to be discovered once books are connected.
McGuire says our idea of what an eBook should be still carries a lot of baggage from our idea of what a book is — We still think of eBooks as being separate from the Web — In particular, eBooks, like print books, are not searchable or linkable, and text can’t be copy-pasted. But full view Google Books are searchable, linkable and deep-link-able, and text can be copy-pasted.
Being able to find books in a Web search is a key to removing the barrier between books and websites, as McGuire says, and Google has been bringing that capability to reality, as they steadily work with libraries to scan their books. Searching for books within Book Search has been part of GBS since the beginning. In the last three years or so, books have also been increasingly integrated into Web Search results. This is a big step, I think, in connecting books to the Web — Searching Google is how we find out about things these days, so to be able to retrieve books and web pages in the same search is a extremely valuable — A great advance that’s gotten little attention.
Not surprisingly, the people who have especially learned to value bringing books onto the Web are historical researchers — With pre-1923 books being out of copyright, and freely available in GBS full-view searches, historians have quickly taken advantage — In a recent forum of historians on GBS, it’s described with words like indispensable … enlivening.
While historians have expressed their valuing of the connected books in Google Books, there are certainly many non-academics with little voice who are also learning to use it. Google engineer Dan Clancy says that half of the out-of-copyright books in GBS have at least 10 pages viewed per month — A large portion of this use must surely be from the general non-academic public. One commenter to McGuire’s article (who apparently doesn’t know about GBS) expresses the voice of this little-heard group: “If books/entire texts were *searchable* on the internet, people searching the web could more easily find exact phrases/specific information in books. To me, this is HUGE!” Indeed — and what a fine, pithy statement of McGuire’s thesis!
To conclude, a plug for metadata as the key to connecting books to the Web, and why it’s people like the geeks at Google who are doing it before the traditional publishing industry — In the words of Mike Cane:
All of this hidden information — exploded out, made explicit — turns an ebook from a dumb object into a smart object. … Google is staffed by geeks who juggle information with an expertise that print publishers lack. … Google makes information do things. … Print publishing freezes information into a static object — An object that stands alone, disconnected, unable to do anything. … The information geeks are new publishers for a new age … Metadata has value. As new connections are formed, and new data is added, its value increases exponentially.
Bringing the world’s books to digital life on the Web, whether it’s done by Google or someone else, is a story that’s just beginning. For now though, Google Book Search gives us a start in exploring the new world of the book and the Internet.
Eric Rumsey is at: eric-rumsey AttSign uiowa dott edu and on Twitter @ericrumsey
I do I do mention it, see:
“There is little talk of this anywhere in the publishing industry that I know of, but the foundation is there for the move — as it should be. And if you are looking at publishing with any kind of long-term business horizon, this is where you should be looking. (Just ask Google, a company that has been laying the groundwork for this shift with Google Books).”
My mistake. I see that you did have GBS in mind. So sorry — I’ve edited first paragraph. OK? — Redfaced Eric
ha! thanks….
one curiosity of the e-book scene is that
it seems to have no collective memory…
the notion that books should be mounted
on the internet is _not_ “mcgquire’s thesis”,
as you put it here, eric. not by a long shot.
we’ve been researching this matter for years.
the ideas even predate the google initiative…
the o.c.a. has mounted books ever since it
started scanning them, way back (in 2003?).
the issue, as you see brewster stated it here,
> http://www.futureofthebook.org/blog/archives/2006/12/microsoft_launches_live_search.html
is “some experimentation on how book material
should appear online”. as you will also see there,
i pointed to a number of books that i’d mounted.
the website they were on has gone down, but
i moved them to another website that i control:
> http://z-m-l.com/go/tolbk/tolbkp023.html
> http://z-m-l.com/go/mabie/mabiep123.html
> http://z-m-l.com/go/myant/myantp123.html
> http://z-m-l.com/go/sgfhb/sgfhbp123.html
> http://z-m-l.com/go/ahmmw/ahmmwp023.html
so if you want to move discussion past the “gee whiz”
stage, i have plenty of thoughts i can share with you…
or, if you prefer to wait until 2012, when this topic is
brought up once again, as if it were _brand_new_, then
we can wait and do it then…
-bowerbird
bowerbird, Your markup pages look interesting — Any chance ZML will ever be widely implemented?
The real gee whiz part of this thing is searchability — If you can find a place to start putting ZML pages where they’re searchable by Google-Bing-Yahoo … Hmmm…
Thanks for the comment.
eric-
> Any chance ZML will ever be widely implemented?
it’s how everything will be done, in 20 years or so.
how it comes about is not of much concern to me…
i’ll soon present proof-of-concept with working code
across the entire workflow… after that, it’s an i.q. test
on if/when the world is smart enough to implement it.
(when i say “it”, i mean lightweight markup. the most
prevalent form at the present is “markdown”, which has
achieved fairly good penetration. markdown isn’t really
“light” enough for my taste, but it gets the idea across,
the idea being that machines should apply the markup,
not people; we just have to leave enough clues for them.)
> If you can find a place to start putting ZML pages
> where they’re searchable by Google-Bing-Yahoo
all books need to be put onto a single cyberlibrary site.
tens of millions of books, on this one site, for everyone.
so search engines can/should/must catalog them there.
but i disagree that search is the reason for the cyberlibrary.
search is simple, there’s nothing to it, it’s not difficult to do.
what _is_ hard is to get the books to “talk to each other”,
to use the phrase coined by kevin kelly. once we develop
a technology that finds the “connections” between books
– between disparate pieces of knowledge in general –
our wisdom will take a quantum leapfrog into the future.
_this_ is the reason to put all our books on the open web.
of course, there’ll be great value in the cyberlibrary per se
– every book at the immediate disposal of every person,
what a boon for worldwide education that will prove to be! –
but it is the “hive” level which will provide the huge payoff.
-bowerbird
bowerbird, Good words – OK if I tweet link to them and/or excerpt them in another article?
godspeed, good buddy…
-bowerbird
Pingback: Books must be allowed to find their readers, says Google | The Pepper Express