Seeing the Picture

Thoughts on digitization & libraries while working on Hardin MD

Main menu

Skip to primary content
Skip to secondary content
  • Home
  • About
  • Hardin MD

Category Archives: Metadata

Metadata About Metadata: Library Catalog Fail

Posted on September 3, 2009 by Eric Rumsey

David Weinberger’s book Everything is Miscellaneous: The Power of the New Digital Disorder is fascinating — I’m especially enjoying his many original comments on metadata. So, trying out Weinberger’s ideas, I search in local library catalogs for david weinberger metadata — I get: NO ENTRIES FOUND … Hmmm … How does Google Book Search compare? I do the same search in GBS, and Bingo –There it is, at the top of the list …

Google, of course, puts the book at the top of the list because its deep metadata indicates that metadata is an important topic, and PageRank likely indicates that other people also value Weinberger’s discussion of the topic.

So, why don’t library catalogs find the book? — The problem is the subject headings assigned by the Library of Congress, and used in most all library catalogs:

Knowledge management.
Information technology — Management.
Information technology — Social aspects.
Personal information management.
Information resources management.
Order.

Even though the book discusses metadata at length and on many pages, it’s not deemed important enough to be a heading — The problem is that the traditional catalog is what Weinberger calls a “second order” resource, being limited to the small number of subject headings that will fit on a card in the (bygone) catalog. Given resources to assign a larger number of subject headings, no doubt metadata would be included.

So … Librarians can’t afford to be smug about metadata — Google has problems (as discussed in Geoff Nunberg articles linked below). But libraries have their own problems. In many ways the traditional library catalog lacks metadata features that have become common in Google, Amazon, and other sites.

Hope for Libraries — WorldCat does find the book with the david weinberger metadata search (#2 in results), because it has additional tags listed in its “Abstract” (scroll down) which include metadata — Sooner or later, maybe libraries will add the WorldCat Abstract to their catalogs to “enrich their metadata.”

Context: Recent articles by Geoff Nunberg:
Google Books: A Metadata Train Wreck, Language Log blog
Google’s Book Search: A Disaster for Scholars, Chronicle of Higher Education

Eric Rumsey is on Twitter @ericrumsey

Posted in Google Book Search, Library Catalog, Metadata, PicsYes, Train Wreck, Uncategorized.

“Metadata Train Wreck”: Librarians Should Tread Lightly

Posted on September 3, 2009 by Eric Rumsey

There’s been much buzz among librarians, and others, on recent articles by Geoff Nunberg (UC Berkeley School of Information) on the “Train Wreck” state of Metadata in Google Book Search (See article references below). Nunberg certainly makes some good points. But we librarians are far from perfection in the metadata realm — Look at the good ol’ Card Catalog — Problems abound, as described in an amusing YouTube piece by librarian Brian Mathews (@brianmathews). He uses Georgia Tech as an example, but the same sorts of problems exist in many catalogs. It’s especially appropriate, in light of Nunberg’s emphasis of date problems in GBS, that one of the examples highlighted by Mathews (below) is author dates … Born by the metadata, die by the metadata …

Context: Recent articles by Geoff Nunberg:
Google Books: A Metadata Train Wreck, Language Log blog
Google’s Book Search: A Disaster for Scholars, Chronicle of Higher Education

Eric Rumsey is on Twitter @ericrumsey

Posted in Google Book Search, Library Catalog, Metadata, PicsYes, Train Wreck, Uncategorized.

Metaphorical Marginalia as Metadata

Posted on August 18, 2009 by Eric Rumsey

Marginalia — writing notes in the margins of books — is not an exact fit for digital books. But the concept has been getting bantered about in a metaphorical sense to denote any kind of user annotation in digital texts. In my June article on Cathy Marshall’s studies of user annotations is this quote from Mitch Ratcliffe: “Creating marginalia is an art made for the era of ‘crowdsourcing’.” Ratcliffe’s article is a long technical commentary that sounds very “metadata-ish,” although he doesn’t actually use the meta-word. So it wasn’t surprising to find that he has made the connection between user annotation/marginalia and metadata, in a July article [boldface in all quotes added]:

Readers will be the creators of the most important metadata describing books. Period. There is no second-guessing that conclusion, which has been proved again and again in every hypertext environment in human history. Defining the problem of book metadata without treating the reader as the fulcrum of the process is missing the point.

Poking around a bit more to find connections between “metadata” and the “metaphorical marginalia” of user annotations, I found interesting commentary in 2005 around David Weinberger’s article Crunching the metadata (Excerpt from Weinberger, boldface added):

We’re going to need massive collections of metadata about each book. Some of this metadata will come from the publishers. But much of it will come from users who write reviews, add comments and annotations to the digital text, and draw connections.

As the digital revolution continues, and as we generate more and more ways of organizing and linking books–integrating information from publishers, libraries and, most radically, other readers–all this metadata will not only let us find books, it will provide the context within which we read them. … The real challenge to traditional publishing today comes not from the digitizing of books, then, but from the very nature of the Web itself. Using metadata to assemble ideas and content from multiple sources, online readers become not passive recipients of bound ideas but active librarians, reviewers, anthologists, editors, commentators, even (re)publishers.

Weinberger doesn’t use the word “marginalia.” But the word IS used in Ben Vershbow’s articles commenting on Weinberger. In Vershbow’s first article (tagged with “marginalia”) he says:

The book in the network is a barnacled spirit, carrying with it the sum of its various accretions. Each book is also its own library by virtue not only of what it links to itself, but of what its readers are linking to, of what its readers are reading.

And in another article by Vershbow that continues on the same theme, Daniel Anderson, in a comment, is reminded of Billy Collins’ poem, “Marginalia,” which, he says, “points to the conversations that take place as readers jot their reactions in the margins of books.” Excerpt from the poem, as quoted by Anderson:

Sometimes the notes are ferocious,
skirmishes against the author
raging along the borders of every page
in tiny black script.
If I could just get my hands on you,
Kierkegaard, or Conor Cruise O’Brien,
they seem to say,
I would bolt the door and beat some logic into your head. …

Yet the one I think of most often,
the one that dangles from me like a locket,
was written in the copy of Catcher in the Rye
I borrowed from the local library
one slow, hot summer . . .

A few greasy looking smears
and next to them, written in soft pencil-
by a beautiful girl, I could tell,
whom I would never meet-
“Pardon the egg salad stains, but I’m in love.”

****************

The Talmud page with marginalia at left is from Vershbow’s article Web marginalia.

Eric Rumsey is on Twitter @ericrumsey

Posted in Digitization, eBooks, Marginalia, Metadata, PicsYes, Uncategorized.

“Books: The Liquid Version” — Kevin Kelly

Posted on August 5, 2009 by Eric Rumsey

What Happens When Books Connect?

Kevin Kelly’s long NY Times article in 2006 (Scan this Book!) on Google Book Search has some elegant words on the transformative effect of digital books in general, beyond GBS. His words are echoed in several recent commentaries that I’ve written about — I’ll precede excerpts from Kelly with connecting ideas from these recent articles:

Kelly’s comments parallel the static print libary of Borges and the flowing, connected library of Rushdie:

(Kelly) The common vision of the library’s future (even the e-book future) assumes that books will remain isolated items, independent from one another, just as they are on shelves in your public library. But this vision misses the chief revolution birthed by scanning books: in the universal library, no book will be an island.

Mike Cane sees metadata as the real gold of digital books (cross-linked … extracted … indexed … analyzed …) — Corresponding with Kelly’s real magic:

(Kelly) Turning inked letters into electronic dots that can be read on a screen is simply the first essential step in creating this new library. The real magic will come in the second act, as each word in each book is cross-linked, clustered, cited, extracted, indexed, analyzed, annotated, remixed, reassembled and woven deeper into the culture than ever before. In the new world of books, every bit informs another; every page reads all the other pages.

Clive Thompson says the community of digital book readers will transform books by building a web of linked commentaries. Kelly says much the same:

Once a book has been integrated into the new expanded library by means of this linking, its text will no longer be separate from the text in other books. … Books, including fiction, will become a web of names and a community of ideas.

Peter Brantley imagines a world in which digital books become connected as long winding rivers that flow together. Here’s Kelly sounding similar:  …

Search engines are transforming our culture because they harness the power of relationships, which is all links really are. … This tangle of relationships is precisely what gives the Web its immense force. The static world of book knowledge is about to be transformed by the same elevation of relationships, as each page in a book discovers other pages and other books. Once text is digital, books seep out of their bindings and weave themselves together.

Rushdie describes the countless currents in the Stream of Stories “weaving in and out of one another like a liquid tapestry” — Likewise Kelly:

When books are digitized, reading becomes a community activity. … In a curious way, the universal library becomes one very, very, very large single text: the world’s only book. … So what happens when all the books in the world become a single liquid fabric of interconnected words and ideas? …

What Happens When Books Connect? – This is the title for one of the second section of Kelly’s article from which most of the quotes above are taken, and it is really an overriding theme for all of the them — The digitized books of the future will talk easily to each other, which will transform books in the same way the Web has already transformed other aspects of culture.

The title of this article — Books: The Liquid Version — is the title of the third section of Kelly’s article.

Eric Rumsey is at: eric-rumsey AttSign uiowa dott edu and on Twitter @ericrumsey

Posted in Digitization, eBooks, Google Book Search, Metadata, PicsNo, Rushdie, The Stream, Uncategorized.

The Future, it’s in the Metadata

Posted on July 31, 2009 by Eric Rumsey

Spurred on by positive reaction to my recent article on metadata, I did more digging in Twitter, and came across this interesting tweet from Christian Science Monitor librarian Leigh Montgomery (@CSMLibrary):

#Journalism future? ‘It’s in the data.’ #Metadata, that is – makes the #news last, rather than a perishable commodity http://tr.im/lmetadata
9:52 PM Jul 22nd
from web

Montgomery brings an interesting perspective, with feet in the world of librarianship, where metadata has been a focus for a long time, and in journalism, which has only more recently begun to awaken to the value of metadata. Montgomery’s tweet links to a blog article (The Future, it’s in the Data) by journalist Carrie Brown-Smith (@Brizzyc), who interviewed Montgomery. Here’s an excerpt …

(quoting Montgomery) Librarians are precisely who have been leading in adding value and context to information … In all the ink and pixels spilled over the future of journalism I have not heard one mention of this … Information is valuable, and it needs structure, ­ keywording, and taxonomy added so it can be accessed, and repurposed.  All this is then repackaged and sold …

Brown-Smith also reports on a recent provocative article by journalist Dan Conover (@xarker) about the importance of adding data to news stories which could provide “a rich trove that could be mined to discover new connections and relationships.” (quote from Brown-Smith)

Conover’s article is a long and chatty discussion of metadata in journalism, and why news reporters resist adding it. He tells an interesting story of reporting on a house fire with and without metadata, and how coding can increase the future value of the work of reporting. He says that “the structure of [metadata] information is [now] the news organization’s primary product.” Unfortunately, though, he says, journalists hate the idea of adding this structured metadata — Why? …

Metadata coding is viewed as a library (or, in newsroom jargon, “morgue”) function … Journalism is a profession for storytellers, and our newsroom culture celebrates romantic myths that are generally hostile to structure. So I understand my curmudgeonly colleagues when they scoff behind my back at the word “metadata.”

I suspect that journalism is not the only profession that “celebrates romantic myths that are hostile to structure” ;-) … In journalism, as in publishing and libraries, discussed in my previous article, we’ve come to the interesting point when it’s the computer-library-coding geeks who will be, in Mike Cane’s words “the new publishers for a new age” … the ones who “make information do things.”

Eric Rumsey is @ericrumsey

Posted in Libraries, Metadata, Newspapers, PicsNo, Publishing, Uncategorized.

Metadata will Rule the World

Posted on July 29, 2009 by Eric Rumsey

As so often happens, there are gems far down in Mike Cane’s blog article (Dumb eBooks Must Die, Smart eBooks Must Live) that deserve more prominence. Cane says the real potential of eBooks will only be realized (attained) when the “hidden” metadata content is brought out (Boldface added):

All of this hidden information — exploded out, made explicit — turns an ebook from a dumb object into a smart object. … With such exploded data, an eBook becomes a ticket for admission to a vast collection of databased information.

An eBook becomes a local terminal connected to a growing and living cloud of associated information, with meanings and implications no publisher or writer can currently imagine. It lets the reader make those connections. It’s an eBook that can do something. … And this is precisely why Google wants the Book Search settlement to go through: it sees that as the future. Google is staffed by geeks who juggle information with an expertise that print publishers lack. … Google makes information do things.

Print publishing freezes information into a static object — An object that stands alone, disconnected, unable to do anything. … There needs to be another layer slathered over [the Publisher]. The information geeks. The ones who will take the static objects, extract the hidden information, and database it. … They are new publishers for a new age.

This metadata has value. And that value will increase as it ages. As new connections are formed, and new data is added, its value increases exponentially. The metadata value of a publisher could equal, if not surpass, that of the works on which it’s based.

Metadata will become a multi-billion dollar business. … The entire global economy is built on metadata. And it’s accessing that metadata that would justify more than a five-dollar price for an eBook. Consumers would see [that] an investment has been made to turn a text data dump into something active and intelligent. … no longer a flat, linear collection of words. Dimensions have been added that breathe and grow. The eBook price becomes a ticket. People are … buying into an ongoing experience.

Metadata Librarians will Rule the World …

Metadata, of course, is a concept that’s near-and-dear to the hearts of librarians ;-) … Which led to a bit of serendipity in thinking about the title of this article. I found in Google that no one else has used the phrase “Metadata will Rule the World.” But in playing around with various combinations I did discover the phrase “Librarians will one day rule the world” — in a 2004 blog post by Robert Wolfe (@metametadata), who works in Metadata services at the MIT library.

Of course, I couldn’t resist the opportunity to use the picture here of Wolfe’s Librarian Trading Card — “That’s right, I’ve got special metadata related powers.” (How do you like that, Mike Cane!)

The last posting on the Metametametadata blog is August, 2006 — Too bad — Robert Wolfe has interesting ideas.

Cane: Metadata turns an eBook into an active, growing, living cloud …

Cane’s contrasting of living eBooks with print publishing’s static books is reminiscent of the language used in the articles I wrote in May on the Stream as the new metaphor of the Web, particularly the article on Salman Rushdie’s Stream library & JL Borges’ Print library — Rushdie’s vision of books twisting and stretching and weaving in and out of each other sounds much like Cane’s vision in the quote above.

Related articles:

  • Books: The Liquid Version – What Happens When Books Connect?

Eric Rumsey is @ericrumsey

Posted in eBooks, Google Book Search, Libraries, Metadata, PicsYes, The Stream, Uncategorized.
  • « Previous
  • 1
  • 2

Seeing the picture: Thoughts on digitization & libraries while working on Hardin MD

Bookmark and Share

Enter e-mail for updates

Eric Rumsey

Eric Rumsey

Eric Rumsey is a librarian and web developer at the Hardin Library for the Health Sciences, University of Iowa. He is the founder and manager of the Hardin MD site.

Recent Posts

  • Responsive Design Library Sites on iPhone & iPad
  • Fast, Efficient & Full-Context Retweeting with Flipboard
  • Official Twitter Retweets are not in Twitter Search or Lists
  • Responsive Design Sites: Higher Ed, Libraries, Notables
  • New York Times’ Bad Headline & the Art of Tweeting
  • Google Dethrones the Alphabetical List

Twitter Updates

Categories

  • Apple
  • BiB10
  • Blogging
  • BookReader
  • Borges
  • Color
  • ContentDM
  • Copyright
  • Curation
  • Digitization
  • DjVu
  • eBooks
  • Elegance
  • Facebook
  • Flickr
  • Flipboard
  • Flu
  • GBS Case Study
  • Google
  • Google Book Search
  • Google eBookstore
  • Google Flu Trends
  • Google Plus
  • Greenstone
  • Hardin MD
  • History
  • Human input
  • ICDL
  • Image Search
  • Internet Archive
  • iPad
  • iPhone Optimized
  • iPhone/iPod Touch
  • Journals
  • Kindle
  • Libraries
  • Library Catalog
  • Library of Congress
  • Long Tail
  • Magazines
  • Maps
  • Marginalia
  • MedlinePlus
  • Metadata
  • Microsoft
  • MLA
  • Mobile
  • Mobile Design
  • Mobile First Design
  • Mobile Libraries
  • Navigation
  • Newspapers
  • NLM
  • Pageturners
  • Pattern recognition
  • PDF
  • PicsNo
  • PicsYes
  • Pictures
  • Publishing
  • PubMed
  • Responsive Design
  • Rushdie
  • Safari
  • Seadragon
  • SEO
  • Serendipity
  • Steve Jobs
  • Storytelling
  • TED
  • The Stream
  • Thumbnails
  • TOC
  • Train Wreck
  • Twitter
  • Twitter Tips
  • Uncategorized
  • Visualization
  • Web History
  • WebKit
  • Wide World
  • Wikipedia
  • Zooming & panning

Archives

Pages

  • About

RSS

  • Entries (RSS)
  • © 2013 Seeing the Picture, all rights reserved.
    Proudly powered by WordPress