Marybeth Peters, head of the US Copyright Office (part of the Library of Congress), said this in her testimony before Congress yesterday:

The Copyright Office has been following the Google Library Project since 2003 with great interest. We first learned about it when Google approached the Library of Congress, seeking to scan all of the Library’s books. At that time, we advised the Library on the copyright issues relevant to mass scanning, and the Library offered Google the more limited ability to scan books that are in the public domain. An agreement did not come to fruition because Google could not accept the terms.

As discussed in my article in June, it seems surprising that the Library of Congress has not taken a more active role in the mass-scanning project that Google is doing. Peters’ words explain why — The copyright mess! If copyright gets fixed, LC might be doing the project instead of Google.

It’s encouraging that Peters has finally been given a platform to talk about the mess. She did talk about it at a Columbia University meeting in March, although it was not widely reported, and was apparently only recorded on a video which was not transcribed (see my transcription of a key passage here). At that conference, she’s reported to have said that Congress had shown no interest in hearing her testimony. Hopefully they’re ready to listen now.

Peters stresses in her testimony yesterday, and in her talk at Columbia, that Congress needs to be the one to fix copyright law. Letting the judiciary branch speak through the Settlement, she says, is making an “end run around the legislative process” — her words in yesterday’s testimony. Brewster Kahle used the same words in April.

With the GBS settlement discussion heating up, it’s becoming increasingly clear to me that the root of the problem is US Copyright law. As Peters suggests, until copyright is fixed, mass-scanning of books is going to be problematic.

Eric Rumsey is on Twitter @ericrumsey

In a recent posting at O’Reilly Radar, Linda Stone discusses recent comments by Brewster Kahle and Robert Darnton on the Google Book Search Settlement. This is especially valuable for its talk about the orphan books problem, discussed by Kahle, as Stone reports, and in comments by Thomas Lord and Tim O’Reilly. I’m excerpting this interchange here. About Kahle’s posting, Stone says that he “focused on the plight of ‘orphan works’ – that vast number of books that are still under copyright but whose authors can no longer be found.”

Thomas Lord’s first comment — He says he’s thought much about the settlement:

My conclusion [around the time of the settlement] was that the big libraries, like Harvard, had made a bad deal — they didn’t understand the tech well enough and Google basically not only steamrollered them but implicated them in the potentially massive infringement case.

Basically, Google should have, indeed, paid for scanning and building the databases – but the ownership of those databases should have remained entirely with the libraries … The Writer’s Guild caved pretty easy and pretty early but legal pressure can still be brought to bear on Google. They can give up their private databases back to the libraries that properly should own them in the first place.

Tim O’Reilly’s comment on the article, and especially on Lord’s comment:

I agree with Tom’s analysis. (See my old post: Book search should work like web search [2006]). And I do agree with Brewster’s concern that this settlement will derail the kind of reform that would have solved this problem far more effectively. That’s still my preferred solution.

That being said, the tone of both Brewster’s comments and Darnton’s, implies that Google was up to some kind of skulduggery here. That’s unfair. Should they have stood up on principle to the Author’s Guild and the AAP? Absolutely, yes. But it’s the AG and the AAP who should be singled out for censure. … From conversations with people at Google, I believe that they do in fact continue to believe in real solutions to the orphaned works problem, and that demonizing them doesn’t do any of us any good.

The fact is, that Google made a massive investment to digitize these books in the first place. No one else was making the effort … In short, we’re comparing a flawed real world outcome with an “if wishes were horses” outcome that wasn’t in the cards. … Barring change to copyright law (and yes, we need that), Google has at least created digital copies of millions of books that were not otherwise available at all. Make those useful enough and valuable enough, and I guarantee there will be pressure to change the law so that others can profit too. …

Google Book Search was an important step forward in building an ebook ecosystem. I wish this settlement hadn’t happened, and that Google had held out for the win on the idea that search is fair use. And I wish that Google had taken the road that Tom outlined. … But they put hundreds of millions of dollars into a project that no one else wanted to touch. And frankly, I think we’re better off, even with this flawed settlement, than if Google had never done this in the first place.

Finally, I’ll point out that there is more competition in ebooks today than at any time in the past. Any claim that we’re on the verge of a huge Google monopoly, such as Darnton claims, is so far from the truth as to be laughable. Google is one of many contenders in an exploding marketplace.

Thomas Lord’s reply to O’Reilly:

… In the spirit of understanding things: you praise Google, I don’t. We’re better off those books having been scanned (I strongly agree) – I don’t like the way they bull-in-china-shop worked this. I think there’s a deep and lasting threat here that they need to fix if they want to “not be evil.”

In a brief response letter, author and publisher Marc Aronson writes about the copyright status of pictures that are in publisher partner books in Google Books. Aronson suggests that the rights for pictures are separate from the rights for text. I’ve corresponded with Aronson to expand on this idea, and he says that in his experience as an author and editor, he has been told that he needs to obtain rights to pictures and text separately. I’ve searched for other commentary on this issue, and have found very little. It’s a subject that needs exploration. Anyone have ideas?

All books in the publisher partner program, of course, are under copyright, and are available only in Limited Preview, with the publisher giving Google the rights to display a specific number of pages. In some cases of books containing pictures, however, the pages are available, but without the pictures. Is this because the publisher has gotten the rights for limited preview of the text, but not the pictures, as Aronson suggests? The three examples below show a variety of Limited Preview options. The first two are especially pertinent, because they are for books from the same publisher (Macmillan), in the same series, that have a different picture preview status, possibly indicating that the illustrator has given permission to display pictures in the first case, but not in the second.

In this example, the first 39 pages* are available for preview, with all pictures displaying. There are about 30 thumbnail images for pages with pictures on the About this Book page.
Birds of North America (Golden Field Guides)
By Chandler S. Robbins et al, Illustrated by Arthur Singer, Published by Macmillan, 2001

In this book, from the same publisher, the first 37 pages* are available for preview, but almost all pictures do not display, replaced with the message “Copyrighted image.” There are no thumbnail images on the About page.
Wildflowers of North America (Golden Field Guides)
By Frank D. Venning, Illustrated by Manabu C. Saito, Published by Macmillan, 2001

This book follows the most common, fairly liberal, pattern of publishers in Limited preview books, with the first 50 pages* available including all pictures. A full complement of 30 thumbnails is on the About page.
Central Rocky Mountain Wildflowers
By H. Wayne Phillips, Illustrated, Published by Globe Pequot, 1999

* The number of pages available for preview varies from session to session — The number given here is the maximum I experienced.