<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Seeing the picture &#187; Long Tail</title>
	<atom:link href="http://blog.lib.uiowa.edu/hardinmd/category/long-tail/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.lib.uiowa.edu/hardinmd</link>
	<description>Thoughts while working on Hardin MD on digitization &#38; libraries</description>
	<lastBuildDate>Wed, 18 Nov 2009 22:16:00 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.5.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>What&#8217;s in Wikipedia? A Very Long Tail</title>
		<link>http://blog.lib.uiowa.edu/hardinmd/2009/08/28/whats-in-wikipedia-a-very-long-tail/</link>
		<comments>http://blog.lib.uiowa.edu/hardinmd/2009/08/28/whats-in-wikipedia-a-very-long-tail/#comments</comments>
		<pubDate>Fri, 28 Aug 2009 19:45:48 +0000</pubDate>
		<dc:creator>Eric Rumsey</dc:creator>
				<category><![CDATA[Long Tail]]></category>
		<category><![CDATA[PicsNo]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Wikipedia]]></category>

		<guid isPermaLink="false">http://blog.lib.uiowa.edu/hardinmd/?p=3695</guid>
		<description><![CDATA[The list below is 50 consecutive random links to Wikipedia articles using the Random Article link that&#8217;s in all articles. As suggested in a recent study by Kittur, Chi &#38; Suh (discussed below) I&#8217;ve divided these random articles into the top level Wikipedia categories. More interesting than these categories are other broad subjects (as picked [...]]]></description>
			<content:encoded><![CDATA[<p>The list below is 50 consecutive random links to Wikipedia articles using the Random Article link that&#8217;s in all articles. As suggested in a recent study by Kittur, Chi &amp; Suh (discussed below) I&#8217;ve divided these random articles into the top level <a href="http://en.wikipedia.org/wiki/Portal:Contents/Categorical_index">Wikipedia categories</a>. More interesting than these categories are other broad subjects (as picked out by me) in the articles below: Sports (7 articles), Pop Music (6), Europe (5), Politics (4), India (3). These subjects, I think, give a good flavor of the sorts of articles in Wikipedia.</p>
<p>Beyond the categories and sub-cats though &#8212; The most striking thing about this random sample of Wikipedia articles is the narrow, limited nature of the articles &#8212; Almost all of them are about things that <strong>No One Has Heard Of</strong>! &#8212; A great example of the <a href="http://blog.lib.uiowa.edu/hardinmd/category/long-tail/">Long Tail effect</a>. Only in this case, it seems to be almost <strong>all Tail</strong>, and <strong>very little Head</strong>. Obviously, there are thousands of Wikipedia articles on well-known subjects, which we read every day. But in terms of numbers, the articles on minor, unheard-of subjects vastly outnumber the popular ones.</p>
<p>[There's <a href="#anchor1">more commentary</a> below following the list]</p>
<h3>Culture</h3>
<ul>
<li><a href="http://en.wikipedia.org/wiki/Oscuros_Rinocerontes_Enjaulados">Oscuros Rinocerontes Enjaulados</a><br />
A 1990 short Cuban film, 1 screen, Stub</li>
<li><a href="http://en.wikipedia.org/wiki/Zoo_in_Budapest">Zoo in Budapest</a><br />
Film (1933) directed by Rowland V. Lee, 1 screen, Stub</li>
<li><a href="http://en.wikipedia.org/wiki/Yoga_for_You">Yoga for You</a><br />
Indian television series, 1 screen, Stub</li>
<li><a href="http://en.wikipedia.org/wiki/The_Dingees">The Dingees</a><br />
Band formed in Orange County, California in 1996, 2 screens</li>
<li><a href="http://en.wikipedia.org/wiki/RTZ">RTZ</a><br />
American rock band in the late 1980s, 2 screens</li>
<li><a href="http://en.wikipedia.org/wiki/I_Ribelli">I Ribelli</a><br />
Italian rock group formed in 1959, 2 screens</li>
<li><a href="http://en.wikipedia.org/wiki/The_Family_Values_2001_Tour">The Family Values 2001 Tour</a><br />
Music Album, 1 screen</li>
<li><a href="http://en.wikipedia.org/wiki/Citizen_Fish">Citizen Fish</a><br />
A ska punk band that have been together since 1990, 3 screens</li>
<li><a href="http://en.wikipedia.org/wiki/List_of_Boston_Celtics_head_coaches">List of Boston Celtics head coaches</a><br />
3 screens</li>
<li><a href="http://en.wikipedia.org/wiki/2009_AEGON_International_-_Women%27s_Doubles">2009 AEGON International &#8211; Women&#8217;s Doubles (Tennis)</a><br />
2screens</li>
<li><a href="http://en.wikipedia.org/wiki/American_Sportsman%27s_Library">American Sportsman&#8217;s Library</a><br />
Series of 16 volumes, from an American perspective, published 1902-1905, 2 screens</li>
<li><a href="http://en.wikipedia.org/wiki/Close_Combat:_First_to_Fight">Close Combat: First to Fight</a><br />
Squad-based military first-person shooter video game, 2 screens</li>
<li><a href="http://en.wikipedia.org/wiki/Tightening_key">Tightening Key (Painting)</a><br />
Small wedge inserted into the corners of a canvas stretcher frame, 1 screen, Stub</li>
<li><a href="http://en.wikipedia.org/wiki/Switch_(BDSM)">Switch (BDSM)</a><br />
BDSM is a compound acronym from the terms bondage, discipline, sadism, masochism, 2 screens</li>
</ul>
<h3>People</h3>
<ul>
<li><a href="http://en.wikipedia.org/wiki/Anthony_Stack">Brigadier General Anthony Stack</a><br />
Currently a Brigadier General in service of the Canadian Forces, 1 screen</li>
<li><a href="http://en.wikipedia.org/wiki/Roy_O._Woodruff">Roy Orchard Woodruff</a><br />
Politician, soldier, printer and dentist from Michigan (1876 &#8211; 1953), 1 screen</li>
<li><a href="http://en.wikipedia.org/wiki/Ed_Bryant">Ed Bryant</a><br />
Former Republican member of the US House of Representatives from Tennessee, 1948- , 3 screens</li>
<li><a href="http://en.wikipedia.org/wiki/Missy_Higgins">Missy Higgins</a><br />
Australian singer-songwriter, 1983 &#8211; , 7 screens</li>
<li><a href="http://en.wikipedia.org/wiki/R%C4%83zvan_Sab%C4%83u">Răzvan Sabău</a><br />
Romanian tennis player, 1977- , 1 screen</li>
<li><a href="http://en.wikipedia.org/wiki/Mirza_Rizvanovi%C4%87">Mirza Rizvanović</a><br />
Bosnian football defender, 1 screen, Stub</li>
<li><a href="http://en.wikipedia.org/wiki/Steve_Byrne">Steve Byrne</a><br />
American stand-up comedian, 1974- , 1 screen, Stub</li>
<li><a href="http://en.wikipedia.org/wiki/Joan_Hambidge">Joan Hambidge</a><br />
Afrikaans poet, literary theorist and academic, 1956- , 2 screens</li>
<li><a href="http://en.wikipedia.org/wiki/Agim_Kaba">Agim Kaba</a><br />
American-Albanian actor, writer, director, sound editor, dancer, and film producer, 1980- , 1 screen</li>
<li><a href="http://en.wikipedia.org/wiki/Verda_Welcome">Verda Welcome</a><br />
African-American teacher, civil rights leader, and Maryland state senator, 1907 &#8211; 1990, 2 screens</li>
<li><a href="http://en.wikipedia.org/wiki/Harolyn_Blackwell">Harolyn Blackwell</a><br />
African-American lyric coloratura soprano, 1955- , 7 screens</li>
<li><a href="http://en.wikipedia.org/wiki/Ron_Sobieszczyk">Ron Sobieszczyk</a><br />
Retired American professional basketball player, 1 screen</li>
<li><a href="http://en.wikipedia.org/wiki/John_Cumberland">John Cumberland</a><br />
Former Major League Baseball player and coach, 1947- , 1 screen, Stub</li>
</ul>
<h3>Geography</h3>
<ul>
<li><a href="http://en.wikipedia.org/wiki/Cwmcarn_Forest_Drive">Cwmcarn Forest Drive</a><br />
Tourist attraction and scenic route in Cwmcarn, Crosskeys, Wales, 1 screen, Stub</li>
<li><a href="http://en.wikipedia.org/wiki/Vila_Ch%C3%A3_(Vila_do_Conde)">Vila Chã</a><br />
Portuguese parish with 2,957 inhabitants and a total area of 5.49 km², 1 screen, Stub</li>
<li><a href="http://en.wikipedia.org/wiki/Chojnowo,_Lubusz_Voivodeship">Chojnowo</a><br />
Village in Krosno Odrzańskie County, Lubusz Voivodeship, western Poland, 1 screen, Stub</li>
<li><a href="http://en.wikipedia.org/wiki/Interstate_17">Interstate 17</a><br />
5 screens</li>
<li><a href="http://en.wikipedia.org/wiki/Withee_(town),_Wisconsin">Withee (Town) Wisconsin</a><br />
Town in Clark County in the US state of Wisconsin, with population of 885 at the 2000 census, 1 screen</li>
<li><a href="http://en.wikipedia.org/wiki/Edson,_Wisconsin">Edson, Wisconsin</a><br />
Town in Chippewa County in the US state of Wisconsin, with population of 966 at the 2000 census, 1 screen</li>
</ul>
<h3>Society</h3>
<ul>
<li><a href="http://en.wikipedia.org/wiki/Supercarrier">Supercarrier</a><br />
Unofficial descriptive term for the largest type of aircraft carrier, 3 screens</li>
<li><a href="http://en.wikipedia.org/wiki/RAAF_Curtin">RAAF Base Curtin</a><br />
Royal Australian Air Force base located on the north coast of Western Australia, 1 screen, Stub</li>
<li><a href="http://en.wikipedia.org/wiki/Massachusetts_House_of_Representatives">Massachusetts House of Representatives</a><br />
16 screens</li>
<li><a href="http://en.wikipedia.org/wiki/List_of_United_States_Supreme_Court_cases,_volume_211">List of all the United States Supreme Court cases from volume 211 of the United States Reports</a><br />
2screens</li>
</ul>
<h3>History</h3>
<ul>
<li><a href="http://en.wikipedia.org/wiki/Alexander_Stewart,_Duke_of_Rothesay">Alexander Stewart, Duke of Rothesay</a> (ca 1430)<br />
1 screen</li>
<li><a href="http://en.wikipedia.org/wiki/Redwood_Castle">Redwood Castle</a><br />
Norman castle in Lorrha, County Tipperary, Ireland, 1 screen</li>
<li><a href="http://en.wikipedia.org/wiki/Friedrich_Wilhelm_Niepelt">Friedrich Wilhelm Niepelt</a><br />
German entomologist, 1862 &#8211; 1936, 1 screen, Stub</li>
<li><a href="http://en.wikipedia.org/wiki/Nattal_Sahu">Nattal Sahu</a><br />
Earliest known Agrawal merchant-prince, who lived ca 1189, 1 screen</li>
<li><a href="http://en.wikipedia.org/wiki/Jayaprakash_Narayan">Jayaprakash Narayan</a><br />
Indian freedom fighter and political leader, 1902 &#8211; 1979, 4 screens</li>
</ul>
<h3>Science</h3>
<ul>
<li><a href="http://en.wikipedia.org/wiki/Swannia">Swannia</a><br />
Genus of moth in the family Geometridae, 1 screen, Stub</li>
<li><a href="http://en.wikipedia.org/wiki/Proflazepam">Proflazepam</a><br />
Drug which is a benzodiazepine derivative, 1 screen, Stub</li>
</ul>
<h3>Technology</h3>
<ul>
<li><a href="http://en.wikipedia.org/wiki/Collaborative_Application_Markup_Language">Collaborative Application Markup Language</a><br />
XML based markup language, 2 screens</li>
<li><a href="http://en.wikipedia.org/wiki/Plant_reliability_analyst">Plant Reliability Analyst</a> (Profession?)<br />
1 screen</li>
<li><a href="http://en.wikipedia.org/wiki/Scissors-glasses">Scissors-glasses</a><br />
Eyeglasses mounted on scissoring stems rather than on temple stems, 1 screen</li>
</ul>
<h3>Math</h3>
<ul>
<li><a href="http://en.wikipedia.org/wiki/Adjunction_space">Adjunction space</a><br />
A common construction in topology, 1 screen</li>
</ul>
<h3>Disambiguation</h3>
<ul>
<li><a href="http://en.wikipedia.org/wiki/Blue_collar_(disambiguation)">Blue collar (disambiguation)</a><br />
1 screen</li>
<li><a href="http://en.wikipedia.org/wiki/Tiga">Tiga (Disambiguation)</a><br />
1 screen</li>
</ul>
<p><a name="anchor1"></a><br />
I&#8217;m assuming that the Random Article link used to derive these links is truly random, that it does give a good sample of all Wikipedia articles. Surprisingly, I have not been able to find a Wikipedia article on &#8220;Wikipedia Random Article,&#8221; or any other commentary on it that might give an idea about this. I also have found no indication that anyone else has attempted to make a list of random Wikipedia articles, as presented here. Please let me know in a Comment if I&#8217;m missing something!</p>
<p>The purpose of the <a href="http://www-users.cs.umn.edu/~echi/papers/2009-CHI2009/p1509.pdf">Kittur, Chi &amp; Suh paper (PDF)</a> mentioned above was to map all Wikipedia articles to one of the top level Wikipedia categories. The articles in the list above fit their results fairly well for most of the categories. Here are their results and mine (in parentheses):</p>
<p style="padding-left: 180px;">Culture: 30% (28%)<br />
People: 15% (26%)<br />
Geography: 14% (12%)<br />
Society: 12% (8%)<br />
History: 11% (10%)<br />
Science: 9% (4%)<br />
Technology: 4% (6%)<br />
Religion: 2% (0%)<br />
Health: 2% (0%)<br />
Math: 1% (2%)<br />
Philosophy: 1% (0%)</p>
<p>The assigning of categories by me is imprecise at best, so it&#8217;s not surprising that there&#8217;s not complete agreement between my findings and those of KCS. It&#8217;s also possible that the real division of categories has changed since KCS collected data for their study, in Jan, 2008. Finally, one more bit on KCS &#8211; Their paper has the same base title as this article (What&#8217;s in Wikipedia?) &#8212; I actually thought of this title before I found their paper &#8212; In fact I found their paper because I searched for the title after I thought of it for this article! So I don&#8217;t feel like I&#8217;m stealing their title <img src='http://blog.lib.uiowa.edu/hardinmd/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>
<p>Acknowledgement to my son David: The writing of this article is a long tale in itself! It arose from a (rather hair-brained, I now see) question I pondered &#8212; whether there&#8217;s a way to generate a &#8220;random Web page&#8221; from anywhere (The answer, I think is No, but that&#8217;s a separate discussion). As I discussed this idea with David, he mentioned the Random Article link on Wikipedia articles. I had actually never noticed this before, and found it quite interesting, which led to this article. Also David confirmed from his younger perspective that the links in the sample above are indeed obscure!</p>
<p>Note: These random links were generated over three separate days in the last week.</p>
<p>Eric Rumsey is on Twitter <a href="http://twitter.com/ericrumsey">@ericrumsey</a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lib.uiowa.edu/hardinmd/2009/08/28/whats-in-wikipedia-a-very-long-tail/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Jon Orwant on Google Book Search at TOC &#8211; Slides with data</title>
		<link>http://blog.lib.uiowa.edu/hardinmd/2009/02/23/jon-orwant-on-google-book-search-at-toc-slides-with-data/</link>
		<comments>http://blog.lib.uiowa.edu/hardinmd/2009/02/23/jon-orwant-on-google-book-search-at-toc-slides-with-data/#comments</comments>
		<pubDate>Mon, 23 Feb 2009 23:10:15 +0000</pubDate>
		<dc:creator>Eric Rumsey</dc:creator>
				<category><![CDATA[Google Book Search]]></category>
		<category><![CDATA[Long Tail]]></category>
		<category><![CDATA[PicsYes]]></category>
		<category><![CDATA[Publishing]]></category>
		<category><![CDATA[TOC]]></category>
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.lib.uiowa.edu/hardinmd/?p=2130</guid>
		<description><![CDATA[The slides and data from Jon Orwant&#8217;s presentation on Google Book Search at TOC, that were not available when I wrote previously, have now been put up on the O&#8217;Reilly site. [these have been removed, see comment below] This is made up of 59 PDF slides, covering a range of recent developments with Google Books, [...]]]></description>
			<content:encoded><![CDATA[<p>The slides and data from Jon Orwant&#8217;s presentation on Google Book Search at TOC, that were not available when I <a href="http://blog.lib.uiowa.edu/hardinmd/2009/02/20/jon-orwant-on-google-book-search-at-toc/">wrote previously</a>, have now been put up <a href="http://assets.en.oreilly.com/1/event/19/Google%20Book%20Search_%20Past,%20Present,%20and%20Future%20Paper.pdf">on the O&#8217;Reilly site</a>. [<strong>these have been removed, see <a href="http://blog.lib.uiowa.edu/hardinmd/2009/02/23/jon-orwant-on-google-book-search-at-toc-slides-with-data/#comment-579">comment</a></strong> below] This is made up of 59 PDF slides, covering a range of recent developments with Google Books, including the recent release of GBS mobile, and a discussion of the Oct 2008 Publisher settlement. The part I&#8217;m most interested in is the data on GBS usage that had been mentioned by Orwant in various venues before, but with few details. The details in the TOC presentation are mostly in three &#8220;Case studies&#8221; of publishers that participate in the GBS Partner Plan &#8212; McGraw-Hill, Oxford University Press, and Springer. I&#8217;ve chosen one slide for each of these publishers that show various <a href="http://blog.lib.uiowa.edu/hardinmd/2009/02/06/google-books-and-the-long-tail/">long-tail effects</a> for usage of their books that are in GBS, and one slide that has data for a more extensive grouping from GBS.</p>
<p>McGraw-Hill case study is presented in slides 21-23. Below is <a href="http://assets.en.oreilly.com/1/event/19/Google%20Book%20Search_%20Past,%20Present,%20and%20Future%20Paper.pdf#page=24">slide 24</a>. Note that this is a small sample of only the top 30 titles.</p>
<p><a href="http://assets.en.oreilly.com/1/event/19/Google%20Book%20Search_%20Past,%20Present,%20and%20Future%20Paper.pdf#page=24"><img class="alignnone size-full wp-image-2134" style="padding-bottom: 15px" src="http://blog.lib.uiowa.edu/hardinmd/files/2009/02/slide24_2.jpg" border="0" alt="" width="475" height="378" /></a></p>
<p>Oxford University Press &#8211; Slides 26-31. Below is <a href="http://assets.en.oreilly.com/1/event/19/Google%20Book%20Search_%20Past,%20Present,%20and%20Future%20Paper.pdf#page=27">slide 27</a>. Note the long tail of visits for pre-1990 books.</p>
<p><a href="http://assets.en.oreilly.com/1/event/19/Google%20Book%20Search_%20Past,%20Present,%20and%20Future%20Paper.pdf#page=27"><img class="alignnone size-full wp-image-2145" style="padding-bottom: 20px" src="http://blog.lib.uiowa.edu/hardinmd/files/2009/02/slide27_2.jpg" border="0" alt="" width="596" height="416" /></a></p>
<p>Springer &#8211; Slides 32-36. Below is <a href="http://assets.en.oreilly.com/1/event/19/Google%20Book%20Search_%20Past,%20Present,%20and%20Future%20Paper.pdf#page=35">slide 35</a>, showing clicks for <strong>Buy this Book</strong>. Note again the very long tail of clicks for pre-1995 books.</p>
<p><a href="http://assets.en.oreilly.com/1/event/19/Google%20Book%20Search_%20Past,%20Present,%20and%20Future%20Paper.pdf#page=35"><img class="alignnone size-full wp-image-2139" style="padding-bottom: 15px" src="http://blog.lib.uiowa.edu/hardinmd/files/2009/02/slide35_2_86.jpg" border="0" alt="" width="590" height="390" /></a></p>
<p><a href="http://assets.en.oreilly.com/1/event/19/Google%20Book%20Search_%20Past,%20Present,%20and%20Future%20Paper.pdf#page=37">Slide 37</a> below shows &#8220;Share of books with more than 10 pages viewed&#8221;, apparently for all books in GBS. The coloring of the data lines looks ambiguous to me &#8211; The lowest line is undoubtedly for Snippet View books. It looks like the top line is for Limited Preview, which are presumably higher than Full View books, apparently the middle line, because Limited Preview books are more current.</p>
<p><a href="http://assets.en.oreilly.com/1/event/19/Google%20Book%20Search_%20Past,%20Present,%20and%20Future%20Paper.pdf#page=37"><img class="alignnone size-full wp-image-2142" style="padding-bottom: 25px" src="http://blog.lib.uiowa.edu/hardinmd/files/2009/02/slide37_2_78.jpg" border="0" alt="" width="594" height="377" /></a><br />
Please comment here or Twitter @<a href="http://twitter.com/ericrumsey">ericrumsey</a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lib.uiowa.edu/hardinmd/2009/02/23/jon-orwant-on-google-book-search-at-toc-slides-with-data/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Google Books and the Long Tail</title>
		<link>http://blog.lib.uiowa.edu/hardinmd/2009/02/06/google-books-and-the-long-tail/</link>
		<comments>http://blog.lib.uiowa.edu/hardinmd/2009/02/06/google-books-and-the-long-tail/#comments</comments>
		<pubDate>Fri, 06 Feb 2009 20:15:04 +0000</pubDate>
		<dc:creator>Eric Rumsey</dc:creator>
				<category><![CDATA[Google Book Search]]></category>
		<category><![CDATA[Libraries]]></category>
		<category><![CDATA[Long Tail]]></category>
		<category><![CDATA[PicsYes]]></category>
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.lib.uiowa.edu/hardinmd/?p=1947</guid>
		<description><![CDATA[In a recent NY Times article that I blogged on, Dan Clancy, the engineering director for Google book search, is cited as saying &#8220;every month users view at least 10 pages of more than half of the one million out-of-copyright books that Google has scanned into its servers.&#8221; Remarkably, this classic long tail description of [...]]]></description>
			<content:encoded><![CDATA[<p>In a recent NY Times article that I <a href="http://blog.lib.uiowa.edu/hardinmd/2009/02/04/google-books-treasure-trove-ny-times/">blogged on</a>, Dan Clancy, the engineering director for Google book search, is cited as saying &#8220;every month users view at least 10 pages of more than half of the one million out-of-copyright books that Google has scanned into its servers.&#8221; Remarkably, this classic <strong>long tail description</strong> of Google Books seems not to have been noticed by anyone &#8212; I&#8217;ve searched in Google (web and blogs) for various word combinations in the quote combined with &#8220;Dan Clancy,&#8221; and have found nothing at all except the original NYT article.</p>
<p>The <a href="http://en.wikipedia.org/wiki/The_Long_Tail">long tail</a> idea, which was first described by Chris Anderson in 2004, is that when a very large number of users are given a very large number of items to choose from, especially in an online environment with virtually unlimited &#8220;shelf space&#8221; and easy access, a very wide variety of items will be chosen. Anderson proposed the idea especially to describe commercial sites such as Amazon and Netflix, but it has also been seen as a good fit for libraries, and especially online library/book sources, such as Google Books.</p>
<p>So &#8212; Yes &#8212; There has been <a href="http://www.google.com/search?hl=en&amp;q=%22google+book%22+long+tail">discussion</a> of Google Books and the long tail. For the most part, though, this has been on a conceptual, non-numeric level. The statement by Clancy is valuable because it&#8217;s the first time there have been actual numbers provided by Google sources to back up the conceptual ideas. And, indeed, striking numbers they are &#8212; every month, <strong>half</strong> of the out-of-copyright books &#8212; i.e. <strong>old books</strong> &#8212; in Google Books <strong>are getting significant use</strong>. The long tail will certainly be even longer when newer books are made available after the October 2008 settlement goes into effect.</p>
<p><a href="http://radar.oreilly.com/GBSvsBookscan.jpg"><img class="alignnone size-full wp-image-1948" style="padding-right: 15px;padding-top: 12px;padding-bottom: 1px" src="http://blog.lib.uiowa.edu/hardinmd/files/2009/02/gbsvsbookscan_40.jpg" border="0" alt="" width="466" height="273" align="left" /></a></p>
<p>The best numeric data that I&#8217;ve found on Google Books and the long tail is given in an <a href="http://radar.oreilly.com/archives/2006/05/long-tail-evidence-from-safari.html">article</a> by Tim O&#8217;Reilly in 2006, which compares sales of O&#8217;Reilly Media book titles, as reported by Nielsen Bookscan, with page views from Google Books. As the graph (at left) from that article shows, the Google Books page views (in red) have a very long, almost flat, tail, in contrast with the relatively short tail for actual sales of book titles (in blue). Incidentally, the graph shown here has a bad link in the O&#8217;Reilly article, so all that displays is the file name; I did some digging on the O&#8217;Reilly site to find it <a href="http://radar.oreilly.com/GBSvsBookscan.jpg">here</a>. (Feb 11: Bad link for this image and others in O&#8217;Reilly article are fixed, after I noted them in a <a href="http://radar.oreilly.com/archives/2006/05/long-tail-evidence-from-safari.html#comment-2052890">comment</a>.)</p>
<p>The closest thing I have found to other long tail numeric data relating to online books is reported in a 2006 <a href="http://www.nybooks.com/articles/19436">article</a> by Jason Epstein:</p>
<blockquote><p>According to Mark Sandler of the University of Michigan Library, in an essay in <a href="http://books.google.com/books?id=D8JxIAAACAAJ">Libraries and Google</a>, an experiment by the library involving the digitization of 10,000 &#8220;low use&#8221; monographs offered on the Web produced &#8220;between 500,000 and one million hits per month.&#8221;</p></blockquote>
<p>I suspect the realization of the &#8220;power of the long tail&#8221; shown in this experiment contributed to the University of Michigan opting to be one of the original library partners in the Google Books project.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lib.uiowa.edu/hardinmd/2009/02/06/google-books-and-the-long-tail/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
