<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Seeing the picture &#187; Hardin MD</title>
	<atom:link href="http://blog.lib.uiowa.edu/hardinmd/category/hardinmd/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.lib.uiowa.edu/hardinmd</link>
	<description>Thoughts while working on Hardin MD on digitization &#38; libraries</description>
	<lastBuildDate>Wed, 18 Nov 2009 22:16:00 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.5.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Tagging in Hardin MD &#8212; History</title>
		<link>http://blog.lib.uiowa.edu/hardinmd/2009/10/15/tagging-in-hardin-md-history/</link>
		<comments>http://blog.lib.uiowa.edu/hardinmd/2009/10/15/tagging-in-hardin-md-history/#comments</comments>
		<pubDate>Thu, 15 Oct 2009 18:09:28 +0000</pubDate>
		<dc:creator>Eric Rumsey</dc:creator>
				<category><![CDATA[Google]]></category>
		<category><![CDATA[Hardin MD]]></category>
		<category><![CDATA[PicsNo]]></category>
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.lib.uiowa.edu/hardinmd/?p=4129</guid>
		<description><![CDATA[[This article accompanies previous article: Tagging in Hardin MD]
Soon after the launching of Hardin MD, in 1996, we began adding keywords in the hidden META keyword field (The first pages for HMD in Internet Archive [Dec, 1998] show them on all pages checked.) We began checking to see if HMD pages were appearing in search [...]]]></description>
			<content:encoded><![CDATA[<p>[This article accompanies previous article: <a href="http://blog.lib.uiowa.edu/hardinmd/2009/09/25/tagging-in-hardin-md/">Tagging in Hardin MD</a>]</p>
<p>Soon after the launching of Hardin MD, in 1996, we began adding keywords in the hidden META keyword field (The first pages for HMD in Internet Archive [<a href="http://web.archive.org/web/19981206205818/http://www.lib.uiowa.edu/hardin/md/index.html">Dec, 1998</a>] show them on all pages checked.) We began checking to see if HMD pages were appearing in search engine results in about 2000, and found that meta keywords didn’t seem to have much effect.</p>
<p>So, in late 2000, we began experimenting with putting keywords (aka tags*) at the bottom of the page, where most users wouldn’t notice them. At first we didn’t see much effect in search engine results, when using the tags mostly for variant spellings or terminology (e.g. on the Hematology page: blood diseases, haematology).</p>
<p>In 2001, as Google rose to prominence, and Search improved, we began using tools that gave the ability to see the popularity of specific words (HitBox, <a href="http://extremetracking.com/">ExtremeTracking</a>, <a href="http://www.wordtracker.com/">WordTracker</a>). We learned that using mis-spelled word variants as tags worked very well in drawing SE traffic. It was also during this time that links to pictures were being added to HMD, and we discovered the power of the word “pictures&#8221; in drawing SE traffic.</p>
<h3 style="text-align: center;"><strong>Time-line of tagging in Hardin MD</strong></h3>
<p>Based on invaluable help from Internet Archive &#8212; Starting from here: <a href="http://web.archive.org/web/*/http://www.lib.uiowa.edu/hardin/md">Internet Archive for Hardin MD, 1999+</a></p>
<p>The first HMD pages in Internet Archive in <a href="http://web.archive.org/web/19981206205818/http://www.lib.uiowa.edu/hardin/md/index.html">Dec, 1998</a> have meta keywords, but not tags on the page. Example of meta keywords (<a href="http://web.archive.org/web/19981202180400/www.lib.uiowa.edu/hardin/md/cardio.html">Hardin MD: Cardiology</a>): health, medicine, medical, nursing, nurses, nurse, disease, diseases, best, list, lists, consumer, cardiology, cardiac, heart, stroke, cardiovascular, cardiothoracic, pacemaker, defibrillator, attack, arrest</p>
<p><strong>Tagging for misspellings</strong> &#8211; Ophthalmology, I&#8217;m sure, would have been one of the first pages on which misspellings would have been used. Internet Archive pages show clearly that the first implementation was in early November, 2000. &#8230;</p>
<blockquote><p><a href="http://web.archive.org/web/20001109201800/http://www.lib.uiowa.edu/hardin/md/ophth.html">Ophthalmology, Nov 7, 2000</a> &#8211; No misspellings in meta keywords. There are no tags on page.<br />
<a href="http://web.archive.org/web/20001119193000/http://www.lib.uiowa.edu/hardin/md/ophth.html">Ophthalmology, Nov 15, 2000</a> &#8211; Has misspellings in meta keywords and on page: [ophthamology]</p></blockquote>
<p>This fits my memory of events &#8212; I was especially motivated to look for ways to draw Web traffic, because Google was just <a href="http://en.wikipedia.org/wiki/Web_search_engine#History">becoming prominent</a>, rationalizing the search process, and making it easier to predict the effects of changes on page traffic.</p>
<p><a href="http://web.archive.org/web/20001109201800/http://www.lib.uiowa.edu/hardin/md/ophth.html"></a>Other examples of pages with tags on the page, with variant spellings, from about the same time: Orthopedics <a href="http://web.archive.org/web/20001119200500/http://www.lib.uiowa.edu/hardin/md/ortho.html">Nov 16, 2000</a> [orthopaedics] and Hematology <a href="http://web.archive.org/web/20001206091200/www.lib.uiowa.edu/hardin/md/hem.html">Nov 29, 2000</a> [blood diseases, haematology]</p>
<p><strong>Use of the word &#8220;pictures,&#8221; in tagging and in page titles</strong></p>
<p>First use: <a href="http://web.archive.org/web/20020611191352/www.lib.uiowa.edu/hardin/md/genitalwartpictures.html">Genital Warts Jun 10, 2002</a></p>
<p>First widespread use &#8211; Several pages linked on <a href="http://web.archive.org/web/20021004190534/www.lib.uiowa.edu/hardin/md/">Hardin MD Index page Sept 30, 2002 </a></p>
<p>.</p>
<p>Eric Rumsey is at: eric-rumsey AttSign uiowa dott edu and on Twitter <a href="http://twitter.com/ericrumsey">@ericrumsey</a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lib.uiowa.edu/hardinmd/2009/10/15/tagging-in-hardin-md-history/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tagging in Hardin MD</title>
		<link>http://blog.lib.uiowa.edu/hardinmd/2009/09/25/tagging-in-hardin-md/</link>
		<comments>http://blog.lib.uiowa.edu/hardinmd/2009/09/25/tagging-in-hardin-md/#comments</comments>
		<pubDate>Fri, 25 Sep 2009 14:05:11 +0000</pubDate>
		<dc:creator>Eric Rumsey</dc:creator>
				<category><![CDATA[Google]]></category>
		<category><![CDATA[Hardin MD]]></category>
		<category><![CDATA[Library Catalog]]></category>
		<category><![CDATA[Metadata]]></category>
		<category><![CDATA[PicsYes]]></category>
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.lib.uiowa.edu/hardinmd/?p=4032</guid>
		<description><![CDATA[
All Hardin MD (HMD) pages  have tags at the bottom, to make them more visible for search engines i.e. Google. We have been doing tagging in HMD since 2000, and it works very well. As shown in the example to the left, the tags are for variant spellings (measels), variant terms (rubeola), and words [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.lib.uiowa.edu/hardin/md/measlespictures.html"><img class="alignnone size-full wp-image-4037" style="padding-right: 16px;padding-top: 3px;padding-bottom: 3px" title="measles1_35_4" src="http://blog.lib.uiowa.edu/hardinmd/files/2009/09/measles1_35_4.jpg" border="0" alt="" width="291" height="426" align="left" /></a></p>
<p>All <a href="http://www.lib.uiowa.edu/hardin/md/index.html">Hardin MD</a> (HMD) pages  have tags at the bottom, to make them more visible for search engines i.e. Google. We have been doing tagging in HMD since 2000, and it works very well. As shown in the example to the left, the tags are for variant spellings (measels), variant terms (rubeola), and words and word combinations relating to pictures (we have found that the word &#8220;pictures&#8221; is especially favored by Google).</p>
<p>One of the things that has made HMD fun has been applying longstanding practices of librarianship to a web-based system. Having been a cataloger for a brief time early in my library career, it seemed natural to put tags at the bottom of the web page, just like subject headings are at the bottom of cards in the card catalog. Including mis-spellings in the tags to help users find the page seemed natural, too &#8212; As a cataloger, I had been taught to put x-ref cards in the catalog for variant ways that patrons might look for a book, and following the same principle on web pages, it became possible to apply it on a much larger scale.</p>
<p>It continues to surprise me that this simple idea &#8212; Putting tags on web pages &#8212; has not been more widely applied. I have seen very few cases of it at other sites. I suspect part of the reason for this is that people have tended to think the hidden meta keyword field was the place to put tags, rather than &#8220;cluttering up&#8221; their pages by putting them on the page. Google&#8217;s announcement  a few days ago that they ignore meta keywords finally puts an end to that idea. But many SEO people have thought meta keywords were ineffective for a long time, and it was certainly our experience &#8212; Around the time we began putting tags on pages in 2000, we compared meta field tagging and on-page tagging, and found that meta field tagging seemed to be ignored by Google.</p>
<p>Another factor that may have discouraged people from putting tags inconspicuously at the bottom of the page is that SEO people generally say that words need to be in a prominent place on the page, preferably near the top, to be found by Google. That&#8217;s no doubt true for common words that have a lot of competition, but for relatively uncommon words, like variant spellings of medical diseases, placement at the bottom of the page works well. (One proviso: Our pages with HMD are relatively small, usually no more than two screens. Putting tags at the bottom of larger pages may not work as well.)</p>
<p>I suspect a reason that people don&#8217;t think more of experimenting with tagging and Google visibility is that it is a lengthy process. Google&#8217;s not going to see new words on your page right away. It may take several weeks or even months. So it requires careful record-keeping, to note when words are added, and having a regular schedule of Google checking to see if your pages are starting to appear in search results.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lib.uiowa.edu/hardinmd/2009/09/25/tagging-in-hardin-md/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Google Flu Trends : Kudos &amp; Complications</title>
		<link>http://blog.lib.uiowa.edu/hardinmd/2008/11/25/google-flu-trends-kudos-complications/</link>
		<comments>http://blog.lib.uiowa.edu/hardinmd/2008/11/25/google-flu-trends-kudos-complications/#comments</comments>
		<pubDate>Tue, 25 Nov 2008 20:12:42 +0000</pubDate>
		<dc:creator>Eric Rumsey</dc:creator>
				<category><![CDATA[Google Flu Trends]]></category>
		<category><![CDATA[Hardin MD]]></category>
		<category><![CDATA[PicsYes]]></category>
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.lib.uiowa.edu/hardinmd/?p=1030</guid>
		<description><![CDATA[Google Flu trends is an elegant application of search data to medicine. Working on Hardin MD, I’ve long noticed seasonal variations in certain diseases — Colds, flu, &#38; respiratory illnesses peak in winter, and insect bites &#38; sun exposure conditions peak in summer. I pay a lot of attention to the search terms that people [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://googleblog.blogspot.com/2008/11/tracking-flu-trends.html">Google Flu trends</a> is an elegant application of search data to medicine. Working on Hardin MD, I’ve long noticed seasonal variations in certain diseases — <a href="http://www.lib.uiowa.edu/hardin/md/commoncold.html">Colds</a>, <a href="http://www.lib.uiowa.edu/hardin/md/influenza.html">flu</a>, &amp; <a href="http://www.lib.uiowa.edu/hardin/md/resp.html">respiratory illnesses</a> peak in winter, and <a href="http://www.lib.uiowa.edu/hardin/md/insectbites.html">insect bites</a> &amp; <a href="http://www.lib.uiowa.edu/hardin/md/sunpoisoning.html">sun exposure</a> conditions peak in summer. I pay a lot of attention to the search terms that people use to get to Hardin MD pages, so Google’s mining of this data to serve the health of the community is especially interesting.</p>
<p><a href="http://www.google.org/about/flutrends/how.html"><img class="alignnone size-full wp-image-1114" src="http://blog.lib.uiowa.edu/hardinmd/files/2008/11/how_07_08_animation1_60.jpg" alt="" width="569" height="218" /></a></p>
<p>The idea of Google Flu Trends is shown nicely in the snapshot above from the <a href="http://www.google.org/about/flutrends/how.html">animation at the Google Flu Trends site</a> &#8212; Google finds that there is an excellent correlation between flu-related terms that people search and the occurrence of flu, as measured by CDC data. And, as shown in the animation, the Google search data in near real-time precedes CDC data, which takes 1-2 weeks to be reported and compiled.</p>
<h3>Complications</h3>
<p>The idea of using search data to track the progression of disease outbreaks certainly is elegant, and Google deserves congratulations for it. In choosing flu as the first example, however, Google has chosen a disease with complicating factors.</p>
<p><a href="http://www.google.org/about/flutrends/how.html"><img class="alignnone size-full wp-image-1123" src="http://blog.lib.uiowa.edu/hardinmd/files/2008/11/how_04_08_activity1_62.jpg" alt="" width="591" height="187" /></a></p>
<p>The nature of these potentially complicating factors is suggested in the graphic above from the <a href="http://www.google.org/about/flutrends/how.html">Google Flu Trends site</a> &#8212; A big question here is &#8212; What caused the spike in flu occurrence and flu search activity in Dec 2003 &#8211; Jan 2004?</p>
<p>Because Google has chosen not to reveal the exact search terms that they are using to determine the volume of searching for flu-related searching (see <a href="http://www.nature.com/nature/journal/vaop/ncurrent/extref/nature07634-s1.pdf">supplementary material</a> accompanying Google&#8217;s <a href="http://www.nature.com/nature/journal/vaop/ncurrent/full/nature07634.html">paper in <em>Nature</em></a>), it&#8217;s difficult to know the cause of the 03-04 spike with certainty. But looking back at the chronology of that time period sheds light &#8212; There was a major shortage of the flu vaccine in late 2004, which is certainly related to the spike shown in the graphic &#8212; The CDC spike (yellow) shows that many people had flu, presumably because they were unable to get the vaccine. The Google spike (blue) is even higher, which may indicate that there were a significant number of people searching for flu information not because they were infected, but because they were looking for information on how to get the vaccine. The accompanying article (<a href="http://blog.lib.uiowa.edu/hardinmd/2008/11/25/google-flu-trends-flu-symptoms-vs-flu-shot/">Flu Symptoms vs Flu Shot</a>) shows that there is in fact a clear indication of heightened search activity for flu vaccine-related terms during the autumn pre-flu season.</p>
<p>The other complicating factor in looking at flu-related search activity is bird flu, and this seems to have been addressed well by Google &#8212; The large bird flu outbreak in Asia, and corresponding bird flu scare throughout the world, occurred in late 2004 and early 2005. Since there is no major spike shown in the graphs for this time, Google apparently has excluded bird flu/avian flu search terms from the aggregate group of terms it&#8217;s using.</p>
<p>** This is one of <strong>a group of three articles on Google Flu Trends</strong>:</p>
<ul>
<li>Google Flu Trends: Kudos &amp; Complications (this article)</li>
<li><a href="http://blog.lib.uiowa.edu/hardinmd/2008/11/25/google-flu-trends-flu-symptoms-vs-flu-shot/">Google Flu Trends: Flu Symptoms vs Flu Shot</a></li>
<li><a href="http://blog.lib.uiowa.edu/hardinmd/2008/11/25/google-flu-trends-the-iowa-connection/">Google Flu Trends: The Iowa Connection</a></li>
</ul>
<p>Together, these articles suggest that, although it&#8217;s difficult to know with assurance because Google has not revealed the search terms that they use for GFT, it seems likely that they&#8217;ve done a good job in working around the complications of flu-related search patterns.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lib.uiowa.edu/hardinmd/2008/11/25/google-flu-trends-kudos-complications/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Public domain pictures in Hardin MD</title>
		<link>http://blog.lib.uiowa.edu/hardinmd/2008/10/21/public-domain-pictures-in-hardin-md/</link>
		<comments>http://blog.lib.uiowa.edu/hardinmd/2008/10/21/public-domain-pictures-in-hardin-md/#comments</comments>
		<pubDate>Tue, 21 Oct 2008 21:44:10 +0000</pubDate>
		<dc:creator>Eric Rumsey</dc:creator>
				<category><![CDATA[Hardin MD]]></category>
		<category><![CDATA[Libraries]]></category>
		<category><![CDATA[PicsYes]]></category>
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.lib.uiowa.edu/hardinmd/?p=687</guid>
		<description><![CDATA[In the last year, we&#8217;ve begun to include the copyright status of pictures on Hardin MD pages. We have especially done this to show which pictures are not under copyright and are therefore free to copy.
Recently Peter Brantley suggested that libraries should make it easy for users to find public domain content on their sites. [...]]]></description>
			<content:encoded><![CDATA[<p>In the last year, we&#8217;ve begun to include the copyright status of pictures on Hardin MD pages. We have especially done this to show which pictures are not under copyright and are therefore free to copy.</p>
<p>Recently <a href="http://blogs.lib.berkeley.edu/shimenawa.php/2008/10/10/cd-public">Peter Brantley</a> suggested that libraries should make it easy for users to find public domain content on their sites. So, with thanks to Brantley for this idea, we&#8217;ve made a page that shows <a href="http://www.lib.uiowa.edu/hardin/md/gallery2public.html">public domain galleries</a> for specific diseases (see snip below).</p>
<p><a href="http://www.lib.uiowa.edu/hardin/md/gallery2public.html"><img class="alignnone size-full wp-image-688" src="http://blog.lib.uiowa.edu/hardinmd/files/2008/10/gallery2public_4.jpg" alt="" width="562" height="195" /></a></p>
<p>Brantley also suggests that libraries with public domain content make the content available from a specific directory called /public. This seems like a good idea, and we will be considering it.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lib.uiowa.edu/hardinmd/2008/10/21/public-domain-pictures-in-hardin-md/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Google Books vs DjVu in Internet Archive</title>
		<link>http://blog.lib.uiowa.edu/hardinmd/2008/09/05/google-books-vs-djvu-in-internet-archive/</link>
		<comments>http://blog.lib.uiowa.edu/hardinmd/2008/09/05/google-books-vs-djvu-in-internet-archive/#comments</comments>
		<pubDate>Fri, 05 Sep 2008 18:41:58 +0000</pubDate>
		<dc:creator>Eric Rumsey</dc:creator>
				<category><![CDATA[DjVu]]></category>
		<category><![CDATA[Google Book Search]]></category>
		<category><![CDATA[Hardin MD]]></category>
		<category><![CDATA[Internet Archive]]></category>
		<category><![CDATA[Navigation]]></category>
		<category><![CDATA[PicsYes]]></category>
		<category><![CDATA[Thumbnails]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[eBooks]]></category>

		<guid isPermaLink="false">http://blog.lib.uiowa.edu/hardinmd/?p=283</guid>
		<description><![CDATA[Finding a heavily illustrated book that&#8217;s in both Google Books (GBS) and Internet Archive (IA) gives a good comparison of the strengths and weaknesses in the way illustrated books are presented in these systems.
Shown below are the &#8220;intro&#8221; pages for the book in the 2 systems. The clear advantage of the GBS intro page is [...]]]></description>
			<content:encoded><![CDATA[<p>Finding a heavily illustrated book that&#8217;s in both Google Books (GBS) and Internet Archive (IA) gives a good comparison of the strengths and weaknesses in the way illustrated books are presented in these systems.</p>
<p>Shown below are the &#8220;intro&#8221; pages for the book in the 2 systems. The clear advantage of the <a href="http://books.google.com/books?id=umQPAAAAYAAJ">GBS intro page</a> is that the sample thumbnails in the lower right make it immediately obvious that the book has COLOR pictures of good quality.</p>
<p><img class="alignnone size-full wp-image-287" style="padding-bottom: 3px" src="http://blog.lib.uiowa.edu/hardinmd/files/2008/09/about31_60.jpg" alt="" width="616" height="413" /></p>
<p>In Internet Archive the main job of <a href="http://www.archive.org/details/atlasofdiseaseso00mraciala">intro screen</a> (below) is to direct the user to options to view the book, in the box in the upper left, and there&#8217;s no indication that the book contains pictures.</p>
<p><img class="alignnone size-full wp-image-294" src="http://blog.lib.uiowa.edu/hardinmd/files/2008/09/details22_51.jpg" alt="" width="581" height="402" /></p>
<p>Even after pulling up the DjVu option to view the book &#8212; which is a tricky matter, see <a href="http://blog.lib.uiowa.edu/hardinmd/2008/09/10/djvu-again/">how to do it here</a> &#8212; there&#8217;s no intro screen at all in DjVu, just an imposing blank page waiting for the user to change display options or begin paging through the book sequentially.</p>
<p><img style="padding-right: 15px;padding-top: 2px;padding-bottom: 10px" src="http://blog.lib.uiowa.edu/hardinmd/files/2008/09/plate15_pg253_2_40.jpg" alt="DjVu" align="left" />It&#8217;s when the user chooses display options and begins viewing the book that the advantages of DjVu become evident. The most important option, especially if pictures are an important part of the book, as they are in the <em>Mracek Atlas</em> book shown here, is to turn on the thumbnail display bar (at left) by clicking the icon in the lower right corner of the DjVu display window. It then becomes easy to scroll through the thumbnails and get a good view of the nature of the pictures in the book, and how they relate to the text. In the <em>Mracek Atlas</em>, it happens that the first third of the book is all text, and the last two-thirds is mostly pictures, so the user can scroll to the pictures easily.</p>
<p>Use of thumbnails is a good way to provide access to pictures in a book. But as simple and obvious as it is, thumbnail access is lacking in most e-book systems, so both GBS and DjVu are to be applauded for providing it, in their different ways. Here&#8217;s a comparison of the two systems &#8230;</p>
<p>In GBS, the About this book page gives immediate thumbnail access to a maximum of 30 pictures. Additional pictures have no thumbnail access, and can only be found by scrolling through pages or text searching.</p>
<p>DjVu has the disadvantage of having no Intro page that gives an overview of pictures in the book. But when the user knows how to set the display options, it provides good thumbnail access to an unlimited number of pictures. In a book like he <em>Mracek Atlas</em>, with over 100 pictures, this is a definite advantage.</p>
<p>Postscript: It wasn&#8217;t easy to find a book that&#8217;s in both GBS and IA, so I was especially pleased to find the <em>Mracek Atlas</em> discussed here that has <a href="http://www.lib.uiowa.edu/hardin/md/google/psoriasis.html">pictures in Hardin MD</a>! The full citation for the book is: <em>Atlas of diseases of the skin</em>, by Franz Mracek, 1899 [<a href="http://books.google.com/books?id=umQPAAAAYAAJ">GBS</a> | <a href="http://www.archive.org/details/atlasofdiseaseso00mraciala">IA</a>]</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lib.uiowa.edu/hardinmd/2008/09/05/google-books-vs-djvu-in-internet-archive/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Intelligent Thumbnails</title>
		<link>http://blog.lib.uiowa.edu/hardinmd/2008/08/19/intelligent-thumbnails/</link>
		<comments>http://blog.lib.uiowa.edu/hardinmd/2008/08/19/intelligent-thumbnails/#comments</comments>
		<pubDate>Tue, 19 Aug 2008 18:05:58 +0000</pubDate>
		<dc:creator>Eric Rumsey</dc:creator>
				<category><![CDATA[ContentDM]]></category>
		<category><![CDATA[Flickr]]></category>
		<category><![CDATA[Hardin MD]]></category>
		<category><![CDATA[Human input]]></category>
		<category><![CDATA[Pattern recognition]]></category>
		<category><![CDATA[PicsYes]]></category>
		<category><![CDATA[Thumbnails]]></category>
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.lib.uiowa.edu/hardinmd/?p=186</guid>
		<description><![CDATA[&#8220;Flickr takes the sun out of the sunset&#8221; &#8212;  The picture to the left  from Flickr shows the full picture and its square thumbnail, in the inset. Thumbnails like these are generated automatically by Flickr and other photo management systems. They work by taking a portion from the center to make the thumbnail. [...]]]></description>
			<content:encoded><![CDATA[<p><img style="padding-right: 15px;padding-top: 2px" src="http://blog.lib.uiowa.edu/hardinmd/files/2008/08/flickr26_85.jpg" alt="Flickr takes the sun out of the sunset" align="left" />&#8220;Flickr takes the sun out of the sunset&#8221; &#8212;  The picture to the left  from Flickr shows the full picture and its square thumbnail, in the inset. Thumbnails like these are generated automatically by Flickr and other photo management systems. They work by taking a portion from the center to make the thumbnail. This works well if the center has the most important subject in the picture. But if the picture is relatively wide or tall, and its main subject is not in the center, as in the example at left, with the sun being to one side, the thumbnail misses it. Looking at this example (<a href="http://www.flickr.com/photos/davidwhlloyd/sets/72157601516657085/">Long Beach Sunset</a>) in Flickr, note that the first thumbnail on the Flickr page (top left) is the one for the larger picture (that&#8217;s shown on our page with the thumbnail in yellow-outlined inset).</p>
<p>In large mass-production systems like Flickr, automatic thumbnails are unavoidable, and my point is not that they should never be used. Instead, my point is that, on many levels, <a href="http://blog.lib.uiowa.edu/hardinmd/2008/07/11/think-different-pictures/">pictures require more human input</a> than text to make them optimally usable. Pattern recognition &#8212; the simple observation that the thumbnail of a picture of a sunset SHOULD CONTAIN THE SUN &#8212; is something that the human brain does easily, but this does not come naturally for a computer.</p>
<p><img style="padding-right: 15px;padding-top: 12px" src="http://blog.lib.uiowa.edu/hardinmd/files/2008/08/fullsize32_70.jpg" alt="" align="left" /><br />
Another sort of problem in automatic production of thumbnails is making a thumbnail by simply reducing the size of the large picture. If the main subject of the picture is relatively small, it is not visible in a small thumbnail.</p>
<p>The picture to the left is from the Hardin Library ContentDM collection. The inset in the upper right shows the <a href="http://digital.lib.uiowa.edu/cdm4/results.php?CISOOP1=all&amp;CISOBOX1=holbein+plate+viii+12a&amp;CISOFIELD1=CISOSEARCHALL&amp;CISOROOT=%2Fjmrbr">thumbnail</a> that&#8217;s generated automatically by the system, which does a poor job of showing details of the picture. The lower inset shows a <strong><span style="color: #cc0000;">thumbnail</span></strong> made manually, which gives a much more clear view of the central image in the picture.</p>
<p>Cropping of a picture to produce a thumbnail, as done here, takes more subtle human judgement than the case with the Flickr picture in the first example, where the weakness of automatic production is obvious. With cropping, there&#8217;s inevitably a trade-off between showing the whole picture in the thumbnail or showing the most important subject of the picture. In cases such as this one from ContentDM, where most all of the detail in the picture will be lost in a small thumbnail, it seems better to focus on a central image that will show up in the thumbnail.</p>
<p>Finally, a few examples from <a href="http://blog.lib.uiowa.edu/hardinmd/2008/07/22/hardin-md-gallery-collections/">Hardin MD</a>, below, show how we have done cropping to improve the detail in our thumbnails. The thumbnails on the left in each of the three pairs are made by simply reducing the size of the full picture. On the right in each pair are the thumbnails we use, that we have made by cropping the full picture before making the thumbnail.</p>
<p><img class="alignnone size-full wp-image-198" src="http://blog.lib.uiowa.edu/hardinmd/files/2008/08/composite4.jpg" alt="" width="500" height="100" /></p>
<p>The biomedical, scientific pictures that we work with in Hardin MD are fairly easy to make thumbnails for, because they generally have a well-defined focus, that&#8217;s usually captured well by automatically-generated thumbnails. More artistic, humanities-oriented pictures, such as the ones discussed here from Flickr and ContentDM, however, often have more subtle subjects, that benefit from the human intelligent touch in the production of thumbnails.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lib.uiowa.edu/hardinmd/2008/08/19/intelligent-thumbnails/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Hardin MD Gallery Collections</title>
		<link>http://blog.lib.uiowa.edu/hardinmd/2008/07/22/hardin-md-gallery-collections/</link>
		<comments>http://blog.lib.uiowa.edu/hardinmd/2008/07/22/hardin-md-gallery-collections/#comments</comments>
		<pubDate>Tue, 22 Jul 2008 14:52:01 +0000</pubDate>
		<dc:creator>Eric Rumsey</dc:creator>
				<category><![CDATA[Hardin MD]]></category>
		<category><![CDATA[PicsYes]]></category>
		<category><![CDATA[Thumbnails]]></category>
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.lib.uiowa.edu/hardinmd/2008/07/22/hardin-md-gallery-collections/</guid>
		<description><![CDATA[Over the last three years, we have added close to 800 pictures on about 100 diseases/conditions to Hardin MD.
As the volume of pictures has grown, providing access to them becomes more difficult. For some time, we have grouped pictures on specific disease conditions into small galleries, each with about 3-12 pictures (ant bites, athletes foot, [...]]]></description>
			<content:encoded><![CDATA[<p>Over the last three years, we have added close to 800 pictures on about 100 diseases/conditions to Hardin MD.</p>
<p>As the volume of pictures has grown, providing access to them becomes more difficult. For some time, we have grouped pictures on specific disease conditions into small galleries, each with about 3-12 pictures (ant bites, athletes foot, atopic dermatitis below). Recently, however, we have expanded the gallery format, broadening it into larger gallery collections, which have links to the smaller galleries.</p>
<p><img src="http://blog.lib.uiowa.edu/hardinmd/files/2008/07/gallery5.JPG" alt="gallery5.JPG" /></p>
<p>Use of the gallery format has been very effective in increasing access to our pictures &#8212; We are finding that users are much more likely to click thumbnail disease links deeper on the page than when a list of text links is provided.</p>
<p>In addition to the gallery collection pages for AIDS, cancer, and child diseases, which are shown on the <a href="http://www.lib.uiowa.edu/hardin/md/gallery.html">gallery gateway</a> page above, there are also gallery collections for foot problems, herpes, insect bites, mouth sores, nail problems, oral diseases, skin rashes, STD&#8217;s, and tropical diseases, all of which are linked on the <a href="http://www.lib.uiowa.edu/hardin/md/gallery2.html">inclusive gallery</a> page.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lib.uiowa.edu/hardinmd/2008/07/22/hardin-md-gallery-collections/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Pictures not images</title>
		<link>http://blog.lib.uiowa.edu/hardinmd/2008/07/16/pictures-not-images/</link>
		<comments>http://blog.lib.uiowa.edu/hardinmd/2008/07/16/pictures-not-images/#comments</comments>
		<pubDate>Wed, 16 Jul 2008 20:16:51 +0000</pubDate>
		<dc:creator>Eric Rumsey</dc:creator>
				<category><![CDATA[Google]]></category>
		<category><![CDATA[Hardin MD]]></category>
		<category><![CDATA[PicsYes]]></category>
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.lib.uiowa.edu/hardinmd/2008/07/16/pictures-not-images/</guid>
		<description><![CDATA[Users of Hardin MD will notice that the word &#8220;pictures&#8221; is used frequently on our pages and the word &#8220;images&#8221; is rarely used. Why is this? Basically, the answer is simple &#8212; We use &#8220;pictures&#8221; because that&#8217;s the word people use in searching.
The screen-shots below, for the Hardin MD : Impetigo Pictures page, show this [...]]]></description>
			<content:encoded><![CDATA[<p>Users of Hardin MD will notice that the word &#8220;pictures&#8221; is used frequently on our pages and the word &#8220;images&#8221; is rarely used. Why is this? Basically, the answer is simple &#8212; We use &#8220;pictures&#8221; because that&#8217;s the word people use in searching.</p>
<p>The screen-shots below, for the <strong>Hardin MD : Impetigo Pictures</strong> page, show this clearly. The Extreme Tracker shot for this page shows the large proportion of search engine traffic from the word &#8220;pictures&#8221; (36%) compared to the small amount of traffic from the word &#8220;images&#8221; (0.6%).</p>
<table border="0" cellspacing="0" cellpadding="5">
<tbody>
<tr>
<td width="100">
<div><img src="http://blog.lib.uiowa.edu/hardinmd/files/2008/07/hmd_impetigopics.JPG" alt="hmd_impetigopics.JPG" /></div>
</td>
<td width="100">
<div><img src="http://blog.lib.uiowa.edu/hardinmd/files/2008/07/extremeimpetigo.jpg" alt="extremeimpetigo.jpg" /></div>
</td>
</tr>
<tr>
<td>
<div><strong>Hardin MD : Impetigo Pictures</strong> page</div>
</td>
<td>
<div>Keywords (Extreme Tracker)</div>
</td>
</tr>
</tbody>
</table>
<p>The Google screen-shots show that the Impetigo Pictures page gets an equally high ranking for the two words, so it&#8217;s apparent that &#8220;pictures&#8221; is being searched much more frequently.</p>
<table border="0" cellspacing="0" cellpadding="5">
<tbody>
<tr>
<td width="100">
<div><img src="http://blog.lib.uiowa.edu/hardinmd/files/2008/07/g_pictures.jpg" alt="g_pictures.jpg" /></div>
</td>
<td width="100">
<div><img src="http://blog.lib.uiowa.edu/hardinmd/files/2008/07/g_images.jpg" alt="g_images.jpg" /></div>
</td>
</tr>
<tr>
<td>
<div>Google search: impetigo <strong>pictures</strong></div>
</td>
<td>
<div>Google search: impetigo <strong>images</strong></div>
</td>
</tr>
</tbody>
</table>
<p>(Note that these screen-shots have been photo-edited to fit the space &#8212; Ads and other text not relevant to the article have been removed. All screen-shots captured in July 2008.)</p>
<p>Here&#8217;s the background &#8230;</p>
<p>In about 2001, we started noticing how people were finding Hardin MD pages in search engines, and designing our pages to make them more likely to be found. An important part of this was using words that people were more likely to search (e.g. &#8220;heart diseases&#8221; instead of &#8220;cardiology&#8221;). Tools such as WordTracker that show how many people are searching for particular words are especially useful for this.</p>
<p>About this same time, we were starting to make links to other sites that have pictures on medical/disease subjects. Using WordTracker, and ExtremeTracker (to see words people were searching to find our pages) it was striking that the word &#8220;pictures&#8221; was very effective. At the time, we assumed that the appropriate word to use was &#8220;images,&#8221; since that word is what&#8217;s used on most medical/disease pages at other sites. We could see clearly, however, that using the word &#8220;pictures&#8221; on our pages brought much more traffic than the word &#8220;images.&#8221; So we&#8217;ve gone on from there, and now have high rankings in Google for many medical/disease subjects combined with &#8220;pictures,&#8221; as with Impetigo.</p>
<p><a href="http://www.lib.uiowa.edu/hardin/md/impetigopictures.html">Hardin MD: Impetigo Pictures</a> | <a href="http://extremetracking.com/">Extreme Tracker</a> | <a href="http://www.wordtracker.com/">WordTracker</a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lib.uiowa.edu/hardinmd/2008/07/16/pictures-not-images/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Think Different : Pictures</title>
		<link>http://blog.lib.uiowa.edu/hardinmd/2008/07/11/think-different-pictures/</link>
		<comments>http://blog.lib.uiowa.edu/hardinmd/2008/07/11/think-different-pictures/#comments</comments>
		<pubDate>Fri, 11 Jul 2008 19:29:40 +0000</pubDate>
		<dc:creator>Eric Rumsey</dc:creator>
				<category><![CDATA[Google]]></category>
		<category><![CDATA[Hardin MD]]></category>
		<category><![CDATA[Human input]]></category>
		<category><![CDATA[Image Search]]></category>
		<category><![CDATA[Pattern recognition]]></category>
		<category><![CDATA[PicsNo]]></category>
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.lib.uiowa.edu/hardinmd/2008/07/11/think-different-pictures/</guid>
		<description><![CDATA[As computers have become more powerful, many of the aspects of handling text that were formerly done by humans have been taken over by computers. Pictures, however, are much more difficult to automate &#8212; Recognizing patterns remains a task that humans do much better than computers. A human infant can easily tell the difference between [...]]]></description>
			<content:encoded><![CDATA[<p>As computers have become more powerful, many of the aspects of handling text that were formerly done by humans have been taken over by computers. Pictures, however, are much more difficult to automate &#8212; Recognizing patterns remains a task that humans do much better than computers. A human infant can easily tell the difference between a cat and a dog, but it&#8217;s difficult to train a computer to do this.</p>
<p>In pre-Google days, the task of finding good lists of web links needed the input of smart humans (and Hardin MD was on the cutting edge in doing this). Now, though, Google Web Search gives us all the lists we need.</p>
<p>Pictures are another story &#8212; on many levels, pictures require much more human input than text.</p>
<p>The basic, intractable problem with finding pictures is that they have no innate &#8220;handle&#8221; allowing them to be found. Text serves as its own handle, so it&#8217;s easy for Google Web Search to find it. But Google Image Search has a much more difficult task. It still has to rely on some sort of text handle that&#8217;s associated with a picture to find it, and is at loss to find pictures not associated with text.</p>
<p>The explosive growth of Hardin MD since 2001 (page views in 2008 are over 50 times larger) has been strongly correlated with the addition of pictures. This time period has also gone along with  the growing presence of Google, with its page-rank technology, and this has come to make old-style list-keeping, as had been featured in Hardin MD, less important.</p>
<p>Though Google has accomplished much in the retrieval of text-based pages, it&#8217;s made little progress in making pictures more accessible. Google Image Search is the second most-used Google service, but its basic approach has changed little over the years.</p>
<p>The basic problem for image search is that pictures don&#8217;t have a natural handle to search for. Because of this it takes much more computer power for the Google spider to find new pictures, and consequently it takes much longer for them to be spidered, compared to text pages (measured in months instead of days).</p>
<p>Beyond the problem of identifying pictures there are other difficult-to-automate problems for image search:<br />
• How to display search results most efficiently to help the user find the what they want &#8212; Do you rank results according to picture size, number of related pictures at a site, or some other, more subjective measure of quality?<br />
• What&#8217;s the best way to display thumbnail images in search results?<br />
• How much weight should be given to pictures that have associated text that helps interpret the picture?</p>
<p>So &#8212; <strong>Good news for picture people!</strong> &#8212; I would suggest that pictures are a growth sector of the information industry, and a human-intensive one. I would predict that text-based librarians will continue to be replaced, as computers become more prominent. But there will continue to be a need for human intelligence working in all areas relating to pictures, from indexing/tagging to designing systems to make them more accessible.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lib.uiowa.edu/hardinmd/2008/07/11/think-different-pictures/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
