As I’ve discussed previously, much of the strangeness of the PubMed Health-NLM-Google affair arises because NLM doesn’t seem to have an appreciation of search engine optimization (SEO), and the value of being ranked in Google.

Another aspect of this misunderstanding that came out during the NLM presentation at the recent MLA annual meeting, is that NLM is frustrated (!) that PubMed Health pages are getting a top ranking in Google because they consider the new resource to be in an uncompleted, “pre-alpha” state. Bafflingly, they apparently didn’t anticipate that Google would find the PMH pages until they were “ready.”

To anyone in the dotcom world, getting a high ranking in Google is invaluable — The Ultimate Goal, The End of the Rainbow. To them, the idea that NLM would not be celebrating (!) a high Google ranking would be hard to fathom.

I realize that government websites like NLM don’t have the nimbleness of dotcom sites, so they have trouble adapting to unexpected happenings. But, still, it’s interesting, I think, to ask how a dotcom site would handle the current PubMed-Google situation …

What would a dotcom do if they had a new site that was under development, not ready to be used by the public, and it suddenly and unexpectedly started getting high rankings in Google? I think if this happened a dotcom would drop everything else and get the site in a finished state as quickly as possible. And while they were working on this, they would inform users about their progress in getting it finished.

NLM’s response to getting a top Google ranking has been very different — From all appearances, they have done nothing different at all because of the high ranking. They are working at a slow pace to implement the new site, on some pages, but they have done nothing to inform users about their progress, and when implementation will be completed.

Beyond NLM – Building Library Discoverability with Google & SEO

My point here is not to be hypercritical of NLM. It’s rather to use NLM as an example of a more general problem in libraries. As I’ve discussed before, I think libraries should be more aware of the effect of SEO and Google on how our users find our sites.

What’s unfortunate about NLM’s reaction to Google’s ranking of PMH is that it almost appears as if they really don’t care whether users find their PMH pages or not — The pages are certainly going to be found and used more if they get a high ranking in Google, so NLM should rejoice, instead of grousing about Google finding them.

So I say the same thing to libraries in general that I say to NLM — We have good stuff! Let’s help our users find it! Taking advantage of Google and the principles of SEO to help us do this doesn’t mean we’re “in it for the money” — It just means we want to make our resources more discoverable for our users!

Related articles:

Eric Rumsey is at: eric-rumsey AttSign uiowa dott edu and on Twitter @ericrumsey

Being an Iowa baseball boy at heart, I naturally thought about Field of Dreams when I read the words below by SEO guru Bill Hartzer in The Status Of Search Engine Optimization: April 2011 — It’s still the same as it’s always been, he says — Build a good website and it’ll get found. The idea is certainly not new, but Hartzer states it nicely:

So, what should you focus on right now, today, in April 2011? What has changed? Really, nothing has changed in a major way. It’s still business as usual. Build a quality web site, with lots of good informational content about your subject, publicize the content (properly) on other web sites, get links from other web sites to ALL of your content, and you will be just fine. Create a site that is good for your users and something that they like, and the search engines will reward you for it.

In other words, quit chasing the Google algorithm and worrying about all of the “minor” SEO tweaks that you could be doing and worry more about the fact that you’re not creating great content on your web site. That said, there are “best practices” that you still need to adhere to:

- Search engine friendly web design
- Unique content
- Make sure your on-page factors are in check (i.e., proper title tags, meta tags, heading tags, alt tags, etc.)
- Add good content to your site on a regular basis
- Do proper publicity for your content (use social sites, link building, and press releases when appropriate).

Eric Rumsey is at: eric-rumsey AttSign uiowa dott edu and on Twitter @ericrumsey

SEO (Search Engine Optimization) has been in bad repute recently, with Google’s SEO spamming problems in the news. Actually SEO has never been given much respect in the library world, and this is unfortunate because on a basic level SEO is closely related to the library-centric concept of discoverability — Making it easy for users to find good things on your website.

I’ve been thinking for some time that librarians’ apparent lack of interest in SEO was surprising. But recently I’ve been realizing that my perceptions are colored by my experience in crafting Hardin MD pages to be found by Google, beginning about 2001, before most anyone had heard of “SEO.”

I can understand why SEO has bad connotations for library people who know it only as a bag of tricks used by the dotcom-Adwords world to trick Google into giving a high ranking to their clients’ pages. But I hope the examples of my pre-SEO-Adwords experience that I’ll present here will show why I think optimizing pages so they can be found in Google is very much in the library tradition of bringing together the users and the pages.

Even in pre-Google days, standard wisdom about getting pages found by search engines emphasized the importance of a strong page title that gives a concise description of the page’s contents (advice that still holds true). Much of my early work on Hardin MD that I now think of as using SEO techniques centered on this importance of the title. I was an early booster of Google, so I noticed soon after it was launched, in 2000, that many of the pages in Hardin MD were getting high rankings in searches for title words of its pages. I also noticed that most of the pages that were highly ranked got more traffic. But not all of them. Why was this, I wondered? Finally,  with the help of WordTracker (this was long before Google Analytics), I figured out that a high Google ranking goes only halfway — The other half of the high-traffic equation is people searching the term that gets the ranking. Getting a high ranking for a term that no one is searching is useless, like providing a supply of something for which there’s no demand! This simple, basic supply and demand principle is still at the heart of SEO.

The case that opened my eyes about the supply and demand principle was a Hardin MD page with the title “Respiration Medicine” — It got high rankings in Google searches but very little traffic. With WordTracker, I saw the reason why — Hardly anyone was searching for “respiration medicine” — So I used WordTracker to determine the equivalent terms that people WERE searching for, and when I put those words in the title (which is now Respiratory System & Lung Diseases), the traffic increased.

Having discovered the value of using title words that people were searching for, I adjusted Hardin MD pages accordingly. This often meant changing from medical specialty terms to terms that are more easily-understood and widely-used by the public — Ophthalmology was changed to Eye Diseases, Cardiology became Heart Disease … Pediatrics >> Childrens Diseases, Otolaryngology >> Ear, Nose, Throat.

After learning the value of choosing the best words to draw traffic, I applied this optimization lesson to creating the tags that are used at the bottom of Hardin MD pages. The same technique also showed that “pictures” should be used instead of “images” for Hardin MD pages relating to pictures.

I find basic SEO principles especially interesting from a library point-of-view because they have similarities with some of the long-standing principles of librarianship. I’ve written about  tagging in Hardin MD that hearkens back to the subject headings used on library catalog cards. And, having had a bit of experience as a library cataloger, I see a similar parallel between the web page title, that I’ve discussed in this article, with the title-page of a book, that was established as the basis for cataloging books several hundred years ago — Principles of information management endure!

Eric Rumsey is at: eric-rumsey AttSign uiowa dott edu and on Twitter @ericrumsey

I tweeted this funny a couple of days ago, and got several retweets …

ericrumsey: :-) RT @wabbitoid “How many SEO specialists are needed to change a lightbulb?” bulb bulbs light cheap affordable

It was only after a day or two that I realized that it was a great chance to laugh at myself! …

As the lightbulb joke pokes fun at SEO specialists who are obsessed with thinking of every possible word that people might search for, I remembered the Hardin MD Chicken Pox / Chickenpox page that I made several years ago –The only page in Hardin MD for which I used a double-element title, because WordTracker indicated that people search for chickenpox as two words and as one word. Traffic data has shown, in fact, that the page does indeed get significant traffic for both terms.

I’ve been blogging on the SEO theme recently, and I’m realizing that my interest in it comes a lot from my experience with Hardin MD, much of it long before I heard the term “SEO.” As the little lightbulb example here shows, though, I guess I have a foot in that world.

Eric Rumsey is at: eric-rumsey AttSign uiowa dott edu and on Twitter @ericrumsey

In February, Google began giving prominent placement to articles in NLM’s PubMed Health. As I discussed in a previous article, NLM and Google have been strangely silent about announcing this new feature, with no discussion of it anywhere that I can find.

It’s especially surprising that Google hasn’t said anything about this because — coincidentally with the NLM boost — Google’s ranking system has been under attack recently, with charges that doctom sites (most notably JC Penney) have used Search Engine Optimization (SEO) tricks that cause Google to give high rankings to their product pages.

As I’ve discussed before, although using SEO techniques to get high rankings in Google is widely discussed in the doctom world, it’s an almost unknown subject to most librarians. This is unfortunate, because without a background understanding of SEO, the next step in the “NLM-CIA conspiracy” story seems completely bizarre …

As I discussed in my previous article, soon after Google began giving prominent ranking to NLM, Jeff Hamilton, who blogs about ADHD, raised questions in his short article PubMed Health Who?:

Where the heck did these guys come from? Try and Google “ADHD” and these guys are the #1 search result!! PubMed Health is a new online resource under development at [NLM-NCBI] … Hmmmm, CIA? Secret Government agency? How does an organization go from not being on the radar to the #1 search engine result for ADHD overnight? … You’ve heard stories about how much control and power the Government has over the Internet…….I wonder if this secret SEO organization would be interested in doing some site optimization for me …

As a member of the medical library community, it seems laughable that anyone would suggest underhanded dealings between NLM and Google, and in my previous article I described Hamilton’s idea as “hare-brained.” But thinking it over I realize that to a non-librarian who’s been reading about the recent Google-SEO controversy, Hamilton’s speculations seem more reasonable. As he says, it does indeed seem surprising that PubMed Health pages suddenly began appearing at the top of Google’s rankings, with no explanation from Google or NLM about why this is happening.

The story gets more meta-interesting because Google’s ranking of Hamilton’s SEO story itself becomes part of the story — If the article had stayed on his blog, it probably wouldn’t have gotten much attention from Google and hence the eyeballs of the world. But instead it got copied on the Psychology Today blog, and that brought it a high ranking (generally between #1 and #6 in the last two weeks) in a Google search for PubMed Health — So, let’s say you’re a health-information-seeking consumer who comes across a PubMed Health page in a Google search — You like the page, so you do some googling to find out more about PubMed Health — And what do you find? Hamilton’s NLM-CIA conspiracy article.

So what’s wrong here? Why is a standard resource by large government site like NLM not able to outrank a blogger’s speculations about its validity in a Google search? Normally, Google does a good job finding “the real thing,” the site itself. The problem, I think, is that there has been nothing for Google to link to for “PubMed Health” — It didn’t even have a home page until last week, when it was announced by NLM/NCBI in Twitter. And there still hasn’t been a press release or longer announcement by NLM or Google. If these sources existed, they and medical library bloggers discussions of them would soon dominate Google’s top ten, and leave wild NLM-CIA conspiracy speculations in the dust. I’d guess that sooner or later, NLM and/or Google will make some sort of announcement. But I’d predict that the longer they wait, the harder it will be to displace Hamilton’s article from its high ranking — In my experience, Google has a persistent memory, and it often holds on to links after they have been obsolesced by events.

I think this is an excellent example of why librarians should learn more about SEO — If people at NLM and in the wider medical library community were paying more attention to SEO, it would have been clear that the sudden appearance of a new resource from NLM at the top of Google searches needs to be explained.

Learning more about SEO — If you google for SEO be ready for a fire-hose of sites offering to help you get a Google ranking. You might want to start out with Wikipedia’s lengthy SEO article or a book on SEO in the Dummies guide series.

Eric Rumsey is at: eric-rumsey AttSign uiowa dott edu and on Twitter @ericrumsey

Google has been under attack recently, because its search results often seem to be overwhelmed by spam-generated links. On the other hand, Wikipedia has gotten many laudatory commentaries on the occasion of its tenth anniversary.

The timing here is interesting — Google, which is driven by computer-generated algorithms, is being “outsmarted” by human SEO engineers who have figured out how to “game” the system to get their sites a high ranking in searches. And Wikipedia, powered by smart human curators, has risen to become “a necessary layer in the Internet knowledge system.

I’ve looked at several of the tenth-anniversary commentaries discussing the uniqueness of Wikipedia, and it’s surprising that I haven’t seen any that note the significance of its being a human-generated tool. TheAtlantic had a good round-up of commentaries by 13 “All-Star Thinkers” — Some of them do talk about the importance of collaboration in the working of Wikipedia, but none of them make the more basic, and, to me, even more acute observation that, in this age of the computer, it’s done by human beings!

In the Wikipedia article on Wikipedia, in the section The Nature of Wikipedia is this interesting quote from Goethe:

Here, as in other human endeavors, it is evident that the active attention of many, when concentrated on one point, produces excellence.

Indeed — As my library school teacher used to say “if there were enough smart humans we wouldn’t need to rely on computers.”

So — Librarians Take Note! Have you ever considered becoming a Wikipedia editor? — On the occasion of the tenth anniversary, Wikipedia founder Jimmy Wales is making an effort to foster more diversity in curation — He especially mentions reaching out to Libraries for help.

Finally, on a related thread — Another notable aspect of Wikipedia that hasn’t been mentioned in anniversary articles — Not only is it done by humans, but it’s done by humans on a volunteer basis — As I discussed in an earlier article, Daniel Pink uses this as a classic example of “intrinsic motivation” >> Wikipedia vs Encarta: The Ali-Frazier of Motivation.

Related articles:

Eric Rumsey is at: eric-rumsey AttSign uiowa dott edu and on Twitter @ericrumsey

In an article on the most popular online stories of 2010 in the Lawrence (Kansas) Journal-World, Whitney Mathews discusses writing headlines with “‘Google juice’” to attract traffic — In other words, using the principles of Search Engine Optimization (SEO) — Mathews talks especially about a syndicated AP story in April for which they made the headline “iPad vs. Kindle” — With this short, pithy headline, the article has consistently been in the top ten hits in Google searches ever since (which I confirmed several times in the last few days), and of course has brought a lot of traffic.

I’ve been aware of the importance of choosing language carefully to bring search engine traffic since the early days of Hardin MD, before SEO became big business, and I’ve been surprised that libraries have been so slow to put it to use. Recently I’ve been paying attention to publishing and journalism because I see that people in those fields are thinking about many of the same digital-future questions as librarians. So I was glad to find, in the Lawrence story, that journalists ARE thinking about crafting their stories to be found by Google. A bit of googling (searching for SEO newspapers site:edu) showed that Lawrence is not alone — There’s a lot on SEO and newspapers.

How about libraries? …

Comparing journalism to librarianship, searching for SEO libraries site:edu finds very little — Actually, I’ve been doing this search periodically for several months, and have never found anything in the top ten, until today I did find one piece, a Word document from Binghamton University Libraries (YAY!) on using SEO for their web pages.

In the dotcom part of the online world, SEO is a givenWhy have libraries not used it more? I’ll be writing more about this in the next few weeks, so keep watching.

What’s Chocolate got to do with the story? …

It just happened that this week, as I was reading about SEO in Lawrence, I was also following a NY Times story with the catchy title Giving Alzheimer’s Patients Their Way, Even Chocolate. This was a lengthy article about an innovative Phoenix nursing home, with only incidental mention of chocolate, but the smart headline writer with some SEO-savvy used the word to get attention — The story was in the top ten most emailed NYT stories all week, and I suspect the chocolate hook had a lot to do with that.

And finally, with my mind on chocolate and libraries, I found this cute little article that was my most popular tweet of the week, no doubt showing the (SEO) power of chocolate! …

ericrumsey: How about a Library with Chocolate instead of Books? NY Educ Dept says NO! http://nyti.ms/fJoGlI

Thanks to my son Brian Rumsey, who lives in Lawrence, and brought my attention to the Journal-World story.

Eric Rumsey is at: eric-rumsey AttSign uiowa dott edu and on Twitter @ericrumsey

[This article accompanies previous article: Tagging in Hardin MD]

Soon after the launching of Hardin MD, in 1996, we began adding keywords in the hidden META keyword field (The first pages for HMD in Internet Archive [Dec, 1998] show them on all pages checked.) We began checking to see if HMD pages were appearing in search engine results in about 2000, and found that meta keywords didn’t seem to have much effect.

So, in late 2000, we began experimenting with putting keywords (aka tags*) at the bottom of the page, where most users wouldn’t notice them. At first we didn’t see much effect in search engine results, when using the tags mostly for variant spellings or terminology (e.g. on the Hematology page: blood diseases, haematology).

In 2001, as Google rose to prominence, and Search improved, we began using tools that gave the ability to see the popularity of specific words (HitBox, ExtremeTrackingWordTracker). We learned that using mis-spelled word variants as tags worked very well in drawing SE traffic. It was also during this time that links to pictures were being added to HMD, and we discovered the power of the word “pictures” in drawing SE traffic.

Time-line of tagging in Hardin MD

Based on invaluable help from Internet Archive — Starting from here: Internet Archive for Hardin MD, 1999+

The first HMD pages in Internet Archive in Dec, 1998 have meta keywords, but not tags on the page. Example of meta keywords (Hardin MD: Cardiology): health, medicine, medical, nursing, nurses, nurse, disease, diseases, best, list, lists, consumer, cardiology, cardiac, heart, stroke, cardiovascular, cardiothoracic, pacemaker, defibrillator, attack, arrest

Tagging for misspellings – Ophthalmology, I’m sure, would have been one of the first pages on which misspellings would have been used. Internet Archive pages show clearly that the first implementation was in early November, 2000. …

Ophthalmology, Nov 7, 2000 – No misspellings in meta keywords. There are no tags on page.
Ophthalmology, Nov 15, 2000 – Has misspellings in meta keywords and on page: [ophthamology]

This fits my memory of events — I was especially motivated to look for ways to draw Web traffic, because Google was just becoming prominent, rationalizing the search process, and making it easier to predict the effects of changes on page traffic.

Other examples of pages with tags on the page, with variant spellings, from about the same time: Orthopedics Nov 16, 2000 [orthopaedics] and Hematology Nov 29, 2000 [blood diseases, haematology]

Use of the word “pictures,” in tagging and in page titles

First use: Genital Warts Jun 10, 2002

First widespread use – Several pages linked on Hardin MD Index page Sept 30, 2002

.

Eric Rumsey is at: eric-rumsey AttSign uiowa dott edu and on Twitter @ericrumsey

[When Hardin MD was launched in 1996, its main purpose was to provide links to health science resources on the Web. In recent years, the emphasis has been on providing access to medical pictures.]

We first started tracking how well Google was finding Hardin MD pages in about 2001, when search engine optimization was in its infancy, and most people, like us, had not heard the term “SEO.” But in today’s lingo, that’s pretty much what we were doing — Learning to use language that would help people searching in Google to find our pages — So here’s a little example of using search engine optimization techniques before they became famous as SEO. …

Users of Hardin MD will notice that the word “pictures” is used frequently on our pages and the word “images” is rarely used. Why is this? Basically, the answer is simple — We use “pictures” because that’s the word people use in searching.

The screen-shots below, for the Hardin MD : Impetigo Pictures page, show this clearly. The Extreme Tracker shot for this page shows the large proportion of search engine traffic from the word “pictures” (36%) compared to the small amount of traffic from the word “images” (0.6%).

hmd_impetigopics.JPG
extremeimpetigo.jpg
Hardin MD : Impetigo Pictures page
Keywords (Extreme Tracker)

The Google screen-shots show that the Impetigo Pictures page gets an equally high ranking for the two words, so it’s apparent that “pictures” is being searched much more frequently.

g_pictures.jpg
g_images.jpg
Google search: impetigo pictures
Google search: impetigo images

(Note that these screen-shots have been photo-edited to fit the space — Ads and other text not relevant to the article have been removed. All screen-shots captured in July 2008.)

Here’s the background …

In about 2001, we started noticing how people were finding Hardin MD pages in search engines, and designing our pages to make them more likely to be found. An important part of this was using words that people were more likely to search (e.g. “heart diseases” instead of “cardiology”). Tools such as WordTracker that show how many people are searching for particular words are especially useful for this.

About this same time, we were starting to make links to other sites that have pictures on medical/disease subjects. Using WordTracker, and ExtremeTracker (to see words people were searching to find our pages) it was striking that the word “pictures” was very effective. At the time, we assumed that the appropriate word to use was “images,” since that word is what’s used on most medical/disease pages at other sites. We could see clearly, however, that using the word “pictures” on our pages brought much more traffic than the word “images.” So we’ve gone on from there, and now have high rankings in Google for many medical/disease subjects combined with “pictures,” as with Impetigo.

Extreme Tracker | WordTracker