Google Ngrams is a fascinating visualization tool for studying word frequency over time in the 15 million books that are part of the Google Books project. The research that led to the creation of Ngrams was a cooperative effort between Google and Harvard University.

The little screenshot snippet below shows Ngrams in action, making it easy to see at a glance how cancer has come to predominate over infectious diseases in the 20th century. Other examples show similar trends in related diseases, medical specialty fields, and the practice of healthcare. Ngram viewer IS case sensitive and results vary quite a bit depending on capitalization, so play around with it …

Especially of historic interest:

Eric Rumsey is at: eric-rumseytemp AttSign uiowa dott edu and on Twitter @ericrumseytemp

Just as Google Wave was announced yesterday, I was thinking of writing about the usefulness of the pictures that accompany results in Twitter Search, giving a good immediate overview of search results. I find this especially valuable in searching for Twitter users, to see how connected they are — It’s easy to see at a glance if most of the tweets listed are by the person being searched. So now Google Wave takes the idea a step further, with pictures of the people in an email thread. Below: Left: Twitter Search.  Right: Google Wave (from yesterday’s Google demo)

Facebook, of course, has similar pictures in its status updates. It’s interesting to follow how the use of pictures has progressed — In Facebook the status update pictures are relatively small. In Twitter, they grow larger, and now, in Google Wave, there are multiple pictures. This increasing reliance on pictures is smart. With the brain’s highly-developed facial recognition skills, we’re able to take in a large amount of information very quickly.

Eric Rumsey is at @ericrumseytemp

Venn diagrams have long been used in teaching online searching, to help users visualize how Boolean searching works. A new application of Venn diagrams, Twitter Venn, gives on-the-fly Venn diagrams of Twitter postings. Because Twitter does such a good job of taking the pulse of the Web, Twitter Venn is an excellent way to visualize connections of breaking news topics. The first Twitter Venn example below shows that there are 5047 postings per day with the word heart, 1314 postings with the word risk and 24 postings that contain both words, represented by the gray “intersection” in the middle containing two purple dots. The fun part — To the lower left other words are listed that occur in the postings on heart and risk — The top word, in large print, is decaf, indicating that there’s current buzz that relates decaf to heart disease risk. Sure enough, a Google search for heart decaf shows that there have indeed been recent reports that decaf coffee may increase the risk of heart disease, at least slightly.

[Click images below for live results in Twitter Venn. The numbers will vary slightly, since they’re generated live. On the Twitter Venn results, click the middle intersection area to show common terms in lower left.]

The second example, below, shows clearly that the main source of alarm about salmonella poisoning is peanut butter, since this is the predominant word that occurs with the search words, as shown in the listing at the lower left.

It’s occurred to me for a long time that Venn diagrams are a good way to visualize the relationships among subjects in online searching. But I suspect the sort of on-the-fly, realtime generation of Venn diagrams done by Twitter Venn would be too slow for databases with more text per record. So it’s for Twitter, with its tiny 140-character records, to show how useful Venn diagrams can be for visualization.