A New York Times article on Google Flu Trends reports that Google’s methodology “has been validated by an unrelated study” based on Yahoo! search data whose lead author is Philip Polgreen, an infectious disease doctor at the University of Iowa.
I was glad to learn about the Polgreen study, first, of course, because Polgreen and colleagues are right here at the University of Iowa! — But beyond that, it was good to find in the full article by the Polgreen team that they give more details about the flu-related search terms they used than the Google Flu Trends team does, making it easier to break down the complicating factors in flu searching. Specifically, they report that they excluded the following terms:
bird, avian, pandemic
vaccine, vaccination, shot
As discussed in accompanying articles (see below), flu is a particularly complicated disease for correlating disease occurrence and web search behavior, because of the existence of bird flu, and because there is a vaccine for flu — exactly the factors that have been excluded by Polgreen et al. It seems likely that the Google Flu Trends team is using a similar method.
Incidentally, more on the Iowa connection — Philip Polgreen has been involved for several years with the Iowa Health Prediction Market, a spin-off of the Iowa Electronic Markets, a real-money prediction market/futures market that’s used to make predictions in political elections.
** This is one of a group of three articles on Google Flu Trends:
- Google Flu Trends: Kudos & Complications
- Google Flu Trends: Flu Symptoms vs Flu Shot
- Google Flu Trends: The Iowa Connection (this article)
Together, these articles suggest that, although it’s difficult to know with assurance because Google has not revealed the search terms that they use for GFT, it seems likely that they’ve done a good job in working around the complications of flu-related search patterns.