By Eric Rumsey and Janna Lawrence
In our earlier article on searching for plant-based foods (PBF) in PubMed, we suggested that a quick way to search the subject is to combine the MeSH Plants explosion AND our Food-Diet-Nutrition (FDN) hedge. This works quite well, especially for citations after 2002. In that year the Plants explosion was greatly expanded by the addition of several hundred new MeSH plant names. Before that, articles on specific plants were indexed inconsistently. Sometimes they were put under the plant family name, in which case they were included in the Plants explosion, and in other cases they were indexed under other terms like Vegetables, Fruit, or Plants, Edible, that are not in the Plants explosion.
In order to do the most inclusive search for plant-based foods, including citations before 2002, we have created two hedges, to be used for all the years in PubMed. These hedges include other MeSH terms and text-words, to supplement the plant-name search strategy that works well after 2002. We have done fairly thorough testing of the two hedges, and we recommend the first hedge for most searches. It uses MeSH terms, and it emphasizes “precision,” which means that it gets somewhat fewer citations, but the citations are more likely to be on target. For both of the hedges, we’ve combined them in an OR search with the simple “Plants AND FDN” hedge search mentioned above.
Here’s the first hedge, with the newly added terms in boldface:
(Vegetables [mesh] OR Fruit [mesh] OR Cereals [mesh] OR Plants, Edible [mesh] OR Soybeans [mesh] OR Dietary Fiber [mesh] OR Flour [mesh] OR Bread [mesh] OR Diet, Vegetarian [mesh] OR Nuts [mesh] OR Condiments [mesh] OR Vegetable Proteins [mesh]) OR (Plants [mesh] AND (food OR foods OR beverages OR diet OR dietary OR vitamin OR vitamins OR nutrition OR nutritional OR nutrition disorders OR food industry OR nutritional physiological phenomena OR dietary fats OR dietary proteins OR feeding behavior))
[Number of citations, 8.19.14 - 265,126]
To use this search, click this link. You can also copy the text above and paste it into the PubMed search box. If you have a personal “My NCBI” account in PubMed, the hedge search can be saved for later use, or it can be made into a search filter. For information on setting up and using saved searches, see here; for more information on filters, see here.
Commentary on terms in this hedge (If the “Year introduced” is not given, the term has been in MeSH since its launch in 1966):
- Vegetables [mesh]
An explosion that includes about 25 specific vegetables, including Onions, Soybeans, Daucus carota, and Solanum tuberosum. This is a relatively small proportion of all vegetables, which are indexed with their species or family name, in the Plants explosion.
- Fruit [mesh]
Notably, this is NOT an explosion. All particular fruit types are indexed with their species or family name, in the Plants explosion.
- Cereals [mesh]
An explosion that includes 8 cereals, including Avena sativa, Triticum and Zea mays. This is an important group, since it includes the world’s staple foods–wheat, rice, and corn.
- Plants, Edible [mesh]
An explosion that includes several terms elsewhere in this hedge that get more citations when they’re searched separately. The term Plants, Edible by itself gets 5402 citations.
- Soybeans [mesh]
Year introduced: 1986
An explosion that includes Soy Foods, Soy Milk, and Soybean Proteins.
- Dietary Fiber [mesh]
Year introduced: 1982(1977)
- Flour [mesh]
- Bread [mesh]
- Diet, Vegetarian [mesh]
Year introduced: 2003(1963)
- Nuts [mesh]
- Condiments [mesh]
An explosion that includes Spices.
- Vegetable Proteins [mesh]
Year introduced: 1975
Here’s the second hedge, with the newly added terms in boldface:
(fruit OR fruits OR vegetable OR vegetables OR cereal OR cereals OR spices OR condiments OR flour OR nut OR nuts OR vegetarian OR soy OR soybean OR soybeans OR bread) OR (Plants [mesh] AND (food OR foods OR beverages OR diet OR dietary OR vitamin OR vitamins OR nutrition OR nutritional OR nutrition disorders OR food industry OR nutritional physiological phenomena OR dietary fats OR dietary proteins OR feeding behavior))
[Number of citations, 8.19.14 - 332,351]
To use this search, click this link, or see instructions above with first hedge.
Most of the words used in this hedge are text-word versions of the MeSH terms used in the first hedge. Since this emphasizes “recall” instead of “precision,” it gets more citations than the first hedge. But the citations are less likely to be relevant. We looked closely at citations using the two hedges, and it was easy to see the lesser relevancy of the citations in the second hedge. Most of these, of course, are retrieved because they mention words that are in the abstract (e.g. fruit, vegetables) but which are not assigned as MeSH terms.
A word about searching for older citations
When we first realized that most of the plant name MeSH terms were only introduced in 2002, it seemed like a serious problem. However, as we’ve looked back retrospectively, we’ve come to see that there really wasn’t much research attention given to the subject in the earlier days of MEDLINE, especially before about 1990.
We’ve done detailed work to study this, but in this article we’ll just give a couple of anecdotal examples of what we’ve found. We looked at the number of citations that contain the word “fruit” since 1968, and found that this stayed flat, at about 400 mentions per year, until about 1990. It’s grown fast since then, and in 2013, the word is in about 8000 citations. In another example, we found that there are 70 articles in all of PubMed that have “sweet potato” in the title, and are on human subjects. All but three of these are after 1992; zero citations from 1980-1992 contain the words in the title. So, if it seems like the hedges in this article aren’t finding many citations before 1990, it’s probably because there just aren’t many to be found.
Things improve in the 1990s. It appears, from our retrospective examination of citations on FDN, that as the volume of research on the subject increased, NLM gradually improved the quality of MeSH indexing to accommodate it. The coverage of more prominent plant families improved, and the application of existing FDN MeSH terms became more consistent. So in the 1990s, even before the mass introduction of new MeSH plant terms in 2002, FDN indexing and retrieval was improving.