PubMed Food Problem: Cruciferous Vegetables

cruciferousvegetables_690x345

To do a PubMed search for cruciferous vegetables that includes such species as Radish and Arugula, each species must be done separately.

By Eric Rumsey, Janna Lawrence and Xiaomei Gu

In order to do successful searches for cruciferous vegetables in PubMed, it helps to know exactly what “cruciferous” means, which makes it easier to understand what vegetables are considered “cruciferous” and the botanical relationships among them. We have discussed these topics in a companion article.

In general, cruciferous vegetables are considered to be any plants in the family Brassicaceae that are edible. Most of these, especially the more popular ones (cabbage, broccoli, cauliflower, brussel sprouts) are in the genus Brassica. A few others are in other genera in the family, the most notable being Radish (Raphanus), Daikon (Raphanus), Arugula (Eruca), Horseradish (Armoracia), White mustard (Sinapis), Garden cress (Lepidium) and Wasabi (Wasabia). With most edible members of the family Brassicaceae being in the genus Brassica, then, searching for that genus works well for most cruciferous vegetables. But without including the MeSH terms for all of the other edible genera in the family, there is no easy way to do a comprehensive search for them as a group.

With edible species in several genera in the Brassicaceae family, it might seem like a way to include all of them would be to search for the family name, since it’s an explosion that contains all of the genera in the family. We have seen this done by MeSH indexers in some cases, but it has problems. For one thing, the family is very large, containing 372 genera, so searching for the family name can retrieve many inappropriate citations. This is especially a problem because one of the genera in the family is Arabidopsis, a very commonly used research subject in plant genetics, having nothing at all to do with nutrition. Arabidopsis is something like the Drosophila of the plant world. So of course searching for the exploded MeSH term Brassicaceae gets a flood of articles on Arabidopsis; approximately 80% of all articles retrieved from this search are indexed to the narrower term Arabidopsis.

We found another problem in how cruciferous vegetables are treated in PubMed indexing when we looked at sample of 30 articles with “cruciferous” in the title.  Twenty-eight of the 30 articles actually had the phrase “cruciferous vegetables” in the title, but  in about ⅓ of the 30 articles, there was no indexing term at all correlated with the word “cruciferous,” and the indexing term used was just “vegetables,” ignoring the word “cruciferous.” Another problem we found in this sample is that, of the articles that had an indexing term correlated with “cruciferous,” the term that was usually used was the family name, Brassicaceae, which retrieves many non-food-related citations, as discussed above.

Suggestions for improving indexing of cruciferous foods

Because there is currently no way to search for cruciferous foods as a group, we would suggest that NLM should add a new MeSH term <cruciferous foods>. This would not only put all of these foods under one term, it would also provide a term to use for articles that use the term itself in the title or abstract.

Image at top of article is from Wikipedia.

Plant-Based Foods – A Tricky PubMed Search – Revised 2016

HardinLib0902

By Eric Rumsey, Janna Lawrence and Xiaomei Gu

As we discussed in an article earlier this year, searching for nutrition in PubMed has improved greatly since NLM brought the subject together in one explosion (Diet, Food, and Nutrition). This ability to search the field of nutrition easily has helped in searching for plant-based foods [PBFs] in some ways. But in other ways, it’s still as difficult as it was when we wrote our 2013 article on the same topic.

The basic problem in searching for PBFs, just as it was before the addition of the new explosion, is that a large proportion of PBFs are not in the Food explosion, but are only in Plants, and not in Food. So the fact that Food is part of the new inclusive explosion doesn’t make it any easier to search for PBFs.

In addition to the fact that most fruits and vegetables are treated as plants instead of foods, another problem in searching for them is that almost all of them are put under their botanical, Latin names, that are not recognizable to most people. Here are some examples, all of which are in the plant-taxonomic branch of the MeSH tree:

  • Kale is Brassica
  • Sweet potato is Ipomoea batatas
  • Plum is Prunus domestica
  • Almond is Prunus dulcis
  • Apple is Malus
  • Cranberry is Vaccinium macrocarpon
  • Strawberry is Fragaria
  • Kidney Beans is Phaseolus
  • Chocolate is Cacao
  • Turmeric is Curcuma

If you’re searching for specific food plants, the Latin botanical MeSH terms are usually not a problem, because when you search for a common name, it’s mapped to the botanical MeSH term (e.g. if you search for Grapes, it maps to Vitis). The problem comes if you want to browse the Plants explosion to pick out the edible plants from the many plants that are not edible, because only the botanical names are listed. The Rose family (Rosaceae) of plants, for example, has several edible species within it. There are 19 genera listed in MeSH in the family, and 6 of them have edible species. But to find them, you have to be able to pick out the genera with edible species (e.g. Malus, Prunus) from the others (e.g. Agrimonia, Alchemilla).

If you’re interested in learning about how to search for PBFs in PubMed, see our companion article, which includes an updated “search recipe,” or hedge.

[Image above is public domain, from WikiMedia]