PubMed Food Problem: Cranberry & Cranberries | Update

cranberry700x350

Because most plant-based foods are under “plants” instead of “food” in PubMed, articles on cranberries may not be retrieved in a search for Food.

By Eric Rumsey, Janna Lawrence and Xiaomei Gu

Part of the problem in searching for food in PubMed is that it’s often the case that there’s a fuzzy border between between food and medicine.   A food that is enjoyed for its taste and general nutritional benefits may have properties that make it therapeutic for specific health conditions. A good example of this is cranberries, and cranberry juice, which may have benefits for prevention of urinary tract infections.

As with most plant-based foods, in MeSH indexing, cranberry is in the Plants explosion, and it’s not in the Diet, Food, and Nutrition explosion. Fortunately, most articles on cranberries and cranberry juice are assigned some Diet, Food, and Nutrition indexing terms so that they are retrieved in searches for the explosion. For instance, articles on cranberry juice are often under Beverages, and some articles on cranberries are under Fruit or Dietary supplements. However, there is still a significant number of relevant articles on the subject that are missed.

To show examples of cranberry-related articles that are not retrieved by the Diet, Food, and Nutrition explosion, we searched for cranberry or cranberries in the article title, limited to human, and retrieved 391 articles. We then combined this with the Diet, Food, and Nutrition explosion. This retrieved 269 articles — 69% of the cranberry/cranberries articles, which is a fairly good retrieval. But still, it’s certainly notable that there are 122 articles that are not retrieved, many of which appear to be very much on target, that don’t contain any Diet, Food, and Nutrition MeSH terms. Here are some examples:

As we mentioned above, plant-based foods are tricky to search in PubMed because the name of the food plant is usually only in Plants, and not in any FDN explosion. The six articles above are all indexed under Vaccinium macrocarpon, the taxonomic name of cranberry, which is in the Plants explosion. So if you were searching for articles on urinary tract infections and plant-based foods, a strategy that would retrieve these articles would be to combine Urinary Tract Infections AND Plants.

The image at the top of the article is original.

PubMed Food Problem: Cruciferous Vegetables

cruciferousvegetables_690x345

To do a PubMed search for cruciferous vegetables that includes such species as Radish and Arugula, each species must be done separately.

By Eric Rumsey, Janna Lawrence and Xiaomei Gu

In order to do successful searches for cruciferous vegetables in PubMed, it helps to know exactly what “cruciferous” means, which makes it easier to understand what vegetables are considered “cruciferous” and the botanical relationships among them. We have discussed these topics in a companion article.

In general, cruciferous vegetables are considered to be any plants in the family Brassicaceae that are edible. Most of these, especially the more popular ones (cabbage, broccoli, cauliflower, brussel sprouts) are in the genus Brassica. A few others are in other genera in the family, the most notable being Radish (Raphanus), Daikon (Raphanus), Arugula (Eruca), Horseradish (Armoracia), White mustard (Sinapis), Garden cress (Lepidium) and Wasabi (Wasabia). With most edible members of the family Brassicaceae being in the genus Brassica, then, searching for that genus works well for most cruciferous vegetables. But without including the MeSH terms for all of the other edible genera in the family, there is no easy way to do a comprehensive search for them as a group.

With edible species in several genera in the Brassicaceae family, it might seem like a way to include all of them would be to search for the family name, since it’s an explosion that contains all of the genera in the family. We have seen this done by MeSH indexers in some cases, but it has problems. For one thing, the family is very large, containing 372 genera, so searching for the family name can retrieve many inappropriate citations. This is especially a problem because one of the genera in the family is Arabidopsis, a very commonly used research subject in plant genetics, having nothing at all to do with nutrition. Arabidopsis is something like the Drosophila of the plant world. So of course searching for the exploded MeSH term Brassicaceae gets a flood of articles on Arabidopsis; approximately 80% of all articles retrieved from this search are indexed to the narrower term Arabidopsis.

We found another problem in how cruciferous vegetables are treated in PubMed indexing when we looked at sample of 30 articles with “cruciferous” in the title.  Twenty-eight of the 30 articles actually had the phrase “cruciferous vegetables” in the title, but  in about ⅓ of the 30 articles, there was no indexing term at all correlated with the word “cruciferous,” and the indexing term used was just “vegetables,” ignoring the word “cruciferous.” Another problem we found in this sample is that, of the articles that had an indexing term correlated with “cruciferous,” the term that was usually used was the family name, Brassicaceae, which retrieves many non-food-related citations, as discussed above.

Suggestions for improving indexing of cruciferous foods

Because there is currently no way to search for cruciferous foods as a group, we would suggest that NLM should add a new MeSH term <cruciferous foods>. This would not only put all of these foods under one term, it would also provide a term to use for articles that use the term itself in the title or abstract.

Image at top of article is from Wikipedia.

The meaning of cruciferous

cruciferousflowers4

To modern ears, “cruciferous” is all about vegetables. But the word’s rich history shows that it was formerly used in a much broader sense.

By Eric Rumsey, Janna Lawrence and Xiaomei Gu

In a Google search for the word “cruciferous,“ 9 out of the top 10 retrievals contain the phrase “cruciferous vegetables.” This certainly does fit the predominant modern usage of the word. As a game show host might say, “what do you think of when I say ‘cruciferous’?” Well, of course, “vegetables”! But it hasn’t always been this way. As the Google Ngram chart below shows, the phrase “cruciferous vegetables” only came into prominent use about 1980. Before that, the word “cruciferous” was widely used in other contexts.

ngramcruciferousveg1110

To understand the real meaning of the word, it’s important to understand what these other contexts are. This is important for more than simply historic reasons; it’s also important to understand the meaning of the word to understand its connections to nutrition, and because it helps to search for the subject in databases such as PubMed (as we’ve discussed in a companion article). Additionally, it’s a surprisingly interesting story.

The key to understanding “cruciferous” is a knowledge of its rich botanical history. The chart below gives a hint of this. The chart is a comparison of the use of the words Cruciferae and Brassicaceae. These are names that have been assigned to the plant family that contains “cruciferous vegetables” and many other plants as well. As the chart indicates, Cruciferae was the name of the family until the early 20th century, when it was officially changed to Brassicaceae. Since then, botanists have gradually been switching the word they use, with continuing widespread use of the older name Cruciferae.

ngramcruciferaebrassicaceae1111

This seemingly arcane naming distinction is important because when the family name was Cruciferae, the word cruciferous was used to include all plants in the family, not just the edible species that we call “cruciferous vegetables,” which helps explain the common use of the word in the chart above. (Examples from Google Books of the broad botanical use of the word “cruciferous” in the 19th century are here, here, and here.) The significance of this is magnified by the fact that the family is very large, containing 372 genera and 4060 species, making it one of the largest flowering plant families. This and other details of the family are well-covered by a Wikipedia article on it. Another detail that gives an idea of the size and variety of the family and helps explain the widespread use of “cruciferous” is that, in addition to cruciferous vegetables, it also includes decorative flowers, weeds, and Arabidopsis thaliana (“a very important model organism in the study of the flowering plants”).

Relating to the image of three flowers at the top of the article, another help in understanding “cruciferous” is the etymology of the word itself. The word comes from the word “cross,” because the 4-petaled flowers have the appearance of a cross. The flowers in the image above are (from left) Raphanus sativus (Wild radish weed), Brassica oleracea (Broccoli) and Arabidopsis thaliana. (Images are from Wikipedia).

In conclusion, the word “cruciferous” is confusing because the word has its origins in a time when the large family Brassicaceae was called Cruciferae, which meant that all of the plants in the family (most of which are not edible) were referred to as “cruciferous.” In the last generation, as botanists have switched to calling the family Brassicaceae instead of Crucuferae, and as people have become more aware of nutrition, the word “cruciferous” has gradually come to be used most commonly in the context of “cruciferous vegetables.” As we’ll discuss in a companion article, it’s important to know about this history and the taxonomic relationships of cruciferous vegetables in order to do successful searches in research databases like PubMed.

Is Chocolate A Food? A Problem In PubMed – 2016 Update

chocolatebgfffcrop3

Because most plant-based foods are under “plants” instead of “food” in PubMed, articles on chocolate may not be retrieved in a search for Food.

By Eric Rumsey, Janna Lawrence and Xiaomei Gu

As we’ve written, searching for plant-based foods (PBFs) in PubMed is difficult because of the way the MeSH system is organized. This is especially the case because because in most cases PBFs are treated as plants rather than food.

One result of treating plant-based foods as plants is that the MeSH term is usually the botanical plant name; in the case of chocolate it’s Cacao. This is usually not a serious problem when searching for specific substances because the common food name maps to the botanical MeSH term.

A more serious consequence of treating plant-based foods as plants instead of foods is that they are usually not in any food-related explosion, but only in the Plants explosion. So the only occurrence of chocolate (Cacao) is here:

Plants
   Angiosperms
      Sterculiaceae
         Cacao

The reason this is a problem is because articles on chocolate/Cacao will not be retrieved in a search for Food. So, for example, if you do a general search for food-related causes of migraine, you will not retrieve this article:

Chocolate is a migraine-provoking agent
Journal: Cephalalgia
http://www.ncbi.nlm.nih.gov/pubmed/1860135

This is not retrieved by searching for food because Cacao is not in the Food explosion. More broadly, however, it’s also not retrieved by the Diet, Food, and Nutrition explosion.

Here are several other articles on health and medical aspects of chocolate that are not retrieved by the broad Diet, Food, and Nutrition explosion:

If chocolate were the only case of a plant-based food that is not retrieved in a broad PubMed search for food-related topics, it would be a trivial matter. But that’s far from being the case. There are many plant-based foods that have the same problem in PubMed. We have written about a few of these.

[Image above is licensed under Creative Commons, from WikiMedia]

Plant-Based Foods – A Tricky PubMed Search – Revised 2016

HardinLib0902

By Eric Rumsey, Janna Lawrence and Xiaomei Gu

As we discussed in an article earlier this year, searching for nutrition in PubMed has improved greatly since NLM brought the subject together in one explosion (Diet, Food, and Nutrition). This ability to search the field of nutrition easily has helped in searching for plant-based foods [PBFs] in some ways. But in other ways, it’s still as difficult as it was when we wrote our 2013 article on the same topic.

The basic problem in searching for PBFs, just as it was before the addition of the new explosion, is that a large proportion of PBFs are not in the Food explosion, but are only in Plants, and not in Food. So the fact that Food is part of the new inclusive explosion doesn’t make it any easier to search for PBFs.

In addition to the fact that most fruits and vegetables are treated as plants instead of foods, another problem in searching for them is that almost all of them are put under their botanical, Latin names, that are not recognizable to most people. Here are some examples, all of which are in the plant-taxonomic branch of the MeSH tree:

  • Kale is Brassica
  • Sweet potato is Ipomoea batatas
  • Plum is Prunus domestica
  • Almond is Prunus dulcis
  • Apple is Malus
  • Cranberry is Vaccinium macrocarpon
  • Strawberry is Fragaria
  • Kidney Beans is Phaseolus
  • Chocolate is Cacao
  • Turmeric is Curcuma

If you’re searching for specific food plants, the Latin botanical MeSH terms are usually not a problem, because when you search for a common name, it’s mapped to the botanical MeSH term (e.g. if you search for Grapes, it maps to Vitis). The problem comes if you want to browse the Plants explosion to pick out the edible plants from the many plants that are not edible, because only the botanical names are listed. The Rose family (Rosaceae) of plants, for example, has several edible species within it. There are 19 genera listed in MeSH in the family, and 6 of them have edible species. But to find them, you have to be able to pick out the genera with edible species (e.g. Malus, Prunus) from the others (e.g. Agrimonia, Alchemilla).

If you’re interested in learning about how to search for PBFs in PubMed, see our companion article, which includes an updated “search recipe,” or hedge.

[Image above is public domain, from WikiMedia]

US Lags Behind The World In Plant-Based Food Research

hardinlibposterpbfcountrieslead700x386

Many other countries spend a much larger proportion of their research time and resources on plant-based foods than the United States does.

By Xiaomei Gu, Eric Rumsey and Janna Lawrence

In our explorations of plant-base foods (PBFs) in PubMed, it’s often striking that there are many excellent articles from non-US countries. So we did a survey in PubMed to measure different countries’ authorship of articles on PBFs, and we found that, indeed, several countries have a much higher proportion of their total articles on PBFs than the US.

The chart above shows our data for all PBFs and the charts below show four specific foods or food groups. The charts are based on the percentage of articles from each county, not the total number of articles. So even though the total number of articles on PBFs by US authors may be higher than other countries, the proportion of articles on PBFs is substantially lower. [The charts are from a poster presented at MLA in 2016. For more details on our survey methods, see the poster. The PubMed search strategy used to find plant-based foods in the chart above is described here.]

…………….

hardinlibPosterPBFcountriesCabbage

hardinlibPosterPBFcountriesNuts

hardinlibPosterPBFcountriesFruit

hardinlibPosterPBFcountriesSpices

Nuts as a Healthy Food: How to Search in PubMed

By Eric Rumsey, Janna Lawrence and Xiaomei Gu

This article is based on a poster presented at the Medical Library Association annual meeting, Toronto, May 2016.

Introduction

Searching for nuts as food is difficult. As with most plant-based foods, MeSH terms for specific types of nuts are in the Plants explosion instead of in the food explosion. Nuts are especially tricky because the MeSH term Nuts is not an explosion, and most articles on specific types of nuts are not indexed to the term Nuts. So it’s necessary to search for specific nuts to retrieve articles on them.

A caveat—As with nutrition topics in general, and plant-based foods in particular, searching in PubMed is complicated, largely because many plant-based substances are used as foods and also as medicines or experimental organisms. A list of articles on specific nut types is likely to contain some articles that are not food-related.

Searching for Nut Types

The general idea of searching for specific types of nuts is simple: Do an OR search that includes the common name and botanical name. In most cases, articles on a specific nut type will be indexed under the botanical name, but using the common name is always a good idea. See the example below for searching walnuts.

walnut [tiab] OR walnuts [tiab] OR juglans [MeSH]

It is necessary to restrict the search for the common names to the title and abstract [tiab] fields because there are many streets in the Address field that are named after nuts (e.g., 975 Walnut St.).

hardinlibPosterNutsAlmonds
Image 1. Almonds

 

hardinlibPosterNutsWalnuts
Image 2. Walnuts

 

hardinlibPosterNutsHazelnuts
Image 3. Hazelnuts

 

hardinlibPosterNutsCashews
Image 4. Cashews

 

hardinlibPosterNutsPecans
Image 5. Pecans

 

hardinlibPosterNutsBrazils
Image 6. Brazil Nuts

 

Peanuts Are Different!

hardinlibPosterNutsPeanuts
Image 7. Peanuts

Peanuts are a special case. Unlike the other nuts here, they grow on herbaceous plants instead of on trees and, as members of the bean family, they are nutritionally more closely related to beans than to other nuts. There is also a separate MeSH term, Peanut Hypersensitivity, dealing with peanut allergies.

Because peanuts are commonly used as experimental plants, many of the articles about them are not related to nutrition. To  focus on nutritional aspects, we suggest incorporating the Diet, Food, and Nutrition explosion into the search:

(peanut [tiab] OR peanuts [tiab] OR arachis [mh])

AND Diet, Food, and Nutrition [mh]

Citations: 2932

Great but Hidden: PubMed’s New Diet, Food, and Nutrition Explosion

By Eric Rumsey, Janna Lawrence and Xiaomei Gu

As we have written, NLM made a great improvement in introducing the new explosion for Diet, Food, and Nutrition. Before that was introduced this year, each of the three elements in the explosion had to be searched separately because they were not in the same explosion. So we strongly endorse the new explosion!

Although it is less difficult to search for nutrition articles now, there is still a problem. The term mapping feature of PubMed does not work for the word “nutrition,” which is certainly a common way to search for the broad subject comprised by the new explosion.

Hardin0511b2-300x247
Image 1. Search Details for Heart Attack.

In most cases in PubMed, if the user searches a word or term that’s synonymous with the appropriate MeSH term, the system will automatically include the MeSH term in the search. An example of this is searching for the term heart attack (without quotes).

As shown in Image 1, PubMed automatically maps the word to the appropriate MeSH term – “myocardial infarction” [MeSH Terms]. (To see Search details, on the PubMed search results screen, scroll down to the “Search details” box, in the right side-bar). Another example of PubMed’s automatic mapping is cancer, which maps to neoplasms [MeSH Terms].

Hardin0511c2-300x279
Image 2. Search Details for Nutrition.

A search for the word nutrition retrieves several phrases and MeSH terms, seen in the Search details in Image 2. But this does not include the new explosion Diet, Food, and Nutrition, which would retrieve many more relevant articles than these terms do.

There are two conceivable ways that NLM could address this problem. The first is to change the name of the explosion to Nutrition, so that mapping the word would not be necessary. The second solution is to make the word nutrition map to the new explosion. We encourage NLM to consider these options, so that the full power of the new explosion can be released!

Plant-Based Foods – An Inclusive PubMed Search – Revised 2016

Hardin0318c

By Eric Rumsey, Janna Lawrence and Xiaomei Gu

Searching for nutrition topics in PubMed is tricky. It’s especially difficult to search for plant-based foods (PBF’s). In 2014, we published an article that addresses this problem that contained a hedge for searching for PBF’s. A few months ago, the National Library of Medicine introduced a new explosion that makes it a lot easier to search for nutrition topics in PubMed, which we discussed here.

In this article we are revising our 2014 PBF hedge to incorporate the new PubMed explosion. While there may still be a few occasions when our previous hedge for PBF’s would be appropriate, in almost all cases we do not recommend using the old hedge. Instead, we recommend using the hedge below:

((Plants[mesh] OR Plant Preparations[mesh]) AND Diet, Food, and Nutrition [mesh]) OR (Vegetables[mesh] OR Fruit[mesh] OR Plants, Edible [mesh] OR Dietary Fiber[mesh] OR Flour[mesh] OR Bread[mesh] OR Diet, Vegetarian[mesh] OR Nuts[mesh] OR Condiments[mesh] OR Vegetable Proteins[mesh] OR Tea[mesh] OR Coffee[mesh] OR Wine[mesh] OR Vegetable Products[mesh])

The big change in this PBF hedge from the previous PBF hedge is, of course, replacing the food-diet-nutrition hedge that was part of the previous PBF hedge with the the new Diet, Food, and Nutrition explosion. Otherwise, the main change is the addition of the new MeSH term Vegetable Products.

Advanced techniques for plant-based food searching

In most cases, you’ll probably want to start with the hedge above. But if that doesn’t find what you want, try the ideas below.

If you combine your subject (a particular disease for example) with the hedge above and it misses articles that you think exist, consider broadening the search by combining your subject with Plants. The Plants explosion in MeSH is very large, containing hundreds of plant species. It’s organized by taxonomic relationships, which makes it hard for a non-botanist to browse. But it’s useful to combine with other subjects in searching, because it’s so comprehensive. The main drawback in searching it is that in addition to plant-based foods, it also has many plant-based drugs, which you’ll have to sift out from the food articles.

If you combine your subject with the hedge above and it misses articles on particular plant-based foods, search specifically for those. If you do a search for food and migraine, for example, and your search doesn’t pick up specific foods that you know have been associated with migraine (e.g. chocolate), combine those foods specifically with migraine.

A caveat: There is an exploded MeSH term Plants, Edible, which might seem to be a good place to search for plant-based foods. Unfortunately, however, it’s of limited usefulness – The explosion contains only grain plants and Vegetables, which is mainly soy-based foods. The term Plants, Edible itself is mostly used to index articles that are on the general concept rather than articles on specific types of edible plants.

If you’re interested in reading some background explaining why it’s so difficult to search for PBFs in PubMed, see our companion article.

Diet, Food, and Nutrition – How To Search in PubMed

hardin0318a

By Eric Rumsey, Janna Lawrence and Xiaomei Gu

Searching for Food, Diet & Nutrition has long been one of the most difficult subjects to search in PubMed. We were happy to report earlier this year that the National Library of Medicine has gone a long way toward fixing this problem, with a new explosion for Diet, Food, and Nutrition.

Before the new explosion came out, searching for the subject was very tricky because diet, food and nutrition were all in different places in the MeSH tree structure, and so they had to be searched separately. To help with this problem, we created a detailed search strategy, or hedge, that would bring together all of the components in one search. We no longer recommend using this hedge. We have examined the new explosion, and find that it covers the field very adequately.

We strongly recommend the new explosion for most nutrition searches. But there are some aspects of the field that are not covered in the new explosion that were part of our hedge, in particular obesity and vitamins. Both of these terms are closely connected to the subject of food, diet & nutrition. But we understand why NLM has not included them in the new explosion, since they will not always be wanted. Both of these subjects are somewhat complicated to search in PubMed. In both cases, however, a simple one-word text word search will retrieve almost all of the relevant citations. So in cases when you want to include these subjects in your nutrition searching, you can do these searches:

Diet, Food, and Nutrition [mh] OR vitamins
Diet, Food, and Nutrition [mh] OR obesity
Diet, Food, and Nutrition [mh] OR vitamins OR obesity

“Good job, NLM!”

In conclusion, good words for the National Library of Medicine – Thank you for fixing the long-standing problem in searching for nutrition! With the surging interest in the subject, you’ve made things a lot easier for the many people searching for it.