Most species of bacteria remain unstudied in scientific research
A biomedical engineer at the University of Michigan has found that just a fraction of all known bacteria species has ever been the main focus of a scientific research effort and subsequent paper. In his research posted on the bioRxiv preprint server, Paul Jensen describes how he searched for information on bacteria species in the PubMed database and found most bacterial research explores only a few species.
Over the past several decades, Escherichia coli has not only been well studied, but has been used as a tool to study other biological characteristics because it is now so well understood. The same cannot be said for other bacteria, Jensen notes, or even other microorganisms in general.
In the PubMed database, he found that out of 43,409 species of known bacteria, just 10 of them accounted for approximately half of all research papers listed on the site. He also found that approximately 75% of all the species he searched for came up empty—no research efforts or subsequent papers have been published about them.
Jensen made his findings as he was looking into the possibility of using an LLM to synthesize research surrounding the study of certain microorganisms. More specifically, he was hoping to learn more about Streptococcus sobrinus—the microorganism responsible for tooth decay. To his surprise, he found just a few dozen papers—not enough to sufficiently train an LLM. He also noted that he had already read all the papers in existence.
Jensen found that E. coli papers made up 21% of the total number of bacteria-focused papers in the database. PubMed, he notes, provides a good way to analyze research efforts regarding microorganisms because the species under study are listed in abstracts and/or titles. He also suggests that paper percentages found on the site likely correspond to those that would be found on other repositories. He suggests that microbiologists need to broaden their research perspective in the coming years because there is still much to learn about other microorganisms.
More information: Paul A. Jensen, Ten species comprise half of the bacteriology literature, leaving most species unstudied, bioRxiv (2025). DOI: 10.1101/2025.01.04.631297
Journal information: bioRxiv