Cold Spring Harbor, NY A rose by any other name would smell as sweet -- but it might confound scientists interested in understanding the chemical components of its fragrance or discovering where its ancestors grew in the wild.
That's because in biology, an organism's scientific (taxonomic) name is the key to finding information about it. This data on the genetic, ecological, and agricultural particulars of every known plant -- is held in repositories scattered all over the globe, at places as diverse as university labs, museums, and private-sector corporations. Some of the information is hidden within spreadsheets stored on the computers of individual plant scientists.
There is, in other words, lots of room for confusion resulting from multiple listings (under different names) of the same species.
Rose-for-TNRS-story-courtesy-Wikimedia-Commons Enter a web-based resource: the Taxonomic Name Resolution Service, or TNRS (tnrs.iplantcollaborative.org) . Today, the third and most complete version of TNRS to date went live on the Web, the work of computer scientists, botanists, and biologists participating in an National Science Foundation (NSF)-funded project called the iPlant Collaborative, in conjunction with Missouri Botanical Gardens and Botanical Information and Ecology Networks (BIEN).
It turns out that up to 30% of the names in major biological databases are incorrect in some way, according to TNRS scientists. Error rates that high greatly reduce confidence in the accuracy of science and limit the ability of the public and business to discover and utilize information about plants.
iPlant -- a virtual collaborative co-led by Doreen Ware, Ph.D., of the U.S. Department of Agriculture's Agricultural Research Service and an Adjunct Associate Professor at Cold Spring Harbor Laboratory (CSHL) has made great progress in solving the problem. The latest version of TNRS resolves plant taxonomic names -- often lists containing thousands of names -- by passing them through a process of exact matching, parsing to break names into their component parts and "fuzzy matching" to search for near matches.
Key work on the TNRS was performed by Zhenyuan Lu and Sheldon McKay of CSHL, as well as by Brian Enquist and Brad Boyle from BIEN and the University of Arizona and Bill Piel from Yale's Peabody Museum. "TNRS is a critical tool to help plant scientists integrate data from diverse sources in virtually every field of plant research," says Lu.
Beginning with Linnaeus
In 1753, Carl Linnaeus, a Swedish botanist and zoologist, published Species Plantarum, which introduced a Latin-based naming system to the world and laid foundations for how subsequent scientists made sense of the immense diversity of life on earth. The Linnaean system is still in operation today, the basis for communication among ecologists studying tropical diversity, crop scientists searching for means of optimizing yields, and so-called systematists who strive to chart the Tree of Life.
Yet for many types of research the first crucial step is to resolve any differences among the taxonomic names of the plants being studied. Consider the tomato, a plant with a troubled taxonomic past. Originally named Solanum lycopersicum by Linnaeus, the tomato was soon transferred to the genus Lycopersicon, and long referred to as both Lycopersicon lycopersicum and Lycopersicon esculentum. Recent DNA research has shown that the tomato indeed belongs in Solanum, meaning that Linnaeus' original name must be restored. Anyone conducting research on the tomato must search for all three names to access the complete data and previous research associated with this economically important species.
The most important feature of TNRS version 3.0 is the ability to hierarchically resolve names against multiple taxonomic sources. Four taxonomic sources are now available: Tropicos, The National Center for Biotechnology Information's (NCBI) Taxonomy Database, The United States Department of Agriculture's (USDA) Plants Database, and The Global Compositae Checklist.
With the addition of the new taxonomic name sources, the TNRS has expanded the geographic range of plant species names it can resolve far beyond the Americas. The plant species available for comparison will continue to grow as the botany community contributes additional sources of names.
Members of the botany community are invited to contact iPlant about contributing their taxonomic sources to the TNRS. The TNRS source code has been released with an open source license and developers are encouraged to expand it to resolve taxonomic names of other groups of organisms.
|Contact: Peter Tarr|
Cold Spring Harbor Laboratory