The study appears this week in the online edition of the Proceedings of the National Academy of Sciences.
Their work relies on established techniques of phylogenetic analysis developed in the past decade to plot the evolution of genes and organisms but which have never before been used to work out the evolutionary history of protein architecture across biological networks.
"We are interested in how structure evolves, not how organisms evolve," said professor of crop sciences Gustavo Caetano-Anollés, principal researcher on the study, which was co-written by graduate student Hee Shin Kim and emeritus professor of cell and developmental biology Jay E. Mittenthal. "We are using the techniques of phylogenetic analysis that systematicists used to build the tree of life, and we are applying it to a biochemical problem, a systems biology problem."
To get at the roots of protein evolution, the researchers examined metabolic proteins at the level of their component structures: easily recognizable folds in the proteins that have known enzymatic activities. These protein domains catalyze a range of functions, breaking down or combining metabolites, small molecules that include the building blocks of all life.
Their findings relied on a fundamental assumption: that the most widely utilized protein folds (they looked at proteins in more than 200 species) were also the most ancient.
"Protein architecture has preserved ancient structural designs as fossils of ancient biochemistries," the authors wrote.
The team used data from two international compilations of genetic and proteomic information: the metabolic pathways database of the Kyoto Encyclopedia of Genes and Genomes, and the Structural Classification of Proteins database. They combined these two data sets with phylogenetic reconstructions, or family trees, of protein fold architectures in metabolism. They created a new database, called the Molecular Ancestry Network (MANET: http://www.manet.uiuc.edu/index.php) which links these data sources into a new global network diagram of metabolic pathways.
The researchers added color, representing evolutionary age, to their diagrams of metabolic networks (for an example, see the purine metabolism network in MANET). The result is a multicolored mosaic of protein fold evolution. The mosaic shows that modern metabolic networks ?and even individual enzymes ?are composed of both very ancient and much more recent protein architectures.
"This mosaic is telling you that the new enzymes and old enzymes are together performing side by side," Caetano-Anollés said. "In some cases in the same protein you have old domains and new domains working together."
This finding supports the hypothesis that protein architectures that perform one function are often recruited to perform new tasks.
The new, global family tree of protein architecture also revealed that many metabolic protein folds are quite ancient: These architectures were found to be quite common in all the species of bacteria, animals, plants, fungi, protists and archaea the researchers analyzed.
Of 776 metabolic protein folds surveyed, 16 were found to be omnipresent, and nine of those occurred in the earliest branches of the newly constructed tree.
"These nine ancient folds represent architectures of fundamental importance undisputedly encoded in a genetic core that can be traced back to the universal ancestor of the three superkingdoms of life," the authors wrote.
The analysis also found that the most ancient metabolic protein folds are important to RNA metabolism, specifically the interconversion of the purine and pyri midine nucleotides that compose the core of the RNA molecule.
This discovery supports the hypothesis of an RNA world in which RNA molecules were among the earliest catalysts of life. This idea is based in part on the observation that RNA still retains many of its catalytic capabilities, including the ability to make proteins. Gradually, according to this theory, proteins began taking over some of the original functions of RNA.
"The most ancient (protein) molecules were involved in the interconversion of nucleotides. But they were not synthesizing them," Caetano-Anollés said. "We see that all the enzymes that were involved in purine synthesis, for example, were very recent. Since these first proteins benefited the formation of building blocks for the primitive RNA world, it makes a lot of sense that we've found this origin encased in nucleotide metabolism."