Earlier reports from the Davuluri laboratory showed that nearly 40 percent of genes use alternative promoters to create protein isoforms. According to Davuluri, integrating this information with data from other studies would surely find significantly more of these alternative promoters.
"Much of the genetic variations occur outside protein coding regions, such as gene regulatory regions," Davuluri said. "MPromDb provides context for data in the form of gene promoter annotations that can tell you where and when our bodies make a particular protein variant."
MPromDb mines its information from huge databases maintained by national and international consortiums of researchers, such as Gene Expression Omnibus (GEO) maintained by the National Center for Biotechnology Information and ENCyclopedia of DNA Elements (ENCODE) run by the National Human Genome Research Institute. Essentially, MPromDb looks for key DNA sequences that could be potential binding sites for Polymerase II, an enzyme that creates the RNA transcript that the cell later translates into protein. The current database contains information on over 42,000 human promoters found in six different cell types and over 48,000 mouse gene promoters found in 10 different cell types.
"In fact, scientists are so good at generating this sort of information using next generation sequencing methods, that they collect information far in excess of what they might need for a given experiment or project," Davuluri said. "This information all ends up in places like GEO, waiting to be discovered by groups like ours."
According to Davuluri, the Wistar Center for Systems and Computational Biology plans to expand MPromDb
|Contact: Greg Lester|
The Wistar Institute