WALNUT CREEK, Calif. More than a thousand microbial genomes have been sequenced at various sequencing centers in the past 15 years to better understand their roles in tasks ranging from bioenergy to health to environmental cleanup. Conservative estimates suggest roughly 10,000 microbial genomes will be publicly available within the next two years, but genomic standards have not caught up with the technological advances that have made the sequencing process faster and cheaper. As a result, the torrent of DNA sequences being released has varying levels of quality, which impacts researchers' ability to use this information.
To assist in checking the quality of the microbial genomic DNA sequences generated before they are submitted to the federally funded public archive GenBank, the U.S. Department of Energy (DOE) Joint Genome Institute (JGI) has introduced a quality control tool known as the Gene Prediction IMprovement Pipeline or GenePRIMP. GenePRIMP is described in a paper published online May 2 in Nature Methods and has the potential to become a standard in prokaryotic gene calling, a technique by which the start and end of potential gene coding sequences are identified.
First author Amrita Pati, a software developer in the DOE JGI's Genome Biology Program noted that GenePRIMP double-checks the gene boundaries, gene annotations and unannotated intergenic regions in genome sequences after the finishing process. She credited colleague Natalia Ivanova with establishing the biological basis of the software tool and helping to refine GenePRIMP. The program, said Pati, identifies gene-calling errors such as potentially incorrect gene start and end positions, large overlaps between genes, fragmented genes and missed genes.
Gene-calling errors, noted Pati, can range from two percent to as much as 30 percent of the original genes identified in the genome and are dependent on many factors, such as horizontal gene transfer between species
|Contact: David Gilbert|
DOE/Joint Genome Institute