Nikos Kyrpides, DOE JGI Genome Biology Program Head, said that, to evaluate the magnitude of this problem and identify the inherent pitfalls, "we have constructed three simulated metagenomic datasets of varying complexity by mixing pieces of over one hundred already sequenced isolate organisms. This approach allowed us to quantify the fidelity of several data processing methods, since we could identify the correct answer by comparing the synthetic datasets to the corresponding isolate genomes."
"This paper provides an extremely useful survey of tools and existing approaches for metagenome analysis and points out their weaknesses," said Natalia Maltsev, of the Bioinformatics Group, Mathematics and Computer Science Division at Argonne National Laboratory. "The simulated datasets constructed by the authors provide a much-needed test bed for evaluation and comparisons of these tools. Their findings will no doubt have a very significant impact on the field of metagenomics in general. It will help groups like mine to choose efficient strategies for the development of automated methods for high-throughput metagenome analysis. And last, but not least, it will stimulate the development of new computational tools and approaches for studies of microbial communities."
In a shotgun sequencing process, the DNA from the microbial genomes is first sheared into millions of small fragments to enable the amplification, labeling, and ultimately sequencing. Genome assembly is the process of putting the sequenced fragments back in order, in effect, putting Humpty Dumpty back together again, to recreate the identity of an organism from the scattered puzzle pieces of DNA.
"One of the problems in assembling metagenomes," said Mavrommatis, "is that you end up with large fragments of unknown accuracy and a
'"/>
Source:DOE/Joint Genome Institute