The solution proposed by Mavrommatis and his colleagues was to evaluate and compare the existing methods to see which performs best for the particular environmental samples being analyzed. "What we did was to take known sample genomes, shuffle them, create simulated metagenomes, and use those tools on them, and then we went back and compared the results to the isolate genomes. Essentially, we applied the gold standard's 'the truth' and found there were tools that shouldn't have been used because their predictive accuracy was very low. But it also validated some of our assumptions."
Mavrommatis said that, for example, when using the widely used sequence assembly tool Phrap, they actually saw artifacts created by the program caused by mixing sequences that should not have been mixed.
"It's like when you're in the market for a digital camera, you can go to web sites like CNET to see the reviews, make the comparisons, and get some guidance for choosing the right product for your particular needs."
Another major problem with metagenomes is binning. Binning is the process of identifying from what organism a particular sequence has originated. There are several methods employed to bin sequences. BLAST (Basic Local Alignment Search Tool) is a method used to rapidly search for similar sequences in existing public databases.
Mavrommatis said that a popular approach is to take the sequence, BLAST it against the database, and find the best hit and assume that the sequence queried belongs to the same group of organisms. Other methods use intrinsic features of the sequence, such as oligonucleotide frequencies. Patterns of these features help to discriminate between the possible groups of organisms that contrib
Source:DOE/Joint Genome Institute