Search engines like BLAST evaluate the similarity between a pair of sequences by aligning them underneath each other in such a way that similar amino acids come to lie in the same columns. The sequence similarity is then calculated by adding the similarities of all aligned amino acids. Here, the similarity between amino acids is measured by how often they mutate into each other without adverse effects, a measure that largely coincides with how similar their sizes and other biophysical properties are.
BLAST has been the most important method for sequence searching since its development in 1990. It is called up around 500,000 times a day from all around the world. Yet this tried and true program is far from perfect. When evaluating the similarity of two amino acids, it ignores their neighboring amino acids, their sequence context. Johannes Soeding and Andreas Biegert of the Gene Center Munich and the cluster of excellence "Center for Integrated Protein Science Munich (CIPSM)" of LMU Munich have now developed a method that significantly improves similarity searches: Their "context-specific" BLAST, or CS-BLAST, can sniff out twice as many distant "relatives" of proteins as BLAST.
When determining the similarity of an amino acid to the reference sequence, CS-BLAST includes the sequence context of every amino acid, namely its six left and six right sequence neighbors, in the analysis. "The idea is that the context says much more about how likely two amino acids are to mutate into each other", explains Soeding, who heads the group for "Protein Bioinformatics and Computational Biology" at the Gene Center Munich. "Take as an example folded and unfolded regions in proteins. In an unfolded region, the amino acid valine, for example, can usually mutate into an
'/>"/>
| Contact: Luise Dirscherl dirscherl@lmu.de 49-892-180-2706 Ludwig-Maximilians-Universitt Mnchen Source:Eurekalert |