The scientists distilled all of this information in a series of steps ?discarding parts of the proteins likely to dock via large surfaces, and zooming in on small regions of the remaining molecules that might hold motifs. Then it was up to the computer to scan the sequences for small patterns. The attempt was successful: in the fly data, for example, 26 sets of proteins seemed to be interacting through linear motifs.
"One challenge was to eliminate red herrings, which crop up everywhere when you look for very small patterns," Russell says. "The fact that nine of these motifs were already known was a sign we were on the right trail; we then did follow-up experiments in collaboration with Luis Serrano's group at EMBL to test some of the others."
One prediction, for example, suggested that a linear motif would bind to the fly protein translin. The scientists verified that this happened, then they made subtle changes in the sequence. When these changes stopped the molecules binding, they knew they had a new linear motif.
Now the lab will expand the method; Russell predicts that hundreds of linear motifs remain to be discovered. This has important implications for the study of genetic diseases. "A lot of work has gone into discovering mutations that affect protein binding," he says. "Because linear motifs are so small, every bit of information is crucial, and any change is likely to be disruptive. But so far, because of their size, these motifs have been below the radar of most methods to tie protein structures to disease."