To visualize this computational challenge, think of a grid that is 100 residues tall, for the first protein, and 100 residues wide, from the second protein. The resulting 10,000 boxes in this grid represent all of the potential residue interactions, and the overall analysis forms a cube 2,500 blocks deep because there is a similar grid for each of the 2,500 protein pairings. Covariance can rank each of these 25 million blocks to identify the target residues that interact directly, along with those numerous indirect pairings that need to be winnowed away.
The innovative next step was for the UCSD group to feed this covariance data into a message-passing program. Over the course of about a week of computing, the program analyzed this seemingly unfathomable mass of information and in time identified patterns in the highest-ranking cubes. Continued analysis ultimately yielded predictions about which pairings were in fact direct interactions.
Because the two-component signaling system has been the focus of intense research efforts at Scripps Research and elsewhere, including extensive x-ray crystallography, many of the direct residue interactions had already been identified. That meant all-or-nothing results for the very first message-passing experiment. Either the technique would accurately identify the direct pairings or not.
The results came back overwhelmingly positive, and it was the culmination of a very long quest for Hoch. "It felt absolutely great," he says, "I thought, 'We finally got it! We got it and it works!'"
With a given protein binding site, on average, the message passing identified ten direct interactions accurately before giving a single false positive. Given that researchers can identify the active binding site for proteins by knowing as few as three directly interaction residues, this success rate is more than enough, for instance, to identify a new drug target. In the case of proteins that in
|Contact: Keith McKeown|
Scripps Research Institute