The innovative method combines a new experimental procedure and a new algorithm to identify gene activation captured by DNA microarray analysis with greater sensitivity and specificity. The work holds great promise for vastly improving research on health and disease, according to Ziv Bar-Joseph, assistant professor of computer science and biological sciences at Carnegie Mellon.
"We are very excited about introducing this versatile, powerful method to the research community because it can be used to study a wide range of complex, dynamic systems more comprehensively," said Bar-Joseph, who also is a member of the Center for Automated Learning and Discovery at the School of Computer Science. "Such systems under study include stress and drug response, cancer and embryo development."
DNA microarray analysis -- a multimillion-dollar-a-year industry -- identifies gene activation in living, complex biological systems. DNA microarrays monitor the behavior of thousands of genes over time by detecting changes in the expression of as many as 30,000 different genes on one small chip. The technique has been used to study some of the most important biological systems, including how cells normally divide (the cell cycle) and immune responses to disease and infection.
"Ultimately, we think that the addition of this method to standard DNA microarray analysis will make it more accurate and cost-effective," Bar-Joseph added.
"While DNA microarrays are very powerf ul, they present a sampling problem," Bar-Joseph said. "DNA microarrays only take static snapshots of gene activity over time. In between these snapshots, genes could be activated and we just don't see them turning on. Our protocol will offer greater overall sensitivity in detecting the expression of any gene, even if a gene turns on when no microarray sampling takes place."
Bar-Joseph's procedure is based on a "check-sum" protocol initially developed to ensure that email messages sent via the Internet don't become garbled in transmission. In the standard Internet check-sum protocol, bits of information that begin as one value (0 or 1) may inadvertently flip to the opposite value as they move from one computer to the next in the form of an email. This data loss, ascribed to noise in the communication channel, is checked by counting the number of 1's in the message. If this number is odd, then the last bit is set to 1; otherwise it is set to 0. By comparing the number of 1's on the sending end with the value of the last bit on the receiving end, the recipient's computer can determine whether the message was accurately received. If not, the recipient's computer asks the sender's computer to forward the message again.
Bar-Joseph's method carries out a similar analysis of the microarray snapshots by "checking" the sum of a set of DNA microarray data points over time (a time series experiment) against the "summary" of the temporal response. If the two sets of results are equal, then what is captured by the DNA microarray time series is real. If the time series results produce a lower value than the microarray summary, the protocol indicates that the researchers have missed a gene's activation somewhere in their time series.
Just as important, according to Bar-Joseph, is whether a DNA microarray summary value exceeds its time sequence value. If that's the case, then researchers have likely identified gene activity that should be attributed to chang es taking place during an experiment -- adding a chemical or changing the temperature, for instance. This aspect of the method provides scientists with the specificity they need to weed out such introduced gene activation from fundamental gene activation pathways that form the hallmark of processes like cancer or immunity. To prove the effectiveness of this new method, Bar-Joseph studied the human cell division cycle. Considered one of the most important biological systems, the cell cycle plays a major role in cancer. Using their new method, Bar-Joseph and his colleagues identified many new human genes that were not previously found to be participants in this system.
"This new set of gene discoveries opens the way to new and more accurate models of the cell cycle system, which in turn can lead to new targets for cancer drugs," said Bar-Joseph.
The new method also overcomes synchronization loss, a vexing problem for scientists who study hundreds or thousands of cells over time, according to Bar-Joseph. Large groups of living cells that start out together at the same biological point in time eventually become asynchronized in their activities, he noted.
"You can compare a group of cells starting out in an experiment like a group of marathoners at the starting line. Over time, some marathoners will be far ahead on the track, while others will fall back." After the race begins, finding one marathoner among the thousands is difficult. Similarly, with asynchronous cells, trying to sort out a single cell response is virtually impossible. But Bar-Joseph has incorporated mathematical tools in his method that can detect genes affected by such asynchrony in a population of cells.