Imagine millions of jigsaw puzzle pieces scattered across a football field, with too few people and too little time available to assemble the picture.
Scientists in the new but fast-growing field of computational genomics are facing a similar dilemma. In recent decades, these researchers have begun to assemble the chemical blueprints of the DNA found in humans, animals, plants and microbes, unlocking a door that will likely lead to better healthcare and greatly expanded life-science knowledge. But a major obstacle now threatens the speedy movement of DNA's secrets into research labs, two scholars in the field are warning.
This logjam has occurred, the researchers say, because the flood of unassembled genetic data is being produced much faster than current computers can turn it into useful information. That's the premise of a new article, co-written by a Johns Hopkins bioinformatics expert and published in the July 2013 issue of IEEE Spectrum. The piece, titled "DNA and the Data Deluge," was co-authored by Michael C. Schatz, an assistant professor of quantitative biology at Cold Spring Harbor Laboratory, in New York state; and Ben Langmead, an assistant professor of computer science in Johns Hopkins' Whiting School of Engineering.
In their article, the authors trace the rapidly increasing speed and declining cost of machines called DNA sequencers, which chop extremely long strands of biochemical components into more manageable small segments. But, the authors point out, these sequencers do not yield important biological information that researchers "can read like a book."
Instead, the article says, the sequencing machines "generate something like an enormous stack of shredded newspapers, without any organization of the fragments. The stack is far too large to deal with manually, so the problem of sifting through all the fragments is delegated to computer programs."
In other words, the sequencers produce the ge
|Contact: Phil Sneiderman|
Johns Hopkins University