The research environment has changed dramatically since 1997, when an avian flu outbreak in Hong Kong alerted health officials to its dangers to humans, Janies noted. The technology behind the Human Genome Project has improved to enable the rapid sequencing of numerous genomes, and avian flu's broad transmission has encouraged scientists to place viral sequence data into the public domain. At the same time, computational power has continued to expand.
Janies and colleagues obtained high-quality avian flu sequences contained in the repositories at the National Institutes of Health's GenBank and the Global Initiative on Sharing Avian Influenza Data (GISAID). They then focused on studying two genes within the virus whose mutations are believed to have the most impact on H5N1 behavior: hemagglutinin, which produces the protein that recognizes the host cell receptor, and neuraminidase, an enzyme that helps the virus escape one cell so it can enter other cells.
The researchers used 1,646 sequences of hemagglutinin and 1,335 of neuraminidase in this study.
Biologists construct what are called phylogenetic trees to trace evolutionary relationships among species or strains believed to share a common ancestor. These trees' branching diagrams can be designed to track similarities in physical characteristics, for example, in the study of dinosaurs, for which genetic data cannot be easily recovered. Or, in the study of influenza, the trees can show how viral strains are related based on shared mutations.
In the past, scientists including Janies have selected a single phylogenetic tree to represent related viruses that share mutations. But in this paper, the researchers used the power of supercomputers to generate millions of trees representing relationships among these thousands of viruses. They then picked a pool of thousands of high-quality trees based on a scoring system in the
|Contact: Daniel Janies|
Ohio State University