Jarvis and others thought the new technologies would solve the genome-sequencing challenges. Through a competition, called the Assemblathon, the scientists discovered that the Pacbio machine had trouble accurately decoding complex regions of the parrot, Melopsittacus undulates, genome.
The machine had a high error rate, generating the wrong genetic letter at every fifth or sixth spot in a string of DNA. The mistakes made it nearly impossible to create a genome assembly with the very long reads, Jarvis said.
But with a team, including scientists from the DOE Genome Science Institute and Cold Spring Harbor in New York, Phillippy, Koren and Jarvis corrected the Pacbio sequencer's errors using shorter, more accurate codes from the next-generation devices. The fix reduces the single-molecule, or third-generation, sequencing machine's error rate from 15 percent to less than one-tenth of one percent.
"Finally we have been able to assemble the regulatory regions of genes, such as FoxP2 and egr1, that are of interest to us and others in vocal learning behavior," Jarvis said.
He explained that FoxP2 is a gene required for speech development in humans and vocal learning in birds that learn to imitate sounds, like songbirds and parrots. Erg1 is a gene that controls the brain's ability to reorganize itself based on new experiences.
By being able to decode and organize the DNA that regulates these regions, neuroscientists may be able to better understand what genetic mechanism causes birds to imitate and sing well. They may also be able to collect more information about genetic factors that affect a person's ability to learn how to communicate well and to speak, Jarvis said. He and his team plan to describe the biology of the parrot's genetic code they sequenced in more detail in an upcoming paper.
|Contact: Ashley Yeager|