Kevin McKernan, Jim Yang, Betty Woolf, Rey Sequerra, Kathy Makowski, Tom Tang Agencourt Bioscience Corporation, Beverly, MA 01915, USA
Understanding genetic variation and its relationship to drug response and disease susceptibility is becoming an increasingly important consideration in drug design and clinical trials. Since the completion of the Human Genome Project, the majority of human genes can be readily resequenced in diseased or drug-responsive populations. Typical studies involve the resequencing of 20500 amplicons in 50 or more patients, generating thousands of sequence traces that need to be analyzed. To efficiently automate the analysis of such resequencing data, Agencourt has developed novel algorithms that enhance the accuracy of automated procedures and maintain the sensitivity that is required for the diverse samples that are often present in cancer biopsies. In this article, we will discuss the common problems encountered in SNP resequencing and the innovative solutions developed at Agencourt.
PolyPhred is currently the most popular software tool utilized for analyzing DNA sequencing reads for SNPs (1,2,3). Other programs such as Mutation Surveyor, NovoSNP (4), and Paracel have certain beneficial features that PolyPhred lacks. However, unlike PolyPhred, they do not have a Linux-compatible analysis program that can automatically interface with a database or sequencing pipeline, allowing the automated assembly of thousands of reads.
Since PolyPhred is the most scalable software tool available,
Agencourt opted to design algorithms to enhance a