Navigation Links
Modification of program enables prediction of gene transcription

A modification to an "ace" gene prediction program by computer scientists at Washington University in St. Louis now enables scientists to predict the very beginnings of gene transcription start sites and where the first splice occurs thereby defining the first exon of the gene.

The modification to the gene prediction software TWINSCAN is called N-SCAN. Michael Brent, Ph.D. professor of computer science at Washington University in St. Louis, together with Samuel S. Gross, then an undergraduate at Washington University, and Randall H. Brown, Ph.D., a research scientist, report their results in the May 2005 issue of Genome Research. N-SCAN has proven to be the best program available at finding both the transcription start site (TSS) and the complete first exon in both the human and fruit fly genomes.

The addition of N-SCAN to TWINSCAN now provides genomics researchers the wherewithal to find and predict both the protein sequences produced by genes and their untranslated regions. Researchers in recent years have grown increasingly enthusiastic about the significance of untranslated regions. By understanding the functions of these regions, scientists expect to understand more about gene regulation ?how genes get turned on and off, the ignition system of our DNA, if you will.

To make the proteins that are the basic micro-machines of life, a region of the genome is copied, or "transcribed," to form a molecule called messenger RNA (mRNA). Some segments of the mRNA are then discarded, and the retained segments are spliced together. Geneticists have traditionally assumed that transcription starts within a few hundred bases of the protein-coding region. However, for nearly 40 percent of known human genes, transcription starts long before the beginning of the protein- coding region. Most of this extra-long untranslated region is then discarded by splicing the 5' untranslated region (UTR). All present gene finding systems ?except for N-SCAN ?either ignore the UTR splice sites or incorrectly incorporate them into some protein-coding segment, making gene prediction a none-too-sure industry.

"We've found that when we add the spliced untranslated regions to our system, we not only get good predictions for UTRs but also improved predictions of the protein-coding region of the gene. By correctly identifying UTRs, we can avoid labeling them incorrectly as part of the protein-coding region," said Brent, who, with various colleagues, developed both TWINSCAN and N-SCAN. "It's important to know these two areas. Some of the signals that regulate transcription reside right near the transcription site. There is a huge amount of biology to be discovered there, and the appreciation of this area is growing daily."

While genomics researchers 15 years ago paid little attention to parts of the genome outside the coding regions, they have discovered some strange functions in UTR that have provoked second and third thoughts.

For instance, it recently was discovered that huntingtin, a gene associated with Huntington's disease, has a second protein segment encoded upstream of the main one. This protein in the so-called untranslated region is involved in regulating the gene. Running the modified TWINSCAN, on both the human and fruit fly genomes, Brent and colleagues predicted about 25,000 transcription-start sites, compared with a known 6,000.

"In the human genome, we found many extra exons on genes that were already known, or in some cases, spliced UTRs on genes that weren't even known to exist before," Brent said.

The system takes advantage of the scarcity of the CG sequence, finding so-called CpG "islands" known to be more common near the transcription-start site. It also has a knack for recognizing sequences that indicate splice sites. Over the past two years, TWINSCAN has been finding and predicting genes in numerous genomes that other gene prediction systems have missed. The addition of N-SCAN to the handy system ?it scans two genomes simultaneously, with potential to scan three or more ?strengthens it for predicting both coding and non-coding DNA.

"Like any multiple choice question, if you can learn something about one of the choices, it helps you with the other one," Brent said. "By making this integrated model that looks for both kinds of exons in both parts of the gene, we're able to convert a blind guessing game to a multiple choice question ?is it a UTR exon or a protein-coding exon? These kinds of questions are easier to answer now."


'"/>

Source:Washington University in St. Louis


Related biology news :

1. Modifications render carbon nanotubes nontoxic
2. Critical role in programmed cell death identified
3. Harmful chemicals may reprogram gene response to estrogen
4. Stem cell training program to make its Stanford debut
5. Rutgers to lead $52.7 million protein research program
6. DOEs Office of Science sets up program to aid scientists displaced by Hurricane Katrina
7. Costly breeding programs for endangered species pay off
8. HIV-infected adults in Botswana respond positively to ARV therapy public treatment program
9. Canine cancer vaccine program shows early promise
10. Its not fair! We are programmed to resist weight loss
11. Researchers unveil strategy for creating actively-programmed anti-cancer molecules
Post Your Comments:
*Name:
*Comment:
*Email:


(Date:5/12/2016)... DALLAS , May 12, 2016 ... has just published the overview results from the Q1 ... of the recent wave was consumers, receptivity to a ... wearables data with a health insurance company. ... choose to share," says Michael LaColla , CEO ...
(Date:5/3/2016)...  Neurotechnology, a provider of high-precision biometric identification ... Identification System (ABIS) , a complete system for ... can process multiple complex biometric transactions with high ... face or iris biometrics. It leverages the core ... MegaMatcher Accelerator , which have been used in ...
(Date:4/28/2016)... , April 28, 2016 First quarter ... (139.9), up 966% compared with the first quarter of 2015 ... totaled SEK 589.1 M (loss: 18.8) and the operating margin was ... (loss: 0.32) Cash flow from operations was SEK 249.9 ... 2016 revenue guidance is unchanged, SEK 7,000-8,500 M. The ...
Breaking Biology News(10 mins):
(Date:6/27/2016)... NC (PRWEB) , ... June 27, 2016 , ... ... mission to bring innovative medical technologies, services and solutions to the healthcare market. ... and implementation of various distribution, manufacturing, sales and marketing strategies that are necessary ...
(Date:6/27/2016)... , June 27, 2016   Ginkgo Bioworks , ... industrial engineering, was today awarded as one of ... of the world,s most innovative companies. Ginkgo Bioworks ... for the real world in the nutrition, health ... work directly with customers including Fortune 500 companies ...
(Date:6/24/2016)... ... June 24, 2016 , ... While the majority of commercial spectrophotometers ... 5000 and the 6000i models are higher end machines that use the more unconventional ... spectrophotometer’s light beam from the bottom of the cuvette holder. , FireflySci has ...
(Date:6/23/2016)... ... June 23, 2016 , ... UAS ... the launch of their brand, UP4™ Probiotics, into Target stores nationwide. The company, ... proud to add Target to its list of well-respected retailers. This list includes ...
Breaking Biology Technology: