Also in this issue are detailed reports of experimental validations to complement the GENCODE gene data and novel strategies for further annotating the genome. Howald and colleagues developed the RT-PCR-seq method to show that a substantial portion of exons, the protein-coding regions of genes retained by splicing, are not well annotated by unbiased RNA-sequencing alone, requiring a more targeted strategy in combination.
GENCODE has mapped more than 9,500 long non-coding RNA (lncRNAs), but up until now, only about 100 have been characterized with cellular function. lncRNAs, which are transcribed in a range of human tissues and play roles in gene regulation, are particularly interesting because they do not seem to be as well-conserved evolutionarily, in contrast to conservation of genes that code for proteins. Derrien et al. have analyzed the GENCODE lncRNA annotations, integrating the lncRNA data with other ENCODE transcriptome and epigenome data, presenting the most comprehensive lncRNA annotation to date. The authors show that approximately one-third of lncRNAs have arisen in the primate lineage, suggesting that there may be important lncRNA functions yet to be discovered.
2. ENCODE studies clarify the murky world of RNAs
The ENCODE Project's efforts to annotate the genome include the sequencing of RNA, the message transcribed from DNA to code for proteins and perform other cellular
|Contact: Peggy Calicchia|
Cold Spring Harbor Laboratory