Characterize gene distribution in your cDNA
Brenda Rogers Douglas McKenzie
We introduce six pairs of control RT-PCR primers that are designed to detect specific human gene transcripts with various expression levels. These primer sets provide insight into whether cDNA synthesis has been effective for all abundance classes of transcripts and whether full-length or near full-length cDNA species are present for those genes. The RT-PCR Primers for Genes of Varying Abundance Levels (human) consist of one primer pair each for a high- and medium-abundance gene transcript, and primer pairs for each of four low-abundance gene transcripts. The primer pairs are positioned toward the 5 end of the transcript to provide information about the effectiveness of first-strand synthesis. These RT-PCR primers span intron/exon junction(s) and are designed for use under high-stringency annealing conditions (60C). The primers are designed to have minimal self-complementarity and minimal capacity for self-priming or formation of primer-dimers. To complement the control RT-PCR primers, we also introduce probes for the same transcripts, so analysis of libraries can be conducted using standard probe technologies.
Generating cDNA from RNA is one of the principal methods used in gene discovery and gene expression analysis. While cDNA libraries have always played an important role in conventional screening technologies, they took on added importance in 1991 with the advent of the shotgun single pass, a partial sequencing of entire cDNA libraries.1 This new approach generated large collections of expressed sequence tags (ESTs) that did not correspond to known genes. The impact of this technology on new gene discovery has been enormous.
Similarly, the generation of normalized and subtracted cDNA libraries has provided material for the discovery of new genes. Alternative means of gene discovery based on analysis of differential expression of genes [e.g. Differential Display,2 or serial analysis of gene expression (SAGE)3] rely heavily on cDNA generation from RNA. Gene expression analysis, whether conducted on a gene-by-gene basis using RT-PCR or through global expression profiling on solid phase, relies on cDNA synthesis technologies. Clearly, cDNA-based gene discovery and gene expression tools will continue to play a significant role in the genome and postgenome eras.
A critical element in using cDNA for gene discovery or expression analysis is the quality of the cDNA. Investigators often presume that the distribution and abundance of mRNAs are maintained in the cDNAs and the libraries made from them; many criteria are of concern (e.g., complete representation, percentage of full-length inserts, lack of ribosomal sequences, lack of mitochondrial/yeast/E. coli sequences, percentage of colonies with inserts, percentage of colonies yielding good sequence).
Typically, one or possibly two features of the cDNA are addressed: representation and cDNA length. To examine these issues, RT-PCR or probe signals derived from traditional housekeeping genes are evaluated. Unfortunately, transcripts for housekeeping genes are abundant, and suboptimal library construction can still yield a strong signal through RT-PCR or by probing. In gene expression analysis, housekeeping genes are typically expressed at higher levels than the gene of interest, which makes comparisons and quantification difficult.
To better analyze library quality, take into account the abundance of the target gene and the placement of RT-PCR primers or probe along the gene transcript, so both representation and insert size can be addressed. With respect to these discovery technologies, representation and cDNA length are particularly important for low-abundance mRNAs, as most new genes are likely to be discovered from this population of transcripts.
DNA hybridization studies indicate that the typical mammalian cell expresses between 10,000 to 30,000 mRNA species at any one time. These mRNAs can be grouped into three distinct abundance classes.4,5 Most prevalent are the approximately 10 mRNAs that account for about 10% of the overall transcripts. The rare transcripts, approximately 10,000 distinct mRNA species present at between 1 to 15 copies per cell, account for about 45% of the total mRNA in the cell.
We used several criteria to select target genes for the Stratagene set of control RT-PCR primers: a fairly ubiquitous and constitutively expressed gene transcript, different sources of RNA expressed in a fairly uniform manner, and some known information about the abundance levels for the transcript. We applied an arbitrary classification of transcript abundance that was consistent with the original studies of Bishop, et. al.4 In general, we designated high-abundance genes as those expressed at approximately 1% of the total RNA, medium-abundance genes as those expressed at 0.1 to 1.0%, and low-abundance genes as those expressed at equal to or less than 0.1%. This corresponds to approximately 3000 copies/cell, 300 to 3000 copies/cell, and less than 300 copies/cell, respectively. Using these criteria, it was fairly straightforward to identify suitable high- and middle-abundance gene targets. In contrast, most low-abundance gene transcripts were not ideal candidates because of their somewhat limited tissue expression patterns. To solve this potential problem, we developed primer pairs for several different low-abundance gene transcripts, each of which exhibited broad but not universal tissue expression. Hence, choosing the appropriate low-abundance primer set must be individually determined.
When analyzing cDNA quality, we used these control RT-PCR primers many different ways: For example, in Figure 1 and Figure 2, we used a number of the control primer sets to conduct a PCR analysis on lambda libraries derived from several human tissues. We wanted to detect an amplification product, as opposed to evaluate the relative amounts of different transcripts. Consequently, we used a large number of cycles (35 cycles) to maximize the signal. Under these conditions, the plateau effect obscured any quantitative relationship that might exist between the abundance of the transcript and accumulation of PCR product. The results in Figure 1 demonstrate that transcripts for b-actin, g-actin, and protein phosphatase 1 catalytic subunit are present in libraries generated from the lung, muscle, spleen, heart, and liver RNA. Other experiments confirmed their expression in the ovary, testis, cerebellum, and pancreas (data not shown).
Figure 2 shows PCR results obtained with primer pairs for the low-abundance genes ARF1 and ARF3 when tested on lambda libraries derived from the lung and spleen. In every instance, the size of the PCR product was consistent with cDNA, not gDNA, as the source of the PCR signal. These results (Figure 1 and Figure 2 and data not shown) are consistent with the broad tissue distribution of these transcripts and the ability of the control RT-PCR primers to detect their presence.
As an alternative, the control RT-PCR primers may be used in a more quantitative manner. Consider following the appearance of the PCR amplification products as a function of cell cycle. Generating PCR amplification products under these conditions will be influenced by primer efficiency and template abundance. Stratagene has determined that each of the primer sets appears to exhibit equal application efficiency, as measured by the rate at which PCR product accumulates during the linear phase of the PCR. This is consistent with the supposition that products generated with these primers reflect the amount of template used in the PCR.
We conducted experiments to evaluate the cycle-dependent accumulation of PCR product for each of the six RT-PCR abundance primer pairs. A single lambda library derived from the liver was used as template in all reactions (Figure 3). The amount of PCR product was measured using quantitative densitometry and signal intensity expressed relative to the cycle number (Figure 4). Based on results using real-time PCR,6,7 we expected the cycle at which product is first detected (the threshold cycle) to indicate the abundance of these transcripts. In addition, we expected the absolute yield of the PCR amplification product to depend on the level of input template. Both of these expectations were consistent with the PCR results we obtained. The more abundant the transcript, the earlier the PCR amplification product was detected, and the greater the amount of PCR amplification product was observed at the plateau. This suggests that these primers may be of utility in semiquantitative measurements of template abundance.
Stratagene introduces a series of control RT-PCR for human gene transcripts of differing abundance. Because they are easy to use and generate different types of information, these primers offer an improvement over RT-PCR tools currently available for analyzing cDNA quality.
PCR product sizeRNA (bp)
Protein phosphatase 1