Fusion cDNAs from 2 different genes. Tends to cause separate genes to cluster together and underestimates gene number.
Did not ligate cDNAs, rather annealed them to vector to create library. Eliminates fusions.
Genomic DNA inserts contaminating cDNA library. Tends to create artificial clusters and overestimates the number of genes.
Did not ligate adapters onto ends of cDNA, thus genomic DNA false inserts are very rare.
Falsely separating 5 and 3 ends of cDNAs into different clusters. Tends to overestimate the number of genes.
Only cluster sequences having a polyA tail, thereby not counting 5 and 3 ends twice.
Falsely separating splice variants into separate clusters. Tends to over- estimate the number of genes.
Only use sequence contiguous to the polyA tail, minimizing the appearance of splice variants in the data.
Stratagene has made some very unique cDNA libraries. Much has been done
to minimize artifacts such as fusion inserts and genomic DNA inserts (Table
1). Moreover the cDNAs have been highly normalized, yielding a very
low level of clone redundancy. We have systematically sequenced the 3
ends of our clones and analyzed the resulting sequences. In the analysis,
we remove sequences lacking poly-A tails.