Proteomics is the large-scale study of proteins, particularly their structures and functions. This term was coined to make an analogy with genomics, and while it is often viewed as the "next step", proteomics is much more complicated than genomics. Most importantly, while the genome is a rather constant entity, the proteome differs from cell to cell and is constantly changing through its biochemical interactions with the genome and the environment. One organism will have radically different protein expression in different parts of its body, in different stages of its life cycle and in different environmental conditions.
The entirety of proteins in existence in an organism throughout its life cycle, or on a smaller scale the entirety of proteins found in a particular cell type under a particular type of stimulation, are referred to as the proteome of the organism or cell type respectively.
With completion of a rough draft of the human genome, many researchers are now looking at how genes and proteins interact to form other proteins. A surprising finding of the Human Genome Project is that there are far fewer protein-coding genes in the human genome than there are proteins in the human proteome (~22,000 genes vs. ~200,000 proteins). The large increase in protein diversity is thought to be due to alternative splicing and post-translational modification of proteins. This discrepancy implies that protein diversity cannot be fully characterized by gene expression analysis alone, making proteomics a useful tool for characterizing cells and tissues of interest.
To catalog all human proteins and ascertain their functions and interactions presents a daunting challenge for scientists. An international collaboration to achieve these goals is being co-ordinated by the Human Proteome Organisation (HUPO).
Branches of proteomics
- Protein separation. All proteomic technologies rely on the ability to separate a complex mixture so that individual proteins are more easily processed with other techniques.
- Protein identification. Well-known methods include low-throughput sequencing through Edman degradation. Higher-throughput proteomic techniques are based on mass spectrometry, commonly peptide mass fingerprinting on simpler instruments, or de novo sequencing on instruments capable of more than one round of mass spectrometry. Antibody-based assays can also be used, but are unique to one protein.
- Protein quantification. Gel-based methods are used, including differential staining of gels with fluorescent dyes (difference gel electrophoresis ). Gel-free methods include various tagging or chemical modification methods, such as isotope-coded affinity tags (ICATs) or combined fractional diagnoal chromatography (COFRADIC).
- Protein sequence analysis. This is more of a bioinformatic branch, dedicated to searching databases for possible protein or peptide matches, but also functional assignment of domains, prediction of function from sequence, and evolutionary relationships of proteins.
- Structural proteomics. This concerns the high-throughput determination of protein structures in three-dimensional space. Common methods are x-ray crystallography and NMR spectroscopy .
- Interaction proteomics. This concerns the investigation of protein interactions on the atomic, molecular and cellular levels
- Protein modification. Almost all proteins are modified from their pure translated amino-acid sequence, so-called post-translational modification. Specialized methods have been developed to study phosporylation (phosphoproteomics) and glycosylation (glycoproteomics).
Key technologies used in proteomics
- Twyman, R. M. 2004. Principles of proteomics. BIOS Scientific Publishers, New York. ISBN 1859962734.(covers almost all branches of proteomics)
- Westermeier, R. and T. Naven. 2002. Proteomics in practice: a laboratory manual of proteome analysis. Wiley-VCH, Weinheim. ISBN 3527303545.(focused on 2D-gels, good on detail)
- Liebler, D. C. 2002. Introduction to proteomics: tools for the new biology. Humana Press, Totowa, NJ. ISBN 0585418799 (electronic, on Netlibrary?), ISBN 0896039919 hardback, ISBN 0896039927 paperback.