In biochemistry, to specify the primary structure of an unbranched biopolymer, such as a molecule of DNA, RNA or protein, is to name the species of every subunit (nucleotide or amino acid) in order from the beginning to the end of the molecule to obtain either nucleotide or peptide sequence. The primary structure, in other words, is a biopolymer's exact chemical composition and to the sequence in which its subunits are arranged.
While the primary structure of a biological polymer to a large extent determines the three-dimensional shape that the molecule assumes in vivo, knowing it often doesn't help a person either to deduce this shape (known as the tertiary structure) or to predict localized structuring, such as the formation of loops or helices (called secondary structure). However, knowing the structure of a similar sequence (for example a member of the same Protein family) can competely identify the tertiary structure of the given sequence. Sequence families are often determined by Sequence clustering, and so structural genomics projects aim to produce a set of representative structurs to cover the sequence space (non-redundant sequences ).