|Home | About | Journals | Submit | Contact Us | Français|
The frequency and distribution of the rare dinucleotide CpG was examined in 15 mammalian genes. CpG is highly methylated at cytosine in mammalian DNA (1,2) and 5-methylcytosine (5mC) is thought to undergo a transition mutation via deamination to produce thymine (3). This would result in the accumulation of TpG and CpA and depletion of CpG during evolution (4). Consistent with this hypothesis, the gene sample of 26,541 dinucleotides contained CpG at 40% the frequency expected by base composition and the CpG transition products, TpG+CpA, were significantly elevated at 124% of expected random frequency. However, because CpG occurs at only 25% of expected random frequency in the genome, the sampled genes were considerably enriched in this dinucleotide. CpGs were asymmetrically distributed in sequences flanking the genes. 5'-flanking sequences were enriched in CpG at 135% of the frequency expected assuming a symmetrical distribution of all the CpGs in the sampled genes (p less than 0.01), while 3'-flanking regions were depleted in CpG at 40% of expected values (p less than 0.0001). This asymmetry may reflect the role of 5-methylcytosine in gene expression. In contrast the frequencies of GpC and GpT+ ApC did not differ significantly from that predicted by base composition and these dinucleotides were not asymmetrically distributed.