|Home | About | Journals | Submit | Contact Us | Français|
The occurrence of GATC (Dam-recognition) sites in available E. coli DNA sequences (representing about 2% of the chromosome) has been determined by a simple numerical analysis. Our approach was to analyze the nucleotide composition of nine large sequenced DNA stretches ("cantles") in order to identify patterns of GATC distribution and to rationalize such patterns in biological/structural terms. The following observations were made: (i) In addition to oriC, GATC-rich regions are present in numerous locations. (ii) There is a wide variation in GATC frequency both between and within DNA cantles which led to the identification of a void-cluster pattern of GATC arrangement. The distance between two GATCs was never greater than 2 kb. (iii) GATC sites are found more frequently in translated regions than (in decreasing order) non-coding or non-translated regions. In particular, rRNA and tRNA encoding genes exhibit the lowest GATC content.