Search tips
Search criteria 


Logo of bmcgenoBioMed Centralsearchsubmit a manuscriptregisterthis articleBMC Genomics
BMC Genomics. 2009; 10: 593.
Published online Dec 10, 2009. doi:  10.1186/1471-2164-10-593
PMCID: PMC2805694
Expansion of tandem repeats in sea anemone Nematostella vectensis proteome: A source for gene novelty?
Guy Naamati,1 Menachem Fromer,1 and Michal Linialcorresponding author2
1School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, 91904, Israel
2Department of Biological Chemistry, Institute of Life Sciences, The Sudarsky Center for Computational biology, The Hebrew University of Jerusalem, Jerusalem, 91904, Israel
corresponding authorCorresponding author.
Guy Naamati: guy.naamati/at/; Menachem Fromer: fromer/at/; Michal Linial: michall/at/
Received August 15, 2009; Accepted December 10, 2009.
The complete proteome of the starlet sea anemone, Nematostella vectensis, provides insights into gene invention dating back to the Cnidarian-Bilaterian ancestor. With the addition of the complete proteomes of Hydra magnipapillata and Monosiga brevicollis, the investigation of proteins having unique features in early metazoan life has become practical. We focused on the properties and the evolutionary trends of tandem repeat (TR) sequences in Cnidaria proteomes.
We found that 11-16% of N. vectensis proteins contain tandem repeats. Most TRs cover 150 amino acid segments that are comprised of basic units of 5-20 amino acids. In total, the N. Vectensis proteome has about 3300 unique TR-units, but only a small fraction of them are shared with H. magnipapillata, M. brevicollis, or mammalian proteomes. The overall abundance of these TRs stands out relative to that of 14 proteomes representing the diversity among eukaryotes and within the metazoan world. TR-units are characterized by a unique composition of amino acids, with cysteine and histidine being over-represented. Structurally, most TR-segments are associated with coiled and disordered regions. Interestingly, 80% of the TR-segments can be read in more than one open reading frame. For over 100 of them, translation of the alternative frames would result in long proteins. Most domain families that are characterized as repeats in eukaryotes are found in the TR-proteomes from Nematostella and Hydra.
While most TR-proteins have originated from prediction tools and are still awaiting experimental validations, supportive evidence exists for hundreds of TR-units in Nematostella. The existence of TR-proteins in early metazoan life may have served as a robust mode for novel genes with previously overlooked structural and functional characteristics.
Articles from BMC Genomics are provided here courtesy of
BioMed Central