Although CenH3 is, with few known exceptions 
, encoded by a single gene in diploid species 
, in the pea genome we identified two divergent copies, designated as CenH3-1
(GenBank accession numbers JF739989 and JF739990), sharing only 55% identity. CenH3-1 and CenH3-2 proteins differ both in length and sequence, being composed of 123 and 119 amino acid residues, respectively, and having 72% identity (). Transformation experiments using pea hairy-root tissue cultures expressing constructs containing cDNA fragment coding either of the CenH3 variants fused with yellow fluorescence protein (YFP) gene demonstrated that, despite the sequence differences, both CenH3 variants target all 14 centromeres in diploid nuclei (). Immunodetection experiments further showed that both CenH3 variants completely co-localized in interphase and mitotic chromosomes (). In sharp contrast to other investigated species, including Vicia faba
which is a close relative of the pea, we found that primary constrictions of all pea chromosomes contain not a single but multiple functional centromere domains (). These domains were best distinguished in prophase and prometaphase chromosomes exhibiting three to five domains that are clearly separated by chromatin blocks lacking CenH3. With increasing condensation in late metaphase and anaphase, the domains came very close to one another or merged into a single extended layer at the poleward face of the primary constriction. However, even in fully condensed chromosomes, the CenH3 within the layer was not evenly distributed, showing intermingling fluorescent signal spots of varying intensity in a row, resembling a string of beads (). The chromosome segments delimited by the two most distant domains were roughly estimated to represent 9.5–18.8% of individual chromosomal DNA which corresponds to 69–107 Mbp (Table S1
). To verify that all CenH3-containing domains are indeed sites of kinetochore formation, we carried out the simultaneous immunodetection of CenH3 and tubulin, a protein of the mitotic spindle which is also localized to the kinetochore 
. Tubulin signals detected in the kinetochore colocalized with those of CenH3, indicating that all CenH3-containing regions are truly functional centromere domains ( and ).
Pea has two variants of the CenH3 that fully colocalize in centromeres of all chromosomes.
Organization of CenH3-containing domains during the cell cycle.
The CenH3-containing domains are fully colocalized with tubulin.
We selected chromosome 3 for the investigation of the centromere structure and the CenH3 distribution with higher resolution using correlative light fluorescence microscopy (LFM) and field emission scanning electron microscopy (FESEM). In this approach, CenH3-containing regions were detected with FluoroNanogold, allowing for the investigation of the same chromosomes with both techniques (). Three distinct and strongly labeled regions, located on either side of the longitudinal chromosome axis, were detected using both LFM and FESEM imaging (). Secondary electron (SE) imaging of the primary constriction revealed longitudinally oriented fibrillar structures interspersed with chromomeres in the range of about 200 nm in diameter (). Backscattered electron (BSE) detection of CenH3 markers showed that labeled regions are composed of discrete multiple signals (Ø 10–15 nm) from markers near the surface, as well as diffuse regions from markers in the interior of the centromere (). Subsequent investigation with a dual beam focused ion beam (FIB) and FESEM system allowed direct visualization of CenH3 markers in the centromere interior (Video S1). Measured by 5 nm milling steps, markers occurred between 10 nm and approx. 200 nm from both poleward centromere surfaces (Figure S1
). High resolution 3-D reconstruction of the CenH3 distribution revealed that very few of the CenH3 signals actually occur at the chromosome surface (Video S2), indicative of other kinetochore factors at the chromatin-microtubule interface.
Organization and DNA sequence composition of CenH3-containing domains in chromosome 3.
In order to uncover DNA sequence composition of the functional centromere domains, we carried out chromatin immunoprecipitation sequencing (ChIP-seq) which produced approximately 9.5 and 19.7 million 35 nt long reads for ChIP and its input control sample, respectively. As the whole pea genome sequence is not yet available, we employed as a reference sequences that were obtained by paired-end Illumina sequencing of the pea nuclear DNA at 0.48× coverage (20.5 million reads, 100 nt in length; all deep sequencing data related to this study have been deposited into the Sequence Read Archive under the study accession number ERA079142 (http://www.ebi.ac.uk/ena/data/view/ERA079142
)). Sequences associated with CenH3 were identified based on the ratio between ChIP and input sequences mapped either to sequence clusters representing the most abundant repeats of the pea genome or to each reference read (Figure S2
). The latter approach revealed a total of 354 717 reference reads (1.73%) showing at least 10-fold enrichment which were grouped into sequence clusters based on their mutual similarity, as described previously 
. Further analysis of the clustered sequence data revealed that a vast majority (99%) of the ChIP-enriched sequences belongs to 13 distinct families of satellite DNA () and one family of Ty3/gypsy retrotransposon belonging to the CRM clade of chromoviruses 
. This data suggests that functional centromere domains are established almost exclusively upon repetitive DNA sequences. These repeats differed considerably from one another, not only in their primary sequences but also in the size of repeating units and abundance in genome (, and Dataset S1
). The association of all these repeats with functional centromere domains was confirmed using fluorescence in situ
hybridization (FISH) combined with immunodetection of CenH3-1 ( and ) which allowed the assignment of each CenH3-containing domain to some of the identified satellites. These experiments also showed that only the repeats with a high ChIP/input ratio are specific to functional centromere domains, while those with lower ChIP/input ratio (e.g. PisTR-B and TR-12) are localized predominantly outside of these domains (, and data not shown). In addition to the ChIP-enriched satellites, we included in these experiments three families of satellite DNA (TR-2, 4, and 5) that are known to occupy primary constrictions but showing no ChIP enrichment (ChIP/input<1.1), which indeed localized outside of CenH3-containing regions (). Contrary to most other species that possess a relatively high level of sequence homogenization among all centromeres 
DNA sequence composition of the centromere domains in the pea varied between chromosomes as well as between individual domains of the same chromosome. The only exception was chromosome 2 that contains a single centromeric satellite family (TR-11, ).
Characterization of satellite DNA families identified in the pea.
All to all dot-plot comparison of the pea satellite repeats.
Association of satellite DNA sequences with CenH3-containing domains.
It has already been shown that functional centromere domains of monocentric chromosomes are composed of intermingling subunits, 10 to 50 Kbp in length, containing nucleosomes with either CenH3 or canonical H3 histones 
. Although it is not yet well understood how the centromeric chromatin folds during chromosome condensation in mitosis, all current models postulate that CenH3-containing subunits are brought together toward the poleward face of the centromere to form a single compact kinetochore 
. As the size of the subunits is relatively small, they can be observed only at the finest resolution of chromatin fiber but not at the level of condensed mitotic chromosomes. Thus, none of the current models allow for large intermingling domains at the poleward side of mitotic chromosomes, as are observed in the pea. On the other hand, the high resolution 3-D distribution of CenH3 in individual centromere domains (Video S2) resembles that postulated for single centromere domains of previously investigated centromeres 
From a molecular point of view, therefore, the pea chromosomes have multiple centromere domains, yet they have only one primary constriction at metaphase. Chromosomes with two or more functional centromeres are usually unstable due to the formation of anaphase bridges leading to chromosome breakage. One exception is when the two centromeres are physically so close that they are able to fuse into a single centromere without disturbing mitosis 
. The maximum distance between two centromeres that still allows faithful segregation of dicentric chromosomes was estimated to be about 20 Mbp 
. Taking into account the size of the chromosome segments delimited by the two outermost functional centromere domains (Table S1
) and the total number of these domains in individual chromosomes, the distance between any two domains is likely to be either below this limit or not exceed it considerably. This probably allows the multiple domains to act in concert, assuring that pea chromosomes are stable during mitosis, behaving as functional monocentrics.
The high diversity of DNA sequence composition of functional centromere domains observed in the pea is unprecedented, but it concurs with the notion that centromeres are determined rather epigenetically (for review see 
). On the other hand, similarly to most other species investigated thus far 
, all of the centromere domains in the pea are made up of satellite DNA, indicating that the tandem organization of repeating units co-determines centromere domains. This converges with the recently proposed role of repetitive DNA in centromere function relying on a formation of covalently closed DNA loops made by inter-repeat homologous recombination 
. However, the tandem arrangement of the repeating units is clearly not the only precondition for a DNA sequence to function as a centromere because some clusters of satellite DNA located within the primary constrictions are not associated with CenH3.
The structure of large pea centromeres is reminiscent of holocentric, also called polycentric, chromosomes that exhibit numerous discrete centromere domains extending over nearly the entire length of the chromosome 
. As with the pea, the centromere domains congregate during mitosis to form a composite, linear-like kinetochore 
. The sizes of segments of pea chromosomes delimited by the outermost functional centromere domains (Table S1
) approach or even exceed the size of entire polycentric chromosomes of some species, including C. elegans
(14–21 Mbp) and Luzula nivea
(155 Mbp on average) 
. A portion of centromere domains in Luzula nivea
is composed of scattered clusters of satellite LCS1 
, suggesting that satellite DNA is an important centromere determinant in at least some holocentric chromosomes. Remarkably, the LCS1 satellite has a similarity to the RCS2 (CentO) which is the major centromeric satellite of monocentric chromosomes of some Oryza species 
Although the mechanism of transition from monocentric chromosomes to polycentric ones is not yet known and may differ between organisms, a conceivable scenario for Luzula nivea
could be that it occurred as a consequence of spreading of centromere-competent satellite(s). If this is the case, then pea chromosomes with multiple distinct clusters of CenH3-associated satellites might represent an intermediate “meta-polycentric” type between monocentric and polycentric chromosomes. However, it has been postulated that centromere expansion causes deleterious effects which in turn create pressure for its suppression, possibly by changes in key factors such as CenH3 or Cenp-C 
. This explains why the centromere expansion is not an infinite process and why the size of centromeres of most eukaryotic species remains limited to relatively small chromosome domains. Therefore, we assume that pea centromeres are more likely to be or to have already been suppressed in their expansion rather than continue their spreading further into noncentromeric regions. It is tempting to speculate that the presence of two CenH3
genes in the pea is somehow related to the unusual centromere structure. However, it is impossible to conclude from the available data whether the ancient duplication of CenH3
genes and their diversification occurred before or after the centromere expansion. Further research is necessary to fully understand the cause and effects of these unusual features of pea centromeres. Establishing the pea as a new model organism for centromere investigation will contribute to a better understanding of centromere chromatin organization and dynamics during the cell cycle as well as the still elusive role of repetitive DNA in centromere evolution, determination and function.