In this study, we identified 170 human CNVs located within 34 primate hotspot regions of CNV formation. The structurally plastic hotspots appear to have remained active in the three lineages despite being separated by over 25 million years of evolution. The majority of primate hotspots overlap with functional genomic elements, especially genes related to immunity. A significant portion of these genes that overlap primate hotspots appear to have evolved under positive selection (Figure ) and some of these genes are also known to be evolving under balancing selection in humans (for example, the HLA
, and LILR
families). As such, the evolution and maintenance of primate CNV hotspots may be a response to diverse environmental pressures acting on the genes residing in these hotspots. The maintained plasticity may then provide the mutational flexibility for these genes to adapt rapidly to changing selective pressures. Therefore, it is not surprising to see that multiple immune system-related genes are variable in copy number across primates, possibly resonating with the 'Red Queen hypothesis': that the constant diversification of the host immune system genes and the parasite defense genes is in response to changes in each other's defenses [21
For example, we observed a significant enrichment of HCR CNVs in a chromosome 19 region corresponding to the leukocyte receptor cluster (LRC). In humans, this 1 Mb region encompasses several families of immunoglobulin (Ig)-like receptor genes, including gene clusters encoding multiple leukocyte Ig-like receptors (LILRs), leukocyte-associated Ig-like receptors (LAIRs) and killer-cell Ig-like receptors (KIRs). The KIRs have a multifaceted role in two processes, immune defense and reproduction, and interact with cell-surface molecules encoded by the MHC class I locus, another region that displays rapid evolution and copy number variation. These epistatic interactions likely require the co-evolution of MHC and KIR, similar to the co-evolution of parasitic and host defenses described above. Under ever-changing pathogenic pressures, more of this variation could be maintained, especially among primates, which, due to their complex social dynamics, have higher pathogenic transfer rates [22
]. Therefore, at least some of these primate CNV hotspots are likely maintained under dynamic selective pressures, allowing for copy number variability at these loci.
Other gene ontological categories are represented, albeit less frequently, in the observed primate CNV hotspots. For instance, the pepsinogens (PGA
family) are precursors for pepsin (a major digestive enzyme) and may be involved in local environmental adaptation of primates [23
]. Such adaptation would be akin to that of the amylase encoding gene in humans, where different copy numbers of the amylase gene evolved as an adaptation to dietary habits [7
]. Similarly, genes such as CHYS1
, involved in wound healing, are also noteworthy. More surprising are gene families such as PHDB
, which may be involved in neural function [24
] and, among other functions, testis development [25
], respectively. These findings provide an initial framework for functional studies to establish the extent to which the variation in these genes has contributed to primate evolution.
In their classic paper, King and Wilson [26
] recognized the similarity between the macromolecules in chimpanzees and humans, noting that regulation of the amount of these macromolecules during different developmental phases may account for most of the phenotypic differences. In this theoretical framework, copy number variation may be one of the major mechanisms to regulate the expression levels within and between the species (Figure ). Indeed, genes that overlap with HCR CNVs were more likely to be differentially expressed between the three primate species studied here and to have evolved under positive selection in primates (Figures and ). Further evidence indicates that intraspecific expression differences are also significantly higher in genes that fall into primate hotspots (Figure ; Figure S9 in Additional file 2
). Not surprisingly, in addition to the HCR CNVs that overlap with coding regions of the genes, we found that at least two HCR CNVs overlap squarely with known enhancer regions that are highly conserved at the sequence level (Figure ). The redundancy in enhancers has been related to phenotypic robustness in fruit flies (Drosophila melanogaster
), especially when exposed to genetic and environmental variability [27
]. Hence, the maintenance of copy number variation in enhancer elements in primates may similarly reflect the evolutionary response to maintain phenotypic robustness in varying and rapidly changing selective pressures. By changing the number and position of genes or regulatory elements present in a single genome, CNVs likely impact gene regulation.
Figure 5 Impact of CNVs on gene regulation. (a) There are multiple ways in which CNVs can impact transcription by overlapping coding regions of the genes. (b) Blekhman et al. (2010) used RNA-seq data to determine whether specific genes are differentially expressed (more ...)
In addition, two recent studies demonstrated that copy number variation in one locus affects the expression levels in other loci. One of these studies showed that the expression level of a gene can be changed through alteration of the copy number variation of another gene that shares the same promoter region [28
]. The other study demonstrated that the expressed pseudogene of PTEN
acts as a sponge for microRNAs. As such, the deletion of the pseudogene subsequently increased the number of microRNA molecules, which can, in turn, negatively regulate the expression of the parental gene [29