We now recruited a group of 200 individuals and their families with idiopathic short stature seen in the genetic clinic of the Institute of Human Genetics at the University of Erlangen-Nuremberg to identify yet unknown genetic factors of growth retardation. Height adjusted SD scores were calculated on basis of the Prader Growth charts 
. We included patients with a height standard deviation score (SDS) of below −2 based on population data or who are significantly below the expected target height for their family. Common causes of short stature such as growth hormone deficiency, Ullrich-Turner syndrome, and SHOX deficiency were excluded where applicable. All patients underwent detailed clinical and dysmorphological evaluation by one of the authors (C.T.T.) and were classified as non-specific for any known genetic aberration. Our study group included 131 patients with isolated short stature (). 69 individuals presented with additional features such as malformations or a dysmorphic facial gestalt. The mean height SDS was −2.75. 52 individuals showed severe growth retardation of prenatal onset. Patients with significant body disproportions indicating skeletal dysplasias were considered a distinct aetiological group and were not included in this study 
. Only patients with disproportionate short stature but without radiographic signs suggestive of skeletal dysplasias were retained. A borderline IQ in the range of learning disability was observed in 3%. As these individuals received regular education no specific developmental assessment was available. A control cohort used to exclude common copy number polymorphisms consisted of 820 individuals originating from the same Central European region with either exfoliation syndrome or psoriatic arthritis 
, both late onset disorders not associated with short stature. Copy number variants as a cause of these disorders were foremost excluded and not reported in the literature.
Overview of the phenotypic characteristics of the patient group.
Molecular karyotyping of the 200 patient and the 820 control samples was performed using Genome-Wide Human SNP 6.0 or CytoScan HD arrays (). All samples met in-house quality criteria. Overall, we detected 6,338 copy number changes with an average of 32 aberrations per affected individual (). When comparing the size range of all observed CNVs in patients (6,338 CNVs) and controls (40,935 CNVs) we determined a size threshold at 99.2 kb () and found a higher incidence of CNVs with a length of above 100 kb in affected individuals (p-value 1.188×10−7
) (). To test for effects of common variants we performed a genome wide CNV association analysis calculating a permutation-based p-value across the CNVs of all individuals. As expected regarding the small cohort size, genome wide association at SNP level to the 20 loci where we identified rare variants was excluded (Figure S1
CNV discovery and characterization.
Molecular karyotyping and MLPA confirmation of identified loci.
Higher incidence of CNVs with a length of above 100 kb in affected individuals.
In an attempt to further investigate the variants under a “frequent disease - rare variant” – hypothesis to identify major gene effects we excluded frequent copy number polymorphisms by screening against 40,935 CNVs of the 820 control individuals (). As we suspected low penetrance alleles to be also present in the control group, we only excluded CNVs with an overlap in CNV size of 95% in more than 15 control samples (approx. 2%). 1,211 aberrations >50 kb were retained. In a gene-centric approach we also excluded aberrations which only affected intronic or intergenic sequences. The remaining 733 CNVs were reviewed for gene content and familial segregation with the growth phenotype either by array analysis or by multiplex ligation-dependent probe amplification (MLPA; Table S1
). We retained all CNVs that were either de novo
in the sporadic cases or co-segregated with the phenotype in the familial cases. In addition both groups had to meet at least one of the following criteria: a) CNVs with previously described human growth phenotypes of the affected genes obtained from the OMIM database, b) Murine knock-out phenotypes of the Mouse Genome Informatics database (http://www.informatics.jax.org
) including keywords like growth retardation and decreased body size (Table S2
), c) genes with a possible role in height development and/or bone growth based on their reported function on cell cycle regulation, organization of the cytoskeleton, chromatin remodeling, cilia development and the involvement in important developmental pathways (Table S3
), d) loci overlapping non-polymorphic, gene-containing aberrations of the Decipher database (http://decipher.sanger.ac.uk
) with short stature as one of the described phenotypes.
Taken together all lines of evidence we identified a total of 20 likely pathogenic copy number changes, 10 deletions and 10 duplications, in 20 families (10% of the study group) (). It is striking that in the RefSeq exons covered by all 20 CNVs we found no overlapping control CNVs at all in 19 and just one control CNV overlapping some exons in the 5p15.33 CNV (Table S3
). The size of these 20 CNVs ranged from 109 kb to 14 Mb. All 20 CNVs were independently confirmed by MLPA (Figure S2A
). 7 aberrations (35%), 4 deletions and 3 duplications, were de novo
(parental relationships confirmed) with an average size of 2,594 kb and an average of 30 genes. As we expected 6×10−3
de novo CNVs per haploid genome per generation in the healthy population 
, the identified number of de novo CNVs>50 kb in our patients was significantly higher than expected by chance further supporting pathogenicity of these variants (p-value 0.03, Fisher's exact test).
Unlike other entities with reduced reproductive fitness e.g. severe intellectual disability with a high rate of de novo
, we anticipated a higher rate of inherited CNVs in short stature as no reproductive disadvantage is known. This was confirmed by the identification of 13 inherited CNVs with an average CNV size of 1,727 kb and an average of 10 genes.
This group of 20 affected individuals with highly probable pathogenic CNVs consisted of 9 male and 11 female individuals (). Interestingly, the mean SD score for height was −3.34 and the SD score distribution of these 20 individuals was significantly lower when compared to the total study group (p-value 0.009; Wilcoxon test) (). Thus, rare pathogenic CNVs are more likely identified in patients with severe short stature. No significant difference was observed in patients with CNVs with prenatal vs. postnatal onset, proportionate vs. disproportionate growth retardation, and syndromic vs. non-syndromic short stature (Table S4
), but the number of affected cases in each group was small with limited statistical power. However, 40% of the 20 patients had a prenatal onset of short stature compared to 26% of the entire study group indicating central regulatory pathways of embryonic development to be disturbed by genes located in these CNVs.
Summary phenotype of patients with identified CNVs.
Height distribution of the study group.
Eight patients had CNVs showing an overlap with 6 known microdeletion/duplication syndromes associated with short stature (). Two of these patients (patient 8 and 19) had large inherited deletions covering the complete 1q21.1 microdeletion region which is known for its phenotypic variability 
. Short stature is present in about 25–50% of the patients 
. Further commonly observed signs such as mild facial anomalies, microcephaly and developmental delay were also observed in our patients. The inherited 307 kb duplication of patient 12 included the distal end of the TAR syndrome susceptibility locus on 1q21.1 but without the recently reported RBM8A
gene region 
. A 1.6 Mb duplication overlapping the rare 3q29 microdeletion/duplication syndrome was found de novo
in one patient with non syndromic idiopathic short stature (patient 7). Features of the 3q29 duplication syndrome have not been clearly determined, but failure to thrive has occasionally been reported 
. We also found a 1,363 kb de novo
deletion partially overlapping the classical and distal 22q11.22 microdeletion region of DiGeorge/Velo-cardio-facial syndrome 
and a 259 kb inherited deletion within the distal part of 22q11.22 only (patient 4 and 9, respectively) 
. Correspondingly, these two patients presented with short stature and some mild facial features, but no cardiac defects. Inherited duplications in patient 13 and 15 slightly overlapped the microdeletion regions 2q33 and 1p36 
. Thus, the clinical presentation of these patients confirmed the broad variability of known microdeletion/duplication syndromes and might highlight potential candidate genes for the short stature phenotype in these entities.
Recent genome-wide association studies (GWAS) found common single nucleotide polymorphisms (SNPs) in at least 180 loci to be significantly associated with height variation in the general population. These associated loci accounted only for up to 10% of the phenotypic variation within the normal range of the Gaussian growth distribution 
. We investigated if these loci might be located within our identified rare CNVs. Using LocusZoom 
we compared position and gene content with the published genome wide association dataset of the GIANT consortium based on the CEU 1000genomes Nov 2010 imputation. To identify significant loci, we considered a Bonferroni corrected level of significance of 1.377×10−6
based on 36,316 SNPs from the GIANT dataset located in the 20 identified CNVs. Loci at 3 of our CNV regions reached this level of significance (variants with the best p-values respectively r2
values are shown in Figure S3A
and Table S5
). The loci with the best p-values were located in the 6.7 Mb deletion 2q36.1–36.3 (patient 2) (), 4.5 Mb duplication 2p23.3 (patient 5), and 14.2 Mb duplication 5q22.1-q23.2 (patient 11). rs11125884 located in the promoter region of EFR2B
(patient 5) even reached a level of genome wide significance based on SNP association (2.8×10−13
). This number of 3 CNVs with significant associated SNP loci out of the 20 likely pathogenic CNVs was significantly higher than expected by chance (p-value<1×10−3
). Our findings not only confirmed the significance of the published results of the genome wide association study but also suggest a possible functional link between common variants in growth variation and rare variants involved in severe growth retardation underlining a major gene effect in short stature.
Genome-wide significant association of GWAS loci for height distribution in the 3 CNVs.
A deletion or duplication of one gene or a subset of genes located in a CNV can lead to directly or indirectly impaired gene expression 
. To investigate whether this is the case for the 20 identified CNVs we performed expression profiling in 11 individuals where RNA from lymphocytes was available. Of the 188 genes contained in the CNVs 58 (31%) showed a significant differential gene expression in the direction of the respective CNV (Table S6
). This number of differentially expressed genes we observed would be expected by chance only with a probability of less than 0.001 according to the binomial distribution, suggesting that these genes are dosage sensitive and the identified aberrations are leading to haploinsufficiency of these genes. To explore whether these 58 differentially expressed genes cluster in networks known to be involved in growth we performed pathway analyses using Ingenuity Pathway Analysis (IPA). This analysis identified networks involving cell death, cell cycle and DNA repair (Table S7
) further supporting the pathogenicity of these CNVs.
In conclusion, we propose rare CNVs as a relatively common cause of short stature under a major gene effect model. These include duplications as well as deletions of more than 100 kb in a comparable frequency as observed in other entities e.g. intellectual disability 
. Our findings also provide strong evidence for a “rare variant – frequent disease” hypothesis for short stature.