|Home | About | Journals | Submit | Contact Us | Français|
Non-human primates provide genetic model systems biologically intermediate between humans and other mammalian model organisms. Populations of Caribbean vervet monkeys (Chlorocebus aethiops sabaeus) are genetically homogeneous and large enough to permit well-powered genetic mapping studies of quantitative traits relevant to human health, including expression quantitative trait loci (eQTL). Previous transcriptome-wide investigation in an extended vervet pedigree identified 29 heritable transcripts for which levels of expression in peripheral blood correlate strongly with expression levels in the brain. Quantitative trait linkage analysis using 261 microsatellite markers identified significant (n = 8) and suggestive (n = 4) linkages for 12 of these transcripts, including both cis- and trans-eQTL. Seven transcripts, located on different chromosomes, showed maximum linkage to markers in a single region of vervet chromosome 9; this observation suggests the possibility of a master trans-regulator locus in this region. For one cis-eQTL (at B3GALTL, beta-1,3-glucosyltransferase), we conducted follow-up single nucleotide polymorphism genotyping and fine-scale association analysis in a sample of unrelated Caribbean vervets, localizing this eQTL to a region of <200 kb. These results suggest the value of pedigree and population samples of the Caribbean vervet for linkage and association mapping studies of quantitative traits. The imminent whole genome sequencing of many of these vervet samples will enhance the power of such investigations by providing a comprehensive catalog of genetic variation.
Non-human primates (NHPs) fill a critical need for biomedical models that are more directly relevant to human biology and disease than rodents or other commonly employed model systems, and they also permit longitudinal or invasive investigations that are infeasible in humans (1). The advent of next-generation sequencing has created the opportunity to assay cost-effectively genome-wide genetic variation in NHPs. This possibility has raised interest in using large-scale genetic and genomic investigations of NHPs to elucidate the biological basis of diverse human traits, including HIV/AIDS, cardiometabolic diseases and disorders of brain and behavior (2–4). However, the perceived lack of suitable and sizable NHP samples has thus far limited the implementation of such investigations. We propose here a strategy for mapping complex traits that leverages both the large pedigrees available in NHP research colonies and extensive samples available from wild NHP populations. As we suggest in this report, the Caribbean vervet (also termed the African green monkey, AGM, Chlorocebus aethiops sabaeus) offers particular advantages for the implementation of this approach.
The Caribbean vervet population descends from very small numbers of West African sabaeus monkeys transported to the islands of St Kitts, Nevis and Barbados during the early colonial era (5) (Fig. 1). These monkeys rapidly established feral populations that, in the absence of natural predators, expanded dramatically over a period of ~80–100 generations. Current estimates place the total size of the feral populations on these three islands as high as 50 000–100 000 (Alexis Nisbett, personal communication). Although such island-wide population estimates are imprecise, detailed surveys of a single location in St Kitts has demonstrated the dramatic expansion in feral vervet population size that can occur there over short periods (nearly 4-fold growth over a single decade from 1971 to 1981) (6).
In the 1970s and 1980s, 57 monkeys trapped on the neighboring islands of St Kitts and Nevis were transported to UCLA to found the Vervet Research Colony (VRC). The VRC, which was relocated to the Wake Forest Primate Center in 2008, has been continuously managed as a single extended pedigree (now eight generations deep) and has included more than 2000 monkeys maintained in species-appropriate social housing. The VRC pedigree has been evaluated for a wide range of heritable quantitative traits, including brain and behavior measures (4,7–11), metabolic phenotypes (12) and gene expression levels (13), which have been assayed in multiple tissues.
The size and structure of the VRC pedigree suggest that it should be a powerful resource for initial localization of quantitative trait loci (QTL), given the many opportunities that the extended pedigree provides for observing the transmission of both trait measures and marker alleles. Available evidence supports this expectation; we have reported previously the mapping in the VRC of a locus for central dopamine metabolism (4) and report here the mapping of several expression QTL (eQTL), despite the fact that current studies rely on a sparse genetic map comprised of human microsatellites (14). Genetic investigations of the VRC will soon become considerably more powerful, given whole genome sequencing (WGS) now underway on a pedigree-wide basis.
The 57 monkeys utilized to establishment the VRC were trapped at several locations spanning St Kitts and Nevis. Given that the St Kitts and Nevis vervet population expanded dramatically from a small number of founders, we therefore reasoned that the VRC pedigree must be genetically broadly representative of the island populations, and that most variants that contribute to quantitative traits in the VRC pedigree also likely contribute to such traits in the feral St Kitts and Nevis vervet populations. Exceptions to this expectation could include loci for which there has been extensive genetic drift since the separation of the VRC and island populations, and loci for which de novo mutations have occurred in either group since that time.
We further expected that VRC monkeys with similar values on quantitative traits would share, identical-by-descent (IBD), long chromosomal segments containing variants contributing to these traits, and that St Kitts monkeys would share shorter IBD segments containing such variants. This expectation reflects the fact that, in the VRC, as in pedigrees generally, individuals measured for any given trait are separated from their most recent common ancestors by at most a few meiotic steps, whereas in the population sample the individuals measured for traits have been sampled from several dozen distinct locations and therefore are likely separated from their most recent common ancestor by a much larger number of meiotic steps. The predictable relationship between the number of meiotic steps to a common ancestor and the length of a shared segment around a disease gene has been extensively utilized in the design of human genetic mapping studies since early in the positional cloning era, that is, the progression from coarse-scale linkage mapping in pedigrees to fine-scale association mapping in population samples (15). Human studies have also established the importance, to detection of IBD sharing, of demographic history; rapidly expanding founder populations, such as that of the St Kitts and Nevis vervets, typically display particularly extensive and predictable patterns of linkage disequilibrium (LD), compared with more heterogeneous populations (16).
Based on our expectations for allele and segment sharing between the VRC pedigree and island populations, we designed an initial test of a two-stage strategy for genetic mapping of complex traits in the Caribbean vervet. The first stage in this approach consists of genome-wide linkage mapping in the VRC (establishing the strongest coarse-scale regions for follow-up and providing a priori evidence in support of the second-stage association studies). The second stage consists of association analyses that target the candidate regions established in the linkage stage, utilizing large numbers of independent feral vervet monkeys sampled on St Kitts and Nevis to fine-map the linked loci.
In this study, we focus on mapping loci contributing to gene expression phenotypes assayed by genome-wide microarrays. While the long-term focus of genomic investigation of the vervet is primarily on phenotypes hypothesized to directly inform our understanding of human diseases, identification of eQTL has proven invaluable in the genetic dissection of a wide range of complex traits (17). For example, eQTL are significantly overrepresented among single nucleotide polymorphisms (SNPs) that that are associated with complex traits in genome wide association studies (18). In human studies, variation in gene expression is generally limited to readily accessible tissues, particularly blood, fibroblasts and lymphoblast cell lines (19). In the vervet, by contrast, it is feasible to obtain, in selected individuals, high-quality RNA from any tissue and then to identify transcripts whose expression patterns in that tissue have a strong correlation with their expression patterns in blood or fibroblasts (which are available in most members of the VRC and in large numbers of individuals from population samples). We previously used such an approach to identify 29 vervet transcripts that display expression patterns in blood that are stable, heritable and highly correlated with their expression patterns in the brain (9).
We report here the genetic mapping in the VRC of loci linked to the blood expression levels of these 29 transcripts (eQTL). Among these loci, several markers in a single genome region show linkage to transcripts located on several different vervet chromosomes, suggesting the presence of a master trans-regulator locus. We then describe an association analysis conducted in independently ascertained animals from St Kitts, genotyped with multiple SNPs from one of the cis-eQTL regions [for B3GALTL, a glycosyltransferase which, when recessively mutated results in Peters plus syndrome, MIM#261540, a developmental delay disorder (20)] that narrowed this region to a chromosome segment of ~170 kb.
We first conducted QTL linkage analysis of 29 gene expression traits (9) using genotypes from 261 genome-wide microsatellite markers in 347 monkeys from the VRC pedigree (see Materials and Methods, and Supplementary Material, Table S1). As shown in Table 1, eight loci (all apparently cis-eQTL) exceeded a threshold for genome-wide significance (LOD > 4.78) obtained by applying the method of Conneely and Boehnke (21) to correct for the analysis of 29 traits. An additional four loci (two cis- and two trans-) exceeded a threshold for suggestive linkage (LOD > 3.27) determined using the same method. These suggestive and significant loci are estimated to account for between ~0.25 and 0.65 of the overall heritability for the specific transcripts to which they are linked (Table 1).
The still incomplete state of the vervet reference genome assembly complicates analysis of the eQTL linkage results, particularly in regions where synteny between the human and vervet genomes is uncertain. Specifically, while the sparse vervet genetic map reflects genotypes obtained from the VRC pedigree (14), the information on the physical position of the transcripts currently derives from the position in the human assembly of the microarray probes used to assay them. It is therefore not yet possible to directly compare the genetic and physical locations of eQTL in any of the several regions that display large-scale rearrangements between the vervet genome (which has 29 autosomes) and the human genome (14,22). An example of such a region is an ~22 cM segment of vervet chromosome 9 that is syntenic with a segment of ~25 Mb on human chromosome 10 (Fig. 2). Seven of the transcripts that we assessed display their peak genome-wide LOD score to markers that map to this region. The strongest linkage evidence (peak LOD scores ≥ 4 for TSPAN14, SLC25A23 and TMEM57) was observed at three microsatellite markers located within a genetic interval of ~3 cM, suggesting that a single locus may be predominantly responsible for the eQTL signal in this region. The location of SLC25A23 and TMEM57 on vervet chromosomes 6 and 20, respectively, further suggests that this eQTL may be a master trans-regulator (23,24). Detailed analysis of this region, however, is currently precluded by the fact that the segment of maximal linkage, although internally syntenic with the corresponding human region, is inverted in the vervet genome compared with that of the human, and high-resolution analysis of the structure of this region in the vervet genome is not yet complete. Additionally, the sparseness of the current genetic marker map makes it difficult to define the inversion region genetically.
To assess the potential utility of independent vervets from the Caribbean population for follow-up of QTL mapping results obtained in the VRC, we conducted an initial genotyping study using wild-trapped monkeys from St Kitts. We considered eQTL for follow-up genotyping based on three criteria: genome-wide significant linkage findings in the VRC (providing strong a priori evidence for the existence of trait-associated variation in the follow-up region), the availability of vervet sequence in the linked genome regions at the time that we were designing follow-up studies and cross-platform validation [by reverse transcription–quantitative polymerase chain reaction (RT–qPCR)] of the expression results. A region on vervet chromosome 3 that included B3GALTL met these criteria. Significant linkage to B3GALTL expression level was observed for two microsatellite markers spanning a distance of >10 cM (D13S1493 and D13S1233, peak LOD scores of 5.76 and 8.65, respectively, Supplementary Material, Fig. S1) and nearly complete vervet sequence data were available for the B3GALTL gene region. RT–qPCR assays of this transcript provided clear experimental evidence indicating that the expression levels measured using the microarray accurately reflected inter-individual variation in expression of this gene among animals of the VRC (validation P-value = 1.08 × 10−5, Wilcoxon test). Based on these results, we chose to test a vervet sample from the St Kitts population for association with SNPs from the B3GALTL gene region.
Candidate SNPs for fine-mapping of B3GALTL were identified from preliminary data of the Vervet Genome Sequencing Project (VGSP, see Materials and Methods). We selected SNPs that included B3GALTL and flanking regions of ~100 kb, a total of 15 SNPs within a gene region of ~330 kb. We genotyped the 15 B3GALTL SNPs in 279 independent wild-born vervets from St Kitts, sampled at the St Kitts Biomedical Research Foundation, SKBRF (see Materials and Methods and Fig. 3). In order to determine the suitability of the St Kitts samples for following up the linkage result at this locus from the VRC, we compared LD patterns and allele frequencies between the VRC pedigree and St Kitts population samples; for this comparison, we evaluated monkeys from each sample chosen to be as independent from one another as possible. From St Kitts, we selected one monkey from each of the 45 different trapping sites from which they were acquired, and from the VRC we selected 45 monkeys, all of whom were separated from each other by at least four meiotic steps. Thirteen of the 15 SNPs typed in the St Kitts animals were successfully genotyped in the VRC pedigree monkeys.
The correlation in allele frequencies between the pedigree and the population samples is strong [0.804, 95% confidence intervals of 0.454–0.939, minor allele frequency range = 6.7–42.2% (St Kitts) and 7.8–41.1% (VRC)] and significant (P = 9.3 × 10−4), consistent with the assumption that the VRC broadly reflects the genetic diversity of the St Kitts population. The degree to which fine-mapping association studies can resolve the location of a trait-related variant depends upon the number of recombination events that have occurred around the locus of interest since the variant was introduced into the population. The statistic D′ is an indicator of historical recombination between two loci within a given study sample (25). Evaluation of D′ values for all of the pairwise comparisons between the 13 SNPs genotyped in both study samples (Supplementary Material, Fig. S2) revealed a significantly larger proportion of comparisons within the VRC for which D′ = 1 (Chi-square = 4.76, P = 0.03), indicating that less historical recombination has occurred at B3GALTL in the VRC than in the St Kitts population sample. This observation is consistent with the hypothesis that, among St Kitts monkeys, IBD sharing will extend, on average over shorter chromosomal distances than among VRC monkeys; accordingly, fine-mapping association studies in St Kitts samples should narrow the candidate region of loci mapped in the VRC.
Four of the 15 SNPs typed in the 279 St Kitts monkeys displayed significant association with B3GALTL expression levels after a Bonferroni correction (0.05/15 = 0.003) for multiple comparisons (Table 2). Three of these four SNPs (B3GALTL 2, 4 and 12) were in tight LD with one another (r2 > 0.8). The fourth SNP (B3GALTL 16) LD pattern displayed an r2 of ~0.65 with the first three SNPs; however, modeling the expression versus a linear combination of the four SNPs suggests that all four are detecting a single association signal. The peak association (P = 7.28 × 10−6) observed at B3GALTL_SNP_2 corresponds to position 31 689 187 bp on human chromosome 13, located ~85 kb upstream of B3GALTL [hg19]. Additionally, two significantly associated SNPs (P < 0.00005) lie in sequences syntenic to introns of this gene, and another associated SNP lies ~15 kb downstream of the gene boundary as predicted from the human sequence. The SNPs genotyped at B3GALTL were insufficient (in terms both of density and of the extent of the genomic region that they cover) to define precisely the boundaries of the association signal at this locus.
We conducted linkage and association analyses of multiple expression traits to assess the potential of the vervet system for large-scale genomic investigations of various quantitative traits. We focused on expression phenotypes based on the assumption that they could have a relatively simple genetic basis, and believed that we could gain substantial empirical information about this potential despite the sparseness of the genetic marker sets currently available in this system. In particular, we expected to observe significant eQTL for transcripts sited in close proximity to informative markers from the vervet microsatellite map. The very strong linkage evidence observed for several cis-eQTL (e.g. three transcripts displayed peak LOD scores > 20, all at microsatellites located at a distance of 2Mb or less) confirms this expectation.
The linkage results further indicate that despite the extensive bottlenecks in the population from which the VRC founders were drawn, the pedigree maintains sufficient genetic variation to provide excellent power for initial QTL mapping. Consideration of results for the entire set of transcripts and comparison with prior studies further suggest that the availability of polymorphisms that adequately cover the genome will enable trait mapping in the VRC to be routinely successful. We previously identified a significant linkage for brain dopamine metabolism (4) by using SNPs to saturate the most promising region (peak LOD = 2.3) in a microsatellite screening study of the VRC. In the current study, the microsatellite screen identified evidence of at least one locus of this magnitude or greater for 21 of the 29 transcripts evaluated. Given that the microsatellites in the vervet map are individually not nearly as informative as the markers used to construct human linkage maps, and considering the paucity of markers available in the regions containing several of the transcripts (five of the transcripts for which neither suggestive nor significant linkage was observed had no marker within 5 Mb of the probe), we anticipate identifying highly significant eQTL for virtually all of these transcripts, once a next-generation marker map is available, based on WGS data.
It has been proposed that trans-eQTL exert more important biological effects than cis-eQTL (23,26), but they are more challenging to detect. As such they may provide a clearer indication of the power of a genetic system for the elucidation of complex traits. The observation that almost 25% of the highly selected transcripts assessed here demonstrated their peak linkage signals in a single small chromosomal region on vervet chromosome 9 suggests that future investigations of the vervet system will provide an excellent opportunity to uncover additional loci with comparable trans effects. This expectation is strengthened by the fact that even the most significant cis-eQTL linkages reported here leave a substantial proportion of trait heritability unaccounted for. Extending the approach used to select these transcripts, establishing that their expression patterns in peripheral blood correlated strongly to their expression pattern in brain, to other typically inaccessible tissues may facilitate the discovery of regulatory loci that are particularly relevant to the function of specific tissue or organ systems.
The localization of the chromosome 9 trans-QTL to a region that includes a large inversion between the vervet and human genome highlights the potential importance of the many structural variations between the vervet genome and that of other primates, as a basis for identifying phenotypic differences between species. Most of the Catarrhini (Old World Monkeys and Hominidae) are characterized by exceptional genome structural stability. In contrast, the Cercopithecini tribe to which the vervet belongs, like the Hylobatidae (gibbons), displays a dramatically accelerated rate of chromosome evolution, with 2N karyotype varying between 54 and 72 (for vervet 2N = 60), reflecting multiple non-centromeric fissions from the ancestral karyotype (27). Analyses of breakpoint regions from numerous structural rearrangements between human and gibbon genomes have suggested that many of these rearrangements could have functional significance, for example by direct interruption of genes (28,29). Unlike gibbons, however, which as endangered species are not available for large-scale investigations of gene expression, epigenetics or higher order phenotypes, vervet pedigree and population samples are ideally suited for comparative phenotypic-structural genomic investigations with both humans and other NHP models such as baboons and macaques.
Although QTL analysis in pedigrees is a powerful means for initial localization of variants contributing to complex phenotypes, the resolution of such analyses is inherently low. Pinpointing the signal identified through QTL linkage studies has proven a major challenge in many species, including mice (30), reflecting the paucity of sufficient genetically independent phenotyped individuals for fine-scale association analyses. The vervet is unique among NHP models in that its most intensively investigated pedigree sample (the VRC) descends entirely from a large but closed ancestral population (St Kitts and Nevis). A common set of alleles is responsible for most genetic variation in both study samples, but the sizable sample of independent individuals available in the island populations offers the opportunity to observe haplotypes at a trait locus that are substantially diminished in length compared with the haplotypes linked to the trait in the pedigree (15).
The high correlation in allele frequencies observed, for B3GALTL region SNPs, between the VRC samples and St Kitts samples drawn from across the island, is consistent with our assumption that the VRC incorporates most of the genetic variation present in the St Kitts population. The observation of greater D′ between SNPs in the VRC compared with the St Kitts samples is also consistent with our expectation of a greater number of historical recombinations in the independently ascertained monkeys from the island population. This expectation is reflected in the trait mapping findings, where the B3GALTL significant linkage signal extends over more than 10 Mb in the VRC, while association evidence in the St Kitts sample is limited to a region of <200 kb. More conclusive confirmation of our hypotheses regarding both the allelic similarity and the extent of haplotype conservation between the VRC and St Kitts population will require genome-wide data, as will be provided by WGS efforts now underway in both study samples.
The WGS of pedigree and population samples, together with the ongoing collection of larger study samples, will provide the opportunity to extend the two-stage pedigree-population approach reported here for eQTL to well-powered investigations of a wide range of quantitative phenotypes. The WGS underway in the VRC will ameliorate the impediments to genetic mapping represented by the inadequacy of current polymorphisms. These studies will provide comprehensive genotype data for >700 vervets; more than 600 monkeys were selected based on their having been phenotyped for multiple measures (from a set of 19 heritable brain and behavior, metabolic and morphometric traits) and nearly 100 monkeys were chosen because they will provide a vervet transcriptome data set comprising RNA sequencing data from 85 brain and peripheral tissues.
For several of the phenotypes assessed in the VRC, it will be possible to conduct follow-up studies in an expanded population sample that now consists of ~1000 monkeys assessed on St Kitts and Nevis. For QTL in which the St Kitts and Nevis samples yield unsatisfactory resolution, it will be feasible to conduct even finer scale association analyses using large samples available from the vervet population of Barbados; preliminary comparative sequence analyses suggest that this population is more distantly related to the VRC than that of St Kitts and Nevis, but is more closely related to the VRC than that of the African populations from which the Caribbean vervets are presumed to derive (Y. Huang, unpublished data).
Traditionally, genetic investigation of NHPs has been limited to the relatively small samples that could be maintained in primate colonies. The development of technologies for tracking and repeat identification of wild or feral NHP populations now enables a wide range of longitudinal investigations of various biomedically important traits that may be relatively inexpensive to conduct compared with human research. International collaborative efforts have already enabled the collection of biomaterials and phenotypic data from more than 1 500 independent vervet monkeys from the Caribbean and African populations (Fig. 4), all of whom have been tagged by microchip and released, and are therefore available for longitudinal studies.
The data from numerous phenotyping assays and genome-wide sequencing, as well as a broad range of biological samples obtained from pedigreed and wild-caught Caribbean and African vervet monkeys, will be made widely available to the scientific community. The majority of the pedigreed animals from which these data were obtained are themselves available for further investigations by the scientific community, particularly for studies that do not interfere with the integrity of the pedigree or its social groups. Investigators can gain access to these materials by contacting the authors and, in the future, through a web-based database. The Integrated Vervet/AGM Research & Resources website provides access to a summary of international vervet phenotyping and genomics efforts and currently available results from VRC microsatellite genotyping, microarray-based gene expression studies and the vervet reference genome sequence, http://www.genomequebec.mcgill.ca/compgen/vervet_research/genomics_genetics/.
The 347 monkeys (all over 2 years of age) from the VRC have been utilized in previous genotyping and gene expression studies as described previously. From the SKBRF, we identified monkeys that had been trapped in independent social groups from sites dispersed throughout the island (Fig. 3). The sample of 279 monkeys included 273 females and 26 males (a preponderance of adult females reflects the natural social structure of vervet social groups) with mean (SD) ages of 11.95 (4.5) and 11.62 (4.8) years, respectively. For all of the investigated monkeys, genomic DNA and total RNA samples were obtained from peripheral blood using a PaxGene system (PreAnalyticX) as described previously (13).
Genome-wide gene expression in the VRC monkeys was measured, as described previously, using an Illumina HumanRef-8 v2 chip (GSE15301) (13). A detailed description of the microarray-based strategy used to identify the most promising candidate transcripts for eQTL mapping of transcripts relevant for brain function was presented in Jasinska et al. (13). Briefly, to identify eQTL for which expression patterns in blood are highly correlated with expression patterns in the brain, we created two data sets: (i) a set of matched tissues from eight brain regions and from peripheral blood of 12 VRC monkeys; (ii) a set of duplicate blood samples from 18 VRC monkeys, all collected at two time points. Criteria for selecting transcripts for eQTL analysis included correlated expression pattern between the brain and blood, a high degree of inter-individual variability in expression, longitudinal stability of expression pattern and heritability. Applying these criteria resulted in 29 transcripts for eQTL analyses (Supplementary Material, Table S2).
We validated array-based expression results by using preliminary sequence data from the VGSP to design RT–qPCR amplicons (Supplementary Material, Table S3). The whole-genome sequence data (~4-fold coverage of Roche/454 data aligned to the human reference genome assembly) were derived from a VRC animal (1994–021) from which a vervet BAC library (CHORI-252) was previously created. We performed RT–qPCR using a validation sample set that included 20 monkeys with the most extreme (high and low) expression values from the microarray results. For this experiment, we followed guidelines described by Nolan et al. (31). Briefly, 500 ng of blood-isolated RNA was converted into cDNA using a High Capacity RNA-to-cDNA kit (Applied Biosystems, USA). RT–qPCR reactions were prepared using the SYBR® PCR Master Mix from Applied Biosystems and oligos from Invitrogen, and analyzed with the 7900HT Fast Real Time PCR System (Applied Biosystems). For cDNA amplification, 25 ng of cDNA and 300 nm primers were used per sample reaction and each sample was analyzed in triplicate. Expression changes for the transcript were quantified relative to glyceraldehyde 3-phosphate dehydrogenase (32), selected as an endogenous control, using the Pfaffl method (33). The medians of expression from the bottom tail and top tail as defined by microarray results were compared with RT–qPCR quantification results using a non-parametric Wilcoxon test. We assayed B3GALTL expression in the SKBRF monkeys using the RT–qPCR protocol described above (primers: F-5′-CTT TCA AGT GGG TGA TGA GC-3′, R-5′-AAT GCG TCT GGG AGT CAA TC-3′).
The VRC pedigree DNA samples were genotyped for genome-wide analyses in a previous study (14) using 261 microsatellite markers that constitute the first-generation vervet genetic map. These animals and the animals sampled from SKBRF were genotyped using SNPs from the B3GALTL region. SNPs were identified from the preliminary data generated by the VGSP, as described above, and were selected based on a minimum of two reads indicating each of two alleles in heterozygous loci, and based on the surrounding sequence being sufficiently complete for assay design.
Genotyping for the 15 B3GALTL SNPs was performed using Sequenom iPLEX technology (SEQUENOM). Multiplex PCR assays were designed using MassARRAY software (v3.1), with default settings of the software and no masking of repeats. Oligos for the PCR reactions and mass-extend reactions were ordered from Integrated DNA Technologies (USA). Reactions for the SNP genotyping were prepared according to the MassARRAY® iPLEX® Gold SNP Genotyping protocol from Sequenom. Multiple positive, negative and non-amplification controls were included.
All of the 279 SKBRF samples were genotyped for each of the 15 SNPs. Forty-five of the VRC monkeys were also genotyped for these SNPs for comparisons of LD and allele frequency with the SKBRF samples; for these monkeys, we obtained acceptable-quality genotypes for 13 of the 15 SNPs.
QTL linkage analysis was performed in the 347 VRC monkeys, to the probe intensities of the 29 transcripts; this analysis was conducted using SOLAR (34). As in Jasinska et al. (13), covariates of age, sex and batch (sets of samples run in the same microarray experiment) were considered in each analysis and those significant at P = 0.10 were retained. Correction for testing linkage for multiple expression traits was calculated using a modification of the method of Conneely and Boehnke (21), where the level of correlation among the traits was used to adjust the genome-wide significant LOD score threshold.
For the comparison of the VRC pedigree and population genotype data, we used the following methods: correlations in allele frequencies between the VRC pedigree and the population samples were assessed with a Pearson correlation. LD was measured as D′ between each pair of the SNPs.
We used regression analyses to test for association between SNP genotypes and overall gene expression (from both alleles) in the SKBRF population sample. Each genotype was coded as 0, 1 or 2 copies of the minor allele, and expression levels regressed on this variable, assuming an additive model of action of the minor allele.
We gratefully acknowledge the expertise and technical assistance of Dr Gene Redmond and the entire staff of the St Kitts Biomedical Research Foundation, particularly Messrs. Alexis Nisbett, Ernell (Zyka) Nisbett and O'Neal Whattley. We also wish to acknowledge the contributions of Ms Jennifer Danzy, Dr Alison Grand, Dr Yu Huang, Mr Oliver (Pess) Morton and the Production Sequencing Group, The Genome Institute, Washington University School of Medicine.
Conflict of Interest statement. None declared.
This work was supported by National Institutes of Health grants R01OD010980 and P40OD010965 from the Office of Research Infrastructure Programs/OD (formerly R01RR016300 and P40RR019963 from NCRR), PL1NS062410, UL1DEO19580, P30NS062691, RL1MH083270 and U54HG003079, Genome Quebec and Genome Canada.