|Home | About | Journals | Submit | Contact Us | Français|
The current genetic and recombination maps of the cat have less than 3,000 markers and a resolution limit greater than 1 Mb. To complement the first generation domestic cat maps, support higher resolution mapping studies, and aid genome assembly in specific areas as well as in the whole genome, a 15,000Rad radiation hybrid (RH) panel for the domestic cat was generated. Fibroblasts from the female Abyssinian cat that was used to generate the cat genomic sequence were fused to a Chinese hamster cell line (A23), producing 150 hybrid lines. The clones were initially characterized using 39 STR and 1536 SNP markers. The utility of whole genome amplification (WGA) in preserving and extending RH panel DNA was also tested using ten STR markers; no significant difference in retention was observed. The resolution of the 15,000Rad RH panel was established by constructing framework maps across ten different 1 Mb regions on different feline chromosomes. In these regions, two-point analysis was used to estimate RH distances, which compared favorably with the estimation of physical distances. The study demonstrates that the 15,000Rad RH panel constitutes a powerful tool for constructing high-resolution maps, having an average resolution of 40.1 kb per marker across the ten 1 Mb regions. In addition, the RH panel will complement existing genomic resources for the domestic cat, aid in the accurate reassemblies of the forthcoming cat genomic sequence, and support cross-species genomic comparisons.
Every genome needs a good map [Lewin et al. 2009]. The accuracy and density of genetic map construction for any particular species has improved with the evolution of molecular biology, cell culture, and genetic and genomic techniques. Chromosomal abnormalities, somatic cell hybrid maps, and family-based linkage analyses are techniques that have increased map density and can decipher gene order. Radiation hybrid (RH) mapping provided an additional leap forward as polymorphic markers were not required and resolution could be manipulated by radiation dosage. Sanger-based and next-generation sequencing now suggest the finest levels of resolution, density, and order, although correct assembly is complicated and not error free. Overall, each independent technique has benefitted from the previously built framework maps and all mapping efforts have combined to improve accuracy of orders and distances of genes in the genome.
Successful RH mapping in humans [Cox et al. 1989] pioneered RH mapping efforts in many species such as, rat, dog, cow, sheep, horse, pig, and macaque [Chowdhary et al. 2002; Laurent et al. 2007; Murphy et al. 2001; Priat et al. 1998; Watanabe et al. 1999; Womack et al. 1997; Yerle et al. 1998] and non-mammalian vertebrate models including the zebrafish [Geisler et al. 1999; Hukriede et al. 1999]. RH panels were generated for several species initially as lower resolution RH panels, generally less than 5,000Rad for initial genome map construction; later, higher resolution panels, generally 10,000Rad or greater, were developed for several species and used in fine mapping and genome assembly. For example, RH maps for the human genome have increased from 3,000Rad to 50,000Rad, improving resolution from approximately 1.2 Mb to an average of 94 kb between ordered markers [Gyapay et al. 1996; Hudson et al. 1995; Olivier et al. 2001]. Comparisons to other mapping techniques, such as recombination maps and genome assemblies, have indicated that each method has value and that cross-method comparisons will help to generate the ultimate map, correct in orientation, order, and distance between loci.
The domestic cat has become a useful model organism for the analysis of many diseases and disorders that occur in the human population [Lyons et al. 2004; Menotti-Raymond et al. 2010; Meurs et al. 2007; Rah et al. 2005]. To facilitate the cat’s role as a model for human disease, genetic and genomic resources for the cat have been in production for nearly three decades, progressing from somatic cell hybrid panels [O’Brien and Nash 1982], interspecies and intra-species linkage maps [Menotti-Raymond et al. 1999; Grahn et al. 2005; Cooper et al. 2006], integration with radiation hybrid panels [Menotti-Raymond et al. 2003b; Sun et al. 2001], and currently to complete genome sequencing.
Next-generation sequencing techniques are more efficient in the capture of regions of the genome that had been missed by earlier technologies [Wheeler et al. 2008], however, the assembly of sequence data generated by non-Sanger based, massively parallel sequencing technologies faces unique challenges due to the short read lengths produced [Pop 2009] and higher error rates [Xu et al. 2009]. The existing Sanger-based genome assembly is necessary as a scaffold to map new sequence data [Wheeler et al. 2008]. Although an excellent initial resource, the coverage of the cat genome captured by the 1.9x sequencing effort of the domestic cat was approximately 60% [Pontius et al. 2007]; additional light sequencing of the cat genome for SNP discovery increased depth to 3x and expanded coverage to approximately 80% of the genome [Mullikin et al. 2010]. Despite the expanded coverage, low contig and short scaffold N50s of 4.6 kb and 162 kb, respectively, suggest that significant gaps in the initial cat sequence assemblies could prevent the formation of continuous contigs with significant length. Next-generation sequencing of the domestic cat to obtain 10 – 13x coverage is nearing completion (see the Felis catus entry in http://www.genome.gov/10002154). However, since the feline 3x sequence contains significant gaps in coverage and multiple unaligned contigs and the scaffold was based on the canine assembly, accurate assembly of the deeper non-Sanger based feline genome sequence will be highly challenging. RH maps have played an important role in facilitating the process of whole genome sequencing and assembly for human, mouse, rat, dog and cattle [Lander et al. 2001; Waterston et al. 2002; Gibbs et al. 2007; Lindblad-Toh et al. 2005; Bovine Genome Sequencing and Analysis Consortium et al. 2009]. To date, the available maps for the cat have less than 3,000 markers [Davis et al. 2009; Menotti-Raymond et al. 2009] with a resolution of ~1 Mb, thus the generation of a high-density map is of great importance to improve the power of studies that require fine-mapping and physical mapping approaches to identify regions of interest. A high-density and high-resolution genome map could strongly support the placement of contigs and resolve placement of the singleton contigs that have yet to be assigned to a chromosome.
To complement the previous physical maps and further resolve the structure of the domestic cat genome, reported is the construction and initial characterization of a 15,000Rad whole-genome RH panel suitable for high-resolution mapping.
A 15,000Rad RH panel was generated using a fibroblast donor primary culture derived from a female Abyssinian cat fused with an A23 thymidine kinase (TK-) deficient Chinese hamster fibroblast cell line. Fusion, isolation, selection, and initial cloning procedures were as previously described [Chowdhary et al. 2002] with the following modification: 30 sequential doses of 500 cGy (rad) were administered to the donor cells for a total absorbed dose of 150Gy (15,000 cGy) using a 6MV photon beam produced by a linear accelerator (Varian 2100C, Varian Corporation, Palo Alto CA). Dosages were calculated using a computer software program (IMSure, version 1.22, Prodigm, Inc. Chico, CA). Hybrid cell lines (N = 204) were expanded in T25 flasks (Thermo Fisher Scientific, Rochester NY USA) in HATO medium containing DMEM medium with 10% FBS, 1X antibiotic-antimycotic (Invitrogen, Carlsbad CA USA), 1X HAT (Invitrogen), and 1X Ouabain (Sigma-Aldrich, St. Louis MO USA). At the second passage (p2), approximately two thirds of the cells of each hybrid clone were cryopreserved in freezing medium containing 10% DMSO; the remaining cells were returned to T25 flasks (p3), allowed to reach confluency, then expanded in triplicate T75 flasks (p4) and harvested and cryopreserved at confluency. One hundred and fifty clones were successfully cultured and expanded to p4. DNA was isolated from approximately 10 μl of each cell pellet using the Qiagen DNeasy kit (Valencia, CA USA) according to manufacturer’s protocol. In addition, each DNA sample was whole-genome amplified using the Qiagen REPLI-g Mini Kit (Valencia, CA USA) following the amplification of genomic DNA from blood or cells protocol with 3 μl extracted DNA as the template.
Isolated DNA from RH clones was tested for total DNA concentration using a NanoDrop spectrophotometer (Thermo Scientific, Wilmington, DE USA) (supplementary table 1). Thirty-nine STR markers [Menotti-Raymond et al. 2003a; Menotti-Raymond et al. 1999; Menotti-Raymond et al. 2003b] (supplementary table 2) were multiplexed and tested with the 150 RH clones as previously described [Lipinski et al. 2008]. For a validation test, ten markers were tested on both WGA and non-WGA DNA samples (supplementary table 3). In each reaction, 2 μl WGA DNA (diluted 1:10) or non-WGA DNA was amplified as previously described [Lipinski et al. 2008]. All STR genotypes were analyzed using STRand software ([Toonen and Hughes 2001], http://www.vgl.ucdavis.edu/STRand) for analysis using semi-automatic calling of alleles. An internal size standard (GeneScan 500 LIZ, Applied Biosystems) was used to determine allele size for each STR. Allele ranges for each marker were established from genotyping the DNA of the donor cat. Additionally, both WGA and non-WGA DNA from five clones were genotyped with 1,536 SNPs via the Illumina GoldenGate assay as described below (data not shown). Retention frequencies for non-WGA and WGA DNA samples were tested for significant difference between the two conditions using Pearson’s chi square test [Pearson 1900].
Whole-genome amplified DNA from the 150 cell lines and the host hamster and donor cat DNA samples were genotyped for 1,536 SNPs using the Illumina GoldenGate genotyping assay analyzed on an Illumina® BeadStation 500G ([Oliphant et al. 2002]; http://www.illumina.com) following the manufacturer’s protocol at the Genetics Core Laboratory at the Texas Biomedical Research Institute. SNPs were selected within random 1 Mb regions on 10 different cat chromosomes (A1, A2, B3, C2, D1, D2, D4, E2, F2 and X) using data from the Felis catus sequencing effort [Mullikin et al. 2010]. Each region contains ~ 153 SNPs, having a geometric distribution of the SNPs in the region, averaging one SNP per 6 kb. GenomeStudio 2010.2 version software was used to manually score the presence or absence of loci across the RH panel clones. Data was filtered by considering the quality of the SNPs and the quality of the clones, as previously described [McKay et al. 2007]. For the primary filtering, SNPs were not scored in the clones if the SNP successfully amplified in the hamster and could not be distinguished from the cat or if the SNP failed to amplify in the donor cat DNA. Calling thresholds for the SNPs that amplified within a clone were absent (0) for call scores below 0.10, unknown (−) for call scores between 0.1 and 0.15, and present (1) for call scores above 0.15. Finally, secondary filtering removed markers with an “unknown” rate of more than 5%.
The filtered data from each region was used to construct linkage groups and framework maps in CarthaGene-1.2 [de Givry et al. 2005]. Markers were tested to detect duplicate RH vectors using the “merge” command. For markers within a given 1 Mb region, linkage groups were formed by two-point analysis with a threshold LOD score > 3.0. Framework maps were constructed using the “buildfw” command with a saving threshold and an adding threshold of 3. The resulting maps were refined with the commands “flips” (7 1 1), “polish”, and “greedy” (3 1 1 15 25) and tested by “robustness” (−3). For each chromosomal region, ten maps with the highest likelihood were examined to confirm the reliability of major changes detected within each region.
Two hundred and four initial cat-hamster fusion lines were identified and propagated, 150 clones were successfully cultured to at least passage 4 and were further characterized to develop the RH panel. To verify accurate representation of clone DNA in WGA samples, ten STRs were genotyped on WGA and non-WGA template DNA from the same clones (supplementary table 3). Average retention frequencies (RFs) between non-WGA and WGA DNA were 28.7% and 27.3%, respectively, and the percentage of clones with greater than 10% RF were 96.7% and 92.7%, respectively (table 1). Comparing presence or absence of each marker for each clone, 3% discordancy was detected between the WGA and non-WGA DNA (supplementary table 3). In addition, five non-WGA DNA clones were tested on 1536 SNPs and no significant differences were detected (data not shown).
Each clone was tested with 39 STR markers distributed across all 18 autosomes and the X chromosome for initial characterization of the panel. Marker RFs are summarized in table 1 and the data matrix is presented in supplementary table 4. Marker RFs ranged from 0.00% to 54%, with an average of 21% for the panel. Clones C027 and C184 were not positive for any STRs, and clone C087 had the highest RF at 54%. Of the 150 clones tested, 88% had RFs greater than 10%. No markers were located near the selectable TK and HPRT loci. Figure 1 shows the graphical representation of RFs for each clone summarized in supplementary table 1, and supplementary fig. 1 illustrates the distribution of markers within clones.
Further characterization of the 150 clones was accomplished via high-throughput SNP genotyping. Of the 1536 SNPs included in the Illumina GoldenGate assay, 921 (59.96%) passed primary filtering and were successfully typed. Seventy-five SNPs (4.88 %) failed in the donor cat and 540 SNPs (35.15%) were positive in hamster, thus genotyping was not possible. Of the 921 SNPs that passed primary filtering, 127 SNPs failed secondary filtering, resulting in 794 SNPs available for analysis. For the SNP markers, the average RF was 29.4%, and the percentage of clones with greater than 10% RF was 90.7% (table 1). Clone RFs estimated by SNP genotyping are summarized in supplementary table 1 and illustrated in fig. 1. The clones that were negative for all STRs had RFs > 20% for the SNPs. A general trend of similar RFs was observed between STRs and SNPs (fig. 1).
The 794 SNPs that passed quality control were used to construct framework RH maps (table 2). No SNPs had identical RH vectors as determined by the “merge” command. The 794 markers formed ten linkage groups, one per chromosome region. Framework maps included 181 of the 794 (22.76%). The RH maps with best likelihoods obtained from the ten initial 1 Mb regions are shown in supplementary fig. 2 and summarized in table 2. A representative map for Chromosome E2 is presented as fig. 1. LOD scores to the next nearest marker in a framework map were generally above 20, ranging from 12.7 – 37.7. Major inversions were suggested on chromosomes A2, B3, D1, and D2, which were consistent across all 10 highest likelihood maps for each region. The overall SNP order, beside the noted inversions, was conserved across all maps. However, in every map, some SNPs showed a different location in the RH map as compared to the positions identified by the sequence assembly. The length of the RH maps ranged from 178.9 cR15,000 to 489 cR15000 with an average kb to cR ratio of 2.4 kb/cR15,000 and an average inter-marker distance of 40.1 kb (table 2). Chromosome E2 (fig. 2) and D4 were the most densely mapped chromosomes, with 25 markers assigned to each 1 Mb region and lengths of 489 cR15,000 and 425.4 cR15,000. Conversely, B3 was the most sparsely mapped chromosome, with 11 markers on the framework map and a length of 178.9 cR15,000. The average map lengths in kb and in cR15,000 as well kb/cR15,000 are shown in table 2.
The domestic cat is increasingly recognized as a robust animal model for human diseases. Expanding resources, particularly the genome sequencing projects and the resulting SNP discoveries, should facilitate candidate gene identification and analyses. The anticipated deeper coverage should accelerate investigations of simple phenotypes and enable analyses of more complex traits in the cat. However, mutation detection studies are hampered by incorrect gene orders caused by statistical fluctuations in genetic maps and incorrect genome assemblies. Even with very high coverage-derived assemblies of humans and mice, imperfections and artifacts in gene order and distances are a recognized hazard [Kong et al. 2002], which can be resolved with the generation of a high-resolution high-density RH map [Marques et al. 2007].
Besides the genome sequence, many excellent resources are currently available for the domestic cat supporting the identification of diseases and traits. The current cat 5,000Rad RH map contains over 2,000 markers with a resolution of ~1 Mb, which supported the 1.9x and 3x cat genome assemblies. The use of a more robust whole genome assembly from the most closely related species is a common practice for comparative assemblies, as has been demonstrated in the bovine genome [Prasad et al. 2007]. The dog scaffold was used where available to direct the cat assembly below the 1 Mb level, introducing bias towards the dog genome in the cat gene order and inherently incorporating errors into the cat sequence assembly [Pontius et al. 2007]. Thus, a high-resolution gene map for the cat, developed by an independent technology such as the radiation hybrid technique, should augment genome assembly for the cat, thereby facilitating fine mapping and candidate gene studies.
The overall goal was to produce a high-resolution RH map of the cat to facilitate genome assembly. The same Abyssinian cat was used to generate the genome sequence and construct the RH panel to alleviate concerns regarding individual sequence rearrangements and variations due to repeat regions. The cat RH 15,000Rad panel is composed of 150 RH clones, which supports mapping of closely spaced markers. Marker retention has been shown to vary across the genome in RH panels [Cox et al. 1990; James et al. 1994; Walter et al. 1994] and can be estimated with different technologies, such as STRs and high-throughput SNP genotyping. An initial retention frequency (RF) estimate of 21% was determined by testing 39 STRs, with 88% of clones having greater than 10% RF. As expected, this RF estimate is lower than the initial estimate of the 5,000Rad panel [Murphy et al. 1999] but similar to RFs obtained from other species RH panels with high radiation dose, such as the 12,000Rad ovine panel, which reported 87.5% of clones with RF greater than 10% [Laurent et al. 2007]. Retention frequencies were also estimated from the SNP analysis and illustrate the limitations of typing a high-resolution RH panel with few markers. The RF estimate from SNP analysis is higher than that estimated by STR analysis, with 29.4% RF and 90.7% of clones greater than 10% RF. By testing more loci, a more complete estimate of RF can be calculated, which reflects the higher retention expected from using a low-passage RH panel. Both STR and SNP analyses demonstrate the fluctuation in retention in different regions of the genome. Using these data, a subset of 94 clones could easily be selected that has an optimum average retention frequency across the cat genome.
The newer SNP genotyping technologies should support the construction of high-density genetic maps. For this study, Illumina technology was used to genotype 1536 SNPs on the cat RH panel. The initial selection of SNPs did not consider conservation to hamster, implying the amplification of some SNPs was expected in the hamster background. Approximately 35% of SNPs amplified in the hamster, precluding their analysis in the cat RH panel. In addition, this analysis of the SNPs is the first genotyping of these particular loci in the cat, thus a 5% failure rate, as determined by failure in the donor cat DNA, was expected due to design and assay failure. Approximately 8% of SNPs failed to amplify robustly, resulting in ambiguous genotyping. Overall, 49% (794) of SNPs were suitable for map construction, supporting the rapid development of a high-density map.
The current 5,000Rad RH panel has a resolution limit exceeding 1 Mb, thus, below this level, the cat genome assembly does not have a secondary mapping method to support assembly or resolve discrepancies. The ten framework maps constructed had an average inter-marker distance of 40.1 kb, suggesting a significant 25-fold increase in resolution as compared to the 5,000Rad feline RH map, and will support genome assembly below 1 Mb. Conservative framework maps for the 15,000Rad cat RH panel were constructed with 181 SNPs, averaging 18 markers covering an average of 900 kb. A majority of markers in the framework maps had consistent order with the suggested SNP positioning. However, several large inversions on five of 10 chromosomes were identified, which were supported by all 10 maps with the highest likelihoods. Each 1 Mb region also had smaller potential inversions. The SNPs were selected and positioned from unpublished versions of the cat genome assembly, therefore the more robust assembly may show more concordance with the RH maps. In addition, large-scale SNP calling methods may need refinement to improve accuracy and objectivity in assigning call scores for RH data, as amplification levels fluctuate unpredictably between markers and minor differences in call rates for even one marker can dramatically change the final marker order of a map.
A recognized challenge of RH panels is that as fused RH clones continue to grow and divide, donor DNA is progressively lost [Karere et al. 2010], continually changing the profile of markers present at each passage. Thus, to be comparable, gene maps must be constructed using DNA from the same passage for a particular clone, which often limits the amount of DNA available for mapping. In addition, large-scale culture steps are required to prepare DNA in sufficient quantities for distribution to multiple laboratories and for large-scale marker typing. The efficiency of RH mapping should be increased by using early passage WGA DNA samples, which have higher retention frequencies and sufficient amounts of DNA. Moreover, WGA may also amplify low-copy fragments that would have otherwise been lost to the analysis. To avoid a significant reduction in the retention frequency, and to make the resource available for the community, each clone was whole genome amplified. Promising results with WGA DNA were obtained on a 10,000Rad panel for the Rhesus macaque [Karere et al. 2010] and a 3,000Rad panel for the gilthead seabream [Senger et al. 2006]. In this cat RH panel, WGA was a reliable method for genome amplification as retention frequencies were comparable between WGA and non-WGA DNA, and only 3% discordancy was detected between STR amplification in WGA versus non-WGA DNA. The low requirements of DNA concentration and sample volume for SNP array technologies and the extensive amplification of RH clone DNA by WGA suggests that the cat 15,000Rad panel should be a sufficient resource for additional mapping on higher density arrays and fine mapping specific regions.
The ultimate map of a genome is not just the complete sequence of the organism, but also the correctly assembly of the sequence with chromosomal assignments and orientation. Thus, the production of a whole-genome sequence does not dismiss the need for accurate and efficient mapping techniques. The cat genome sequencing coverage is shallow at ~3x, and the highly fragmented sequence assembly further highlights the need for high-resolution RH maps in the species. This cat 15,000Rad RH panel should provide support for the continued refinement of the domestic cat genome assembly, particularly since the deeper cat sequence was developed with shorter read technologies that have higher error rates. With a comprehensive high-density, high-resolution RH map, marker order rearrangements can be identified and resolved and cross-species genome comparisons can be accomplished, providing a powerful tool for inferring genome function and evolutionary history.
Funding for this project was provided by NIH-NCRR RR016094, the Center for Companion Animal Health at UC Davis, the Intramural Research Program of the National Human Genome Research Institute, National Institutes of Health, and the George and Phyllis Miller Feline Health Fund of the San Francisco Foundation. This investigation was partially conducted in a facility constructed with support from Research Facilities Improvement Program Grant Number C06 RR013556 from NCRR, NIH. Cat genome sequence for SNP detection was provided by Dr. Wesley Warren of the Genome Institute at Washington University School of Medicine. The authors would like to thank Paul Lathrop and Hasan Alhaddad for technical support and advice.