Gene targeting in embryonic stem cells has become the principal technology for manipulation of the mouse genome, offering unrivalled accuracy in allele design and access to conditional mutagenesis. To bring these advantages to the wider research community, large-scale mouse knockout programmes are producing a permanent resource of targeted mutations in all protein-coding genes. Here we report the establishment of a high-throughput gene-targeting pipeline for the generation of reporter-tagged, conditional alleles. Computational allele design, 96-well modular vector construction and high-efficiency gene-targeting strategies have been combined to mutate genes on an unprecedented scale. So far, more than 12,000 vectors and 9,000 conditional targeted alleles have been produced in highly germline-competent C57BL/6N embryonic stem cells. High-throughput genome engineering highlighted by this study is broadly applicable to rat and human stem cells and provides a foundation for future genome-wide efforts aimed at deciphering the function of all genes encoded by the mammalian genome.
We sequenced reduced representation libraries by means of Illumina technology to generate over 1.5 Mb of orthologous sequence from a representative of each of the four extant gibbon genera (Nomascus, Hylobates, Symphalangus, and Hoolock). We used these data to assess the evolutionary relationships between the genera by evaluating the likelihoods of all possible bifurcating trees involving the four taxa. Our analyses provide weak support for a tree with Nomascus and Hylobates as sister taxa and with Hoolock and Symphalangus as sister taxa, though bootstrap resampling suggests that other phylogenetic scenarios are also possible. This uncertainty is due to short internal branch lengths and extensive incomplete lineage sorting across taxa. The true phylogenetic relationships among gibbon genera will likely require a more extensive whole-genome sequence analysis.
Gene duplication is an important source of phenotypic change and adaptive evolution. We use a novel genomic approach to identify highly identical sequence missing from the reference genome, confirming the cortical development gene Slit-Robo Rho GTPase activating protein 2 (SRGAP2) duplicated three times in humans. We show that the promoter and first nine exons of SRGAP2 duplicated from 1q32.1 (SRGAP2A) to 1q21.1 (SRGAP2B) ~3.4 million years ago (mya). Two larger duplications later copied SRGAP2B to chromosome 1p12 (SRGAP2C) and to proximal 1q21.1 (SRGAP2D), ~2.4 and ~1 mya, respectively. Sequence and expression analysis shows SRGAP2C is the most likely duplicate to encode a functional protein and among the most fixed human-specific duplicate genes. Our data suggest a mechanism where incomplete duplication created a novel function —at birth, antagonizing parental SRGAP2 function 2–3 mya a time corresponding to the transition from Australopithecus to Homo and the beginning of neocortex expansion.
The Chelonid fibropapilloma-associated herpesvirus (CFPHV; ChHV5) is believed to be the causative agent of fibropapillomatosis (FP), a neoplastic disease of marine turtles. While clinical signs and pathology of FP are well known, research on ChHV5 has been impeded because no cell culture system for its propagation exists. We have cloned a BAC containing ChHV5 in pTARBAC2.1 and determined its nucleotide sequence. Accordingly, ChHV5 has a type D genome and its predominant gene order is typical for the varicellovirus genus within the alphaherpesvirinae. However, at least four genes that are atypical for an alphaherpesvirus genome were also detected, i.e. two members of the C-type lectin-like domain superfamily (F-lec1, F-lec2), an orthologue to the mouse cytomegalovirus M04 (F-M04) and a viral sialyltransferase (F-sial). Four lines of evidence suggest that these atypical genes are truly part of the ChHV5 genome: (1) the pTARBAC insertion interrupted the UL52 ORF, leaving parts of the gene to either side of the insertion and suggesting that an intact molecule had been cloned. (2) Using FP-associated UL52 (F-UL52) as an anchor and the BAC-derived sequences as a means to generate primers, overlapping PCR was performed with tumor-derived DNA as template, which confirmed the presence of the same stretch of “atypical” DNA in independent FP cases. (3) Pyrosequencing of DNA from independent tumors did not reveal previously undetected viral sequences, suggesting that no apparent loss of viral sequence had happened due to the cloning strategy. (4) The simultaneous presence of previously known ChHV5 sequences and F-sial as well as F-M04 sequences was also confirmed in geographically distinct Australian cases of FP. Finally, transcripts of F-sial and F-M04 but not transcripts of lytic viral genes were detected in tumors from Hawaiian FP-cases. Therefore, we suggest that F-sial and F-M04 may play a role in FP pathogenesis.
Nuclear lamins are usually classified as A-type (lamins A and C) or B-type (lamins B1 and B2). A-type lamins have been implicated in multiple genetic diseases but are not required for cell growth or development. In contrast, B-type lamins have been considered essential in eukaryotic cells, with crucial roles in DNA replication and in the formation of the mitotic spindle. Knocking down the genes for B-type lamins (LMNB1, LMNB2) in HeLa cells has been reported to cause apoptosis. In the current study, we created conditional knockout alleles for mouse Lmnb1 and Lmnb2, with the goal of testing the hypothesis that B-type lamins are crucial for the growth and viability of mammalian cells in vivo. Using the keratin 14-Cre transgene, we bred mice lacking the expression of both Lmnb1 and Lmnb2 in skin keratinocytes (Lmnb1Δ/ΔLmnb2Δ/Δ). Lmnb1 and Lmnb2 transcripts were absent in keratinocytes of Lmnb1Δ/ΔLmnb2Δ/Δ mice, and lamin B1 and lamin B2 proteins were undetectable. But despite an absence of B-type lamins in keratinocytes, the skin and hair of Lmnb1Δ/ΔLmnb2Δ/Δ mice developed normally and were free of histological abnormalities, even in 2-year-old mice. After an intraperitoneal injection of bromodeoxyuridine (BrdU), similar numbers of BrdU-positive keratinocytes were observed in the skin of wild-type and Lmnb1Δ/ΔLmnb2Δ/Δ mice. Lmnb1Δ/ΔLmnb2Δ/Δ keratinocytes did not exhibit aneuploidy, and their growth rate was normal in culture. These studies challenge the concept that B-type lamins are essential for proliferation and vitality of eukaryotic cells.
Cancer genomes frequently undergo genomic instability resulting in accumulation of chromosomal rearrangement. To date, one of the main challenges has been to confidently and accurately identify these rearrangements using short-read massively parallel sequencing. We were able to improve cancer rearrangement detection by combining two distinct massively parallel sequencing strategies: fosmid-sized (36 Kilobases on average) and standard 5 Kilobase mate pair libraries. We applied this strategy to map rearrangements in two breast cancer cell lines, MCF7 and HCC1954. We detect and validate a total of 91 somatic rearrangements in MCF7 and 25 in HCC1954, including genomic alterations corresponding to previously reported transcript aberrations in these two cell lines. Each of the genomes contains two types of breakpoints – clustered and dispersed. In both cell lines, the dispersed breakpoints show enrichment for low copy repeats, while the clustered breakpoints associate with high-copy number amplifications. Comparing the two genomes, we observe highly similar structural mutational spectra affecting different sets of genes, pointing to similar histories of genomic instability against the background of very different gene network perturbations.
fosmid ditag; massively parallel sequencing; gene fusion; copy number variation; genomic instability
The importance of miRNAs during development and disease processes is well established. However, most studies have been done in cells or with patient tissues, and therefore the physiological roles of miRNAs are not well understood. To unravel in vivo functions of miRNAs, we have generated conditional, reporter-tagged knockout-first mice for numerous evolutionarily conserved miRNAs. Here we report the generation of 162 miRNA targeting vectors, 64 targeted ES cell lines, and 46 germline-transmitted miRNA knockout mice. In vivo lacZ reporter analysis in 18 lines revealed highly tissue-specific expression patterns and their miRNA expression profiling matched closely with published expression data. Most miRNA knockout mice tested were viable, supporting a mechanism by which miRNAs act redundantly with other miRNAs or other pathways. These data and collection of resources will be of value for the in vivo dissection of miRNA functions in mouse models.
Lamin B1 is essential for neuronal migration and progenitor proliferation during the development of the cerebral cortex. The observation of distinct phenotypes of Lmnb1- and Lmnb2-knockout mice and the differences in the nuclear morphology of cortical neurons in vivo suggest that lamin B1 and lamin B2 play distinct functions in the developing brain.
Neuronal migration is essential for the development of the mammalian brain. Here, we document severe defects in neuronal migration and reduced numbers of neurons in lamin B1–deficient mice. Lamin B1 deficiency resulted in striking abnormalities in the nuclear shape of cortical neurons; many neurons contained a solitary nuclear bleb and exhibited an asymmetric distribution of lamin B2. In contrast, lamin B2 deficiency led to increased numbers of neurons with elongated nuclei. We used conditional alleles for Lmnb1 and Lmnb2 to create forebrain-specific knockout mice. The forebrain-specific Lmnb1- and Lmnb2-knockout models had a small forebrain with disorganized layering of neurons and nuclear shape abnormalities, similar to abnormalities identified in the conventional knockout mice. A more severe phenotype, complete atrophy of the cortex, was observed in forebrain-specific Lmnb1/Lmnb2 double-knockout mice. This study demonstrates that both lamin B1 and lamin B2 are essential for brain development, with lamin B1 being required for the integrity of the nuclear lamina, and lamin B2 being important for resistance to nuclear elongation in neurons.
Previously, we identified the E3 ubiquitin ligase Idol (inducible degrader of the low-density lipoprotein [LDL] receptor [LDLR]) as a posttranscriptional regulator of the LDLR pathway. Idol stimulates LDLR degradation through ubiquitination of its C-terminal domain, thereby limiting cholesterol uptake. Here we report the generation and characterization of mouse embryonic stem cells homozygous for a null mutation in the Idol gene. Cells lacking Idol exhibit markedly elevated levels of the LDLR protein and increased rates of LDL uptake. Furthermore, despite an intact sterol responsive element-binding protein (SREBP) pathway, Idol-null cells exhibit an altered response to multiple regulators of sterol metabolism, including serum, oxysterols, and synthetic liver X receptor (LXR) agonists. The ability of oxysterols and lipoprotein-containing serum to suppress LDLR protein levels is reduced, and the time course of suppression is delayed, in cells lacking Idol. LXR ligands have no effect on LDLR levels in Idol-null cells, indicating that Idol is required for LXR-dependent inhibition of the LDLR pathway. In line with these results, the half-life of the LDLR protein is prolonged in the absence of Idol. Finally, the ability of statins and PCSK9 to alter LDLR levels is independent of, and additive with, the LXR-Idol pathway. These results demonstrate that the LXR-Idol pathway is an important contributor to feedback inhibition of the LDLR by sterols and a biological determinant of cellular LDL uptake.
Gibbons are small, arboreal, highly endangered apes that are understudied compared with other hominoids. At present, there are four recognized genera and approximately 17 species, all likely to have diverged from each other within the last 5–6 My. Although the gibbon phylogeny has been investigated using various approaches (i.e., vocalization, morphology, mitochondrial DNA, karyotype, etc.), the precise taxonomic relationships are still highly debated. Here, we present the first survey of nuclear sequence variation within and between gibbon species with the goal of estimating basic population genetic parameters. We gathered ∼60 kb of sequence data from a panel of 19 gibbons representing nine species and all four genera. We observe high levels of nucleotide diversity within species, indicative of large historical population sizes. In addition, we find low levels of genetic differentiation between species within a genus comparable to what has been estimated for human populations. This is likely due to ongoing or episodic gene flow between species, and we estimate a migration rate between Nomascus leucogenys and N. gabriellae of roughly one migrant every two generations. Together, our findings suggest that gibbons have had a complex demographic history involving hybridization or mixing between diverged populations.
population history; chromosomal rearrangements; genetic diversity; gibbon; population genetics
The killer cell Ig-like receptors (KIR) of natural killer (NK) cells recognize major histocompatibility complex (MHC) class I ligands and function in placental reproduction and immune defense against pathogens. During the evolution of monkeys, great apes and humans, an ancestral KIR3DL gene expanded to become a diverse and rapidly evolving gene family of four KIR lineages. Characterising the KIR locus are three framework regions, defining two intervals of variable gene-content. By analysis of four KIR haplotypes from two species of gibbon, we find that the smaller apes do not conform to these rules. Although diverse and irregular in structure, the gibbon haplotypes are unusually small, containing only two to five functional genes. Comparison with the predicted ancestral hominoid KIR haplotype indicates that modern gibbon KIR haplotypes were formed by a series of deletion events, which created new hybrid genes as well as eliminating ancestral genes. Of the three framework regions, only KIR3DL3 (lineage V), defining the 5’ end of the KIR locus, is present and intact on all gibbon KIR haplotypes. KIR2DL4 (lineage I) defining the central framework region has been a major target for elimination or inactivation, correlating with the absence of its putative ligand, MHC-G, in gibbons. Similarly, the MHC-C driven expansion of lineage III KIR genes in great apes has not occurred in gibbons because they lack MHC-C. Our results indicate that the selective forces shaping the size and organisation of the gibbon KIR locus differed from those acting upon the KIR of other hominoid species.
Comparative Immunology/Evolution; Reproductive Immunology; Natural Killer Cells; Cell Surface Molecules; MHC
We have developed a new approach to screen bacterial artificial chromosome (BAC) libraries by recombination selection. To test this method, we constructed an orangutan BAC library using an E. coli strain (DY380) with temperature inducible homologous recombination (HR) capability. We amplified one library segment, induced HR at 42°C to make it recombination proficient, and prepared electrocompetent cells for transformation with a kanamycin cassette to target sequences in the orangutan genome through terminal recombineering homologies. Kanamycin-resistant colonies were tested for the presence of BACs containing the targeted genes by the use of a PCR-assay to confirm the presence of the kanamycin insertion. The results indicate that this is an effective approach for screening clones. The advantage of recombination screening is that it avoids the high costs associated with the preparation, screening, and archival storage of arrayed BAC libraries. In addition, the screening can be conceivably combined with genetic engineering to create knockout and reporter constructs for functional studies.
Bitter taste perception likely evolved as a protective mechanism against the ingestion of harmful compounds in food. The evolution of the taste receptor type 2 (TAS2R) gene family, which encodes the chemoreceptors that are directly responsible for the detection of bitter compounds, has therefore been of considerable interest. Though TAS2R repertoires have been characterized for a number of species, to date the complement of TAS2Rs from just one bird, the chicken, which had a notably small number of TAS2Rs, has been established. Here, we used targeted mapping and genomic sequencing in the white-throated sparrow (Zonotrichia albicollis) and sample sequencing in other closely related birds to reconstruct the history of a TAS2R gene cluster physically linked to the break points of an evolutionary chromosomal rearrangement. In the white-throated sparrow, this TAS2R cluster encodes up to 18 functional bitter taste receptors and likely underwent a large expansion that predates and/or coincides with the radiation of the Emberizinae subfamily into the New World. In addition to signatures of gene birth-and-death evolution within this cluster, estimates of Ka/Ks for the songbird TAS2Rs were similar to those previously observed in mammals, including humans. Finally, comparison of the complete genomic sequence of the cluster from two common haplotypes in the white-throated sparrow revealed a number of nonsynonymous variants and differences in functional gene content within this species. These results suggest that interspecies and intraspecies genetic variability does exist in avian TAS2Rs and that these differences could contribute to variation in bitter taste perception in birds.
bitter taste receptors; molecular evolution; inversion; duplication
Non-obese diabetic (NOD) mice spontaneously develop type 1 diabetes (T1D) due to the progressive loss of insulin-secreting β-cells by an autoimmune driven process. NOD mice represent a valuable tool for studying the genetics of T1D and for evaluating therapeutic interventions. Here we describe the development and characterization by end-sequencing of bacterial artificial chromosome (BAC) libraries derived from NOD/MrkTac (DIL NOD) and NOD/ShiLtJ (CHORI-29), two commonly used NOD substrains. The DIL NOD library is composed of 196,032 BACs and the CHORI-29 library is composed of 110,976 BACs. The average depth of genome coverage of the DIL NOD library, estimated from mapping the BAC end-sequences to the reference mouse genome sequence, was 7.1-fold across the autosomes and 6.6-fold across the X chromosome. Clones from this library have an average insert size of 150 kb and map to over 95.6% of the reference mouse genome assembly (NCBIm37), covering 98.8% of Ensembl mouse genes. By the same metric, the CHORI-29 library has an average depth over the autosomes of 5.0-fold and 2.8-fold coverage of the X chromosome, the reduced X chromosome coverage being due to the use of a male donor for this library. Clones from this library have an average insert size of 205 kb and map to 93.9% of the reference mouse genome assembly, covering 95.7% of Ensembl genes. We have identified and validated 191,841 single nucleotide polymorphisms (SNPs) for DIL NOD and 114,380 SNPs for CHORI-29. In total we generated 229,736,133 bp of sequence for the DIL NOD and 121,963,211 bp for the CHORI-29. These BAC libraries represent a powerful resource for functional studies, such as gene targeting in NOD embryonic stem (ES) cell lines, and for sequencing and mapping experiments.
Bacterial artificial chromosome; NOD/MrkTac; NOD/ShiLtJ; Mouse genome; Non-obese diabetic (NOD); Type 1 diabetes; T1D; Insulin-dependent diabetes; IDD
The prairie vole (Microtus ochrogaster) is a premier animal model for understanding the genetic and neurological basis of social behaviors. Unlike other biomedical models, prairie voles display a rich repertoire of social behaviors including the formation of long-term pair bonds and biparental care. However, due to a lack of genomic resources for this species, studies have been limited to a handful of candidate genes. To provide a substrate for future development of genomic resources for this unique model organism, we report the construction and characterization of a bacterial artificial chromosome (BAC) library from a single male prairie vole and a prairie vole-mouse (Mus musculus) comparative cytogenetic map.
We constructed a prairie vole BAC library (CHORI-232) consisting of 194,267 recombinant clones with an average insert size of 139 kb. Hybridization-based screening of the gridded library at 19 loci established that the library has an average depth of coverage of ~10×. To obtain a small-scale sampling of the prairie vole genome, we generated 3884 BAC end-sequences totaling ~2.8 Mb. One-third of these BAC-end sequences could be mapped to unique locations in the mouse genome, thereby anchoring 1003 prairie vole BAC clones to an orthologous position in the mouse genome. Fluorescence in situ hybridization (FISH) mapping of 62 prairie vole clones with BAC-end sequences mapping to orthologous positions in the mouse genome was used to develop a first-generation genome-wide prairie vole-mouse comparative cytogenetic map. While conserved synteny was observed between this pair of rodent genomes, rearrangements between the prairie vole and mouse genomes were detected, including a minimum of five inversions and 16 inter-chromosomal rearrangements.
The construction of the prairie vole BAC library and the vole-mouse comparative cytogenetic map represent the first genome-wide modern genomic resources developed for this species. The BAC library will support future genomic, genetic and molecular characterization of this genome and species, and the isolation of clones of high interest to the vole research community will allow for immediate characterization of the regulatory and coding sequences of genes known to play important roles in social behaviors. In addition, these resources provide an excellent platform for future higher resolution cytogenetic mapping and full genome sequencing.
We constructed Drosophila melanogaster BAC libraries with 21-kb and 83-kb inserts in the P(acman) system. Clones representing 12-fold coverage and encompassing more than 95% of annotated genes were mapped onto the reference genome. These clones can be integrated into predetermined attP sites in the genome using ΦC31 integrase to rescue mutations. They can be modified through recombineering, for example to incorporate protein tags and assess expression patterns.
The gibbon family belongs to the superfamily Hominoidea and includes 15 species divided into four genera. Each genus possesses a distinct karyotype with chromosome numbers varying from 38 to 52. This diversity is the result of numerous chromosomal changes that have accumulated during the evolution of the gibbon lineage, a quite unique feature in comparison with other hominoids and most of the other primates. Some gibbon species and subspecies rank among the most endangered primates in the world. Breeding programs can be extremely challenging and hybridization plays an important role within the factors responsible for the decline of captive gibbons. With less than 500 individuals left in the wild, the northern white-cheeked gibbon (Nomascus leucogenys leucogenys, NLE) is the most endangered primate in a successful captive breeding program. We present here the analysis of an inversion that we show being specific for the northern white-cheeked gibbon and can be used as one of the criteria to distinguish this subspecies from other gibbon taxa. The availability of the sequence spanning for one of the breakpoints of the inversion allows detecting it by a simple PCR test also on low quality DNA. Our results demonstrate the important role of genomics in providing tools for conservation efforts.
The subtelomeric region of mouse chromosome (Chr) 4 harbors loci with effects on behavior, development, and disease susceptibility. Regions near the telomeres are more difficult to map and characterize than other areas because of the unique features of subtelomeric DNA. As a result of these problems, the available mapping information for this part of mouse Chr 4 was insufficient to pursue candidate gene evaluation. Therefore, we sought to characterize the area in greater detail by creating a comprehensive genetic, physical, and comparative map. We constructed a genetic map that contained 30 markers and covered 13.3 cM; then we created a 1.2-Mb sequence-ready BAC contig, representing a 5.1-cM area, and sequenced a 246-kb mouse BAC from this contig. The resulting sequence, as well as approximately 40 kb of previously deposited genomic sequence, yielded a total of 284 kb of sequence, which contained over 20 putative genes. These putative genes were confirmed by matching ESTs or cDNA in the public databases to the genomic sequence and/or by direct sequencing of cDNA. Comparative genome sequence analysis demonstrated conserved synteny between the mouse and the human genomes (1p36.3). DNA from two strains of mice (C57BL/6ByJ and 129P3/J) was sequenced to detect single nucleotide polymorphisms (SNPs). The frequency of SNPs in this region was more than threefold higher than the genome-wide average for comparable mouse strains (129/Sv and C57BL/6J). The resulting SNP map, in conjunction with the sequence annotation and with physical and genetic maps, provides a detailed description of this gene-rich region. These data will facilitate genetic and comparative mapping studies and identification of a large number of novel candidate genes for the trait loci mapped to this region.
BAC libraries generated from restriction-digested genomic DNA display representational bias and lack some sequences. To facilitate completion of genome projects, procedures have been developed to create BACs from DNA physically sheared to create fragments extending up to 200 kb. The DNA fragments were repaired to create blunt ends and ligated to a new BAC vector. This approach has been tested by generating BAC libraries from Drosophila DNA, with average insert lengths between 50 – 150 kb. The libraries lack chimeric clone problems as determined by mapping paired BAC-end sequences to the assembled fly genome sequence. The utility of “sheared” libraries was demonstrated by closure of a previous clone gap and by isolation of clones from telomeric regions, which were notably absent from previous Drosophila BAC libraries.
bacterial artificial chromosome; BAC; sheared DNA; cloning; vector; adaptor; telomere; centromere and heterochromatin
The mammalian evolutionary history of chromosome 13 was characterized and evolutionary-new centromeres compared to two human neocentromeres at 13q21 using chromatin immunoprecipitation and genomic microarrays
Evolutionary centromere repositioning and human analphoid neocentromeres occurring in clinical cases are, very likely, two stages of the same phenomenon whose properties still remain substantially obscure. Chromosome 13 is the chromosome with the highest number of neocentromeres. We reconstructed the mammalian evolutionary history of this chromosome and characterized two human neocentromeres at 13q21, in search of information that could improve our understanding of the relationship between evolutionarily new centromeres, inactivated centromeres, and clinical neocentromeres.
Chromosome 13 evolution was studied, using FISH experiments, across several diverse superordinal phylogenetic clades spanning >100 million years of evolution. The analysis revealed exceptional conservation among primates (hominoids, Old World monkeys, and New World monkeys), Carnivora (cat), Perissodactyla (horse), and Cetartiodactyla (pig). In contrast, the centromeres in both Old World monkeys and pig have apparently repositioned independently to a central location (13q21). We compared these results to the positions of two human 13q21 neocentromeres using chromatin immunoprecipitation and genomic microarrays.
We show that a gene-desert region at 13q21 of approximately 3.9 Mb in size possesses an inherent potential to form evolutionarily new centromeres over, at least, approximately 95 million years of mammalian evolution. The striking absence of genes may represent an important property, making the region tolerant to the extensive pericentromeric reshuffling during subsequent evolution. Comparison of the pericentromeric organization of chromosome 13 in four Old World monkey species revealed many differences in sequence organization. The region contains clusters of duplicons showing peculiar features.
As farming of Atlantic salmon is growing as an aquaculture enterprise, the need to identify the genomic mechanisms for specific traits is becoming more important in breeding and management of the animal. Traits of importance might be related to growth, disease resistance, food conversion efficiency, color or taste. To identify genomic regions responsible for specific traits, genomic large insert libraries have previously proven to be of crucial importance. These large insert libraries can be screened using gene or genetic markers in order to identify and map regions of interest. Furthermore, large-scale mapping can utilize highly redundant libraries in genome projects, and hence provide valuable data on the genome structure.
Here we report the construction and characterization of a highly redundant bacterial artificial chromosome (BAC) library constructed from a Norwegian aquaculture strain male of Atlantic salmon (Salmo salar). The library consists of a total number of 305 557 clones, in which approximately 299 000 are recombinants. The average insert size of the library is 188 kbp, representing 18-fold genome coverage. High-density filters each consisting of 18 432 clones spotted in duplicates have been produced for hybridization screening, and are publicly available . To characterize the library, 15 expressed sequence tags (ESTs) derived overgos and 12 oligo sequences derived from microsatellite markers were used in hybridization screening of the complete BAC library. Secondary hybridizations with individual probes were performed for the clones detected. The BACs positive for the EST probes were fingerprinted and mapped into contigs, yielding an average of 3 contigs for each probe. Clones identified using genomic probes were PCR verified using microsatellite specific primers.
Identification of genes and genomic regions of interest is greatly aided by the availability of the CHORI-214 Atlantic salmon BAC library. We have demonstrated the library's ability to identify specific genes and genetic markers using hybridization, PCR and fingerprinting experiments. In addition, multiple fingerprinting contigs indicated a pseudo-tetraploidity of the Atlantic salmon genome. The highly redundant CHORI-214 BAC library is expected to be an important resource for mapping and sequencing of the Atlantic salmon genome.
Using the human bacterial artificial chromosome (BAC) fingerprint-based physical map, genome sequence assembly and BAC end sequences, we have generated a fingerprint-validated set of 32 855 BAC clones spanning the human genome. The clone set provides coverage for at least 98% of the human fingerprint map, 99% of the current assembled sequence and has an effective resolving power of 79 kb. We have made the clone set publicly available, anticipating that it will generally facilitate FISH or array-CGH-based identification and characterization of chromosomal alterations relevant to disease.
Analysis of conservation in eight genomic regions (apterous, even-skipped, fushi tarazu, twist, and Rhodopsins 1, 2, 3 and 4) from four Drosophila species (D. erecta, D. pseudoobscura, D. willistoni, and D. littoralis) covering more than 500 kb of the D. melanogaster genome. All D. melanogaster genes (and 78-82% of coding exons) identified in divergent species such as D. pseudoobscura show evidence of functional constraint. Addition of a third species can reveal functional constraint in otherwise non-significant pairwise exon comparisons.
It is widely accepted that comparative sequence data can aid the functional annotation of genome sequences; however, the most informative species and features of genome evolution for comparison remain to be determined.
We analyzed conservation in eight genomic regions (apterous, even-skipped, fushi tarazu, twist, and Rhodopsins 1, 2, 3 and 4) from four Drosophila species (D. erecta, D. pseudoobscura, D. willistoni, and D. littoralis) covering more than 500 kb of the D. melanogaster genome. All D. melanogaster genes (and 78-82% of coding exons) identified in divergent species such as D. pseudoobscura show evidence of functional constraint. Addition of a third species can reveal functional constraint in otherwise non-significant pairwise exon comparisons. Microsynteny is largely conserved, with rearrangement breakpoints, novel transposable element insertions, and gene transpositions occurring in similar numbers. Rates of amino-acid substitution are higher in uncharacterized genes relative to genes that have previously been studied. Conserved non-coding sequences (CNCSs) tend to be spatially clustered with conserved spacing between CNCSs, and clusters of CNCSs can be used to predict enhancer sequences.
Our results provide the basis for choosing species whose genome sequences would be most useful in aiding the functional annotation of coding and cis-regulatory sequences in Drosophila. Furthermore, this work shows how decoding the spatial organization of conserved sequences, such as the clustering of CNCSs, can complement efforts to annotate eukaryotic genomes on the basis of sequence conservation alone.
In 2007, the International Knockout Mouse Consortium (IKMC) made the ambitious promise to generate mutations in virtually every protein-coding gene of the mouse genome in a concerted worldwide action. Now, 5 years later, the IKMC members have developed high-throughput gene trapping and, in particular, gene-targeting pipelines and generated more than 17,400 mutant murine embryonic stem (ES) cell clones and more than 1,700 mutant mouse strains, most of them conditional. A common IKMC web portal (www.knockoutmouse.org) has been established, allowing easy access to this unparalleled biological resource. The IKMC materials considerably enhance functional gene annotation of the mammalian genome and will have a major impact on future biomedical research.
The evolution of the amniotic egg was one of the great evolutionary innovations in the history of life, freeing vertebrates from an obligatory connection to water and thus permitting the conquest of terrestrial environments1. Among amniotes, genome sequences are available for mammals2 and birds3–5, but not for non-avian reptiles. Here we report the genome sequence of the North American green anole lizard, Anolis carolinensis. We find that A. carolinensis microchromosomes are highly syntenic with chicken microchromosomes, yet do not exhibit the high GC and low repeat content that are characteristic of avian microchromosomes3. Also, A. carolinensis mobile elements are very young and diverse – more so than in any other sequenced amniote genome. This lizard genome’s GC content is also unusual in its homogeneity, unlike the regionally variable GC content found in mammals and birds6. We describe and assign sequence to the previously unknown A. carolinensis X chromosome. Comparative gene analysis shows that amniote egg proteins have evolved significantly more rapidly than other proteins. An anole phylogeny resolves basal branches to illuminate the history of their repeated adaptive radiations.