Neurofibromatosis Type 1 (NF1) is a genetic disorder that is driven by the loss of neurofibromin (Nf) protein function. Nf contains a Ras GTPase activating domain (Ras-GAP), which directly regulates Ras signaling. Numerous clinical manifestations are associated with the loss of Nf and increased Ras activity. Ras proteins must be prenylated in order to traffic and functionally localize with target membranes. Hence, Ras is a potential therapeutic target for treating NF1. We have tested the efficacy of two novel farnesyl transferase inhibitors (FTI), 1 and 2, alone or in combination with lovastatin, on two NF1 malignant peripheral nerve sheath tumor (MPSNT) cell lines, NF90-8 and ST88-14. Single treatments of 1, 2, or lovastatin had no effect on MPNST cell proliferation. However, low micromolar combinations of 1 or 2 with lovastatin (FTI/lovastatin) reduced Ras prenylation in both MPNST cell lines. Further, this FTI/lovastatin combination treatment reduced cell proliferation and induced an apoptotic response as shown by morphological analysis, pro-caspase-3/-7 activation, loss of mitochondrial membrane potential, and accumulation of cells with sub G1 DNA content. Little to no detectable toxicity was observed in normal rat Schwann cells following FTI/lovastatin combination treatment. These data support the hypothesis that combination FTI plus lovastatin therapy may be a potential treatment for NF1 MPNSTs.
De novo mutations affect risk for many diseases and disorders, especially those with early-onset. An example is autism spectrum disorders (ASD). Four recent whole-exome sequencing (WES) studies of ASD families revealed a handful of novel risk genes, based on independent de novo loss-of-function (LoF) mutations falling in the same gene, and found that de novo LoF mutations occurred at a twofold higher rate than expected by chance. However successful these studies were, they used only a small fraction of the data, excluding other types of de novo mutations and inherited rare variants. Moreover, such analyses cannot readily incorporate data from case-control studies. An important research challenge in gene discovery, therefore, is to develop statistical methods that accommodate a broader class of rare variation. We develop methods that can incorporate WES data regarding de novo mutations, inherited variants present, and variants identified within cases and controls. TADA, for Transmission And De novo Association, integrates these data by a gene-based likelihood model involving parameters for allele frequencies and gene-specific penetrances. Inference is based on a Hierarchical Bayes strategy that borrows information across all genes to infer parameters that would be difficult to estimate for individual genes. In addition to theoretical development we validated TADA using realistic simulations mimicking rare, large-effect mutations affecting risk for ASD and show it has dramatically better power than other common methods of analysis. Thus TADA's integration of various kinds of WES data can be a highly effective means of identifying novel risk genes. Indeed, application of TADA to WES data from subjects with ASD and their families, as well as from a study of ASD subjects and controls, revealed several novel and promising ASD candidate genes with strong statistical support.
The genetic underpinnings of autism spectrum disorder (ASD) have proven difficult to determine, despite a wealth of evidence for genetic causes and ongoing effort to identify genes. Recently investigators sequenced the coding regions of the genomes from ASD children along with their unaffected parents (ASD trios) and identified numerous new candidate genes by pinpointing spontaneously occurring (de novo) mutations in the affected offspring. A gene with a severe (de novo) mutation observed in more than one individual is immediately implicated in ASD; however, the majority of severe mutations are observed only once per gene. These genes create a short list of candidates, and our results suggest about 50% are true risk genes. To strengthen our inferences, we develop a novel statistical method (TADA) that utilizes inherited variation transmitted to affected offspring in conjunction with (de novo) mutations to identify risk genes. Through simulations we show that TADA dramatically increases power. We apply this approach to nearly 1000 ASD trios and 2000 subjects from a case-control study and identify several promising genes. Through simulations and application we show that TADA's integration of sequencing data can be a highly effective means of identifying risk genes.
Osteogenesis imperfecta (OI), Ehlers-Danlos syndrome (EDS), and osteopetrosis (OPT)are collectively common inherited skeletal diseases. Evaluation of subjects with these conditions often includes molecular testing which has important counseling, therapeutic and sometimes legal implications. Since several different genes have been implicated in these conditions, Sanger sequencing of each gene can be a prohibitively expensive and time consuming way to reach a molecular diagnosis.
In order to circumvent these problems, we have designed and tested a NGS platform that would allow simultaneous sequencing on a single diagnostic platform of different genes implicated in OI, OPT, EDS, and other inherited conditions leading to low or high bone mineral density. We used a liquid-phase probe library that captures 602 exons (~100 kb) of 34 selected genes and have applied it to test clinical samples from patients with bone disorders.
NGS of the captured exons by Illumina HiSeq2000 resulted in an average coverage of over 900X. The platform was successfully validated by identifying mutations in 6 patients with known mutations. Moreover, in 4 patients with OI or OPT without a prior molecular diagnosis, the assay was able to detect the causative mutations.
In conclusion, our NGS panel provides a fast and accurate method to arrive at a molecular diagnosis in most patients with inherited high or low bone mineral density disorders.
Isoprenylcysteine carboxyl methyltransferases (Icmts) are a class of integral membrane protein methyltransferases localized to the endoplasmic reticulum (ER) membrane in eukaryotes. The Icmts from human (hIcmt) and S. cerevisae (Ste14p) catalyze the α-carboxyl methyl esterification step in the post-translational processing of CaaX proteins, including the yeast a-factor mating pheromones and both human and yeast Ras proteins. Herein, we evaluated synthetic analogs of two well-characterized Icmt substrates, N-acetyl-S-farnesyl-L-cysteine (AFC) and the yeast a-factor peptide mating pheromone, that contain photoactive benzophenone moieties in either the lipid or peptide portion of the molecule. The AFC based-compounds were substrates for both hIcmt and Ste14p, whereas the a-factor analogs were only substrates for Ste14p. However, the a-factor analogs were found to be micromolar inhibitors of hIcmt. Together, these data suggest that the Icmt substrate binding site is dependent upon features in both the isoprenyl moiety and upstream amino acid composition and that hIcmt and Ste14p have overlapping, yet distinct, substrate specificities. Photocrosslinking and neutravidin-agarose capture experiments with these analogs revealed that both hIcmt and Ste14p were specifically photolabeled to varying degrees with all of the compounds tested. These data suggest that these analogs will be useful for the future identification of the Icmt substrate binding sites.
Icmt; Ste14p; a-factor; photocrosslinking; benzophenone; methyltransferase
Next generation exome sequencing (ES) and whole genome sequencing (WGS) are new powerful tools for discovering the gene(s) that underlie Mendelian disorders. To accelerate these discoveries, the National Institutes of Health has established three Centers for Mendelian Genomics (CMGs): the Center for Mendelian Genomics at the University of Washington; the Center for Mendelian Disorders at Yale University; and the Baylor-Johns Hopkins Center for Mendelian Genomics at Baylor College of Medicine and Johns Hopkins University. The CMGs will provide ES/WGS and extensive analysis expertise at no cost to collaborating investigators where the causal gene(s) for a Mendelian phenotype has yet to be uncovered. Over the next few years and in collaboration with the global human genetics community, the CMGs hope to facilitate the identification of the genes underlying a very large fraction of all Mendelian disorders see http://mendelian.org.
mendelian; exome sequencing; commentary
Genitopatellar syndrome (GPS) and Say-Barber-Biesecker-Young-Simpson syndrome (SBBYSS or Ohdo syndrome) have both recently been shown to be caused by distinct mutations in the histone acetyltransferase KAT6B (a.k.a. MYST4/MORF). All variants are de novo dominant mutations that lead to protein truncation. Mutations leading to GPS occur in the proximal portion of the last exon and lead to the expression of a protein without an activation domain. Mutations leading to SBBYSS occur either throughout the gene, leading to nonsense-mediated decay, or more distally in the last exon. Features present only in GPS are contractures, anomalies of the spine, ribs and pelvis, renal cysts, hydronephrosis and agenesis of the corpus callosum. Features present only in SBBYSS include long thumbs and long great toes and lacrimal duct abnormalities. Several features occur in both, such as intellectual disability, congenital heart defects, genital and patellar anomalies. We propose that haploinsufficiency or loss of a function mediated by the C-terminal domain causes the common features, whereas gain-of-function activities would explain the features unique to GPS. Further molecular studies and the compilation of mutations in a database for genotype-phenotype correlations (www.LOVD.nl/KAT6B) might help tease out answers to these questions and understand the developmental programs dysregulated by the different truncations.
KAT6B; MYST4; mutation database; Genitopatellar syndrome; Ohdo Syndrome
The debate regarding the relative merits of whole genome sequencing (WGS) versus exome sequencing (ES) centers around comparative cost, average depth of coverage for each interrogated base, and their relative efficiency in the identification of medically actionable variants from the myriad of variants identified by each approach. Nevertheless, few genomes have been subjected to both WGS and ES, using multiple next generation sequencing platforms. In addition, no personal genome has been so extensively analyzed using DNA derived from peripheral blood as opposed to DNA from transformed cell lines that may either accumulate mutations during propagation or clonally expand mosaic variants during cell transformation and propagation.
We investigated a genome that was studied previously by SOLiD chemistry using both ES and WGS, and now perform six independent ES assays (Illumina GAII (x2), Illumina HiSeq (x2), Life Technologies' Personal Genome Machine (PGM) and Proton), and one additional WGS (Illumina HiSeq).
We compared the variants identified by the different methods and provide insights into the differences among variants identified between ES runs in the same technology platform and among different sequencing technologies. We resolved the true genotypes of medically actionable variants identified in the proband through orthogonal experimental approaches. Furthermore, ES identified an additional SH3TC2 variant (p.M1?) that likely contributes to the phenotype in the proband.
ES identified additional medically actionable variant calls and helped resolve ambiguous single nucleotide variants (SNV) documenting the power of increased depth of coverage of the captured targeted regions. Comparative analyses of WGS and ES reveal that pseudogenes and segmental duplications may explain some instances of apparent disease mutations in unaffected individuals.
Exome sequencing; Whole-genome sequencing; Incidental findings; SH3TC2; Personal genomes; Precision medicine
Czech dysplasia, metatarsal type is an autosomal dominant skeletal disorder that is characterized by early-onset, progressive arthritis, brachydactyly of the 3rd and 4th toes, and characteristic radiographic findings in patients of normal stature. Patients with Czech dysplasia typically present in late childhood or later. In the present report, whole exome sequencing identified a mutation in COL2A1 (c.823C>T, p.R275C) known to be associated with Czech dysplasia in a 3.5 year old female who had a family history of early-onset arthritis and who was asymptomatic except for prominent knees. The use of whole exome sequencing facilitated diagnosis of this rare disease (less than 15 families in the literature) in the presymptomatic period and thus enabled us to provide early anticipatory guidance and genetic counseling for the family.
Czech dysplasia; skeletal dysplasia; prominent knees; early-onset osteoarthritis; Depressed nasal bridge; Brachydactyly of 3rd and 4th toes; Normal stature; Early-onset arthritis
Transposable elements (TEs) are abundant in the human genome, and some are capable of generating new insertions through RNA intermediates. In cancer, the disruption of cellular mechanisms that normally suppress TE activity may facilitate mutagenic retrotranspositions. We performed single-nucleotide resolution analysis of TE insertions in 43 high-coverage whole-genome sequencing data sets from five cancer types. We identified 194 high-confidence somatic TE insertions, as well as thousands of polymorphic TE insertions in matched normal genomes. Somatic insertions were present in epithelial tumors but not in blood or brain cancers. Somatic L1 insertions tend to occur in genes that are commonly mutated in cancer, disrupt the expression of the target genes, and are biased toward regions of cancer-specific DNA hypomethylation, highlighting their potential impact in tumorigenesis.
Human diseases are caused by alleles that encompass the full range of variant types, from single-nucleotide changes to copy-number variants, and these variations span a broad frequency spectrum, from the very rare to the common. The picture emerging from analysis of whole-genome sequences, the 1000 Genomes Project pilot studies, and targeted genomic sequencing derived from very large sample sizes reveals an abundance of rare and private variants. One implication of this realization is that recent mutation may have a greater influence on disease susceptibility or protection than is conferred by variations that arose in distant ancestors.
Following the “finished,” euchromatic, haploid human reference genome sequence, the rapid development of novel, faster, and cheaper sequencing technologies is making possible the era of personalized human genomics. Personal diploid human genome sequences have been generated, and each has contributed to our better understanding of variation in the human genome. We have consequently begun to appreciate the vastness of individual genetic variation from single nucleotide to structural variants. Translation of genome-scale variation into medically useful information is, however, in its infancy. This review summarizes the initial steps undertaken in clinical implementation of personal genome information, and describes the application of whole-genome and exome sequencing to identify the cause of genetic diseases and to suggest adjuvant therapies. Better analysis tools and a deeper understanding of the biology of our genome are necessary in order to decipher, interpret, and optimize clinical utility of what the variation in the human genome can teach us. Personal genome sequencing may eventually become an instrument of common medical practice, providing information that assists in the formulation of a differential diagnosis. We outline herein some of the remaining challenges.
whole-genome sequencing (WGS); exome sequencing; simple nucleotide variation (SNV); structural variation; personal genomics
Since the initial report of targeted-enrichment (Albert et al, 2007) we have been evolving the design and utility of capture reagents and methods, while taking advantage of the parallel advances in sequencing platforms. New exome designs target a comprehensive set of coding exons from 6 different gene databases, as well as computationally predicted coding and non-coding elements: regulatory regions, and conserved UTRs. Library automation, reduction of DNA input samples, capture hybridization multiplexing and application of faster read mapping tools such as BWA, together allow a rate of >4,300 libraries/captures per month, with >40,000 exome and regional capture libraries completed to date. In addition, a fully integrated informatics and analysis pipeline (Mercury), supports all aspects of data flow and analysis from the initial data production on the sequencing instrument to annotated variant calls (SNPs and small Indels). These laboratory methods and analysis pipelines have been production hardened at the Human Genome Sequencing Center (HGSC) and have now been applied toward clinical exome sequencing. Through a joint collaboration between the Human Genome Sequencing Center and the Medical Genetics Laboratories (MGL) of the Department of Molecular and Human Genetics, clinical exome sequencing and interpretation are now provided through the CAP/CLIA certified Whole Genome Laboratory (WGL). To date, the WGL has completed exome sequencing of 650 patient samples and final interpretation completed for over 450 patients with causative deleterious mutations identified in 25% of cases. Performance has been maintained to a high standard of 95% of the exome target bases represented at 20X coverage. Overall exome performance metrics, LIMS support, variant analysis and validation of the clinical pipeline for a CAP/CLIA environment will be presented.
Next generation sequencing platforms have greatly reduced sequencing costs, leading to the production of unprecedented amounts of sequence data. BWA is one of the most popular alignment tools due to its relatively high accuracy. However, mapping reads using BWA is still the most time consuming step in sequence analysis. Increasing mapping efficiency would allow the community to better cope with ever expanding volumes of sequence data.
We designed a new program, CGAP-align, that achieves a performance improvement over BWA without sacrificing recall or precision. This is accomplished through the use of Suffix Tarray, a novel data structure combining elements of Suffix Array and Suffix Tree. We also utilize a tighter lower bound estimation for the number of mismatches in a read, allowing for more effective pruning during inexact mapping. Evaluation of both simulated and real data suggests that CGAP-align consistently outperforms the current version of BWA and can achieve over twice its speed under certain conditions, all while obtaining nearly identical results.
CGAP-align is a new time efficient read alignment tool that extends and improves BWA. The increase in alignment speed will be of critical assistance to all sequence-based research and medicine. CGAP-align is freely available to the academic community at http://sourceforge.net/p/cgap-align under the GNU General Public License (GPL).
Elephant endotheliotropic herpesvirus 1A is a member of the Proboscivirus genus and is a major cause of fatal hemorrhagic disease in endangered juvenile Asian elephants worldwide. Here, we report the first complete genome sequence from this genus, obtained directly from necropsy DNA, in which 60 of the 115 predicted genes are not found in any known herpesvirus.
We report on results from whole-exome sequencing (WES) of 1,039 subjects diagnosed with autism spectrum disorders (ASD) and 870 controls selected from the NIMH repository to be of similar ancestry to cases. The WES data came from two centers using different methods to produce sequence and to call variants from it. Therefore, an initial goal was to ensure the distribution of rare variation was similar for data from different centers. This proved straightforward by filtering called variants by fraction of missing data, read depth, and balance of alternative to reference reads. Results were evaluated using seven samples sequenced at both centers and by results from the association study. Next we addressed how the data and/or results from the centers should be combined. Gene-based analyses of association was an obvious choice, but should statistics for association be combined across centers (meta-analysis) or should data be combined and then analyzed (mega-analysis)? Because of the nature of many gene-based tests, we showed by theory and simulations that mega-analysis has better power than meta-analysis. Finally, before analyzing the data for association, we explored the impact of population structure on rare variant analysis in these data. Like other recent studies, we found evidence that population structure can confound case-control studies by the clustering of rare variants in ancestry space; yet, unlike some recent studies, for these data we found that principal component-based analyses were sufficient to control for ancestry and produce test statistics with appropriate distributions. After using a variety of gene-based tests and both meta- and mega-analysis, we found no new risk genes for ASD in this sample. Our results suggest that standard gene-based tests will require much larger samples of cases and controls before being effective for gene discovery, even for a disorder like ASD.
This study evaluates association of rare variants and autism spectrum disorders (ASD) in case and control samples sequenced by two centers. Before doing association analyses, we studied how to combine information across studies. We first harmonized the whole-exome sequence (WES) data, across centers, in terms of the distribution of rare variation. Key features included filtering called variants by fraction of missing data, read depth, and balance of alternative to reference reads. After filtering, the vast majority of variants calls from seven samples sequenced at both centers matched. We also evaluated whether one should combine summary statistics from data from each center (meta-analysis) or combine data and analyze it together (mega-analysis). For many gene-based tests, we showed that mega-analysis yields more power. After quality control of data from 1,039 ASD cases and 870 controls and a range of analyses, no gene showed exome-wide evidence of significant association. Our results comport with recent results demonstrating that hundreds of genes affect risk for ASD; they suggest that rare risk variants are scattered across these many genes, and thus larger samples will be required to identify those genes.
Polymicrogyria is a disorder of neuronal development resulting in structurally abnormal cerebral hemispheres characterized by over-folding and abnormal lamination of the cerebral cortex. Polymicrogyria is frequently associated with severe neurologic deficits including intellectual disability, motor problems, and epilepsy. There are acquired and genetic causes of polymicrogyria, but most patients with a presumed genetic etiology lack a specific diagnosis. Here we report using whole-exome sequencing to identify compound heterozygous mutations in the WD repeat domain 62 (WDR62) gene as the cause of recurrent polymicrogyria in a sibling pair. Sanger sequencing confirmed that the siblings both inherited 1-bp (maternal allele) and 2-bp (paternal allele) frameshift deletions, which predict premature truncation of WDR62, a protein that has a role in early cortical development. The probands are from a non-consanguineous family of Northern European descent, suggesting that autosomal recessive PMG due to compound heterozygous mutation of WDR62 might be a relatively common cause of PMG in the population. Further studies to identify mutation frequency in the population are needed.
malformations of cortical development; high-throughput nucleotide sequencing; genetic testing; epilepsy; intellectual disability
schizophrenia; sequencing; SNV; genetic; association; mutation; DISC1
Molecular diagnostics can resolve locus heterogeneity underlying clinical phenotypes that may otherwise be co-assigned as a specific syndrome based on shared clinical features, and can associate phenotypically diverse diseases to a single locus through allelic affinity. Here we describe an apparently novel syndrome, likely caused by de novo truncating mutations in ASXL3, which shares characteristics with Bohring-Opitz syndrome, a disease associated with de novo truncating mutations in ASXL1.
We used whole-genome and whole-exome sequencing to interrogate the genomes of four subjects with an undiagnosed syndrome.
Using genome-wide sequencing, we identified heterozygous, de novo truncating mutations in ASXL3, a transcriptional repressor related to ASXL1, in four unrelated probands. We found that these probands shared similar phenotypes, including severe feeding difficulties, failure to thrive, and neurologic abnormalities with significant developmental delay. Further, they showed less phenotypic overlap with patients who had de novo truncating mutations in ASXL1.
We have identified truncating mutations in ASXL3 as the likely cause of a novel syndrome with phenotypic overlap with Bohring-Opitz syndrome.
We report the design and synthesis of novel FTPA-triazole compounds as potent inhibitors of isoprenylcysteine carboxyl methyltransferase (Icmt), through a focus on thioether and isoprenoid mimetics. These mimetics were coupled utilizing a copper-assisted cycloaddition to assemble the potential inhibitors. Using the resulting triazole from the coupling as an isoprenyl mimetic resulted in the biphenyl substituted FTPA triazole 10n. This lipid-modified analog is a potent inhibitor of Icmt (IC50 = 0.8 ± 0.1 μM; calculated Ki = 0.4 μM).
Isoprenylcysteine carboxyl methyltransferase (Icmt); Ras; prenylcysteine; dipolar cycloaddition; S-farnesyl-thiopropionic acid (FTPA); triazole
Human protein isoprenylcysteine carboxyl methyltransferase (hIcmt) is the enzyme responsible for the α-carboxyl methylation of the C-termimal isoprenylated cysteine of CaaX proteins, including Ras proteins. This specific posttranslational methylation event has been shown to be important for cellular transformation by oncogenic Ras isoforms. This finding led to interest in hIcmt inhibitors as potential anti-cancer agents. Previous analog studies based on N-acetyl-S-farnesylcysteine identified two prenylcysteine-based low micromolar inhibitors (1a and 1b) of hIcmt, each bearing a phenoxyphenyl amide modification. In this study, a focused library of analogs of 1a and 1b was synthesized and screened versus hIcmt, delineating structural features important for inhibition. Kinetic characterization of the most potent analogs 1a and 1b established that both inhibitors exhibited mixed-mode inhibition and that the competitive component predominated. Using the Cheng – Prusoff method, the Ki values were determined from the IC50 values. Analog 1a has a KIC of 1.4 ± 0.2 μM and a KIU of 4.8 ± 0.5 μM while 1b has a KIC of 0.5 ± 0.07 μM and a KIU of 1.9 ± 0.2 μM. Cellular evaluation of 1b revealed that it alters the subcellular localization of GFP-KRas, and also inhibits both Ras activation and Erk phosphorylation in Jurkat cells.
Isoprenylcysteine carboxylmethyltransferase; methylesterification; Icmt; enzyme inhibition; Ras proteins; anti-cancer agents; prenylation; methyl transferase
Genetic variants responsible for susceptibility to obesity and its comorbidities among Hispanic children have not been identified. The VIVA LA FAMILIA Study was designed to genetically map childhood obesity and associated biological processes in the Hispanic population. A genome-wide association study (GWAS) entailed genotyping 1.1 million single nucleotide polymorphisms (SNPs) using the Illumina Infinium technology in 815 children. Measured genotype analysis was performed between genetic markers and obesity-related traits i.e., anthropometry, body composition, growth, metabolites, hormones, inflammation, diet, energy expenditure, substrate utilization and physical activity. Identified genome-wide significant loci: 1) corroborated genes implicated in other studies (MTNR1B, ZNF259/APOA5, XPA/FOXE1 (TTF-2), DARC, CCR3, ABO); 2) localized novel genes in plausible biological pathways (PCSK2, ARHGAP11A, CHRNA3); and 3) revealed novel genes with unknown function in obesity pathogenesis (MATK, COL4A1). Salient findings include a nonsynonymous SNP (rs1056513) in INADL (p = 1.2E-07) for weight; an intronic variant in MTNR1B associated with fasting glucose (p = 3.7E-08); variants in the APOA5-ZNF259 region associated with triglycerides (p = 2.5-4.8E-08); an intronic variant in PCSK2 associated with total antioxidants (p = 7.6E-08); a block of 23 SNPs in XPA/FOXE1 (TTF-2) associated with serum TSH (p = 5.5E-08 to 1.0E-09); a nonsynonymous SNP (p = 1.3E-21), an intronic SNP (p = 3.6E-13) in DARC identified for MCP-1; an intronic variant in ARHGAP11A associated with sleep duration (p = 5.0E-08); and, after adjusting for body weight, variants in MATK for total energy expenditure (p = 2.7E-08) and in CHRNA3 for sleeping energy expenditure (p = 6.0E-08). Unprecedented phenotyping and high-density SNP genotyping enabled localization of novel genetic loci associated with the pathophysiology of childhood obesity.
Osteogenesis imperfecta (OI) is a spectrum of genetic disorders characterized by bone fragility. It is caused by dominant mutations affecting the synthesis and/or structure of type I procollagen or by recessively inherited mutations in genes responsible for the post-translational processing/trafficking of type I procollagen. Recessive OI type VI is unique among OI types in that it is characterized by an increased amount of unmineralized osteoid, thereby suggesting a distinct disease mechanism. In a large consanguineous family with OI type VI, we performed homozygosity mapping and next-generation sequencing of the candidate gene region to isolate and identify the causative gene. We describe loss of function mutations in serpin peptidase inhibitor, clade F, member 1 (SERPINF1) in two affected members of this family and in an additional unrelated patient with OI type VI. SERPINF1 encodes pigment epithelium-derived factor. Hence, loss of pigment epithelium-derived factor function constitutes a novel mechanism for OI and demonstrates its involvement in bone mineralization.
Brittle bone disease; Collagen type I; Fracture; Matrix proteins; Pigment epithelium-derived factor
Somatostatin receptor type 5 (SSTR5) P335L is a hypofunctional single nucleotide polymorphism of SSTR5 with implications in tumor diagnostics and therapy. The purpose of this study is to determine whether a SSTR5 P335L-specific monoclonal antibody (mAb) could sufficiently differentiate pancreatic neuroendocrine tumor (PNT) patients with different SSTR5 genotypes.
Cellular proliferation rate, SSTR5 mRNA level and SSTR5 protein level were measured by performing MTS assay, qRT-PCR and western blotting and immunohistochemistry, respectively. SSTR5 genotype was determined with the TaqMan SNP Genotyping assay.
1) SSTR5 analogue RPL-1980 inhibited cellular proliferation of CAPAN-1 cells more significantly than that of PANC-1 cells. 2) Only PANC-1 (TT) cells, but not CAPAN-1 (CC) cells expressed SSTR5 P335L. 3) In 29 Caucasian PNT patients, 38% had TT genotype for SSTR5 P335L, 24% had CC genotype for WT SSTR5, and 38% had CT genotype for both SSTR5 P335L and WT SSTR5. 4) Immunohistochemistry using SSTR5 P335L mAb detected immunostaining signals only from the PNT specimens with TT and CT genotypes, but not those with CC genotypes.
A SSTR5 P335L mAb that specifically recognizes SSTR5 P335L, but not WT SSTR5, could differentiate PNT patients with different SSTR5 genotypes, thus providing a potential tool for clinical diagnosis of PNT.
Many genomes have been sequenced to high-quality draft status using Sanger capillary electrophoresis and/or newer short-read sequence data and whole genome assembly techniques. However, even the best draft genomes contain gaps and other imperfections due to limitations in the input data and the techniques used to build draft assemblies. Sequencing biases, repetitive genomic features, genomic polymorphism, and other complicating factors all come together to make some regions difficult or impossible to assemble. Traditionally, draft genomes were upgraded to “phase 3 finished” status using time-consuming and expensive Sanger-based manual finishing processes. For more facile assembly and automated finishing of draft genomes, we present here an automated approach to finishing using long-reads from the Pacific Biosciences RS (PacBio) platform. Our algorithm and associated software tool, PBJelly, (publicly available at https://sourceforge.net/projects/pb-jelly/) automates the finishing process using long sequence reads in a reference-guided assembly process. PBJelly also provides “lift-over” co-ordinate tables to easily port existing annotations to the upgraded assembly. Using PBJelly and long PacBio reads, we upgraded the draft genome sequences of a simulated Drosophila melanogaster, the version 2 draft Drosophila pseudoobscura, an assembly of the Assemblathon 2.0 budgerigar dataset, and a preliminary assembly of the Sooty mangabey. With 24× mapped coverage of PacBio long-reads, we addressed 99% of gaps and were able to close 69% and improve 12% of all gaps in D. pseudoobscura. With 4× mapped coverage of PacBio long-reads we saw reads address 63% of gaps in our budgerigar assembly, of which 32% were closed and 63% improved. With 6.8× mapped coverage of mangabey PacBio long-reads we addressed 97% of gaps and closed 66% of addressed gaps and improved 19%. The accuracy of gap closure was validated by comparison to Sanger sequencing on gaps from the original D. pseudoobscura draft assembly and shown to be dependent on initial reference quality.
BACKGROUND AND AIMS
The intestinal microbiomes of healthy children and pediatric patients with irritable bowel syndrome (IBS) are not well defined. Studies in adults have indicated that the gastrointestinal microbiota could be involved in IBS.
We analyzed 71 samples from 22 children with IBS (pediatric Rome III criteria) and 22 healthy children, ages 7–12 years, by 16S rRNA gene sequencing, with an average of 54,287 reads/stool sample (average 454 read length = 503 bases). Data were analyzed using phylogenetic-based clustering (Unifrac), or an operational taxonomic unit (OTU) approach using a supervised machine learning tool (randomForest). Most samples were also hybridized to a microarray that can detect 8,741 bacterial taxa (16S rRNA PhyloChip).
Microbiomes associated with pediatric IBS were characterized by a significantly greater percentage of the class Gammaproteobacteria (0.07% vs 0.89% of total bacteria; P <.05); one prominent component of this group was Haemophilus parainfluenzae. Differences highlighted by 454 sequencing were confirmed by high-resolution PhyloChip analysis. Using supervised learning techniques, we were able to classify different subtypes of IBS with a success rate of 98.5%, using limited sets of discriminant bacterial species. A novel Ruminococcus-like microbe was associated with IBS, indicating the potential utility of microbe discovery for gastrointestinal disorders. A greater frequency of pain correlated with an increased abundance of several bacterial taxa from the genus Alistipes.
Using16S metagenomics by Phylochip DNA hybridization and deep 454 pyrosequencing, we associated specific microbiome signatures with pediatric IBS. These findings indicate the important association between gastrointestinal microbes and IBS in children; these approaches might be used in diagnosis of functional bowel disorders in pediatric patients.