PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (1379652)

Clipboard (0)
None

Related Articles

1.  Huvariome: a web server resource of whole genome next-generation sequencing allelic frequencies to aid in pathological candidate gene selection 
Background
Next generation sequencing provides clinical research scientists with direct read out of innumerable variants, including personal, pathological and common benign variants. The aim of resequencing studies is to determine the candidate pathogenic variants from individual genomes, or from family-based or tumor/normal genome comparisons. Whilst the use of appropriate controls within the experimental design will minimize the number of false positive variations selected, this number can be reduced further with the use of high quality whole genome reference data to minimize false positives variants prior to candidate gene selection. In addition the use of platform related sequencing error models can help in the recovery of ambiguous genotypes from lower coverage data.
Description
We have developed a whole genome database of human genetic variations, Huvariome, determined by whole genome deep sequencing data with high coverage and low error rates. The database was designed to be sequencing technology independent but is currently populated with 165 individual whole genomes consisting of small pedigrees and matched tumor/normal samples sequenced with the Complete Genomics sequencing platform. Common variants have been determined for a Benelux population cohort and represented as genotypes alongside the results of two sets of control data (73 of the 165 genomes), Huvariome Core which comprises 31 healthy individuals from the Benelux region, and Diversity Panel consisting of 46 healthy individuals representing 10 different populations and 21 samples in three Pedigrees. Users can query the database by gene or position via a web interface and the results are displayed as the frequency of the variations as detected in the datasets. We demonstrate that Huvariome can provide accurate reference allele frequencies to disambiguate sequencing inconsistencies produced in resequencing experiments. Huvariome has been used to support the selection of candidate cardiomyopathy related genes which have a homozygous genotype in the reference cohorts. This database allows the users to see which selected variants are common variants (> 5% minor allele frequency) in the Huvariome core samples, thus aiding in the selection of potentially pathogenic variants by filtering out common variants that are not listed in one of the other public genomic variation databases. The no-call rate and the accuracy of allele calling in Huvariome provides the user with the possibility of identifying platform dependent errors associated with specific regions of the human genome.
Conclusion
Huvariome is a simple to use resource for validation of resequencing results obtained by NGS experiments. The high sequence coverage and low error rates provide scientists with the ability to remove false positive results from pedigree studies. Results are returned via a web interface that displays location-based genetic variation frequency, impact on protein function, association with known genetic variations and a quality score of the variation base derived from Huvariome Core and the Diversity Panel data. These results may be used to identify and prioritize rare variants that, for example, might be disease relevant. In testing the accuracy of the Huvariome database, alleles of a selection of ambiguously called coding single nucleotide variants were successfully predicted in all cases. Data protection of individuals is ensured by restricted access to patient derived genomes from the host institution which is relevant for future molecular diagnostics.
doi:10.1186/2043-9113-2-19
PMCID: PMC3549785  PMID: 23164068
Medical genetics; Medical genomics; Whole genome sequencing; Allele frequency; Cardiomyopathy
2.  Respiratory chain complex I deficiency caused by mitochondrial DNA mutations 
Defects of the mitochondrial respiratory chain are associated with a diverse spectrum of clinical phenotypes, and may be caused by mutations in either the nuclear or the mitochondrial genome (mitochondrial DNA (mtDNA)). Isolated complex I deficiency is the most common enzyme defect in mitochondrial disorders, particularly in children in whom family history is often consistent with sporadic or autosomal recessive inheritance, implicating a nuclear genetic cause. In contrast, although a number of recurrent, pathogenic mtDNA mutations have been described, historically, these have been perceived as rare causes of paediatric complex I deficiency. We reviewed the clinical and genetic findings in a large cohort of 109 paediatric patients with isolated complex I deficiency from 101 families. Pathogenic mtDNA mutations were found in 29 of 101 probands (29%), 21 in MTND subunit genes and 8 in mtDNA tRNA genes. Nuclear gene defects were inferred in 38 of 101 (38%) probands based on cell hybrid studies, mtDNA sequencing or mutation analysis (nuclear gene mutations were identified in 22 probands). Leigh or Leigh-like disease was the most common clinical presentation in both mtDNA and nuclear genetic defects. The median age at onset was higher in mtDNA patients (12 months) than in patients with a nuclear gene defect (3 months). However, considerable overlap existed, with onset varying from 0 to >60 months in both groups. Our findings confirm that pathogenic mtDNA mutations are a significant cause of complex I deficiency in children. In the absence of parental consanguinity, we recommend whole mitochondrial genome sequencing as a key approach to elucidate the underlying molecular genetic abnormality.
doi:10.1038/ejhg.2011.18
PMCID: PMC3137493  PMID: 21364701
respiratory chain; complex I; mitochondrial DNA; mutation; genetic counselling
3.  Mutations in VRK1 Associated With Complex Motor and Sensory Axonal Neuropathy Plus Microcephaly 
JAMA neurology  2013;70(12):1491-1498.
IMPORTANCE
Patients with rare diseases and complex clinical presentations represent a challenge for clinical diagnostics. Genomic approaches are allowing the identification of novel variants in genes for very rare disorders, enabling a molecular diagnosis. Genomics is also revealing a phenotypic expansion whereby the full spectrum of clinical expression conveyed by mutant alleles at a locus can be better appreciated.
OBJECTIVE
To elucidate the molecular cause of a complex neuropathy phenotype in 3 patients by applying genomic sequencing strategies.
DESIGN, SETTING, AND PARTICIPANTS
Three affected individuals from 2 unrelated families presented with a complex neuropathy phenotype characterized by axonal sensorimotor neuropathy and microcephaly. They were recruited into the Centers for Mendelian Genomics research program to identify the molecular cause of their phenotype. Whole-genome, targeted whole-exome sequencing, and high-resolution single-nucleotide polymorphism arrays were performed in genetics clinics of tertiary care pediatric hospitals and biomedical research institutions.
MAIN OUTCOMES AND MEASURES
Whole-genome and whole-exome sequencing identified the variants responsible for the patients’ clinical phenotype.
RESULTS
We identified compound heterozygous alleles in 2 affected siblings from 1 family and a homozygous nonsense variant in the third unrelated patient in the vaccinia-related kinase 1 gene (VRK1). In the latter subject, we found a common haplotype on which the nonsense mutation occurred and that segregates in the Ashkenazi Jewish population.
CONCLUSIONS AND RELEVANCE
We report the identification of disease-causing alleles in 3 children from 2 unrelated families with a previously uncharacterized complex axonal motor and sensory neuropathy accompanied by severe nonprogressive microcephaly and cerebral dysgenesis. Our data raise the question of whether VRK1 mutations disturb cell cycle progression and may result in apoptosis of cells in the nervous system. The application of unbiased genomic approaches allows the identification of potentially pathogenic mutations in unsuspected genes in highly genetically heterogeneous and uncharacterized neurological diseases.
doi:10.1001/jamaneurol.2013.4598
PMCID: PMC4039291  PMID: 24126608
4.  Win on Sunday, Sell on Monday: From the Exome Sequencing of One Boy to the Delivery of Clinical Diagnostics 
For several years, there have been discussions about using both Sanger and whole genome sequencing in clinical practice. In late 2009, the Medical College of Wisconsin initiated the infrastructure to streamline the delivery of current and emerging DNA technologies into state-of-the-art molecular diagnostics. The online publication of our initial case in Genetics of Medicine in late 2010 further intensified our efforts in this endeavor. However, being relatively new to the field of NextGen sequencing, we began with the addition of Sanger diagnostic sequencing to our already successful research core, which at that point had been in operation for almost ten years. This was a great undertaking, as typically, independent research laboratories performing cutting-edge science lack the financial resources and breadth of experience to launch their custom product or application to the diagnostic industry. An independent research laboratory is able to resolve these shortages by partnering with a core laboratory staffed with diagnostic expertise. Due to our lack of diagnostic experience, we quickly aligned the research core to a consortium of individuals with clinical experience to allow us to benefit from established diagnostic facilities on campus. Difficulties faced at the onset of diagnostic startup were many, including large issues such as accreditation program (CAP vs. CLIA), SOP generation and validation, competency and proficiency testing, and reimbursement, as well as smaller problems like semiannual pipette calibration, temperature monitoring, and inventory control. The purpose of this talk is to give insight into efficient ways to resolve these problems, both large and small, and transform a decade or more of research expertise into a viable diagnostic laboratory.
PMCID: PMC3186498
5.  Whole Genome Sequencing versus Traditional Genotyping for Investigation of a Mycobacterium tuberculosis Outbreak: A Longitudinal Molecular Epidemiological Study 
PLoS Medicine  2013;10(2):e1001387.
In an outbreak investigation of Mycobacterium tuberculosis comparing whole genome sequencing (WGS) with traditional genotyping, Stefan Niemann and colleagues found that classical genotyping falsely clustered some strains, and WGS better reflected contact tracing.
Background
Understanding Mycobacterium tuberculosis (Mtb) transmission is essential to guide efficient tuberculosis control strategies. Traditional strain typing lacks sufficient discriminatory power to resolve large outbreaks. Here, we tested the potential of using next generation genome sequencing for identification of outbreak-related transmission chains.
Methods and Findings
During long-term (1997 to 2010) prospective population-based molecular epidemiological surveillance comprising a total of 2,301 patients, we identified a large outbreak caused by an Mtb strain of the Haarlem lineage. The main performance outcome measure of whole genome sequencing (WGS) analyses was the degree of correlation of the WGS analyses with contact tracing data and the spatio-temporal distribution of the outbreak cases. WGS analyses of the 86 isolates revealed 85 single nucleotide polymorphisms (SNPs), subdividing the outbreak into seven genome clusters (two to 24 isolates each), plus 36 unique SNP profiles. WGS results showed that the first outbreak isolates detected in 1997 were falsely clustered by classical genotyping. In 1998, one clone (termed “Hamburg clone”) started expanding, apparently independently from differences in the social environment of early cases. Genome-based clustering patterns were in better accordance with contact tracing data and the geographical distribution of the cases than clustering patterns based on classical genotyping. A maximum of three SNPs were identified in eight confirmed human-to-human transmission chains, involving 31 patients. We estimated the Mtb genome evolutionary rate at 0.4 mutations per genome per year. This rate suggests that Mtb grows in its natural host with a doubling time of approximately 22 h (400 generations per year). Based on the genome variation discovered, emergence of the Hamburg clone was dated back to a period between 1993 and 1997, hence shortly before the discovery of the outbreak through epidemiological surveillance.
Conclusions
Our findings suggest that WGS is superior to conventional genotyping for Mtb pathogen tracing and investigating micro-epidemics. WGS provides a measure of Mtb genome evolution over time in its natural host context.
Please see later in the article for the Editors' Summary
Editors' Summary
Background
Tuberculosis—a contagious bacterial disease that usually infects the lungs—is a major public health problem, particularly in low- and middle-income countries. In 2011, an estimated 8.7 million people developed tuberculosis globally, and 1.4 million people died from the disease. Tuberculosis is second only to HIV/AIDS in terms of global deaths from a single infectious agent. Mycobacterium tuberculosis, the bacterium that causes tuberculosis, is readily spread in airborne droplets when people with active disease cough or sneeze. The characteristic symptoms of tuberculosis include persistent cough, weight loss, fever, and night sweats. Diagnostic tests for the disease include sputum smear analysis (examination of mucus coughed up from the lungs for the presence of M. tuberculosis), mycobacterial culture (growth of M. tuberculosis from sputum), and chest X-rays. Tuberculosis can be cured by taking several antibiotics daily for at least six months, although the recent emergence of multidrug-resistant M. tuberculosis is making tuberculosis harder to treat.
Why Was This Study Done?
Although efforts to reduce the global burden of tuberculosis are showing some improvements, the annual decline in the number of people developing tuberculosis continues to be slow. To develop optimized control strategies, experts need to be able to accurately track M. tuberculosis transmission within human populations. Because M. tuberculosis, like all bacteria, accumulates genetic changes over time, there are many different strains (genetic variants) of M. tuberculosis. Genotyping methods have been developed that identify different bacterial strains by examining specific regions of the bacterial genome (blueprint), but because these methods examine only a small part of the genome, they may not distinguish between related transmission chains. That is, traditional strain genotyping methods may not be able to determine accurately where a tuberculosis outbreak started or how it spread through a population. In this longitudinal cohort study, the researchers compare the ability of whole genome sequencing (WGS), which is rapidly becoming widely available, and traditional genotyping to provide information about a recent German tuberculosis outbreak. In a longitudinal cohort study, a population is followed over time to analyze the occurrence of a specific disease.
What Did the Researchers Do and Find?
During long-term (1997–2010) population-based molecular epidemiological surveillance (disease surveillance that uses molecular techniques rather than reports of illness) in Hamburg and Schleswig-Holstein, the researchers identified a large tuberculosis outbreak caused by M. tuberculosis isolates of the Haarlem lineage using classical strain typing. The researchers examined each of the 86 isolates from this outbreak using WGS and classical genotyping and asked whether the results of these two approaches correlated with contact tracing data (information is routinely collected about the people a patient with tuberculosis has recently met so that these contacts can be tested for tuberculosis and treated if necessary) and with the spatio-temporal distribution of outbreak cases. WGS of the isolates identified 85 single nucleotide polymorphisms (SNPs; genomic sequence variants in which single building blocks, or nucleotides, are altered) that subdivided the outbreak into seven clusters of isolates and 36 unique isolates. The WGS results showed that the first isolates of the outbreak were incorrectly clustered by classical genotyping and that one strain—the “Hamburg clone”—started expanding in 1998. Notably, the genome-based clustering patterns were in better accordance with contact tracing data and with the geographical distribution of cases than clustering patterns based on classical genotyping, and they identified eight confirmed human-to-human transmission chains that involved 31 patients and a maximum of three SNPs. Finally, the researchers used their WGS results to estimate that the Hamburg clone emerged between 1993 and 1997, shortly before the discovery of the tuberculosis outbreak through epidemiological surveillance.
What Do These Findings Mean?
These findings show that WGS can be used to identify specific strains within large tuberculosis outbreaks more accurately than classical genotyping. They also provide new information about the evolution of M. tuberculosis during outbreaks and indicate how WGS data should be interpreted in future genome-based molecular epidemiology studies. WGS has the potential to improve the molecular epidemiological surveillance and control of tuberculosis and of other infectious diseases. Importantly, note the researchers, ongoing reductions in the cost of WGS, the increased availability of “bench top” genome sequencers, and bioinformatics developments should all accelerate the implementation of WGS as a standard method for the identification of transmission chains in infectious disease outbreaks.
Additional Information
Please access these websites via the online version of this summary at http://dx.doi.org/10.1371/journal.pmed.1001387.
The World Health Organization provides information (in several languages) on all aspects of tuberculosis, including the Global Tuberculosis Report 2012
The Stop TB Partnership is working towards tuberculosis elimination; patient stories about tuberculosis are available (in English and Spanish)
The US Centers for Disease Control and Prevention has information about tuberculosis, including information on tuberculosis genotyping (some information in English and Spanish)
The US National Institute of Allergy and Infectious Diseases also has detailed information on all aspects of tuberculosis
The Tuberculosis Survival Project, which aims to raise awareness of tuberculosis and provide support for people with tuberculosis, provides personal stories about treatment for tuberculosis; the Tuberculosis Vaccine Initiative also provides personal stories about dealing with tuberculosis
MedlinePlus has links to further information about tuberculosis (in English and Spanish)
Wikipedia has a page on whole-genome sequencing (note: Wikipedia is a free online encyclopedia that anyone can edit; available in several languages)
doi:10.1371/journal.pmed.1001387
PMCID: PMC3570532  PMID: 23424287
6.  Molecular Findings Among Patients Referred for Clinical Whole-Exome Sequencing 
JAMA  2014;312(18):1870-1879.
IMPORTANCE
Clinical whole-exome sequencing is increasingly used for diagnostic evaluation of patients with suspected genetic disorders.
OBJECTIVE
To perform clinical whole-exome sequencing and report (1) the rate of molecular diagnosis among phenotypic groups, (2) the spectrum of genetic alterations contributing to disease, and (3) the prevalence of medically actionable incidental findings such as FBN1 mutations causing Marfan syndrome.
DESIGN, SETTING, AND PATIENTS
Observational study of 2000 consecutive patients with clinical whole-exome sequencing analyzed between June 2012 and August 2014. Whole-exome sequencing tests were performed at a clinical genetics laboratory in the United States. Results were reported by clinical molecular geneticists certified by the American Board of Medical Genetics and Genomics. Tests were ordered by the patient’s physician. The patients were primarily pediatric (1756 [88%]; mean age, 6 years; 888 females [44%], 1101 males [55%], and 11 fetuses [1% gender unknown]), demonstrating diverse clinical manifestations most often including nervous system dysfunction such as developmental delay.
MAIN OUTCOMES AND MEASURES
Whole-exome sequencing diagnosis rate overall and by phenotypic category, mode of inheritance, spectrum of genetic events, and reporting of incidental findings.
RESULTS
A molecular diagnosis was reported for 504 patients (25.2%) with 58% of the diagnostic mutations not previously reported. Molecular diagnosis rates for each phenotypic category were 143/526 (27.2%; 95% CI, 23.5%–31.2%) for the neurological group, 282/1147 (24.6%; 95% CI, 22.1%–27.2%) for the neurological plus other organ systems group, 30/83 (36.1%; 95% CI, 26.1%–47.5%) for the specific neurological group, and 49/244 (20.1%; 95% CI, 15.6%–25.8%) for the nonneurological group. The Mendelian disease patterns of the 527 molecular diagnoses included 280 (53.1%) autosomal dominant, 181 (34.3%) autosomal recessive (including 5 with uniparental disomy), 65 (12.3%) X-linked, and 1 (0.2%) mitochondrial. Of 504 patients with a molecular diagnosis, 23 (4.6%) had blended phenotypes resulting from 2 single gene defects. About 30% of the positive cases harbored mutations in disease genes reported since 2011. There were 95 medically actionable incidental findings in genes unrelated to the phenotype but with immediate implications for management in 92 patients (4.6%), including 59 patients (3%) with mutations in genes recommended for reporting by the American College of Medical Genetics and Genomics.
CONCLUSIONS AND RELEVANCE
Whole-exome sequencing provided a potential molecular diagnosis for 25% of a large cohort of patients referred for evaluation of suspected genetic conditions, including detection of rare genetic events and new mutations contributing to disease. The yield of whole-exome sequencing may offer advantages over traditional molecular diagnostic approaches in certain patients.
doi:10.1001/jama.2014.14601
PMCID: PMC4326249  PMID: 25326635
7.  Characterization of mtDNA variation in a cohort of South African paediatric patients with mitochondrial disease 
Mitochondrial disease can be attributed to both mitochondrial and nuclear gene mutations. It has a heterogeneous clinical and biochemical profile, which is compounded by the diversity of the genetic background. Disease-based epidemiological information has expanded significantly in recent decades, but little information is known that clarifies the aetiology in African patients. The aim of this study was to investigate mitochondrial DNA variation and pathogenic mutations in the muscle of diagnosed paediatric patients from South Africa. A cohort of 71 South African paediatric patients was included and a high-throughput nucleotide sequencing approach was used to sequence full-length muscle mtDNA. The average coverage of the mtDNA genome was 81±26 per position. After assigning haplogroups, it was determined that although the nature of non-haplogroup-defining variants was similar in African and non-African haplogroup patients, the number of substitutions were significantly higher in African patients. We describe previously reported disease-associated and novel variants in this cohort. We observed a general lack of commonly reported syndrome-associated mutations, which supports clinical observations and confirms general observations in African patients when using single mutation screening strategies based on (predominantly non-African) mtDNA disease-based information. It is finally concluded that this first extensive report on muscle mtDNA sequences in African paediatric patients highlights the need for a full-length mtDNA sequencing strategy, which applies to all populations where specific mutations is not present. This, in addition to nuclear DNA gene mutation and pathogenicity evaluations, will be required to better unravel the aetiology of these disorders in African patients.
doi:10.1038/ejhg.2011.262
PMCID: PMC3355259  PMID: 22258525
mitochondrial DNA; mitochondrial diseases; paediatrics; Africa; high-throughput nucleotide sequencing
8.  Whole-genome haplotyping approaches and genomic medicine 
Genome Medicine  2014;6(9):73.
Genomic information reported as haplotypes rather than genotypes will be increasingly important for personalized medicine. Current technologies generate diploid sequence data that is rarely resolved into its constituent haplotypes. Furthermore, paradigms for thinking about genomic information are based on interpreting genotypes rather than haplotypes. Nevertheless, haplotypes have historically been useful in contexts ranging from population genetics to disease-gene mapping efforts. The main approaches for phasing genomic sequence data are molecular haplotyping, genetic haplotyping, and population-based inference. Long-read sequencing technologies are enabling longer molecular haplotypes, and decreases in the cost of whole-genome sequencing are enabling the sequencing of whole-chromosome genetic haplotypes. Hybrid approaches combining high-throughput short-read assembly with strategic approaches that enable physical or virtual binning of reads into haplotypes are enabling multi-gene haplotypes to be generated from single individuals. These techniques can be further combined with genetic and population approaches. Here, we review advances in whole-genome haplotyping approaches and discuss the importance of haplotypes for genomic medicine. Clinical applications include diagnosis by recognition of compound heterozygosity and by phasing regulatory variation to coding variation. Haplotypes, which are more specific than less complex variants such as single nucleotide variants, also have applications in prognostics and diagnostics, in the analysis of tumors, and in typing tissue for transplantation. Future advances will include technological innovations, the application of standard metrics for evaluating haplotype quality, and the development of databases that link haplotypes to disease.
doi:10.1186/s13073-014-0073-7
PMCID: PMC4254418  PMID: 25473435
9.  Diagnostics of Primary Immunodeficiency Diseases: A Sequencing Capture Approach 
PLoS ONE  2014;9(12):e114901.
Primary Immunodeficiencies (PID) are genetically inherited disorders characterized by defects of the immune system, leading to increased susceptibility to infection. Due to the variety of clinical symptoms and the complexity of current diagnostic procedures, accurate diagnosis of PID is often difficult in daily clinical practice. Thanks to the advent of “next generation” sequencing technologies and target enrichment methods, the development of multiplex diagnostic assays is now possible. In this study, we applied a selector-based target enrichment assay to detect disease-causing mutations in 179 known PID genes. The usefulness of this assay for molecular diagnosis of PID was investigated by sequencing DNA from 33 patients, 18 of which had at least one known causal mutation at the onset of the experiment. We were able to identify the disease causing mutations in 60% of the investigated patients, indicating that the majority of PID cases could be resolved using a targeted sequencing approach. Causal mutations identified in the unknown patient samples were located in STAT3, IGLL1, RNF168 and PGM3. Based on our results, we propose a stepwise approach for PID diagnostics, involving targeted resequencing, followed by whole transcriptome and/or whole genome sequencing if causative variants are not found in the targeted exons.
doi:10.1371/journal.pone.0114901
PMCID: PMC4263707  PMID: 25502423
10.  Rare chromosomal deletions and duplications in attention-deficit hyperactivity disorder: a genome-wide analysis 
Lancet  2010;376(9750):1401-1408.
Summary
Background
Large, rare chromosomal deletions and duplications known as copy number variants (CNVs) have been implicated in neurodevelopmental disorders similar to attention-deficit hyperactivity disorder (ADHD). We aimed to establish whether burden of CNVs was increased in ADHD, and to investigate whether identified CNVs were enriched for loci previously identified in autism and schizophrenia.
Methods
We undertook a genome-wide analysis of CNVs in 410 children with ADHD and 1156 unrelated ethnically matched controls from the 1958 British Birth Cohort. Children of white UK origin, aged 5–17 years, who met diagnostic criteria for ADHD or hyperkinetic disorder, but not schizophrenia and autism, were recruited from community child psychiatry and paediatric outpatient clinics. Single nucleotide polymorphisms (SNPs) were genotyped in the ADHD and control groups with two arrays; CNV analysis was limited to SNPs common to both arrays and included only samples with high-quality data. CNVs in the ADHD group were validated with comparative genomic hybridisation. We assessed the genome-wide burden of large (>500 kb), rare (<1% population frequency) CNVs according to the average number of CNVs per sample, with significance assessed via permutation. Locus-specific tests of association were undertaken for test regions defined for all identified CNVs and for 20 loci implicated in autism or schizophrenia. Findings were replicated in 825 Icelandic patients with ADHD and 35 243 Icelandic controls.
Findings
Data for full analyses were available for 366 children with ADHD and 1047 controls. 57 large, rare CNVs were identified in children with ADHD and 78 in controls, showing a significantly increased rate of CNVs in ADHD (0·156 vs 0·075; p=8·9×10−5). This increased rate of CNVs was particularly high in those with intellectual disability (0·424; p=2·0×10−6), although there was also a significant excess in cases with no such disability (0·125, p=0·0077). An excess of chromosome 16p13.11 duplications was noted in the ADHD group (p=0·0008 after correction for multiple testing), a finding that was replicated in the Icelandic sample (p=0·031). CNVs identified in our ADHD cohort were significantly enriched for loci previously reported in both autism (p=0·0095) and schizophrenia (p=0·010).
Interpretation
Our findings provide genetic evidence of an increased rate of large CNVs in individuals with ADHD and suggest that ADHD is not purely a social construct.
Funding
Action Research; Baily Thomas Charitable Trust; Wellcome Trust; UK Medical Research Council; European Union.
doi:10.1016/S0140-6736(10)61109-9
PMCID: PMC2965350  PMID: 20888040
11.  An Integrated Approach for Analyzing Clinical Genomic Variant Data from Next-Generation Sequencing 
Next-generation sequencing (NGS) technologies provide the potential for developing high-throughput and low-cost platforms for clinical diagnostics. A limiting factor to clinical applications of genomic NGS is downstream bioinformatics analysis for data interpretation. We have developed an integrated approach for end-to-end clinical NGS data analysis from variant detection to functional profiling. Robust bioinformatics pipelines were implemented for genome alignment, single nucleotide polymorphism (SNP), small insertion/deletion (InDel), and copy number variation (CNV) detection of whole exome sequencing (WES) data from the Illumina platform. Quality-control metrics were analyzed at each step of the pipeline by use of a validated training dataset to ensure data integrity for clinical applications. We annotate the variants with data regarding the disease population and variant impact. Custom algorithms were developed to filter variants based on criteria, such as quality of variant, inheritance pattern, and impact of variant on protein function. The developed clinical variant pipeline links the identified rare variants to Integrated Genome Viewer for visualization in a genomic context and to the Protein Information Resource’s iProXpress for rich protein and disease information. With the application of our system of annotations, prioritizations, inheritance filters, and functional profiling and analysis, we have created a unique methodology for downstream variant filtering that empowers clinicians and researchers to interpret more effectively the relevance of genomic alterations within a rare genetic disease.
doi:10.7171/jbt.15-2601-002
PMCID: PMC4310222  PMID: 25649353
bioinformatics; genetic alterations; Mendelian Genetics; protein information resources
12.  Molecular Diagnosis of Usher Syndrome: Application of Two Different Next Generation Sequencing-Based Procedures 
PLoS ONE  2012;7(8):e43799.
Usher syndrome (USH) is a clinically and genetically heterogeneous disorder characterized by visual and hearing impairments. Clinically, it is subdivided into three subclasses with nine genes identified so far. In the present study, we investigated whether the currently available Next Generation Sequencing (NGS) technologies are already suitable for molecular diagnostics of USH. We analyzed a total of 12 patients, most of which were negative for previously described mutations in known USH genes upon primer extension-based microarray genotyping. We enriched the NGS template either by whole exome capture or by Long-PCR of the known USH genes. The main NGS sequencing platforms were used: SOLiD for whole exome sequencing, Illumina (Genome Analyzer II) and Roche 454 (GS FLX) for the Long-PCR sequencing. Long-PCR targeting was more efficient with up to 94% of USH gene regions displaying an overall coverage higher than 25×, whereas whole exome sequencing yielded a similar coverage for only 50% of those regions. Overall this integrated analysis led to the identification of 11 novel sequence variations in USH genes (2 homozygous and 9 heterozygous) out of 18 detected. However, at least two cases were not genetically solved. Our result highlights the current limitations in the diagnostic use of NGS for USH patients. The limit for whole exome sequencing is linked to the need of a strong coverage and to the correct interpretation of sequence variations with a non obvious, pathogenic role, whereas the targeted approach suffers from the high genetic heterogeneity of USH that may be also caused by the presence of additional causative genes yet to be identified.
doi:10.1371/journal.pone.0043799
PMCID: PMC3430670  PMID: 22952768
13.  An Integrative Approach for Interpretation of Clinical NGS Genomic Variant Data 
Antibody (Ab) discovery research has accelerated as monoclonal Ab (mAb)-based biologic strategies have proved efficacious in the treatment of many human diseases, ranging from cancer to autoimmunity. Initial steps in the discovery of therapeutic mAb require epitope characterization and preclinical studies in vitro and in animal models often using limited quantities of Ab. To facilitate this research, our Shared Resource Laboratory (SRL) offers microscale Ab conjugation. Ab submitted for conjugation may or may not be commercially produced, but have not been characterized for use in immunofluorescence applications. Purified mAb and even polyclonal Ab (pAb) can be efficiently conjugated, although the advantages of direct conjugation are more obvious for mAb. To improve consistency of results in microscale (<100ug) conjugation reactions, we chose to utilize several different varieties of commercial kits. Kits tested were limited to covalent fluorophore labeling. Established quality control (QC) processes to validate fluorophore labeling either rely solely on spectrophotometry or utilize flow cytometry of cells expected to express the target antigen. This methodology is not compatible with microscale reactions using uncharacterized Ab. We developed a novel method for cell-free QC of our conjugates that reflects conjugation quality, but is independent of the biological properties of the Ab itself. QC is critical, as amine reactive chemistry relies on the absence of even trace quantities of competing amine moieties such as those found in the Good buffers (HEPES, MOPS, TES, etc.) or irrelevant proteins. Herein, we present data used to validate our method of assessing the extent of labeling and the removal of free dye by using flow cytometric analysis of polystyrene Ab capture beads to verify product quality. This microscale custom conjugation and QC allows for the rapid development and validation of high quality reagents, specific to the needs of our colleagues and clientele. Next generation sequencing (NGS) technologies provide the potential for developing high-throughput and low-cost platforms for clinical diagnostics. A limiting factor to clinical applications of genomic NGS is downstream bioinformatics analysis. Most analysis pipelines do not connect genomic variants to disease and protein specific information during the initial filtering and selection of relevant variants. Robust bioinformatics pipelines were implemented for trimming, genome alignment, SNP, INDEL, or structural variation detection of whole genome or exon-capture sequencing data from Illumina. Quality control metrics were analyzed at each step of the pipeline to ensure data integrity for clinical applications. We further annotate the variants with statistics regarding the diseased population and variant impact. Custom algorithms were developed to analyze the variant data by filtering variants based upon criteria such as quality of variant, inheritance pattern (e.g. dominant, recessive, X-linked), and impact of variant. The resulting variants and their associated genes are linked to Integrated Genome Browser (IGV) in a genome context, and to the PIR iProXpress system for rich protein and disease information. This poster will present detailed analysis of whole exome sequencing performed on patients with facio-skeletal anomalies. We will compare and contrast data analysis methods and report on potential clinically relevant leads discovered by implementing our new clinical variant pipeline. Our variant analysis of these patients and their unaffected family members resulted in more than 500,000 variants. By applying our system of annotations, prioritizations, inheritance filters, and functional profiling and analysis, we have created a unique methodology for further filtering of disease relevant variants that impact protein coding genes. Taken together, the integrative approach allows better selection of disease relevant genomic variants by using both genomic and disease/protein centric information. This type of clustering approach can help clinicians better understand the association of variants to the disease phenotype, enabling application to personalized medicine approaches.
PMCID: PMC4162289
14.  Integrating precision medicine in the study and clinical treatment of a severely mentally ill person 
PeerJ  2013;1:e177.
Background. In recent years, there has been an explosion in the number of technical and medical diagnostic platforms being developed. This has greatly improved our ability to more accurately, and more comprehensively, explore and characterize human biological systems on the individual level. Large quantities of biomedical data are now being generated and archived in many separate research and clinical activities, but there exists a paucity of studies that integrate the areas of clinical neuropsychiatry, personal genomics and brain-machine interfaces.
Methods. A single person with severe mental illness was implanted with the Medtronic Reclaim® Deep Brain Stimulation (DBS) Therapy device for Obsessive Compulsive Disorder (OCD), targeting his nucleus accumbens/anterior limb of the internal capsule. Programming of the device and psychiatric assessments occurred in an outpatient setting for over two years. His genome was sequenced and variants were detected in the Illumina Whole Genome Sequencing Clinical Laboratory Improvement Amendments (CLIA)-certified laboratory.
Results. We report here the detailed phenotypic characterization, clinical-grade whole genome sequencing (WGS), and two-year outcome of a man with severe OCD treated with DBS. Since implantation, this man has reported steady improvement, highlighted by a steady decline in his Yale-Brown Obsessive Compulsive Scale (YBOCS) score from ∼38 to a score of ∼25. A rechargeable Activa RC neurostimulator battery has been of major benefit in terms of facilitating a degree of stability and control over the stimulation. His psychiatric symptoms reliably worsen within hours of the battery becoming depleted, thus providing confirmatory evidence for the efficacy of DBS for OCD in this person. WGS revealed that he is a heterozygote for the p.Val66Met variant in BDNF, encoding a member of the nerve growth factor family, and which has been found to predispose carriers to various psychiatric illnesses. He carries the p.Glu429Ala allele in methylenetetrahydrofolate reductase (MTHFR) and the p.Asp7Asn allele in ChAT, encoding choline O-acetyltransferase, with both alleles having been shown to confer an elevated susceptibility to psychoses. We have found thousands of other variants in his genome, including pharmacogenetic and copy number variants. This information has been archived and offered to this person alongside the clinical sequencing data, so that he and others can re-analyze his genome for years to come.
Conclusions. To our knowledge, this is the first study in the clinical neurosciences that integrates detailed neuropsychiatric phenotyping, deep brain stimulation for OCD and clinical-grade WGS with management of genetic results in the medical treatment of one person with severe mental illness. We offer this as an example of precision medicine in neuropsychiatry including brain-implantable devices and genomics-guided preventive health care.
doi:10.7717/peerj.177
PMCID: PMC3792182  PMID: 24109560
Genomics; Deep brain stimulation; Whole genome sequencing; Ethics; Neurosurgery; Obsessive compulsive disorder
15.  Successful transitioning is a matter of the Heart: Integrated Care for Grown-Up Congenital Heart Disease 
Purpose
This study offers a comprehensive overview over the existing guidelines for GUCH/ACHD care and synthesises the recommendations made over the past decade, developing them into an integrated care concept for GUCH/ACHD patients. Its aim is to emphasise the need for more coordinated action of paediatric and adult specialists, professional and patients organisations to lobby for a concerted implementation of the guidelines for GUCH/ACHD management and an organised transitioning process.
Context
More than a decade ago, discussions picked up on the adequate management of a challenging new patient group: persons with ‘Grown-Up Congenital Heart Disease’ (GUCH), also known as ‘Adult Congenital Heart Disease’ (ACHD) in North America. The various authors acknowledged the demand for highly specialised and trained professionals who could provide the wide array of services needed for this patient group, with a systematic and multi-disciplinary approach. First experiences have already been gathered throughout the 1990s in Canada and the UK. Since then, the technological and medical advances in paediatric cardiology, cardiac surgery and related medical fields have improved the health outcomes even further, to the extent that 85%–95% of children with congenital heart disease (CHD) survive into adulthood. However, the efforts to implement the necessary managerial, transitioning and vocational training requirements have not been afforded equal focus.
Data sources
A literature review of the existing guidelines for the management of GUCH patients from national and international cardiology associations, expert interviews.
Case description
The key problem in the management of GUCH patients is a lack of understanding the importance of a coordinated transitioning process from paediatric to adult care services. Neither the paediatric nor the adult specialists have the proper training for the care of these patients, the former lacking experience with adult patients the latter not knowing the complex indication of congenital heart disease. In the different guidelines (e.g. from the American Heart Association or the European Society of Cardiology), it is acknowledged that cooperation and communication between specialists and settings and a managed transitioning process are paramount. In this case, the focus is laid on developing an integrated care model based on the existing medical guidelines and the requirements a transition process demands. Adolescents with CHD and their parents need to be prepared to adapt to the demands of an adult life. They need information on working and educational options, family planning, and what complications may be expected. Also, the adolescents need to learn to take over the responsibility for their own life and health—independent of their parents. These are just the most pressing of the challenges GUCH patients face.
(Preliminary) conclusions
Even though specialised GUCH/ACHD centres exist in many countries, they are too few in numbers to effectively and adequately service the whole population and provide high quality training. A lack of coordination and communication between paediatric and adult health care service providers results in patients being lost in transition from paediatric to adult care settings. This counteracts the excellent services children with congenital heart disease receive nowadays, and which have lead to the need of specialised adult service in the first place. It is a waste of time and resources if the efforts made in the paediatric care setting are not followed up adequately once the patients are grown up. This is a classic setting for implementing integrated care and this study offers a model, based on the available medical guidelines to do so.
PMCID: PMC3617751
transition from paediatric to adult services; GUCH; implementation of guidelines; integrated care centres
16.  Cryptococcus gattii in North American Pacific Northwest: Whole-Population Genome Analysis Provides Insights into Species Evolution and Dispersal 
mBio  2014;5(4):e01464-14.
ABSTRACT
The emergence of distinct populations of Cryptococcus gattii in the temperate North American Pacific Northwest (PNW) was surprising, as this species was previously thought to be confined to tropical and semitropical regions. Beyond a new habitat niche, the dominant emergent population displayed increased virulence and caused primary pulmonary disease, as opposed to the predominantly neurologic disease seen previously elsewhere. Whole-genome sequencing was performed on 118 C. gattii isolates, including the PNW subtypes and the global diversity of molecular type VGII, to better ascertain the natural source and genomic adaptations leading to the emergence of infection in the PNW. Overall, the VGII population was highly diverse, demonstrating large numbers of mutational and recombinational events; however, the three dominant subtypes from the PNW were of low diversity and were completely clonal. Although strains of VGII were found on at least five continents, all genetic subpopulations were represented or were most closely related to strains from South America. The phylogenetic data are consistent with multiple dispersal events from South America to North America and elsewhere. Numerous gene content differences were identified between the emergent clones and other VGII lineages, including genes potentially related to habitat adaptation, virulence, and pathology. Evidence was also found for possible gene introgression from Cryptococcus neoformans var. grubii that is rarely seen in global C. gattii but that was present in all PNW populations. These findings provide greater understanding of C. gattii evolution in North America and support extensive evolution in, and dispersal from, South America.
IMPORTANCE
Cryptococcus gattii emerged in the temperate North American Pacific Northwest (PNW) in the late 1990s. Beyond a new environmental niche, these emergent populations displayed increased virulence and resulted in a different pattern of clinical disease. In particular, severe pulmonary infections predominated in contrast to presentation with neurologic disease as seen previously elsewhere. We employed population-level whole-genome sequencing and analysis to explore the genetic relationships and gene content of the PNW C. gattii populations. We provide evidence that the PNW strains originated from South America and identified numerous genes potentially related to habitat adaptation, virulence expression, and clinical presentation. Characterization of these genetic features may lead to improved diagnostics and therapies for such fungal infections. The data indicate that there were multiple recent introductions of C. gattii into the PNW. Public health vigilance is warranted for emergence in regions where C. gattii is not thought to be endemic.
doi:10.1128/mBio.01464-14
PMCID: PMC4161256  PMID: 25028429
17.  Molecular Genetic Testing for Mitochondrial Disease: From One Generation to the Next 
Neurotherapeutics  2012;10(2):251-261.
Molecular genetic diagnostic testing for mitochondrial disease has evolved continually since the first genetic basis for a clinical mitochondrial disease syndrome was identified in the late 1980s. Owing to global limitations in both knowledge and technology, few individuals, even among those with strong clinical or biochemical evidence of mitochondrial respiratory chain dysfunction, ever received a definitive molecular diagnosis prior to 2005. Clinically available genetic diagnostic testing options improved by 2006 to include sequencing and deletion analysis of an increasing number of individual nuclear genes linked to mitochondrial disease, genome-wide microarray analysis for chromosomal copy number abnormalities, and mitochondrial DNA whole genome sequence analysis. To assess the collective effect of these tests on the genetic diagnosis of suspected mitochondrial disease, we report here results from a retrospective review of the diagnostic yield in patients evaluated from 2008 to 2011 in the Mitochondrial-Genetics Diagnostic Clinic at The Children’s Hospital of Philadelphia. Among 152 patients aged 6 weeks to 81 years referred for clinical evaluation of multisystem presentations concerning for suspected mitochondrial disease, a genetic etiology was established that confirmed definite mitochondrial disease in 16.4 % and excluded primary mitochondrial disease in 9.2 %. Substantial diagnostic challenges remain owing to the clinical difficulty and frank low yield of a priori selecting individual nuclear genes to sequence based on particular symptomatic or biochemical manifestations of suspected mitochondrial disease. These findings highlight the particular utility of massively parallel nuclear exome sequencing technologies, whose benefits and limitations are explored relative to the clinical genetic diagnostic evaluation of mitochondrial disease.
Electronic supplementary material
The online version of this article (doi:10.1007/s13311-012-0174-1) contains supplementary material, which is available to authorized users.
doi:10.1007/s13311-012-0174-1
PMCID: PMC3625386  PMID: 23269497
Next generation sequencing; massively parallel sequencing; whole exome sequencing; retrospective study; mitochondrial disease diagnosis
18.  The Clinical Impact of Chromosomal Microarray on Paediatric Care in Hong Kong 
PLoS ONE  2014;9(10):e109629.
Objective
To evaluate the clinical impact of chromosomal microarray (CMA) on the management of paediatric patients in Hong Kong.
Methods
We performed NimbleGen 135k oligonucleotide array on 327 children with intellectual disability (ID)/developmental delay (DD), autism spectrum disorders (ASD), and/or multiple congenital anomalies (MCAs) in a university-affiliated paediatric unit from January 2011 to May 2013. The medical records of patients were reviewed in September 2013, focusing on the pathogenic/likely pathogenic CMA findings and their “clinical actionability” based on established criteria.
Results
Thirty-seven patients were reported to have pathogenic/likely pathogenic results, while 40 had findings of unknown significance. This gives a detection rate of 11% for clinically significant (pathogenic/likely pathogenic) findings. The significant findings have prompted clinical actions in 28 out of 37 patients (75.7%), while the findings with unknown significance have led to further management recommendation in only 1 patient (p<0.001). Nineteen out of the 28 management recommendations are “evidence-based” on either practice guidelines endorsed by a professional society (n = 9, Level 1) or peer-reviewed publications making medical management recommendation (n = 10, Level 2). CMA results impact medical management by precipitating referral to a specialist (n = 24); diagnostic testing (n = 25), surveillance of complications (n = 19), interventional procedure (n = 7), medication (n = 15) or lifestyle modification (n = 12).
Conclusion
The application of CMA in children with ID/DD, ASD, and/or MCAs in Hong Kong results in a diagnostic yield of ∼11% for pathogenic/likely pathogenic results. Importantly the yield for clinically actionable results is 8.6%. We advocate using diagnostic yield of clinically actionable results to evaluate CMA as it provides information of both clinical validity and clinical utility. Furthermore, it incorporates evidence-based medicine into the practice of genomic medicine. The same framework can be applied to other genomic testing strategies enabled by next-generation sequencing.
doi:10.1371/journal.pone.0109629
PMCID: PMC4198120  PMID: 25333781
19.  Whole Genome Sequences of Three Treponema pallidum ssp. pertenue Strains: Yaws and Syphilis Treponemes Differ in Less than 0.2% of the Genome Sequence 
Background
The yaws treponemes, Treponema pallidum ssp. pertenue (TPE) strains, are closely related to syphilis causing strains of Treponema pallidum ssp. pallidum (TPA). Both yaws and syphilis are distinguished on the basis of epidemiological characteristics, clinical symptoms, and several genetic signatures of the corresponding causative agents.
Methodology/Principal Findings
To precisely define genetic differences between TPA and TPE, high-quality whole genome sequences of three TPE strains (Samoa D, CDC-2, Gauthier) were determined using next-generation sequencing techniques. TPE genome sequences were compared to four genomes of TPA strains (Nichols, DAL-1, SS14, Chicago). The genome structure was identical in all three TPE strains with similar length ranging between 1,139,330 bp and 1,139,744 bp. No major genome rearrangements were found when compared to the four TPA genomes. The whole genome nucleotide divergence (dA) between TPA and TPE subspecies was 4.7 and 4.8 times higher than the observed nucleotide diversity (π) among TPA and TPE strains, respectively, corresponding to 99.8% identity between TPA and TPE genomes. A set of 97 (9.9%) TPE genes encoded proteins containing two or more amino acid replacements or other major sequence changes. The TPE divergent genes were mostly from the group encoding potential virulence factors and genes encoding proteins with unknown function.
Conclusions/Significance
Hypothetical genes, with genetic differences, consistently found between TPE and TPA strains are candidates for syphilitic treponemes virulence factors. Seventeen TPE genes were predicted under positive selection, and eleven of them coded either for predicted exported proteins or membrane proteins suggesting their possible association with the cell surface. Sequence changes between TPE and TPA strains and changes specific to individual strains represent suitable targets for subspecies- and strain-specific molecular diagnostics.
Author Summary
Spirochete Treponema pallidum ssp. pertenue (TPE) is the causative agent of yaws while strains of Treponema pallidum ssp. pallidum (TPA) cause syphilis. Both yaws and syphilis are distinguished on the basis of epidemiological characteristics and clinical symptoms. Neither treponeme can reproduce outside the host organism, which precludes the use of standard molecular biology techniques used to study cultivable pathogens. In this study, we determined high quality whole genome sequences of TPE strains and compared them to known genetic information for T. pallidum ssp. pallidum strains. The genome structure was identical in all three TPE strains and also between TPA and TPE strains. The TPE genome length ranged between 1,139,330 bp and 1,139,744 bp. The overall sequence identity between TPA and TPE genomes was 99.8%, indicating that the two pathogens are extremely closely related. A set of 34 TPE genes (3.5%) encoded proteins containing six or more amino acid replacements or other major sequence changes. These genes more often belonged to the group of genes with predicted virulence and unknown functions suggesting their involvement in infection differences between yaws and syphilis.
doi:10.1371/journal.pntd.0001471
PMCID: PMC3265458  PMID: 22292095
20.  Germline Variation in Cancer-Susceptibility Genes in a Healthy, Ancestrally Diverse Cohort: Implications for Individual Genome Sequencing 
PLoS ONE  2014;9(4):e94554.
Technological advances coupled with decreasing costs are bringing whole genome and whole exome sequencing closer to routine clinical use. One of the hurdles to clinical implementation is the high number of variants of unknown significance. For cancer-susceptibility genes, the difficulty in interpreting the clinical relevance of the genomic variants is compounded by the fact that most of what is known about these variants comes from the study of highly selected populations, such as cancer patients or individuals with a family history of cancer. The genetic variation in known cancer-susceptibility genes in the general population has not been well characterized to date. To address this gap, we profiled the nonsynonymous genomic variation in 158 genes causally implicated in carcinogenesis using high-quality whole genome sequences from an ancestrally diverse cohort of 681 healthy individuals. We found that all individuals carry multiple variants that may impact cancer susceptibility, with an average of 68 variants per individual. Of the 2,688 allelic variants identified within the cohort, most are very rare, with 75% found in only 1 or 2 individuals in our population. Allele frequencies vary between ancestral groups, and there are 21 variants for which the minor allele in one population is the major allele in another. Detailed analysis of a selected subset of 5 clinically important cancer genes, BRCA1, BRCA2, KRAS, TP53, and PTEN, highlights differences between germline variants and reported somatic mutations. The dataset can serve a resource of genetic variation in cancer-susceptibility genes in 6 ancestry groups, an important foundation for the interpretation of cancer risk from personal genome sequences.
doi:10.1371/journal.pone.0094554
PMCID: PMC3984285  PMID: 24728327
21.  Joint genotype inference with germline and somatic mutations 
BMC Bioinformatics  2013;14(Suppl 5):S3.
The joint sequencing of related genomes has become an important means to discover rare variants. Normal-tumor genome pairs are routinely sequenced together to find somatic mutations and their associations with different cancers. Parental and sibling genomes reveal de novo germline mutations and inheritance patterns related to Mendelian diseases.
Acute lymphoblastic leukemia (ALL) is the most common paediatric cancer and the leading cause of cancer-related death among children. With the aim of uncovering the full spectrum of germline and somatic genetic alterations in childhood ALL genomes, we conducted whole-exome re-sequencing on a unique cohort of over 120 exomes of childhood ALL quartets, each comprising a patient's tumor and matched-normal material, and DNA from both parents. We developed a general probabilistic model for such quartet sequencing reads mapped to the reference human genome. The model is used to infer joint genotypes at homologous loci across a normal-tumor genome pair and two parental genomes.
We describe the algorithms and data structures for genotype inference, model parameter training. We implemented the methods in an open-source software package (QUADGT) that uses the standard file formats of the 1000 Genomes Project. Our method's utility is illustrated on quartets from the ALL cohort.
doi:10.1186/1471-2105-14-S5-S3
PMCID: PMC3622648  PMID: 23734724
22.  Serum protein profiling of adults and children with Crohn’s disease 
Objectives
Crohn’s disease (CD) and ulcerative colitis (UC), known collectively as inflammatory bowel disease (IBD), are chronic immuno-inflammatory pathologies of unknown etiology. Despite the frequent utilization of biomarkers in medical practice, there is a relative lack of information regarding validated paediatric biomarkers for IBD. Further, biomarkers proved to be efficacious in adults are frequently extrapolated to the paediatric clinical setting without considering that the pathogenesis of many diseases is distinctly different in children. In the current study, proteomics technology was employed in order to monitor differences in protein expression among adult and children CD patients, in order to identify a panel of candidate protein biomarkers that might be used to improve prognostic-diagnostic accuracy and to advance paediatric medical care.
Methods
Male and female serum samples from 12 adults and 12 children with active CD were subjected to two-dimensional gel electrophoresis. Following the relative quantitation of protein spots exhibiting a differential expression between the two groups by densitometry, the spots were further characterized by MALDI-TOF-MS. Results were confirmed by Western blot analysis.
Results
Clusterin (CLUS) was found to be significantly over-expressed in adults with CD, whereas ceruloplasmin (CERU) and apolipoprotein B-100 (APOB) were found to be significantly over-expressed in children indicating that the expression of these proteins might be implicated in the onset or progression of CD in these two sub-groups of patients.
Conclusions
Interestingly, we found a differential expression of several proteins in adults versus paediatric CD patients. Undoubtedly, future experiments using a larger cohort of CD patients are needed to evaluate the relevance of our preliminary findings.
doi:10.1097/MPG.0000000000000579
PMCID: PMC4276513  PMID: 25250685
Crohn’s disease; ulcerative colitis; proteomics; paediatric; inflammatory bowel disease
23.  Global population-specific variation in miRNA associated with cancer risk and clinical biomarkers 
BMC Medical Genomics  2014;7:53.
Background
MiRNA expression profiling is being actively investigated as a clinical biomarker and diagnostic tool to detect multiple cancer types and stages as well as other complex diseases. Initial investigations, however, have not comprehensively taken into account genetic variability affecting miRNA expression and/or function in populations of different ethnic backgrounds. Therefore, more complete surveys of miRNA genetic variability are needed to assess global patterns of miRNA variation within and between diverse human populations and their effect on clinically relevant miRNA genes.
Methods
Genetic variation in 1524 miRNA genes was examined using whole genome sequencing (60x coverage) in a panel of 69 unrelated individuals from 14 global populations, including European, Asian and African populations.
Results
We identified 33 previously undescribed miRNA variants, and 31 miRNA containing variants that are globally population-differentiated in frequency between African and non-African populations (PD-miRNA). The top 1% of PD-miRNA were significantly enriched for regulation of genes involved in glucose/insulin metabolism and cell division (p < 10−7), most significantly the mitosis pathway, which is strongly linked to cancer onset. Overall, we identify 7 PD-miRNAs that are currently implicated as cancer biomarkers or diagnostics: hsa-mir-202, hsa-mir-423, hsa-mir-196a-2, hsa-mir-520h, hsa-mir-647, hsa-mir-943, and hsa-mir-1908. Notably, hsa-mir-202, a potential breast cancer biomarker, was found to show significantly high allele frequency differentiation at SNP rs12355840, which is known to affect miRNA expression levels in vivo and subsequently breast cancer mortality.
Conclusion
MiRNA expression profiles represent a promising new category of disease biomarkers. However, population specific genetic variation can affect the prevalence and baseline expression of these miRNAs in diverse populations. Consequently, miRNA genetic and expression level variation among ethnic groups may be contributing in part to health disparities observed in multiple forms of cancer, specifically breast cancer, and will be an essential consideration when assessing the utility of miRNA biomarkers for the clinic.
doi:10.1186/1755-8794-7-53
PMCID: PMC4159108  PMID: 25169894
miRNA; Biomarkers; Population differentiation; Whole-genome sequencing; African genetic diversity; Disease susceptibility; Cancer; Diabetes
24.  SOLiD™ Sequencing of Genomes of Clinical Isolates of Leishmania donovani from India Confirm Leptomonas Co-Infection and Raise Some Key Questions 
PLoS ONE  2013;8(2):e55738.
Background
Known as ‘neglected disease’ because relatively little effort has been applied to finding cures, leishmaniasis kills more than 150,000 people every year and debilitates millions more. Visceral leishmaniasis (VL), also called Kala Azar (KA) or black fever in India, claims around 20,000 lives every year. Whole genome analysis presents an excellent means to identify new targets for drugs, vaccine and diagnostics development, and also provide an avenue into the biological basis of parasite virulence in the L. donovani complex prevalent in India.
Methodology/Principal Findings
In our presently described study, the next generation SOLiD™ platform was successfully utilized for the first time to carry out whole genome sequencing of L. donovani clinical isolates from India. We report the exceptional occurrence of insect trypanosomatids in clinical cases of visceral leishmaniasis (Kala Azar) patients in India. We confirm with whole genome sequencing analysis data that isolates which were sequenced from Kala Azar (visceral leishmaniasis) cases were genetically related to Leptomonas. The co-infection in splenic aspirate of these patients with a species of Leptomonas and how likely is it that the infection might be pathogenic, are key questions which need to be investigated. We discuss our results in the context of some important probable hypothesis in this article.
Conclusions/Significance
Our intriguing results of unusual cases of Kala Azar found to be most similar to Leptomonas species put forth important clinical implications for the treatment of Kala Azar in India. Leptomonas have been shown to be highly susceptible to several standard leishmaniacides in vitro. There is very little divergence among these two species viz. Leishmania sp. and L. seymouri, in terms of genomic sequence and organization. A more extensive perception of the phenomenon of co-infection needs to be addressed from molecular pathogenesis and eco-epidemiological standpoint.
doi:10.1371/journal.pone.0055738
PMCID: PMC3572117  PMID: 23418454
25.  Relationship Estimation from Whole-Genome Sequence Data 
PLoS Genetics  2014;10(1):e1004144.
The determination of the relationship between a pair of individuals is a fundamental application of genetics. Previously, we and others have demonstrated that identity-by-descent (IBD) information generated from high-density single-nucleotide polymorphism (SNP) data can greatly improve the power and accuracy of genetic relationship detection. Whole-genome sequencing (WGS) marks the final step in increasing genetic marker density by assaying all single-nucleotide variants (SNVs), and thus has the potential to further improve relationship detection by enabling more accurate detection of IBD segments and more precise resolution of IBD segment boundaries. However, WGS introduces new complexities that must be addressed in order to achieve these improvements in relationship detection. To evaluate these complexities, we estimated genetic relationships from WGS data for 1490 known pairwise relationships among 258 individuals in 30 families along with 46 population samples as controls. We identified several genomic regions with excess pairwise IBD in both the pedigree and control datasets using three established IBD methods: GERMLINE, fastIBD, and ISCA. These spurious IBD segments produced a 10-fold increase in the rate of detected false-positive relationships among controls compared to high-density microarray datasets. To address this issue, we developed a new method to identify and mask genomic regions with excess IBD. This method, implemented in ERSA 2.0, fully resolved the inflated cryptic relationship detection rates while improving relationship estimation accuracy. ERSA 2.0 detected all 1st through 6th degree relationships, and 55% of 9th through 11th degree relationships in the 30 families. We estimate that WGS data provides a 5% to 15% increase in relationship detection power relative to high-density microarray data for distant relationships. Our results identify regions of the genome that are highly problematic for IBD mapping and introduce new software to accurately detect 1st through 9th degree relationships from whole-genome sequence data.
Author Summary
The determination of the relationship between a pair of individuals is a fundamental application of genetics. The most accurate methods for relationship estimation rely on precise, localized estimates of genetic sharing between individuals. Earlier methods have generated these estimates from high-density genetic marker data. We performed relationship estimation using whole-genome sequence data for 1490 known pairwise relationships among 258 individuals in 30 families along with 46 population samples as controls. Our results demonstrate that complexities specific to whole-genome sequencing result in regions of the genome that are prone to false-positive estimates of genetic sharing. We provide a map of these spurious IBD regions and introduce new methods, implemented in the software package ERSA 2.0, to control for spurious IBD. We show that ERSA 2.0 provides a 5% to 15% increase in relationship detection power for distant relationships with whole-genome sequence data relative to high-density genetic marker data.
doi:10.1371/journal.pgen.1004144
PMCID: PMC3907355  PMID: 24497848

Results 1-25 (1379652)