Search tips
Search criteria

Results 1-25 (1138007)

Clipboard (0)

Related Articles

1.  Next Generation Sequencing in Research and Diagnostics of Ocular Birth Defects 
Molecular genetics and metabolism  2010;100(2):184-192.
Sequence capture enrichment (SCE) strategies and massively parallel next generation sequencing (NGS) are expected to increase the rate of gene discovery for genetically heterogeneous hereditary diseases, but at present, there are very few examples of successful application of these technologic advances in translational research and clinical testing. Our study assessed whether array based target enrichment followed by re-sequencing on the Roche Genome Sequencer FLX (GS FLX) system could be used for novel mutation identification in more than 1000 exons representing 100 candidate genes for ocular birth defects, and as a control, whether these methods could detect two known mutations in the PAX2 gene. We assayed two samples with heterozygous sequence changes in PAX2 that were previously identified by conventional Sanger sequencing. These changes were a c.527G>C (S176T) substitution and a single basepair deletion c.77delG. The nucleotide substitution c.527G>C was easily identified by NGS. A deletion of one base in a long polyG stretch (c.77delG) was not registered initially by the GS Reference Mapper, but was detected in repeated analysis using two different software packages. Different approaches were evaluated for distinguishing false positives (sequencing errors) and benign polymorphisms from potentially pathogenic sequence changes that require further follow-up. Although improvements will be necessary in accuracy, speed, ease of data analysis and cost, our study confirms that NGS can be used in research and diagnostic settings to screen for mutations in hundreds of loci in genetically heterogeneous human diseases.
PMCID: PMC2871986  PMID: 20359920
next generation sequencing; sequence capture; GS FLX; anophthalmia; microphthalmia; coloboma
2.  Targeted sequence capture and GS-FLX Titanium sequencing of 23 hypertrophic and dilated cardiomyopathy genes: implementation into diagnostics 
Journal of Medical Genetics  2013;50(9):614-626.
Genetic evaluation of cardiomyopathies poses a challenge. Multiple genes are involved but no clear genotype–phenotype correlations have been found so far. In the past, genetic evaluation for hypertrophic (HCM) and dilated (DCM) cardiomyopathies was performed by sequential screening of a very limited number of genes. Recent developments in sequencing have increased the throughput, enabling simultaneous screening of multiple genes for multiple patients in a single sequencing run.
Development and implementation of a next generation sequencing (NGS) based genetic test as replacement for Sanger sequencing.
Methods and Results
In order to increase the number of genes that can be screened in a shorter time period, we enriched all exons of 23 of the most relevant HCM and DCM related genes using on-array multiplexed sequence capture followed by massively parallel pyrosequencing on the GS-FLX Titanium. After optimisation of array based sequence capture it was feasible to reliably detect a large panel of known and unknown variants in HCM and DCM patients, whereby the unknown variants could be confirmed by Sanger sequencing.
The rate of detection of (pathogenic) variants in both HCM and DCM patients was increased due to a larger number of genes studied. Array based target enrichment followed by NGS showed the same accuracy as Sanger sequencing.  Therefore, NGS is ready for implementation in a diagnostic setting.
PMCID: PMC3756457  PMID: 23785128
Cardiomyopathy; Diagnostics; Genetics; Molecular genetics
3.  Illumina sequencing of 15 deafness genes using fragmented amplicons 
BMC Research Notes  2014;7(1):509.
Resequencing of deafness related genes using GS FLX massive parallel sequencing of PCR amplicons spanning selected genes has previously been reported as a successful strategy to discover causal variants. The amplicon lengths were designed to be smaller than the sequencing read length of GS FLX technology, but are longer than Illumina sequencing technology read lengths. Fragmentation is thus required to sequence these amplicons using high throughput Illumina technology.
We performed Illumina sequencing in 4 patients on 563 multiplexed amplicons covering the exons of 15 genes involved in the hearing process. After exploring several fragmentation strategies, the amplicons were fragmented using Covaris sonication prior to library preparation. CLC genomic workbench was used to analyze the data.
We achieve an excellent coverage with more than 99% of the amplicons bases covered. All variants that were previously validated using Sanger sequencing, were also called in this study. Variant calling revealed less false positive and false negative results compared to the previous study. For each patient, several variants were found that are reported by ClinVar as possible hearing loss variants.
Migration from GS FLX amplicon sequencing to Illumina amplicon sequencing is straightforward and leads to more accurate results.
Electronic supplementary material
The online version of this article (doi:10.1186/1756-0500-7-509) contains supplementary material, which is available to authorized users.
PMCID: PMC4266979  PMID: 25106482
4.  Analysis of the Pythium ultimum transcriptome using Sanger and Pyrosequencing approaches 
BMC Genomics  2008;9:542.
Pythium species are an agriculturally important genus of plant pathogens, yet are not understood well at the molecular, genetic, or genomic level. They are closely related to other oomycete plant pathogens such as Phytophthora species and are ubiquitous in their geographic distribution and host rage. To gain a better understanding of its gene complement, we generated Expressed Sequence Tags (ESTs) from the transcriptome of Pythium ultimum DAOM BR144 (= ATCC 200006 = CBS 805.95) using two high throughput sequencing methods, Sanger-based chain termination sequencing and pyrosequencing-based sequencing-by-synthesis.
A single half-plate pyrosequencing (454 FLX) run on adapter-ligated cDNA from a normalized cDNA population generated 90,664 reads with an average read length of 190 nucleotides following cleaning and removal of sequences shorter than 100 base pairs. After clustering and assembly, a total of 35,507 unique sequences were generated. In parallel, 9,578 reads were generated from a library constructed from the same normalized cDNA population using dideoxy chain termination Sanger sequencing, which upon clustering and assembly generated 4,689 unique sequences. A hybrid assembly of both Sanger- and pyrosequencing-derived ESTs resulted in 34,495 unique sequences with 1,110 sequences (3.2%) that were solely derived from Sanger sequencing alone. A high degree of similarity was seen between P. ultimum sequences and other sequenced plant pathogenic oomycetes with 91% of the hybrid assembly derived sequences > 500 bp having similarity to sequences from plant pathogenic Phytophthora species. An analysis of Gene Ontology assignments revealed a similar representation of molecular function ontologies in the hybrid assembly in comparison to the predicted proteomes of three Phytophthora species, suggesting a broad representation of the P. ultimum transcriptome was present in the normalized cDNA population. P. ultimum sequences with similarity to oomycete RXLR and Crinkler effectors, Kazal-like and cystatin-like protease inhibitors, and elicitins were identified. Sequences with similarity to thiamine biosynthesis enzymes that are lacking in the genome sequences of three Phytophthora species and one downy mildew were identified and could serve as useful phylogenetic markers. Furthermore, we identified 179 candidate simple sequence repeats that can be used for genotyping strains of P. ultimum.
Through these two technologies, we were able to generate a robust set (~10 Mb) of transcribed sequences for P. ultimum. We were able to identify known sequences present in oomycetes as well as identify novel sequences. An ample number of candidate polymorphic markers were identified in the dataset providing resources for phylogenetic and diagnostic marker development for this species. On a technical level, in spite of the depth possible with 454 FLX platform, the Sanger and pyro-based sequencing methodologies were complementary as each method generated sequences unique to each platform.
PMCID: PMC2612028  PMID: 19014603
5.  Diagnostic Screening Workflow for Mutations in the BRCA1 and BRCA2 Genes 
Screening for mutations in large genes is challenging in a molecular diagnostic environment. Sanger-based DNA sequencing methods are largely used; however, massively parallel sequencing (MPS) can accommodate increasing test demands and financial constraints. This study aimed to establish a simple workflow to amplify and screen all coding regions of the BRCA1 and BRCA2 (BRCA1/2) genes by Sanger-based sequencing as well as to assess a MPS approach encompassing multiplex polymerase chain reaction (PCR) and pyrosequencing.
This study was conducted between July 2011 and April 2013. A total of 20 patients were included in the study who had been referred to Genetic Health Services New Zealand (Northern Hub) for BRCA1/2 mutation screening. Patients were randomly divided into a MPS evaluation and validation cohort (n = 10 patients each). Primers were designed to amplify all coding exons of BRCA1/2 (28 and 42 primer pairs, respectively). Primers overlying known variants were avoided to circumvent allelic drop-out. The MPS approach necessitated utilisation of a complementary fragment analysis assay to eliminate apparent false-positives at homopolymeric regions. Variants were filtered on the basis of their frequency and sequence depth.
Sanger-based sequencing of PCR-amplified coding regions was successfully achieved. Sensitivity and specificity of the combined MPS/homopolymer protocol was determined to be 100% and 99.5%, respectively.
In comparison to traditional Sanger-based sequencing, the MPS workflow led to a reduction in both cost and analysis time for BRCA1/2 screening. MPS analysis achieved high analytical sensitivity and specificity, but required complementary fragment analysis combined with Sanger-based sequencing confirmation in some instances.
PMCID: PMC4318608
Massively Parallel Sequencing; BRCA1 Gene; BRCA2 Gene; HBOC Syndrome; Detection, heterozygote
6.  Genetic Testing in Hereditary Breast and Ovarian Cancer Using Massive Parallel Sequencing 
BioMed Research International  2014;2014:542541.
High throughput methods such as next generation sequencing are increasingly used in molecular diagnosis. The aim of this study was to develop a workflow for the detection of BRCA1 and BRCA2 mutations using massive parallel sequencing in a 454 GS Junior bench top sequencer. Our approach was first validated in a panel of 23 patients containing 62 unique variants that had been previously Sanger sequenced. Subsequently, 101 patients with familial breast and ovarian cancer were studied. BRCA1 and BRCA2 exon enrichment has been performed by PCR amplification using the BRCA MASTR kit (Multiplicom). Bioinformatic analysis of reads is performed with the AVA software v2.7 (Roche). In total, all 62 variants were detected resulting in a sensitivity of 100%. 71 false positives were called resulting in a specificity of 97.35%. All of them correspond to deletions located in homopolymeric stretches. The analysis of the homopolymers stretches of 6 bp or longer using the BRCA HP kit (Multiplicom) increased the specificity of the detection of BRCA1 and BRCA2 mutations to 99.99%. We show here that massive parallel pyrosequencing can be used as a diagnostic strategy to test for BRCA1 and BRCA2 mutations meeting very stringent sensitivity and specificity parameters replacing traditional Sanger sequencing with a lower cost.
PMCID: PMC4098986  PMID: 25136594
7.  Sequence artefacts in a prospective series of formalin-fixed tumours tested for mutations in hotspot regions by massively parallel sequencing 
BMC Medical Genomics  2014;7:23.
Clinical specimens undergoing diagnostic molecular pathology testing are fixed in formalin due to the necessity for detailed morphological assessment. However, formalin fixation can cause major issues with molecular testing, as it causes DNA damage such as fragmentation and non-reproducible sequencing artefacts after PCR amplification. In the context of massively parallel sequencing (MPS), distinguishing true low frequency variants from sequencing artefacts remains challenging. The prevalence of formalin-induced DNA damage and its impact on molecular testing and clinical genomics remains poorly understood.
The Cancer 2015 study is a population-based cancer cohort used to assess the feasibility of mutational screening using MPS in cancer patients from Victoria, Australia. While blocks were formalin-fixed and paraffin-embedded in different anatomical pathology laboratories, they were centrally extracted for DNA utilising the same protocol, and run through the same MPS platform (Illumina TruSeq Amplicon Cancer Panel). The sequencing artefacts in the 1-10% and the 10-25% allele frequency ranges were assessed in 488 formalin-fixed tumours from the pilot phase of the Cancer 2015 cohort. All blocks were less than 2.5 years of age (mean 93 days).
Consistent with the signature of DNA damage due to formalin fixation, many formalin-fixed samples displayed disproportionate levels of C>T/G>A changes in the 1-10% allele frequency range. Artefacts were less apparent in the 10-25% allele frequency range. Significantly, changes were inversely correlated with coverage indicating high levels of sequencing artefacts were associated with samples with low amounts of available amplifiable template due to fragmentation. The degree of fragmentation and sequencing artefacts differed between blocks sourced from different anatomical pathology laboratories. In a limited validation of potentially actionable low frequency mutations, a NRAS G12D mutation in a melanoma was shown to be a false positive.
These findings indicate that DNA damage following formalin fixation remains a major challenge in laboratories working with MPS. Methodologies that assess, minimise or remove formalin-induced DNA damaged templates as part of MPS protocols will aid in the interpretation of genomic results leading to better patient outcomes.
PMCID: PMC4032349  PMID: 24885028
8.  Comprehensive Mutation Analysis for Congenital Muscular Dystrophy: A Clinical PCR-Based Enrichment and Next-Generation Sequencing Panel 
PLoS ONE  2013;8(1):e53083.
The congenital muscular dystrophies (CMDs) comprise a heterogeneous group of heritable muscle disorders with often difficult to interpret muscle pathology, making them challenging to diagnose. Serial Sanger sequencing of suspected CMD genes, while the current molecular diagnostic method of choice, can be slow and expensive. A comprehensive panel test for simultaneous screening of mutations in all known CMD-associated genes would be a more effective diagnostic strategy. Thus, the CMDs are a model disorder group for development and validation of next-generation sequencing (NGS) strategies for diagnostic and clinical care applications. Using a highly multiplexed PCR-based target enrichment method (RainDance) in conjunction with NGS, we performed mutation detection in all CMD genes of 26 samples and compared the results with Sanger sequencing. The RainDance NGS panel showed great consistency in coverage depth, on-target efficiency, versatility of mutation detection, and genotype concordance with Sanger sequencing, demonstrating the test's appropriateness for clinical use. Compared to single tests, a higher diagnostic yield was observed by panel implementation. The panel's limitation is the amplification failure of select gene-specific exons which require Sanger sequencing for test completion. Successful validation and application of the CMD NGS panel to improve the diagnostic yield in a clinical laboratory was shown.
PMCID: PMC3543442  PMID: 23326386
9.  Next generation sequence analysis for mitochondrial disorders 
Genome Medicine  2009;1(10):100.
Mitochondrial disorders can originate from mutations in one of many nuclear genes controlling the organelle function or in the mitochondrial genome (mitochondrial DNA (mtDNA)). The large numbers of potential culprit genes, together with the little guidance offered by most clinical phenotypes as to which gene may be causative, are a great challenge for the molecular diagnosis of these disorders.
We developed a novel targeted resequencing assay for mitochondrial disorders relying on microarray-based hybrid capture coupled to next-generation sequencing. Specifically, we subjected the entire mtDNA genome and the exons and intron-exon boundary regions of 362 known or candidate causative nuclear genes to targeted capture and resequencing. We here provide proof-of-concept data by testing one HapMap DNA sample and two positive control samples.
Over 94% of the targeted regions were captured and sequenced with appropriate coverage and quality, allowing reliable variant calling. Pathogenic mutations blindly tested in patients' samples were 100% concordant with previous Sanger sequencing results: a known mutation in Pyruvate dehydrogenase alpha 1 subunit (PDHA1), a novel splicing and a known coding mutation in Hydroxyacyl-CoA dehydrogenase alpha subunit (HADHA) were correctly identified. Of the additional variants recognized, 90 to 94% were present in dbSNP while 6 to 10% represented new alterations. The novel nonsynonymous variants were all in heterozygote state and mostly predicted to be benign. The depth of sequencing coverage of mtDNA was extremely high, suggesting that it may be feasible to detect pathogenic mtDNA mutations confounded by low level heteroplasmy. Only one sequencing lane of an eight lane flow cell was utilized for each sample, indicating that a cost-effective clinical test can be achieved.
Our study indicates that the use of next generation sequencing technology holds great promise as a tool for screening mitochondrial disorders. The availability of a comprehensive molecular diagnostic tool will increase the capacity for early and rapid identification of mitochondrial disorders. In addition, the proposed approach has the potential to identify new mutations in candidate genes, expanding and redefining the spectrum of causative genes responsible for mitochondrial disorders.
PMCID: PMC2784303  PMID: 19852779
10.  Targeted NGS: A Cost-Effective Approach to Molecular Diagnosis of PIDs 
Background: Primary immunodeficiencies (PIDs) are a diverse group of disorders caused by multiple genetic defects. Obtaining a molecular diagnosis for PID patients using a phenotype-based approach is often complex, expensive, and not always successful. Next-generation sequencing (NGS) methods offer an unbiased genotype-based approach, which can facilitate molecular diagnostics.
Objective: To develop an efficient NGS method to identify variants in PID-related genes.
Methods: We performed HaloPlex custom target enrichment and NGS using the Ion Torrent PGM to screen 173 genes in 11 healthy controls, 13 PID patients previously evaluated with either an identified mutation or SNP, and 120 patients with undiagnosed PIDs. Sensitivity and specificity were determined by comparing NGS and Sanger sequencing results for 33 patients. Run metrics and coverage analyses were done to identify systematic deficiencies.
Results: A molecular diagnosis was identified for 18 of 120 patients who previously lacked a genetic diagnosis, including 9 who had atypical presentations and extensive previous genetic and functional studies. Our NGS method detected variants with 98.1% sensitivity and >99.9% specificity. Uniformity was variable (72–89%), and we were not able to reliably sequence 45 regions (45/2455 or 1.8% of total regions) due to low (<20) average read depth or <90% region coverage; thus, we optimized probe hybridization conditions to improve read-depth and coverage for future analyses, and established criteria to help identify true positives.
Conclusion: While NGS methods are not as sensitive as Sanger sequencing for individual genes, targeted NGS is a cost-effective, first-line genetic test for the evaluation of patients with PIDs. This approach decreases time to diagnosis, increases diagnostic rate, and provides insight into the genotype–phenotype correlation of PIDs in a cost-effective way.
PMCID: PMC4217515  PMID: 25404929
primary immunodeficiency; mutation analysis; Sanger sequencing; next-generation sequencing; genotype–phenotype correlation; SNV; INDEL
11.  Screening of the hearing of newborns - Update 
Permanent congenital bilateral hearing loss (CHL) of moderate or greater degree (≥40 dB HL) is a rare disease, with a prevalence of about 1 to 3 per 1000 births. However, it is one of the most frequent congenital diseases. Reliance on physician observation and parental recognition has not been successful in the past in detecting significant hearing loss in the first year of life. With this strategy significant hearing losses have been detected in the second year of life. With two objective technologies based on physiologic response to sound, otoacoustic emissions (OAE) and auditory brainstem response (ABR) hearing screening in the first days of life is made possible.
The objective of this health technology assessment report is to update the evaluation on clinical effectiveness and cost-effectiveness of newborn hearing screening programs. Universal newborn hearing screening (UHNS) (i), selective screening of high risk newborns (ii), and the absence of a systematic screening program are compared for age at identification and age at hearing aid fitting of children with hearing loss. Secondly the potential benefits of early intervention are analysed. Costs and cost-effectiveness of newborn hearing screening programs are determined. This report is intended to make a contribution to the decision making whether and under which conditions a newborn hearing screening program should be reimbursed by the statutory sickness funds in Germany.
This health technology assessment report updates a former health technology assessment (Kunze et al. 2004 [1]). A systematic review of the literature was conducted, based on a documented search and selection of the literature using predefined inclusion and exclusion criteria and a documented extraction and appraisal of the included studies. To assess the cost-effectiveness of the different screening strategies in Germany the decision analytic Markov state model which had been developed in our former health technology assessment report was updated.
Universal newborn hearing screening programs are able to substantially reduce the age at identification and the age at intervention of children with CHL to six months of age in the German health care setting. High coverage rates, low fail rates and - if tracking systems are implemented – high follow-up-rates to diagnostic evaluation for test positives were achieved. New publications on potential benefits of early intervention could not be retrieved. For a final assessment of cost-effectiveness of newborn hearing screening evidence based long-term data are lacking. Decision analytic models with lifelong time horizon assuming that early detection results in improved language abilities and lower educational costs and higher life time productivity showed a potential of UNHS for long term cost savings compared to selective screening and no screening. For the short-term cost-effectiveness with a time horizon up to diagnostic evaluation more evidence based data are available. The average costs per case diagnosed range from 16,000 EURO to 33,600 EURO in Germany and hence are comparable to the cost of other implemented newborn screening programs. Empirical data for cost of selective screening in the German health care setting are lacking. Our decision analytic model shows that selective screening is more cost-effective but detects only 50% of all cases of congenital hearing loss.
There is good evidence that UNHS-Programs with appropriate quality management can reduce the age at start of intervention below six months. Up to now there is no indication of considerable negative consequences of screening for children with false positive test results and their parents. However, it is more difficult to prove the efficacy of early intervention to improve long-term outcomes. Randomized clinical trials of the efficacy of early intervention for children with CHL hearing losses are inappropriate because of ethical reasons. Prospective cohort studies with long-term outcomes of rare diseases are costly, take a long time and simultaneously substantial benefits of early intervention for language development seem likely.
A UNHS-Program should be implemented in Germany and be reimbursed by the statutory sickness funds. To achieve high coverage and because of better conditions for obtaining low false positive rates UNHS should be performed in hospital after birth. For outpatient deliveries additionally screening measures in an outpatient setting must be provided.
PMCID: PMC3011344  PMID: 21289971
12.  Assessing the feasibility of GS FLX Pyrosequencing for sequencing the Atlantic salmon genome 
BMC Genomics  2008;9:404.
With a whole genome duplication event and wealth of biological data, salmonids are excellent model organisms for studying evolutionary processes, fates of duplicated genes and genetic and physiological processes associated with complex behavioral phenotypes. It is surprising therefore, that no salmonid genome has been sequenced. Atlantic salmon (Salmo salar) is a good representative salmonid for sequencing given its importance in aquaculture and the genomic resources available. However, the size and complexity of the genome combined with the lack of a sequenced reference genome from a closely related fish makes assembly challenging. Given the cost and time limitations of Sanger sequencing as well as recent improvements to next generation sequencing technologies, we examined the feasibility of using the Genome Sequencer (GS) FLX pyrosequencing system to obtain the sequence of a salmonid genome. Eight pooled BACs belonging to a minimum tiling path covering ~1 Mb of the Atlantic salmon genome were sequenced by GS FLX shotgun and Long Paired End sequencing and compared with a ninth BAC sequenced by Sanger sequencing of a shotgun library.
An initial assembly using only GS FLX shotgun sequences (average read length 248.5 bp) with ~30× coverage allowed gene identification, but was incomplete even when 126 Sanger-generated BAC-end sequences (~0.09× coverage) were incorporated. The addition of paired end sequencing reads (additional ~26× coverage) produced a final assembly comprising 175 contigs assembled into four scaffolds with 171 gaps. Sanger sequencing of the ninth BAC (~10.5× coverage) produced nine contigs and two scaffolds. The number of scaffolds produced by the GS FLX assembly was comparable to Sanger-generated sequencing; however, the number of gaps was much higher in the GS FLX assembly.
These results represent the first use of GS FLX paired end reads for de novo sequence assembly. Our data demonstrated that this improved the GS FLX assemblies; however, with respect to de novo sequencing of complex genomes, the GS FLX technology is limited to gene mining and establishing a set of ordered sequence contigs. Currently, for a salmonid reference sequence, it appears that a substantial portion of sequencing should be done using Sanger technology.
PMCID: PMC2532694  PMID: 18755037
13.  Evaluation of human gene variant detection in amplicon pools by the GS-FLX parallel Pyrosequencer 
BMC Genomics  2008;9:464.
A new priority in genome research is large-scale resequencing of genes to understand the molecular basis of hereditary disease and cancer. We assessed the ability of massively parallel pyrosequencing to identify sequence variants in pools. From a large collection of human PCR samples we selected 343 PCR products belonging to 16 disease genes and including a large spectrum of sequence variations previously identified by Sanger sequencing. The sequence variants included SNPs and small deletions and insertions (up to 44 bp), in homozygous or heterozygous state.
The DNA was combined in 4 pools containing from 27 to 164 amplicons and from 8,9 to 50,8 Kb to sequence for a total of 110 Kb. Pyrosequencing generated over 80 million base pairs of data. Blind searching for sequence variations with a specifically designed bioinformatics procedure identified 465 putative sequence variants, including 412 true variants, 53 false positives (in or adjacent to homopolymeric tracts), no false negatives. All known variants in positions covered with at least 30× depth were correctly recognized.
Massively parallel pyrosequencing may be used to simplify and speed the search for DNA variations in PCR products. Our results encourage further studies to evaluate molecular diagnostics applications.
PMCID: PMC2569949  PMID: 18842124
14.  Hi-Plex for high-throughput mutation screening: application to the breast cancer susceptibility gene PALB2 
BMC Medical Genomics  2013;6:48.
Massively parallel sequencing (MPS) has revolutionised biomedical research and offers enormous capacity for clinical application. We previously reported Hi-Plex, a streamlined highly-multiplexed PCR-MPS approach, allowing a given library to be sequenced with both the Ion Torrent and TruSeq chemistries. Comparable sequencing efficiency was achieved using material derived from lymphoblastoid cell lines and formalin-fixed paraffin-embedded tumour.
Here, we report high-throughput application of Hi-Plex by performing blinded mutation screening of the coding regions of the breast cancer susceptibility gene PALB2 on a set of 95 blood-derived DNA samples that had previously been screened using Sanger sequencing and high-resolution melting curve analysis (n = 90), or genotyped by Taqman probe-based assays (n = 5). Hi-Plex libraries were prepared simultaneously using relatively inexpensive, readily available reagents in a simple half-day protocol followed by MPS on a single MiSeq run.
We observed that 99.93% of amplicons were represented at ≥10X coverage. All 56 previously identified variant calls were detected and no false positive calls were assigned. Four additional variant calls were made and confirmed upon re-analysis of previous data or subsequent Sanger sequencing.
These results support Hi-Plex as a powerful approach for rapid, cost-effective and accurate high-throughput mutation screening. They further demonstrate that Hi-Plex methods are suitable for and can meet the demands of high-throughput genetic testing in research and clinical settings.
PMCID: PMC3829211  PMID: 24206657
Hi-Plex; Massively parallel sequencing; Mutation screening; PALB2; Molecular diagnostics
15.  Deep sampling of the Palomero maize transcriptome by a high throughput strategy of pyrosequencing 
BMC Genomics  2009;10:299.
In-depth sequencing analysis has not been able to determine the overall complexity of transcriptional activity of a plant organ or tissue sample. In some cases, deep parallel sequencing of Expressed Sequence Tags (ESTs), although not yet optimized for the sequencing of cDNAs, has represented an efficient procedure for validating gene prediction and estimating overall gene coverage. This approach could be very valuable for complex plant genomes. In addition, little emphasis has been given to efforts aiming at an estimation of the overall transcriptional universe found in a multicellular organism at a specific developmental stage.
To explore, in depth, the transcriptional diversity in an ancient maize landrace, we developed a protocol to optimize the sequencing of cDNAs and performed 4 consecutive GS20–454 pyrosequencing runs of a cDNA library obtained from 2 week-old Palomero Toluqueño maize plants. The protocol reported here allowed obtaining over 90% of informative sequences. These GS20–454 runs generated over 1.5 Million reads, representing the largest amount of sequences reported from a single plant cDNA library. A collection of 367,391 quality-filtered reads (30.09 Mb) from a single run was sufficient to identify transcripts corresponding to 34% of public maize ESTs databases; total sequences generated after 4 filtered runs increased this coverage to 50%. Comparisons of all 1.5 Million reads to the Maize Assembled Genomic Islands (MAGIs) provided evidence for the transcriptional activity of 11% of MAGIs. We estimate that 5.67% (86,069 sequences) do not align with public ESTs or annotated genes, potentially representing new maize transcripts. Following the assembly of 74.4% of the reads in 65,493 contigs, real-time PCR of selected genes confirmed a predicted correlation between the abundance of GS20–454 sequences and corresponding levels of gene expression.
A protocol was developed that significantly increases the number, length and quality of cDNA reads using massive 454 parallel sequencing. We show that recurrent 454 pyrosequencing of a single cDNA sample is necessary to attain a thorough representation of the transcriptional universe present in maize, that can also be used to estimate transcript abundance of specific genes. This data suggests that the molecular and functional diversity contained in the vast native landraces remains to be explored, and that large-scale transcriptional sequencing of a presumed ancestor of the modern maize varieties represents a valuable approach to characterize the functional diversity of maize for future agricultural and evolutionary studies.
PMCID: PMC2714558  PMID: 19580677
16.  A massive parallel sequencing workflow for diagnostic genetic testing of mismatch repair genes 
The purpose of this study was to develop a massive parallel sequencing (MPS) workflow for diagnostic analysis of mismatch repair (MMR) genes using the GS Junior system (Roche). A pathogenic variant in one of four MMR genes, (MLH1, PMS2, MSH6, and MSH2), is the cause of Lynch Syndrome (LS), which mainly predispose to colorectal cancer. We used an amplicon-based sequencing method allowing specific and preferential amplification of the MMR genes including PMS2, of which several pseudogenes exist. The amplicons were pooled at different ratios to obtain coverage uniformity and maximize the throughput of a single-GS Junior run. In total, 60 previously identified and distinct variants (substitutions and indels), were sequenced by MPS and successfully detected. The heterozygote detection range was from 19% to 63% and dependent on sequence context and coverage. We were able to distinguish between false-positive and true-positive calls in homopolymeric regions by cross-sample comparison and evaluation of flow signal distributions. In addition, we filtered variants according to a predefined status, which facilitated variant annotation. Our study shows that implementation of MPS in routine diagnostics of LS can accelerate sample throughput and reduce costs without compromising sensitivity, compared to Sanger sequencing.
PMCID: PMC3960061  PMID: 24689082
Amplicon sequencing; diagnostics; hereditary colorectal cancer; massive parallel sequencing; mismatch repair; MLH1; MSH2; MSH6; PMS2
17.  Experience of targeted Usher exome sequencing as a clinical test 
We show that massively parallel targeted sequencing of 19 genes provides a new and reliable strategy for molecular diagnosis of Usher syndrome (USH) and nonsyndromic deafness, particularly appropriate for these disorders characterized by a high clinical and genetic heterogeneity and a complex structure of several of the genes involved. A series of 71 patients including Usher patients previously screened by Sanger sequencing plus newly referred patients was studied. Ninety-eight percent of the variants previously identified by Sanger sequencing were found by next-generation sequencing (NGS). NGS proved to be efficient as it offers analysis of all relevant genes which is laborious to reach with Sanger sequencing. Among the 13 newly referred Usher patients, both mutations in the same gene were identified in 77% of cases (10 patients) and one candidate pathogenic variant in two additional patients. This work can be considered as pilot for implementing NGS for genetically heterogeneous diseases in clinical service.
PMCID: PMC3907913  PMID: 24498627
Bioinformatics; next-generation sequencing; NSHL; Usher syndrome; variant prioritization
18.  Diagnostic Applications of Next Generation Sequencing in Immunogenetics and Molecular Oncology 
With the introduction of the next generation sequencing (NGS) technologies, remarkable new diagnostic applications have been established in daily routine. Implementation of NGS is challenging in clinical diagnostics, but definite advantages and new diagnostic possibilities make the switch to the technology inevitable. In addition to the higher sequencing capacity, clonal sequencing of single molecules, multiplexing of samples, higher diagnostic sensitivity, workflow miniaturization, and cost benefits are some of the valuable features of the technology. After the recent advances, NGS emerged as a proven alternative for classical Sanger sequencing in the typing of human leukocyte antigens (HLA). By virtue of the clonal amplification of single DNA molecules ambiguous typing results can be avoided. Simultaneously, a higher sample throughput can be achieved by tagging of DNA molecules with multiplex identifiers and pooling of PCR products before sequencing. In our experience, up to 380 samples can be typed for HLA-A, -B, and -DRB1 in high-resolution during every sequencing run. In molecular oncology, NGS shows a markedly increased sensitivity in comparison to the conventional Sanger sequencing and is developing to the standard diagnostic tool in detection of somatic mutations in cancer cells with great impact on personalized treatment of patients.
PMCID: PMC3725031  PMID: 23922545
NGS; HLA; Molecular oncology
19.  Exome sequencing in an SCA14 family demonstrates its utility in diagnosing heterogeneous diseases 
Neurology  2012;79(2):127-131.
Genetic heterogeneity is common in many neurologic disorders. This is particularly true for the hereditary ataxias where at least 36 disease genes or loci have been described for spinocerebellar ataxia and over 100 genes for neurologic disorders that present primarily with ataxia. Traditional genetic testing of a large number of candidate genes delays diagnosis and is expensive. In contrast, recently developed genomic techniques, such as exome sequencing that targets only the coding portion of the genome, offer an alternative strategy to rapidly sequence all genes in a comprehensive manner. Here we describe the use of exome sequencing to investigate a large, 5-generational British kindred with an autosomal dominant, progressive cerebellar ataxia in which conventional genetic testing had not revealed a causal etiology.
Twenty family members were seen and examined; 2 affected individuals were clinically investigated in detail without a genetic or acquired cause being identified. Exome sequencing was performed in one patient where coverage was comprehensive across the known ataxia genes, excluding the known repeat loci which should be examined using conventional analysis.
A novel p.Arg26Gly change in the PRKCG gene, mutated in SCA14, was identified. This variant was confirmed using Sanger sequencing and showed segregation with disease in the entire family.
This work demonstrates the utility of exome sequencing to rapidly screen heterogeneous genetic disorders such as the ataxias. Exome sequencing is more comprehensive, faster, and significantly cheaper than conventional Sanger sequencing, and thus represents a superior diagnostic screening tool in clinical practice.
PMCID: PMC3390538  PMID: 22675081
20.  Screening and Rapid Molecular Diagnosis of Tuberculosis in Prisons in Russia and Eastern Europe: A Cost-Effectiveness Analysis 
PLoS Medicine  2012;9(11):e1001348.
Daniel Winetsky and colleagues investigate eight strategies for screening and diagnosis of tuberculosis within prisons of the former Soviet Union.
Prisons of the former Soviet Union (FSU) have high rates of multidrug-resistant tuberculosis (MDR-TB) and are thought to drive general population tuberculosis (TB) epidemics. Effective prison case detection, though employing more expensive technologies, may reduce long-term treatment costs and slow MDR-TB transmission.
Methods and Findings
We developed a dynamic transmission model of TB and drug resistance matched to the epidemiology and costs in FSU prisons. We evaluated eight strategies for TB screening and diagnosis involving, alone or in combination, self-referral, symptom screening, mass miniature radiography (MMR), and sputum PCR with probes for rifampin resistance (Xpert MTB/RIF). Over a 10-y horizon, we projected costs, quality-adjusted life years (QALYs), and TB and MDR-TB prevalence. Using sputum PCR as an annual primary screening tool among the general prison population most effectively reduced overall TB prevalence (from 2.78% to 2.31%) and MDR-TB prevalence (from 0.74% to 0.63%), and cost US$543/QALY for additional QALYs gained compared to MMR screening with sputum PCR reserved for rapid detection of MDR-TB. Adding sputum PCR to the currently used strategy of annual MMR screening was cost-saving over 10 y compared to MMR screening alone, but produced only a modest reduction in MDR-TB prevalence (from 0.74% to 0.69%) and had minimal effect on overall TB prevalence (from 2.78% to 2.74%). Strategies based on symptom screening alone were less effective and more expensive than MMR-based strategies. Study limitations included scarce primary TB time-series data in FSU prisons and uncertainties regarding screening test characteristics.
In prisons of the FSU, annual screening of the general inmate population with sputum PCR most effectively reduces TB and MDR-TB prevalence, doing so cost-effectively. If this approach is not feasible, the current strategy of annual MMR is both more effective and less expensive than strategies using self-referral or symptom screening alone, and the addition of sputum PCR for rapid MDR-TB detection may be cost-saving over time.
Please see later in the article for the Editors' Summary
Editors' Summary
Tuberculosis (TB)—a contagious bacterial disease—is a major public health problem, particularly in low- and middle-income countries. In 2010, about nine million people developed TB, and about 1.5 million people died from the disease. Mycobacterium tuberculosis, the bacterium that causes TB, is spread in airborne droplets when people with active disease cough or sneeze. The characteristic symptoms of TB include fever, a persistent cough, and night sweats. Diagnostic tests include sputum smear microscopy (examination of mucus from the lungs for M. tuberculosis bacilli), mycobacterial culture (growth of M. tuberculosis from sputum), and chest X-rays. TB can also be diagnosed by looking for fragments of the M. tuberculosis genetic blueprint in sputum samples (sputum PCR). Importantly, sputum PCR can detect the genetic changes that make M. tuberculosis resistant to rifampicin, a constituent of the cocktail of antibiotics that is used to cure TB. Rifampicin resistance is an indicator of multidrug-resistant TB (MDR-TB), the emergence of which is thwarting ongoing global efforts to control TB.
Why Was This Study Done?
Prisons present unique challenges for TB control. Overcrowding, poor ventilation, and inadequate medical care increase the spread of TB among prisoners, who often come from disadvantaged populations where the prevalence of TB (the proportion of the population with TB) is already high. Prisons also act as reservoirs for TB, recycling the disease back into the civilian population. The prisons of the former Soviet Union, for example, which have extremely high rates of MDR-TB, are thought to drive TB epidemics in the general population. Because effective identification of active TB among prison inmates has the potential to improve TB control outside prisons, the World Health Organization recommends active TB case finding among prisoners using self-referral, screening with symptom questionnaires, or screening with chest X-rays or mass miniature radiography (MMR). But which of these strategies will reduce the prevalence of TB in prisons most effectively, and which is most cost-effective? Here, the researchers evaluate the relative effectiveness and cost-effectiveness of alternative strategies for screening and diagnosis of TB in prisons by modeling TB and MDR-TB epidemics in prisons of the former Soviet Union.
What Did the Researchers Do and Find?
The researchers used a dynamic transmission model of TB that simulates the movement of individuals in prisons in the former Soviet Union through different stages of TB infection to estimate the costs, quality-adjusted life years (QALYs; a measure of disease burden that includes both the quantity and quality of life) saved, and TB and MDR-TB prevalence for eight TB screening/diagnostic strategies over a ten-year period. Compared to annual MMR alone (the current strategy), annual screening with sputum PCR produced the greatest reduction in the prevalence of TB and of MDR-TB among the prison population. Adding sputum PCR for detection of MDR-TB to annual MMR screening did not affect the overall TB prevalence but slightly reduced the MDR-TB prevalence and saved nearly US$2,000 over ten years per model prison of 1,000 inmates, compared to MMR screening alone. Annual sputum PCR was the most cost-effective strategy, costing US$543/QALY for additional QALYs gained compared to MMR screening plus sputum PCR for MDR-TB detection. Other strategies tested, including symptom screening alone or combined with sputum PCR, were either more expensive and less effective or less cost-effective than these two options.
What Do These Findings Mean?
These findings suggest that, in prisons in the former Soviet Union, annual screening with sputum PCR will most effectively reduce TB and MDR-TB prevalence and will be cost-effective. That is, the cost per QALY saved of this strategy is less than the per-capita gross domestic product of any of the former Soviet Union countries. The paucity of primary data on some facets of TB epidemiology in prisons in the former Soviet Union and the assumptions built into the mathematical model limit the accuracy of these findings. Moreover, because most of the benefits of sputum PCR screening come from treating the MDR-TB cases that are detected using this screening approach, these findings cannot be generalized to prison settings without a functioning MDR-TB treatment program or with a very low MDR-TB prevalence. Despite these and other limitations, these findings provide valuable information about the screening strategies that are most likely to interrupt the TB cycle in prisons, thereby saving resources and averting preventable deaths both inside and outside prisons.
Additional Information
Please access these websites via the online version of this summary at
The World Health Organization provides information (in several languages) on all aspects of tuberculosis, including general information on tuberculosis diagnostics and on tuberculosis in prisons; a report published in the Bulletin of the World Health Organization in 2006 describes tough measures taken in Russian prisons to slow the spread of TB
The Stop TB Partnership is working towards tuberculosis elimination; patient stories about tuberculosis are available (in English and Spanish)
The US Centers for Disease Control and Prevention has information about tuberculosis, about its diagnosis, and about tuberculosis in prisons (some information in English and Spanish)
A PLOS Medicine Research Article by Iacapo Baussano et al. describes a systematic review of tuberculosis incidence in prisons; a linked editorial entitled The Health Crisis of Tuberculosis in Prisons Extends beyond the Prison Walls is also available
The Tuberculosis Survival Project, which aims to raise awareness of tuberculosis and provide support for people with tuberculosis, provides personal stories about treatment for tuberculosis; the Tuberculosis Vaccine Initiative also provides personal stories about dealing with tuberculosis
MedlinePlus has links to further information about tuberculosis (in English and Spanish)
PMCID: PMC3507963  PMID: 23209384
21.  Efficient alignment of pyrosequencing reads for re-sequencing applications 
BMC Bioinformatics  2011;12:163.
Over the past few years, new massively parallel DNA sequencing technologies have emerged. These platforms generate massive amounts of data per run, greatly reducing the cost of DNA sequencing. However, these techniques also raise important computational difficulties mostly due to the huge volume of data produced, but also because of some of their specific characteristics such as read length and sequencing errors. Among the most critical problems is that of efficiently and accurately mapping reads to a reference genome in the context of re-sequencing projects.
We present an efficient method for the local alignment of pyrosequencing reads produced by the GS FLX (454) system against a reference sequence. Our approach explores the characteristics of the data in these re-sequencing applications and uses state of the art indexing techniques combined with a flexible seed-based approach, leading to a fast and accurate algorithm which needs very little user parameterization. An evaluation performed using real and simulated data shows that our proposed method outperforms a number of mainstream tools on the quantity and quality of successful alignments, as well as on the execution time.
The proposed methodology was implemented in a software tool called TAPyR--Tool for the Alignment of Pyrosequencing Reads--which is publicly available from
PMCID: PMC3118166  PMID: 21672185
22.  Molecular Diagnosis of Usher Syndrome: Application of Two Different Next Generation Sequencing-Based Procedures 
PLoS ONE  2012;7(8):e43799.
Usher syndrome (USH) is a clinically and genetically heterogeneous disorder characterized by visual and hearing impairments. Clinically, it is subdivided into three subclasses with nine genes identified so far. In the present study, we investigated whether the currently available Next Generation Sequencing (NGS) technologies are already suitable for molecular diagnostics of USH. We analyzed a total of 12 patients, most of which were negative for previously described mutations in known USH genes upon primer extension-based microarray genotyping. We enriched the NGS template either by whole exome capture or by Long-PCR of the known USH genes. The main NGS sequencing platforms were used: SOLiD for whole exome sequencing, Illumina (Genome Analyzer II) and Roche 454 (GS FLX) for the Long-PCR sequencing. Long-PCR targeting was more efficient with up to 94% of USH gene regions displaying an overall coverage higher than 25×, whereas whole exome sequencing yielded a similar coverage for only 50% of those regions. Overall this integrated analysis led to the identification of 11 novel sequence variations in USH genes (2 homozygous and 9 heterozygous) out of 18 detected. However, at least two cases were not genetically solved. Our result highlights the current limitations in the diagnostic use of NGS for USH patients. The limit for whole exome sequencing is linked to the need of a strong coverage and to the correct interpretation of sequence variations with a non obvious, pathogenic role, whereas the targeted approach suffers from the high genetic heterogeneity of USH that may be also caused by the presence of additional causative genes yet to be identified.
PMCID: PMC3430670  PMID: 22952768
23.  The utility of PacBio circular consensus sequencing for characterizing complex gene families in non-model organisms 
BMC Genomics  2014;15(1):720.
Molecular characterization of highly diverse gene families can be time consuming, expensive, and difficult, especially when considering the potential for relatively large numbers of paralogs and/or pseudogenes. Here we investigate the utility of Pacific Biosciences single molecule real-time (SMRT) circular consensus sequencing (CCS) as an alternative to traditional cloning and Sanger sequencing PCR amplicons for gene family characterization. We target vomeronasal gene receptors, one of the most diverse gene families in mammals, with the goal of better understanding intra-specific V1R diversity of the gray mouse lemur (Microcebus murinus). Our study compares intragenomic variation for two V1R subfamilies found in the mouse lemur. Specifically, we compare gene copy variation within and between two individuals of M. murinus as characterized by different methods for nucleotide sequencing. By including the same individual animal from which the M. murinus draft genome was derived, we are able to cross-validate gene copy estimates from Sanger sequencing versus CCS methods.
We generated 34,088 high quality circular consensus sequences of two diverse V1R subfamilies (here referred to as V1RI and V1RIX) from two individuals of Microcebus murinus. Using a minimum threshold of 7× coverage, we recovered approximately 90% of V1RI sequences previously identified in the draft M. murinus genome (59% being identical at all nucleotide positions). When low coverage sequences were considered (i.e. < 7× coverage) 100% of V1RI sequences identified in the draft genome were recovered. At least 13 putatively novel V1R loci were also identified using CCS technology.
Recent upgrades to the Pacific Biosciences RS instrument have improved the CCS technology and offer an alternative to traditional sequencing approaches. Our results suggest that the Microcebus murinus V1R repertoire has been underestimated in the draft genome. In addition to providing an improved understanding of V1R diversity in the mouse lemur, this study demonstrates the utility of CCS technology for characterizing complex regions of the genome. We anticipate that long-read sequencing technologies such as PacBio SMRT will allow for the assembly of multigene family clusters and serve to more accurately characterize patterns of gene copy variation in large gene families, thus revealing novel micro-evolutionary patterns within non-model organisms.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-720) contains supplementary material, which is available to authorized users.
PMCID: PMC4152597  PMID: 25159659
Chemosensory genes; Microcebus murinus; Multigene family; Pacific Biosciences; Pheromone detection; Single molecule real-time sequencing
24.  A Multi-Site Study Employing High Resolution HLA Genotyping by Next Generation Sequencing 
Tissue antigens  2011;77(3):206-217.
The high degree of polymorphism at HLA class I and class II loci makes high resolution HLA typing challenging. Current typing methods, including Sanger sequencing, yield ambiguous typing results due to incomplete genomic coverage and inability to set phase for HLA haplotype determination. The 454 Life Sciences GS FLX next generation sequencing system coupled with Conexio ATF software can provide very high resolution HLA genotyping. High throughput genotyping can be achieved by use of primers with multiplex identifier (MID) tags to allow pooling of the amplicons generated from different individuals prior to sequencing. We have conducted a double blind study in which eight laboratory sites performed amplicon sequencing using GS FLX standard chemistry and genotyped the same 20 samples for HLA-A, -B, -C, DPB1, DQA1, DQB1, DRB1, and DRB3, DRB4 and DRB5 (DRB3/4/5) in a single sequencing run. The average sequence read length was 250 base pairs (bp) and the average number of sequence reads per amplicon was 672, providing confidence in the allele assignments. Of the 1280 genotypes considered, assignment was possible in 95% of the cases. Failure to assign genotypes was the result of researcher procedural error or the presence of a novel allele rather than a failure of sequencing technology. Concordance with known genotypes, in cases where assignment was possible, ranged from 95.3% to 99.4% for the eight sites, with overall concordance of 97.2%. We conclude that clonal pyrosequencing using the GS FLX platform and Conexio ATF software allows reliable identification of HLA genotypes at high resolution.
PMCID: PMC4205124  PMID: 21299525
DNA Sequencing; GS FLX; HLA genotyping
25.  Comparison of the Illumina Genome Analyzer and Roche 454 GS FLX for Resequencing of Hypertrophic Cardiomyopathy-Associated Genes 
Next-generation sequencing (NGS) is widely used in biomedical research, but its adoption has been limited in molecular diagnostics. One application of NGS is the targeted resequencing of genes whose mutations lead to an overlapping clinical phenotype. This study evaluated the comparative performance of the Illumina Genome Analyzer and Roche 454 GS FLX for the resequencing of 16 genes associated with hypertrophic cardiomyopathy (HCM). Using a single human genomic DNA sample enriched by long-range PCR (LR-PCR), 40 GS FLX and 31 Genome Analyzer exon variants were identified using ≥30-fold read-coverage and ≥20% read-percentage selection criteria. Twenty-seven platform concordant variants were Sanger-confirmed. The discordant variants segregated into two categories: variants with read coverages ≥30 on one platform but <30-fold on the alternate platform and variants with read percentages ≥20% on one platform but <20% on the alternate platform. All variants with <30-fold coverage were Sanger-confirmed, suggesting that the coverage criterion of ≥30-fold is too stringent for variant discovery. The variants with <20% read percentage were identified as reference sequence based on Sanger sequencing. These variants were found in homopolymer tracts and short-read misalignments, specifically in genes with high identity. The results of the current study demonstrate the feasibility of combining LR-PCR with the Genome Analyzer or GS FLX for targeted resequencing of HCM-associated genes.
PMCID: PMC2884316  PMID: 20592870
sequence analysis; DNA; next-generation sequencing

Results 1-25 (1138007)