Search tips
Search criteria

Results 1-25 (537108)

Clipboard (0)

Related Articles

1.  Whole genome sequencing for lung cancer 
Journal of Thoracic Disease  2012;4(2):155-163.
Lung cancer is a leading cause of cancer related morbidity and mortality globally, and carries a dismal prognosis. Improved understanding of the biology of cancer is required to improve patient outcomes. Next-generation sequencing (NGS) is a powerful tool for whole genome characterisation, enabling comprehensive examination of somatic mutations that drive oncogenesis. Most NGS methods are based on polymerase chain reaction (PCR) amplification of platform-specific DNA fragment libraries, which are then sequenced. These techniques are well suited to high-throughput sequencing and are able to detect the full spectrum of genomic changes present in cancer. However, they require considerable investments in time, laboratory infrastructure, computational analysis and bioinformatic support. Next-generation sequencing has been applied to studies of the whole genome, exome, transcriptome and epigenome, and is changing the paradigm of lung cancer research and patient care. The results of this new technology will transform current knowledge of oncogenic pathways and provide molecular targets of use in the diagnosis and treatment of cancer. Somatic mutations in lung cancer have already been identified by NGS, and large scale genomic studies are underway. Personalised treatment strategies will improve care for those likely to benefit from available therapies, while sparing others the expense and morbidity of futile intervention. Organisational, computational and bioinformatic challenges of NGS are driving technological advances as well as raising ethical issues relating to informed consent and data release. Differentiation between driver and passenger mutations requires careful interpretation of sequencing data. Challenges in the interpretation of results arise from the types of specimens used for DNA extraction, sample processing techniques and tumour content. Tumour heterogeneity can reduce power to detect mutations implicated in oncogenesis. Next-generation sequencing will facilitate investigation of the biological and clinical implications of such variation. These techniques can now be applied to single cells and free circulating DNA, and possibly in the future to DNA obtained from body fluids and from subpopulations of tumour. As costs reduce, and speed and processing accuracy increase, NGS technology will become increasingly accessible to researchers and clinicians, with the ultimate goal of improving the care of patients with lung cancer.
PMCID: PMC3378223  PMID: 22833821
High-throughput nucleotide sequencing; DNA sequence analysis; lung neoplasms; non-small cell lung carcinoma; small cell lung carcinoma
2.  Next Generation Sequencing: Advances in Characterizing the Methylome  
Genes  2010;1(2):143-165.
Epigenetic modifications play an important role in lymphoid malignancies. This has been evidenced by the large body of work published using microarray technologies to generate methylation profiles for numerous types and subtypes of lymphoma and leukemia. These studies have shown the importance of defining the epigenome so that we can better understand the biology of lymphoma. Recent advances in DNA sequencing technology have transformed the landscape of epigenomic analysis as we now have the ability to characterize the genome-wide distribution of chromatin modifications and DNA methylation using next-generation sequencing. To take full advantage of the throughput of next-generation sequencing, there are many methodologies that have been developed and many more that are currently being developed. Choosing the appropriate methodology is fundamental to the outcome of next-generation sequencing studies. In this review, published technologies and methodologies applicable to studying the methylome are presented. In addition, progress towards defining the methylome in lymphoma is discussed and prospective directions that have been made possible as a result of next-generation sequencing technology. Finally, methodologies are introduced that have not yet been published but that are being explored in the pursuit of defining the lymphoma methylome.
PMCID: PMC3954092
lymphoma; leukemia; next-generation sequencing; methylation; epigenome
3.  Interrogating genomic and epigenomic data to understand Prostate Cancer 
Biochimica et Biophysica Acta  2012;1825(2):186-196.
Major breakthroughs at the beginning of this century in high-throughput technologies have profoundly transformed biological research. Significant knowledge has been gained regarding our biological system and its disease such as malignant transformation. In this review, we summarize leading discoveries in prostate cancer research derived from the use of high-throughput approaches powered by microarrays and massively parallel next-generation sequencing (NGS). These include the seminal discovery of chromosomal translocations such as TMPRSS2-ERG gene fusions as well as the identification of critical oncogenes exemplified by the polycomb group protein EZH2. We then demonstrate the power of interrogating genomic and epigenomic data in understanding the plethora of mechanisms of transcriptional regulation. As an example, we review how androgen receptor (AR) binding events are mediated at multiple levels through protein-DNA interaction, histone and DNA modifications, as well as high-order chromatin structural changes.
PMCID: PMC3307852  PMID: 22240201
integrative genomics; FoxA1; EZH2; androgen receptor; nucleosome positioning; transcriptional regulation; gene fusion
4.  Next-generation sequencing in aging research: emerging applications, problems, pitfalls and possible solutions 
Ageing research reviews  2009;9(3):315-323.
Recent technological advances that allow faster and cheaper DNA sequencing are now driving biological and medical research. In this review, we provide an overview of state-of-the-art next-generation sequencing (NGS) platforms and their applications, including in genome sequencing and resequencing, transcriptional profiling (RNA-Seq) and high-throughput survey of DNA-protein interactions (ChIP-Seq) and of the epigenome. Particularly, we focus on how new methods made possible by NGS can help unravel the biological and genetic mechanisms of aging, longevity and age-related diseases. In the same way, however, NGS platforms open discovery not available before, they also give rise to new challenges, in particular in processing, analyzing and interpreting the data. Bioinformatics and software issues plus statistical difficulties in genome-wide studies are discussed, as well as the use of targeted sequencing to decrease costs and facilitate statistical analyses. Lastly, we discuss a number of methods to gather biological insights from massive amounts of data, such as functional enrichment, transcriptional regulation and network analyses. Although in the fast-moving field of NGS new platforms will soon take center stage, the approaches made possible by NGS will be at the basis of molecular biology, genetics and systems biology for years to come, making them instrumental for research on aging.
PMCID: PMC2878865  PMID: 19900591
bioinformatics; epigenetics; functional genomics; senescence; systems biology
5.  Statistical Analyses of Next Generation Sequence Data: A Partial Overview 
Next generation sequencing has revolutionized the status of biological research. For a long time, the gold standard of DNA sequencing was considered to be the Sanger method. However, in 2005, commercial launching of next generation sequencing has made it possible to generate massively parallel and high resolution DNA sequence data. Its usefulness in various genomic applications such as genome-wide detection of SNPs, DNA methylation profiling, mRNA expression profiling, whole-genome re-sequencing and so on are now well recognized. There are several platforms for generating next generation sequencing (NGS) data which we briefly discuss in this mini overview. With new technologies come new challenges for the data analysts. This mini review attempts to present a collection of selected topics in the current development of statistical methods dealing with these novel data types. We believe that knowing the advances and bottlenecks of this technology will help the researchers to benchmark the analytical tools dealing with these data and will pave the path for its proper application into clinical diagnostics.
PMCID: PMC2989618  PMID: 21113236
DNA; Sequencing; Deep sequencing; High throughput; Sequence reads; RNA; ChIP-seq; Intensities
6.  The impact of next-generation sequencing on genomics 
This article reviews basic concepts, general applications, and the potential impact of next-generation sequencing (NGS) technologies on genomics, with particular reference to currently available and possible future platforms and bioinformatics. NGS technologies have demonstrated the capacity to sequence DNA at unprecedented speed, thereby enabling previously unimaginable scientific achievements and novel biological applications. But, the massive data produced by NGS also presents a significant challenge for data storage, analyses, and management solutions. Advanced bioinformatic tools are essential for the successful application of NGS technology. As evidenced throughout this review, NGS technologies will have a striking impact on genomic research and the entire biological field. With its ability to tackle the unsolved challenges unconquered by previous genomic technologies, NGS is likely to unravel the complexity of the human genome in terms of genetic variations, some of which may be confined to susceptible loci for some common human conditions. The impact of NGS technologies on genomics will be far reaching and likely change the field for years to come.
PMCID: PMC3076108  PMID: 21477781
Next-generation sequencing; Genomics; Genetic variation; Polymorphism; Targeted sequence enrichment; Bioinformatics
7.  Transcriptome Analysis Using Next-Generation Sequencing Technology 
High throughput RNA sequencing (RNA-Seq) is becoming increasingly utilized as the technology of choice to detect and quantify known and novel transcripts. Multiple next-generation sequencing (NGS) platforms are available that enable transcriptome profiling through RNA-Seq workflows. Demonstrations of the power of RNA-Seq to profile the well annotated transcriptome and also identify novel transcribed regions, gene fusions, and even identify novel classes of RNA are rapidly increasing in the field of RNA research. Our aim has been to develop library preparation methods and tools that aid in the reliable generation of libraries for next generation sequencing from total RNA. Reported here are results from the development of the Ambion® RNA-Seq Library Construction kit optimized for sequencing on the Illumina® next generation sequencing instruments. We show results from two protocols utilizing the same reagents that allow generation of RNA-Seq libraries targeting either the small RNA fraction of total RNA, or the whole transcriptome which includes transcripts larger than 100 base pairs. Results are reported from Illumina® Genome Analyzer II sequencing of both small RNA and transcriptome libraries with a focus on mapping to the miRBase and RefSeq references respectively. We also demonstrate the use of External RNA Control Consortium (ERCC) transcripts as spike-in controls for transcriptome libraries that aid in quality control of the library generation procedure and aid in downstream data analysis. The library construction technology embedded in the Ambion® RNA-Seq Library Construction kit enables researchers to analyze the transcriptome of their research samples in a precise, sensitive and robust manner while maintaining information regarding the genomic DNA strand to which the RNA transcript maps utilizing the Illumina® Genome Analyzer II sequencing platform. The workflow and results reported here demonstrate new commercially available options for library construction enabling small RNA and transcriptome profiling and novel discovery using next-generation sequencing technology.
PMCID: PMC3186484
8.  Deep mRNA Sequencing for In Vivo Functional Analysis of Cardiac Transcriptional Regulators: Application to Gαq 
Circulation research  2010;106(9):1459-1467.
Transcriptional profiling can detect subclinical heart disease and provide insight into disease etiology and functional status. Current microarray-based methods are expensive and subject to artifact.
To develop RNA sequencing methodologies using next generation massively parallel platforms for high throughput comprehensive analysis of individual mouse cardiac transcriptomes. To compare the results of sequencing- and array-based transcriptional profiling in the well-characterized Gαq transgenic mouse hypertrophy/cardiomyopathy model.
Methods and Results
The techniques for preparation of individually bar-coded mouse heart RNA libraries for Illumina Genome Analyzer II resequencing are described. RNA sequencing showed that 234 high abundance transcripts (>60 copies/cell) comprised 55% of total cardiac mRNA. Parallel transcriptional profiling of Gαq transgenic and non-transgenic hearts by Illumina RNA sequencing and Affymetrix Mouse Gene 1.0 ST arrays revealed superior dynamic range for mRNA expression and enhanced specificity for reporting low-abundance transcripts by RNA sequencing. Differential mRNA expression in Gαq and non-transgenic hearts correlated well between microarrays and RNA sequencing for highly abundant transcripts. RNA sequencing was superior to arrays for accurately quantifying lower-abundance genes, which represented the majority of the regulated genes in the Gαq transgenic model.
RNA sequencing is rapid, accurate, and sensitive for identifying both abundant and rare cardiac transcripts, and has significant advantages in time- and cost-efficiencies over microarray analysis.
PMCID: PMC2891025  PMID: 20360248
RNA sequencing; microarray; gene regulation; Gq
9.  Tackling Skeletal Muscle Cells Epigenome in the Next-Generation Sequencing Era 
Recent advances in high-throughput technologies have transformed methodologies employed to study cell-specific epigenomes and the approaches to investigate complex cellular phenotypes. Application of next-generation sequencing technology in the skeletal muscle differentiation field is rapidly extending our knowledge on how chromatin modifications, transcription factors and chromatin regulators orchestrate gene expression pathways guiding myogenesis. Here, we review recent biological insights gained by the application of next-generation sequencing techniques to decode the epigenetic profile and gene regulatory networks underlying skeletal muscle differentiation.
PMCID: PMC3371680  PMID: 22701348
10.  Perinatal bisphenol A exposure promotes dose-dependent alterations of the mouse methylome 
BMC Genomics  2014;15:30.
Environmental factors during perinatal development may influence developmental plasticity and disease susceptibility via alterations to the epigenome. Developmental exposure to the endocrine active compound, bisphenol A (BPA), has previously been associated with altered methylation at candidate gene loci. Here, we undertake the first genome-wide characterization of DNA methylation profiles in the liver of murine offspring exposed perinatally to multiple doses of BPA through the maternal diet.
Using a tiered focusing approach, our strategy proceeds from unbiased broad DNA methylation analysis using methylation-based next generation sequencing technology to in-depth quantitative site-specific CpG methylation determination using the Sequenom EpiTYPER MassARRAY platform to profile liver DNA methylation patterns in offspring maternally exposed to BPA during gestation and lactation to doses ranging from 0 BPA/kg (Ctr), 50 μg BPA/kg (UG), or 50 mg BPA/kg (MG) diet (N = 4 per group). Genome-wide analyses indicate non-monotonic effects of DNA methylation patterns following perinatal exposure to BPA, corroborating previous studies using multiple doses of BPA with non-monotonic outcomes. We observed enrichment of regions of altered methylation (RAMs) within CpG island (CGI) shores, but little evidence of RAM enrichment in CGIs. An analysis of promoter regions identified several hundred novel BPA-associated methylation events, and methylation alterations in the Myh7b and Slc22a12 gene promoters were validated. Using the Comparative Toxicogenomics Database, a number of candidate genes that have previously been associated with BPA-related gene expression changes were identified, and gene set enrichment testing identified epigenetically dysregulated pathways involved in metabolism and stimulus response.
In this study, non-monotonic dose dependent alterations in DNA methylation among BPA-exposed mouse liver samples and their relevant pathways were identified and validated. The comprehensive methylome map presented here provides candidate loci underlying the role of early BPA exposure and later in life health and disease status.
PMCID: PMC3902427  PMID: 24433282
Bisphenol A; DNA methylation; Environmental epigenomics; MethylPlex
11.  The salt-responsive transcriptome of chickpea roots and nodules via deepSuperSAGE 
BMC Plant Biology  2011;11:31.
The combination of high-throughput transcript profiling and next-generation sequencing technologies is a prerequisite for genome-wide comprehensive transcriptome analysis. Our recent innovation of deepSuperSAGE is based on an advanced SuperSAGE protocol and its combination with massively parallel pyrosequencing on Roche's 454 sequencing platform. As a demonstration of the power of this combination, we have chosen the salt stress transcriptomes of roots and nodules of the third most important legume crop chickpea (Cicer arietinum L.). While our report is more technology-oriented, it nevertheless addresses a major world-wide problem for crops generally: high salinity. Together with low temperatures and water stress, high salinity is responsible for crop losses of millions of tons of various legume (and other) crops. Continuously deteriorating environmental conditions will combine with salinity stress to further compromise crop yields. As a good example for such stress-exposed crop plants, we started to characterize salt stress responses of chickpeas on the transcriptome level.
We used deepSuperSAGE to detect early global transcriptome changes in salt-stressed chickpea. The salt stress responses of 86,919 transcripts representing 17,918 unique 26 bp deepSuperSAGE tags (UniTags) from roots of the salt-tolerant variety INRAT-93 two hours after treatment with 25 mM NaCl were characterized. Additionally, the expression of 57,281 transcripts representing 13,115 UniTags was monitored in nodules of the same plants. From a total of 144,200 analyzed 26 bp tags in roots and nodules together, 21,401 unique transcripts were identified. Of these, only 363 and 106 specific transcripts, respectively, were commonly up- or down-regulated (>3.0-fold) under salt stress in both organs, witnessing a differential organ-specific response to stress.
Profiting from recent pioneer works on massive cDNA sequencing in chickpea, more than 9,400 UniTags were able to be linked to UniProt entries. Additionally, gene ontology (GO) categories over-representation analysis enabled to filter out enriched biological processes among the differentially expressed UniTags. Subsequently, the gathered information was further cross-checked with stress-related pathways.
From several filtered pathways, here we focus exemplarily on transcripts associated with the generation and scavenging of reactive oxygen species (ROS), as well as on transcripts involved in Na+ homeostasis. Although both processes are already very well characterized in other plants, the information generated in the present work is of high value. Information on expression profiles and sequence similarity for several hundreds of transcripts of potential interest is now available.
This report demonstrates, that the combination of the high-throughput transcriptome profiling technology SuperSAGE with one of the next-generation sequencing platforms allows deep insights into the first molecular reactions of a plant exposed to salinity. Cross validation with recent reports enriched the information about the salt stress dynamics of more than 9,000 chickpea ESTs, and enlarged their pool of alternative transcripts isoforms.
As an example for the high resolution of the employed technology that we coin deepSuperSAGE, we demonstrate that ROS-scavenging and -generating pathways undergo strong global transcriptome changes in chickpea roots and nodules already 2 hours after onset of moderate salt stress (25 mM NaCl). Additionally, a set of more than 15 candidate transcripts are proposed to be potential components of the salt overly sensitive (SOS) pathway in chickpea.
Newly identified transcript isoforms are potential targets for breeding novel cultivars with high salinity tolerance. We demonstrate that these targets can be integrated into breeding schemes by micro-arrays and RT-PCR assays downstream of the generation of 26 bp tags by SuperSAGE.
PMCID: PMC3045889  PMID: 21320317
12.  Challenges of sequencing human genomes 
Briefings in Bioinformatics  2010;11(5):484-498.
Massively parallel sequencing technologies continue to alter the study of human genetics. As the cost of sequencing declines, next-generation sequencing (NGS) instruments and datasets will become increasingly accessible to the wider research community. Investigators are understandably eager to harness the power of these new technologies. Sequencing human genomes on these platforms, however, presents numerous production and bioinformatics challenges. Production issues like sample contamination, library chimaeras and variable run quality have become increasingly problematic in the transition from technology development lab to production floor. Analysis of NGS data, too, remains challenging, particularly given the short-read lengths (35–250 bp) and sheer volume of data. The development of streamlined, highly automated pipelines for data analysis is critical for transition from technology adoption to accelerated research and publication. This review aims to describe the state of current NGS technologies, as well as the strategies that enable NGS users to characterize the full spectrum of DNA sequence variation in humans.
PMCID: PMC2980933  PMID: 20519329
massively parallel sequencing; next generation sequencing; human genome; variant detection; short read alignment; whole genome sequencing
13.  Methyl-Analyzer—whole genome DNA methylation profiling 
Bioinformatics  2011;27(16):2296-2297.
Summary: Methyl-Analyzer is a python package that analyzes genome-wide DNA methylation data produced by the Methyl-MAPS (methylation mapping analysis by paired-end sequencing) method. Methyl-MAPS is an enzymatic-based method that uses both methylation-sensitive and -dependent enzymes covering >80% of CpG dinucleotides within mammalian genomes. It combines enzymatic-based approaches with high-throughput next-generation sequencing technology to provide whole genome DNA methylation profiles. Methyl-Analyzer processes and integrates sequencing reads from methylated and unmethylated compartments and estimates CpG methylation probabilities at single base resolution.
Availability and implementation: Methyl-Analyzer is available at Sample dataset is available for download at
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC3150045  PMID: 21685051
14.  Monozygotic twins: genes are not the destiny? 
Bioinformation  2011;7(7):369-370.
Monozygotic twins are considered to be genetically identical, yet can show high discordance in their phenotypes and disease susceptibility. Several studies have emphasized the influence of external factors and the role of epigenetic polymorphism in conferring this variability. However, some recent high-resolution studies on DNA methylation show contradicting evidence, which poses questions on the extent of epigenetic variability between twins. The advent of next-generation sequencing technologies now allow us to interrogate multiple epigenomes on a massive scale and understand the role of epigenetic modification, especially DNA methylation, in regulating complex traits. This article briefly discusses the recent key findings, unsolved questions in the area, and speculates on the future directions in the field
PMCID: PMC3280493  PMID: 22355239
monozygotic twins; epigenetics; DNA methylation; next-generation sequencing
15.  Pash 3.0: A versatile software package for read mapping and integrative analysis of genomic and epigenomic variation using massively parallel DNA sequencing 
BMC Bioinformatics  2010;11:572.
Massively parallel sequencing readouts of epigenomic assays are enabling integrative genome-wide analyses of genomic and epigenomic variation. Pash 3.0 performs sequence comparison and read mapping and can be employed as a module within diverse configurable analysis pipelines, including ChIP-Seq and methylome mapping by whole-genome bisulfite sequencing.
Pash 3.0 generally matches the accuracy and speed of niche programs for fast mapping of short reads, and exceeds their performance on longer reads generated by a new generation of massively parallel sequencing technologies. By exploiting longer read lengths, Pash 3.0 maps reads onto the large fraction of genomic DNA that contains repetitive elements and polymorphic sites, including indel polymorphisms.
We demonstrate the versatility of Pash 3.0 by analyzing the interaction between CpG methylation, CpG SNPs, and imprinting based on publicly available whole-genome shotgun bisulfite sequencing data. Pash 3.0 makes use of gapped k-mer alignment, a non-seed based comparison method, which is implemented using multi-positional hash tables. This allows Pash 3.0 to run on diverse hardware platforms, including individual computers with standard RAM capacity, multi-core hardware architectures and large clusters.
PMCID: PMC3001746  PMID: 21092284
16.  Epigenome Sequencing Comes of Age 
Cell  2008;133(3):395-397.
Epigenetic states are responsive to developmental and environmental signals, and as a consequence a eukaryotic cell can have many different epigenomes. In this issue of Cell, Lister et al. (2008) present the floral epigenome of Arabidopsis using next-generation sequencing technology to analyze both DNA methylation at single-base resolution and the expression of small RNAs.
PMCID: PMC3137521  PMID: 18455978
17.  Novel software package for cross-platform transcriptome analysis (CPTRA) 
BMC Bioinformatics  2009;10(Suppl 11):S16.
Next-generation sequencing techniques enable several novel transcriptome profiling approaches. Recent studies indicated that digital gene expression profiling based on short sequence tags has superior performance as compared to other transcriptome analysis platforms including microarrays. However, the transcriptomic analysis with tag-based methods often depends on available genome sequence. The use of tag-based methods in species without genome sequence should be complemented by other methods such as cDNA library sequencing. The combination of different next generation sequencing techniques like 454 pyrosequencing and Illumina Genome Analyzer (Solexa) will enable high-throughput and accurate global gene expression profiling in species with limited genome information. The combination of transcriptome data acquisition methods requires cross-platform transcriptome data analysis platforms, including a new software package for data processing.
Here we presented a software package, CPTRA: Cross-Platform TRanscriptome Analysis, to analyze transcriptome profiling data from separate methods. The software package is available at It was applied to the case study of non-target site glyphosate resistance in horseweed; and the data was mined to discover resistance target gene(s). For the software, the input data included a long-read sequence dataset with proper annotation, and a short-read sequence tag dataset for the quantification of transcripts. By combining the two datasets, the software carries out the unique sequence tag identification, tag counting for transcript quantification, and cross-platform sequence matching functions, whereby the short sequence tags can be annotated with a function, level of expression, and Gene Ontology (GO) classification. Multiple sequence search algorithms were implemented and compared. The analysis highlighted the importance of transport genes in glyphosate resistance and identified several candidate genes for down-stream analysis.
CPTRA is a powerful software package for next generation sequencing-based transcriptome profiling in species with limited genome information. According to our case study, the strategy can greatly broaden the application of the next generation sequencing for transcriptome analysis in species without reference genome sequence.
PMCID: PMC3226187  PMID: 19811681
18.  Emerging patterns of epigenomic variation 
Trends in genetics : TIG  2011;27(6):242-250.
Fuelled by new sequencing technologies, epigenome mapping projects are revealing epigenomic variation at all levels of biological complexity, from species to cells. Comparisons of methylation profiles among species reveal evolutionary conservation of gene body methylation patterns, pointing to the fundamental role of epigenomes in gene regulation. At the human population level, epigenomic changes provide footprints of the effects of genomic variants within the vast non-protein coding fraction of the genome while comparisons of the epigenomes of parents and their offspring point to quantitative epigenomic parent-of-origin effects confounding classical Mendelian genetics. At the organismal level, comparisons of epigenomes from diverse cell types provide insights into cellular differentiation. Finally, comparisons of epigenomes from monozygotic twins help dissect genetic and environmental influences on human phenotypes and longitudinal comparisons reveal aging-associated epigenomic drift. The development of new bioinformatic frameworks for comparative epigenome analysis is putting epigenome maps within reach of researchers across a wide spectrum of biological disciplines.
PMCID: PMC3104125  PMID: 21507501
19.  New strategies and emerging technologies for massively parallel sequencing: applications in medical research 
Genome Medicine  2009;1(4):40.
A variety of techniques that specifically target human gene sequences for differential capture from a genomic sample, coupled with next-generation, massively parallel DNA sequencing instruments, is rapidly supplanting the combination of polymerase chain reaction and capillary sequencing to discover coding variants in medically relevant samples. These studies are most appropriate for the sample numbers necessary to identify both common and rare single nucleotide variants, as well as small insertion or deletion events, which may cause complex inherited diseases. The same massively parallel sequencers are simultaneously being used for whole-genome resequencing and comprehensive, genome-wide variant discovery in studies of somatic diseases such as cancer. Viral and microbial researchers are using next-generation sequences to identify unknown etiologic agents in human diseases, to study the viral and microbial species that occupy surfaces of the human body, and to inform the clinical management of chronic infectious diseases such as human immunodeficiency virus (HIV). Taken together, these approaches are dramatically accelerating the pace of human disease research and are already impacting patient care.
PMCID: PMC2684661  PMID: 19435481
20.  Technical Considerations for Reduced Representation Bisulfite Sequencing with Multiplexed Libraries 
Reduced representation bisulfite sequencing (RRBS), which couples bisulfite conversion and next generation sequencing, is an innovative method that specifically enriches genomic regions with a high density of potential methylation sites and enables investigation of DNA methylation at single-nucleotide resolution. Recent advances in the Illumina DNA sample preparation protocol and sequencing technology have vastly improved sequencing throughput capacity. Although the new Illumina technology is now widely used, the unique challenges associated with multiplexed RRBS libraries on this platform have not been previously described. We have made modifications to the RRBS library preparation protocol to sequence multiplexed libraries on a single flow cell lane of the Illumina HiSeq 2000. Furthermore, our analysis incorporates a bioinformatics pipeline specifically designed to process bisulfite-converted sequencing reads and evaluate the output and quality of the sequencing data generated from the multiplexed libraries. We obtained an average of 42 million paired-end reads per sample for each flow-cell lane, with a high unique mapping efficiency to the reference human genome. Here we provide a roadmap of modifications, strategies, and trouble shooting approaches we implemented to optimize sequencing of multiplexed libraries on an a RRBS background.
PMCID: PMC3495292  PMID: 23193365
21.  Next-generation sequencing: applications beyond genomes 
Biochemical Society Transactions  2008;36(Pt 5):1091-1096.
The development of DNA sequencing more than 30 years ago has profoundly impacted biological research. In the last couple of years, remarkable technological innovations have emerged that allow the direct and cost-effective sequencing of complex samples at unprecedented scale and speed. These next-generation technologies make it feasible to sequence not only static genomes, but also entire transcriptomes expressed under different conditions. These and other powerful applications of next-generation sequencing are rapidly revolutionizing the way genomic studies are carried out. Below, we provide a snapshot of these exciting new approaches to understanding the properties and functions of genomes. Given that sequencing-based assays may increasingly supersede microarray-based assays, we also compare and contrast data obtained from these distinct approaches.
PMCID: PMC2563889  PMID: 18793195
ChIP-Seq; high-throughput sequencing; massively parallel sequencing; microarray; RNA-Seq; transcriptome; yeast; ChIP, chromatin immunoprecipitation; ChIP-on-chip, ChIP using microarrays; NRSF, neuron-restrictive silencer factor; STAT1, signal transducer and activator of transcription 1
22.  Application of Genotyping-by-Sequencing on Semiconductor Sequencing Platforms: A Comparison of Genetic and Reference-Based Marker Ordering in Barley 
PLoS ONE  2013;8(10):e76925.
The rapid development of next-generation sequencing platforms has enabled the use of sequencing for routine genotyping across a range of genetics studies and breeding applications. Genotyping-by-sequencing (GBS), a low-cost, reduced representation sequencing method, is becoming a common approach for whole-genome marker profiling in many species. With quickly developing sequencing technologies, adapting current GBS methodologies to new platforms will leverage these advancements for future studies. To test new semiconductor sequencing platforms for GBS, we genotyped a barley recombinant inbred line (RIL) population. Based on a previous GBS approach, we designed bar code and adapter sets for the Ion Torrent platforms. Four sets of 24-plex libraries were constructed consisting of 94 RILs and the two parents and sequenced on two Ion platforms. In parallel, a 96-plex library of the same RILs was sequenced on the Illumina HiSeq 2000. We applied two different computational pipelines to analyze sequencing data; the reference-independent TASSEL pipeline and a reference-based pipeline using SAMtools. Sequence contigs positioned on the integrated physical and genetic map were used for read mapping and variant calling. We found high agreement in genotype calls between the different platforms and high concordance between genetic and reference-based marker order. There was, however, paucity in the number of SNP that were jointly discovered by the different pipelines indicating a strong effect of alignment and filtering parameters on SNP discovery. We show the utility of the current barley genome assembly as a framework for developing very low-cost genetic maps, facilitating high resolution genetic mapping and negating the need for developing de novo genetic maps for future studies in barley. Through demonstration of GBS on semiconductor sequencing platforms, we conclude that the GBS approach is amenable to a range of platforms and can easily be modified as new sequencing technologies, analysis tools and genomic resources develop.
PMCID: PMC3789676  PMID: 24098570
23.  Sequencing quality assessment tools to enable data-driven informatics for high throughput genomics 
Frontiers in Genetics  2013;4:288.
The processes of quality assessment and control are an active area of research at The Genome Analysis Centre (TGAC). Unlike other sequencing centers that often concentrate on a certain species or technology, TGAC applies expertise in genomics and bioinformatics to a wide range of projects, often requiring bespoke wet lab and in silico workflows. TGAC is fortunate to have access to a diverse range of sequencing and analysis platforms, and we are at the forefront of investigations into library quality and sequence data assessment. We have developed and implemented a number of algorithms, tools, pipelines and packages to ascertain, store, and expose quality metrics across a number of next-generation sequencing platforms, allowing rapid and in-depth cross-platform Quality Control (QC) bioinformatics. In this review, we describe these tools as a vehicle for data-driven informatics, offering the potential to provide richer context for downstream analysis and to inform experimental design.
PMCID: PMC3865868  PMID: 24381581
quality control; sequence analysis; QC; NGS data analysis; bioinformatics tools; run statistics; quality assessment and improvement; contamination screening
24.  Integrated Core Facility Support and Optimization of Next Generation Sequencing Technologies 
New DNA sequencing technologies present an exceptional opportunity for novel and creative applications with the potential for breakthrough discoveries. To support such research efforts, the Cornell University Life Sciences Core Laboratories Center has implemented the Illumina HiSeq 2000 and the Roche 454 GS FLX platforms as academic core facility shared research resources. We have established sample handling methods, LIMS tools and BioHPC informatics analysis pipelines in support of these new technologies. Our genomics core laboratory, in collaboration with our epigenomics core and bioinformatics core, provides sample preparation and data generation services and both project consultation and analysis support for a wide range of possible applications, including de novo or reference based genome assembly, detection of genetic variation, transcriptome sequencing, small RNA profiling, and genome-wide epigenomic measurements of methylation and protein-nucleic acid interactions. Implementation of next generation sequencing platforms as shared resources with multidisciplinary core facility support enables cost effective access and broad based use of these technologies.
PMCID: PMC3186497
25.  The Decade of the Epigenomes? 
Genes & Cancer  2011;2(6):680-687.
The beginning of this century was not only marked by the publication of the first draft of the human genome but also set off a decade of intense research on epigenetic phenomena. Apart from DNA methylation, it became clear that many other factors including a wide range of histone modifications, different shades of chromatin accessibility, and a vast suite of noncoding RNAs comprise the epigenome. With the recent advances in sequencing technologies, it has now become possible to analyze many of these features in depth, allowing for the first time the establishment of complete epigenomic profiles for basically every cell type of interest. Here, we will discuss the recent advances that allow comprehensive epigenetic mapping, highlight several projects that set out to better understand the epigenome, and discuss the impact that epigenomic mapping can have on our understanding of both healthy and diseased cells.
PMCID: PMC3174262  PMID: 21941622
epigenome; chromatin accessibility; DNA methylation; ChIP-seq; RNA-seq

Results 1-25 (537108)