Search tips
Search criteria

Results 1-12 (12)

Clipboard (0)
Year of Publication
1.  Genes suppressed by DNA methylation in non-small cell lung cancer reveal the epigenetics of epithelial–mesenchymal transition 
BMC Genomics  2014;15(1):1079.
DNA methylation is associated with aberrant gene expression in cancer, and has been shown to correlate with therapeutic response and disease prognosis in some types of cancer. We sought to investigate the biological significance of DNA methylation in lung cancer.
We integrated the gene expression profiles and data of gene promoter methylation for a large panel of non-small cell lung cancer cell lines, and identified 578 candidate genes with expression levels that were inversely correlated to the degree of DNA methylation. We found these candidate genes to be differentially methylated in normal lung tissue versus non-small cell lung cancer tumors, and segregated by histologic and tumor subtypes. We used gene set enrichment analysis of the genes ranked by the degree of correlation between gene expression and DNA methylation to identify gene sets involved in cellular migration and metastasis. Our unsupervised hierarchical clustering of the candidate genes segregated cell lines according to the epithelial-to-mesenchymal transition phenotype. Genes related to the epithelial-to-mesenchymal transition, such as AXL, ESRP1, HoxB4, and SPINT1/2, were among the nearly 20% of the candidate genes that were differentially methylated between epithelial and mesenchymal cells. Greater numbers of genes were methylated in the mesenchymal cells and their expressions were upregulated by 5-azacytidine treatment. Methylation of the candidate genes was associated with erlotinib resistance in wild-type EGFR cell lines. The expression profiles of the candidate genes were associated with 8-week disease control in patients with wild-type EGFR who had unresectable non-small cell lung cancer treated with erlotinib, but not in patients treated with sorafenib.
Our results demonstrate that the underlying biology of genes regulated by DNA methylation may have predictive value in lung cancer that can be exploited therapeutically.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-1079) contains supplementary material, which is available to authorized users.
PMCID: PMC4298954  PMID: 25486910
DNA methylation; Epithelial-mesenchymal transition; Erlotinib; Lung cancer
2.  Unlocking the mystery of the hard-to-sequence phage genome: PaP1 methylome and bacterial immunity 
BMC Genomics  2014;15(1):803.
Whole-genome sequencing is an important method to understand the genetic information, gene function, biological characteristics and survival mechanisms of organisms. Sequencing large genomes is very simple at present. However, we encountered a hard-to-sequence genome of Pseudomonas aeruginosa phage PaP1. Shotgun sequencing method failed to complete the sequence of this genome.
After persevering for 10 years and going over three generations of sequencing techniques, we successfully completed the sequence of the PaP1 genome with a length of 91,715 bp. Single-molecule real-time sequencing results revealed that this genome contains 51 N-6-methyladenines and 152 N-4-methylcytosines. Three significant modified sequence motifs were predicted, but not all of the sites found in the genome were methylated in these motifs. Further investigations revealed a novel immune mechanism of bacteria, in which host bacteria can recognise and repel modified bases containing inserts in a large scale. This mechanism could be accounted for the failure of the shotgun method in PaP1 genome sequencing. This problem was resolved using the nfi- mutant of Escherichia coli DH5α as a host bacterium to construct a shotgun library.
This work provided insights into the hard-to-sequence phage PaP1 genome and discovered a new mechanism of bacterial immunity. The methylome of phage PaP1 is responsible for the failure of shotgun sequencing and for bacterial immunity mediated by enzyme Endo V activity; this methylome also provides a valuable resource for future studies on PaP1 genome replication and modification, as well as on gene regulation and host interaction.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-803) contains supplementary material, which is available to authorized users.
PMCID: PMC4177049  PMID: 25233860
3.  Data-mining the FlyAtlas online resource to identify core functional motifs across transporting epithelia 
BMC Genomics  2013;14:518.
Comparative analysis of tissue-specific transcriptomes is a powerful technique to uncover tissue functions. Our provides authoritative gene expression levels for multiple tissues of Drosophila melanogaster (1). Although the main use of such resources is single gene lookup, there is the potential for powerful meta-analysis to address questions that could not easily be framed otherwise. Here, we illustrate the power of data-mining of FlyAtlas data by comparing epithelial transcriptomes to identify a core set of highly-expressed genes, across the four major epithelial tissues (salivary glands, Malpighian tubules, midgut and hindgut) of both adults and larvae.
Parallel hypothesis-led and hypothesis-free approaches were adopted to identify core genes that underpin insect epithelial function. In the former, gene lists were created from transport processes identified in the literature, and their expression profiles mapped from the online dataset. In the latter, gene enrichment lists were prepared for each epithelium, and genes (both transport related and unrelated) consistently enriched in transporting epithelia identified.
A key set of transport genes, comprising V-ATPases, cation exchangers, aquaporins, potassium and chloride channels, and carbonic anhydrase, was found to be highly enriched across the epithelial tissues, compared with the whole fly. Additionally, a further set of genes that had not been predicted to have epithelial roles, were co-expressed with the core transporters, extending our view of what makes a transporting epithelium work. Further insights were obtained by studying the genes uniquely overexpressed in each epithelium; for example, the salivary gland expresses lipases, the midgut organic solute transporters, the tubules specialize for purine metabolism and the hindgut overexpresses still unknown genes.
Taken together, these data provide a unique insight into epithelial function in this key model insect, and a framework for comparison with other species. They also provide a methodology for function-led datamining of and other multi-tissue expression datasets.
PMCID: PMC3734111  PMID: 23895496
Drosophila melanogaster; Functional genomics; Ion transport; Microarrays
4.  Systematic evaluation of genome-wide methylated DNA enrichment using a CpG island array 
BMC Genomics  2011;12:10.
Recent progress in high-throughput technologies has greatly contributed to the development of DNA methylation profiling. Although there are several reports that describe methylome detection of whole genome bisulfite sequencing, the high cost and heavy demand on bioinformatics analysis prevents its extensive application. Thus, current strategies for the study of mammalian DNA methylomes is still based primarily on genome-wide methylated DNA enrichment combined with DNA microarray detection or sequencing. Methylated DNA enrichment is a key step in a microarray based genome-wide methylation profiling study, and even for future high-throughput sequencing based methylome analysis.
In order to evaluate the sensitivity and accuracy of methylated DNA enrichment, we investigated and optimized a number of important parameters to improve the performance of several enrichment assays, including differential methylation hybridization (DMH), microarray-based methylation assessment of single samples (MMASS), and methylated DNA immunoprecipitation (MeDIP). With advantages and disadvantages unique to each approach, we found that assays based on methylation-sensitive enzyme digestion and those based on immunoprecipitation detected different methylated DNA fragments, indicating that they are complementary in their relative ability to detect methylation differences.
Our study provides the first comprehensive evaluation for widely used methodologies for methylated DNA enrichment, and could be helpful for developing a cost effective approach for DNA methylation profiling.
PMCID: PMC3023747  PMID: 21211017
5.  Genes related to the very early stage of ConA-induced fulminant hepatitis: a gene-chip-based study in a mouse model 
BMC Genomics  2010;11:240.
Due to the high morbidity and mortality of fulminant hepatitis, early diagnosis followed by early effective treatment is the key for prognosis improvement. So far, little is known about the gene expression changes in the early stage of this serious illness. Identification of the genes related to the very early stage of fulminant hepatitis development may provide precise clues for early diagnosis.
Balb/C mice were used for ConA injection to induce fulminant hepatitis that was confirmed by pathological and biochemical examination. After a gene chip-based screening, the data of gene expression in the liver, was further dissected by ANOVA analysis, gene expression profiles, gene network construction and real-time RT-PCR.
At the very early stage of ConA-triggered fulminant hepatitis, totally 1,473 genes with different expression variations were identified. Among these, 26 genes were finally selected for further investigation. The data from gene network analysis demonstrate that two genes, MPDZ and Acsl1, localized in the core of the network.
At the early stages of fulminant hepatitis, expression of twenty-six genes involved in protein transport, transcription regulation and cell metabolism altered significantly. These genes form a network and have shown strong correlation with fulminant hepatitis development. Our study provides several potential targets for the early diagnosis of fulminant hepatitis.
PMCID: PMC2867829  PMID: 20398290
6.  Mapping QTL affecting resistance to Marek's disease in an F6 advanced intercross population of commercial layer chickens 
BMC Genomics  2009;10:20.
Marek's disease (MD) is a T-cell lymphoma of chickens caused by the Marek's disease virus (MDV), an oncogenic avian herpesvirus. MD is a major cause of economic loss to the poultry industry and the most serious and persistent infectious disease concern. A full-sib intercross population, consisting of five independent families was generated by crossing and repeated intercrossing of two partially inbred commercial White Leghorn layer lines known to differ in genetic resistance to MD. At the F6 generation, a total of 1615 chicks were produced (98 to 248 per family) and phenotyped for MD resistance measured as survival time in days after challenge with a very virulent plus (vv+) strain of MDV.
QTL affecting MD resistance were identified by selective DNA pooling using a panel of 15 SNPs and 217 microsatellite markers. Since MHC blood type (BT) is known to affect MD resistance, a total of 18 independent pool pairs were constructed according to family × BT combination, with some combinations represented twice for technical reasons. Twenty-one QTL regions (QTLR) affecting post-challenge survival time were identified, distributed among 11 chromosomes (GGA1, 2, 3, 4, 5, 8, 9, 15, 18, 26 and Z), with about two-thirds of the MD resistance alleles derived from the more MD resistant parental line. Eight of the QTLR associated with MD resistance, were previously identified in a backcross (BC) mapping study with the same parental lines. Of these, 7 originated from the more resistant line, and one from the less resistant line.
There was considerable evidence suggesting that MD resistance alleles tend to be recessive. The width of the QTLR for these QTL appeared to be reduced about two-fold in the F6 as compared to that found in the previous BC study. These results provide a firm basis for high-resolution linkage disequilibrium mapping and positional cloning of the resistance genes.
PMCID: PMC2651900  PMID: 19144166
7.  How many human genes can be defined as housekeeping with current expression data? 
BMC Genomics  2008;9:172.
Housekeeping (HK) genes are ubiquitously expressed in all tissue/cell types and constitute a basal transcriptome for the maintenance of basic cellular functions. Partitioning transcriptomes into HK and tissue-specific (TS) genes relatively is fundamental for studying gene expression and cellular differentiation. Although many studies have aimed at large-scale and thorough categorization of human HK genes, a meaningful consensus has yet to be reached.
We collected two latest gene expression datasets (both EST and microarray data) from public databases and analyzed the gene expression profiles in 18 human tissues that have been well-documented by both two data types. Benchmarked by a manually-curated HK gene collection (HK408), we demonstrated that present data from EST sampling was far from saturated, and the inadequacy has limited the gene detectability and our understanding of TS expressions. Due to a likely over-stringent threshold, microarray data showed higher false negative rate compared with EST data, leading to a significant underestimation of HK genes. Based on EST data, we found that 40.0% of the currently annotated human genes were universally expressed in at least 16 of 18 tissues, as compared to only 5.1% specifically expressed in a single tissue. Our current EST-based estimate on human HK genes ranged from 3,140 to 6,909 in number, a ten-fold increase in comparison with previous microarray-based estimates.
We concluded that a significant fraction of human genes, at least in the currently annotated data depositories, was broadly expressed. Our understanding of tissue-specific expression was still preliminary and required much more large-scale and high-quality transcriptomic data in future studies. The new HK gene list categorized in this study will be useful for genome-wide analyses on structural and functional features of HK genes.
PMCID: PMC2396180  PMID: 18416810
8.  Variable sexually dimorphic gene expression in laboratory strains of Drosophila melanogaster 
BMC Genomics  2007;8:454.
Wild-type laboratory strains of model organisms are typically kept in isolation for many years, with the action of genetic drift and selection on mutational variation causing lineages to diverge with time. Natural populations from which such strains are established, show that gender-specific interactions in particular drive many aspects of sequence level and transcriptional level variation. Here, our goal was to identify genes that display transcriptional variation between laboratory strains of Drosophila melanogaster, and to explore evidence of gender-biased interactions underlying that variability.
Transcriptional variation among the laboratory genotypes studied occurs more frequently in males than in females. Qualitative differences are also apparent to suggest that genes within particular functional classes disproportionately display variation in gene expression. Our analysis indicates that genes with reproductive functions are most often divergent between genotypes in both sexes, however a large proportion of female variation can also be attributed to genes without expression in the ovaries.
The present study clearly shows that transcriptional variation between common laboratory strains of Drosophila can differ dramatically due to sexual dimorphism. Much of this variation reflects sex-specific challenges associated with divergent physiological trade-offs, morphology and regulatory pathways operating within males and females.
PMCID: PMC2244638  PMID: 18070343
9.  GO-2D: identifying 2-dimensional cellular-localized functional modules in Gene Ontology 
BMC Genomics  2007;8:30.
Rapid progress in high-throughput biotechnologies (e.g. microarrays) and exponential accumulation of gene functional knowledge make it promising for systematic understanding of complex human diseases at functional modules level. Based on Gene Ontology, a large number of automatic tools have been developed for the functional analysis and biological interpretation of the high-throughput microarray data.
Different from the existing tools such as Onto-Express and FatiGO, we develop a tool named GO-2D for identifying 2-dimensional functional modules based on combined GO categories. For example, it refines biological process categories by sorting their genes into different cellular component categories, and then extracts those combined categories enriched with the interesting genes (e.g., the differentially expressed genes) for identifying the cellular-localized functional modules. Applications of GO-2D to the analyses of two human cancer datasets show that very specific disease-relevant processes can be identified by using cellular location information.
For studying complex human diseases, GO-2D can extract functionally compact and detailed modules such as the cellular-localized ones, characterizing disease-relevant modules in terms of both biological processes and cellular locations. The application results clearly demonstrate that 2-dimensional approach complementary to current 1-dimensional approach is powerful for finding modules highly relevant to diseases.
PMCID: PMC1794235  PMID: 17250772
10.  The use of comparative genomic hybridization to characterize genome dynamics and diversity among the serotypes of Shigella 
BMC Genomics  2006;7:218.
Compelling evidence indicates that Shigella species, the etiologic agents of bacillary dysentery, as well as enteroinvasive Escherichia coli, are derived from multiple origins of Escherichia coli and form a single pathovar. To further understand the genome diversity and virulence evolution of Shigella, comparative genomic hybridization microarray analysis was employed to compare the gene content of E. coli K-12 with those of 43 Shigella strains from all lineages.
For the 43 strains subjected to CGH microarray analyses, the common backbone of the Shigella genome was estimated to contain more than 1,900 open reading frames (ORFs), with a mean number of 726 undetectable ORFs. The mosaic distribution of absent regions indicated that insertions and/or deletions have led to the highly diversified genomes of pathogenic strains.
These results support the hypothesis that by gain and loss of functions, Shigella species became successful human pathogens through convergent evolution from diverse genomic backgrounds. Moreover, we also found many specific differences between different lineages, providing a window into understanding bacterial speciation and taxonomic relationships.
PMCID: PMC3225857  PMID: 16939645
11.  Complete genome sequence of Shigella flexneri 5b and comparison with Shigella flexneri 2a 
BMC Genomics  2006;7:173.
Shigella bacteria cause dysentery, which remains a significant threat to public health. Shigella flexneri is the most common species in both developing and developed countries. Five Shigella genomes have been sequenced, revealing dynamic and diverse features. To investigate the intra-species diversity of S. flexneri genomes further, we have sequenced the complete genome of S. flexneri 5b strain 8401 (abbreviated Sf8401) and compared it with S. flexneri 2a (Sf301).
The Sf8401 chromosome is 4.5-Mb in size, a little smaller than that of Sf301, mainly because the former lacks the SHI-1 pathogenicity island (PAI). Compared with Sf301, there are 6 inversions and one translocation in Sf8401, which are probably mediated by insertion sequences (IS). There are clear differences in the known PAIs between these two genomes. The bacteriophage SfV segment remaining in SHI-O of Sf8401 is clearly larger than the remnants of bacteriophage SfII in Sf301. SHI-1 is absent from Sf8401 but a specific related protein is found next to the pheV locus. SHI-2 is involved in one intra-replichore inversion near the origin of replication, which may change the expression of iut/iuc genes. Moreover, genes related to the glycine-betaine biosynthesis pathway are present only in Sf8401 among the known Shigella genomes.
Our data show that the two S. flexneri genomes are very similar, which suggests a high level of structural and functional conservation between the two serotypes. The differences reflect different selection pressures during evolution. The ancestor of S. flexneri probably acquired SHI-1 and SHI-2 before SHI-O was integrated and the serotypes diverged. SHI-1 was subsequently deleted from the S. flexneri 5b genome by recombination, but stabilized in the S. flexneri 2a genome. These events may have contributed to the differences in pathogenicity and epidemicity between the two serotypes of S. flexneri.
PMCID: PMC1550401  PMID: 16822325
12.  Obtaining reliable information from minute amounts of RNA using cDNA microarrays 
BMC Genomics  2002;3:16.
High density cDNA microarray technology provides a powerful tool to survey the activity of thousands of genes in normal and diseased cells, which helps us both to understand the molecular basis of the disease and to identify potential targets for therapeutic intervention. The promise of this technology has been hampered by the large amount of biological material required for the experiments (more than 50 μg of total RNA per array). We have modified an amplification procedure that requires only 1 μg of total RNA. Analyses of the results showed that most genes that were detected as expressed or differentially expressed using the regular protocol were also detected using the amplification protocol. In addition, many genes that were undetected or weakly detected using the regular protocol were clearly detected using the amplification protocol. We have carried out a series of confirmation studies by northern blotting, western blotting, and immunohistochemistry assays.
Our results showed that most of the new information revealed by the amplification protocol represents real gene activity in the cells.
We have confirmed a powerful and consistent cDNA microarray procedure that can be used to study minute amounts of biological tissue.
PMCID: PMC117130  PMID: 12086591

Results 1-12 (12)