Background. Colorectal cancer mostly arises from the polyps of colon. The aim of our study was to examine the association of body mass index (BMI) and serum lipids with the colorectal polyps in old Chinese people. Methods. The risk of developing colorectal polyps was studied in 244 subjects (212 men and 32 women, 74.63 ± 11.63 years old) who underwent colonoscopy for the first time from January 2008 to July 2012 at the Navy General Hospital, Beijing, China. According to the results of colonoscopy, all the subjects were divided into 112 normal control, 38 right colorectal polyps, 53 left colorectal polyps, and 41 both right and left colorectal polyps groups. The total plasma cholesterol, plasma triglyceride, plasma creatinine concentration, blood urinary nitrogen, and fasting glucose were determined using a multichannel analyzer. Results. There were significant differences among normal control, right colorectal polyps, left colorectal polyps, and both right and left polyps groups, which were the BMI, total cholesterol, triglycerides, creatinine, and urinary nitrogen. In binary logistic regression analysis, there were two risk factors associated with the occurrence of colorectal polyps, which included BMI and systolic blood pressure. Conclusions. Colorectal polyps were significantly associated with increased BMI, total cholesterol, and triglycerides levels.
Protein microarrays have been developed to study antibody reactivity against a large number of antigens, demonstrating extensive perspective for clinical application. We developed a viral antigen array by spotting four recombinant antigens and synthetic peptide, including glycoprotein G of herpes simplex virus (HSV) type 1 and 2, phosphoprotein 150 of cytomegalovirus (CMV), Rubella virus (RV) core plus glycoprotein E1 and E2 as well as a E1 peptide with the optimal concentrations on activated glass slides to simultaneously detect IgG and IgM against HSV1, HSV2, CMV and RV in clinical specimens of sera and cerebrospinal fluids (CSFs). The positive reference sera were initially used to measure the sensitivity and specificity of the array with the optimal conditions. Then clinical specimens of 144 sera and 93 CSFs were tested for IgG and IgM antibodies directed against HSV1, HSV2, CMV and RV by the antigen array. Specificity of the antigen array for viral antibodies detection was satisfying compared to commercial ELISA kits but sensitivity of the array varied relying on quality and antigenic epitopes of the spotting antigens. In short, the recombinant antigen array has potential to simultaneous detect multiple viral antibodies using minute amount (3 µl) of samples, which holds the particularly advantage to detect viral antibodies in clinical CSFs being suspicious of neonatal meningitis and encephalitis.
Moorella thermoacetica was long the only model organism used to study the biochemistry of acetogenesis from CO2. Depending on the growth substrate, this Gram-positive bacterium can either form H2 or consume it. Despite the importance of H2 in its metabolism, a hydrogenase from the organism has not yet been characterized. We report here the purification and properties of an electron-bifurcating [FeFe]-hydrogenase from M. thermoacetica and show that the cytoplasmic enzyme efficiently catalyzes both H2 formation and H2 uptake. The purified heterotrimeric iron-sulfur flavoprotein (HydABC) catalyzed the coupled reduction of ferredoxin (Fd) and NAD+ with H2 at 55°C at pH 7.5 at a specific rate of about 100 μmol min−1 mg protein−1 and the reverse reaction, the coupled reduction of protons to H2 with reduced ferredoxin and NADH, at a specific rate of about 10 μmol min−1 mg protein−1 in the stoichiometry Fdox + NAD+ + 2H2 ⇋ Fdred2− + NADH + 3H+. When ferredoxin from Clostridium pasteurianum, NAD+, and the enzyme were incubated at pH 7.0 under 100% H2 in the gas phase (E0′ = −414 mV), more than 95% of the ferredoxin (E0′ = −400 mV) was reduced, which indicated that ferredoxin reduction with H2 is driven by the exergonic reduction of NAD+ (E0′ = −320 mV) with H2. In the absence of NAD+, ferredoxin was not reduced. We identified the genes encoding HydABC within the transcriptional unit hydCBAX and mapped the transcription start site.
Identifying a bicluster, or submatrix of a gene expression dataset wherein the genes express similar behavior over the columns, is useful for discovering novel functional gene interactions. In this article, we introduce a new algorithm for finding biClusters with Linear Patterns (CLiP). Instead of solely maximizing Pearson correlation, we introduce a fitness function that also considers the correlation of complementary genes and conditions. This eliminates the need for a priori determination of the bicluster size. We employ both greedy search and the genetic algorithm in optimization, incorporating resampling for more robust discovery. When applied to both real and simulation datasets, our results show that CLiP is superior to existing methods. In analyzing RNA-seq fly and worm time-course data from modENCODE, we uncover a set of similarly expressed genes suggesting maternal dependence. Supplementary Material is available online (at www.liebertonline.com/cmb).
algorithms; gene clusters; probability
Pathway genes are considered as a group of genes that work cooperatively in the same pathway constituting a fundamental functional grouping in a biological process. Identifying pathway genes has been one of the major tasks in understanding biological processes. However, due to the difficulty in characterizing/inferring different types of biological gene relationships, as well as several computational issues arising from dealing with high-dimensional biological data, deducing genes in pathways remain challenging.
Results: In this work, we elucidate higher level gene–gene interactions by evaluating the conditional dependencies between genes, i.e. the relationships between genes after removing the influences of a set of previously known pathway genes. These previously known pathway genes serve as seed genes in our model and will guide the detection of other genes involved in the same pathway. The detailed statistical techniques involve the estimation of a precision matrix whose elements are known to be proportional to partial correlations (i.e. conditional dependencies) between genes under appropriate normality assumptions. Likelihood ratio tests on two forms of precision matrices are further performed to see if a candidate pathway gene is conditionally independent of all the previously known pathway genes. When used effectively, this is a promising approach to recover gene relationships that would have otherwise been missed by standard methods. The advantage of the proposed method is demonstrated using both simulation studies and real datasets. We also demonstrated the importance of taking into account experimental dependencies in the simulation and real data studies.
Supplementary information:Supplementary data are available at Bioinformatics online.
Moorella thermoacetica ferments glucose to three acetic acids. In the oxidative part of the fermentation, the hexose is converted to 2 acetic acids and 2 CO2 molecules with the formation of 2 NADH and 2 reduced ferredoxin (Fdred2−) molecules. In the reductive part, 2 CO2 molecules are reduced to acetic acid, consuming the 8 reducing equivalents generated in the oxidative part. An open question is how the two parts are electronically connected, since two of the four oxidoreductases involved in acetogenesis from CO2 are NADP specific rather than NAD specific. We report here that the 2 NADPH molecules required for CO2 reduction to acetic acid are generated by the reduction of 2 NADP+ molecules with 1 NADH and 1 Fdred2− catalyzed by the electron-bifurcating NADH-dependent reduced ferredoxin:NADP+ oxidoreductase (NfnAB). The cytoplasmic iron-sulfur flavoprotein was heterologously produced in Escherichia coli, purified, and characterized. The purified enzyme was composed of 30-kDa (NfnA) and 50-kDa (NfnB) subunits in a 1-to-1 stoichiometry. NfnA harbors a [2Fe2S] cluster and flavin adenine dinucleotide (FAD), and NfnB harbors two [4Fe4S] clusters and FAD. M. thermoacetica contains a second electron-bifurcating enzyme. Cell extracts catalyzed the coupled reduction of NAD+ and Fd with 2 H2 molecules. The specific activity of this cytoplasmic enzyme was 3-fold higher in H2-CO2-grown cells than in glucose-grown cells. The function of this electron-bifurcating hydrogenase is not yet clear, since H2-CO2-grown cells additionally contain high specific activities of an NADP+-dependent hydrogenase that catalyzes the reduction of NADP+ with H2. This activity is hardly detectable in glucose-grown cells.
Diabetes and obesity, which confer an increased risk of sudden cardiac death, are associated with cardiomyocyte lipid accumulation and altered cardiac electrical properties, manifested by prolongation of the QRS duration and QT interval. It is difficult to distinguish the contribution of cardiomyocyte lipid accumulation versus the contribution of global metabolic defects to the increased incidence of sudden death and electrical abnormalities.
Methods and Results
In order to study the effects of metabolic abnormalities on arrhythmias without the complex systemic effects of diabetes and obesity, we studied cardiac-specific transgenic mice expressing PPARγ1 via the cardiac α-myosin heavy-chain promoter. The PPARγ-transgenic mice develop abnormal accumulation of intracellular lipids and die as young adults, prior to a significant reduction in systolic function. Using implantable ECG telemeters, we found that these mice have prolongation of the QRS and QT intervals, and spontaneous ventricular arrhythmias, including polymorphic ventricular tachycardia and ventricular fibrillation. Isolated cardiomyocytes demonstrated prolonged action potential duration caused by reduced expression and function of the potassium channels responsible for repolarization. Short-term exposure to pioglitazone, a PPARγ agonist, had no effect on mortality or rhythm in WT mice, but further exacerbated the arrhythmic phenotype and increased the mortality in the PPARγ TG mice.
Our findings support an important link between PPARγ activation, cardiomyocyte lipid accumulation, ion channel remodeling and increased cardiac mortality.
arrhythmia; metabolism; ion channels
Repeated morphine exposure can induce behavioral sensitization. There are evidences have shown that central gamma-aminobutyric acid (GABA) system is involved in morphine dependence. However, the effect of a GABAB receptor agonist baclofen on morphine-induced behavioral sensitization in rats is unclear.
We used morphine-induced behavioral sensitization model in rat to investigate the effects of baclofen on behavioral sensitization. Moreover, dopamine release in the shell of the nucleus accumbens was evaluated using microdialysis assay in vivo.
The present study demonstrated that morphine challenge (3 mg/kg, s.c.) obviously enhanced the locomotor activity following 4-day consecutive morphine administration and 3-day withdrawal period, which indicated the expression of morphine sensitization. In addition, chronic treatment with baclofen (2.5, 5 mg/kg) significantly inhibited the development of morphine sensitization. It was also found that morphine challenge 3 days after repeated morphine administration produced a significant increase of extracellular dopamine release in nucleus accumbens. Furthermore, chronic treatment with baclofen decreased the dopamine release induced by morphine challenge.
Our results indicated that gamma-aminobutyric acid system plays an important role in the morphine sensitization in rat and suggested that behavioral sensitization is a promising model to study the mechanism underlying drug abuse.
GABA receptor; Morphine; Sensitization; Dopamine; Nucleus accumbens; Baclofen
Timely intervention for cancer requires knowledge of its earliest genetic aberrations. Sequencing of tumors and their metastases reveals numerous abnormalities occurring late in progression. A means to temporally order aberrations in a single cancer, rather than inferring them from serially acquired samples, would define changes preceding even clinically evident disease. We integrate DNA sequence and copy number information to reconstruct the order of abnormalities as individual tumors evolve for two separate cancer types. We detect vast, unreported expansion of simple mutation sharply demarcated by recombinative loss of the second copy of TP53 in cutaneous squamous cell carcinomas (cSCCs) and serous ovarian adenocarcinomas, in the former surpassing 50 mutations per megabase. In cSCCs, we also report diverse secondary mutations in known and novel oncogenic pathways, illustrating how such expanded mutagenesis directly promotes malignant progression. These results reframe paradigms in which TP53 mutation is required later, to bypass senescence induced by driver oncogenes.
mutation; p53; cancer genetics; genomic; Notch
Vertebrates require tremendous molecular diversity to defend against numerous small hydrophobic chemicals. UDP-glucuronosyltransferases (UGTs) are a large family of detoxification enzymes that glucuronidate xenobiotics and endobiotics, facilitating their excretion from the body. The UGT1 gene cluster contains a tandem array of variable first exons, each preceded by a specific promoter, and a common set of downstream constant exons, similar to the genomic organization of the protocadherin (Pcdh), immunoglobulin, and T-cell receptor gene clusters. To assist pharmacogenomics studies in Chinese, we sequenced nine first exons, promoter and intronic regions, and five common exons of the UGT1 gene cluster in a population sample of 253 unrelated Chinese individuals. We identified 101 polymorphisms and found 15 novel SNPs. We then computed allele frequencies for each polymorphism and reconstructed their linkage disequilibrium (LD) map. The UGT1 cluster can be divided into five linkage blocks: Block 9 (UGT1A9), Block 9/7/6 (UGT1A9, UGT1A7, and UGT1A6), Block 5 (UGT1A5), Block 4/3 (UGT1A4 and UGT1A3), and Block 3′ UTR. Furthermore, we inferred haplotypes and selected their tagSNPs. Finally, comparing our data with those of three other populations of the HapMap project revealed ethnic specificity of the UGT1 genetic diversity in Chinese. These findings have important implications for future molecular genetic studies of the UGT1 gene cluster as well as for personalized medical therapies in Chinese.
The growth arrest and DNA damage–inducible 45 (Gadd45) proteins act in many cellular processes. In the liver, Gadd45b (encoding Gadd45β) is the gene most strongly induced early during both compensatory regeneration and drug-induced hyperplasia. The latter response is associated with the dramatic and rapid hepatocyte growth that follows administration of the xenobiotic TCPOBOP (1,4-bis[2-(3,5)-dichoropyridyloxy] benzene), a ligand of the nuclear receptor constitutive androstane receptor (CAR). Here, we have shown that Gadd45b–/– mice have intact proliferative responses following administration of a single dose of TCPOBOP, but marked growth delays. Moreover, early transcriptional stimulation of CAR target genes was weaker in Gadd45b–/– mice than in wild-type animals, and more genes were downregulated. Gadd45β was then found to have a direct role in transcription by physically binding to CAR, and TCPOBOP treatment caused both proteins to localize to a regulatory element for the CAR target gene cytochrome P450 2b10 (Cyp2b10). Further analysis defined separate Gadd45β domains that mediated binding to CAR and transcriptional activation. Although baseline hepatic expression of Gadd45b was broadly comparable to that of other coactivators, its 140-fold stimulation by TCPOBOP was striking and unique. The induction of Gadd45β is therefore a response that facilitates increased transcription, allowing rapid expansion of liver mass for protection against xenobiotic insults.
It was recently found that the cytoplasmic butyryl-coenzyme A (butyryl-CoA) dehydrogenase-EtfAB complex from Clostridium kluyveri couples the exergonic reduction of crotonyl-CoA to butyryl-CoA with NADH and the endergonic reduction of ferredoxin with NADH via flavin-based electron bifurcation. We report here on a second cytoplasmic enzyme complex in C. kluyveri capable of energetic coupling via this novel mechanism. It was found that the purified iron-sulfur flavoprotein complex NfnAB couples the exergonic reduction of NADP+ with reduced ferredoxin (Fdred) and the endergonic reduction of NADP+ with NADH in a reversible reaction: Fdred2− + NADH + 2 NADP+ + H+ = Fdox + NAD+ + 2 NADPH. The role of this energy-converting enzyme complex in the ethanol-acetate fermentation of C. kluyveri is discussed.
Cavernous malformations (CMs) in the cerebellopontine angle (CPA) are rare, and most of such CMs reported to date are solid and extend from the internal auditory canal into the CPA. In contrast, cystic CMs that arise in the CPA and do not involve the internal auditory canal and dura of the skull base are extremely rare.
A 50-year-old man presented with vertigo and progressive hearing loss in the right ear. MRI examination revealed a lesion in the CPA with solid and cystic components. Surgery was performed. Well-circumscribed adhesion to cranial nerves, the cerebellum, or the brain stem was noted during surgery. The lesion was totally resected. Pathological examination suggested the lesion to be a CM. At 1-year follow-up, the symptoms at presentation had resolved and no complications had occurred.
Although cystic CMs of the CPA have no established imaging features, a diagnosis of CMs may be suspected when a cystic lesion is present in the CPA and does not involve internal acoustic meatus or dura mater of the skull base. Skillful microsurgical techniques and monitoring of cranial nerves will secure good outcomes for patients with cystic CMs in the CPA.
It is very rare that a foramen magnum arachnoid cyst induces compression of the spinal cord and syringomyelia, and currently there are few treatment experiences available. Here we reported the case of a 43-year-old male patient who admitted to the hospital due to weakness and numbness of all 4 limbs, with difficulty in urination and bowel movement. MRI revealed a foramen magnum arachnoid cyst with associated syringomyelia. Posterior fossa decompression and arachnoid cyst excision were performed. Decompression was fully undertaken during surgery; however, only the posterior wall of the arachnoid cyst was excised, because it was almost impossible to remove the whole arachnoid cyst due to toughness of the cyst and tight adhesion to the spinal cord. Three months after the surgery, MRI showed a reduction in the size of the arachnoid cyst but syrinx still remained. Despite this, the symptoms of the patient were obviously improved compared to before surgery. Thus, for the treatment of foramen magnum arachnoid cyst with compression of the spinal cord and syringomyelia, if the arachnoid cyst could not be completely excised, excision should be performed as much as possible with complete decompression of the posterior fossa, which could result in a satisfying outcome.
foramen magnum; arachnoid cyst; syringomyelia.
Reoperation as a result of increased intracranial pressure (ICP) associated with cyst formation in an intracranial tumor resection cavity is a rare clinical condition. We report two cases of reoperation as a result of raised ICP associated with cyst formation in the tumor resection cavity, one arising after glioma resection and the other after meningioma resection. In both cases, a “valve”-like structure was noted intraoperatively in the roof region of the tumor resection cavity. Surgical resection of the “valve”-like structure led to slow regression over several months after the reoperation rather than to immediate disappearance of the cyst. Both cases illustrate that the “valve”-like structure formed in the roof region of the tumor resection cavity may be responsible for cyst formation. Surgical resection of it provides good long-term outcomes in such patients though short-term outcomes are unsatisfactory; we speculate that if the resection of the cortical tissue around the “valve”-like structure is enough wide, its return may be avoided.
UDP-glucuronosyltransferases (Ugts) are a supergene family of phase II drug-metabolizing enzymes that catalyze the conjugation of numerous hydrophobic small molecules with the UDP-glucuronic acid, converting them into hydrophilic molecules. Here, we report the identification and cloning of the complete zebrafish Ugt gene repertoire. We found that the zebrafish genome contains 45 Ugt genes that can be divided into three families: Ugt1, Ugt2, and Ugt5. Both Ugt1 and Ugt2 have two unlinked clusters: a and b. The Ugt1a, Ugt1b, Ugt2a, and Ugt2b clusters each contain variable and constant regions, similar to that of the protocadherin (Pcdh), immunoglobulin (Ig), and T-cell receptor (Tcr) clusters. Cloning the full-length coding sequences confirmed that each of the variable exons is separately spliced to the set of constant exons within each zebrafish Ugt cluster. Comparative analyses showed that both a and b clusters of the zebrafish Ugt1 and Ugt2 genes have orthologs in other teleosts, suggesting that they may be resulted from the “fish-specific” whole-genome duplication event. The Ugt5 genes are a novel family of Ugt genes that exist in teleosts and amphibians. Their entire open reading frames are encoded by single large exons. The zebrafish Ugt1, Ugt2, and Ugt5 genes can generate additional transcript diversity through alternative splicing. Based on phylogenetic analyses, we propose that the ancestral tetrapod and teleost Ugt1 clusters contained multiple Ugt1 paralogs. After speciation, these ancestral Ugt1 clusters underwent lineage-specific gene loss and duplication. The ancestral vertebrate Ugt2 cluster also underwent lineage-specific duplication. The intronless Ugt5 open reading frames may be derived from retrotransposition followed by gene duplication. They have been expanded dramatically in teleosts and have become the most abundant Ugt family in these lineages. These findings have interesting implications regarding the molecular evolution of genes with diversified variable exons in vertebrates.
Nanometer silicon dioxide (nano-SiO2) has a wide variety of applications in material sciences, engineering and medicine; however, the potential cell biological and proteomic effects of nano-SiO2 exposure and the toxic mechanisms remain far from clear.
Here, we evaluated the effects of amorphous nano-SiO2 (15-nm, 30-nm SiO2). on cellular viability, cell cycle, apoptosis and protein expression in HaCaT cells by using biochemical and morphological analysis, two-dimensional differential gel electrophoresis (2D-DIGE) as well as mass spectrometry (MS). We found that the cellular viability of HaCaT cells was significantly decreased in a dose-dependent manner after the treatment of nano-SiO2 and micro-sized SiO2 particles. The IC50 value (50% concentration of inhibition) was associated with the size of SiO2 particles. Exposure to nano-SiO2 and micro-sized SiO2 particles also induced apoptosis in HaCaT cells in a dose-dependent manner. Furthermore, the smaller SiO2 particle size was, the higher apoptotic rate the cells underwent. The proteomic analysis revealed that 16 differentially expressed proteins were induced by SiO2 exposure, and that the expression levels of the differentially expressed proteins were associated with the particle size. The 16 proteins were identified by MALDI-TOF-TOF-MS analysis and could be classified into 5 categories according to their functions. They include oxidative stress-associated proteins; cytoskeleton-associated proteins; molecular chaperones; energy metabolism-associated proteins; apoptosis and tumor-associated proteins.
These results showed that nano-SiO2 exposure exerted toxic effects and altered protein expression in HaCaT cells. The data indicated the alterations of the proteins, such as the proteins associated with oxidative stress and apoptosis, could be involved in the toxic mechanisms of nano-SiO2 exposure.
Using transcript profile analysis, we explored the nature of the stem cell niche in roots of maize (Zea mays). Toward assessing a role for specific genes in the establishment and maintenance of the niche, we perturbed the niche and simultaneously monitored the spatial expression patterns of genes hypothesized as essential. Our results allow us to quantify and localize gene activities to specific portions of the niche: to the quiescent center (QC) or the proximal meristem (PM), or to both. The data point to molecular, biochemical and physiological processes associated with the specification and maintenance of the niche, and include reduced expression of metabolism-, redox- and certain cell cycle-associated transcripts in the QC, enrichment of auxin-associated transcripts within the entire niche, controls for the state of differentiation of QC cells, a role for cytokinins specifically in the PM portion of the niche, processes (repair machinery) for maintaining DNA integrity and a role for gene silencing in niche stabilization. To provide additional support for the hypothesized roles of the above-mentioned and other transcripts in niche specification, we overexpressed, in Arabidopsis, homologs of representative genes (eight) identified as highly enriched or reduced in the maize root QC. We conclude that the coordinated changes in expression of auxin-, redox-, cell cycle- and metabolism-associated genes suggest the linkage of gene networks at the level of transcription, thereby providing additional insights into events likely associated with root stem cell niche establishment and maintenance.
Electronic supplementary material
The online version of this article (doi:10.1007/s00425-009-1059-3) contains supplementary material, which is available to authorized users.
Quiescent center; Root; Stem cell; Stem cell niche; Zea mays
Neuregulin1 (NRG1) influences the development of white matter connectivity and is implicated in genetic susceptibility to schizophrenia. The cingulum bundle is a white matter structure implicated in schizophrenia. Its anterior component is especially implicated, as it provides reciprocal connections between brain regions with prominent involvement in the disorder. Abnormalities in the structural integrity of the anterior cingulum in patients with schizophrenia have been reported previously. The present study investigated the potential contribution of NRG1 variation to anterior cingulum abnormalities in participants with schizophrenia.
We studied 31 men with schizophrenia and 36 healthy men using diffusion tensor imaging to investigate the association between fractional anisotropy in the anterior cingulum and a single-nucleotide polymorphism (SNP8NRG221533: rs35753505) of NRG1.
Consistent with previous reports, fractional anisotropy was significantly reduced in the anterior cingulum in the schizophrenia group. Moreover, the results revealed a significant group (schizophrenia, control) by genotype (C/C, T carriers, including CT and TT) interaction between genetic variation in NRG1 and diagnosis of schizophrenia, such that the patients with the T allele for SNP8NRG221533 had significantly decreased anterior cingulum fractional anisotropy compared with patients homozygous for the C allele and healthy controls who were T carriers.
Limitations of our study included the small sample size of the TT subgroup and our use of only fractional anisotropy as an index of myelin integrity. In addition, the use of diffusion tensor imaging acquisition methods limited our ability to study other brain regions that may be involved in schizophrenia.
Our results suggest that NRG1 variation may play a role in the pathophysiology of anterior cingulum abnormalities in patients with schizophrenia.
Disease classification has been an important application of microarray technology. However, most microarray-based classifiers can only handle data generated within the same study, since microarray data generated by different laboratories or with different platforms can not be compared directly due to systematic variations. This issue has severely limited the practical use of microarray-based disease classification.
In this study, we tested the feasibility of disease classification by integrating the large amount of heterogeneous microarray datasets from the public microarray repositories. Cross-platform data compatibility is created by deriving expression log-rank ratios within datasets. One may then compare vectors of log-rank ratios across datasets. In addition, we systematically map textual annotations of datasets to concepts in Unified Medical Language System (UMLS), permitting quantitative analysis of the phenotype "distance" between datasets and automated construction of disease classes. We design a new classification approach named ManiSVM, which integrates Manifold data transformation with SVM learning to exploit the data properties. Using the leave one dataset out cross validation, ManiSVM achieved the overall accuracy of 70.7% (68.6% precision and 76.9% recall) with many disease classes achieving the accuracy higher than 80%.
Our results not only demonstrated the feasibility of the integrated disease classification approach, but also showed that the classification accuracy increases with the number of homogenous training datasets. Thus, the power of the integrative approach will increase with the continuous accumulation of microarray data in public repositories. Our study shows that automated disease diagnosis can be an important and promising application of the enormous amount of costly to generate, yet freely available, public microarray data.
Clustering methods are widely used on gene expression data to categorize genes with similar expression profiles. Finding an appropriate (dis)similarity measure is critical to the analysis. In our study, we developed a new measure for clustering the genes when the key factor is the shape of the profile, and when the expression magnitude should also be accounted for in determining the gene relationship. This is achieved by modeling the shape and magnitude parameters separately in a gene expression profile, and then using the estimated shape and magnitude parameters to define a measure in a new feature space.
We explored several different transformation schemes to construct the feature spaces that include a space whose features are determined by the mutual differences of the original expression components, a space derived from a parametric covariance matrix, and the principal component space in traditional PCA analysis. The former two are the newly proposed and the latter is explored for comparison purposes. The new measures we defined in these feature spaces were employed in a K-means clustering procedure to perform analyses. Applying these algorithms to a simulation dataset, a developing mouse retina SAGE dataset, a small yeast sporulation cDNA dataset, and a maize root affymetrix microarray dataset, we found from the results that the algorithm associated with the first feature space, named TransChisq, showed clear advantages over other methods.
The proposed TransChisq is very promising in capturing meaningful gene expression clusters. This study also demonstrates the importance of data transformations in defining an efficient distance measure. Our method should provide new insights in analyzing gene expression data. The clustering algorithms are available upon request.
Two Poisson-based distances were developed for SAGE data; their application to simulated and experimental mouse retina data show that they are more appropriate and reliable for analyzing SAGE data than other commonly used distances or similarity measures.
Serial analysis of gene expression (SAGE) data have been poorly exploited by clustering analysis owing to the lack of appropriate statistical methods that consider their specific properties. We modeled SAGE data by Poisson statistics and developed two Poisson-based distances. Their application to simulated and experimental mouse retina data show that the Poisson-based distances are more appropriate and reliable for analyzing SAGE data compared to other commonly used distances or similarity measures such as Pearson correlation or Euclidean distance.
The vertebrate retina is comprised of seven major cell types that are generated in overlapping but well-defined intervals. To identify genes that might regulate retinal development, gene expression in the developing retina was profiled at multiple time points using serial analysis of gene expression (SAGE). The expression patterns of 1,051 genes that showed developmentally dynamic expression by SAGE were investigated using in situ hybridization. A molecular atlas of gene expression in the developing and mature retina was thereby constructed, along with a taxonomic classification of developmental gene expression patterns. Genes were identified that label both temporal and spatial subsets of mitotic progenitor cells. For each developing and mature major retinal cell type, genes selectively expressed in that cell type were identified. The gene expression profiles of retinal Müller glia and mitotic progenitor cells were found to be highly similar, suggesting that Müller glia might serve to produce multiple retinal cell types under the right conditions. In addition, multiple transcripts that were evolutionarily conserved that did not appear to encode open reading frames of more than 100 amino acids in length (“noncoding RNAs”) were found to be dynamically and specifically expressed in developing and mature retinal cell types. Finally, many photoreceptor-enriched genes that mapped to chromosomal intervals containing retinal disease genes were identified. These data serve as a starting point for functional investigations of the roles of these genes in retinal development and physiology.
Spatial and temporal patterns of expression for over 1000 genes in identified retinal cells invites functional investigations into the role of these genes in development and physiology