Search tips
Search criteria

Results 1-25 (1353667)

Clipboard (0)

Related Articles

1.  Mathematical Modeling of Corticosteroid Pharmacogenomics in Rat Muscle following Acute and Chronic Methylprednisolone Dosing 
Molecular pharmaceutics  2008;5(2):328-339.
The pharmacogenomic effects of a corticosteroid (CS) were assessed in rat skeletal muscle using microarrays. Adrenalectomized (ADX) rats were treated with methylprednisolone (MPL) by either 50 mg/kg intravenous injection or 7-day 0.3 mg/kg/h infusion through subcutaneously implanted pumps. RNAs extracted from individual rat muscles were hybridized to Affymetrix Rat Genome Genechips. Data mining yielded 653 and 2316 CS-responsive probe sets following MPL bolus and infusion treatments. Of these, 196 genes were controlled by MPL under both dosing conditions. Cluster analysis revealed that 124 probe sets exhibited three typical expression dynamic profiles following acute dosing. Cluster A consisted of up-regulated probe sets which were grouped into five subclusters each exhibiting unique temporal patterns during the infusion. Cluster B comprised down-regulated probe sets which were divided into two subclusters with distinct dynamics during the infusion. Cluster C probe sets exhibited delayed down-regulation under both bolus and infusion conditions. Among those, 104 probe sets were further grouped into subclusters based on their profiles following chronic MPL dosing. Several mathematical models were proposed and adequately captured the temporal patterns for each subcluster. Multiple types of dosing regimens are needed to resolve common determinants of gene regulation as chronic exposure results in unexpected differences in gene expression compared to acute dosing. Pharmacokinetic/pharmacodynamic (PK/PD) modeling provides a quantitative tool for elucidating the complexities of CS pharmacogenomics in skeletal muscle.
PMCID: PMC4196382  PMID: 18271548
Microarray studies; pharmacokinetics; pharmacodynamics; mathematical models; computational biology
2.  Application of Scaling Factors in Simultaneous Modeling of Microarray Data from Diverse Chips 
Pharmaceutical research  2007;24(4):643-649.
Microarrays have been utilized in many biological, physiological and pharmacological studies as a high-throughput genomic technique. Several generations of Affymetrix GeneChip® microarrays are widely used in gene expression studies. However, differences in intensities of signals for different probe sets that represent the same gene on various types of Affymetrix chips make comparison of datasets complicated.
Materials and Methods
A power coefficient scaling factor was applied in the pharmacokinetic/ pharmacodynamic (PK/PD) modeling to account for differences in probe set sensitivities (i.e., signal intensities). Microarray data from muscle and liver following methylprednisolone 50 mg/kg i.v. bolus and 0.3 mg/kg/h infusion regimens were taken as an exemplar.
The scaling factor applied to the pharmacodynamic output function was used to solve the problem of intensity differences between probe sets. This approach yielded consistent pharmacodynamic parameters for the applied models.
Modeling of pharmacodynamic/pharmacogenomic (PD/PG) data from diverse chips should be performed with caution due to differential probe set intensities. In such circumstances, a power scaling factor can be applied in the modeling.
PMCID: PMC4181592  PMID: 17318415
bioinformatics; computational biology; pharmacodynamics; pharmacogenomics; pharmacokinetics
3.  Deciphering next-generation pharmacogenomics: an information technology perspective 
Open Biology  2014;4(7):140071.
In the post-genomic era, the rapid evolution of high-throughput genotyping technologies and the increased pace of production of genetic research data are continually prompting the development of appropriate informatics tools, systems and databases as we attempt to cope with the flood of incoming genetic information. Alongside new technologies that serve to enhance data connectivity, emerging information systems should contribute to the creation of a powerful knowledge environment for genotype-to-phenotype information in the context of translational medicine. In the area of pharmacogenomics and personalized medicine, it has become evident that database applications providing important information on the occurrence and consequences of gene variants involved in pharmacokinetics, pharmacodynamics, drug efficacy and drug toxicity will become an integral tool for researchers and medical practitioners alike. At the same time, two fundamental issues are inextricably linked to current developments, namely data sharing and data protection. Here, we discuss high-throughput and next-generation sequencing technology and its impact on pharmacogenomics research. In addition, we present advances and challenges in the field of pharmacogenomics information systems which have in turn triggered the development of an integrated electronic ‘pharmacogenomics assistant’. The system is designed to provide personalized drug recommendations based on linked genotype-to-phenotype pharmacogenomics data, as well as to support biomedical researchers in the identification of pharmacogenomics-related gene variants. The provisioned services are tuned in the framework of a single-access pharmacogenomics portal.
PMCID: PMC4118603  PMID: 25030607
whole-genome sequencing; personalized pharmacogenomics profile; informatics solutions; microattribution; drug metabolism; gene variants
4.  A Microarray Analysis of the Temporal Response of Liver to Methylprednisolone: A Comparative Analysis of Two Dosing Regimens 
Endocrinology  2007;148(5):2209-2225.
Microarray analyses were performed on livers from adrenalectomized male Wistar rats chronically infused with methylprednisolone (MPL) (0.3 mg/kg·h) using Alzet mini-osmotic pumps for periods ranging from 6 h to 7 d. Four control and 40 drug-treated animals were killed at 10 different times during drug infusion. Total RNA preparations from the livers of these animals were hybridized to 44 individual Affymetrix REA230A gene chips, generating data for 15,967 different probe sets for each chip. A series of three filters were applied sequentially. These filters were designed to eliminate probe sets that were not expressed in the tissue, were not regulated by the drug, or did not meet defined quality control standards. These filters eliminated 13,978 probe sets (87.5%) leaving a remainder of 1989 probe sets for further consideration. We previously described a similar dataset obtained from animals after administration of a single dose of MPL (50 mg/kg given iv). That study involved 16 time points over a 72-h period. A similar filtering schema applied to the single-bolus-dose data-set identified 1519 probe sets as being regulated by MPL. A comparison of datasets from the two different dosing regimens identified 358 genes that were regulated by MPL in response to both dosing regimens. Regulated genes were grouped into 13 categories, mainly on gene product function. The temporal profiles of these common genes were subjected to detailed scrutiny. Examination of temporal profiles demonstrates that current perspectives on the mechanism of glucocorticoid action cannot entirely explain the temporal profiles of these regulated genes.
PMCID: PMC4183266  PMID: 17303664
5.  Microarray analysis of the temporal response of skeletal muscle to methylprednisolone: comparative analysis of two dosing regimens 
Physiological genomics  2007;30(3):282-299.
The transcriptional response of skeletal muscle to chronic corticosteroid exposure was examined over 168 h and compared with the response profiles observed following a single dose of corticosteroid. Male adrenalectomized Wistar rats were given a constant-rate infusion of 0.3 mg•kg−1•h−1 methylprednisolone for up to 7 days via subcutaneously implanted minipumps. Four control and forty drug-treated animals were killed at ten different time points during infusion. Liver total RNAs were hybridized to 44 individual Affymetrix REA230A gene chips. Previously, we described a filtration approach for identifying genes of interest in microarray data sets developed from tissues of rats treated with methylprednisolone (MPL) following acute dosing. Here, a similar approach involving a series of three filters was applied sequentially to identify genes of interest. These filters were designed to eliminate probe sets that were not expressed in the tissue, not regulated by the drug, or did not meet defined quality control standards. Filtering eliminated 86% of probe sets, leaving a remainder of 2,316 for further consideration. In a previous study, 653 probe sets were identified as MPL regulated following administration of a single (acute) dose of the drug. Comparison of the two data sets yielded 196 genes identified as regulated by MPL in both dosing regimens. Because of receptor downregulation, it was predicted that genes regulated by receptor-glucocorticoid response element interactions would exhibit tolerance in chronic profiles. However, many genes did not exhibit steroid tolerance, indicating that present perspectives on the mechanism of glucocorticoid action cannot entirely explain all temporal profiles.
PMCID: PMC4186702  PMID: 17473217
glucocorticoids; corticosteroids; Affymetrix gene chips; gene expression; time series
6.  Time Course of Microbiologic Outcome and Gene Expression in Candida albicans during and following In Vitro and In Vivo Exposure to Fluconazole†  
Pharmacodynamics (PD) considers the relationship between drug exposure and effect. The two factors that have been used to distinguish the PD behaviors of antimicrobials are the impact of concentration on the extent of organism killing and the duration of persistent microbiologic suppression (postantibiotic effect). The goals of these studies were (i) to examine the relationship between antimicrobial PD and gene expression and (ii) to gain insight into the mechanism of fluconazole effects persisting following exposure. Microarrays were used to estimate the transcriptional response of Candida albicans to a supra-MIC F exposure over time in vitro. Fluconazole at four times the MIC was added to a log-phase C. albicans culture, and cells were collected to determine viable growth and for microarray analyses. We identified differential expression of 18% of all genes for at least one of the time points. More genes were upregulated (n = 1,053 [16%]) than downregulated (174 [3%]). Of genes with known function that were upregulated during exposure, most were related to plasma membrane/cell wall synthesis (18%), stress responses (7%), and metabolism (6%). The categories of downregulated genes during exposure included protein synthesis (15%), DNA synthesis/repair (7%), and transport (7%) genes. The majority of genes identified at the postexposure time points were from the protein (17%) and DNA (7%) synthesis categories. In subsequent studies, three genes (CDR1, CDR2, and ERG11) were examined in greater detail (more concentration and time points) following fluconazole exposure in vitro and in vivo. Expression levels from the in vitro and in vivo studies were congruent. CDR1 and CDR2 transcripts were reduced during in vitro fluconazole exposure and during supra-MIC exposure in vivo. However, in the postexposure period, the mRNA abundance of both pumps increased. ERG11 expression increased during exposure and fell in the postexposure period. The expression of the three genes responded in a dose-dependent manner. In sum, the microarray data obtained during and following fluconazole exposure identified genes both known and unknown to be affected by this drug class. The expanded in vitro and in vivo expression data set underscores the importance of considering the time course of exposure in pharmacogenomic investigations.
PMCID: PMC1426956  PMID: 16569846
7.  Exploring drug action on Mycobacterium tuberculosis using affymetrix oligonucleotide genechips 
DNA microarrays have rapidly emerged as an important tool for Mycobacterium tuberculosis research. While the microarray approach has generated valuable information, a recent survey has found a lack of correlation among the microarray data produced by different laboratories on related issues, raising a concern about the credibility of research findings. The Affymetrix oligonucleotide array has been shown to be more reliable for interrogating changes in gene expression than other platforms. However, this type of array system has not been applied to the pharmacogenomic study of M. tuberculosis. The goal here was to explore the strength of the Affymetrix array system for monitoring drug-induced gene expression in M. tuberculosis, compare with other related studies, and conduct cross-platform analysis. The genome-wide gene expression profiles of M. tuberculosis in response to drug treatments including INH (isoniazid) and ethionamide were obtained using the Affymetrix array system. Up-regulated or down-regulated genes were identified through bioinformatic analysis of the microarray data derived from the hybridization of RNA samples and gene probes. Based on the Affymetrix system, our method identified all drug-induced genes reported in the original reference work as well as some other genes that have not been recognized previously under the same drug treatment. For instance, the Affymetrix system revealed that Rv2524c (fas) was induced by both INH and ethionamide under the given levels of concentration, as suggested by most of the probe sets implementing this gene sequence. This finding is contradictory to previous observations that the expression of fas is not changed by INH treatment. This example illustrates that the determination of expression change for certain genes is probe-dependent, and the appropriate use of multiple probe-set representation is an advantage with the Affymetrix system. Our data also suggest that whereas the up-regulated gene expression pattern reflects the drug’s mode of action, the down-regulated pattern is largely non-specific. According to our analysis, the Affymetrix array system is a reliable tool for studying the pharmacogenomics of M. tuberculosis and lends itself well in the research and development of anti-TB drugs.
PMCID: PMC1557687  PMID: 16246625
Tuberculosis; Drug; Microarray; Genome
8.  Corticosteroid-regulated genes in rat kidney: mining time series array data 
Kidney is a major target for adverse effects associated with corticosteroids. A microarray dataset was generated to examine changes in gene expression in rat kidney in response to methylprednisolone. Four control and 48 drug-treated animals were killed at 16 times after drug administration. Kidney RNA was used to query 52 individual Affymetrix chips, generating data for 15,967 different probe sets for each chip. Mining techniques applicable to time series data that identify drug-regulated changes in gene expression were applied. Four sequential filters eliminated probe sets that were not expressed in the tissue, not regulated by drug, or did not meet defined quality control standards. These filters eliminated 14,890 probe sets (94%) from further consideration. Application of judiciously chosen filters is an effective tool for data mining of time series datasets. The remaining data can then be further analyzed by clustering and mathematical modeling. Initial analysis of this filtered dataset identified a group of genes whose pattern of regulation was highly correlated with prototype corticosteroid enhanced genes. Twenty genes in this group, as well as selected genes exhibiting either downregulation or no regulation, were analyzed for 5′ GRE half-sites conserved across species. In general, the results support the hypothesis that the existence of conserved DNA binding sites can serve as an important adjunct to purely analytic approaches to clustering genes into groups with common mechanisms of regulation. This dataset, as well as similar datasets on liver and muscle, are available online in a format amenable to further analysis by others.
PMCID: PMC3752664  PMID: 15985454
data mining; gene arrays; glucocorticoids; pharmacogenomics; evolutionary conservation
9.  Pharmacogenomics: a systems approach 
Pharmacogenetics and pharmacogenomics involve the study of the role of inheritance in individual variation in drug response, a phenotype that varies from potentially life-threatening adverse drug reactions to equally serious lack of therapeutic efficacy. Pharmacogenetics-pharmacogenomics represents a major component of the movement to `individualized medicine'. Pharmacogenetic studies originally focused on monogenic traits, often involving genetic variation in drug metabolism. However, contemporary studies increasingly involve entire `pathways' that include both pharmacokinetics (PKs)—factors that influence the concentration of a drug reaching its target(s)—and pharmacodynamics (PDs), factors associated with the drug target(s), as well as genome-wide approaches. The convergence of advances in pharmacogenetics with rapid developments in human genomics has resulted in the evolution of pharmacogenetics into pharmacogenomics. At the same time, studies of drug response are expanding beyond genomics to encompass pharmacotranscriptomics and pharmacometabolomics to become a systems-based discipline. This discipline is also increasingly moving across the `translational interface' into the clinic and is being incorporated into the drug development process and governmental regulation of that process. The article will provide an overview of the development of pharmacogenetics-pharmacogenomics, the scientific advances that have contributed to the continuing evolution of this discipline, the incorporation of transcriptomic and metabolomic data into attempts to understand and predict variation in drug response phenotypes as well as challenges associated with the `translation' of this important aspect of biomedical science into the clinic.
PMCID: PMC3894835  PMID: 20836007
10.  Pharmacogenomic Responses of Rat Liver to Methylprednisolone: An Approach to Mining a Rich Microarray Time Series 
The AAPS journal  2005;7(1):E156-E194.
A data set was generated to examine global changes in gene expression in rat liver over time in response to a single bolus dose of methylprednisolone. Four control animals and 43 drug-treated animals were humanely killed at 16 different time points following drug administration. Total RNA preparations from the livers of these animals were hybridized to 47 individual Affymetrix RU34A gene chips, generating data for 8799 different probe sets for each chip. Data mining techniques that are applicable to gene array time series data sets in order to identify drug-regulated changes in gene expression were applied to this data set. A series of 4 sequentially applied filters were developed that were designed to eliminate probe sets that were not expressed in the tissue, were not regulated by the drug treatment, or did not meet defined quality control standards. These filters eliminated 7287 probe sets of the 8799 total (82%) from further consideration. Application of judiciously chosen filters is an effective tool for data mining of time series data sets. The remaining data can then be further analyzed by clustering and mathematical modeling techniques.
PMCID: PMC2607485  PMID: 16146338
Data mining; gene arrays; glucocorticoids; mathematical modeling; pharmacogenomics
11.  Pharmacogenomic responses of rat liver to methylprednisolone: An approach to mining a rich microarray time series 
The AAPS Journal  2005;7(1):E156-E194.
A data set was generated to examine global changes in gene expression in rat liver over time in response to a single bolus dose of methylprednisolone. Four control animals and 43 drug-treated animals were humanely killed at 16 different time points following drug administration. Total RNA preparation from the livers of these animals were hybridized to 47 individual Affymetrix RU34A gene chips, generating data for 8799 different probe sets for each chip. Data mining techniques that are applicable to gene array time series data sets in order to identify drug-regulated changes in gene expression were applied to this data set. A series of 4 sequentially applied filters were developed that were designed to eliminate probe sets that were not expressed in the tissue, were not regulated by the drug treatment, or did not meet defined quality control standards. These filters eliminated 7287 probe sets of the 8799 total (82%) from further consideration. Application of judiciously chosen filters is an effective tool for data mining of time series data sets. The remaining data can then be further analyzed by clustering and mathematical modeling techniques.
PMCID: PMC2607485  PMID: 16146338
Data mining; gene arrays; glucocorticoids; mathematical modeling; pharmacogenomics
12.  Analysis of Pharmacokinetics, Pharmacodynamics, and Pharmacogenomics Data Sets Using VizStruct, A Novel Multidimensional Visualization Technique 
Pharmaceutical research  2004;21(5):777-780.
Data visualization techniques for the pharmaceutical sciences have not been extensively investigated. The purpose of this study was to evaluate the usefulness of VizStruct, a multidimensional visualization tool, for applications in pharmacokinetics, pharmacodynamics, and pharmacogenomics.
The VizStruct tool uses the first harmonic of the discrete Fourier transform to map multidimensional data to two dimensions for visualization. The mapping was used to visualize several published pharmacokinetic, pharmacodynamic, and pharmacogenomic data sets. The VizStruct approach was evaluated using simulated population pharmacokinetics data sets, the data from Dalen and colleagues (Clin. Pharmacol. Ther. 63:444−452, 1998) on the kinetics of nortriptyline and its 10-hydroxy-nortriptyline metabolite in subjects with differing number of copies of the CYP2D6, and the gene expression profiling data of Bohen and colleagues (Proc. Natl. Acad. Sci. USA 100:1926−1930, 2003) on follicular lymphoma patients responsive and nonresponsive to rituximab.
The VizStruct mapping preserves the key characteristics of multidimensional data in two dimensions in a manner that facilitates visualization. The mapping is computationally efficient and can be used for cluster detection and class prediction in pharmaceutical data sets. The VizStruct visualization succinctly summarized the salient similarities and differences in the nortriptyline and 10-hydroxynortriptyline pharmacokinetic profiles in subjects with increasing number of CYP2D6 gene copies. In the simulated population pharmacokinetic data sets, it was capable of discriminating the subtle differences between pharmacokinetic profiles derived from 1- and 2-compartment models with the same area under the curve. The two-dimensional VizStruct mapping computed from a subset of 102 informative genes from the Bohen and colleagues data set effectively separated the rituximab responder, rituximab nonresponder, and control subject groups.
The VizStruct approach is a computationally efficient and effective approach for visualizing complex, multidimensional data sets. It could have many useful applications in the pharmaceutical sciences.
PMCID: PMC2607483  PMID: 15180333
microarray; pharmacodynamics; pharmacogenomic modeling; pharmacokinetics; visualization algorithms
13.  Current clinical evidence on pioglitazone pharmacogenomics 
Pioglitazone is the most widely used thiazolidinedione and acts as an insulin-sensitizer through activation of the Peroxisome Proliferator-Activated Receptor-γ (PPARγ). Pioglitazone is approved for use in the management of type 2 diabetes mellitus (T2DM), but its use in other therapeutic areas is increasing due to pleiotropic effects. In this hypothesis article, the current clinical evidence on pioglitazone pharmacogenomics is summarized and related to variability in pioglitazone response. How genetic variation in the human genome affects the pharmacokinetics and pharmacodynamics of pioglitazone was examined. For pharmacodynamic effects, hypoglycemic and anti-atherosclerotic effects, risks of fracture or edema, and the increase in body mass index in response to pioglitazone based on genotype were examined. The genes CYP2C8 and PPARG are the most extensively studied to date and selected polymorphisms contribute to respective variability in pioglitazone pharmacokinetics and pharmacodynamics. We hypothesized that genetic variation in pioglitazone pathway genes contributes meaningfully to the clinically observed variability in drug response. To test the hypothesis that genetic variation in PPARG associates with variability in pioglitazone response, we conducted a meta-analysis to synthesize the currently available data on the PPARG p.Pro12Ala polymorphism. The results showed that PPARG 12Ala carriers had a more favorable change in fasting blood glucose from baseline as compared to patients with the wild-type Pro12Pro genotype (p = 0.018). Unfortunately, findings for many other genes lack replication in independent cohorts to confirm association; further studies are needed. Also, the biological functionality of these polymorphisms is unknown. Based on current evidence, we propose that pharmacogenomics may provide an important tool to individualize pioglitazone therapy and better optimize therapy in patients with T2DM or other conditions for which pioglitazone is being used.
PMCID: PMC3840328  PMID: 24324437
pioglitazone; thiazolidinedione; CYP2C8; cytochrome P450; PPAR; pharmacokinetics; pharmacodynamics
14.  Circadian Variations in Liver Gene Expression: Relationships to Drug Actions 
Chronopharmacology is an important but under-explored aspect of therapeutics. Rhythmic variations in biological processes can influence drug action, including pharmacodynamic responses, due to circadian variations in the availability or functioning of drug targets. We hypothesized that global gene expression analysis can be useful in the identification of circadian regulated genes involved in drug action. Circadian variations in gene expression in rat liver were explored using Affymetrix gene arrays. A rich time series involving animals analyzed at 18 time points within the 24 hour cycle was generated. Of the more than 15,000 probe sets on these arrays, 265 exhibited oscillations with a 24 hour frequency. Cluster analysis yielded 5 distinct circadian clusters, with approximately two-thirds of the transcripts reaching maximum expression during the animal’s dark/active period. Of the 265 probe sets, 107 of potential therapeutic importance were identified. The expression levels of clock genes were also investigated in this study. Five clock genes exhibited circadian variation in liver, and data suggest that these genes may also be regulated by corticosteroids.
PMCID: PMC2561907  PMID: 18562560
15.  Discovering Biological Progression Underlying Microarray Samples 
PLoS Computational Biology  2011;7(4):e1001123.
In biological systems that undergo processes such as differentiation, a clear concept of progression exists. We present a novel computational approach, called Sample Progression Discovery (SPD), to discover patterns of biological progression underlying microarray gene expression data. SPD assumes that individual samples of a microarray dataset are related by an unknown biological process (i.e., differentiation, development, cell cycle, disease progression), and that each sample represents one unknown point along the progression of that process. SPD aims to organize the samples in a manner that reveals the underlying progression and to simultaneously identify subsets of genes that are responsible for that progression. We demonstrate the performance of SPD on a variety of microarray datasets that were generated by sampling a biological process at different points along its progression, without providing SPD any information of the underlying process. When applied to a cell cycle time series microarray dataset, SPD was not provided any prior knowledge of samples' time order or of which genes are cell-cycle regulated, yet SPD recovered the correct time order and identified many genes that have been associated with the cell cycle. When applied to B-cell differentiation data, SPD recovered the correct order of stages of normal B-cell differentiation and the linkage between preB-ALL tumor cells with their cell origin preB. When applied to mouse embryonic stem cell differentiation data, SPD uncovered a landscape of ESC differentiation into various lineages and genes that represent both generic and lineage specific processes. When applied to a prostate cancer microarray dataset, SPD identified gene modules that reflect a progression consistent with disease stages. SPD may be best viewed as a novel tool for synthesizing biological hypotheses because it provides a likely biological progression underlying a microarray dataset and, perhaps more importantly, the candidate genes that regulate that progression.
Author Summary
We present a novel computational approach, Sample Progression Discovery (SPD), to discover biological progression underlying a microarray dataset. In contrast to the majority of microarray data analysis methods which identify differences between sample groups (normal vs. cancer, treated vs. control), SPD aims to identify an underlying progression among individual samples, both within and across sample groups. We validated SPD's ability to discover biological progression using datasets of cell cycle, B-cell differentiation, and mouse embryonic stem cell differentiation. We view SPD as a hypothesis generation tool when applied to datasets where the progression is unclear. For example, when applied to a microarray dataset of cancer samples, SPD assumes that the cancer samples collected from individual patients represent different stages during an intrinsic progression underlying cancer development. The inferred relationship among the samples may therefore indicate a trajectory or hierarchy of cancer progression, which serves as a hypothesis to be tested. SPD is not limited to microarray data analysis, and can be applied to a variety of high-dimensional datasets. We implemented SPD using MATLAB graphical user interface, which is available at
PMCID: PMC3077357  PMID: 21533210
16.  Mining biological information from 3D short time-series gene expression data: the OPTricluster algorithm 
BMC Bioinformatics  2012;13:54.
Nowadays, it is possible to collect expression levels of a set of genes from a set of biological samples during a series of time points. Such data have three dimensions: gene-sample-time (GST). Thus they are called 3D microarray gene expression data. To take advantage of the 3D data collected, and to fully understand the biological knowledge hidden in the GST data, novel subspace clustering algorithms have to be developed to effectively address the biological problem in the corresponding space.
We developed a subspace clustering algorithm called Order Preserving Triclustering (OPTricluster), for 3D short time-series data mining. OPTricluster is able to identify 3D clusters with coherent evolution from a given 3D dataset using a combinatorial approach on the sample dimension, and the order preserving (OP) concept on the time dimension. The fusion of the two methodologies allows one to study similarities and differences between samples in terms of their temporal expression profile. OPTricluster has been successfully applied to four case studies: immune response in mice infected by malaria (Plasmodium chabaudi), systemic acquired resistance in Arabidopsis thaliana, similarities and differences between inner and outer cotyledon in Brassica napus during seed development, and to Brassica napus whole seed development. These studies showed that OPTricluster is robust to noise and is able to detect the similarities and differences between biological samples.
Our analysis showed that OPTricluster generally outperforms other well known clustering algorithms such as the TRICLUSTER, gTRICLUSTER and K-means; it is robust to noise and can effectively mine the biological knowledge hidden in the 3D short time-series gene expression data.
PMCID: PMC3376030  PMID: 22475802
17.  Chapter 7: Pharmacogenomics 
PLoS Computational Biology  2012;8(12):e1002817.
There is great variation in drug-response phenotypes, and a “one size fits all” paradigm for drug delivery is flawed. Pharmacogenomics is the study of how human genetic information impacts drug response, and it aims to improve efficacy and reduced side effects. In this article, we provide an overview of pharmacogenetics, including pharmacokinetics (PK), pharmacodynamics (PD), gene and pathway interactions, and off-target effects. We describe methods for discovering genetic factors in drug response, including genome-wide association studies (GWAS), expression analysis, and other methods such as chemoinformatics and natural language processing (NLP). We cover the practical applications of pharmacogenomics both in the pharmaceutical industry and in a clinical setting. In drug discovery, pharmacogenomics can be used to aid lead identification, anticipate adverse events, and assist in drug repurposing efforts. Moreover, pharmacogenomic discoveries show promise as important elements of physician decision support. Finally, we consider the ethical, regulatory, and reimbursement challenges that remain for the clinical implementation of pharmacogenomics.
PMCID: PMC3531317  PMID: 23300409
18.  Inferring biochemical reaction pathways: the case of the gemcitabine pharmacokinetics 
BMC Systems Biology  2012;6:51.
The representation of a biochemical system as a network is the precursor of any mathematical model of the processes driving the dynamics of that system. Pharmacokinetics uses mathematical models to describe the interactions between drug, and drug metabolites and targets and through the simulation of these models predicts drug levels and/or dynamic behaviors of drug entities in the body. Therefore, the development of computational techniques for inferring the interaction network of the drug entities and its kinetic parameters from observational data is raising great interest in the scientific community of pharmacologists. In fact, the network inference is a set of mathematical procedures deducing the structure of a model from the experimental data associated to the nodes of the network of interactions. In this paper, we deal with the inference of a pharmacokinetic network from the concentrations of the drug and its metabolites observed at discrete time points.
The method of network inference presented in this paper is inspired by the theory of time-lagged correlation inference with regard to the deduction of the interaction network, and on a maximum likelihood approach with regard to the estimation of the kinetic parameters of the network. Both network inference and parameter estimation have been designed specifically to identify systems of biotransformations, at the biochemical level, from noisy time-resolved experimental data. We use our inference method to deduce the metabolic pathway of the gemcitabine. The inputs to our inference algorithm are the experimental time series of the concentration of gemcitabine and its metabolites. The output is the set of reactions of the metabolic network of the gemcitabine.
Time-lagged correlation based inference pairs up to a probabilistic model of parameter inference from metabolites time series allows the identification of the microscopic pharmacokinetics and pharmacodynamics of a drug with a minimal a priori knowledge. In fact, the inference model presented in this paper is completely unsupervised. It takes as input the time series of the concetrations of the parent drug and its metabolites. The method, applied to the case study of the gemcitabine pharmacokinetics, shows good accuracy and sensitivity.
PMCID: PMC3536593  PMID: 22640931
19.  Bayesian hierarchical clustering for microarray time series data with replicates and outlier measurements 
BMC Bioinformatics  2011;12:399.
Post-genomic molecular biology has resulted in an explosion of data, providing measurements for large numbers of genes, proteins and metabolites. Time series experiments have become increasingly common, necessitating the development of novel analysis tools that capture the resulting data structure. Outlier measurements at one or more time points present a significant challenge, while potentially valuable replicate information is often ignored by existing techniques.
We present a generative model-based Bayesian hierarchical clustering algorithm for microarray time series that employs Gaussian process regression to capture the structure of the data. By using a mixture model likelihood, our method permits a small proportion of the data to be modelled as outlier measurements, and adopts an empirical Bayes approach which uses replicate observations to inform a prior distribution of the noise variance. The method automatically learns the optimum number of clusters and can incorporate non-uniformly sampled time points. Using a wide variety of experimental data sets, we show that our algorithm consistently yields higher quality and more biologically meaningful clusters than current state-of-the-art methodologies. We highlight the importance of modelling outlier values by demonstrating that noisy genes can be grouped with other genes of similar biological function. We demonstrate the importance of including replicate information, which we find enables the discrimination of additional distinct expression profiles.
By incorporating outlier measurements and replicate values, this clustering algorithm for time series microarray data provides a step towards a better treatment of the noise inherent in measurements from high-throughput genomic technologies. Timeseries BHC is available as part of the R package 'BHC' (version 1.5), which is available for download from Bioconductor (version 2.9 and above) via
PMCID: PMC3228548  PMID: 21995452
20.  Systematic Pharmacogenomics Analysis of a Malay Whole Genome: Proof of Concept for Personalized Medicine 
PLoS ONE  2013;8(8):e71554.
With a higher throughput and lower cost in sequencing, second generation sequencing technology has immense potential for translation into clinical practice and in the realization of pharmacogenomics based patient care. The systematic analysis of whole genome sequences to assess patient to patient variability in pharmacokinetics and pharmacodynamics responses towards drugs would be the next step in future medicine in line with the vision of personalizing medicine.
Genomic DNA obtained from a 55 years old, self-declared healthy, anonymous male of Malay descent was sequenced. The subject's mother died of lung cancer and the father had a history of schizophrenia and deceased at the age of 65 years old. A systematic, intuitive computational workflow/pipeline integrating custom algorithm in tandem with large datasets of variant annotations and gene functions for genetic variations with pharmacogenomics impact was developed. A comprehensive pathway map of drug transport, metabolism and action was used as a template to map non-synonymous variations with potential functional consequences.
Principal Findings
Over 3 million known variations and 100,898 novel variations in the Malay genome were identified. Further in-depth pharmacogenetics analysis revealed a total of 607 unique variants in 563 proteins, with the eventual identification of 4 drug transport genes, 2 drug metabolizing enzyme genes and 33 target genes harboring deleterious SNVs involved in pharmacological pathways, which could have a potential role in clinical settings.
The current study successfully unravels the potential of personal genome sequencing in understanding the functionally relevant variations with potential influence on drug transport, metabolism and differential therapeutic outcomes. These will be essential for realizing personalized medicine through the use of comprehensive computational pipeline for systematic data mining and analysis.
PMCID: PMC3751891  PMID: 24009664
21.  Searching for Pharmacogenomic Markers: The Synergy between Omic and Hypothesis-Driven Research 
Disease Markers  2002;17(2):77-88.
With 35,000 genes and hundreds of thousands of protein states to identify, correlate, and understand, it no longer suffices to rely on studies of one gene, gene product, or process at a time. We have entered the “omic” era in biology. But large-scale omic studies of cellular molecules in aggregate rarely can answer interesting questions without the assistance of information from traditional hypothesis-driven research. The two types of science are synergistic. A case in point is the set of pharmacogenomic studies that we and our collaborators have done with the 60 human cancer cell lines of the National Cancer Institute’s drug discovery program. Those cells (the NCI-60) have been characterized pharmacologically with respect to their sensitivity to > 70,000 chemical compounds. We are further characterizing them at the DNA, RNA, protein, and functional levels. Our major aim is to identify pharmacogenomic markers that can aid in drug discovery and design, as well as in individualization of cancer therapy. The bioinformatic and chemoinformatic challenges of this study have demanded novel methods for analysis and visualization of high-dimensional data. Included are the color-coded “clustered image map” and also the MedMiner program package, which captures and organizes the biomedical literature on gene-gene and gene-drug relationships. Microarray transcript expression studies of the 60 cell lines reveal, for example, a gene-drug correlation with potential clinical implications – that between the asparagine synthetase gene and the enzyme-drug L-asparaginase in ovarian cancer cells.
PMCID: PMC3851640  PMID: 11673654
microarray; genomics; proteomics; omics; cancer; cell line; pharmacology; pharmacogenomics; molecular marker; cancer therapy; clustered image map; MedMiner
22.  Dynamics of hepatic gene expression profile in a rat cecal ligation and puncture model 
The Journal of Surgical Research  2011;176(2):583-600.
Sepsis remains a major clinical challenge in intensive care units. The difficulty in developing new and more effective treatments for sepsis exemplifies our incomplete understanding of the underlying pathophysiology of it. One of the more widely used rodent models for studying polymicrobial sepsis is cecal ligation and puncture (CLP). While a number of CLP studies investigated the ensuing systemic inflammatory response, they usually focus on a single time point post CLP and therefore fail to describe the dynamics of the response. Furthermore, previous studies mostly use surgery without infection (herein referred to as Sham CLP, SCLP) as a control for the CLP model, however SCLP represents an aseptic injurious event that also stimulates a systemic inflammatory response. Thus, there is a need to better understand the dynamics and expression patterns of both injury- and sepsis- induced gene expression alterations to identify potential regulatory targets. In this direction, we characterized the response of the liver within the first 24 h in a rat model of SCLP and CLP using a time series of microarray gene expression data.
Rats were randomly divided into three groups, sham, SCLP and CLP. Rats in SCLP group are subjected to laparotomy, cecal ligation and puncture while those in CLP group are subjected to the similar procedures without cecal ligation and puncture. Animals were saline resuscitated and sacrificed at defined time points (0, 2, 4, 8, 16, and 24 h). Liver tissues were explanted and analyzed for their gene expression profiles using microarray technology. Unoperated animals (Sham) serve as negative controls. After identifying differentially expressed probesets between sham and SCLP or CLP conditions over time, the concatenated data sets corresponding to these differentially expressed probesets in sham and SCLP or CLP groups were combined and analyzed using a “consensus clustering” approach. Promoters of genes that share common characteristics were extracted, and compared with gene batteries comprised of co expressed genes in order to identify putatative transcription factors which could be responsible for the co regulation of those genes.
The SCLP/CLP genes whose expression patterns significantly changed compared to sham over time were identified, clustered, and finally analyzed for pathway enrichment. Our results indicate that both CLP and SCLP triggered the activation of a pro-inflammatory response, enhanced synthesis of acute-phase proteins, increased metabolism and tissue damage markers. Genes triggered by CLP which can be directly linked to bacteria removal functions were absent in SCLP injury. In addition, genes relevant to oxidative stress induced damage were unique to CLP injury, which may be due to the increased severity of CLP injury vs. SCLP injury. Pathway enrichment identified pathways with similar functionality but different dynamics in the two injury models, indicating that the functions controlled by those pathways are under the influence of different transcription factors and regulatory mechanisms. Putatively identified transcription factors, notably including CREB, NF-KB and STAT, were obtained through analysis of the promoter regions in the SCLP/CLP genes. Our results show that while transcription factors such as NF-KB, HOMF, and GATA were common in both injuries for the IL-6 signaling pathway, there were many other transcription factors associated with that pathway which were unique to CLP, including FKHD, HESF and IRFF. There were 17 transcription factors that were identified as important in at least 2 pathways in the CLP injury, but only 7 transcription factors with that property in the SCLP injury. This also supports the hypothesis of unique regulatory modules that govern the pathways present in both the CLP and SCLP response.
By using microarrays to assess multiple genes in a high throughput manner, we demonstrate that an inflammatory response involving different dynamics and different genes is triggered by SCLP and CLP. From our analysis of the CLP data, the key characteristics of sepsis are a pro inflammatory response which drives hypermetabolism, immune cell activation, and damage from oxidative stress. This contrasts with SCLP, which triggers a modified inflammatory response leading to no immune cell activation, decreased detoxification potential, and hyper metabolism. Many of the identified transcription factors that drive the CLP-induced response are not found in the SCLP group, suggesting that SCLP and CLP induce different types of inflammatory responses via different regulatory pathways.
PMCID: PMC3368040  PMID: 22381171
sepsis; trauma; gene expression; transcription factor; microarray; inflammation; liver
23.  Epidermal Growth Factor Receptor Mutation (EGFR) Testing for Prediction of Response to EGFR-Targeting Tyrosine Kinase Inhibitor (TKI) Drugs in Patients with Advanced Non-Small-Cell Lung Cancer 
Executive Summary
In February 2010, the Medical Advisory Secretariat (MAS) began work on evidence-based reviews of the literature surrounding three pharmacogenomic tests. This project came about when Cancer Care Ontario (CCO) asked MAS to provide evidence-based analyses on the effectiveness and cost-effectiveness of three oncology pharmacogenomic tests currently in use in Ontario.
Evidence-based analyses have been prepared for each of these technologies. These have been completed in conjunction with internal and external stakeholders, including a Provincial Expert Panel on Pharmacogenetics (PEPP). Within the PEPP, subgroup committees were developed for each disease area. For each technology, an economic analysis was also completed by the Toronto Health Economics and Technology Assessment Collaborative (THETA) and is summarized within the reports.
The following reports can be publicly accessed at the MAS website at: or at
Gene Expression Profiling for Guiding Adjuvant Chemotherapy Decisions in Women with Early Breast Cancer: An Evidence-Based Analysis
Epidermal Growth Factor Receptor Mutation (EGFR) Testing for Prediction of Response to EGFR-Targeting Tyrosine Kinase Inhibitor (TKI) Drugs in Patients with Advanced Non-Small-Cell Lung Cancer: an Evidence-Based Analysis
K-RAS testing in Treatment Decisions for Advanced Colorectal Cancer: an Evidence-Based Analysis
The Medical Advisory Secretariat undertook a systematic review of the evidence on the clinical effectiveness and cost-effectiveness of epidermal growth factor receptor (EGFR) mutation testing compared with no EGFR mutation testing to predict response to tyrosine kinase inhibitors (TKIs), gefitinib (Iressa®) or erlotinib (Tarceva®) in patients with advanced non-small cell lung cancer (NSCLC).
Clinical Need: Target Population and Condition
With an estimated 7,800 new cases and 7,000 deaths last year, lung cancer is the leading cause of cancer deaths in Ontario. Those with unresectable or advanced disease are commonly treated with concurrent chemoradiation or platinum-based combination chemotherapy. Although response rates to cytotoxic chemotherapy for advanced NSCLC are approximately 30 to 40%, all patients eventually develop resistance and have a median survival of only 8 to 10 months. Treatment for refractory or relapsed disease includes single-agent treatment with docetaxel, pemetrexed or EGFR-targeting TKIs (gefitinib, erlotinib). TKIs disrupt EGFR signaling by competing with adenosine triphosphate (ATP) for the binding sites at the tyrosine kinase (TK) domain, thus inhibiting the phosphorylation and activation of EGFRs and the downstream signaling network. Gefitinib and erlotinib have been shown to be either non-inferior or superior to chemotherapy in the first- or second-line setting (gefitinib), or superior to placebo in the second- or third-line setting (erlotinib).
Certain patient characteristics (adenocarcinoma, non-smoking history, Asian ethnicity, female gender) predict for better survival benefit and response to therapy with TKIs. In addition, the current body of evidence shows that somatic mutations in the EGFR gene are the most robust biomarkers for EGFR-targeting therapy selection. Drugs used in this therapy, however, can be costly, up to C$ 2000 to C$ 3000 per month, and they have only approximately a 10% chance of benefiting unselected patients. For these reasons, the predictive value of EGFR mutation testing for TKIs in patients with advanced NSCLC needs to be determined.
The Technology: EGFR mutation testing
The EGFR gene sequencing by polymerase chain reaction (PCR) assays is the most widely used method for EGFR mutation testing. PCR assays can be performed at pathology laboratories across Ontario. According to experts in the province, sequencing is not currently done in Ontario due to lack of adequate measurement sensitivity. A variety of new methods have been introduced to increase the measurement sensitivity of the mutation assay. Some technologies such as single-stranded conformational polymorphism, denaturing high-performance liquid chromatography, and high-resolution melting analysis have the advantage of facilitating rapid mutation screening of large numbers of samples with high measurement sensitivity but require direct sequencing to confirm the identity of the detected mutations. Other techniques have been developed for the simple, but highly sensitive detection of specific EGFR mutations, such as the amplification refractory mutations system (ARMS) and the peptide nucleic acid-locked PCR clamping. Others selectively digest wild-type DNA templates with restriction endonucleases to enrich mutant alleles by PCR. Experts in the province of Ontario have commented that currently PCR fragment analysis for deletion and point mutation conducts in Ontario, with measurement sensitivity of 1% to 5%.
Research Questions
In patients with locally-advanced or metastatic NSCLC, what is the clinical effectiveness of EGFR mutation testing for prediction of response to treatment with TKIs (gefitinib, erlotinib) in terms of progression-free survival (PFS), objective response rates (ORR), overall survival (OS), and quality of life (QoL)?
What is the impact of EGFR mutation testing on overall clinical decision-making for patients with advanced or metastatic NSCLC?
What is the cost-effectiveness of EGFR mutation testing in selecting patients with advanced NSCLC for treatment with gefitinib or erlotinib in the first-line setting?
What is the budget impact of EGFR mutation testing in selecting patients with advanced NSCLC for treatment with gefitinib or erlotinib in the second- or third-line setting?
A literature search was performed on March 9, 2010 using OVID MEDLINE, MEDLINE In-Process and Other Non-Indexed Citations, OVID EMBASE, Wiley Cochrane, CINAHL, Centre for Reviews and Dissemination/International Agency for Health Technology Assessment for studies published from January 1, 2004 until February 28, 2010 using the following terms:
Non-Small-Cell Lung Carcinoma
Epidermal Growth Factor Receptor
An automatic literature update program also extracted all papers published from February 2010 until August 2010. Abstracts were reviewed by a single reviewer and for those studies meeting the eligibility criteria full-text articles were obtained. Reference lists were also examined for any additional relevant studies not identified through the search. Articles with unknown eligibility were reviewed with a second clinical epidemiologist, and then a group of epidemiologists, until consensus was established. The quality of evidence was assessed as high, moderate, low or very low according to GRADE methodology.
The inclusion criteria were as follows:
Population: patients with locally advanced or metastatic NSCLC (stage IIIB or IV)
Procedure: EGFR mutation testing before treatment with gefitinib or erlotinib
Language: publication in English
Published health technology assessments, guidelines, and peer-reviewed literature (abstracts, full text, conference abstract)
Outcomes: progression-free survival (PFS), Objective response rate (ORR), overall survival (OS), quality of life (QoL).
The exclusion criteria were as follows:
Studies lacking outcomes specific to those of interest
Studies focused on erlotinib maintenance therapy
Studies focused on gefitinib or erlotinib use in combination with cytotoxic agents or any other drug
Grey literature, where relevant, was also reviewed.
Outcomes of Interest
ORR determined by means of the Response Evaluation Criteria in Solid Tumours (RECIST)
Quality of Evidence
The quality of the Phase II trials and observational studies was based on the method of subject recruitment and sampling, possibility of selection bias, and generalizability to the source population. The overall quality of evidence was assessed as high, moderate, low or very low according to the GRADE Working Group criteria.
Summary of Findings
Since the last published health technology assessment by Blue Cross Blue Shield Association in 2007 there have been a number of phase III trials which provide evidence of predictive value of EGFR mutation testing in patients who were treated with gefitinib compared to chemotherapy in the first- or second-line setting. The Iressa Pan Asian Study (IPASS) trial showed the superiority of gefitinib in terms of PFS in patients with EGFR mutations versus patients with wild-type EGFR (Hazard ratio [HR], 0.48, 95%CI; 0.36-0.64 versus HR, 2.85; 95%CI, 2.05-3.98). Moreover, there was a statistically significant increased ORR in patients who received gefitinib and had EGFR mutations compared to patients with wild-type EGFR (71% versus 1%). The First-SIGNAL trial in patients with similar clinical characteristics as IPASS as well as the NEJ002 and WJTOG3405 trials that included only patients with EGFR mutations, provide confirmation that gefitinib is superior to chemotherapy in terms of improved PFS or higher ORR in patients with EGFR mutations. The INTEREST trial further indicated that patients with EGFR mutations had prolonged PFS and higher ORR when treated with gefitinib compared with docetaxel.
In contrast, there is still a paucity of strong evidence regarding the predictive value of EGFR mutation testing for response to erlotinib in the second- or third-line setting. The BR.21 trial randomized 731 patients with NSCLC who were refractory or intolerant to prior first- or second-line chemotherapy to receive erlotinib or placebo. While the HR of 0.61 (95%CI, 0.51-0.74) favored erlotinib in the overall population, this was not a significant in the subsequent retrospective subgroup analysis. A retrospective evaluation of 116 of the BR.21 tumor samples demonstrated that patients with EGFR mutations had significantly higher ORRs when treated with erlotinib compared with placebo (27% versus 7%; P=0.03). However, erlotinib did not confer a significant survival benefit compared with placebo in patients with EGFR mutations (HR, 0.55; 95%CI, 0.25-1.19) versus wild-type (HR, 0.74; 95%CI, 0.52-1.05). The interaction between EGFR mutation status and erlotinib use was not significant (P=0.47). The lack of significance could be attributable to a type II error since there was a low sample size that was available for subgroup analysis.
A series of phase II studies have examined the clinical effectiveness of erlotinib in patients known to have EGFR mutations. Evidence from these studies has consistently shown that erlotinib yields a very high ORR (typically 70% vs. 4%) and a prolonged PFS (9 months vs. 2 months) in patients with EGFR mutations compared with patients with wild-type EGFR. Although having a prolonged PFS and higher respond in EGFR mutated patients might be due to a better prognostic profile regardless of the treatment received. In the absence of a comparative treatment or placebo control group, it is difficult to determine if the observed differences in survival benefit in patients with EGFR mutation is attributed to prognostic or predictive value of EGFR mutation status.
Based on moderate quality of evidence, patients with locally advanced or metastatic NSCLC with adenocarcinoma histology being treated with gefitinib in the first-line setting are highly likely to benefit from gefitinib if they have EGFR mutations compared to those with wild-type EGFR. This advantage is reflected in improved PFS, ORR and QoL in patients with EGFR mutation who are being treated with gefitinib relative to patients treated with chemotherapy.
Based on low quality of evidence, in patients with locally advanced or metastatic NSCLC who are being treated with erlotinib, the identification of EGFR mutation status selects those who are most likely to benefit from erlotinib relative to patients treated with placebo in the second or third-line setting.
PMCID: PMC3377519  PMID: 23074402
24.  SP12 Biological Pathway-Centric Approach to Integrative Analysis of Array Data as Applied to Mefloquine Neurotoxicity 
Expression profiling of whole genomes, and modern high-throughput proteomics, has created a revolution in the study of disease states. Approaches for gene expression analysis (time series analysis and clustering) have been applied to functional genomics related to cancer research, and have yielded major successes in the pursuit of gene expression signatures. However, these analysis methods are primarily designed to identify correlative or causal relationships between entities, but do not consider the data in the proper biological context of a “biological pathway” model. Pathway models form a cornerstone of systems biology. They provide a framework for (1) systematic interrogation of biochemical interactions, (2) management of the collective knowledge pertaining to cellular components, and (3) discovery of emergent properties of different pathway configurations.
CFD Research Corporation has developed advanced techniques to interpret microarray data in the context of known biological pathways. We have applied this integrative biological pathway-centered approach to the specific problem of identifying a genetic cause for individuals predisposed to mefloquine neurotoxicity. Mefloquine (Lariam) is highly effective against drug-resistant malaria. However, adverse neurological effects (ataxia, mood changes) have been observed in human sub-populations. Microarray experiments were used to quantify the transcriptional response of cells exposed to mefloquine. Canonical pathway models containing the differentially expressed genes were automatically retrieved from the KEGG database, using recently developed software. The canonical pathway models were automatically concatenated together to form the final pathway model. The resultant pathway model was interrogated using a novel signaling control flux (SCF) algorithm that combines Boolean pseudodynamics (BPD) to relax the cumbersome steady-state assumptions of SCF. The SCF-BPD algorithm was used to identify and prioritize pathways critical to adverse effects of mefloquine. Further analysis resulted in the identification of specific sub-cellular targets that may explain mefloquine neurotoxicity in human subpopulations on the basis of known single-nucleotide polymorphisms.
PMCID: PMC2291890
25.  A temporal precedence based clustering method for gene expression microarray data 
BMC Bioinformatics  2010;11:68.
Time-course microarray experiments can produce useful data which can help in understanding the underlying dynamics of the system. Clustering is an important stage in microarray data analysis where the data is grouped together according to certain characteristics. The majority of clustering techniques are based on distance or visual similarity measures which may not be suitable for clustering of temporal microarray data where the sequential nature of time is important. We present a Granger causality based technique to cluster temporal microarray gene expression data, which measures the interdependence between two time-series by statistically testing if one time-series can be used for forecasting the other time-series or not.
A gene-association matrix is constructed by testing temporal relationships between pairs of genes using the Granger causality test. The association matrix is further analyzed using a graph-theoretic technique to detect highly connected components representing interesting biological modules. We test our approach on synthesized datasets and real biological datasets obtained for Arabidopsis thaliana. We show the effectiveness of our approach by analyzing the results using the existing biological literature. We also report interesting structural properties of the association network commonly desired in any biological system.
Our experiments on synthesized and real microarray datasets show that our approach produces encouraging results. The method is simple in implementation and is statistically traceable at each step. The method can produce sets of functionally related genes which can be further used for reverse-engineering of gene circuits.
PMCID: PMC2841598  PMID: 20113513

Results 1-25 (1353667)