The pharmacogenomic effects of a corticosteroid (CS) were assessed in rat skeletal muscle using microarrays. Adrenalectomized (ADX) rats were treated with methylprednisolone (MPL) by either 50 mg/kg intravenous injection or 7-day 0.3 mg/kg/h infusion through subcutaneously implanted pumps. RNAs extracted from individual rat muscles were hybridized to Affymetrix Rat Genome Genechips. Data mining yielded 653 and 2316 CS-responsive probe sets following MPL bolus and infusion treatments. Of these, 196 genes were controlled by MPL under both dosing conditions. Cluster analysis revealed that 124 probe sets exhibited three typical expression dynamic profiles following acute dosing. Cluster A consisted of up-regulated probe sets which were grouped into five subclusters each exhibiting unique temporal patterns during the infusion. Cluster B comprised down-regulated probe sets which were divided into two subclusters with distinct dynamics during the infusion. Cluster C probe sets exhibited delayed down-regulation under both bolus and infusion conditions. Among those, 104 probe sets were further grouped into subclusters based on their profiles following chronic MPL dosing. Several mathematical models were proposed and adequately captured the temporal patterns for each subcluster. Multiple types of dosing regimens are needed to resolve common determinants of gene regulation as chronic exposure results in unexpected differences in gene expression compared to acute dosing. Pharmacokinetic/pharmacodynamic (PK/PD) modeling provides a quantitative tool for elucidating the complexities of CS pharmacogenomics in skeletal muscle.
Microarray studies; pharmacokinetics; pharmacodynamics; mathematical models; computational biology
Microarrays have been utilized in many biological, physiological and pharmacological studies as a high-throughput genomic technique. Several generations of Affymetrix GeneChip® microarrays are widely used in gene expression studies. However, differences in intensities of signals for different probe sets that represent the same gene on various types of Affymetrix chips make comparison of datasets complicated.
Materials and Methods
A power coefficient scaling factor was applied in the pharmacokinetic/ pharmacodynamic (PK/PD) modeling to account for differences in probe set sensitivities (i.e., signal intensities). Microarray data from muscle and liver following methylprednisolone 50 mg/kg i.v. bolus and 0.3 mg/kg/h infusion regimens were taken as an exemplar.
The scaling factor applied to the pharmacodynamic output function was used to solve the problem of intensity differences between probe sets. This approach yielded consistent pharmacodynamic parameters for the applied models.
Modeling of pharmacodynamic/pharmacogenomic (PD/PG) data from diverse chips should be performed with caution due to differential probe set intensities. In such circumstances, a power scaling factor can be applied in the modeling.
bioinformatics; computational biology; pharmacodynamics; pharmacogenomics; pharmacokinetics
In the post-genomic era, the rapid evolution of high-throughput genotyping technologies and the increased pace of production of genetic research data are continually prompting the development of appropriate informatics tools, systems and databases as we attempt to cope with the flood of incoming genetic information. Alongside new technologies that serve to enhance data connectivity, emerging information systems should contribute to the creation of a powerful knowledge environment for genotype-to-phenotype information in the context of translational medicine. In the area of pharmacogenomics and personalized medicine, it has become evident that database applications providing important information on the occurrence and consequences of gene variants involved in pharmacokinetics, pharmacodynamics, drug efficacy and drug toxicity will become an integral tool for researchers and medical practitioners alike. At the same time, two fundamental issues are inextricably linked to current developments, namely data sharing and data protection. Here, we discuss high-throughput and next-generation sequencing technology and its impact on pharmacogenomics research. In addition, we present advances and challenges in the field of pharmacogenomics information systems which have in turn triggered the development of an integrated electronic ‘pharmacogenomics assistant’. The system is designed to provide personalized drug recommendations based on linked genotype-to-phenotype pharmacogenomics data, as well as to support biomedical researchers in the identification of pharmacogenomics-related gene variants. The provisioned services are tuned in the framework of a single-access pharmacogenomics portal.
whole-genome sequencing; personalized pharmacogenomics profile; informatics solutions; microattribution; drug metabolism; gene variants
Microarray analyses were performed on livers from adrenalectomized male Wistar rats chronically infused with methylprednisolone (MPL) (0.3 mg/kg·h) using Alzet mini-osmotic pumps for periods ranging from 6 h to 7 d. Four control and 40 drug-treated animals were killed at 10 different times during drug infusion. Total RNA preparations from the livers of these animals were hybridized to 44 individual Affymetrix REA230A gene chips, generating data for 15,967 different probe sets for each chip. A series of three filters were applied sequentially. These filters were designed to eliminate probe sets that were not expressed in the tissue, were not regulated by the drug, or did not meet defined quality control standards. These filters eliminated 13,978 probe sets (87.5%) leaving a remainder of 1989 probe sets for further consideration. We previously described a similar dataset obtained from animals after administration of a single dose of MPL (50 mg/kg given iv). That study involved 16 time points over a 72-h period. A similar filtering schema applied to the single-bolus-dose data-set identified 1519 probe sets as being regulated by MPL. A comparison of datasets from the two different dosing regimens identified 358 genes that were regulated by MPL in response to both dosing regimens. Regulated genes were grouped into 13 categories, mainly on gene product function. The temporal profiles of these common genes were subjected to detailed scrutiny. Examination of temporal profiles demonstrates that current perspectives on the mechanism of glucocorticoid action cannot entirely explain the temporal profiles of these regulated genes.
The transcriptional response of skeletal muscle to chronic corticosteroid exposure was examined over 168 h and compared with the response profiles observed following a single dose of corticosteroid. Male adrenalectomized Wistar rats were given a constant-rate infusion of 0.3 mg•kg−1•h−1 methylprednisolone for up to 7 days via subcutaneously implanted minipumps. Four control and forty drug-treated animals were killed at ten different time points during infusion. Liver total RNAs were hybridized to 44 individual Affymetrix REA230A gene chips. Previously, we described a filtration approach for identifying genes of interest in microarray data sets developed from tissues of rats treated with methylprednisolone (MPL) following acute dosing. Here, a similar approach involving a series of three filters was applied sequentially to identify genes of interest. These filters were designed to eliminate probe sets that were not expressed in the tissue, not regulated by the drug, or did not meet defined quality control standards. Filtering eliminated 86% of probe sets, leaving a remainder of 2,316 for further consideration. In a previous study, 653 probe sets were identified as MPL regulated following administration of a single (acute) dose of the drug. Comparison of the two data sets yielded 196 genes identified as regulated by MPL in both dosing regimens. Because of receptor downregulation, it was predicted that genes regulated by receptor-glucocorticoid response element interactions would exhibit tolerance in chronic profiles. However, many genes did not exhibit steroid tolerance, indicating that present perspectives on the mechanism of glucocorticoid action cannot entirely explain all temporal profiles.
glucocorticoids; corticosteroids; Affymetrix gene chips; gene expression; time series
Pharmacodynamics (PD) considers the relationship between drug exposure and effect. The two factors that have been used to distinguish the PD behaviors of antimicrobials are the impact of concentration on the extent of organism killing and the duration of persistent microbiologic suppression (postantibiotic effect). The goals of these studies were (i) to examine the relationship between antimicrobial PD and gene expression and (ii) to gain insight into the mechanism of fluconazole effects persisting following exposure. Microarrays were used to estimate the transcriptional response of Candida albicans to a supra-MIC F exposure over time in vitro. Fluconazole at four times the MIC was added to a log-phase C. albicans culture, and cells were collected to determine viable growth and for microarray analyses. We identified differential expression of 18% of all genes for at least one of the time points. More genes were upregulated (n = 1,053 [16%]) than downregulated (174 [3%]). Of genes with known function that were upregulated during exposure, most were related to plasma membrane/cell wall synthesis (18%), stress responses (7%), and metabolism (6%). The categories of downregulated genes during exposure included protein synthesis (15%), DNA synthesis/repair (7%), and transport (7%) genes. The majority of genes identified at the postexposure time points were from the protein (17%) and DNA (7%) synthesis categories. In subsequent studies, three genes (CDR1, CDR2, and ERG11) were examined in greater detail (more concentration and time points) following fluconazole exposure in vitro and in vivo. Expression levels from the in vitro and in vivo studies were congruent. CDR1 and CDR2 transcripts were reduced during in vitro fluconazole exposure and during supra-MIC exposure in vivo. However, in the postexposure period, the mRNA abundance of both pumps increased. ERG11 expression increased during exposure and fell in the postexposure period. The expression of the three genes responded in a dose-dependent manner. In sum, the microarray data obtained during and following fluconazole exposure identified genes both known and unknown to be affected by this drug class. The expanded in vitro and in vivo expression data set underscores the importance of considering the time course of exposure in pharmacogenomic investigations.
Data visualization techniques for the pharmaceutical sciences have not been extensively investigated. The purpose of this study was to evaluate the usefulness of VizStruct, a multidimensional visualization tool, for applications in pharmacokinetics, pharmacodynamics, and pharmacogenomics.
The VizStruct tool uses the first harmonic of the discrete Fourier transform to map multidimensional data to two dimensions for visualization. The mapping was used to visualize several published pharmacokinetic, pharmacodynamic, and pharmacogenomic data sets. The VizStruct approach was evaluated using simulated population pharmacokinetics data sets, the data from Dalen and colleagues (Clin. Pharmacol. Ther. 63:444−452, 1998) on the kinetics of nortriptyline and its 10-hydroxy-nortriptyline metabolite in subjects with differing number of copies of the CYP2D6, and the gene expression profiling data of Bohen and colleagues (Proc. Natl. Acad. Sci. USA 100:1926−1930, 2003) on follicular lymphoma patients responsive and nonresponsive to rituximab.
The VizStruct mapping preserves the key characteristics of multidimensional data in two dimensions in a manner that facilitates visualization. The mapping is computationally efficient and can be used for cluster detection and class prediction in pharmaceutical data sets. The VizStruct visualization succinctly summarized the salient similarities and differences in the nortriptyline and 10-hydroxynortriptyline pharmacokinetic profiles in subjects with increasing number of CYP2D6 gene copies. In the simulated population pharmacokinetic data sets, it was capable of discriminating the subtle differences between pharmacokinetic profiles derived from 1- and 2-compartment models with the same area under the curve. The two-dimensional VizStruct mapping computed from a subset of 102 informative genes from the Bohen and colleagues data set effectively separated the rituximab responder, rituximab nonresponder, and control subject groups.
The VizStruct approach is a computationally efficient and effective approach for visualizing complex, multidimensional data sets. It could have many useful applications in the pharmaceutical sciences.
microarray; pharmacodynamics; pharmacogenomic modeling; pharmacokinetics; visualization algorithms
Nowadays, it is possible to collect expression levels of a set of genes from a set of biological samples during a series of time points. Such data have three dimensions: gene-sample-time (GST). Thus they are called 3D microarray gene expression data. To take advantage of the 3D data collected, and to fully understand the biological knowledge hidden in the GST data, novel subspace clustering algorithms have to be developed to effectively address the biological problem in the corresponding space.
We developed a subspace clustering algorithm called Order Preserving Triclustering (OPTricluster), for 3D short time-series data mining. OPTricluster is able to identify 3D clusters with coherent evolution from a given 3D dataset using a combinatorial approach on the sample dimension, and the order preserving (OP) concept on the time dimension. The fusion of the two methodologies allows one to study similarities and differences between samples in terms of their temporal expression profile. OPTricluster has been successfully applied to four case studies: immune response in mice infected by malaria (Plasmodium chabaudi), systemic acquired resistance in Arabidopsis thaliana, similarities and differences between inner and outer cotyledon in Brassica napus during seed development, and to Brassica napus whole seed development. These studies showed that OPTricluster is robust to noise and is able to detect the similarities and differences between biological samples.
Our analysis showed that OPTricluster generally outperforms other well known clustering algorithms such as the TRICLUSTER, gTRICLUSTER and K-means; it is robust to noise and can effectively mine the biological knowledge hidden in the 3D short time-series gene expression data.
In biological systems that undergo processes such as differentiation, a clear concept of progression exists. We present a novel computational approach, called Sample Progression Discovery (SPD), to discover patterns of biological progression underlying microarray gene expression data. SPD assumes that individual samples of a microarray dataset are related by an unknown biological process (i.e., differentiation, development, cell cycle, disease progression), and that each sample represents one unknown point along the progression of that process. SPD aims to organize the samples in a manner that reveals the underlying progression and to simultaneously identify subsets of genes that are responsible for that progression. We demonstrate the performance of SPD on a variety of microarray datasets that were generated by sampling a biological process at different points along its progression, without providing SPD any information of the underlying process. When applied to a cell cycle time series microarray dataset, SPD was not provided any prior knowledge of samples' time order or of which genes are cell-cycle regulated, yet SPD recovered the correct time order and identified many genes that have been associated with the cell cycle. When applied to B-cell differentiation data, SPD recovered the correct order of stages of normal B-cell differentiation and the linkage between preB-ALL tumor cells with their cell origin preB. When applied to mouse embryonic stem cell differentiation data, SPD uncovered a landscape of ESC differentiation into various lineages and genes that represent both generic and lineage specific processes. When applied to a prostate cancer microarray dataset, SPD identified gene modules that reflect a progression consistent with disease stages. SPD may be best viewed as a novel tool for synthesizing biological hypotheses because it provides a likely biological progression underlying a microarray dataset and, perhaps more importantly, the candidate genes that regulate that progression.
We present a novel computational approach, Sample Progression Discovery (SPD), to discover biological progression underlying a microarray dataset. In contrast to the majority of microarray data analysis methods which identify differences between sample groups (normal vs. cancer, treated vs. control), SPD aims to identify an underlying progression among individual samples, both within and across sample groups. We validated SPD's ability to discover biological progression using datasets of cell cycle, B-cell differentiation, and mouse embryonic stem cell differentiation. We view SPD as a hypothesis generation tool when applied to datasets where the progression is unclear. For example, when applied to a microarray dataset of cancer samples, SPD assumes that the cancer samples collected from individual patients represent different stages during an intrinsic progression underlying cancer development. The inferred relationship among the samples may therefore indicate a trajectory or hierarchy of cancer progression, which serves as a hypothesis to be tested. SPD is not limited to microarray data analysis, and can be applied to a variety of high-dimensional datasets. We implemented SPD using MATLAB graphical user interface, which is available at http://icbp.stanford.edu/software/SPD/.
DNA microarrays have rapidly emerged as an important tool for Mycobacterium tuberculosis research. While the microarray approach has generated valuable information, a recent survey has found a lack of correlation among the microarray data produced by different laboratories on related issues, raising a concern about the credibility of research findings. The Affymetrix oligonucleotide array has been shown to be more reliable for interrogating changes in gene expression than other platforms. However, this type of array system has not been applied to the pharmacogenomic study of M. tuberculosis. The goal here was to explore the strength of the Affymetrix array system for monitoring drug-induced gene expression in M. tuberculosis, compare with other related studies, and conduct cross-platform analysis. The genome-wide gene expression profiles of M. tuberculosis in response to drug treatments including INH (isoniazid) and ethionamide were obtained using the Affymetrix array system. Up-regulated or down-regulated genes were identified through bioinformatic analysis of the microarray data derived from the hybridization of RNA samples and gene probes. Based on the Affymetrix system, our method identified all drug-induced genes reported in the original reference work as well as some other genes that have not been recognized previously under the same drug treatment. For instance, the Affymetrix system revealed that Rv2524c (fas) was induced by both INH and ethionamide under the given levels of concentration, as suggested by most of the probe sets implementing this gene sequence. This finding is contradictory to previous observations that the expression of fas is not changed by INH treatment. This example illustrates that the determination of expression change for certain genes is probe-dependent, and the appropriate use of multiple probe-set representation is an advantage with the Affymetrix system. Our data also suggest that whereas the up-regulated gene expression pattern reflects the drug’s mode of action, the down-regulated pattern is largely non-specific. According to our analysis, the Affymetrix array system is a reliable tool for studying the pharmacogenomics of M. tuberculosis and lends itself well in the research and development of anti-TB drugs.
Tuberculosis; Drug; Microarray; Genome
Kidney is a major target for adverse effects associated with corticosteroids. A microarray dataset was generated to examine changes in gene expression in rat kidney in response to methylprednisolone. Four control and 48 drug-treated animals were killed at 16 times after drug administration. Kidney RNA was used to query 52 individual Affymetrix chips, generating data for 15,967 different probe sets for each chip. Mining techniques applicable to time series data that identify drug-regulated changes in gene expression were applied. Four sequential filters eliminated probe sets that were not expressed in the tissue, not regulated by drug, or did not meet defined quality control standards. These filters eliminated 14,890 probe sets (94%) from further consideration. Application of judiciously chosen filters is an effective tool for data mining of time series datasets. The remaining data can then be further analyzed by clustering and mathematical modeling. Initial analysis of this filtered dataset identified a group of genes whose pattern of regulation was highly correlated with prototype corticosteroid enhanced genes. Twenty genes in this group, as well as selected genes exhibiting either downregulation or no regulation, were analyzed for 5′ GRE half-sites conserved across species. In general, the results support the hypothesis that the existence of conserved DNA binding sites can serve as an important adjunct to purely analytic approaches to clustering genes into groups with common mechanisms of regulation. This dataset, as well as similar datasets on liver and muscle, are available online in a format amenable to further analysis by others.
data mining; gene arrays; glucocorticoids; pharmacogenomics; evolutionary conservation
Post-genomic molecular biology has resulted in an explosion of data, providing measurements for large numbers of genes, proteins and metabolites. Time series experiments have become increasingly common, necessitating the development of novel analysis tools that capture the resulting data structure. Outlier measurements at one or more time points present a significant challenge, while potentially valuable replicate information is often ignored by existing techniques.
We present a generative model-based Bayesian hierarchical clustering algorithm for microarray time series that employs Gaussian process regression to capture the structure of the data. By using a mixture model likelihood, our method permits a small proportion of the data to be modelled as outlier measurements, and adopts an empirical Bayes approach which uses replicate observations to inform a prior distribution of the noise variance. The method automatically learns the optimum number of clusters and can incorporate non-uniformly sampled time points. Using a wide variety of experimental data sets, we show that our algorithm consistently yields higher quality and more biologically meaningful clusters than current state-of-the-art methodologies. We highlight the importance of modelling outlier values by demonstrating that noisy genes can be grouped with other genes of similar biological function. We demonstrate the importance of including replicate information, which we find enables the discrimination of additional distinct expression profiles.
By incorporating outlier measurements and replicate values, this clustering algorithm for time series microarray data provides a step towards a better treatment of the noise inherent in measurements from high-throughput genomic technologies. Timeseries BHC is available as part of the R package 'BHC' (version 1.5), which is available for download from Bioconductor (version 2.9 and above) via http://www.bioconductor.org/packages/release/bioc/html/BHC.html?pagewanted=all.
A data set was generated to examine global changes in gene expression in rat liver over time in response to a single bolus dose of methylprednisolone. Four control animals and 43 drug-treated animals were humanely killed at 16 different time points following drug administration. Total RNA preparation from the livers of these animals were hybridized to 47 individual Affymetrix RU34A gene chips, generating data for 8799 different probe sets for each chip. Data mining techniques that are applicable to gene array time series data sets in order to identify drug-regulated changes in gene expression were applied to this data set. A series of 4 sequentially applied filters were developed that were designed to eliminate probe sets that were not expressed in the tissue, were not regulated by the drug treatment, or did not meet defined quality control standards. These filters eliminated 7287 probe sets of the 8799 total (82%) from further consideration. Application of judiciously chosen filters is an effective tool for data mining of time series data sets. The remaining data can then be further analyzed by clustering and mathematical modeling techniques.
Data mining; gene arrays; glucocorticoids; mathematical modeling; pharmacogenomics
A data set was generated to examine global changes in gene expression in rat liver over time in response to a single bolus dose of methylprednisolone. Four control animals and 43 drug-treated animals were humanely killed at 16 different time points following drug administration. Total RNA preparations from the livers of these animals were hybridized to 47 individual Affymetrix RU34A gene chips, generating data for 8799 different probe sets for each chip. Data mining techniques that are applicable to gene array time series data sets in order to identify drug-regulated changes in gene expression were applied to this data set. A series of 4 sequentially applied filters were developed that were designed to eliminate probe sets that were not expressed in the tissue, were not regulated by the drug treatment, or did not meet defined quality control standards. These filters eliminated 7287 probe sets of the 8799 total (82%) from further consideration. Application of judiciously chosen filters is an effective tool for data mining of time series data sets. The remaining data can then be further analyzed by clustering and mathematical modeling techniques.
Data mining; gene arrays; glucocorticoids; mathematical modeling; pharmacogenomics
Sepsis remains a major clinical challenge in intensive care units. The difficulty in developing new and more effective treatments for sepsis exemplifies our incomplete understanding of the underlying pathophysiology of it. One of the more widely used rodent models for studying polymicrobial sepsis is cecal ligation and puncture (CLP). While a number of CLP studies investigated the ensuing systemic inflammatory response, they usually focus on a single time point post CLP and therefore fail to describe the dynamics of the response. Furthermore, previous studies mostly use surgery without infection (herein referred to as Sham CLP, SCLP) as a control for the CLP model, however SCLP represents an aseptic injurious event that also stimulates a systemic inflammatory response. Thus, there is a need to better understand the dynamics and expression patterns of both injury- and sepsis- induced gene expression alterations to identify potential regulatory targets. In this direction, we characterized the response of the liver within the first 24 h in a rat model of SCLP and CLP using a time series of microarray gene expression data.
Rats were randomly divided into three groups, sham, SCLP and CLP. Rats in SCLP group are subjected to laparotomy, cecal ligation and puncture while those in CLP group are subjected to the similar procedures without cecal ligation and puncture. Animals were saline resuscitated and sacrificed at defined time points (0, 2, 4, 8, 16, and 24 h). Liver tissues were explanted and analyzed for their gene expression profiles using microarray technology. Unoperated animals (Sham) serve as negative controls. After identifying differentially expressed probesets between sham and SCLP or CLP conditions over time, the concatenated data sets corresponding to these differentially expressed probesets in sham and SCLP or CLP groups were combined and analyzed using a “consensus clustering” approach. Promoters of genes that share common characteristics were extracted, and compared with gene batteries comprised of co expressed genes in order to identify putatative transcription factors which could be responsible for the co regulation of those genes.
The SCLP/CLP genes whose expression patterns significantly changed compared to sham over time were identified, clustered, and finally analyzed for pathway enrichment. Our results indicate that both CLP and SCLP triggered the activation of a pro-inflammatory response, enhanced synthesis of acute-phase proteins, increased metabolism and tissue damage markers. Genes triggered by CLP which can be directly linked to bacteria removal functions were absent in SCLP injury. In addition, genes relevant to oxidative stress induced damage were unique to CLP injury, which may be due to the increased severity of CLP injury vs. SCLP injury. Pathway enrichment identified pathways with similar functionality but different dynamics in the two injury models, indicating that the functions controlled by those pathways are under the influence of different transcription factors and regulatory mechanisms. Putatively identified transcription factors, notably including CREB, NF-KB and STAT, were obtained through analysis of the promoter regions in the SCLP/CLP genes. Our results show that while transcription factors such as NF-KB, HOMF, and GATA were common in both injuries for the IL-6 signaling pathway, there were many other transcription factors associated with that pathway which were unique to CLP, including FKHD, HESF and IRFF. There were 17 transcription factors that were identified as important in at least 2 pathways in the CLP injury, but only 7 transcription factors with that property in the SCLP injury. This also supports the hypothesis of unique regulatory modules that govern the pathways present in both the CLP and SCLP response.
By using microarrays to assess multiple genes in a high throughput manner, we demonstrate that an inflammatory response involving different dynamics and different genes is triggered by SCLP and CLP. From our analysis of the CLP data, the key characteristics of sepsis are a pro inflammatory response which drives hypermetabolism, immune cell activation, and damage from oxidative stress. This contrasts with SCLP, which triggers a modified inflammatory response leading to no immune cell activation, decreased detoxification potential, and hyper metabolism. Many of the identified transcription factors that drive the CLP-induced response are not found in the SCLP group, suggesting that SCLP and CLP induce different types of inflammatory responses via different regulatory pathways.
sepsis; trauma; gene expression; transcription factor; microarray; inflammation; liver
Pioglitazone is the most widely used thiazolidinedione and acts as an insulin-sensitizer through activation of the Peroxisome Proliferator-Activated Receptor-γ (PPARγ). Pioglitazone is approved for use in the management of type 2 diabetes mellitus (T2DM), but its use in other therapeutic areas is increasing due to pleiotropic effects. In this hypothesis article, the current clinical evidence on pioglitazone pharmacogenomics is summarized and related to variability in pioglitazone response. How genetic variation in the human genome affects the pharmacokinetics and pharmacodynamics of pioglitazone was examined. For pharmacodynamic effects, hypoglycemic and anti-atherosclerotic effects, risks of fracture or edema, and the increase in body mass index in response to pioglitazone based on genotype were examined. The genes CYP2C8 and PPARG are the most extensively studied to date and selected polymorphisms contribute to respective variability in pioglitazone pharmacokinetics and pharmacodynamics. We hypothesized that genetic variation in pioglitazone pathway genes contributes meaningfully to the clinically observed variability in drug response. To test the hypothesis that genetic variation in PPARG associates with variability in pioglitazone response, we conducted a meta-analysis to synthesize the currently available data on the PPARG p.Pro12Ala polymorphism. The results showed that PPARG 12Ala carriers had a more favorable change in fasting blood glucose from baseline as compared to patients with the wild-type Pro12Pro genotype (p = 0.018). Unfortunately, findings for many other genes lack replication in independent cohorts to confirm association; further studies are needed. Also, the biological functionality of these polymorphisms is unknown. Based on current evidence, we propose that pharmacogenomics may provide an important tool to individualize pioglitazone therapy and better optimize therapy in patients with T2DM or other conditions for which pioglitazone is being used.
pioglitazone; thiazolidinedione; CYP2C8; cytochrome P450; PPAR; pharmacokinetics; pharmacodynamics
Visualization tools allow researchers to obtain a global view of the interrelationships between the probes or experiments of a gene expression (e.g. microarray) data set. Some existing methods include hierarchical clustering and k-means. In recent years, others have proposed applying minimum spanning trees (MST) for microarray clustering. Although MST-based clustering is formally equivalent to the dendrograms produced by hierarchical clustering under certain conditions; visually they can be quite different.
HAMSTER (Helpful Abstraction using Minimum Spanning Trees for Expression Relations) is an open source system for generating a set of MSTs from the experiments of a microarray data set. While previous works have generated a single MST from a data set for data clustering, we recursively merge experiments and repeat this process to obtain a set of MSTs for data visualization. Depending on the parameters chosen, each tree is analogous to a snapshot of one step of the hierarchical clustering process. We scored and ranked these trees using one of three proposed schemes. HAMSTER is implemented in C++ and makes use of Graphviz for laying out each MST.
We report on the running time of HAMSTER and demonstrate using data sets from the NCBI Gene Expression Omnibus (GEO) that the images created by HAMSTER offer insights that differ from the dendrograms of hierarchical clustering. In addition to the C++ program which is available as open source, we also provided a web-based version (HAMSTER+) which allows users to apply our system through a web browser without any computer programming knowledge.
Researchers may find it helpful to include HAMSTER in their microarray analysis workflow as it can offer insights that differ from hierarchical clustering. We believe that HAMSTER would be useful for certain types of gradient data sets (e.g time-series data) and data that indicate relationships between cells/tissues. Both the source and the web server variant of HAMSTER are available from http://hamster.cbrc.jp/.
Time-course microarray experiments can produce useful data which can help in understanding the underlying dynamics of the system. Clustering is an important stage in microarray data analysis where the data is grouped together according to certain characteristics. The majority of clustering techniques are based on distance or visual similarity measures which may not be suitable for clustering of temporal microarray data where the sequential nature of time is important. We present a Granger causality based technique to cluster temporal microarray gene expression data, which measures the interdependence between two time-series by statistically testing if one time-series can be used for forecasting the other time-series or not.
A gene-association matrix is constructed by testing temporal relationships between pairs of genes using the Granger causality test. The association matrix is further analyzed using a graph-theoretic technique to detect highly connected components representing interesting biological modules. We test our approach on synthesized datasets and real biological datasets obtained for Arabidopsis thaliana. We show the effectiveness of our approach by analyzing the results using the existing biological literature. We also report interesting structural properties of the association network commonly desired in any biological system.
Our experiments on synthesized and real microarray datasets show that our approach produces encouraging results. The method is simple in implementation and is statistically traceable at each step. The method can produce sets of functionally related genes which can be further used for reverse-engineering of gene circuits.
Chronopharmacology is an important but under-explored aspect of therapeutics. Rhythmic variations in biological processes can influence drug action, including pharmacodynamic responses, due to circadian variations in the availability or functioning of drug targets. We hypothesized that global gene expression analysis can be useful in the identification of circadian regulated genes involved in drug action. Circadian variations in gene expression in rat liver were explored using Affymetrix gene arrays. A rich time series involving animals analyzed at 18 time points within the 24 hour cycle was generated. Of the more than 15,000 probe sets on these arrays, 265 exhibited oscillations with a 24 hour frequency. Cluster analysis yielded 5 distinct circadian clusters, with approximately two-thirds of the transcripts reaching maximum expression during the animal’s dark/active period. Of the 265 probe sets, 107 of potential therapeutic importance were identified. The expression levels of clock genes were also investigated in this study. Five clock genes exhibited circadian variation in liver, and data suggest that these genes may also be regulated by corticosteroids.
Pharmacogenetics and pharmacogenomics involve the study of the role of inheritance in individual variation in drug response, a phenotype that varies from potentially life-threatening adverse drug reactions to equally serious lack of therapeutic efficacy. Pharmacogenetics-pharmacogenomics represents a major component of the movement to `individualized medicine'. Pharmacogenetic studies originally focused on monogenic traits, often involving genetic variation in drug metabolism. However, contemporary studies increasingly involve entire `pathways' that include both pharmacokinetics (PKs)—factors that influence the concentration of a drug reaching its target(s)—and pharmacodynamics (PDs), factors associated with the drug target(s), as well as genome-wide approaches. The convergence of advances in pharmacogenetics with rapid developments in human genomics has resulted in the evolution of pharmacogenetics into pharmacogenomics. At the same time, studies of drug response are expanding beyond genomics to encompass pharmacotranscriptomics and pharmacometabolomics to become a systems-based discipline. This discipline is also increasingly moving across the `translational interface' into the clinic and is being incorporated into the drug development process and governmental regulation of that process. The article will provide an overview of the development of pharmacogenetics-pharmacogenomics, the scientific advances that have contributed to the continuing evolution of this discipline, the incorporation of transcriptomic and metabolomic data into attempts to understand and predict variation in drug response phenotypes as well as challenges associated with the `translation' of this important aspect of biomedical science into the clinic.
Comprehensively understanding corticosteroid pharmacogenomic effects is an essential step towards an insight into the underlying molecular mechanisms for both beneficial and detrimental clinical effects. Nevertheless, even in a single tissue different methods of corticosteroid administration can induce different patterns of expression and regulatory control structures. Therefore, rich in vivo datasets of pharmacological time-series with two dosing regimens sampled from rat liver are examined for temporal patterns of changes in gene expression and their regulatory commonalities.
The study addresses two issues, including (1) identifying significant transcriptional modules coupled with dynamic expression patterns and (2) predicting relevant common transcriptional controls to better understand the underlying mechanisms of corticosteroid adverse effects. Following the orientation of meta-analysis, an extended computational approach that explores the concept of agreement matrix from consensus clustering has been proposed with the aims of identifying gene clusters that share common expression patterns across multiple dosing regimens as well as handling challenges in the analysis of microarray data from heterogeneous sources, e.g. different platforms and time-grids in this study. Six significant transcriptional modules coupled with typical patterns of expression have been identified. Functional analysis reveals that virtually all enriched functions (gene ontologies, pathways) in these modules are shown to be related to metabolic processes, implying the importance of these modules in adverse effects under the administration of corticosteroids. Relevant putative transcriptional regulators (e.g. RXRF, FKHD, SP1F) are also predicted to provide another source of information towards better understanding the complexities of expression patterns and the underlying regulatory mechanisms of those modules.
We have proposed a framework to identify significant coexpressed clusters of genes across multiple conditions experimented from different microarray platforms, time-grids, and also tissues if applicable. Analysis on rich in vivo datasets of corticosteroid time-series yielded significant insights into the pharmacogenomic effects of corticosteroids, especially the relevance to metabolic side-effects. This has been illustrated through enriched metabolic functions in those transcriptional modules and the presence of GRE binding motifs in those enriched pathways, providing significant modules for further analysis on pharmacogenomic corticosteroid effects.
Gene expression levels in a given cell can be influenced by different factors, namely pharmacological or medical treatments. The response to a given stimulus is usually different for different genes and may depend on time. One of the goals of modern molecular biology is the high-throughput identification of genes associated with a particular treatment or a biological process of interest. From methodological and computational point of view, analyzing high-dimensional time course microarray data requires very specific set of tools which are usually not included in standard software packages. Recently, the authors of this paper developed a fully Bayesian approach which allows one to identify differentially expressed genes in a 'one-sample' time-course microarray experiment, to rank them and to estimate their expression profiles. The method is based on explicit expressions for calculations and, hence, very computationally efficient.
The software package BATS (Bayesian Analysis of Time Series) presented here implements the methodology described above. It allows an user to automatically identify and rank differentially expressed genes and to estimate their expression profiles when at least 5–6 time points are available. The package has a user-friendly interface. BATS successfully manages various technical difficulties which arise in time-course microarray experiments, such as a small number of observations, non-uniform sampling intervals and replicated or missing data.
BATS is a free user-friendly software for the analysis of both simulated and real microarray time course experiments. The software, the user manual and a brief illustrative example are freely available online at the BATS website:
The Hedgehog (Hh) pathway is involved in oncogenic transformation and tumor maintenance. The primary objective of this study was to select surrogate tissue to measure messenger ribonucleic acid (mRNA) levels of Hh pathway genes for measurement of pharmacodynamic effect. Expression of Hh pathway specific genes was measured by quantitative real time polymerase chain reaction (qRT-PCR) and global gene expression using Affymetrix U133 microarrays. Correlations were made between the expression of specific genes determined by qRT-PCR and normalized microarray data. Gene ontology analysis using microarray data for a broader set of Hh pathway genes was performed to identify additional Hh pathway-related markers in the surrogate tissue. RNA extracted from blood, hair follicle, and skin obtained from healthy subjects was analyzed by qRT-PCR for 31 genes, whereas 8 samples were analyzed for a 7-gene subset. Twelve sample sets, each with ≤500 ng total RNA derived from hair, skin, and blood, were analyzed using Affymetrix U133 microarrays. Transcripts for several Hh pathway genes were undetectable in blood using qRT-PCR. Skin was the most desirable matrix, followed by hair follicle. Whether processed by robust multiarray average or microarray suite 5 (MAS5), expression patterns of individual samples showed co-clustered signals; both normalization methods were equally effective for unsupervised analysis. The MAS5- normalized probe sets appeared better suited for supervised analysis. This work provides the basis for selection of a surrogate tissue and an expression analysis-based approach to evaluate pathway-related genes as markers of pharmacodynamic effect with novel inhibitors of the Hh pathway.
Hedgehog; smoothened; biomarkers; cancer; skin
Corticosteroids (CS) regulate many enzymes at both mRNA and protein levels. This study used microarrays to broadly assess regulation of various genes related to the greater urea cycle and employs pharmacokinetic/pharmacodynamic (PK/PD) modeling to quantitatively analyze and compare the temporal profiles of these genes during acute and chronic exposure to methylprednisolone (MPL). One group of adrenalectomized male Wistar rats received an intravenous bolus dose (50 mg/kg) of MPL, whereas a second group received MPL by a subcutaneous infusion (Alzet osmotic pumps) at a rate of 0.3 mg/kg/hr for seven days. The rats were sacrificed at various time points over 72 hours (acute) or 168 hours (chronic) and livers were harvested. Total RNA was extracted and Affymetrix® gene chips (RG_U34A for acute and RAE 230A for chronic) were used to identify genes regulated by CS. Besides five primary urea cycle enzymes, many other genes related to the urea cycle showed substantial changes in mRNA expression. Some genes that were simply up- or down-regulated after acute MPL showed complex biphasic patterns upon chronic infusion indicating involvement of secondary regulation. For the simplest patterns, indirect response models were used to describe the nuclear steroid-bound receptor mediated increase or decrease in gene transcription (e.g. tyrosine aminotransferase, glucocorticoid receptor). For the biphasic profiles, involvement of a secondary biosignal was assumed (e.g. ornithine decarboxylase, CCAAT/enhancer binding protein) and more complex models were derived. Microarrays were used successfully to explore CS effects on various urea cycle enzyme genes. PD models presented in this report describe testable hypotheses regarding molecular mechanisms and quantitatively characterize the direct or indirect regulation of various genes by CS.
urea cycle; corticosteroids; methylprednisolone; pharmacodynamics; genomics
Despite the mounting research on Arabidopsis transcriptome and the powerful tools to explore biology of this model plant, the organization of expression of Arabidopsis genome is only partially understood. Here, we create a coexpression network from a 22,746 Affymetrix probes dataset derived from 963 microarray chips that query the transcriptome in response to a wide variety of environmentally, genetically, and developmentally induced perturbations.
Markov chain graph clustering of the coexpression network delineates 998 regulons ranging from one to 1623 genes in size. To assess the significance of the clustering results, the statistical over-representation of GO terms is averaged over this set of regulons and compared to the analogous values for 100 randomly-generated sets of clusters. The set of regulons derived from the experimental data scores significantly better than any of the randomly-generated sets. Most regulons correspond to identifiable biological processes and include a combination of genes encoding related developmental, metabolic pathway, and regulatory functions. In addition, nearly 3000 genes of unknown molecular function or process are assigned to a regulon. Only five regulons contain plastomic genes; four of these are exclusively plastomic. In contrast, expression of the mitochondrial genome is highly integrated with that of nuclear genes; each of the seven regulons containing mitochondrial genes also incorporates nuclear genes. The network of regulons reveals a higher-level organization, with dense local neighborhoods articulated for photosynthetic function, genetic information processing, and stress response.
This analysis creates a framework for generation of experimentally testable hypotheses, gives insight into the concerted functions of Arabidopsis at the transcript level, and provides a test bed for comparative systems analysis.