PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-9 (9)
 

Clipboard (0)
None

Select a Filter Below

Journals
Authors
more »
Year of Publication
Document Types
1.  RS-Predictor: A new tool for predicting sites of cytochrome P450-mediated metabolism applied to CYP 3A4 
This article describes RegioSelectivity-Predictor (RS-Predictor), a new in silico method for generating predictive models of P450-mediated metabolism for drug-like compounds. Within this method, potential sites of metabolism (SOMs) are represented as “metabolophores”: A concept that describes the hierarchical combination of topological and quantum chemical descriptors needed to represent the reactivity of potential metabolic reaction sites. RS-Predictor modeling involves the use of metabolophore descriptors together with multiple-instance ranking (MIRank) to generate an optimized descriptor weight vector that encodes regioselectivity trends across all cases in a training set. The resulting pathway-independent,i isozyme-specific regioselectivity model may be used to predict potential metabolic liabilities. In the present work, cross-validated RS-Predictor models were generated for a set of 394 substrates of CYP 3A4 as a proof-of-principle for the method. Rank aggregation was then employed to merge independently generated predictions for each substrate into a single consensus prediction. The resulting consensus RS-Predictor models were shown to reliably identify at least one observed site of metabolism in the top two rank-positions on 78% of the substrates. Comparisons between RS-Predictor and previously described regioselectivity prediction methods reveal new insights into how in silico metabolite prediction methods should be compared.
doi:10.1021/ci2000488
PMCID: PMC3144271  PMID: 21528931
2.  Examining the sublineage structure of Mycobacterium tuberculosis complex strains with multiple-biomarker tensors 
Strains of the Mycobacterium tuberculosis complex (MTBC) can be classified into coherent lineages of similar traits based on their genotype. We present a tensor clustering framework to group MTBC strains into sublineages of the known major lineages based on two biomarkers: spacer oligonucleotide type (spoligotype) and mycobacterial interspersed repetitive units (MIRU). We represent genotype information of MTBC strains in a high-dimensional array in order to include information about spoligotype, MIRU, and their coexistence using multiple-biomarker tensors. We use multiway models to transform this multidimensional data about the MTBC strains into two-dimensional arrays and use the resulting score vectors in a stable partitive clustering algorithm to classify MTBC strains into sublineages. We validate clusterings using cluster stability and accuracy measures, and find stabilities of each cluster. Based on validated clustering results, we present a sublineage structure of MTBC strains and compare it to the sublineage structures of SpolDB4 and MIRU-VNTRplus.
doi:10.1109/BIBM.2010.5706625
PMCID: PMC3315393  PMID: 22466374
Tuberculosis; Mycobacterium tuberculosis complex; multiway models; clustering; cluster validation
3.  Data-driven insights into deletions of Mycobacterium tuberculosis complex chromosomal DR region using spoligoforests 
Biomarkers of Mycobacterium tuberculosis complex (MTBC) mutate over time. Among the biomarkers of MTBC, spacer oligonucleotide type (spoligotype) and Mycobacterium Interspersed Repetitive Unit (MIRU) patterns are commonly used to genotype clinical MTBC strains. In this study, we present an evolution model of spoligotype rearrangements using MIRU patterns to disambiguate the ancestors of spoligotypes, in a large patient dataset from the United States Centers for Disease Control and Prevention (CDC). Based on the contiguous deletion assumption and rare observation of convergent evolution, we first generate the most parsimonious forest of spoligotypes, called a spoligoforest, using three genetic distance measures. An analysis of topological attributes of the spoligoforest and number of variations at the direct repeat (DR) locus of each strain reveals interesting properties of deletions in the DR region. First, we compare our mutation model to existing mutation models of spoligotypes and find that our mutation model produces as many within-lineage mutation events as other models, with slightly higher segregation accuracy. Second, based on our mutation model, the number of descendant spoligotypes follows a power law distribution. Third, contrary to prior studies, the power law distribution does not plausibly fit to the mutation length frequency. Finally, the total number of mutation events at consecutive DR loci follows a bimodal distribution, which results in accumulation of shorter deletions in the DR region. The two modes are spacers 13 and 40, which are hotspots for chromosomal rearrangements. The change point in the bimodal distribution is spacer 34, which is absent in most MTBC strains. This bimodal separation results in accumulation of shorter deletions, which explains why a power law distribution is not a plausible fit to the mutation length frequency.
doi:10.1109/BIBM.2011.64
PMCID: PMC3279189  PMID: 22343484
tuberculosis; Mycobacterium tuberculosis complex; DR locus; spoligotype; MIRU-VNTR; mutation
4.  Sublineage structure analysis of Mycobacterium tuberculosis complex strains using multiple-biomarker tensors 
BMC Genomics  2011;12(Suppl 2):S1.
Background
Strains of Mycobacterium tuberculosis complex (MTBC) can be classified into major lineages based on their genotype. Further subdivision of major lineages into sublineages requires multiple biomarkers along with methods to combine and analyze multiple sources of information in one unsupervised learning model. Typically, spacer oligonucleotide type (spoligotype) and mycobacterial interspersed repetitive units (MIRU) are used for TB genotyping and surveillance. Here, we examine the sublineage structure of MTBC strains with multiple biomarkers simultaneously, by employing a tensor clustering framework (TCF) on multiple-biomarker tensors.
Results
Simultaneous analysis of the spoligotype and MIRU type of strains using TCF on multiple-biomarker tensors leads to coherent sublineages of major lineages with clear and distinctive spoligotype and MIRU signatures. Comparison of tensor sublineages with SpolDB4 families either supports tensor sublineages, or suggests subdivision or merging of SpolDB4 families. High prediction accuracy of major lineage classification with supervised tensor learning on multiple-biomarker tensors validates our unsupervised analysis of sublineages on multiple-biomarker tensors.
Conclusions
TCF on multiple-biomarker tensors achieves simultaneous analysis of multiple biomarkers and suggest a new putative sublineage structure for each major lineage. Analysis of multiple-biomarker tensors gives insight into the sublineage structure of MTBC at the genomic level.
doi:10.1186/1471-2164-12-S2-S1
PMCID: PMC3194230  PMID: 21988942
5.  Determination of Major Lineages of Mycobacterium tuberculosis Complex using Mycobacterial Interspersed Repetitive Units 
We present a novel Bayesian network (BN) to classify strains of Mycobacterium tuberculosis Complex (MTBC) into six major genetic lineages using mycobacterial interspersed repetitive units (MIRUs), a high-throughput biomarker. MTBC is the causative agent of tuberculosis (TB), which remains one of the leading causes of disease and morbidity world-wide. DNA fingerprinting methods such as MIRU are key components of modern TB control and tracking. The BN achieves high accuracy on four large MTBC genotype collections consisting of over 4700 distinct 12-loci MIRU genotypes. The BN captures distinct MIRU signatures associated with each lineage, explaining the excellent performance of the BN. The errors in the BN support the need for additional biomarkers such as the expanded 24-loci MIRU used in CDC genotyping labs since May 2009. The conditional independence assumption of each locus given the lineage makes the BN easily extensible to additional MIRU loci and other biomarkers.
doi:10.1109/BIBM.2009.86
PMCID: PMC2954607  PMID: 20953280
tuberculosis; MIRU-VNTR; Bayesian network; lineages
6.  A conformal Bayesian network for classification of Mycobacterium tuberculosis complex lineages 
BMC Bioinformatics  2010;11(Suppl 3):S4.
Background
We present a novel conformal Bayesian network (CBN) to classify strains of Mycobacterium tuberculosis Complex (MTBC) into six major genetic lineages based on two high-throuput biomarkers: mycobacterial interspersed repetitive units (MIRU) and spacer oligonucleotide typing (spoligotyping). MTBC is the causative agent of tuberculosis (TB), which remains one of the leading causes of disease and morbidity world-wide. DNA fingerprinting methods such as MIRU and spoligotyping are key components in the control and tracking of modern TB.
Results
CBN is designed to exploit background knowledge about MTBC biomarkers. It can be trained on large historical TB databases of various subsets of MTBC biomarkers. During TB control efforts not all biomarkers may be available. So, CBN is designed to predict the major lineage of isolates genotyped by any combination of the PCR-based typing methods: spoligotyping and MIRU typing. CBN achieves high accuracy on three large MTBC collections consisting of over 34,737 isolates genotyped by different combinations of spoligotypes, 12 loci of MIRU, and 24 loci of MIRU. CBN captures distinct MIRU and spoligotype signatures associated with each lineage, explaining its excellent performance. Visualization of MIRU and spoligotype signatures yields insight into both how the model works and the genetic diversity of MTBC.
Conclusions
CBN conforms to the available PCR-based biological markers and achieves high performance in identifying major lineages of MTBC. The method can be readily extended as new biomarkers are introduced for TB tracking and control. An online tool (http://www.cs.rpi.edu/~bennek/tbinsight/tblineage) makes the CBN model available for TB control and research efforts.
doi:10.1186/1471-2105-11-S3-S4
PMCID: PMC2863063  PMID: 20438651
7.  Multiway modeling and analysis in stem cell systems biology 
BMC Systems Biology  2008;2:63.
Background
Systems biology refers to multidisciplinary approaches designed to uncover emergent properties of biological systems. Stem cells are an attractive target for this analysis, due to their broad therapeutic potential. A central theme of systems biology is the use of computational modeling to reconstruct complex systems from a wealth of reductionist, molecular data (e.g., gene/protein expression, signal transduction activity, metabolic activity, etc.). A number of deterministic, probabilistic, and statistical learning models are used to understand sophisticated cellular behaviors such as protein expression during cellular differentiation and the activity of signaling networks. However, many of these models are bimodal i.e., they only consider row-column relationships. In contrast, multiway modeling techniques (also known as tensor models) can analyze multimodal data, which capture much more information about complex behaviors such as cell differentiation. In particular, tensors can be very powerful tools for modeling the dynamic activity of biological networks over time. Here, we review the application of systems biology to stem cells and illustrate application of tensor analysis to model collagen-induced osteogenic differentiation of human mesenchymal stem cells.
Results
We applied Tucker1, Tucker3, and Parallel Factor Analysis (PARAFAC) models to identify protein/gene expression patterns during extracellular matrix-induced osteogenic differentiation of human mesenchymal stem cells. In one case, we organized our data into a tensor of type protein/gene locus link × gene ontology category × osteogenic stimulant, and found that our cells expressed two distinct, stimulus-dependent sets of functionally related genes as they underwent osteogenic differentiation. In a second case, we organized DNA microarray data in a three-way tensor of gene IDs × osteogenic stimulus × replicates, and found that application of tensile strain to a collagen I substrate accelerated the osteogenic differentiation induced by a static collagen I substrate.
Conclusion
Our results suggest gene- and protein-level models whereby stem cells undergo transdifferentiation to osteoblasts, and lay the foundation for mechanistic, hypothesis-driven studies. Our analysis methods are applicable to a wide range of stem cell differentiation models.
doi:10.1186/1752-0509-2-63
PMCID: PMC2527292  PMID: 18625054
8.  Laminin-5 activates extracellular matrix production and osteogenic gene focusing in human mesenchymal stem cells 
We recently reported that laminin-5, expressed by human mesenchymal stem cells (hMSC), stimulates osteogenic gene expression in these cells in the absence of any other osteogenic stimulus. Here we employ two dimensional liquid chromatography and tandem mass spectrometry, along with the Database for Annotation, Visualization and Integrated Discovery (DAVID), to obtain a more comprehensive profile of the protein (and hence gene) expression changes occurring during laminin-5-induced osteogenesis of hMSC. Specifically, we compare the protein expression profiles of undifferentiated hMSC, hMSC cultured on laminin-5 (Ln-5 hMSC), and fully differentiated human osteoblasts (hOST) with profiles from hMSC treated with well-established osteogenic stimuli (collagen I, vitronectin, or dexamethazone). We find a marked reduction in the number of proteins (e.g., those involved with calcium signaling and cellular metabolism) expressed in Ln-5 hMSC compared to hMSC, consistent with our previous finding that hOST express far fewer proteins than do their hMSC progenitors, a pattern we call “osteogenic gene focusing.” This focused set, which resembles an intermediate stage between hMSC and mature hOST, mirrors the expression profiles of hMSC exposed to established osteogenic stimuli and includes osteogenic extracellular matrix proteins (collagen, vitronectin) and their integrin receptors, calcium signaling proteins, and enzymes involved in lipid metabolism. These results provide direct evidence that laminin-5 alone stimulates global changes in gene/protein expression in hMSC that lead to commitment of these cells to the osteogenic phenotype, and that this commitment correlates with extracellular matrix production.
doi:10.1016/j.matbio.2006.10.001
PMCID: PMC1852504  PMID: 17137774
Laminin-5; extracellular matrix; mesenchymal stem cells; osteogenesis
9.  Proteomics reveals multiple routes to the osteogenic phenotype in mesenchymal stem cells 
BMC Genomics  2007;8:380.
Background
Recently, we demonstrated that human mesenchymal stem cells (hMSC) stimulated with dexamethazone undergo gene focusing during osteogenic differentiation (Stem Cells Dev 14(6): 1608–20, 2005). Here, we examine the protein expression profiles of three additional populations of hMSC stimulated to undergo osteogenic differentiation via either contact with pro-osteogenic extracellular matrix (ECM) proteins (collagen I, vitronectin, or laminin-5) or osteogenic media supplements (OS media). Specifically, we annotate these four protein expression profiles, as well as profiles from naïve hMSC and differentiated human osteoblasts (hOST), with known gene ontologies and analyze them as a tensor with modes for the expressed proteins, gene ontologies, and stimulants.
Results
Direct component analysis in the gene ontology space identifies three components that account for 90% of the variance between hMSC, osteoblasts, and the four stimulated hMSC populations. The directed component maps the differentiation stages of the stimulated stem cell populations along the differentiation axis created by the difference in the expression profiles of hMSC and hOST. Surprisingly, hMSC treated with ECM proteins lie closer to osteoblasts than do hMSC treated with OS media. Additionally, the second component demonstrates that proteomic profiles of collagen I- and vitronectin-stimulated hMSC are distinct from those of OS-stimulated cells. A three-mode tensor analysis reveals additional focus proteins critical for characterizing the phenotypic variations between naïve hMSC, partially differentiated hMSC, and hOST.
Conclusion
The differences between the proteomic profiles of OS-stimulated hMSC and ECM-hMSC characterize different transitional phenotypes en route to becoming osteoblasts. This conclusion is arrived at via a three-mode tensor analysis validated using hMSC plated on laminin-5.
doi:10.1186/1471-2164-8-380
PMCID: PMC2148065  PMID: 17949499

Results 1-9 (9)