PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (1083998)

Clipboard (0)
None

Related Articles

1.  xQTL workbench: a scalable web environment for multi-level QTL analysis 
Bioinformatics  2012;28(7):1042-1044.
Summary: xQTL workbench is a scalable web platform for the mapping of quantitative trait loci (QTLs) at multiple levels: for example gene expression (eQTL), protein abundance (pQTL), metabolite abundance (mQTL) and phenotype (phQTL) data. Popular QTL mapping methods for model organism and human populations are accessible via the web user interface. Large calculations scale easily on to multi-core computers, clusters and Cloud. All data involved can be uploaded and queried online: markers, genotypes, microarrays, NGS, LC-MS, GC-MS, NMR, etc. When new data types come available, xQTL workbench is quickly customized using the Molgenis software generator.
Availability: xQTL workbench runs on all common platforms, including Linux, Mac OS X and Windows. An online demo system, installation guide, tutorials, software and source code are available under the LGPL3 license from http://www.xqtl.org.
Contact: m.a.swertz@rug.nl
doi:10.1093/bioinformatics/bts049
PMCID: PMC3315722  PMID: 22308096
2.  XGAP: a uniform and extensible data model and software platform for genotype and phenotype experiments 
Genome Biology  2010;11(3):R27.
XGAP, a software platform for the integration and analysis of genotype and phenotype data.
We present an extensible software model for the genotype and phenotype community, XGAP. Readers can download a standard XGAP (http://www.xgap.org) or auto-generate a custom version using MOLGENIS with programming interfaces to R-software and web-services or user interfaces for biologists. XGAP has simple load formats for any type of genotype, epigenotype, transcript, protein, metabolite or other phenotype data. Current functionality includes tools ranging from eQTL analysis in mouse to genome-wide association studies in humans.
doi:10.1186/gb-2010-11-3-r27
PMCID: PMC2864567  PMID: 20214801
3.  solQTL: a tool for QTL analysis, visualization and linking to genomes at SGN database 
BMC Bioinformatics  2010;11:525.
Background
A common approach to understanding the genetic basis of complex traits is through identification of associated quantitative trait loci (QTL). Fine mapping QTLs requires several generations of backcrosses and analysis of large populations, which is time-consuming and costly effort. Furthermore, as entire genomes are being sequenced and an increasing amount of genetic and expression data are being generated, a challenge remains: linking phenotypic variation to the underlying genomic variation. To identify candidate genes and understand the molecular basis underlying the phenotypic variation of traits, bioinformatic approaches are needed to exploit information such as genetic map, expression and whole genome sequence data of organisms in biological databases.
Description
The Sol Genomics Network (SGN, http://solgenomics.net) is a primary repository for phenotypic, genetic, genomic, expression and metabolic data for the Solanaceae family and other related Asterids species and houses a variety of bioinformatics tools. SGN has implemented a new approach to QTL data organization, storage, analysis, and cross-links with other relevant data in internal and external databases. The new QTL module, solQTL, http://solgenomics.net/qtl/, employs a user-friendly web interface for uploading raw phenotype and genotype data to the database, R/QTL mapping software for on-the-fly QTL analysis and algorithms for online visualization and cross-referencing of QTLs to relevant datasets and tools such as the SGN Comparative Map Viewer and Genome Browser. Here, we describe the development of the solQTL module and demonstrate its application.
Conclusions
solQTL allows Solanaceae researchers to upload raw genotype and phenotype data to SGN, perform QTL analysis and dynamically cross-link to relevant genetic, expression and genome annotations. Exploration and synthesis of the relevant data is expected to help facilitate identification of candidate genes underlying phenotypic variation and markers more closely linked to QTLs. solQTL is freely available on SGN and can be used in private or public mode.
doi:10.1186/1471-2105-11-525
PMCID: PMC2984588  PMID: 20964836
4.  Genetic networks in the mouse retina: Growth Associated Protein 43 and Phosphatase Tensin Homolog network 
Molecular Vision  2011;17:1355-1372.
Purpose
The present study examines the structure and covariance of endogenous variation in gene expression across the recently expanded family of C57BL/6J (B) X DBA/2J (D) Recombinant Inbred (BXD RI) strains of mice. This work is accompanied by a highly interactive database that can be used to generate and test specific hypotheses. For example, we define the genetic network regulating growth associated protein 43 (Gap43) and phosphatase tensin homolog (Pten).
Methods
The Hamilton Eye Institute (HEI) Retina Database within GeneNetwork features the data analysis of 346 Illumina Sentrix BeadChip Arrays (mouse whole genome-6 version 2). Eighty strains of mice are presented, including 75 BXD RI strains, the parental strains (C57BL/6J and DBA/2J), the reciprocal crosses, and the BALB/cByJ mice. Independent biologic samples for at least two animals from each gender were obtained with a narrow age range (48 to 118 days). Total RNA was prepared followed by the production of biotinylated cRNAs, which were pipetted into the Mouse WG-6V2 arrays. The data was globally normalized with rank invariant and stabilization (2z+8).
Results
The HEI Retina Database is located on the GeneNetwork website. The database was used to extract unique transcriptome signatures for specific cell types in the retina (retinal pigment epithelial, amacrine, and retinal ganglion cells). Two genes associated with axonal outgrowth (Gap43 and Pten) were used to display the power of this new retina database. Bioinformatic tools located within GeneNetwork in conjunction with the HEI Retina Database were used to identify the unique signature Quantitative Trait Loci (QTLs) for Gap43 and Pten on chromosomes 1, 2, 12, 15, 16, and 19. Gap43 and Pten possess networks that are similar to ganglion cell networks that may be associated with axonal growth in the mouse retina. This network involves high correlations of transcription factors (SRY sex determining region Y-box 2 [Sox2], paired box gene 6 [Pax6], and neurogenic differentiation 1 [Neurod1]), and genes involved in DNA binding (proliferating cell nuclear antigen [Pcna] and zinc finger, BED-type containing 4 [Zbed4]), as well as an inhibitor of DNA binding (inhibitor of DNA binding 2, dominant negative helix–loop–helix protein [Id2]). Furthermore, we identified the potential upstream modifiers on chromosome 2 (teashirt zinc finger homeobox 2 [Tshz2], RNA export 1 homolog [Rae1] and basic helix–loop–helix domain contatining, class B4 [Bhlhb4]) on chromosome 15 (RAB, member of RAS oncogene family-like 2a [Rabl2a], phosphomannomutase 1 [Pmm1], copine VIII [Cpne8], and fibulin 1 [Fbln1]).
Conclusions
The endogenous variation in mRNA levels among BXD RI strains can be used to explore and test expression networks underlying variation in retina structure, function, and disease susceptibility. The Gap43 and Pten network highlights the covariance of gene expression and forms a molecular network associated with axonal outgrowth in the adult retina.
PMCID: PMC3108897  PMID: 21655357
5.  The Quixote project: Collaborative and Open Quantum Chemistry data management in the Internet age 
Computational Quantum Chemistry has developed into a powerful, efficient, reliable and increasingly routine tool for exploring the structure and properties of small to medium sized molecules. Many thousands of calculations are performed every day, some offering results which approach experimental accuracy. However, in contrast to other disciplines, such as crystallography, or bioinformatics, where standard formats and well-known, unified databases exist, this QC data is generally destined to remain locally held in files which are not designed to be machine-readable. Only a very small subset of these results will become accessible to the wider community through publication.
In this paper we describe how the Quixote Project is developing the infrastructure required to convert output from a number of different molecular quantum chemistry packages to a common semantically rich, machine-readable format and to build respositories of QC results. Such an infrastructure offers benefits at many levels. The standardised representation of the results will facilitate software interoperability, for example making it easier for analysis tools to take data from different QC packages, and will also help with archival and deposition of results. The repository infrastructure, which is lightweight and built using Open software components, can be implemented at individual researcher, project, organisation or community level, offering the exciting possibility that in future many of these QC results can be made publically available, to be searched and interpreted just as crystallography and bioinformatics results are today.
Although we believe that quantum chemists will appreciate the contribution the Quixote infrastructure can make to the organisation and and exchange of their results, we anticipate that greater rewards will come from enabling their results to be consumed by a wider community. As the respositories grow they will become a valuable source of chemical data for use by other disciplines in both research and education.
The Quixote project is unconventional in that the infrastructure is being implemented in advance of a full definition of the data model which will eventually underpin it. We believe that a working system which offers real value to researchers based on tools and shared, searchable repositories will encourage early participation from a broader community, including both producers and consumers of data. In the early stages, searching and indexing can be performed on the chemical subject of the calculations, and well defined calculation meta-data. The process of defining more specific quantum chemical definitions, adding them to dictionaries and extracting them consistently from the results of the various software packages can then proceed in an incremental manner, adding additional value at each stage.
Not only will these results help to change the data management model in the field of Quantum Chemistry, but the methodology can be applied to other pressing problems related to data in computational and experimental science.
doi:10.1186/1758-2946-3-38
PMCID: PMC3206452  PMID: 21999363
6.  R/qtl: high-throughput multiple QTL mapping 
Bioinformatics  2010;26(23):2990-2992.
Motivation: R/qtl is free and powerful software for mapping and exploring quantitative trait loci (QTL). R/qtl provides a fully comprehensive range of methods for a wide range of experimental cross types. We recently added multiple QTL mapping (MQM) to R/qtl. MQM adds higher statistical power to detect and disentangle the effects of multiple linked and unlinked QTL compared with many other methods. MQM for R/qtl adds many new features including improved handling of missing data, analysis of 10 000 s of molecular traits, permutation for determining significance thresholds for QTL and QTL hot spots, and visualizations for cis–trans and QTL interaction effects. MQM for R/qtl is the first free and open source implementation of MQM that is multi-platform, scalable and suitable for automated procedures and large genetical genomics datasets.
Availability: R/qtl is free and open source multi-platform software for the statistical language R, and is made available under the GPLv3 license. R/qtl can be installed from http://www.rqtl.org/. R/qtl queries should be directed at the mailing list, see http://www.rqtl.org/list/.
Contact: kbroman@biostat.wisc.edu
doi:10.1093/bioinformatics/btq565
PMCID: PMC2982156  PMID: 20966004
7.  Flow Cytometry Bioinformatics 
PLoS Computational Biology  2013;9(12):e1003365.
Flow cytometry bioinformatics is the application of bioinformatics to flow cytometry data, which involves storing, retrieving, organizing, and analyzing flow cytometry data using extensive computational resources and tools. Flow cytometry bioinformatics requires extensive use of and contributes to the development of techniques from computational statistics and machine learning. Flow cytometry and related methods allow the quantification of multiple independent biomarkers on large numbers of single cells. The rapid growth in the multidimensionality and throughput of flow cytometry data, particularly in the 2000s, has led to the creation of a variety of computational analysis methods, data standards, and public databases for the sharing of results. Computational methods exist to assist in the preprocessing of flow cytometry data, identifying cell populations within it, matching those cell populations across samples, and performing diagnosis and discovery using the results of previous steps. For preprocessing, this includes compensating for spectral overlap, transforming data onto scales conducive to visualization and analysis, assessing data for quality, and normalizing data across samples and experiments. For population identification, tools are available to aid traditional manual identification of populations in two-dimensional scatter plots (gating), to use dimensionality reduction to aid gating, and to find populations automatically in higher dimensional space in a variety of ways. It is also possible to characterize data in more comprehensive ways, such as the density-guided binary space partitioning technique known as probability binning, or by combinatorial gating. Finally, diagnosis using flow cytometry data can be aided by supervised learning techniques, and discovery of new cell types of biological importance by high-throughput statistical methods, as part of pipelines incorporating all of the aforementioned methods.
Open standards, data, and software are also key parts of flow cytometry bioinformatics. Data standards include the widely adopted Flow Cytometry Standard (FCS) defining how data from cytometers should be stored, but also several new standards under development by the International Society for Advancement of Cytometry (ISAC) to aid in storing more detailed information about experimental design and analytical steps. Open data is slowly growing with the opening of the CytoBank database in 2010 and FlowRepository in 2012, both of which allow users to freely distribute their data, and the latter of which has been recommended as the preferred repository for MIFlowCyt-compliant data by ISAC. Open software is most widely available in the form of a suite of Bioconductor packages, but is also available for web execution on the GenePattern platform.
doi:10.1371/journal.pcbi.1003365
PMCID: PMC3867282  PMID: 24363631
8.  Statistical properties of interval mapping methods on quantitative trait loci location: impact on QTL/eQTL analyses 
BMC Genetics  2012;13:29.
Background
Quantitative trait loci (QTL) detection on a huge amount of phenotypes, like eQTL detection on transcriptomic data, can be dramatically impaired by the statistical properties of interval mapping methods. One of these major outcomes is the high number of QTL detected at marker locations. The present study aims at identifying and specifying the sources of this bias, in particular in the case of analysis of data issued from outbred populations. Analytical developments were carried out in a backcross situation in order to specify the bias and to propose an algorithm to control it. The outbred population context was studied through simulated data sets in a wide range of situations.
The likelihood ratio test was firstly analyzed under the "one QTL" hypothesis in a backcross population. Designs of sib families were then simulated and analyzed using the QTL Map software. On the basis of the theoretical results in backcross, parameters such as the population size, the density of the genetic map, the QTL effect and the true location of the QTL, were taken into account under the "no QTL" and the "one QTL" hypotheses. A combination of two non parametric tests - the Kolmogorov-Smirnov test and the Mann-Whitney-Wilcoxon test - was used in order to identify the parameters that affected the bias and to specify how much they influenced the estimation of QTL location.
Results
A theoretical expression of the bias of the estimated QTL location was obtained for a backcross type population. We demonstrated a common source of bias under the "no QTL" and the "one QTL" hypotheses and qualified the possible influence of several parameters. Simulation studies confirmed that the bias exists in outbred populations under both the hypotheses of "no QTL" and "one QTL" on a linkage group. The QTL location was systematically closer to marker locations than expected, particularly in the case of low QTL effect, small population size or low density of markers, i.e. designs with low power. Practical recommendations for experimental designs for QTL detection in outbred populations are given on the basis of this bias quantification. Furthermore, an original algorithm is proposed to adjust the location of a QTL, obtained with interval mapping, which co located with a marker.
Conclusions
Therefore, one should be attentive when one QTL is mapped at the location of one marker, especially under low power conditions.
doi:10.1186/1471-2156-13-29
PMCID: PMC3386024  PMID: 22520935
QTL; linkage analysis; QTL location; bias
9.  ProteoLens: a visual analytic tool for multi-scale database-driven biological network data mining 
BMC Bioinformatics  2008;9(Suppl 9):S5.
Background
New systems biology studies require researchers to understand how interplay among myriads of biomolecular entities is orchestrated in order to achieve high-level cellular and physiological functions. Many software tools have been developed in the past decade to help researchers visually navigate large networks of biomolecular interactions with built-in template-based query capabilities. To further advance researchers' ability to interrogate global physiological states of cells through multi-scale visual network explorations, new visualization software tools still need to be developed to empower the analysis. A robust visual data analysis platform driven by database management systems to perform bi-directional data processing-to-visualizations with declarative querying capabilities is needed.
Results
We developed ProteoLens as a JAVA-based visual analytic software tool for creating, annotating and exploring multi-scale biological networks. It supports direct database connectivity to either Oracle or PostgreSQL database tables/views, on which SQL statements using both Data Definition Languages (DDL) and Data Manipulation languages (DML) may be specified. The robust query languages embedded directly within the visualization software help users to bring their network data into a visualization context for annotation and exploration. ProteoLens supports graph/network represented data in standard Graph Modeling Language (GML) formats, and this enables interoperation with a wide range of other visual layout tools. The architectural design of ProteoLens enables the de-coupling of complex network data visualization tasks into two distinct phases: 1) creating network data association rules, which are mapping rules between network node IDs or edge IDs and data attributes such as functional annotations, expression levels, scores, synonyms, descriptions etc; 2) applying network data association rules to build the network and perform the visual annotation of graph nodes and edges according to associated data values. We demonstrated the advantages of these new capabilities through three biological network visualization case studies: human disease association network, drug-target interaction network and protein-peptide mapping network.
Conclusion
The architectural design of ProteoLens makes it suitable for bioinformatics expert data analysts who are experienced with relational database management to perform large-scale integrated network visual explorations. ProteoLens is a promising visual analytic platform that will facilitate knowledge discoveries in future network and systems biology studies.
doi:10.1186/1471-2105-9-S9-S5
PMCID: PMC2537576  PMID: 18793469
10.  WormQTL—public archive and analysis web portal for natural variation data in Caenorhabditis spp 
Nucleic Acids Research  2012;41(Database issue):D738-D743.
Here, we present WormQTL (http://www.wormqtl.org), an easily accessible database enabling search, comparative analysis and meta-analysis of all data on variation in Caenorhabditis spp. Over the past decade, Caenorhabditis elegans has become instrumental for molecular quantitative genetics and the systems biology of natural variation. These efforts have resulted in a valuable amount of phenotypic, high-throughput molecular and genotypic data across different developmental worm stages and environments in hundreds of C. elegans strains. WormQTL provides a workbench of analysis tools for genotype–phenotype linkage and association mapping based on but not limited to R/qtl (http://www.rqtl.org). All data can be uploaded and downloaded using simple delimited text or Excel formats and are accessible via a public web user interface for biologists and R statistic and web service interfaces for bioinformaticians, based on open source MOLGENIS and xQTL workbench software. WormQTL welcomes data submissions from other worm researchers.
doi:10.1093/nar/gks1124
PMCID: PMC3531126  PMID: 23180786
11.  Towards systems genetic analyses in barley: Integration of phenotypic, expression and genotype data into GeneNetwork 
BMC Genetics  2008;9:73.
Background
A typical genetical genomics experiment results in four separate data sets; genotype, gene expression, higher-order phenotypic data and metadata that describe the protocols, processing and the array platform. Used in concert, these data sets provide the opportunity to perform genetic analysis at a systems level. Their predictive power is largely determined by the gene expression dataset where tens of millions of data points can be generated using currently available mRNA profiling technologies. Such large, multidimensional data sets often have value beyond that extracted during their initial analysis and interpretation, particularly if conducted on widely distributed reference genetic materials. Besides quality and scale, access to the data is of primary importance as accessibility potentially allows the extraction of considerable added value from the same primary dataset by the wider research community. Although the number of genetical genomics experiments in different plant species is rapidly increasing, none to date has been presented in a form that allows quick and efficient on-line testing for possible associations between genes, loci and traits of interest by an entire research community.
Description
Using a reference population of 150 recombinant doubled haploid barley lines we generated novel phenotypic, mRNA abundance and SNP-based genotyping data sets, added them to a considerable volume of legacy trait data and entered them into the GeneNetwork . GeneNetwork is a unified on-line analytical environment that enables the user to test genetic hypotheses about how component traits, such as mRNA abundance, may interact to condition more complex biological phenotypes (higher-order traits). Here we describe these barley data sets and demonstrate some of the functionalities GeneNetwork provides as an easily accessible and integrated analytical environment for exploring them.
Conclusion
By integrating barley genotypic, phenotypic and mRNA abundance data sets directly within GeneNetwork's analytical environment we provide simple web access to the data for the research community. In this environment, a combination of correlation analysis and linkage mapping provides the potential to identify and substantiate gene targets for saturation mapping and positional cloning. By integrating datasets from an unsequenced crop plant (barley) in a database that has been designed for an animal model species (mouse) with a well established genome sequence, we prove the importance of the concept and practice of modular development and interoperability of software engineering for biological data sets.
doi:10.1186/1471-2156-9-73
PMCID: PMC2630324  PMID: 19017390
12.  The Gaggle: An open-source software system for integrating bioinformatics software and data sources 
BMC Bioinformatics  2006;7:176.
Background
Systems biologists work with many kinds of data, from many different sources, using a variety of software tools. Each of these tools typically excels at one type of analysis, such as of microarrays, of metabolic networks and of predicted protein structure. A crucial challenge is to combine the capabilities of these (and other forthcoming) data resources and tools to create a data exploration and analysis environment that does justice to the variety and complexity of systems biology data sets. A solution to this problem should recognize that data types, formats and software in this high throughput age of biology are constantly changing.
Results
In this paper we describe the Gaggle -a simple, open-source Java software environment that helps to solve the problem of software and database integration. Guided by the classic software engineering strategy of separation of concerns and a policy of semantic flexibility, it integrates existing popular programs and web resources into a user-friendly, easily-extended environment.
We demonstrate that four simple data types (names, matrices, networks, and associative arrays) are sufficient to bring together diverse databases and software. We highlight some capabilities of the Gaggle with an exploration of Helicobacter pylori pathogenesis genes, in which we identify a putative ricin-like protein -a discovery made possible by simultaneous data exploration using a wide range of publicly available data and a variety of popular bioinformatics software tools.
Conclusion
We have integrated diverse databases (for example, KEGG, BioCyc, String) and software (Cytoscape, DataMatrixViewer, R statistical environment, and TIGR Microarray Expression Viewer). Through this loose coupling of diverse software and databases the Gaggle enables simultaneous exploration of experimental data (mRNA and protein abundance, protein-protein and protein-DNA interactions), functional associations (operon, chromosomal proximity, phylogenetic pattern), metabolic pathways (KEGG) and Pubmed abstracts (STRING web resource), creating an exploratory environment useful to 'web browser and spreadsheet biologists', to statistically savvy computational biologists, and those in between. The Gaggle uses Java RMI and Java Web Start technologies and can be found at .
doi:10.1186/1471-2105-7-176
PMCID: PMC1464137  PMID: 16569235
13.  BioHackathon series in 2011 and 2012: penetration of ontology and linked data in life science domains 
Katayama, Toshiaki | Wilkinson, Mark D | Aoki-Kinoshita, Kiyoko F | Kawashima, Shuichi | Yamamoto, Yasunori | Yamaguchi, Atsuko | Okamoto, Shinobu | Kawano, Shin | Kim, Jin-Dong | Wang, Yue | Wu, Hongyan | Kano, Yoshinobu | Ono, Hiromasa | Bono, Hidemasa | Kocbek, Simon | Aerts, Jan | Akune, Yukie | Antezana, Erick | Arakawa, Kazuharu | Aranda, Bruno | Baran, Joachim | Bolleman, Jerven | Bonnal, Raoul JP | Buttigieg, Pier Luigi | Campbell, Matthew P | Chen, Yi-an | Chiba, Hirokazu | Cock, Peter JA | Cohen, K Bretonnel | Constantin, Alexandru | Duck, Geraint | Dumontier, Michel | Fujisawa, Takatomo | Fujiwara, Toyofumi | Goto, Naohisa | Hoehndorf, Robert | Igarashi, Yoshinobu | Itaya, Hidetoshi | Ito, Maori | Iwasaki, Wataru | Kalaš, Matúš | Katoda, Takeo | Kim, Taehong | Kokubu, Anna | Komiyama, Yusuke | Kotera, Masaaki | Laibe, Camille | Lapp, Hilmar | Lütteke, Thomas | Marshall, M Scott | Mori, Takaaki | Mori, Hiroshi | Morita, Mizuki | Murakami, Katsuhiko | Nakao, Mitsuteru | Narimatsu, Hisashi | Nishide, Hiroyo | Nishimura, Yosuke | Nystrom-Persson, Johan | Ogishima, Soichi | Okamura, Yasunobu | Okuda, Shujiro | Oshita, Kazuki | Packer, Nicki H | Prins, Pjotr | Ranzinger, Rene | Rocca-Serra, Philippe | Sansone, Susanna | Sawaki, Hiromichi | Shin, Sung-Ho | Splendiani, Andrea | Strozzi, Francesco | Tadaka, Shu | Toukach, Philip | Uchiyama, Ikuo | Umezaki, Masahito | Vos, Rutger | Whetzel, Patricia L | Yamada, Issaku | Yamasaki, Chisato | Yamashita, Riu | York, William S | Zmasek, Christian M | Kawamoto, Shoko | Takagi, Toshihisa
The application of semantic technologies to the integration of biological data and the interoperability of bioinformatics analysis and visualization tools has been the common theme of a series of annual BioHackathons hosted in Japan for the past five years. Here we provide a review of the activities and outcomes from the BioHackathons held in 2011 in Kyoto and 2012 in Toyama. In order to efficiently implement semantic technologies in the life sciences, participants formed various sub-groups and worked on the following topics: Resource Description Framework (RDF) models for specific domains, text mining of the literature, ontology development, essential metadata for biological databases, platforms to enable efficient Semantic Web technology development and interoperability, and the development of applications for Semantic Web data. In this review, we briefly introduce the themes covered by these sub-groups. The observations made, conclusions drawn, and software development projects that emerged from these activities are discussed.
doi:10.1186/2041-1480-5-5
PMCID: PMC3978116  PMID: 24495517
BioHackathon; Bioinformatics; Semantic Web; Web services; Ontology; Visualization; Knowledge representation; Databases; Semantic interoperability; Data models; Data sharing; Data integration
14.  Cloud BioLinux: pre-configured and on-demand bioinformatics computing for the genomics community 
BMC Bioinformatics  2012;13:42.
Background
A steep drop in the cost of next-generation sequencing during recent years has made the technology affordable to the majority of researchers, but downstream bioinformatic analysis still poses a resource bottleneck for smaller laboratories and institutes that do not have access to substantial computational resources. Sequencing instruments are typically bundled with only the minimal processing and storage capacity required for data capture during sequencing runs. Given the scale of sequence datasets, scientific value cannot be obtained from acquiring a sequencer unless it is accompanied by an equal investment in informatics infrastructure.
Results
Cloud BioLinux is a publicly accessible Virtual Machine (VM) that enables scientists to quickly provision on-demand infrastructures for high-performance bioinformatics computing using cloud platforms. Users have instant access to a range of pre-configured command line and graphical software applications, including a full-featured desktop interface, documentation and over 135 bioinformatics packages for applications including sequence alignment, clustering, assembly, display, editing, and phylogeny. Each tool's functionality is fully described in the documentation directly accessible from the graphical interface of the VM. Besides the Amazon EC2 cloud, we have started instances of Cloud BioLinux on a private Eucalyptus cloud installed at the J. Craig Venter Institute, and demonstrated access to the bioinformatic tools interface through a remote connection to EC2 instances from a local desktop computer. Documentation for using Cloud BioLinux on EC2 is available from our project website, while a Eucalyptus cloud image and VirtualBox Appliance is also publicly available for download and use by researchers with access to private clouds.
Conclusions
Cloud BioLinux provides a platform for developing bioinformatics infrastructures on the cloud. An automated and configurable process builds Virtual Machines, allowing the development of highly customized versions from a shared code base. This shared community toolkit enables application specific analysis platforms on the cloud by minimizing the effort required to prepare and maintain them.
doi:10.1186/1471-2105-13-42
PMCID: PMC3372431  PMID: 22429538
15.  WormQTLHD—a web database for linking human disease to natural variation data in C. elegans 
Nucleic Acids Research  2013;42(Database issue):D794-D801.
Interactions between proteins are highly conserved across species. As a result, the molecular basis of multiple diseases affecting humans can be studied in model organisms that offer many alternative experimental opportunities. One such organism—Caenorhabditis elegans—has been used to produce much molecular quantitative genetics and systems biology data over the past decade. We present WormQTLHD (Human Disease), a database that quantitatively and systematically links expression Quantitative Trait Loci (eQTL) findings in C. elegans to gene–disease associations in man. WormQTLHD, available online at http://www.wormqtl-hd.org, is a user-friendly set of tools to reveal functionally coherent, evolutionary conserved gene networks. These can be used to predict novel gene-to-gene associations and the functions of genes underlying the disease of interest. We created a new database that links C. elegans eQTL data sets to human diseases (34 337 gene–disease associations from OMIM, DGA, GWAS Central and NHGRI GWAS Catalogue) based on overlapping sets of orthologous genes associated to phenotypes in these two species. We utilized QTL results, high-throughput molecular phenotypes, classical phenotypes and genotype data covering different developmental stages and environments from WormQTL database. All software is available as open source, built on MOLGENIS and xQTL workbench.
doi:10.1093/nar/gkt1044
PMCID: PMC3965109  PMID: 24217915
16.  Mediation Analysis Demonstrates That Trans-eQTLs Are Often Explained by Cis-Mediation: A Genome-Wide Analysis among 1,800 South Asians 
PLoS Genetics  2014;10(12):e1004818.
A large fraction of human genes are regulated by genetic variation near the transcribed sequence (cis-eQTL, expression quantitative trait locus), and many cis-eQTLs have implications for human disease. Less is known regarding the effects of genetic variation on expression of distant genes (trans-eQTLs) and their biological mechanisms. In this work, we use genome-wide data on SNPs and array-based expression measures from mononuclear cells obtained from a population-based cohort of 1,799 Bangladeshi individuals to characterize cis- and trans-eQTLs and determine if observed trans-eQTL associations are mediated by expression of transcripts in cis with the SNPs showing trans-association, using Sobel tests of mediation. We observed 434 independent trans-eQTL associations at a false-discovery rate of 0.05, and 189 of these trans-eQTLs were also cis-eQTLs (enrichment P<0.0001). Among these 189 trans-eQTL associations, 39 were significantly attenuated after adjusting for a cis-mediator based on Sobel P<10-5. We attempted to replicate 21 of these mediation signals in two European cohorts, and while only 7 trans-eQTL associations were present in one or both cohorts, 6 showed evidence of cis-mediation. Analyses of simulated data show that complete mediation will be observed as partial mediation in the presence of mediator measurement error or imperfect LD between measured and causal variants. Our data demonstrates that trans-associations can become significantly stronger or switch directions after adjusting for a potential mediator. Using simulated data, we demonstrate that this phenomenon is expected in the presence of strong cis-trans confounding and when the measured cis-transcript is correlated with the true (unmeasured) mediator. In conclusion, by applying mediation analysis to eQTL data, we show that a substantial fraction of observed trans-eQTL associations can be explained by cis-mediation. Future studies should focus on understanding the mechanisms underlying widespread cis-mediation and their relevance to disease biology, as well as using mediation analysis to improve eQTL discovery.
Author Summary
Expression quantitative trait locus (eQTL) studies have demonstrated that human genes can be regulated by genetic variation residing close to the gene (cis-eQTLs) or in a distant region or on a different chromosome (trans-eQTLs). While cis-eQTL variants are likely to affect transcription factor binding or chromatin structure, our understanding of the mechanisms underlying trans-eQTLs is incomplete. We hypothesize that a substantial fraction of trans-eQTLs influence expression of distant genes through mediation by expression levels of a cis-transcript. In this paper, we use genome-wide SNPs and expression data for 1,799 South Asians to identify cis- and trans-eQTLs and to test our hypothesis using Sobel tests of mediation. Among 189 observed trans-eQTL associations, we provide evidence of cis-mediation for 39, 6 of which show mediation in an independent European cohort. We used simulated data to demonstrate that complete mediation will be observed as partial mediation in the presence of mediator measurement error or imperfect LD between measured and causal variants. We also demonstrate how unobserved confounding variables and incorrect mediator selection can bias mediation estimates. In conclusion, we have identified cis-mediators for many trans-eQTLs and described a mediation analysis approach that can be used to validate, characterize, and enhance discovery of trans-eQTLs.
doi:10.1371/journal.pgen.1004818
PMCID: PMC4256471  PMID: 25474530
17.  Using MATLAB software with Tomcat server and Java platform for remote image analysis in pathology 
Diagnostic Pathology  2011;6(Suppl 1):S18.
Background
The Matlab software is a one of the most advanced development tool for application in engineering practice. From our point of view the most important is the image processing toolbox, offering many built-in functions, including mathematical morphology, and implementation of a many artificial neural networks as AI. It is very popular platform for creation of the specialized program for image analysis, also in pathology. Based on the latest version of Matlab Builder Java toolbox, it is possible to create the software, serving as a remote system for image analysis in pathology via internet communication. The internet platform can be realized based on Java Servlet Pages with Tomcat server as servlet container.
Methods
In presented software implementation we propose remote image analysis realized by Matlab algorithms. These algorithms can be compiled to executable jar file with the help of Matlab Builder Java toolbox. The Matlab function must be declared with the set of input data, output structure with numerical results and Matlab web figure. Any function prepared in that manner can be used as a Java function in Java Servlet Pages (JSP). The graphical user interface providing the input data and displaying the results (also in graphical form) must be implemented in JSP. Additionally the data storage to database can be implemented within algorithm written in Matlab with the help of Matlab Database Toolbox directly with the image processing. The complete JSP page can be run by Tomcat server.
Results
The proposed tool for remote image analysis was tested on the Computerized Analysis of Medical Images (CAMI) software developed by author. The user provides image and case information (diagnosis, staining, image parameter etc.). When analysis is initialized, input data with image are sent to servlet on Tomcat. When analysis is done, client obtains the graphical results as an image with marked recognized cells and also the quantitative output. Additionally, the results are stored in a server database. The internet platform was tested on PC Intel Core2 Duo T9600 2.8GHz 4GB RAM server with 768x576 pixel size, 1.28Mb tiff format images reffering to meningioma tumour (x400, Ki-67/MIB-1). The time consumption was as following: at analysis by CAMI, locally on a server – 3.5 seconds, at remote analysis – 26 seconds, from which 22 seconds were used for data transfer via internet connection. At jpg format image (102 Kb) the consumption time was reduced to 14 seconds.
Conclusions
The results have confirmed that designed remote platform can be useful for pathology image analysis. The time consumption is depended mainly on the image size and speed of the internet connections. The presented implementation can be used for many types of analysis at different staining, tissue, morphometry approaches, etc. The significant problem is the implementation of the JSP page in the multithread form, that can be used parallelly by many users. The presented platform for image analysis in pathology can be especially useful for small laboratory without its own image analysis system.
doi:10.1186/1746-1596-6-S1-S18
PMCID: PMC3073211  PMID: 21489188
18.  Open Source Software Projects of the caBIG™ In Vivo Imaging Workspace Software Special Interest Group 
Journal of Digital Imaging  2007;20(Suppl 1):94-100.
The Cancer Bioinformatics Grid (caBIG™) program was created by the National Cancer Institute to facilitate sharing of IT infrastructure, data, and applications among the National Cancer Institute-sponsored cancer research centers. The program was launched in February 2004 and now links more than 50 cancer centers. In April 2005, the In Vivo Imaging Workspace was added to promote the use of imaging in cancer clinical trials. At the inaugural meeting, four special interest groups (SIGs) were established. The Software SIG was charged with identifying projects that focus on open-source software for image visualization and analysis. To date, two projects have been defined by the Software SIG. The eXtensible Imaging Platform project has produced a rapid application development environment that researchers may use to create targeted workflows customized for specific research projects. The Algorithm Validation Tools project will provide a set of tools and data structures that will be used to capture measurement information and associated needed to allow a gold standard to be defined for the given database against which change analysis algorithms can be tested. Through these and future efforts, the caBIG™ In Vivo Imaging Workspace Software SIG endeavors to advance imaging informatics and provide new open-source software tools to advance cancer research.
doi:10.1007/s10278-007-9061-4
PMCID: PMC2039820  PMID: 17846835
Open source, digital imaging and communications in medicine (DICOM); grid computing; image analysis; imaging informatics; caBIG; XIP; AVT
19.  Open Source Software Projects of the caBIG™ In Vivo Imaging Workspace Software Special Interest Group 
Journal of Digital Imaging  2007;20(Suppl 1):94-100.
The Cancer Bioinformatics Grid (caBIG™) program was created by the National Cancer Institute to facilitate sharing of IT infrastructure, data, and applications among the National Cancer Institute-sponsored cancer research centers. The program was launched in February 2004 and now links more than 50 cancer centers. In April 2005, the In Vivo Imaging Workspace was added to promote the use of imaging in cancer clinical trials. At the inaugural meeting, four special interest groups (SIGs) were established. The Software SIG was charged with identifying projects that focus on open-source software for image visualization and analysis. To date, two projects have been defined by the Software SIG. The eXtensible Imaging Platform project has produced a rapid application development environment that researchers may use to create targeted workflows customized for specific research projects. The Algorithm Validation Tools project will provide a set of tools and data structures that will be used to capture measurement information and associated needed to allow a gold standard to be defined for the given database against which change analysis algorithms can be tested. Through these and future efforts, the caBIG™ In Vivo Imaging Workspace Software SIG endeavors to advance imaging informatics and provide new open-source software tools to advance cancer research.
doi:10.1007/s10278-007-9061-4
PMCID: PMC2039820  PMID: 17846835
Open source, digital imaging and communications in medicine (DICOM); grid computing; image analysis; imaging informatics; caBIG; XIP; AVT
20.  Detection and validation of stay-green QTL in post-rainy sorghum involving widely adapted cultivar, M35-1 and a popular stay-green genotype B35 
BMC Genomics  2014;15(1):909.
Background
Sorghum [Sorghum bicolor (L.) Moench] is an important dry-land cereal of the world providing food, fodder, feed and fuel. Stay-green (delayed-leaf senescence) is a key attribute in sorghum determining its adaptation to terminal drought stress. The objective of this study was to validate sorghum stay-green quantitative trait loci (QTL) identified in the past, and to identify new QTL in the genetic background of a post-rainy adapted genotype M35-1.
Results
A genetic linkage map based on 245 F9 Recombinant Inbred Lines (RILs) derived from a cross between M35-1 (more senescent) and B35 (less senescent) with 237 markers consisting of 174 genomic, 60 genic and 3 morphological markers was used. The phenotypic data collected for three consecutive post-rainy crop seasons on the RIL population (M35-1 × B35) was used for QTL analysis. Sixty-one QTL were identified for various measures of stay-green trait and each trait was controlled by one to ten QTL. The phenotypic variation explained by each QTL ranged from 3.8 to 18.7%. Co-localization of QTL for more than five traits was observed on two linkage groups i.e. on SBI-09-3 flanked by S18 and Xgap206 markers and, on SBI-03 flanked by XnhsbSFCILP67 and Xtxp31. QTL identified in this study were stable across environments and corresponded to sorghum stay-green and grain yield QTL reported previously. Of the 60 genic SSRs mapped, 14 were closely linked with QTL for ten traits. A genic marker, XnhsbSFCILP67 (Sb03g028240) encoding Indole-3-acetic acid-amido synthetase GH3.5, was co-located with QTL for GLB, GLM, PGLM and GLAM on SBI-03. Genes underlying key enzymes of chlorophyll metabolism were also found in the stay-green QTL regions.
Conclusions
We validated important stay-green QTL reported in the past in sorghum and detected new QTL influencing the stay-green related traits consistently. Stg2, Stg3 and StgB were prominent in their expression. Collectively, the QTL/markers identified are likely candidates for subsequent verification for their involvement in stay-green phenotype using NILs and to develop drought tolerant sorghum varieties through marker-assisted breeding for terminal drought tolerance in sorghum.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-909) contains supplementary material, which is available to authorized users.
doi:10.1186/1471-2164-15-909
PMCID: PMC4219115  PMID: 25326366
Stay-green; Sorghum; Post-flowering drought tolerance; Quantitative trait loci; Marker assisted breeding
21.  A repository based on a dynamically extensible data model supporting multidisciplinary research in neuroscience 
Background
Robust, extensible and distributed databases integrating clinical, imaging and molecular data represent a substantial challenge for modern neuroscience. It is even more difficult to provide extensible software environments able to effectively target the rapidly changing data requirements and structures of research experiments. There is an increasing request from the neuroscience community for software tools addressing technical challenges about: (i) supporting researchers in the medical field to carry out data analysis using integrated bioinformatics services and tools; (ii) handling multimodal/multiscale data and metadata, enabling the injection of several different data types according to structured schemas; (iii) providing high extensibility, in order to address different requirements deriving from a large variety of applications simply through a user runtime configuration.
Methods
A dynamically extensible data structure supporting collaborative multidisciplinary research projects in neuroscience has been defined and implemented. We have considered extensibility issues from two different points of view. First, the improvement of data flexibility has been taken into account. This has been done through the development of a methodology for the dynamic creation and use of data types and related metadata, based on the definition of “meta” data model. This way, users are not constrainted to a set of predefined data and the model can be easily extensible and applicable to different contexts. Second, users have been enabled to easily customize and extend the experimental procedures in order to track each step of acquisition or analysis. This has been achieved through a process-event data structure, a multipurpose taxonomic schema composed by two generic main objects: events and processes. Then, a repository has been built based on such data model and structure, and deployed on distributed resources thanks to a Grid-based approach. Finally, data integration aspects have been addressed by providing the repository application with an efficient dynamic interface designed to enable the user to both easily query the data depending on defined datatypes and view all the data of every patient in an integrated and simple way.
Results
The results of our work have been twofold. First, a dynamically extensible data model has been implemented and tested based on a “meta” data-model enabling users to define their own data types independently from the application context. This data model has allowed users to dynamically include additional data types without the need of rebuilding the underlying database. Then a complex process-event data structure has been built, based on this data model, describing patient-centered diagnostic processes and merging information from data and metadata. Second, a repository implementing such a data structure has been deployed on a distributed Data Grid in order to provide scalability both in terms of data input and data storage and to exploit distributed data and computational approaches in order to share resources more efficiently. Moreover, data managing has been made possible through a friendly web interface. The driving principle of not being forced to preconfigured data types has been satisfied. It is up to users to dynamically configure the data model for the given experiment or data acquisition program, thus making it potentially suitable for customized applications.
Conclusions
Based on such repository, data managing has been made possible through a friendly web interface. The driving principle of not being forced to preconfigured data types has been satisfied. It is up to users to dynamically configure the data model for the given experiment or data acquisition program, thus making it potentially suitable for customized applications.
doi:10.1186/1472-6947-12-115
PMCID: PMC3560115  PMID: 23043673
Neuroscience; Data models; Multidisciplinary studies
22.  CloudDOE: A User-Friendly Tool for Deploying Hadoop Clouds and Analyzing High-Throughput Sequencing Data with MapReduce 
PLoS ONE  2014;9(6):e98146.
Background
Explosive growth of next-generation sequencing data has resulted in ultra-large-scale data sets and ensuing computational problems. Cloud computing provides an on-demand and scalable environment for large-scale data analysis. Using a MapReduce framework, data and workload can be distributed via a network to computers in the cloud to substantially reduce computational latency. Hadoop/MapReduce has been successfully adopted in bioinformatics for genome assembly, mapping reads to genomes, and finding single nucleotide polymorphisms. Major cloud providers offer Hadoop cloud services to their users. However, it remains technically challenging to deploy a Hadoop cloud for those who prefer to run MapReduce programs in a cluster without built-in Hadoop/MapReduce.
Results
We present CloudDOE, a platform-independent software package implemented in Java. CloudDOE encapsulates technical details behind a user-friendly graphical interface, thus liberating scientists from having to perform complicated operational procedures. Users are guided through the user interface to deploy a Hadoop cloud within in-house computing environments and to run applications specifically targeted for bioinformatics, including CloudBurst, CloudBrush, and CloudRS. One may also use CloudDOE on top of a public cloud. CloudDOE consists of three wizards, i.e., Deploy, Operate, and Extend wizards. Deploy wizard is designed to aid the system administrator to deploy a Hadoop cloud. It installs Java runtime environment version 1.6 and Hadoop version 0.20.203, and initiates the service automatically. Operate wizard allows the user to run a MapReduce application on the dashboard list. To extend the dashboard list, the administrator may install a new MapReduce application using Extend wizard.
Conclusions
CloudDOE is a user-friendly tool for deploying a Hadoop cloud. Its smart wizards substantially reduce the complexity and costs of deployment, execution, enhancement, and management. Interested users may collaborate to improve the source code of CloudDOE to further incorporate more MapReduce bioinformatics tools into CloudDOE and support next-generation big data open source tools, e.g., Hadoop BigTop and Spark. Availability: CloudDOE is distributed under Apache License 2.0 and is freely available at http://clouddoe.iis.sinica.edu.tw/.
doi:10.1371/journal.pone.0098146
PMCID: PMC4045712  PMID: 24897343
23.  Model-driven user interfaces for bioinformatics data resources: regenerating the wheel as an alternative to reinventing it 
BMC Bioinformatics  2006;7:532.
Background
The proliferation of data repositories in bioinformatics has resulted in the development of numerous interfaces that allow scientists to browse, search and analyse the data that they contain. Interfaces typically support repository access by means of web pages, but other means are also used, such as desktop applications and command line tools. Interfaces often duplicate functionality amongst each other, and this implies that associated development activities are repeated in different laboratories. Interfaces developed by public laboratories are often created with limited developer resources. In such environments, reducing the time spent on creating user interfaces allows for a better deployment of resources for specialised tasks, such as data integration or analysis. Laboratories maintaining data resources are challenged to reconcile requirements for software that is reliable, functional and flexible with limitations on software development resources.
Results
This paper proposes a model-driven approach for the partial generation of user interfaces for searching and browsing bioinformatics data repositories. Inspired by the Model Driven Architecture (MDA) of the Object Management Group (OMG), we have developed a system that generates interfaces designed for use with bioinformatics resources. This approach helps laboratory domain experts decrease the amount of time they have to spend dealing with the repetitive aspects of user interface development. As a result, the amount of time they can spend on gathering requirements and helping develop specialised features increases. The resulting system is known as Pierre, and has been validated through its application to use cases in the life sciences, including the PEDRoDB proteomics database and the e-Fungi data warehouse.
Conclusion
MDAs focus on generating software from models that describe aspects of service capabilities, and can be applied to support rapid development of repository interfaces in bioinformatics. The Pierre MDA is capable of supporting common database access requirements with a variety of auto-generated interfaces and across a variety of repositories. With Pierre, four kinds of interfaces are generated: web, stand-alone application, text-menu, and command line. The kinds of repositories with which Pierre interfaces have been used are relational, XML and object databases.
doi:10.1186/1471-2105-7-532
PMCID: PMC1713253  PMID: 17169146
24.  Synthesis of 53 tissue and cell line expression QTL datasets reveals master eQTLs 
BMC Genomics  2014;15(1):532.
Background
Gene expression genetic studies in human tissues and cells identify cis- and trans-acting expression quantitative trait loci (eQTLs). These eQTLs provide insights into regulatory mechanisms underlying disease risk. However, few studies systematically characterized eQTL results across cell and tissues types. We synthesized eQTL results from >50 datasets, including new primary data from human brain, peripheral plaque and kidney samples, in order to discover features of human eQTLs.
Results
We find a substantial number of robust cis-eQTLs and far fewer trans-eQTLs consistent across tissues. Analysis of 45 full human GWAS scans indicates eQTLs are enriched overall, and above nSNPs, among positive statistical signals in genetic mapping studies, and account for a significant fraction of the strongest human trait effects. Expression QTLs are enriched for gene centricity, higher population allele frequencies, in housekeeping genes, and for coincidence with regulatory features, though there is little evidence of 5′ or 3′ positional bias. Several regulatory categories are not enriched including microRNAs and their predicted binding sites and long, intergenic non-coding RNAs. Among the most tissue-ubiquitous cis-eQTLs, there is enrichment for genes involved in xenobiotic metabolism and mitochondrial function, suggesting these eQTLs may have adaptive origins. Several strong eQTLs (CDK5RAP2, NBPFs) coincide with regions of reported human lineage selection. The intersection of new kidney and plaque eQTLs with related GWAS suggest possible gene prioritization. For example, butyrophilins are now linked to arterial pathogenesis via multiple genetic and expression studies. Expression QTL and GWAS results are made available as a community resource through the NHLBI GRASP database [http://apps.nhlbi.nih.gov/grasp/].
Conclusions
Expression QTLs inform the interpretation of human trait variability, and may account for a greater fraction of phenotypic variability than protein-coding variants. The synthesis of available tissue eQTL data highlights many strong cis-eQTLs that may have important biologic roles and could serve as positive controls in future studies. Our results indicate some strong tissue-ubiquitous eQTLs may have adaptive origins in humans. Efforts to expand the genetic, splicing and tissue coverage of known eQTLs will provide further insights into human gene regulation.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-532) contains supplementary material, which is available to authorized users.
doi:10.1186/1471-2164-15-532
PMCID: PMC4102726  PMID: 24973796
eQTL; RNA; Gene expression; Genomics; Transcriptome; GWAS; Genome-wide; Tissue; Cis; Trans
25.  Effect of chromosome substitution on intrinsic exercise capacity in mice 
F1000Research  2014;3:9.
Previous research identified a locus on Chromosome 14 as an important regulator of endurance exercise capacity in mice. The aim of this study was to investigate the effect of chromosome substitution on intrinsic exercise capacity and identify quantitative trait loci (QTL) associated with exercise capacity in mice. Mice from a chromosome substitution strain (CSS) derived from A/J and C57Bl/6J (B6), denoted as B6.A14, were used to assess the contribution of Chromosome 14 to intrinsic exercise capacity. All mice performed a graded exercise test to exhaustion to determine exercise capacity expressed as time (min) or work (kg·m). Exercise time and work were significantly greater in B6 mice than B6.A14 and A/J mice, indicating the presence of a QTL on Chromosome 14 for exercise capacity. To localize exercise-related QTL, 155 B6.A14 x B6 F 2 mice were generated for linkage analysis. Suggestive QTL for exercise time (57 cM, 1.75 LOD) and work (57 cM, 2.08 LOD) were identified in the entire B6.A14 x B6 F 2 cohort. To identify putative sex-specific QTL, male and female F 2 cohorts were analyzed separately.  In males, a significant QTL for exercise time (55 cM, 2.28 LOD) and a suggestive QTL for work (55 cM, 2.19 LOD) were identified.  In the female cohort, no QTL was identified for time, but a suggestive QTL for work was located at 16 cM (1.8 LOD). These data suggest that one or more QTL on Chromosome 14 regulate exercise capacity. The putative sex-specific QTL further suggest that the genetic architecture underlying exercise capacity is different in males and females.  Overall, the results of this study support the use of CSS as a model for the genetic analysis of exercise capacity. Future studies should incorporate the full panel of CSS using male and female mice to dissect the genetic basis for differences in exercise capacity.
doi:10.12688/f1000research.3-9.v1
PMCID: PMC4032107  PMID: 25184035

Results 1-25 (1083998)