Search tips
Search criteria

Results 26-50 (133)

Clipboard (0)

Select a Filter Below

Year of Publication
26.  Multiplex Degenerate Primer Design for Targeted Whole Genome Amplification of Many Viral Genomes 
Advances in Bioinformatics  2014;2014:101894.
Background. Targeted enrichment improves coverage of highly mutable viruses at low concentration in complex samples. Degenerate primers that anneal to conserved regions can facilitate amplification of divergent, low concentration variants, even when the strain present is unknown. Results. A tool for designing multiplex sets of degenerate sequencing primers to tile overlapping amplicons across multiple whole genomes is described. The new script, run_tiled_primers, is part of the PriMux software. Primers were designed for each segment of South American hemorrhagic fever viruses, tick-borne encephalitis, Henipaviruses, Arenaviruses, Filoviruses, Crimean-Congo hemorrhagic fever virus, Rift Valley fever virus, and Japanese encephalitis virus. Each group is highly diverse with as little as 5% genome consensus. Primer sets were computationally checked for nontarget cross reactions against the NCBI nucleotide sequence database. Primers for murine hepatitis virus were demonstrated in the lab to specifically amplify selected genes from a laboratory cultured strain that had undergone extensive passage in vitro and in vivo. Conclusions. This software should help researchers design multiplex sets of primers for targeted whole genome enrichment prior to sequencing to obtain better coverage of low titer, divergent viruses. Applications include viral discovery from a complex background and improved sensitivity and coverage of rapidly evolving strains or variants in a gene family.
PMCID: PMC4137498  PMID: 25157264
27.  Prediction of Epitope-Based Peptides for the Utility of Vaccine Development from Fusion and Glycoprotein of Nipah Virus Using In Silico Approach 
Advances in Bioinformatics  2014;2014:402492.
This study aims to design epitope-based peptides for the utility of vaccine development by targeting glycoprotein G and envelope protein F of Nipah virus (NiV) that, respectively, facilitate attachment and fusion of NiV with host cells. Using various databases and tools, immune parameters of conserved sequence(s) from G and F proteins of different isolates of NiV were tested to predict probable epitope(s). Binding analyses of the peptides with MHC class-I and class-II molecules, epitope conservancy, population coverage, and linear B cell epitope prediction were analyzed. Predicted peptides interacted with seven or more MHC alleles and illustrated population coverage of more than 99% and 95%, for G and F proteins, respectively. The predicted class-I nonamers, SLIDTSSTI and EWISIVPNF, superimposed on the putative decameric B cell epitopes, were also identified as core sequences of the most probable class-II 15-mer peptides GPKVSLIDTSSTITI and EWISIVPNFILVRNT. These peptides were further validated for their binding to specific HLA alleles using in silico docking technique. Our in silico analysis suggested that the predicted epitopes, either GPKVSLIDTSSTITI or EWISIVPNFILVRNT, could be a better choice as universal vaccine component against NiV irrespective of different isolates which may elicit both humoral and cell-mediated immunity.
PMCID: PMC4131549  PMID: 25147564
28.  IN-MACA-MCC: Integrated Multiple Attractor Cellular Automata with Modified Clonal Classifier for Human Protein Coding and Promoter Prediction 
Advances in Bioinformatics  2014;2014:261362.
Protein coding and promoter region predictions are very important challenges of bioinformatics (Attwood and Teresa, 2000). The identification of these regions plays a crucial role in understanding the genes. Many novel computational and mathematical methods are introduced as well as existing methods that are getting refined for predicting both of the regions separately; still there is a scope for improvement. We propose a classifier that is built with MACA (multiple attractor cellular automata) and MCC (modified clonal classifier) to predict both regions with a single classifier. The proposed classifier is trained and tested with Fickett and Tung (1992) datasets for protein coding region prediction for DNA sequences of lengths 54, 108, and 162. This classifier is trained and tested with MMCRI datasets for protein coding region prediction for DNA sequences of lengths 252 and 354. The proposed classifier is trained and tested with promoter sequences from DBTSS (Yamashita et al., 2006) dataset and nonpromoters from EID (Saxonov et al., 2000) and UTRdb (Pesole et al., 2002) datasets. The proposed model can predict both regions with an average accuracy of 90.5% for promoter and 89.6% for protein coding region predictions. The specificity and sensitivity values of promoter and protein coding region predictions are 0.89 and 0.92, respectively.
PMCID: PMC4123571  PMID: 25132849
29.  Pharmacophore Modeling and Molecular Docking Studies on Pinus roxburghii as a Target for Diabetes Mellitus 
Advances in Bioinformatics  2014;2014:903246.
The present study attempts to establish a relationship between ethnopharmacological claims and bioactive constituents present in Pinus roxburghii against all possible targets for diabetes through molecular docking and to develop a pharmacophore model for the active target. The process of molecular docking involves study of different bonding modes of one ligand with active cavities of target receptors protein tyrosine phosphatase 1-beta (PTP-1β), dipeptidyl peptidase-IV (DPP-IV), aldose reductase (AR), and insulin receptor (IR) with help of docking software Molegro virtual docker (MVD). From the results of docking score values on different receptors for antidiabetic activity, it is observed that constituents, namely, secoisoresinol, pinoresinol, and cedeodarin, showed the best docking results on almost all the receptors, while the most significant results were observed on AR. Then, LigandScout was applied to develop a pharmacophore model for active target. LigandScout revealed that 2 hydrogen bond donors pointing towards Tyr 48 and His 110 are a major requirement of the pharmacophore generated. In our molecular docking studies, the active constituent, secoisoresinol, has also shown hydrogen bonding with His 110 residue which is a part of the pharmacophore. The docking results have given better insights into the development of better aldose reductase inhibitor so as to treat diabetes related secondary complications.
PMCID: PMC4120483  PMID: 25114678
30.  How Good Are Simplified Models for Protein Structure Prediction? 
Advances in Bioinformatics  2014;2014:867179.
Protein structure prediction (PSP) has been one of the most challenging problems in computational biology for several decades. The challenge is largely due to the complexity of the all-atomic details and the unknown nature of the energy function. Researchers have therefore used simplified energy models that consider interaction potentials only between the amino acid monomers in contact on discrete lattices. The restricted nature of the lattices and the energy models poses a twofold concern regarding the assessment of the models. Can a native or a very close structure be obtained when structures are mapped to lattices? Can the contact based energy models on discrete lattices guide the search towards the native structures? In this paper, we use the protein chain lattice fitting (PCLF) problem to address the first concern; we developed a constraint-based local search algorithm for the PCLF problem for cubic and face-centered cubic lattices and found very close lattice fits for the native structures. For the second concern, we use a number of techniques to sample the conformation space and find correlations between energy functions and root mean square deviation (RMSD) distance of the lattice-based structures with the native structures. Our analysis reveals weakness of several contact based energy models used that are popular in PSP.
PMCID: PMC4022063  PMID: 24876837
31.  Elementary Flux Mode Analysis of Acetyl-CoA Pathway in Carboxydothermus hydrogenoformans Z-2901 
Advances in Bioinformatics  2014;2014:928038.
Carboxydothermus hydrogenoformans is a carboxydotrophic hydrogenogenic bacterium species that produces hydrogen molecule by utilizing carbon monoxide (CO) or pyruvate as a carbon source. To investigate the underlying biochemical mechanism of hydrogen production, an elementary mode analysis of acetyl-CoA pathway was performed to determine the intermediate fluxes by combining linear programming (LP) method available in CellNetAnalyzer software. We hypothesized that addition of enzymes necessary for carbon monoxide fixation and pyruvate dissimilation would enhance the theoretical yield of hydrogen. An in silico gene knockout of pyk, pykC, and mdh genes of modeled acetyl-CoA pathway allows the maximum theoretical hydrogen yield of 47.62 mmol/gCDW/h for 1 mole of carbon monoxide (CO) uptake. The obtained hydrogen yield is comparatively two times greater than the previous experimental data. Therefore, it could be concluded that this elementary flux mode analysis is a crucial way to achieve efficient hydrogen production through acetyl-CoA pathway and act as a model for strain improvement.
PMCID: PMC4009226  PMID: 24822064
32.  Objective and Comprehensive Evaluation of Bisulfite Short Read Mapping Tools 
Advances in Bioinformatics  2014;2014:472045.
Background. Large-scale bisulfite treatment and short reads sequencing technology allow comprehensive estimation of methylation states of Cs in the genomes of different tissues, cell types, and developmental stages. Accurate characterization of DNA methylation is essential for understanding genotype phenotype association, gene and environment interaction, diseases, and cancer. Aligning bisulfite short reads to a reference genome has been a challenging task. We compared five bisulfite short read mapping tools, BSMAP, Bismark, BS-Seeker, BiSS, and BRAT-BW, representing two classes of mapping algorithms (hash table and suffix/prefix tries). We examined their mapping efficiency (i.e., the percentage of reads that can be mapped to the genomes), usability, running time, and effects of changing default parameter settings using both real and simulated reads. We also investigated how preprocessing data might affect mapping efficiency. Conclusion. Among the five programs compared, in terms of mapping efficiency, Bismark performs the best on the real data, followed by BiSS, BSMAP, and finally BRAT-BW and BS-Seeker with very similar performance. If CPU time is not a constraint, Bismark is a good choice of program for mapping bisulfite treated short reads. Data quality impacts a great deal mapping efficiency. Although increasing the number of mismatches allowed can increase mapping efficiency, it not only significantly slows down the program, but also runs the risk of having increased false positives. Therefore, users should carefully set the related parameters depending on the quality of their sequencing data.
PMCID: PMC4009243  PMID: 24839440
33.  Network Completion for Static Gene Expression Data 
Advances in Bioinformatics  2014;2014:382452.
We tackle the problem of completing and inferring genetic networks under stationary conditions from static data, where network completion is to make the minimum amount of modifications to an initial network so that the completed network is most consistent with the expression data in which addition of edges and deletion of edges are basic modification operations. For this problem, we present a new method for network completion using dynamic programming and least-squares fitting. This method can find an optimal solution in polynomial time if the maximum indegree of the network is bounded by a constant. We evaluate the effectiveness of our method through computational experiments using synthetic data. Furthermore, we demonstrate that our proposed method can distinguish the differences between two types of genetic networks under stationary conditions from lung cancer and normal gene expression data.
PMCID: PMC3984774  PMID: 24826192
34.  Secondary Structure Preferences of Mn2+ Binding Sites in Bacterial Proteins 
Advances in Bioinformatics  2014;2014:501841.
3D structures of proteins with coordinated Mn2+ ions from bacteria with low, average, and high genomic GC-content have been analyzed (149 PDB files were used). Major Mn2+ binders are aspartic acid (6.82% of Asp residues), histidine (14.76% of His residues), and glutamic acid (3.51% of Glu residues). We found out that the motif of secondary structure “beta strand-major binder-random coil” is overrepresented around all the three major Mn2+ binders. That motif may be followed by either alpha helix or beta strand. Beta strands near Mn2+ binding residues should be stable because they are enriched by such beta formers as valine and isoleucine, as well as by specific combinations of hydrophobic and hydrophilic amino acid residues characteristic to beta sheet. In the group of proteins from GC-rich bacteria glutamic acid residues situated in alpha helices frequently coordinate Mn2+ ions, probably, because of the decrease of Lys usage under the influence of mutational GC-pressure. On the other hand, the percentage of Mn2+ sites with at least one amino acid in the “beta strand-major binder-random coil” motif of secondary structure (77.88%) does not depend on genomic GC-content.
PMCID: PMC3977119  PMID: 24778647
35.  A Parallel Framework for Multipoint Spiral Search in ab Initio Protein Structure Prediction 
Advances in Bioinformatics  2014;2014:985968.
Protein structure prediction is computationally a very challenging problem. A large number of existing search algorithms attempt to solve the problem by exploring possible structures and finding the one with the minimum free energy. However, these algorithms perform poorly on large sized proteins due to an astronomically wide search space. In this paper, we present a multipoint spiral search framework that uses parallel processing techniques to expedite exploration by starting from different points. In our approach, a set of random initial solutions are generated and distributed to different threads. We allow each thread to run for a predefined period of time. The improved solutions are stored threadwise. When the threads finish, the solutions are merged together and the duplicates are removed. A selected distinct set of solutions are then split to different threads again. In our ab initio protein structure prediction method, we use the three-dimensional face-centred-cubic lattice for structure-backbone mapping. We use both the low resolution hydrophobic-polar energy model and the high-resolution 20 × 20 energy model for search guiding. The experimental results show that our new parallel framework significantly improves the results obtained by the state-of-the-art single-point search approaches for both energy models on three-dimensional face-centred-cubic lattice. We also experimentally show the effectiveness of mixing energy models within parallel threads.
PMCID: PMC3976798  PMID: 24744779
36.  A Brachytherapy Plan Evaluation Tool for Interstitial Applications 
Advances in Bioinformatics  2014;2014:376207.
Radiobiological metrics such as tumor control probability (TCP) and normal tissue complication probability (NTCP) help in assessing the quality of brachytherapy plans. Application of such metrics in clinics as well as research is still inadequate. This study presents the implementation of two indigenously designed plan evaluation modules: Brachy_TCP and Brachy_NTCP. Evaluation tools were constructed to compute TCP and NTCP from dose volume histograms (DVHs) of any interstitial brachytherapy treatment plan. The computation module was employed to estimate probabilities of tumor control and normal tissue complications in ten cervical cancer patients based on biologically effective equivalent uniform dose (BEEUD). The tumor control and normal tissue morbidity were assessed with clinical followup and were scored. The acute toxicity was graded using common terminology criteria for adverse events (CTCAE) version 4.0. Outcome score was found to be correlated with the TCP/NTCP estimates. Thus, the predictive ability of the estimates was quantified with the clinical outcomes. Biologically effective equivalent uniform dose-based formalism was found to be effective in predicting the complexities and disease control.
PMCID: PMC3934649  PMID: 24665263
37.  Prediction of B-Cell Epitopes in Listeriolysin O, a Cholesterol Dependent Cytolysin Secreted by Listeria monocytogenes 
Advances in Bioinformatics  2014;2014:871676.
Listeria monocytogenes is a gram-positive, foodborne bacterium responsible for disease in humans and animals. Listeriolysin O (LLO) is a required virulence factor for the pathogenic effects of L. monocytogenes. Bioinformatics revealed conserved putative epitopes of LLO that could be used to develop monoclonal antibodies against LLO. Continuous and discontinuous epitopes were located by using four different B-cell prediction algorithms. Three-dimensional molecular models were generated to more precisely characterize the predicted antigenicity of LLO. Domain 4 was predicted to contain five of eleven continuous epitopes. A large portion of domain 4 was also predicted to comprise discontinuous immunogenic epitopes. Domain 4 of LLO may serve as an immunogen for eliciting monoclonal antibodies that can be used to study the pathogenesis of L. monocytogenes as well as develop an inexpensive assay.
PMCID: PMC3909977  PMID: 24523732
38.  Comparing Imputation Procedures for Affymetrix Gene Expression Datasets Using MAQC Datasets 
Advances in Bioinformatics  2013;2013:790567.
Introduction. The microarray datasets from the MicroArray Quality Control (MAQC) project have enabled the assessment of the precision, comparability of microarrays, and other various microarray analysis methods. However, to date no studies that we are aware of have reported the performance of missing value imputation schemes on the MAQC datasets. In this study, we use the MAQC Affymetrix datasets to evaluate several imputation procedures in Affymetrix microarrays. Results. We evaluated several cutting edge imputation procedures and compared them using different error measures. We randomly deleted 5% and 10% of the data and imputed the missing values using imputation tests. We performed 1000 simulations and averaged the results. The results for both 5% and 10% deletion are similar. Among the imputation methods, we observe the local least squares method with k = 4 is most accurate under the error measures considered. The k-nearest neighbor method with k = 1 has the highest error rate among imputation methods and error measures. Conclusions. We conclude for imputing missing values in Affymetrix microarray datasets, using the MAS 5.0 preprocessing scheme, the local least squares method with k = 4 has the best overall performance and k-nearest neighbor method with k = 1 has the worst overall performance. These results hold true for both 5% and 10% missing values.
PMCID: PMC3809938  PMID: 24223587
39.  A Multilevel Gamma-Clustering Layout Algorithm for Visualization of Biological Networks 
Advances in Bioinformatics  2013;2013:920325.
Visualization of large complex networks has become an indispensable part of systems biology, where organisms need to be considered as one complex system. The visualization of the corresponding network is challenging due to the size and density of edges. In many cases, the use of standard visualization algorithms can lead to high running times and poorly readable visualizations due to many edge crossings. We suggest an approach that analyzes the structure of the graph first and then generates a new graph which contains specific semantic symbols for regular substructures like dense clusters. We propose a multilevel gamma-clustering layout visualization algorithm (MLGA) which proceeds in three subsequent steps: (i) a multilevel γ-clustering is used to identify the structure of the underlying network, (ii) the network is transformed to a tree, and (iii) finally, the resulting tree which shows the network structure is drawn using a variation of a force-directed algorithm. The algorithm has a potential to visualize very large networks because it uses modern clustering heuristics which are optimized for large graphs. Moreover, most of the edges are removed from the visual representation which allows keeping the overview over complex graphs with dense subgraphs.
PMCID: PMC3707208  PMID: 23864855
41.  Reverse Engineering Sparse Gene Regulatory Networks Using Cubature Kalman Filter and Compressed Sensing 
Advances in Bioinformatics  2013;2013:205763.
This paper proposes a novel algorithm for inferring gene regulatory networks which makes use of cubature Kalman filter (CKF) and Kalman filter (KF) techniques in conjunction with compressed sensing methods. The gene network is described using a state-space model. A nonlinear model for the evolution of gene expression is considered, while the gene expression data is assumed to follow a linear Gaussian model. The hidden states are estimated using CKF. The system parameters are modeled as a Gauss-Markov process and are estimated using compressed sensing-based KF. These parameters provide insight into the regulatory relations among the genes. The Cramér-Rao lower bound of the parameter estimates is calculated for the system model and used as a benchmark to assess the estimation accuracy. The proposed algorithm is evaluated rigorously using synthetic data in different scenarios which include different number of genes and varying number of sample points. In addition, the algorithm is tested on the DREAM4 in silico data sets as well as the in vivo data sets from IRMA network. The proposed algorithm shows superior performance in terms of accuracy, robustness, and scalability.
PMCID: PMC3664478  PMID: 23737768
42.  [No title available] 
PMCID: PMC3638690  PMID: 23653640
43.  Gene Regulation, Modulation, and Their Applications in Gene Expression Data Analysis 
Advances in Bioinformatics  2013;2013:360678.
Common microarray and next-generation sequencing data analysis concentrate on tumor subtype classification, marker detection, and transcriptional regulation discovery during biological processes by exploring the correlated gene expression patterns and their shared functions. Genetic regulatory network (GRN) based approaches have been employed in many large studies in order to scrutinize for dysregulation and potential treatment controls. In addition to gene regulation and network construction, the concept of the network modulator that has significant systemic impact has been proposed, and detection algorithms have been developed in past years. Here we provide a unified mathematic description of these methods, followed with a brief survey of these modulator identification algorithms. As an early attempt to extend the concept to new RNA regulation mechanism, competitive endogenous RNA (ceRNA), into a modulator framework, we provide two applications to illustrate the network construction, modulation effect, and the preliminary finding from these networks. Those methods we surveyed and developed are used to dissect the regulated network under different modulators. Not limit to these, the concept of “modulation” can adapt to various biological mechanisms to discover the novel gene regulation mechanisms.
PMCID: PMC3610383  PMID: 23573084
44.  Correction of Spatial Bias in Oligonucleotide Array Data 
Advances in Bioinformatics  2013;2013:167915.
Background. Oligonucleotide microarrays allow for high-throughput gene expression profiling assays. The technology relies on the fundamental assumption that observed hybridization signal intensities (HSIs) for each intended target, on average, correlate with their target's true concentration in the sample. However, systematic, nonbiological variation from several sources undermines this hypothesis. Background hybridization signal has been previously identified as one such important source, one manifestation of which appears in the form of spatial autocorrelation. Results. We propose an algorithm, pyn, for the elimination of spatial autocorrelation in HSIs, exploiting the duality of desirable mutual information shared by probes in a common probe set and undesirable mutual information shared by spatially proximate probes. We show that this correction procedure reduces spatial autocorrelation in HSIs; increases HSI reproducibility across replicate arrays; increases differentially expressed gene detection power; and performs better than previously published methods. Conclusions. The proposed algorithm increases both precision and accuracy, while requiring virtually no changes to users' current analysis pipelines: the correction consists merely of a transformation of raw HSIs (e.g., CEL files for Affymetrix arrays). A free, open-source implementation is provided as an R package, compatible with standard Bioconductor tools. The approach may also be tailored to other platform types and other sources of bias.
PMCID: PMC3610395  PMID: 23573083
45.  Spectral Analysis on Time-Course Expression Data: Detecting Periodic Genes Using a Real-Valued Iterative Adaptive Approach 
Advances in Bioinformatics  2013;2013:171530.
Time-course expression profiles and methods for spectrum analysis have been applied for detecting transcriptional periodicities, which are valuable patterns to unravel genes associated with cell cycle and circadian rhythm regulation. However, most of the proposed methods suffer from restrictions and large false positives to a certain extent. Additionally, in some experiments, arbitrarily irregular sampling times as well as the presence of high noise and small sample sizes make accurate detection a challenging task. A novel scheme for detecting periodicities in time-course expression data is proposed, in which a real-valued iterative adaptive approach (RIAA), originally proposed for signal processing, is applied for periodogram estimation. The inferred spectrum is then analyzed using Fisher's hypothesis test. With a proper p-value threshold, periodic genes can be detected. A periodic signal, two nonperiodic signals, and four sampling strategies were considered in the simulations, including both bursts and drops. In addition, two yeast real datasets were applied for validation. The simulations and real data analysis reveal that RIAA can perform competitively with the existing algorithms. The advantage of RIAA is manifested when the expression data are highly irregularly sampled, and when the number of cycles covered by the sampling time points is very reduced.
PMCID: PMC3600260  PMID: 23533399
46.  Identification of Robust Pathway Markers for Cancer through Rank-Based Pathway Activity Inference 
Advances in Bioinformatics  2013;2013:618461.
One important problem in translational genomics is the identification of reliable and reproducible markers that can be used to discriminate between different classes of a complex disease, such as cancer. The typical small sample setting makes the prediction of such markers very challenging, and various approaches have been proposed to address this problem. For example, it has been shown that pathway markers, which aggregate the gene activities in the same pathway, tend to be more robust than gene markers. Furthermore, the use of gene expression ranking has been demonstrated to be robust to batch effects and that it can lead to more interpretable results. In this paper, we propose an enhanced pathway activity inference method that uses gene ranking to predict the pathway activity in a probabilistic manner. The main focus of this work is on identifying robust pathway markers that can ultimately lead to robust classifiers with reproducible performance across datasets. Simulation results based on multiple breast cancer datasets show that the proposed inference method identifies better pathway markers that can predict breast cancer metastasis with higher accuracy. Moreover, the identified pathway markers can lead to better classifiers with more consistent classification performance across independent datasets.
PMCID: PMC3600350  PMID: 23533400
47.  An Overview of the Statistical Methods Used for Inferring Gene Regulatory Networks and Protein-Protein Interaction Networks 
Advances in Bioinformatics  2013;2013:953814.
The large influx of data from high-throughput genomic and proteomic technologies has encouraged the researchers to seek approaches for understanding the structure of gene regulatory networks and proteomic networks. This work reviews some of the most important statistical methods used for modeling of gene regulatory networks (GRNs) and protein-protein interaction (PPI) networks. The paper focuses on the recent advances in the statistical graphical modeling techniques, state-space representation models, and information theoretic methods that were proposed for inferring the topology of GRNs. It appears that the problem of inferring the structure of PPI networks is quite different from that of GRNs. Clustering and probabilistic graphical modeling techniques are of prime importance in the statistical inference of PPI networks, and some of the recent approaches using these techniques are also reviewed in this paper. Performance evaluation criteria for the approaches used for modeling GRNs and PPI networks are also discussed.
PMCID: PMC3594945  PMID: 23509452
48.  Using Protein Clusters from Whole Proteomes to Construct and Augment a Dendrogram 
Advances in Bioinformatics  2013;2013:191586.
In this paper we present a new ab initio approach for constructing an unrooted dendrogram using protein clusters, an approach that has the potential for estimating relationships among several thousands of species based on their putative proteomes. We employ an open-source software program called pClust that was developed for use in metagenomic studies. Sequence alignment is performed by pClust using the Smith-Waterman algorithm, which is known to give optimal alignment and, hence, greater accuracy than BLAST-based methods. Protein clusters generated by pClust are used to create protein profiles for each species in the dendrogram, these profiles forming a correlation filter library for use with a new taxon. To augment the dendrogram with a new taxon, a protein profile for the taxon is created using BLASTp, and this new taxon is placed into a position within the dendrogram corresponding to the highest correlation with profiles in the correlation filter library. This work was initiated because of our interest in plasmids, and each step is illustrated using proteomes from Gram-negative bacterial plasmids. Proteomes for 527 plasmids were used to generate the dendrogram, and to demonstrate the utility of the insertion algorithm twelve recently sequenced pAKD plasmids were used to augment the dendrogram.
PMCID: PMC3590580  PMID: 23509450
49.  Solving the 0/1 Knapsack Problem by a Biomolecular DNA Computer 
Advances in Bioinformatics  2013;2013:341419.
Solving some mathematical problems such as NP-complete problems by conventional silicon-based computers is problematic and takes so long time. DNA computing is an alternative method of computing which uses DNA molecules for computing purposes. DNA computers have massive degrees of parallel processing capability. The massive parallel processing characteristic of DNA computers is of particular interest in solving NP-complete and hard combinatorial problems. NP-complete problems such as knapsack problem and other hard combinatorial problems can be easily solved by DNA computers in a very short period of time comparing to conventional silicon-based computers. Sticker-based DNA computing is one of the methods of DNA computing. In this paper, the sticker based DNA computing was used for solving the 0/1 knapsack problem. At first, a biomolecular solution space was constructed by using appropriate DNA memory complexes. Then, by the application of a sticker-based parallel algorithm using biological operations, knapsack problem was resolved in polynomial time.
PMCID: PMC3588402  PMID: 23509451
50.  MRMPath and MRMutation, Facilitating Discovery of Mass Transitions for Proteotypic Peptides in Biological Pathways Using a Bioinformatics Approach 
Advances in Bioinformatics  2013;2013:527295.
Quantitative proteomics applications in mass spectrometry depend on the knowledge of the mass-to-charge ratio (m/z) values of proteotypic peptides for the proteins under study and their product ions. MRMPath and MRMutation, web-based bioinformatics software that are platform independent, facilitate the recovery of this information by biologists. MRMPath utilizes publicly available information related to biological pathways in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. All the proteins involved in pathways of interest are recovered and processed in silico to extract information relevant to quantitative mass spectrometry analysis. Peptides may also be subjected to automated BLAST analysis to determine whether they are proteotypic. MRMutation catalogs and makes available, following processing, known (mutant) variants of proteins from the current UniProtKB database. All these results, available via the web from well-maintained, public databases, are written to an Excel spreadsheet, which the user can download and save. MRMPath and MRMutation can be freely accessed. As a system that seeks to allow two or more resources to interoperate, MRMPath represents an advance in bioinformatics tool development. As a practical matter, the MRMPath automated approach represents significant time savings to researchers.
PMCID: PMC3570921  PMID: 23424586

Results 26-50 (133)