1.  Design, Synthesis, and SAR Studies of 4-Substituted Methoxylbenzoyl-aryl-thiazoles Analogues as Potent and Orally Bioavailable Anticancer Agents 
Journal of medicinal chemistry  2011;54(13):4678-4693.
In a continued effort to improve upon the previously published 4-substituted methoxybenzoyl-arylthiazole (SMART) template, we explored chemodiverse “B” rings and “B” to “C” ring linkage. Further, to overcome the poor aqueous solubility of this series of agents, we introduced polar and ionizable hydrophilic groups to obtain water-soluble compounds. For instance, based on in vivo pharmacokinetic (PK) studies, an orally bioavailable phenyl-aminothiazole (PAT) template was designed and synthesized in which an amino linkage was inserted between “A” and “B” rings of compound 1. The PAT template maintained nanomolar (nM) range potency against cancer cell lines via inhibiting tubulin polymerization and was not susceptible to P-glycoprotein mediated multidrug resistance in vitro, and markedly improved solubility and bioavailability compared with the SMART template (45a–c (PAT) vs 1 (SMART)).
PMCID: PMC4755333  PMID: 21557538
2.  Functional MnO nanoclusters for efficient siRNA delivery† 
A non-viral gene delivery nanovehicle based on Alkyl-PEI2k capped MnO nanoclusters was synthesized via a simple, facile method and used for efficient siRNA delivery and magnetic resonance imaging.
PMCID: PMC4620662  PMID: 21991584
3.  Attitudes and Practices of Obstetrician–Gynecologists Regarding Influenza Vaccination in Pregnancy 
Obstetrics and gynecology  2011;118(5):1074-1080.
To assess knowledge, attitudes, and practices of obstetrician–gynecologists (ob-gyns) regarding vaccination of pregnant women during the 2009 H1N1 pandemic.
From February to July 2010, a self-administered mail survey was conducted among a random sample of American College of Obstetricians and Gynecologists (the College) members involved in obstetric care. To assess predictors of routinely offering influenza vaccination, adjusted prevalence ratios and 95% confidence intervals (CIs) were calculated from survey data.
Among 3,096 survey recipients, 1,310 (42.3%) responded to the survey, of whom 873 were eligible for participation. The majority of ob-gyns reported routinely offering both seasonal and 2009 H1N1 influenza vaccination to their pregnant patients (77.6% and 85.6%, respectively) during the 2009–2010 season; 21.1% and 13.3% referred patients to other specialists. Reported reasons for not offering vaccination included inadequate reimbursement, storage limitations, or belief that vaccine should be administered by another provider. Seasonal and 2009 H1N1 influenza vaccination during the first trimester was not recommended by 10.6% and 9.6% of ob-gyns, respectively. Predictors of routinely offering 2009 H1N1 influenza vaccine included: considering primary care and preventive medicine a very important part of practice (adjusted prevalence ratio 1.2, CI 1.01–1.4); observing serious conditions attributed to influenza-like illness (adjusted prevalence ratio 1.1, CI 1.02–1.1); personally receiving 2009 H1N1 influenza vaccination (adjusted prevalence ratio 1.2, CI 1.1–1.4); and practicing in multispecialty group (adjusted prevalence ratio 1.1, CI 1.1–1.2). Physicians in solo practice were less likely to routinely offer influenza vaccine (adjusted prevalence ratio 0.8, CI 0.7–0.9).
Although most ob-gyns routinely offered influenza vaccination to pregnant patients, vaccination coverage rates may be improved by addressing logistic and financial challenges of vaccine providers.
PMCID: PMC4608446  PMID: 22015875
4.  Reversible Infantile Respiratory Chain Deficiency is a Unique, Genetically Heterogenous Mitochondrial Disease 
Journal of medical genetics  2011;48(10):660-668.
Homoplasmic maternally inherited, m.14674T>C or m. 14674T>G mt-tRNAGlu mutations have recently been identified in Reversible infantile cytochrome c oxidase deficiency (or “Benign COX deficiency”). We sought other genetic defects that may give rise to similar presentations.
We investigated 8 patients from 7 families with clinico-pathological features of infantile reversible cytochrome c oxidase deficiency.
We reviewed the diagnostic features and performed molecular genetic analyses of mitochondrial DNA and nuclear-encoded candidate genes.
Patients presented with subacute onset of profound hypotonia, feeding difficulties and lactic acidosis within the first months of life. Although recovery was remarkable, a mild myopathy persisted into adulthood. Histopathological findings in muscle included increased lipid and/or glycogen content, ragged-red and COX negative fibres. Biochemical studies suggested more generalized abnormalities than pure COX deficiency. Clinical improvement was reflected by normalization of lactic acidosis and histopathological abnormalities. The m.14674T>C mt-tRNAGlu mutation was identified in 4 families, but none had the m.14674T>G mutation. Furthermore, in 2 families we also found pathogenic mutations in nuclear TRMU gene which has not previously been associated with this phenotype. In one family, the genetic etiology still remains unknown.
Benign COX deficiency is better described as “Reversible Infantile Respiratory Chain Deficiency”. It is genetically heterogeneous, and patients not carrying the m.14674T>C or T>G mt-tRNAGlu mutations may have mutations in TRMU gene. Diagnosing this disorder at the molecular level is a significant advance for paediatric neurologists and intensive care paediatricians, enabling them to select children with an excellent prognosis for continuing respiratory support from those with severe mitochondrial presentation in infancy.
PMCID: PMC4562368  PMID: 21931168
Neuromuscular Diseases; Mitochondrial myopathies; Respiratory chain
5.  On the Importance of Knowing Your Partner’s Views: Attitude Familiarity is Associated with Better Interpersonal Functioning and Lower Ambulatory Blood Pressure in Daily Life 
Relationships have been linked to significant physical health outcomes. However, little is known about the more specific processes that might be responsible for such links.
The main aim of this study was to examine a previously unexplored and potentially important form of partner knowledge (i.e., attitude familiarity) on relationship processes and cardiovascular function.
In this study, 47 married couples completed an attitude familiarity questionnaire and ambulatory assessments of daily spousal interactions and blood pressure.
Attitude familiarity was associated with better interpersonal functioning between spouses in daily life (e.g., greater partner responsiveness). Importantly, attitude familiarity was also related to lower overall ambulatory systolic blood pressure and diastolic blood pressure.
These data suggest that familiarity with a spouse’s attitudes may be an important factor linking relationships to better interpersonal and physical health outcomes.
PMCID: PMC4560465  PMID: 20878291
Attitudes; Partner knowledge; Relationships; Ambulatory blood pressure; Ecological momentary assessment
6.  Effect of the influenza A (H1N1) live attenuated intranasal vaccine on nitric oxide (FENO) and other volatiles in exhaled breath 
Journal of breath research  2011;5(3):037107.
For the 2009 influenza A (H1N1) pandemic, vaccination and infection control were the main modes of prevention. A live attenuated H1N1 vaccine mimics natural infection and works by evoking a host immune response, but currently there are no easy methods to measure such a response. To determine if an immune response could be measured in exhaled breath, exhaled nitric oxide (FENO) and other exhaled breath volatiles using selected ion flow tube mass spectrometry (SIFT-MS) were measured before and daily for seven days after administering the H1N1 2009 monovalent live intranasal vaccine (FluMist®, MedImmune LLC) in nine healthy healthcare workers (age 35 ± 7 years; five females). On day 3 after H1N1 FluMist® administration there were increases in FENO (MEAN±SEM: day 0 15 ± 3 ppb, day 3 19 ± 3 ppb; p < 0.001) and breath isoprene (MEAN±SEM: day 0 59 ± 15 ppb, day 3 99 ± 17 ppb; p = 0.02). MS analysis identified the greatest number of changes in exhaled breath on day 3 with 137 product ion masses that changed from baseline. The exhaled breath changes on day 3 after H1N1 vaccination may reflect the underlying host immune response. However, further work to elucidate the sources of the exhaled breath changes is necessary.
PMCID: PMC4552053  PMID: 21757798
7.  Region-specific regulation of 5-HT1A receptor expression by Pet-1-dependent mechanisms in vivo 
Journal of neurochemistry  2011;116(6):1066-1076.
Serotonin (5-hydroxytryptamine, 5-HT) neurotransmission is negatively regulated by 5-HT1A autoreceptors on raphe neurons, and is implicated in mood disorders. Pet-1/FEV is an ETS transcription factor expressed exclusively in serotonergic neurons and is essential for serotonergic differentiation, although its regulation of 5-HT receptors has not yet been studied. Here, we show by electrophoretic mobility shift assay that recombinant human Pet-1/FEV binds directly to multiple Pet-1 elements of the human 5-HT1A receptor promoter to enhance its transcriptional activity. In luciferase reporter assays, mutational analysis indicated that while several sites contribute, the Pet-1 site at −1406 bp had the greatest effect on 5-HT1A promoter activity. To address the effect of Pet-1 on 5-HT1A receptor regulation in vivo, we compared the expression of 5-HT1A receptor RNA and protein in Pet-1 null and wild-type littermate mice. In the raphe nuclei of Pet-1−/− mice tryptophan hydroxylase 2 (TPH2) RNA, and 5-HT and TPH immunostaining were greatly reduced, indicating a deficit in 5-HT production. Raphe 5-HT1A RNA and protein levels were also reduced in Pet-1-deficient mice, consistent with an absence of Pet-1-mediated transcriptional enhancement of 5-HT1A autoreceptors in serotonergic neurons. Interestingly, 5-HT1A receptor expression was up-regulated in the hippocampus, but down-regulated in the striatum and cortex. These data indicate that, in addition to transcriptional regulation by Pet-1 in raphe neurons, 5-HT1A receptor expression is regulated indirectly by alterations in 5-HT neurotransmission in a region-specific manner that together may contribute to the aggressive/anxiety phenotype observed in Pet-1 null mice.
PMCID: PMC4540595  PMID: 21182526 CAMSID: cams4997
depression; development; ETS transcription factor; serotonin; transcription; transgenic
8.  Novel Flavohemoglobins of Mycobacteria 
IUBMB life  2011;63(5):337-345.
Flavohemoglobins (flavoHbs) constitute a distinct class of chimeric hemoglobins in which a globin domain is coupled with a ferredoxin reductase such as FAD- and NADH-binding modules. Structural features and active site of heme and reductase domains are highly conserved in various flavoHbs. A new class of flavoHbs, displaying crucial differences in functionally conserved regions of heme and reductase domains, have been identified in mycobacteria. Mining of microbial genome data indicated that the occurrence of such flavoHbs might be restricted to a small group of microbes unlike conventional flavoHbs that are widespread among prokaryotes and lower eukaryotes. One of the representative flavoHbs of this class, encoded by Rv0385 gene (MtbFHb) of Mycobacterium tuberculosis, has been cloned, expressed, and characterized. The ferric and deoxy spectra of MtbFHb displayed a hexacoordinate state indicating that its distal site may be occupied by an intrinsic amino acid or an external ligand and it may not be involved in nitric oxide detoxification. Phylogenetic analysis revealed that mycobacterial flavoHbs constitute a separate cluster distinct from conventional flavoHbs and may have novel function(s).
PMCID: PMC4533980  PMID: 21491561
flavohemoglobin; mycobacteria; electron-transfer; phylogeny; nitric-oxide dioxygenase
9.  The pediatric PRO-SELF©: Pain control program: An effective educational program for parents caring for children at home following tonsillectomy 
The purpose of this paper was to provide a description of the components of the PEDIATRIC PRO-SELF©: Pain Control Program.
Design and Methods
The Program, adapted from studies of this intervention in adults with cancer pain, was tested in two randomized clinical trials of acute pain management in pediatrics.
Key strategies most effective for parents in the pediatric ambulatory surgery setting included use of an educational booklet and timer to facilitate adherence to the prescribed analgesic regimen, as well as interactive nursing support.
Practice Implications
The PEDIATRIC PRO-SELF©: Pain Control Program can be used with parents caring for children at home following tonsillectomy.
PMCID: PMC4523122  PMID: 21951354
Parent teaching; pediatric pain; postoperative pain; self-care; tonsillectomy
10.  Methodological Challenges in Physical Activity Research with Older Adults 
The aging adult population is growing, as well as the incidence of chronic illness among older adults. Physical activity has been demonstrated in the literature to be a beneficial component of self-management for chronic illnesses commonly found in the older adult population. Health sciences research seeks to develop new knowledge, practices, and policies that may benefit older adults’ management of chronic illness and quality of life. However, research with the older adult population, though beneficial, includes potential methodological challenges specific to this age group. This article discusses common methodological issues in research among older adults, with a focus on physical activity intervention studies. Awareness and understanding of these issues may facilitate future development of research studies devoted to the aging adult population, through appropriate modification and tailoring of sampling techniques, intervention development, and data measures and collection.
PMCID: PMC4523128  PMID: 21821726
older adult; elderly; physical activity; methods
11.  Ancient Pbx-Hox signatures define hundreds of vertebrate developmental enhancers 
BMC Genomics  2011;12:637.
Gene regulation through cis-regulatory elements plays a crucial role in development and disease. A major aim of the post-genomic era is to be able to read the function of cis-regulatory elements through scrutiny of their DNA sequence. Whilst comparative genomics approaches have identified thousands of putative regulatory elements, our knowledge of their mechanism of action is poor and very little progress has been made in systematically de-coding them.
Here, we identify ancient functional signatures within vertebrate conserved non-coding elements (CNEs) through a combination of phylogenetic footprinting and functional assay, using genomic sequence from the sea lamprey as a reference. We uncover a striking enrichment within vertebrate CNEs for conserved binding-site motifs of the Pbx-Hox hetero-dimer. We further show that these predict reporter gene expression in a segment specific manner in the hindbrain and pharyngeal arches during zebrafish development.
These findings evoke an evolutionary scenario in which many CNEs evolved early in the vertebrate lineage to co-ordinate Hox-dependent gene-regulatory interactions that pattern the vertebrate head. In a broader context, our evolutionary analyses reveal that CNEs are composed of tightly linked transcription-factor binding-sites (TFBSs), which can be systematically identified through phylogenetic footprinting approaches. By placing a large number of ancient vertebrate CNEs into a developmental context, our findings promise to have a significant impact on efforts toward de-coding gene-regulatory elements that underlie vertebrate development, and will facilitate building general models of regulatory element evolution.
PMCID: PMC3261376  PMID: 22208168
Gene regulation; enhancer code; sea lamprey; Hox genes; embryogenesis
12.  Refinement of Bos taurus sequence assembly based on BAC-FISH experiments 
BMC Genomics  2011;12:639.
The sequencing of the cow genome was recently published (Btau_4.0 assembly). A second, alternate cow genome assembly (UMD2), based on the same raw sequence data, was also published. The two assemblies have been subsequently updated to Btau_4.2 and UMD3.1, respectively.
We compared the Btau_4.2 and UMD3.1 alternate assemblies. Inconsistencies were grouped into three main categories: (i) DNA segments showing almost coincidental chromosomal mapping but discordant orientation (inversions); (ii) DNA segments showing a discordant map position along the same chromosome; and (iii) sequences present in one chromosomal assembly but absent in the corresponding chromosome of the other assembly. The latter category mainly consisted of large amounts of scaffolds that were unassigned in Btau_4.2 but successfully mapped in UMD3.1. We sampled 70 inconsistencies and identified appropriate cow BACs for each of them. These clones were then utilized in FISH experiments on cow metaphase or interphase nuclei in order to disambiguate the discrepancies. In almost all instances the FISH results agreed with the UMD3.1 assembly. Occasionally, however, the mapping data of both assemblies were discordant with the FISH results.
Our work demonstrates how FISH, which is assembly independent, can be efficiently used to solve assembly problems frequently encountered using the shotgun approach.
PMCID: PMC3268123  PMID: 22208360
Cow genome; alternate assemblies of cow genomes; genomic comparison; unassigned scaffolds; BAC-FISH mapping
13.  Opposite GC skews at the 5' and 3' ends of genes in unicellular fungi 
BMC Genomics  2011;12:638.
GC-skews have previously been linked to transcription in some eukaryotes. They have been associated with transcription start sites, with the coding strand G-biased in mammals and C-biased in fungi and invertebrates.
We show a consistent and highly significant pattern of GC-skew within genes of almost all unicellular fungi. The pattern of GC-skew is asymmetrical: the coding strand of genes is typically C-biased at the 5' ends but G-biased at the 3' ends, with intermediate skews at the middle of genes. Thus, the initiation, elongation, and termination phases of transcription are associated with different skews. This pattern influences the encoded proteins by generating differential usage of amino acids at the 5' and 3' ends of genes. These biases also affect fourfold-degenerate positions and extend into promoters and 3' UTRs, indicating that skews cannot be accounted by selection for protein function or translation.
We propose two explanations, the mutational pressure hypothesis, and the adaptive hypothesis. The mutational pressure hypothesis is that different co-factors bind to RNA pol II at different phases of transcription, producing different mutational regimes. The adaptive hypothesis is that cytidine triphosphate deficiency may lead to C-avoidance at the 3' ends of transcripts to control the flow of RNA pol II molecules and reduce their frequency of collisions.
PMCID: PMC3315797  PMID: 22208287
14.  Employing machine learning for reliable miRNA target identification in plants 
BMC Genomics  2011;12:636.
miRNAs are ~21 nucleotide long small noncoding RNA molecules, formed endogenously in most of the eukaryotes, which mainly control their target genes post transcriptionally by interacting and silencing them. While a lot of tools has been developed for animal miRNA target system, plant miRNA target identification system has witnessed limited development. Most of them have been centered around exact complementarity match. Very few of them considered other factors like multiple target sites and role of flanking regions.
In the present work, a Support Vector Regression (SVR) approach has been implemented for plant miRNA target identification, utilizing position specific dinucleotide density variation information around the target sites, to yield highly reliable result. It has been named as p-TAREF (plant-Target Refiner). Performance comparison for p-TAREF was done with other prediction tools for plants with utmost rigor and where p-TAREF was found better performing in several aspects. Further, p-TAREF was run over the experimentally validated miRNA targets from species like Arabidopsis, Medicago, Rice and Tomato, and detected them accurately, suggesting gross usability of p-TAREF for plant species. Using p-TAREF, target identification was done for the complete Rice transcriptome, supported by expression and degradome based data. miR156 was found as an important component of the Rice regulatory system, where control of genes associated with growth and transcription looked predominant. The entire methodology has been implemented in a multi-threaded parallel architecture in Java, to enable fast processing for web-server version as well as standalone version. This also makes it to run even on a simple desktop computer in concurrent mode. It also provides a facility to gather experimental support for predictions made, through on the spot expression data analysis, in its web-server version.
A machine learning multivariate feature tool has been implemented in parallel and locally installable form, for plant miRNA target identification. The performance was assessed and compared through comprehensive testing and benchmarking, suggesting a reliable performance and gross usability for transcriptome wide plant miRNA target identification.
PMCID: PMC3293931  PMID: 22206472
15.  Expression divergence measured by transcriptome sequencing of four yeast species 
BMC Genomics  2011;12:635.
The evolution of gene expression is a challenging problem in evolutionary biology, for which accurate, well-calibrated measurements and methods are crucial.
We quantified gene expression with whole-transcriptome sequencing in four diploid, prototrophic strains of Saccharomyces species grown under the same condition to investigate the evolution of gene expression. We found that variation in expression is gene-dependent with large variations in each gene's expression between replicates of the same species. This confounds the identification of genes differentially expressed across species. To address this, we developed a statistical approach to establish significance bounds for inter-species differential expression in RNA-Seq data based on the variance measured across biological replicates. This metric estimates the combined effects of technical and environmental variance, as well as Poisson sampling noise by isolating each component. Despite a paucity of large expression changes, we found a strong correlation between the variance of gene expression change and species divergence (R2 = 0.90).
We provide an improved methodology for measuring gene expression changes in evolutionary diverged species using RNA Seq, where experimental artifacts can mimic evolutionary effects.
GEO Accession Number: GSE32679
PMCID: PMC3296765  PMID: 22206443
RNA-Seq; Comparative transcriptomics; S. cerevisiae; S. paradoxus; S. mikatae; S. bayanus
16.  Thyroid hormone-regulated gene expression in juvenile mouse liver: identification of thyroid response elements using microarray profiling and in silico analyses 
BMC Genomics  2011;12:634.
Disruption of thyroid hormone signalling can alter growth, development and energy metabolism. Thyroid hormones exert their effects through interactions with thyroid receptors that directly bind thyroid response elements and can alter transcriptional activity of target genes. The effects of short-term thyroid hormone perturbation on hepatic mRNA transcription in juvenile mice were evaluated, with the goal of identifying genes containing active thyroid response elements. Thyroid hormone disruption was induced from postnatal day 12 to 15 by adding goitrogens to dams' drinking water (hypothyroid). A subgroup of thyroid hormone-disrupted pups received intraperitoneal injections of replacement thyroid hormones four hours prior to sacrifice (replacement). An additional group received only thyroid hormones four hours prior to sacrifice (hyperthyroid). Hepatic mRNA was extracted and hybridized to Agilent mouse microarrays.
Transcriptional profiling enabled the identification of 28 genes that appeared to be under direct thyroid hormone-regulation. The regulatory regions of the genome adjacent to these genes were examined for half-site sequences that resemble known thyroid response elements. A bioinformatics search identified 33 thyroid response elements in the promoter regions of 13 different genes thought to be directly regulated by thyroid hormones. Thyroid response elements found in the promoter regions of Tor1a, 2310003H01Rik, Hect3d and Slc25a45 were further validated by confirming that the thyroid receptor is associated with these sequences in vivo and that it can bind directly to these sequences in vitro. Three different arrangements of thyroid response elements were identified. Some of these thyroid response elements were located far up-stream (> 7 kb) of the transcription start site of the regulated gene.
Transcriptional profiling of thyroid hormone disrupted animals coupled with a novel bioinformatics search revealed new thyroid response elements associated with genes previously unknown to be responsive to thyroid hormone. The work provides insight into thyroid response element sequence motif characteristics.
PMCID: PMC3340398  PMID: 22206413
17.  Novel features of ARS selection in budding yeast Lachancea kluyveri 
BMC Genomics  2011;12:633.
The characterization of DNA replication origins in yeast has shed much light on the mechanisms of initiation of DNA replication. However, very little is known about the evolution of origins or the evolution of mechanisms through which origins are recognized by the initiation machinery. This lack of understanding is largely due to the vast evolutionary distances between model organisms in which origins have been examined.
In this study we have isolated and characterized autonomously replicating sequences (ARSs) in Lachancea kluyveri - a pre-whole genome duplication (WGD) budding yeast. Through a combination of experimental work and rigorous computational analysis, we show that L. kluyveri ARSs require a sequence that is similar but much longer than the ARS Consensus Sequence well defined in Saccharomyces cerevisiae. Moreover, compared with S. cerevisiae and K. lactis, the replication licensing machinery in L. kluyveri seems more tolerant to variations in the ARS sequence composition. It is able to initiate replication from almost all S. cerevisiae ARSs tested and most Kluyveromyces lactis ARSs. In contrast, only about half of the L. kluyveri ARSs function in S. cerevisiae and less than 10% function in K. lactis.
Our findings demonstrate a replication initiation system with novel features and underscore the functional diversity within the budding yeasts. Furthermore, we have developed new approaches for analyzing biologically functional DNA sequences with ill-defined motifs.
PMCID: PMC3306766  PMID: 22204614
18.  Increasing the source/sink ratio in Vitis vinifera (cv Sangiovese) induces extensive transcriptome reprogramming and modifies berry ripening 
BMC Genomics  2011;12:631.
Cluster thinning is an agronomic practice in which a proportion of berry clusters are removed from the vine to increase the source/sink ratio and improve the quality of the remaining berries. Until now no transcriptomic data have been reported describing the mechanisms that underlie the agronomic and biochemical effects of thinning.
We profiled the transcriptome of Vitis vinifera cv. Sangiovese berries before and after thinning at veraison using a genome-wide microarray representing all grapevine genes listed in the latest V1 gene prediction. Thinning increased the source/sink ratio from 0.6 to 1.2 m2 leaf area per kg of berries and boosted the sugar and anthocyanin content at harvest. Extensive transcriptome remodeling was observed in thinned vines 2 weeks after thinning and at ripening. This included the enhanced modulation of genes that are normally regulated during berry development and the induction of a large set of genes that are not usually expressed.
Cluster thinning has a profound effect on several important cellular processes and metabolic pathways including carbohydrate metabolism and the synthesis and transport of secondary products. The integrated agronomic, biochemical and transcriptomic data revealed that the positive impact of cluster thinning on final berry composition reflects a much more complex outcome than simply enhancing the normal ripening process.
PMCID: PMC3283566  PMID: 22192855
19.  Comprehensive analysis of tandem amino acid repeats from ten angiosperm genomes 
BMC Genomics  2011;12:632.
The presence of tandem amino acid repeats (AARs) is one of the signatures of eukaryotic proteins. AARs were thought to be frequently involved in bio-molecular interactions. Comprehensive studies that primarily focused on metazoan AARs have suggested that AARs are evolving rapidly and are highly variable among species. However, there is still controversy over causal factors of this inter-species variation. In this work, we attempted to investigate this topic mainly by comparing AARs in orthologous proteins from ten angiosperm genomes.
Angiosperm AAR content is positively correlated with the GC content of the protein coding sequence. However, based on observations from fungal AARs and insect AARs, we argue that the applicability of this kind of correlation is limited by AAR residue composition and species' life history traits. Angiosperm AARs also tend to be fast evolving and structurally disordered, supporting the results of comprehensive analyses of metazoans. The functions of conserved long AARs are summarized. Finally, we propose that the rapid mRNA decay rate, alternative splicing and tissue specificity are regulatory processes that are associated with angiosperm proteins harboring AARs.
Our investigation suggests that GC content is a predictor of AAR content in the protein coding sequence under certain conditions. Although angiosperm AARs lack conservation and 3D structure, a fraction of the proteins that contain AARs may be functionally important and are under extensive regulation in plant cells.
PMCID: PMC3283746  PMID: 22195734
20.  Genomic signatures and gene networking: challenges and promises 
BMC Genomics  2011;12(Suppl 5):I1.
This is an editorial report of the supplement to BMC Genomics that includes 15 papers selected from the BIOCOMP'10 - The 2010 International Conference on Bioinformatics & Computational Biology as well as other sources with a focus on genomics studies.
BIOCOMP'10 was held on July 12-15 in Las Vegas, Nevada. The congress covered a large variety of research areas, and genomics was one of the major focuses because of the fast development in this field. We set out to launch a supplement to BMC Genomics with manuscripts selected from this congress and invited submissions. With a rigorous peer review process, we selected 15 manuscripts that showed work in cutting-edge genomics fields and proposed innovative methodology. We hope this supplement presents the current computational and statistical challenges faced in genomics studies, and shows the enormous promises and opportunities in the genomic future.
PMCID: PMC3287490  PMID: 22369358
21.  Gene selection and classification for cancer microarray data based on machine learning and similarity measures 
BMC Genomics  2011;12(Suppl 5):S1.
Microarray data have a high dimension of variables and a small sample size. In microarray data analyses, two important issues are how to choose genes, which provide reliable and good prediction for disease status, and how to determine the final gene set that is best for classification. Associations among genetic markers mean one can exploit information redundancy to potentially reduce classification cost in terms of time and money.
To deal with redundant information and improve classification, we propose a gene selection method, Recursive Feature Addition, which combines supervised learning and statistical similarity measures. To determine the final optimal gene set for prediction and classification, we propose an algorithm, Lagging Prediction Peephole Optimization. By using six benchmark microarray gene expression data sets, we compared Recursive Feature Addition with recently developed gene selection methods: Support Vector Machine Recursive Feature Elimination, Leave-One-Out Calculation Sequential Forward Selection and several others.
On average, with the use of popular learning machines including Nearest Mean Scaled Classifier, Support Vector Machine, Naive Bayes Classifier and Random Forest, Recursive Feature Addition outperformed other methods. Our studies also showed that Lagging Prediction Peephole Optimization is superior to random strategy; Recursive Feature Addition with Lagging Prediction Peephole Optimization obtained better testing accuracies than the gene selection method varSelRF.
PMCID: PMC3287491  PMID: 22369383
gene selection; microarray; classification; supervised-learning; similarity
22.  A hidden Markov model-based algorithm for identifying tumour subtype using array CGH data 
BMC Genomics  2011;12(Suppl 5):S10.
The recent advancement in array CGH (aCGH) research has significantly improved tumor identification using DNA copy number data. A number of unsupervised learning methods have been proposed for clustering aCGH samples. Two of the major challenges for developing aCGH sample clustering are the high spatial correlation between aCGH markers and the low computing efficiency. A mixture hidden Markov model based algorithm was developed to address these two challenges.
The hidden Markov model (HMM) was used to model the spatial correlation between aCGH markers. A fast clustering algorithm was implemented and real data analysis on glioma aCGH data has shown that it converges to the optimal cluster rapidly and the computation time is proportional to the sample size. Simulation results showed that this HMM based clustering (HMMC) method has a substantially lower error rate than NMF clustering. The HMMC results for glioma data were significantly associated with clinical outcomes.
We have developed a fast clustering algorithm to identify tumor subtypes based on DNA copy number aberrations. The performance of the proposed HMMC method has been evaluated using both simulated and real aCGH data. The software for HMMC in both R and C++ is available in ND INBRE website
PMCID: PMC3287492  PMID: 22369459
23.  Predicting adverse side effects of drugs 
BMC Genomics  2011;12(Suppl 5):S11.
Studies of toxicity and unintended side effects can lead to improved drug safety and efficacy. One promising form of study comes from molecular systems biology in the form of "systems pharmacology". Systems pharmacology combines data from clinical observation and molecular biology. This approach is new, however, and there are few examples of how it can practically predict adverse reactions (ADRs) from an experimental drug with acceptable accuracy.
We have developed a new and practical computational framework to accurately predict ADRs of trial drugs. We combine clinical observation data with drug target data, protein-protein interaction (PPI) networks, and gene ontology (GO) annotations. We use cardiotoxicity, one of the major causes for drug withdrawals, as a case study to demonstrate the power of the framework. Our results show that an in silico model built on this framework can achieve a satisfactory cardiotoxicity ADR prediction performance (median AUC = 0.771, Accuracy = 0.675, Sensitivity = 0.632, and Specificity = 0.789). Our results also demonstrate the significance of incorporating prior knowledge, including gene networks and gene annotations, to improve future ADR assessments.
Biomolecular network and gene annotation information can significantly improve the predictive accuracy of ADR of drugs under development. The use of PPI networks can increase prediction specificity and the use of GO annotations can increase prediction sensitivity. Using cardiotoxicity as an example, we are able to further identify cardiotoxicity-related proteins among drug target expanding PPI networks. The systems pharmacology approach that we developed in this study can be generally applicable to all future developmental drug ADR assessments and predictions.
PMCID: PMC3287493  PMID: 22369493
24.  Transcriptomic profiles of peripheral white blood cells in type II diabetes and racial differences in expression profiles 
BMC Genomics  2011;12(Suppl 5):S12.
Along with obesity, physical inactivity, and family history of metabolic disorders, African American ethnicity is a risk factor for type 2 diabetes (T2D) in the United States. However, little is known about the differences in gene expression and transcriptomic profiles of blood in T2D between African Americans (AA) and Caucasians (CAU), and microarray analysis of peripheral white blood cells (WBCs) from these two ethnic groups will facilitate our understanding of the underlying molecular mechanism in T2D and identify genetic biomarkers responsible for the disparities.
A whole human genome oligomicroarray of peripheral WBCs was performed on 144 samples obtained from 84 patients with T2D (44 AA and 40 CAU) and 60 healthy controls (28 AA and 32 CAU). The results showed that 30 genes had significant difference in expression between patients and controls (a fold change of <-1.4 or >1.4 with a P value <0.05). These known genes were mainly clustered in three functional categories: immune responses, lipid metabolism, and organismal injury/abnormaly. Transcriptomic analysis also showed that 574 genes were differentially expressed in AA diseased versus AA control, compared to 200 genes in CAU subjects. Pathway study revealed that "Communication between innate and adaptive immune cells"/"Primary immunodeficiency signaling" are significantly down-regulated in AA patients and "Interferon signaling"/"Complement System" are significantly down-regulated in CAU patients.
These newly identified genetic markers in WBCs provide valuable information about the pathophysiology of T2D and can be used for diagnosis and pharmaceutical drug design. Our results also found that AA and CAU patients with T2D express genes and pathways differently.
PMCID: PMC3287494  PMID: 22369568
25.  Learning the structure of gene regulatory networks from time series gene expression data 
BMC Genomics  2011;12(Suppl 5):S13.
Dynamic Bayesian Network (DBN) is an approach widely used for reconstruction of gene regulatory networks from time-series microarray data. Its performance in network reconstruction depends on a structure learning algorithm. REVEAL (REVerse Engineering ALgorithm) is one of the algorithms implemented for learning DBN structure and used to reconstruct gene regulatory networks (GRN). However, the two-stage temporal Bayes network (2TBN) structure of DBN that specifies correlation between time slices cannot be obtained by score metrics used in REVEAL.
In this paper, we study a more sophisticated score function for DBN first proposed by Nir Friedman for stationary DBNs structure learning of both initial and transition networks but has not yet been used for reconstruction of GRNs. We implemented Friedman's Bayesian Information Criterion (BIC) score function, modified K2 algorithm to learn Dynamic Bayesian Network structure with the score function and tested the performance of the algorithm for GRN reconstruction with synthetic time series gene expression data generated by GeneNetWeaver and real yeast benchmark experiment data.
We implemented an algorithm for DBN structure learning with Friedman's score function, tested it on reconstruction of both synthetic networks and real yeast networks and compared it with REVEAL in the absence or presence of preprocessed network generated by Zou&Conzen's algorithm. By introducing a stationary correlation between two consecutive time slices, Friedman's score function showed a higher precision and recall than the naive REVEAL algorithm.
Friedman's score metrics for DBN can be used to reconstruct transition networks and has a great potential to improve the accuracy of gene regulatory network structure prediction with time series gene expression datasets.
PMCID: PMC3287495  PMID: 22369588

