Background & Aims
Farnesoid X receptor (FXR, NR1H4) is a ligand-activated transcription factor, belonging to the nuclear receptor superfamily. FXR is highly expressed in the liver and is essential in regulating bile acid homeostasis. FXR deficiency is implicated in numerous liver diseases and mice with modulation of FXR have been used as animal models to study liver physiology and pathology. We have reported genome-wide binding of FXR in mice by chromatin immunoprecipitation - deep sequencing (ChIP-seq), with results indicating that FXR may be involved in regulating diverse pathways in liver. However, limited information exists for the functions of human FXR and the suitability of using murine models to study human FXR functions.
In the current study, we performed ChIP-seq in primary human hepatocytes (PHHs) treated with a synthetic FXR agonist, GW4064 or DMSO control. In parallel, RNA deep sequencing (RNA-seq) and RNA microarray were performed for GW4064 or control treated PHHs and wild type mouse livers, respectively.
ChIP-seq showed similar profiles of genome-wide FXR binding in humans and mice in terms of motif analysis and pathway prediction. However, RNA-seq and microarray showed more different transcriptome profiles between PHHs and mouse livers upon GW4064 treatment.
In summary, we have established genome-wide human FXR binding and transcriptome profiles. These results will aid in determining the human FXR functions, as well as judging to what level the mouse models could be used to study human FXR functions.
Farnesoid X receptor (Fxr) is a ligand-activated nuclear receptor critical for liver function. Reports indicate that functions of Fxr in the liver may overlap with those of hepatocyte nuclear factor 4α (Hnf4α), but studies of their precise genome-wide interaction to regulate gene transcription are lacking. Thus, we compared the genome-wide binding of Fxr and Hnf4α in the liver of mice and characterized their cooperative activity on binding to and activating target gene transcription.
Methods and Results
ChIP-Seq of mouse livers revealed that nearly 50% binding sites of Fxr and Hnf4α overlap. Co-immunoprecipitation assays showed a direct Fxr-Hnf4α protein interaction dependent on Fxr activity. Hnf4α bound to shared target sites upstream and in close proximity to Fxr. Moreover, genes co-bound by Fxr and Hnf4α are enriched in complement and coagulation cascades and drug metabolism. Furthermore, transcriptional and binding assays suggest that Hnf4α increases Fxr transcriptional activity; however, binding of Hnf4α can be either Fxr-dependent or -independent at different sites.
Our results showed that Fxr cooperates with Hnf4α in the liver to modulate gene transcription. This study provides the first evidence on a genome-wide scale of both cooperative and independent interactions between Fxr and Hnf4α in regulating gene transcription.
Fxr; Hnf4α; ChIP-Seq; co-regulation; nuclear receptor interaction
Understanding the factors governing protein solubility is a key to grasp the mechanisms of protein solubility and may provide insight into protein aggregation and misfolding related diseases such as Alzheimer’s disease. In this work, we attempt to identify factors important to protein solubility using feature selection. Firstly, we calculate 1438 features including physicochemical properties and statistics for each protein. Random Forest algorithm is used to select the most informative and the minimal subset of features based on their predictive performance. A predictive model is built based on 17 selected features. Compared with previous models, our model achieves better performance with a sensitivity of 0.82, specificity 0.85, ACC 0.84, AUC 0.91 and MCC 0.67. Furthermore, a model using redundancy-reduced dataset (sequence identity <= 30%) achieves the same performance as the model without redundancy reduction. Our results provide not only a reliable model for predicting protein solubility but also a list of features important to protein solubility. The predictive model is implemented as a freely available web application at http://shark.abl.ku.edu/ProS/.
Protein solubility; Aggregation; Random Forest; Classification; Feature selection
Fenretinide is significantly more effective in inducing apoptosis in cancer cells than all-trans retinoic acid (ATRA). The current study uses a genome-wide approach to understand the differential role fenretinide and ATRA have in inducing apoptosis in Huh7 cells. Fenretinide and ATRA-induced gene expressions and DNA bindings were profiled using microarray and chromatin immunoprecipitation with anti-RXRα antibody. The data showed that fenretinide was not a strong transcription regulator. Fenretinide only changed the expressions of 1 093 genes, approximately three times less than the number of genes regulated by ATRA (2 811). Biological function annotation demonstrated that both fenretinide and ATRA participated in pathways that determine cell fate and metabolic processes. However, fenretinide specifically induced Fas/TNFα-mediated apoptosis by increasing the expression of pro-apoptotic genes i.e., DEDD2, CASP8, CASP4, and HSPA1A/B; whereas, ATRA induced the expression of BIRC3 and TNFAIP3, which inhibit apoptosis by interacting with TRAF2. In addition, fenretinide inhibited the expression of the genes involved in RAS/RAF/ERK-mediated survival pathway. In contrast, ATRA increased the expression of SOSC2, BRAF, MEK, and ERK genes. Most genes regulated by fenretinide and ATRA were bound by RXRα, suggesting a direct effect. This study revealed that by regulating fewer genes, the effects of fenretinide become more specific and thus has fewer side effects than ATRA. The data also suggested that fenretinide induces apoptosis via death receptor effector and by inhibiting the RAS/RAF/ERK pathway. It provides insight on how retinoid efficacy can be improved and how side effects in cancer therapy can be reduced.
retinoic acid receptor; retinoid x receptor; nuclear receptor; hepatocellular carcinoma; ChIP-Seq
Summary: Nuclear receptors (NRs) are a class of transcription factors playing important roles in various biological processes. An NR often impacts numerous genes and different NRs share overlapped target networks. To fulfil the need for a database incorporating binding sites of different NRs at various conditions for easy comparison and visualization to improve our understanding of NR binding mechanisms, we have developed NURBS, a database for experimental and predicted nuclear receptor binding sites of mouse (NURBS). NURBS currently contains binding sites across the whole-mouse genome of 8 NRs identified in 40 chromatin immunoprecipitation with massively parallel DNA sequencing experiments. All datasets are processed using a widely used procedure and same statistical criteria to ensure the binding sites derived from different datasets are comparable. NURBS also provides predicted binding sites using NR-HMM, a Hidden Markov Model (HMM) model.
Availability: The GBrowse-based user interface of NURBS is freely accessible at http://shark.abl.ku.edu/nurbs/. NR-HMM and all results can be downloaded for free at the website.
Protein aggregation is a significant problem in the biopharmaceutical industry (protein drug stability) and is associated medically with over 40 human diseases. Although a number of computational models have been developed for predicting aggregation propensity and identifying aggregation-prone regions in proteins, little systematic research has been done to determine physicochemical properties relevant to aggregation and their relative importance to this important process. Such studies may result in not only accurately predicting peptide aggregation propensities and identifying aggregation prone regions in proteins, but also aid in discovering additional underlying mechanisms governing this process.
We use two feature selection algorithms to identify 16 features, out of a total of 560 physicochemical properties, presumably important to protein aggregation. Two predictors (ProA-SVM and ProA-RF) using selected features are built for predicting peptide aggregation propensity and identifying aggregation prone regions in proteins. Both methods are compared favourably to other state-of-the-art algorithms in cross validation. The identified important properties are fairly consistent with previous studies and bring some new insights into protein and peptide aggregation. One interesting new finding is that aggregation prone peptide sequences have similar properties to signal peptide and signal anchor sequences.
Both predictors are implemented in a freely available web application (http://www.abl.ku.edu/ProA/). We suggest that the quaternary structure of protein aggregates, especially soluble oligomers, may allow the formation of new molecular recognition signals that guide aggregate targeting to specific cellular sites.
Aggregation; Amyloid; Peptide; Prediction; Feature selection; Machine learning
The eyes and skin are obvious retinoid target organs. Vitamin A deficiency causes night blindness and retinoids are widely used to treat acne and psoriasis. However, more than 90% of total body retinol is stored in liver stellate cells. In addition, hepatocytes produce the largest amount of retinol binding protein and cellular retinoic acid binding protein to mobilize retinol from the hepatic storage pool and deliver retinol to its receptors, respectively. Furthermore, hepatocytes express the highest amount of retinoid x receptor alpha (RXRα) among all the cell types. Surprisingly, the function of endogenous retinoids in the liver has received very little attention.
Based on the data generated from chromatin immunoprecipitation followed by sequencing, the global DNA binding of transcription factors including retinoid x receptor α (RXRα) along with its partners i.e. retinoic acid receptor α (RARα), pregnane x receptor (PXR), liver x receptor (LXR), farnesoid x receptor (FXR), and peroxisome proliferator-activated receptor α (PPARα) has been established. Based on the binding, functional annotation illustrated the role of those receptors in regulating hepatic lipid homeostasis. To correlate the DNA binding data with gene expression data, the expression patterns of 576 genes that regulate lipid homeostasis were studied in wild type and liver RXRα-null mice treated with and without RA. The data showed that RA treatment and RXRα-deficiency had opposite effects in regulating lipid homeostasis. A subset of genes (114), which could clearly differentiate the effect of ligand treatment and receptor deficiency, were selected for further functional analysis. The expression data suggested that RA treatment could produce unsaturated fatty acids and induce triglyceride breakdown, bile acid secretion, lipolysis, and retinoids elimination. In contrast, RXRα deficiency might induce the synthesis of saturated fatty acids, triglyceride, cholesterol, bile acids, and retinoids. In addition, DNA binding data indicated extensive cross-talk among RARα, PXR, LXR, FXR, and PPARα in regulating those RA/RXRα-dependent gene expression levels. Moreover, RA reduced serum cholesterol, triglyceride, and bile acid levels in mice.
We have characterized the role of hepatic RA for the first time. Hepatic RA mediated through RXRα and its partners regulates lipid homeostasis.
Nuclear receptor; Retinoids x receptor; Retinoic acid receptor; Farnesnoid x receptor; Peroxisomal proliferator-activated receptor α; Liver x receptor; Pregnane x receptor; Chromatin immunoprecipitation; Sequencing; Microarray
The current study tests the hypothesis that peroxisome proliferator-activated receptor β (PPARβ) has a role in liver regeneration due to its effect in regulating energy homeostasis and cell proliferation. The role of PPARβ in liver regeneration was studied using two-third partial hepatectomy (PH) in Wild-type (WT) and PPARβ-null (KO) mice. In KO mice, liver regeneration was delayed and the number of Ki-67 positive cells reached the peak at 60 hr rather than at 36–48 hr after PH shown in WT mice. RNA-sequencing uncovered 1344 transcriptomes that were differentially expressed in regenerating WT and KO livers. About 70% of those differentially expressed genes involved in glycolysis and fatty acid synthesis pathways failed to induce during liver regeneration due to PPARβ deficiency. The delayed liver regeneration in KO mice was accompanied by lack of activation of phosphoinositide-dependent kinase 1 (PDK1)/Akt. In addition, cell proliferation-associated increase of genes encoding E2f transcription factor (E2f) 1–2 and E2f7–8 as well as their downstream target genes were not noted in KO livers 36–48 hr after PH. E2fs have dual roles in regulating metabolism and proliferation. Moreover, transient steatosis was only found in WT, but not in KO mice 36 hr after PH. These data suggested that PPARβ-regulated PDK1/Akt and E2f signaling that controls metabolism and proliferation is involved in the normal progression of liver regeneration.
Complex diseases induce perturbations to interaction and regulation networks in living systems, resulting in dynamic equilibrium states that differ for different diseases and also normal states. Thus identifying gene expression patterns corresponding to different equilibrium states is of great benefit to the diagnosis and treatment of complex diseases. However, it remains a major challenge to deal with the high dimensionality and small size of available complex disease gene expression datasets currently used for discovering gene expression patterns.
Here we present a phase-only correlation (POC) based classification method for recognizing the type of complex diseases. First, a virtual sample template is constructed for each subclass by averaging all samples of each subclass in a training dataset. Then the label of a test sample is determined by measuring the similarity between the test sample and each template. This novel method can detect the similarity of overall patterns emerged from the differentially expressed genes or proteins while ignoring small mismatches.
The experimental results obtained on seven publicly available complex disease datasets including microarray and protein array data demonstrate that the proposed POC-based disease classification method is effective and robust for diagnosing complex diseases with regard to the number of initially selected features, and its recognition accuracy is better than or comparable to other state-of-the-art machine learning methods. In addition, the proposed method does not require parameter tuning and data scaling, which can effectively reduce the occurrence of over-fitting and bias.
Retinoid x receptor α (RXRα) is abundantly expressed in the liver and is essential for the function of other nuclear receptors. Using chromatin immunoprecipitation sequencing and mRNA profiling data generated from wild type and RXRα-null mouse livers, the current study identifies the bona-fide hepatic RXRα targets and biological pathways. In addition, based on binding and motif analysis, the molecular mechanism by which RXRα regulates hepatic genes is elucidated in a high-throughput manner.
Close to 80% of hepatic expressed genes were bound by RXRα, while 16% were expressed in an RXRα-dependent manner. Motif analysis predicted direct repeat with a spacer of one nucleotide as the most prevalent RXRα binding site. Many of the 500 strongest binding motifs overlapped with the binding motif of specific protein 1. Biological functional analysis of RXRα-dependent genes revealed that hepatic RXRα deficiency mainly resulted in up-regulation of steroid and cholesterol biosynthesis-related genes and down-regulation of translation- as well as anti-apoptosis-related genes. Furthermore, RXRα bound to many genes that encode nuclear receptors and their cofactors suggesting the central role of RXRα in regulating nuclear receptor-mediated pathways.
This study establishes the relationship between RXRα DNA binding and hepatic gene expression. RXRα binds extensively to the mouse genome. However, DNA binding does not necessarily affect the basal mRNA level. In addition to metabolism, RXRα dictates the expression of genes that regulate RNA processing, translation, and protein folding illustrating the novel roles of hepatic RXRα in post-transcriptional regulation.
Protein acidostability is a common problem in biopharmaceutical and other industries. However, it remains a great challenge to engineer proteins for enhanced acidostability because our knowledge of protein acidostabilization is still very limited. In this paper, we present a comparative study of proteins from bacteria with acidic (AP) and neutral cytoplasms (NP) using an integrated statistical and machine learning approach. We construct a set of 393 non-redundant AP-NP ortholog pairs and calculate a total of 889 sequence based features for these proteins. The pairwise alignments of these ortholog pairs are used to build a residue substitution propensity matrix between APs and NPs. We use Gini importance provided by the Random Forest algorithm to rank the relative importance of these features. A scoring function using the 10 most significant features is developed and optimized using a hill climbing algorithm. The accuracy of the score function is 86.01% in predicting AP-NP ortholog pairs and is 76.65% in predicting non-ortholog AP-NP pairs, suggesting that there are significant differences between APs and NPs which can be used to predict relative acidostability of proteins. The overall trends uncovered in the study can be used as general guidelines for designing acidostable proteins. To best of our knowledge, this work represents the first systematic comparative study of the acidostable proteins and their non-acidostable orthologs.
Identification of phosphorylation sites by computational methods is becoming increasingly important because it reduces labor-intensive and costly experiments and can improve our understanding of the common properties and underlying mechanisms of protein phosphorylation.
A multitask learning framework for learning four kinase families simultaneously, instead of studying each kinase family of phosphorylation sites separately, is presented in the study. The framework includes two multitask classification methods: the Multi-Task Least Squares Support Vector Machines (MTLS-SVMs) and the Multi-Task Feature Selection (MT-Feat3).
Using the multitask learning framework, we successfully identify 18 common features shared by four kinase families of phosphorylation sites. The reliability of selected features is demonstrated by the consistent performance in two multi-task learning methods.
The selected features can be used to build efficient multitask classifiers with good performance, suggesting they are important to protein phosphorylation across 4 kinase families.
Allergic disease is on the rise worldwide. Effective prevention of allergic disease requires comprehensive understanding of the factors that contribute to its intermediate phenotypes, such as sensitization to common allergens.
To estimate the degree of genetic and environmental contributions to sensitization to food or aeroallergens.
Sensitization was defined as a positive skin prick test to an allergen. We calculated the zygosity-specific concordance rates and odds ratios (ORs) for sensitization to food and aeroallergens in 826 Chinese twin pairs (472 MZ and 354 DZ) aged 12 to 28 years. We also applied structural equation modeling procedures to estimate genetic and environmental influences on sensitization.
The concordance rates and risk of sensitization in one twin given the presence vs. the absence of sensitization in the other twin were higher in MZ twins than those in DZ twins. However, a large number of MZ twins were discordant in sensitization to common allergens. These observations suggest both genetic and environmental factors influence sensitization. Consistently, the estimated heritability and individual environmental components of the liability to sensitization ranged from 0.51 to 0.68 and 0.32 to 0.49, respectively, based on the best-fitted structural equation model. We also observed high phenotypic correlations between sensitization to two aeroallergens (cockroach and dust mite: 0.83) and two food allergens (peanut and shellfish: 0.58), but only moderate correlations for the pairs between sensitization to a food and an aeroallergen (0.31-0.46). The shared genetic and environmental factors between paired sensitizations contribute to the observed correlations.
We demonstrated that sensitization to common food and aeroallergens were influenced by both genetic and environmental factors. Moreover, we found that paired allergen sensitizations might share some common sets of genes and environmental factors. This study underscores the need to further delineate unique and/or pleiotropic genetic and environmental factors for allergen sensitization.
Twin; sensitization; positive SPT; structural equation modeling; heritability; environmental factors
Preterm delivery (PTD, <37 weeks of gestation) is a significant clinical and public health problem. Previously, we reported that maternal smoking and metabolic gene polymorphisms of CYP1A1 MspI and GSTT1 synergistically increase the risk of low birth weight. This study investigates the relationship between maternal smoking and metabolic gene polymorphisms of CYP1A1 MspI and GSTT1 with preterm delivery (PTD) as a whole and pre-term subgroups. This case–control study included 1,749 multi-ethnic mothers (571 with PTD and 1,178 controls) enrolled at Boston Medical Center. After adjusting covariates, regression analyses were performed to identify individual and joint associations of maternal smoking, two functional variants of CYP1A1 and GSTT1 with PTD. We observed a moderate effect of maternal smoking on PTD (OR = 1.6; 95% CI: 1.1–2.2). We found that compared to non-smoking mothers with low-risk genotypes, there was a significant joint association of maternal smoking, CYP1A1 (Aa/aa) and GSTT1 (absent) genotypes with gestational age (β = −3.37; SE = 0.86; P = 9 × 10−5) and with PTD (OR = 5.8; 95% CI: 2.0–21.1), respectively. Such joint association was particularly strong in certain preterm subgroups, including spontaneous PTD (OR = 8.3; 95% CI: 2.7–30.6), PTD < 32 weeks (OR = 11.1; 95% CI: 2.9–47.7), and PTD accompanied by histologic chorioamnionitis (OR = 15.6; 95% CI: 4.1–76.7). Similar patterns were observed across ethnic groups. Taken together, maternal smoking significantly increased the risk of PTD among women with high-risk CYP1A1 and GSTT1 genotypes. Such joint associations were strongest among PTD accompanied by histologic chorioamnionitis.
Obesity and allergic diseases have increased dramatically in recent decades. While adiposity has been associated with asthma, associations with allergic sensitization have been inconsistent.
To examine the association of adiposity and lipid profiles with allergic sensitization.
This study included 1,187 rural Chinese twins (653 men) aged 18-39 years, with skin prick tests (SPT), anthropometric and DEXA-assessed adiposity measures, and lipid assessments. Allergic sensitization was defined as positive SPT to ≥1 allergen (9 foods and 5 aeroallergens tested). We applied gender-stratified generalized estimating equations to assess the association of adiposity and serum lipids with allergic sensitization, and structural equation models to estimate the genetic/environmental influences on any observed associations.
Males had lower percent body fat (%BF) (13.9% vs. 28.8%) but higher rates of allergic sensitization (56.2% vs. 36.7%) than females. Males in the highest %BF quartile were 2.1 times more likely sensitized than the lowest quartile (95%CI 1.3-3.5, P-trend=0.003). In males, the risk of allergic sensitization increased with HDL<40 mg/dl (OR=4.0, 95%CI 1.8-9.2) and higher LDL quartiles (P-trend=0.007). This appeared to be partially explained by shared genetic factors between serum lipid levels and allergic sensitization. In females, lower HDL was associated with increased risk of allergic sensitization.
In this relatively lean Chinese population, higher %BF, lower HDL and higher LDL were associated with greater risk of allergic sensitization, most notable in males. The observed associations between adiposity, serum lipids and allergic sensitization in males appear to be partially explained by common genetic influences on these traits.
DEXA; Body mass index; Adiposity; Serum lipids; Sensitization
The increasing prevalence of food allergy (FA) is a growing clinical and public health problem. The contribution of genetic factors to FA remains largely unknown.
This study examined the pattern of familial aggregation and the degree to which genetic factors contribute to FA and sensitization to food allergens.
This study included 581 nuclear families (2,004 subjects) as part of an ongoing FA study in Chicago, IL, USA. FA was defined by a set of criteria including timing, clinical symptoms obtained via standardized questionnaire interview, and corroborative specific IgE cutoffs for >=95% positive predictive value (PPV) for food allergens measured by Phadia ImmunoCAP. Familial aggregation of FA as well as sensitization to food allergens were examined using generalized estimating equation (GEE) models, with adjustment for important covariates including age, gender, ethnicity and birth order. Heritability was estimated for food-specific IgE measurements.
FA in the index child was a significant and independent predictor of FA in other siblings (OR=2.6, 95%CI:1.2–5.6, p=0.01). There were significant and positive associations among family members (father-offspring, mother-offspring, index-other siblings) for total IgE and specific IgE to all the 9 major food allergens tested in this sample (sesame, peanut, wheat, milk, egg white, soy, walnut, shrimp and cod fish). The estimated heritability of food-specific IgE ranged from 0.15 to 0.35 and was statistically significant for all the 9 tested food allergens.
This family-based study demonstrates strong familial aggregation of food allergy and sensitization to food allergens, especially, among siblings. The heritability estimates indicate that food-specific IgE is likely influenced by both genetic and environmental factors. Together, this study provides strong evidence that both host genetic susceptibilityand environmental factors determine the complex trait of IgE-mediated food allergy.
familial aggregation; heritability; food allergy; sensitization to food allergens; IgE-mediated
The prevalence of allergic diseases is increasing worldwide, but the reasons are not well understood. Previous studies suggest that this trend may be associated with lifestyle and urbanization.
To describe patterns of sensitization and allergic disease in an unselected agricultural Chinese population.
The data was derived from a community-based twin study in Anqing, China. Skin prick testing was performed to foods and aeroallergens. Atopy was defined as sensitization to ≥1 allergen. Allergic disease was ascertained by self-report. The analysis was stratified by sex and age (children [11-17 years] and adults [≥18 years]) and included 1059 same-sex twin pairs.
Of 2118 subjects, 57.6% were male (n=1220). Ages ranged from 11-71 years; 43.3% were children (n=918). Atopy was observed in 47.2% (n=999) of participants. The most common sensitizing foods were shellfish (16.7%) and peanut (12.3%). The most common sensitizing aeroallergens were dust mite (30.6%) and cockroach (25.2%). Birth order and zygosity had no effect on sensitization rates. Multivariate logistic regression models revealed risk factors for sensitization include age for foods and sex for aeroallergens. The rates of food allergy and asthma were estimated to be <1%.
Atopic sensitization was common in this rural farming Chinese population, particularly to shellfish, peanut, dust mite, and cockroach. The prevalence of allergic disease, in contrast, was quite low.
Allergen sensitization was far more common than the rate of self-reported allergic disease in this community. Evidence of sensitization is an inadequate marker of allergic disease and better correlates with clinical disease are needed.
Among this large unselected Chinese rural farming community, atopy was observed in nearly half of the study subjects, but the rate of allergic disease was comparatively very low.
aeroallergens; rural; farming community; Chinese; food allergens; prevalence; sensitization; skin prick tests