The identification of novel genes relevant to plant cell wall (PCW) biosynthesis in Populus is a highly important and challenging problem. We surveyed candidate Populus cell wall genes using a non-targeted approach. First, a genome-wide Populus gene co-expression network (PGCN) was constructed using microarray data available in the public domain. Module detection was then performed, followed by gene ontology (GO) enrichment analysis, to assign the functional category to these modules. Based on GO annotation, the modules involved in PCW biosynthesis were then selected and analyzed in detail to annotate the candidate PCW genes in these modules, including gene annotation, expression of genes in different tissues, and so on. We examined the overrepresented cis-regulatory elements (CREs) in the gene promoters to understand the possible transcriptionally co-regulated relationships among the genes within the functional modules of cell wall biosynthesis. PGCN contains 6,854 nodes (genes) with 324,238 edges. The topological properties of the network indicate scale-free and modular behavior. A total of 435 modules were identified; among which, 67 modules were identified by overrepresented GO terms. Six modules involved in cell wall biosynthesis were identified. Module 9 was mainly involved in cellular polysaccharide metabolic process in the primary cell wall, whereas Module 4 comprises genes involved in secondary cell wall biogenesis. In addition, we predicted and analyzed 10 putative CREs in the promoters of the genes in Module 4 and Module 9. The non-targeted approach of gene network analysis and the data presented here can help further identify and characterize cell wall related genes in Populus.
The conjugative tetracycline resistance plasmid pCW3 is the paradigm conjugative plasmid in the anaerobic gram-positive pathogen Clostridium perfringens. Two closely related FtsK/SpoIIIE homologs, TcpA and TcpB, are encoded on pCW3, which is significant since FtsK domains are found in coupling proteins of gram-negative conjugation systems. To develop an understanding of the mechanism of conjugative transfer in C. perfringens, we determined the role of these proteins in the conjugation process. Mutation and complementation analysis was used to show that the tcpA gene was essential for the conjugative transfer of pCW3 and that the tcpB gene was not required for transfer. Furthermore, complementation of a pCW3ΔtcpA mutant with divergent tcpA homologs provided experimental evidence that all of the known conjugative plasmids from C. perfringens use a similar transfer mechanism. Functional genetic analysis of the TcpA protein established the essential role in conjugative transfer of its Walker A and Walker B ATP-binding motifs and its FtsK-like RAAG motif. It is postulated that TcpA is the essential DNA translocase or coupling protein encoded by pCW3 and as such represents a key component of the unique conjugation process in C. perfringens.
Cotton fiber is a single-celled seed trichome of major biological and economic importance. In recent years, genomic approaches such as microarray-based expression profiling were used to study fiber growth and development to understand the developmental mechanisms of fiber at the molecular level. The vast volume of microarray expression data generated requires a sophisticated means of data mining in order to extract novel information that addresses fundamental questions of biological interest. One of the ways to approach microarray data mining is to increase the number of dimensions/levels to the analysis, such as comparing independent studies from different genotypes. However, adding dimensions also creates a challenge in finding novel ways for analyzing multi-dimensional microarray data.
Mining of independent microarray studies from Pima and Upland (TM1) cotton using double feature selection and cluster analyses identified species-specific and stage-specific gene transcripts that argue in favor of discrete genetic mechanisms that govern developmental programming of cotton fiber morphogenesis in these two cultivated species. Double feature selection analysis identified the highest number of differentially expressed genes that distinguish the fiber transcriptomes of developing Pima and TM1 fibers. These results were based on the finding that differences in fibers harvested between 17 and 24 day post-anthesis (dpa) represent the greatest expressional distance between the two species. This powerful selection method identified a subset of genes expressed during primary (PCW) and secondary (SCW) cell wall biogenesis in Pima fibers that exhibits an expression pattern that is generally reversed in TM1 at the same developmental stage. Cluster and functional analyses revealed that this subset of genes are primarily regulated during the transition stage that overlaps the termination of PCW and onset of SCW biogenesis, suggesting that these particular genes play a major role in the genetic mechanism that underlies the phenotypic differences in fiber traits between Pima and TM1.
The novel application of double feature selection analysis led to the discovery of species- and stage-specific genetic expression patterns, which are biologically relevant to the genetic programs that underlie the differences in the fiber phenotypes in Pima and TM1. These results promise to have profound impacts on the ongoing efforts to improve cotton fiber traits.
Developing neocortical progenitors express transcription factors in gradients that induce programs of region-specific gene expression. Our previous work identified anteriorly upregulated expression gradients of a number of corticofugal neuron-associated gene probe sets along the anterior–posterior axis of the human neocortex (8-12 postconceptional weeks [PCW]). Here, we demonstrate by real-time polymerase chain reaction, in situ hybridization and immunohistochemistry that 3 such genes, ROBO1, SRGAP1, and CTIP2 are highly expressed anteriorly between 8-12 PCW, in comparison with other genes (FEZF2, SOX5) expressed by Layer V, VI, and subplate neurons. All 3 were prominently expressed by early postmitotic neurons in the subventricular zone, intermediate zone, and cortical plate (CP) from 8 to 10 PCW. Between 12 and 15 PCW expression patterns for ER81 and SATB2 (Layer V), TBR1 (Layer V/VI) and NURR1 (Layer VI) revealed Layer V forming. By 15 PCW, ROBO1 and SRGAP1 expression was confined to Layer V, whereas CTIP2 was expressed throughout the CP anteriorly. We observed ROBO1 and SRGAP1 immunoreactivity in medullary corticospinal axons from 11 PCW onward. Thus, we propose that the coexpression of these 3 markers in the anterior neocortex may mark the early location of the human motor cortex, including its corticospinal projection neurons, allowing further study of their early differentiation.
cerebral cortex; corticospinal tract; regionalization
The transcription factors Emx2 and Pax6 are expressed in the proliferating zones of the developing rodent neocortex, and gradients of expression interact in specifying caudal and rostral identities. Pax6 is also involved in corticoneurogenesis, being expressed by radial glial progenitors that give rise to cells that also sequentially express Tbr2, NeuroD and Tbr1, genes temporally downstream of Pax6. In this study, using in situ hybridization, we analysed the expression of EMX2, PAX6, TBR2, NEUROD and TBR1 mRNA in the developing human cortex between 8 and 12 postconceptional weeks (PCW). EMX2 mRNA was expressed in the ventricular (VZ) and subventricular zones (SVZ), but also in the cortical plate, unlike in the rodent. However, gradients of expression were similar to that of the rodent at all ages studied. PAX6 mRNA expression was limited to the VZ and SVZ. At 8 PCW, PAX6 was highly expressed rostrally but less so caudally, as has been seen in the rodent, however this gradient disappeared early in corticogenesis, by 9 PCW. There was less restricted compartment-specific expression of TBR2, NEUROD and TBR1 mRNA than in the rodent, where the gradients of expression were similar to that of PAX6 prior to 9 PCW. The gradient disappeared for TBR2 by 10 PCW, and for NEUROD and TBR1 by 12 PCW. These data support recent reports that EMX2 but not PAX6 is more directly involved in arealization, highlighting that analysis of human development allows better spatio-temporal resolution than studies in rodents.
arealization; development; neurogenesis; subventricular zone
We have employed immunohistochemistry for multiple markers to investigate the structure and possible function of the different compartments of human cerebral wall from the formation of cortical plate at 8 postconceptional weeks (PCW) to the arrival of thalamocortical afferents at 17 PCW. New observations include the subplate emerging as a discrete differentiated layer by 10 PCW, characterized by synaptophysin and vesicular gamma-aminobutyric acid transporter expression also seen in the marginal zone, suggesting that these compartments may maintain a spontaneously active synaptic network even before the arrival of thalamocortical afferents. The subplate expanded from 13 to 17 PCW, becoming the largest compartment and differentiated further, with NPY neurons located in the outer subplate and KCC2 neurons in the inner subplate. Glutamate decarboxylase and calretinin-positive inhibitory neurons migrated tangentially and radially from 11.5 PCW, appearing in larger numbers toward the rostral pole. The proliferative zones, marked by Ki67 expression, developed a complicated structure by 12.5 PCW reflected in transcription factor expression patterns, including TBR2 confined to the inner subventricular and outer ventricular zones and TBR1 weakly expressed in the subventricular zone (SVZ). PAX6 was extensively expressed in the proliferative zones such that the human outer SVZ contained a large reservoir of PAX6-positive potential progenitor cells.
cell migration; cortical development; immunohistochemistry; synaptogenesis
IMR is useful for assessing the microvascular dysfunction after primary percutaneous coronary intervention (PCI). It remains unknown whether index of microcirculatory resistance (IMR) reflects the functional outcome in patients with anterior myocardial infarction (AMI) with or without microvascular obstruction (MO).This study was performed to evaluate the clinical value of the IMR for assessing myocardial injury and predicting microvascular functional recovery in patients with AMI undergoing primary PCI. We enrolled 34 patients with first anterior AMI. After successful primary PCI, the mean distal coronary artery pressure (Pa), coronary wedge pressure (Pcw), mean aortic pressure (Pa), mean transit time (Tmn), and IMR (Pd * hyperemic Tmn) were measured. The presence and extent of MO were measured using cardiac magnetic resonance image (MRI). All patients underwent follow-up echocardiography after 6 months. We divided the patients into two groups according to the existence of MO (present; n = 16, absent; n = 18) on MRI. The extent of MO correlated with IMR (r = 0.754; P < 0.001), Pcw (r = 0.404; P = 0.031), and Pcw/Pd of infarct-related arteries (r = 0.502; P = 0.016). The IMR was significantly correlated with the ΔRegional wall motion score index (r = -0.61, P < 0.01) and ΔLeft ventricular ejection fraction (r = -0.52, P < 0.01), implying a higher IMR is associated with worse functional improvement. Therefore, Intracoronary wedge pressures and IMR, as parameters for specific and quantitative assessment of coronary microvascular dysfunction, are reliable on-site predictors of short-term myocardial viability and Left ventricle functional recovery in patients undergoing primary PCI for AMI.
Acute Anterior Wall Myocardial Infarction; Coronary Occlusion; Capillary Resistance; Magnetic Resonance Imaging
We present a web-based network-construction system, CINPER (CSBL INteractive Pathway BuildER), to assist a user to build a user-specified gene network for a prokaryotic organism in an intuitive manner. CINPER builds a network model based on different types of information provided by the user and stored in the system. CINPER’s prediction process has four steps: (i) collection of template networks based on (partially) known pathways of related organism(s) from the SEED or BioCyc database and the published literature; (ii) construction of an initial network model based on the template networks using the P-Map program; (iii) expansion of the initial model, based on the association information derived from operons, protein-protein interactions, co-expression modules and phylogenetic profiles; and (iv) computational validation of the predicted models based on gene expression data. To facilitate easy applications, CINPER provides an interactive visualization environment for a user to enter, search and edit relevant data and for the system to display (partial) results and prompt for additional data. Evaluation of CINPER on 17 well-studied pathways in the MetaCyc database shows that the program achieves an average recall rate of 76% and an average precision rate of 90% on the initial models; and a higher average recall rate at 87% and an average precision rate at 28% on the final models. The reduced precision rate in the final models versus the initial models reflects the reality that the final models have large numbers of novel genes that have no experimental evidences and hence are not yet collected in the MetaCyc database. To demonstrate the usefulness of this server, we have predicted an iron homeostasis gene network of Synechocystis sp. PCC6803 using the server. The predicted models along with the server can be accessed at http://csbl.bmb.uga.edu/cinper/.
Four chloramphenicol resistance (Cm) and four tetracycline resistance (Tc) plasmids from Staphylococcus aureus were characterized by restriction endonuclease mapping. All four Tc plasmids had molecular masses of 2.9 megadaltons (Mdaltons) and indistinguishable responses to seven different restriction endonucleases. The four Cm plasmids (pCW6, pCW7, pCW8, and pC221) had molecular masses of 2.6, 2.8, 1.9, and 2.9 Mdaltons, respectively. The four Cm plasmids also differed both in the level of resistance to Cm and in susceptibility to retriction endonucleases. Single restriction endonuclease sites contained within each plasmid included the following: in pCW6 for HindIII, XbaI, HpaII, and BstEII; in pCW7 for HindIII, BstEII, BglII, HaeIII, and HpaII; in pCW8 for HindIII, HaeIII, and HpaII; in pC221 for HindIII, BstEII, and EcoRI. The molecular cloning capabilities of pCW8 and pC221 were determined. Cm and erythromycin resistance (Em) recombinant plasmids pCW12, PCW13, and pCW14 were constructed and used to transform S. aureus 8325-4. A 2.8-Mdalton HindIII fragment from plasmid pI258 was found to encode Em resistance and contain single sites for the retriction endonucleases BglII, PstI, HaeIII, and HpaII. The largest EcoRI fragment (8 Mdaltons) from pI258 contained the HindIII fragment encoding Em resistance intact. Cloning of DNA into the BglII site of pCW14 did not alter Em resistance. Cloning of DNA into the HindIII site of pCW8 and the HindIII and EcoRI sites of pC221 did not disrupt either plasmid replication of Cm resistance.
Understanding of gene regulatory networks requires discovery of expression modules within gene co-expression networks and identification of promoter motifs and corresponding transcription factors that regulate their expression. A commonly used method for this purpose is a top-down approach based on clustering the network into a range of densely connected segments, treating these segments as expression modules, and extracting promoter motifs from these modules. Here, we describe a novel bottom-up approach to identify gene expression modules driven by known cis-regulatory motifs in the gene promoters. For a specific motif, genes in the co-expression network are ranked according to their probability of belonging to an expression module regulated by that motif. The ranking is conducted via motif enrichment or motif position bias analysis. Our results indicate that motif position bias analysis is an effective tool for genome-wide motif analysis. Sub-networks containing the top ranked genes are extracted and analyzed for inherent gene expression modules. This approach identified novel expression modules for the G-box, W-box, site II, and MYB motifs from an Arabidopsis thaliana gene co-expression network based on the graphical Gaussian model. The novel expression modules include those involved in house-keeping functions, primary and secondary metabolism, and abiotic and biotic stress responses. In addition to confirmation of previously described modules, we identified modules that include new signaling pathways. To associate transcription factors that regulate genes in these co-expression modules, we developed a novel reporter system. Using this approach, we evaluated MYB transcription factor-promoter interactions within MYB motif modules.
Gene co-expression networks unite genes with similar expression patterns. From these networks, gene co-expression modules can be identified. A specific family of transcription factor(s) may regulate the genes within a co-expression module. Thus, module identification is important to decipher the gene regulatory network. Previously, module identification relied on clustering the gene network into gene clusters that were then treated as modules. This represents a top-down approach. Here, we introduce a reverse approach aiming at identifying gene co-expression modules regulated by known promoter motifs. For a given promoter motif, we calculated the probability of each gene within the network to belong to a module regulated by that motif via motif enrichment analysis or motif position bias analysis. A sub-network containing the genes with a high probability of belonging to a motif driven module was then extracted from the gene co-expression network. From this sub-network, the modular structure can be identified via visual inspection. Our bottom-up approach recovered many known and novel modules for the G-box, MYB, W-box and site II elements motif, whose expression may be regulated by the transcription factors that bind to these motifs. Additionally, we developed a rapid transcription factor-promoter interaction screening system to validate predicted interactions.
Motivation: The computational identification of non-coding RNA (ncRNA) genes represents one of the most important and challenging problems in computational biology. Existing methods for ncRNA gene prediction rely mostly on homology information, thus limiting their applications to ncRNA genes with known homologues.
Results: We present a novel de novo prediction algorithm for ncRNA genes using features derived from the sequences and structures of known ncRNA genes in comparison to decoys. Using these features, we have trained a neural network-based classifier and have applied it to Escherichia coli and Sulfolobus solfataricus for genome-wide prediction of ncRNAs. Our method has an average prediction sensitivity and specificity of 68% and 70%, respectively, for identifying windows with potential for ncRNA genes in E.coli. By combining windows of different sizes and using positional filtering strategies, we predicted 601 candidate ncRNAs and recovered 41% of known ncRNAs in E.coli. We experimentally investigated six novel candidates using Northern blot analysis and found expression of three candidates: one represents a potential new ncRNA, one is associated with stable mRNA decay intermediates and one is a case of either a potential riboswitch or transcription attenuator involved in the regulation of cell division. In general, our approach enables the identification of both cis- and trans-acting ncRNAs in partially or completely sequenced microbial genomes without requiring homology or structural conservation.
Availability: The source code and results are available at http://csbl.bmb.uga.edu/publications/materials/tran/.
Supplementary information: Supplementary data are available at Bioinformatics online.
The pathogenesis of avian necrotic enteritis involves NetB, a pore-forming toxin produced by virulent avian isolates of Clostridium perfringens type A. To determine the location and mobility of the netB structural gene, we examined a derivative of the tetracycline-resistant necrotic enteritis strain EHE-NE18, in which netB was insertionally inactivated by the chloramphenicol and thiamphenicol resistance gene catP. Both tetracycline and thiamphenicol resistance could be transferred either together or separately to a recipient strain in plate matings. The separate transconjugants could act as donors in subsequent matings, which demonstrated that the tetracycline resistance determinant and the netB gene were present on different conjugative elements. Large plasmids were isolated from the transconjugants and analyzed by high-throughput sequencing. Analysis of the resultant data indicated that there were actually three large conjugative plasmids present in the original strain, each with its own toxin or antibiotic resistance locus. Each plasmid contained a highly conserved 40-kb region that included plasmid replication and transfer regions that were closely related to the 47-kb conjugative tetracycline resistance plasmid pCW3 from C. perfringens. The plasmids were as follows: (i) a conjugative 49-kb tetracycline resistance plasmid that was very similar to pCW3, (ii) a conjugative 82-kb plasmid that contained the netB gene and other potential virulence genes, and (iii) a 70-kb plasmid that carried the cpb2 gene, which encodes a different pore-forming toxin, beta2 toxin.
The anaerobic bacterium Clostridium perfringens can cause an avian gastrointestinal disease known as necrotic enteritis. Disease pathogenesis is not well understood, although the plasmid-encoded pore-forming toxin NetB, is an important virulence factor. In this work, we have shown that the plasmid that carries the netB gene is conjugative and has a 40-kb region that is very similar to replication and transfer regions found within each of the sequenced conjugative plasmids from C. perfringens. We also showed that this strain contained two additional large plasmids that were also conjugative and carried a similar 40-kb region. One of these plasmids encoded beta2 toxin, and the other encoded tetracycline resistance. To our knowledge, this is the first report of a bacterial strain that carries three closely related but different independently conjugative plasmids. These results have significant implications for our understanding of the transmission of virulence and antibiotic resistance genes in pathogenic bacteria.
In a previous study, Staphylococcus aureus purified cell walls (PCW), consisting of peptidoglycan (PG) plus covalently linked teichoic acid (TA), were found to be more active in complement consumption than isolated PG. Isolated TA has now been shown to be capable of activating complement. Mild sonication markedly increased the ability of PG to activate complement but had essentially no effect on the activities of PCW and TA. Optimal sonication of PG did not yield activities equal to those of PCW in dose-response and kinetic studies, which may imply that TA plays some role in complement consumption. Sonication did not lead to solubilization of PCW or PG but may have enhanced the activity of PG in complement consumption by better dispersing PG particles, thereby exposing more surface area. Lysostaphin solubilization of PCW and PG markedly decreased their activities in complement consumption. The PCW of an S. aureus TA-deficient mutant, which were mostly PG, caused similar amounts of complement consumption as the parent strain PCW. Of the treatments of PCW commonly used to isolate PG, formamide and periodate extractions in particular led to PG preparations with lower activities in complement consumption than the PCW from which they were prepared, although these activities were stimulated by sonication. When whole organisms were studied by using a TA-deficient mutant, a mutant with an additional cell surface polymer, and the TA-containing parent strains and complement consumption by these strains was compared, no difference was found in either the rate or the degree of complement activation. This led to experiments demonstrating that both material released extracellularly from staphylococci and the cytoplasmic fraction of S. aureus were active in complement consumption. The results of these experiments indicate that both physical and chemical factors must be considered in studies of complement activation by isolated bacterial cell wall components. Under certain conditions, staphylococcal TA may enhance complement activation, but studies with whole organisms clearly show that this cell wall constituent does not play an essential role in this process. In addition, studies of complement consumption with intact organisms have demonstrated that there may be contributions both from cell surface components and from material released by the cells.
Although excystation is crucial to the initiation of infection by Giardia lamblia, little is known about the regulation of this important process. We have been able to reliably induce excystation in vitro by mimicking cyst passage through the stomach and upper small intestine by the exposure of in vitro-derived cysts to an acidic, reducing environment (stage I) followed by protease treatment at a slightly alkaline pH (stage II). Preexposure of cysts to polyclonal rabbit antiserum against purified cyst walls (PCWs) or to wheat germ agglutinin (WGA) inhibited excystation by > 90%. Adsorption of either ligand with PCWs eliminated inhibition, demonstrating specificity for cyst wall epitopes. Inhibition by WGA was reversed by either chitotriose or sialic acid, while inhibition by polyclonal antibodies against PCWs (anti-PCW) was reversed only by sialic acid, which also inhibited binding of both ligands to intact cysts and to cyst wall antigens in immunoblots. Binding of anti-PCW did not affect acidification of cyst cytoplasm during stage I. Exposure of cysts to anti-PCW and WGA prior to, but not after, stage II was sufficient to inhibit excystation, and inhibition could be partially reversed by increasing the protease concentration during stage II. A 7- to 10-fold higher proportion of WGA- and anti-PCW-treated cysts than control cysts remained intact after stage II. Our results suggest that these ligands, which bind cyst wall epitopes, inhibit excystation, most likely by interfering with proteolysis of cyst wall glycoproteins during stage II.
The aim of the study is to describe the work pattern of personal care workers (PCWs) in nursing homes. This knowledge is important for staff performance appraisal, task allocation and scheduling. It will also support funding allocation based on activities.
A time-motion study was conducted in 2010 at two Australian nursing homes. The observation at Site 1 was between the hours of 7:00 and 14:00 or 15:00 for 14 days. One PCW was observed on each day. The observation at Site 2 was from 10:00 to 17:00 for 16 days. One PCW working on a morning shift and another one working on an afternoon shift were observed on each day. Fifty-eight work activities done by PCWs were grouped into eight categories. Activity time, frequency, duration and the switch between two consecutive activities were used as measurements to describe the work pattern.
Personal care workers spent about 70.0% of their time on four types of activities consistently at both sites: direct care (30.7%), indirect care (17.6%), infection control (6.4%) and staff break (15.2%). Oral communication was the most frequently observed activity. It could occur independently or concurrently with other activities. At Site 2, PCWs spent significantly more time than their counterparts at Site 1 on oral communication (Site 1: 47.3% vs. Site 2: 63.5%, P = 0.003), transit (Site 1: 3.4% vs. Site 2: 5.5%, P < 0.001) and others (Site 1: 0.5% vs. Site 2: 1.8%, P < 0.001). They spent less time on documentation (Site 1: 4.1% vs. Site 2: 2.3%, P < 0.001). More than two-thirds of the observed activities had a very short duration (1 minute or less). Personal care workers frequently switched within or between oral communication, direct and indirect care activities.
At both nursing homes, direct care, indirect care, infection control and staff break occupied the major part of a PCW’s work, however oral communication was the most time consuming activity. Personal care workers frequently switched between activities, suggesting that looking after the elderly in nursing homes is a busy and demanding job.
Determination of PC20-FEV1 during Methacholine bronchial provocation test (MCT) is considered to be impossible in preschool children, as it requires repetitive spirometry sets. The aim of this study was to assess the feasibility of determining PC20-FEV1 in preschool age children and compares the results to the wheeze detection (PCW) method.
55 preschool children (ages 2.8–6.4 years) with recurrent respiratory symptoms were recruited. Baseline spirometry and MCT were performed according to ATS/ERS guidelines and the following parameters were determined at baseline and after each inhalation: spirometry-indices, lung auscultation at tidal breathing, oxygen saturation, respiratory and heart rate. Comparison between PCW and PC20-FEV1 and clinical parameters at these end-points was done by paired Student's t-tests.
Results and discussion
Thirty-six of 55 children (65.4%) successfully performed spirometry-sets up to the point of PCW. PC20-FEV1 occurred at a mean concentration of 1.70+/-2.01 while PCW occurred at a mean concentration of 4.37+/-3.40 mg/ml (p < 0.05). At PCW, all spirometry-parameters were markedly reduced: FVC by 41.3+/-16.4% (mean +/-SD); FEV1 by 44.7+/-14.5%; PEFR by 40.5+/-14.5 and FEF25–75 by 54.7+/-14.4% (P < 0.01 for all parameters). This reduction was accompanied by de-saturation, hyperpnoea, tachycardia and a response to bronchodilators.
Determination of PC20-FEV1 by spirometry is feasible in many preschool children. PC20-FEV1 often appears at lower provocation dose than PCW. The lower dose may shorten the test and encourage participation. Significant decrease in spirometry indices at PCW suggests that PC20-FEV1 determination may be safer.
We present a new computational method for solving a classical problem, the identification problem of cis-regulatory motifs in a given set of promoter sequences, based on one key new idea. Instead of scoring candidate motifs individually like in all the existing motif-finding programs, our method scores groups of candidate motifs with similar sequences, called motif closures, using a P-value, which has substantially improved the prediction reliability over the existing methods. Our new P-value scoring scheme is sequence length independent, hence allowing direct comparisons among predicted motifs with different lengths on the same footing. We have implemented this method as a Motif Recognition Computer (MREC) program, and have extensively tested MREC on both simulated and biological data from prokaryotic genomes. Our test results indicate that MREC can accurately pick out the actual motif with the correct length as the best scoring candidate for the vast majority of the cases in our test set. We compared our prediction results with two motif-finding programs Cosmo and MEME, and found that MREC outperforms both programs across all the test cases by a large margin. The MREC program is available at http://csbl.bmb.uga.edu/~bingqiang/MREC1/.
Computational identification of blood-secretory proteins, especially proteins with differentially expressed genes in diseased tissues, can provide highly useful information in linking transcriptomic data to proteomic studies for targeted disease biomarker discovery in serum.
A new algorithm for prediction of blood-secretory proteins is presented using an information-retrieval technique, called manifold ranking. On a dataset containing 305 known blood-secretory human proteins and a large number of other proteins that are either not blood-secretory or unknown, the new method performs better than the previous published method, measured in terms of the area under the recall-precision curve (AUC). A key advantage of the presented method is that it does not explicitly require a negative training set, which could often be noisy or difficult to derive for most biological problems, hence making our method more applicable than classification-based data mining methods in general biological studies.
We believe that our program will prove to be very useful to biomedical researchers who are interested in finding serum markers, especially when they have candidate proteins derived through transcriptomic or proteomic analyses of diseased tissues. A computer program is developed for prediction of blood-secretory proteins based on manifold ranking, which is accessible at our website http://csbl.bmb.uga.edu/publications/materials/qiliu/blood_secretory_protein.html.
Carbohydrate-active enzymes (CAZymes) are very important to the biotech industry, particularly the emerging biofuel industry because CAZymes are responsible for the synthesis, degradation and modification of all the carbohydrates on Earth. We have developed a web resource, dbCAN (http://csbl.bmb.uga.edu/dbCAN/annotate.php), to provide a capability for automated CAZyme signature domain-based annotation for any given protein data set (e.g. proteins from a newly sequenced genome) submitted to our server. To accomplish this, we have explicitly defined a signature domain for every CAZyme family, derived based on the CDD (conserved domain database) search and literature curation. We have also constructed a hidden Markov model to represent the signature domain of each CAZyme family. These CAZyme family-specific HMMs are our key contribution and the foundation for the automated CAZyme annotation.
We have recently developed a new version of the DOOR operon database, DOOR 2.0, which is available online at http://csbl.bmb.uga.edu/DOOR/ and will be updated on a regular basis. DOOR 2.0 contains genome-scale operons for 2072 prokaryotes with complete genomes, three times the number of genomes covered in the previous version published in 2009. DOOR 2.0 has a number of new features, compared with its previous version, including (i) more than 250 000 transcription units, experimentally validated or computationally predicted based on RNA-seq data, providing a dynamic functional view of the underlying operons; (ii) an integrated operon-centric data resource that provides not only operons for each covered genome but also their functional and regulatory information such as their cis-regulatory binding sites for transcription initiation and termination, gene expression levels estimated based on RNA-seq data and conservation information across multiple genomes; (iii) a high-performance web service for online operon prediction on user-provided genomic sequences; (iv) an intuitive genome browser to support visualization of user-selected data; and (v) a keyword-based Google-like search engine for finding the needed information intuitively and rapidly in this database.
Macrophages and granulocytes seem to play a key role in the pathogenesis of bacterial meningitis. Transforming growth factor beta (TGF-beta) leads to macrophage deactivation, as well as to inhibition of cytokine production and of endothelial granulocyte adhesion. We have investigated the influence of TGF-beta on regional cerebral blood flow (rCBF), intracranial pressure (ICP), and brain edema formation during the early phase of experimental meningitis. Rats which were inoculated intracisternally with live pneumococci or with pneumococcal cell wall hydrolyzed by the M1 muramidase (PCW-M) developed an increase of rCBF and ICP within 4 h postintracisternal challenge. A single intraperitoneal injection of TGF-beta 2 but not of TGF-beta 2 vehicle- control prevented the changes of rCBF. Furthermore, TGF-beta 2 significantly reduced the increase of ICP in rats inoculated with PCW- M. Likewise, the elevation of brain water content after intracisternal injection of pneumococci or PCW-M was blocked by pretreatment of rats with TGF-beta 2. TGF-beta 1 exhibited similar inhibitory effects in PCW- M-injected rats. The beneficial effects of TGF-beta 2 on the initial phase after pneumococcal inoculation seem to be tumor necrosis factor alpha- (TNF-alpha) independent since (a) intracisternal or intraperitoneal injection of neutralizing anti-TNF-alpha antibodies did not significantly influence rCBF, ICP, and brain water content in PCW-M- induced meningitis; and (b) TNF-alpha was only occasionally detected at low levels in cerebrospinal fluid at 4 h after PCW-M application.
Pathway enrichment analysis represents a key technique for analyzing high-throughput omic data, and it can help to link individual genes or proteins found to be differentially expressed under specific conditions to well-understood biological pathways. We present here a computational tool, SEAS, for pathway enrichment analysis over a given set of genes in a specified organism against the pathways (or subsystems) in the SEED database, a popular pathway database for bacteria. SEAS maps a given set of genes of a bacterium to pathway genes covered by SEED through gene ID and/or orthology mapping, and then calculates the statistical significance of the enrichment of each relevant SEED pathway by the mapped genes. Our evaluation of SEAS indicates that the program provides highly reliable pathway mapping results and identifies more organism-specific pathways than similar existing programs. SEAS is publicly released under the GPL license agreement and freely available at http://csbl.bmb.uga.edu/~xizeng/research/seas/.
Several large-scale gene co-expression networks have been constructed successfully for predicting gene functional modules and cis-regulatory elements in Arabidopsis (Arabidopsis thaliana). However, these networks are usually constructed and analyzed in an ad hoc manner. In this study, we propose a completely parameter-free and systematic method for constructing gene co-expression networks and predicting functional modules as well as cis-regulatory elements.
Our novel method consists of an automated network construction algorithm, a parameter-free procedure to predict functional modules, and a strategy for finding known cis-regulatory elements that is suitable for consensus scanning without prior knowledge of the allowed extent of degeneracy of the motif. We apply the method to study a large collection of gene expression microarray data in Arabidopsis. We estimate that our co-expression network has ~94% of accuracy, and has topological properties similar to other biological networks, such as being scale-free and having a high clustering coefficient. Remarkably, among the ~300 predicted modules whose sizes are at least 20, 88% have at least one significantly enriched functions, including a few extremely significant ones (ribosome, p < 1E-300, photosynthetic membrane, p < 1.3E-137, proteasome complex, p < 5.9E-126). In addition, we are able to predict cis-regulatory elements for 66.7% of the modules, and the association between the enriched cis-regulatory elements and the enriched functional terms can often be confirmed by the literature. Overall, our results are much more significant than those reported by several previous studies on similar data sets. Finally, we utilize the co-expression network to dissect the promoters of 19 Arabidopsis genes involved in the metabolism and signaling of the important plant hormone gibberellin, and achieved promising results that reveal interesting insight into the biosynthesis and signaling of gibberellin.
The results show that our method is highly effective in finding functional modules from real microarray data. Our application on Arabidopsis leads to the discovery of the largest number of annotated Arabidopsis functional modules in the literature. Given the high statistical significance of functional enrichment and the agreement between cis-regulatory and functional annotations, we believe our Arabidopsis gene modules can be used to predict the functions of unknown genes in Arabidopsis, and to understand the regulatory mechanisms of many genes.
A new computational method uses gene expression databases and transcription factor binding specificities to describe regulatory elements in the Drosophila A/P patterning network in unprecedented detail.
Cis-regulatory modules that drive precise spatial-temporal patterns of gene expression are central to the process of metazoan development. We describe a new computational strategy to annotate genomic sequences based on their “pattern generating potential” and to produce quantitative descriptions of transcriptional regulatory networks at the level of individual protein-module interactions. We use this approach to convert the qualitative understanding of interactions that regulate Drosophila segmentation into a network model in which a confidence value is associated with each transcription factor-module interaction. Sequence information from multiple Drosophila species is integrated with transcription factor binding specificities to determine conserved binding site frequencies across the genome. These binding site profiles are combined with transcription factor expression information to create a model to predict module activity patterns. This model is used to scan genomic sequences for the potential to generate all or part of the expression pattern of a nearby gene, obtained from available gene expression databases. Interactions between individual transcription factors and modules are inferred by a statistical method to quantify a factor's contribution to the module's pattern generating potential. We use these pattern generating potentials to systematically describe the location and function of known and novel cis-regulatory modules in the segmentation network, identifying many examples of modules predicted to have overlapping expression activities. Surprisingly, conserved transcription factor binding site frequencies were as effective as experimental measurements of occupancy in predicting module expression patterns or factor-module interactions. Thus, unlike previous module prediction methods, this method predicts not only the location of modules but also their spatial activity pattern and the factors that directly determine this pattern. As databases of transcription factor specificities and in vivo gene expression patterns grow, analysis of pattern generating potentials provides a general method to decode transcriptional regulatory sequences and networks.
The developmental program specifying segmentation along the anterior-posterior axis of the Drosophila embryo is one of the best studied examples of transcriptional regulatory networks. Previous work has identified the location and function of dozens of DNA segments called cis-regulatory “modules” that regulate several genes in precise spatial patterns in the early embryo. In many cases, transcription factors that interact with such modules have also been identified. We present a novel computational framework that turns a qualitative and fragmented understanding of modules and factor-module interactions into a quantitative, systems-level view. The formalism utilizes experimentally characterized binding specificities of transcription factors and gene expression patterns to describe how multiple transcription factors (working as activators or repressors) act together in a module to determine its regulatory activity. This formalism can explain the expression patterns of known modules, infer factor-module interactions and quantify the potential of an arbitrary DNA segment to drive a gene's expression. We have also employed databases of gene expression patterns to find novel modules of the regulatory network. As databases of binding motifs and gene expression patterns grow, this new approach provides a general method to decode transcriptional regulatory sequences and networks.
Conjugative plasmids encode antibiotic resistance determinants or toxin genes in the anaerobic pathogen Clostridium perfringens. The paradigm conjugative plasmid in this bacterium is pCW3, a 47-kb tetracycline resistance plasmid that encodes the unique tcp transfer locus. The tcp locus consists of 11 genes, intP and tcpA-tcpJ, at least three of which, tcpA, tcpF, and tcpH, are essential for the conjugative transfer of pCW3. In this study we examined protein-protein interactions involving TcpA, the putative coupling protein. Use of a bacterial two-hybrid system identified interactions between TcpA and TcpC, TcpG, and TcpH. This analysis also demonstrated TcpA, TcpC, and TcpG self-interactions, which were confirmed by chemical cross-linking studies. Examination of a series of deletion and site-directed derivatives of TcpA identified the domains and motifs required for these interactions. Based on these results, we have constructed a model for this unique conjugative transfer apparatus.