Plasma membrane proteins play critical roles in cell-to-cell recognition, signal transduction and material transport. Because of their accessibility, membrane proteins constitute the major targets for protein-based drugs. Here, we described an approach, which included stable isotope labeling by amino acids in cell culture (SILAC), cell surface biotinylation, affinity peptide purification and LC-MS/MS for the identification and quantification of cell surface membrane proteins. We applied the strategy for the quantitative analysis of membrane proteins expressed by a pair of human melanoma cell lines, WM-115 and WM-266-4, which were derived initially from the primary and metastatic tumor sites of the same individual. We were able to identify more than 100 membrane and membrane-associated proteins from these two cell lines, including cell surface histones. We further confirmed the surface localization of histone H2B and three other proteins by immunocytochemical analysis with confocal microscopy. The contamination from cytoplasmic and other nonmembrane-related sources is greatly reduced by using cell surface biotinylation and affinity purification of biotinylated peptides. We also quantified the relative expression of 62 identified proteins in the two types of melanoma cells. The application to quantitative analysis of membrane proteins of primary and metastatic melanoma cells revealed great potential of the method in the comprehensive identification of tumor progression markers as well as in the discovery of new protein-based therapeutic targets.
Quantitative proteomics; LC-MS/MS; SILAC; membrane proteins; cell surface biotinylation; immunocytochemistry; malignant melanoma; biomarkers
The traditional reliance on blood pressure (BP) measurement in the medical setting misses a significant number of individuals with masked hypertension, who have normal clinic BP but persistently high daytime BP when measured out of the office. We suggest that masked hypertension may be a precursor of clinically recognized sustained hypertension and is associated with increased cardiovascular risk compared with consistent normotension. We discuss factors that may contribute to clinic–daytime BP differences as well as the changing relationship between these two measures over time. Anxiety at the time of BP measurement and having been diagnosed as hypertensive appear to be two possible mechanisms. The identification of individuals with masked hypertension is of great clinical importance and requires out-of-office BP screening. Ambulatory BP monitoring is the best established technique for doing this, but home monitoring may be applicable in the future.
blood pressure measurement; masked hypertension; white coat hypertension
Hexavalent chromium (Cr(VI)) is emerging as a major concern for aquatic environments, particularly marine environments. Medaka (Oryzias latipes) has been used as a model species for human and aquatic health, including the marine environment, though few studies have directly compared toxicological responses in medaka to humans or other aquatic species. We used a medaka fin cell line to compare the genotoxic response of medaka to Cr(VI) to the response observed in North Atlantic right whale cells to see if responses in medaka were similar to those of other aquatic species, particularly aquatic mammals. We used the production of chromosomal aberrations as a measure of genotoxicity. We found that in medaka cells, concentrations of 1, 5 and 10 μM sodium chromate damaged 17, 32 and 43% of metaphases, respectively and these same concentrations 1, 2.5, 5 and 10 μM sodium chromate damaged 14, 24 and 49% of metaphases, respectively, in North Atlantic right whale lung cells and 11, 32 and 41% of metaphases, respectively, in North Atlantic right whale testes cells. These data show that genotoxic responses in medaka are comparable to those seen in North Atlantic right whale cells, consistent with the hypothesis that medaka are a useful model for other aquatic species.
Chromate; Chromium; Genotoxicity; Medaka; North Atlantic right whale; Hexavalent chromium
Oxidative stress is a common stress encountered by living organisms and is due to an imbalance between intracellular reactive oxygen and nitrogen species (ROS, RNS) and cellular antioxidant defence. To defend themselves against ROS/RNS, bacteria possess a subsystem of detoxification enzymes, which are classified with regard to their substrates. To identify such enzymes in prokaryotic genomes, different approaches based on similarity, enzyme profiles or patterns exist. Unfortunately, several problems persist in the annotation, classification and naming of these enzymes due mainly to some erroneous entries in databases, mistake propagation, absence of updating and disparity in function description.
In order to improve the current annotation of oxidative stress subsystems, an innovative platform named OxyGene has been developed. It integrates an original database called OxyDB, holding thoroughly tested anchor-based signatures associated to subfamilies of oxidative stress enzymes, and a new anchor-driven annotator, for ab initio detection of ROS/RNS response genes. All complete Bacterial and Archaeal genomes have been re-annotated, and the results stored in the OxyGene repository can be interrogated via a Graphical User Interface.
OxyGene enables the exploration and comparative analysis of enzymes belonging to 37 detoxification subclasses in 664 microbial genomes. It proposes a new classification that improves both the ontology and the annotation of the detoxification subsystems in prokaryotic whole genomes, while discovering new ORFs and attributing precise function to hypothetical annotated proteins. OxyGene is freely available at:
Mismatched oligonucleotides are widely used on microarrays to differentiate specific from nonspecific hybridization. While many experiments rely on such oligos, the hybridization behavior of various degrees of mismatch (MM) structure has not been extensively studied. Here, we present the results of two large-scale microarray experiments on S. cerevisiae and H. sapiens genomic DNA, to explore MM oligonucleotide behavior with real sample mixtures under tiling-array conditions.
We examined all possible nucleotide substitutions at the central position of 36-nucleotide probes, and found that nonspecific binding by MM oligos depends upon the individual nucleotide substitutions they incorporate: C→A, C→G and T→A (yielding purine-purine mispairs) are most disruptive, whereas A→X were least disruptive. We also quantify a marked GC skew effect: substitutions raising probe GC content exhibit higher intensity (and vice versa). This skew is small in highly-expressed regions (± 0.5% of total intensity range) and large (± 2% or more) elsewhere. Multiple mismatches per oligo are largely additive in effect: each MM added in a distributed fashion causes an additional 21% intensity drop relative to PM, three-fold more disruptive than adding adjacent mispairs (7% drop per MM).
We investigate several parameters for oligonucleotide design, including the effects of each central nucleotide substitution on array signal intensity and of multiple MM per oligo. To avoid GC skew, individual substitutions should not alter probe GC content. RNA sample mixture complexity may increase the amount of nonspecific hybridization, magnify GC skew and boost the intensity of MM oligos at all levels.
Large scale genome-wide association studies have become popular since the introduction of high throughput genotyping platforms. Efficient management of the vast array of data generated poses many challenges.
We have developed an open source web-based data management system for the large amount of genotype data generated from the Affymetrix GeneChip® Mapping Array and Affymetrix Genome-Wide Human SNP Array platforms. The database supports genotype calling using DM, BRLMM, BRLMM-P or Birdseed algorithms provided by the Affymetrix Power Tools. The genotype and corresponding pedigree data are stored in a relational database for efficient downstream data manipulation and analysis, such as calculation of allele and genotype frequencies, sample identity checking, and export of genotype data in various file formats for analysis using commonly-available software. A novel method for genotyping error estimation is implemented using linkage disequilibrium information from the HapMap project. All functionalities are accessible via a web-based user interface.
OpenADAM provides an open source database system for management of Affymetrix genome-wide association SNP data.
Biomphalaria glabrata is an intermediate snail host for Schistosoma mansoni, one of the important schistosomes infecting man. B. glabrata/S. mansoni provides a useful model system for investigating the intimate interactions between host and parasite. Examining differential gene expression between S. mansoni-exposed schistosome-resistant and susceptible snail lines will identify genes and pathways that may be involved in snail defences.
We have developed a 2053 element cDNA microarray for B. glabrata containing clones from ORESTES (Open Reading frame ESTs) libraries, suppression subtractive hybridization (SSH) libraries and clones identified in previous expression studies. Snail haemocyte RNA, extracted from parasite-challenged resistant and susceptible snails, 2 to 24 h post-exposure to S. mansoni, was hybridized to the custom made cDNA microarray and 98 differentially expressed genes or gene clusters were identified, 94 resistant-associated and 4 susceptible-associated. Quantitative PCR analysis verified the cDNA microarray results for representative transcripts. Differentially expressed genes were annotated and clustered using gene ontology (GO) terminology and Kyoto Encyclopaedia of Genes and Genomes (KEGG) pathway analysis. 61% of the identified differentially expressed genes have no known function including the 4 susceptible strain-specific transcripts. Resistant strain-specific expression of genes implicated in innate immunity of invertebrates was identified, including hydrolytic enzymes such as cathepsin L, a cysteine proteinase involved in lysis of phagocytosed particles; metabolic enzymes such as ornithine decarboxylase, the rate-limiting enzyme in the production of polyamines, important in inflammation and infection processes, as well as scavenging damaging free radicals produced during production of reactive oxygen species; stress response genes such as HSP70; proteins involved in signalling, such as importin 7 and copine 1, cytoplasmic intermediate filament (IF) protein and transcription enzymes such as elongation factor 1α and EF-2.
Production of the first cDNA microarray for profiling gene expression in B. glabrata provides a foundation for expanding our understanding of pathways and genes involved in the snail internal defence system (IDS). We demonstrate resistant strain-specific expression of genes potentially associated with the snail IDS, ranging from signalling and inflammation responses through to lysis of proteinacous products (encapsulated sporocysts or phagocytosed parasite components) and processing/degradation of these targeted products by ubiquitination.
The recently constructed river buffalo whole-genome radiation hybrid panel (BBURH5000) has already been used to generate preliminary radiation hybrid (RH) maps for several chromosomes, and buffalo-bovine comparative chromosome maps have been constructed. Here, we present the first-generation whole genome RH map (WG-RH) of the river buffalo generated from cattle-derived markers. The RH maps aligned to bovine genome sequence assembly Btau_4.0, providing valuable comparative mapping information for both species.
A total of 3990 markers were typed on the BBURH5000 panel, of which 3072 were cattle derived SNPs. The remaining 918 were classified as cattle sequence tagged site (STS), including coding genes, ESTs, and microsatellites. Average retention frequency per chromosome was 27.3% calculated with 3093 scorable markers distributed in 43 linkage groups covering all autosomes (24) and the X chromosomes at a LOD ≥ 8. The estimated total length of the WG-RH map is 36,933 cR5000. Fewer than 15% of the markers (472) could not be placed within any linkage group at a LOD score ≥ 8. Linkage group order for each chromosome was determined by incorporation of markers previously assigned by FISH and by alignment with the bovine genome sequence assembly (Btau_4.0).
We obtained radiation hybrid chromosome maps for the entire river buffalo genome based on cattle-derived markers. The alignments of our RH maps to the current bovine genome sequence assembly (Btau_4.0) indicate regions of possible rearrangements between the chromosomes of both species. The river buffalo represents an important agricultural species whose genetic improvement has lagged behind other species due to limited prior genomic characterization. We present the first-generation RH map which provides a more extensive resource for positional candidate cloning of genes associated with complex traits and also for large-scale physical mapping of the river buffalo genome.
Quantitative polymerase chain reaction (QPCR) is a widely applied analytical method for the accurate determination of transcript abundance. Primers for QPCR have been designed on a genomic scale but non-specific amplification of non-target genes has frequently been a problem. Although several online databases have been created for the storage and retrieval of experimentally validated primers, only a few thousand primer pairs are currently present in existing databases and the primers are not designed for use under a common PCR thermal profile.
We previously reported the implementation of an algorithm to predict PCR primers for most known human and mouse genes. We now report the use of that resource to identify 17483 pairs of primers that have been experimentally verified to amplify unique sequences corresponding to distinct murine transcripts. The primer pairs have been validated by gel electrophoresis, DNA sequence analysis and thermal denaturation profile. In addition to the validation studies, we have determined the uniformity of amplification using the primers and the technical reproducibility of the QPCR reaction using the popular and inexpensive SYBR Green I detection method.
We have identified an experimentally validated collection of murine primer pairs for PCR and QPCR which can be used under a common PCR thermal profile, allowing the evaluation of transcript abundance of a large number of genes in parallel. This feature is increasingly attractive for confirming and/or making more precise data trends observed from experiments performed with DNA microarrays.
One of the most striking features of mammalian and birds chromosomes is the variation in the guanine-cytosine (GC) content that occurs over scales of hundreds of kilobases to megabases; this is known as the "isochore" structure. Among other vertebrates the presence of isochores depends upon the taxon; isochore are clearly present in Crocodiles and turtles but fish genome seems very homogeneous on GC content. This has suggested a unique isochore origin after the divergence between Sarcopterygii and Actinopterygii, but before that between Sauropsida and mammals. However during more than 30 years of analysis, isochore characteristics have been studied and many important biological properties have been associated with the isochore structure of human genomes. For instance, the genes are more compact and their density is highest in GC rich isochores.
This paper shows in teleost fish genomes the existence of "GC segmentation" sharing some of the characteristics of isochores although teleost fish genomes presenting a particular homogeneity in CG content. The entire genomes of T nigroviridis and D rerio are now available, and this has made it possible to check whether a mosaic structure associated with isochore properties can be found in these fishes. In this study, hidden Markov models were trained on fish genes (T nigroviridis and D rerio) which were classified by using the isochore class of their human orthologous. A clear segmentation of these genomes was detected.
The GC content is an excellent indicator of isochores in heterogeneous genomes as mammals. The segmentation we obtained were well correlated with GC content and other properties associated to GC content such as gene density, the number of exons per gene and the length of introns. Therefore, the GC content is the main property that allows the detection of isochore but more biological properties have to be taken into account. This method allows detecting isochores in homogeneous genomes.
Mus spretus diverged from Mus musculus over one million years ago. These mice are genetically and phenotypically divergent. Despite the value of utilizing M. musculus and M. spretus for quantitative trait locus (QTL) mapping, relatively little genomic information on M. spretus exists, and most of the available sequence and polymorphic data is for one strain of M. spretus, Spret/Ei. In previous work, we mapped fifteen loci for skin cancer susceptibility using four different M. spretus by M. musculus F1 backcrosses. One locus, skin tumor susceptibility 5 (Skts5) on chromosome 12, shows strong linkage in one cross.
To identify potential candidate genes for Skts5, we sequenced 65 named and unnamed genes and coding elements mapping to the peak linkage area in outbred spretus, Spret/EiJ, FVB/NJ, and NIH/Ola. We identified polymorphisms in 62 of 65 genes including 122 amino acid substitutions. To look for polymorphisms consistent with the linkage data, we sequenced exons with amino acid polymorphisms in two additional M. spretus strains and one additional M. musculus strain generating 40.1 kb of sequence data. Eight candidate variants were identified that fit with the linkage data. To determine the degree of variation across M. spretus, we conducted phylogenetic analyses. The relatedness of the M. spretus strains at this locus is consistent with the proximity of region of ascertainment of the ancestral mice.
Our analyses suggest that, if Skts5 on chromosome 12 is representative of other regions in the genome, then published genomic data for Spret/EiJ are likely to be of high utility for genomic studies in other M. spretus strains.
Gene expression is controlled over a wide range at the transcript level through complex interplay between DNA and regulatory proteins, resulting in profiles of gene expression that can be represented as normal, graded, and bimodal (switch-like) distributions. We have previously performed genome-scale identification and annotation of genes with switch-like expression at the transcript level in mouse, using large microarray datasets for healthy tissue, in order to study the cellular pathways and regulatory mechanisms involving this class of genes. We showed that a large population of bimodal mouse genes encoding for cell membrane and extracellular matrix proteins is involved in communication pathways. This study expands on previous results by annotating human bimodal genes, investigating their correspondence to bimodality in mouse orthologs and exploring possible regulatory mechanisms that contribute to bimodality in gene expression in human and mouse.
Fourteen percent of the human genes on the HGU133A array (1847 out of 13076) were identified as bimodal or switch-like. More than 40% were found to have bimodal mouse orthologs. KEGG pathways enriched for bimodal genes included ECM-receptor interaction, focal adhesion, and tight junction, showing strong similarity to the results obtained in mouse. Tissue-specific modes of expression of bimodal genes among brain, heart, and skeletal muscle were common between human and mouse. Promoter analysis revealed a higher than average number of transcription start sites per gene within the set of bimodal genes. Moreover, the bimodal gene set had differentially methylated histones compared to the set of the remaining genes in the genome.
The fact that bimodal genes were enriched within the cell membrane and extracellular environment make these genes as candidates for biomarkers for tissue specificity. The commonality of the important roles bimodal genes play in tissue differentiation in both the human and mouse indicates the potential value of mouse data in providing context for human tissue studies. The regulation motifs enriched in the bimodal gene set (TATA boxes, alternative promoters, methlyation) have known associations with complex diseases, such as cancer, providing further potential for the use of bimodal genes in studying the molecular basis of disease.
We have recently released a comprehensive, manually curated database of mammalian protein complexes called CORUM. Combining CORUM with other resources, we assembled a dataset of over 2700 mammalian complexes. The availability of a rich information resource allows us to search for organizational properties concerning these complexes.
As the complexity of a protein complex in terms of the number of unique subunits increases, we observed that the number of such complexes and the mean non-synonymous to synonymous substitution ratio of associated genes tend to decrease. Similarly, as the number of different complexes a given protein participates in increases, the number of such proteins and the substitution ratio of the associated gene also tends to decrease. These observations provide evidence relating natural selection and the organization of mammalian complexes. We also observed greater homogeneity in terms of predicted protein isoelectric points, secondary structure and substitution ratio in annotated versus randomly generated complexes. A large proportion of the protein content and interactions in the complexes could be predicted from known binary protein-protein and domain-domain interactions. In particular, we found that large proteins interact preferentially with much smaller proteins.
We observed similar trends in yeast and other data. Our results support the existence of conserved relations associated with the mammalian protein complexes.
Skeletal muscle mass can be markedly reduced through a process called atrophy, as a consequence of many diseases or critical physiological and environmental situations. Atrophy is characterised by loss of contractile proteins and reduction of fiber volume. Although in the last decade the molecular aspects underlying muscle atrophy have received increased attention, the fine mechanisms controlling muscle degeneration are still incomplete. In this study we applied meta-analysis on gene expression signatures pertaining to different types of muscle atrophy for the identification of novel key regulatory signals implicated in these degenerative processes.
We found a general down-regulation of genes involved in energy production and carbohydrate metabolism and up-regulation of genes for protein degradation and catabolism. Six functional pathways occupy central positions in the molecular network obtained by the integration of atrophy transcriptome and molecular interaction data. They are TGF-β pathway, apoptosis, membrane trafficking/cytoskeleton organization, NFKB pathways, inflammation and reorganization of the extracellular matrix. Protein degradation pathway is evident only in the network specific for muscle short-term response to atrophy. TGF-β pathway plays a central role with proteins SMAD3/4, MYC, MAX and CDKN1A in the general network, and JUN, MYC, GNB2L1/RACK1 in the short-term muscle response network.
Our study offers a general overview of the molecular pathways and cellular processes regulating the establishment and maintenance of atrophic state in skeletal muscle, showing also how the different pathways are interconnected. This analysis identifies novel key factors that could be further investigated as potential targets for the development of therapeutic treatments. We suggest that the transcription factors SMAD3/4, GNB2L1/RACK1, MYC, MAX and JUN, whose functions have been extensively studied in tumours but only marginally in muscle, appear instead to play important roles in regulating muscle response to atrophy.
MicroRNAs (miRNAs) play key roles in mammalian gene expression and several cellular processes, including differentiation, development, apoptosis and cancer pathomechanisms. Recently the biological importance of primary cilia has been recognized in a number of human genetic diseases. Numerous disorders are related to cilia dysfunction, including polycystic kidney disease (PKD). Although involvement of certain genes and transcriptional networks in PKD development has been shown, not much is known how they are regulated molecularly.
Given the emerging role of miRNAs in gene expression, we explored the possibilities of miRNA-based regulations in PKD. Here, we analyzed the simultaneous expression changes of miRNAs and mRNAs by microarrays. 935 genes, classified into 24 functional categories, were differentially regulated between PKD and control animals. In parallel, 30 miRNAs were differentially regulated in PKD rats: our results suggest that several miRNAs might be involved in regulating genetic switches in PKD. Furthermore, we describe some newly detected miRNAs, miR-31 and miR-217, in the kidney which have not been reported previously. We determine functionally related gene sets, or pathways to reveal the functional correlation between differentially expressed mRNAs and miRNAs.
We find that the functional patterns of predicted miRNA targets and differentially expressed mRNAs are similar. Our results suggest an important role of miRNAs in specific pathways underlying PKD.
Acute changes in environmental parameters (e.g., O2, pH, UV, osmolarity, nutrients, etc.) evoke a common transcriptomic response in yeast referred to as the "environmental stress response" (ESR) or "common environmental response" (CER). Why such a diverse array of insults should elicit a common transcriptional response remains enigmatic. Previous functional analyses of the networks involved have found that, in addition to up-regulating those for mitigating the specific stressor, the majority appear to be involved in balancing energetic supply and demand and modulating progression through the cell cycle. Here we compared functional and regulatory aspects of the stress responses elicited by the acute inhibition of respiration with antimycin A and oxygen deprivation under catabolite non-repressed (galactose) conditions.
Gene network analyses of the transcriptomic responses revealed both treatments result in the transient (10 – 60 min) down-regulation of MBF- and SBF-regulated networks involved in the G1/S transition of the cell cycle as well as Fhl1 and PAC/RRPE-associated networks involved in energetically costly programs of ribosomal biogenesis and protein synthesis. Simultaneously, Msn2/4 networks involved in hexose import/dissimilation, reserve energy regulation, and autophagy were transiently up-regulated. Interestingly, when cells were treated with antimycin A well before experiencing anaerobiosis these networks subsequently failed to respond to oxygen deprivation. These results suggest the transient stress response is elicited by the acute inhibition of respiration and, we postulate, changes in cellular energetics and/or the instantaneous growth rate, not oxygen deprivation per se. After a considerable delay (≥ 1 generation) under anoxia, predictable changes in heme-regulated gene networks (e.g., Hap1, Hap2/3/4/5, Mot3, Rox1 and Upc2) were observed both in the presence and absence of antimycin A.
This study not only differentiates between the gene networks that respond to respiratory inhibition and those that respond to oxygen deprivation but suggests the function of the ESR or CER is to balance energetic supply/demand and coordinate growth with the cell cycle, whether in response to perturbations that disrupt catabolic pathways or those that require rapidly up-regulating energetically costly programs for combating specific stressors.
Phosphorylation by protein kinases is a common event in many cellular processes. Further, many kinases perform specialized roles and are regulated by non-kinase domains tethered to kinase domain. Perturbation in the regulation of kinases leads to malignancy. We have identified and analysed putative protein kinases encoded in the genome of chimpanzee which is a close evolutionary relative of human.
The shared core biology between chimpanzee and human is characterized by many orthologous protein kinases which are involved in conserved pathways. Domain architectures specific to chimp/human kinases have been observed. Chimp kinases with unique domain architectures are characterized by deletion of one or more non-kinase domains in the human kinases. Interestingly, counterparts of some of the multi-domain human kinases in chimp are characterized by identical domain architectures but with kinase-like non-kinase domain. Remarkably, out of 587 chimpanzee kinases no human orthologue with greater than 95% sequence identity could be identified for 160 kinases. Variations in chimpanzee kinases compared to human kinases are brought about also by differences in functions of domains tethered to the catalytic kinase domain. For example, the heterodimer forming PB1 domain related to the fold of ubiquitin/Ras-binding domain is seen uniquely tethered to PKC-like chimpanzee kinase.
Though the chimpanzee and human are evolutionary very close, there are chimpanzee kinases with no close counterpart in the human suggesting differences in their functions. This analysis provides a direction for experimental analysis of human and chimpanzee protein kinases in order to enhance our understanding on their specific biological roles.
Sodium channels are heteromultimeric, integral membrane proteins that belong to a superfamily of ion channels. The mutations in genes encoding for sodium channel proteins have been linked with several inherited genetic disorders such as febrile epilepsy, Brugada syndrome, ventricular fibrillation, long QT syndrome, or channelopathy associated insensitivity to pain. In spite of these significant effects that sodium channel proteins/genes could have on human health, there is no publicly available resource focused on sodium channels that would support exploration of the sodium channel related information.
We report here Dragon Database for Exploration of Sodium Channels in Human (DDESC), which provides comprehensive information related to sodium channels regarding different entities, such as "genes and proteins", "metabolites and enzymes", "toxins", "chemicals with pharmacological effects", "disease concepts", "human anatomy", "pathways and pathway reactions" and their potential links. DDESC is compiled based on text- and data-mining. It allows users to explore potential associations between different entities related to sodium channels in human, as well as to automatically generate novel hypotheses.
DDESC is first publicly available resource where the information related to sodium channels in human can be explored at different levels. This database is freely accessible for academic and non-profit users via the worldwide web .
Increasing evidence shows that whole genomes of eukaryotes are almost entirely transcribed into both protein coding genes and an enormous number of non-protein-coding RNAs (ncRNAs). Therefore, revealing the underlying regulatory mechanisms of transcripts becomes imperative. However, for a complete understanding of transcriptional regulatory mechanisms, we need to identify the regions in which they are found. We will call these transcriptional regulation regions, or TRRs, which can be considered functional regions containing a cluster of regulatory elements that cooperatively recruit transcriptional factors for binding and then regulating the expression of transcripts.
We constructed a hierarchical stochastic language (HSL) model for the identification of core TRRs in yeast based on regulatory cooperation among TRR elements. The HSL model trained based on yeast achieved comparable accuracy in predicting TRRs in other species, e.g., fruit fly, human, and rice, thus demonstrating the conservation of TRRs across species. The HSL model was also used to identify the TRRs of genes, such as p53 or OsALYL1, as well as microRNAs. In addition, the ENCODE regions were examined by HSL, and TRRs were found to pervasively locate in the genomes.
Our findings indicate that 1) the HSL model can be used to accurately predict core TRRs of transcripts across species and 2) identified core TRRs by HSL are proper candidates for the further scrutiny of specific regulatory elements and mechanisms. Meanwhile, the regulatory activity taking place in the abundant numbers of ncRNAs might account for the ubiquitous presence of TRRs across the genome. In addition, we also found that the TRRs of protein coding genes and ncRNAs are similar in structure, with the latter being more conserved than the former.
Seed oil accumulates primarily as triacylglycerol (TAG). While the biochemical pathway for TAG biosynthesis is known, its regulation remains unclear. Previous research identified microsomal diacylglycerol acyltransferase 1 (DGAT1, EC 18.104.22.168) as controlling a rate-limiting step in the TAG biosynthesis pathway. Of note, overexpression of DGAT1 results in substantial increases in oil content and seed size. To further analyze the global consequences of manipulating DGAT1 levels during seed development, a concerted transcriptome and metabolome analysis of transgenic B. napus prototypes was performed.
Using a targeted Brassica cDNA microarray, about 200 genes were differentially expressed in two independent transgenic lines analyzed. Interestingly, 24–33% of the targets showing significant changes have no matching gene in Arabidopsis although these represent only 5% of the targets on the microarray. Further analysis of some of these novel transcripts indicated that several are inducible by ABA in microspore-derived embryos. Of the 200 Arabidopsis genes implicated in lipid biology present on the microarray, 36 were found to be differentially regulated in DGAT transgenic lines. Furthermore, kinetic reverse transcriptase Polymerase Chain Reaction (k-PCR) analysis revealed up-regulation of genes encoding enzymes of the Kennedy pathway involved in assembly of TAGs. Hormone profiling indicated that levels of auxins and cytokinins varied between transgenic lines and untransformed controls, while differences in the pool sizes of ABA and catabolites were only observed at later stages of development.
Our results indicate that the increased TAG accumulation observed in transgenic DGAT1 plants is associated with modest transcriptional and hormonal changes during seed development that are not limited to the TAG biosynthesis pathway. These might be associated with feedback or feed-forward effects due to altered levels of DGAT1 activity. The fact that a large fraction of significant amplicons have no matching genes in Arabidopsis compromised our ability to draw concrete inferences from the data at this stage, but has led to the identification of novel genes of potential interest.
The fish pathogen Aliivibrio salmonicida is the causative agent of cold-water vibriosis in marine aquaculture. The Gram-negative bacterium causes tissue degradation, hemolysis and sepsis in vivo.
In total, 4 286 protein coding sequences were identified, and the 4.6 Mb genome of A. salmonicida has a six partite architecture with two chromosomes and four plasmids. Sequence analysis revealed a highly fragmented genome structure caused by the insertion of an extensive number of insertion sequence (IS) elements. The IS elements can be related to important evolutionary events such as gene acquisition, gene loss and chromosomal rearrangements. New A. salmonicida functional capabilities that may have been aquired through horizontal DNA transfer include genes involved in iron-acquisition, and protein secretion and play potential roles in pathogenicity. On the other hand, the degeneration of 370 genes and consequent loss of specific functions suggest that A. salmonicida has a reduced metabolic and physiological capacity in comparison to related Vibrionaceae species.
Most prominent is the loss of several genes involved in the utilisation of the polysaccharide chitin. In particular, the disruption of three extracellular chitinases responsible for enzymatic breakdown of chitin makes A. salmonicida unable to grow on the polymer form of chitin. These, and other losses could restrict the variety of carrier organisms A. salmonicida can attach to, and associate with. Gene acquisition and gene loss may be related to the emergence of A. salmonicida as a fish pathogen.
Many plant genomes are resistant to whole-genome assembly due to an abundance of repetitive sequence, leading to the development of gene-rich sequencing techniques. Two such techniques are hypomethylated partial restriction (HMPR) and methylation spanning linker libraries (MSLL). These libraries differ from other gene-rich datasets in having larger insert sizes, and the MSLL clones are designed to provide reads localized to "epigenetic boundaries" where methylation begins or ends.
A large-scale study in maize generated 40,299 HMPR sequences and 80,723 MSLL sequences, including MSLL clones exceeding 100 kb. The paired end reads of MSLL and HMPR clones were shown to be effective in linking existing gene-rich sequences into scaffolds. In addition, it was shown that the MSLL clones can be used for anchoring these scaffolds to a BAC-based physical map. The MSLL end reads effectively identified epigenetic boundaries, as indicated by their preferential alignment to regions upstream and downstream from annotated genes. The ability to precisely map long stretches of fully methylated DNA sequence is a unique outcome of MSLL analysis, and was also shown to provide evidence for errors in gene identification. MSLL clones were observed to be significantly more repeat-rich in their interiors than in their end reads, confirming the correlation between methylation and retroelement content. Both MSLL and HMPR reads were found to be substantially gene-enriched, with the SalI MSLL libraries being the most highly enriched (31% align to an EST contig), while the HMPR clones exhibited exceptional depletion of repetitive DNA (to ~11%). These two techniques were compared with other gene-enrichment methods, and shown to be complementary.
MSLL technology provides an unparalleled approach for mapping the epigenetic status of repetitive blocks and for identifying sequences mis-identified as genes. Although the types and natures of epigenetic boundaries are barely understood at this time, MSLL technology flags both approximate boundaries and methylated genes that deserve additional investigation. MSLL and HMPR sequences provide a valuable resource for maize genome annotation, and are a uniquely valuable complement to any plant genome sequencing project. In order to make these results fully accessible to the community, a web display was developed that shows the alignment of MSLL, HMPR, and other gene-rich sequences to the BACs; this display is continually updated with the latest ESTs and BAC sequences.
The bacterial cell wall is the target of many antibiotics and cell envelope constituents are critical to host-pathogen interactions. To combat resistance development and virulence, a detailed knowledge of the individual factors involved is essential. Members of the LytR-CpsA-Psr family of cell envelope-associated attenuators are relevant for β-lactam resistance, biofilm formation, and stress tolerance, and they are suggested to play a role in cell wall maintenance. However, their precise function is still unknown. This study addresses the occurrence as well as sequence-based characteristics of the LytR-CpsA-Psr proteins.
A comprehensive list of LytR-CpsA-Psr proteins was established, and their phylogenetic distribution and clustering into subgroups was determined. LytR-CpsA-Psr proteins were present in all Gram-positive organisms, except for the cell wall-deficient Mollicutes and one strain of the Clostridiales. In contrast, the majority of Gram-negatives did not contain LytR-CpsA-Psr family members. Despite high sequence divergence, the LytR-CpsA-Psr domains of different subclusters shared a highly similar, predicted mixed a/β-structure, and conserved charged residues. PhoA fusion experiments, using MsrR of Staphylococcus aureus, confirmed membrane topology predictions and extracellular location of its LytR-CpsA-Psr domain.
The LytR-CpsA-Psr domain is unique to bacteria. The presence of diverse subgroups within the LytR-CpsA-Psr family might indicate functional differences, and could explain variations in phenotypes of respective mutants reported. The identified conserved structural elements and amino acids are likely to be important for the function of the domain and will help to guide future studies of the LytR-CpsA-Psr proteins.
Microsatellites or single sequence repeats (SSRs) are a powerful choice of marker in the study of Phytophthora population biology, epidemiology, ecology, genetics and evolution. A strategy was tested in which the publicly available unigene datasets extracted from genome sequences of P. infestans, P. sojae and P. ramorum were mined for candidate SSR markers that could be applied to a wide range of Phytophthora species.
A first approach, aimed at the identification of polymorphic SSR loci common to many Phytophthora species, yielded 171 reliable sequences containing 211 SSRs. Microsatellites were identified from 16 target species representing the breadth of diversity across the genus. Repeat number ranged from 3 to 16 with most having seven repeats or less and four being the most commonly found. Trinucleotide repeats such as (AAG)n, (AGG)n and (AGC)n were the most common followed by pentanucleotide, tetranucleotide and dinucleotide repeats. A second approach was aimed at the identification of useful loci common to a restricted number of species more closely related to P. sojae (P. alni, P. cambivora, P. europaea and P. fragariae). This analysis yielded 10 trinucleotide and 2 tetranucleotide SSRs which were repeated 4, 5 or 6 times.
Key studies on inter- and intra-specific variation of selected microsatellites remain. Despite the screening of conserved gene coding regions, the sequence diversity between species was high and the identification of useful SSR loci applicable to anything other than the most closely related pairs of Phytophthora species was challenging. That said, many novel SSR loci for species other than the three 'source species' (P. infestans, P. sojae and P. ramorum) are reported, offering great potential for the investigation of Phytophthora populations. In addition to the presence of microsatellites, many of the amplified regions may represent useful molecular marker regions for other studies as they are highly variable and easily amplifiable from different Phytophthora species.
The Tephritidae family of insects includes the most important agricultural pests of fruits and vegetables, belonging mainly to four genera (Bactrocera, Ceratitis, Anastrepha and Rhagoletis). The olive fruit fly, Bactrocera oleae, is the major pest of the olive fruit. Currently, its control is based on chemical insecticides. Environmentally friendlier methods have been attempted in the past (Sterile Insect Technique), albeit with limited success. This was mainly attributed to the lack of knowledge on the insect's behaviour, ecology and genetic structure of natural populations. The development of molecular markers could facilitate the access in the genome and contribute to the solution of the aforementioned problems. We chose to focus on microsatellite markers due to their abundance in the genome, high degree of polymorphism and easiness of isolation.
Fifty-eight microsatellite-containing clones were isolated from the olive fly, Bactrocera oleae, bearing a total of sixty-two discrete microsatellite motifs. Forty-two primer pairs were designed on the unique sequences flanking the microsatellite motif and thirty-one of them amplified a PCR product of the expected size. The level of polymorphism was evaluated against wild and laboratory flies and the majority of the markers (93.5%) proved highly polymorphic. Thirteen of them presented a unique position on the olive fly polytene chromosomes by in situ hybridization, which can serve as anchors to correlate future genetic and cytological maps of the species, as well as entry points to the genome. Cross-species amplification of these markers to eleven Tephritidae species and sequencing of thirty-one of the amplified products revealed a varying degree of conservation that declines outside the Bactrocera genus.
Microsatellite markers are very powerful tools for genetic and population analyses, particularly in species deprived of any other means of genetic analysis. The presented set of microsatellite markers possesses all features that would render them useful in such analyses. This could also prove helpful for species where SIT is a desired outcome, since the development of effective SIT can be aided by detailed knowledge at the genetic and molecular level. Furthermore, their presented efficacy in several other species of the Tephritidae family not only makes them useful for their analysis but also provides tools for phylogenetic comparisons among them.