Search tips
Search criteria

Results 1-25 (481)

Clipboard (0)
Year of Publication
1.  MacroH2A1 knockdown effects on the Peg3 imprinted domain 
BMC Genomics  2007;8:479.
MacroH2A1 is a histone variant that is closely associated with the repressed regions of chromosomes. A recent study revealed that this histone variant is highly enriched in the inactive alleles of Imprinting Control Regions (ICRs).
The current study investigates the potential roles of macroH2A1 in genomic imprinting by lowering the cellular levels of the macroH2A1 protein. RNAi-based macroH2A1 knockdown experiments in Neuro2A cells changed the expression levels of a subset of genes, including Peg3 and Usp29 of the Peg3 domain. The expression of these genes was down-regulated, rather than up-regulated, in response to reduced protein levels of the potential repressor macroH2A1. This down-regulation was not accompanied with changes in the DNA methylation status of the Peg3 domain.
MacroH2A1 may not function as a transcriptional repressor for this domain, but that macroH2A1 may participate in the heterochromatin formation with functions yet to be discovered.
PMCID: PMC2241636  PMID: 18166131
2.  Single nucleotide polymorphisms (SNPs) are highly conserved in rhesus (Macaca mulatta) and cynomolgus (Macaca fascicularis) macaques 
BMC Genomics  2007;8:480.
Macaca fascicularis (cynomolgus or longtail macaques) is the most commonly used non-human primate in biomedical research. Little is known about the genomic variation in cynomolgus macaques or how the sequence variants compare to those of the well-studied related species, Macaca mulatta (rhesus macaque). Previously we identified single nucleotide polymorphisms (SNPs) in portions of 94 rhesus macaque genes and reported that Indian and Chinese rhesus had largely different SNPs. Here we identify SNPs from some of the same genomic regions of cynomolgus macaques (from Indochina, Indonesia, Mauritius and the Philippines) and compare them to the SNPs found in rhesus.
We sequenced a portion of 10 genes in 20 cynomolgus macaques. We identified 69 SNPs in these regions, compared with 71 SNPs found in the same genomic regions of 20 Indian and Chinese rhesus macaques. Thirty six (52%) of the M. fascicularis SNPs were overlapping in both species. The majority (70%) of the SNPs found in both Chinese and Indian rhesus macaque populations were also present in M. fascicularis. Of the SNPs previously found in a single rhesus population, 38% (Indian) and 44% (Chinese) were also identified in cynomolgus macaques. In an alternative approach, we genotyped 100 cynomolgus DNAs using a rhesus macaque SNP array representing 53 genes and found that 51% (29/57) of the rhesus SNPs were present in M. fascicularis. Comparisons of SNP profiles from cynomolgus macaques imported from breeding centers in China (where M. fascicularis are not native) showed they were similar to those from Indochina.
This study demonstrates a surprisingly high conservation of SNPs between M. fascicularis and M. mulatta, suggesting that the relationship of these two species is closer than that suggested by morphological and mitochondrial DNA analysis alone. These findings indicate that SNP discovery efforts in either species will generate useful resources for both macaque species. Identification of SNPs that are unique to regional populations of cynomolgus macaques indicates that location-specific SNPs could be used to distinguish monkeys of uncertain origin. As an example, cynomolgus macaques obtained from 2 different breeding centers in China were shown to have Indochinese ancestry.
PMCID: PMC2248198  PMID: 18166133
3.  Conservation and divergence of microRNAs in Populus 
BMC Genomics  2007;8:481.
MicroRNAs (miRNAs) are small RNAs (sRNA) ~21 nucleotides in length that negatively control gene expression by cleaving or inhibiting the translation of target gene transcripts. miRNAs have been extensively analyzed in Arabidopsis and rice and partially investigated in other non-model plant species. To date, 109 and 62 miRNA families have been identified in Arabidopsis and rice respectively. However, only 33 miRNAs have been identified from the genome of the model tree species (Populus trichocarpa), of which 11 are Populus specific. The low number of miRNA families previously identified in Populus, compared with the number of families identified in Arabidopsis and rice, suggests that many miRNAs still remain to be discovered in Populus. In this study, we analyzed expressed small RNAs from leaves and vegetative buds of Populus using high throughput pyrosequencing.
Analysis of almost eighty thousand small RNA reads allowed us to identify 123 new sequences belonging to previously identified miRNA families as well as 48 new miRNA families that could be Populus-specific. Comparison of the organization of miRNA families in Populus, Arabidopsis and rice showed that miRNA family sizes were generally expanded in Populus. The putative targets of non-conserved miRNA include both previously identified targets as well as several new putative target genes involved in development, resistance to stress, and other cellular processes. Moreover, almost half of the genes predicted to be targeted by non-conserved miRNAs appear to be Populus-specific. Comparative analyses showed that genes targeted by conserved and non-conserved miRNAs are biased mainly towards development, electron transport and signal transduction processes. Similar results were found for non-conserved miRNAs from Arabidopsis.
Our results suggest that while there is a conserved set of miRNAs among plant species, a large fraction of miRNAs vary among species. The non-conserved miRNAs may regulate cellular, physiological or developmental processes specific to the taxa that produce them, as appears likely to be the case for those miRNAs that have only been observed in Populus. Non-conserved and conserved miRNAs seem to target genes with similar biological functions indicating that similar selection pressures are acting on both types of miRNAs. The expansion in the number of most conserved miRNAs in Populus relative to Arabidopsis, may be linked to the recent genome duplication in Populus, the slow evolution of the Populus genome, or to differences in the selection pressure on duplicated miRNAs in these species.
PMCID: PMC2270843  PMID: 18166134
4.  A large-scale proteomic analysis of human embryonic stem cells 
BMC Genomics  2007;8:478.
Much of our current knowledge of the molecular expression profile of human embryonic stem cells (hESCs) is based on transcriptional approaches. These analyses are only partly predictive of protein expression however, and do not shed light on post-translational regulation, leaving a large gap in our knowledge of the biology of pluripotent stem cells.
Here we describe the use of two large-scale western blot assays to identify over 600 proteins expressed in undifferentiated hESCs, and highlight over 40 examples of multiple gel mobility variants, which are suspected protein isoforms and/or post-translational modifications. Twenty-two phosphorylation events in cell signaling molecules, as well as potential new markers of undifferentiated hESCs were also identified. We confirmed the expression of a subset of the identified proteins by immunofluorescence and correlated the expression of transcript and protein for key molecules in active signaling pathways in hESCs. These analyses also indicated that hESCs exhibit several features of polarized epithelia, including expression of tight junction proteins.
Our approach complements proteomic and transcriptional analysis to provide unique information on human pluripotent stem cells, and is a framework for the continued analyses of self-renewal.
PMCID: PMC2211323  PMID: 18162134
5.  Gene response profiles for Daphnia pulex exposed to the environmental stressor cadmium reveals novel crustacean metallothioneins 
BMC Genomics  2007;8:477.
Genomic research tools such as microarrays are proving to be important resources to study the complex regulation of genes that respond to environmental perturbations. A first generation cDNA microarray was developed for the environmental indicator species Daphnia pulex, to identify genes whose regulation is modulated following exposure to the metal stressor cadmium. Our experiments revealed interesting changes in gene transcription that suggest their biological roles and their potentially toxicological features in responding to this important environmental contaminant.
Our microarray identified genes reported in the literature to be regulated in response to cadmium exposure, suggested functional attributes for genes that share no sequence similarity to proteins in the public databases, and pointed to genes that are likely members of expanded gene families in the Daphnia genome. Genes identified on the microarray also were associated with cadmium induced phenotypes and population-level outcomes that we experimentally determined. A subset of genes regulated in response to cadmium exposure was independently validated using quantitative-realtime (Q-RT)-PCR. These microarray studies led to the discovery of three genes coding for the metal detoxication protein metallothionein (MT). The gene structures and predicted translated sequences of D. pulex MTs clearly place them in this gene family. Yet, they share little homology with previously characterized MTs.
The genomic information obtained from this study represents an important first step in characterizing microarray patterns that may be diagnostic to specific environmental contaminants and give insights into their toxicological mechanisms, while also providing a practical tool for evolutionary, ecological, and toxicological functional gene discovery studies. Advances in Daphnia genomics will enable the further development of this species as a model organism for the environmental sciences.
PMCID: PMC2234263  PMID: 18154678
6.  Comparative genomic characterization of citrus-associated Xylella fastidiosa strains 
BMC Genomics  2007;8:474.
The xylem-inhabiting bacterium Xylella fastidiosa (Xf) is the causal agent of Pierce's disease (PD) in vineyards and citrus variegated chlorosis (CVC) in orange trees. Both of these economically-devastating diseases are caused by distinct strains of this complex group of microorganisms, which has motivated researchers to conduct extensive genomic sequencing projects with Xf strains. This sequence information, along with other molecular tools, have been used to estimate the evolutionary history of the group and provide clues to understand the capacity of Xf to infect different hosts, causing a variety of symptoms. Nonetheless, although significant amounts of information have been generated from Xf strains, a large proportion of these efforts has concentrated on the study of North American strains, limiting our understanding about the genomic composition of South American strains – which is particularly important for CVC-associated strains.
This paper describes the first genome-wide comparison among South American Xf strains, involving 6 distinct citrus-associated bacteria. Comparative analyses performed through a microarray-based approach allowed identification and characterization of large mobile genetic elements that seem to be exclusive to South American strains. Moreover, a large-scale sequencing effort, based on Suppressive Subtraction Hybridization (SSH), identified 290 new ORFs, distributed in 135 Groups of Orthologous Elements, throughout the genomes of these bacteria.
Results from microarray-based comparisons provide further evidence concerning activity of horizontally transferred elements, reinforcing their importance as major mediators in the evolution of Xf. Moreover, the microarray-based genomic profiles showed similarity between Xf strains 9a5c and Fb7, which is unexpected, given the geographical and chronological differences associated with the isolation of these microorganisms. The newly identified ORFs, obtained by SSH, represent an approximately 10% increase in our current knowledge of the South American Xf gene pool and include new putative virulence factors, as well as novel potential markers for strain identification. Surprisingly, this list of novel elements include sequences previously believed to be unique to North American strains, pointing to the necessity of revising the list of specific markers that may be used for identification of distinct Xf strains.
PMCID: PMC2262912  PMID: 18154652
7.  Microarray analysis of iron deficiency chlorosis in near-isogenic soybean lines 
BMC Genomics  2007;8:476.
Iron is one of fourteen mineral elements required for proper plant growth and development of soybean (Glycine max L. Merr.). Soybeans grown on calcareous soils, which are prevalent in the upper Midwest of the United States, often exhibit symptoms indicative of iron deficiency chlorosis (IDC). Yield loss has a positive linear correlation with increasing severity of chlorotic symptoms. As soybean is an important agronomic crop, it is essential to understand the genetics and physiology of traits affecting plant yield. Soybean cultivars vary greatly in their ability to respond successfully to iron deficiency stress. Microarray analyses permit the identification of genes and physiological processes involved in soybean's response to iron stress.
RNA isolated from the roots of two near isogenic lines, which differ in iron efficiency, PI 548533 (Clark; iron efficient) and PI 547430 (IsoClark; iron inefficient), were compared on a spotted microarray slide containing 9,728 cDNAs from root specific EST libraries. A comparison of RNA transcripts isolated from plants grown under iron limiting hydroponic conditions for two weeks revealed 43 genes as differentially expressed. A single linkage clustering analysis of these 43 genes showed 57% of them possessed high sequence similarity to known stress induced genes. A control experiment comparing plants grown under adequate iron hydroponic conditions showed no differences in gene expression between the two near isogenic lines. Expression levels of a subset of the differentially expressed genes were also compared by real time reverse transcriptase PCR (RT-PCR). The RT-PCR experiments confirmed differential expression between the iron efficient and iron inefficient plants for 9 of 10 randomly chosen genes examined. To gain further insight into the iron physiological status of the plants, the root iron reductase activity was measured in both iron efficient and inefficient genotypes for plants grown under iron sufficient and iron limited conditions. Iron inefficient plants failed to respond to decreased iron availability with increased activity of Fe reductase.
These experiments have identified genes involved in the soybean iron deficiency chlorosis response under iron deficient conditions. Single linkage cluster analysis suggests iron limited soybeans mount a general stress response as well as a specialized iron deficiency stress response. Root membrane bound reductase capacity is often correlated with iron efficiency. Under iron-limited conditions, the iron efficient plant had high root bound membrane reductase capacity while the iron inefficient plants reductase levels remained low, further limiting iron uptake through the root. Many of the genes up-regulated in the iron inefficient NIL are involved in known stress induced pathways. The most striking response of the iron inefficient genotype to iron deficiency stress was the induction of a profusion of signaling and regulatory genes, presumably in an attempt to establish and maintain cellular homeostasis. Genes were up-regulated that point toward an increased transport of molecules through membranes. Genes associated with reactive oxidative species and an ROS-defensive enzyme were also induced. The up-regulation of genes involved in DNA repair and RNA stability reflect the inhospitable cellular environment resulting from iron deficiency stress. Other genes were induced that are involved in protein and lipid catabolism; perhaps as an effort to maintain carbon flow and scavenge energy. The under-expression of a key glycolitic gene may result in the iron-inefficient genotype being energetically challenged to maintain a stable cellular environment. These experiments have identified candidate genes and processes for further experimentation to increase our understanding of soybeans' response to iron deficiency stress.
PMCID: PMC2253546  PMID: 18154662
8.  Surviving extreme polar winters by desiccation: clues from Arctic springtail (Onychiurus arcticus) EST libraries 
BMC Genomics  2007;8:475.
Ice, snow and temperatures of -14°C are conditions which most animals would find difficult, if not impossible, to survive in. However this exactly describes the Arctic winter, and the Arctic springtail Onychiurus arcticus regularly survives these extreme conditions and re-emerges in the spring. It is able to do this by reducing the amount of water in its body to almost zero: a process that is called "protective dehydration". The aim of this project was to generate clones and sequence data in the form of ESTs to provide a platform for the future molecular characterisation of the processes involved in protective dehydration.
Five normalised libraries were produced from both desiccating and rehydrating populations of O. arcticus from stages that had previously been defined as potentially informative for molecular analyses. A total of 16,379 EST clones were generated and analysed using Blast and GO annotation. 40% of the clones produced significant matches against the Swissprot and trembl databases and these were further analysed using GO annotation. Extraction and analysis of GO annotations proved an extremely effective method for identifying generic processes associated with biochemical pathways, proving more efficient than solely analysing Blast data output. A number of genes were identified, which have previously been shown to be involved in water transport and desiccation such as members of the aquaporin family. Identification of these clones in specific libraries associated with desiccation validates the computational analysis by library rather than producing a global overview of all libraries combined.
This paper describes for the first time EST data from the arctic springtail (O. arcticus). This significantly enhances the number of Collembolan ESTs in the public databases, providing useful comparative data within this phylum. The use of GO annotation for analysis has facilitated the identification of a wide variety of ESTs associated with a number of different biochemical pathways involved in the dehydration and recovery process in O. arcticus.
PMCID: PMC2246132  PMID: 18154659
9.  Maintenance of transposon-free regions throughout vertebrate evolution 
BMC Genomics  2007;8:470.
We recently reported the existence of large numbers of regions up to 80 kb long that lack transposon insertions in the human, mouse and opossum genomes. These regions are significantly associated with loci involved in developmental and transcriptional regulation.
Here we report that transposon-free regions (TFRs) are prominent genomic features of amphibian and fish lineages, and that many have been maintained throughout vertebrate evolution, although most transposon-derived sequences have entered these lineages after their divergence. The zebrafish genome contains 470 TFRs over 10 kb and a further 3,951 TFRs over 5 kb, which is comparable to the number identified in mammals. Two thirds of zebrafish TFRs over 10 kb are orthologous to TFRs in at least one mammal, and many have orthologous TFRs in all three mammalian genomes as well as in the genome of Xenopus tropicalis. This indicates that the mechanism responsible for the maintenance of TFRs has been active at these loci for over 450 million years. However, the majority of TFR bases cannot be aligned between distantly related species, demonstrating that TFRs are not the by-product of strong primary sequence conservation. Syntenically conserved TFRs are also more enriched for regulatory genes compared to lineage-specific TFRs.
We suggest that TFRs contain extended regulatory sequences that contribute to the precise expression of genes central to early vertebrate development, and can be used as predictors of important regulatory regions.
PMCID: PMC2241635  PMID: 18093339
10.  The fungus Ustilago maydis and humans share disease-related proteins that are not found in Saccharomyces cerevisiae 
BMC Genomics  2007;8:473.
The corn smut fungus Ustilago maydis is a well-established model system for molecular phytopathology. In addition, it recently became evident that U. maydis and humans share proteins and cellular processes that are not found in the standard fungal model Saccharomyces cerevisiae. This prompted us to do a comparative analysis of the predicted proteome of U. maydis, S. cerevisiae and humans.
At a cut off at 20% identity over protein length, all three organisms share 1738 proteins, whereas both fungi share only 541 conserved proteins. Despite the evolutionary distance between U. maydis and humans, 777 proteins were shared. When applying a more stringent criterion (≥ 20% identity with a homologue in one organism over at least 50 amino acids and ≥ 10% less in the other organism), we found 681 proteins for the comparison of U. maydis and humans, whereas the both fungi share only 622 fungal specific proteins. Finally, we found that S. cerevisiae and humans shared 312 proteins. In the U. maydis to H. sapiens homology set 454 proteins are functionally classified and 42 proteins are related to serious human diseases. However, a large portion of 222 proteins are of unknown function.
The fungus U. maydis has a long history of being a model system for understanding DNA recombination and repair, as well as molecular plant pathology. The identification of functionally un-characterized genes that are conserved in humans and U. maydis opens the door for experimental work, which promises new insight in the cell biology of the mammalian cell.
PMCID: PMC2262911  PMID: 18096044
11.  Functional characterization of two novel 5' untranslated exons reveals a complex regulation of NOD2 protein expression 
BMC Genomics  2007;8:472.
NOD2 is an innate immune receptor for the bacterial cell wall component muramyl-dipeptide. Mutations in the leucine-rich repeat region of NOD2, which lead to an impaired recognition of muramyl-dipeptide, have been associated with Crohn disease, a human chronic inflammatory bowel disease. Tissue specific constitutive and inducible expression patterns of NOD2 have been described that result from complex regulatory events for which the molecular mechanisms are not yet fully understood.
We have identified two novel exons of the NOD2 gene (designated exon 1a and 1b), which are spliced to the canonical exon 2 and constitute the 5' untranslated region of two alternative transcript isoforms (i.e. exon 1a/1b/2 and exon 1a/2). The two novel transcripts are abundantly expressed and seem to comprise the majority of NOD2 transcripts under physiological conditions. We confirm the expression of the previously known canonical first exon (designated exon 1c) of the gene in unstimulated mononuclear cells. The inclusion of the second alternative exon 1b, which harbours three short upstream open reading frames (uORFs), is downregulated upon stimulation with TNF-α or under pro-inflammatory conditions in the inflamed intestinal mucosa in vivo. Using the different 5' UTR splice forms fused to a firefly luciferase (LUC) reporter we demonstrate a rapamycin-sensitive inhibitory effect of the uORFs on translation efficacy.
The differential usage of two alternative promoters in the NOD2 gene leads to tissue-specific and context-dependent NOD2 transcript isoform patterns. We demonstrate for the first time that context-dependent alternative splicing is linked to uORF-mediated translational repression. The results suggest complex parallel control mechanisms that independently regulate NOD2 expression in the context of inflammatory signaling.
PMCID: PMC2228316  PMID: 18096043
12.  AphanoDB: a genomic resource for Aphanomyces pathogens 
BMC Genomics  2007;8:471.
The Oomycete genus Aphanomyces comprises devastating plant and animal pathogens. However, little is known about the molecular mechanisms underlying pathogenicity of Aphanomyces species. In this study, we report on the development of a public database called AphanoDB which is dedicated to Aphanomyces genomic data. As a first step, a large collection of Expressed Sequence Tags was obtained from the legume pathogen A. euteiches, which was then processed and collected into AphanoDB.
Two cDNA libraries of A. euteiches were created: one from mycelium growing on synthetic medium and one from mycelium grown in contact to root tissues of the model legume Medicago truncatula. From these libraries, 18,684 expressed sequence tags were obtained and assembled into 7,977 unigenes which were compared to public databases for annotation. Queries on AphanoDB allow the users to retrieve information for each unigene including similarity to known protein sequences, protein domains and Gene Ontology classification. Statistical analysis of EST frequency from the two different growth conditions was also added to the database.
AphanoDB is a public database with a user-friendly web interface. The sequence report pages are the main web interface which provides all annotation details for each unigene. These interactive sequence report pages are easily available through text, BLAST, Gene Ontology and expression profile search utilities. AphanoDB is available from URL: .
PMCID: PMC2228315  PMID: 18096036
13.  Rice transposable elements are characterized by various methylation environments in the genome 
BMC Genomics  2007;8:469.
Recent studies using high-throughput methods have revealed that transposable elements (TEs) are a comprehensive target for DNA methylation. However, the relationship between TEs and their genomic environment regarding methylation still remains unclear. The rice genome contains representatives of all known TE families with different characteristics of chromosomal distribution, structure, transposition, size, and copy number. Here we studied the DNA methylation state around 12 TEs in nine genomic DNAs from cultivated rice strains and their closely related wild strains.
We employed a transposon display (TD) method to analyze the methylation environments in the genomes. The 12 TE families, consisting of four class I elements, seven class II elements, and one element of a different class, were differentially distributed in the rice chromosomes: some elements were concentrated in the centromeric or pericentromeric regions, but others were located in euchromatic regions. The TD analyses revealed that the TE families were embedded in flanking sequences with different methylation degrees. Each TE had flanking sequences with similar degrees of methylation among the nine rice strains. The class I elements tended to be present in highly methylated regions, while those of the class II elements showed widely varying degrees of methylation. In some TE families, the degrees of methylation were markedly lower than the average methylation state of the genome. In two families, dramatic changes of the methylation state occurred depending on the distance from the TE.
Our results demonstrate that the TE families in the rice genomes can be characterized by the methylation states of their surroundings. The copy number and degree of conservation of the TE family are not likely to be correlated with the degree of methylation. We discuss possible relationships between the methylation state of TEs and their surroundings. This is the first report demonstrating that TEs in the genome are associated with a particular methylation environment that is a feature of a given TE.
PMCID: PMC2222647  PMID: 18093338
14.  Serial Analysis of Gene Expression in Plasmodium berghei salivary gland sporozoites 
BMC Genomics  2007;8:466.
The invasion of Anopheles salivary glands by Plasmodium sporozoites is an essential step for transmission of the parasite to the vertebrate host. Salivary gland sporozoites undergo a developmental programme to express genes required for their journey from the site of the mosquito bite to the liver and subsequent invasion of, and development within, hepatocytes. A Serial Analysis of Gene Expression was performed on Anopheles gambiae salivary glands infected or not with Plasmodium berghei and we report here the analysis of the Plasmodium sporozoite transcriptome.
Annotation of 530 tag sequences homologous to Plasmodium berghei genomic sequences identified 123 genes expressed in salivary gland sporozoites and these genes were classified according to their transcript abundance. A subset of these genes was further studied by quantitative PCR to determine their expression profiles. This revealed that sporozoites modulate their RNA amounts not only between the midgut and salivary glands, but also during their storage within the latter. Among the 123 genes, the expression of 66 is described for the first time in sporozoites of rodent Plasmodium species.
These novel sporozoite expressed genes, especially those expressed at high levels in salivary gland sporozoites, are likely to play a role in Plasmodium infectivity in the mammalian host.
PMCID: PMC2263065  PMID: 18093287
15.  Identification of chromosomal alpha-proteobacterial small RNAs by comparative genome analysis and detection in Sinorhizobium meliloti strain 1021 
BMC Genomics  2007;8:467.
Small untranslated RNAs (sRNAs) seem to be far more abundant than previously believed. The number of sRNAs confirmed in E. coli through various approaches is above 70, with several hundred more sRNA candidate genes under biological validation. Although the total number of sRNAs in any one species is still unclear, their importance in cellular processes has been established. However, unlike protein genes, no simple feature enables the prediction of the location of the corresponding sequences in genomes. Several approaches, of variable usefulness, to identify genomic sequences encoding sRNA have been described in recent years.
We used a combination of in silico comparative genomics and microarray-based transcriptional profiling. This approach to screening identified ~60 intergenic regions conserved between Sinorhizobium meliloti and related members of the alpha-proteobacteria sub-group 2. Of these, 14 appear to correspond to novel non-coding sRNAs and three are putative peptide-coding or 5' UTR RNAs (ORF smaller than 100 aa). The expression of each of these new small RNA genes was confirmed by Northern blot hybridization.
Small non coding RNA (sra) genes can be found in the intergenic regions of alpha-proteobacteria genomes. Some of these sra genes are only present in S. meliloti, sometimes in genomic islands; homologues of others are present in related genomes including those of the pathogens Brucella and Agrobacterium.
PMCID: PMC2245857  PMID: 18093320
16.  Specific elements of the glyoxylate pathway play a significant role in the functional transition of the soybean cotyledon during seedling development 
BMC Genomics  2007;8:468.
The soybean (Glycine max) cotyledon is a specialized tissue whose main function is to serve as a nutrient reserve that supplies the needs of the young plant throughout seedling development. During this process the cotyledons experience a functional transition to a mainly photosynthetic tissue. To identify at the genetic level the specific active elements that participate in the natural transition of the cotyledon from storage to photosynthetic activity, we studied the transcript abundance profile at different time points using a new soybean oligonucleotide chip containing 19,200 probes (70-mer long).
After normalization and statistical analysis we determined that 3,594 genes presented a statistically significant altered expression in relation to the imbibed seed in at least one of the time points defined for the study. Detailed analysis of this data identified individual, specific elements of the glyoxylate pathway that play a fundamental role during the functional transition of the cotyledon from nutrient storage to photosynthesis. The dynamics between glyoxysomes and peroxisomes is evident during these series of events. We also identified several other genes whose products could participate co-ordinately throughout the functional transition and the associated mechanisms of control and regulation and we described multiple unknown genetic elements that by association have the potential to make a major contribution to this biological process.
We demonstrate that the global transcript profile of the soybean cotyledon during seedling development is extremely active, highly regulated and dynamic. We defined the expression profiles of individual gene family members, enzymatic isoforms and protein subunits and classified them accordingly to their involvement in different functional activities relevant to seedling development and the cotyledonary functional transition in soybean, especially the ones associated with the glyoxylate cycle. Our data suggests that in the soybean cotyledon a very complex and synchronized system of control and regulation of several metabolic pathways is essential to carry out the necessary functions during this developmental process.
PMCID: PMC2234262  PMID: 18093333
17.  Mosquito transcriptome changes and filarial worm resistance in Armigeres subalbatus 
BMC Genomics  2007;8:463.
Armigeres subalbatus is a natural vector of the filarial worm Brugia pahangi, but it rapidly and proficiently kills Brugia malayi microfilariae by melanotic encapsulation. Because B. malayi and B. pahangi are morphologically and biologically similar, the Armigeres-Brugia system serves as a valuable model for studying the resistance mechanisms in mosquito vectors. We have initiated transcriptome profiling studies in Ar. subalbatus to identify molecular components involved in B. malayi refractoriness.
These initial studies assessed the transcriptional response of Ar. subalbatus to B. malayi at 1, 3, 6, 12, 24, 48, and 72 hrs after an infective blood feed. In this investigation, we initiated the first holistic study conducted on the anti-filarial worm immune response in order to effectively explore the functional roles of immune-response genes following a natural exposure to the parasite. Studies assessing the transcriptional response revealed the involvement of unknown and conserved unknowns, cytoskeletal and structural components, and stress and immune responsive factors. The data show that the anti-filarial worm immune response by Ar. subalbatus to be a highly complex, tissue-specific process involving varied effector responses working in concert with blood cell-mediated melanization.
This initial study provides a foundation and direction for future studies, which will more fully dissect the nature of the anti-filarial worm immune response in this mosquito-parasite system. The study also argues for continued studies with RNA generated from both hemocytes and whole bodies to fully expound the nature of the anti-filarial worm immune response.
PMCID: PMC2234435  PMID: 18088420
18.  Literature Lab: a method of automated literature interrogation to infer biology from microarray analysis 
BMC Genomics  2007;8:461.
The biomedical literature is a rich source of associative information but too vast for complete manual review. We have developed an automated method of literature interrogation called "Literature Lab" that identifies and ranks associations existing in the literature between gene sets, such as those derived from microarray experiments, and curated sets of key terms (i.e. pathway names, medical subject heading (MeSH) terms, etc).
Literature Lab was developed using differentially expressed gene sets from three previously published cancer experiments and tested on a fourth, novel gene set. When applied to the genesets from the published data including an in vitro experiment, an in vivo mouse experiment, and an experiment with human tumor samples, Literature Lab correctly identified known biological processes occurring within each experiment. When applied to a novel set of genes differentially expressed between locally invasive and metastatic prostate cancer, Literature Lab identified a strong association between the pathway term "FOSB" and genes with increased expression in metastatic prostate cancer. Immunohistochemistry subsequently confirmed increased nuclear FOSB staining in metastatic compared to locally invasive prostate cancers.
This work demonstrates that Literature Lab can discover key biological processes by identifying meritorious associations between experimentally derived gene sets and key terms within the biomedical literature.
PMCID: PMC2244637  PMID: 18088408
19.  Construction and characterization of an expressed sequenced tag library for the mosquito vector Armigeres subalbatus 
BMC Genomics  2007;8:462.
The mosquito, Armigeres subalbatus, mounts a distinctively robust innate immune response when infected with the nematode Brugia malayi, a causative agent of lymphatic filariasis. In order to mine the transcriptome for new insight into the cascade of events that takes place in response to infection in this mosquito, 6 cDNA libraries were generated from tissues of adult female mosquitoes subjected to immune-response activation treatments that lead to well-characterized responses, and from aging, naïve mosquitoes. Expressed sequence tags (ESTs) from each library were produced, annotated, and subjected to comparative analyses.
Six libraries were constructed and used to generate 44,940 expressed sequence tags, of which 38,079 passed quality filters to be included in the annotation project and subsequent analyses. All of these sequences were collapsed into clusters resulting in 8,020 unique sequence clusters or singletons. EST clusters were annotated and curated manually within ASAP (A Systematic Annotation Package for Community Analysis of Genomes) web portal according to BLAST results from comparisons to Genbank, and the Anopheles gambiae and Drosophila melanogaster genome projects.
The resulting dataset is the first of its kind for this mosquito vector and provides a basis for future studies of mosquito vectors regarding the cascade of events that occurs in response to infection, and thereby providing insight into vector competence and innate immunity.
PMCID: PMC2262096  PMID: 18088419
20.  Diversity in conserved genes in tomato 
BMC Genomics  2007;8:465.
Tomato has excellent genetic and genomic resources including a broad set of Expressed Sequence Tag (EST) data and high-density genetic maps. In addition, emerging physical maps and bacterial artificial clone sequence data serve as template to investigate genetic variation within the cultivated germplasm pool with the goal to manipulate agriculturally important traits. Unfortunately, the nearly exclusive focus of resource development on interspecific populations for genetic analyses and diversity studies has left a void in our understanding of genotypic variation within tomato breeding programs that focus on intra-specific populations. We describe the results of a study to identify nucleotide variation within tomato breeding germplasm and mapping parents for a set of conserved single-copy ESTs that are orthologous between tomato and Arabidopsis.
Using a pooled sequencing strategy, 967 tomato transcripts were screened for polymorphism in 12 tomato lines. Although intron position was conserved, intron lengths were 2-fold larger in tomato than in Arabidopsis. A total of 1,487 single nucleotide polymorphisms and 282 insertion/deletions were identified, of which 579 and 206 were polymorphic in breeding germplasm, respectively. Fresh market and processing germplasm were clearly divergent, as were Solanum lycopersicum var. cerasiformae and Solanum pimpinellifolium, tomato's closest relatives. The polymorphisms identified serve as marker resources for tomato. The COS is also applicable to other Solanaceae crops.
The results from this research enabled significant progress towards bridging the gap between genetic and genomic resources developed for populations derived from wide crosses and those applicable to intra-specific crosses for breeding in tomato.
PMCID: PMC2249608  PMID: 18088428
21.  Profiling sex-biased gene expression during parthenogenetic reproduction in Daphnia pulex 
BMC Genomics  2007;8:464.
Sexual reproduction is a core biological function that is conserved throughout eukaryotic evolution, yet breeding systems are extremely variable. Genome-wide comparative studies can be effectively used to identify genes and regulatory patterns that are constrained to preserve core functions from those that may help to account for the diversity of animal reproductive strategies. We use a custom microarray to investigate gene expression in males and two reproductive stages of females in the crustacean Daphnia pulex. Most Daphnia species reproduce by cyclical parthenogenesis, alternating between sexual and clonal reproduction. Both sex determination and the switch in their mode of reproduction is environmentally induced, making Daphnia an interesting comparative system for the study of sex-biased and reproductive genes.
Patterns of gene expression in females and males reveal that 50% of assayed transcripts show some degree of sex-bias. Female-biased transcription is enriched for translation, metabolic and regulatory genes associated with development. Male-biased expression is enriched for cuticle and protease function. Comparison with well studied arthropods such as Drosophila melanogaster and Anopheles gambiae suggests that female-biased patterns tend to be conserved, whereas male-biased genes are evolving faster in D. pulex. These findings are based on the proportion of female-biased, male-biased, and unbiased genes that share sequence similarity with proteins in other animal genomes.
Some transcriptional differences between males and females appear to be conserved across Arthropoda, including the rapid evolution of male-biased genes which is observed in insects and now in a crustacean. Yet, novel patterns of male-biased gene expression are also uncovered. This study is an important first step towards a detailed understanding of the genetic basis and evolution of parthenogenesis, environmental sex determination, and adaptation to aquatic environments.
PMCID: PMC2245944  PMID: 18088424
22.  Denoising inferred functional association networks obtained by gene fusion analysis 
BMC Genomics  2007;8:460.
Gene fusion detection – also known as the 'Rosetta Stone' method – involves the identification of fused composite genes in a set of reference genomes, which indicates potential interactions between its un-fused counterpart genes in query genomes. The precision of this method typically improves with an ever-increasing number of reference genomes.
In order to explore the usefulness and scope of this approach for protein interaction prediction and generate a high-quality, non-redundant set of interacting pairs of proteins across a wide taxonomic range, we have exhaustively performed gene fusion analysis for 184 genomes using an efficient variant of a previously developed protocol. By analyzing interaction graphs and applying a threshold that limits the maximum number of possible interactions within the largest graph components, we show that we can reduce the number of implausible interactions due to the detection of promiscuous domains. With this generally applicable approach, we generate a robust set of over 2 million distinct and testable interactions encompassing 696,894 proteins in 184 species or strains, most of which have never been the subject of high-throughput experimental proteomics. We investigate the cumulative effect of increasing numbers of genomes on the fidelity and quantity of predictions, and show that, for large numbers of genomes, predictions do not become saturated but continue to grow linearly, for the majority of the species. We also examine the percentage of component (and composite) proteins with relation to the number of genes and further validate the functional categories that are highly represented in this robust set of detected genome-wide interactions.
We illustrate the phylogenetic and functional diversity of gene fusion events across genomes, and their usefulness for accurate prediction of protein interaction and function.
PMCID: PMC2248599  PMID: 18081932
23.  Positional bias of general and tissue-specific regulatory motifs in mouse gene promoters 
BMC Genomics  2007;8:459.
The arrangement of regulatory motifs in gene promoters, or promoter architecture, is the result of mutation and selection processes that have operated over many millions of years. In mammals, tissue-specific transcriptional regulation is related to the presence of specific protein-interacting DNA motifs in gene promoters. However, little is known about the relative location and spacing of these motifs. To fill this gap, we have performed a systematic search for motifs that show significant bias at specific promoter locations in a large collection of housekeeping and tissue-specific genes.
We observe that promoters driving housekeeping gene expression are enriched in particular motifs with strong positional bias, such as YY1, which are of little relevance in promoters driving tissue-specific expression. We also identify a large number of motifs that show positional bias in genes expressed in a highly tissue-specific manner. They include well-known tissue-specific motifs, such as HNF1 and HNF4 motifs in liver, kidney and small intestine, or RFX motifs in testis, as well as many potentially novel regulatory motifs. Based on this analysis, we provide predictions for 559 tissue-specific motifs in mouse gene promoters.
The study shows that motif positional bias is an important feature of mammalian proximal promoters and that it affects both general and tissue-specific motifs. Motif positional constraints define very distinct promoter architectures depending on breadth of expression and type of tissue.
PMCID: PMC2249607  PMID: 18078513
24.  In silico and in vitro comparative analysis to select, validate and test SNPs for human identification 
BMC Genomics  2007;8:457.
The recent advances in human genetics have recently provided new insights into phenotypic variation and genome variability. Current forensic DNA techniques involve the search for genetic similarities and differences between biological samples. Consequently the selection of ideal genomic biomarkers for human identification is crucial in order to ensure the highest stability and reproducibility of results.
In the present study, we selected and validated 24 SNPs which are useful in human identification in 1,040 unrelated samples originating from three different populations (Italian, Benin Gulf and Mongolian). A Rigorous in silico selection of these markers provided a list of SNPs with very constant frequencies across the populations tested as demonstrated by the Fst values. Furthermore, these SNPs also showed a high specificity for the human genome (only 5 SNPs gave positive results when amplified in non-human DNA).
Comparison between in silico and in vitro analysis showed that current SNPs databases can efficiently improve and facilitate the selection of markers because most of the analyses performed (Fst, r2, heterozigosity) in more than 1,000 samples confirmed available population data.
PMCID: PMC2222643  PMID: 18076761
25.  Unravelling the molecular control of calvarial suture fusion in children with craniosynostosis 
BMC Genomics  2007;8:458.
Craniosynostosis, the premature fusion of calvarial sutures, is a common craniofacial abnormality. Causative mutations in more than 10 genes have been identified, involving fibroblast growth factor, transforming growth factor beta, and Eph/ephrin signalling pathways. Mutations affect each human calvarial suture (coronal, sagittal, metopic, and lambdoid) differently, suggesting different gene expression patterns exist in each human suture. To better understand the molecular control of human suture morphogenesis we used microarray analysis to identify genes differentially expressed during suture fusion in children with craniosynostosis. Expression differences were also analysed between each unfused suture type, between sutures from syndromic and non-syndromic craniosynostosis patients, and between unfused sutures from individuals with and without craniosynostosis.
We identified genes with increased expression in unfused sutures compared to fusing/fused sutures that may be pivotal to the maintenance of suture patency or in controlling early osteoblast differentiation (i.e. RBP4, GPC3, C1QTNF3, IL11RA, PTN, POSTN). In addition, we have identified genes with increased expression in fusing/fused suture tissue that we suggest could have a role in premature suture fusion (i.e. WIF1, ANXA3, CYFIP2). Proteins of two of these genes, glypican 3 and retinol binding protein 4, were investigated by immunohistochemistry and localised to the suture mesenchyme and osteogenic fronts of developing human calvaria, respectively, suggesting novel roles for these proteins in the maintenance of suture patency or in controlling early osteoblast differentiation. We show that there is limited difference in whole genome expression between sutures isolated from patients with syndromic and non-syndromic craniosynostosis and confirmed this by quantitative RT-PCR. Furthermore, distinct expression profiles for each unfused suture type were noted, with the metopic suture being most disparate. Finally, although calvarial bones are generally thought to grow without a cartilage precursor, we show histologically and by identification of cartilage-specific gene expression that cartilage may be involved in the morphogenesis of lambdoid and posterior sagittal sutures.
This study has provided further insight into the complex signalling network which controls human calvarial suture morphogenesis and craniosynostosis. Identified genes are candidates for targeted therapeutic development and to screen for craniosynostosis-causing mutations.
PMCID: PMC2222648  PMID: 18076769

Results 1-25 (481)