The B-hordeins are the major group of prolamin storage proteins in barley (Hordeum vulgare L.) and they are encoded by a small multigene family that is expressed specifically in the developing endosperm. We report the complete nucleotide sequence of a clone of one B-hordein gene (pBHR184). The cloned gene contains no introns and belongs to the B1 sub-family of B-hordein genes. Comparison of the 5'-flanking sequences of pBHR184 with those of related S-rich prolamin genes from wheat shows that several short sequences within 600 bp upstream of the translation initiation codon are strongly conserved. A sequence that is conserved at around -300 bp in the S-rich prolamins is also conserved at similar locations in genes encoding the two major classes of maize prolamin (the Z19 and Z21 zeins) and appears to be unique to prolamin genes. We discuss the possible role of this '-300 element' in the control of gene expression in the developing cereal endosperm.
The effect of sowing date on grain protein, hordein fraction content and malting quality of two-rowed spring barley was investigated by using ten commercial cultivars with different grain protein content and the relationships among these traits were examined. The results showed that grain protein content and B hordein content increased as the sowing date postponed and were significantly affected by sowing date, while C and D hordein contents were less influenced by sowing date. There were significant differences in grain protein and hordein fraction content among the ten cultivars. The coefficient of variation of D hordein content was much larger than that of B and C hordein contents, suggesting its greater variation caused by different sowing dates. Beta-amylase activity and diastatic power were also significantly affected by sowing date, with malt extract being less affected. Significant differences in measured malt quality were found among the ten cultivars. Grain protein was significantly correlated with B hordein and malt extract positively and negatively, respectively. There was no significant correlation between beta-amylase activity or diastatic power and grain protein content. B hordein was negatively and significantly correlated with malt extract, but no significant correlations between C hordein, D hordein and malting quality traits.
Barley (Hordeum vulgare L.); Sowing date; Protein; Hordein; Malting quality
Association mapping is receiving considerable attention in plant genetics for its potential to fine map quantitative trait loci (QTL), validate candidate genes, and identify alleles of interest. In the present study association mapping in barley (Hordeum vulgare L.) is investigated by associating DNA polymorphisms with variation in grain quality traits, plant height, and flowering time to gain further understanding of gene functions involved in the control of these traits. We focused on the four loci BLZ1, BLZ2, BPBF and HvGAMYB that play a role in the regulation of B-hordein expression, the major fraction of the barley storage protein. The association was tested in a collection of 224 spring barley accessions using a two-stage mixed model approach.
Within the sequenced fragments of four candidate genes we observed different levels of nucleotide diversity. The effect of selection on the candidate genes was tested by Tajima's D which revealed significant values for BLZ1, BLZ2, and BPBF in the subset of two-rowed barleys. Pair-wise LD estimates between the detected SNPs within each candidate gene revealed different intra-genic linkage patterns. On the basis of a more extensive examination of genomic regions surrounding the four candidate genes we found a sharp decrease of LD (r2<0.2 within 1 cM) in all but one flanking regions.
Significant marker-trait associations between SNP sites within BLZ1 and flowering time, BPBF and crude protein content and BPBF and starch content were detected. Most haplotypes occurred at frequencies <0.05 and therefore were rejected from the association analysis. Based on haplotype information, BPBF was associated to crude protein content and starch content, BLZ2 showed association to thousand-grain weight and BLZ1 was found to be associated with flowering time and plant height.
Differences in nucleotide diversity and LD pattern within the candidate genes BLZ1, BLZ2, BPBF, and HvGAMYB reflect the impact of selection on the nucleotide sequence of the four candidate loci.
Despite significant associations, the analysed candidate genes only explained a minor part of the total genetic variation although they are known to be important factors influencing the expression of seed quality traits. Therefore, we assume that grain quality as well as plant height and flowering time are influenced by many factors each contributing a small part to the expression of the phenotype. A genome-wide association analysis could provide a more comprehensive picture of loci involved in the regulation of grain quality, thousand grain weight and the other agronomic traits that were analyzed in this study. However, despite available high-throughput genotyping arrays the marker density along the barely genome is still insufficient to cover all associations in a whole genome scan. Therefore, the candidate gene-based approach will further play an important role in barley association studies.
As the global population continues to expand, increasing yield in bread wheat is of critical importance as 20% of the world’s food supply is sourced from this cereal. Several recent studies of the molecular basis of grain yield indicate that the cytokinins are a key factor in determining grain yield. In this study, cytokinin gene family members in bread wheat were isolated from four multigene families which regulate cytokinin synthesis and metabolism, the isopentenyl transferases (IPT), cytokinin oxidases (CKX), zeatin O-glucosyltransferases (ZOG), and β-glucosidases (GLU). As bread wheat is hexaploid, each gene family is also likely to be represented on the A, B and D genomes. By using a novel strategy of qRT-PCR with locus-specific primers shared among the three homoeologues of each family member, detailed expression profiles are provided of family members of these multigene families expressed during leaf, spike and seed development.
The expression patterns of individual members of the IPT, CKX, ZOG, and GLU multigene families in wheat are shown to be tissue- and developmentally-specific. For instance, TaIPT2 and TaCKX1 were the most highly expressed family members during early seed development, with relative expression levels of up to 90- and 900-fold higher, respectively, than those in the lowest expressed samples. The expression of two cis-ZOG genes was sharply increased in older leaves, while an extremely high mRNA level of TaGLU1-1 was detected in young leaves.
Key genes with tissue- and developmentally-specific expression have been identified which would be prime targets for genetic manipulation towards yield improvement in bread wheat breeding programmes, utilising TILLING and MAS strategies.
The utility of mining DNA sequence data to understand the structure and expression of cereal prolamin genes is demonstrated by the identification of a new class of wheat prolamins. This previously unrecognized wheat prolamin class, given the name δ-gliadins, is the most direct ortholog of barley γ3-hordeins. Phylogenetic analysis shows that the orthologous δ-gliadins and γ3-hordeins form a distinct prolamin branch that existed separate from the γ-gliadins and γ-hordeins in an ancestral Triticeae prior to the branching of wheat and barley. The expressed δ-gliadins are encoded by a single gene in each of the hexaploid wheat genomes. This single δ-gliadin/γ3-hordein ortholog may be a general feature of the Triticeae tribe since examination of ESTs from three barley cultivars also confirms a single γ3-hordein gene. Analysis of ESTs and cDNAs shows that the genes are expressed in at least five hexaploid wheat cultivars in addition to diploids Triticum monococcum and Aegilops tauschii. The latter two sequences also allow assignment of the δ-gliadin genes to the A and D genomes, respectively, with the third sequence type assumed to be from the B genome. Two wheat cultivars for which there are sufficient ESTs show different patterns of expression, i.e., with cv Chinese Spring expressing the genes from the A and B genomes, while cv Recital has ESTs from the A and D genomes. Genomic sequences of Chinese Spring show that the D genome gene is inactivated by tandem premature stop codons. A fourth δ-gliadin sequence occurs in the D genome of both Chinese Spring and Ae. tauschii, but no ESTs match this sequence and limited genomic sequences indicates a pseudogene containing frame shifts and premature stop codons. Sequencing of BACs covering a 3 Mb region from Ae. tauschii locates the δ-gliadin gene to the complex Gli-1 plus Glu-3 region on chromosome 1.
Analysis of cell-selective gene expression for families of proteins of therapeutic interest is crucial when deducing the influence of genes upon complex traits and disease susceptibility. Presently, there is no convenient tool for examining isoform-selective expression for large gene families. A multigene isoform profiling strategy was developed and used to investigate the inwardly rectifying K+ (Kir) channel family in human leukocytes. Comprised of seven subfamilies, Kir channels have important roles in setting the resting membrane potential in excitable and non-excitable cells.
Gene sequence alignment allowed determination of "islands" of amino acid homology, and sub-family "centred" priming permitted simultaneous co-amplification of each family member. Validation and cross-priming analysis was performed against a panel of cognate Kir channel clones. Radiolabelling and diagnostic restriction digestion of pooled PCR products enabled determination of distinct Kir gene expression profiles in pure populations of human neutrophils, eosinophils and lung mast cells, with conservation of Kir2.0 isoforms amongst the leukocyte subsets. We also identified a Kir2.0 channel product, which may potentially represent a novel family member.
We have developed a novel, rapid and flexible strategy for the determination of gene family isoform composition in any cell type with the additional capacity to detect hitherto unidentified family members and verified its application in a study of Kir channel isoform expression in human leukocytes.
The secretory granules (trichocysts) of Paramecium are characterized by a highly constrained shape that reflects the crystalline organization of their protein contents. Yet the crystalline trichocyst content is composed not of a single protein but of a family of related polypeptides that derive from a family of precursors by protein processing. In this paper we show that a multigene family, of unusually large size for a unicellular organism, codes for these proteins. The family is organized in subfamilies; each subfamily codes for proteins with different primary structures, but within the subfamilies several genes code for nearly identical proteins. For one subfamily, we have obtained direct evidence that the different members are coexpressed. The three subfamilies we have characterized are located on different macronuclear chromosomes. Typical 23-29 nucleotide Paramecium introns are found in one of the regions studied and the intron sequences are more variable than the surrounding coding sequences, providing gene-specific markers. We suggest that this multigene family may have evolved to assure a microheterogeneity of structural proteins necessary for morphogenesis of a complex secretory granule core with a constrained shape and dynamic properties: genetic analysis has shown that correct assembly of the crystalline core is necessary for trichocyst function.
Subtelomeric multigene families of malaria parasites encode virulent determinants. The published genome sequence of Plasmodium vivax revealed the largest subtelomeric multigene family of human malaria parasites, the vir super-family, presently composed of 346 vir genes subdivided into 12 different subfamilies based on sequence homologies detected by BLAST.
A novel computational approach was used to redefine vir genes. First, a protein-weighted graph was built based on BLAST alignments. This graph was processed to ensure that edge weights are not exclusively based on the BLAST score between the two corresponding proteins, but strongly dependant on their graph neighbours and their associations. Then the Markov Clustering Algorithm was applied to the protein graph. Next, the Homology Block concept was used to further validate this clustering approach. Finally, proteome-wide analysis was carried out to predict new VIR members. Results showed that (i) three previous subfamilies cannot longer be classified as vir genes; (ii) most previously unclustered vir genes were clustered into vir subfamilies; (iii) 39 hypothetical proteins were predicted as VIR proteins; (iv) many of these findings are supported by a number of structural and functional evidences, sub-cellular localization studies, gene expression analysis and chromosome localization (v) this approach can be used to study other multigene families in malaria.
This methodology, resource and new classification of vir genes will contribute to a new structural framing of this multigene family and other multigene families of malaria parasites, facilitating the design of experiments to understand their role in pathology, which in turn may help furthering vaccine development.
Malaria; Plasmodium vivax; vir genes; VIR proteins; Subtelomeric multigene families; Sequence clustering; Similarity networks; Homology blocks
The tick-borne relapsing fever spirochete Borrelia hermsii evades the mammalian immune system by periodically switching expression among members of two multigene families that encode immunogenic, antigenically distinct outer surface proteins. The type strain, B. hermsii HS1, has at least 40 complete genes and pseudogenes that participate in this multiphasic antigenic variation. Originally termed vmp (for variable major protein) genes, they have been reclassified as vsp (for variable small protein) and vlp (for variable large protein) genes, based on size and amino acid sequence similarities. To date, antigenic variation in B. hermsii has been studied only in the type strain, HS1. Nucleotide sequence comparisons of 23 B. hermsii HS1 genes revealed five distinct groups, the vsp gene family and four subfamilies of vlp genes. We used PCR with family- and subfamily-specific primers, followed by restriction fragment length polymorphism analysis, to compare the vsp and vlp repertoires of HS1 and seven other B. hermsii isolates from Washington, Idaho, and California. This analysis, together with pulsed-field gel electrophoresis genome profiles, revealed that the eight isolates formed three distinct groups, which likely represent clonal lineages. Members of the three groups coexisted in the same geographic area, but they could also be isolated across large geographical distances. This population structure may result from immune selection by the host, as has been proposed for other pathogens with polymorphic antigens.
The transient leaf assay in Nicotiana benthamiana is widely used in plant sciences, with one application being the rapid assembly of complex multigene pathways that produce new fatty acid profiles. This rapid and facile assay would be further improved if it were possible to simultaneously overexpress transgenes while accurately silencing endogenes. Here, we report a draft genome resource for N. benthamiana spanning over 75% of the 3.1 Gb haploid genome. This resource revealed a two-member NbFAD2 family, NbFAD2.1 and NbFAD2.2, and quantitative RT-PCR (qRT-PCR) confirmed their expression in leaves. FAD2 activities were silenced using hairpin RNAi as monitored by qRT-PCR and biochemical assays. Silencing of endogenous FAD2 activities was combined with overexpression of transgenes via the use of the alternative viral silencing-suppressor protein, V2, from Tomato yellow leaf curl virus. We show that V2 permits maximal overexpression of transgenes but, crucially, also allows hairpin RNAi to operate unimpeded. To illustrate the efficacy of the V2-based leaf assay system, endogenous lipids were shunted from the desaturation of 18∶1 to elongation reactions beginning with 18∶1 as substrate. These V2-based leaf assays produced ∼50% more elongated fatty acid products than p19-based assays. Analyses of small RNA populations generated from hairpin RNAi against NbFAD2 confirm that the siRNA population is dominated by 21 and 22 nt species derived from the hairpin. Collectively, these new tools expand the range of uses and possibilities for metabolic engineering in transient leaf assays.
We report a technique for the rapid determination of genomic structure of individual members of human interspersed multigene families which circumvents the requirement for genomic clone isolation. In this approach, vectorette libraries were constructed from human/rodent somatic cell hybrid DNA harbouring single members of the gene family. Using these libraries as PCR templates with nested gene-specific primers in combination with a common vectorette primer resulted in the amplification of gene-specific products suitable for the subsequent determination of intron/exon structure. We have applied this technique to characterise members of two gene families.
The origin of introns and their role (if any) in gene expression, in the evolution of the genome, and in the generation of new expressed sequences are issues that are understood poorly, if at all. Multigene families provide a favorable opportunity for examining the evolutionary history of introns because it is possible to identify changes in intron placement and content since the divergence of family members from a common ancestral sequence. Here we report the complete sequence of the gene encoding the 68-kilodalton (kDa) neurofilament protein; the gene is a member of the intermediate filament multigene family that diverged over 600 million years ago. Five other members of this family (desmin, vimentin, glial fibrillary acidic protein, and type I and type II keratins) are encoded by genes with six or more introns at homologous positions. To our surprise, the number and placement of introns in the 68-kDa neurofilament protein gene were completely anomalous, with only three introns, none of which corresponded in position to introns in any characterized intermediate filament gene. This finding was all the more unexpected because comparative amino acid sequence data suggest a closer relationship of the 68-kDa neurofilament protein to desmin, vimentin, and glial fibrillary acidic protein than between any of these three proteins and the keratins. It appears likely that an mRNA-mediated transposition event was involved in the evolution of the 68-kDa neurofilament protein gene and that subsequent events led to the acquisition of at least two of the three introns present in the contemporary sequence.
Multiple proteins containing BURP domain have been identified in many different plant species, but not in any other organisms. To date, the molecular function of the BURP domain is still unknown, and no systematic analysis and expression profiling of the gene family in soybean (Glycine max) has been reported.
In this study, multiple bioinformatics approaches were employed to identify all the members of BURP family genes in soybean. A total of 23 BURP gene types were identified. These genes had diverse structures and were distributed on chromosome 1, 2, 4, 6, 7, 8, 11, 12, 13, 14, and 18. Phylogenetic analysis suggested that these BURP family genes could be classified into 5 subfamilies, and one of which defines a new subfamily, BURPV. Quantitative real-time PCR (qRT-PCR) analysis of transcript levels showed that 15 of the 23 genes had no expression specificity; 7 of them were specifically expressed in some of the tissues; and one of them was not expressed in any of the tissues or organs studied. The results of stress treatments showed that 17 of the 23 identified BURP family genes responded to at least one of the three stress treatments; 6 of them were not influenced by stress treatments even though a stress related cis-element was identified in the promoter region. No stress related cis-elements were found in promoter region of any BURPV member. However, qRT-PCR results indicated that all members from BURPV responded to at least one of the three stress treatments. More significantly, the members from the RD22-like subfamily showed no tissue-specific expression and they all responded to each of the three stress treatments.
We have identified and classified all the BURP domain-containing genes in soybean. Their expression patterns in different tissues and under different stress treatments were detected using qRT-PCR. 15 out of 23 BURP genes in soybean had no tissue-specific expression, while 17 out of them were stress-responsive. The data provided an insight into the evolution of the gene family and suggested that many BURP family genes may be important for plants responding to stress conditions.
The polymerase chain reaction (PCR) is a versatile method to
amplify specific DNA with oligonucleotide primers. By designing
degenerate PCR primers based on amino acid sequences that are highly conserved
among all known gene family members, new members of a multigene
family can be identified. The inherent weakness of this approach
is that the degenerate primers will amplify previously identified, in
addition to new, family members. To specifically address this problem,
we synthesized a specific RNA for each known family member so that
it hybridized to one strand of the template, adjacent to the 3′-end of the primer, allowing the degenerate
primer to bind yet preventing extension by DNA polymerase. To test our
strategy, we used known members of the soluble, nitric oxide-sensitive
guanylyl cyclase family as our templates and degenerate primers
that discriminate this family from other guanylyl cyclases. We demonstrate
that amplification of known members of this family is effectively
and specifically inhibited by the corresponding RNAs, alone or in
combination. This robust method can be adapted to any application where
multiple PCR products are amplified, as long as the sequence of
the desired and the undesired PCR product(s) is sufficiently distinct
between the primers.
Homology-dependent gene silencing is achieved in Paramecium by introduction of gene coding regions into the somatic nucleus at high copy number, resulting in reduced expression of all homologous genes. Although a powerful tool for functional analysis, the relationship of this phenomenon to gene silencing mechanisms in other organisms has remained obscure. We report here experiments using the T4a gene, a member of the trichoeyst matrix protein (TMP) multigene family encoding secretory proteins, and the ND7 gene, a single copy gene required for exocytotic membrane fusion. Silencing of either gene leads to an exocytosis-deficient phenotype easily scored on individual cells. For each gene we have tested the ability of different combinations of promoter, coding and 3′ non-coding regions to provoke silencing, and analyzed transcription and steady-state RNA in the transformed cells. We provide evidence that homology-dependent gene silencing in Paramecium is post-transcriptional and that both sense and antisense RNA are transcribed from the transgenes, consistent with a role for dsRNA in triggering silencing. Constructs with and without promoters induce gene silencing. However, transgenes that contain 3′ non-coding regions do not induce gene silencing, despite antisense RNA production. We present a model according to which different pathways of RNA metabolism compete for transcripts and propose that the relative efficiencies of dsRNA formation and of 3′ RNA processing of sense transgene transcripts determine the outcome of transformation experiments.
Phospholipases are critical for modification and redistribution of lipid substrates, membrane remodeling and microbial virulence. Among the many different classes of phospholipases, fungal phospholipase B (Plb) proteins show the broadest range of substrate specificity and hydrolytic activity, hydrolyzing acyl ester bonds in phospholipids and lysophospholipids and further catalyzing lysophospholipase-transacylase reactions. The genome of the opportunistic fungal pathogen Candida albicans encodes a PLB multigene family with five putative members; we present the first characterization of this group of potential virulence determinants. CaPLB5, the third member of this multigene family characterized herein is a putative secretory protein with a predicted GPI-anchor attachment site. Real-time RT-PCR gene expression analysis of CaPLB5 and the additional CaPLB gene family members revealed that filamentous growth and physiologically relevant environmental conditions are associated with increased phospholipase B gene activity. The phenotypes expressed by null mutant and revertant strains of CaPLB5 indicate that this lipid hydrolase plays an important role for cell-associated phospholipase A2 activity and in vivo organ colonization.
GPI anchor; Phospholipase; Lysophospholipase; Candida; Selection marker; Virulence
Sequencing of restriction fragment EcoRI A-SalI C of African swine fever virus has revealed the existence of a multigene family, designated family 505 because of the average number of amino acids in the proteins, composed of seven homologous and tandemly arranged genes. All the genes of family 505 are expressed during infection. Primer extension analysis showed that transcription is initiated a short distance (3 to 62 nucleotides) from the start codon of the corresponding open reading frame. The proteins of family 505 showed similarity to those of family 360 from African swine fever virus. In particular, a striking conservation of three regions at the amino terminus of the polypeptides was observed.
Microinjection at high copy number of plasmids containing only the coding region of a gene into the Paramecium somatic macronucleus led to a marked reduction in the expression of the corresponding endogenous gene(s). The silencing effect, which is stably maintained throughout vegetative growth, has been observed for all Paramecium genes examined so far: a single-copy gene (ND7), as well as members of multigene families (centrin genes and trichocyst matrix protein genes) in which all closely related paralogous genes appeared to be affected. This phenomenon may be related to posttranscriptional gene silencing in transgenic plants and quelling in Neurospora and allows the efficient creation of specific mutant phenotypes thus providing a potentially powerful tool to study gene function in Paramecium. For the two multigene families that encode proteins that coassemble to build up complex subcellular structures the analysis presented herein provides the first experimental evidence that the members of these gene families are not functionally redundant.
The immunodominant surface protein, MSP3, is structurally and antigenically polymorphic among strains of Anaplasma marginale. In this study we show that a polymorphic multigene family is at least partially responsible for the variation seen in MSP3. The A. marginale msp3 gene msp3-12 was cloned and expressed in Escherichia coli. With msp3-12 as a probe, multiple, partially homologous gene copies were identified in the genomes of three A. marginale strains. These copies were widely distributed throughout the chromosome. Sequence analysis of three unique msp3 genes, msp3-12, msp3-11, and msp3-19, revealed both conserved and variant regions within the open reading frames. Importantly, msp3 contains amino acid blocks related to another polymorphic multigene family product, MSP2. These data, in conjunction with data presented in previous studies, suggest that multigene families are used to vary important antigenic surface proteins of A. marginale. These findings may provide a basis for studying antigenic variation of the organism in persistently infected carrier cattle.
Babesia bovis, an intraerythrocytic parasite of cattle, establishes persistent infections of extreme duration. This is accomplished, at least in part, through rapid antigenic variation of a heterodimeric virulence factor, the variant erythrocyte surface antigen-1 (VESA1) protein. Previously, the VESA1a subunit was demonstrated to be encoded by a 1α member of the ves multigene family. Since its discovery the 1β branch of this multigene family has been hypothesized to encode the VESA1b polypeptide, but formal evidence for this connection has been lacking. Here, we provide evidence that products of ves1β genes are rapidly variant in antigenicity and size-polymorphic, matching known VESA1b polypeptides. Importantly, the ves1β-encoded antigens are co-precipitated with VESA1a during immunoprecipitation with anti-VESA1a monoclonal antibodies, and antisera to ves1β polypeptide co-precipitate VESA1a. Further, the ves1β-encoded antigens significantly co-localize with VESA1a on the infected-erythrocyte membrane surface of live cells. These characteristics all match known properties of VESA1b, allowing us to conclude that the ves1β gene divergently apposing the ves1α gene within the locus of active ves transcription (LAT) encodes the 1b subunit of the VESA1 cytoadhesion ligand. However, the extent and stoichiometry of VESA1a and 1b co-localization on the surface of individual cells is quite variable, implicating competing effects on transcription, translation, or trafficking of the two subunits. These results provide essential information facilitating further investigation into this parasite virulence factor.
Babesia bovis; antigenic variation; VESA1b; ves multigene family; ves1β gene; cytoadhesion ligand
The unique PE/PPE multigene family of proteins occupies almost 10% of the coding sequence of Mycobacterium tuberculosis (M.tb), the causative agent of human tuberculosis. Although some members of this family have been shown to be involved in pathways essential to M.tb pathogenesis, their precise physiological functions remain largely undefined. Here, we investigate the roles of the conserved members of the ‘PE only’ subfamily Rv0285 (PE5) and Rv1386 (PE15) in mediating host-pathogen interactions. Recombinant Mycobacterium smegmatis strains expressing PE5 and PE15 showed enhanced survival vs controls in J774.1 and THP-1 macrophages - this increase in viable counts was correlated with a reduction in transcript levels of inducible nitric oxide synthase. An up-regulation of anti- and down-regulation of pro-inflammatory cytokine levels was also observed in infected macrophages implying an immuno-modulatory function for these proteins. Induction of IL-10 production upon infection of THP-1 macrophages was associated with increased phosphorylation of the MAP Kinases p38 and ERK1/2, which was abolished in the presence of the pharmacological inhibitors SB203580 and PD98059. The PE5-PPE4 and PE15-PPE20 gene pairs were observed to be co-operonic in M.tb, hinting at an additional level of complexity in the functioning of these proteins. We conclude that M.tb exploits the PE proteins to evade the host immune response by altering the Th1 and Th2 type balance thereby favouring in vivo bacillary survival.
The PE and PPE multigene families of Mycobacterium tuberculosis comprise about 10% of the coding potential of the genome. The function of the proteins encoded by these large gene families remains unknown, although they have been proposed to be involved in antigenic variation and disease pathogenesis. Interestingly, some members of the PE and PPE families are associated with the ESAT-6 (esx) gene cluster regions, which are regions of immunopathogenic importance, and encode a system dedicated to the secretion of members of the potent T-cell antigen ESAT-6 family. This study investigates the duplication characteristics of the PE and PPE gene families and their association with the ESAT-6 gene clusters, using a combination of phylogenetic analyses, DNA hybridization, and comparative genomics, in order to gain insight into their evolutionary history and distribution in the genus Mycobacterium.
The results showed that the expansion of the PE and PPE gene families is linked to the duplications of the ESAT-6 gene clusters, and that members situated in and associated with the clusters represent the most ancestral copies of the two gene families. Furthermore, the emergence of the repeat protein PGRS and MPTR subfamilies is a recent evolutionary event, occurring at defined branching points in the evolution of the genus Mycobacterium. These gene subfamilies are thus present in multiple copies only in the members of the M. tuberculosis complex and close relatives. The study provides a complete analysis of all the PE and PPE genes found in the sequenced genomes of members of the genus Mycobacterium such as M. smegmatis, M. avium paratuberculosis, M. leprae, M. ulcerans, and M. tuberculosis.
This work provides insight into the evolutionary history for the PE and PPE gene families of the mycobacteria, linking the expansion of these families to the duplications of the ESAT-6 (esx) gene cluster regions, and showing that they are composed of subgroups with distinct evolutionary (and possibly functional) differences.
Proteins located on Plasmodium falciparum merozoites, the invasive form of the parasite's asexual blood stage, are of considerable interest in vaccine research. Merozoite surface protein 7 (MSP7) forms a complex with MSP1 and is encoded by a member of a multigene family located on chromosome 13. The family codes for MSP7 and five MSP7-related proteins (MSRPs). In the present study, we have investigated the expression and the effect of msrp gene deletion at the asexual blood stage. In addition to msp7, msrp2, msrp3, and msrp5 are transcribed, and mRNA was easily detected by hybridization analysis, whereas mRNA for msrp1 and msrp4 could be detected only by reverse transcription (RT)-PCR. Notwithstanding evidence of transcription, antibodies to recombinant MSRPs failed to detect specific proteins, except for antibodies to MSRP2. Sequential proteolytic cleavages of MSRP2 resulted in 28- and 25-kDa forms. However, MSRP2 was absent from merozoites; the 25-kDa MSRP2 protein (MSRP225) was soluble and secreted upon merozoite egress. The msrp genes were deleted by targeted disruption in the 3D7 line, leading to ablation of full-length transcripts. MSRP deletion mutants had no detectable phenotype, with growth and invasion characteristics comparable to those of the parental parasite; only the deletion of MSP7 led to a detectable growth phenotype. Thus, within this family some of the genes are transcribed at a significant level in asexual blood stages, but the corresponding proteins may or may not be detectable. Interactions of the expressed proteins with the merozoite also differ. These results highlight the potential for unexpected differences of protein expression levels within gene families.
Ticks are blood feeding arachnids that characteristically take a long blood meal. They must therefore counteract host defence mechanisms such as hemostasis, inflammation and the immune response. This is achieved by expressing batteries of salivary proteins coded by multigene families.
We report the in-depth analysis of a tick multigene family and describe five new anticomplement proteins in Ixodes ricinus. Compared to previously described Ixodes anticomplement proteins, these segregated into a new phylogenetic group or subfamily. These proteins have a novel action mechanism as they specifically bind to properdin, leading to the inhibition of C3 convertase and the alternative complement pathway. An excess of non-synonymous over synonymous changes indicated that coding sequences had undergone diversifying selection. Diversification was not associated with structural, biochemical or functional diversity, adaptation to host species or stage specificity but rather to differences in antigenicity.
Anticomplement proteins from I. ricinus are the first inhibitors that specifically target a positive regulator of complement, properdin. They may provide new tools for the investigation of role of properdin in physiological and pathophysiological mechanisms. They may also be useful in disorders affecting the alternative complement pathway. Looking for and detecting the different selection pressures involved will help in understanding the evolution of multigene families and hematophagy in arthropods.
Western blot analysis of proteins from a cell culture isolate (USG3) of the human granulocytic ehrlichiosis (HGE) agent has identified a number of immunoreactive proteins, including major antigenic proteins of 43 and 45 kDa. Peptides derived from the 43- and 45-kDa proteins were sequenced, and degenerate PCR primers based on these sequences were used to amplify DNA from USG3. Sequencing of a 550-bp PCR product revealed that it encodes a protein homologous to the MSP-2 proteins of Anaplasma marginale. Concurrently, an expression library made from USG3 genomic DNA was screened with granulocytic Ehrlichia (GE)-positive immune sera. Analysis of two clones showed that they contain one partial and three full-length highly related genes, suggesting that they are part of a multigene family. Amino acid alignment showed conserved amino- and carboxy-terminal regions which flank a variable region. The conserved regions of these proteins are also homologous to the MSP-2 proteins of A. marginale; thus, they were designated GE MSP-2A (45 kDa), MSP-2B (34 kDa), and MSP-2C (38 kDa). The PCR fragment obtained as a result of peptide sequencing was completely contained within the msp-2A clone, and all of the sequenced peptides were found in the GE MSP-2 proteins. Recombinant MSP-2B protein and an MSP-2A fusion protein were expressed in Escherichia coli and reacted with human sera positive for the HGE agent by immunofluorescence assay. These data suggest that the 43- and 45-kDa proteins of the HGE agent are encoded by members of the GE MSP-2 multigene family.