Structural variants (SVs) are less common than single nucleotide polymorphisms and indels in the population, but collectively account for a significant fraction of genetic polymorphism and diseases. Base pair differences arising from SVs are on a much higher order (>100 fold) than point mutations; however, none of the current detection methods are comprehensive, and currently available methodologies are incapable of providing sufficient resolution and unambiguous information across complex regions in the human genome. To address these challenges, we applied a high-throughput, cost-effective genome mapping technology to comprehensively discover genome-wide SVs and characterize complex regions of the YH genome using long single molecules (>150 kb) in a global fashion.
Utilizing nanochannel-based genome mapping technology, we obtained 708 insertions/deletions and 17 inversions larger than 1 kb. Excluding the 59 SVs (54 insertions/deletions, 5 inversions) that overlap with N-base gaps in the reference assembly hg19, 666 non-gap SVs remained, and 396 of them (60%) were verified by paired-end data from whole-genome sequencing-based re-sequencing or de novo assembly sequence from fosmid data. Of the remaining 270 SVs, 260 are insertions and 213 overlap known SVs in the Database of Genomic Variants. Overall, 609 out of 666 (90%) variants were supported by experimental orthogonal methods or historical evidence in public databases. At the same time, genome mapping also provides valuable information for complex regions with haplotypes in a straightforward fashion. In addition, with long single-molecule labeling patterns, exogenous viral sequences were mapped on a whole-genome scale, and sample heterogeneity was analyzed at a new level.
Our study highlights genome mapping technology as a comprehensive and cost-effective method for detecting structural variation and studying complex regions in the human genome, as well as deciphering viral integration into the host genome.
Electronic supplementary material
The online version of this article (doi:10.1186/2047-217X-3-34) contains supplementary material, which is available to authorized users.
Genome mapping; Structural variation; Repeat units; Epstein-Barr virus (EBV) integration
N -Acetylneuraminate lyases (NALs) or sialic acid aldolases catalyze the reversible aldol cleavage of N-acetylneuraminic acid (Neu5Ac, the most common form of sialic acid) to form pyruvate and N-acetyl-D-mannosamine (ManNAc). Although equilibrium favors sialic acid cleavage, these enzymes can be used for high-yield chemoenzymatic synthesis of structurally diverse sialic acids in the presence of excess pyruvate. Engineering these enzymes to synthesize structurally modified natural sialic acids and their non-natural derivatives holds promise in creating novel therapeutic agents. Atomic resolution structures of these enzymes will greatly assist in guiding mutagenic and modeling studies to engineer enzymes with altered substrate specificity. We report here the crystal structures of wild-type Pasteurella multocida N-acetylneuraminate lyase and its K164A mutant. Like other bacterial lyases, it assembles into a homotetramer with each monomer folding into a classic (β/α)8 TIM barrel. Two wild-type structures were determined; in the absence of substrates, and trapped in a Schiff base intermediate between Lys164 and pyruvate, respectively. Three structures of the K164A variant were determined: one in the absence of substrates and two binary complexes with N-acetylneuraminic acid (Neu5Ac) and N-glycolylneuraminic acid (Neu5Gc), respectively. Both sialic acids bind to the active site in the open-chain ketone form of the monosaccharide. The structures reveal that every hydroxyl group of the linear sugars makes hydrogen bond interactions with the enzyme and the residues that determine specificity were identified. Additionally, the structures lend some clues in explaining the natural discrimination of sialic acid substrates between the P. multocida and E. coli NALs.
N-Acetylneuraminate lyase; N-Acetylneuraminic acid; N-Glycolylneuraminic acid; Protein crystal structure; Sialic acid; Sialic acid aldolase
Control of inflammation is critical for therapy of infectious diseases. Pathogen-associated and/or danger-associated molecular patterns (PAMPs and DAMPs, respectively) are the two major inducers of inflammation. Because the CD24-Siglec G/10 interactions selectively repress inflammatory response to DAMPs, microbial disruption of the negative regulation would provide a general mechanism to exacerbate inflammation. Here we show that the sialic acid-based pattern recognitions of CD24 by Siglec G/10 are targeted by sialidases in polybacterial sepsis. Sialidase inhibitors protect mice against sepsis by a CD24-Siglecg-dependent mechanism, whereas a targeted mutation of either CD24 or Siglecg exacerbates sepsis. Bacterial sialidase and host CD24 and Siglecg genes interact to determine pathogen virulence. Our data demonstrate a critical role for disrupting sialic acid-based pattern recognitions in microbial virulence and suggest a therapeutic approach to dampen harmful inflammatory response during infection.
Fluorinated Thomsen-Friedenreich (T) antigens were synthesized efficient from chemically produced fluorinated monosaccharides using a highly efficient one-pot two-enzyme chemoenzymatic approach containing a galactokinase and a D-galactosyl-β1–3-N-acetyl-D-hexosamine phosphorylase. These fluorinated T-antigens were further sialylated to form fluorinated ST-antigens using a one-pot two-enzyme system containing a CMP-sialic acid synthetase and an α2–3-sialyltransferase.
Within the Netherlands a national network of biobanks has been established (Biobanking and Biomolecular Research Infrastructure-Netherlands (BBMRI-NL)) as a national node of the European BBMRI. One of the aims of BBMRI-NL is to enrich biobanks with different types of molecular and phenotype data. Here, we describe the Genome of the Netherlands (GoNL), one of the projects within BBMRI-NL. GoNL is a whole-genome-sequencing project in a representative sample consisting of 250 trio-families from all provinces in the Netherlands, which aims to characterize DNA sequence variation in the Dutch population. The parent–offspring trios include adult individuals ranging in age from 19 to 87 years (mean=53 years; SD=16 years) from birth cohorts 1910–1994. Sequencing was done on blood-derived DNA from uncultured cells and accomplished coverage was 14–15x. The family-based design represents a unique resource to assess the frequency of regional variants, accurately reconstruct haplotypes by family-based phasing, characterize short indels and complex structural variants, and establish the rate of de novo mutational events. GoNL will also serve as a reference panel for imputation in the available genome-wide association studies in Dutch and other cohorts to refine association signals and uncover population-specific variants. GoNL will create a catalog of human genetic variation in this sample that is uniquely characterized with respect to micro-geographic location and a wide range of phenotypes. The resource will be made available to the research and medical community to guide the interpretation of sequencing projects. The present paper summarizes the global characteristics of the project.
whole-genome sequence; trio-design; population genetics
The major histocompatibility complex (MHC) is one of the most variable and gene-dense regions of the human genome. Most studies of the MHC, and associated regions, focus on minor variants and HLA typing, many of which have been demonstrated to be associated with human disease susceptibility and metabolic pathways. However, the detection of variants in the MHC region, and diagnostic HLA typing, still lacks a coherent, standardized, cost effective and high coverage protocol of clinical quality and reliability. In this paper, we presented such a method for the accurate detection of minor variants and HLA types in the human MHC region, using high-throughput, high-coverage sequencing of target regions. A probe set was designed to template upon the 8 annotated human MHC haplotypes, and to encompass the 5 megabases (Mb) of the extended MHC region. We deployed our probes upon three, genetically diverse human samples for probe set evaluation, and sequencing data show that ∼97% of the MHC region, and over 99% of the genes in MHC region, are covered with sufficient depth and good evenness. 98% of genotypes called by this capture sequencing prove consistent with established HapMap genotypes. We have concurrently developed a one-step pipeline for calling any HLA type referenced in the IMGT/HLA database from this target capture sequencing data, which shows over 96% typing accuracy when deployed at 4 digital resolution. This cost-effective and highly accurate approach for variant detection and HLA typing in the MHC region may lend further insight into immune-mediated diseases studies, and may find clinical utility in transplantation medicine research. This one-step pipeline is released for general evaluation and use by the scientific community.
Residents of the Tibetan Plateau show heritable adaptations to extreme altitude. We sequenced 50 exomes of ethnic Tibetans, encompassing coding sequences of 92% of human genes, with an average coverage of 18X per individual. Genes showing population-specific allele frequency changes, which represent strong candidates for altitude adaptation, were identified. The strongest signal of natural selection came from EPAS1, a transcription factor involved in response to hypoxia. One SNP at EPAS1 shows a 78% frequency difference between Tibetan and Han samples, representing the fastest allele frequency change observed at any human gene to date. This SNP’s association with erythrocyte abundance supports the role of EPAS1 in adaptation to hypoxia. Thus, a population genomic survey has revealed a functionally important locus in genetic adaptation to high altitude.
Genome-wide association studies have mainly relied on common HapMap sequence variations. Recently, sequencing approaches have allowed analysis of low frequency and rare variants in conjunction with common variants, thereby improving the search for functional variants and thus the understanding of the underlying biology of human traits and diseases. Here, we used a large Icelandic whole genome sequence dataset combined with Danish exome sequence data to gain insight into the genetic architecture of serum levels of vitamin B12 (B12) and folate. Up to 22.9 million sequence variants were analyzed in combined samples of 45,576 and 37,341 individuals with serum B12 and folate measurements, respectively. We found six novel loci associating with serum B12 (CD320, TCN2, ABCD4, MMAA, MMACHC) or folate levels (FOLR3) and confirmed seven loci for these traits (TCN1, FUT6, FUT2, CUBN, CLYBL, MUT, MTHFR). Conditional analyses established that four loci contain additional independent signals. Interestingly, 13 of the 18 identified variants were coding and 11 of the 13 target genes have known functions related to B12 and folate pathways. Contrary to epidemiological studies we did not find consistent association of the variants with cardiovascular diseases, cancers or Alzheimer's disease although some variants demonstrated pleiotropic effects. Although to some degree impeded by low statistical power for some of these conditions, these data suggest that sequence variants that contribute to the population diversity in serum B12 or folate levels do not modify the risk of developing these conditions. Yet, the study demonstrates the value of combining whole genome and exome sequencing approaches to ascertain the genetic and molecular architectures underlying quantitative trait associations.
Genome-wide association studies have in recent years revealed a wealth of common variants associated with common diseases and phenotypes. We took advantage of the advances in sequencing technologies to study the association of low frequency and rare variants in conjunction with common variants with serum levels of vitamin B12 (B12) and folate in Icelanders and Danes. We found 18 independent signals in 13 loci associated with serum B12 or folate levels. Interestingly, 13 of the 18 identified variants are coding and 11 of the 13 target genes have known functions related to B12 and folate pathways. These data indicate that the target genes at all of the loci have been identified. Epidemiological studies have shown a relationship between serum B12 and folate levels and the risk of cardiovascular diseases, cancers, and Alzheimer's disease. We investigated association between the identified variants and these diseases but did not find consistent association.
The relatively short read lengths from next generation sequencing (NGS) technologies still pose a challenge for de novo assembly of complex mammal genomes. One important solution is to use paired-end (PE) sequence information experimentally obtained from long-range DNA fragments (>1 kb). Here, we characterize and extend a long-range PE library construction method based on direct intra-molecule ligation (or molecular linker-free circularization) for NGS.
We found that the method performs stably for PE sequencing of 2- to 5- kb DNA fragments, and can be extended to 10–20 kb (and even in extremes, up to ∼35 kb). We also characterized the impact of low quality input DNA on the method, and develop a whole-genome amplification (WGA) based protocol using limited input DNA (<1 µg). Using this PE dataset, we accurately assembled the YanHuang (YH) genome, the first sequenced Asian genome, into a scaffold N50 size of >2 Mb, which is over100-times greater than the initial size produced with only small insert PE reads(17 kb). In addition, we mapped two 7- to 8- kb insertions in the YH genome using the larger insert sizes of the long-range PE data.
In conclusion, we demonstrate here the effectiveness of this long-range PE sequencing method and its use for the de novo assembly of a large, complex genome using NGS short reads.
Naturally occurring 8-O-methylated sialic acids, including 8-O-methyl-N-acetylneuraminic acid and 8-O-methyl-N-glycolylneuraminic acid, along with 8-O-methyl-2-keto-3-deoxy-D-glycero-D-galacto-nonulosonic acid (Kdn8Me) and 8-deoxy-Kdn were synthesized from corresponding 5-O-modified six-carbon monosaccharides and pyruvate using a sialic acid aldolase cloned from Pasteurella multocida strain P-1059 (PmNanA). In addition, α2–3- and α2–6-linked sialyltrisaccharides containing Neu5Ac8Me and Kdn8Deoxy were also synthesized using a one-pot multienzyme approach. The strategy reported here provides an efficient approach to produce glycans containing various C8-modified sialic acids for biological evaluations.
Chemoenzymatic synthesis; C8-modification; Sialic acid; Sialoside; Sialyltransferase
Despite the importance of neuraminidase (NA) activity in effective infection by influenza A viruses, limited information exists about the differences of substrate preferences of viral neuraminidases from different hosts or from different strains. Using a high-throughput screening format and a library of twenty α2–3- or α2–6-linked para-nitrophenol-tagged sialylgalactosides, substrate specificity of NAs on thirty-seven strains of human and avian influenza A viruses was studied using intact viral particles. Neuraminidases of all viruses tested cleaved both α2–3- and α2–6-linked sialosides but preferred α2–3-linked ones and the activity was dependent on the terminal sialic acid structure. In contrast to NAs of other subtypes of influenza A viruses which did not cleave 2-keto-3-deoxy-D-glycero-D-galacto-nonulosonic acid (Kdn) or 5-deoxy Kdn (5d–Kdn), NAs of all N7 subtype viruses tested had noticeable hydrolytic activities on α2–3-linked sialosides containing Kdn or 5d–Kdn. Additionally, group 1 NAs showed efficient activity in cleaving N-azidoacetylneuraminic acid from α2 –3-linked sialoside.
Carbohydrate; Influenza A virus; Inhibitors; Neuraminidase; Sialic acid; Sialosides; Substrate specificity studies
Human carcinomas can metabolically incorporate and present the dietary non-human sialic acid Neu5Gc, which differs from the human sialic acid N-acetylneuraminic acid (Neu5Ac) by one oxygen atom. Tumor-associated Neu5Gc can interact with low levels of circulating anti-Neu5Gc antibodies, thereby facilitating tumor progression via chronic inflammation in a human-like Neu5Gc-deficient mouse model. Here we show that human anti-Neu5Gc antibodies can be affinity-purified in substantial amounts from clinically-approved intravenous IgG (IVIG) and used at higher concentrations to suppress growth of the same Neu5Gc-expressing tumors. Hypothesizing that this polyclonal spectrum of human anti-Neu5Gc antibodies also includes potential cancer biomarkers, we then characterize them in cancer and non-cancer patients’ sera, using a novel sialoglycan-microarray presenting multiple Neu5Gc-glycans and control Neu5Ac-glycans. Antibodies against Neu5Gcα2–6GalNAcα1-O-Ser/Thr (GcSTn) were found to be more prominent in patients with carcinomas than with other diseases. This unusual epitope arises from dietary Neu5Gc incorporation into the carcinoma marker Sialyl-Tn, and is the first example of such a novel mechanism for biomarker generation. Finally, human serum or purified antibodies rich in anti-GcSTn-reactivity kill GcSTn-expressing human tumors via complement-dependent-cytotoxicity or antibody-dependent-cellular-cytotoxicity. Such xeno-autoantibodies and xenoautoantigens have potential for novel diagnostics, prognostics and therapeutics in human carcinomas.
Antibodies; Biomarkers; Cancer; Neu5Gc; Sialic acids
Aberrant expression of human sialidases has been shown to associate with various pathological conditions. Despite the effort in sialidase inhibitor design, less attention has been paid to designing specific inhibitors against human sialidases and characterizing the substrate specificity of different sialidases regarding diverse terminal sialic acid forms and sialyl linkages. This is mainly due to the lack of sialoside probes and efficient screening methods, as well as limited access to human sialidases. Low cellular expression level of human sialidase NEU2 hampers its functional and inhibitory studies. Here we report the successful cloning and expression of human sialidase NEU2 in E. coli. About 11 mg of soluble active NEU2 was routinely obtained from 1 L of E. coli cell culture. Substrate specificity studies of the recombinant human NEU2 using twenty para-nitrophenol (pNP)-tagged α2–3- or α2–6-linked sialyl galactosides containing different terminal sialic acid forms including common N-acetylneuraminic acid (Neu5Ac), non-human N-glycolylneuraminic acid (Neu5Gc), 2-keto-3-deoxy-D-glycero-D-galacto-nonulosonic acid (Kdn), or their C5-derivatives in a microtiter plate-based high-throughput colorimetric assay identified a unique structural feature specifically recognized by the human NEU2 but not two bacterial sialidases. The results obtained from substrate specificity studies were used to guide the design of a sialidase inhibitor that was selective against human NEU2. The selectivity of the inhibitor was revealed by the comparison of sialidase crystal structures and inhibitor docking studies.
Carbohydrates; Enzymes; Inhibitors; NEU2; Sialidases
The important roles that carbohydrates play in biological processes and their potential application in diagnosis, therapeutics, and vaccine development have made them attractive synthetic targets. Despite ongoing challenges, tremendous progresses have been made in recent years for the synthesis of carbohydrates. The chemical glycosylation methods have become more sophisticated and the synthesis of oligosaccharides has become more predictable. Simplified one-pot glycosylation strategy and automated synthesis are increasingly used to obtain biologically important glycans. On the other hand, chemoenzymatic synthesis continues to be a powerful alternative for obtaining complex carbohydrates. This review highlights recent progress in chemical and chemoenzymatic synthesis of carbohydrates with a particular focus on the methods developed for the synthesis of oligosaccharides, polysaccharides, glycolipids, and glycosylated natural products.
Carbohydrates; chemical synthesis; chemoenzymatic synthesis; oligosaccharides; polysaccharides
Analysis across the genome of patterns of DNA methylation reveals a rich landscape of allele-specific epigenetic modification and consequent effects on allele-specific gene expression.
DNA methylation plays an important role in biological processes in human health and disease. Recent technological advances allow unbiased whole-genome DNA methylation (methylome) analysis to be carried out on human cells. Using whole-genome bisulfite sequencing at 24.7-fold coverage (12.3-fold per strand), we report a comprehensive (92.62%) methylome and analysis of the unique sequences in human peripheral blood mononuclear cells (PBMC) from the same Asian individual whose genome was deciphered in the YH project. PBMC constitute an important source for clinical blood tests world-wide. We found that 68.4% of CpG sites and <0.2% of non-CpG sites were methylated, demonstrating that non-CpG cytosine methylation is minor in human PBMC. Analysis of the PBMC methylome revealed a rich epigenomic landscape for 20 distinct genomic features, including regulatory, protein-coding, non-coding, RNA-coding, and repeat sequences. Integration of our methylome data with the YH genome sequence enabled a first comprehensive assessment of allele-specific methylation (ASM) between the two haploid methylomes of any individual and allowed the identification of 599 haploid differentially methylated regions (hDMRs) covering 287 genes. Of these, 76 genes had hDMRs within 2 kb of their transcriptional start sites of which >80% displayed allele-specific expression (ASE). These data demonstrate that ASM is a recurrent phenomenon and is highly correlated with ASE in human PBMCs. Together with recently reported similar studies, our study provides a comprehensive resource for future epigenomic research and confirms new sequencing technology as a paradigm for large-scale epigenomics studies.
Epigenetic modifications such as addition of methyl groups to cytosine in DNA play a role in regulating gene expression. To better understand these processes, knowledge of the methylation status of all cytosine bases in the genome (the methylome) is required. DNA methylation can differ between the two gene copies (alleles) in each cell. Such allele-specific methylation (ASM) can be due to parental origin of the alleles (imprinting), X chromosome inactivation in females, and other as yet unknown mechanisms. This may significantly alter the expression profile arising from different allele combinations in different individuals. Using advanced sequencing technology, we have determined the methylome of human peripheral blood mononuclear cells (PBMC). Importantly, the PBMC were obtained from the same male Han Chinese individual whose complete genome had previously been determined. This allowed us, for the first time, to study genome-wide differences in ASM. Our analysis shows that ASM in PBMC is higher than can be accounted for by regions known to undergo parent-of-origin imprinting and frequently (>80%) correlates with allele-specific expression (ASE) of the corresponding gene. In addition, our data reveal a rich landscape of epigenomic variation for 20 genomic features, including regulatory, coding, and non-coding sequences, and provide a valuable resource for future studies. Our work further establishes whole-genome sequencing as an efficient method for methylome analysis.
A convenient chemoenzymatic strategy for synthesizing sialosides containing a C5-diversified sialic acid was developed. The α2,3- and α2,6-linked sialosides containing a 5-azido neuraminic acid synthesized by a highly efficient one-pot three-enzyme approach were converted to C5″-amino sialosides, which were used as common intermediates for chemical parallel synthesis to quickly generate a series of sialosides containing various sialic acid forms.
The sialyl Lewis x tetrasaccharide with a propylamine aglycon was assembled by chemoselective glycosylation from a p-tolyl thioglycosyl donor obtained from an enzymatically synthesized sialodisaccharide. Combining the advantages of highly efficient enzymatic synthesis of sialoside building blocks, and diverse chemical glycosylation, this chemoenzymatic approach is practical for obtaining complex sialosides and their analogues.
Sialyl Lewis x; Sialoside; Oligosaccharide; Chemoenzymatic synthesis
Human heterophile antibodies that agglutinate animal erythrocytes are known to detect the non-human sialic acid N-glycolylneuraminic acid (Neu5Gc). This monosaccharide cannot by itself fill the binding site (paratope) of an antibody and can also be modified and presented in various linkages, on diverse underlying glycans. Thus, we hypothesized that the human anti-Neu5Gc antibody response is diverse and polyclonal. Here we use a novel set of natural and chemoenzymatically-synthesized glycans to show that normal humans have an abundant and diverse spectrum of such anti-Neu5Gc antibodies, directed against a variety of Neu5Gc-containing epitopes. High sensitivity and specificity assays were achieved by using N-acetylneuraminic acid (Neu5Ac)-containing probes (differing from Neu5Gc by one less oxygen atom) as optimal background controls. The commonest anti-Neu5Gc antibodies are of the IgG class. Moreover, the range of reactivity and Ig classes of antibodies varies greatly amongst normal humans, with some individuals having remarkably large amounts, even surpassing levels of some well-known natural blood group and xenoreactive antibodies. We purified these anti-Neu5Gc antibodies from individual human sera using a newly developed affinity method and show that they bind to wild-type but not Neu5Gc-deficient mouse tissues. Moreover, they bind back to human carcinomas that have accumulated Neu5Gc in vivo. As dietary Neu5Gc is primarily found in red meat and milk products, we suggest that this ongoing antigen-antibody reaction may generate chronic inflammation, possibly contributing to the high frequency of diet-related carcinomas and other diseases in humans.
Antibodies; Cell surface molecules; Human; Neu5Gc; Sialic acids
Human heterophile antibodies that agglutinate animal erythrocytes are known to detect the nonhuman sialic acid N-glycolylneuraminic acid (Neu5Gc). This monosaccharide cannot by itself fill the binding site (paratope) of an antibody and can also be modified and presented in various linkages, on diverse underlying glycans. Thus, we hypothesized that the human anti-Neu5Gc antibody response is diverse and polyclonal. Here, we use a novel set of natural and chemoenzymatically synthesized glycans to show that normal humans have an abundant and diverse spectrum of such anti-Neu5Gc antibodies, directed against a variety of Neu5Gc-containing epitopes. High sensitivity and specificity assays were achieved by using N-acetylneuraminic acid (Neu5Ac)-containing probes (differing from Neu5Gc by one less oxygen atom) as optimal background controls. The commonest anti-Neu5Gc antibodies are of the IgG class. Moreover, the range of reactivity and Ig classes of antibodies vary greatly amongst normal humans, with some individuals having remarkably large amounts, even surpassing levels of some well-known natural blood group and xenoreactive antibodies. We purified these anti-Neu5Gc antibodies from individual human sera using a newly developed affinity method and showed that they bind to wild-type but not Neu5Gc-deficient mouse tissues. Moreover, they bind back to human carcinomas that have accumulated Neu5Gc in vivo. As dietary Neu5Gc is primarily found in red meat and milk products, we suggest that this ongoing antigen-antibody reaction may generate chronic inflammation, possibly contributing to the high frequency of diet-related carcinomas and other diseases in humans.
antibodies; cell surface molecules; human; Neu5Gc; sialic acids
Sialic acid aldolases or N-acetylneuraminate lyases (NanAs) catalyze the reversible aldol cleavage of N-acetylneuraminic acid (Neu5Ac) to form pyruvate and N-acetyl-D-mannosamine (ManNAc). A capillary electrophoresis (CE) assay was developed to directly characterize the activities of NanAs in both Neu5Ac cleavage and Neu5Ac synthesis directions. The assay was used to obtain the pH profile and the kinetic data of a NanA cloned from Pasteurella multocida P-1059 (PmNanA) and a previously reported recombinant Escherichia coli K12 NanA (EcNanA). Both enzymes are active in a broad pH range of 6.0–9.0 in both reaction directions and have similar kinetic parameters. Substrates specificity studies showed that 5-O-methyl-ManNAc, a ManNAc derivative, can be used efficiently as a substrate by PmNanA, but not efficiently by EcNanA, for the synthesis of 8-O-methyl Neu5Ac. In addition, PmNanA (250 mg per liter culture) has a higher expression level (2.5 fold) than EcNanA (94 mg per liter culture). The higher expression level and a broader substrate tolerance make PmNanA a better catalyst than EcNanA for the chemoenzymatic synthesis of sialic acids and their derivatives.
aldolase; capillary electrophoresis; Escherichia coli; lyase; NanA; Pasteurella multocida
Group B Streptococcus (GBS) is a common cause of neonatal sepsis and meningitis. A major GBS virulence determinant is its sialic acid (Sia)-capped capsular polysaccharide (CPS). Recently, we discovered the presence and genetic basis of capsular Sia O-acetylation in GBS. We now characterize a GBS Sia O-acetylesterase that modulates the degree of GBS surface O-acetylation. The GBS Sia O-acetylesterase operates cooperatively with the GBS CMP-Sia synthetase, both part of a single polypeptide encoded by the neuA gene. NeuA de-O-acetylation of free 9-O-acetyl-N-acetylneuraminic acid (Neu5,9Ac2) was enhanced by CTP and Mg2+, the substrate and co-factor respectively of the N-terminal GBS CMP-Sia synthetase domain. In contrast, the homologous bi-functional NeuA esterase from E. coli K1 did not display cofactor dependence. Further analyses showed that in vitro, GBS NeuA can operate via two alternate enzymatic pathways: de-O-acetylation of Neu5,9Ac2, followed by CMP-activation of Neu5Ac; or, activation of Neu5,9Ac2, then de-O-acetylation of CMP-Neu5,9Ac2. Consistent with in vitro esterase assays, genetic deletion of GBS neuA led to accumulation of intracellular O-acetylated Sias, and over-expression of GBS NeuA reduced O-acetylation of Sias on the bacterial surface. Site-directed mutagenesis of conserved asparagine residue 301 abolished esterase activity, but preserved CMP-Sia synthetase activity, as evidenced by hyper-O-acetylation of CPS Sias on GBS expressing only the N301A NeuA allele. These studies demonstrate a novel mechanism regulating the extent of capsular Sia O-acetylation in intact bacteria, and provide a genetic strategy for manipulating GBS O-acetylation, in order to explore the role of this modification in GBS pathogenesis and immunogenicity.