Some studies have compared whole neisserial genomes in an attempt to elucidate the basis for differences in invasiveness and pathogenicity 
. Other than the presence of a capsule and the production of endotoxin, no definitive requirements for virulence and pathogenicity have been found. All sequenced genomes differ in gene content from each other; the absolute number of differences is understandably greater in strains from different clonal complexes. Even within a clonal complex, some genes are present in some strains but not in all. Some of these variably present genes are phage-related and their presence/absence may reflect evolutionary distance since the time of phage acquisition. Many other variably present genes are annotated as encoding hypothetical proteins, with insufficient similarity to well-characterized genes to allow prediction of function.
A phylogenetic tree of genetic relatedness of early strain type, late strain type and 20 available sequenced meningococcal genomes indicates that the serogroup Y strains are most closely related to each other and they cluster in a branch not closely related to other sequenced genomes. The serogroup Y strains in this study differ in gene content from other sequenced genomes in the same way as other sequenced genomes differ from each other, that is, mainly in phage-related genes and in gene encoding hypothetical proteins. Most notably, the serogroup Y genomes are missing the genes located in the putative islands of horizontal transfer PNM1, IHT-C and IHT-E, all of which appear to be of phage origin. It has been suggested that these genes contribute to virulence and pathogenicity, but their actual roles are currently unknown 
Closely related strains that differ in their ability to cause disease are ideal for investigating the genetic factors that contribute to that ability to cause disease 
. An analogous situation is presented in the case of replacement of one invasive strain by another over time, where the analysis involves determination of factors responsible for the decrease in incidence of the earlier strain and increase in incidence of the later strain. In this study, we used pyrosequencing to obtain draft genomes of two closely related clonal complex 23 meningococcal strains from Maryland: one whose ST, OMP profile and PFGE profile were representative of a strain causing disease in the early 1990s and one with a different OMP and PFGE profile which was responsible for disease later in that decade 
. The early and late strains exhibited antigenic differences which can be postulated to have contributed to the decline of the early strain type and the emergence of the late strain type.
The majority of ORFs in the two genomes had few or no mutations, leading to an overall low level of πN and πS. The highest value of both πN and πS was found in the gene encoding PorB, which is known to be both immunogenic and highly variable. Other genes with relatively high πN and πS included those encoding PorA, FetA, PilV, LbpB, and HpuA. A number of housekeeping genes were represented among those with high πN and πS, including shikimate 5-dehydrogenase (aroE), one of the seven genes used for MLST. The late strain type is a single locus variant of ST-23; single locus variants are considered to have arisen by horizontal gene transfer if the variant allele exhibits extensive differences from the allele found in the parent sequence type. Therefore, the πN and πS values for shikimate 5-dehydrogenase can be used as a marker for horizontal gene transfer. Genes in the early and late genomes whose πN and πS are similar to that of aroE have likely been involved in horizontal gene transfer.
In addition to the overall low values of πN
, loci which differed between the early and late strain type genomes exhibited a surplus of πS
. This phenomenon has been documented in methicillin-resistant Staphylococcus aureus
and Clostridium difficile
. Recent mutations show an excess of non-synonymous changes, which will be modified over time by the effects of selection. Alleles which result from horizontal gene transfer have a relative surplus of synonymous changes because in them, selection has already purged deleterious non-synonymous changes. The relative abundance of synonymous over non-synonymous changes in genes which differ between the early and late strains is another indication that these genes have been involved in horizontal gene transfer.
Given that both strain types caused similar rates of disease in the same population during different periods of the same decade, we did not expect major differences in virulence between them. Accordingly, our principal hypothesis was that emergence of the late strain type was primarily due to antigenic changes that allowed escape from population immunity. A number of proteins are known to be immunogenic in N. meningitidis
and many others can be predicted to be immunogenic based upon their structure or their function. Our comparison of whole genomes found a preponderance of antigens unchanged but also found differences in a number of loci that contribute to the antigenic profile and these differences may have been instrumental in disease emergence. Genes exhibiting differences include antigenic outer membrane proteins, genes involved in pilus structure, function and glycosylation, and genes involved in iron acquisition and uptake. Differences in the genes encoding the antigens PorA, PorB, and FetA are more extensive than previously described 
. Additional antigens differing between the early and late strain types included putative lipoproteins and the vaccine target Ag473. Neisserial type IV pili (
are necessary for adhesion of bacterial cells to human mucosal cells and are major antigens 
. Differences were found in genes that can be expected to affect the Tfp antigen profile, including the major structural subunit PilE and in the pilus tip adhesin PilC 
. The pilH/I/J
loci differed between early and late strains but most strains within the population tested contained the late strain type alleles. These loci may not have been involved in emergence of the late strain type. Alternatively, the late strain type alleles may have been acquired early in the genetic transformation of the early to late strain type. The presence of the early strain type alleles in isolates obtained in three different years (1998, 1999, and 2001) may imply persistence of a sub-population possessing those alleles or multiple instances of horizontal transfer. Pilin glycosylation is believed to be an important mechanism for decreasing immunogenicity of the pilin subunits by masking exposed areas of the protein. In N. meningitidis
, the gene encoding pilin glycosylation protein B (PglB) has been documented to exist in 2 forms, pglB
. The early strain type contains pglB
, while the late strain type contains pglB2
. Based upon the number and extent of differences in genes affecting the structure and function of the pilus, it can be theorized that these differences contributed to the change in disease epidemiology.
Iron acquisition and uptake are necessary for survival of N. meningitidis
in the human host and iron is sequestered in a variety of forms, including lactoferrin, transferrin, and hemoglobin/haptoglobin. N. meningitidis
has evolved mechanisms for obtaining iron from these sources using several two component systems. These systems are antigenic, as they require surface exposure to bind their substrates. In addition, they are highly variable, with different alleles found in otherwise similar strains during epidemics, invoking the concept of a “genocloud”, a combination of a dominant strain and its close relatives 
. The early and late strain type differed at genes encoding the main iron acquisition systems for lactoferrin, transferrin, and hemoglobin/haptoglobin. The alleles for these genes were consistent in a population of late strain type isolates. However, some early strain type isolates contained the late strain type allele for lbpB
and the early strain type isolates contained multiple variants of tbpB
, which did not match either the early or late allele. These results may indicate selection for recombinants at these loci in a population of isolates against which human immunity has developed.
We also identified genes encoding known and predicted lipoproteins that were highly similar or identical in the two genomes. Factor H binding protein (fHbp
), neisserial heparin binding antigen (nhbA
) and nadA
have been identified as encoding possible targets for vaccines for prevention of serogroup B meningococcal disease as well as that caused by other serogroups 
. The gene nadA
was absent in the early and late strain type genomes, but fHbp
were highly conserved. This study found 46 other genes encoding predicted lipoproteins that were identical in amino acid sequence between the early and late strain types and an additional 20 with greater than 99% amino acid identity. The majority of the identical lipoproteins were also highly conserved across 20 sequenced genomes, suggesting that some of them may be appropriate for investigation as vaccine targets. Conservation of antigens may indicate that structural constraints based upon function limit variation in those proteins. Conserved antigens are superior vaccine candidates, since highly variable vaccine targets complicate vaccine formulation, requiring the inclusion of many variants. However, highly conserved antigens may be a disadvantage to the bacteria, since their conservation predicts that any immunity they engender will be broad and will therefore limit spread of many strains.
Only a few genes were found in one of the genomes sequenced in this study but not the other. The majority of these were hypothetical proteins, but some are predicted to be membrane associated and therefore possibly antigenic. The exact function and importance of these genes will require further investigation.
Sequencing of a single genome provides a complete picture of that particular genome but does not provide information on variations among a population of closely related strains. While some studies have sequenced multiple strains, the relationship of a sequenced genome to a wider population of similar strains is still relatively unexplored. Our results indicate that some genes differ even within isolates from a closely related population. The identity of these genes gives important insights into the ways in which N. meningitidis adapts to the immune response. Allelic variation of known antigens has been demonstrated by targeted Sanger sequencing, but this method requires prior knowledge of the targeted genes. The whole genome approach allows investigation of known targets but also facilitates identification of novel genes whose importance and even function has not previously been determined.
The existence of two strains that are close temporally and geographically, with a majority of genes identical but still some genes highly variant, accentuates the mosaic nature of the meningococcal genome and the ability of this bacterium to acquire new gene variants through horizontal gene transfer and to continue to cause invasive disease. The results of this study suggest that specific genes, mostly encoding antigens, were associated with and were potentially responsible for the expansion of serogroup Y disease. However, this study represents an ecological analysis which makes causality difficult to prove. Additional studies involving multiple examples of clonal emergence of N. meningitidis are needed to determine whether there are consistent antigenic features associated with meningococcal disease emergence.