By revealing complete repertoires of genes, genome sequences provide the key to a better and eventually global understanding of the biology of living organisms. It is widely accepted that this will have important consequences on human health and economics by leading to the rational design of novel therapies against pathogens infecting humans, livestock or crops [
1]. For example, identifying genes essential for cell viability or pathogenesis would uncover targets for new antibiotics or drugs that selectively interfer with virulence mechanisms of pathogenic species, respectively. The major obstacle to this is the fact that hundreds of predicted coding sequences (CDSs) in every genome remain uncharacterized. Unraveling gene function on such a large scale requires suitable biological resources, which are lacking in most species.
As shown in
Saccharomyces cerevisiae, the model organism for genomics, the most valuable toolbox for determining gene function on a genome scale is likely to be a comprehensive archived collection of mutants [
2]. In bacteria, archived collections of mutants containing mutations in most or all non-essential genes have been constructed by systematic targeted mutagenesis in model species (
Escherichia coli and
Bacillus subtilis) and the genetically tractable soil species
Acinetobacter baylyi [
3-
5]. Incidentally, this defined the genes necessary to support cellular life (the minimal genome) as those not amenable to mutagenesis. For a few other bacterial species (
Corynebacterium glutamicum,
Francisella novicida,
Mycoplasma genitalium,
Pseudomonas aeruginosa and
Staphylococcus aureus) transposon mutagenesis followed by sequencing of the transposon insertion sites has been used to generate large (but incomplete) archived libraries of mutants [
6-
11]. However, multiple factors often hinder the effectiveness of these toolboxes in contributing to large-scale unraveling of gene function and/or the design of novel therapies, including: slow growth and complex nutritional requirements (
M. genitalium); the fact that many of these species do not cause disease in humans (
C. glutamicum,
F. novicida); the use of strains for which no accurate genome annotation is available; and the frequent lack of publicly accessible online databases for analysis and distribution of the mutants.
Neisseria meningitidis (the meningococcus) possesses several features that make it a good candidate among human pathogens for the creation of such a biological resource. The meningococcus, which colonizes the nasopharyngeal mucosa of more than 10% of mankind (usually asymptomatically), grows on simple media with a rapid doubling time and has a relatively compact genome of approximately 2.2 Mbp [
12-
15]. Furthermore, it is naturally competent throughout its growth cycle and is therefore a workhorse for genetics. Yet, it is a feared human pathogen because, upon entry in the bloodstream, it causes meningitis and/or septicemia, which can be fatal within hours [
16]. Each year there are approximately 1.2 million cases of meningococcal infections worldwide, mostly in infants, children and adolescents, leading to an estimated 135,000 deaths [
17].
Here we have exploited these meningococcal features to design NeMeSys, a toolbox for
N. meningitidis systematic functional analysis. We opted for strain 8013 (serogroup C), which was isolated at the Institut Pasteur in 1989 from the blood of a 57-year-old male. This strain belongs to the ST-18 clonal complex, often associated with disease in countries from Central and Eastern Europe. It was chosen primarily because it is well-characterized (extensively used to study adhesion to human cells and type IV pilus (Tfp) biology) and has been previously used to produce an archived library of approximately 4,500 transposon mutants [
18]. We created NeMeSys by sequencing the genome of strain 8013, the annotation of which has been performed manually using MicroScope, a powerful platform for microbial genome annotation [
19], and sequencing/mapping the transposon insertion sites in 83% of the above mutants, which showed that 924 genes were hit. Taking advantage of
N. meningitidis natural competence for transformation, we designed a targeted
in vitro transposon mutagenesis approach useful for completing the library in the future and validated it by constructing 26 mutants. The current library contains mutants in 947 genes of strain 8013. All these datasets were stored in a publicly accessible thematic database (NeisseriaScope) within MicroScope [
19]. Furthermore, to maximize the potential of NeMeSys for functional analysis and foster its use in the
Neisseria community where multiple strains are used, we have manually (re)annotated the following publicly available genome sequences: four
N. meningitidis clinical isolates from the different clonal complexes MC58 (ST-32, serogroup B), Z2491 (ST-4, serogroup A), FAM18 (ST-11, serogroup C) and 053442 (ST-4821, serogroup C) [
12-
15]; one unencapsulated
N. meningitidis carrier isolate (strain α14) [
20]; one isolate of the commensal
N. lactamica (ST-640), which shares the same ecological niche as
N. meningitidis; and two clinical isolates of the closely related human pathogen
N. gonorrhoeae (strains FA 1090 and NCCP11945), which colonizes a totally different niche (the urogenital tract) [
21]. As above, these genomes have been stored in NeisseriaScope and are publicly accessible. Finally, we present evidence obtained through functional and comparative genomics illustrating how NeMeSys can be used to narrow the gap between sequence and function in the meningococcus.