The KIR are members of the immunoglobulin super family (IgSF) formerly called Killer-cell Inhibitory Receptors. KIRs have been shown to be highly polymorphic both at the allelic and haplotypic levels (4
). They are composed of two or three Ig-domains, a transmembrane region and cytoplasmic tail, which can in turn be short (activatory) or long (inhibitory). The Leukocyte Receptor Complex (LRC), which encodes KIR genes, has been shown to be polymorphic, polygenic and complex in a manner similar to the MHC. Because of the complexity in the KIR region and KIR sequences a KIR Nomenclature Committee was established in 2002, to undertake the naming of KIR allele sequences. The first KIR Nomenclature report was published in 2003 (5
), which coincided with the first release of the IPD-KIR database. The number of officially named KIR alleles has increased since the initial release which contained 89 alleles. As of September 2009, there are over 450 alleles, which code for over 230 unique protein sequences. This has meant that since its initial release there have been nine further releases of the IPD-KIR database, the latest being in February 2009.
The online tools available for IPD-KIR include those for performing allele queries, sequence alignments and cell queries. As the database is based on the work of a nomenclature committee, the website includes links to a portable document format (PDF) file of recent nomenclature reports. From the data contained within these reports the database is also able to provide individual allele reports (). These pages contain the official allele name, any previous designations, the EMBL, GenBank, or DDBJ accession number(s) and a reference linked wherever possible to the PubMed abstract. Where possible additional details on the source of sequence are also provided. This source material is normally in the form of a cell-line or DNA from which each allele in the database was isolated and characterised and it’s availability. The information contained within this dataset can be searched independently from the allele data.
Figure 1. Allele report. The figure shows part of the report provided for each KIR allele. The report provides cross-references to an SRS entry (KIR00002), the source entries in EMBL-BANK (AY789055-U24076) and to the seminal citations in PubMed. Other information (more ...)
Within each IPD section alleles of a particular gene may differ from each other by as little a single nucleotide. With sometimes hundreds of alleles for a particular gene we need to be able to graphically represent where these differences lie. These polymorphic positions are also often conserved within certain groups of alleles. For this reason the sequences are displayed as multiple sequence alignments which highlight the polymorphic positions. These alignments allow a visual interpretation of sequence similarity, so that polymorphic positions can easily be identified and motifs found in multiple alleles are easily identified. The sequence alignments are available via a link from the section homepage. The sequence alignment tool uses the same basic interface for both IPD-KIR database and IPD-MHC. The interface provided () lets the user define a number of key variables for the alignments, before producing an online output, which can be printed or downloaded. The first step in any alignment is to select the locus of interest. The tool provides a drop-down list of all loci. The selection of a locus automatically updates the list of features, which can be aligned, as well as the default reference sequence used for the alignment. The types of feature available for alignment are the nucleotide coding sequence and individual exons, the signal peptide, mature protein and full-length protein sequence. We also include the genomic sequence, which covers some of the 5′ and 3′ untranslated regions, exons and introns into the database for over half of the alleles in the KIR system. The alignment tool options also allow the user to display a subset of alleles of a particular locus, omit alleles unsequenced for a particular region and also to align against a particular reference or consensus sequence. The alignment tool uses standard formatting for the display of sequence alignments. The alignment tool does not perform a sequence alignment each time it is used, but it extracts pre-aligned sequences, allowing for faster access. The alignments adhere to a number of conventions for displaying evolutionary events and numbering. The numbering of the alignments is based upon the sequence of the reference allele. For a nucleotide sequence, the A of the initiation Methionine codon is denoted nucleotide +1 and the nucleotide 5′ to +1 is numbered −1. There is no nucleotide zero (0). All numbering is based on the ATG of the reference sequence. If a nucleotide sequences is displayed in codons, then the protein numbering is applied. For amino acid-based alignments, the first codon of the mature protein, after cleavage of the signal sequence is labelled codon 1 and the codon 5′ to this is numbered −1. In all sequences the following conventions are used. Where identity to the reference sequence is present the base will be displayed as a hyphen (-). Non-identity to the reference sequence is shown by displaying the appropriate base at that position. Where an insertion or deletion has occurred, this is be represented by a period (.). If the sequence is unknown at any point in the alignment, this is be represented by an asterisk (*). In protein alignments for null alleles, the 'Stop' codons are represented by an X and the sequence following the termination codon, is not marked up and appears blank. The flexibility of the new alignment tool means that unlike in previous alignments you can now display a small subset of sequences against an allele of your choice, using a number of display options ().
Alignment interface. The alignment interface provides a user-friendly method of viewing sequence alignments with output options easily selected.
Figure 3. Alignment formats available from IPD. In these alignments a dash (−) indicates identity to the reference sequence and an asterisk (*) denotes an unsequenced base. The first alignment shows the default output for the nucleotide sequence of KIR3DL2. (more ...)
Further recent additions to the tools available from the IPD-KIR site include a KIR ligand calculator (6
). The ligand calculator allows the user to define which KIR ligands are present in a transplant setting based on the HLA typing of a patient and prospective donor. This is because recent transplant strategies based on KIR-ligand mismatch to predict NK cell alloreactivity have resulted in less relapse, less GvHD and better overall survival in patients with acute myeloid leukaemia (AML) (7
). The KIR-ligands are HLA molecules that can be grouped into three major categories based on the amino acid sequence determining the KIR-binding epitopes in HLA-C and HLA-B molecules. All expressed HLA-C alleles are of the C1 or C2 group (8
) and most HLA-B alleles can be classified as either Bw4 or Bw6 (9
). The receptors KIR2DL1, KIR2DL2 and KIR3DL1 bind KIR-ligand C2, C1 and Bw4, respectively, resulting in inhibition of NK cell mediated lysis. The output lists the ligands associated with the HLA typing entered. In the case of two digit typings the most common ligand for the allele type is displayed. A link to the full list of alleles matching the type given and their associated ligands is also provided. For two digit typings any exceptions to the list are also listed along with their motif. For example B*15 alleles are predominately Bw6 however there are a number of alleles, B*1513 for example, which contain the Bw4 motif. The output classifies alleles in the exceptions list as ‘Rare’ if they have a gene frequency of <0.001 and have been seen in less than three unrelated individuals (10
). This classification allows users to make a judgement call over whether the exceptions are likely to be seen in their samples.
The typing of KIRs is dependant on up to date lists of alleles and primers, and many typing laboratories have spreadsheets detailing probe hit patterns for different alleles. Each time a new database is released it is necessary to update these ever-expanding lists. The new Probe and Primer Search Tool allows uses to enter a list of primer sequences and the tool will search the know coding sequences for these and report any matches in file format suitable for cutting and pasting into existing spreadsheets. The tool is currently limited to coding sequence but as the number of genomic sequences in the database expands then the tool will be modified to search these regions as well. A number of groups involved in developing KIR typing have proposed A Community Standard Reporting Format for KIR Genotyping Data, these guidelines are available from the IPD-KIR website.
The IPD-KIR database is also been expanded to include the KIR sequences from other species, most recently work has begun on including the sequences of KIR alleles found in Rhesus Macaques (Macaca mulatta
). These sequences have been compiled through direct submission and through data-mining existing sequences from the generalist sequence databanks. The first release of the official Mamu-KIR nomenclature will include 107 alleles covering 13 loci. Further loci will be added once haplotype studies have allowed the identification of the genes present. Sequences from other primate species like Crab eating Macaque (Macaca fascicularis
) have also been submitted to the database, and these sequences will be included at a later date. The non-human KIR sequences will be included into the IPD-KIR section and be accessible using the same tools as the human KIR sequences.