Model organisms are becoming increasingly important for the study of complex diseases such as type 1 diabetes (T1D). The non-obese diabetic (NOD) mouse is an experimental model for T1D having been bred to develop the disease spontaneously in a process that is similar to humans. Genetic analysis of the NOD mouse has identified around 50 disease loci, which have the nomenclature Idd for insulin-dependent diabetes, distributed across at least 11 different chromosomes. In total, 21 Idd regions across 6 chromosomes, that are major contributors to T1D susceptibility or resistance, were selected for finished sequencing and annotation at the Wellcome Trust Sanger Institute. Here we describe the generation of 40.4 mega base-pairs of finished sequence from 289 bacterial artificial chromosomes for the NOD mouse. Manual annotation has identified 738 genes in the diabetes sensitive NOD mouse and 765 genes in homologous regions of the diabetes resistant C57BL/6J reference mouse across 19 candidate Idd regions. This has allowed us to call variation consequences between homologous exonic sequences for all annotated regions in the two mouse strains. We demonstrate the importance of this resource further by illustrating the technical difficulties that regions of inter-strain structural variation between the NOD mouse and the C57BL/6J reference mouse can cause for current next generation sequencing and assembly techniques. Furthermore, we have established that the variation rate in the Idd regions is 2.3 times higher than the mean found for the whole genome assembly for the NOD/ShiLtJ genome, which we suggest reflects the fact that positive selection for functional variation in immune genes is beneficial in regard to host defence. In summary, we provide an important resource, which aids the analysis of potential causative genes involved in T1D susceptibility.
The domestic pig is known as an excellent model for human immunology and the two species share many pathogens. Susceptibility to infectious disease is one of the major constraints on swine performance, yet the structure and function of genes comprising the pig immunome are not well-characterized. The completion of the pig genome provides the opportunity to annotate the pig immunome, and compare and contrast pig and human immune systems.
The Immune Response Annotation Group (IRAG) used computational curation and manual annotation of the swine genome assembly 10.2 (Sscrofa10.2) to refine the currently available automated annotation of 1,369 immunity-related genes through sequence-based comparison to genes in other species. Within these genes, we annotated 3,472 transcripts. Annotation provided evidence for gene expansions in several immune response families, and identified artiodactyl-specific expansions in the cathelicidin and type 1 Interferon families. We found gene duplications for 18 genes, including 13 immune response genes and five non-immune response genes discovered in the annotation process. Manual annotation provided evidence for many new alternative splice variants and 8 gene duplications. Over 1,100 transcripts without porcine sequence evidence were detected using cross-species annotation. We used a functional approach to discover and accurately annotate porcine immune response genes. A co-expression clustering analysis of transcriptomic data from selected experimental infections or immune stimulations of blood, macrophages or lymph nodes identified a large cluster of genes that exhibited a correlated positive response upon infection across multiple pathogens or immune stimuli. Interestingly, this gene cluster (cluster 4) is enriched for known general human immune response genes, yet contains many un-annotated porcine genes. A phylogenetic analysis of the encoded proteins of cluster 4 genes showed that 15% exhibited an accelerated evolution as compared to 4.1% across the entire genome.
This extensive annotation dramatically extends the genome-based knowledge of the molecular genetics and structure of a major portion of the porcine immunome. Our complementary functional approach using co-expression during immune response has provided new putative immune response annotation for over 500 porcine genes. Our phylogenetic analysis of this core immunome cluster confirms rapid evolutionary change in this set of genes, and that, as in other species, such genes are important components of the pig’s adaptation to pathogen challenge over evolutionary time. These comprehensive and integrated analyses increase the value of the porcine genome sequence and provide important tools for global analyses and data-mining of the porcine immune response.
Immune response; Porcine; Genome annotation; Co-expression network; Phylogenetic analysis; Accelerated evolution
Major histocompatibility complex (MHC) genes play a critical role in vertebrate immune response and because the MHC is linked to a significant number of auto-immune and other diseases it is of great medical interest. Here we describe the clone-based sequencing and subsequent annotation of the MHC region of the gorilla genome. Because the MHC is subject to extensive variation, both structural and sequence-wise, it is not readily amenable to study in whole genome shotgun sequence such as the recently published gorilla genome. The variation of the MHC also makes it of evolutionary interest and therefore we analyse the sequence in the context of human and chimpanzee. In our comparisons with human and re-annotated chimpanzee MHC sequence we find that gorilla has a trimodular RCCX cluster, versus the reference human bimodular cluster, and additional copies of Class I (pseudo)genes between Gogo-K and Gogo-A (the orthologues of HLA-K and -A). We also find that Gogo-H (and Patr-H) is coding versus the HLA-H pseudogene and, conversely, there is a Gogo-DQB2 pseudogene versus the HLA-DQB2 coding gene. Our analysis, which is freely available through the VEGA genome browser, provides the research community with a comprehensive dataset for comparative and evolutionary research of the MHC.
Microtubule organization and dynamics are essential during axon and dendrite formation and maintenance in neurons. However, little is known about the regulation of microtubule dynamics during synaptic development and function in mammalian neurons. Here, we present evidence that the microtubule plus-end tracking protein CLASP2 (cytoplasmic linker associated protein 2) is a key regulator of axon and dendrite outgrowth that leads to functional alterations in synaptic activity and formation. We found that CLASP2 protein levels steadily increase throughout neuronal development in the mouse brain and are specifically enriched at the growth cones of extending neurites. The shRNA-mediated knockdown of CLASP2 in primary mouse neurons decreased axon and dendritic length whereas overexpression of human CLASP2 caused the formation of multiple axons, enhanced dendritic branching, and Golgi condensation, implicating CLASP2 in neuronal morphogenesis. In addition, the CLASP2-induced morphological changes led to significant functional alterations in synaptic transmission. CLASP2 overexpression produced a large increase in spontaneous miniature event frequency that was specific to excitatory neurotransmitter release. The changes in presynaptic activity produced by CLASP2 overexpression were accompanied by increases in presynaptic terminal circumference, total synapse number and a selective increase in presynaptic proteins that are involved in neurotransmitter release. Also, we found a smaller increase in miniature event amplitude that was accompanied by an increase in postsynaptic surface expression of GluA1 receptor localization. Together, these results provide evidence for involvement of the microtubule plus-end tracking protein CLASP2 in cytoskeleton-related mechanisms underlying neuronal polarity and interplay between the microtubule stabilization and synapse formation and activity.
CLASP2; Golgi; axons; synapses; dendrites
Coupling of heterotrimeric G proteins to activated G protein-coupled receptors results in nucleotide exchange on the Gα subunit, which in turn decreases its affinity for both Gβγ and activated receptors. N-terminal myristoylation of Gα subunits aids in membrane localization of inactive G proteins. Despite the presence of the covalently attached myristoyl group, Gα proteins are highly soluble after GTP binding. This study investigated factors facilitating the solubility of the activated, myristoylated protein. In doing so, we also identified myristoylation-dependent differences in regions of Gα known to play important roles in interactions with receptors, effectors, and nucleotide binding. Amide-hydrogen deuterium exchange and site-directed fluorescence of activated proteins revealed a solvent-protected amino terminus which was enhanced by myristoylation. Furthermore, fluorescence quenching confirmed that the myristoylated amino terminus lies in close proximity to the Switch II region in the activated protein. Myristoylation also stabilized the interaction between the guanine ring and the base of the α5 helix which contacts bound nucleotide. The allosteric effects of myristoylation on protein structure, function, and localization indicate that the myristoylated amino terminus of Gαi functions as a myristoyl switch, with implications for myristoylation in the stabilization of nucleotide binding and in the spatial regulation of G protein signaling.
During homeostatic adjustment in response to alterations in neuronal activity, synaptic expression of AMPA receptors (AMPARs) is globally tuned up- or down so that the neuronal activity is restored to a physiological range. Given that a central neuron receives multiple presynaptic inputs, whether and how AMPAR synaptic expression is homeostatically regulated at individual synapses remains unclear. In cultured hippocampal neurons, we report that when activity of an individual presynaptic terminal is selectively elevated by light-controlled excitation, AMPAR abundance at the excited synapses is selectively down-regulated in an NMDAR-dependent manner. The reduction in surface AMPARs is accompanied by enhanced receptor endocytosis and dependent on proteasomal activity. Synaptic activation also leads to a site-specific increase in the ubiquitin ligase Nedd4 and polyubiquitination levels, consistent with AMPAR ubiquitination and degradation in the spine. These results indicate that AMPAR accumulation at individual synapses is subject to autonomous homeostatic regulation in response to synaptic activity.
The bone and immune systems are closely interconnected. The immediate inflammatory response after fracture is known to trigger a healing cascade which plays an important role in bone repair. Toll-like receptor 4 (TLR4) is a member of a highly conserved receptor family and is a critical activator of the innate immune response after tissue injury. TLR4 signaling has been shown to regulate the systemic inflammatory response induced by exposed bone components during long-bone fracture. Here we tested the hypothesis that TLR4 activation affects the healing of calvarial defects. A 1.8 mm diameter calvarial defect was created in wild-type (WT) and TLR4 knockout (TLR4−/−) mice. Bone healing was tested using radiographic, histologic and gene expression analyses. Radiographic and histomorphometric analyses revealed that calvarial healing was accelerated in TLR4−/− mice. More bone was observed in TLR4−/− mice compared to WT mice at postoperative days 7 and 14, although comparable healing was achieved in both groups by day 21. Bone remodeling was detected in both groups on postoperative day 28. In TLR4−/− mice compared to WT mice, gene expression analysis revealed that higher expression levels of IL-1β, IL-6, TNF-α,TGF-β1, TGF-β3, PDGF and RANKL and lower expression level of RANK were detected at earlier time points (≤ postoperative 4 days); while higher expression levels of IL-1β and lower expression levels of VEGF, RANK, RANKL and OPG were detected at late time points (> postoperative 4 days). This study provides evidence of accelerated bone healing in TLR4−/− mice with earlier and higher expression of inflammatory cytokines and with increased osteoclastic activity. Further work is required to determine if this is due to inflammation driven by TLR4 activation.
AMPA receptors (AMPARs) are the primary mediators of excitatory synaptic transmission in the brain. Alterations in AMPAR localization and turnover have been considered critical mechanisms underpinning synaptic plasticity and higher brain functions, but the molecular processes that control AMPAR trafficking and stability are still not fully understood. Here, we report that mammalian AMPARs are subject to ubiquitination in neurons and in transfected heterologous cells. Ubiquitination facilitates AMPAR endocytosis, leading to a reduction in AMPAR cell-surface localization and total receptor abundance. Mutation of lysine residues to arginine residues at the GluA1 C-terminus dramatically reduces GluA1 ubiquitination and abolishes ubiquitin-dependent GluA1 internalization and degradation, indicating that the lysine residues, particularly K868, are sites of ubiquitination. We also find that the E3 ligase Nedd4 is enriched in synaptosomes and co-localizes and associates with AMPARs in neurons. Nedd4 expression leads to AMPAR ubiquitination, leading to reduced AMPAR surface expression and suppressed excitatory synaptic transmission. Conversely, knockdown of Nedd4 by specific siRNAs abolishes AMPAR ubiquitination. These data indicate that Nedd4 is the E3 ubiquitin ligase responsible for AMPAR ubiquitination, a modification that regulates multiple aspects of AMPAR molecular biology including trafficking, localization and stability.
glutamate receptors; AMPA receptors; Nedd4; E3 ligase; ubiquitination; trafficking
Homeostatic synaptic plasticity is a negative-feedback response employed to compensate for functional disturbances in the nervous system. Typically, synaptic activity is strengthened when neuronal firing is chronically suppressed or weakened when neuronal activity is chronically elevated. At both the whole cell and entire network levels, activity manipulation leads to a global up- or downscaling of the transmission efficacy of all synapses. However, the homeostatic response can also be induced locally at subcellular regions or individual synapses. Homeostatic synaptic scaling is expressed mainly via the regulation of α-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid receptor (AMPAR) trafficking and synaptic expression. Here we review the recently identified functional molecules and signaling pathways that are involved in homeostatic plasticity, especially the homeostatic regulation of AMPAR localization at excitatory synapses.
While early models of ejaculate allocation predicted that both relative testes and ejaculate size should increase with sperm competition intensity across species, recent models predict that ejaculate size may actually decrease as testes size and sperm competition intensity increase, owing to the confounding effect of potential male mating rate. A recent study demonstrated that ejaculate volume decreased in relation to increased polyandry across bushcricket species, but testes mass was not measured. Here, we recorded testis mass for 21 bushcricket species, while ejaculate (ampulla) mass, nuptial gift mass, sperm number and polyandry data were largely obtained from the literature. Using phylogenetic-comparative analyses, we found that testis mass increased with the degree of polyandry, but decreased with increasing ejaculate mass. We found no significant relationship between testis mass and either sperm number or nuptial gift mass. While these results are consistent with recent models of ejaculate allocation, they could alternatively be driven by substances in the ejaculate that affect the degree of polyandry and/or by a trade-off between resources spent on testes mass versus non-sperm components of the ejaculate.
sperm competition; testes size; ejaculate size; polyandry; sexual selection
Manual annotation of genomic data is extremely valuable to produce an accurate reference gene set but is expensive compared with automatic methods and so has been limited to model organisms. Annotation tools that have been developed at the Wellcome Trust Sanger Institute (WTSI, http://www.sanger.ac.uk/.) are being used to fill that gap, as they can be used remotely and so open up viable community annotation collaborations. We introduce the ‘Blessed’ annotator and ‘Gatekeeper’ approach to Community Annotation using the Otterlace/ZMap genome annotation tool. We also describe the strategies adopted for annotation consistency, quality control and viewing of the annotation.
Database URL: http://vega.sanger.ac.uk/index.html
The distribution of the neurotropic alphaherpesviruses (HSV-1, HSV-2, and VZV) was determined in autonomic and sensory ganglia of the head and neck from formalin-fixed human cadavers. HSV-1 and VZV DNA were found in 18/58 and 16/58 trigeminal, 23/58 and 11/58 pterygopalatine, 25/60 and 14/60 ciliary, 25/48 and 11/48 geniculate, 15/50 and 8/50 otic, 14/47 and 4/47 submandibular, 18/58 and 10/58 superior cervical, and 12/36 and 1/36 nodose ganglia, respectively. HSV-2 was not detected in any site. Viral DNA positivity and location were independently distributed among autonomic and sensory ganglia of the human head and neck.
Herpes simplex virus (HSV); varicella-zoster virus (VZV); human ganglia; formalin-fixed tissue, DNA
Cartilage development and function are dependent on a temporally integrated program of gene expression. With the advent of RNA interference (RNAi), artificial control of these complex programs becomes a possibility, limited only by the ability to regulate and express the RNAi. Using existing methods for production of RNAi’s, we have constructed a plasmid-based short hairpin RNA (shRNA) expression system under control of the human pol III H1 promoter and supplemented this promoter with DNA binding sites for the cartilage-specific transcription factor Sox9. The resulting shRNA expression system displays robust, Sox9-dependent gene silencing. Dependence on Sox9 expression was confirmed by electrophoretic mobility shift assays. The ability of the system to regulate heterologously expressed Sox9 was demonstrated by Western blot, as a function of both Sox9 to shRNA ratio, as well as time from transfection. This novel expression system supports auto-regulatory gene silencing, providing a tissue-specific feedback mechanism for temporal control of gene expression. Its applications for both basic mechanistic studies and therapeutic purposes should facilitate the design and implementation of innovative tissue engineering strategies.
Sox9; shRNA; tissue-specific; regulated expression; inducible gene silencing
Host defense peptides are a critical component of the innate immune system. Human alpha- and beta-defensin genes are subject to copy number variation (CNV) and historically the organization of mouse alpha-defensin genes has been poorly defined. Here we present the first full manual genomic annotation of the mouse defensin region on Chromosome 8 of the reference strain C57BL/6J, and the analysis of the orthologous regions of the human and rat genomes. Problems were identified with the reference assemblies of all three genomes. Defensins have been studied for over two decades and their naming has become a critical issue due to incorrect identification of defensin genes derived from different mouse strains and the duplicated nature of this region.
The defensin gene cluster region on mouse Chromosome 8 A2 contains 98 gene loci: 53 are likely active defensin genes and 22 defensin pseudogenes. Several TATA box motifs were found for human and mouse defensin genes that likely impact gene expression. Three novel defensin genes belonging to the Cryptdin Related Sequences (CRS) family were identified. All additional mouse defensin loci on Chromosomes 1, 2 and 14 were annotated and unusual splice variants identified. Comparison of the mouse alpha-defensins in the three main mouse reference gene sets Ensembl, Mouse Genome Informatics (MGI), and NCBI RefSeq reveals significant inconsistencies in annotation and nomenclature. We are collaborating with the Mouse Genome Nomenclature Committee (MGNC) to establish a standardized naming scheme for alpha-defensins.
Prior to this analysis, there was no reliable reference gene set available for the mouse strain C57BL/6J defensin genes, demonstrating that manual intervention is still critical for the annotation of complex gene families and heavily duplicated regions. Accurate gene annotation is facilitated by the annotation of pseudogenes and regulatory elements. Manually curated gene models will be incorporated into the Ensembl and Consensus Coding Sequence (CCDS) reference sets. Elucidation of the genomic structure of this complex gene cluster on the mouse reference sequence, and adoption of a clear and unambiguous naming scheme, will provide a valuable tool to support studies on the evolution, regulatory mechanisms and biological functions of defensins in vivo.
There is widespread concern that the quality of out-of-hours primary care for patients with complex needs may be at risk now that the new general medical services contract (GMS) has been implemented.
To explore changes in the use of out-of-hours services around the time of implementation of the new contract for patients with complex needs, using patients with cancer as an example.
Design of study
Longitudinal observational study.
Out-of-hours primary care provider covering Devon (adult population 900 000), UK.
Two, 1-year periods corresponding to pre- (April 2003 to March 2004) and post-contract implementation (October 2004 to September 2005) were sampled. Call rates per 1000 of the adult population (age ≥16 years) were calculated for all calls (any cause) and cancer-related calls. Anonymised outcome and process measures data were extracted.
Although overall call rates per 1000 population had increased by 26% (185 pre-contract to 233 post-contract), the proportion of cancer-related calls remained relatively constant (2.08% versus 1.96%). Around half (56%) of these callers had advanced cancer needs (including palliative care). By post-contract, the time taken to triage had significantly increased (P<0.001). Although the proportions admitted to hospital or receiving a home visit remained constant, calls where a special message was sent by the out-of-hours clinician to the in-hours team had decreased (P<0.001).
The demand for out-of-hours care for patients with cancer did not alter disproportionately after implementation of the contract. While potential quality indicators (for example, hospital admissions, home visiting rates) remained constant, potentially adverse changes to triage time and communication between out-of-hours and in-hours clinicians were observed. Quality standards and provider databases require further refinement to capture elements of care relevant to patients with complex needs.
cancer care; cohort study; healthcare delivery; out-of-hours medical care; primary health care
Chromosome 17 is unusual among the human chromosomes in many respects. It is the largest human autosome with orthology to only a single mouse chromosome1, mapping entirely to the distal half of mouse chromosome 11. Chromosome 17 is rich in protein-coding genes, having the second highest gene density in the genome2,3. It is also enriched in segmental duplications, ranking third in density among the autosomes4. Here we report a finished sequence for human chromosome 17, as well as a structural comparison with the finished sequence for mouse chromosome 11, the first finished mouse chromosome. Comparison of the orthologous regions reveals striking differences. In contrast to the typical pattern seen in mammalian evolution5,6, the human sequence has undergone extensive intrachromosomal rearrangement, whereas the mouse sequence has been remarkably stable. Moreover, although the human sequence has a high density of segmental duplication, the mouse sequence has a very low density. Notably, these segmental duplications correspond closely to the sites of structural rearrangement, demonstrating a link between duplication and rearrangement. Examination of the main classes of duplicated segments provides insight into the dynamics underlying expansion of chromosome-specific, low-copy repeats in the human genome.
Fibronectin (FN) isoform expression is altered during chondrocyte commitment and maturation, with cartilage favoring expression of FN isoforms that include the type II repeat extra domain B (EDB) but exclude extra domain A (EDA). We and others have hypothesized that the regulated splicing of FN mRNAs is necessary for progression of chondrogenesis. To test this, we treated the pre-chondrogenic cell line ATDC5 with transforming growth factor -β1, which has been shown to modulate expression of the EDA and EDB exons, as well as the late markers of chondrocyte maturation; it also slightly accelerates the early acquisition of a sulfated proteoglycan matrix without affecting cell proliferation. When chondrocytes are treated with TGF-β1, the EDA exon is preferentially excluded at all times whereas the EDB exon is relatively depleted at early times. This regulated alternative splicing of FN correlates with regulation of alternative splicing of SRp40, a splicing factor facilitating inclusion of the EDA exon. To determine if overexpression of the SRp40 isoforms altered FN and FN EDA organization, cDNAs encoding these isoforms were overexpressed in ATDC5 cells. Overexpression of the long-form of SRp40 yielded a FN organization similar to TGF-β1 treatment; whereas overexpression of the short form of SRp40 (which facilitates EDA inclusion) increased formation of long-thick FN fibrils. Therefore, we conclude that the effects of TGF-β1 on FN splicing during chondrogenesis may be largely dependent on its effect on SRp40 isoform expression.
Fibronectin; TGF-β1; SRp40; alternative splicing; collagen type II; collagen type I; Alcian blue
The human major histocompatibility complex (MHC) is contained within about 4 Mb on the short arm of chromosome 6 and is recognised as the most variable region in the human genome. The primary aim of the MHC Haplotype Project was to provide a comprehensively annotated reference sequence of a single, human leukocyte antigen-homozygous MHC haplotype and to use it as a basis against which variations could be assessed from seven other similarly homozygous cell lines, representative of the most common MHC haplotypes in the European population. Comparison of the haplotype sequences, including four haplotypes not previously analysed, resulted in the identification of >44,000 variations, both substitutions and indels (insertions and deletions), which have been submitted to the dbSNP database. The gene annotation uncovered haplotype-specific differences and confirmed the presence of more than 300 loci, including over 160 protein-coding genes. Combined analysis of the variation and annotation datasets revealed 122 gene loci with coding substitutions of which 97 were non-synonymous. The haplotype (A3-B7-DR15; PGF cell line) designated as the new MHC reference sequence, has been incorporated into the human genome assembly (NCBI35 and subsequent builds), and constitutes the largest single-haplotype sequence of the human genome to date. The extensive variation and annotation data derived from the analysis of seven further haplotypes have been made publicly available and provide a framework and resource for future association studies of all MHC-associated diseases and transplant medicine.
Major histocompatibility complex; Haplotype; Polymorphism; Retroelement; Genetic predisposition to disease; Population genetics
The sequencing, annotation and comparative analysis of an 8Mb region of pig chromosome 17 allows the coverage and quality of the pig genome sequencing project to be assessed
We describe here the sequencing, annotation and comparative analysis of an 8 Mb region of pig chromosome 17, which provides a useful test region to assess coverage and quality for the pig genome sequencing project. We report our findings comparing the annotation of draft sequence assembled at different depths of coverage.
Within this region we annotated 71 loci, of which 53 are orthologous to human known coding genes. When compared to the syntenic regions in human (20q13.13-q13.33) and mouse (chromosome 2, 167.5 Mb-178.3 Mb), this region was found to be highly conserved with respect to gene order. The most notable difference between the three species is the presence of a large expansion of zinc finger coding genes and pseudogenes on mouse chromosome 2 between Edn3 and Phactr3 that is absent from pig and human. All of our annotation has been made publicly available in the Vertebrate Genome Annotation browser, VEGA. We assessed the impact of coverage on sequence assembly across this region and found, as expected, that increased sequence depth resulted in fewer, longer contigs. One-third of our annotated loci could not be fully re-aligned back to the low coverage version of the sequence, principally because the transcripts are fragmented over several contigs.
We have demonstrated the considerable advantages of sequencing at increased read depths and discuss the implications that lower coverage sequence may have on subsequent comparative and functional studies, particularly those involving complex loci such as GNAS.
The GENCODE consortium was formed to identify and map all protein-coding genes within the ENCODE regions. This was achieved by a combination of initial manual annotation by the HAVANA team, experimental validation by the GENCODE consortium and a refinement of the annotation based on these experimental results.
The GENCODE gene features are divided into eight different categories of which only the first two (known and novel coding sequence) are confidently predicted to be protein-coding genes. 5' rapid amplification of cDNA ends (RACE) and RT-PCR were used to experimentally verify the initial annotation. Of the 420 coding loci tested, 229 RACE products have been sequenced. They supported 5' extensions of 30 loci and new splice variants in 50 loci. In addition, 46 loci without evidence for a coding sequence were validated, consisting of 31 novel and 15 putative transcripts. We assessed the comprehensiveness of the GENCODE annotation by attempting to validate all the predicted exon boundaries outside the GENCODE annotation. Out of 1,215 tested in a subset of the ENCODE regions, 14 novel exon pairs were validated, only two of them in intergenic regions.
In total, 487 loci, of which 434 are coding, have been annotated as part of the GENCODE reference set available from the UCSC browser. Comparison of GENCODE annotation with RefSeq and ENSEMBL show only 40% of GENCODE exons are contained within the two sets, which is a reflection of the high number of alternative splice forms with unique exons annotated. Over 50% of coding loci have been experimentally verified by 5' RACE for EGASP and the GENCODE collaboration is continuing to refine its annotation of 1% human genome with the aid of experimental validation.
Although previous lentivirus vector systems have used human immunodeficiency virus type 1 (HIV-1), HIV-2 is less pathogenic in humans and is amenable to pathogenicity testing in a primate model. In this study, an HIV-2 molecular clone that is infectious but apathogenic in macaques was used to first define cis-acting regions that can be deleted to prevent HIV-2 genomic encapsidation and replication without inhibiting viral gene expression. Lentivirus encapsidation determinants are complex and incompletely defined; for HIV-2, some deletions between the major 5′ splice donor and the gag open reading frame have been shown to minimally affect encapsidation and replication. We find that a larger deletion (61 to 75 nucleotides) abrogates encapsidation and replication but does not diminish mRNA expression. This deletion was incorporated into a replication-defective, envelope-pseudotyped, three-plasmid HIV-2 lentivirus vector system that supplies HIV-2 Gag/Pol and accessory proteins in trans from an HIV-2 packaging plasmid. The HIV-2 vectors efficiently transduced marker genes into human T and monocytoid cell lines and, in contrast to a murine leukemia virus-based vector, into growth-arrested HeLa cells and terminally differentiated human macrophages and NTN2 neurons. Vector DNA could be detected in HIV-2 vector-transduced nondividing CD34+ CD38− human hematopoietic progenitor cells but not in those cells transduced with murine vectors. However, stable integration and expression of the reporter gene could not be detected in these hematopoietic progenitors, leaving open the question of the accessibility of these cells to stable lentivirus transduction.