|Home | About | Journals | Submit | Contact Us | Français|
Ion channel mutations are an important cause of rare Mendelian disorders affecting brain, heart, and other tissues. We performed parallel exome sequencing of 237 channel genes in a well characterized human sample, comparing variant profiles of unaffected individuals to those with the most common neuronal excitability disorder, sporadic idiopathic epilepsy. Rare missense variation in known Mendelian disease genes is prevalent in both groups at similar complexity, revealing that even deleterious ion channel mutations confer uncertain risk to an individual depending on the other variants with which they are combined. Our findings indicate that variant discovery via large scale sequencing efforts is only a first step in illuminating the complex allelic architecture underlying personal disease risk. We propose that in silico modeling of channel variation in realistic cell and network models will be crucial to future strategies assessing mutation profile pathogenicity and drug response in individuals with a broad spectrum of excitability disorders.
Voltage- and ligand-gated ion channels comprise one of the largest and best understood functional groups of proteins, with over 400 members spanning nearly 1% of the human genome. Channels are ubiquitously expressed in human tissues, and extensive evolutionary, anatomical, biophysical, and pharmacological information on key pore-forming and regulatory subunits in each family is available (Jegla et al., 2009; Nusser, 2009; Vacher et al., 2008; Wulff et al., 2009). Ion channels are multimeric protein complexes expressed in cell-type specific combinations and exert unique yet functionally overlapping control over excitability and signaling in both the plasma membrane and intracellular organelles. Despite this essential role, nearly half of channel genes are unstudied, and their broader involvement in non-current-related transmembrane and nuclear signaling pathways remains largely unexplored (Kaczmarek, 2006; Matzke et al., 2010). Mendelian mutations link single channel defects with an expanding array of familial episodic and degenerative excitability disorders in the nervous (Catterall et al., 2008), cardiovascular (Demolombe et al., 2005), neuroendocrine (Hiriart and Aguilar-Bryan, 2008; Roepke et al., 2009; Ryan et al., 2010) and immune surveillance systems (Cahalan and Chandy, 2009). Sporadic channel variants are also emerging as prime candidates for risk in complex psychiatric (Huffaker et al., 2009), metabolic (Holmkvist et al., 2009) and metastatic disease (Sontheimer, 2008), and alter ligand binding to the targets of hundreds of clinically valuable drugs (Drolet et al., 2005; Liu et al., 2003; Wulff et al., 2009). Their central position in the biology and therapy of excitability disorders, combined with the imminent arrival of gene-directed medicine provide compelling reasons to explore genomic variation in this exemplary gene set in order to correctly diagnose, predict, and treat a broad spectrum of common human disease (Tucker et al., 2009).
Emerging reports of allelic variability within known disease genes in healthy individuals raise concern that ‘genetic noise’ may confound personal disease prediction, which will fail in the absence of a ‘healthy’ reference genome (Lupski et al., 2010; Ng et al., 2010).Population-based studies can provide a statistical measure of genetic risk as well as implicate novel disease loci, but individualized disease prediction requires identifying how genetic variants contribute to risk in a single individual (Durbin et al., 2010; Manolio et al., 2009). For many common disorders, beginning with cancer (Knudson et al., 1975), it has long been hypothesized that the appearance and severity of the disorder are the simple result of the net accumulation of genetic variants or ‘hits’ in a disease pathway, where crossing an undefined risk threshold divides affected from unaffected individuals (Fraser, 1976; Wray and Goddard, 2010). While recent data suggest this assumption may hold true in some neurological disorders (Davis et al., 2011), it is less clear in the case of ion channels, which display extensive overlapping compensatory control of membrane excitability, as well as dynamic homeostatic regulation during brain development and disease (Marder and Tang, 2010).
Here we performed comparisons of exomic SNP profiles, including the type, relative burden, and pattern of variants within a large ion channel candidate gene set between healthy unaffected individuals and those with severe neurological excitability disease to evaluate personal genetic liability, and explored the value of computational models to assist in personal risk prediction. Epilepsy with no known cause (idiopathic epilepsy, IE) is an ideal condition to study the impact of sporadic genetic channel variation on cortical function, as seizure disorders affect 1-2% of the population, and analyses of the rare Mendelian forms reveal that ion channels are major determinants of the phenotype, since 17/20 confirmed monogenic syndromes arise in individuals heterozygous for a SNP in a channel subunit gene (Reid et al., 2009). We observed remarkable genetic complexity and overlapping patterns of both rare and common variants in known excitability disease genes across both populations, indicating that the potential for clinical expression of these common disorders is embedded in the fabric of all human genomes. A single cell computational model of even the simplest pairwise interaction of two channel mutations reveals the unpredictability of their collective electrical signature. Given the extent of individual variation within this large gene family, scaling multigenic combinatorial functional assays to accurately reflect neural network behavior is impractical. Therefore, personalized ion channel disease prediction will require the integration of genomic profiles with single cell proteomics and computational modeling of dynamic circuit behavior to translate unique channel variant portfolios into more informative predictors of personal risk for excitability disorders.
We performed targeted Sanger sequencing of the coding regions of 237 ion channel subunit genes to expand novel SNP discovery within known pathogenic channels, their nearest phylogenetic relatives, and others previously unexplored by medical resequencing. We examined exonic variation in 139 healthy, neurologically unaffected adults and 152 individuals of similar age and race with IE. The ion channel gene list (Table S1 available online) and comprehensive validated SNP data (Table S2) are provided in the Supplemental Information.
We analyzed >9 million base pair reads, yielding 11,102 SNPs (Supplementary Table 3). From these we assembled a validated dataset of 3,095 SNPs that included all SNP types in similar proportions to those observed in the unvalidated set (Table 1 and Supplementary Table 3). Not all known members of the 400+ gene family nor their still ill-defined promoter regions could be included for sequencing, and only the first 50bp of flanking intronic regions were examined. Insertions and deletions (Indels) and copy number variants (CNVs) were not reliably detected. While we considered only exonic SNPs for subsequent evaluation, much variation occurred in the noncoding regions (introns, UTRs, promoters) which can nevertheless lead to altered gene transcription (McDaniell et al., 2010). Thus, despite extensive sequence information, our data significantly underestimates the potential density of variants per gene, as well as the extent of variation in each individual's ion channel sequence variation profile (‘channotype’).
First we evaluated the variation in channel genes, including known alternatively spliced isoforms, in our validated dataset. We discovered 989 novel SNPs, of which 351 are synonymous SNPs (sSNPs), 415 are nonsynonymous SNPs (nsSNPs), and 9 are nonsense mutations (Table 1). Of these novel SNPs, 387 were “rare”, defined as present in a solitary individual when more than 145 individuals were genotyped at that position (<1% Minor Allele Frequency (MAF)). The largest genes (Supplemental Figure 1 and Supplementary Table 4), RYR1, RYR2, and RYR3 were among those with the most variation (95, 88, 57 SNPs respectively). The smallest, HTR1B, had 6 SNPs, including 3 novel nsSNPs. Collectively, these data expand the known channel SNP lists in dbSNP, and confirm the existence of rare allelic variation across a broad spectrum of ion channel genes. The rich variation agrees with that emerging from whole genome sequencing of individuals (Durbin et al., 2010; Wheeler et al., 2008), and from >2,100 cases screened for variants in a subset of clinically important cardiac channel genes (Kapplinger et al., 2009). An individual's channotype is apparently unique; in our small cohort we found that no individuals were free of SNPs, and no two channotypes among 291 individuals were identical (Figure 1). Across both groups we found an overlapping variety of SNP types, including sSNPs, nsSNPs, and SNPs in promoter, coding, UTR and intronic regions. Nonsense SNPs were observed in both populations. We found SNPs in both groups for every targeted gene; of the validated SNPs, 1,355 were unique to either population and the majority (1,740) were shared (Table 1).
Truncated channel proteins resulting from nonsense mutations can cause epilepsy (Claes et al., 2001). Similarly, splice site mutations are implicated in channel disease due to dropout of exons from the coding mRNA (Tsuji et al., 2007). Although variants of these most functionally ‘severe’ types rarely occur in our dataset, the 4 nonsense and 8 splice site variants with an observed MAF <0.05 appeared to be enriched in the IE group (Table 1). We randomly shuffled all genotype class labels, one SNP at a time, then tallied the variants by group again, this time under the null hypothesis. After 10,000 such permutations, a comparison of the observed tally against the null distribution resulted in a False Discovery Rate (FDR) of 0.11. These results indicate that the numerical enrichment observed when comparing rare severe SNPs between cases and controls occurs more than 11% of the time by random chance alone. To test whether other SNP types were involved, we expanded the number of SNPs considered functionally severe by biochemical classification. We examined all missense SNPs for their relative rarity, and then used a Blosum80 substitution matrix to score the amino acid substitution caused by the nsSNP as a predictor of its potential to alter ion channel behavior (Supplementary Table 2 and Supplementary Figure 2). Analysis of missense SNPs with MAF<0.05 yielded 205 nsSNPs in the predicted deleterious range (-6 to 0). When these SNPs were included in the permutation analysis with the nonsense and splice site SNPs, the results again did not reach significance (FDR= 0.59). Increasing the stringency to evaluate only very rare severe SNPs (MAF<0.01; n=171), yielded a FDR=0.50. Finally, we considered only very rare nsSNPs with a Blosum80 score of -6 ((MAF<0.01; n=38) because these represent the most biochemically disruptive substitutions (e.g. Arg→Cys). After 2,000 permutations, the FDR was 0.25. This increased stringency for the most destabilizing amino acid mutations still failed to reach significance. Collectively, this unbiased analysis of SNP severity in the coding regions of 237 ion channel genes from our small study population provides no evidence for an ion channel rare variant association in idiopathic epilepsy.
We next compared variation within the 17 known ion channel genes causing familial human epilepsy (hEP genes) (Supplementary Table 1). We found that 96.1% of cases and 66.9% of controls had a missense mutation in at least 1 known hEP gene. The finding of missense variants in disease-producing ion channels in unaffected individuals is not new. A smaller scale study of 1 ion channel (SCN5A) involved in the cardiac arrhythmia LQT3 syndrome revealed that ~3% of 295 unaffected Caucasians have rare missense mutations in this gene (Ackerman et al., 2004). However, we found 300 missense channel variants in 139 unaffected individuals, 23 of which are in hEP genes signalling that allelic penetrance in channelopathy is underappreciated (Figure 2). The R393H nsSNP in the ion selective pore of the SCN1A gene, believed to cause severe myoclonic epilepsy of infancy (Claes et al., 2001) was also detected once in our study, and only in the control population. Interestingly, in vitro functional studies of this mutation failed to produce measurable sodium current (Ohmori et al., 2006), indicating that deleterious alterations in protein structure in a known hEP gene are insufficient to predict risk of epilepsy. The unexpectedly common finding of missense mutations in known hEP genes in our control cohort supports a biophysical model that other subunits, as yet unrelated to epilepsy, may constitute genetic excitabilty modifiers, and provides direct evidence validating the multigenic basis for complex inheritance of channelopathy phenotypes.
An attractive mechanism relating monogenic familial epilepsy to multigenic sporadic IE is the load hypothesis, which suggests that if IE is the result of accumulating mutations of small effect in known disease genes, then the “load” or summation of those deleterious mutations will surpass some liability threshold contributing to the overt excitability phenotype. We analyzed hEP gene profiles in both groups, and observed that 77.6% of cases and, surprisingly, 29.5% of controls have missense mutations in 2 or more hEP genes. Superficially, the simple numerical excess of mutations in known hEP genes appears to favor the disease phenotype. However, the most extreme control showed 7 missense mutations in hEP genes, while the most extreme case had 9, indicating that the specific type and pattern of mutation rather than number is an overriding consideration (Figure 3). Individuals in both groups often carry multiple, potentially functionally interacting (both intra and intergenic) variants, some of which are shared between populations. In addition, unique profiles of nsSNPs in hEP genes are defined by pattern despite the same numerical load (Figure 3). A simple hEP load hypothesis is therefore confounded by two factors: 1) gene dosage, which correlates imperfectly with allelic penetrance at the protein level, and 2) the presence of multiple missense SNPs in the same gene that may perturb protein structure in a non-additive way (Figure 3). Inclusion of all other missense mutations in the hEP SNP load analysis added complexity without increasing the power to predict phenotype. For example, among the highest hEP variant-bearing individuals, the most extreme case bore 97 nsSNPs in a total of 62 genes, compared to the most extreme control with 86 nsSNPs in 59 genes.
While specific channel subunits have been identified as Mendelian hEP genes, variants in other members of their gene families may potentiate or suppress similar membrane currents, and alter or mask pathological phenotypes. Since voltage and ligand-gated ion channels assemble as heteromeric complexes with differing stoichiometry, the combinatorial impact of multiple nsSNPs among subunits is felt across both voltage and ligand-gated channel types. For example, examination of sodium channel homologues revealed that 50.7% of cases and 14.4% had 2 or more nsSNPs in their sodium channel genes (Figure 4). Simultaneous co-expression of different sodium channel alpha subunit variants in transgenic mice has been shown to bidirectionally modify epilepsy phenotypes (Hawkins et al., 2010; Singh et al., 2009). Ligand-gated channels provide a second example of complex subunit interactions occurring in both cases and controls. Pentameric GABA receptors form chloride channels that are composed of alpha, beta and gamma subunits, and dysfunction in members of each (GABRA1, GABRB3 and GABRG2) are linked to epilepsy. We found that 24.3% of cases had 2 or more nsSNPs in one of the 6 GABA receptor alpha subunits; this number was also found in 6.5% of controls; 14.5% of cases had 2 or more nsSNPs in one of the 3 beta subunits compared with 5.0% of controls. In the 3 gamma subunits, 1.3% of cases had 2 or more nsSNPs, while control individuals only contained single missense mutations (17.1%).
Given the large SNP variation observed in the control population, and that the majority of SNPs are shared between both groups (Table 1), we considered the possibility that disease causing nsSNPs are masked in the control populations by other SNPs, and found similar patterns of complexity in control channotypes. These occurred either as a second intragenic hit in the same subunit, or heterogenically, where the physiological summation of pathogenic variants in two (or more) subunits could effectively cancel the impact of the deleterious SNP, as shown in mouse models of digenic calcium and potassium channel mutation interactions (Glasscock et al., 2007). We looked for this particular gene family combination in our human cohorts, and found it in both controls (12/139) and cases (55/152). Similarly, we observed combinations of calcium and sodium channel missense mutations in both groups (cases 23/152; controls 7/139) (Figure 4). Interestingly, a few individuals in both populations (2 controls; 6 cases) had rare nsSNP combinations across 3 (SCN, KCN, CACN) gene families.
Although widely distributed thoughout brain circuits, it is instructive to assess the potential impact of individual intragenic and epistatic interactions beginning within a single cell. We adapted a classic computational model of a single CA3 hippocampal neuron to simulate gain or loss of function alleles in ion currents in order to demonstrate that pairwise combinations (different personal profiles) produce dramatic variation in firing patterns given a constant “two-hit” load (Bower and Beeman, 2007; Traub et al., 1991). If a simple load effect based on equivalent gain/loss of function among two depolarizing currents was operative, we would expect similar emergent firing patterns between two-hit models (eg. Figure 5A3 vs A4 and Figure 5B3 vs B4). However, the combinatorial affects on firing behavior are dramatically more complex, indicating that the pattern of genetic variation (functional valence of each allele) overrides the load even at the single cell level. The addition of a third hit can also suppress or aggravate spontaneous rhythmic bursting, and important cellular determinant of neural network behavior (Figure 5C).
Appreciation of complex channel variation has far-ranging implications for understanding the individuality of higher nervous system function, and defines critical areas for future research to translate genomic profiling into even rudimentary risk prediction in common sporadic disease (Choi et al., 2009; Gargus, 2003). Here we examined how functional SNP severity and patterns of channel nsSNP variation differ among individuals, and uncovered three general findings of immediate clinical significance for personalized prediction; first, that the architecture of ion channel variation in both groups consists of dense and highly complex patterns of common and rare alleles; second, that structural variants in both known and suspected epilepsy genes appear in otherwise healthy individuals; and third, that individuals with epilepsy typically carry more than one mutation in known human epilepsy genes.
Due to this robust and ubiquitous genetic heterogeneity, we propose that personalized channel SNP pathogenicity in complex disease must, at a minimum, be defined in the context of all other channel subunits present, and that causality cannot be assigned to any particular variant, even those that are known to be functionally deleterious. Many potentially pathogenic variants in known dominant channel genes for epilepsy appear in otherwise healthy individuals in our study, leading us to suggest a channel variant pattern hypothesis rather than mutational load as an oligogenic mechanism for both sporadic epilepsy and protection from disease. Because of their overlapping voltage dependence, even physically non-interacting channel proteins can modulate distantly related channel genes within the same membrane compartment through changes in transmembrane potential. This oligogenic electrical relationship integrates channel alleles within the tissue, and likely plays a role even in ‘monogenic’ channelopathies, since carriers of such mutations often show a spectrum of clinical phenotypes, including the absence of disease, even within the same pedigree (Scheffer et al., 2007). Both in vitro functional expression studies (Rusconi et al., 2007) and transgenic mouse models combining epilepsy gene mutations (Glasscock et al., 2007; Hawkins et al., 2010; Martin et al., 2007; Singh et al., 2009; Song et al., 2004) have demonstrated that the functional impact of an additional mutation can exacerbate or mask the excitability disorder. Thus phenotypic variation in epilepsy and other excitability disorders may arise from a diverse array of channel alleles at a single locus, or a constellation of novel alleles in related or distant subunit genes, and here we show both are abundant. Finally, without excluding many other biological pathways that could further modify network excitability (Noebels, 2003), the high incidence of rare variants implicates the ion channel family itself as a preeminent source of candidate background modifier genes for the clinical expression of ion channel disorders, a mechanism that could help explain both the efficacy of, and pharmacoresistance to, antiepileptic drugs that non-specifically modulate multiple ionic currents.
In contrast, we found little evidence or biological rationale for a SNP load effect in ion channelopathy. We found abundant channel gene variation in both groups, yet no significant enrichment in singleton-SNPs within affected individuals, and conclude that biologically, absolute numerical counts of SNP burden hold little predictive value as a global pathogenic measure. The detection of a nsSNP allows little insight into its functional effect. The magnitude of ion flux among channels, even within a family, is highly variable; single channel conductance in different channels ranges from ~1-250 picosiemens (Harmar et al., 2009), and since the SNPs in individual genes are of unequal functional valence, a simple tally of variants across genes without regard to the amplitude of their gain- or loss- of function, would obscure the net magnitude of their contribution to network excitability. The cell type and brain region where the channel isoforms reside, along with their expression levels during development add further complexity. Finally, even when a mutant transcript is expressed in both excitatory and inhibitory cells (potentially counterbalancing the net excitability changes within a microcircuit), the impact of the channelopathy may still be inexplicably uneven. For example, an important epilepsy-related sodium channel gene, SCN1A, is expressed in both hippocampal pyramidal cells and interneurons, yet haploinsufficiency reduces sodium currents only in interneurons; this discrepancy may explain the powerful network hyperexcitability underlying Dravet Syndrome, a severe epilepsy of infancy (Yu et al., 2006). Thus, while membrane current profiles are additive within an electrotonic compartment of a single cell, the sum of genomic channel SNPs poorly predicts the emergent physiology of local and long-range networks throughout the entire nervous system.
With the arrival of discovery platforms capable of efficiently discriminating all ion channel variation in an individual, and in the absence of high throughput systems for evaluating oligogenic combinatorial functional effects in their appropriate cellular context (Tomaselli, 2010), the ability to extract predictive phenotypic information on network behavior from ion channel SNP profiles presents a formidable bioinformatic challenge. Fortunately, current advances in converging technologies will help overcome this complexity. Positional information on channel subunit expression and stoichiometry in identified brain cells is being generated by medium (laser-capture) (Okaty et al., 2009), and higher throughput (bac-trap) methods (Dougherty et al., 2010), and can be combined with neuroproteomic databases evaluating subcellular channel composition within critical neuronal compartments, including dendritic spines, axon initial segment, and pre-synaptic membrane terminals (Bayés and Grant, 2009; Bayés et al., 2011; Bloodgood et al., 2009; Müller et al., 2010; Takamori et al., 2006). New computational approaches, such as quasi-active modular simulations (Kellems et al., 2009) of specific channel combinations at specialized sites are in development (Winden et al., 2009). As algorithms predicting realistic neuronal firing patterns in relevant brain microcircuits are developed, the integration of genomic variation in a matrix with cellular localization in human brain may provide a platform to assess the personalized electrical signature of an oligogenic variant profile on disease-specific networks (Clancy and Rudy, 1999; Dyhrfjeld-Johnsen et al., 2007; Goaillard et al., 2009; Grashow et al., 2009; Gurkiewicz and Korngreen, 2007; Okaty et al., 2009; Thomas et al., 2009; Tobin et al., 2009; Toledo-Rodriguez et al., 2005; Xu and Clancy, 2008).
While predicting disease severity by incorporating ion channel genomic profiles will require these new tools, exome screening to pinpoint variants with specific effects on drug binding offers a more immediate role for large scale channel profiling to contribute to gene-directed medical therapy (Sheets et al., 2010). In epilepsy, nearly one third of patients are refractory to current AED treatments, which with few exceptions, target ion channels. Sequence variants that alter access to drug binding sites are obvious candidate mechanisms for pharmacoresistance (Drolet et al., 2005; Liu et al., 2003), and variant profiles may personalize treatment by identifying ineffective drugs in epilepsy and other excitability disorders where channel modulation is clinically useful.
We evaluated all self-reported white Caucasian and white Hispanic individuals regardless of the age or gender examined at Baylor College of Medicine (BCM)-affiliated hospitals in accordance with the guidelines of the BCM Institutional Review Board. Race and ethnicity were established according to Risch et al. (Tang et al., 2005) based on a structured questionnaire that examined the racial and ethnic origin of an individual back three generations on either side of the family tree. Demographic characteristics of the cohort are detailed in Table S5. DNA from individuals recruited into the study (both cases and controls) was sent to the Coriell Institute for Medical Research for archiving in their cell line repository and is accessible upon request (http://ccr.coriell.org/Sections/Collections/NINDS/Epilepsy/). Please note that the samples we have provided Coriell are not differentiated from the other “epilepsy” samples in the repository.
Cases meeting the accepted criteria for either idiopathic or cryptogenic epilepsy (Commission on Classification and Terminology of the International League Against Epilepsy, 1989) were recruited in accordance with the approved study protocol. In the proposed new classification, the terms “idiopathic” and “cryptogenic” have been revised in favor of the term “genetic epilepsy” when describing the underlying etiology of the disorder (Berg et al., 2010), and includes seizures occurring in epilepsy patients with presumed genetic origin.
Study subjects were evaluated for epilepsy by board-certified neurologists according to a standard protocol for the diagnostic evaluation of their epilepsy. The neurological assessment was classified as abnormal if deficits such as hemiparesis, ataxia, or cognitive impairment were present. The presence of clumsiness or other “soft” neurological signs was insufficient to classify a patient as having neurological abnormalities. A structured questionnaire was used to collect information from the patient and any witnesses to the seizures to complement data recorded in the clinic chart.
All study patients underwent at least one of the following: (1) routine sleep deprived electroencephalogram (EEG), (2) prolonged ambulatory (A-EEG), or 3) video EEG (V-EEG) monitoring. EEG studies were reviewed and interpreted according to International League Against Epilepsy (ILAE) standards by neurologists board-certified in clinical neurophysiology at the Baylor Comprehensive Epilepsy Center.
Brain magnetic resonance imaging (MRI) was performed in all seizure subjects in accordance with standard protocol. Visual analysis of oblique and coronal T1, T2, and FLAIR images was done at BCM-affiliated hospitals. Epileptic seizures and syndromes were classified with information obtained from the history, physical evaluation, and above-specified ancillary diagnostic investigations according to ILAE guidelines (Commission on Classification and Terminology of the International League Against Epilepsy, 1989).
Control subjects were recruited among unrelated visitors, spouses, significant others, and friends of patients. The clinical assessment of controls was limited to answers to a structured general medical questionnaire with specific emphasis on any neurological or cardiac symptoms, including recurrent spells or seizure-like episodes. Only individuals with an entirely negative past and present history were included. Individuals with a family history positive for any neurological disorder outside of cerebrovascular disease and undocumented headache were also excluded.
We selected all known ion channelopathy genes, their family members and functionally similar ion channels (Harmar et al., 2009). This list includes members of both the voltage-gated, ligand-gated and background “leak” channels, and a small number of ion channel interacting proteins like ANK2 (ankyrin-B). Ion channel gene models were defined with the annotations available through the UCSC Genome Browser for the hg17 assembly from March 2004. Exon boundaries were defined using Refseq coordinates where available. All reported splice isoforms were included for sequencing. Protein sequences from in silico translation of each transcript model were aligned to the reported protein sequence in PubMed. Manual curation of a handful of gene models was required to reconcile discrepancies. Gene models are available on request.
Amplicons selected for PCR amplification and downstream sequencing were single exons smaller than 600 bp in size. For exons larger than 500 bp, multiple amplicons were subdivided to allow for optimal product lengths but still span the entire exon with overlapping ends. When two or more exons with the spanning intron were collectively less than 500 bp in size, the total region was defined as a single amplicon for primer design. This consolidation reduced the number of targets by 10%. For ion channel gene targets in which alternative isoforms exist, all undefined exons (alternatively spliced exons) were targeted as amplicons and reference cDNA sequences from NCBI for the alternative isoform were used for annotation. Primers were designed as 20-mers and were placed 50 bp from either end of an amplicon to ensure accurate reads at the exon/intron boundaries. Primers were generated automatically with Primer3, and primer sequences are available on request.
After counseling and informed consent, two vials of 8 ml of blood were drawn from each individual. Genomic DNA was extracted with the Gentra Puregene Blood Kit (QIAGEN) and stored in deidentified bar coded tubes prior to use in amplicon generation. Samples of high-quality, high-molecular-weight genomic DNA were also sent to the Coriell Sample Repository for cell line archiving and are openly accessible (http://www.coriell.org/, Home/Collections/NINDS/Epilepsy).
A high-throughput, parallel Sanger sequencing pipeline was designed to move multiple samples through the amplification, sequencing and subsequent SNP detection at one time. All PCR reactions were performed in a 25 ml reaction volume containing 20–40 ng of individual genomic DNA, 10 pmol of each primer and 12.5 ul of QIAGEN muliplex PCR master mix. General PCR conditions were initial denaturation and activation step 15 min at 95 C, followed by 30 cycles of the following: denaturation 94 C for 30 s, annealing 50 C–65 C (depending on target) for 90 s, followed by a 1 min extension step at 72 C. A final 10 min completion step at 72 C followed amplification. Sequencing was performed with Big Dye Terminator Cycle Sequencing v3.1 and visualized with an ABI3700 (PE Applied Biosystems). Sequencing chromatograms were compared to reference gene models and single-nucleotide polymorphisms (SNPs) were detected and annotated using SNPdetector v3.0 (Zhang et al., 2005), and individual channotypes were output for analysis. Validation of SNPs included in this study was performed by visual confirmation of the chromatogram data, the presence of the SNP in dbSNP, the generation of a custom MIP chip, and a combination of Biotage and/or 454 sequencing.
To assess the enrichment and predictive power of rare variants within our study population, we performed a permutation-based rare variant enrichment test for case-control studies that accounts for amplicon-specific group oversampling. After filtering for the desired SNPs and corresponding genotypes, we observed the number of minor alleles in each group. Any observed enrichment of rare variants in either group was compared against the null distribution generated by 10,000 permutations (random shuffling) of the group labels (case versus control) to determine the false discovery rate.
For every nucleotide reference sequence (NM_number) used in this study, we obtained the corresponding protein sequences (NP_number) in NCBI. Protein alignments for all members of a gene family that were structurally related (e.g., SCN1A-SCN11A or GABRG1-GABRG3) were performed using the MUSCLE webserver (Edgar, 2004a, 2004b). For the human epilepsy (hEP) genes, structural models with reference comparative alignments or annotated amino acid sequence were obtained from the literature. These models were used to position individual nonsense or missense mutations on schematic representations of channel structure.
We thank Melissa Lambeth and Marsha Hill for assistance with clinical recruitment, the HGSC sequencing team, and Colleen Clancy and Luay Nakhleh for valuable discussion. The authors acknowledge the support of NIH, NS 049130 (JN), NHGRI (RG), the Gillson Longenbaugh Foundation (JN, RG) and the Blue Bird Circle Foundation.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.