PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-10 (10)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
Document Types
1.  PyPop: A Software Framework for Population Genomics: Analyzing Large-Scale Multi-Locus Genotype Data 
Software to analyze multi-locus genotype data for entire populations is useful for estimating haplotype frequencies, deviation from Hardy-Weinberg equilibrium and patterns of linkage disequilibrium. These statistical results are important to both those interested in human genome variation and disease predisposition as well as evolutionary genetics. As part of the 13th International Histocompatibility and Immunogenetics Working Group (IHWG), we have developed a software frame-work (PyPop). The primary novelty of this package is that it allows integration of statistics across large numbers of data-sets by heavily utilizing the XML file format and the R statistical package to view graphical output, while retaining the ability to inter-operate with existing software. Largely developed to address human population data, it can, however, be used for population based data for any organism. We tested our software on the data from the 13th IHWG which involved data sets from at least 50 laboratories each of up to 1000 individuals with 9 MHC loci (both class I and class II) and found that it scales to large numbers of data sets well.
PMCID: PMC3891851  PMID: 12603054
2.  Tracking human migrations by the analysis of the distribution of HLA alleles, lineages and haplotypes in closed and open populations 
The human leucocyte antigen (HLA) system shows extensive variation in the number and function of loci and the number of alleles present at any one locus. Allele distribution has been analysed in many populations through the course of several decades, and the implementation of molecular typing has significantly increased the level of diversity revealing that many serotypes have multiple functional variants. While the degree of diversity in many populations is equivalent and may result from functional polymorphism(s) in peptide presentation, homogeneous and heterogeneous populations present contrasting numbers of alleles and lineages at the loci with high-density expression products. In spite of these differences, the homozygosity levels are comparable in almost all of them. The balanced distribution of HLA alleles is consistent with overdominant selection. The genetic distances between outbred populations correlate with their geographical locations; the formal genetic distance measurements are larger than expected between inbred populations in the same region. The latter present many unique alleles grouped in a few lineages consistent with limited founder polymorphism in which any novel allele may have been positively selected to enlarge the communal peptide-binding repertoire of a given population. On the other hand, it has been observed that some alleles are found in multiple populations with distinctive haplotypic associations suggesting that convergent evolution events may have taken place as well. It appears that the HLA system has been under strong selection, probably owing to its fundamental role in varying immune responses. Therefore, allelic diversity in HLA should be analysed in conjunction with other genetic markers to accurately track the migrations of modern humans.
doi:10.1098/rstb.2011.0320
PMCID: PMC3267126  PMID: 22312049
histocompatibility; HLA; migrations; selection; convergent evolution; diversification
3.  Evidence for More than One Parkinson's Disease-Associated Variant within the HLA Region 
PLoS ONE  2011;6(11):e27109.
Parkinson's disease (PD) was recently found to be associated with HLA in a genome-wide association study (GWAS). Follow-up GWAS's replicated the PD-HLA association but their top hits differ. Do the different hits tag the same locus or is there more than one PD-associated variant within HLA? We show that the top GWAS hits are not correlated with each other (0.00≤r2≤0.15). Using our GWAS (2000 cases, 1986 controls) we conducted step-wise conditional analysis on 107 SNPs with P<10−3 for PD-association; 103 dropped-out, four remained significant. Each SNP, when conditioned on the other three, yielded PSNP1 = 5×10−4, PSNP2 = 5×10−4, PSNP3 = 4×10−3 and PSNP4 = 0.025. The four SNPs were not correlated (0.01≤r2≤0.20). Haplotype analysis (excluding rare SNP2) revealed increasing PD risk with increasing risk alleles from OR = 1.27, P = 5×10−3 for one risk allele to OR = 1.65, P = 4×10−8 for three. Using additional 843 cases and 856 controls we replicated the independent effects of SNP1 (Pconditioned-on-SNP4 = 0.04) and SNP4 (Pconditioned-on-SNP1 = 0.04); SNP2 and SNP3 could not be replicated. In pooled GWAS and replication, SNP1 had ORconditioned-on-SNP4 = 1.23, Pconditioned-on-SNP4 = 6×10−7; SNP4 had ORconditioned-on-SNP1 = 1.18, Pconditioned-on-SNP1 = 3×10−3; and the haplotype with both risk alleles had OR = 1.48, P = 2×10−12. Genotypic OR increased with the number of risk alleles an individual possessed up to OR = 1.94, P = 2×10−11 for individuals who were homozygous for the risk allele at both SNP1 and SNP4. SNP1 is a variant in HLA-DRA and is associated with HLA-DRA, DRB5 and DQA2 gene expression. SNP4 is correlated (r2 = 0.95) with variants that are associated with HLA-DQA2 expression, and with the top HLA SNP from the IPDGC GWAS (r2 = 0.60). Our findings suggest more than one PD-HLA association; either different alleles of the same gene, or separate loci.
doi:10.1371/journal.pone.0027109
PMCID: PMC3212531  PMID: 22096524
4.  Novel sequence feature variant type analysis of the HLA genetic association in systemic sclerosis 
Human Molecular Genetics  2009;19(4):707-719.
We describe a novel approach to genetic association analyses with proteins sub-divided into biologically relevant smaller sequence features (SFs), and their variant types (VTs). SFVT analyses are particularly informative for study of highly polymorphic proteins such as the human leukocyte antigen (HLA), given the nature of its genetic variation: the high level of polymorphism, the pattern of amino acid variability, and that most HLA variation occurs at functionally important sites, as well as its known role in organ transplant rejection, autoimmune disease development and response to infection. Further, combinations of variable amino acid sites shared by several HLA alleles (shared epitopes) are most likely better descriptors of the actual causative genetic variants. In a cohort of systemic sclerosis patients/controls, SFVT analysis shows that a combination of SFs implicating specific amino acid residues in peptide binding pockets 4 and 7 of HLA-DRB1 explains much of the molecular determinant of risk.
doi:10.1093/hmg/ddp521
PMCID: PMC2807365  PMID: 19933168
5.  SEQUENCE FEATURE VARIANT TYPE (SFVT) ANALYSIS OF THE HLA GENETIC ASSOCIATION IN JUVENILE IDIOPATHIC ARTHRITIS 
The immune response HLA class II DRB1 gene provides the major genetic contribution to Juvenile Idiopathic Arthritis (JIA), with a hierarchy of predisposing through intermediate to protective effects. With JIA, and the many other HLA associated diseases, it is difficult to identify the combinations of biologically relevant amino acid (AA) residues directly involved in disease due to the high level of HLA polymorphism, the pattern of AA variability, including varying degrees of linkage disequilibrium (LD), and the fact that most HLA variation occurs at functionally important sites. In a subset of JIA patients with the clinical phenotype oligoarticular-persistent (OP), we have applied a recently developed novel approach to genetic association analyses with genes/proteins sub-divided into biologically relevant smaller sequence features (SFs), and their “alleles” which are called variant types (VTs). With SFVT analysis, association tests are performed on variation at biologically relevant SFs based on structural (e.g., beta-strand 1) and functional (e.g., peptide binding site) features of the protein. We have extended the SFVT analysis pipeline to additionally include pairwise comparisons of DRB1 alleles within serogroup classes, our extension of the Salamon Unique Combinations algorithm, and LD patterns of AA variability to evaluate the SFVT results; all of which contributed additional complementary information. With JIA-OP, we identified a set of single AA SFs, and SFs in which they occur, particularly pockets of the peptide binding site, that account for the major disease risk attributable to HLA DRB1. These are (in numeric order): AAs 13 (pockets 4 and 6), 37 and 57 (both pocket 9), 67 (pocket 7), 74 (pocket 4), and 86 (pocket 1), and to a lesser extent 30 (pockets 6 and 7) and 71 (pockets 4, 5, and 7).
PMCID: PMC2958177  PMID: 19908388
6.  Genetic variation within the HLA class III influences T1D susceptibility conferred by high risk HLA haplotypes 
Genes and immunity  2010;11(3):209-218.
HLA class II DRB1 and DQB1 represent the major type 1 diabetes (T1D) genetic susceptibility loci; however, other genes in the HLA region are also involved in T1D risk. We analyzed 1411 pedigrees (2865 affected individuals) from the type 1 diabetes genetics consortium (T1DGC) genotyped for HLA classical loci and for 12 SNPs in the class III region previously shown to be associated with T1D in a subset of 886 pedigrees. Using the transmission disequilibrium test, we compared the proportion of SNP alleles transmitted from within the high risk DR3 and DR4 haplotypes to affected offspring. Markers rs4151659 (mapping to CFB) and rs7762619 (mapping 5′ of LTA) were the most strongly associated with T1D on DR3 (p=1.2 × 10−9 and p=2 × 10−12 respectively) and DR4 (p=4 × 10−15 and p= 8 × 10−8 respectively) haplotypes. They remained significantly associated after stratifying individuals in analyses for B*1801, A*0101-B*0801, DPB1*0301, DPB1*0202, DPB1*0401 or DPB1*0402. Rs7762619 and rs4151659 are in strong linkage disequilibrium (LD) (r2=0.82) with each other, but a joint analysis showed that the association for each SNP was not solely due to LD. Our data support a role for more than one locus in the class III region contributing to risk of T1D.
doi:10.1038/gene.2009.104
PMCID: PMC2858242  PMID: 20054343
Type 1 diabetes; DR3; DR4; linkage disequilibrium; fine mapping
7.  Analysis of Maternal–Offspring HLA Compatibility, Parent-of-Origin Effects, and Noninherited Maternal Antigen Effects for HLA–DRB1 in Systemic Lupus Erythematosus 
Arthritis and rheumatism  2010;62(6):1712-1717.
Objective
Genetic susceptibility to systemic lupus erythematosus (SLE) is well established, with the HLA class II DRB1 and DQB1 loci demonstrating the strongest association. However, HLA may also influence SLE through novel biologic mechanisms in addition to genetic transmission of risk alleles. Evidence for increased maternal–offspring HLA class II compatibility in SLE and differences in maternal versus paternal transmission rates (parent-of-origin effects) and nontransmission rates (noninherited maternal antigen [NIMA] effects) in other autoimmune diseases have been reported. Thus, we investigated maternal–offspring HLA compatibility, parent-of-origin effects, and NIMA effects at DRB1 in SLE.
Methods
The cohort comprised 707 SLE families and 188 independent healthy maternal–offspring pairs (total of 2,497 individuals). Family-based association tests were conducted to compare transmitted versus nontransmitted alleles (transmission disequilibrium test) and both maternally versus paternally transmitted (parent-of-origin) and nontransmitted alleles (using the chi-square test of heterogeneity). Analyses were stratified according to the sex of the offspring. Maternally affected offspring DRB1 compatibility in SLE families was compared with paternally affected offspring compatibility and with independent control maternal–offspring pairs (using Fisher’s test) and was restricted to male and nulligravid female offspring with SLE.
Results
As expected, DRB1 was associated with SLE (P < 1 × 10−4). However, mothers of children with SLE had similar transmission and nontransmission frequencies for DRB1 alleles when compared with fathers, including those for the known SLE risk alleles HLA–DRB1*0301, *1501, and *0801. No association between maternal–offspring compatibility and SLE was observed.
Conclusion
Maternal–offspring HLA compatibility, parent-of-origin effects, and NIMA effects at DRB1 are unlikely to play a role in SLE.
doi:10.1002/art.27426
PMCID: PMC2948464  PMID: 20191587
8.  High-Density SNP Screening of the Major Histocompatibility Complex in Systemic Lupus Erythematosus Demonstrates Strong Evidence for Independent Susceptibility Regions 
PLoS Genetics  2009;5(10):e1000696.
A substantial genetic contribution to systemic lupus erythematosus (SLE) risk is conferred by major histocompatibility complex (MHC) gene(s) on chromosome 6p21. Previous studies in SLE have lacked statistical power and genetic resolution to fully define MHC influences. We characterized 1,610 Caucasian SLE cases and 1,470 parents for 1,974 MHC SNPs, the highly polymorphic HLA-DRB1 locus, and a panel of ancestry informative markers. Single-marker analyses revealed strong signals for SNPs within several MHC regions, as well as with HLA-DRB1 (global p = 9.99×10−16). The most strongly associated DRB1 alleles were: *0301 (odds ratio, OR = 2.21, p = 2.53×10−12), *1401 (OR = 0.50, p = 0.0002), and *1501 (OR = 1.39, p = 0.0032). The MHC region SNP demonstrating the strongest evidence of association with SLE was rs3117103, with OR = 2.44 and p = 2.80×10−13. Conditional haplotype and stepwise logistic regression analyses identified strong evidence for association between SLE and the extended class I, class I, class III, class II, and the extended class II MHC regions. Sequential removal of SLE–associated DRB1 haplotypes revealed independent effects due to variation within OR2H2 (extended class I, rs362521, p = 0.006), CREBL1 (class III, rs8283, p = 0.01), and DQB2 (class II, rs7769979, p = 0.003, and rs10947345, p = 0.0004). Further, conditional haplotype analyses demonstrated that variation within MICB (class I, rs3828903, p = 0.006) also contributes to SLE risk independent of HLA-DRB1*0301. Our results for the first time delineate with high resolution several MHC regions with independent contributions to SLE risk. We provide a list of candidate variants based on biologic and functional considerations that may be causally related to SLE risk and warrant further investigation.
Author Summary
Systemic lupus erythematosus (SLE) is an autoimmune disease characterized by autoantibody production and involvement of multiple organ systems. Although the cause of SLE remains unknown, several lines of evidence underscore the importance of genetic factors. As is true for most autoimmune diseases, a substantial genetic contribution to disease risk is conferred by major histocompatibility complex (MHC) gene(s) on chromosome 6. This region of the genome contains a large number of genes that participate in the immune response. However, the full contribution of this genomic region to SLE risk has not yet been defined. In the current study we characterize a large number of SLE patients and family members for approximately 2,000 MHC region variants to identify the specific genes that influence disease risk. Our results, for the first time, implicate four different MHC regions in SLE risk. We provide a list of candidate variants based on biologic and functional considerations that may be causally related to SLE risk and warrant further investigation.
doi:10.1371/journal.pgen.1000696
PMCID: PMC2758598  PMID: 19851445
9.  Balancing selection and heterogeneity across the classical human leukocyte antigen loci: a meta-analytic review of 497 population studies 
Human immunology  2008;69(7):443-464.
This paper presents a meta-analysis of high-resolution human leukocyte antigen (HLA) allele frequency data describing 497 population samples. Most of the datasets were compiled from studies published in eight journals from 1990 to 2007; additional datasets came from the International Histocompatibility Workshops and from the AlleleFrequencies.net database. In all, these data represent approximately 66,800 individuals from throughout the world, providing an opportunity to observe trends that may not have been evident at the time the data were originally analyzed, especially with regard to the relative importance of balancing selection among the HLA loci. Population genetic measures of allele frequency distributions were summarized across populations by locus and geographic region. A role for balancing selection maintaining much of HLA variation was confirmed. Further, the breadth of this meta-analysis allowed the ranking of the HLA loci, with DQA1 and HLA-C showing strongest balancing selection and DPB1 being compatible with neutrality. Comparisons of the allelic spectra reported by studies since 1990 suggest that most of the HLA alleles identified since 2000 are very-low-frequency alleles. The literature-based allele-count data, as well as maps summarizing the geographic distributions for each allele, are available online.
doi:10.1016/j.humimm.2008.05.001
PMCID: PMC2632948  PMID: 18638659
10.  Conditional genotype analysis: detecting secondary disease loci in linkage disequilibrium with a primary disease locus 
BMC Proceedings  2007;1(Suppl 1):S163.
A number of autoimmune and other diseases have well established HLA associations; in many cases there is strong evidence for the direct involvement of the HLA class II peptide-presenting antigens, e.g., HLA DR-DQ for type 1 diabetes (T1D) and HLA-DR for rheumatoid arthritis (RA). The involvement of additional HLA region genes in the disease process is implicated in these diseases. We have developed a model-free approach to detect these additional disease genes using genotype data; the conditional genotype method (CGM) and overall conditional genotype method (OCGM) use all patient and control data and do not require haplotype estimation. Genotypes at marker genes in the HLA region are stratified and their expected values are determined in a way that removes the effects of linkage disequilibrium (LD) with the peptide-presenting HLA genes directly involved in the disease. A statistic has been developed under the null hypothesis of no additional disease genes in the HLA region for the OCGM method and was applied to the Genetic Analysis Workshop 15 simulated data set of Problem 3, which mimics RA (answers were known). In addition to the primary effect of the HLA DR locus, the effects of the other two HLA region simulated genes involved in disease were detected (gene C, 0 cM from DR, increases RA risk only in women; and gene D, 5.12 cM from DR, rare allele increases RA risk five-fold). No false negatives were found. Power calculations were performed.
PMCID: PMC2367484  PMID: 18466509

Results 1-10 (10)