|Home | About | Journals | Submit | Contact Us | Français|
High-density lipoprotein (HDL) particles exhibit multiple antiatherogenic effects. They are key players in the reverse cholesterol transport which shuttles cholesterol from peripheral cells (e.g. macrophages) to the liver or other tissues. This complex process is thought to represent the basis for the antiatherogenic properties of HDL particles. The amount of cholesterol transported in HDL particles is measured as HDL cholesterol (HDLC) and is inversely correlated with the risk for coronary artery disease: an increase of 1 mg/dL of HDLC levels is associated with a 2% and 3% decrease of the risk for coronary artery disease in men and women, respectively. Genetically determined conditions with high HDLC levels (e.g. familial hyperalphalipoproteinemia) often coexist with longevity, and higher HDLC levels were found among healthy elderly individuals.
HDLC levels are under considerable genetic control with heritability estimates of up to 80%. The identification and characterization of genetic variants associated with HDLC concentrations can provide new insights into the background of longevity. This review provides an extended overview on the current genetic-epidemiological evidence from association studies on genes involved in HDLC metabolism. It provides a path through the jungle of association studies which are sometimes confusing due to the varying and sometimes erroneous names of genetic variants, positions and directions of associations. Furthermore, it reviews the recent findings from genome-wide association studies which have identified new genes influencing HDLC levels.
The yet identified genes together explain only a small amount of less than 10% of the HDLC variance, which leaves an enormous room for further yet to be identified genetic variants. This might be accomplished by large population-based genome-wide meta-analyses and by deep-sequencing approaches on the identified genes. The resulting findings will probably result in a re-drawing and extension of the involved metabolic pathways of HDLC metabolism.
High-density lipoproteins (HDL) are a heterogeneous group of particles composed of a core of cholesteryl ester and triglycerides surrounded by an amphipathic layer of free cholesterol, phospholipids and apolipoproteins (apo) (Klos et al., 2007).
HDL particles exhibit multiple antiatherogenic effects (Kontush et al., 2006). They shuttle cholesterol from peripheral cells (e.g. macrophages) to the liver or other tissues in need of large amounts of cholesterol (Bruce et al., 1998), an important step that relieves the peripheral cells from cholesterol burden (Figure 1). The concept for this complex process was first porposed by Glomset et al. 40 years ago (Glomset 1968) and is called reverse cholesterol transport. It is thought to represent the basis for the antiatherogenic properties of HDL (Kontush et al., 2006; Von Eckardstein et al., 2001). Furthermore, HDL has antioxidative properties due to associated antioxidative enzymes and expresses anti-inflammatory activity in various pathways.
Figure 1 illustrates the major routes of lipoprotein metabolism to allow a better understanding of the HDL metabolism (Kwan et al., 2007). HDL precursor particles are secreted as disc-shaped structures by the liver and intestine and can absorb free cholesterol from cell membranes, a process mediated by ABCA1, apoA-I and apoA-IV. While ABCA1 presents the best understood active cholesterol efflux mechanism, the HDL-mediated removal of free cholesterol from the peripheral cells is also aided by other active transporters such as ABCG1 as well as by passive diffusion which can be enhanced by plasma membrane receptors such as SCARB1 (Cavelier et al., 2006). ApoA-I is the major apolipoprotein of HDL and activates the enzyme lecithin:cholesteryl acyltransferase (LCAT), which esterifies the accepted free cholesterol to allow more efficient packaging of the cholesterol for transport. By acquisition of additional apolipoproteins, cholesteryl esters, and triglycerides, HDL3 particles are transformed into spherical HDL2 particles (Dieplinger et al., 1985). Reverse cholesterol transport can take three different routes. First, large HDL particles with multiple copies of apoE can be taken up by the liver via the LDL receptor (Bruce et al., 1998). Second, the accumulated cholesteryl esters from HDL can be selectively taken up by the liver mediated by SR-B1 (Acton et al., 1996). This receptor is expressed primarily in liver and nonplacental steroidogenic tissues. Third, cholesteryl esters are transferred by the cholesteryl ester transfer protein (CETP) from HDL to triglyceride-rich lipoproteins (Bruce et al., 1998). Serum HDL cholesterol levels are influenced by the complexity of these reverse cholesterol transport processes. Disturbances in the concentrations of apoproteins, function of enzymes, transport proteins, receptors, other lipoproteins, and their clearance from plasma can have a major impact on the anti-atherogenic properties in HDL.
High-density lipoprotein (HDL) particles exhibit multiple antiatherogenic effects (Kontush et al., 2006) and HDL cholesterol (HDLC) concentrations show a strong inverse correlation with the risk of coronary artery disease (CAD) (Lewington et al., 2007). Epidemiological studies highlighted the antiatherogenic function of HDLC and showed that an increase of 1 mg/dL of HDLC levels is associated with a 2% and 3% decrease of the risk for CAD in men and women, respectively (Wilson 1990; Linsel-Nitschke et al., 2005). A recent meta-analysis (Lewington et al., 2007) including prospective observational studies with HDLC measurements available in 150,000 individuals demonstrated a strong negative association with ischemic heart disease mortality in every age group, with no evidence of a threshold beyond which higher HDL cholesterol was no longer associated with lower mortality. On average, 13 mg/dL higher HDL cholesterol was associated with about a third lower ischemic heart disease mortality. Within every age group, the strength of this association was comparable for men and women. Moreover, clinical trials have established that increasing HDLC levels by drugs could reduce CAD risk (Gotto, Jr. 2001; Manninen et al., 1988; Rubins et al., 1999; Kronenberg 2004), thus bringing longer life expectancy. This hypothesis is compatible with observations that familial hyperalphalipoproteinemia often coexists with longevity (Patsch et al., 1981), and that higher HDLC levels are found among healthy elderly aged 85-89 years as compared to those in middle-aged subjects (Nikkila et al., 1990). Accordingly, HDL and molecules involved in HDL metabolism seem to be attractive candidates for longevity-promoting factors (Arai et al., 2004).
Since HDLC levels are under considerable genetic control with heritability estimates of up to 80% (Kronenberg et al., 2002; Perusse et al., 1997; Wang et al., 2005; Goode et al., 2007), the identification and characterization of genetic variants associated with HDLC concentrations can provide useful information related to genotype-phenotype relationships and give new insights in the background of longevity.
This article reviews the current genetic-epidemiological evidence from association studies on genes involved in HDL metabolism. Genetic association studies establish estimates for a difference of mean HDLC levels among subjects with a certain genetic variant compared to the others and thus a measure of association between the genetic background and HDLC outcome in human beings. This overview should provide a path through the jungle of association studies often reporting different names and positions for the same genetic variant under study. Furthermore, the associations are sometimes reported in the direction of the “risk allele”, that is the genetic variant associated with increased CAD risk and thus lower HDLC levels, or in the direction of the “minor allele”, that is the less frequently appearing variant in the population, which may even differ between study populations. Furthermore, it reviews the recent findings from genome-wide association studies, which have identified new genes influencing HDLC levels and provides a first taste on the future. It is not intended to provide an overview on functional studies.
The literature search for genes previously reported for association with HDLC was performed in PubMed using search terms such as “(meta-analysis OR associat* OR epidemiolog*) AND (polymorphism OR genetic OR mutation) AND HDL AND human” with an Entrez Date in PubMed until April 2008 (EDAT: the date the citation was added to PubMed). Results were complemented by knowledge of the lipid-experienced investigators of this study.
Eligible studies for data extraction were meta-analyses, population-based studies, studies in the general population, healthy control populations, hospital-based controls, control populations selected for not having type 2 diabetes mellitus (T2DM), and case-control studies for cardiovascular disease (CVD) or T2DM. Studies which were included in a meta-analysis were only indirectly considered by showing the results of this meta-analysis in the tables instead of each single study. We did not consider studies with special patient populations other than cardiovascular disease or T2DM (e.g. patients with chronic kidney disease).
Exclusion criteria for our search were studies in a non-English language where only the abstract was available in English and small studies with less than 500 individuals.
Studies which investigated a large number of genetic variants were only considered to provide statistically significant association, if the associations were strong enough to hold for a significance level corrected for the multiple comparisons. This was usually not sufficiently accounted for in the reported associations. Moreover, we omitted studies on polymorphisms with a too low minor allele frequency (MAF) yielding less than about 30 study participants with the minor allele considering the number of individuals investigated.
If the results of a particular polymorphism of a certain study population were considered in more than one publication, only one of these publications was referenced in our tables.
A gene was considered an HDLC candidate gene if at least two eligible studies reported a statistically significant association of genetic variation within or related to the particular gene with HDLC levels. An exception from this rule was made for LCAT for which strong functional evidence was available combined with one large and a few small association studies.
Rarely investigated genetic variations that showed a significant association with HDLC levels in at least one large study (n>1000) are listed in an additional table (Table 11).
Tables 1--1111 and the Supplementary Table show the results of all eligible candidate gene association studies which were accessible in PubMed until the Entrez Date April 2008. Even though we draw our attention to the fact not to miss the most important studies, we are aware that it might not have been possible to have complete lists of all studies which deal with candidate genes for HDL cholesterol.
It should be noted that identifying the precise rs numbers of some polymorphisms was very tedious if not impossible as rs numbers were often lacking or mistakenly reported and in some of these cases we were only able to derive rs numbers by combining literature reports with databases such as HapMap or NCBI. In some cases the effect size of the investigated SNP on HDL cholesterol was only displayed in a figure and allowed only an approximate estimate of the numerical value.
For the single nucleotide polymorphisms (SNPs) of the candidate genes reported in the Tables 1--99 we imported data from HapMap and constructed an r2-plot using HaploView 4.1 to illustrate the correlation of the SNPs reported for each gene. For APOA1, APOC3 and PON1 no r2-plot was constructed since only one or none of the tabulated SNPs was available in HapMap. Furthermore, rs2303790 of CETP and rs2066718 of ABCA1 are not displayed in the r2-plots since the MAF of these SNPs was 0.0% for the imported HapMap data.
A further chapter reviews the evidence recently derived from genome-wide association (GWA) studies. SNPs derived from these studies are not listed in Tables 1--1111 but are provided in an extra table (Table 12).
The CETP gene is located on chromosome 16 and consists of 16 exons and 15 introns and is a member of the lipopolysaccharide binding protein gene family (Yamashita et al., 2001). The gene locus is highly polymorphic with several common polymorphisms as well as rare mutations (Thompson et al., 2007).
CETP is a key plasma protein that influences circulating levels of HDLC by mediating the transfer of esterified cholesterol from HDL to apoB-containing particles in exchanges for triglycerides (Tall 1993) (Figure 1). The effect of this transfer is an increase in atherogenic apoB-containing particles and a decrease of HDL (Carlquist et al., 2007). The lipid profile typically seen in subjects with CETP deficiency includes elevated HDL cholesterol levels with LDL cholesterol levels generally in the normal range (Miller et al., 2003). Complete loss of CETP activity due to mutations in the CETP gene can result in up to five times the normal HDLC levels, heterozygous deficiency of CETP result in milder (10-30%) increases in HDLC levels (Klos et al., 2007).
A great number of studies investigated the association of CETP polymorphisms with HDLC concentrations and the evidence for an association is very clear (Table 1). Although these SNPs are in strong linkage disequilibrium they are not necessarily strongly correlated (Heid et al., 2008; Thompson et al., 2007; McCaskie et al., 2007). SNPs reported in Table 1 that were genotyped in HapMap were for the most part only weakly correlated with each other (Figure 2). Some of the investigated SNPs showed pronounced additive influences on HDLC levels with changes of 2-3 mg/dL per copy of the minor allele. Meta-analyses including more than 10,000 individuals are available for rs708272 (Taq1B) and rs5882 (Ile405Val) showing a clear association of these SNPs with HDLC levels (Boekholdt et al., 2003; Boekholdt et al., 2005). Recently, a dense genotyping approach in more than 2000 individuals provided a very clear overview on the genetic variability within this gene and its associations with HDLC levels (Thompson et al., 2007). During the review process of this manuscript, a further meta-analysis on CETP was published (Thompson et al., 2008), which is not included in Table 1 since the Entrez Date in PubMed was after our deadline of April 2008. The results of this meta-analysis which included 26 to 72 studies with 39,581 to 68,134 participants for the three common SNPs rs708272 (Taq1B), rs5882 (I405V), and rs1800775 (-629C>A) are in line with the results of the other studies listed in Table 1 and underline the significant influence of the CETP gene on HDLC levels.
Lipoprotein lipase is a critical enzyme involved in lipolysis of triglyceride-rich lipoproteins, chylomicrons and VLDL (Miller et al., 2004). LPL does not act directly on HDL, but its action on triglyceride-rich lipoproteins has an important indirect effect on HDL metabolism (Lewis et al., 2005) (Figure 1). Familial LPL deficiency is a rare autosomal recessive disorder characterized by absence of LPL activity and a massive accumulation of chylomicrons in plasma. Homozygous deficiency is associated with severe hypertriglyceridemia and marked reductions in high- and low-density cholesterol levels (Klos et al., 2007).
Our literature search identified nine polymorphisms investigated for an association with HDL cholesterol (Table 2 and Supplementary Table). Most of these SNPs showed a significant association with HDLC levels (Table 2), whereas no study found a significant association for the common polymorphism rs285, which is only weakly correlated with the other SNPs (Supplementary Table and Figure 2). Five of the seven SNPs available in HapMap are located in a block of strong linkage disequilibrium (Heid et al., 2008) and especially rs320, rs326, rs13702 and rs10105606 show high correlations (Figure 2). A meta-analysis is available for four common polymorphisms which demonstrated significant associations with HDLC (Wittrup et al., 1999): the minor allele of rs328 (S447X respectively S474X according to NCBI) in >4000 individuals was associated with increased HDLC levels, whereas the minor alleles of rs268 (N291S) in almost 15,000 individuals, Gly188Glu (no rs number available) in >10,000 individuals and rs1801177 (D9N) in >5000 individuals were associated with lower HDLC levels.
The human hepatic lipase gene, LIPC, is located on chromosome 15q21. It comprises 9 exons and 8 introns, and spans a length of more than 30 kb. It encodes a protein of 449 amino acids with a signal peptide of 23 amino acids (Cai et al., 1989; Ameis et al., 1990). Hepatic lipase is a glycoprotein of approximately 65 kDa and is synthesized in hepatocytes and then secreted and bound to hepatocytes and hepatic endothelial surfaces. Hepatic lipase specifically catalyzes the hydrolysis of triglycerides, diglycerides and phospholipids in native lipoproteins (Miller et al., 2003). It appears to be involved in the selective uptake of cholesterol ester from HDL (Lambert et al., 2000) (Figure 1). To date, few patients with true hepatic lipase deficiency have been identified. All individuals with complete deficiency have had elevated plasma cholesterol and triglyceride concentrations. Furthermore, deficiency is characterized by abnormally triglyceride-rich HDL and LDL particles as well as high levels of HDLC (Cohen et al., 1999; Holleboom et al., 2008).
Seven SNPs in LIPC were reported to be associated with HDLC levels, two of them showing a very consistent association: rs1800588 and rs2070895 (Table 3). On six of the seven SNPs data are available in HapMap, and the r2-plot indicates a strong correlation between three of these SNPs which are all located in the promoter region of LIPC (Figure 2). A meta-analysis including more than 24,000 individuals is available for rs1800588 (also described as C-514T or C-480T) showing that one or two copies of the minor allele increase HDLC levels by 1.5 and 3.5 mg/dL, respectively (Isaacs et al., 2004).
Endothelial lipase (encoded by the LIPG gene) was discovered in 1999 independently by two different laboratories (Jaye et al., 1999; Hirata et al., 1999) and is a member of the triglyceride lipase gene family. This enzyme derived its name because of its expression by endothelial cells (Jaye et al., 2004). Since overexpression of LIPG in mice resulted in marked reduction of HDLC levels (Jaye et al., 1999), this has led to the concept that LIPG may play an important role in HDL metabolism. It promotes the turnover of HDL components and it increases the catabolism of apolipoprotein A-I (Figure 1). It was proposed that the endothelial lipase together with the hepatic lipase converts large HDL particles to smaller particles by its phospholipase activity (Jaye et al., 2004). Evidence is increasing that endothelial lipase might play a role in the etiology of the lipoprotein profile characteristic of the metabolic syndrome and an increased activity of this enzyme is linked to the underlying proinflammatory state in the metabolic syndrome (Lamarche et al., 2007).
In our literature search we found five SNPs of LIPG which were significantly associated with HDL levels (Table 4). The evidence for an association is less strong as for CETP, LPL and LIPC and the studies available are much smaller than for these three genes. Four of these SNPs were available in HapMap, three of them are strongly correlated with r2 ranging from 0.31-0.96 (Figure 2).
LCAT is associated with HDL and esterifies cholesterol in plasma (Figure 1). ApoA-I serves as a cofactor for this reaction (Miller et al., 2004; Klos et al., 2007). Important knowledge on the function of this gene was derived from patients with LCAT deficiency which results in free cholesterol and phosphatidylcholine deposition in membranes followed by corneal opacification (fish-eye disease), anemia, and renal failure (Miller et al., 2004). Homozygous LCAT deficiency is an underlying cause of two conditions: familial LCAT deficiency and fish-eye disease. While the former is associated with the complete loss of LCAT activity, the latter is associated with a change in the substrate specificity of LCAT that becomes inactive toward HDL cholesterol, while retaining its activity toward LDL cholesterol. Both conditions are characterized by severe hypoalphalipoproteinemia, but only familial LCAT deficiency is strongly associated with premature CAD (Sviridov et al., 2007).
Even though LCAT is due to its physiological function an important HDL candidate gene, only few large studies investigated the possible role of this gene on HDLC levels. Five polymorphisms of LCAT were investigated for an association with HDLC levels, but the findings are not entirely consistent (Table 5). The two SNPs available in HapMap (rs4986970 and rs2292318) are not correlated at all with each other (Figure 2) which could explain the presence of association with HDLC for one SNP (rs2292318) and the lack of association for the other SNP (rs4986970). An influence of LCAT on HDLC levels is supported by genome-wide association studies (see below). There is an urgent need for large studies investigating the influence of common and rare genetic variations of this candidate gene on HDLC levels.
The SCARB1 gene has been localized to chromosome 12 spanning a region of 75 kb containing 13 exons. The gene encodes a receptor protein of approximately 80 kDa, whose weight can vary based on its extent of glycosylation. SCARB1 is highly expressed in liver and steroidogenic tissue (adrenal, ovaries, and testes) (Cao et al., 1997). It was isolated and characterized as a functional hepatic receptor for HDL (Acton et al., 1996). This receptor participates in the selective uptake of cholesterol ester (Miller et al., 2003), and binds a number of ligands with high affinity, including native HDL (Cao et al., 1997). The selective uptake involves the transfer of cholesterol from the HDL particle and the release of the lipid-poor HDL particle into the plasma (Acton et al., 1996) (Figure 1). It should be mentioned that SCARB1 has predominantly been studied in animal models where it has been shown to control levels of plasma HDLC and non-HDL cholesterol as well as the propensity for atherosclerosis (Miller et al., 2003), but it has to be determined whether this receptor has an equally important function in humans.
Despite the obvious functional evidence for an influence of SCARB1 on HDLC levels, the genetic-epidemiological evidence is relatively weak. Our literature search identified five polymorphic sites studied with HDLC levels with inconsistent results (Table 6). The two variants found in HapMap (rs5888 and rs5891) show no correlation with each other (Figure 2). Large genetic epidemiological studies on this gene are required before a final conclusion can be drawn.
The ATP binding cassette group of transporter proteins is involved in the transfer of various substances across plasma membranes, including ions, peptides, vitamins, and hormones. The ABCA1 transporter is expressed in liver, macrophages and steroidogenic tissues and plays a pivotal role in the initial phase of reverse cholesterol transport by mediating cholesterol and phospholipid efflux from macrophages to HDL (Miller et al., 2003; Cavelier et al., 2006) (Figure 1). Far more than 1000 polymorphic sites within this gene are known. We appreciate the most important knowledge on ABCA1 and its influence on HDL metabolism to a rare monogenic recessive disorder, called Tangier disease in which patients carry homozygous mutations in ABCA1 (Bodzioch et al., 1999; Rust et al., 1999; Brooks-Wilson et al., 1999). This causes a deposition of cholesteryl esters in the reticuloendothelial organs which results in orange-colored tonsils. Cholesterol is entrapped in macrophages due to the heavily disturbed cholesterol efflux. This results in lipid-depleted HDL particles rapidly catabolized and HDL deficiency.
Since the ABCA1 transporter plays an important physiological role in HDL metabolism, many studies attempted to show an association of this candidate gene with HDLC levels. Table 7 and the Supplementary Table list 19 SNPs of ABCA1 that were investigated in the literature. However, most studies did not reveal large effects of these polymorphisms on HDLC levels although many included a large number of individuals. An explanation for this failing might be that a lot of rare (population frequency below 1%) and very rare (also called “private”) mutations in ABCA1 with strong effects for each, could sum to an important influence on HDL metabolism (Cohen et al., 2004). These rare mutations are hardly to be pinpointed by association studies due to the small number of mutation carriers even in large studies. A typical example was recently shown by the Copenhagen City Heart Study (Frikke-Schmidt et al., 2008b) which could demonstrate in more than 9000 participants a pronounced effect for four rare mutations in ABCA1 on HDL cholesterol levels (a reduction of 17 mg/dL for heterozygotes vs noncarriers, p<0.001).
The r2-plot of the 9 available SNPs in HapMap shows that most SNPs are only weakly correlated which suggests that the missing associations of the single SNPs with HDL levels are independent of each other (Figure 2).
APOA1 is located in the APOA1/C3/C4 gene cluster on human chromosome 11q23 which harbors also APOA5.
As the major protein component of the HDL particle, apoA-I is an important ligand for HDL binding to cellular receptors, including scavenger receptor class B type 1 and ABCA1 (Rigotti et al., 1997; Remaley et al., 2001). Therefore, apoA-I serves as a cofactor for cholesterol esterification and is an important component of the reverse cholesterol transport (Miller et al., 2003) (Figure 1). The clinical importance of apoA-I is underscored by a premature CVD in families with APOA1 deficiency and by a marked atherosclerosis in knockout mouse models (Miller et al., 2003). Patients carrying two functionally-relevant mutations in APOA1 being either homozygous or compound heterozygous, have extremely rare states of complete HDLC deficiency (Funke 1997) and present with two different clinical hallmarks, namely xanthomas or corneal opacities (Von Eckardstein 2006).
In our literature search we found three SNPs (rs670, rs5069 and rs5070) which were investigated for an association with HDLC levels. Mostly, large studies found a significant association with HDLC levels, but further large studies are required (Table 8 and Supplementary Table).
Apolipoprotein C-III is a component of triglyceride-rich lipoproteins and HDL particle and is transferred to HDL during the hydrolysis of triglyceride-rich lipoproteins (Miller et al., 2004). The major physiological role of ApoC-III appears to be an inhibiting effect on LPL (Figure 1).
For the APOC3 gene, data on the influence of genetic variability on HDLC levels are heterogeneous. If indeed present, the influence seems to be weak (Table 8).
APOA5 is located proximal to the APOA1/C3/C4 gene cluster and was recently described by two groups (Pennacchio et al., 2001; van der Vliet et al., 2001). The human APOA5 gene consists of four exons and codes a 369 amino acid protein, expressed only in the liver (Hubacek 2005). ApoA-V is predominantly located on TG-rich particles, chylomicrons and VLDL but also on HDL, and there is evidence that ApoA-V serves as an activator of LPL (Hubacek 2005) (Figure 1).
Among the apolipoproteins, APOA5 is the gene with the most pronounced associations with HDLC levels. Many polymorphisms of APOA5 were investigated in the literature, and most of them show an association with decreased levels of HDLC (Table 8). Only four SNPs are available in HapMap which are only weakly associated except rs662799 and rs651821 with a strong correlation (Figure 2).
Ghrelin was only recently described by Kojima et al. (Kojima et al., 1999). Ghrelin is an endogenous peptide, its active form is composed of 28 aminoacids and circulates in plasma in a concentration of ~800-1000 pg/mL. Ghrelin is an orexigenic peptide predominantly produced by the stomach. Accumulating evidence suggests that ghrelin is a key regulator of body weight (Cummings et al., 2003).
Two studies found a significant inverse association of four polymorphisms with HDLC levels whereas another study found no association of five SNPs with HDLC (Table 9). The effects of the SNPs on HDLC levels seem to be independent from each other since most of the SNPs available in HapMap are not highly correlated (Figure 2).
The LDL-receptor is a cell surface receptor that mediates the uptake of LDL particles from the circulation via receptor-mediated endocytosis (Hobbs et al., 1990). Since this receptor plays an important role in plasma lipoprotein homeostasis (Muallem et al., 2007) (Figure 1), genetic variations in the LDLR gene that alter its expression could contribute to inter-individual differences in lipoprotein levels.
Some of the five literature-reported polymorphisms in the LDLR gene show an association with HDLC levels, which was not confirmed by other SNPs (Table 9). For the four SNPs available in HapMap, rs688 and rs5925 showed strong correlations (r2=0.96) (Figure 2).
Paraoxonase 1 is a member of the paraoxonase family that includes three isoenzymes; their genes are clustered on chromosome 7q21.3-22.1 (Primo-Parmo et al., 1996). PON1, a HDL-associated enzyme, is partly responsible for the antioxidation properties of HDL by protecting low-density lipoproteins against oxidation (Mackness et al., 1991).
The PON1 gene has two widely investigated polymorphisms, the Gln192Arg and the Met55Leu, but the results are not entirely convincing (Table 9). A strong linkage disequilibrium exists between the two polymorphisms (Arca et al., 2002).
ApoE is a part of chylomicrons, VLDL, IDL and of some subspecies of HDL (Sviridov et al., 2007) (Figure 1). The role of APOE polymorphism in determining VLDL-LDL levels is well known, but its independent influence on HDLC concentration is still controversial (Sviridov et al., 2007).
A connection between the APOE polymorphism and HDLC levels was found in many but not all studies (Table 10). Since APOE is the best investigated of all known polymorphisms with far more than 50 publications that investigated an association with HDLC levels, we only referenced large population-based studies with more than 1000 individuals per study. Almost all of these studies found the E2 allele with increased and the E4 allele with decreased HDL levels compared to the E3 allele.
Table 11 provides the results on genes rarely investigated which showed a significant association with HDLC levels in at least one large study (n>1000). It might be a worthwhile attempt to replicate the shown associations with HDLC in further large studies.
Many studies investigated the association of the APOA4 gene and HDLC levels. The most frequently investigated polymorphisms are the T347S (rs675) and Q360H (rs5110). Especially the large studies with sample sizes above 1000 did not show a significant association of these polymorphisms with HDLC levels (data not shown).
The same holds true for APOB: several studies investigated the association of this gene with HDLC levels. Large studies, however, showed non-significant results (data not shown). A recent genome-wide association study with almost 18,000 individuals from 16 population-based cohorts found apoB to be associated with HDLC levels (p=4.4×10-8) (Aulchenko et al., 2008).
Common variants are those that appear in the general population with a frequency >1% or even >5% of subjects. These variants often show, individually, rather small associations. Kathiresan et al. illustrated, however, that a combining of the information on several of these low-impact common variants can contribute to some extent to the prediction of HDLC levels (Kathiresan et al., 2008a). They studied nine SNPs at nine loci in 5287 subjects, created a genotype score on the basis of the number of unfavorable alleles and found decreasing HDLC levels with increasing genotype scores. Another study (Zietz et al., 2006) investigated five polymorphisms in different genes and found a significant association between the number of rarely occurring genotype combinations and HDLC levels, whereas each single polymorphism was not associated with HDLC levels.
Rare variants are those that appear with <1% in the general population. Cohen and colleagues (Cohen et al., 2004) investigated whether rare DNA sequence variants collectively contribute to variation in HDLC levels by sequencing ABCA1, APOA1, and LCAT in individuals from a population-based study. Nonsynonymous sequence variants were eight times more common (16% vs. 2%) in individuals with low HDLC (<5th percentile) than in those with high HDLC levels (>95th percentile). Biochemical studies indicated that most sequence variants in the low HDLC group were functionally important. The authors concluded that rare alleles with major phenotypic effects contribute significantly to low plasma HDLC levels in the general population. This seminal work started a new era with a focus on rare variants, and it is a consequent step to search these variants after a gene has been pinpointed by common variants to contribute to a certain phenotype.
Other examples for genes with rare variants having an important influence on HDLC levels are the angiopoietin-like 4 (ANGPTL4) gene (variant E40K, MAF=2%) which was associated with significantly higher levels of HDLC (p=4.0×10-7) in 8726 individuals of the ARIC Study (Romeo et al., 2007) or rare variants in the alcohol dehydrogenase 2 (ADH2) gene (MAF=2.8%) (Whitfield et al., 2003) or in the cytochrome P450, family 3, subfamily A, polypeptide 4 (CYP3A4) gene (MAF=0.5%) (Yamada et al., 2007) which showed significant associations in 901 and 3787 individuals, respectively.
These studies highlight several important aspects for quantitative traits: 1) multiple common alleles influence trait variation, with each allele conferring a modest effect; 2) although each SNP exerts a modest effect, a combination of SNPs in aggregate can have a substantial influence on HDL levels; and 3) many rare alleles of candidate genes with a strong effect of each allele might have a pronounced influence on the trait under investigation in the studied population. Considering that all common alleles together explain less than 5-10% of HDLC levels in the general population and that HDLC levels are under considerable genetic control with heritability estimates of up to 80% (Kronenberg et al., 2002; Perusse et al., 1997), leaves a wide space for rare variants as well as for genetic regulatory mechanisms not yet well understood.
The recent introduction of microarray technology for genotyping allows the genotyping of several hundreds of thousands genetic variants in a single person in one step. This enables genome-wide association (GWA) studies by genotyping a large number of individuals with phenotypes of interest at reasonable costs. Compared to a hypothesis-driven candidate gene approach as described in the chapters above the hypothesis-free GWA studies can identify new susceptibility genes without making any a priori biological assumptions. They permit to identify genes involved in pathways which until now were unknown to be involved in a certain phenotype. GWA studies are therefore a new and very powerful tool to identify genetic contributors to phenotypes and have revolutionized gene hunting (for review see (Kronenberg 2008; McCarthy et al., 2008). There was never a time before in which in such a short time an unbelievable high number of genes for complex diseases and phenotypes has been identified. This holds true for complex diseases such as T2DM (Frayling 2007), cancer as well as continuous variables such as lipids.
The introduction of GWA studies allowed to leave the path of hypothesis-driven search for genes influencing lipid levels, which was based on a limited a priori knowledge on biological and physiological processes involved in lipid metabolism. GWA studies allow to widen the horizon and to detect new genes involved in lipid metabolism. The identification of genes, however, is strongly dependent on the power of GWA studies which is usually quite low considering the fact that 500,000 SNPs and more are investigated at once, which creates a pronounced multiple testing problem with up to 25,000 false positive findings if the conventional significance threshold of p<0.05 is used. A decrease of this threshold to a genome-wide significance level (e.g. p<10-7) and an increase in the number of samples studied increases the power and allows to bring the “real” genes out of the jungle of false positive associations. Therefore, the step from the single study to the meta-analysis was never before as short as in times of GWA studies. The researchers in this field realized immediately that they are more successful when combining their studies, which allows to identify even gene variants modifying the disease risk by only 10% or explaining less than 0.5% of the variance of continuous parameters.
Until now eight GWA studies on HDLC levels are available (Willer et al., 2008; Kathiresan et al., 2008b; Kooner et al., 2008; Wallace et al., 2008; Heid et al., 2008; Kathiresan et al., 2007; Aulchenko et al., 2008; Cashman et al., 2008). Results from these studies are summarized in Table 12. On the one hand, they have confirmed the following genes which were already known from earlier functional as well as association studies: CETP, LPL, LIPC, LIPG, ABCA1, LCAT, and the APOA1C3A4A5 gene cluster. On the other hand, these studies have identified new candidate genes influencing HDLC levels which need further functional characterization.
The most pronounced association of these newly identified genes was observed for GALNT2 (Willer et al., 2008; Kathiresan et al., 2008b). This gene would not have been identified in an hypothesis-driven approach since it does not have a known function immediately connected to lipid metabolism. GALNT2 encodes a widely expressed glycosyltransferase that could potentially modify a lipoprotein or receptor. This gene was also found to be associated with triglyceride concentrations (Willer et al., 2008; Kathiresan et al., 2008b). Another newly identified gene region was located near MVK and MMAB. These neighboring genes are regulated by SREBP2. MVK encodes mevalonate kinase, which catalyzes an early step in cholesterol biosynthesis, and MMAB encodes a protein that participates in a metabolic pathway that degrades cholesterol (Willer et al., 2008). GRIN3A was identified in the initial GWA with a genome-wide significant p value of 2.5×10-8, but was not confirmed in the combined analysis with the replication samples (Willer et al., 2008). CLPTM1 was found in two independent GWA studies with a combined p value of 5.79×10-6. However, this SNP was not found by other GWA studies (Willer et al., 2008; Heid et al., 2008; Aulchenko et al., 2008; Cashman et al., 2008).
Future GWA studies under way with even larger study samples and meta-analyses in up to 50,000 to 100,000 study participants will elucidate whether the above mentioned new genes or previously known candidate genes mentioned in chapter 3 and 4 but not yet identified in the hitherto published GWA studies will be confirmed. It can be taken for granted that new candidates will be identified with even smaller effect sizes than for the already identified genes.
Except three GWA studies (Heid et al., 2008; Cashman et al., 2008; Aulchenko et al., 2008), previous studies were mainly based on study participants ascertained for T2DM or the metabolic syndrome (Willer et al., 2008; Kathiresan et al., 2008b; Wallace et al., 2008; Kooner et al., 2008; Kathiresan et al., 2007), which could have obscured the results. Therefore, meta-analyses of population-based studies might have an advantage in identifying new candidates for HDLC levels that are not yet distorted by disease processes or medication. Such a meta-analysis of 16 European population-based cohorts has been published very recently (Aulchenko et al., 2008). For HDL cholesterol, eight regions showed genome-wide significant results. The results of the SNPs with the lowest p value for each of these regions are displayed in Table 12. Five of these regions (LPL, ABCA1, LIPC, CETP, and LIPG) have been identified in earlier GWA studies, APOB was for the first time identified in a GWA study for HDL cholesterol, and two regions (CTCF-PRMT8 and MADD-FOLH1) represent entirely novel regions. The MADD-FOLH1 locus of chromosome 11 represents a gene desert close to the centromere with no known gene on the 550 kb flanking region, and the two genes MADD and FOLH1 flanking the locus have not been implicated in lipid metabolism. The CTCF-PRMT8 gene encodes a transcriptional regulator, potentially involved in hormone-dependent gene silencing (Aulchenko et al., 2008). Further studies are needed to examine the potential role of these or neighbouring genes in HDL metabolism.
It is of interest that some loci (LCAT, APOA1C3A4A5, GALNT2, MVK/MMAB, CLPTM1 and GRIN3A) identified in earlier GWA studies of mostly diseased populations could not be confirmed in the meta-analysis of 16 European population-based cohorts. It can not be excluded that they were false positives in earlier studies considering that some were of borderline significance in these studies. Another explanation could be that genetic variation in these genes might play a more pronounced role in populations with metabolic disease-relevant disturbances compared to the general population.
One of the most important observation of our and other GWA studies is the fact that strong association signals are observed for SNPs in regions hitherto regarded as “intergenic”. For example, we observed for HDLC very strong signals up to 70kb downstream of LIPG and LPL, and 10kb upstream of CETP (Heid et al., 2008). This is particularly remarkable as the usual current candidate gene studies focus on SNPs within the gene ±5kb. While our downstream LPL SNPs showed some correlation with previously literature-reported SNPs within the gene, our LIPG signals were completely independent from SNPs within the gene or from any SNP reported previously in candidate gene studies. The identified CETP SNP 10kb upstream (rs8999419) was independent from all of the numerous SNPs reported in candidate gene studies and was even located in a recombination hotspot which strongly supports this region as an independent locus.
These findings are in line with ideas that several regions within and outside of a gene can have independent effects on the gene function and that even regions far away from the gene harboring transcription factor binding sites, miRNAs, other enhancer elements or yet unknown regulatory elements can remotely control gene function. A similar example is reported for chromosomal region 9p21, which was shown to be associated with myocardial infarction (Samani et al., 2007) and T2DM (Scott et al., 2007; Saxena et al., 2007). SNPs with the most significant signals were located more than 100kb upstream of the cyclin-dependent kinase inhibitors CDKN2A and CDKN2B, which supports either long-range effects on one of these genes or the influence of a gene not yet annotated. This supports the relevance of intergenic regions and calls for future functional studies to address this issue.
As we discussed recently (Heid et al., 2008), the relevance of genetic variation far outside of the gene might have far reaching consequences for candidate gene association studies in general. It is conceivable that plausible candidate genes in the past have been dropped prematurely when no association between intragenic variation and the investigated phenotypes was observed. If these regions are not considered and the variation located there is not in strong linkage disequilibrium with intragenic variation, an association would be missed. A higher-weighting for intragenic regions in previous candidate gene studies results in a biased search of phenotype-influencing gene regions and should probably be avoided.
In line with these observations, the GWA study approach thus evolves as “the better candidate gene study” as it enables a much more comprehensive analysis of all known candidate genes with ad libitum extension beyond gene boundaries and comparability across studies. Previously, candidate gene studies were each investigating different SNPs in different studies with different HDLC measurements and often different analysis models and in most of the studies with a strong focus on the promoter and intragenic regions of these genes.
HDL cholesterol levels have a strong genetic determination, with heritability estimates ranging in most cases between 40% and 60% (Wang et al., 2005; Goode et al., 2007; Kronenberg et al., 2002). As it is the case with many intermediate cardiovascular risk phenotypes, low HDLC can be either monogenic or purely environmental or, in most cases is multifactorial / polygenic in origin (Von Eckardstein 2006). The most common inherited form of low HDLC is familial hypoalphalipoproteinemia, which is defined as an HDLC level below the 10th percentile, without secondary cause, and associated with a family history of low HDLC levels. The genetic cause is not fully characterized: some of the cases are due to mutations in HDL structural genes (mostly in one of the three genes APOA1, LCAT or ABCA1), whereas in other cases there appears to be accelerated catabolism of HDL and its apolipoproteins without a specific genetic mutation identified (Klos et al., 2007). In contrast to the rare occurrence of isolated low HDL in hypoalphalipoproteinemia, very low plasma levels of HDLC are also found in patients with genetically disturbed metabolic pathways which are indirectly linked to HDL metabolism. For example, many patients with lipid storage diseases like Gaucher’s disease, Nieman-Pick disease, diabetes mellitus or hypertriglyceridemia present with low HDL cholesterol (Von Eckardstein 2006). In the context of high triglycerides, low HDL cholesterol is not only an early symptom, but also provides a very sensitive marker of impaired glucose tolerance and increased lipolysis (Rohrer et al., 2004). These examples reflect the very heterogeneous nature of HDLC which makes HDL genetics very complex.
Many genes control lipolysis of plasma triglycerides, a process that also affects HDLC levels through the delivery of apolipoproteins and phospholipids to HDL (Holleboom et al., 2008). There are several genes influencing both HDLC and triglyceride levels which have been identified by candidate gene and hypothesis-free GWA study approaches. For example associations between common SNPs at the APOA5 locus and triglyceride concentrations as well as HDLC levels are significant and relatively consistent across studies and populations (Lai et al., 2005). Certain variants of LPL are associated with elevated plasma triglyceride and decreased HDLC levels (Busch et al., 2000). The variation of the recognition site for SstI within the 3′-untranslated region of APOC3 has consistently shown an association with both plasma triglyceride and HDLC levels, whereas common variants of some other promising candidate genes, such as LIPC, were not as consistent (Busch et al., 2000).
Recent GWA analyses confirmed on the one hand the association of several loci with prior evidence of an association with triglyceride and HDLC levels (LPL, APOA1C3A4A5, and partly LIPC) (Saxena et al., 2007; Wallace et al., 2008; Kooner et al., 2008; Willer et al., 2008; Kathiresan et al., 2008b; Aulchenko et al., 2008; Cashman et al., 2008), and identified on the other hand new loci (GALNT2, APOB) with an influence on both HDL and triglyceride levels (Willer et al., 2008; Kathiresan et al., 2008b; Aulchenko et al., 2008).
The epidemiological evidence for an inverse association of HDLC concentrations with the risk of coronary artery disease is very strong (Lewington et al., 2007), but the causality of this relationship is hard to prove. Since HDL levels are strongly determined by genetic variants (besides environmental factors), a causal association between HDLC and CAD can be demonstrated by showing an association of these genetic variants with risk of CAD. The idea behind is the concept of Mendelian Randomization (Katan 1986; Davey et al., 2003; Kronenberg et al., 2007) which is based on the fact that it is randomly determined at the time of conception which of the two alleles from the father as well as from the mother will be transmitted to the child. Since the transmitted alleles are of lifelong persistence, these alleles determine to a certain amount also whether a person is exposed e.g. to low HDLC levels and therefore to the CVD risk associated with low HDLC levels. Therefore, the association between the polymorphism and CVD is less influenced by reverse causation or confounding. Reverse causation would mean that CVD influences the polymorphism, which can practically be excluded. Confounding would mean that e.g. a lifestyle factor such as smoking is associated with the disease (which is often the case) as well as with the polymorphism (which is less probable). Therefore this method is well appropriate to underline a causal relationship which is hardly possible with conventional epidemiological observation studies. The idea of Mendelian Randomization was originally proposed by Katan in the mid eighties (Katan 1986) and was more and more used during the last five years (Davey et al., 2003). To our knowledge, it was the first time applied in 1992 when Sandholzer and colleagues clearly showed that the apolipoprotein(a) gene locus determines the risk for CVD by its strong influence on lipoprotein(a) concentrations (Sandholzer et al., 1992). This seminal work demonstrated that high lipoprotein(a) concentrations are a primary, genetically determined risk factor for CVD, which was discussed controversially at that time since some researchers had believed that high lipoprotein(a) concentrations are a consequence rather than a cause of CVD. The causal relationship between lipoprotein(a) and atherosclerosis by using the Mendelian Randomization approach has been confirmed in the meanwhile by several studies (Kronenberg et al., 1999b; Kronenberg et al., 1999a; Wild et al., 1997; Kamstrup et al., 2008; Koch et al., 1997).
Figure 3A shows some considerations concerning the sample size for Mendelian Randomization projects. If one assumes that a particular polymorphism explains an unusual high 30% of an intermediate phenotype and that this intermediate phenotype explains about 10% of the clinical endpoint, it is expected that the polymorphism itself explains about 3% of the clinical endpoint (0.30 × 0.10 = 0.03). Single polymorphisms of HDLC genes, however, explain at best 3% of the HDLC levels. This would mean that such a polymorphism would explain 0.3% of the clinical endpoint (0.03 × 0.10 = 0.003). These considerations warn us that very large case-control studies are required to demonstrate an association between a polymorphism and a clinical endpoint. This might be the reason why although many association studies found an association between various polymorphisms and HDLC levels, most of them were markedly underpowered to consistently find an association of these polymorphisms with clinical endpoints.
A recent meta-analysis of three polymorphisms within the CETP gene (Taq1B, I405V and -629C>A) included 92 studies with up to 113,833 healthy participants, and 46 studies on 27,196 coronary cases and 55,338 controls (Thompson et al., 2008). The minor allele of each of the three polymorphisms was associated with a moderate inhibition of CETP activity and, in line with that, modestly higher HDLC levels and, most importantly, with a weakly inverse association with coronary risk. The odds ratios (OR) for coronary disease were compatible with the expected reductions in risk for equivalent increases in HDLC concentrations demonstrated in prospective studies (Lewington et al., 2007). Figure 3B illustrates these associations for the Taq1B polymorphism. These results clearly support a causal association of low HDLC levels with CVD. It also demonstrates that large studies are required to detect an association of this kind with certainty: the meta-analysis on the Taq1B polymorphism included 38 studies with a total of 19,035 cases and 32,368 controls. Since the effects of the investigated polymorphisms on HDLC levels were in the usual range expected in association studies, the results and the required sample size will be rather the rule than the exemption.
Even if the Mendelian Randomization approach worked fine e.g. for CETP, the situation is less clear for ABCA1. A large study of almost 50,000 individuals from Copenhagen showed that heterozygosity for rare loss-of-function mutations in ABCA1 were associated with substantial, lifelong lowering of plasma levels of HDLC (-17 mg/dL). These variants, however, were not associated with an increased risk of ischemic heart disease (Frikke-Schmidt et al., 2008b). It was speculated that this might be explained by the observation that the lower HDLC levels in these loss-of-function mutations are not accompanied by high triglyceride concentrations and that low HDLC levels per se are not atherogenic. A dysfunctionality of the HDL lipoprotein not necessarily reflected in the HDLC levels could be more important than HDLC levels themself. This is in line with another study in the Danish population showing that common non-synonymous genetic variation in ABCA1 predicts risk of ischemic heart disease independent of HDLC concentrations (Frikke-Schmidt et al., 2008a).
Several decades of research on the genetic contribution to HDL metabolism have identified a large number of genes contributing to the antiatherogenic HDLC concentrations. The variants detected in these genes together explain only a small amount of less than 10% of the HDLC variance, but the heritability of HDLC is estimated to be up to 80%. This provides an enormous room for further yet to be identified genetic variants. This might be accomplished by genome-wide association studies. Since the low-hanging fruits might already be detected by earlier candidate gene studies and the first genome-wide association studies, future studies in large population-based genome-wide meta-analyses will identify new genes with even smaller effects on HDLC levels. These findings will probably result in a redrawing or extension of the involved metabolic pathways of HDLC metabolism. On the other hand, in-depth analysis of particular genes by deep-sequencing approaches or similar techniques (Coassin et al., 2008) will extend the previously made observation that a large number of rare alleles considerably contributes to a complex phenotype.
Our own research discussed in this review was funded by grants from the “Genomics of Lipid-associated Disorders – GOLD” of the “Austrian Genome Research Programme GEN-AU”, the Austrian National Bank (Project 12531) and the Austrian Heart Fund to F. Kronenberg and by the German National Genome Research Net, the Munich Center of Health Sciences (MC Health) as part of LMUinnovativ, Germany, and the NIH-subcontract from the Children’s Hospital, Boston, USA, under the prime grant 1 R01 DK075787-01A1, CFDA 93.848 supporting the work of I.M. Heid.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.