This pilot study is one of the first applications of next-generation sequencing in DVT. Sequencing by oligonucleotide ligation and detection was used to resequence the protein-coding areas of 186 hemostatic/pro-inflammatory genes in cases and controls of DVT. This regional sequencing approach enabled the simultaneous analysis of several dozens of genes in many samples at a fraction of the cost and computation required for whole-exome and whole-genome analysis. At the current level of multiplexing (i.e. up to 64 samples per SOLiD slide, 16 per spot), the analysis of our target area in one sample costs one tenth of a high-coverage exome-sequencing and one hundredth of a high-coverage whole-genome sequencing, with similar proportions in the saving of computational time (3 hours for read mapping vs the few days of whole-exome or the more than 1 week of whole-genome datasets) and information storage-capacity requirements (300 MB per BAM file vs 40 GB per BAM for whole-exome or 400 GB for whole-genome sequencing). For these reasons, regional sequencing is ideal to interrogate relatively small genomic areas deemed of particular functional relevance in a disease. Potential applications of this approach are the sequencing of positional candidate genes or genomic loci identified by genome-wide linkage or association analysis [28
], the large-scale replication of the initial findings of exome and whole-genome resequencing studies and the rapid screening of disease-genes in those Mendelian diseases that have several different causal genes [29
In the field of DVT, the importance of being able to analyze all known hemostatic genes (i.e. the 'hemostateome') has been recently highlighted by Fechtel et al. [30
]. The simultaneous sequencing of all hemostatic genes in affected individuals is ideal to study specific combinations of variants in the hemostatic pathway acting in synergy to confer DVT-predisposition. This type of approach has the potential to reveal new disease-predisposing mechanisms, besides the identification of isolated variants that show statistical association with the disease. In this study, the use of molecular barcoding allowed multiplexing without loss of individual sequence information, which is required to fully exploit the potential of sequence data. Sequencing of the 186 genes in 22 individuals yielded more than 1700 genetic variants of different functional type and frequency. Annotation of identified variants revealed several disease-associated variants, a proportion of which had already been reported in association with DVT. Many novel variants with potentially deleterious effect on the function of key hemostatic proteins were also found. These results are consistent with the recent report by Dewey et al. of the genome sequence of a family quartet in which the father had a history of DVT and pulmonary embolism [31
]. In the study, the authors identified four different novel nonsynonymous variants in DVT-risk genes and other known thrombophilia associated variants.
In our study, we adopted different analytical approaches to reveal potential associations with the disease at both single-variant and gene/pathway levels. Although the number of analyzed individuals was very small, it was possible to find statistically significant and biologically plausible associations with DVT. An increased burden of rare missense mutations in anticoagulant genes was found in DVT cases compared to controls. Single-variant association analysis followed by replication genotyping in > 1400 individuals identified an association for the rs6050 SNP in FGA
. The excess of rare missense mutations in anticoagulant genes in patients in whom the deficiencies of natural anticoagulants had been excluded with biochemical assays suggests that a fraction of idiopathic DVT cases might be affected by 'unrecognized' anticoagulant deficiencies, caused by mutations that impart functional effects to which currently used biochemical assays are not sensitive. The association of rs6050, already described by previous candidate SNP studies, was here identified by an agnostic screening of several dozen genes. The association of rs6050 was reported in 4 studies focusing on venous thromboembolism (VTE), therefore including cases of pulmonary embolism without a diagnosis of DVT [25
]. Two of these studies were very small, with total sample size (cases and controls) of less than 500 individuals [25
]. Ours is the second largest of all the studies on rs6050 both in terms of DVT cases investigated and overall statistical power. Thus, along with previous reports, this study makes of rs6050 one of the most widely replicated variants in DVT. The mechanisms for the association of rs6050 with DVT/VTE are not fully understood. FGA
rs6050 was reported to result in enhanced coagulation factor XIII-mediated cross-linking of fibrin alpha-chain [35
] and to be associated with increased FGA
]. In this report we found no association of FGA
rs6050 with plasmatic fibrinogen activity, but the fact that rs6050 is a missense SNP and that the amino acid substitution is considered as 'possibly damaging' by Polyphen 2 indeed suggest that the risk for DVT is conferred by an alteration of fibrinogen-alpha chain function.