Example Standard Output Plot
Figure shows an example of Synthesis-View visualization for analysis across populations. For Figure two input files were used (containing simulated data for example purposes): 1) containing the SNP association data and 2) containing phenotypic summary information. The various "tracks" of data are described, starting from the top to bottom:
Figure 2 Synthesis-View Standard Output Example Plot. An example Synthesis-View plot generated from simulated data. Data is plotted in tracks, with various information including SNP locations, results of association tests, sample sizes, and phenotypic summary (more ...)
(1) Physical genome track. Synthesis-View provides information on the relative location of the SNP on a given chromosome and how that position relates to other SNPs in the same study. Lines lead from the chromosome locations to the IDs of each SNP. If the "Additional SNP locations" option is selected, the location of the SNPs within the chromosomal region are indicated. If SNPs are close together, the location of the first SNP in that group is indicated in the plot (to prevent text overlap). When the plot is first generated, an image of the plot is shown within the web browser that includes embedded links. If the results for a SNP are selected within this image, the NCBI SNP database page for that specific SNP is opened in the default web browser of the user.
(2) SNP presence/absence track. Not all SNPs may be available for all associations across study groups or populations. Thus, this track provides information on whether a SNP was used in the test of associations through the presence/absence of a colored box corresponding to the group, study, or phenotype.
(3) Significance track. The resultant p-values are plotted as the negative log10 of the p-value. Grey vertical lines behind the points create an "abacus-view" that allows for the eye to follow from the SNP presence/absence track down through the lower data tracks. An optional red horizontal line can be applied at a significance level of choice (in Figure the p-value line is set at 0.01). Figure shows how the p-value for each SNP association can be compared across populations (in this case, across race/ethnicities), and triangles indicate direction of the genetic effect (bottom of the triangle is the location of the p-value). For example, if the genetic effect is measured as beta, the triangles point up if beta is positive and point down if beta is negative. The direction enables the investigator(s) to quickly determine if direction of effect is consistent across groups, studies, or phenotypes.
(4) Effect size track. The resultant effect size values (beta values here) are plotted. This track allows you to view the similar effect sizes across race/ethnicity in Figure . To omit the effect size plot, omit effect size information from the input file.
(5) Coded allele frequency track. The coded allele frequency (CAF), the allele chosen by the user to compare across groups or studies, is optionally plotted so trends and differences in the data can be observed.
(6) Sample size track. Optional plotting of the sample size for each genotype-phenotype association for each group/study/phenotype is available so the relationships between sample size and other results of the study can be explored. To plot without sample size, omit sample size information columns from the input file. If sample size is provided only as entire group summary information, rather than for individual SNP/phenotype regressions, a separate box will appear at the bottom of the plot with this summary information graphically represented.
(7) Phenotype summary plot. Summary information for a single phenotype across several groups is plotted if a separate file of phenotype summary information is included. This is currently a feature for quantitative traits/continuous data. Future versions will incorporate methods to characterize categorical/case-control phenotype summary information.
Other Options for Standard Output Plot
When all SNPs are available for all tests of association, a legend appears indicating which colors correspond to which group, study, or phenotype. Figure shows an example of this feature, as well as showing an example of using Synthesis-View to plot the results for multiple phenotypes (from Jeff et al. 2010, in preparation). Alternate colors for the data points can be specified by adding an additional column with the header of 'Color', along with the color selected for each group, in the phenotype summary file.
Figure 3 Multiple Phenotype Synthesis-View Plot. An example of plotting the results for tests of association, assuming an additive genetic model, for the same SNPs across multiple phenotypes (listed in the legend). Because all tests of association were available (more ...)
Comparisons can also be made across study stages, such as Figure where the results of Table three, from Willer et al. 2008 [3
] are plotted, allowing for the visual comparison between individual stages of a study and the combined meta-analysis results. Of particular interest in Figure are SNPs rs9989419, rs3764261, rs1864163, all of which have strongest association with high-density lipoprotein (HDL) cholesterol among the SNPs tested. Plotting the data in this format shows the similar location of these SNPs as well as their strong association with HDL levels. Investigators examining these data could postulate that the close proximity of these SNPs and the similar genetic effect sizes made evident by Synthesis-View suggest that these SNPs are in LD with one another, prompting further investigation of the data. Also, plotting the samples size is useful in (indirectly) visualizing the power of each test of association, which is necessary to interpret non-significant findings.
Figure 4 Tabular results Moved to Visual Format. Plotting the results from Table three, from Willer et al. 2008 , allowing for the visual comparison between individual stages of a study to the combined meta-analysis results.
In Figure the same Table three data from Willer et al. [3
] is plotted again, however, different options are used. When there is a wide range of p-values, there can be compression of the less significant p-values when plotted, visible in Figure for the results for SNPs other than rs9989419, rs3764261, rs1864163. Synthesis-View allows for the choice of significance level, such that any points more significant than the cutoff value are plotted at that value, in a larger size (in this case set at 1E-33). This feature allows for closer inspection of the less significant p-values. Also used in Figure is the option for "jitter", where overlapping points are plotted with horizontal distance between them.
Figure 5 Additional Options in Synthesis-View. The results from Table three, from Willer et al. 2008 , plotted again with additional options. In this case a significance level of 1E-33 was chosen, and results more significant than that cutoff are plotted at (more ...)
In some settings, studies have multiple phenotypes, as well as multiple groups (such as multiple race-ethnicities). In this case, Synthesis-View will plot the results for all the phenotypes for each group on separate tracks. Figure shows an example of this feature, where associations were calculated for six phenotypes and a series of SNPs (from Jeff et al. 2010 in preparation), and investigated for three race/ethnicities: Mexican Americans (MA); European Americans (EA); and African Americans (AF). Thus, in this plot, the results across multiple phenotypes are plotted with a track for each race/ethnicity allowing for multi-layered results to be viewed together.
Figure 6 Multiple Phenotypes and Multiple Groups. The results for six phenotypes across three race/ethnicities (from Jeff et al, 2010, in preparation) are plotted. Multiple tracks are available for comparing results across multiple phenotypes from data stratified (more ...)
Another feature available in Synthesis-View is the plotting of a D' or r2
plot in Haploview style format [2
] as the bottom-most track, shown in Figure (From Jeff et al. 2010 in preparation
). This plot will appear when D' or r2
data are provided in a separate file.
Synthesis-View plot with r2 plotted in Haploview style format. The results for nine phenotypes (from Jeff et al, 2010, in preparation) are plotted. At the bottom of the image is a Synthesis-View generated r2 Haploview style plot.
Forest Plot/Odds Ratio options in Synthesis-View
Synthesis-View also has an option for plotting odds ratio (OR) results in forest plot format. Figure shows an example from the International Multiple Sclerosis Genetics Consortium (IMSGC) [4
], where original IMSGC Multiple Sclerosis GWAS results were investigated for replication in an independent dataset. A Stage 1 analysis was performed to examine the replication of a series of SNP/phenotype associations. In Stage 2 a smaller subset of 19 SNPs were tested for further replication. The results from Stage 1, Stage 2, and the final combined analysis for 19 SNPs were presented in Table two of the manuscript [4
], and the results are presented here in forest plot format using Synthesis-View. The tracks are as follows:
Figure 8 Forest-Plot option in Synthesis-View. Stage 1, Stage2, and the final Combined results from a study of MS by the International Multiple Sclerosis Genetics Consortium (IMSGC) were presented in Table two of manuscript  and the results are presented here (more ...)
1) The first track, like with the standard Synthesis-View plot, is a physical genome track, displaying the chromosome and relative location of each SNP used in the association tests.
2) The next track is an optional significance track, displaying the p-values. A single color represents each group. In this case, a red line has been placed at a p-value of 0.05.
3) The next three tracks are odds ratio/forest plot tracks. Squares represent the OR point estimate, with lines representing the upper and lower 95% confidence intervals. Here the similarity of the results between Stage 1 and Stage 2 are visible. An additional option, not shown, is available. If a result is significant (the upper or lower boundary of the confidence intervals does not cross 1.0), the square can be plotted in larger size, allowing for quick visual identification of significant results in forest plots with a large number of results.
4) The second to last track is the CAF track. Colors match those of the groups of the previous tracks, allowing the user to identify trends in allele frequencies between groups which can aid in interpreting replication of results. The option of horizontal separation of overlapping points was also used here as the CAF measurements were very similar between the analyses.
5) The last track is the sample size track. Case/control sample size can either be plotted in separate tracks, or, as shown here with closed circles indicating cases, and open circles indicating controls in the same track. The colors match those of the groups of the previous tracks. This option is also available when the CAF for cases vs. controls are provided.
The upper panel of Figure shows the plot that appears within the web browser, after "generate image" is selected. This plot contains embedded links for each SNP to the NCBI SNP database. The lower panel shows the NCBI SNP page that appeared when the SNP 17419032 was selected with the mouse.
Figure 9 HTML embedded image in Synthesis-View web interface. When a plot is first generated, such shown in Figure 8, an image of the plot appears within the web browser (screen capture shown in upper panel). This plot includes embedded HTML links. When the results (more ...)
An alternative way to view OR results is in stacked tracks where the eye moves from top to bottom, in more of the Synthesis-View standard format. If the forest plot option is not chosen in Synthesis-View, the default data plot is in this format. Unlike the forest plots of Figure ORs are plotted as closed circles. When OR results are significant, the OR closed circle is plotted in a larger size, rendering it easy to discriminate significant results visually.