The Causal Model
Thus, there seems to be clear epidemiological evidence for at least three distinct environmental events contributing to MS pathogenesis. Consequently, it is unnecessary to choose between the vitamin D and the EBV hypotheses. They may both be correct. Nevertheless, even if these two environmental events are part of a pathway to adult MS, they may not be on the same or the only pathway. Indeed, assuming that each is part of some causal pathway, there are several possible arrangements for how these environmental-events might produce MS (). No pathway can be excluded entirely although, if prior EBV infection is necessary for adult MS (), then pathways 1 and 2 () must occur rarely, if at all. Similarly, if pathway 4 were the major pathway, an observable maternal effect would not be anticipated. Consequently, only pathway 3 (implying sequential environmental events) seems to form a necessary part of causal cascade to adult MS. The first two events may be an appropriately-timed vitamin D deficiency and an appropriately-timed EBV infection (as in the Figure). However, these particular associations with the first and second environmental events are not a necessary component of the Model and, if it turns out that these two factors are not relevant, then two suitable alternatives would simply need to be substituted into the equations without any alteration of the Model itself.
Possible causal pathways (1–4) leading to MS, which include genetic factors (G), vitamin D deficiency (VD), Epstein-Barr virus (EBV) infection, and other unidentified environmental factors (O1–O4).
Definitions for the terms used in the Model are presented in . The life-time probability of getting MS (PMS
) will be equated with the prevalence of MS in the general population (). This is an approximation, which is accurate only if disease incidence is unchanging, mortality is unaffected, the population size is stable, and the diagnostic sensitivity (over the entire life-span) is unchanged. None of these conditions pertain exactly. Nevertheless, this assumption seems reasonable as a rough approximation, especially because the error introduced by increased mortality (in the estimate for individuals currently aged 35–40 years) will be opposite in direction from the others. The probability of genetic susceptibility (PG
) will refer to the probability of an individual possessing any of possibly several “susceptible” genotypes and will refer to the susceptibility conferred by genetic determinants present at conception. Alterations of genetic expression subsequently will be considered environmental events although the genotypes that permit such alterations to occur would still be included within the group of “susceptible” genotypes. Because mitochondrial genes linked to MS have not been found 
and because, as discussed above, preferential maternal transmission has not been observed 
, genetic susceptibility will be attributed to nuclear genes.
Definition of Terms used in the Model.
Initially, the probability (PE
) will be used to describe the combined occurrence of an entire set of environmental events (taking place at appropriate times) that could cause MS in, at least, some genetic backgrounds. Such an occurrence will be referred to as an individual experiencing a “sufficient” set of environmental events. Here, it is understood that it is possible that every “sufficient” set of environmental events may not be “sufficient” for every “susceptible” genotype. However, the set of environmental events does need to be “sufficient” for, at least, one such genotype. Under these conditions, the probability that any individual will develop MS is described by the equation:
|G) is the conditional probability of an individual experiencing a “sufficient” set of environmental events given that they are genetically susceptible to MS. The last term (PMS
|G, E) is the conditional probability of MS developing in an individual having both an appropriate genetic make-up and experiencing a “sufficient” set of environmental-events. As such, this last term allows for different probabilities of developing MS in persons with different combinations of “susceptible” genotypes and “sufficient” environmental exposures. It also allows for a purely stochastic factor, in which only a fraction of individuals will actually develop MS, even when they are genetically susceptible and when they experience an environmental exposure “sufficient” for MS to develop given their particular genotype. Importantly, however, Equation (1) is neither speculative nor controversial and it does not limit the environmental or genetic possibilities. It is simply the definitional statement for these various conditional probabilities.
Nevertheless, although it is clear that genetic susceptibility plays a key role in MS pathogenesis, theoretically, it could also turn out that everyone in the population might become susceptible to MS in response to some very special environmental circumstances, regardless of their genetic make-up. In this case, Equation (1) would require some modification (see Appendix S1
). Despite this possibility, however, the available experimental evidence suggests that the large majority (perhaps all) of MS is due to the effect of specific environmental events acting on genetically susceptible individuals and, thus, that Equation (1) adequately characterizes the causal pathway leading to MS (see Appendix S1
for an expanded discussion of these issues).
For a monozygotic-twin of an MS proband, (PG
) should be close to 100%. Even though gene copy numbers can vary and epigenetic factors may differ to a degree between monozygotic-twins, especially as the individuals age 
, in general, any monozygotic-twin of an MS proband will be genetically susceptible if this is conferred solely by nuclear genes inherited at conception. Moreover, as discussed in Appendix S1
, for a monozygotic-twin of an MS proband, the lifetime probability of developing MS (PMS
) is equal to the proband-wise concordance-rate of MS (CRMZ
). Thus, in this special case, where (PG
≈1) and where (PMS
), Equation (1) simplifies to:
Thus, the product of the conditional probability of experiencing a “sufficient” environmental exposure and the probability of an appropriate outcome from a random (stochastic) process is equal to the proband-wise monozygotic-twin concordance-rate (which is also equal to the penetrance of the complex genetic trait). This relationship is independent of whether any specific factor is in the causal path to MS and the value of [(PE
|G, E)] can be determined regardless of whether genetic susceptibility, environmental exposure, and any stochastic processes are independent of each other. As a result, if both (PMS
) and (CRMZ
) are known for any particular region, the prevalence (probability) of genetic susceptibility (PG
) in the population for that region can always be approximated as:
Note that in deriving Equation (3), beyond ascribing susceptibility to nuclear genes, no assumptions have been made with respect to the environmental events, the genetic variables or their interactions. As noted, Equation (1) follows directly from the definition of conditional probability and Equations (2) and (3) are immediate consequences of this. Moreover, each of the potential errors (discussed earlier), which arise from equating (PMS) with the prevalence of MS in the general population, will also arise from equating (CRMZ) with the prevalence of MS in monozygotic-twins of an MS proband. Therefore, in Equation (3), these errors (whatever they are) should largely cancel.
As an example, for far northern populations 
, the range of estimates for (CRMZ
) is 20–30% and for (PMS
) is 0.1–0.2%. Therefore, (PG
) in these regions can be approximated as 0.3–1.0% (). In more southerly regions of North America and Europe, both rates are approximately half those in the north 
, implying that (PG
) is in the same range (). In fact, applying Equation (3) to every region in which (PG
) can be calculated from the available epidemiological data, (PG
) seems to be in a very similar range everywhere 
, without any obvious north-south gradient (). It may be (as is often suggested) that the probability of genetic susceptibility is different in some ethnic populations than in others although, at present, based on the apparent absence of any difference in susceptibility between the ethnically different populations of Europe and North America (), such a possibility is pure speculation. Thus, all that can be said at the moment is that throughout Europe and North America, the probability of being genetically susceptible is remarkably consistent ().
Prevalence (probability) of genetic susceptibility in populations in different geographic regions.
Nevertheless, it is clear that a person's genetic make-up is, by far, the most important determinant of MS risk, despite the fact that the contribution of individual genes to that risk seems to be small and that several genome-wide screens have not provided evidence for strong associations other than at the HLA DRB1 locus 
. Thus, by this probabilistic analysis, over 99% of individuals seem to be genetically incapable of getting MS, regardless of what environmental events they experience during their lives. Paradoxically, however, because the (PG
) term is so similar in different areas (), it is the environmental events (i.e., the PE
term) that determine the observed regional variations in MS epidemiology. In addition, the mechanisms underlying genetic susceptibility to MS are likely to be quite complex. Thus, even though the HLA DRB1*1501 allele has the largest and most consistent association with MS susceptibility of any genetic marker identified to date 
, approximately 99% of the individuals who carry this susceptibility allele are not even genetically susceptible to getting MS ().
For the purposes of this Model, only two time-periods from the Canadian study will be considered in detail although, in fact, the observed change in sex-ratio seems steady and consistent over the entire study-interval 
and the parameter estimates derived from each these different sex-ratio data points (see Appendix S1
) is quite similar (). The first interval considered is (1941–1945) because, of the older data points, this is the oldest with a very narrow confidence interval 
. The second is (1976–1980), which will be considered “current” because this is the youngest 5-year cohort whose members have lived long enough (i.e., 35–40 years) for MS to declare itself. Also, this time-period is the most likely to match-up with “current” estimates of the monozygotic-twin concordance-rates. Obviously, these point estimates, are associated with error terms 
, so that estimates derived from them are only valid within the limits set by these potential errors. Nevertheless, knowing that the sex-ratio in the (1976–1980) time-frame is 3.2 
and using an estimate of 0.1% for MS prevalence in Canada 
, a gender-specific MS prevalence of 152.4 and 47.6 per 100,000 population can be calculated for women and men respectively (). Moreover, the proband-wise monozygotic-twin concordance-rate for women (0.34) is significantly larger (p<0.001) than the same rate (0.065) for men 
Table 4 Parameter estimates using different the sex-ratios (FemaleMale) reported in Canada over the period from 1931 to 1980*.
From Equation (3), and as shown in , men seem to be 60% more likely to be genetically susceptible to getting MS than are women (i.e., PGM>1.6 PGW). This result is independent of the actual prevalence of MS in Canada and, consequently, the greater current MS prevalence in women, presumably, is due to the fact that the [(PE|G)(PMS|G, E)] term in Equation (1) is presently larger for women than for men.
This gender-specific environmental effect might reflect a true difference in exposure (e.g., maybe women use more sun-block or sun-avoidance than men, maybe they spend less time out of doors, or maybe they have better hygiene as children and therefore acquire EBV later). It could also reflect gender-specific differences in vitamin D metabolism, which causes men and women to experience a deficiency at different absolute exposure levels 
. It may also reflect women having a greater probability of actually developing MS once the necessary environmental and genetic events have come together or it could be that some combination of these factors contributes to the observed gender-specific differences.
Regardless of the explanation, however, the existence gender-specific differences necessitates that Equation (1) be re-written separately for both women and men as:
In the Model, these gender-specific terms [(PEW|GW)(PMS|GW, EW) and (PEM|GM)(PMS|GM, EM)], will be referred to (collectively) as the probability of an “effective” exposure (i.e., an exposure that actually produces disease in a susceptible individual). These gender differences also necessitate the use of gender-specific monozygotic-twin concordance-rates (Zw and Zm), as defined in .
MS epidemiology in Canada has been changing in the 35 year interval between (1941–1945) and (1976–1980). First, MS prevalence may have doubled (in which case C would be equal to 0.5) and the sex-ratio of women-to-men has increased from 2.2 to 3.2 
. From these two pieces of information, and assuming both that PGW
are unchanged and that the current MS prevalence is 0.1%, it follows that the gender-specific MS prevalence in (1941–1945) was 68.8 and 31.2 per 100,000 population for women and men respectively and that the gender-specific term for the probability of “effective” exposure during this period () is (0.307*C
15.3%) for women and (0.085*C
4.3%) for men (see Appendix S1
). Moreover, knowing these values permits the relationship between the probability of an “effective” exposure and the actual exposure level of the population to be defined rather precisely (see Appendix S1
If the hazard-rate for “effective” exposure is constant and the same for men and women, which seems plausible for a stochastic process related only to the actual exposure level received by the population (see Appendix S1
for a discussion of these issues), the most general equations describing the approach to maximum probability of “effective” exposure are exponential and are given by:
In these equations, a
are positive constants (≤1), which represent the maximum probability of “effective” exposure for men and women, x
(≥0) represents the actual exposure level received by the population, and r
(≥0) is the hazard-rate for “effective” exposure. The threshold exposure level at which disease becomes possible and is defined as the exposure-level (rx
) such that:
The parameter (λ
) represents the difference (between men and women) in this threshold exposure-level, so that:
By virtue of a few basic epidemiological observations 
, we can specify two points on these gender-specific response curves and, therefore, each curve can be defined within narrow limits (see Appendix S1
). Thus, based upon the current proband-wise monozygotic-twin concordance-rates 
, allowing for an accuracy (±1 SE), and based upon the observed changes in sex-ratio 
, it can be shown that (b
), that (0.018<a
<0.154), that (0.335<b
<0.576), that (2.7<b/a
<26.1), and that (λ
<−0.1). It is therefore apparent that women are more responsive to the environmental changes that have taken place (whatever they are) than men. Despite this, however, men have a lower threshold of actual environmental exposure for the disease to develop than women. Such a circumstance might explain why the earliest reports of MS were often in men 
and why a 1922 study reported that of 363 MS patients from the United States and of 1,142 cases from Europe, approximately 58% (in both regions separately) were men 
. Moreover, it can be shown that, at a minimum, there must have been a 32% increase in the prevalence of MS in Canada over the 35 year interval of study. Some increase in MS prevalence might be expected from better diagnostic techniques although, because this minimum increase in MS prevalence depends only upon the observed change in the sex-ratio (see Appendix S1
), this explanation seems unlikely. Indeed, assuming that MS prevalence has doubled over the 35 year interval 
, and that the “current” estimates for Zw2
, and R1
are accurate, every parameter can be determined precisely [i.e., b
5.751, and λ
−0.373] and the theoretical response curves constructed exactly (). Finally, these parameter estimates are quite stable for all of the changes in sex-ratio that have been observed over time in Canada ().
Derived response curves for men and women of an “effective” exposure to the environmental factors (PE*) with increasing levels of actual exposure (x), as described in the text and in the Appendix.
Thus, there can be no doubt that the environmental factors have been changing over the past several decades and probably for much longer. Nevertheless, it is possible that only some of these implicated environmental factors have changed and, thus, only these particular factors may be responsible for the changes that have taken place in MS epidemiology over the past several decades. If, as discussed earlier, the sequential pathway 3 () plays the dominant role in adult MS pathogenesis, then it follows that the “total” environmental term (PE
) can be re-written as:
In this equation, (PVD) is the probability of a “sufficient” exposure to vitamin D deficiency, the term (PEBV|VD) is the conditional probability of a “sufficient” EBV exposure given a “sufficient” vitamin D exposure, and (PO|VD, EBV) is the conditional probability of the other “sufficient” environmental exposures given the fact that the individual has already experienced “sufficient” vitamin D and EBV exposures.
If individuals are equally likely to receive a “sufficient” exposure to each of these three environmental events (VD, EBV, and Other), if (PMS
|G, E) is 100%, and if the conditional probabilities are approximately equal to the probabilities themselves, then it follows that 63% of northern European and northern North American populations 
have experienced what would have been a “sufficient” exposure to each of the three implicated environmental events in a genetically susceptible individual. Thus, in this circumstance:
Even in the more southerly regions of these continents 
, a “sufficient” exposure to each of the events would still be experienced by 53% of the population. It is only because genetic susceptibility is so infrequent that the disease is uncommon. Moreover, if (PMS
|G, E) is less than 100% (as seems likely), these numbers will only increase further as this probability declines. Consequently, the necessary environmental exposures in the causal pathway to MS seem likely to be very common events. Indeed, because both vitamin D deficiency and EBV infection are both very common population-wide events, this conclusion is fully consistent with these factors being the first two environmental events involved in MS pathogenesis. Thus, vitamin D deficiency (at least to some degree) is anticipated in the large majority of individuals living in low sun-exposure regions 
and EBV is an extremely prevalent pathogen in human populations ().