While the analogy with thermometry is instructive, I would propose that there is an even deeper connection between the physicists’ underlying measurement problem and ours, and I believe we can adapt the established foundations of thermodynamics to accelerate the development of an absolute evidence scale. Others have already made various relevant connections: see for instance work by Cox [
13], Jaynes [
14], and Shannon [
15], all of whom borrow methods (e.g., differential equations) and individual concepts (e.g., entropy) from physics in grappling with the foundations of inferential and informational systems. But I think we can take an even more radical approach, and harness the actual foundations of thermodynamics to derive an evidential analogue of the true thermometer. Of course I am not claiming that evidence and temperature literally share a physical basis, but neither am I invoking thermodynamics as mere metaphor. Rather, I am proposing that the two are related by virtue of our ability to represent them within the same underlying mathematical framework. Indeed, work within physics itself has already shown the generality of the framework (see Callen's derivation of thermodynamics [
16] from simple symmetry relationships, applicable to all types of systems in macroscopic aggregation, and Caratheodory's axiomatic development [
17]).
To give the flavor of how this might go, consider first the fundamental physical construct known as the equation of state. An equation of state gives a complete description of a system in terms of a (generally small) number of interrelated parameters describing the system at a macroscopic level. For instance, the behavior of
m moles of an ideal gas can be described in terms of the absolute temperature T (that is, temperature measured on the absolute scale), volume V, and pressure P through the equation of state
(with R = gas constant). This equation represents a complete description of the system in the sense that knowing the value of any two of the three parameters (T, V, P) determines the value of the third, while any additional macroscopic properties of the system (e.g., entropy) can be derived once this equation is known. Equation 1 is therefore a particular instance of an equation of state, which can be represented in the more general form
where the subscript
IG (for ideal gas) indicates that the function
f will take on a particular form specific to ideal gases.
A great deal of fundamental theory can be worked out without an explicit definition of T, based only on the assumption that an equation of state in the form of equation 2 exists. This theory can then be related to the behavior of actual gases provided only that we have access to a device for measuring empirical temperature t (that is, temperature as it would be measured by a thermoscope), such that T equals some function g of t. Here the specific law g(t) = T need not be known, but there is a presumption that a device for measuring t exists such that this law could be established in principle. In practice, this simply means that the empirical temperature t must ‘track’ with the absolute temperature, going up when T goes up and down when T goes down.
Statistical models can also be viewed as equations of state, similarly framed in terms of an empirical (‘thermoscopic’) measure of evidence
e, even before we have a definition of what would be the absolute evidence E. All we need is an empirical measure that behaves like a thermoscope, that is, one that goes up or down as the evidence goes up or down, at least in simple settings and under normal circumstances (see also [
2] for further discussion of how we know when an evidence measure is behaving like a thermoscope). For instance, think of binomial data (N coin tosses of which X land heads) and the simple hypotheses P[heads] =
q = 0.05 versus
q = 0.5. Let us assume at least for the moment that the LR itself behaves like a thermoscope for this simple system, that is, that it correctly tracks with the evidence, or in other words, that
e = LR. (This is probably a safe assumption for comparisons between simple hypotheses [
3,
7], but does not necessarily assist us in measuring evidence for compound hypotheses, where unknown parameters can take multiple values.) In this case we have
Now if we hold e constant and increase N, X will have to change by a compensatory amount; similarly for other relationships among e, N, and X. Indeed, if we fix any two of the three variables in this equation, the value of the third is known and simple to calculate. Thus equation 3 is an equation of state in e, N, and X.
But what can we do with this equation of state? In physics, the equation of state allows construction of a Carnot engine, a device for quantifying the transformation of heat into work in the absence of any prior operational definition of heat [
18]. Note that the Carnot engine is itself a purely mathematical device, based on certain assumptions – e.g., perfect reversibility – which cannot be realized by physical systems. Kelvin and Joule defined the absolute temperature scale by deriving it from mathematical features of the Carnot engine, rather than defining it ab initio [
18]. It is hard to overstate the ingeniousness of this maneuver. Without knowing what heat was (this was still the era of caloric, after all) or what precisely was meant by T, mathematical insights based on the Carnot engine yielded both a definition of T and an absolute scale for its measurement. The details are beyond the scope of this essay, but the interested reader is referred to Chang [
12] for a lucid presentation that is quite accessible to non-physicists.
Returning to the evidential problem, for a given set of data, the simple equation of state given above (equation 3) for the binomial case represents a ‘system’, which can be plotted as a curve with
q on the x-axis and
e = LR on the y-axis. Insofar as equation 3 is an equation of state, this plot conveys all of the information in the data regarding the strength of the evidence for or against
q = 0.05 for given data. The plot itself has some very real, physical properties, including for instance the area A under the curve and the maximizing value
Q of
q. We can, therefore, reformulate equation 3, which is written in terms of the data (N, X) to being about properties of the graph itself, such as A and Q. This yields an equation of state in the form
[Here the subscript BIN (for binomial) is a reminder that the particular form of f will be dictated by the behavior of binomial systems.] Equation 4 is a simple reparameterization of equation 3, which also encapsulates the property that if, say, we hold the evidence e constant and increase A by a specific amount, Q will have to shift by a compensatory amount (and similarly for other relationships among e, A, and Q).
In the physical system, T, V, and P constitute macroscopic properties of the system that are affected by changes in energy, in particular by the influx or outflux of heat. We can similarly think of the influx of new information – like the influx of heat – as performing ‘work’ on the graph associated with equation 4, that is, changing its physical properties A and Q. Viewing the statistical system in this way suggests that we should be able to run ‘evidential’ Carnot cycles and to study their properties. This opens the door to derivation of an absolute scale for the measurement of evidence, following the Kelvin-Joule template. There is an enormous amount of mathematical detail remaining to be worked out, and, of course, the devil may well reside in the details. However, if this basic framework is even approximately correct, this means that evidential equations of state can be derived prior to defining what precisely is meant by the evidence, and deployed in ‘evidentialism’ just as they are in thermodynamics, to provide the basis for (and definition of) absolute measurement.
This requires of course that we accept evidential analogues of the laws of thermodynamics, but this is not so farfetched as it might at first appear. While there may not be a comparable physical basis for an evidential version of the 1st law (which stipulates the conservation of energy), certain elementary principles come to mind that could stand in nicely: for instance, the law of total probability, or perhaps, a comparable law constraining total evidential information. The analogue of the 2nd law (which, in one form or another, describes the tendency of systems to equilibrate in their maximum entropy configurations) is perhaps more obscure. However, given the close connection between thermodynamics, statistical mechanics, and the entropy-based information theoretic framework of Shannon [
15], it seems reasonable to expect that an entropy-based formulation will also be forthcoming in the evidential case, and this would in turn allow concise formulation of an evidential 2nd law.
Not only would all of this give us a framework for solving our nomic measurement problem, but application of the theory to specific subject-matter domains could produce additional results paralleling those of physics. For example, the amount of heat required to raise the temperature of a liquid by a given amount depends on the liquid, or in other words, different liquids have different specific heats. This means that different liquids have different equations of state, so that the thermal meaning of a given change in volume within a thermoscope is not fixed. Just so, the quantity of data required to change the evidence by a fixed amount will depend upon the hypotheses of interest, among other things. We can think of this as a matter of different ‘specific heats’ for different statistical applications. We will therefore need to derive empirical adjustments to equations of state for particular applications, to ensure that a change of 1° in our measure of the evidence always means the same thing. The Carnot engine also yielded a theoretical upper bound on the efficiency with which heat could be converted to work, generating new metrics for investigating the relative efficiencies of various real engines. Just so, our evidential analogue of the Carnot engine could yield a theoretical upper bound on the efficiency with which data, or the information conveyed by the data, can be converted to evidence, in turn leading to new ways to evaluate mathematical modeling methods by comparison with this upper bound.
It is also important to note that definition of the absolute (Kelvin) scale for temperature did not in any way necessitate replacement of thermoscopes. It simply provided the basis for absolute calibration of existing thermometric devices. Just so, this line of inquiry would not replace other research on statistical methods in biology, although it might call for some adjustments to ensure that all of our empirical outcome measures behave like thermoscopes. The object would be simply to harmonize the evidence scale on which results of a multitude of statistical approaches can be represented, in the process formulating a general theoretical framework and the many ancillary benefits that can come from having such a framework in place. Indeed, there is no reason to think that the benefits would be restricted to statistical modeling; non-stochastic methods, such as control theory, might also incorporate well into this framework.