|Home | About | Journals | Submit | Contact Us | Français|
To review the history, theory and current applications of Weibull analyses sufficient to make informed decisions regarding practical use of the analysis in dental material strength testing.
References are made to examples in the engineering and dental literature, but this paper also includes illustrative analyses of Weibull plots, fractographic interpretations, and Weibull distribution parameters obtained for a dense alumina, two feldspathic porcelains, and a zirconia.
Informational sources include Weibull's original articles, later articles specific to applications and theoretical foundations of Weibull analysis, texts on statistics and fracture mechanics and the international standards literature.
The chosen Weibull analyses are used to illustrate technique, the importance of flaw size distributions, physical meaning of Weibull parameters and concepts of “equivalent volumes” to compare measured strengths obtained from different test configurations.
Weibull analysis has a strong theoretical basis and can be of particular value in dental applications, primarily because of test specimen size limitations and the use of different test configurations. Also endemic to dental materials, however, is increased difficulty in satisfying application requirements, such as confirming fracture origin type and diligence in obtaining quality strength data.
This paper reviews Weibull statistics in order to facilitate informed decisions regarding practical use of the analysis as it applies to dental material strength testing. Weibull statistics are commonly used in the engineering community, but to a somewhat lesser extent in the dental field where applicability has been questioned [1,2]. A possible confusing factor is that Waloddi Weibull originally presented his analyses partially on empirical grounds; published theoretical confirmation was not available until many years later. Even as a theoretical basis was being constructed, Weibull, as an engineer, seemed more concerned with what worked [3,4]. In order to assess applicability to the specific field of dental material strength testing, it is helpful to start with a brief history and overview of basic concepts of extreme value theory, fracture mechanics, and flaw populations as they pertain to the theoretical foundation of Weibull analysis.
It is often noted that dental restorative ceramics and composites, while popular in terms of esthetics and biocompatibility, are susceptible to brittle fracture. This type of failure is particularly difficult to predict. Imminent brittle fracture is seldom preceded by warning, such as visible deformation, nor do seemingly identical brittle components appear to break at the same applied stress. Ductile materials deform to evenly distribute stresses throughout a region, but in stiff, brittle materials, stress concentrations at specimen geometry changes, cracks, surface irregularities, pores and other intrinsic flaws are not relieved. Hence the design methodologies and test methods for ductile materials are unsuitable for brittle materials and different test methods and design approaches are needed.
A series of catastrophic events, often epitomized by the Liberty ship  and Comet airplane  disasters, spectacularly demonstrated the inapplicability of ductile strength testing analyses to brittle failure prediction. Such disasters spurred the development of the science of fracture mechanics in order to understand the conditions of failure through crack growth rather than by ductile mechanisms .
In the early 1920's, Griffith postulated that crack extension in brittle materials occurs when there is sufficient elastic strain energy in the vicinity of a growing crack to form two new surfaces . In the 1950's Irwin built on Griffith's work to associate crack extension with an “energy release rate” . This led to a new parameter, KIc – fracture toughness, or resistance to crack growth. Irwin's approach enabled strength predictions based on fracture toughness calculations that relate crack extension to the sizes of preexisting cracks or “flaws” within a material. For the most common case of a small flaw in a far field tensile stress field, modern fracture mechanics relates the applied fracture stress at the fracture origin, σf, to a flaw size, c :
where Y is a dimensionless, material-independent constant, related to the flaw shape, location and stress configuration and is called the stress intensity shape factor. “Flaws” are not necessarily inadvertent defects or blemishes in a material. No material is perfectly homogenous, and all contain some sort of discontinuities on some scale. These discontinuities might be pores, inclusions, distributed microcracks associated with grain boundaries or phase changes during processing, regions of dislocations or slight variations of chemistry, or many other possible variants and combinations. They could also be surface distributed flaws such as grinding cracks. Flaws in this sense are intrinsic to the material or the way it was shaped or processed, and are distributed in some way throughout the material surface or volume. It can be seen from the equation that the smallest strengths are associated with the largest flaw sizes.
The idea of failure being associated with a largest flaw, or “weakest link theory”, is not recent. Leonardo DaVinci is reputed to have conducted tests circa 1500 involving baskets suspended by different lengths of wire of nominally identical diameter . DaVinci gradually filled the baskets with sand and noted the baskets suspended by shorter wires could hold more sand, an outcome that is expected if it is assumed that there is a lower probability of encountering a large flaw in a shorter wire. DaVinci did not know exactly where a particular wire would break, but he recommended that multiple tests be done for each wire length, suggesting that there was a statistical variation in the strengths and failure locations of wires of a the same length. Since DaVinci's time, much progress has been made both in our ability to identify critical flaw properties and in refining predictions based on weakest link theory and statistical failure probabilities.
Suppose a material has a flaw size distribution such as the normal, or Gaussian, distribution illustrated in Fig. 1. Although the flaws have different sizes, suppose they are all of the same general type that initiate failure. Finally, suppose that small material test specimens are withdrawn from this parent population. Each test specimen will contain a discrete flaw population based on the parent distribution, and the largest flaw within the highly stressed regions of each test specimen will precipitate failure if uniform tension is applied.
If one is very unlucky, a randomly withdrawn material test specimen will contain an unusually large flaw, at the very end of the tail at the right of the parent distribution in Fig. 1. The test specimen will be very weak. If one is very fortunate, however, the largest flaw in a random material test specimen will not be far into the tail. This specimen will be relatively strong. If many random test specimens are withdrawn, most of them will contain a largest flaw that is somewhere within the parent population tail of large flaws. The “largest flaw” distribution might look like the shaded portion of Fig. 1.
The distribution of “largest flaws” of many withdrawn test specimens from a parent population is an example of an extreme value distribution. The “largest flaw” population is not expected to be a symmetric distribution, whether or not the parent population is Gaussian as in the Fig. 1 example.
Suppose now that larger-sized test specimens of the same material are tested. Each of these physically larger test specimens contains more flaws than each of the smaller test specimens in the previous thought experiment. Since a larger test specimen contains more flaws, it is more likely to contain a very large flaw corresponding to the far right portion of the tail in the Fig. 1 parent population. If many larger-sized test specimens are withdrawn, their “largest flaw” distribution will be weighted further to the right within the parent population tail than the previous “largest flaw” distribution of smaller test specimens. The “largest flaw” distributions “march to the right” as the physical sizes of the test specimen increase.
One result of “largest flaw” distributions that depend on the physical size of test specimens is that strength distributions, which are based on flaw sizes, will similarly depend on test specimen size. Strength distributions will inversely “march” left or right in accordance with “largest flaw” distributions in a mathematically predictable fashion. Also, strength distributions would not be expected to be symmetric if the “largest flaw” distributions are not symmetric.
Currently, many data in the dental literature are simply reported in terms of mean values with standard deviations calculated by assuming symmetric, normal strength distributions about an average. This assumption can still yield insight into material strengths and strength ranges. In fact, the assumed normal strength distribution may not greatly differ from an extreme value distribution in providing strength estimates of similarly sized and stressed test pieces. Unfortunately, such predictions only hold for the specific test specimen size and shape in the particular test configuration under laboratory conditions. As will be shown, besides more accurately characterizing material strength, extreme value statistics are a powerful tool that yield parameters that can be used to relate test strength data to expected strengths for different stress configurations, different test specimen sizes, and different testing conditions.
Returning to Fig. 1, it can be seen that the parent distribution tail at the right is much more important than the rest of the parent distribution containing smaller flaws. Since small flaws in the test specimens do not precipitate failure, it does not matter how small these flaws are or how their sizes are distributed. The left side of the total flaw distribution in Fig. 1 has very little bearing on the “largest flaw” distribution.
The original total flaw population does not have to be normal or symmetric to result in an extreme value distribution shaped similar to the shaded portion in the figure. The right side of the parent flaw distribution, the tail of large flaws, dominates the shape of the “largest flaw” distribution. This is because only one very large flaw in a test specimen is necessary to be counted as “the largest flaw”, but if the largest flaw is relatively small, all the other flaws in the test specimen have to be smaller. The distribution is thus skewed to the right as in Fig. 1, following the parent distribution tail. Extreme value distributions in general “follow the tail” . Focusing only on the tails allows simple, generalized parametric functions to be used in extreme value statistical models. Such models have been defined to accommodate underlying distribution tails that deviate considerably from the Gaussian shape in the Fig. 1 example.
There are three commonly recognized families of extreme value distributions [11,12,13] where G(x) is the probability distributions function for an outcome being less than x for a sample set of n independent measurements:
|Type I. Gumbel:||G(x) =||exp(−exp−(x−μ)/σ) for all x|
|Type II. Fréchet:||G(x) =||exp(−((x−μ)/σ)−ξ)||for x ≥ μ|
|Type III. Weibull:||G(x) =||exp(−((μ−x)/σ)ξ) for x ≤ μ|
where μ, σ(>0), and ξ(>0) are the location, scale and shape parameters, respectively. Fisher and Tippet  are credited [11,13] with defining the extreme value distributions in 1928, when they showed there could only be the three types. Some graphical examples of the probability density functions of the three types of extreme value distributions are shown in Fig. 2.
All three extreme value distributions have a theoretical basis for characterizing phenomena founded on weakest link theory. For strength dependency on an underlying material flaw distribution, the goodness-of-fit of any of the extreme value distributions depends on the shape of the flaw distribution tail. In this regard, Type III, or the Weibull distribution, is usually considered the best choice because it is bounded (the lowest possible fracture strength is zero), the parameters allow comparatively greater shape flexibility, it can provide reasonably accurate failure forecasts with small numbers of test specimens and it provides a simple and useful graphical plot [4,14]. In what has been hailed as his “hallmark paper” in 1951 , Weibull based the wide applicability of the distribution on functional simplicity, satisfaction of necessary boundary conditions and, mostly, good empirical fit. The parameter symbols and form of the extreme value functions are usually written differently for reliability analyses. In the specific case of Weibull fracture strength analysis, the cumulative probability function is written such that the probability of failure, Pf, increases with the fracture stress variable, σ:
This is known as the Weibull three parameter strength distribution. The threshold stress parameter, σu, represents a minimum stress below which a test specimen will not break. The scale parameter or characteristic stress, σθ, is dependent on the stress configuration and test specimen size. The distribution shape parameter, m, is the Weibull modulus. This is the equation form that Weibull presented in his original publications, directly derived from weakest link theory . He was conservative and disclaimed any theoretical basis, not because of misgivings concerning extreme value or weakest link theory, which was well-established by then, but because he perceived it was “hopeless to expect a theoretical basis for distribution functions of random variables such as strength properties”. Since Weibull's initial publications in 1939 and 1951, however, the science of fracture mechanics has enabled determinations of quantitative functional relationships between strength and flaws in brittle materials, as exemplified in eq. (1).
By 1977, Jayatilaka and Trustrum  used fracture mechanics to develop a general expression for the failure probability using several general flaw size distributions suggested by experimental work. They coupled these flaw size distributions with the fracture mechanics criterion, eq. (1), and integrated the risk of breakage over component volumes and then derived a number of different strength distributions [16,17]. Their derivations are too lengthy to repeat here and the reader is encouraged to review their exposition in the original references. They showed that the right side tail of many parent flaw size distributions such as shown in Fig. 1 often can be modeled by a simple power law function: f(c) = constant c−n where c is the flaw size. In such cases, the resulting strength distribution is the Weibull distribution with m = 2n − 2. Danzer et al.[18,19,20] have done similar derivations with more generalized flaw size distributions and have reached the same conclusions. Subsequent work, including painstaking measurement of flaw sizes and constructing distributions, has confirmed the power law function for the distribution of large crack sizes and hence the theoretical basis for the Weibull approach [21,22,23]. In other words, a reasonable power law distribution for large flaw sizes, classical fracture mechanics analysis and weakest link theory leads directly to the Weibull strength distribution.
Today's engineers routinely utilize Weibull statistics for characterizing failure of brittle materials. Numerous and diverse studies in the engineering literature report data in terms of Weibull parameters where the strengths are related to fractographically determined flaw types and sizes [24,25,26]. Such studies substantiate the existence of flaw distributions that lead to strength distributions that can be modeled by Weibull statistics for a wide range of materials. As noted earlier, fractographic examinations are becoming more common in relating flaw types and sizes to strengths of dental restoration materials [27,28,29]. While this shows that characterizing flaw populations in dental materials is possible, this does not prove it is easy or possible in every case. Many dental materials have rough microstructures, where it can be particularly difficult to identify fractographic features [30,31]. Also, it is not unusual to have multiple flaw types present which complicate the Weibull analysis.
Major standards organizations throughout the world have published specific guidelines for reporting ceramic strength in terms of Weibull parameters, including ASTM (C1239) , the Japanese Industrial Standards Organization (JIS R1625) , the European Committee for Standardization (CEN ENV 843) , and the International Organization for Standards (ISO 20501) . These standards are very similar and use the identical Maximum Likelihood Estimation analysis to calculate the Weibull parameters. A newly revised (2008) Ceramic Material for Dentistry standard (ISO 6872)  is the sole exception and uses a simpler linear regression calculation in an informative annex. Since many brittle materials used in structural engineering are also used in restorative dentistry, are Weibull analyses appropriate for ceramic dental materials? To answer this, the specific underlying assumptions and conditions inherent in applying Weibull statistics must be examined.
A good data set is required for any credible property determination, and it can be deduced from the previous paragraphs that diligence in test specimen preparation and testing procedures is particularly important when using Weibull statistics. Test specimens breaking from inconsistent machining or handling, or haphazard alignment, are not representative of a specific flaw population, and do not contribute to a valid data set for Weibull analysis. A materials advisory board committee in 1980  concluded that: “Ceramic strength data must meet stringent quality demands if they are to be used to determine the failure probability of a stressed component. Statistical fracture theory is based on the premise that specimen-to-specimen variability of strength is an intrinsic property of the ceramic, reflecting its flaw population and not unassignable measurement errors. Ceramic strength data must be essentially free of experimental error.” Even with meticulous test implementation, however, it is the stipulation of a single flaw population that seems to cause the most difficulty in using Weibull statistics in dental material strength testing.
In eq. (1), the parameter Y distinguishes different types of flaws as well as different test specimen test configurations. A blunt flaw, such as a pore, is more benign than a sharp flaw, such as a microcrack. A material under load may break from a sharp flaw but not break from a blunt flaw of a similar size. Suppose a material contains the two flaw types, small microcracks and large pores. Each flaw type has its own distribution. If all the test specimens break from microcracks, or all the test specimens break from pores, then either the microcrack size distribution or the pore size distribution will govern the strength distribution. If, however, some test specimens break from microcracks, and other test specimens break from pores, the strength distributions resulting from the different flaw populations overlap and an associated extreme value distribution cannot be modeled by one single flaw size distribution tail. In this case, the Weibull strength distribution would not be expected to appear smoothly continuous. If many test specimens are tested and enough of them break from either flaw population, parts of two distinct Weibull distributions may be discernable. Censored statistical analyses must be used in such cases . Bends or kinks in a Weibull distribution function are often indicative of fracture resulting from multiple flaw types.
Thus, lack of a good Weibull fit is suggestive of an inconsistent underlying flaw population, assuming the material was tested properly and failed in a brittle manner. Conversely, a good Weibull fit is sometimes taken as indicative of a single, dominant flaw type and confirmation of adequate care in testing procedures. Unless material familiarity and previous testing dictates otherwise, it is prudent to verify the cause of fracture initiation. This is often done fractographically and is encouraged, and in cases required, by the standards for Weibull analyses [32–35]. There even are guidelines and formal standards for fractographic analysis that have been prepared with Weibull analysis in mind [31,38].
Another consideration in using Weibull statistics is the increased number of test specimens that might be needed to characterize an entire strength distribution rather than simply estimate a mean strength value. The optimal number of test specimens depends on many variables, including material and testing costs, the values of the distribution parameters and the desired precision for an intended application. Helpful calculations and tables to make such decisions can be found in the previously cited standards. In the absence of specific requirements, a general rule-of-thumb is that approximately thirty test specimens provide adequate Weibull strength distribution parameters, with more test specimens contributing little towards better uncertainty estimates [39,40,41]. More information regarding optimal test specimen numbers, as well as reasons why Weibull distributions are so often observed in material testing practice are discussed by Danzer et al. . They also detail conditions necessary to obtain a Weibull distribution and suggest alternative statistical approaches for analyzing strength data for materials that do not satisfy these conditions, such as materials with unusual or highly mixed flaw distributions. In this sense, Weibull analysis can be regarded as a special, simple case of a broader statistical approach for analyzing strength data [18–20]. Indeed, Danzer now uses the expression “Weibull material” as one with a single flaw type whose size distribution fits a power law function on the right side tail.
The previous paragraphs highlight some of the assumptions and difficulties in utilizing Weibull statistics in characterizing strength measurements of dental materials. What are the benefits?
It was initially noted that extreme value statistics can be used to predict changes in distributions according to the physical size of the individual test specimens. This is one of the strongest virtues of the Weibull model and what distinguishes it from other distributions. In practical terms, this means that strength values for one test specimen size may be “scaled” to expected strength values for different sized test specimens. This strength scaling permits comparison of strengths of structures with stress gradients such as bend bars or flexed disks. Examples of strength scaling are shown later in this paper. So far we have assumed that all flaws in a body are exposed to the same tensile stresses, but in bodies with stress gradients, some large flaws may be in a low tensile stress or compression region and will not cause fracture. Fracture occurs where there is a critically loaded flaw that has a size, c, and shape factor Y, and a local stress, σ, combination such that the critical fracture toughness, KIc, is reached in accordance with eq. (1).* Figure 3 by Johnson and Tucker  illustrates how fracture origin sites may be scattered in a three point bend bar. For materials with low Weibull moduli (i.e., the flaw sizes are quite variable) fracture can occur from a large portion of the test specimen. On the other hand, the origin sites are concentrated to only the highest stressed regions if the Weibull modulus is large, since flaw sizes are all similar in size.
The validity of the Weibull approach can be tested by its ability to scale strengths for a particular specimen size to another size or testing configuration. A review of ceramic flexural strength data  tabulated a number of studies wherein ceramic strength scaling by Weibull analysis was successful over a size range as much as four orders of magnitude in volume or area! Strength scaling is done by the concept of an “equivalent” volume or area under stress, discussed in a following section. Whether equivalent volumes or equivalent areas are used depends on whether fracture initiates from volume flaws or surface flaws within the stressed region.† Using Weibull statistics to calculate corresponding strengths for different test specimen sizes, test specimen shapes and stress configurations is particularly appealing for use in the dental field where sample sizes and testing configurations are frequently different.
Identification and quantization of different flaw populations can explain strength differences and ultimately lead to improved materials. Such approaches can, for example, isolate the effects of grinding media and surface treatments, or determine whether strength might be improved by reducing porosity. Most important of all, determination of the dominant flaw populations, the effects of stress configurations and physical size are all necessary for correlations to clinical components.
As noted above in eq. (2), three parameters were used to define the previously introduced extreme value functions, but only two Weibull parameters are generally reported for strengths. The two-parameter Weibull distribution is obtained by setting σu = 0, although the three parameter form is not uncommon. When the three parameter Weibull distribution is used, the lower strength bound, σu, might represent a lower bound strength limit for a data set. This is analogous to a data set that may have been previously proof-tested or inspected to eliminate flaws over a certain size. The lower bound strength could also correspond to a physical limit to a crack size. It is very risky to assume a finite threshold strength exists without careful screening or nondestructive evaluation. Hence, the two-parameter form is most commonly used for simplicity.
Setting σu to zero in eq. (2) and taking the double logarithm of the resulting two-parameter Weibull distribution yields:
The reason the double logarithm of the Weibull equation is used in strength analysis is the ease of accessing information. Appendix A shows how eq. (3) yields the Weibull parameters in a simple graphical representation of the data with a slope of the Weibull modulus, m, and the y-intercept at Pf = .632 (i.e., 63.2 %) associated with σθ, the Weibull characteristic strength. Figure 4 shows an example for alumina. One can perform a simple linear regression analysis to get the Weibull parameters, but many analysts prefer the maximum likelihood estimation approach as discussed in the Appendix. The characteristic strength, σθ, is a location parameter; a large σθ shifts the data to the right, while a small σθ shifts the data to the left. The characteristic strength is the strength value, σ, at Pf = 63.2%, when the left side of eq.(3) = 0. Thus reported Weibull characteristic strength values (Pf = 63.2%) are slightly greater than the mean strength values (Pf = 50%).
The double logarithm of the reciprocal (1 − Pf) on the left side of eq.(3) explains the unusual interval spacing of the Pf labels on the ordinate (y) axis of the Weibull graph. Two examples are used in this section to illustrate the previous concepts. The example in Appendix A also illustrates some of these topics.
This first example illustrates strength differences in polycrystalline alumina bars tested in 3-point and 4-point flexure. The bar sizes were all 3 mm × 4 mm × 50 mm. The three-point flexure test specimens had 40 mm outer spans, and the four-point flexure test specimens had 40 mm outer spans and 20 mm inner spans. Stresses were calculated by the formulae :
where P is the break load, L is the outer span length, b is the test specimen width, and d is the test specimen height. The ¼-point qualifier for the four-point configuration means that the inner loading rollers are located inward by ¼ L from the outer loading rollers.
For each of the two test configurations, the stress values were ranked in ascending order, i = 1, 2, 3, …, N, where N is the total number of test specimens and i is the ith datum. Thus, the lowest stress for each configuration represents the first value (i = 1), the next lowest stress value is the second datum (i = 2), etc., and the highest stress is represented by the Nth datum. This enables a ranked probability of failure, Pf(σi), to be assigned to each datum according to:
Since the fracture stress and the associated Pf for each datum are now known, a graph may be constructed using the left side of eq. (3) on the ordinate and ln (σ) on the abscissa. This comprises Fig. 4, where the 3-point flexure bar data (squares) are at the right, at higher strengths, than the 4-point flexure bar data (circles). There are two common approaches to fit a line through the data: linear regression analysis and maximum likelihood estimation analysis [45,46,47,48]. The pros and cons of each analysis are described in more detail in the Appendix, but, as mentioned above, most world standards use the latter. The MLE analysis was used for the Weibull parameter estimates in Fig. 4 and Table 1. The strength difference for the same material seems large, but this is to be expected according to weakest link theory, as more material is under higher stress in the 4-point configuration, with a higher probability of containing a larger flaw.
It can be seen from Fig. 4 that the slopes (Weibull moduli) indicate the strength distribution widths. The similar slopes suggest that the same flaw types were active in both specimen sets and this indeed was verified by fractographic analysis. The wiggles in the curves are not unusual and are common in small size sample sets. A high modulus, or steep slope, is associated with a narrow strength distribution. This is usually desirable, as materials with high Weibull moduli are more predictable and less likely to break at a stress much lower than a mean value. The characteristic strength, or Weibull scale parameter σθ, indicates the distribution location along the abscissa (x) axis, and is expected to move according to test specimen size or the amount of material that is highly stressed. Thus, the distributions for 3- and 4-point flexure are expected to have the same shape (m value), but move to the right (higher σθ value) as test specimen sizes or stressed volumes decrease. This is directly analogous to the “march” of the “largest flaw” distribution presented in a previous section. It was also stated in previous sections that such measured strength changes due to differences in test specimen size and configuration can be quantitatively predicted using Weibull parameters. This can be accomplished through the concept of equivalent areas or volumes.
Fractographic analysis of the flexure test specimens graphed in Fig. 4 determined the predominant flaw type was volume-distributed porosity or agglomerates associated with porosity. Inclusions caused fracture in some test specimens, and the flaw mix probably contributed to some of the wriggles in the Fig. 4 data. Since the flaws were volume distributed, we will compare the two data sets using an equivalent volume approach. In the Weibull weakest link model, the size/strength relationship can be expressed [43,46,47,48]:
where m is the Weibull modulus, σ1 and σ2 are the mean (or median or characteristic) strengths of test specimens of type 1 and 2 which may have different sizes and stress distributions, and VE1 and VE2 are the associated effective volumes. (Effective areas may be substituted for the effective volumes in eq. (5) for surface flaws, such as machining damage.) A unimodal flaw population that is uniformly distributed throughout the volume and a Weibull two parameter distribution are assumed.
The effective volume approach is illustrated in Fig. 5. In the simplest case of direct uniform tension, VE is the test specimen volume, V. Many test specimens or components such as flexural loaded rods or bars have stress gradients and VE < V. Sometimes the relationship between the two is expressed as VE = KV, where K is called the loading factor and is dimensionless and V is the total volume within the outer loading points. As shown in Figure 5, VE is the volume of a hypothetical tensile test specimen, which when subjected to the stress σmax, has the same probability of fracture as the flexure test specimen stressed at σmax. In other words, a flexure bar of volume V is equivalent to a tensile test specimen of size VE. K is 1 for an ideally loaded tension specimen. K is typically less than 1 for parts or test specimens that have stress gradients, i.e., the stress varies with position within the body.
Equations for effective volumes and effective areas may be determined from knowledge of the stress state, or looked up in the literature for common configurations, such as flexure of rectangular bars  or round rods . For the flexure bars in Fig. 4, the effective volumes can be calculated:
where Lo is the outer span length, b is the test specimen width, d is the test specimen height, and m is the Weibull modulus. In other words, the effective volumes are equal to the specimen volume (Lobd) within the loading span multiplied by a dimensionless term including the Weibull modulus. The latter term takes into account the stress gradient, but also reflects the influence of the variability in flaw sizes as illustrated in Fig. 4. The portions of the test specimen that lie beyond the fixture outer span are unstressed and do not contribute to the effective volumes. The same is true of the portions stressed in compression. In our example, the flexure bars all have the same outer span length of 40 mm, and same height and width of 3 mm and 4 mm, respectively. The ratio of effective volumes is thus:
This can be substituted into eq.(5) to yield a simple expression for determining the expected strength ratio for the 3- and 4-point flexure test configurations:
Using an average m of 9.2, the three-point strengths should be 1.206 times the four-point strengths, in excellent agreement with the experimentally determined ratio of 1.216. In other words, the three-point strengths are 21% stronger than the four-point strengths, in good agreement with the prediction. This example demonstrates how the Weibull function can be utilized to predict the scaling of strengths to other configurations.
In this second example, the strengths of two different porcelains, Porcelain1 and Porcelain 2 are compared. Both are feldspathic porcelains containing well-dispersed crystallites of similar sizes. The porcelains differ, however, in crystalline volume content and crystalline phases, and were obtained from different manufacturers. Porcelain 1 and Porcelain 2 both had cross-sections of 3 mm × 4 mm and both were tested in ¼-point, 4-point flexure. The Porcelain 2 test specimens were shorter than the Porcelain1 test specimens, however, and were tested using shorter spans. Porcelain 2 was tested with a 20 mm outer span and 10 mm inner span. The longer Porcelain 1 test specimens were tested with the same fixture design, but with a 40 mm outer span and 20 mm inner span.
As in the previous example, the two data sets were each ranked such that each datum could be assigned a failure probability and then graphed using eq. (3). The results are shown in Fig. 6 and Table 1. The slopes are very similar (18.5 and 18.0), and the shorter test specimens had higher strengths, as would be expected from weakest link theory. Are the test specimens truly comparable in terms of strength? If Porcelain 2 were tested in 40 mm × 20 mm fixtures instead of the smaller 20 mm × 10 mm fixtures, would the strengths be similar to Porcelain 1?
Fractographic examination of the materials plotted in Fig. 6 indicated that the two porcelains generally failed from intrinsic flaws that were volume distributed. Once again we return to eq. (5). In this case we need the effective volume ratio for longer and shorter test specimens tested in 4-point flexure. Again using the equation for ¼ 4-point flexure:
Since all the quantities in the previous equation are the same for the two porcelains except the span lengths, the effective volume ratio is simply:
The Weibull modulus of Porcelain 2 is 18.0. Thus, from eq. (4):
The expected strength of Porcelain 2, if the test specimens were longer and tested with 40 mm spans, is only 4% less. The small difference is due to the large Weibull modulus so that the effective volumes are not vastly different. Although Weibull scaling predicts a 4% difference in strength if Porcelain 2 is tested with longer spans, the measured difference in strengths of the two porcelains is much greater - about 25%. Utilizing the tables in Ref.  and ASTM C1239 , the high moduli and adequate numbers of test specimens for both configurations result in 90% confidence bounds that are sufficiently narrow to indicate the porcelain strengths are statistically significantly different. Porcelain 2 has a higher calculated strength than Porcelain 1 for a similar test configuration.
Figure 7 shows an example where a single Weibull distribution is a poor fit. Thirty nine commercial 3Y-TZP zirconia bend bars of size 3 mm × 4mm × 45 mm were tested on 20 mm × 40 mm four-point fixtures. Fractographic analysis was done on every test specimen and revealed that most of the strength limiting flaws were volume-distributed pores between 10 μm and 20 μm in diameter. The six weakest specimens broke from unusually large flaws such as compositional inhomogeneites, inclusions, and gross pores, so it is not surprising that the strength trend is irregular. A proper analysis of this data set would require the use of censored statistical analyses as described in . In this example, the low strength tail of the distribution was readily apparent since a large number of test specimens were available. Had only ten or fifteen specimens been broken, then it is possible that only one or two weak specimens would have been revealed and the low strength problem not detected.
The previous examples demonstrate some of the problems and advantages of using Weibull statistics. A single flaw population is assumed and should be verified, test specimen numbers should be sufficient to determine the Weibull parameters within acceptable confidence bounds, and the calculations and results are more cumbersome than simple determination of a mean stress value. On the other hand, it was possible to quantitatively compare expected strengths for the different materials and test configurations, as well as peruse the plots for an idea of the comparative distribution widths.
As a final note, it should be mentioned that all the test specimens were tested using self-aligning fixtures with rollers, such as shown in Fig. 8. The elastic bands in the figure apply enough force to keep the rollers in place while allowing them to rotate when a flexure load is applied. The allowed rotation is quite important, for the supports and load points will be subject to a frictional force if the rollers are not free to roll. The errors due to friction are significant, but almost always ignored in the dental literature, where frictionless load points are assumed in the calculations. Experimental differences in failure stress using rigid knife edges as compared to roller-type contact points have been measured higher than 11% [51,52,53]. The frictional force prevents the load points from rolling apart, and superimposes a compressive force on the tensile face of the test specimen. Thus, the error results in apparently “stronger” test specimens than would result from rolling supports. Significant errors can also result from misalignment, especially if the bars are not of constant geometry or flat and parallel [43,54,55]. No amount of statistical manipulation can compensate for indiscriminate test procedures.
There is a strong theoretical foundation for Weibull statistical analysis of strength data based on extreme value theory, fracture mechanics, and demonstrated flaw size distributions. However, awareness of the conditions and limitations inherent in Weibull analyses, especially those pertaining to existing flaw populations and quality of data, is important for meaningful application and interpretation. A great deal of useful information is available through Weibull analysis. Among the most useful applications are comparisons of strength values and ranges for different stress configurations, which were demonstrated in the experimental section for several dental materials.
This report was made possible by a grant from NIH, R01-DE17983, and the people and facilities at the National Institute of Standards and Technology and the ADAF Paffenbarger Research Center.
The following example utilizes fictitious data in order to demonstrate how a Weibull strength distribution graph is prepared. Suppose that five test specimens produce strength outcomes of: 255 MPa, 300 MPa, 330 MPa, 295 MPa, and 315 MPa. More than 5 data are advisable for most conditions and this small sample set is for illustrative purposes only. The first step is to order the data from lowest to highest strengths as shown in the second column of Table A1.
|i||strength (MPa)||X = ln (strength)||Pf= (i−0.5)/n||Y = ln ln [1/(1−Pf)]|
The natural logarithms of the stresses are computed and shown in the third column. These values will be plotted along the horizontal axis of a Weibull graph.
Next, a cumulative probability of fracture, Pf, is estimated and assigned to each datum. A commonly used estimator that has low bias when used with linear regression analyses is Pf = (i − 0.5) / n, where i is the ith datum and n is the total number of data points. This is the estimator used in the main text. Many studies [e.g., A1–A4, 45] have shown that for n > 20, this estimator produces the least biased estimates of the Weibull parameters. Using the estimator for the first (i = 1) data point out of n = 5 total points, Pf is estimated to be 0.10 or 10% as shown in the fourth column of Table A1. This means that if many test specimens were broken, it is estimated that 10% of all outcomes would be weaker than the specimen that broke at a stress of 255 MPa. 90% would be stronger.
The next step is to compute the double natural logarithm of [1 / (1−Pf)] in accordance with eq. (3) in the main body of the paper. This is listed in the last column of the table.
A graph is prepared with X = ln (strength) plotted on the horizontal axis, and Y = ln ln [1/(1−Pf)] on the vertical axis. Figure A1 shows a graph with these two axes shown on the right and top sides. For convenience, the axes are often labeled with the values of fracture stress and Pf, as shown along the left and bottom sides of the graph. Note how the values of these parameters are not simply and evenly distributed along the axes, but stretched according to the logarithmic and double logarithmic functions.
Finally, a line is fitted through the data. Linear regression (LR) analysis is commonly used since it is the easiest to understand and can be done with a hand calculator, a simple spreadsheet, or many common graphics software programs. The usual procedure is to regress the ln ln [1/(1−Pf)] values onto ln (fracture stress), or in other words, to minimize the vertical deviations in the graph. The slope of the line is the Weibull modulus, m. The characteristic strength, σθ, is the value of stress for which ln ln [1/(1−Pf)] is zero, or Pf = 63.2%. It is analogous to the median strength, except that the latter is at Pf = 50%. The Weibull modulus and characteristic strength from the linear regression analysis are shown on the right of the line in Fig A1. (Users should be cautioned that some algorithms and computer programs regress the opposite way, so that horizontal deviations are minimized. Different Weibull parameter estimates are obtained.)
The Weibull modulus and characteristic strength from linear regression analysis are adequate for many cases, but it should be borne in mind that these are estimates. The confidence bounds or uncertainties on these estimates may be obtained from the literature. In general, estimates of the characteristic strength quickly converge to population values as the number of specimens is increased to ten or more. On the other hand, Weibull modulus estimates can be quite variable for a sample set with only a small number of test specimens or if the data do not fall on a single line. Therefore, it is common to require no fewer than ten test specimens and preferably thirty to obtain good estimates of the Weibull modulus.
The literature includes many papers suggesting new and improved Pf estimators for linear regression analyses. There are far too many to list here. Usually Monte Carlo simulations with an assumed Weibull distribution generate many small data sets which are analyzed in turn by the chosen linear regression scheme. Scatter in the computed Weibull parameters as well as bias trends (the parameters on average do not match the assumed parent distribution) are analyzed and compared to results when using the usual Pf = (i − 0.5) / n estimator. Ideally, the results should have low scatter (tight confidence bounds) and negligible bias. Some of the proposed estimators are quite implausible, however. For example, one study suggested the use of Pf = (i − 0.999)/(n + 1000) [A5]. So for a set of 30 specimens, this estimator suggests the first strength datum corresponds to a Pf = 1.0 × 10−4 %, and, for the last datum, a Pf = 28. %. It is unreasonable to assume that the weakest data point of a set of 30 specimens gives useful information about a probability of fracture of one part in 10,000. With the traditional Pf = (i − .5) / n, one obtains far more plausible estimates of 1.7 % and 98.3 %. These numbers mean that one might expect only 1.7 % of additional test strengths would be weaker than the first datum and 98.3 % would be weaker than the strongest recorded test outcome. Two subsequent papers showed that dramatic correction factors for bias had to be used when using the Pf = (i − 0.999)/(n + 1000) estimator [A6,A7]. For most Weibull analyses, it is not necessary to resort to such exotic probability estimators and, as stated above, leading researchers [45–49, A1–A4] have concluded that for n > 20, the Pf = (i − .5) / n estimator gives parameters estimates with small bias and reasonable confidence limits. Users should be cautious with smaller sample sets than 10, however, since bias in the Weibull modulus can be 5% or more [45, A1–A4,A8,A9].
An important and common alternative analysis to fit the data is the Maximum Likelihood Estimation (MLE) approach. It is a more advanced analysis that is preferred by many statisticians since the 90 % or 95% confidence intervals on the estimates of the Weibull parameters are appreciably tighter than those from linear regression [42,45,A4,A5,A9]. Furthermore, it is not necessary to use a probability estimator for Pf. For these reasons, MLE is incorporated into the comprehensive Weibull standards for analysis of strength data [32–35] which all give identical Weibull parameter estimates and confidence bounds for a particular data set. MLE analysis is strongly preferred for design. MLE analysis, which is explained well in Ref. , estimates the Weibull parameters by maximizing a likelihood function. The MLE analysis is usually described in mathematical terms (e.g., [45–48]), but a simplified text description of how it works is as follows. A first estimate, or “guess” is made of the Weibull distribution and, for each actual test strength outcome, a probability of occurrence is calculated. For a given test set of say n = 30 strengths, the probabilities are summed. Another slightly different Weibull distribution, with different modulus and characteristic strength, is then tried and the probabilities are also summed. The Weibull parameters are iteratively adjusted until the optimum, or “most likely,” parameters are found to fit the actual test data. This iterative analysis is typically done with a computer program.* The MLE estimate for the characteristic strength has negligible bias, but a small correction factor is usually applied to correct or “unbias' the Weibull modulus estimate [32–35, 41]. Users of MLE programs should check whether or not the calculated moduli are corrected for bias. A MLE fitted line is also shown in Fig. A1. In this example, the Weibull MLE and LR parameter estimates are similar. This is commonly the case for the characteristic strength, but MLE and LR estimates of the Weibull modulus usually differ. Linear regression analyses usually “chase the lowest strength data points” whereas MLE seems to “chase the highest strength data points” . One might ask: which is better? The answer is that each gives reasonable estimates of the Weibull modulus, but, since the confidence intervals for the MLE estimates are tighter, statisticians and designers prefer MLE. For more details on the MLE analysis, the reader should consult Ref.  or  or the Weibull strength standards [32–35]. With the sole exception of the short annex in the Dental Standard ISO 6782:2008  (which has no information on confidence bounds), all other standards all specify strength data analysis by MLE and include instructions on how to determine the confidence bounds
There are many reasons why strength data may deviate from a straight line when plotted as shown above. A non-zero threshold strength may cause the trend to curve downward at lower strengths. Bends or wiggles in the trend may be a consequence of small sample sizes (e.g., for n ≤ 10) or may be manifestations of multiple flaw populations. More advanced analyses for bimodal strength distributions are available (e.g., censored statistical analysis as specified in ASTM C 1239  or ISO 20501 .) Fractographic analysis may help determine the cause of bends or wiggles in a data set.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
*For simplicity, we ignore rising R-curve behavior whereby fracture resistance changes with crack size and we also ignore environmentally assisted slow crack growth.
†In rare instances, strength is controlled by edge origin sites, and strength scales with the effective edge lengths.
*The actual calculation used by most programs uses a more efficient scheme. The likelihood function is the mathematical product of the probability density function values for a series of experimental strength values. This product (actually the natural logarithm of the product for mathematical convenience) is differentiated twice, once with respect to m, and once with respect to σθ. The two differential equations are set equal to zero to find the maximum, i.e., the maximum likelihood. The two nonlinear equations are then are solved iteratively to obtain the maximum likelihood parameter estimates.