Yoccoz et al. (2001)
note that many monitoring programmes either ignore or deal ineffectively with two primary sources of variation in monitoring data: spatial variation and detectability.
Biodiversity, and trends in biodiversity, can vary enormously between locations (reflecting differing habitats, land uses, climates, etc.), so that monitoring programmes should be designed to take account of this spatial variation. This is especially critical if biodiversity is to be monitored at a regional or global (as distinct from site) level. Too often, monitoring programmes are conducted at unrepresentative sites (sometimes called ‘sentinel sites’) and conclusions generalized to the landscape as a whole. For example, the British Butterfly Monitoring Scheme (BMS; Pollard & Yates 1993
) uses transects at sites chosen because they are suitable for butterflies. Such sites are often protected reserves, or atypical in other respects, and trends in abundance may be unrepresentative of what is happening in the wider countryside. Another example is the North American Breeding Bird Survey (NA BBS; Droege 1990
; Link & Sauer 1997
), which is conducted along roads and tracks where habitats are unlikely to be typical of the area as a whole; these habitats may be avoided by some species (Reijnen et al. 1995
Until recently, the main United Kingdom breeding bird monitoring programme was the Common Birds Census (CBC; Williamson 1964
) and was based on non-random sites selected by volunteer observers. It has now been replaced by the United Kingdom's Breeding Bird Survey (UK BBS; Freeman et al. 2003
), which is based on a stratified random sample of transects from throughout the UK. Although critics argued that volunteer observers would not wish to visit random sites, where bird densities and diversity might often be low, the scheme now has nearly 2000 contributors compared with 200 or fewer for the CBC.
Differences in biodiversity measures over time at a single location may be due to real changes or may simply reflect the fact that species were more detectable in some time-points than others, perhaps due to variable observer effort, time of year, or habitat succession affecting the ease with which species could be detected, or many other possible factors. Often no attempt is made to estimate detectability because it adds an unacceptable overhead to surveys, which often span a wide range of species. Thus, the BMS is based on (often incomplete) counts of butterflies within a strip of specified width (usually 5
m but sometimes wider), while the NA BBS is based on counts of all birds detected out to 400
m from the sample point. It is certain that only a proportion of the birds present within 400
m will be detected, even of those vocalizing during the survey, and that this proportion will vary with habitat, environmental conditions and observer skill. By contrast, the UK BBS, relying on volunteer observers, uses line transect sampling, an example of distance sampling (Buckland et al. 2001
), to correct for detectability. Recognizing the difficulty that untrained volunteer observers may have with estimating distances, the UK BBS uses just three distance intervals for tallying detections. However, given the rate at which the price of survey lasers is dropping, it may soon be realistic to expect a volunteer, in developed countries at least, to purchase one, in which case detectability in the context of distance sampling could be measured with greater precision and lower bias.
The UK BBS is not the only large-scale monitoring programme that successfully deals with variability due to both spatial variation and detectability, although other examples are rare. Another is the Waterfowl Breeding Population and Habitat Survey, which is a spring survey of breeding waterfowl in the north-central USA and Canada. This is conducted by aerial surveys of strips, using a stratified systematic sampling survey design, and detectability is measured by surveying a proportion of the strips from the ground (where detectability is assumed to be certain) as well as from the air.
When the NA BBS, the BMS and the CBC were first established, they were ground-breaking but, as time has passed, their limitations have come to light. Too often, established but flawed methods are retained in order to avoid breaking a long and valuable time series. However, if that time series is compromised to the extent that trend estimates may seriously mislead managers, then the decision to change methods should be made. The British Trust for Ornithology faced this difficult decision when it replaced the CBC by the UK BBS. It addressed the issue of continuity of time series by running the schemes in parallel for several years to allow calibration. Freeman et al. (2003)
found that, despite the fact that CBC sites were non-random, for a large majority of species considered, there was no significant difference between population trends, calculated from CBC and BBS. However, these analyses were restricted to that part of the country where CBC data were sufficient to support a meaningful comparison.
Given the number of long-term time series of data from non-random sites, there will continue to be considerable interest in how best to estimate trends for wider regions within which these sites fall. Post-stratification can help reduce bias in trends. If the wider region can be divided into strata, within which trends are relatively homogeneous, then the sites that fall within each stratum might be assumed to be representative for that stratum. A common difficulty with this approach is that less-favoured habitat types may have inadequate sample sizes, yet may account for the majority of many populations. For example, some butterfly species occur at high densities within the kind of site that is monitored, and at lower densities (but higher overall abundance) through much larger tracts of less suitable habitat. Supplementing existing schemes with additional sites in under-sampled strata may be a less costly, but technically less satisfactory, option than replacing an existing scheme altogether.
Monitoring programmes should be designed so that they address the defined objectives. If practicalities lead to a design that cannot meet its objectives, then the programme should be re-examined and other options evaluated. Any programme that seriously attempts to monitor biodiversity should address the two issues of spatial variation and detectability. Danielsen et al. (2003)
argue that designs are too complicated and programmes too costly for developing countries, so that simpler schemes are needed. We wholeheartedly support the response of Yoccoz et al. (2003)
to this, that the ‘why’, ‘what’ and ‘how’ of biological monitoring is important irrespective of available resources. We also endorse the call to measure detectability in large-scale surveys from Pollock et al. (2002)
In the context of point counts (used, for example, by the NA BBS), Buckland (in preparation)
noted that ‘comparisons of counts across species are invalid, because different species have different detectabilities, and comparisons within a species across different habitats are invalid, because different habitats result in different detectabilities. Even comparisons over time in counts made at the same locations are compromised if habitat succession affects detectability, or if an observer's hearing ability changes over time, or if observers change or, in the case of surveys near roads, if traffic noise increases over time.’ Observer variation in detectability has been well demonstrated for the NA BBS (Sauer et al. 1994
; Kendall et al. 1996
; Link & Sauer 1998a
). Detectability can be safely ignored only if detection is certain (or nearly so) within the sampled plots. It would often be necessary to have very small plots or very narrow strips along transects to ensure this, in which case many potential records beyond the plot are discarded.
Our favoured strategy is to design a survey that will fully meet its objectives (assuming that those objectives are realistic and achievable). The region of interest may span different administrative areas and possibly several nations. Within some areas, the survey design may be achievable from the outset. In others, sampling may need to be restricted to localities that are safe or accessible, sampling methods may have to be simplified and the number of sampling locations may have to be reduced (achievable without bias, using a stratified sampling scheme such as that used by the UK BBS). However, the full design should remain as a goal for those areas to aspire to. A design that may be unachievable now may well be achievable in 10 years, especially if other areas in the region are able to implement the full design successfully. If the simpler methodologies or reduced sampling rates are carefully planned, this need not compromise the long-term time series; rather, as areas acquire the expertise or resources to upgrade their part of the programme, the aim should be to make upgrades ‘backwards compatible’. That is, it should be possible to extract data from the improved programme that are comparable with those from the simplified programme.
Again, we concur with Yoccoz et al. (2003)
that there is no necessity for sound survey design to lead to a complex monitoring programme; a well-designed programme makes for easy data analysis, whereas a poorly designed one leads to either flawed or complex data analysis, and often both.
It is possible to use more ambitious survey designs in programmes that use a few professional observers compared with those that use a large number of volunteers. For the latter, it is difficult to get the balance between methods that are over-simplistic and methods that are complex to the point that compromises both data quality and the goodwill of the volunteers. However, if field methods are simplified to the point that the data cannot possibly answer the objectives of the programme, then the survey fails the volunteers, who have contributed their effort to help achieve those objectives. Moreover, a combination of different sampling techniques may be required to produce an accurate representation of biodiversity. For example, Sørensen et al. (2002)
used six different sampling methods in their investigation of spider diversity in Tanzania. Finally, the challenges of identifying less charismatic taxa—including most invertebrate groups—may impede a comprehensive monitoring programme.