16 |
Statistical Methods to Assess the Effects of Chemicals on Ecosystems |
| J. N. R. Jeffers | |
| ESE Consultants, United Kingdom |
| 16.1 INTRODUCTION | |
| 16.2 STRATEGIES | |
| 16.3 HIERARCHY THEORY | |
| 16.4 STATISTICAL CONCEPTS | |
| 16.5 BACK TO BASICS | |
| 16.6 SELECTED TOPICAL REFERENCES | |
|
|
|
Determination of the effects of chemicals on ecosystems is a difficult task. Ecosystems have been defined by Tansley (1935) as "The whole system (in the sense of physics) including not only the organism-complex. but also the whole complex of physical factors forming what we call the environment of the biome¾the habitat factors in the widest sense." As such. ecosystems are difficult to investigate. Established traditions of thought in science fragment reality by the application of reductionism. When comprehension of the whole is difficult, it is divided into smaller components, each of which can then be investigated separately. As a result, objects in the real world become identified as autonomous, distinct entities. inherently separate and disconnected from other objects outside. and which constitute their environment. The results of reductionism are intended to be understanding. prediction and control.
Because of the inherent complexity of ecosystems, however, treating parts of the reality in isolation leads to unexpected, unexplained, and often unwanted consequences when the parts are considered together. An ecosystem is greater than the sum of its parts. The general inability of science to build the properties of fragments into reasonable replicas of the properties of wholes is claimed to be calling the entire reductionist-mechanistic paradigm of the physical sciences into question (Patten, 1991).
The problem of holistic science becomes one of system definition. Specifically, determination of a minimal universe sufficient to encompass the indirect effects relevant to a given scope of enquiry and no more is necessary. Having defined and bounded the system, logical methodologies capable of describing and interpreting the variability inherent in ecological systems must be used, because of the genetic mechanisms that control the inheritance and response of individuals to other organisms and to their environment. To understand ecosystems, either as complex physical entities or as a paradigm for ecological science, requires formal methodologies to represent the relationships between organisms, and between organisms and their environment. Mathematical modelling is the systematic methodology that has proven successful in discovering and understanding the underlying processes in ecology.
This chapter seeks to explore some of the important concepts in mathematical analysis and modelling as applied to the estimation of the effects of chemicals on ecosystems. First, appropriate strategies are defined to investigate the impact of chemicals on an ecosystem. The role of hierarchy theory in determining the appropriate level of search and measurement is discussed, together with the constraints imposed by statistical inference in the estimation of system parameters. Ten basic principles are presented as a guide to develop future research projects. Finally, a list of selected references is provided as a source of further information.
Studies of the effects of chemicals on ecosystems can usually be classified broadly into baseline, monitoring, or impact studies. A baseline study is one in which data are collected and analysed for the purpose of defining the present state of the ecosystem. Usually, some environmental change is anticipated, although both the nature of the change and the time of its occurrence may be unknown. However, the present state will be characterized by patterns of spatial and temporal variation. An impact study is one whose purpose is to determine whether a specified impact has caused a change in an ecosystem, and, if so, the nature of that change. The nature of the impact and the fact of its occurrence is, thus, known. A monitoring study uses data to detect change is an ecosystem as that change occurs, and commonly assumes that baseline data are available to provide a standard against which change can be measured. In the simplest cases, the nature of the change may be defined very specifically, but frequently it is not defined at all.
Appropriate strategies for environmental studies have been conveniently summarized by Green (1979) as a decision tree. The choice of strategy for the investigation of an impact on an ecosystem depends essentially on the response to three basic questions: (1) Has the impact already occurred?, (2) Are the where and when known?, and (3) Is there a control area?
If the impact has already occurred, the type of impact and its time or location are probably also known. Unless pre-existing baseline data are available, the effects of the impact and the mechanism of those effects must then be inferred from spatial differences between areas differing in their degree of impact. The methods available for such inference are not ideal, principally because the recorded differences in the degree of impact may be confounded with many factors, or, alternatively, may interact strongly with a range of environmental and ecological variables. Specific hypotheses need to be derived from an examination of spatial patterns, and then tested in additional laboratory or field experiments. The same considerations apply when the roles of chemicals in normal ecosystem processes are being investigated, but this aspect is not pursued further here. The detection of known pollution incidents or oil spills is an opposite example.
If an impact of some kind has occurred but the precise timing and location of that impact is unknown, the only available strategy is that of a survey in both time and space. The design of such a survey must take into account all available knowledge of the variability of the component ecosystems; ideally, a prepared sampling frame should be available from past research. For example, the location and timing of the radionuclide depositions in Britain that resulted from the Chernobyl explosion were made possible by the existence of a land classification that provided a stratification and sampling frame for a survey of grassland ecosystems across the whole Britain. That survey enabled the precise determination of the location of the deposition of 134Ce and 137Ce, and the assessment of the uptake of the caesium by vegetation and sheep.
More constructive strategies are available when the impact has not already occurred, even if the precise time and location of some future impact is unknown. Given a broad indication of the ecosystems at risk, a baseline survey can be undertaken, and then the system can be monitored to detect when and where an impact occurs. However, even this strategy is not without its difficulties, if the nature of the chemical and the changes that are likely to occur as a result of its impact are unknown in advance. Not everything can be monitored, and past history suggests that effects of chemicals have not always been anticipated correctly, or even at all. This strategy is probably, therefore, the most expensive, and perhaps the least successful.
Where the impact has not occurred, and the time and place of the future impact is known, usually because it will occur as a result of a planned intervention in the functioning of the ecosystem, the appropriate strategy will depend on the presence or absence of one or more control areas on which the impact will not occur. Without control areas, the effect of the impact will need to be inferred from changes taking place in the ecosystem over time. Such a strategy is not without its problems, especially where feedback mechanisms are present in the ecosystem processes that make it difficult to determine the point at which changes first occur, even though the precise timing of the impact is known. Although time-series analysis is a well-developed part of modern statistical methodology, considerable ambiguities persist as a result of lags in the effects of chemicals on ecosystem processes. By far the most effective strategy arises from prior knowledge of where and when an impact is likely to be of concern, again most probably because it is planned to arise from an experimental intervention, and one or more control areas are designated to provide a comparison with the treated or impacted areas. This strategy permits, at least in theory, the use of an optimum design, including replication and randomization, that will enable the parameters of an ecosystem model to be estimated. Some of the basic statistical concepts associated with such an approach are emphasized later in this chapter.
Within each strategy, several decisions remain to be made. The number and kinds of variables need to be resolved, and the means by which they are to be derived, coded, and analysed need to be defined. Special considerations that may influence choice of sample unit size and the number of replicate samples need to be explored. The procedures to be used for the preliminary screening of data for aberrant values, failures of assumption, and the estimation of ecosystem parameters require specification. These issues are considered in some detail below.
Because ecological systems are organised across a range of space and time scales, a way is needed to decide which specific mechanisms must be understood in order to predict system behaviour. The time and the resources available for research are too limited to permit the collection of data for any sizeable portion of the world's ecological systems at the level of detail implied by a reductionist philosophy of scientific investigation. Ecologists are, therefore, beginning to use hierarchy theory to help construct a link between theory and empiricism (Allen and Starr, 1982; O'Neill et al., 1986).
Hierarchy theory asserts that a useful way of dealing with complex, multiscaled systems is to focus on a single phenomenon and a single time-space scale. By limiting the problem in this way, it cannot be defined clearly, or to choose the proper subsystem with which to work.
The system of interest (Level 0) will itself be a component of some higher level. The dynamics of the upper level will usually appear as constants or driving variables in a model of Level 0, so that the behaviour of Level 0 will also appear to be constrained, bounded, and controlled by this higher level.
The higher level also provides the significance of the phenomena of interest. If, for example, the effect of a chemical on an organism were being determined, behaviour difficult to explain will be observed, if one's attention is limited to the single organism. Only by reference to the higher level, the population, can the significance of that behaviour be revealed.
The next step is to divide Level 0 into components forming the next lower level (Level -1). The components of Level -1 can then be studied to explain the mechanisms operating at Level 0, and these lower level entities appear as state variables in a model of Level 0.
Defined in this way, hierarchy theory dissects a phenomenon out of its complex spatial and temporal context. Understanding the phenomenon depends on referencing the next higher and lower scales of resolution. Levels higher than +1 are too large and too slow to be seen at the 0 level, and can, therefore, usually be ignored. Levels lower than -1 are too small and too fast to appear as anything but background noise in observations of Level 0. The theory focuses attention on a defined subset of behaviour and permits systematic study of very complex systems.
O'Neill (1988) develops these concepts to provide a set of possible criteria for the study of global change. The author has adapted criteria to the study of impacts of chemicals on ecosystems:
Searching for the fundamental hierarchy. Although searching for one hierarchy that characterizes an ecosystem may be tempting, several different hierarchies may be necessary to address different problem areas. Requiring that a hierarchy fits a priori biases is an unnecessary constraint.
Searching for the fundamental level. Designating the one and only level to which all other phenomena must be reduced is also not fruitful. Indeed, the impact itself is likely to determine the time and space scales that will be emphasized.
Translating principles between levels. Generally, transposing relationships developed at one hierarchical level to higher and lower levels will be impossible. Constraints imposed at higher levels may dominate, and the overall system behaviour may have little resemblance to the behaviour of isolated components. Experimental ecologists comprehend that relationships in the field may be quite different from those measured in the laboratory .
Be prepared to accept innovative approaches. The approaches required to understand the impact of chemicals on ecosystems will reduce to simple repetitions of approaches used previously. For instance, measuring long-term changes at the scales currently emphasized in ecology is notoriously difficult. Limiting investigators to familiar scales of measurement may result in failure to detect significant trends until prevention of permanent changes is impossible.
Effects of a higher level on a lower. Higher levels of the hierarchy set constraints or boundary conditions for lower levels. Because the upper level dynamics act as forcing functions or driving variables, given sufficient difference in scale, predicting how the higher level will affect the lower is possible.
Predicting the higher level from the lower. While hierarchy theory predicts how higher levels affect lower ones, moving in the opposite direction is more difficult. Some higher level properties are sometimes but not always the sum or integral of lower level dynamics. This influence of the lower levels on the higher is commonly known as the "aggregation problem."
Interactive state variables. A useful scale for interfacing different disciplines (e.g., chemistry and ecology) can be found, provided that the state variables of a model from one discipline appear as state variables in the model of the other discipline. Once a problem area is selected, hierarchy theory provides a means of determining a specific scale at which chemical and ecological processes can be interfaced.
Seeking coherent levels. The previous criteria might lead to the belief that any level of resolution can be chosen arbitrarily; however, that is not the case. Scales exist at which predictive capability is improved over slightly larger or slightly smaller scales. The scale at which the predictive power is maximized is the coherent level that makes sense as an isolated object of study. Fortunately, this scale is quite likely to correspond to a traditional level of study within a discipline. While interfacing disciplines at arbitrary levels of resolution would be possible, arbitrary scales do not take advantage of the innate organization in hierarchical systems. Only when focus is placed on coherent levels can advantage be taken of the information and insights that have developed in each discipline about their own systems.
Critical points in parameter space. Once a scale has been selected, the parameters to be measured must be determined. A potential solution can be offered; the normal functioning of the ecosystem is not of great concern. This normal, stable behaviour of the ecosystem is most difficult to monitor . However, concern is raised about the unusual circumstances that may perturb the system. The points of critical change are called bifurcations in the underlying mathematical theory, and a necessary and sufficient condition exists to determine when the radical change is apt to occur. The change occurs when the rapid components cease to be stable; that is, the lower level components do not return to normal behaviour following a minor perturbation.
The causes of normal behaviour by a system include the fact that the rapid portions of the system are constrained by higher levels. If the system is perturbed, the rapid components simply return to the slowly changing trajectory. The rate of recovery can be taken as an indicator of the relative stability of the system, but, as the system approaches a bifurcation, the recovery becomes slower. Thus, monitoring for a significant impact is possible by monitoring the recovery rate of lower levels in the hierarchy. If the response times are increased, the system is being moved towards a point of radical self-amplifying change. Even though the actual point of change could be precipitated by fine-scaled changes, the proximity to any radical point of change will be indicated by a change in recovery times.
Any system that includes living organisms is certain to show some degree of heterogeneity, because of the genetic variation that results from sexual reproduction. Models of the reactions of organisms to applications of a chemical that do not allow for variability in the response to those applications are, therefore, mere caricatures.
The statistician regards the measurement of variation as being of greater importance than the measurement of central tendencies, whether as means or as relationships; and most of the extensive theory of mathematical statistics now deals with the measurement of variations and its characterization. Much of that theory has historically concentrated on the continuity of observations, as reflecting some idealized distribution that has desirable mathematical properties. More recently, statisticians have turned to methods that are capable of detecting discontinuities in measurements, often in multidimensional space. Increasingly statistical methods have embraced qualitative data as important extensions of purely quantitative measures.
The total set of individuals about which inferences are made is said by statisticians to constitute a population. Those individuals may be organisms, communities, societies, or whole ecosystems. While such populations will usually be finite in both time and space, they will often be so large as to make it impossible for every member of the population to be investigated, measured, or counted. Practical scientists are, therefore, usually forced to work with samples drawn from the population, and we will need to make the assumption that those samples are representative of the defined population. Only then can values calculated from the samples be regarded as unbiased estimates of the corresponding population values.
A statistician's requirement for a set of samples to be regarded as representative of a population is uncompromising. The samples must be taken by an objective and unbiased method. Selection by some form of subjective choice, guided by whatever personal consideration of the representativeness of the samples, will not satisfy the constraints of statistical inference. Two methods of objective sampling are commonly employed to meet these constraints, namely systematic or random sampling. Unless systematic sampling is repeated, severe problems occur in calculating the precision of estimates derived from the samples and in characterizing the heterogeneity of the population. The simple expedient of ensuring genuinely random choice at an appropriate part of the sampling procedure guarantees the lack of bias, and also provides a methodology to characterize the heterogeneity of the sampled population, and the precision with which population parameters are estimated from the sample (Jeffers, 1988a).
Statistically, interaction is defined as a measure of the extent to which the effect of one factor varies with changes in the strength, grade, or level of other factors in an experiment. As long ago as the mid-1920s, Fisher (1925,1935) pointed out that interactions could be investigated experimentally only if all the factors were included in the same experiment. Together with his co-workers, he developed the concept of factorial experiments through which some or all of the combinations of factors could be used to determine the strength and character of their interactions, provided that adequate replication of the experimental treatments exists.
Genuine replicates are independent in the sense that the outcome of a given replicate has no effect on the outcome of any others. They represent the total variability affecting replicates in some specified experimental conditions. Replication improves experiments in three principal ways. First, replication is the only way in which a valid estimate can be made of experimental error. Second, experimental results become increasingly precise as the number of replicates increases. Third, replication expands the range of experimental units studied, and, therefore, the extent to which results can be generalized.
However, tension exists between the need to replicate and the need to study ecological processes at appropriately large scales. Indeed, large-scale experiments, whether planned or unplanned, are not replaceable. The serendipity of unplanned experiments usually precludes replication, and even when experiments are planned, the independent experimental units are limited in number. Candidate systems are so different ecologically that they do not constitute reasonable replicates. Funding levels and logistic limitations also often preclude replication (Carpenter, 1990). Detecting change in ecological time-series (Jassby and Powell, 1990) and Bayesian inference from non-replicated ecological studies (Reckhow, 1990) have been advocated as alternatives to replicated experiments, but the arguments are controversial.
As Walters and Holling (1990) emphasize, a major challenge to justify and design experimental management programmes is the exposure of uncertainties (in the form of alternative working hypotheses) and management decision choices in a form that will promote both reasoned choice and a search for imaginative and safe experimental options, by using tools of statistical decision analysis. Therefore, experimental designs must be identified that distinguish clearly between localized and large-scale effects, and make the best possible use of opportunities for replication and comparison. Such designs also need to permit analysts to make unambiguous assessments of transient responses by ecosystems to chemicals, in the face of uncontrolled environmental factors that may affect treated and control experimental units differently.
The following summarizes the important principles to be observed in any study of the effect of chemicals on ecosystems. The similarity of these principles to those suggested by Green (1979) is acknowledged:
Clarity is essential to the goals of investigations, and to the means of collecting data to answer the questions posed. Collecting large quantities of data in the hope that they can somehow be synthesized as a model of an ecological system has been tried many times, from the International Biological Programme onwards, and has never resulted in a useful outcome. The only effective approach is first to state the hypothesis and then to collect the data.
When measuring the characteristics of any ecosystem, replicate samples are important within each combination of time, location, and any other controlled variable. By comparing differences between elements with differences within elements, that statistical significance can be established.
To the extent possible, equal numbers of randomly allocated samples should be taken for each combination of controlled variables. Taking samples from representative or typical places does not qualify as random sampling.
To test whether a condition has an effect, samples must be taken from the location in which the condition is present and absent. Furthermore, interactions between factors can only be estimated from factorial combinations of those factors.
Pilot trials are essential to provide a basis for the evaluation of sampling designs and options for statistical analysis. Eliminating this step in project design, for whatever reason, inevitably results in a great loss of time and resources.
Variation in efficiency of sampling among areas frequently biases comparisons between the areas. Therefore, sampling methods must be shown to be able to sample the population to be sampled, and with equal and adequate efficiency over the entire range of sampling conditions encountered.
If an area to be sampled has a large-scale environmental pattern, the area should be divided into relatively homogeneous sub-areas, and samples allocated to each sub-area in proportion to the size of the sub-area. Alternatively, if an estimate of the total abundance of some species is required, the allocation of samples should be proportional to the numbers of organisms in the sub-area.
Sample unit sizes must be appropriate to the size, densities, and spatial patterns of the organisms being sampled. The numbers of replicate samples required to obtain estimates with the required precision can then be determined from the information obtained from the pilot trials.
Appropriate methods of analysis will depend on the ways in which the data have been collected, the hypotheses the data were designed to test, and the nature of the heterogeneity displayed by the data. These "metadata"¾data about data¾must not be separated from the data themselves before submitting them to analysis. A fundamental lack of integration exists between database management systems and classical statistical packages.
Important as good computing tools are, they do not substitute for appropriate statistical knowledge about how to analyse structured data. Good tools allow bad methods to be encoded just as easily as good ones. For scientists, the key problems are in knowing what processing can be validly undertaken on the data, and, in its simplest form, what data can be legitimately combined and compared.
Allen, T.F.H., and Starr, T.B. (1982) Hierarchy: Perspectives for Ecological Complexity. University of Chicago Press, Chicago.
Barnsley, M. (1988) Fractals Everywhere. Academic Press, London.
Beven, K.J., and Moore, I.D. (1991) Terrain Analysis and Distributed Modelling in Hydrology. John Wiley & Sons, Chichester, UK.
Carpenter, S.R. (1990) Large-scale perturbations: opportunities for innovation. Ecology 71(6), 2038-2043.
Checkland, P., and Scholes, J. (1990) Soft Systems Methodology in Action. John Wiley & Sons, Chichester, UK.
Feoli, E., and Orloci, L. (1991) Computer Assisted Vegetation Analysis. Kluwer Academic, Dordrecht, The Netherlands.
Fisher, R.A. (1925) Statistical Methods for Research Workers. Oliver & Boyd, Edinburgh.
Fisher, R.A. (1935) The Design of Experiments. Oliver & Boyd, Edinburgh.
Green, R.H. (1979) Sampling Design and Statistical Methods for Environmental Biologists. John Wiley & Sons, Chichester, UK.
Guariso, G., and Werthner, H. (1989) Environmental Decision Support Systems. Ellis, Horwood, Ltd, Chichester, UK.
Halfon, E. (1979) Theoretical Systems Ecology. Academic Press, London.
Howard, P.J.A. (1991) An Introduction to Environmental Pattern Analysis. Parthenon, Casterton, UK.
Huggett, R.J., Kimerle, R.A., Mehrle, P.M., and Bergman, H.L. (1992) Biomarkers: Biochemical, Physiological and Histological Markers in Anthropogenic Stress. Lewis, Chelsea, Michigan.
Jackson, J.E. (1991) A User's Guide to Principal Components. John Wiley & Sons, Chichester, UK.
Jassby, A.D., and Powell, T.M. (1990) Detecting change in ecological time series. Ecology 71(6), 2044-2052.
Jeffers, J.N.R. (1978) An Introduction to Systems Analysis: With Ecological Applications. Edward Arnold, London.
Jeffers, J.N .R. (1988a) Statistical and mathematical approaches to issues of scales in ecology. In: Rosswall, T., Woodmansee, R.G., and Risser, P.G. (Eds.) Scales and Global Change. John Wiley & Sons, Chichester, U.K.
Jeffers, J.N.R. (1988b) Practitioner's Handbook on the Modelling of Dynamic Change in Ecosystems. John Wiley & Sons, Chichester, U.K.
Lunn, A.D., and McNeil, D.R. (1991) Computer-lnteractive Data Analysis. John Wiley & Sons, Chichester, UK.
Noreen, E.W. (1989) Computer-lntensive Methods for Testing Hypotheses. John Wiley & Sons, Chichester, UK.
O'Neill, R.V. (1988) Hierarchy theory and global change. In: Rosswall, T., Woodmansee, R.G., and Risser, P.G. (Eds.) Scales and Global Change. John Wiley & Sons, Chichester, UK.
O'Neill, R.V., DeAngelis, D.L., Waide, J.B., and Allen, T.H.F. (1986) A Hierarchical Concept of Ecosystems. Princeton University Press, Princeton, New Jersey.
Patten, B.C. (1991) Network ecology: indirect determination of the life-environment relationship in ecosystems. In: Higashi, M., and Burns, T.P. (Eds.) Theoretical Studies of Ecosystems: The Network Perspective. Cambridge University Press, Cambridge.
Reckhow, K.H. (1990) Bayesian inference in non-replicated ecological studies. Ecology 71(6), 2053-2059.
Starfield, A.M., and Bleloch, A.L. (1986) Building Models for Conservation and Wildlife. Macmillan, London.
Swartzman, G.L., and Kaluzny, S.P. (1987) Ecological Simulation Primer. Macmillan, London.
Tansley, A.G. (1935) The use and abuse of vegetational concepts and terms. Ecology 16, 284-307.
Tijms, H.C. (1986) Stochastic Modelling and Analysis. John Wiley & Sons, Chichester, U.K
Walters, C.J., and Holling, C.S. (1990) Large-scale management experiments and learning by doing. Ecology 71(6), 2060-2068.
Whittaker, J. (1990) Graphical Models in Applied Multivariate Statistics. John Wiley & Sons, Chichester, UK.
|
|
|
The electronic version of this publication has been
prepared at |