Ecological communities respond to many factors, but not all of these factors are necessarily of interest to the ecologist. Fortunately, CANOCO has the capability to "factor out" such influences. The result is a partial ordination, which is directly analogous to partial correlation (see Draper and Smith 1981). Partial analysis can be performed for both direct and indirect gradient analysis. The variables to be "factored out" are termed covariates or covariables. Examples of covariables follow.
In a large study, there is often more than one person responsible for collecting the data. Unfortunately, there will inevitably be some subtle variations in how different observers estimate cover, identify species, etc. If observers were represented by dummy variables (or, as in Canoco 5 or higher, as levels within a factor - i.e. a categorical variable), such variation can be factored out. This capability of CANOCO is useful, but should not be an excuse for sloppy collection of data.
Collection of data takes time. In a study involving many plots, it is possible that many weeks separate the first and last plots. The species composition may change substantially during this time period. Unless phenological change is an objective of the study, it is desirable to use the time of the year as a covariable, so that the true site-to-site gradients may be more obvious. Similarly, the year of sampling (coded as a factor or dummy variable) might be a desirable covariable if understanding year-to-year variation is not an objective of the study.
If an observer is interested only in temporal trends within repeatedly sampled plots, then variation among plots is worth "factoring out". This can be done by creating a multilevel factor for plot (or dummy variables for each plot), and by using these dummy variables as covariables.
Suppose you have an experiment which is replicated between several sites or blocks (for example, in a "split plot" design). You can code your sites or blocks as a factor (or dummy variables), so that you can specifically focus on experimental effects.
Many gradients are so well known that it is not worth focusing on them in an analysis. For example, an ecologist studying the effects of grazing practices on vegetation in a mountainous region might wish to use elevation as a covariable.
of the basic assumptions of statistics is that different observations are
independent of each other. However, community data often suffer from spatial
dependence (Palmer 1988, 1990, Legendre and Fortin 1989), which can lead to
incorrect statistical inference. Use of spatial coordinates as covariables can
help factor out some aspects of spatial dependence (Borcard et al 1992, Økland and Eilertsen 1994). I
urge caution in using spatial coordinates as covariables, because not all
spatial dependence is a linear (or polynomial) function of location, and
because excessive use of polynomial terms in spatial covariables might lead to
the arch effect (for some of the reasons, see variance
explained and variation partitioning). If data are collected in a transect or grid, it is possible to factor out spatial
dependence using a special permutation test (ter Braak 1990, ter Braak and Wiertz 1994). Other kinds of spatial pattern can
be factored out using PCNM (Principal Components of Neighbor Matrices).
Other kinds of spatial pattern can be factored out using PCNM (Principal Components of Neighbor Matrices).
In an exploratory analysis, an investigator is often interested in revealing all of the important environmental factors which determine species distributions. If all of the environmental variables in a study are inputted as covariables, and an indirect gradient analysis is performed (e.g. a partial Detrended Correspondence Analysis or pDCA), then the result indicates the residual variation in species composition. The investigator can then use his or her knowledge and intuition while examining the location of species and samples in ordination space. The distribution of different kinds of species in different parts of ordination space might suggest that some unmeasured gradient is important.
Using covariables does not mean that you are completely factoring out an effect. For example, suppose that you use elevation as a covariable. It is possible that you have an important elevation by aspect interaction which is still "hidden" in the data set. Also, it is possible that elevation is not best expressed as a linear term. If species composition varies as a function of the square root of elevation, you will not be removing the entire "elevation effect". These concerns shouldn't bother us much: usually a variable will be so highly correlated with any transformation of that variable, so removing a linear effect will essentially remove the bulk of any nonlinear effect, and interaction effects (except in extreme cases) will also be correlated with the covariable. However, getting the "wrong" transformation dampens what would otherwise be crisp hypotheses. Note that this "dampening" is not unique to gradient analysis: it is a problem with ANCOVA, partial correlation, or any other method which uses covariables.
An even more problematic caveat is that you may end up getting rid of important variation. For example, suppose you are interested in the effects of land use on bird communities, but wish to factor out elevation. If land use varies strongly as a function of elevation, then factoring out elevation will also factor out most of the land use effect. Thus, even if the relationship between land use and bird composition is strong, you may be unable to have enough residual variation in land use to find meaningful patterns.
This problem is even more extreme if you factor out, for example, study sites with multiple plots. Any variable that is constant WITHIN a study site, but varies AMONG study sites, will end up with zero residual variation if you factor out study sites.
It is possible to use the same variable as both a covariable and as an environmental variable in different parts of the same analysis. In variation partitioning, covariables are useful for distinguishing the relative contributions of different groups of variables in explaining species composition. In stepwise ordinations, variables are included in a regression model, one at a time. As soon as they are added, they become covariables. This procedure can be automated in CANOCO.
Permutation tests with covariables are somewhat more complex than without covariables. Ter Braak and milauer (1998) and Legendre and Legendre (1998) discuss some of these complexities. However, the tets become much cleaner if the permutations are "conditioned upon covariables". In almost all cases, this is done when your covariables are dummy (categorical) variables. When you condition upon covariables, you are only permuting your data within each category of your covariables. For example, if your covariables represent the blocks of a split-plot design, you would only permute data within your blocks.
(see also selected references for self-education)
Borcard, D., P. Legendre, and P. Drapeau. 1992. Partialling out the spatial component of ecological variation. Ecology 73:1045-55.
Draper, N. R., and H. Smith. 1981. Applied Regression Analysis. second edition. Wiley, New York.
Legendre, P., and M.-J. Fortin. 1989. Spatial pattern and ecological analysis. Vegetatio 80:107-38.
Legendre, P. and L. Legendre. 1998. Numerical Ecology. 2nd English edition. Elsevier, Amsterdam. 853 pages.
Palmer, M. W. 1988. Fractal Geometry: a tool for describing spatial patterns of plant communities. Vegetatio 75:91-102.
Palmer, M. W. 1990. Spatial scale and patterns of species- environment relationships in hardwood forests of the North Carolina piedmont. Coenoses 5:79-87.
ter Braak, C. J. F. 1990. Update notes: CANOCO version 3.10. Agricultural Mathematics Group, Wageningen, The Netherlands.
ter Braak, C. J. F., and P. milauer. 1998. CANOCO
Reference Manual and User's Guide to Canoco for
Windows: Software for Canonical Community Ordination (version 4). Microcomputer Power (Ithaca, NY USA)
ter Braak, C. J. F., and J. Wiertz. 1994. On the statistical analysis of vegetation change: a wetland affected by water extraction and soil acidification. J. Veg. Sci. 5:361-72.
Økland, R. H., and O. Eilertsen. 1994. Canonical correspondence analysis with variation partitioning: some comments and an application. J. Veg. Sci. 5:117-26.