# EIGENANALYSIS-BASED METHODS

## Eigenanalysis

Eigenanalysis is central to the mathematical discipline of linear (matrix) algebra, and a thorough understanding of ordination methods requires a training in linear algebra. However, for our present purposes, it suffices to know that:

• Eigenanalysis is a mathematical operation on a square, symmetric matrix. A square matrix has the same number of rows as columns. A symmetric matrix is the same if you switch rows and columns. Distance and similarity matrices are nearly always square and symmetric.
• It is possible to perform a eigenanalyses analytically (that is, get exact results) only for very small matrices (e.g. three rows and columns). For large matrices, eigenanalysis requires an iterative approach which eventually "closes in" on the answer (in most cases).
• The answer to an eigenanalysis consists of a series of eigenvalues and eigenvectors. Each eigenvalue has an eigenvector, and there are as many eigenvectors and eigenvalues as there are rows in the initial matrix. Eigenvalues are usually ranked from the greatest to the least. The first eigenvalue is often called the "dominant" or "leading" eigenvalue. Eigenvalues are also often called "latent values".
• The eigenvalue is a measure of the strength of an axis, the amount of variation along an axis, and ideally the importance of an ecological gradient. The precise meaning depends on the ordination method used.
• The eigenvectors are the sample scores, if the rows and columns of the initial matrix represent samples.
• Some texts (e.g. Digby and Kempton 1987) describe "singular value decomposition" instead of eigenanalysis. The approaches are theoretically identical.
• Principal Components Analysis, Correspondence Analysis (Reciprocal Averaging), DCA (Detrended Correspondence Analysis) are all examples of eigenanalysis-based ordination methods.
• The direct gradient analysis (constrained ordination) techniques of RDA, CCA, and DCCA (not covered in this page) are also eigenanalysis-based techniques.

## Principal Components Analysis (PCA)

• PCA is available in most statistical packages.
• PCA is often considered a form of "factor analysis".
• PCA is a "rigid rotation" of the data matrix: it does not change the positions of points relative to each other; it just changes the coordinate systems.
• PCA is mathematically quite elegant.
• In PCA, axes are created such that the perpendicular distance from each object to the ordination axes is minimized.
• Axes are linear combinations of species/variables. The weights are known as "coefficients" or "loadings".
• PCA is good for data that are not in the same units, in which case data must be standardized to zero mean and unit variance (this is known as PCA on the correlation matrix).
• PCA can accept negative numbers for variables/species.
• PCA can place new points in an old ordination.
• PCA can find characteristics of any point in ordination space (though this is not standard in most packages).
• Eigenvalues have a meaning: variance explained.
• The sum of the eigenvalues will equal the sum of the variance of all variables.
• If performed on a correlation matrix, the sum of the eigenvalues will equal the number of variables/species.
• If performed on a covariance matrix, the sum of the eigenvalues will equal the sum of the variances of all species.
• PCA has a serious problem for vegetation data: the horseshoe effect. This is caused by the curvilinearity of species distributions along gradients. Since species response curves are typically unimodal (i.e. very strongly curvilinear), horseshoe effects are common.

## Correspondence Analysis or Reciprocal Averaging (RA)

• RA has been discovered independently by different scientists.
• Reciprocal Averaging means that sample scores are calculated as a weighted average of species scores, and species scores are calculated as a weighted average of sample scores, and iterations continue until there is no change. However, other algorithms are possible.
• RA simultaneously ordinates species and samples. There are as many axes as there are species or samples, whichever is less.
• The number of axes worth interpreting is a matter of taste, but eigenvalues can be a guide (e.g. if the third eigenvalue is much less than the second eigenvalue, third and higher axes might not be informative).
• RA maximizes the correlation between species scores and sample scores. The eigenvalue is equal to the correlation coefficient.
• An eigenvalue of 1.0 implies that one sample (or group of samples) shares no species with all other samples.
• One can put new points in a Correspondence Analysis without affecting the rest of the ordination.
• As with all the other eigenvector techniques, it is possible to define "passive samples" or "passive species".
• RA has a problem: the arch effect. It is also caused by nonlinearity of distributions along gradients.
• The arch is not as serious as the horseshoe effect of PCA, because the ends of the gradient are not convoluted.
• Another related problem of RA is that the ends of the gradient are compressed.

## Detrended Correspondence Analysis (DCA)

• DCA is probably the most widely used ordination technique today. It is an extension of RA.
• Both sample scores and species scores are produced.
• The first axis has same eigenvalue as RA.
• The arch is removed by dividing the first axis into segments, and re-centering samples.
• Samples are shifted to equalize beta-diversity: species have, on average, a habitat breadth (as measured by standard deviations) of 1. Thus the axes of DCA are a useful measure of beta diversity.
• One cannot easily put in new samples in DCA.
• DCA has problems: it is inelegant, sensitive to parameters which determine number of segments, destroys a "true" arch (if it exists), and perhaps creates a "tongue effect" (I suspect that the existence of a tongue effect is more commonly a real attribute of nature, than an artifact of DCA).

