Title: | Multiscale Codependence Analysis |
---|---|
Description: | Computation of Multiscale Codependence Analysis and spatial eigenvector maps. |
Authors: | Guillaume Guenard and Pierre Legendre, Bertrand Pages |
Maintainer: | Guillaume Guenard <[email protected]> |
License: | GPL-3 |
Version: | 1.2-3 |
Built: | 2024-11-11 04:13:48 UTC |
Source: | https://github.com/guenardg/codep |
Computation of Multiscale Codependence Analysis and spatial eigenvector maps. Multiscale Codependence Analysis (MCA) consists in assessing the coherence of pairs of variables in space (or time) using the product of their correlation coefficients with series of spatial (or temporal) eigenfunctions. That product, which is positive or negative when variables display similar or opposing trends, respectively, is called a codependence coefficient.
The eigenfunctions used in the calculation are obtained in three steps: 1) a distance matrix is calculated from the locations of samples in space (or the sampling organisation through time). 2) From that distance matrix, a matrix of Moran spatial weights is obtained; this is the same matrix as used to calculate Moran's autocorrelation index, hence the name. And 3) the spatial weight matrix is eigenvalue-decomposed after centring the rows and columns of the spatial weight matrix.
The statistical significance of codependence coefficients is tested using parametric or permutational testing of a tau statistic. The 'tau' statistic is the product of the Student's 't' statistics obtained from comparison of the two variables with a given eigenfunction. The 'tau' statistic can take either positive or negative values, thereby allowing one to perform one-tailed or two-tailed testing. For multiple response variables, testing is performed using the 'phi' statistic instead of 'tau'. That statistic follows the distribution of the product of two Fisher-Snedocor F statistics (see product-distribution for details).
Function MCA
performs Multiscale Codependence Analysis
(MCA). Functions test.cdp
and permute.cdp
handle
parametric or permutation testing of the codependence coefficients,
respectively.
Methods are provided to print and plot cdp-class objects
(print.cdp
and plot.cdp
, respectively) as well as
summary (summary.cdp
), fitted values
(fitted.cdp
), residuals (residuals.cdp
), and for
making predictions (predict.cdp
).
Function eigenmap
calculates spatial eigenvector maps following
the approach outlined in Dray et al. (2006), and which are necessary to
calculate MCA
. It returns a eigenmap-class object. The
package also features methods to print (print.eigenmap
) and
plot (plot.eigenmap
) these objects. Function
eigenmap.score
can be used to make predictions for spatial
models built from the eigenfunctions of eigenmap
using
distances between one or more target locations and the sampled locations for
which the spatial eigenvector map was built.
The package also features an exemplary dataset salmon containing 76
sampling site positions along a 1520 m river segment. It also contains
functions cthreshold
and minpermute
, which
compute the testwise type I error rate threshold corresponding to a given
familywise threshold and the minimal number of permutations needed for
testing Multiscale Codependence Analysis given the alpha threshold,
respectively.
The DESCRIPTION file:
Package: | codep |
Type: | Package |
Version: | 1.2-3 |
Date: | 2024-04-15 |
Encoding: | UTF-8 |
Title: | Multiscale Codependence Analysis |
Author: | Guillaume Guenard and Pierre Legendre, Bertrand Pages |
Maintainer: | Guillaume Guenard <[email protected]> |
Description: | Computation of Multiscale Codependence Analysis and spatial eigenvector maps. |
Depends: | R (>= 3.5.0) |
Suggests: | |
Imports: | grDevices, graphics, stats, parallel |
License: | GPL-3 |
LazyLoad: | yes |
NeedsCompilation: | yes |
RoxygenNote: | 7.2.3 |
Repository: | https://guenardg.r-universe.dev |
RemoteUrl: | https://github.com/guenardg/codep |
RemoteRef: | HEAD |
RemoteSha: | 215f84947ca78b30c71000144dcf8beea82c4361 |
Index of help topics:
Doubs The Doubs Fish Data Euclid Calculation of the Euclidean Distance LGDat Legendre and Gallagher Synthetic Example LGTransforms Transformation for Species Abundance Data MCA Multiple-descriptors, Multiscale Codependence Analysis cdp-class Class and Methods for Multiscale Codependence Analysis (MCA) codep-package Multiscale Codependence Analysis cthreshold Familywise Type I Error Rate eigenmap Spatial Eigenvector Maps eigenmap-class Class and Methods for Spatial Eigenvector Maps geodesics Calculation of Geodesic Distances minpermute Number of Permutations for MCA mite The Oribatid Mite Data Set product-distribution Frequency Distributions for MCA Parametric Testing salmon The St. Marguerite River Altantic Salmon Parr Transect weighting-functions Weighting Functions for Spatial Eigenvector Map
Guillaume Guenard and Pierre Legendre, Bertrand Pages Maintainer: Guillaume Guenard <[email protected]>
Dray, S.; Legendre, P. and Peres-Neto, P. 2006. Spatial modelling: a comprehensive framework for principal coordinate analysis of neighbor matrices (PCNM). Ecol. Modelling 196: 483-493
Guénard, G., Legendre, P., Boisclair, D., and Bilodeau, M. 2010. Multiscale codependence analysis: an integrated approach to analyse relationships across scales. Ecology 91: 2952-2964
Guénard, G. Legendre, P. 2018. Bringing multivariate support to multiscale codependence analysis: Assessing the drivers of community structure across spatial scales. Meth. Ecol. Evol. 9: 292-304
Legendre, P. and Legendre, L. 2012. Numerical Ecology, 3rd English edition. Elsevier Science B.V., Amsterdam, The Neatherlands.
data(mite) emap <- eigenmap(x = mite.geo, weighting = wf.RBF, wpar = 0.1) emap ## Organize the environmental variables mca0 <- MCA(Y = log1p(mite.species), X = mite.env, emobj = emap) mca0_partest <- test.cdp(mca0, response.tests = FALSE) summary(mca0_partest) plot(mca0_partest, las = 2, lwd = 2) plot(mca0_partest, col = rainbow(1200)[1L:1000], las = 3, lwd = 4, main = "Codependence diagram", col.signif = "white") rng <- list(x = seq(min(mite.geo[,"x"]) - 0.1, max(mite.geo[,"x"]) + 0.1, 0.05), y = seq(min(mite.geo[,"y"]) - 0.1, max(mite.geo[,"y"]) + 0.1, 0.05)) grid <- cbind(x = rep(rng[["x"]], length(rng[["y"]])), y = rep(rng[["y"]], each = length(rng[["x"]]))) newdists <- matrix(NA, nrow(grid), nrow(mite.geo)) for(i in 1L:nrow(grid)) { newdists[i,] <- ((mite.geo[,"x"] - grid[i,"x"])^2 + (mite.geo[,"y"] - grid[i,"y"])^2)^0.5 } spmeans <- colMeans(mite.species) pca0 <- svd(log1p(mite.species) - rep(spmeans, each = nrow(mite.species))) prd0 <- predict( mca0_partest, newdata = list(target = eigenmap.score(emap, newdists)) ) Uprd0 <- (prd0 - rep(spmeans, each = nrow(prd0))) %*% pca0$v %*% diag(pca0$d^-1) ## Printing the response variable prmat <- Uprd0[,1L] dim(prmat) <- c(length(rng$x), length(rng$y)) zlim <- c(min(min(prmat), min(pca0$u[,1L])), max(max(prmat), max(pca0$u[,1L]))) image(z = prmat, x = rng$x, y = rng$y, asp = 1, zlim = zlim, col = rainbow(1200L)[1L:1000], ylab = "y", xlab = "x") points( x = mite.geo[,"x"], y = mite.geo[,"y"], pch = 21, bg = rainbow(1200L)[round(1+(999*(pca0$u[,1L] - zlim[1L])/ (zlim[2L] - zlim[1L])),0)] )
data(mite) emap <- eigenmap(x = mite.geo, weighting = wf.RBF, wpar = 0.1) emap ## Organize the environmental variables mca0 <- MCA(Y = log1p(mite.species), X = mite.env, emobj = emap) mca0_partest <- test.cdp(mca0, response.tests = FALSE) summary(mca0_partest) plot(mca0_partest, las = 2, lwd = 2) plot(mca0_partest, col = rainbow(1200)[1L:1000], las = 3, lwd = 4, main = "Codependence diagram", col.signif = "white") rng <- list(x = seq(min(mite.geo[,"x"]) - 0.1, max(mite.geo[,"x"]) + 0.1, 0.05), y = seq(min(mite.geo[,"y"]) - 0.1, max(mite.geo[,"y"]) + 0.1, 0.05)) grid <- cbind(x = rep(rng[["x"]], length(rng[["y"]])), y = rep(rng[["y"]], each = length(rng[["x"]]))) newdists <- matrix(NA, nrow(grid), nrow(mite.geo)) for(i in 1L:nrow(grid)) { newdists[i,] <- ((mite.geo[,"x"] - grid[i,"x"])^2 + (mite.geo[,"y"] - grid[i,"y"])^2)^0.5 } spmeans <- colMeans(mite.species) pca0 <- svd(log1p(mite.species) - rep(spmeans, each = nrow(mite.species))) prd0 <- predict( mca0_partest, newdata = list(target = eigenmap.score(emap, newdists)) ) Uprd0 <- (prd0 - rep(spmeans, each = nrow(prd0))) %*% pca0$v %*% diag(pca0$d^-1) ## Printing the response variable prmat <- Uprd0[,1L] dim(prmat) <- c(length(rng$x), length(rng$y)) zlim <- c(min(min(prmat), min(pca0$u[,1L])), max(max(prmat), max(pca0$u[,1L]))) image(z = prmat, x = rng$x, y = rng$y, asp = 1, zlim = zlim, col = rainbow(1200L)[1L:1000], ylab = "y", xlab = "x") points( x = mite.geo[,"x"], y = mite.geo[,"y"], pch = 21, bg = rainbow(1200L)[round(1+(999*(pca0$u[,1L] - zlim[1L])/ (zlim[2L] - zlim[1L])),0)] )
A class and set of methods to handle the results of Multiscale Codependence Analysis.
## S3 method for class 'cdp' print(x, ...) ## S3 method for class 'cdp' plot(x, col, col.signif = 2, main = "", ...) ## S3 method for class 'cdp' summary(object, ...) ## S3 method for class 'cdp' fitted(object, selection, components = FALSE, ...) ## S3 method for class 'cdp' residuals(object, selection, ...) ## S3 method for class 'cdp' predict(object, selection, newdata, components = FALSE, ...)
## S3 method for class 'cdp' print(x, ...) ## S3 method for class 'cdp' plot(x, col, col.signif = 2, main = "", ...) ## S3 method for class 'cdp' summary(object, ...) ## S3 method for class 'cdp' fitted(object, selection, components = FALSE, ...) ## S3 method for class 'cdp' residuals(object, selection, ...) ## S3 method for class 'cdp' predict(object, selection, newdata, components = FALSE, ...)
x |
A |
... |
Further parameters to be passed to other functions or methods. |
col |
A vector of color values to be used for plotting the multivariate codependence coefficients. |
col.signif |
Color of the frame used to mark the statistically significant codependence coefficients. |
main |
Text for the main title of the plot. |
object |
A |
selection |
A numeric vector of indices or character vector variable
names to test or force-use. Mandatory if |
components |
A boolean specifying whether the components of fitted or predicted values associated with single eigenfunctions in the map should be returned. |
newdata |
A list with elements $X, $meanY, and $target that contain the information needed to make predictions (see details). |
cdp-class objects contain:
A list with two elements: the first being a copy of the
response ('Y') and the second being a copy of the explanatory variables
('X'). This is the variables that were given to MCA
.
The eigenmap-class object that was given to
MCA
.
A list with five elements: the first ('UpY') is a matrix of the cross-products of structuring variable ('U') and the response variable 'Y', the second ('UpX') is a matrix of the cross-product of the structuring variable and the explanatory variables ('X'), the third ('C') is a 3-dimensional array of the codependence coefficients, the fourth ('B') is a 3-dimensional array of the coregression coefficients, and the fifth ('CM') is a matrix of the multivariate codependence coefficients.
Results of statistical testing as performed by
test.cdp
or permute.cdp
. NULL
if no
testing was performed, such as when only MCA
had been
called. The results of statistical testing is a list containing the
following members:
The number of randomized permutations used by
permute.cdp
for permutation testing. 0 or FALSE
for
parametric testing obtained using test.cdp
.
The indices of codependence coefficient describing statistically significant codependence between 'Y' and 'X', in decreasing order of magnitude.
The testing table (a 5-column matrix) with phi statistics, degrees-of-freedom, and testwise and familywise probabilities of type I (alpha) error. It contains one line for each statistically significant global coefficient (if any) in addition to test results for the first, non-significant coefficient, on which the testing procedure stopped.
Tests of every single response variable (a
3-dimensional array), had such tests been requested while calling the
testing function, NULL
otherwise.
Details about permutation testing not shown in
'test$global' or 'test$response'. NULL
for parametric
testing.
The 'fitted', 'residuals', and 'predict' methods return a matrix of fitted,
residuals, or predicted values, respectively. The 'fitted' and 'predict'
methods return a list a list when argument 'component' is TRUE
. The
list contains the 'fitted' or 'predicted' values as a first element and an
array 'components' as a second. That 3-dimensional array has one matrix for
each statistically significant codependence coefficient.
For making predictions, argument newdata
may contain three elements:
'$X', a matrix of new values of the explanatory variables, '$meanY', a vector
of the predicted mean values of the responses, and '$target', a matrix of
target scores for arbitrary locations within the study area. When no '$X' is
supplied, the descriptor given to MCA
is recycled, while when
no '$meanY' is supplied, the mean values of the response variables given to
MCA
are used.
Finally, when element '$target' is omitted from argument newdata
,
predictions are made at the sites were observations were done. When none of
the above is provided, or if newdata
is omitted when calling the
prediction method, the behaviour of the 'predict' method is identical to
that of the 'fitted' method.
From version 0.7-1, cdp-class replaces the former class
mca
used by codep-package because the standard package MASS
also had S3 methods for a class named mca
that were overwritten by
those of codep-package.
print(cdp)
: Print method for cdp-class objects.
plot(cdp)
: Plot method for cdp-class objects.
summary(cdp)
: Summary method for cdp-class objects.
fitted(cdp)
: Fitted method for cdp-class objects.
residuals(cdp)
: Residuals method for cdp-class objects.
predict(cdp)
: Predict method for cdp-class objects.
Guillaume Guenard and Pierre Legendre, Bertrand Pages Maintainer: Guillaume Guenard <[email protected]>
Guénard, G., Legendre, P., Boisclair, D., and Bilodeau, M. 2010. Multiscale codependence analysis: an integrated approach to analyse relationships across scales. Ecology 91: 2952-2964
Guénard, G. Legendre, P. 2018. Bringing multivariate support to multiscale codependence analysis: Assessing the drivers of community structure across spatial scales. Meth. Ecol. Evol. 9: 292-304
Function to calculate the testwise type I error rate threshold corresponding to a give familywise threshold.
cthreshold(alpha, nbtest)
cthreshold(alpha, nbtest)
alpha |
The familywise type I error threshold. |
nbtest |
The number of tests performed. |
Type I error rate inflation occurs when a single hypothesis is tested indirectly using inferences about two or more (i.e., a family of) sub-hypotheses. In such situation, the probability of type I error (i.e., the probability of incorrectly rejecting the null hypothesis) of the single, familywise, hypothesis is higher than the lowest, testwise, probabilities. As a consequence, the rejection of null hypothesis for one or more individual tests does not warrant that the correct decision (whether to reject the the null hypothesis on a familywise basis) was taken properly. This function allows to obtain correct, familywise, alpha thresholds in the context of multiple testing. It is base on the Sidak inegality.
The threshold that have to be used for individual tests.
Guillaume Guenard and Pierre Legendre, Bertrand Pages Maintainer: Guillaume Guenard <[email protected]>
Sidak, Z. 1967. Rectangular Confidence Regions for Means of Multivariate Normal Distributions J. Am. Stat. Assoc. 62: 626-633
Wright, P. S. 1992. Adjusted p-values for simultaneous inference. Biometrics 48: 1005-1013
Legendre, P. and Legendre, L. 1998. Numerical Ecology. Elsevier Science B.V., Amsterdam, The Neatherlands. p. 18
## For a familywise threshold of 5% with 5 tests: cthreshold(0.05, 5) ## The corrected threshold for each test is 0.01020622
## For a familywise threshold of 5% with 5 tests: cthreshold(0.05, 5) ## The corrected threshold for each test is 0.01020622
Fish community composition of the Doubs River, France.
data(Doubs)
data(Doubs)
Contains three matrices:
The abundance of 27 fish species.
Nine environmental variables (all quantitative).
Geographic information of the samples.
Values in 'Doubs.fish' are counts of individuals of each of 27 species observed in a set of 30 sites located along the 453 km long Doubs River, France (see Verneaux 1973 for further details about fishing methods and effort).
contains 11 quantitative variables, namely the slope ('slo'; 1/1000) and mean minimum discharge ('flo' m³/s) of the river, the pH of the water, its harness (Calcium concentration; 'har'; mg/L), phosphate ('pho'; mg/L), nitrate ('nit'; mg/L), and ammonium ('amm'; mg/L), concentration as well as its dissolved oxygen ('oxy'; mg/L) and biological oxygen demand ('bdo'; mg/L).
contains geographical information. 'Lon', the longitude and 'Lat', the latitude of the sample (degree) as well as 'DFS', its distance from the source of the river (km) and 'Alt', altitude (m above see level).
Verneaux, 1973
Verneaux J. 1973. - Cours d'eau de Franche-Comté (Massif du Jura). Recherches écologiques sur le réseau hydrographique du Doubs. Essai de biotypologie. Thèse d'état, Besançon. 257 p.)
Verneaux, J.; Schmitt, V.; Verneaux, V. & Prouteau, C. 2003. Benthic insects and fish of the Doubs River system: typological traits and the development of a species continuum in a theoretically extrapolated watercourse. Hydrobiologia 490: 60-74
Borcard, D.; Gillet, F. & Legendre, P. 2011. Numerical Ecology with R. Springer, New-York, NY, USA.
data(Doubs) summary(Doubs.fish) summary(Doubs.env) summary(Doubs.geo)
data(Doubs) summary(Doubs.fish) summary(Doubs.env) summary(Doubs.geo)
Function to calculate spatial eigenvector maps of a set of locations in a space with an arbitrary number of dimension.
eigenmap( x, alt.coord = NA, weighting = wf.sqrd, boundaries, wpar, tol = .Machine$double.eps^0.5 ) eigenmap.score(emap, target)
eigenmap( x, alt.coord = NA, weighting = wf.sqrd, boundaries, wpar, tol = .Machine$double.eps^0.5 ) eigenmap.score(emap, target)
x |
A set of coordinates defined in one (numeric vector) or many (a
coordinate x dimension matrix) dimensions or, alternatively, a distance
matrix provided by |
alt.coord |
Coordinates to be used when a distance matrix is provided as x. Used for plotting purposes. |
weighting |
The function to obtain the edge weighting matrix (see details). |
boundaries |
When required by argument |
wpar |
Shape parameter for argument |
tol |
The smallest absolute eigenvalue for a spatial eigenfunctions to
be considered as a suitable predictor. Default:
|
emap |
An eigenmap-class object. |
target |
A (generally rectangular) distance matrix between a set of
target locations for which spatially-explicit predictions are being made
(rows), and the reference locations given to function |
When function eigenmap
is given coordinates as its argument
x
, they are treated as Cartesian coordinates and the distances between
them are assumed to be Euclidean. Otherwise (e.g., when geodesic distances
are used), distances have to be provided as the argument x
and
plotting coordinates have to be supplied as argument alt.coord
.
The weighting function (see weighting-functions) must have the
distances as its first argument, optionally an argument named
boundaries
giving the boundaries within which locations are regarded
as neighbours and/or an argument wpar
containing any other weighting
function parameters.
Default values for argument boundaries
are 0 for the minimum value and
NA
for the maximum. For weighting functions with an argument
bounraries
, The upper value NA
indicates the function to take
the minimum value that allow every locations to form a single cluster
following single linkage clustering as a maximum value (obtained internally
from a call to hclust
.
eigenmap()
: Main function for generating an eigenmap-class object from Cartesian
coordinates or pairwise distances.
eigenmap.score()
: Generate scores for arbitrary locations within the scope of an existing
eigenvector map.
Guillaume Guenard and Pierre Legendre, Bertrand Pages Maintainer: Guillaume Guenard <[email protected]>
Borcard, D. and Legendre, P. 2002. All-scale spatial analysis of ecological data by means of principal coordinates of neighbour matrices. Ecol. Model. 153: 51-68
Dray, S.; Legendre, P. and Peres-Neto, P. 2006. Spatial modelling: a comprehensive framework for principal coordinate analysis of neighbor matrices (PCNM). Ecol. Modelling 196: 483-493
Legendre, P. and Legendre, L. 2012. Numerical Ecology, 3rd English edition. Elsevier Science B.V., Amsterdam, The Netherlands.
### Example 1: A linear transect. data(salmon) ## A warning is issued when no boundaries are provided for a function that ## requires them. ## Example: map <- eigenmap(x = salmon[,"Position"], weighting = wf.binary) map ## plot(map) ## In the following examples, boundaries are provided; they are needed by the ## functions. map <- eigenmap(x = salmon[,"Position"], weighting = wf.binary, boundaries = c(0,20)) map ## plot(map) map <- eigenmap(x = salmon[,"Position"], weighting = wf.Drayf1, boundaries = c(0,20)) map ## plot(map) map <- eigenmap(x = salmon[,"Position"], weighting = wf.Drayf2, boundaries = c(0,20)) map ## plot(map) map <- eigenmap(x = salmon[,"Position"], weighting = wf.Drayf3, boundaries = c(0,20), wpar = 2) map ## plot(map) map <- eigenmap(x = salmon[,"Position"], weighting = wf.PCNM, boundaries = c(0,20)) map ## plot(map) map <- eigenmap(x = salmon[,"Position"], weighting = wf.sqrd) map ## plot(map) map <- eigenmap(x = salmon[,"Position"], weighting = wf.RBF, wpar = 0.001) map ## plot(map) ### Example 2: Using predictor scores smpl <- c(4,7,10,14,34,56,61,64) # A sample to be discarded map <- eigenmap(x = salmon[-smpl,"Position"], weighting = wf.sqrd) scr <- eigenmap.score( map, target = as.matrix(dist(salmon[,"Position"]))[,-smpl] ) ## Scores of sampling points are the eigenvectors scr[smpl,] wh <- 5L # You can try with other vectors. plot(map$U[,wh] ~ salmon[-smpl,"Position"], ylab = expression(U[5]), xlab = "Position along transect") points(y = scr[smpl,wh], x = salmon[smpl,"Position"], pch = 21L, bg = "black") map <- eigenmap(x = salmon[-smpl,"Position"], weighting = wf.binary, boundaries = c(0,20)) scr <- eigenmap.score( map, target = as.matrix(dist(salmon[,"Position"]))[smpl,-smpl]) ## Plot the 8 prediction sites along particular eigenvectors, here ## eigenvector #1: wh <- 1L # One could try the other vectors. plot(map$U[,wh] ~ salmon[-smpl,"Position"], ylab = expression(U[1L]), xlab = "Position along transect (m)") points(y = scr[,wh], x = salmon[smpl,"Position"], pch=21L, bg = "black") map <- eigenmap(x = salmon[-smpl,"Position"], weighting = wf.PCNM, boundaries = c(0,100)) scr <- eigenmap.score( map, target = as.matrix(dist(salmon[,"Position"]))[smpl,-smpl] ) wh <- 1L # You can try with other vectors. plot(map$U[,wh] ~ salmon[-smpl,"Position"], ylab = expression(U[1]), xlab = "Position along transect (m)") points(y = scr[,wh], x = salmon[smpl,"Position"], pch = 21L, bg = "black") ### Example 3: A unevenly sampled surface. data(mite) ## Example using the principal coordinates of the square root of the ## (Euclidean) distances: map <- eigenmap(x = as.matrix(mite.geo), weighting = wf.sqrd) map ## plot(map) ## Example using the radial basis functions (RBF): map <- eigenmap(x = as.matrix(mite.geo), weighting = wf.RBF) map ## plot(map)
### Example 1: A linear transect. data(salmon) ## A warning is issued when no boundaries are provided for a function that ## requires them. ## Example: map <- eigenmap(x = salmon[,"Position"], weighting = wf.binary) map ## plot(map) ## In the following examples, boundaries are provided; they are needed by the ## functions. map <- eigenmap(x = salmon[,"Position"], weighting = wf.binary, boundaries = c(0,20)) map ## plot(map) map <- eigenmap(x = salmon[,"Position"], weighting = wf.Drayf1, boundaries = c(0,20)) map ## plot(map) map <- eigenmap(x = salmon[,"Position"], weighting = wf.Drayf2, boundaries = c(0,20)) map ## plot(map) map <- eigenmap(x = salmon[,"Position"], weighting = wf.Drayf3, boundaries = c(0,20), wpar = 2) map ## plot(map) map <- eigenmap(x = salmon[,"Position"], weighting = wf.PCNM, boundaries = c(0,20)) map ## plot(map) map <- eigenmap(x = salmon[,"Position"], weighting = wf.sqrd) map ## plot(map) map <- eigenmap(x = salmon[,"Position"], weighting = wf.RBF, wpar = 0.001) map ## plot(map) ### Example 2: Using predictor scores smpl <- c(4,7,10,14,34,56,61,64) # A sample to be discarded map <- eigenmap(x = salmon[-smpl,"Position"], weighting = wf.sqrd) scr <- eigenmap.score( map, target = as.matrix(dist(salmon[,"Position"]))[,-smpl] ) ## Scores of sampling points are the eigenvectors scr[smpl,] wh <- 5L # You can try with other vectors. plot(map$U[,wh] ~ salmon[-smpl,"Position"], ylab = expression(U[5]), xlab = "Position along transect") points(y = scr[smpl,wh], x = salmon[smpl,"Position"], pch = 21L, bg = "black") map <- eigenmap(x = salmon[-smpl,"Position"], weighting = wf.binary, boundaries = c(0,20)) scr <- eigenmap.score( map, target = as.matrix(dist(salmon[,"Position"]))[smpl,-smpl]) ## Plot the 8 prediction sites along particular eigenvectors, here ## eigenvector #1: wh <- 1L # One could try the other vectors. plot(map$U[,wh] ~ salmon[-smpl,"Position"], ylab = expression(U[1L]), xlab = "Position along transect (m)") points(y = scr[,wh], x = salmon[smpl,"Position"], pch=21L, bg = "black") map <- eigenmap(x = salmon[-smpl,"Position"], weighting = wf.PCNM, boundaries = c(0,100)) scr <- eigenmap.score( map, target = as.matrix(dist(salmon[,"Position"]))[smpl,-smpl] ) wh <- 1L # You can try with other vectors. plot(map$U[,wh] ~ salmon[-smpl,"Position"], ylab = expression(U[1]), xlab = "Position along transect (m)") points(y = scr[,wh], x = salmon[smpl,"Position"], pch = 21L, bg = "black") ### Example 3: A unevenly sampled surface. data(mite) ## Example using the principal coordinates of the square root of the ## (Euclidean) distances: map <- eigenmap(x = as.matrix(mite.geo), weighting = wf.sqrd) map ## plot(map) ## Example using the radial basis functions (RBF): map <- eigenmap(x = as.matrix(mite.geo), weighting = wf.RBF) map ## plot(map)
Create and handle spatial eigenvector maps of a set of locations a space with an arbitrary number of dimensions.
## S3 method for class 'eigenmap' print(x, ...) ## S3 method for class 'eigenmap' plot(x, ...)
## S3 method for class 'eigenmap' print(x, ...) ## S3 method for class 'eigenmap' plot(x, ...)
x |
an 'eigenmap-class' object. |
... |
Further parameters to be passed to other functions or methods (currently ignored). |
'eigenmap-class' objects contain:
A matrix of coordinates.
The interval within which pairs of sites are considered as neighbours.
A distance matrix.
The weighting function that had been used.
The weighting function parameter that had been used.
A vector of the eigenvalues obtain from the computation of the eigenvector map.
A matrix of the eigenvectors defining the eigenvector map.
The 'print' method provides the number of the number of orthonormal variables (i.e. basis functions), the number of observations these functions are spanning, and their associated eigenvalues.
The 'plot' method provides a plot of the eigenvalues and offers the
possibility to plot the values of variables for 1- or 2-dimensional sets of
coordinates. plot.eigenmap
opens the default graphical device driver,
i.e., X11
, windows
, or quartz
and recurses through
variable with a left mouse click on the graphical window. A right mouse click
interrupts recursing on X11
and windows
(Mac OS X users should
hit Esc on the quartz
graphical device driver (Mac OS X users).
print(eigenmap)
: Print method for eigenmap-class objects
plot(eigenmap)
: Plot method for eigenmap-class objects
Guillaume Guenard and Pierre Legendre, Bertrand Pages Maintainer: Guillaume Guenard <[email protected]>
Borcard, D. and Legendre, P. 2002. All-scale spatial analysis of ecological data by means of principal coordinates of neighbour matrices. Ecol. Model. 153: 51-68
Dray, S.; Legendre, P. and Peres-Neto, P. 2006. Spatial modelling: a comprehensive framework for principal coordinate analysis of neighbor matrices (PCNM). Ecol. Modelling 196: 483-493
Legendre, P. and Legendre, L. 2012. Numerical Ecology, 3rd English edition. Elsevier Science B.V., Amsterdam, The Netherlands.
Function Euclid
carries out the calculation of pairwise Euclidean
distances within a set of coordinates or between two sets thereof, with
optional weights.
Euclid(x, y, squared = FALSE)
Euclid(x, y, squared = FALSE)
x |
A set of coordinates in the form of a |
y |
An optional second set of coordinates in the same dimensions as
argument |
squared |
Should the squared Euclidean distances be returned (default: FALSE). |
When only one set of coordinates is given to the function (i.e.,
when argument y
is omitted), the function returns the pairwise
distances in the form of a 'dist-class' object representing a
lower-triangle matrix. If weights are omitted, the result is identical to
that produced by function dist with argument
method = "euclidean"
(the function's default).
The standard 'R' function used to calculate the Euclidean distance
(dist
), only allows one to calculate pairwise distances between
the rows of a single matrix of Cartesian coordinates and return a
'dist-class' object, which is a one-dimensional array meant to be
interpreted as a lower-triangular matrix. Function Euclid
can also be
provided two data matrices (arguments x
and y
) and output a
rectangular matrix of the Euclidean distances.
A 'dist-class' object or, whenever y
is provided,
a matrix
with as many rows as the number of rows in x
and as many columns as the number of rows in y
.
Guillaume Guenard and Pierre Legendre, Bertrand Pages Maintainer: Guillaume Guenard <[email protected]>
The 'dist-class' and associated methods.
## A set of reference points: x <- cbind(c(1,4,5,2,8,4), c(3,6,7,1,3,2)) dimnames(x) <- list(LETTERS[1:6], c("x", "y")) ## The pairwise Euclidean distances among the reference points: d1 <- Euclid(x) d1 ## That result is the same as that obtained from function dist: d2 <- dist(x, method = "euclidean") all(d1 == d2) ## A second set of points: y <- cbind(c(3,5,7), c(3,6,8)) dimnames(y) <- list(LETTERS[7:9], c("x", "y")) ## The distances between the points in y (rows) and x (columns): Euclid(x, y)
## A set of reference points: x <- cbind(c(1,4,5,2,8,4), c(3,6,7,1,3,2)) dimnames(x) <- list(LETTERS[1:6], c("x", "y")) ## The pairwise Euclidean distances among the reference points: d1 <- Euclid(x) d1 ## That result is the same as that obtained from function dist: d2 <- dist(x, method = "euclidean") all(d1 == d2) ## A second set of points: y <- cbind(c(3,5,7), c(3,6,8)) dimnames(y) <- list(LETTERS[7:9], c("x", "y")) ## The distances between the points in y (rows) and x (columns): Euclid(x, y)
Function geodesics
carries out the calculation of pairwise geodesic
distances within a set of coordinates or between two sets thereof, using one
of two calculation approaches.
geodesics( x, y, method = c("haversine", "Vincenty"), radius = 6371000, sma = 6378137, flat = 1/298.257223563, maxiter = 1024L, tol = .Machine$double.eps^0.75 )
geodesics( x, y, method = c("haversine", "Vincenty"), radius = 6371000, sma = 6378137, flat = 1/298.257223563, maxiter = 1024L, tol = .Machine$double.eps^0.75 )
x |
A set of geographic coordinates in the form of a two-column
|
y |
An other two-column |
method |
The calculation method used to obtain the distances (default: haversine method; see details). |
radius |
Radius of the planetary body (when assuming a sphere; default: 6371000 m). |
sma |
Length of the semi-major axis of the planetary body (when assuming a revolution ellipsoid; default: 6378137 m). |
flat |
Flattening of the ellipsoid (default: 1/298.257223563). |
maxiter |
Maximum number of iterations, whenever iterative calculation is involved (default: 1024). |
tol |
Tolerance used when iterative calculation is involved (default:
|
When only one set of coordinates is given to the function (i.e.,
when argument y
is omitted), the function returns the pairwise
distances in the form of a 'dist-class' object representing a
lower-triangle matrix. When the second coordinate set is given, the function
calculates the distances between each coordinate of argument x
and
each coordinate of argument y
.
Two calculation methods are implemented. The first is the haversine formula,
which assume the planetary body to be a sphere. The radius of that sphere is
given to the function as its argument radius
, with the default value
being the mean radius of planet earth. Of the two methods implemented, the
haversine formula is fastest but its precision depend on how well the
planetary body match the sphericity assumption. The second method implemented
is Vincenty's inverse formula, which assumes the the planetary body is a
revolution ellipsoid, which is expected for rotating semi-fluid such as
planet earth. Argument sma
, the length of the semi-major axis,
corresponds to the radius of the circle obtained when the revolution
ellipsoid at the equator, whereas argument flat
correspond to the
compression of the sphere, along the diameter joining the poles, to form the
ellipsoid of revolution. Their default values corresponds to parameters for
planet Earth according to WGS84. These values, along with arguments
maxiter
and tol
, are ignored when using the haversine formula,
whereas the value of argument radius
is ignored when using Vincenty's
inverse formula.
Vincenty's inverse formula is more precise on planet Earth (on the order of
0.5mm) than the haversine formula, but it involves more computation time and
may sometimes fail to converge. This is more likely for pairs of locations
that are nearly antipodal or both (numerically) very close to the equator.
The results returned by the function when using Vincenty's inverse formula
are given a niter
attribute that gives the number of iterations that
were necessary to achieve convergence. Numbers greater than argument
maxiter
are indicative of failed convergence; a warning is issued in
such a circumstance.
Geodesic distance matrices are non metric.
A 'dist-class' object or, whenever argument y
is
provided, a matrix
with as many rows as the number of rows in
argument x
and as many columns as the number of rows in argument
y
.
Guillaume Guenard and Pierre Legendre, Bertrand Pages Maintainer: Guillaume Guenard <[email protected]>
Vincenty, T. 1975. Direct and Inverse Solutions of Geodesics on the Ellipsoid with application of nested equations. Survey Review XXIII (176): 88-93 doi:10.1179/sre.1975.23.176.88
Inman, J. 1835. Navigation and Nautical Astronomy: For the Use of British Seamen (3 ed.). London, UK: W. Woodward, C. & J. Rivington
The dist-class
and associated methods.
## ### First example: locations spread throughout the world ## coords <- cbind(c(43,22,9,12,-40,72,-86,-22), c(-135,22,0,1,-45,12,27,-139)) res_hav <- geodesics(coords) ## Default: the haversine formula res_hav res_vif <- geodesics(coords, method = "Vincenty") res_vif attr(res_vif,"niter") ## The numbers of iterations res_vif-res_hav ## Absolute difference 200*(res_vif-res_hav)/(res_vif+res_hav) ## Large relative difference ## ### Second example: locations nearer from one another ## coords <- cbind(c(45.01,44.82,45.23,44.74), c(72.03,72.34,71.89,72.45)) res_hav <- geodesics(coords) res_vif <- geodesics(coords, method = "Vincenty") res_vif-res_hav ## Absolute difference 200*(res_vif-res_hav)/(res_vif+res_hav) ## Relative difference are smaller ##
## ### First example: locations spread throughout the world ## coords <- cbind(c(43,22,9,12,-40,72,-86,-22), c(-135,22,0,1,-45,12,27,-139)) res_hav <- geodesics(coords) ## Default: the haversine formula res_hav res_vif <- geodesics(coords, method = "Vincenty") res_vif attr(res_vif,"niter") ## The numbers of iterations res_vif-res_hav ## Absolute difference 200*(res_vif-res_hav)/(res_vif+res_hav) ## Large relative difference ## ### Second example: locations nearer from one another ## coords <- cbind(c(45.01,44.82,45.23,44.74), c(72.03,72.34,71.89,72.45)) res_hav <- geodesics(coords) res_vif <- geodesics(coords, method = "Vincenty") res_vif-res_hav ## Absolute difference 200*(res_vif-res_hav)/(res_vif+res_hav) ## Relative difference are smaller ##
A data set used as a synthetic example in paper Legendre and Gallagher (2001).
data(LGDat)
data(LGDat)
A 19 rows by 10 columns data.frame
.
This synthetic data set is described by Legendre and Gallagher (2001) and was used to test species abundance transformations. Its first column contains geographic locations from 1 to 19 (no particular units are specified). The five columns that follow contain abundances of five species with abundances peaking at 7-8 at different locations (site 1, 5, 10, 15, and 19). The latter are considered "abundant species". For next four columns contains abundances of "rare species" occurring in between the abundance species (abundances from 1 to 4).
Legendre, P. & Gallagher E. D. 2001. Ecologically meaningful transformations for ordination of species data. Oecologia 129: 271-280 doi: 10.1007/s004420100716
data(LGDat) summary(LGDat)
data(LGDat) summary(LGDat)
Calculates the transformed species abundances following Legendre and Gallagher.
LGTransforms( x, method = c("chord", "chisq", "profile", "Hellinger"), offset = 0, power = 1 )
LGTransforms( x, method = c("chord", "chisq", "profile", "Hellinger"), offset = 0, power = 1 )
x |
A species abundance matrix (rows: sites, columns: species). |
method |
The transformation method, one of "chord" (the default), "chisq", "profile", or "Hellinger" (see details). |
offset |
Offset value applied to all the columns of |
power |
Exponent for the power transformation (Box-Cox) applied
to all columns of |
These transformations of species abundances values are useful for
multivariate least squares methods for ordination methods, such as the
principal component analysis, or modelling methods, such as the multiscale
codependence analysis (MCA
), the canonical redundancy analysis
(RDA). They allow one to use least squares methods, which operate on the
basis of the Euclidean metric, on species abundance data, for which the
Euclidean metric have generally inadequate properties (see Legendre &
Gallagher 2001 and Legendre & Borcard 2018, in references below, for a
thorough discussion on the topic).
The power (Box Cox) transformation involves the following equation:
y' = (y + offset)^power if power != 0
y' = log(y + offset) if power == 0
The default values for offset
(0
) and power
(1
)
correspond to applying no transformation besides that specified by argument
methods
.
A matrix of the transformed species abundances.
Guillaume Guenard and Pierre Legendre, Bertrand Pages Maintainer: Guillaume Guenard <[email protected]>
Legendre P. & Gallagher E. D. 2001. Ecologically meaningful transformations for ordination of species data. Oecologia 129: 271-280 doi: 10.1007/s004420100716
Box G. E. P. & Cox D. R. 1964. An analysis of transformations. Journal of the Royal Statistical Society Series B 26: 211-243
Legendre P. & Borcard D. 2018. Box-Cox-chord transformations for community composition data prior to beta diversity analysis. Ecography 41: 1820-1824. doi: 0.1111/ecog.03498
Legendre, P. & Legendre, L. 2012. Numerical Ecology, Third English Edition. Elsevier B. V. Amsterdam, The Netherlands.
data(Doubs) ## Removing any species that have not been not observed: Doubs.fish -> x x[rowSums(x)!=0,] -> x ## Transforming the abundances LGTransforms(x,"chord") -> chord LGTransforms(x,"chord",offset=1,power=0) -> log.chord LGTransforms(x,"chord",power=0.25) -> pow.chord LGTransforms(x,"chisq") -> chisq LGTransforms(x,"profile") -> sp_pr LGTransforms(x,"Hellinger") -> Helli dist(chord) dist(log.chord) dist(pow.chord) dist(chisq) dist(sp_pr) dist(Helli) ## Legendre & Gallagher synthetic examples: data(LGDat) ## Diastemograms: as.matrix(dist(LGDat[,1L])) -> geo geo[upper.tri(geo)] -> geo ## Raw Euclidean distances par(mfrow=c(1,1), mar=c(5,5,4,2)) as.matrix(dist(LGDat[,-1L])) -> eco eco[upper.tri(eco)] -> eco plot(eco~geo, data=data.frame(geo=geo, eco=eco), xaxp=c(1,18,17), las=1, ylim=c(0,max(eco)), xlab="True geographic distance", ylab="Euclidean distance") ## Euclidean distances on the transformed abundances: par(mfrow=c(3,2), mar=c(3,5,4,2)) LGTransforms(LGDat[,-1L],"chord") -> chord as.matrix(dist(chord)) -> eco eco[upper.tri(eco)] -> eco plot(eco~geo,data=data.frame(geo=geo,eco=eco), xaxp=c(1,18,17),las=1,xlab="",ylab="", main="Chord distance",ylim=c(0,max(eco))) LGTransforms(LGDat[,-1L],"chord",offset=1,power=0) -> log.chord as.matrix(dist(log.chord)) -> eco eco[upper.tri(eco)] -> eco plot(eco~geo,data=data.frame(geo=geo,eco=eco), xaxp=c(1,18,17),las=1,xlab="",ylab="", main="Chord distance (log(x+1))",ylim=c(0,max(eco))) par(mar=c(4,5,3,2)) LGTransforms(LGDat[,-1L],"chord",power=0.25) -> pow.chord as.matrix(dist(pow.chord)) -> eco eco[upper.tri(eco)] -> eco plot(eco~geo,data=data.frame(geo=geo,eco=eco), xaxp=c(1,18,17),las=1,xlab="",ylab="", main="Chord distance (power=0.25)",ylim=c(0,max(eco))) LGTransforms(LGDat[,-1L],"chisq") -> chisq as.matrix(dist(chisq)) -> eco eco[upper.tri(eco)] -> eco plot(eco~geo,data=data.frame(geo=geo,eco=eco), xaxp=c(1,18,17),las=1,xlab="",ylab="", main="Chi-square distance",ylim=c(0,max(eco))) par(mar=c(5,5,2,2)) LGTransforms(LGDat[,-1L],"profile") -> sp_pr as.matrix(dist(sp_pr)) -> eco eco[upper.tri(eco)] -> eco plot(eco~geo,data=data.frame(geo=geo,eco=eco), xaxp=c(1,18,17),las=1,xlab="",ylab="", main="Dist. between profiles",ylim=c(0,max(eco))) LGTransforms(LGDat[,-1L],"Hellinger") -> Helli as.matrix(dist(Helli)) -> eco eco[upper.tri(eco)] -> eco plot(eco~geo,data=data.frame(geo=geo,eco=eco), xaxp=c(1,18,17),las=1,xlab="",ylab="", main="Hellinger distance",ylim=c(0,max(eco))) mtext(text="True geographic distance", side=1, line=-1.5, outer=TRUE) mtext(text="Ecological distance", side=2, line=-1.5, outer=TRUE) ## Examples from Legendre & Legendre 2012, page 329 (Figure 7.8): matrix(c(0,0,1,4,1,0,8,1,0),3L,3L) -> LL329 ## D1: Euclidean distance dist(LL329) ## Chord transformation (D3: chord distance) LGTransforms(LL329,"chord") -> tr tr dist(tr) ## "Species profile" transformation (D18) LGTransforms(LL329,"profile") -> tr tr dist(tr) ## Hellinger transformation (D17: Hellinger distance) LGTransforms(LL329,"Hellinger") -> tr tr dist(tr) ## Chi-square transformation (D16: Chi-square distance) LGTransforms(LL329,"chisq") -> tr tr dist(tr)
data(Doubs) ## Removing any species that have not been not observed: Doubs.fish -> x x[rowSums(x)!=0,] -> x ## Transforming the abundances LGTransforms(x,"chord") -> chord LGTransforms(x,"chord",offset=1,power=0) -> log.chord LGTransforms(x,"chord",power=0.25) -> pow.chord LGTransforms(x,"chisq") -> chisq LGTransforms(x,"profile") -> sp_pr LGTransforms(x,"Hellinger") -> Helli dist(chord) dist(log.chord) dist(pow.chord) dist(chisq) dist(sp_pr) dist(Helli) ## Legendre & Gallagher synthetic examples: data(LGDat) ## Diastemograms: as.matrix(dist(LGDat[,1L])) -> geo geo[upper.tri(geo)] -> geo ## Raw Euclidean distances par(mfrow=c(1,1), mar=c(5,5,4,2)) as.matrix(dist(LGDat[,-1L])) -> eco eco[upper.tri(eco)] -> eco plot(eco~geo, data=data.frame(geo=geo, eco=eco), xaxp=c(1,18,17), las=1, ylim=c(0,max(eco)), xlab="True geographic distance", ylab="Euclidean distance") ## Euclidean distances on the transformed abundances: par(mfrow=c(3,2), mar=c(3,5,4,2)) LGTransforms(LGDat[,-1L],"chord") -> chord as.matrix(dist(chord)) -> eco eco[upper.tri(eco)] -> eco plot(eco~geo,data=data.frame(geo=geo,eco=eco), xaxp=c(1,18,17),las=1,xlab="",ylab="", main="Chord distance",ylim=c(0,max(eco))) LGTransforms(LGDat[,-1L],"chord",offset=1,power=0) -> log.chord as.matrix(dist(log.chord)) -> eco eco[upper.tri(eco)] -> eco plot(eco~geo,data=data.frame(geo=geo,eco=eco), xaxp=c(1,18,17),las=1,xlab="",ylab="", main="Chord distance (log(x+1))",ylim=c(0,max(eco))) par(mar=c(4,5,3,2)) LGTransforms(LGDat[,-1L],"chord",power=0.25) -> pow.chord as.matrix(dist(pow.chord)) -> eco eco[upper.tri(eco)] -> eco plot(eco~geo,data=data.frame(geo=geo,eco=eco), xaxp=c(1,18,17),las=1,xlab="",ylab="", main="Chord distance (power=0.25)",ylim=c(0,max(eco))) LGTransforms(LGDat[,-1L],"chisq") -> chisq as.matrix(dist(chisq)) -> eco eco[upper.tri(eco)] -> eco plot(eco~geo,data=data.frame(geo=geo,eco=eco), xaxp=c(1,18,17),las=1,xlab="",ylab="", main="Chi-square distance",ylim=c(0,max(eco))) par(mar=c(5,5,2,2)) LGTransforms(LGDat[,-1L],"profile") -> sp_pr as.matrix(dist(sp_pr)) -> eco eco[upper.tri(eco)] -> eco plot(eco~geo,data=data.frame(geo=geo,eco=eco), xaxp=c(1,18,17),las=1,xlab="",ylab="", main="Dist. between profiles",ylim=c(0,max(eco))) LGTransforms(LGDat[,-1L],"Hellinger") -> Helli as.matrix(dist(Helli)) -> eco eco[upper.tri(eco)] -> eco plot(eco~geo,data=data.frame(geo=geo,eco=eco), xaxp=c(1,18,17),las=1,xlab="",ylab="", main="Hellinger distance",ylim=c(0,max(eco))) mtext(text="True geographic distance", side=1, line=-1.5, outer=TRUE) mtext(text="Ecological distance", side=2, line=-1.5, outer=TRUE) ## Examples from Legendre & Legendre 2012, page 329 (Figure 7.8): matrix(c(0,0,1,4,1,0,8,1,0),3L,3L) -> LL329 ## D1: Euclidean distance dist(LL329) ## Chord transformation (D3: chord distance) LGTransforms(LL329,"chord") -> tr tr dist(tr) ## "Species profile" transformation (D18) LGTransforms(LL329,"profile") -> tr tr dist(tr) ## Hellinger transformation (D17: Hellinger distance) LGTransforms(LL329,"Hellinger") -> tr tr dist(tr) ## Chi-square transformation (D16: Chi-square distance) LGTransforms(LL329,"chisq") -> tr tr dist(tr)
Class, Functions, and methods to perform Multiscale Codependence Analysis (MCA)
MCA(Y, X, emobj) test.cdp(object, alpha = 0.05, max.step, response.tests = TRUE) permute.cdp(object, permute, alpha = 0.05, max.step, response.tests = TRUE) parPermute.cdp( object, permute, alpha = 0.05, max.step, response.tests = TRUE, nnode, seeds, verbose = TRUE, ... )
MCA(Y, X, emobj) test.cdp(object, alpha = 0.05, max.step, response.tests = TRUE) permute.cdp(object, permute, alpha = 0.05, max.step, response.tests = TRUE) parPermute.cdp( object, permute, alpha = 0.05, max.step, response.tests = TRUE, nnode, seeds, verbose = TRUE, ... )
Y |
A numeric matrix or vector containing the response variable(s). |
X |
A numeric matrix or vector containing the explanatory variable(s). |
emobj |
A eigenmap-class object. |
object |
A cdp-class object. |
alpha |
The type I (alpha) error threshold used by the testing procedure. |
max.step |
The maximum number of steps to perform when testing for statistical significance. |
response.tests |
A boolean specifying whether to test individual response variables. |
permute |
The number of random permutations used for testing. When
omitted, the number of permutations is calculated using function
|
nnode |
The number of parallel computation nodes. |
seeds |
Seeds for computation nodes' random number generators when using parallel computation during the permutation test. |
verbose |
Whether to return user notifications. |
... |
Parameters to be passed to function |
Multiscale Codependence Analysis (MCA) allows to calculate correlation-like (i.e.codependence) coefficients between two variables with respect to structuring variables (Moran's eigenvector maps). The purpose of this function is limited to parameter fitting.
Test procedures are handled through test.cdp
(parametric testing) or
permute.cdp
(permutation testing). Moreover, methods are provided for
printing (print.cdp
), displaying a summary of the tests
(summary.cdp
), plotting results (plot.cdp
), calculating
fitted (fitted.cdp
) and residuals values (redisuals.cdp
), and
making predictions (predict.cdp
).
It is noteworthy that the test procedure used by MCA
deviates from the
standard R workflow since intermediate testing functions (test.cdp
and
permute.cdp
) need first to be called before any testing be performed.
Function parPermute.cdp
allows the user to spread the number of
permutation on many computation nodes. It relies on package parallel.
Omitting argument nnode
lets function parallel::detectCores
specify the number of node. Similarly, omitting parameter seeds
lets
the function define the seeds as a set of values drawn from a uniform random
distribution between with minimum value -.Machine$integer.max
and
maximum value .Machine$integer.max
.
A cdp-class object.
MCA()
: Main function to compute the multiscale codependence analysis
test.cdp()
: Parametric statistical testing for multiscale codependence analysis
permute.cdp()
: Permutation testing for multiscale codependence analysis.
parPermute.cdp()
: Permutation testing for multiscale codependence analysis using parallel
processing.
Guillaume Guenard and Pierre Legendre, Bertrand Pages Maintainer: Guillaume Guenard <[email protected]>
Guénard, G., Legendre, P., Boisclair, D., and Bilodeau, M. 2010. Multiscale codependence analysis: an integrated approach to analyse relationships across scales. Ecology 91: 2952-2964
Guénard, G. Legendre, P. 2018. Bringing multivariate support to multiscale codependence analysis: Assessing the drivers of community structure across spatial scales. Meth. Ecol. Evol. 9: 292-304
### Example 1: St. Marguerite River Salmon Transect data(salmon) ## Converting the data from data frames to to matrices: Abundance <- log1p(as.matrix(salmon[,"Abundance",drop = FALSE])) Environ <- as.matrix(salmon[,3L:5]) ## Creating a spatial eigenvector map: map1 <- eigenmap(x = salmon[,"Position"], weighting = wf.binary, boundaries = c(0,20)) ## Case of a single descriptor: mca1 <- MCA(Y = Abundance, X = Environ[,"Substrate",drop = FALSE], emobj = map1) mca1 mca1_partest <- test.cdp(mca1) mca1_partest summary(mca1_partest) par(mar = c(6,4,2,4)) plot(mca1_partest, las = 3, lwd=2) mca1_pertest <- permute.cdp(mca1) ## Not run: ## or: mca1_pertest <- parPermute.cdp(mca1, permute = 999999) ## End(Not run) mca1_pertest summary(mca1_pertest) plot(mca1_pertest, las = 3) mca1_pertest$UpYXcb$C # Array containing the codependence coefficients ## With all descriptors at once: mca2 <- MCA(Y = log1p(as.matrix(salmon[,"Abundance",drop = FALSE])), X = as.matrix(salmon[,3L:5]), emobj = map1) mca2 mca2_partest <- test.cdp(mca2) mca2_partest summary(mca2_partest) par(mar = c(6,4,2,4)) plot(mca2_partest, las = 3, lwd=2) mca2_pertest <- permute.cdp(mca2) ## Not run: ## or: mca2_pertest <- parPermute.cdp(mca2, permute = 999999) ## End(Not run) mca2_pertest summary(mca2_pertest) plot(mca2_pertest, las = 3, lwd=2) mca2_pertest$UpYXcb$C # Array containing the codependence coefficients mca2_pertest$UpYXcb$C[,1L,] # now turned into a matrix. ### Example 2: Doubs Fish Community Transect data(Doubs) ## Sites with no fish observed are excluded: excl <- which(rowSums(Doubs.fish) == 0) ## Creating a spatial eigenvector map: map2 <- eigenmap(x = Doubs.geo[-excl,"DFS"]) ## The eigenvalues are in map2$lambda, the MEM eigenvectors in matrix map2$U ## MCA with multivariate response data analyzed on the basis of the Hellinger ## distance: Y <- LGTransforms(Doubs.fish[-excl,],"Hellinger") mca3 <- MCA(Y = Y, X=Doubs.env[-excl,], emobj = map2) mca3_pertest <- permute.cdp(mca3) ## Not run: ## or: mca3_pertest <- parPermute.cdp(mca3, permute = 999999) ## End(Not run) mca3_pertest summary(mca3_pertest) par(mar = c(6,4,2,4)) plot(mca3_pertest, las = 2, lwd=2) ## Array containing all the codependence coefficients: mca3_pertest$UpYXcb$C ## Display the results along the transect spmeans <- colMeans(Y) pca1 <- svd(Y - rep(spmeans, each=nrow(Y))) par(mar = c(5,5,2,5) + 0.1) plot(y = pca1$u[,1L], x = Doubs.geo[-excl,"DFS"], pch = 21L, bg = "red", ylab = "PCA1 loadings", xlab = "Distance from river source (km)") ## A regular transect of sites from 0 to 450 (km) spaced by 1 km intervals ## (451 sites in total). It is used for plotting spatially-explicit ## predictions. x <- seq(0,450,1) newdists <- matrix(NA, length(x), nrow(Doubs.geo[-excl,])) for(i in 1L:nrow(newdists)) newdists[i,] <- abs(Doubs.geo[-excl,"DFS"] - x[i]) ## Calculating predictions for the regular transect under the same set of ## environmental conditions from which the codependence model was built. prd1 <- predict(mca3_pertest, newdata = list(target = eigenmap.score(map2, newdists))) ## Projection of the predicted species abundance on pca1: Uprd1 <- (prd1 - rep(spmeans, each = nrow(prd1))) %*% pca1$v %*% diag(pca1$d^-1) lines(y = Uprd1[,1L], x = x, col=2, lty = 1) ## Projection of the predicted species abundance on pca2: plot(y = pca1$u[,2L], x = Doubs.geo[-excl,"DFS"], pch = 21L, bg = "red", ylab = "PCA2 loadings", xlab = "Distance from river source (km)") lines(y = Uprd1[,2L], x = x, col = 2, lty = 1) ## Displaying only the observed and predicted abundance for Brown Trout. par(new = TRUE) plot(y = Y[,"TRU"], Doubs.geo[-excl,"DFS"], pch = 21L, bg = "green", ylab = "", xlab = "", new = FALSE, axes = FALSE) axis(4) lines(y = prd1[,"TRU"], x = x, col = 3) mtext(side = 4, "sqrt(Brown trout rel. abundance)", line = 2.5) ### Example 3: Borcard et al. Oribatid Mite ## Testing the (2-dimensional) spatial codependence between the Oribatid Mite ## community structure and environmental variables, while displaying the ## total effects of the significant variables on the community structure ## (i.e., its first principal component). data(mite) map3 <- eigenmap(x = mite.geo) Y <- LGTransforms(mite.species, "Hellinger") ## Organize the environmental variables mca4 <- MCA(Y = Y, X = mite.env, emobj = map3) mca4_partest <- test.cdp(mca4, response.tests = FALSE) summary(mca4_partest) plot(mca4_partest, las = 2, lwd = 2) plot(mca4_partest, col = rainbow(1200)[1L:1000], las = 3, lwd = 4, main = "Codependence diagram", col.signif = "white") ## Making a regular point grid for plotting the spatially-explicit ## predictions: rng <- list( x = seq(min(mite.geo[,"x"]) - 0.1, max(mite.geo[,"x"]) + 0.1, 0.05), y = seq(min(mite.geo[,"y"]) - 0.1, max(mite.geo[,"y"]) + 0.1, 0.05)) grid <- cbind(x = rep(rng[["x"]], length(rng[["y"]])), y = rep(rng[["y"]], each = length(rng[["x"]]))) newdists <- matrix(NA, nrow(grid), nrow(mite.geo)) for(i in 1L:nrow(grid)) { newdists[i,] <- ((mite.geo[,"x"] - grid[i,"x"])^2 + (mite.geo[,"y"] - grid[i,"y"])^2)^0.5 } spmeans <- colMeans(Y) pca2 <- svd(Y - rep(spmeans, each = nrow(Y))) prd2 <- predict(mca4_partest, newdata = list(target = eigenmap.score(map3, newdists))) Uprd2 <- (prd2 - rep(spmeans, each = nrow(prd2))) %*% pca2$v %*% diag(pca2$d^-1) ## Printing the response variable (first principal component of the mite ## community structure). prmat <- Uprd2[,1L] dim(prmat) <- c(length(rng$x), length(rng$y)) zlim <- c(min(min(prmat), min(pca2$u[,1L])), max(max(prmat), max(pca2$u[,1L]))) image(z = prmat, x = rng$x, y = rng$y, asp = 1, zlim = zlim, col = rainbow(1200L)[1L:1000], ylab = "y", xlab = "x") points( x=mite.geo[,"x"], y=mite.geo[,"y"], pch=21L, bg = rainbow(1200L)[round(1+(999*(pca2$u[,1L]-zlim[1L])/(zlim[2L]-zlim[1L])),0)])
### Example 1: St. Marguerite River Salmon Transect data(salmon) ## Converting the data from data frames to to matrices: Abundance <- log1p(as.matrix(salmon[,"Abundance",drop = FALSE])) Environ <- as.matrix(salmon[,3L:5]) ## Creating a spatial eigenvector map: map1 <- eigenmap(x = salmon[,"Position"], weighting = wf.binary, boundaries = c(0,20)) ## Case of a single descriptor: mca1 <- MCA(Y = Abundance, X = Environ[,"Substrate",drop = FALSE], emobj = map1) mca1 mca1_partest <- test.cdp(mca1) mca1_partest summary(mca1_partest) par(mar = c(6,4,2,4)) plot(mca1_partest, las = 3, lwd=2) mca1_pertest <- permute.cdp(mca1) ## Not run: ## or: mca1_pertest <- parPermute.cdp(mca1, permute = 999999) ## End(Not run) mca1_pertest summary(mca1_pertest) plot(mca1_pertest, las = 3) mca1_pertest$UpYXcb$C # Array containing the codependence coefficients ## With all descriptors at once: mca2 <- MCA(Y = log1p(as.matrix(salmon[,"Abundance",drop = FALSE])), X = as.matrix(salmon[,3L:5]), emobj = map1) mca2 mca2_partest <- test.cdp(mca2) mca2_partest summary(mca2_partest) par(mar = c(6,4,2,4)) plot(mca2_partest, las = 3, lwd=2) mca2_pertest <- permute.cdp(mca2) ## Not run: ## or: mca2_pertest <- parPermute.cdp(mca2, permute = 999999) ## End(Not run) mca2_pertest summary(mca2_pertest) plot(mca2_pertest, las = 3, lwd=2) mca2_pertest$UpYXcb$C # Array containing the codependence coefficients mca2_pertest$UpYXcb$C[,1L,] # now turned into a matrix. ### Example 2: Doubs Fish Community Transect data(Doubs) ## Sites with no fish observed are excluded: excl <- which(rowSums(Doubs.fish) == 0) ## Creating a spatial eigenvector map: map2 <- eigenmap(x = Doubs.geo[-excl,"DFS"]) ## The eigenvalues are in map2$lambda, the MEM eigenvectors in matrix map2$U ## MCA with multivariate response data analyzed on the basis of the Hellinger ## distance: Y <- LGTransforms(Doubs.fish[-excl,],"Hellinger") mca3 <- MCA(Y = Y, X=Doubs.env[-excl,], emobj = map2) mca3_pertest <- permute.cdp(mca3) ## Not run: ## or: mca3_pertest <- parPermute.cdp(mca3, permute = 999999) ## End(Not run) mca3_pertest summary(mca3_pertest) par(mar = c(6,4,2,4)) plot(mca3_pertest, las = 2, lwd=2) ## Array containing all the codependence coefficients: mca3_pertest$UpYXcb$C ## Display the results along the transect spmeans <- colMeans(Y) pca1 <- svd(Y - rep(spmeans, each=nrow(Y))) par(mar = c(5,5,2,5) + 0.1) plot(y = pca1$u[,1L], x = Doubs.geo[-excl,"DFS"], pch = 21L, bg = "red", ylab = "PCA1 loadings", xlab = "Distance from river source (km)") ## A regular transect of sites from 0 to 450 (km) spaced by 1 km intervals ## (451 sites in total). It is used for plotting spatially-explicit ## predictions. x <- seq(0,450,1) newdists <- matrix(NA, length(x), nrow(Doubs.geo[-excl,])) for(i in 1L:nrow(newdists)) newdists[i,] <- abs(Doubs.geo[-excl,"DFS"] - x[i]) ## Calculating predictions for the regular transect under the same set of ## environmental conditions from which the codependence model was built. prd1 <- predict(mca3_pertest, newdata = list(target = eigenmap.score(map2, newdists))) ## Projection of the predicted species abundance on pca1: Uprd1 <- (prd1 - rep(spmeans, each = nrow(prd1))) %*% pca1$v %*% diag(pca1$d^-1) lines(y = Uprd1[,1L], x = x, col=2, lty = 1) ## Projection of the predicted species abundance on pca2: plot(y = pca1$u[,2L], x = Doubs.geo[-excl,"DFS"], pch = 21L, bg = "red", ylab = "PCA2 loadings", xlab = "Distance from river source (km)") lines(y = Uprd1[,2L], x = x, col = 2, lty = 1) ## Displaying only the observed and predicted abundance for Brown Trout. par(new = TRUE) plot(y = Y[,"TRU"], Doubs.geo[-excl,"DFS"], pch = 21L, bg = "green", ylab = "", xlab = "", new = FALSE, axes = FALSE) axis(4) lines(y = prd1[,"TRU"], x = x, col = 3) mtext(side = 4, "sqrt(Brown trout rel. abundance)", line = 2.5) ### Example 3: Borcard et al. Oribatid Mite ## Testing the (2-dimensional) spatial codependence between the Oribatid Mite ## community structure and environmental variables, while displaying the ## total effects of the significant variables on the community structure ## (i.e., its first principal component). data(mite) map3 <- eigenmap(x = mite.geo) Y <- LGTransforms(mite.species, "Hellinger") ## Organize the environmental variables mca4 <- MCA(Y = Y, X = mite.env, emobj = map3) mca4_partest <- test.cdp(mca4, response.tests = FALSE) summary(mca4_partest) plot(mca4_partest, las = 2, lwd = 2) plot(mca4_partest, col = rainbow(1200)[1L:1000], las = 3, lwd = 4, main = "Codependence diagram", col.signif = "white") ## Making a regular point grid for plotting the spatially-explicit ## predictions: rng <- list( x = seq(min(mite.geo[,"x"]) - 0.1, max(mite.geo[,"x"]) + 0.1, 0.05), y = seq(min(mite.geo[,"y"]) - 0.1, max(mite.geo[,"y"]) + 0.1, 0.05)) grid <- cbind(x = rep(rng[["x"]], length(rng[["y"]])), y = rep(rng[["y"]], each = length(rng[["x"]]))) newdists <- matrix(NA, nrow(grid), nrow(mite.geo)) for(i in 1L:nrow(grid)) { newdists[i,] <- ((mite.geo[,"x"] - grid[i,"x"])^2 + (mite.geo[,"y"] - grid[i,"y"])^2)^0.5 } spmeans <- colMeans(Y) pca2 <- svd(Y - rep(spmeans, each = nrow(Y))) prd2 <- predict(mca4_partest, newdata = list(target = eigenmap.score(map3, newdists))) Uprd2 <- (prd2 - rep(spmeans, each = nrow(prd2))) %*% pca2$v %*% diag(pca2$d^-1) ## Printing the response variable (first principal component of the mite ## community structure). prmat <- Uprd2[,1L] dim(prmat) <- c(length(rng$x), length(rng$y)) zlim <- c(min(min(prmat), min(pca2$u[,1L])), max(max(prmat), max(pca2$u[,1L]))) image(z = prmat, x = rng$x, y = rng$y, asp = 1, zlim = zlim, col = rainbow(1200L)[1L:1000], ylab = "y", xlab = "x") points( x=mite.geo[,"x"], y=mite.geo[,"y"], pch=21L, bg = rainbow(1200L)[round(1+(999*(pca2$u[,1L]-zlim[1L])/(zlim[2L]-zlim[1L])),0)])
Calculates the number of permutations suitable for testing Multiscale Codependence Analysis.
minpermute(alpha, nbtest, margin = 1, ru = 3)
minpermute(alpha, nbtest, margin = 1, ru = 3)
alpha |
The familywise type I error threshold allowable for the complete analysis. |
nbtest |
The number of test performed (the number of eigenvectors in the
'mem' object in the case of |
margin |
A margin allowed for the number of permutation. Default value: 1. |
ru |
The magnitude of the round-up to apply to the number of permutations. |
This function calculate the number of permutations for use with
permute.cdp
. Argument margin
allows to apply a safe
margin to the number of permutations. The minimal suitable value for this
parameter is 1. Argument ru
allows one to round-up the number of
permutations. A value of 0 implies no round-up, a value of 1 a round-up to
the next ten, 2 a round-up to the next hundred, and so on. Function
minpermute
is called internally by permute.cdp
in
case permute = NA
. In that case, the margin is set to 10
(margin = 10
) and the outcome is rounded-up to the next thousand
(ru = 3
). This function is meant for users that wish to apply their
own margins and round-up factors to calculate the number of permutations for
use with permute.cdp
.
The minimum number of permutation to be used for
permute.cdp
.
Guillaume Guenard and Pierre Legendre, Bertrand Pages Maintainer: Guillaume Guenard <[email protected]>
Guénard, G., Legendre, P., Boisclair, D., and Bilodeau, M. 2010. Multiscale codependence analysis: an integrated approach to analyse relationships across scales. Ecology 91: 2952-2964
## For a 5\% threshold under 50 tests. minpermute(alpha = 0.05, nbtest=50) ## Allowing more margin (implies more computation time). minpermute(alpha = 0.05, nbtest=50, margin=10, ru=3)
## For a 5\% threshold under 50 tests. minpermute(alpha = 0.05, nbtest=50) ## Allowing more margin (implies more computation time). minpermute(alpha = 0.05, nbtest=50, margin=10, ru=3)
Borcard et al's oribatid mite community composition from Lac Geai, Canada.
data(mite)
data(mite)
Contains three matrices:
The abundance of 35 morpho-species of oribatid mites (Acari).
14 environmental variables (quantitative and binary).
The relative coordinates of the samples.
Values in mite.species
are counts of individuals of each of
the morpho-species obtained from 5 cm diameter cores going from the surface
of the peat down to a depth of 7 cm. See Bordard & Legendre (1994) and
reference therein for details about sample treatment and species
identification.
'mite.env' contains two quantitative variables, namely the substratum density (g/L) and water content (percent wet mass over dry mass), in addition to 12 dummy variables. The first seven represent the composition of the substratum: Sphagnum magellacinum (with a majority of S. rubellum), S. rubellum, S. nemorum, (with a majority of S. augustifollium), S. rubellum + S. magellicum (in equal proportions), lignous litter, bare peat, and interface between Sphagnum species. The next three dummy variables represent the presence and abundance of shrubs (Kalmia polifolia, K. angustifolia, and Rhododentron groenlandicum): none, few, and many. The last two dummy variables represent the microtopography of the peat: blanket (flat) or hummock (raised).
'mite.geo' contains the location of the samples, in meters, with respect to the sampling grid. Point (0,0) is the lower left end of the plot for an observer looking from the shore towards the water. The 'x' coordinate is the offset along the shore (from left to right) while the 'y' coordinate is the offset from the shore while moving towards the water (See Borcard & Legendre, 1994, Fig. 1 for details on the sampling area).
Daniel Borcard, Département de sciences biologiques, Université de Montréal, Montréal, Québec, Canada.
Borcard, D. & Legendre, P. 1994. Environmental control and spatial structure in ecological communities: an example using Oribatid mites (Acari, Oribatei). Environ. Ecol. Stat. 1: 37-61
Borcard, D.; P. Legendre & P. Drapeau. 1992. Partialling out the spatial component of ecological variation. Ecology 73: 1045-1055
Legendre, P. 2005. Species associations: the Kendall coefficient of concordance revisited. Journal of Agricultural, Biological and Environmental Statistics 10: 226-245
Borcard, D.; Gillet, F. & Legendre, P. 2011. Numerical Ecology with R. Springer, New-York, NY, USA.
data(mite) summary(mite.species) summary(mite.env) summary(mite.geo)
data(mite) summary(mite.species) summary(mite.env) summary(mite.geo)
Density and distribution functions of the phi statistic, which is the product of two Fisher-Snedecor distributions or the tau statistic, which is the product of two Student's t distributions.
dphi(x, nu1, nu2, tol = .Machine$double.eps^0.5) pphi(q, nu1, nu2, lower.tail = TRUE, tol = .Machine$double.eps^0.5) dtau(x, nu, tol = .Machine$double.eps^0.5) ptau(q, nu, lower.tail = TRUE, tol = .Machine$double.eps^0.5)
dphi(x, nu1, nu2, tol = .Machine$double.eps^0.5) pphi(q, nu1, nu2, lower.tail = TRUE, tol = .Machine$double.eps^0.5) dtau(x, nu, tol = .Machine$double.eps^0.5) ptau(q, nu, lower.tail = TRUE, tol = .Machine$double.eps^0.5)
x , q
|
A vector of quantile. |
nu1 , nu2 , nu
|
Degrees of freedom (>0, may be non-integer; |
tol |
The tolerance used during numerical estimation. |
lower.tail |
Logical; if TRUE (default), probabilities are P[X <= x], otherwise, P[X > x]. |
The density distribution of a variable z
that is the product
of two random variables 'x' and 'y' with density distributions f(x) and g(y),
respectively, is the integral f(x) * g(z/x) / abs(x) dx over the intersection
of the domains of 'x' and 'y'.
dphi
estimates density values using numerical integration
(integrate
) the Fisher-Scedecor df
density
distribution function. Following the algebra of Multiscale Codependence
Analysis, f(x) has df1 = nu1 and df2 = nu1 * nu2 degrees of freedom and g(x)
has 'df1 = 1' and 'df2 = nu2' degrees of freedom. Hence, that product
distribution has two parameters.
pphi
integrates dphi
in the interval [0,q] when
lower.tail = TRUE
(the default) and on the interval [q,Inf
]
when lower.tail = FALSE
.
dtau
and ptau
are similar to dphi
and pphi
,
respectively. pphi
integrates dphi
, with f(x) and f(y) being
two Student's t distribution with nu
degrees of freedom. It is called
by functions test.cdp
and permute.cdp
to perform
hypothesis tests for single response variables, in which case unilateral
tests can be performed.
dphi
and dtau
return the density functions, whereas
pphi
and ptau
return the distribution functions.
dphi()
: Probability density function for the phi statistics
pphi()
: Distribution function for the phi statistics
dtau()
: Probability density function for the tau statistics
ptau()
: Distribution function for the tau statistics
Guillaume Guenard and Pierre Legendre, Bertrand Pages Maintainer: Guillaume Guenard <[email protected]>
Springer, M. D. 1979. The algebra of random variables. John Wiley and Sons Inc., Hoboken, NJ, USA.
Guénard, G., Legendre, P., Boisclair, D., and Bilodeau, M. 2010. Multiscale codependence analysis: an integrated approach to analyse relationships across scales. Ecology 91: 2952-2964
Guénard, G. Legendre, P. 2018. Bringing multivariate support to multiscale codependence analysis: Assessing the drivers of community structure across spatial scales. Meth. Ecol. Evol. 9: 292-304
### Displays the phi probability distribution for five different numbers ### of degrees of freedom: x <- 10^seq(-4, 0.5, 0.05) plot(y = dphi(x, 1, 10), x = x, type = "l", col = "black", las = 1, ylab = "pdf", ylim = c(0, 0.5)) lines(y = dphi(x, 3, 10), x = x, col = "purple") lines(y = dphi(x, 5, 70), x = x, col = "blue") lines(y = dphi(x, 12, 23), x = x, col = "green") lines(y = dphi(x, 35, 140), x = x, col = "red") ### Displays the density distribution function for 10 degrees of freedom. x <- 10^seq(-4, 0.5, 0.05) y <- dphi(x, 5, 70) plot(y = y, x = x, type = "l", col = "black", las = 1, ylab = "Density", ylim = c(0, 0.5)) polygon(x = c(x[81L:91], x[length(x)], 1), y = c(y[81L:91], 0, 0), col = "grey") text(round(pphi(1, 5, 70, lower.tail=FALSE), 3), x = 1.75, y = 0.05) ## Idem for the tau distribution: x <- c(-(10^seq(0.5, -4, -0.05)), 10^seq(-4, 0.5, 0.05)) plot(y = dtau(x, 1), x = x, type = "l", col = "black", las = 1, ylab = "pdf", ylim = c(0, 0.5)) lines(y = dtau(x, 2), x = x, col = "purple") lines(y = dtau(x, 5), x = x, col="blue") lines(y = dtau(x, 10), x = x, col="green") lines(y = dtau(x, 100), x = x, col="red") y <- dtau(x, 10) plot(y = y, x = x, type = "l", col = "black", las = 1, ylab = "Density", ylim = c(0, 0.5)) polygon(x = c(x[which(x==1):length(x)], x[length(x)],1), y = c(y[which(x==1):length(x)], 0, 0), col = "grey") text(round(ptau(1, 10, lower.tail = FALSE), 3), x = 1.5, y = 0.03) polygon(x = c(-1, x[1L], x[1L:which(x==-1)]), y = c(0, 0, y[1L:which(x==-1)]), col="grey") text(round(ptau(-1, 10), 3), x = -1.5, y = 0.03)
### Displays the phi probability distribution for five different numbers ### of degrees of freedom: x <- 10^seq(-4, 0.5, 0.05) plot(y = dphi(x, 1, 10), x = x, type = "l", col = "black", las = 1, ylab = "pdf", ylim = c(0, 0.5)) lines(y = dphi(x, 3, 10), x = x, col = "purple") lines(y = dphi(x, 5, 70), x = x, col = "blue") lines(y = dphi(x, 12, 23), x = x, col = "green") lines(y = dphi(x, 35, 140), x = x, col = "red") ### Displays the density distribution function for 10 degrees of freedom. x <- 10^seq(-4, 0.5, 0.05) y <- dphi(x, 5, 70) plot(y = y, x = x, type = "l", col = "black", las = 1, ylab = "Density", ylim = c(0, 0.5)) polygon(x = c(x[81L:91], x[length(x)], 1), y = c(y[81L:91], 0, 0), col = "grey") text(round(pphi(1, 5, 70, lower.tail=FALSE), 3), x = 1.75, y = 0.05) ## Idem for the tau distribution: x <- c(-(10^seq(0.5, -4, -0.05)), 10^seq(-4, 0.5, 0.05)) plot(y = dtau(x, 1), x = x, type = "l", col = "black", las = 1, ylab = "pdf", ylim = c(0, 0.5)) lines(y = dtau(x, 2), x = x, col = "purple") lines(y = dtau(x, 5), x = x, col="blue") lines(y = dtau(x, 10), x = x, col="green") lines(y = dtau(x, 100), x = x, col="red") y <- dtau(x, 10) plot(y = y, x = x, type = "l", col = "black", las = 1, ylab = "Density", ylim = c(0, 0.5)) polygon(x = c(x[which(x==1):length(x)], x[length(x)],1), y = c(y[which(x==1):length(x)], 0, 0), col = "grey") text(round(ptau(1, 10, lower.tail = FALSE), 3), x = 1.5, y = 0.03) polygon(x = c(-1, x[1L], x[1L:which(x==-1)]), y = c(0, 0, y[1L:which(x==-1)]), col="grey") text(round(ptau(-1, 10), 3), x = -1.5, y = 0.03)
Juvenile Atlantic salmon (parr) density in a 1520m transect of the St. Marguerite River, Québec, Canada.
data(salmon)
data(salmon)
A 76 rows by 5 columns data.frame
.
Contains (1) the 76 sampling site positions along a 1520 m river segment beginning at a location called ‘Bardsville' (Lat: 48°23’01.59” N ; Long: 70°12'10.05” W), (2) the number of parr (young salmon, ages I+ and II+) observed at the sampling sites, (3) the mean water depths (m), (4) the mean current velocity (m/s), and (5) the mean substrate size (mm). Sampling took place on July 7, 2002, in the 76 sites, each 20 m long. The 'Bardsville' river segment is located in the upper portion of Sainte-Marguerite River, Quebec, Canada.
Daniel Boisclair, Département de sciences biologiques, Université de Montréal, Montréal, Québec, Canada.
Guénard, G., Legendre, P., Boisclair, D., and Bilodeau, M. 2010. Multiscale codependence analysis: an integrated approach to analyse relationships across scales. Ecology 91: 2952-2964
Bouchard, J. and Boisclair, D. 2008. The relative importance of local, lateral, and longitudinal variables on the development of habitat quality models for a river. Can. J. Fish. Aquat. Sci. 65: 61-73
data(salmon) summary(salmon)
data(salmon) summary(salmon)
A set of common distance weighting functions to calculate spatial eignevector
maps using function eigenmap
.
wf.sqrd(d) wf.RBF(d, wpar = 1) wf.PCNM(d, boundaries, wpar = 4) wf.binary(d, boundaries) wf.Drayf1(d, boundaries) wf.Drayf2(d, boundaries, wpar = 1) wf.Drayf3(d, boundaries, wpar = 1)
wf.sqrd(d) wf.RBF(d, wpar = 1) wf.PCNM(d, boundaries, wpar = 4) wf.binary(d, boundaries) wf.Drayf1(d, boundaries) wf.Drayf2(d, boundaries, wpar = 1) wf.Drayf3(d, boundaries, wpar = 1)
d |
A triangular ('dist-class') or rectangular geographic
distance matrix produced by |
wpar |
Where applicable, a parameter controlling the shape of the spatial weighting function. |
boundaries |
Where applicable, a two-element numeric vector containing the lower and upper threshold values used to obtain the connectivity matrix. (see details). |
These functions are meant primarily to be called within functions
eigenmap
and eigenmap.score
. In
eigenmap
, argument d
is a lower-triangular
'dist-class' object and the resulting lower-triangular weight matrix
is used in calculating the spatial eigenvector map. In
eigenmap.score
, d
is a rectangular matrix of the
distances between a set of arbitrary locations (rows) and reference locations
(columns; the locations for which the the spatial eigenvector map has been
built and the resulting rectangular weight matrix is used to calculate
spatial eigenfunction values. These values allow one to use the spatial
information of a data set for making predictions at arbitrary values.
'Wf.sqrd' (default value) consists in taking w_i,j = -0.5*d_i,j and does not involve any truncation.
'Wf.RBF' consists in taking w_i,j = exp(-wpar*d_i,j^2) and does not involve any truncation, where wpar is a non-zero real positive value (default: 1).
'Wf.binary' the spatial weighting matrix is simply the connectivity matrix.
'Wf.PCNM' is a_i,j = 1 - (d_i,j / (wpar*boundaries_2))^2, where wpar is a non-zero real positive value (default: 4).
'Wf.Drayf1' is a_i,j = 1 - (d_i,j / d_max) where d_max is the distance between the two most distant locations in the set.
'Wf.Drayf2' is a_i,j = 1 - (d_i,j / d_max)^wpar, where wpar is a non-zero real positive value (default: 1).
'Wf.Drayf3' is a_i,j = 1 / d_i,j^wpar, where wpar is a non-zero real positive value (default: 1).
Functions Wf.Drayf1
, Wf.Drayf2
, and Wf.Drayf3
were
proposed by Dray et al. (2006) and function PCNM
was proposed by
Legendre and Legendre (2012).
The Wf.sqrd
weighting approach is equivalent to submitting the
elementwise square-root of the distance matrix to a principal coordinate
analysis. It was proposed by Diniz-Filho et al. (2013) and is equivalent, for
evenly spaced transect or surfaces (square or rectangle), to using the basis
functions of type II discrete cosine basis transforms; a fact that has gone
unnoticed by Diniz-Filho et al. (2013).
The radial basis function (RBF) is a widespread kernel method involving sets
of real-valued functions whose values depend on the distance between any
given input coordinate and a set of fixed points (a single fixed point for
each function). It is implemented using function Wf.RBF
using all the
sampling points as the fixed points.
When calculating the connectivity matrix, pairs of location whose distance to
one another are between the boundary values (argument bounraries
) are
considered as neighbours (b_i,j=1) whereas values located below the
minimum and above the maximum are considered as equivalent or distant,
respectively (b_i,j=0 in both cases).
User may implement custom weighting functions. These functions must at the
very least have an argument d
, and can be given arguments
boundaries
and wpar
. Argument wpar
may be a vector with
any number of elements. They should be added to the R-code file
(weighting-functions.R). User-provided weighting functions with an argument
wpar
must come with a valid default value for that parameter since
eigenmap
may internally call it without a formal value.
A 'dist-class' object when argument d
is a
'dist-class' object or a rectangular matrix when argument d
is
a rectangular matrix, either one with the weights as its values.
wf.sqrd()
: Principal coordinates of the square-root distance matrix (Diniz-Filho et al.
2013).
wf.RBF()
: Radial basis functions with the observations as the kernels.
wf.PCNM()
: Borcard & Legendre's (2002) principal coordinates of the neighbour matrix
approach.
wf.binary()
: Dray et al. (2006) Moran's eigenvector maps (distance-based binary
connections without continuous weighting of the neighbours).
wf.Drayf1()
: Dray et al. (2006) Moran's eigenvector maps (distance-based binary
connections with continuous weighting of the neighbours: f1).
wf.Drayf2()
: Dray et al. (2006) Moran's eigenvector maps (distance-based binary
connections with continuous weighting of the neighbours: f2).
wf.Drayf3()
: Dray et al. (2006) Moran's eigenvector maps (distance-based binary
connections with continuous weighting of the neighbours: f3).
Guillaume Guenard and Pierre Legendre, Bertrand Pages Maintainer: Guillaume Guenard <[email protected]>
Borcard, D. and Legendre, P. 2002. All-scale spatial analysis of ecological data by means of principal coordinates of neighbour matrices. Ecol. Model. 153: 51-68
Diniz-Filho, J. A. F.; Diniz, J. V. B. P. L.; Rangel, T. F.; Soares, T. F.; de Campos Telles, M. P.; Garcia Collevatti, R. and Bini, L. M. 2013. A new eigenfunction spatial analysis describing population genetic structure. Genetica 141:479-489.
Dray, S.; Legendre, P. and Peres-Neto, P. 2006. Spatial modelling: a comprehensive framework for principal coordinate analysis of neighbor matrices (PCNM). Ecol. Modelling 196: 483-493
Legendre, P. and Legendre, L. 2012. Numerical Ecology, 3rd English edition. Elsevier Science B.V., Amsterdam, The Netherlands.
locations <- c(1,2,4,7,10,14,17,21) D <- dist(locations) wf.sqrd(D) wf.RBF(D, wpar = 0.1) wf.binary(D, c(0,5)) wf.PCNM(D, c(0,5)) wf.Drayf1(D, c(0,5)) wf.Drayf2(D, c(0,5), 0.5) wf.Drayf3(D, c(0,5), 0.5) emap <- eigenmap(D, locations, wf.Drayf2, c(0,5), 0.5) emap emap <- eigenmap(D, locations, wf.Drayf3, c(0,5), 0.25) emap emap <- eigenmap(D, locations, wf.RBF, wpar = 0.1) emap
locations <- c(1,2,4,7,10,14,17,21) D <- dist(locations) wf.sqrd(D) wf.RBF(D, wpar = 0.1) wf.binary(D, c(0,5)) wf.PCNM(D, c(0,5)) wf.Drayf1(D, c(0,5)) wf.Drayf2(D, c(0,5), 0.5) wf.Drayf3(D, c(0,5), 0.5) emap <- eigenmap(D, locations, wf.Drayf2, c(0,5), 0.5) emap emap <- eigenmap(D, locations, wf.Drayf3, c(0,5), 0.25) emap emap <- eigenmap(D, locations, wf.RBF, wpar = 0.1) emap