| Title: | Predictive Moran's Eigenvector Maps |
|---|---|
| Description: | Calculate Predictive Moran's Eigenvector Maps (pMEM) for spatially-explicit prediction of environmental variables, as defined by Guénard and Legendre (2024) <doi:10.1111/2041-210X.14413>. pMEM extends classical MEM by enabling interpolation and prediction at unsampled locations using spatial weighting functions parameterized by range (and optionally shape). The package implements multiple pMEM types (e.g., exponential, Gaussian, linear) and features a modular architecture that allows programmers to define custom weighting functions. Designed for ecologists, geographers, and spatial analysts working with spatially-structured data. |
| Authors: | Guillaume Guénard [aut, cre] (ORCID: <https://orcid.org/0000-0003-0761-3072>), Pierre Legendre [ctb] (ORCID: <https://orcid.org/0000-0002-3838-3305>) |
| Maintainer: | Guillaume Guénard <[email protected]> |
| License: | GPL-3 |
| Version: | 1.0-1 |
| Built: | 2026-06-07 08:22:31 UTC |
| Source: | https://github.com/guenardg/pmem |
pMEM implements Predictive Moran's Eigenvector Maps, a method for
spatially-explicit prediction of environmental variables using eigen-
decomposition of distance-based spatial weighting matrices.
pMEM extends classical Moran's Eigenvector Maps (MEM) by:
Enabling prediction at unsampled locations via predict()
methods
Supporting tunable distance weighting functions (exponential, Gaussian, power, linear, spherical, hyperbolic, hole_effect)
Optimizing spatial scale parameters via cross-validated model selection
Integrating with regression frameworks (lm, glm, glmnet)
Providing fast C++ backend via Rcpp for efficient eigenvector selection
Core workflow functions:
genDistMetric: Generate distance metric functions
(symmetric Euclidean or asymmetric complex-valued)
genDWF: Generate distance weighting functions
(7 built-in types with tunable parameters)
genSEF: Generate Spatial EigenMap (SEMap) objects
getMinMSE: Fast eigenvector selection via
cross-validated MSE
predict.SEMap: Interpolate eigenfunctions at new
locations
The SEMap-class encapsulates spatial eigenfunctions with
methods:
print(): Display object summary (sites, components,
eigenvalues)
as.matrix(): Extract eigenvectors as numeric/complex matrix
as.data.frame(): Extract eigenvectors as data frame
predict(): Calculate eigenfunction scores at new coordinates
Two spatial ecological datasets for examples and testing:
salmon: Atlantic salmon parr abundance and habitat
variables along a 1520 m river transect (76 sites, 5 variables)
geoMite: Oribatid mite community data with GIS layers
from a peat bog (70 cores, 35 species, 5 polygon layers)
Standard spatial modeling workflow with pMEM:
Load coordinates and response variable(s)
Generate distance metric: m <- genDistMetric()
Generate weighting function: f <- genDWF("Gaussian", range = 50)
Create SEMap object: sef <- genSEF(x, m, f)
Split data into training/testing sets
Select optimal eigenvectors: res <- getMinMSE(U, y, Up, yy)
Fit model with selected eigenvectors: lm(y ~ ., data = sef[, sel])
Predict at new locations: predict(sef, newdata, wh = sel)
pMEM supports directional spatial processes via complex-valued distance metrics:
Use genDistMetric(delta) with non-zero delta
parameter
For 1D transects: delta sets phase shift for
upstream/downstream
For 2D data: theta sets influence angle (direction of flow)
Eigenfunctions have real and imaginary parts representing directionality
Useful for rivers, wind patterns, ocean currents, and other directed processes
getMinMSE() is implemented in C++ via Rcpp (~50× faster than
pure R)
Memory-efficient handling of large eigenvector matrices
Parallelization-ready architecture for future extensions
Key methodological references:
Guénard, G. and Legendre, P. (2024). Spatially-explicit predictions using spatial eigenvector maps. Methods in Ecology and Evolution. <doi:10.1111/2041-210X.14413>
Dray, S., Legendre, P., and Peres-Neto, P.R. (2006). Spatial modelling: A comprehensive framework for Principal Coordinate Analysis of Neighbour Matrices (PCNM). Ecological Modelling, 196: 483–493. <doi:10.1016/j.ecolmodel.2006.02.015>
Moran, P.A.P. (1948). The interpretation of statistical maps. Journal of the Royal Statistical Society B, 10: 243–251. <doi:10.1111/j.2517-6161.1948.tb00012.x>
Guillaume Guénard [aut, cre] (ORCID: <https://orcid.org/0000-0003-0761-3072>), Pierre Legendre [ctb] (ORCID: <https://orcid.org/0000-0002-3838-3305>)
## Quick start: model channel depth along a river transect data("salmon", package = "pMEM") ## Split into training/testing sets: set.seed(123) test <- sample(nrow(salmon), 25) train <- setdiff(seq_len(nrow(salmon)), test) ## Generate spatial eigenfunctions: sef <- genSEF( x = salmon$Position[train], m = genDistMetric(), f = genDWF("Gaussian", range = 50) ) ## Select optimal eigenvectors via cross-validated MSE: res <- getMinMSE( U = as.matrix(sef), y = salmon$Depth[train], Up = predict(sef, salmon$Position[test]), yy = salmon$Depth[test], complete = FALSE ) ## Coefficient of prediction (R² on test set): 1 - res$mse / var(salmon$Depth[test]) #> [1] 0.8108639 ## View SEMap object summary: sef #> A SEMap-class object #> -------------------- #> Number of sites: 51 #> Directional: no #> Number of components: 50 #> Eigenvalues: 3.76546,3.54744,3.43502,...,0.00066,0.00012 #> -------------------- ## More examples: vignette("Using_pMEM", package = "pMEM")## Quick start: model channel depth along a river transect data("salmon", package = "pMEM") ## Split into training/testing sets: set.seed(123) test <- sample(nrow(salmon), 25) train <- setdiff(seq_len(nrow(salmon)), test) ## Generate spatial eigenfunctions: sef <- genSEF( x = salmon$Position[train], m = genDistMetric(), f = genDWF("Gaussian", range = 50) ) ## Select optimal eigenvectors via cross-validated MSE: res <- getMinMSE( U = as.matrix(sef), y = salmon$Depth[train], Up = predict(sef, salmon$Position[test]), yy = salmon$Depth[test], complete = FALSE ) ## Coefficient of prediction (R² on test set): 1 - res$mse / var(salmon$Depth[test]) #> [1] 0.8108639 ## View SEMap object summary: sef #> A SEMap-class object #> -------------------- #> Number of sites: 51 #> Directional: no #> Number of components: 50 #> Eigenvalues: 3.76546,3.54744,3.43502,...,0.00066,0.00012 #> -------------------- ## More examples: vignette("Using_pMEM", package = "pMEM")
genDistMetric returns a function that calculates pairwise distances
between rows of coordinate matrices, with optional asymmetry parameters.
genDistMetric(delta, theta = 0)genDistMetric(delta, theta = 0)
delta |
Optional asymmetry parameter. When provided, the returned
distance metric becomes complex-valued, with modulus equal to the Euclidean
distance and argument determined by |
theta |
Influence angle (in radians) for 2D asymmetric metrics. Used
only when |
When delta is missing (default), the returned function computes the
standard Euclidean distance. When delta is provided, the metric
becomes complex-valued: its modulus equals the Euclidean distance, and its
argument is determined by delta as follows:
1D data (transects): The argument is +delta for
pairs where the second point is downstream (higher coordinate) and
-delta otherwise.
2D data: The argument is the cosine of the angular
difference between the bearing from point A to B and the influence
angle theta.
In both cases, the distance from A → B has the opposite argument sign as B → A, ensuring the pairwise distance matrix is Hermitian with strictly real eigenvalues.
genDistMetric uses a function-generator pattern: it returns a
closure that embeds the specified delta and theta values
in its environment. To change these parameters, call
genDistMetric again with new arguments.
A function with signature function(x, y) that returns a
numeric matrix of pairwise distances. When y is omitted, the function
returns a square symmetric (or Hermitian) matrix of distances among rows of
x. For asymmetric metrics (delta provided), the matrix is
Hermitian with strictly real eigenvalues.
Guillaume Guénard [aut, cre] (ORCID: <https://orcid.org/0000-0003-0761-3072>), Pierre Legendre [ctb] (ORCID: <https://orcid.org/0000-0002-3838-3305>)
## A five-point equidistant transect: n <- 5 x <- (n - 1)*seq(0, 1, length.out=n) ## Symmetric (Euclidean) metric function: mSym <- genDistMetric() ## The pairwise symmetric metric between the rows of x: mSym(x) #> [,1] [,2] [,3] [,4] [,5] #> [1,] 0 1 2 3 4 #> [2,] 1 0 1 2 3 #> [3,] 2 1 0 1 2 #> [4,] 3 2 1 0 1 #> [5,] 4 3 2 1 0 ## Distances between x and a finer grid xx: xx <- (n - 1)*seq(0, 1, 0.05) mSym(x, xx) # Rectangular matrix: 5 rows × 21 columns ## Asymmetric metric with delta = 0.2: mAsy <- genDistMetric(delta = 0.2) mAsy(x) # Hermitian matrix (complex values) Mod(mAsy(x)) # Modulus equals Euclidean distances ## Asymmetric distances between x and xx: mAsy(x, xx)## A five-point equidistant transect: n <- 5 x <- (n - 1)*seq(0, 1, length.out=n) ## Symmetric (Euclidean) metric function: mSym <- genDistMetric() ## The pairwise symmetric metric between the rows of x: mSym(x) #> [,1] [,2] [,3] [,4] [,5] #> [1,] 0 1 2 3 4 #> [2,] 1 0 1 2 3 #> [3,] 2 1 0 1 2 #> [4,] 3 2 1 0 1 #> [5,] 4 3 2 1 0 ## Distances between x and a finer grid xx: xx <- (n - 1)*seq(0, 1, 0.05) mSym(x, xx) # Rectangular matrix: 5 rows × 21 columns ## Asymmetric metric with delta = 0.2: mAsy <- genDistMetric(delta = 0.2) mAsy(x) # Hermitian matrix (complex values) Mod(mAsy(x)) # Modulus equals Euclidean distances ## Asymmetric distances between x and xx: mAsy(x, xx)
genDWF returns a function that transforms pairwise distances into
spatial weights, with support for both real and complex-valued distances.
genDWF( fun = c("linear", "power", "hyperbolic", "spherical", "exponential", "Gaussian", "hole_effect"), range, shape = 1 )genDWF( fun = c("linear", "power", "hyperbolic", "spherical", "exponential", "Gaussian", "hole_effect"), range, shape = 1 )
fun |
Distance weighting function type. One of |
range |
Range parameter (positive numeric). For |
shape |
Shape parameter (positive numeric). Used only for |
All functions return 1 (or 1+0i) at distance 0. Behavior beyond varies.
When distances are complex (from asymmetric metrics via
genDistMetric(delta)), the weighting functions operate on the
modulus while preserving phase information. This enables directional
spatial modeling for transects or flow-influenced systems.
genDWF returns a closure embedding fun, range, and
shape in its environment. To change parameters, call
genDWF again with new arguments.
A function with signature function(d) that transforms
distances into weights. Returns a numeric vector for vector input, or a
numeric/complex matrix for matrix input. For complex-valued distances (from
asymmetric metrics), returns complex-valued weights.
Guillaume Guénard [aut, cre] (ORCID: <https://orcid.org/0000-0003-0761-3072>), Pierre Legendre [ctb] (ORCID: <https://orcid.org/0000-0002-3838-3305>)
## A set of distances from which to show the corresponding weights: d <- seq(0, 5, 0.1) ## The linear function with different ranges: w <- cbind( genDWF(fun = "linear", range = 1)(d), genDWF(fun = "linear", range = 0.5)(d), genDWF(fun = "linear", range = 2)(d) ) dim(w) # 51 × 3 matrix #> [1] 51 3 ## The exponential function: w <- genDWF(fun = "exponential", range = 1)(d) w[1:5] # First 5 weights (starts at 1, decays) #> [1] 1.0000000 0.9048374 0.8187308 0.7408182 0.6703200 ## The hole_effect function (can go negative): w <- genDWF(fun = "hole_effect", range = 1)(d) range(w) # Includes negative values #> [1] -0.2162362 1.0000000 ## Complex-valued distances (asymmetric metrics) with a delta of pi/8: d <- complex(modulus = seq(0, 5, 0.1), argument = pi/8) w <- genDWF(fun = "Gaussian", range = 1)(d) Mod(w[1:5]) # Modulus of complex weights #> 1.0000000 0.9929539 0.9721120 0.9383431 0.8930282 Arg(w[1:5]) # Phase preserved from input distances #> 0.000000000 -0.007071068 -0.028284271 -0.063639610 -0.113137085## A set of distances from which to show the corresponding weights: d <- seq(0, 5, 0.1) ## The linear function with different ranges: w <- cbind( genDWF(fun = "linear", range = 1)(d), genDWF(fun = "linear", range = 0.5)(d), genDWF(fun = "linear", range = 2)(d) ) dim(w) # 51 × 3 matrix #> [1] 51 3 ## The exponential function: w <- genDWF(fun = "exponential", range = 1)(d) w[1:5] # First 5 weights (starts at 1, decays) #> [1] 1.0000000 0.9048374 0.8187308 0.7408182 0.6703200 ## The hole_effect function (can go negative): w <- genDWF(fun = "hole_effect", range = 1)(d) range(w) # Includes negative values #> [1] -0.2162362 1.0000000 ## Complex-valued distances (asymmetric metrics) with a delta of pi/8: d <- complex(modulus = seq(0, 5, 0.1), argument = pi/8) w <- genDWF(fun = "Gaussian", range = 1)(d) Mod(w[1:5]) # Modulus of complex weights #> 1.0000000 0.9929539 0.9721120 0.9383431 0.8930282 Arg(w[1:5]) # Phase preserved from input distances #> 0.000000000 -0.007071068 -0.028284271 -0.063639610 -0.113137085
Spatially-referenced oribatid mite community and environmental data from a peat bog surrounding Lac Geai, Québec, Canada.
A list with five sf data frames:
core: A data frame with 70 rows (peat cores) containing point geometries and 46 fields containing values of environmental variables at the locations of the cores as well as the number of individuals of one of 35 Oribatid species observed in the cores (see details).
water: A data frame with three rows containing polygon
geometries and a single field: a factor variable named
"Type" and specifying whether the polygon represents open water
(value == "Water") or flooded areas (value == "Flooded") at the time of
sampling.
substrate: A data frame with 13 rows containing polygon
geometries and seven fields. The first field is a factor
that specifies one of six substrate classes (see details) and the
remaining six fields are binary variables for each of these six substrate
classes that take the value 1 when the polygon is of the class being
represented by the that variable and otherwise take the value 0.
shrub: A data frame with four rows containing polygon
geometries and a single field: an ordered variable named
"Type" and specifying whether the polygon represents areas with no shrub
(value == "None"), a few shrubs (value == "Few"), or many shrubs
(value == "Many").
topo: A data frame with four rows containing polygon
geometries and a single field: a factor variable named
"Type" and specifying the type of peat micro-topography. There are two
such types: "Blanket" (flat area) and "Hummock" (raised bumps).
All geometries use a local Cartesian coordinate system with units in centimeters, oriented such that X increases from water edge to forest edge, and Y increases along the shoreline. No projected CRS is assigned; treat as a local grid for spatial modeling.
Fields in geoMite$core:
SubsDens: Substrate density (g/L)
WatrCont: Water content of peat (g/L)
Substrate-*: Six binary indicators for substrate type (see substrate classification below)
Shrub: Ordered factor (None < Few < Many) for shrub cover
Topo: Factor with levels "Blanket" (flat) and "Hummock" (raised)
Flooded: Binary indicator of flooding at time of sampling
Species-*: Counts of 35 oribatid species (morphological identification)
Substrate types (non-exclusive; cores may span multiple types):
Sphagn1: Sphagnum magellanicum (majority) + S. rubellum
Sphagn2: Sphagnum rubellum
Sphagn3: Sphagnum nemoreum (+ minority S. angustifolium)
Sphagn4: S. rubellum and S. magellanicum in equal parts
Litter: Ligneous plant litter
Barepeat: Exposed peat without vegetation
These types are not mutually exclusive categories: cores were sometimes taken at the boundary between two or more substrate types and thus belong to multiple categories.
Polygon geometries were digitized from Figure 1 in Borcard et al. (1994) using a 10 mm grid. Due to print resolution limits, effective positional accuracy is approximately 10 cm in both X and Y directions.
The X coordinates corresponds to distances going from the edge of the water to the edge of the forest. The Y coordinates correspond the distances along the lake's shore.
The identification of the Oribatid species was carried out solely on the basis of their morphology as little is known on the ecology of these small animals.
Daniel Borcard, <[email protected]> and Pierre Legendre <[email protected]>
Borcard, D. and Legendre, P. 1994. Environmental Control and Spatial Structure in Ecological Communities: An Example Using Oribatid Mites (Acari, Oribatei). Environ. Ecol. Stat. 1(1): 37-61 <doi:10.1007/BF00714196>
Borcard, D., Legendre, P., and Drapeau, P. 1992. Partialling out the spatial component of ecological variation. Ecology, 73, 1045-1055. doi:10.2307/1940179
Borcard, D.; Legendre, P.; and Gillet, F. 2018. Numerical Ecology with R (2nd Edition) Springer, Cham, Switzerland. doi:10.1007/978-3-319-71404-2
Data set oribatid from package ade4, which is another
version of this data set.
data(geoMite) ## Quick summary of species counts: species_cols <- grep("^Species\\.", names(geoMite$core), value = TRUE) ## Maximum count per species sapply(st_drop_geometry(geoMite$core[species_cols]), max) #> Species.Brachysp Species.Hoplcfpa Species.Oppinova ... #> 42 8 37 ... ## Simple map of core locations over substrate: if(requireNamespace("sf", quietly = TRUE)) { plot(st_geometry(geoMite$substrate), col = "lightgreen", main = "Peat Cores") plot(st_geometry(geoMite$core), pch = 21, bg = "black", add = TRUE) } ## Not run: ## Color definitions: col <- list() col[["substrate"]] <- c(Sphagn1 = "#00ff00", Sphagn2 = "#fffb00", Sphagn3 = "#774b00", Sphagn4 = "#ff8400", Litter = "#ee00ff", Barepeat = "#ff0004") col[["water"]] <- c(Water = "#008cff", Flooded = "#ffffff00", core = "#000000ff") col[["shrub"]] <- c(None = "#dfdfdf", Few = "#a7a7a7", Many = "#5c5c5c") col[["topo"]] <- c(Blanket = "#74cd00", Hummock = "#bc9d00") ## Graphical paramters: p <- par(no.readonly = TRUE) par(mar=c(0,0,1,0), mfrow=c(1L,4L)) ## Substrate: with( geoMite, { plot(st_geometry(substrate), col=col[["substrate"]][substrate$Type], main="Substrate") plot(st_geometry(water[1,]), col=col[["water"]][water[1,]$Type], add=TRUE) plot(st_geometry(water[-1,]), col=col[["water"]][water[-1,]$Type], lty=3L, add=TRUE) plot(st_geometry(core), pch = 21, bg = "black", add=TRUE) } ) ## Shrubs: with( geoMite, { plot(st_geometry(shrub), col = col[["shrub"]][shrub$Type], main="Shrubs") plot(st_geometry(water[1,]), col=col[["water"]][water[1,]$Type], add=TRUE) plot(st_geometry(water[-1,]), col=col[["water"]][water[-1,]$Type], lty=3, add=TRUE) plot(st_geometry(core), pch = 21, bg = "black", add=TRUE) } ) ## Topograghy: with( geoMite, { plot(st_geometry(topo), col = col[["topo"]][topo$Type], main="Topography") plot(st_geometry(water[1,]), col=col[["water"]][water[1,]$Type], add=TRUE) plot(st_geometry(water[-1,]), col=col[["water"]][water[-1,]$Type], lty=3, add=TRUE) plot(st_geometry(core), pch = 21, bg = "black", add=TRUE) } ) ## Legends: with( geoMite, { plot(NA, xlim=c(0,1), ylim=c(0,1), axes = FALSE) legend(x=0, y=0.9, pch=22, pt.cex = 2.5, pt.bg=col[["substrate"]], box.lwd = 0, legend=names(col[["substrate"]]), title="Substrate") legend(x=-0.025, y=0.6, pch=c(22,NA,21), pt.cex=c(2.5,NA,1), pt.bg=col[["water"]], box.lwd=0, lty=c(0L,3L,NA), legend=c("Open water","Flooded area","Peat core")) legend(x=0, y=0.4, pch=22, pt.cex=2.5, pt.bg=col[["shrub"]], box.lwd=0, legend=names(col[["shrub"]]), title="Shrubs") legend(x=0, y=0.2, pch=22, pt.cex=2.5, pt.bg=col[["topo"]], box.lwd=0, legend=names(col[["topo"]]), title="Topography") } ) ### Display the species counts ## Get the species names: unlist( lapply( strsplit(colnames(geoMite$core),".",fixed=TRUE), function(x) if(x[1L] == "Species") x[2L] else NULL ) ) -> spnms ## See the maximum counts for all the species apply(st_drop_geometry(geoMite$core[,paste("Species",spnms,sep=".")]),2L,max) ## Species selection to display: sel <- c("Brachysp","Hoplcfpa","Oppinova","Limncfci","Limncfru") ## Range of counts to display: rng <- log1p(c(0,1000)) colmap <- grey(seq(1,0,length.out=256L)) ## Update the graphical parameters for this example par(mar=c(0,0,2,0), mfrow=c(1L,length(sel) + 1L)) with( geoMite, { ## Display each species in the selection over the substrate map for(sp in sel) { plot(st_geometry(substrate), col=col[["substrate"]][substrate$Type], main=sp) plot(st_geometry(core), pch=21L, add = TRUE, cex=1.5, bg=colmap[1 + 255*log1p(core[[paste("Species",sp,sep=".")]])/rng[2L]]) } ## Display the colour chart for the species counts: par(mar=c(2,7,3,1)) image(z=matrix(seq(0,log1p(1000),length.out=256L),1,256), col=colmap, xaxt="n", yaxt="n", y=seq(0,log1p(1000),length.out=256), xlab="", cex.lab=1.5, ylab=expression(paste("Counts by species (",ind~core^{-1},")"))) axis(2, at=log1p(c(0,1,3,10,30,100,300,1000)), cex.axis = 1.5, label=c(0,1,3,10,30,100,300,1000)) } ) ## Restore graphical parameters: par(p) ## End(Not run)data(geoMite) ## Quick summary of species counts: species_cols <- grep("^Species\\.", names(geoMite$core), value = TRUE) ## Maximum count per species sapply(st_drop_geometry(geoMite$core[species_cols]), max) #> Species.Brachysp Species.Hoplcfpa Species.Oppinova ... #> 42 8 37 ... ## Simple map of core locations over substrate: if(requireNamespace("sf", quietly = TRUE)) { plot(st_geometry(geoMite$substrate), col = "lightgreen", main = "Peat Cores") plot(st_geometry(geoMite$core), pch = 21, bg = "black", add = TRUE) } ## Not run: ## Color definitions: col <- list() col[["substrate"]] <- c(Sphagn1 = "#00ff00", Sphagn2 = "#fffb00", Sphagn3 = "#774b00", Sphagn4 = "#ff8400", Litter = "#ee00ff", Barepeat = "#ff0004") col[["water"]] <- c(Water = "#008cff", Flooded = "#ffffff00", core = "#000000ff") col[["shrub"]] <- c(None = "#dfdfdf", Few = "#a7a7a7", Many = "#5c5c5c") col[["topo"]] <- c(Blanket = "#74cd00", Hummock = "#bc9d00") ## Graphical paramters: p <- par(no.readonly = TRUE) par(mar=c(0,0,1,0), mfrow=c(1L,4L)) ## Substrate: with( geoMite, { plot(st_geometry(substrate), col=col[["substrate"]][substrate$Type], main="Substrate") plot(st_geometry(water[1,]), col=col[["water"]][water[1,]$Type], add=TRUE) plot(st_geometry(water[-1,]), col=col[["water"]][water[-1,]$Type], lty=3L, add=TRUE) plot(st_geometry(core), pch = 21, bg = "black", add=TRUE) } ) ## Shrubs: with( geoMite, { plot(st_geometry(shrub), col = col[["shrub"]][shrub$Type], main="Shrubs") plot(st_geometry(water[1,]), col=col[["water"]][water[1,]$Type], add=TRUE) plot(st_geometry(water[-1,]), col=col[["water"]][water[-1,]$Type], lty=3, add=TRUE) plot(st_geometry(core), pch = 21, bg = "black", add=TRUE) } ) ## Topograghy: with( geoMite, { plot(st_geometry(topo), col = col[["topo"]][topo$Type], main="Topography") plot(st_geometry(water[1,]), col=col[["water"]][water[1,]$Type], add=TRUE) plot(st_geometry(water[-1,]), col=col[["water"]][water[-1,]$Type], lty=3, add=TRUE) plot(st_geometry(core), pch = 21, bg = "black", add=TRUE) } ) ## Legends: with( geoMite, { plot(NA, xlim=c(0,1), ylim=c(0,1), axes = FALSE) legend(x=0, y=0.9, pch=22, pt.cex = 2.5, pt.bg=col[["substrate"]], box.lwd = 0, legend=names(col[["substrate"]]), title="Substrate") legend(x=-0.025, y=0.6, pch=c(22,NA,21), pt.cex=c(2.5,NA,1), pt.bg=col[["water"]], box.lwd=0, lty=c(0L,3L,NA), legend=c("Open water","Flooded area","Peat core")) legend(x=0, y=0.4, pch=22, pt.cex=2.5, pt.bg=col[["shrub"]], box.lwd=0, legend=names(col[["shrub"]]), title="Shrubs") legend(x=0, y=0.2, pch=22, pt.cex=2.5, pt.bg=col[["topo"]], box.lwd=0, legend=names(col[["topo"]]), title="Topography") } ) ### Display the species counts ## Get the species names: unlist( lapply( strsplit(colnames(geoMite$core),".",fixed=TRUE), function(x) if(x[1L] == "Species") x[2L] else NULL ) ) -> spnms ## See the maximum counts for all the species apply(st_drop_geometry(geoMite$core[,paste("Species",spnms,sep=".")]),2L,max) ## Species selection to display: sel <- c("Brachysp","Hoplcfpa","Oppinova","Limncfci","Limncfru") ## Range of counts to display: rng <- log1p(c(0,1000)) colmap <- grey(seq(1,0,length.out=256L)) ## Update the graphical parameters for this example par(mar=c(0,0,2,0), mfrow=c(1L,length(sel) + 1L)) with( geoMite, { ## Display each species in the selection over the substrate map for(sp in sel) { plot(st_geometry(substrate), col=col[["substrate"]][substrate$Type], main=sp) plot(st_geometry(core), pch=21L, add = TRUE, cex=1.5, bg=colmap[1 + 255*log1p(core[[paste("Species",sp,sep=".")]])/rng[2L]]) } ## Display the colour chart for the species counts: par(mar=c(2,7,3,1)) image(z=matrix(seq(0,log1p(1000),length.out=256L),1,256), col=colmap, xaxt="n", yaxt="n", y=seq(0,log1p(1000),length.out=256), xlab="", cex.lab=1.5, ylab=expression(paste("Counts by species (",ind~core^{-1},")"))) axis(2, at=log1p(c(0,1,3,10,30,100,300,1000)), cex.axis = 1.5, label=c(0,1,3,10,30,100,300,1000)) } ) ## Restore graphical parameters: par(p) ## End(Not run)
getMinMSE performs forward selection of orthogonal predictors by
minimizing out-of-sample mean squared error (MSE) on a testing dataset.
getMinMSE(U, y, Up, yy, complete = TRUE)getMinMSE(U, y, Up, yy, complete = TRUE)
U |
Training matrix of orthogonal predictors ( |
y |
Training response vector (length |
Up |
Testing matrix of orthogonal predictors ( |
yy |
Testing response vector (length |
complete |
If |
This function allows one to calculate a simple model, involving only the spatial eigenvectors and a single response variable. The coefficients are estimated on a training data set; the ones that are retained are chosen on the basis of minimizing the mean squared error on the testing data set.
The selection procedure proceeds as follows:
Compute regression coefficients: b = t(U) %*% y (valid
because U is orthonormal).
Sort coefficients by decreasing absolute value; store order in
ord.
Compute null MSE: variance of yy around mean(y).
For each predictor (in sorted order), add its partial prediction to
the model and compute out-of-sample MSE on yy.
Identify the model with minimum MSE; return corresponding coefficient threshold and MSE.
Predictors in U must be orthonormal (columns have unit norm and are
mutually orthogonal). This condition is met by design for spatial
eigenvectors produced by genSEF(), but users must ensure it for
custom predictors.
This function is implemented in C++ via Rcpp for efficiency. For
p = 50 predictors and n = 100 observations, it is
approximately 50× faster than a pure R implementation.
If complete = TRUE, a list with the following elements:
Squared standardized regression coefficients (length
p).
Null model MSE: variance of testing responses around their mean.
MSE values for incremental models (length p + 1);
mse[1] is the null MSE.
Indices of predictors sorted by decreasing absolute coefficient value.
Index of the best model (1-based). Value 1 means the null
model is best; value k + 1 means the first k predictors from
ord are selected.
If complete = FALSE, a list with two elements:
Squared standardized coefficient at the optimal model.
Minimum MSE value achieved.
Guillaume Guénard [aut, cre] (ORCID: <https://orcid.org/0000-0003-0761-3072>), Pierre Legendre [ctb] (ORCID: <https://orcid.org/0000-0002-3838-3305>)
## Loading the 'salmon' dataset data("salmon") seq(1,nrow(salmon),3) -> test # Indices of the testing set. (1:nrow(salmon))[-test] -> train # Indices of the training set. ## A set of locations located 1 m apart: xx <- seq(min(salmon$Position) - 20, max(salmon$Position) + 20, 1) ## Lists to contain the results: mseRes <- list() sel <- list() lm <- list() prd <- list() ## Generate the spatial eigenfunctions: genSEF( x = salmon$Position[train], m = genDistMetric(), f = genDWF("Gaussian",40) ) -> sefTrain ## Spatially-explicit modelling of the channel depth: ## Calculate the minimum MSE model: getMinMSE( U = as.matrix(sefTrain), y = salmon$Depth[train], Up = predict(sefTrain, salmon$Position[test]), yy = salmon$Depth[test] ) -> mseRes[["Depth"]] ## This is the coefficient of prediction: 1 - mseRes$Depth$mse[mseRes$Depth$wh]/mseRes$Depth$nullmse ## Storing graphical parameters: tmp <- par(no.readonly = TRUE) ## Changing the graphical margins: par(mar=c(4,4,2,2)) ## Plot of the MSE values: plot(mseRes$Depth$mse, type="l", ylab="MSE", xlab="order", axes=FALSE, ylim=c(0.005,0.025)) points(x=1:length(mseRes$Depth$mse), y=mseRes$Depth$mse, pch=21, bg="black") axis(1) axis(2, las=1) abline(h=mseRes$Depth$nullmse, lty=3) # Dotted line: the null MSE ## A list of the selected spatial eigenfunctions: sel[["Depth"]] <- sort(mseRes$Depth$ord[1:mseRes$Depth$wh]) ## A linear model built using the selected spatial eigenfunctions: lm( formula = y~., data = cbind( y = salmon$Depth[train], as.data.frame(sefTrain, wh=sel$Depth) ) ) -> lm[["Depth"]] ## Calculating predictions of depth at each 1 m intervals: predict( lm$Depth, newdata = as.data.frame( predict( object = sefTrain, newdata = xx, wh = sel$Depth ) ) ) -> prd[["Depth"]] ## Plot of the predicted depth (solid line), and observed depth for the ## training set (black markers) and testing set (red markers): plot(x=xx, y=prd$Depth, type="l", ylim=range(salmon$Depth, prd$Depth), las=1, ylab="Depth (m)", xlab="Location along the transect (m)") points(x = salmon$Position[train], y = salmon$Depth[train], pch=21, bg="black") points(x = salmon$Position[test], y = salmon$Depth[test], pch=21, bg="red") ## Prediction of the velocity, substrate, and using them to predict the parr ## density. ## Not run: ## Spatially-explicit modelling of the water velocity: ## Calculate the minimum MSE model: getMinMSE( U = as.matrix(sefTrain), y = salmon$Velocity[train], Up = predict(sefTrain, salmon$Position[test]), yy = salmon$Velocity[test] ) -> mseRes[["Velocity"]] ## This is the coefficient of prediction: 1 - mseRes$Velocity$mse[mseRes$Velocity$wh]/mseRes$Velocity$nullmse ## Plot of the MSE values: plot(mseRes$Velocity$mse, type="l", ylab="MSE", xlab="order", axes=FALSE, ylim=c(0.010,0.030)) points(x=1:length(mseRes$Velocity$mse), y=mseRes$Velocity$mse, pch=21, bg="black") axis(1) axis(2, las=1) abline(h=mseRes$Velocity$nullmse, lty=3) ## A list of the selected spatial eigenfunctions: sel[["Velocity"]] <- sort(mseRes$Velocity$ord[1:mseRes$Velocity$wh]) ## A linear model build using the selected spatial eigenfunctions: lm( formula = y~., data = cbind( y = salmon$Velocity[train], as.data.frame(sefTrain, wh=sel$Velocity) ) ) -> lm[["Velocity"]] ## Calculating predictions of velocity at each 1 m intervals: predict( lm$Velocity, newdata = as.data.frame( predict( object = sefTrain, newdata = xx, wh = sel$Velocity ) ) ) -> prd[["Velocity"]] ## Plot of the predicted velocity (solid line), and observed velocity for the ## training set (black markers) and testing set (red markers): plot(x=xx, y=prd$Velocity, type="l", ylim=range(salmon$Velocity, prd$Velocity), las=1, ylab="Velocity (m/s)", xlab="Location along the transect (m)") points(x = salmon$Position[train], y = salmon$Velocity[train], pch=21, bg="black") points(x = salmon$Position[test], y = salmon$Velocity[test], pch=21, bg="red") ## Spatially-explicit modelling of the mean substrate size (D50): ## Calculate the minimum MSE model: getMinMSE( U = as.matrix(sefTrain), y = salmon$Substrate[train], Up = predict(sefTrain, salmon$Position[test]), yy = salmon$Substrate[test] ) -> mseRes[["Substrate"]] ## This is the coefficient of prediction: 1 - mseRes$Substrate$mse[mseRes$Substrate$wh]/mseRes$Substrate$nullmse ## Plot of the MSE values: plot(mseRes$Substrate$mse, type="l", ylab="MSE", xlab="order", axes=FALSE, ylim=c(1000,6000)) points(x=1:length(mseRes$Substrate$mse), y=mseRes$Substrate$mse, pch=21, bg="black") axis(1) axis(2, las=1) abline(h=mseRes$Substrate$nullmse, lty=3) ## A list of the selected spatial eigenfunctions: sel[["Substrate"]] <- sort(mseRes$Substrate$ord[1:mseRes$Substrate$wh]) ## A linear model build using the selected spatial eigenfunctions: lm( formula = y~., data = cbind( y = salmon$Substrate[train], as.data.frame(sefTrain, wh=sel$Substrate) ) ) -> lm[["Substrate"]] ## Calculating predictions of D50 at each 1 m intervals: predict( lm$Substrate, newdata = as.data.frame( predict( object = sefTrain, newdata = xx, wh = sel$Substrate ) ) ) -> prd[["Substrate"]] ## Plot of the predicted D50 (solid line), and observed D50 for the training ## set (black markers) and testing set (red markers): plot(x=xx, y=prd$Substrate, type="l", ylim=range(salmon$Substrate, prd$Substrate), las=1, ylab="D50 (mm)", xlab="Location along the transect (m)") points(x = salmon$Position[train], y = salmon$Substrate[train], pch=21, bg="black") points(x = salmon$Position[test], y = salmon$Substrate[test], pch=21, bg="red") ## Spatially-explicit modelling of Atlantic salmon parr abundance using ## x=channel depth + water velocity + D50 + pMEM: ## Requires suggested package glmnet to perform elasticnet regression: library(glmnet) ## Calculation of the elastic net model (cross-validated): cv.glmnet( y = salmon$Abundance[train], x = cbind( Depth = salmon$Depth[train], Velocity = salmon$Velocity[train], Substrate = salmon$Substrate[train], as.matrix(sefTrain) ), family = "poisson" ) -> cvglm ## Calculating predictions for the test data: predict( cvglm, newx = cbind( Depth = salmon$Depth[test], Velocity = salmon$Velocity[test], Substrate = salmon$Substrate[test], predict(sefTrain, salmon$Position[test]) ), s="lambda.min", type = "response" ) -> yhatTest ## Calculating predictions for the transect (1 m separated data): predict( cvglm, newx = cbind( Depth = prd$Depth, Velocity = prd$Velocity, Substrate = prd$Substrate, predict(sefTrain, xx) ), s = "lambda.min", type = "response" ) -> yhatTransect ## Plot of the predicted Atlantic salmon parr abundance (solid line, with the ## depth, velocity, and D50 also predicted using spatially-explicit ## submodels), the observed abundances for the training set (black markers), ## the observed abundances for the testing set (red markers), and the ## predicted abundances for the testing set calculated on the basis of ## observed depth, velocity, and median substrate grain size: plot(x=xx, y=yhatTransect, type="l", ylim=range(salmon$Abundance,yhatTransect), las=1, ylab="Abundance (fish)", xlab="Location along the transect (m)") points(x=salmon$Position[train], y=salmon$Abundance[train], pch=21, bg="black") points(x=salmon$Position[test], y=salmon$Abundance[test], pch=21, bg="red") points(x=salmon$Position[test], y=yhatTest, pch=21, bg="green") ## End(Not run) ## Restoring previous graphical parameters: par(tmp)## Loading the 'salmon' dataset data("salmon") seq(1,nrow(salmon),3) -> test # Indices of the testing set. (1:nrow(salmon))[-test] -> train # Indices of the training set. ## A set of locations located 1 m apart: xx <- seq(min(salmon$Position) - 20, max(salmon$Position) + 20, 1) ## Lists to contain the results: mseRes <- list() sel <- list() lm <- list() prd <- list() ## Generate the spatial eigenfunctions: genSEF( x = salmon$Position[train], m = genDistMetric(), f = genDWF("Gaussian",40) ) -> sefTrain ## Spatially-explicit modelling of the channel depth: ## Calculate the minimum MSE model: getMinMSE( U = as.matrix(sefTrain), y = salmon$Depth[train], Up = predict(sefTrain, salmon$Position[test]), yy = salmon$Depth[test] ) -> mseRes[["Depth"]] ## This is the coefficient of prediction: 1 - mseRes$Depth$mse[mseRes$Depth$wh]/mseRes$Depth$nullmse ## Storing graphical parameters: tmp <- par(no.readonly = TRUE) ## Changing the graphical margins: par(mar=c(4,4,2,2)) ## Plot of the MSE values: plot(mseRes$Depth$mse, type="l", ylab="MSE", xlab="order", axes=FALSE, ylim=c(0.005,0.025)) points(x=1:length(mseRes$Depth$mse), y=mseRes$Depth$mse, pch=21, bg="black") axis(1) axis(2, las=1) abline(h=mseRes$Depth$nullmse, lty=3) # Dotted line: the null MSE ## A list of the selected spatial eigenfunctions: sel[["Depth"]] <- sort(mseRes$Depth$ord[1:mseRes$Depth$wh]) ## A linear model built using the selected spatial eigenfunctions: lm( formula = y~., data = cbind( y = salmon$Depth[train], as.data.frame(sefTrain, wh=sel$Depth) ) ) -> lm[["Depth"]] ## Calculating predictions of depth at each 1 m intervals: predict( lm$Depth, newdata = as.data.frame( predict( object = sefTrain, newdata = xx, wh = sel$Depth ) ) ) -> prd[["Depth"]] ## Plot of the predicted depth (solid line), and observed depth for the ## training set (black markers) and testing set (red markers): plot(x=xx, y=prd$Depth, type="l", ylim=range(salmon$Depth, prd$Depth), las=1, ylab="Depth (m)", xlab="Location along the transect (m)") points(x = salmon$Position[train], y = salmon$Depth[train], pch=21, bg="black") points(x = salmon$Position[test], y = salmon$Depth[test], pch=21, bg="red") ## Prediction of the velocity, substrate, and using them to predict the parr ## density. ## Not run: ## Spatially-explicit modelling of the water velocity: ## Calculate the minimum MSE model: getMinMSE( U = as.matrix(sefTrain), y = salmon$Velocity[train], Up = predict(sefTrain, salmon$Position[test]), yy = salmon$Velocity[test] ) -> mseRes[["Velocity"]] ## This is the coefficient of prediction: 1 - mseRes$Velocity$mse[mseRes$Velocity$wh]/mseRes$Velocity$nullmse ## Plot of the MSE values: plot(mseRes$Velocity$mse, type="l", ylab="MSE", xlab="order", axes=FALSE, ylim=c(0.010,0.030)) points(x=1:length(mseRes$Velocity$mse), y=mseRes$Velocity$mse, pch=21, bg="black") axis(1) axis(2, las=1) abline(h=mseRes$Velocity$nullmse, lty=3) ## A list of the selected spatial eigenfunctions: sel[["Velocity"]] <- sort(mseRes$Velocity$ord[1:mseRes$Velocity$wh]) ## A linear model build using the selected spatial eigenfunctions: lm( formula = y~., data = cbind( y = salmon$Velocity[train], as.data.frame(sefTrain, wh=sel$Velocity) ) ) -> lm[["Velocity"]] ## Calculating predictions of velocity at each 1 m intervals: predict( lm$Velocity, newdata = as.data.frame( predict( object = sefTrain, newdata = xx, wh = sel$Velocity ) ) ) -> prd[["Velocity"]] ## Plot of the predicted velocity (solid line), and observed velocity for the ## training set (black markers) and testing set (red markers): plot(x=xx, y=prd$Velocity, type="l", ylim=range(salmon$Velocity, prd$Velocity), las=1, ylab="Velocity (m/s)", xlab="Location along the transect (m)") points(x = salmon$Position[train], y = salmon$Velocity[train], pch=21, bg="black") points(x = salmon$Position[test], y = salmon$Velocity[test], pch=21, bg="red") ## Spatially-explicit modelling of the mean substrate size (D50): ## Calculate the minimum MSE model: getMinMSE( U = as.matrix(sefTrain), y = salmon$Substrate[train], Up = predict(sefTrain, salmon$Position[test]), yy = salmon$Substrate[test] ) -> mseRes[["Substrate"]] ## This is the coefficient of prediction: 1 - mseRes$Substrate$mse[mseRes$Substrate$wh]/mseRes$Substrate$nullmse ## Plot of the MSE values: plot(mseRes$Substrate$mse, type="l", ylab="MSE", xlab="order", axes=FALSE, ylim=c(1000,6000)) points(x=1:length(mseRes$Substrate$mse), y=mseRes$Substrate$mse, pch=21, bg="black") axis(1) axis(2, las=1) abline(h=mseRes$Substrate$nullmse, lty=3) ## A list of the selected spatial eigenfunctions: sel[["Substrate"]] <- sort(mseRes$Substrate$ord[1:mseRes$Substrate$wh]) ## A linear model build using the selected spatial eigenfunctions: lm( formula = y~., data = cbind( y = salmon$Substrate[train], as.data.frame(sefTrain, wh=sel$Substrate) ) ) -> lm[["Substrate"]] ## Calculating predictions of D50 at each 1 m intervals: predict( lm$Substrate, newdata = as.data.frame( predict( object = sefTrain, newdata = xx, wh = sel$Substrate ) ) ) -> prd[["Substrate"]] ## Plot of the predicted D50 (solid line), and observed D50 for the training ## set (black markers) and testing set (red markers): plot(x=xx, y=prd$Substrate, type="l", ylim=range(salmon$Substrate, prd$Substrate), las=1, ylab="D50 (mm)", xlab="Location along the transect (m)") points(x = salmon$Position[train], y = salmon$Substrate[train], pch=21, bg="black") points(x = salmon$Position[test], y = salmon$Substrate[test], pch=21, bg="red") ## Spatially-explicit modelling of Atlantic salmon parr abundance using ## x=channel depth + water velocity + D50 + pMEM: ## Requires suggested package glmnet to perform elasticnet regression: library(glmnet) ## Calculation of the elastic net model (cross-validated): cv.glmnet( y = salmon$Abundance[train], x = cbind( Depth = salmon$Depth[train], Velocity = salmon$Velocity[train], Substrate = salmon$Substrate[train], as.matrix(sefTrain) ), family = "poisson" ) -> cvglm ## Calculating predictions for the test data: predict( cvglm, newx = cbind( Depth = salmon$Depth[test], Velocity = salmon$Velocity[test], Substrate = salmon$Substrate[test], predict(sefTrain, salmon$Position[test]) ), s="lambda.min", type = "response" ) -> yhatTest ## Calculating predictions for the transect (1 m separated data): predict( cvglm, newx = cbind( Depth = prd$Depth, Velocity = prd$Velocity, Substrate = prd$Substrate, predict(sefTrain, xx) ), s = "lambda.min", type = "response" ) -> yhatTransect ## Plot of the predicted Atlantic salmon parr abundance (solid line, with the ## depth, velocity, and D50 also predicted using spatially-explicit ## submodels), the observed abundances for the training set (black markers), ## the observed abundances for the testing set (red markers), and the ## predicted abundances for the testing set calculated on the basis of ## observed depth, velocity, and median substrate grain size: plot(x=xx, y=yhatTransect, type="l", ylim=range(salmon$Abundance,yhatTransect), las=1, ylab="Abundance (fish)", xlab="Location along the transect (m)") points(x=salmon$Position[train], y=salmon$Abundance[train], pch=21, bg="black") points(x=salmon$Position[test], y=salmon$Abundance[test], pch=21, bg="red") points(x=salmon$Position[test], y=yhatTest, pch=21, bg="green") ## End(Not run) ## Restoring previous graphical parameters: par(tmp)
Spatially-referenced observations of juvenile Atlantic salmon (parr) density and habitat variables along a 1520 m transect in the Sainte-Marguerite River, Québec, Canada.
A data.frame with 76 rows (sampling sites) and 5 columns:
Numeric: distance along transect from Bardsville
(m).
Integer: count of Atlantic salmon parr observed per 20 m site.
Numeric: mean water depth (m) at thalweg.
Numeric: mean current velocity (m/s) at thalweg.
Numeric: mean substrate grain size (mm, D50) at thalweg.
Data were collected on July 7, 2002, along a 1520 m river segment starting
at Bardsville (48°23'01.59” N, 70°12'10.05” W). The transect was
divided into 76 contiguous 20 m sites. At each site, snorkelers recorded
parr abundance while moving upstream in a zigzag pattern to minimize
disturbance.
Environmental variables were measured at the thalweg (channel centerline) at the midpoint of each 20 m site (i.e., 10 m from each boundary):
Depth: Mean water depth (m)
Velocity: Mean current velocity (m/s)
Substrate: Mean grain size (mm, D50 metric)
Position is a local Cartesian coordinate (meters) increasing
upstream from the Bardsville reference point. No projected CRS is
assigned; treat as a 1D transect for spatial modeling.
Daniel Boisclair, Département de sciences biologiques, Université de Montréal, Montréal, Québec, Canada.
Guénard, G., Legendre, P., Boisclair, D., and Bilodeau, M. 2010. Multiscale codependence analysis: an integrated approach to analyse relationships across scales. Ecology 91: 2952-2964 <doi:10.1890/09-0460.1>
Bouchard, J. and Boisclair, D. 2008. The relative importance of local, lateral, and longitudinal variables on the development of habitat quality models for a river. Can. J. Fish. Aquat. Sci. 65: 61-73 <doi:10.1139/f07-140>
data(salmon) ## Quick summary: summary(salmon) #> Position Abundance Depth Velocity Substrate #> Min. :1280 Min. : 0.000 Min. :0.3600 Min. :0.1900 Min. : 51.0 #> 1st Qu.:1655 1st Qu.: 0.000 1st Qu.:0.5075 1st Qu.:0.4700 1st Qu.:159.5 #> Median :2030 Median : 2.000 Median :0.6050 Median :0.5750 Median :220.0 #> Mean :2030 Mean : 2.092 Mean :0.6184 Mean :0.5868 Mean :213.9 #> 3rd Qu.:2405 3rd Qu.: 3.000 3rd Qu.:0.6900 3rd Qu.:0.6725 3rd Qu.:265.2 #> Max. :2780 Max. :14.000 Max. :1.0600 Max. :1.0700 Max. :381.0 ## Simple plot of parr abundance along the transect: if(interactive()) { plot(salmon$Position, salmon$Abundance, type = "b", pch = 21, bg = "black", xlab = "Position along transect (m)", ylab = "Parr abundance") }data(salmon) ## Quick summary: summary(salmon) #> Position Abundance Depth Velocity Substrate #> Min. :1280 Min. : 0.000 Min. :0.3600 Min. :0.1900 Min. : 51.0 #> 1st Qu.:1655 1st Qu.: 0.000 1st Qu.:0.5075 1st Qu.:0.4700 1st Qu.:159.5 #> Median :2030 Median : 2.000 Median :0.6050 Median :0.5750 Median :220.0 #> Mean :2030 Mean : 2.092 Mean :0.6184 Mean :0.5868 Mean :213.9 #> 3rd Qu.:2405 3rd Qu.: 3.000 3rd Qu.:0.6900 3rd Qu.:0.6725 3rd Qu.:265.2 #> Max. :2780 Max. :14.000 Max. :1.0600 Max. :1.0700 Max. :381.0 ## Simple plot of parr abundance along the transect: if(interactive()) { plot(salmon$Position, salmon$Abundance, type = "b", pch = 21, bg = "black", xlab = "Position along transect (m)", ylab = "Parr abundance") }
SEMap objects store spatial eigenfunctions derived from distance-based
weighting matrices, enabling spatially-explicit prediction at sampled and
unsampled locations.
genSEF(x, m, f, tol = .Machine$double.eps^0.5) ## S3 method for class 'SEMap' print(x, ...) ## S3 method for class 'SEMap' as.data.frame(x, row.names = NULL, optional = FALSE, ...) ## S3 method for class 'SEMap' as.matrix(x, ...) ## S3 method for class 'SEMap' predict(object, newdata, ...)genSEF(x, m, f, tol = .Machine$double.eps^0.5) ## S3 method for class 'SEMap' print(x, ...) ## S3 method for class 'SEMap' as.data.frame(x, row.names = NULL, optional = FALSE, ...) ## S3 method for class 'SEMap' as.matrix(x, ...) ## S3 method for class 'SEMap' predict(object, newdata, ...)
x |
For |
m |
A distance metric function (e.g., from |
f |
A distance weighting function (e.g., from |
tol |
Tolerance threshold (positive numeric). Eigenvectors with absolute
eigenvalues below this value are discarded. Default is
|
... |
Additional arguments passed to methods (e.g., |
row.names, optional
|
Passed to |
object |
An |
newdata |
A numeric matrix or vector of coordinates for prediction
( |
An SEMap-class object is a list with class attribute
"SEMap" containing the following components:
showFunction to print object summary.
getIMoranFunction returning Moran's I coefficients for each eigenfunction.
getSEF Function returning eigenvectors; accepts wh
argument to select specific columns.
getLambdaFunction returning eigenvalues.
getPredictor Function computing eigenfunction scores at new
locations; accepts xx (coordinates) and wh (selection).
Predictive Moran's Eigenvector Maps (pMEM) allows one to model the spatial variability of an environmental variable and use the resulting model for making prediction at any location on and around the sampling points.
pMEM eigenfunctions are computed as follows:
Compute pairwise distances from coordinates using m(x, x).
Transform distances to weights using f(d).
Double-center the weight matrix (row and column means subtracted).
Perform eigenvalue decomposition on the centered matrix.
Retain eigenvectors with absolute eigenvalues above tol.
The predict method calculates eigenfunction scores at unsampled
locations by:
Computing distances from training sites to new locations.
Transforming to weights and re-centering using training site centers.
Projecting onto the eigenvector basis via matrix multiplication.
When using asymmetric distance metrics (via genDistMetric(delta)),
eigenfunctions are complex-valued. The real and imaginary parts represent
directional spatial patterns (e.g., upstream vs. downstream in rivers).
Standard workflow:
Generate SEMap object with genSEF().
Extract eigenvectors with as.matrix() or
as.data.frame().
Select optimal subset using getMinMSE or other
criteria.
Fit model (e.g., lm(), glm()) with selected
eigenvectors.
Predict at new locations using predict(SEMap, newdata).
genSEF An SEMap-class object containing
eigenfunctions, eigenvalues, and prediction methods.
print.SEMapNULL (invisibly); prints summary to
console.
as.data.frame.SEMap A data.frame with eigenvectors as
columns.
as.matrix.SEMapA numeric/complex matrix of eigenvectors.
predict.SEMap A matrix of eigenfunction scores at
newdata locations (n_new × k, where k is the
number of eigenvectors).
genSEF(): Predictive Moran's Eigenvector Map (pMEM) Generation
Generates a predictive spatial eigenvector map (a SEMap-class object).
print(SEMap): Print SEMap-class
A print method for SEMap-class objects.
as.data.frame(SEMap): An as.data.frame Method for SEMap-class Objects
A method to extract the spatial eigenvectors from an SEMap-class
object as a data frame.
as.matrix(SEMap): An as.matrix Method for SEMap-class Objects
A method to extract the spatial eigenvectors from an SEMap-class
object as a matrix.
predict(SEMap): A predict Method for SEMap-class Objects
A method to obtain predictions from an SEMap-class object.
Guillaume Guénard [aut, cre] (ORCID: <https://orcid.org/0000-0003-0761-3072>), Pierre Legendre [ctb] (ORCID: <https://orcid.org/0000-0002-3838-3305>)
## Case 1: one-dimensional symmetrical n <- 11 x <- (n - 1)*seq(0, 1, length.out=n) ## Store graphical parameters: tmp <- par(no.readonly = TRUE) par(las=1) sef <- genSEF(x, genDistMetric(), genDWF("Gaussian",3)) sef #> A SEMap-class object #> -------------------- #> Number of sites: 11 #> Directional: no #> Number of components: 10 #> Eigenvalues: 3.28700,1.98782,0.79880,...,0.00005,0.00000 #> -------------------- ## Extract eigenvectors: dim(as.matrix(sef)) # 11 × 10 matrix #> [1] 11 10 ## Predict at new locations: xx <- (n - 1)*seq(0, 1, 0.01) pred <- predict(sef, xx, wh=1:3) dim(pred) # 101 × 3 matrix #> [1] 101 3 ## Quick plot of the first eigenfunction: if(interactive()) { plot(xx, pred[, 1], type="l", xlab="Position", ylab="pMEM_1") points(x, as.matrix(sef)[, 1], pch=21, bg="black") } ## Not run: ## The Second eigenfunction: plot(y = predict(sef, xx, wh=2), x=xx, type="l", ylab="PMEM_2", xlab="x") points(y=as.matrix(sef, wh=2), x=x) plot(y=predict(sef, xx, wh=5), x=xx, type="l", ylab="PMEM_5", xlab="x") points(y=as.matrix(sef, wh=5), x=x) ## Case 2: one-dimensional asymmetrical (each has a real and imaginary parts) sef <- genSEF(x, genDistMetric(delta=pi/8), genDWF("Gaussian",3)) ## First asymmetric eigenfunction: plot(y = Re(predict(sef, xx, wh=1)), x=xx, type="l", ylab="PMEM_1", xlab="x", ylim=c(-0.35,0.35)) lines(y = Im(predict(sef, xx, wh=1)), x=xx, col="red") points(y=Re(as.matrix(sef, wh=1)), x=x) points(y=Im(as.matrix(sef, wh=1)), x=x, col="red") ## Second asymmetric eigenfunction: plot(y=Re(predict(sef, xx, wh=2)), x=xx, type="l", ylab="PMEM_2", xlab="x", ylim=c(-0.45,0.35)) lines(y = Im(predict(sef, xx, wh=2)), x=xx, col="red") points(y = Re(as.matrix(sef, wh=2)), x=x) points(y = Im(as.matrix(sef, wh=2)), x=x, col="red") ## Fifth asymmetric eigenfunction: plot(y = Re(predict(sef, xx, wh=5)), x=xx, type="l", ylab="PMEM_5", xlab="x", ylim=c(-0.45,0.35)) lines(y = Im(predict(sef, xx, wh=5)), x=xx, col="red") points(y = Re(as.matrix(sef, wh=5)), x=x) points(y = Im(as.matrix(sef, wh=5)), x=x, col="red") ## A function to display combinations of the real and imaginary parts: plotAsy <- function(object, xx, wh, a, ylim) { pp <- predict(object, xx, wh=wh) plot(y = cos(a)*Re(pp) + sin(a)*Im(pp), x = xx, type = "l", ylab = "PMEM_5", xlab = "x", ylim=ylim, col="green") invisible(NULL) } ## Display combinations at an angle of 45° (pMEM_5): plotAsy(sef, xx, 5, pi/4, ylim=c(-0.45,0.45)) ## Display combinations for other angles: for(i in 0:15) plotAsy(sef, xx, 5, i*pi/8, ylim=c(-0.45,0.45)) ## Case 3: two-dimensional symmetrical cbind( x = c(-0.5,0.5,-1,0,1,-0.5,0.5), y = c(rep(sqrt(3)/2,2L),rep(0,3L),rep(-sqrt(3)/2,2L)) ) -> x2 seq(min(x2[,1L]) - 0.3, max(x2[,1L]) + 0.3, 0.05) -> xx seq(min(x2[,2L]) - 0.3, max(x2[,2L]) + 0.3, 0.05) -> yy list( x = xx, y = yy, coords = cbind( x = rep(xx, length(yy)), y = rep(yy, each = length(xx)) ) ) -> ss cc <- seq(0,1,0.01) cc <- c(rgb(cc,cc,1),rgb(1,1-cc,1-cc)) sef <- genSEF(x2, genDistMetric(), genDWF("Gaussian",3)) scr <- predict(sef, ss$coords) par(mfrow = c(2,3), mar=0.5*c(1,1,1,1)) for(i in 1L:6) { image(z=matrix(scr[,i],length(ss$x),length(ss$y)), x=ss$x, y=ss$y, asp=1, zlim=max(abs(scr[,i]))*c(-1,1), col=cc, axes=FALSE) points(x = x2[,1L], y = x2[,2L]) } ## Case 4: two-dimensional asymmetrical sef <- genSEF(x2, genDistMetric(delta=pi/8), genDWF("Gaussian",1)) ## Note: default influence angle is 0 (with respect to the abscissa) ## A function to display combinations of the real and imaginary parts (2D): plotAsy2 <- function(object, ss, a) { pp <- predict(object, ss$coords) for(i in 1:6) { z <- cos(a)*Re(pp[,i]) + sin(a)*Im(pp[,i]) image(z=matrix(z,length(ss$x),length(ss$y)), x=ss$x, y=ss$y, asp=1, zlim=max(abs(z))*c(-1,1), col=cc, axes=FALSE) } invisible(NULL) } ## Display combinations at an angle of 22°: plotAsy2(sef, ss, pi/8) ## Combinations at other angles: for(i in 0:23) plotAsy2(sef, ss, i*pi/12) ## With an influence of +45° (with respect to the abscissa) sef <- genSEF(x2, genDistMetric(delta=pi/8, theta=pi/4), genDWF("Gaussian",1)) ## Combinations at other angles: for(i in 0:23) plotAsy2(sef, ss, i*pi/12) ## With an influence of +90° (with respect to the abscissa) sef <- genSEF(x2, genDistMetric(delta=pi/8, theta=pi/2), genDWF("Gaussian",1)) ## Combinations at other angles: for(i in 0:23) plotAsy2(sef, ss, i*pi/12) ## With an influence of -45° (with respect to the abscissa) sef <- genSEF(x2, genDistMetric(delta=pi/8, theta=-pi/4), genDWF("Gaussian",1)) ## Combinations at other angles: for(i in 0:23) plotAsy2(sef, ss, i*pi/12) ## End(Not run) ## Reverting to initial graphical parameters: par(tmp)## Case 1: one-dimensional symmetrical n <- 11 x <- (n - 1)*seq(0, 1, length.out=n) ## Store graphical parameters: tmp <- par(no.readonly = TRUE) par(las=1) sef <- genSEF(x, genDistMetric(), genDWF("Gaussian",3)) sef #> A SEMap-class object #> -------------------- #> Number of sites: 11 #> Directional: no #> Number of components: 10 #> Eigenvalues: 3.28700,1.98782,0.79880,...,0.00005,0.00000 #> -------------------- ## Extract eigenvectors: dim(as.matrix(sef)) # 11 × 10 matrix #> [1] 11 10 ## Predict at new locations: xx <- (n - 1)*seq(0, 1, 0.01) pred <- predict(sef, xx, wh=1:3) dim(pred) # 101 × 3 matrix #> [1] 101 3 ## Quick plot of the first eigenfunction: if(interactive()) { plot(xx, pred[, 1], type="l", xlab="Position", ylab="pMEM_1") points(x, as.matrix(sef)[, 1], pch=21, bg="black") } ## Not run: ## The Second eigenfunction: plot(y = predict(sef, xx, wh=2), x=xx, type="l", ylab="PMEM_2", xlab="x") points(y=as.matrix(sef, wh=2), x=x) plot(y=predict(sef, xx, wh=5), x=xx, type="l", ylab="PMEM_5", xlab="x") points(y=as.matrix(sef, wh=5), x=x) ## Case 2: one-dimensional asymmetrical (each has a real and imaginary parts) sef <- genSEF(x, genDistMetric(delta=pi/8), genDWF("Gaussian",3)) ## First asymmetric eigenfunction: plot(y = Re(predict(sef, xx, wh=1)), x=xx, type="l", ylab="PMEM_1", xlab="x", ylim=c(-0.35,0.35)) lines(y = Im(predict(sef, xx, wh=1)), x=xx, col="red") points(y=Re(as.matrix(sef, wh=1)), x=x) points(y=Im(as.matrix(sef, wh=1)), x=x, col="red") ## Second asymmetric eigenfunction: plot(y=Re(predict(sef, xx, wh=2)), x=xx, type="l", ylab="PMEM_2", xlab="x", ylim=c(-0.45,0.35)) lines(y = Im(predict(sef, xx, wh=2)), x=xx, col="red") points(y = Re(as.matrix(sef, wh=2)), x=x) points(y = Im(as.matrix(sef, wh=2)), x=x, col="red") ## Fifth asymmetric eigenfunction: plot(y = Re(predict(sef, xx, wh=5)), x=xx, type="l", ylab="PMEM_5", xlab="x", ylim=c(-0.45,0.35)) lines(y = Im(predict(sef, xx, wh=5)), x=xx, col="red") points(y = Re(as.matrix(sef, wh=5)), x=x) points(y = Im(as.matrix(sef, wh=5)), x=x, col="red") ## A function to display combinations of the real and imaginary parts: plotAsy <- function(object, xx, wh, a, ylim) { pp <- predict(object, xx, wh=wh) plot(y = cos(a)*Re(pp) + sin(a)*Im(pp), x = xx, type = "l", ylab = "PMEM_5", xlab = "x", ylim=ylim, col="green") invisible(NULL) } ## Display combinations at an angle of 45° (pMEM_5): plotAsy(sef, xx, 5, pi/4, ylim=c(-0.45,0.45)) ## Display combinations for other angles: for(i in 0:15) plotAsy(sef, xx, 5, i*pi/8, ylim=c(-0.45,0.45)) ## Case 3: two-dimensional symmetrical cbind( x = c(-0.5,0.5,-1,0,1,-0.5,0.5), y = c(rep(sqrt(3)/2,2L),rep(0,3L),rep(-sqrt(3)/2,2L)) ) -> x2 seq(min(x2[,1L]) - 0.3, max(x2[,1L]) + 0.3, 0.05) -> xx seq(min(x2[,2L]) - 0.3, max(x2[,2L]) + 0.3, 0.05) -> yy list( x = xx, y = yy, coords = cbind( x = rep(xx, length(yy)), y = rep(yy, each = length(xx)) ) ) -> ss cc <- seq(0,1,0.01) cc <- c(rgb(cc,cc,1),rgb(1,1-cc,1-cc)) sef <- genSEF(x2, genDistMetric(), genDWF("Gaussian",3)) scr <- predict(sef, ss$coords) par(mfrow = c(2,3), mar=0.5*c(1,1,1,1)) for(i in 1L:6) { image(z=matrix(scr[,i],length(ss$x),length(ss$y)), x=ss$x, y=ss$y, asp=1, zlim=max(abs(scr[,i]))*c(-1,1), col=cc, axes=FALSE) points(x = x2[,1L], y = x2[,2L]) } ## Case 4: two-dimensional asymmetrical sef <- genSEF(x2, genDistMetric(delta=pi/8), genDWF("Gaussian",1)) ## Note: default influence angle is 0 (with respect to the abscissa) ## A function to display combinations of the real and imaginary parts (2D): plotAsy2 <- function(object, ss, a) { pp <- predict(object, ss$coords) for(i in 1:6) { z <- cos(a)*Re(pp[,i]) + sin(a)*Im(pp[,i]) image(z=matrix(z,length(ss$x),length(ss$y)), x=ss$x, y=ss$y, asp=1, zlim=max(abs(z))*c(-1,1), col=cc, axes=FALSE) } invisible(NULL) } ## Display combinations at an angle of 22°: plotAsy2(sef, ss, pi/8) ## Combinations at other angles: for(i in 0:23) plotAsy2(sef, ss, i*pi/12) ## With an influence of +45° (with respect to the abscissa) sef <- genSEF(x2, genDistMetric(delta=pi/8, theta=pi/4), genDWF("Gaussian",1)) ## Combinations at other angles: for(i in 0:23) plotAsy2(sef, ss, i*pi/12) ## With an influence of +90° (with respect to the abscissa) sef <- genSEF(x2, genDistMetric(delta=pi/8, theta=pi/2), genDWF("Gaussian",1)) ## Combinations at other angles: for(i in 0:23) plotAsy2(sef, ss, i*pi/12) ## With an influence of -45° (with respect to the abscissa) sef <- genSEF(x2, genDistMetric(delta=pi/8, theta=-pi/4), genDWF("Gaussian",1)) ## Combinations at other angles: for(i in 0:23) plotAsy2(sef, ss, i*pi/12) ## End(Not run) ## Reverting to initial graphical parameters: par(tmp)