Part 9. Spatial Statistics

$$ \def\pr{\hbox{Pr}} \def\var{\hbox{var}} \def\cov{\hbox{cov}} \def\tr{\hbox{tr}} \def\corr{\hbox{corr}} \def\dmX{\un{\mathcal{X}}} \def\dmG{\un{\mathcal{G}}} \def\dmK{\un{\mathcal{K}}} \def\dmS{\un{\mathcal{S}}} \def\dmC{\un{\mathcal{C}}} \def\dmZ{\un{\mathcal{Z}}} \def\bma{{\mbox{\boldmath $\alpha$}}} \def\bmb{{\mbox{\boldmath $\beta$}}} \def\bmu{{\mbox{\boldmath $\mu$}}} \def\bme{{\mbox{\boldmath $\epsilon$}}} \def\bmS{{\mbox{\boldmath $\Sigma$}}} \def\bmL{{\mbox{\boldmath $\Lambda$}}} \def\bmd{{\mbox{\boldmath $\delta$}}} \def\bmD{{\mbox{\boldmath $\Delta$}}} \def\bmG{{\mbox{\boldmath $\Gamma$}}} \def\bmphi{{\mbox{\boldmath $\phi$}}} \def\bmPhi{{\mbox{\boldmath $\Phi$}}} \def\bmpsi{{\mbox{\boldmath $\psi$}}} \def\bmtheta{{\mbox{\boldmath $\theta$}}} \def\eq{\begin{equation}} \def\eeq{\end{equation}} \def\i{{\bf i}} \def\un#1{{\bf #1}}$$

The semi-variogram

A random vector corresponding to some two-dimensional (geo)spatial measurement at a point $\un x$:

$$\un Z(\un x),\quad \un x \in \Re^2.$$

For first-order stationariy the mean vector

$$\langle\un Z(\un x)\rangle = \bf \mu$$

is constant, independent of $\un x$, so WLoG we can assume

$$\langle\un Z(\un x)\rangle = \un 0.$$

For second-order stationarity, the *spatial covariance*

$$\un C(\un x,\un h) = \langle \un Z(\un x) \un Z(\un x+\un h)^\top\rangle = \un C(\un h)$$

is also constant, independent of $\un x$. Here $\un h$ is the “lag”.

We have

$$\un C(\un 0) = \langle\un Z(\un x)\un Z(\un x)^\top\rangle = \bf\Sigma$$

and

$$\eqalign{ \un C(-\un h) &= \langle \un Z(\un x) \un Z(\un x-\un h)^\top\rangle\cr &= \langle \un Z(\un x+\un h) \un Z(\un x)^\top\rangle\cr &= \langle (\un Z(\un x) \un Z(\un x+\un h)^\top)^\top\rangle\cr &= \un C(\un h)^\top. }$$

The (multivariate) *variogram* is defined as the covariance matrix of the random vector $\un Z(\un x)-\un Z(\un x+\un h)$:

$$\eqalign{ 2\bf\Gamma(\un h)&=\langle (\un Z(\un x)-\un Z(\un x+\un h)) (\un Z(\un x)-\un Z(\un x+\un h)^\top\rangle\cr &=\langle\un Z(\un x) \un Z(\un x)^\top\rangle+\langle\un Z(\un x+\un h) \un Z(\un x+\un h)^\top\rangle\cr &\quad\quad -\langle\un Z(\un x) \un Z(\un x+\un h)^\top\rangle-\langle\un Z(\un x+\un h) \un Z(\un x)^\top\rangle\cr &=2\bf\Sigma-\un C(\un h)-\un C(-\un h).}$$

Note that $\bf\Gamma(\un h)$ is symmetric (why?).

For scalar measurements, this is written

$$2\gamma(\un h) = 2 \sigma^2 - C(\un h) - C(-\un h) = 2(\sigma^2-C(\un h))$$

and

$$\gamma(\un h)=\sigma^2-C(\un h)$$

is called the *semi-variogram*.

An estimate for the semi-variogram based on sample measurements is

$$\hat\gamma(\un h) = {1\over 2N(\un h)}\sum_{i=1}^{N(\un h)}[z(\un x_i)-z(\un x_i+\un h)]^2$$

where $N(\un h)$ is the number of sample point pairs separated by $\un h$, both in magnitude and direction.

Source: A. A. Nielsen (1994) *Analysis of regularly and irregularly sampled spatial, multivariate and multitemporal data*, PhD Thesis, Lyngby.</span>

Source: A. A. Nielsen (1994) *Analysis of regularly and irregularly sampled spatial, multivariate and multitemporal data*, PhD Thesis, Lyngby.</span>

A simple, isotropic semi-variogram model:

$$ \gamma(h) = \cases{ 0 & if $h=0$\cr C_0 + C_1\left[{3\over 2}{h\over R}-{1\over 2}\left({h\over R}\right)^3\right] & if $0where $h = \|\un h\|$, $C_0$ is the “nugget”, $R$ is the “range of influence” and $C_0+C_1$ is the “sill”.

The parameters can be estimated from the experimental semivariograms by means of non-linear least squares methods. See

Journel and Froidevaux (1982) Anisotropic hole-effect modeling

Cressie (2015) Statistics for Spatial Data, Wiley

Ordinary kriging (assuming second order stationarity)

Suppose we have sampled the distribution characterized by the random variable $Z$ at the points $\un x_1\dots \un x_m$, so we have the measurements $z(\un x_1)\dots z(\un x_m)$. We want to interpolate to the point $\un x_0$. A linear, unbiased estimator is the sample function

$$\hat Z(\un x_0) = \sum_{i=1}^m w_i Z(\un x_i),$$

provided that

$$\sum_{i=1}^m w_i = 1.$$

That is,

$$\langle \hat Z(\un x_0)\rangle = \sum_{i=1}^m w_i\langle Z(\un x_i)\rangle = \sum_{i=1}^m w_i\langle Z \rangle = \langle Z \rangle.$$

The variance in the estimate is

$$\sigma_E^2 = \langle (\hat Z(\un x_0) - Z(\un x_0))^2 \rangle = \left\langle \sum_{i=0}^m w_iZ(\un x_i)\cdot\sum_{j=0}^m w_jZ(\un x_j)\right\rangle$$

where $w_0 = -1$. Therefore

$$\sigma_E^2 = \sum_{i,j=0}^m w_iw_j\langle Z(\un x_i)Z(\un x_j)\rangle = \sum_{i,j=0}^m w_iw_j C(\un x_i-\un x_j),$$

since

$$ \langle Z(\un x_i)Z(\un x_j)\rangle =\langle Z(\un x_i)Z(\un x_i-(\un x_i- \un x_j))\rangle = C(\un x_i-\un x_j). $$

With $C_{ij} = C(\un x_i-\un x_j)$, the minimum variance estimate is thus the minimum of the Lagrange function

$$L(\un w,\lambda) = \sum_{i,j=0}^m w_iw_j C_{ij} -\lambda(\sum_{i=1}^m w_i -1),$$

with respect to $w_i$, $i=1\dots m$, and $\lambda$.

Taking derivatives $\partial L/\partial w_i$ and $\partial L/\partial\lambda$ and equating to 0 gives $$\sum_{j=1}^m w_jC_{ij} - \lambda = C_{i0},\quad i= 1\dots m,\quad \sum_{i=1}^m w_i = 1.$$

In matrix form: $$\pmatrix{C_{11} & \cdots & C_{1m} & 1\cr \vdots & \cdots & \vdots & 1\cr C_{m1} & \cdots & C_{mm} & 1\cr 1 & \cdots & 1 & 0} \pmatrix{w_1\cr\vdots\cr w_m\cr -\lambda} = \pmatrix{C_{10}\cr\vdots\cr C_{m0}\cr 1}$$ or $$\un C\un w = \un D.$$

We get the required kriging coefficients (and $\lambda$) from

$$\un w = \un C^{-1}\un D.$$

For second order stationarity, $C_{ii} = \sigma^2$, $i=1\dots m.$ In terms of the semi-variogram,

$$C_{ij} = \sigma^2 - \gamma(\un x_i - \un x_j).$$

We get the kriging variance by substituting back into the expression for $\sigma_E^2$:

$$\eqalign{ \sigma_E^2 &= \sum_{ij=0}^mw_iw_jC_{ij} = \sum_{j=0}^m w_j \sum_{i=0}^m w_i C_{ij}\cr &= w_0\sum_{i=0}^m w_iC_{i0} + \sum_{j=1}^m w_j(w_0C_{0j} + \sum_{i=1}^mw_iC_{ij})\cr &= -\sum_{i=0}^m w_iC_{i0} + \sum_{j=1}^m w_j(-C_{0j} + \lambda + C_{0j})\cr &= -w_0C_{00} - \sum_{i=1}^mw_iC_{i0} + \lambda \cr &= - \sum_{i=1}^mw_iC_{i0} + C_{00} + \lambda. }$$

Note that, if the point $\un x_0$ coincides with one of the support points, $\un x_0= \un x_j$ say,

$$\sigma_E^2 = - \sum_{i=1}^mw_iC_{ij} + C_{jj} + \lambda = -(C_{jj} + \lambda) + C_{jj} + \lambda = 0.$$

Maximizing spatial correlation: the MAF transform

Recall that we had for the multivariate variogram

$$2\bf\Gamma(\un h)=2\bf\Sigma-\un C(\un h)-\un C(-\un h),$$

where $$\un C(\un h) = \langle\un Z(\un x)\un Z(\un x+\un h)^\top\rangle$$

and

$$\un C(\un 0)=\bf\Sigma,\quad \un C(\un h) = \un C(-\un h)^\top.$$

Now let us look at the spatial covariance of projections $$Y=\un a^\top\un Z$$ of the original and data. These are given by

$$\eqalign{ \cov(\un a^\top\un Z(\un x),\un a^\top\un Z(\un x+\un h)) &=\un a^\top \langle\un Z(\un x) \un Z(\un x+\un h)^\top\rangle \un a\cr &= \un a^\top \un C(\un h) \un a\cr &= \un a^\top \un C(-\un h) \un a \quad\hbox{(quadratic form!)}\cr &= {1\over 2}\cdot\un a^\top (\un C(\un h)+\un C(-\un h)) \un a\cr &= {1\over 2}\cdot\un a^\top (2\bf\Sigma-2\bf\Gamma(\un h)) \un a\cr &= \un a^\top\bf\Sigma\un a - \un a^\top\bf\Gamma(\un h)\un a.}$$

The *spatial autocorrelation* of the projections is therefore

$$\eqalign{ \corr(\un a^\top \un Z(\un x),\un a^\top \un Z(\un x+\un h)) &= {\un a^\top \bf\Sigma \un a-\un a^\top \bf\Gamma(\un h) \un a \over \sqrt{\var(\un a^\top \un Z(\un x))\var(\un a^\top \un Z(\un x+\un h))}}\cr &= {\un a^\top \bf\Sigma \un a - \un a^\top \bf\Gamma(\un h) \un a \over \sqrt{(\un a^\top \bf\Sigma \un a)(\un a^\top \bf\Sigma \un a)}}\cr &= 1- {\un a^\top \bf\Gamma(\un h) \un a\over \un a^\top \bf\Sigma \un a}.}$$

The *maximum autocorrelation factor* (MAF) transformation determines that vector $\un a$ which maximizes the spatial autocorrelation. We obtain it by minimizing the *Rayleigh quotient*

$$R(\un a) = {\un a^\top \bf\Gamma(\un h) \un a\over \un a^\top \bf\Sigma \un a}.$$

Setting the vector derivative equal to zero gives

$${\partial R\over \partial \un a}= {1\over \un a^\top \bf\Sigma \un a}\cdot {1\over 2}\cdot\bf\Gamma(\un h) \un a -{\un a^\top \bf\Gamma(\un h) \un a\over(\un a^\top \bf\Sigma \un a)^2}\cdot {1\over 2}\cdot\bf\Sigma \un a= \un 0$$

or

$$(\un a^\top \bf\Sigma \un a)\bf\Gamma(\un h) \un a = (\un a^\top \bf\Gamma(\un h) \un a)\bf\Sigma \un a.$$

This condition is met when $\un a$ solves the *generalized eigenvalue problem*

$$\bf\Gamma(\un h) \un a = \lambda\bf\Sigma \un a.$$

We can solve this eigenvalue problem as follows. Since $\bf\Sigma$ is symmetric positive definite, there exists a lower triangular matrix $\un L$ such that

$$\bf\Sigma = \un L\un L^\top.$$

So we can write

$$\bf\Gamma(\un h)\un a = \lambda\un L\un L^\top\un a.$$

Equivalently,

$$\un L^{-1}\bf\Gamma(\un h) (\un L^\top)^{-1}\un L^\top\un a = \lambda \un L^\top \un a,$$

or, with $\un b = \un L^\top\un a$,

$$[\un L^{-1}\bf\Gamma(\un h)(\un L^{-1})^\top]\un b = \lambda\un b,$$

a standard eigenvalue problem for the symmetric matrix $\un L^{-1}\bf\Gamma(\un h)(\un L^{-1})^\top$.

Let the eigenvalues be ordered from smallest to largest, $\lambda_1\le\dots \le\lambda_N$, and the corresponding (orthogonal) eigenvectors be $\un b_i$. We have

$$\un b_i^\top \un b_j= \un a_i^\top \un L \un L^\top \un a_j =\un a_i^\top \bf\Sigma \un a_j=\delta_{ij}$$

so that the MAF components $Y_i=\un a_i^\top\un Z$, $\ i=1\dots N$, are orthogonal (uncorrelated) with unit variance. Moreover,

$$\corr(\un a_i^\top \un Z(\un x),\un a_i^\top \un Z(\un x+\un h))=1-{1\over 2}\lambda_i,\quad i=1\dots N,$$

and the first MAF component has maximum spatial autocorrelation.

Allan A. Nielsen (PhD Thesis 1995):

“The MAFs for irregularly spaced data possess the characteristic that they are orthogonal and they are ordered by decreasing autocorrelation. Typically, low order factors will contain a lot of signal, high order factors will contain a lot of noise. The MAFs thus relate beautifully to [...] interpolation by kriging. Because low-order MAFs contain signal they are expected to have low nugget effects and long ranges of influence. [...] As the factors are orthogonal there is no cross-correlation for small lags ($\un h$). I therefore suggest [...] kriging based on the MAF analysis of [...] multivariate data. In order to obtain kriged versions of the original data the inverse MAF transformation can be applied.”

In [ ]: