A random vector corresponding to some two-dimensional (geo)spatial measurement at a point $\un x$:
$$\un Z(\un x),\quad \un x \in \Re^2.$$For first-order stationariy the mean vector
$$\langle\un Z(\un x)\rangle = \bf \mu$$is constant, independent of $\un x$, so WLoG we can assume
$$\langle\un Z(\un x)\rangle = \un 0.$$For second-order stationarity, the *spatial covariance*
$$\un C(\un x,\un h) = \langle \un Z(\un x) \un Z(\un x+\un h)^\top\rangle = \un C(\un h)$$is also constant, independent of $\un x$. Here $\un h$ is the “lag”.
We have
$$\un C(\un 0) = \langle\un Z(\un x)\un Z(\un x)^\top\rangle = \bf\Sigma$$and
$$\eqalign{ \un C(-\un h) &= \langle \un Z(\un x) \un Z(\un x-\un h)^\top\rangle\cr &= \langle \un Z(\un x+\un h) \un Z(\un x)^\top\rangle\cr &= \langle (\un Z(\un x) \un Z(\un x+\un h)^\top)^\top\rangle\cr &= \un C(\un h)^\top. }$$The (multivariate) *variogram* is defined as the covariance matrix of the random vector $\un Z(\un x)-\un Z(\un x+\un h)$:
$$\eqalign{ 2\bf\Gamma(\un h)&=\langle (\un Z(\un x)-\un Z(\un x+\un h)) (\un Z(\un x)-\un Z(\un x+\un h)^\top\rangle\cr &=\langle\un Z(\un x) \un Z(\un x)^\top\rangle+\langle\un Z(\un x+\un h) \un Z(\un x+\un h)^\top\rangle\cr &\quad\quad -\langle\un Z(\un x) \un Z(\un x+\un h)^\top\rangle-\langle\un Z(\un x+\un h) \un Z(\un x)^\top\rangle\cr &=2\bf\Sigma-\un C(\un h)-\un C(-\un h).}$$Note that $\bf\Gamma(\un h)$ is symmetric (why?).
For scalar measurements, this is written
$$2\gamma(\un h) = 2 \sigma^2 - C(\un h) - C(-\un h) = 2(\sigma^2-C(\un h))$$and
$$\gamma(\un h)=\sigma^2-C(\un h)$$is called the *semi-variogram*.
An estimate for the semi-variogram based on sample measurements is
$$\hat\gamma(\un h) = {1\over 2N(\un h)}\sum_{i=1}^{N(\un h)}[z(\un x_i)-z(\un x_i+\un h)]^2$$where $N(\un h)$ is the number of sample point pairs separated by $\un h$, both in magnitude and direction.
Source: A. A. Nielsen (1994) *Analysis of regularly and irregularly sampled spatial, multivariate and multitemporal data*, PhD Thesis, Lyngby.</span>
Source: A. A. Nielsen (1994) *Analysis of regularly and irregularly sampled spatial, multivariate and multitemporal data*, PhD Thesis, Lyngby.</span>
A simple, isotropic semi-variogram model:
$$ \gamma(h) = \cases{ 0 & if $h=0$\cr C_0 + C_1\left[{3\over 2}{h\over R}-{1\over 2}\left({h\over R}\right)^3\right] & if $0The parameters can be estimated from the experimental semivariograms by means of non-linear least squares methods. See
Journel and Froidevaux (1982) Anisotropic hole-effect modeling
Suppose we have sampled the distribution characterized by the random variable $Z$ at the points $\un x_1\dots \un x_m$, so we have the measurements $z(\un x_1)\dots z(\un x_m)$. We want to interpolate to the point $\un x_0$. A linear, unbiased estimator is the sample function
$$\hat Z(\un x_0) = \sum_{i=1}^m w_i Z(\un x_i),$$provided that
$$\sum_{i=1}^m w_i = 1.$$That is,
$$\langle \hat Z(\un x_0)\rangle = \sum_{i=1}^m w_i\langle Z(\un x_i)\rangle = \sum_{i=1}^m w_i\langle Z \rangle = \langle Z \rangle.$$The variance in the estimate is
$$\sigma_E^2 = \langle (\hat Z(\un x_0) - Z(\un x_0))^2 \rangle = \left\langle \sum_{i=0}^m w_iZ(\un x_i)\cdot\sum_{j=0}^m w_jZ(\un x_j)\right\rangle$$where $w_0 = -1$. Therefore
$$\sigma_E^2 = \sum_{i,j=0}^m w_iw_j\langle Z(\un x_i)Z(\un x_j)\rangle = \sum_{i,j=0}^m w_iw_j C(\un x_i-\un x_j),$$since
$$ \langle Z(\un x_i)Z(\un x_j)\rangle =\langle Z(\un x_i)Z(\un x_i-(\un x_i- \un x_j))\rangle = C(\un x_i-\un x_j). $$With $C_{ij} = C(\un x_i-\un x_j)$, the minimum variance estimate is thus the minimum of the Lagrange function
$$L(\un w,\lambda) = \sum_{i,j=0}^m w_iw_j C_{ij} -\lambda(\sum_{i=1}^m w_i -1),$$with respect to $w_i$, $i=1\dots m$, and $\lambda$.
Taking derivatives $\partial L/\partial w_i$ and $\partial L/\partial\lambda$ and equating to 0 gives $$\sum_{j=1}^m w_jC_{ij} - \lambda = C_{i0},\quad i= 1\dots m,\quad \sum_{i=1}^m w_i = 1.$$
In matrix form: $$\pmatrix{C_{11} & \cdots & C_{1m} & 1\cr \vdots & \cdots & \vdots & 1\cr C_{m1} & \cdots & C_{mm} & 1\cr 1 & \cdots & 1 & 0} \pmatrix{w_1\cr\vdots\cr w_m\cr -\lambda} = \pmatrix{C_{10}\cr\vdots\cr C_{m0}\cr 1}$$ or $$\un C\un w = \un D.$$
We get the required kriging coefficients (and $\lambda$) from
$$\un w = \un C^{-1}\un D.$$For second order stationarity, $C_{ii} = \sigma^2$, $i=1\dots m.$ In terms of the semi-variogram,
$$C_{ij} = \sigma^2 - \gamma(\un x_i - \un x_j).$$We get the kriging variance by substituting back into the expression for $\sigma_E^2$:
$$\eqalign{ \sigma_E^2 &= \sum_{ij=0}^mw_iw_jC_{ij} = \sum_{j=0}^m w_j \sum_{i=0}^m w_i C_{ij}\cr &= w_0\sum_{i=0}^m w_iC_{i0} + \sum_{j=1}^m w_j(w_0C_{0j} + \sum_{i=1}^mw_iC_{ij})\cr &= -\sum_{i=0}^m w_iC_{i0} + \sum_{j=1}^m w_j(-C_{0j} + \lambda + C_{0j})\cr &= -w_0C_{00} - \sum_{i=1}^mw_iC_{i0} + \lambda \cr &= - \sum_{i=1}^mw_iC_{i0} + C_{00} + \lambda. }$$Note that, if the point $\un x_0$ coincides with one of the support points, $\un x_0= \un x_j$ say,
$$\sigma_E^2 = - \sum_{i=1}^mw_iC_{ij} + C_{jj} + \lambda = -(C_{jj} + \lambda) + C_{jj} + \lambda = 0.$$Recall that we had for the multivariate variogram
$$2\bf\Gamma(\un h)=2\bf\Sigma-\un C(\un h)-\un C(-\un h),$$where $$\un C(\un h) = \langle\un Z(\un x)\un Z(\un x+\un h)^\top\rangle$$
and
$$\un C(\un 0)=\bf\Sigma,\quad \un C(\un h) = \un C(-\un h)^\top.$$Now let us look at the spatial covariance of projections $$Y=\un a^\top\un Z$$ of the original and data. These are given by
$$\eqalign{ \cov(\un a^\top\un Z(\un x),\un a^\top\un Z(\un x+\un h)) &=\un a^\top \langle\un Z(\un x) \un Z(\un x+\un h)^\top\rangle \un a\cr &= \un a^\top \un C(\un h) \un a\cr &= \un a^\top \un C(-\un h) \un a \quad\hbox{(quadratic form!)}\cr &= {1\over 2}\cdot\un a^\top (\un C(\un h)+\un C(-\un h)) \un a\cr &= {1\over 2}\cdot\un a^\top (2\bf\Sigma-2\bf\Gamma(\un h)) \un a\cr &= \un a^\top\bf\Sigma\un a - \un a^\top\bf\Gamma(\un h)\un a.}$$The *spatial autocorrelation* of the projections is therefore
$$\eqalign{ \corr(\un a^\top \un Z(\un x),\un a^\top \un Z(\un x+\un h)) &= {\un a^\top \bf\Sigma \un a-\un a^\top \bf\Gamma(\un h) \un a \over \sqrt{\var(\un a^\top \un Z(\un x))\var(\un a^\top \un Z(\un x+\un h))}}\cr &= {\un a^\top \bf\Sigma \un a - \un a^\top \bf\Gamma(\un h) \un a \over \sqrt{(\un a^\top \bf\Sigma \un a)(\un a^\top \bf\Sigma \un a)}}\cr &= 1- {\un a^\top \bf\Gamma(\un h) \un a\over \un a^\top \bf\Sigma \un a}.}$$The *maximum autocorrelation factor* (MAF) transformation determines that vector $\un a$ which maximizes the spatial autocorrelation. We obtain it by minimizing the *Rayleigh quotient*
$$R(\un a) = {\un a^\top \bf\Gamma(\un h) \un a\over \un a^\top \bf\Sigma \un a}.$$Setting the vector derivative equal to zero gives
$${\partial R\over \partial \un a}= {1\over \un a^\top \bf\Sigma \un a}\cdot {1\over 2}\cdot\bf\Gamma(\un h) \un a -{\un a^\top \bf\Gamma(\un h) \un a\over(\un a^\top \bf\Sigma \un a)^2}\cdot {1\over 2}\cdot\bf\Sigma \un a= \un 0$$or
$$(\un a^\top \bf\Sigma \un a)\bf\Gamma(\un h) \un a = (\un a^\top \bf\Gamma(\un h) \un a)\bf\Sigma \un a.$$This condition is met when $\un a$ solves the *generalized eigenvalue problem*
$$\bf\Gamma(\un h) \un a = \lambda\bf\Sigma \un a.$$We can solve this eigenvalue problem as follows. Since $\bf\Sigma$ is symmetric positive definite, there exists a lower triangular matrix $\un L$ such that
$$\bf\Sigma = \un L\un L^\top.$$So we can write
$$\bf\Gamma(\un h)\un a = \lambda\un L\un L^\top\un a.$$Equivalently,
$$\un L^{-1}\bf\Gamma(\un h) (\un L^\top)^{-1}\un L^\top\un a = \lambda \un L^\top \un a,$$or, with $\un b = \un L^\top\un a$,
$$[\un L^{-1}\bf\Gamma(\un h)(\un L^{-1})^\top]\un b = \lambda\un b,$$a standard eigenvalue problem for the symmetric matrix $\un L^{-1}\bf\Gamma(\un h)(\un L^{-1})^\top$.
Let the eigenvalues be ordered from smallest to largest, $\lambda_1\le\dots \le\lambda_N$, and the corresponding (orthogonal) eigenvectors be $\un b_i$. We have
$$\un b_i^\top \un b_j= \un a_i^\top \un L \un L^\top \un a_j =\un a_i^\top \bf\Sigma \un a_j=\delta_{ij}$$so that the MAF components $Y_i=\un a_i^\top\un Z$, $\ i=1\dots N$, are orthogonal (uncorrelated) with unit variance. Moreover,
$$\corr(\un a_i^\top \un Z(\un x),\un a_i^\top \un Z(\un x+\un h))=1-{1\over 2}\lambda_i,\quad i=1\dots N,$$and the first MAF component has maximum spatial autocorrelation.
Allan A. Nielsen (PhD Thesis 1995):
“The MAFs for irregularly spaced data possess the characteristic that they are orthogonal and they are ordered by decreasing autocorrelation. Typically, low order factors will contain a lot of signal, high order factors will contain a lot of noise. The MAFs thus relate beautifully to [...] interpolation by kriging. Because low-order MAFs contain signal they are expected to have low nugget effects and long ranges of influence. [...] As the factors are orthogonal there is no cross-correlation for small lags ($\un h$). I therefore suggest [...] kriging based on the MAF analysis of [...] multivariate data. In order to obtain kriged versions of the original data the inverse MAF transformation can be applied.”