21.4.4.3. Kriging Models

Kriging is a geostatistical method of spatial data interpolation. The mathematical model of kriging is named after D.G. Krige, who first introduced a version of this spatial prediction process. Kriging has been extensively described in the literature since Sacks et al. proposed the application of kriging in computer experiments.

Unlike the real tests, computer analysis codes are deterministic therefore it is not influenced with measurement errors. Hence, the approximate models can be defined as a combination of a regression model plus a departure term:

\(y=\mathbf{X\beta }+z\left( \mathbf{x} \right)\)

where, \(y\) is the approximate model, \(\mathbf{X\beta }\) is a polynomial type regression model, and \(z\left( \mathbf{x} \right)\) is a Gaussian random process with \(N\left( 0,{{\sigma }^{2}} \right)\). If the regression model \(\left( \mathbf{X\beta } \right)\) globally approximates the design space, the departure term \(z\left( \mathbf{x} \right)\) represents the localized deviations so that the Kriging model interpolates the \({{n}_{s}}\) sampled points.

The covariance matrix of \(z\left( \mathbf{x} \right)\) is given by

\(Cov\left[ z\left( {{\mathbf{x}}_{i}} \right)z\left( {{\mathbf{x}}_{j}} \right) \right]={{\sigma }^{2}}\mathbf{R}\left[ R\left( {{\mathbf{x}}_{i}},{{\mathbf{x}}_{j}} \right) \right]\)

where, \(\mathbf{R}\) is the correlation matrix and \(R\left( {{\mathbf{x}}_{i}},{{\mathbf{x}}_{j}} \right)\) is the correlation function between any two of the \({{n}_{s}}\) sampled points. Hence, \(\mathbf{R}\) is a \({{n}_{s}}\times {{n}_{s}}\) symmetric matrix with ones in the diagonal term. There are many correlation functions \(R\left( {{\mathbf{x}}_{i}},{{\mathbf{x}}_{j}} \right)\). Among them, the Gaussian type is widely used

\(R\left( {{\mathbf{x}}_{i}},{{\mathbf{x}}_{j}} \right)=\exp \left[ \sum\limits_{l=1}^{k}{{{\theta }_{l}}{{\left| \mathbf{x}_{i}^{l}-\mathbf{x}_{j}^{l} \right|}^{2}}} \right]\)

where, \({{\theta }_{l}}\) are the unknown correlation parameters to fit model. The estimates, \(\tilde{y}\left( \mathbf{x} \right)\) of the response \(y\left( \mathbf{x} \right)\) at the untried values of \(\mathbf{x}\) are given by

\(\tilde{y}\left( \mathbf{x} \right)=\mathbf{X}\left( \mathbf{x} \right)\mathbf{\tilde{\beta }}+{{\mathbf{r}}^{T}}\left( \mathbf{x} \right){{\mathbf{R}}^{-1}}\left( \mathbf{y}-\mathbf{X}\left( \mathbf{x} \right)\mathbf{\tilde{\beta }} \right)\)

The correlation vector between \(\mathbf{x}\) and the sampled points \(\left\{ {{\mathbf{x}}_{1}},{{\mathbf{x}}_{2}},...,{{\mathbf{x}}_{{{n}_{s}}}} \right\}\) is given by:

\(\mathbf{r}{{\left( \mathbf{x} \right)}^{T}}={{\left[ R\left( \mathbf{x},{{\mathbf{x}}_{1}} \right),R\left( \mathbf{x},{{\mathbf{x}}_{2}} \right),...,R\left( \mathbf{x},{{\mathbf{x}}_{{{n}_{s}}}} \right) \right]}^{T}}\)

In the estimates, the unknown coefficients of regression model is determined as

\(\mathbf{\tilde{\beta }}={{\left( {{\mathbf{X}}^{T}}{{\mathbf{R}}^{-1}}\mathbf{X} \right)}^{-1}}\left\{ {{\mathbf{X}}^{T}}{{\mathbf{R}}^{-1}}\mathbf{y} \right\}\)

Also, in order to determine the unknown correlation parameters \({{\theta }_{l}}\), the estimate of the variance \({{\tilde{\sigma }}^{2}}\) (not the variance in the observed data)

\({{\tilde{\sigma }}^{2}}=\frac{1}{{{n}_{s}}}{{\left( \mathbf{y}-\mathbf{X\tilde{\beta }} \right)}^{T}}{{\mathbf{R}}^{-1}}\left( \mathbf{y}-\mathbf{X\tilde{\beta }} \right)\)

is introduced. Hence, the correction parameters \({{\theta }_{l}}\) is determined by solving

\(\underset{\mathbf{\theta }>0}{\mathop{\min }}\,{{\left( \det \mathbf{R}\left( \mathbf{\theta } \right) \right)}^{{1}/{{{n}_{s}}}\;}}\tilde{\sigma }\left( \mathbf{\theta } \right)\) or \(\underset{\mathbf{\theta }>0}{\mathop{\max }}\,-\frac{1}{2}\left[ {{n}_{s}}\ln \left( \tilde{\sigma }{{\left( \mathbf{\theta } \right)}^{2}} \right)+\ln \left( \det \mathbf{R}\left( \mathbf{\theta } \right) \right) \right]\)

While any values for \(\mathbf{\theta }\) create an interpolation model, the best kriging model is found by solving the k-dimensional unconstrained optimization problems described in the above.

Reference

  1. Matheron G. Principles of geostatistics, Economic Geology 1963; 58:1246-1266.

  2. Sacks J, Welch WJ, Mitchell TJ, Wynn HP, Design and analysis of computer experiments. Statistical Science 1989; 4:409-435.

  3. Sacks J, Susannah SB, Welch WJ. Design for computer experiments. Technometrics 1989; 31:41-47