Variogram - Geostatistics

This page shows some concepts about variograms, as part of the SPRING Geostatistics module. The topics presented here are:


VARIOGRAM - What is it?

The variogram is a supporting basic tool to kriging techniques, which allows to represent qualitatively a regionalized phenomenon variation in space (Huijbregts, 1975).

Let's consider two regionalized variables, say X and Y, where X = Z(x) and Y = Z(x+h). In this case, they are referred to the same attribute (for instance, the amount of Zinc in the soil) measured in two different positions, as shown in the Figure below, where

Fig. - Two dimensional sample.

x denotes a position in two dimensions, with components (xi , yi), and h is the distance vector (module and direction) connecting the two points.

The dependency level between these two regionalized variables, X and Y, is represented by the variogram, 2g(h), which is defined as the mathematical expectation of the difference squared between the point values in space, separated by the distance vector h, that is,

2g (h) = E{[Z(x)-Z(x+h)]2} = Var[Z(x)-Z(x+h)] . (13)

Considering a sample z(xi), i=1, 2, ..., n, the variogram can be estimated by

, (14)

where:

  • - is the estimated variogram;
  • N(h) - is the number of pairs of measured values z(xi) and z(xi+h), separated by a distance vector h
  • z(xi) and z(xi+h) - are values of the i-th observation of the regionalized variable, collected at the xi and xi+h (i = 1, ..., n) points, separated by the h vector.


Many authors define a variogram differently than what was presented in equation (13), considering what is usually referred as semivariogram, which is given by:


. (15)


Similarly, the semivariogram function can be estimated by:


(16)

where N(h), z(xi) and z(xi +h) were already defined.


SEMIVARIOGRAM PARAMETERS

The Figure below shows an experimental semivariogram with characteristics closer to the ideal case. Its pattern represents what is, intuitively, expected from data fields, that is, that the differences {Z(xi) - Z(xi + h)} decrease as h, the distance separating them also decreases. It is expected that observations geographically closer have a more similar behavior among them, then the observations that are separated by larger distances. In this way it is expected that g(h) increases with the distance h.

Semivariogram example.

The semivariogram parameters can be observed directly from the Figure:

  • Range (a): the distance in which the samples present a spatial correlation. In the Figure above, this range is around 25 m.
  • Platform (C): is the semivariogram value corresponding to the range (a). From this point on it is considered that there is no spatial dependency among the samples, because the variance difference between pairs of samples (Var[Z(x) - Z(x+h)]) become invariant with the distance.
  • Nugget Effect (C0): by definition, g(0)=0, (see Equation 2.15). However, in practice, when h goes to 0 (zero), g(h) gets closer to a positive value called Nugget Effect (C0). The C0 value shows the semivariogram discontinuity for distances smaller than the smallest distance among samples. Part of this discontinuity can also be related to measure errors (Isaaks and Srivastava, 1989), but it is impossible to quantify if the largest contribution came from measure errors or from the small variability scale not captured by the sample.
  • Contribution (C1): is the difference between the platform (C) and the Nugget Effect (Co).




COMPUTING THE SEMIVARIOGRAM FROM REGULARLY SPACED SAMPLES

Consider the set of samples regularly spaced, in two dimensions, as presented in the Figure below.

Fig. - Regularly spaced samples in two dimensions.

To determine the experimental semivariogram, for instance, in the 900 direction the computation is repeated for all intervals of h. Suppose the distance between two consecutive points is equals 100 meters (d=100m). Then, any observation pair, in the 900 direction, where the distance is equals to 100m will be included in the computation. Once this is done, the computations will be repeated for the next distance, for instance, 200m. This includes all observation pairs where the distance is equals to 200m. The process is repeated until the desired stop point is reached. This procedure can be better understood with Figure 2.5 help and also it has to be repeated for other directions (00, 450 and 1350).

Fig. - Illustration for the semivariogram computation from regularly spaced samples.


COMPUTING THE SEMIVARIOGRAM FROM IRREGULARLY SPACED SAMPLES

Consider the set of irregularly spaced samples, in two dimensions, as presented in the Figure below. In this case, to determine the experimental semivariogram, it is required to introduce tolerance limits for direction and distance.

Fig. - Parameters for the semivariogram computation from irregularly spaced samples in two dimensions.
SOURCE: Modified from Deutsch and Journel (1992), p. 45.

Take as reference the Lag2 (Lag is referred to a pre-defined distance, which is used in the semivariogram computation) of the Figure above. Suppose a Lag increment of 100 m with tolerance of 50 m. Consider also the measured direction of 450 with angular tolerance of 22.50. Then, any observation pair where the distance is in the 150m and 250m and 22.50 and 67.50 ranges will be included in the semivariogram computation of Lag2. This process is repeated for all other Lag's.

Still referring to the Figure above, the bandwidth (BW) is referred to an adjusting value from which the number of observation pairs is restricted for the semivariogram computation.

The next step makes an adjustment of the theoretical model to the experimental semivariogram, as described below.



THEORETICAL MODELS

The experimental semivariogram graph, , computed using Equation (16), is formed by a series of values, as shown in the Figure above, over which the goal is the adjusting function. It is important that the adjusted model represents the tendency related to h. In this way, the obtained estimates from kriging will be more exact and, thus more reliable.

The adjusting procedure is not direct and automatic, as, for instance, in the regression case, but it is interactive, because in this process the interpreter makes a first adjustment and verifies the theoretical model adjustment. Depending on the obtained adjustment, it is possible or not to redefine the model, up to the point where the adjustment is considered satisfactory.

The models presented here are considered basic models, named isotropic models by Isaaks and Srivastava (1989). The models are divided into two types: models with a platform and models without a platform. Models with platform are referenced in geostatistics as transitive models. Some of the transitive models reach the platform (C) asymptotically. For these models, the range (a) is arbitrarily defined as the distance corresponding to 95% of the platform. Models without platform are used to model phenomena that have infinite dispersion capability.


NUGGET EFFECT MODEL

As discussed in Semivariogram Parameters, many experimental semivariograms presents a discontinuity in the source. When |h|=0, the semivariogram value is strictly zero. However, when |h| goes to zero, the semivariogram value can be significantly higher than zero, that is, there is a discontinuity in the source. This discontinuity is modeled using the nugget effect model, defined as:

g o (|h|) = (17)

In the geostatistics literature, the nugget effect is not classified as basic model, but it shows as a constant (Co) in the semivariogram equation, and has to be understood that Co = 0 when |h| = 0. Strictly, the notation for the nugget effect is Cogo(|h|), where Co represents the discontinuity value in the source, and go(|h|) is the nugget effect model normalized as presented in Equation 17. This notation is consistent with the basic model presentation described here and becoming convenient when the compound model is used.

The most used transitive models are: spherical model (Sph), exponential model (Exp) and gaussian model (Gau). These models are presented in the Figure below with the same range (a).

Fig. - Graphical representation for the normalized transitive models.
SOURCE: Modified from Isaaks and Srivastava (1989), p. 374.


SPHERICAL MODEL

The spherical model is one of the most used models and it is shown in red in the Figure above. The normalized Equation of this model is:

(18)


EXPONENTIAL MODEL

Another model often used is the exponential model, which is presented in blue in the Figure above. The normalized equation of this model is:

(19)

This model reaches the platform asymptotically, with a practical range defined as a distance in which the model value is 95% of the platform (Isaaks and Srivastava, 1989).


GAUSSIAN MODEL

The gaussian model is a transitive model, it is used many times to model extremely continuous phenomena (Isaaks and Srivastava, 1989). Its formulation is given by:

(20)

Similar to the exponential model, the gaussian model reaches the platform asymptotically and the a parameter is defined as a practical range or distance in which the model value is 95% of the platform (Isaaks and Srivastava, 1989). What characterizes this model is its parabolic behavior closer to the source, as represented in the Figure above through the solid green line.


POWER MODEL

The power model is not a transitive model, thus it does not reach a platform. In general, this model type is used to model phenomena with infinite dispersion capacity. The figure below shows the power model, which is expressed through:

(21)

where,

  • c is the declivity coefficient, and
  • e is the exponent.


Fig. - Graphical representation of the power model.

Up to this point the main normalized basic models were presented, which are used to model or to adjust the experimental semivariogram. In practical cases, the experimental semivariograms have nugget effect values (Co) greater than zero and platform values (C) greater than the unit, as illustrated in the Figure below.

Fig. - Graphical representation of the experimental semivariogram and theoretical models. In summary, the semivariogram of the transitive basic models are defined as:


  • Semivariogram Spherical Model:

(22)

  • Semivariogram Exponential Model:

(23)


  • Semivariogram Gaussian Model:

(24)

In a similar way, the power model is written as a semivariogram as presented below:

  • Semivariogram Power Model:

g (25)



NESTED MODELS

There are certain phenomena which requires more complex semivariogram models to explain its spatial variations. These models are combinations of simple models, named nested. McBratney and Webster (1986) noticed that nested models are required to explain the soil variation deriving from independent formation factors. For instance, a nested model useful for mineral studies and soil research is the double spherical model. McBratney et al. (1982) used it to describe the copper and cobalt variations in the soil. This model is defined as:

(26)

where,

  • a1 and C1 corresponding to the range parameters and contribution, respectively, of the first spherical model (g 1 (h)) and
  • a2 and C2 corresponding to the range parameters and contribution, respectively, of the second spherical model (g 2 (h)).


This model is shown in the Figure below, where the solid lines represent the theoretical adjusting models to the experimental semivariogram.


Fig. - Graphical representation of the double spherical model.


Depending on the studied phenomenon, other nested models might be required to characterize the spatial variability. For instance: double exponential, exponential with a double spherical, linear with double spherical, etc.



ANISOTROPY

The anisotropy can be easily verified through the observation of the obtained semivariogram for different directions. The directional conventions used in geostatistics are presented in the Figure below.

Fig. - Directional conventions used in geostatistics.

Consider the semivariogram obtained for the 00, 450, 900 and 1350 directions, shown in the Figure below. It is verified a big similarity among them. This is the representation of a simple case and less frequently, where the phenomenon spatial distribution is named isotropic. In this case, a single model is enough to describe the spatial variability of the phenomenon being studied.

Fig. - Graphical representation of isotropic semivariogram.

In the other hand, if the semivariograms are not equal in all directions, the distribution is named anisotropic. If the anisotropy is observed and is reflected by the same Platform (C) with different Ranges (a) of the same model, then it is named Geometric.

Consider the semivariogram illustrated in the Figure below. The interconnected points with dotted lines are the experimental semivariogram in two orthogonal directions. The semivariogram that first reaches the platform (blue) is related to the 1200 direction, and the semivariogram with the largest range (red) is related to the 300 direction. The solid lines in both directions are the theoretical adjusting models of the experimental semivariograms.

Fig. - Graphical representation of the geometrical anisotropy.

A direct way to visualize and compute the geometric anisotropy parameters (factor and angle) is through a graphical sketch of an ellipses, computed through the obtained ranges in distinct direction, as shown in the Figure below. The conventions that follow were adopted by Deutsch and Journel (1992). For the ellipses major axis, named maximum continuity direction, it is applied the larger range(a1). The maximum continuity angle direction is defined from the North direction, moving clockwise. Its value corresponds to the largest range direction. The smaller axis defines the range(a2) in the small continuity direction, which is orthogonal to the main direction.

Fig. - Graphical representation of the geometrical anisotropy in two dimensions.
SOURCE: Modified from Deutsch and Journel (1992), p. 24.

The geometrical anisotropy factor is defined as the ratio between the range in the small continuity direction (a2) and the range in the large continuity direction (a1). In this case, the geometrical anisotropy factor is always smaller than one and the anisotropic angle is equals to the maximum continuity angle direction.

There is still another type of anisotropy in which the semivariograms present the same Ranges (a) and different Platforms (C). In this case, the anisotropy is called Zonal. As the isotropy, the zonal anisotropy is also a less frequent case in the natural phenomena The most common is to find combinations of zonal anisotropy and geometrical, named combined anisotropy.

Consider the semivariogram presented in the Figure below. The interconnected points with dotted lines corresponds to experimental semivariograms in two orthogonal directions. The semivariogram with the largest platform (blue) is related to the 600 direction and the semivariogram with the smallest platform (red) is related to the orthogonal direction, that is, 1500. The adjusting models to semivariograms are represented by solid lines.

Fig. - Graphical representation of the combined anisotropy.

According to Isaaks and Srivastava (1989), cited by Deutsch and Journel (1992, p. 25), the zonal anisotropy can be considered as a particular case of the geometrical anisotropy, when a very large anisotropy factor is considered. In these conditions, the implicit range in the smaller continuity direction is too large. The semivariogram structure is then added only to the largest continuity direction.




See how to execute:

SPRING Procedures Sequence

See also:
Regionalized Variables
About Kriging
SPRING Spatial Analysis
Digital Terrain Modeling
Digitalization of Maps g