![]() GeostatisticsThis page presents the main concepts for understanding and manipulating the geostatistics module at SPRING. The topics presented are:
Exploratory Analysis - Geostatistics
REGIONALIZED VARIABLESThe spatial variability of some soil characteristics is one of the researchers concerns since the beginning of the century. Smith (1910) studied field parcels disposition in experiments of corn yield, attempting the elimination of soil variation effects. Montgomery (1913), concerned with the effect of nitrogen in the wheat yield, built an experiment with 224 parcels, measuring the grain yield. Several other authors, like Waynick e Sharp (1919), also have studied variations of nitrogen and carbon in the soil. The procedures were based on classical statistics, using large amount of sample data, aiming at the characterization or description of the spatial distribution of a characteristic under study. By classical statistics it is meant the one that uses parameters such mean and standard deviation to represent a phenomenon, and it is based on the main hypothesis that the variability from one place to another are random. Krige (1951), working with gold concentration data, concluded that only the information given by the variance would not be enough to explain the phenomenon under study. As such, it would be necessary to take into account the distance among the observations. From that arises the concept of geostatistics, which considers de geographical location and the spatial dependence. Matheron (1963, 1971), based on Krige´s observations, developed the regionalized variables theory, from the foundation of geostatistics. Blais e Carlier (1968), quoted by Olea (1975), considers a regionalized variable as a numerical function with spatial distribution that varies from one point to another with an apparent continuity, but with variations that can not be represented by a simple mathematical function. The regionalized variables theory assumes that the variation of a variable can be expressed by the sum of the components (Burrough, 1987): a) a structural component, associated to a constant mean value or a constant trend; b) a random component, spatially correlated; and c) a random noise or residual error. Let x represents a position in one, two or three dimensions. Then, the value of the variable Z, in x, is given by (Burrough, 1987): Z(x) = m(x) + e ¢ (x) + e ² (1) where:
Figure below illustrates the three main components of the spatial variation. Part (a) presents a deterministic component that varies abruptly, while the deterministic component in part (b) presents a constant trend.
![]() Figs. (a) e (b) - Main components of the spatial variation. FONTE: Modified from Burrough (1987), p. 155. Considered HypothesisUnlike conventional estimation methods, kriging is based on the theory of regionalized variables. The fist step in kriging is to define a suitable function for the deterministic component m(x). As such, some hypothesis are necessary (Burrough, 1987 e David, 1977):
Under this hypothesis, it is assumed the deterministic component, m(x), is constant (there is no trend in the region). Then, m(x) is equal to the expected value of the random variable Z at the position x, and the mean difference between the observed values at x and x+h, separated by a distance vector h (modulus and direction) is null.
E[Z(x) - Z(x+h)] = 0 or E[Z(x)] = E[Z(x+h)] = m(x) = m (2) where E represents the mathematical expectation operator. It is also assumed that the covariance between the pairs Z(x) and Z(x+h), separate by a distance vector h, exists and depends only on h. Then: C(h) = Cov [Z(x), Z(x+h)] = = E[(Z(x)-m).(Z(x+h)- m)] = E[Z(x).Z(x+h)]-m2, " x; (3) where Cov [Z(x), Z(x+h)] is the covariance between Z(x) and Z(x+h). From Equation (3), stationarity of the covariance implies in stationarity of the variance: Var[Z(x)] = E{[Z(x)- m]2} = E[Z2(x)] - 2.E[Z(x)].m + m2 = = E[Z(x).Z(x+0)] - 2m2 + m2 = = E[Z(x).Z(x+0)] - m2 = C(0), " x. (4) where Var is the variance operator. The stationarity of the covariance also implies in the stationarity of the variogram, defined by: 2g (h) = E{[Z(x)-Z(x+h)]2} (5) Equation (5) can be written as: 2g (h) = E{Z2(x) - 2 Z(x)Z(x+h) + Z2(x+h)} = E[Z2(x)] - 2E[Z(x)Z(x+h)] + E[Z2(x+h)] (6) From Equation (3) one can get: E[Z(x)Z(x+h)] = C(h) + m2 (7) In an analogous manner, from Equation (4) we have: E[Z(x).Z(x+0)] = E[Z2(x)] = C(0) + m2 (8) Substituting Equations (7) and (8) in Equation (6), one can get: 2g (h) = C(0) + m2 - 2 (C(h) + m2) + C(0) + m2 = = 2 C(0) - 2 C(h) (9) Simplifying Equation (9), we have: g (h) = C(0) - C(h) (10) where: g (h) represents a function known in the theory of regionalized variables as semivariogram, which is half of the variogram. See discussion about variogram.
The relation represented in Equation (10) indicates that under the second-order stationarity hypothesis, the covariance and the semivariogram are two alternative forms of characterizing the autocorrelation of the pairs Z(x) e Z(x+h) separate by the vector h. The second-order stationarity hypothesis assumes the existence of a covariance and, therefore, of a finite variance (Equation 4). Under this condition, the correlogram, r(h), can be defined. By dividing both sides of Equation (10) by C(0), we have: r (h) =
The restrictions imposed to the second-order stationarity, that is, assuming that $ C(h) Þ $ Var[Z(x)] = C(0) and also Þ $ g (h), may not be satisfied by some physical phenomenon that have a infinite dispersion capacity (David, 1977). Infinite dispersion capacity Þ
Analogously to the previous hypothesis, it is assumed that E[Z(x)] = m(x) = m, " x. Besides, it is assumed that the variance of the differences depends only on the distance vector h, that is: Var[Z(x) - Z(x+h)] = E{[Z(x)-Z(x+h)]2} = 2g (h) , (12) where 2g (h) is as presented before. According to David (1977), this hypothesis is more frequent in geostatistics, mainly because it is less restrictive. That is, it requires only the existence and the stationarity of the variogram, with no restriction about the existence of a finite variance. An additional consideration, that transcends the scope of this work, refers to the Universal Kriging hypothesis (David, 1977). In this case, m(x) is the drift (main trend) and assumes that C(h) and g (h) have stationarity inside a neighborhood of a restricted size. Moreover, it is assumed that E[Z(x)] = m(x), which is not stationary, varying in a regular way inside such neighborhood. According to David (1977), not only the covariance and the variogram are defined from the experimental values, but also the size of the neighborhood where the hypothesis maintains valid. Topics about this subject can be found in Olea (1975, 1977), and an application example can be seen in Burgess and Webster (1980c). In this work, it is assumed the second-order stationarity (Þ intrinsic hypothesis), which is enough for using the simple kriging estimation methods (SK) and the ordinary kriging (OK), to be discussed ??????. Regionalized Variables CharacteristicsAccording to Olea (1975, 1977), the main characteristics of a regionalized variable are:
TABLE - PERCENTAGE OF H2O IN TWO DISTINCT SAMPLES A AND B.
In this table, the individual values in the two samples are exactly the same. Therefore, the sample mean and sample variance, as well as the histogram of the observed variable in samples A and B, are exactly identical. Any analysis that does not take into consideration other statistics beside the mean, variance and histogram will not differentiate the two series. This example emphasizes the importance of the regionalized variable spatial continuity measurement. Therefore, it is necessary to consider the relative spatial position of each observation in the two samples, in order to be possible the differentiation between them. The regionalized variable spatial continuity can be analyzed from the variogram, as described next.
![]() ![]() |