next up previous
Next: Exploratory spatial data analysis Up: A review of spatial Previous: The analysis of point

Geostatistics

Geostatistical methods most often start from observations at points of single or multiple attributes, and are concerned with their statistical interpolation to a field or continuous surface assumed to extend across the whole study area. It is of course possible to interpolate in a deterministic way, or to use polynomial regression on the site coordinate values to predict a trend surface, but these methods do not give the degree of statistical control to be had from variogram analysis and subsequently modelling by kriging. Geostatistical methods are also subject to a variant of the modifiable areal unit problem, known as the change of support problem (Cressie, 1996); although a surface is assumed to exist throughout the study area, it is not feasible to gather data at all of the tex2html_wrap_inline787 in the study area, or to know on the basis of the sample points how they represent the study area. Geologists are also vitally interested in finding anomalies, perhaps similar to clusters; the same applies to environmental scientists examining the distribution of radioactive isotopes, who are concerned to locate ``hot-spots''.

In practice a sample data set may be treated for systematic variation in the first two moments before geostatistical analysis begins. The next step is to use variograms for exploring spatial variability between all pairs of points a specified distance apart. Measures are taken across the whole map, and can be taken assuming isotropy, or in a chosen direction. The chief sources for exploratory variography and variogram modelling are Cressie (1993), Isaaks and Srivastava (1989), and Deutsch and Journel (1992). Semivariogram analysis and modelling has been attracting growing attention in the spatial analysis of data from others than the earth sciences over recent years. Among other examples, geostatistical methods have been employed in medical as well as physical geography (Oliver and Webster 1986, Webster, Oliver, Muir and Mann 1994).

The distance measure tex2html_wrap_inline801 is a vector expressing distance and direction, within specified tolerances, and thus has a natural head and tail. The head and tail variables can be the same, but can differ; in such bivariate cases causal effect is manifested in the direction and at the distance specified. It is assumed that the same dependency relationships between locations will be manifest irrespective of placing in the study area, although the relations may be anisotropic. The classical semivariance measure is:

displaymath51

where tex2html_wrap_inline803 is the number of pairs fulfilling relationship tex2html_wrap_inline801 , tex2html_wrap_inline807 is the tail value, and tex2html_wrap_inline809 is the head value. The covariogram is similarly defined:

displaymath61

The semivariance is thus the sum of squared differences between pairs of values at distance tex2html_wrap_inline801 , divided by twice the number of such pairs. This is analogous to the Geary statistic, while the covariance corresponds to a distance-banded Moran's I statistic (described in sections on the analysis of lattice data below). Many further semivariance estimators are available, providing robustness to outliers, and perhaps a better separation of the structured aspect related to the overall distribution of the phenomenon from the often erratic local behaviour of the phenomena.

Modelling is derived from the fitting of one or more of a family of functions to the observed curve, adjusted with respect to a number of parameters. The principle advantage of using geostatistical methods is yielded when the resultant models are used for prediction to other locations within the study area, using both results of trend analyses, and of local dependencies. These result in surfaces of fitted values, perhaps plotted over a regular grid and contoured, and more importantly surfaces of variances, permitting confidence intervals to be constructed around model predictions over the study area. For environmental scientists in general, and mining geologists in particular, attempting to squeeze the most information possible out of each sample core, these methods have proved to be of considerable value. Social science applications are limited chiefly because there are relatively few phenomena which can reasonably be supposed to exist as surfaces of this nature, although by the use of analogy, one might relax this limitation. There are some parallels between work in geostatistics and the treatment of non-stationarity using geographically weighted regression discussed below.


next up previous
Next: Exploratory spatial data analysis Up: A review of spatial Previous: The analysis of point

Roger Bivand
Fri Mar 5 08:30:34 CET 1999