Kernel Estimation

When we observe spatial point patterns such as the incidence of crimes or diseases, we can verify high or low intensity of occurrences and this provides us some type of information. Frequently, looking at events this way does not help much and a smoothing operation could help improving the information we get about the phenomenon tendency. Thereupon, the objective of Kernel is to estimate values where there are no samples, using an interpolation function, providing then a smoother way of observing a phenomenon.

Estimating the intensity of a spatial points pattern is very like estimating a probability density. The quartic kernel is used in SPRING and the figure below tries to elucidate this operation. There is a plane with event occurrences and, for each localization such as S, where one wants to estimate the intensity, there is a function as the one represented on the drawing. The points located inside the function contribute for the calculation of the value of intensity in S. Each point contributes proportionally to the distance to S.

p1.gif (4982 bytes)

Therefore, we can imagine a rectangular grid overlaid to the plan. The function is applied on each grid cell and a value is estimated. This value should be the sample influence measure on the cell. From the grid it is possible to generate an image or execute a slicing, observing then how the density of points is spreading around.

p2.bmp (361014 bytes)

 

Suppose that S represents a localization in a region R and S1,.....,Sn are localizations of n observed events, samples. Then, a density, at location S, (red point in the figure below) is estimated using the formula , where hi is the distance between the point S and the observed event location Si, and the sum only occurs for the points which are at location hi that does not surpass t (points inside the blue circle).

p3.bmp (222230 bytes)

The influence region where the events contribute for the calculation of intensity is a circle with radius t and center in S. Observing the formula one can verify that at the localization S, at a zero distance, the weight is . This weight reaches zero when the distance is t.

Inside SPRING, the choice for the radius t is called bandwidth and it defines smooth surfaces or not. For large bandwidths t, the estimated intensity is smooth and for small bandwidths the intensity tends to peaks centered at S.

p7.jpg (202014 bytes)

The described operation only takes into consideration the event localization informing us, for example, identifying crime hotspots or epidemic outbreaks. What ar the most dangerous regions? Where can we find diseases focuses?

It is possible to include an operation that besides localization also considers point attributes. For example, estimating the carbon density in a certain localization, from a set of known data. Both localization and attribute are considered in their case.

Some variations can be observed and allowed in this version.

The function might consider the points attributes. In the most simple case, where each point corresponds only to an event occurrence, it is an intensity estimator and it is related to the number "events by unit area". If there is a value associated to a point, for instance, argil tenor, can use the formula p9.jpg (2482 bytes),    representing an "attribute quantity by unit area". If, for instance, you want to estimate the attribute average value it is possible to divide this value into the "# of events by unit area", and we get

 

If you are working with polygons, it is possible to associate to the polygon its own center, using then the tools above.               

Suppose you want to estimate "events by population". Just divide "number of events by area" by "population by area", or population density and you will get:

 

Other application would be to use as population another spatial process, representing the population, called control process. For example, consider the cases of larynx cancer where the cases of lung cancer, in the same area, could be used as population. This case would be the ratio between two processes represented by the formula:

In the formula above, the k() value is general. If k() is chosen as Quartic Kernel, it must be replaced by:

p10.jpg (1904 bytes)

Notice that in some cases the bandwidth is considered the same for the numerator and for the denominator. If the user wants to use different bands one should avoid to select bandwidth narrower for the denominator than for the numerator. This observation avoids unacceptable variations in the ratio computation.

NOTE: This module has many options and the user should be responsible for the coherence in the operations. Before describing the execution itself we will present a short summary of the possible kernel operations (simple or ratio)

For example, a simple operation consists of number of observations per area unit, where only the localization of points is considered. Another example would be the attribute quantity per area unit. The inferface of SPRING provides the option of considering or not an attribute like this and, therefore, performing one of the operations described. Only the left side of the window is visible and, if the chosen infolayer contains polygons, the user must choose Area instead for Point (default). SPRING will calculate the centroids of each polygon and then work as if they were points.

The ratio operation is used for calculating, for instance, the average value of the attribute. When this option is selected, the right side of the window is also active. It will be necessary to calculate the attribute by area, the numerator which is presented on the left side of the window, and the events per area, the denominator shown on the right side. SPRING calculates the average which is the ratio between these two values.

Make sure you choose the right options, selecting area or point, with or without attributes and verifying the correct model.

The options available for the operation depend on and vary according to the IL category. For Thematic ILs, the geoclasses become available; for Cadastral ILs, you should choose the associated objects and attributes; for Digital ILs, the only option is to consider or not the attributes (the default is not to consider them).

As an output of these operations in SPRING we obtain aa grid. The user should create a thematic model category with corresponding intensity and color classes so that the grid can be sliced and the outcome can be observed. It is also interesting to create an image model category to generate an image from the grid. This image will allow the visualization of the distribution effect.

For the ration operation, the grid will have values which are difficult to be sliced. The ideal situation would be to have an histogram of the grid to see how the values are distributed. In this case, a percentage of extreme values is usually thrown away and the rest is sliced in equal intervals. However, SPRING does not have this feature yet. An image can be generated from this grid and you can use the slice option inside the function Image - Contrast.

NOTE: This function operates over points, thus, only the Cadastral, Thematic and DTM categories are valid.


Executing the Kernel function in the SPRING:

  • select on the "Control Panel" a Thematic, Cadastral or DTM Infolayer that contains an entity to be analyzed. If the IL is cadastral or thematic, we can use 2D points or polygons and if it is Digital, 3D points;
  • click on Analysis - Spatial Statistics - Kernel Estimation... on the main menu;
  • select the operation type clicking on Simple or Ratio;
  • select the category and the IL that has the denominator data in the Ratio case, because the Numerator is always the active IL;
  • select the data type, point or area, in the Numerator and, in the Ratio case, the Denominator's data type. In case of areas, polygons, the program should compute the centroid;
  • select an object, in the Cadastral case or a geoclass, in the Thematic case, for the Numerator and, if Ratio, select also for the Denominator;
  • select an attribute, in the Cadastral case;
  • type the bandwidth or use the cursor to select one. With the text resources it is possible to select the same value for the Numerator and Denominator.
  • select if the attribute will be used. In case it is not, only the position will be considered.
  • click on Output Category and select a window "Category List" a category from the numerical model;
  • type the IL name to be created. This IL will have a grid representation;
  • select a box which has the interested samples and points.
  • type the resolution in X and in Y.
  • click on Grid Generation. The result is a grid with the intensity values which has to be sliced in intervals and associated to thematic classes previously defined in the conceptual model.

 

See also:

Spatial Analysis