TerraLib and TerraView Wiki Page

Data Access Module

The Data Access module provides the fundamental layer for applications that handle spatial data from different sources, ranging from traditional DBMSs to OGC Web Services.

This module is composed by some base abstract classes that must be extended to allow the creation of Data Access Drivers which actually implement all the details needed to access data in a specific format or system.

This module provides the base foundation for an application discover what is stored in a data source and how the data is organized in it.

Keep in mind that this organization is the low-level organization of the data. For instance, in a DBMS, you can find out the tables stored in a database and the relationships between them or detailed informations about their columns (data type, name and constraints).

It is not the role of this module to provide higher-level metadata about the data stored in a data source. This support is provided by another TerraLib module: Spatial Metadata module.

This section describes the Data Access module in details.

Design

As one can see in the class diagram below, the Data Access module provides a basic framework for accessing data.

Data Access Class Diagram

It is designed towards extensibility and data interoperability, so you can easily extend it with your own data access implementation.

The requirements that drove this design were:

  • extensible data formats/access: the API must be flexible enough to allow new data source driver implementations for new data formats.
  • data type extension: it must be possible to add new data types to the list of supported ones. The API must provide a way for developers to access new data types that exist only in particular implementations. The new data types can be added to all data source drivers or just for part of them. This will enable the use of extensible data types available in all object-relational DBMS.
  • query language extension: it must be feasible to add new functions to the list of supported operations in the query language of a specific driver.
  • dynamic linking and loading: new drivers can be added without the need of an explicit linking. The data access architecture must support dynamic loading of data source driver modules, so that new drivers can be installed at any time, without any requirement to recompile TerraLib or the application.

The yellow classes with the names in italic are abstract and must be implemented by data access drivers. Following we discuss each class in detail. See the Doxygen documentation for more details.

DataSource

The DataSource class is the fundamental class of the data access module and it represents a data repository.

It may represent, for instance, a PostgreSQL database, an Oracle database, an OGC Web Feature Service, a directory of ESRI shape-files, a single shape-file, a directory of TIFF images, a single TIFF image or a data stream.

Each system or file format requires an implementation of this class.

A DataSource shows the data contained in it as a collection of Datasets.

The information about the data that is stored in a data source may be available through a DataSetType, that contains the dataset name name and its structure/schema.

Besides the descriptive information about the underlying data repository each data source also provides information about its requirements and capabilities. This information may be used by applications so that they can adapt to the abilities of the underlying data source in use.

Each data source driver must have a unique identifier. This identifier is a string (in capital letters) with the data source type name and it is available through the method getType. Examples of identifiers are: POSTGIS, OGR, GDAL, SQLITE, WFS, WCS, SHP, ACCESS.

A data source is also characterized by a set of parameters that can be used to set up an access channel to its underlying repository. This information is referred as the data source connection information. This information may be provided as an associative container (a set of key-value pairs) through the method setConnectionInfo1). The key-value pairs (kvp) may contain information about maximum number of accepted connections, user name and password required for establishing a connection, the url of a service or any other information needed by the data source to operate. The parameters are dependent on the data source driver. So, please, check the driver documentation for any additional information on the supported parameters.2) For instance, in a PostGIS data source it is usual to use the following syntax:

For an in depth explanation, see the Doxygen documentation of this class.

DataSet

For an in depth explanation, see the Doxygen documentation of this class.

DataSetType

For an in depth explanation, see the Doxygen documentation of this class.

1) TODO: we should use a plain connection string through the method setConnectionStr
2) When using a plain string, the information is encoded by a set of key-value pairs separated by an equal sign and each pair is separated by the ampersand (&) and they must be URL encoded.