Data Types

The data type module implements the type system supported by TerraLib for dealing with data that comes from different data sources. It has an important role in TerraLib since each data source has its own set of data types used for representing and storing data. This module works integrated with the data access module and it provides the base foundation for data type extensibility.

Design

The basic idea behind this module is to provide a data type extension mechanism and an abstraction to handle the data values in a more general way. This is achieved by providing an abstract class called AbstractData from which all other data type values must derive and a data type system where you can register new types (see DataType and DataTypeManager classes). This module also includes the base classes for describing properties applied to set of values.

Type, Type Management and Type Codes

Each data type is associated to a type code (an integer value). This code must be unique and must be a standardized value known by all TerraLib modules. The following table shows some data type codes (see Enums.h for a complete and updated type list):

macro name data type code description
TE_UNKNOWN_DT 0 when the correct data type is unknown
TE_VOID_DT 1 data type for void values
TE_BIT_DT 2 data type for values stored as 1 bit of data
TE_CHAR_DT 3 character data type (1 byte long)
TE_UCHAR_DT 4 unsigned character data type (1 byte long)
TE_INT16_DT 5 integer number data type (2 bytes long)
TE_UINT16_DT 6 unsigned integer number data type (2 bytes long)
TE_INT32_DT 7 signed integer number data type (4 bytes long)
TE_UINT32_DT 8 unsigned integer number data type (4 bytes long)
TE_INT64_DT 9 signed integer number data type (8 bytes long)
TE_UINT64_DT 10 unsigned integer number data type (8 bytes long)
TE_BOOLEAN_DT 11 boolean type (true or false)
TE_FLOAT_DT 12 float number (32 bits) data type
TE_DOUBLE_DT 13 double number (64 bits) data type
TE_NUMERIC_DT 14 arbitrary precision data type: Numeric(p, q)
TE_STRING_DT 15 string data types, may be: fixed-length strings (blank padded if needed), variable length string with a limited size or variable unlimited length
TE_BYTE_ARRAY_DT 16 binary data (BLOB)
TE_GEOMETRY_DT 17 vectorial geometry data type
TE_DATETIME_DT 18 for date an time types
TE_ARRAY_DT 19 multidimensional array of homogeneous elements
TE_COMPOSITE_DT 20 composite type
TE_DATASET_DT 21 when the type is a DataSet
TE_RASTER_DT 22 when the type is a raster
TE_CINT16_DT 23 complex signed integer (4 bytes long → 2 + 2)
TE_CINT32_DT 24 complex signed integer (8 bytes long → 4 + 4)
TE_CFLOAT_DT 25 complex float (8 bytes long → 4 + 4)
TE_CDOUBLE_DT 26 complex double(16 bytes long → 8 + 8)
TE_XML_DT 27 for XML documents data type

You can use the set of macros listed above when working with the built-in types of TerraLib. Although they have a special-fixed code you must rely just on the macros not in their values because they can change in future releases. Note also that these type codes are used by the classes that describes properties.

Besides the numeric codes there are also two other classes that helps registering the available data types:

Data Type Managment Classes

The DataType stores descriptive information about a given data type.

DataTypeManager is a singleton for managing all data types in the system. There are some basic constraints for data types:

  • No two data types may have the same name
  • The id of a data type will be dynamically generated by the manager (it is the same as the type code).
  • Data type names must be in capital letters although it can contain numbers and other symbols.

AbstractData

This is the base class for values that can be retrieved from the data access module using the getValue method in the DataSet or DataSetItem classes. This class provides the basic extensibility for data types. Through implementing this interface you can handle new data type values in the data access module.

AbstractData Class

As one can see in the class diagram every data type supported by TerraLib, like Geometry or Raster, is a subclass of AbstractData and thus can be handled in the DataSet and DataSetItem API as a built-in type via getGeometry method or as a general value via getValue method.

Basic Data Types

For more information on how to create a new data type see the section below called Data Type Extensibility.

Byte Array

The byte array class can be used for representing binary data. Most data sources comes with the type BLOB (CLOB) that can be mapped to a byte array data type. It is a copy constructible type.

A byte array object can be constructed from a new buffer or using a pre-existing one and hence avoiding the overhead of copying data. A byte array has an internal capacity and also a internal pointer that marks how much of the internal buffer is in use.

Byte Array Class

Date and Time Types

This module introduces a base abstract class named DateTime for date and time classes based on ISO 8601 and ISO 19108. Internally these classes uses Boost.Date_Time library. The class diagram shows the following classes:

  • Date: a class to represent dates based on the Gregorian Calendar. Internally, it uses boost::gregorian::date.
  • DateDuration: it is a simple day count used for arithmetic with date. Internally, it uses boost::gregorian::date_duration.
  • DatePeriod: it represents a range between two dates. Internally, it uses boost::gregorian::date_period.
  • TimeDuration: it represents time duration. Internally, it uses boost::posix_time::time_duration.
  • TimeInstant: it is a time point composed by a gregorian date portion and a time portion. Internally, it uses boost::posix_time::ptime.
  • TimePeriod: representation for ranges between two times. Internally, it uses boost::posix_time::time_period.
  • TimeInstantTZ: it is a time point with time zone information. Internally, it uses boost::local_time::local_date_time.
  • TimePeriodTZ: representation for ranges between two times accounting for time zone. Internally, it uses boost::local_time::local_time_period.

Date and Time Classes

Numeric Type

String Type

Array Type

SimpleData

The class SimpleData is a template for atomic data types (integers, floats, strings, boolean and numerics). Most of the atomic types are just typedefs.

SimpleData template class

Requirements on type T:

  • T must be a copy constructible type.
  • T must be used with output streams via operator «.
typedef SimpleData<char, TE_CHAR_DT> Char;
typedef SimpleData<unsigned char, TE_UCHAR_DT> UChar;
typedef SimpleData<boost::int16_t, TE_INT16_DT> Int16;
typedef SimpleData<boost::uint16_t, TE_UINT16_DT> UInt16;
typedef SimpleData<boost::int32_t, TE_INT32_DT> Int32;
typedef SimpleData<boost::uint32_t, TE_UINT32_DT> UInt32;
typedef SimpleData<boost::int64_t, TE_INT64_DT> Int64;
typedef SimpleData<boost::uint64_t, TE_UINT64_DT> UInt64;
typedef SimpleData<bool, TE_BOOLEAN_DT> Boolean;
typedef SimpleData<float, TE_FLOAT_DT> Float;
typedef SimpleData<double, TE_DOUBLE_DT> Double;
typedef SimpleData<std::string, TE_NUMERIC_DT> Numeric;
typedef SimpleData<std::string, TE_STRING_DT> String;

Composites

A composite is a data type which can be constructed using primitive data types and other composite types. It is acceptable for the pieces of a composite to be composite types themselves. This type can be used for example to map the traditional composite type of database systems.

Composite Data Type

Data Type Mapping

This module introduces a set of classes for dealing with data type conversions. The AbstractDataConverter is a helper class that can be used to guide applications to convert data types to the right types. In the data access module each data source can publish its list of conversion operations and hence be used by applications to guide the process of data conversion.

Data Type Mapping Classes

The DataConverterManager is a singleton for managing the data type converter objects available in the system. All converters available in the system must be registered in this singleton. Data sources can mantain pointers to this converters and so operations that can change this singleton must be used just by data access driver developers.

There are some built-in converters:

  • Int32ToStringConverter: a converter from Int32 data values to String.
  • <color red>Continuar esta lista</color>

Properties

The property classes can be used to model the definition of properties or set of values.

Property Classes

The base abstract class Property defines the common information about the value of a given property or set of values. It includes:

  • The data type associated to the property;
  • Any restrictions on the values of the property;

Property Class

The SimpleProperty class represents an atomic property like an integer or double. It may have default values, it may indicates if the value is an auto-increment or if the value is always required.

SimpleProperty Class

The classes StringProperty, NumericProperty, DateTimeProperty and ArrayProperty refines the SimpleProperty class adding more semantics to the represented properties.

The CompositeProperty class is a base class for compound properties (non-atomic properties).

CompositeProperty Class

The Geometry Module and Raster Module add new properties (GeometryProperty and RasterProperty) to describe geometric and raster properties.

Exceptions

This module introduces an exception class te::dt::Exception to help catch specific exceptions thrown by this module. The following class diagram shows the exception class in detail:

Data type module exception class

Data Type Extensibility

<color red>TO BE DONE From this point on the documentation is under construction TO BE DONE</color>

Creation of New Data Types

You can create a new data type by registering it in the DataTypeManager. For each data type you must supply a unique name for the type and then manager will assigned an GID (global ID)…

Besides registering the data type you must also provide an abstract data implementation for the new type….

For each data source driver where this new data type will be used you must register two routines to convert instances of the data type from/to the driver internal representation. These routines allow wide flexibility for data source implementations…

In the case of data sources you should also register it in the data source driver. See the data type code table below if you want to know the code of the basic data types of TerraLib.

There is also a data type catalog called DataTypeManager, a singleton that keeps information about all available data types in the system. In this singleton you will find all supported data types. Some of the data type codes are reserved for the primitive types in TerraLib. The basic data types all have fixed codes that can be seen in the config definitions.

Registering the Data Type in the TerraLib Query Language

From Theory to Practice

Module Summary

-------------------------------------------------------------------------------
Language          files     blank   comment      code    scale   3rd gen. equiv
-------------------------------------------------------------------------------
C++                  28       536       534      1133 x   1.51 =        1710.83
C/C++ Header         30      1233      2073      1021 x   1.00 =        1021.00
-------------------------------------------------------------------------------
SUM:                 58      1769      2607      2154 x   1.27 =        2731.83
-------------------------------------------------------------------------------

Besides the C++ code there is also…

Final Remarks

  • We need to types: Numeric and Array!
  • We need to add a serialization/deserialization signature to all data types.
  • Think about overloading operator « and » and io type (binary, text, xml, json, db).
  • Remove the copy and allocation overhead in byte array toString method.
  • We must consider that boost::gregorian::date is stored as a 32 bit integer type and it is specifically designed to NOT contain virtual functions because this design allows for efficient calculation and memory usage with large collections of dates.

If you want more information about the use of data types, please, refer to the following classes/concepts:

References


QR Code
QR Code wiki:designimplementation:datatype (generated for current page)