Dimensions of EO Data Cubes Copy

Henryk Hodam

A dimension refers to a certain axis of a datacube. This includes all variables (e.g. bands), which are represented as dimensions. Our exemplary raster datacube has the spatial dimensions x and y, and the temporal dimension t. Furthermore, it has a bands dimension, extending into the realm of what kind of information is contained in the cube.

The following properties are usually available for dimensions:

name
type (potential types include: spatial (raster or vector data), temporal and other data such as bands)
axis (for spatial dimensions) / number
labels (usually exposed through textual or numerical representations, in the metadata as nominal values and/or extents)
reference system / projection
resolution / step size
unit (either explicitly specified or implicitly given by the reference system)
additional information specific to the dimension type (e.g. the geometry types for a dimension containing geometries)

Here is an overview of the dimensions contained in our example raster datacube above:

#	name	type	labels	resolution	reference system
1	`x`	spatial	`466380`, `466580`, `466780`, `466980`, `467180`, `467380`	200m	EPSG:32627
2	`y`	spatial	`7167130`, `7166930`, `7166730`, `7166530`, `7166330`, `7166130`, `7165930`	200m	EPSG:32627
3	`bands`	bands	`blue`, `green`, `red`, `nir`	4 bands	–
4	`t`	temporal	`2020-10-01`, `2020-10-13`, `2020-10-25`	12 days	Gregorian calendar / UTC

Table 1: Overview of the dimensions contained in our example raster datacube above ()

Dimension labels are usually either numerical or text (also known as “strings”), which also includes textual representations of timestamps or geometries for example. For example, temporal labels are usually encoded as ISO 8601 compatible dates and/or times and similarly geometries can be encoded as Well-known Text (WKT) or be represented by their IDs.

Dimensions with a natural/inherent order (usually all temporal and spatial raster dimensions) are always sorted. Dimensions without inherent order (usually bands), retain the order in which they have been defined in metadata or processes (e.g. through filter_bands), with new labels simply being appended to the existing labels.

Geometry as a Dimension

A geometry dimension is not included in the example raster datacube above and it is not used in the following examples, but to show how a vector dimension with two polygons could look like:

name	type	labels	reference system
`geometry`	vector	`POLYGON((-122.4 37.6,-122.35 37.6,-122.35 37.64,-122.4 37.64,-122.4 37.6))`, `POLYGON((-122.51 37.5,-122.48 37.5,-122.48 37.52,-122.51 37.52,-122.51 37.5))`	EPSG:4326

Table 2 : Geometry as a Dimension ()

A dimension with geometries can consist of points, linestrings, polygons, multi points, multi linestrings, or multi polygons. It is not possible to mix geometry types, but the single geometry type with their corresponding multi type can be combined in a dimension (e.g. points and multi points).

EO datacubes contain scalar values (e.g. strings, numbers or boolean values), with all other associated attributes stored in dimensions (e.g. coordinates or timestamps). Attributes such as the CRS or the sensor can also be turned into dimensions. Be advised that in such a case, the uniqueness of pixel coordinates may be affected. When usually, (x, y) refers to a unique location, that changes to (x, y, CRS) when (x, y) values are reused in other coordinate reference systems (e.g. two neighboring UTM zones).

Coordinate Reference System as a Dimension

In the example above, x and y dimension values have a unique relationship to world coordinates through their coordinate reference system (CRS). This implies that a single coordinate reference system is associated with these x and y dimensions. If we want to create a data cube from multiple tiles spanning different coordinate reference systems (e.g. Sentinel-2: different UTM zones), we would have to resample/warp those to a single coordinate reference system. In many cases, this is wanted because we want to be able to look at the result, meaning it is available in a single coordinate reference system.

Resampling is however costly, involves (some) data loss, and is in general not reversible. Suppose that we want to work only on the spectral and temporal dimensions of a data cube, and do not want to do any resampling. In that case, one could create one data cube for each coordinate reference system. An alternative would be to create one single data cube containing all tiles that has an additional dimension with the coordinate reference system. In that data cube, x and y no longer point to a unique world coordinate, because identical x and y coordinate pairs occur in each UTM zone. Now, only the combination (x, y, crs) has a unique relationship to the world coordinates.

On such a crs-dimensioned data cube, several operations make perfect sense, such as apply or reduce_dimension on spectral and/or temporal dimensions. A simple reduction over the crs dimension, using sum or mean would typically not make sense. The “reduction” (removal) of the crs dimension that is meaningful involves the resampling/warping of all sub-cubes for the crs dimension to a single, common target coordinate reference system.

Resolution

The resolution of a dimension gives information about what interval lies between observations. This is most obvious with the temporal resolution, where the intervals depict how often observations were made. Spatial resolution gives information about the pixel spacing, meaning how many ‘real world meters’ are contained in a pixel. The number of bands and their wavelength intervals give information about the spectral resolution.

Data Cube Viewer

More Information

View: for more information and links to interesting material.