Lesson 1, Topic 1
In Progress

The STAC catalog

STAC stands for SpatioTemporal Asset Catalog. It is a community specification that provides a common way for describing and cataloging assets that have a connection to space and time, usually but not necessarily on the Earth. The STAC specification focuses on organizing and sharing geospatial data in a way that is accessible, interoperable, and scalable. The STAC Specification consists of 4 semi-independent specifications (Catalog, Collection, Item, and API) which can work independently or be used together. All of them can be and are enriched by a variety of extensions. It is a relatively new specification but is increasingly integrated by various data providers and is seen as the future of EO Data cataloging and discovery. The data model in the dataspace is still evolving to comply fully with all standardized properties. Because of that, more attention is provided to STAC than other catalogue protocols in this tutorial.

The components of STAC

The STAC specification is divided into three main parts:

  • STAC specification for static catalogs, which consists of three parts:
    • STAC Items
    • STAC Catalogs
    • STAC Collections
  • STAC API specification for dynamic catalogs.
  • STAC extensions (both for static STAC and the STAC API)

All these components are fairly independent, but all components work together and use links to express the relationship between them so that eventually clients can traverse through them. The links to the actual spatio-temporal data files that the STAC metadata describes are handled specifically and are called STAC Assets. Assets can be made available in Items and Collections.

STAC hierarchy (Image by Matthias Mohr from https://mohr.ws/foss4g/)

STAC Item

A STAC Item is the foundational building block of STAC. It is a GeoJSON feature supplemented with additional metadata that enables clients to traverse through catalogs. Since an item is a GeoJSON, it can be easily read by any modern GIS or geospatial library. One item can describe one or more spatio-temporal asset(s). For example, a common practice of using STAC for imagery is that each band in a scene is its own STAC Asset and there is one STAC Item to represent all the bands in a single scene.

The STAC Item JSON specification uses standard GeoJSON fields as well as a few additional informational fields to describe the asset(s) more thoroughly.

STAC Item (and other components) have some required fields that must be always filled with information. In the example below, required fields like type, stac_version, or id are filled. Properties are also required fields, but here also extended by many STAC extensions, in the format of extension_name:field_name: value. STAC extensions are also listed in the stac_extensions field. Complete STAC Item spec can be found on GitHub.

Example of a Sentinel 2 L2A STAC Item with one band asset.
{
   "type":"Feature",
   "stac_version":"1.0.0",
   "id":"S2B_43SCR_20231123_0_L2A",
   "properties":{
      "created":"2023-11-23T08:25:34.597Z",
      "platform":"sentinel-2b",
      "constellation":"sentinel-2",
      "instruments":[
         "msi"
      ],
      "eo:cloud_cover":0.414053,
      "proj:epsg":32643,
      "mgrs:utm_zone":43,
      "mgrs:latitude_band":"S",
      "mgrs:grid_square":"CR",
      "grid:code":"MGRS-43SCR",
      "view:sun_azimuth":161.976971280106,
      "view:sun_elevation":35.5947611411469,
      "s2:degraded_msi_data_percentage":0,
      "s2:nodata_pixel_percentage":77.337396,
      "s2:saturated_defective_pixel_percentage":0,
      "s2:dark_features_percentage":0,
      "s2:cloud_shadow_percentage":0.002591,
      "s2:vegetation_percentage":0.010072,
      "s2:not_vegetated_percentage":98.205149,
      "s2:water_percentage":0.9589,
      "s2:unclassified_percentage":0.38101,
      "s2:medium_proba_clouds_percentage":0.410817,
      "s2:high_proba_clouds_percentage":0.003235,
      "s2:thin_cirrus_percentage":0,
      "s2:snow_ice_percentage":0.028226,
      "s2:product_type":"S2MSI2A",
      "s2:processing_baseline":"05.09",
      "s2:product_uri":"S2B_MSIL2A_20231123T054139_N0509_R005_T43SCR_20231123T070647.SAFE",
      "s2:generation_time":"2023-11-23T07:06:47.000000Z",
      "s2:datatake_id":"GS2B_20231123T054139_035066_N05.09",
      "s2:datatake_type":"INS-NOBS",
      "s2:datastrip_id":"S2B_OPER_MSI_L2A_DS_2BPS_20231123T070647_S20231123T054133_N05.09",
      "s2:granule_id":"S2B_OPER_MSI_L2A_TL_2BPS_20231123T070647_A035066_T43SCR_N05.09",
      "s2:reflectance_conversion_factor":1.02412472897181,
      "datetime":"2023-11-23T05:50:10.118000Z",
      "s2:sequence":"0",
      "earthsearch:s3_path":"s3://sentinel-cogs/sentinel-s2-l2a-cogs/43/S/CR/2023/11/S2B_43SCR_20231123_0_L2A",
      "earthsearch:payload_id":"roda-sentinel2/workflow-sentinel2-to-stac/ee5536069c6dd3a13ae2dbd530beafeb",
      "earthsearch:boa_offset_applied":true,
      "processing:software":{
         "sentinel2-to-stac":"0.1.1"
      },
      "updated":"2023-11-23T08:25:34.597Z"
   },
   "geometry":{
      "type":"Polygon",
      "coordinates":[
         [
            [
               73.91301155493866,
               32.53264844646899
            ],
            [
               73.65302362425264,
               31.539681428386213
            ],
            [
               74.04974127686076,
               31.543245006457585
            ],
            [
               74.03945448697993,
               32.53367782512506
            ],
            [
               73.91301155493866,
               32.53264844646899
            ]
         ]
      ]
   },
   "links":[
      {
         "rel":"self",
         "type":"application/geo+json",
         "href":"https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a/items/S2B_43SCR_20231123_0_L2A"
      },
      {
         "rel":"canonical",
         "href":"s3://sentinel-cogs/sentinel-s2-l2a-cogs/43/S/CR/2023/11/S2B_43SCR_20231123_0_L2A/S2B_43SCR_20231123_0_L2A.json",
         "type":"application/json"
      },
      {
         "rel":"license",
         "href":"https://sentinel.esa.int/documents/247904/690755/Sentinel_Data_Legal_Notice"
      },
      {
         "rel":"derived_from",
         "href":"https://earth-search.aws.element84.com/v1/collections/sentinel-2-l1c/items/S2B_43SCR_20231123_0_L1C",
         "type":"application/geo+json"
      },
      {
         "rel":"parent",
         "type":"application/json",
         "href":"https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a"
      },
      {
         "rel":"collection",
         "type":"application/json",
         "href":"https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a"
      },
      {
         "rel":"root",
         "type":"application/json",
         "href":"https://earth-search.aws.element84.com/v1"
      },
      {
         "rel":"thumbnail",
         "href":"https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a/items/S2B_43SCR_20231123_0_L2A/thumbnail"
      }
   ],
   "assets":{
      "blue":{
         "href":"https://sentinel-cogs.s3.us-west-2.amazonaws.com/sentinel-s2-l2a-cogs/43/S/CR/2023/11/S2B_43SCR_20231123_0_L2A/B02.tif",
         "type":"image/tiff; application=geotiff; profile=cloud-optimized",
         "title":"Blue (band 2) - 10m",
         "eo:bands":[
            {
               "name":"blue",
               "common_name":"blue",
               "description":"Blue (band 2)",
               "center_wavelength":0.49,
               "full_width_half_max":0.098
            }
         ],
         "gsd":10,
         "proj:shape":[
            10980,
            10980
         ],
         "proj:transform":[
            10,
            0,
            300000,
            0,
            -10,
            3600000
         ],
         "raster:bands":[
            {
               "nodata":0,
               "data_type":"uint16",
               "bits_per_sample":15,
               "spatial_resolution":10,
               "scale":0.0001,
               "offset":-0.1
            }
         ],
         "roles":[
            "data",
            "reflectance"
         ]
      },
      "thumbnail":{
         "href":"https://sentinel-cogs.s3.us-west-2.amazonaws.com/sentinel-s2-l2a-cogs/43/S/CR/2023/11/S2B_43SCR_20231123_0_L2A/thumbnail.jpg",
         "type":"image/jpeg",
         "title":"Thumbnail image",
         "roles":[
            "thumbnail"
         ]
      }
   },
   "bbox":[
      73.65302362425264,
      31.539681428386213,
      74.04974127686076,
      32.53367782512506
   ],
   "stac_extensions":[
      "https://stac-extensions.github.io/view/v1.0.0/schema.json",
      "https://stac-extensions.github.io/grid/v1.0.0/schema.json",
      "https://stac-extensions.github.io/mgrs/v1.0.0/schema.json",
      "https://stac-extensions.github.io/raster/v1.1.0/schema.json",
      "https://stac-extensions.github.io/processing/v1.1.0/schema.json",
      "https://stac-extensions.github.io/eo/v1.1.0/schema.json",
      "https://stac-extensions.github.io/projection/v1.1.0/schema.json"
   ],
   "collection":"sentinel-2-l2a"
}

STAC Catalog

A STAC Catalog is an entity that logically groups other Catalogs, Collections, and Items. A Catalog contains links to these other entities and can include additional metadata to describe the entities contained therein. A catalog is usually the starting point for navigating a STAC. More specifically, a catalog.json file contains links to some combination of other STAC Catalogs, Collections, and/or Items. We can think of it like a directory on a computer although it doesn’t necessarily need to mirror the local directory tree.

There are no restrictions on the way STAC Catalogs are organized. Therefore, the combination of STAC components within a STAC Catalog is quite variable and flexible. Many implementations use a set of ‘sub-catalog(s)’ that group the items in some sensible way, e.g. by years as a first level and months as a second level. It can be easily extended, for example, to include additional metadata to further describe its holdings, as the STAC Collection does.

STAC Collection

A STAC Collection is similar to a STAC Catalog but includes and partially requires additional metadata about a set of items that exist as part of the collection. It adds additional fields to enable the description of information like the spatial and temporal extent of the data, the license, keywords, providers, etc. Therefore, it can easily be extended with additional collection-level metadata that is common across all children. For example, it could be summarized that all Items underneath hold data in either 10m or 30m spatial resolution. Example of a Sentinel 2 L2A STAC Collection.

Example of a Sentinel 2 L2A STAC Collection.
{
   "type":"Collection",
   "id":"sentinel-2-l2a",
   "stac_version":"1.0.0",
   "description":"Global Sentinel-2 data from the Multispectral Instrument (MSI) onboard Sentinel-2",
   "links":[
      {
         "rel":"self",
         "type":"application/json",
         "href":"https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a"
      },
      {
         "rel":"cite-as",
         "href":"https://doi.org/10.5270/S2_-742ikth",
         "title":"Copernicus Sentinel-2 MSI Level-2A (L2A) Bottom-of-Atmosphere Radiance"
      },
      {
         "rel":"license",
         "href":"https://sentinel.esa.int/documents/247904/690755/Sentinel_Data_Legal_Notice",
         "title":"proprietary"
      },
      {
         "rel":"parent",
         "type":"application/json",
         "href":"https://earth-search.aws.element84.com/v1"
      },
      {
         "rel":"root",
         "type":"application/json",
         "href":"https://earth-search.aws.element84.com/v1"
      },
      {
         "rel":"items",
         "type":"application/geo+json",
         "href":"https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a/items"
      },
      {
         "rel":"http://www.opengis.net/def/rel/ogc/1.0/queryables",
         "type":"application/schema+json",
         "href":"https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a/queryables"
      },
      {
         "rel":"aggregate",
         "type":"application/json",
         "href":"https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a/aggregate",
         "method":"GET"
      },
      {
         "rel":"aggregations",
         "type":"application/json",
         "href":"https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a/aggregations"
      }
   ],
   "stac_extensions":[
      "https://stac-extensions.github.io/item-assets/v1.0.0/schema.json",
      "https://stac-extensions.github.io/view/v1.0.0/schema.json",
      "https://stac-extensions.github.io/scientific/v1.0.0/schema.json",
      "https://stac-extensions.github.io/raster/v1.1.0/schema.json",
      "https://stac-extensions.github.io/eo/v1.0.0/schema.json"
   ],
   "title":"Sentinel-2 Level 2A",
   "extent":{
      "spatial":{
         "bbox":[
            [
               -180,
               -90,
               180,
               90
            ]
         ]
      },
      "temporal":{
         "interval":[
            [
               "2015-06-27T10:25:31.456000Z",
               null
            ]
         ]
      }
   },
   "license":"proprietary",
   "keywords":[
      "sentinel",
      "earth observation",
      "esa"
   ],
   "providers":[
      {
         "name":"ESA",
         "roles":[
            "producer"
         ],
         "url":"https://earth.esa.int/web/guest/home"
      },
      {
         "name":"Sinergise",
         "roles":[
            "processor"
         ],
         "url":"https://registry.opendata.aws/sentinel-2/"
      },
      {
         "name":"AWS",
         "roles":[
            "host"
         ],
         "url":"http://sentinel-pds.s3-website.eu-central-1.amazonaws.com/"
      },
      {
         "name":"Element 84",
         "roles":[
            "processor"
         ],
         "url":"https://element84.com"
      }
   ],
   "summaries":{
      "platform":[
         "sentinel-2a",
         "sentinel-2b"
      ],
      "constellation":[
         "sentinel-2"
      ],
      "instruments":[
         "msi"
      ],
      "gsd":[
         10,
         20,
         60
      ],
      "view:off_nadir":[
         0
      ],
      "sci:doi":[
         "10.5270/s2_-znk9xsj"
      ],
      "eo:bands":[
         {
            "name":"blue",
            "common_name":"blue",
            "description":"Blue (band 2)",
            "center_wavelength":0.49,
            "full_width_half_max":0.098
         }
      ]
   }
}

Source URL: https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a

STAC API

STAC API is a dynamic version of a static SpatioTemporal Asset Catalog and provides a RESTful endpoint that enables the search of STAC Items and STAC Collections. STAC Catalogs don’t play a big role in APIs as they are mostly used as an entity for grouping larger static catalogs into smaller chunks, which is usually not needed in the context of a dynamic API.

If the API implements the Filter or Query extension, additionally the user is allowed to search for specific content based on a set of available metadata fields. Additional extensions may support more interactive elements such as aggregations, or managing the metadata (updating it, creating new entities, or deleting some) through transactions.

A part of the STAC API is built on top of OGC API – Features.

STAC Extension

Extensions to STAC are split into two parts: STAC extensions and STAC API extensions. They are both an important addition to the STAC specifications and can provide either additions to the data model (i.e. additional JSON properties such as eo:cloud_cover) or behavioral changes (e.g. additional types of links or a sorting functionality). Most tend to be about describing a particular domain or type of data.

To find out which extensions do the STAC API, STAC Catalog, Collection or Item object implement, you can explore a list of STAC extensions or a list of STAC API extensions.