Lesson 1, Topic 1
In Progress

The Open Science Journey – Open Science in geospatial, EO and EO cloud platforms Copy

Finally let’s see how open science principles are applied in the field of geospatial, earth observation and EO cloud platforms. To begin we will have a look at the open science journey and a research project that has adapted openness and the FAIR principles very well. Then we will have a look at the role open science plays in today’s geospatial and EO world.

This drag-and-drop game asks you to connect the tasks to their respecitve step within the open science journey. If you hover over the icons, their description will pop up.

Open Science in the ClirSnow Project

The ClirSnow Project is a great example of how the concepts of opennes and FAIR are applied to a real world research project.

YouTube

By loading the video, you agree to YouTube’s privacy policy.
Learn more

Load video

The Open Science Journey
Video content in collaboration with Michael Matiu (University of Trento).

“It seems like a lot of work the first time you do it. And it is. But once you know how to do it, you will use it in every research project, because it actually makes research so much easier. And, it will boost your research impact and credibility. It is really worth it.”

The Role of Open Source Software in Geospatial – The example of GDAL

Open Science plays an important role in geospatial. Open source software is a part of that and the Geographic Data Abstraction Library (GDAL) software is a great example of how important open source software is in the geospatial world. Paul Ramsey, the co-founder of the PostGIS extension, has described what GDAL is in a metaphoric way in a mapscaping.com podcast: “GDAL is data plumbing, a bit like an international electrical plug set for traveling — it’s got multiple different shaped plugs. Electricity is “just” electrons moving around. But they can move around as DC, AC, 120 volts or 240 volts. Plus, there are all these different ways you can plug and join electrical things. At the core, electricity is electrons vibrating, but it can be quite complex to get your hair dryer spinning.” Howard Butler, a director of the Open-Source Geospatial Foundation, said about the importance of GDAL: “[…] It’s open, it provides core functionality, I can’t understand how anybody gets anything done without it.“

YouTube

By loading the video, you agree to YouTube’s privacy policy.
Learn more

Load video

The Role of Open Source Software in Geospatial – The example of GDAL
Video content in collaboration with Even Rouault (Main Developer of GDAL).

Open Science in EO Cloud Platforms

  • Code: Workflows and Code can easily be shared on EO Cloud Platforms. There are openly available tutorial notebooks. Workflows can be shared as user defined processes and be reused by the community. There are user forums that share solutions and snippets. OpenEO, a standardized processing API for EO in the cloud, allows code to be portable between different cloud platforms. This increases reprodicibility, collaboration and prevents vendor locks.
  • Results: There are multiple ways to share results created in EO cloud platforms. Ideally they can be ingested into the platform and be made available as collections for other users directly upon creation. If the result comes with appropiate metadata (e.g. according to the STAC specification) they can easily be registered in publicly avialable STAC Catalogues. Cloud Native Data Formats, like cloud optimized geotiff, are accessible via https requests. So instead of sharing a file, only a URL pointing to the file is shared.
  • Publication: If a publication is built on top of results produced in an EO cloud platform, the results and code can easily be linked to the publication in one of the forms described aboved. For example, you can publish your openEO process graph and link to it, and provide a link to a STAC Catalogue where the results are accessible.
  • FAIRness:
    • Findable: Data is usually presented through a data catalogue (e.g. STAC Catalogues are used in openEO platform and the Microsoft Planetary Computer) that is explicitly made for searching data. In many cases searching data works even without registration on the platform.
    • Accessible: Data access in cloud platforms is usually granted after registration and authentication. Since cloud computing resources can easily be misused a certain degree of access control is necessary.
    • Interoperable: Processing standards like openEO aim at making the code interoperable, which means it is transferable between platforms. Standardised metadata attached to the results,the use of cloud optimized formats and reingestion of the results into the platform guarantee easy uptake of the results right away. Different sources of satellite data are made interoperable by the cloud platform through the use of data cubes and processing on the fly – reprojections, regridding and temporal alignment are enabled on the fly.
    • Reusable: To make results reusable for others, they need to be accessible and have an open license. Ideally, a license of choice can be added to the metadata and the results are reingested into the platform as a public collection, available for everyone.
  • Analysis Ready Data (ARD): Analysis Ready Data are in the context of EO cloud platforms are usually satellite data that have been processed to a minimum set of requirements and organized into a form that allows immediate analysis with a minimum of additional user effort and interoperability both through time and with other datasets. This means for example that atmospheric correction and cloud masking has already been applied to optical data. Many collections on cloud platforms are analysis ready, so that users can directly start the analysis withouth the tedious and technically demanding preprocessing steps. Since ‘analysis ready’ can mean different things to different people, CEOS is working on standardizing what analysis ready data are.

More Information

View: for more information and links to interesting material.


Solve the quiz to complete this topic!