Story

Predicting the changing marine conditions with future digital twins

24 April 2023

A pioneering application of machine learning is helping scientists to emulate highly complex physical-biogeochemical marine processes, to forecast marine oxygen anomalies and help inform sustainable marine management through digital twins in the future.

Anton Filatov | Unsplash
Computer generated coast | Anton Filatov, Unsplash

The ocean is a vast, dynamic and complex place. Being able to fully appreciate its vital role in the Earth system and overall benefit to humans requires extensive knowledge of a myriad of ecosystem and biogeochemical processes, many of which are challenging or impossible to monitor on a regular basis.

To help address this issue, for over 50 years scientists have been developing increasingly sophisticated marine models to help bridge the gaps in data and understanding. By bringing together oceanic, atmospheric and terrestrial observational data into a computer model, scientists can explore how elements, processes and functions interact to support the numerous goods and services provided by the marine environment and ultimately life on this planet. The models also provide a method to digitally replicate a system so changes can be simulated without having the ongoing resource expense of observing them in the lab or natural environment.

PML is a world-leader in marine ecosystem and biogeochemical model development and its models and outputs are used across the globe by a wide range of environmental scientists. However these models are complex, computationally expensive and require advanced training to generate the output and interpret the information. With this in mind, PML’s expert team has turned its attention to developing “digital twins’ of the future, to make modelling capabilities more accessible for specific purposes outside of academia.

By accurately replicating the traditional model with a simple machine learning tool, end-users will be able to use the digital twin directly, to help inform operational and management decisions relating to various marine activities, such as aquaculture, Blue Carbon initiatives and coastal development.

Machine learning is a type of artificial intelligence that enables computer systems to learn and adapt, by using algorithms and statistical models to analyse and draw conclusions from patterns in data. The machine learning revolution is becoming established in marine science but its application to emulate important marine biogeochemical models are still rare.

In this new study, led by PML in partnership with the University of Exeter and the National Centre for Earth Observation, the team focused on the prediction of low oxygen levels for aquaculture sites and the accurate estimation of oxygen levels from marine observations.

The study demonstrated that complex models can be accurately replicated with computationally-inexpensive machine learning surrogates that are capable of predicting successfully simulated oxygen anomalies and that these emulators can also provide essential components within future digital twin applications.

Dr Jozef Skakala, study lead and Ecosystem Modeller at PML, commented:

“In this study we were successful in building computationally-inexpensive and efficient machine learning emulators to replicate computationally-expensive and complex physical-biogeochemical models.”

“Our vision is that these emulators, acting as “digital twins of the ocean’, would eventually democratize the access to modelling, enabling developing countries and other end-users without access to high-performance computing facilities, to investigate range of real-world scenarios for management and policy-making decisions. Plans are to develop similar tools to explore future climate scenarios for ocean ecosystem health.”

“There are challenges that still need addressing if we are to develop successful digital twins for high resolution applications in the future. The first challenge is that we might need to adapt the observational sampling strategies to the needs of the machine learning models, however, as we have shown in this study, this can often be achieved without a major increase in the observational cost. Another area we need to explore is increasing the model resolution to better approximate the scales of the digital twin applications and ensure that the dynamics relevant to the digital twin application is represented in the model as best as possible.”