Data Assimilation Group

Data assimilation is the science of combining observational data with numerical models taking into account the uncertainties and errors of both. Data assimilation is interdisciplinary and combines skills from mathematics, computer science and the application disciplines like meteorology, oceanography and biogeochemistry. Data assimilation is not limited to Earth sciences but can be applied to any system where observations over time and a related model simulating the system over time are available. In general, data assimilation represents probability distributions such that one can estimate both a state and its uncertainty. Data assimilation has relations to machine learning, e.g. using optimization methods. However, data assimilation allows a model to learn from the observations over time without extensive prior training and it natually accounts for the different uncertainties in the model and observations. 

Data assimilation is most widely known for its application to initialize model forecasts, e.g. for weather forecasting. However, the usage possibilities of data assimilation are much wider and it can generally be applied to improve the model fields and parameters that control model processes. It is also used to systematically improve the model itself, and to assess the impact of observations, thus supporting the planning of observation campaigns. Related to AWI’s research foci, we can, e.g. improve the state and predictions of ocean models by data assimilation, for example by utilizing satellite observations of sea surface temperature or sea surface height. Similarly, ocean-biogeochemical models can profit from the incorporation of satellite ocean chlorophyll data and other observations by correcting the values of biogeochemical fields or by estimating the parameters that control the biogeochemical processes represented by the model.

Our Research

Our research focuses on ensemble-based data assimilation methods and their application. These methods use an ensemble of model realizations to represent the model state estimate and its uncertainty. Our research focuses on three fields:

Data Assimilation Methodology

We focus on parallel ensemble assimilation algorithms, like ensemble Kalman or particle filters. These methods are highly scalable and hence well suited for data assimilation with complex models using parallel high-performance computers. A resulting methods are, e.g. the local Kalman-nonlinear Transform filter (LKNETF, Nerger 2022) and contributations to the Nonlinear Ensemble Transform Filter and Smoother (NETF, Kirchgessner et al, 2017, Tödter et al 2016).  We also contribute to the review and assessment of data assimilation methodology (van Leeuwen et al., 2019, Vetra-Carvalho et al., 2018).

Data assimilation software - PDAF

To be able to apply data assimilation with complex coupled models of Earth system components, we develop and maintain the Parallel Data Assimilation Framework (PDAF, Nerger et al., 2005, Nerger and Hiller 2013, Nerger et al., 2020). PDAF is free open source software available at https://github.com/PDAF/PDAF to enable the research community to apply data assimilation more easily. We also contibute to the development of couplings of PDAF to different models to generate actual assimilative model systems (e.g. Yu et al. 2025, Shao et al. 2024, Tang et al., 2024, Bruggeman et al., 2024, Li et al., 2024).

Applications of Data Assimilation in the Earth System

Our research applying data assimilation concerns different components of the Earth system: For ocean biogeochemistry and carbon cycle we study the trongly and weakly coupled data assimilation in the North and Baltic Seas (Nerger et al., 2023, Goodliff et al., 2019) as well as the global ocean (Pradhan et al., 2019, 2020) including the uncertainty quantification in biogeochemical modeling with multiple phytoplankton functional groups (Ciavatta et al., 2025, Mamnun et al, 2022, 2023). We also assess the effect of data assimilation on the carbon uptake of the ocean (Bunsen et al., 2025). For coupled model components we develop strategies to implement strongly coupled data assimilation for Earth system models (Nerger et al. 2020) and the apply data assimilation into the coupled atmosphere-ocean model AWI-CM (Tang et al. 2020, 2021, Mu et al. 2020). Further we contribute to sea-ice data assimilation in cooperation with the Sun Yat-sen University in Zhuhai, China (e.g. Min et al, 2023, Luo et al., 2023, 2020, Liu et al. 2019, Mu et al. 2018, Liang et al. 2017, 2019). Further research applications relate to paleo climate (Masoum et al., 2024) and atmospheric data assimilation (Shao and Nerger, 2024, Li et al., 2024).

Team lead
Dr. Lars Nerger

The Team
Anna Broschke
Frauke Bunsen
Dr. Chao Min (guest)

Former members
Dr. Anju Sathyanarayanan
Ahmadreza Masoum (guest)
Sophie Vliegen
Dr. Changliang Shao (guest)
Nabir Mamnun
Dr. Yuchen Sun
Chao Min (visiting PhD student)
Dr. Farshid Daryabor
Imke Sievers (visiting PhD student)
Xiaoyu Liu (visiting PhD student)
Dr. Qi Tang
Dr. Michael Goodliff
Dr. Himansu K. Pradhan
Paul Kirchgessner
Dr. Svetlana Losa

Projects

We participate in different research projects:

SPECBIC

In the project SPECBIC - Employing Spectral Radiation for Enhanced Modeling of Biogeochemistry and Carbon Cycling, we aim to develop data assimilation methodology for spectral radiation observations, which contain information about the phytoplankton group composition in the biogeohcemical model REcoM. These development will be combined with approaches from machine learning. We conduct this project in a jointly with the AWI Sections Marine Biogeosciences, Climate Dynamics and Physical Oceanography.

SOCRA

The project SOCRA - A Surface Ocean CO2 ReAnalysis aims at producing a novel global ocean CO2 reanalysis product by combining the benefits of (i) ocean circulation and biogeochemical model and (ii) observational data by applying data assimilation. We apply the coupled model FESOM-REcoM for the modeling and PDAF for the data assimilation. We conduct this project in a jointly with the group Marine Biogeochemical Modeling in the Section Marine Biogeosciences, and with AWI's Climate-Sciences Sections Marine Biogeosciences and Physical Oceanography.

 

Completed Projects

SEAMLESS (2021-2023)

The project SEAMLESS - Services based on Ecosystem data AssiMiLation: Essential Science and Solutions - was funded by the EU Horizon-2020 program. SEAMLESS aimed at improving the current European capability to simulate and predict the state of marine ecosystems. The project focused on state indicators that are currently are monitored and/or simulated routinely by observatories and models of the European Copernicus Marine Services (“CMEMS”). SEAMLESS improved the  CMEMS data assimilation methods that integrate the information from monitored and simulated indicators. We have built a 1-dimensional prototype that uses PDAF for the data assimilation. Further, at AWI we will applied the data assimilation with PDAF using the operational model system of the CMEMS monitoring and forecasting center for the Baltic Sea and assessed the effects of coupled physics/biogeochemical data assimilation.

Further information is available on the web site of SEAMLESS.

InfoWas (2021-2023)

The project InfoWas - Development of a model-based information system for the water quality in the North- and Baltic Seas was a collaborative project with the German Federal Maritime and Hydrographic Agency (BSH). The project focused on concentrations of algae and oxygen in the North Sea and Baltic Sea and developed an oxygen deficit index. We contributed to the project with the development of data assimilation functionality, based on PDAF, for ocean-biogeochemical modeling in order to improve predictions of water quality.

ESM (2017-2021)

The project ESM - Advanced Earth System Modeling Capacity was a cooperation project of 8 research centers of the Helmholtz Association. In the project we developed data assimilation capability for coupled Earth system models. Further we performed research in the optimal application of data assimilation for coupled model e.g. accounting for the different temporal and spatial scales of model compartments. For the data assimilation component, we applied the software framework PDAF to the coupled atmosphere-ocean model AWI-CM. The implementation approach was published in Nerger et al. (2020), while Tang et al. (2020) described the effects of weakly coupled data assimilation onto both the ocean and atmopshere, while Mu et al. (2020) focuses on effect on the sea ice for building an sealess sea ice prediction system.

More information can be found on the web site of the ESM project.

IPSO (2016-2019)

In the project IPSO (Improving the prediction of photophysiology in the Southern Ocean by accounting for iron limitation, optical properties and spectral satellite data information) the data assimilation group cooperated with the groups Marine Biogeosciences and Phytooptics at AWI. The project aimed at improving the simulation of plankton dynamics and carbon fluxes in the Southern Ocean by enhancing the ecosystem model REcoM. This was achieved by applying data assimilation with PDAF for improving the state representation of REcoM (Pradhan et al., 2019, 2020) and by extending the model to account for light availability in several spectral bands as well photoprotection and photophysiological effects of iron limitation. Further model parameterizations for the photophysiology were improved.

MeRamo (2016-2018)

In the project MeRamo (Supporting the authorities that implement the EU Marine Strategy Framework Directive using an assimilative ecosystem model) we developed a data assimilation components for the coupled ocean-biogeochemical forecast model of the German Maritime and Hydrographic Agency (BSH) in the North and Baltic Seas. The data assimilation system uses PDAF and the operational model HBM coupled to the ecosystem model ERGOM and focused on the assimialtion of sea surface temperature data. The effect of strongly-coupled assimilation is published in Goodliff et al. (2019). The project was funded by the German Ministry for Transport and Digital Infrastructure.

DeMarine (2012-2015)

In the project DeMarine-2 we continued to develop a data assimilation data system for the North and Baltic Seas for the German Maritime and Hydrographic Agency (BSH). The data assimilation system uses PDAF and the operational model HBM of the BSH. Initial work has been done in the previous project DeMarine Environment (Losa et al. 2012, Losa et al. 2013).

More information on DeMarine is available on the web pages of DeMarine.

Sangoma (EU FP7, years 2011-2015)

We participated in the EU-funded project SANGOMA (Stochastic Assimilation for the Next Generation Ocean Model Applications). In project unified tools for data assimilation, new assimilation algorithms and data assimilation benchmark applications were developed to support future operational systems with state-of-the-art data assimilation and related analysis tools.

More information is available on the web site of Sangoma.