top of page

Generative Machine Learning Models for Data Assimilation

This is a brief introduction to the 2024 Polymath Jr project that will be run by Ricardo Baptista and Giulio Trigila. 

​

Broad Goal Of The Project. Generative machine learning aims to characterize properties of probability distributions from samples. These distributions quantify our beliefs for possible outcomes of an experiment or hypothesis such as:

  • Is it likely to rain in New York tomorrow?

  • Will the Boston Red Sox win the World Series in 2024?

 

In practice we summarize our beliefs by computing properties of distributions such as their moments and quantiles. This project will focus on the distributions for states of dynamical systems given a stream of noisy observations. This distribution is known as the filtering distribution and in this project, we will build algorithms to sample these distributions using tools from optimal transport. This procedure is also known as data assimilation when incorporating data that arrives sequentially in time in an online setting.

 

The applications of this project range vastly from improving predictions of weather forecasting systems to financial models. For example, data assimilation is a core operational procedure to estimate the state of atmosphere given limited measurements at weather stations. This procedure is deployed on a regular basis by weather prediction services and atmospheric research centers across the globe. Devising robust and unbiased methods to perform these tasks will have a significant benefit for these applications.

 

General references. 

​

Relevant papers.

​

​

​

bottom of page