The International Environmetrics Society (TIES) has launched a new TIES Seminar Series on Data Science for Environmental Sciences (DSES).
Our first webinar will be on Friday, 18 March, at 11.00 am Central Standard Time (UTC-6) (see attached flyer).
You can virtually access the webinar and register via our website: www.environmetrics.xyz
Speaker: Huikyo Lee, Jet Propulsion Laboratory, Caltech.
Title: Application of topological data analysis to multi-resolution matching and anomaly detection
Abstract: Topology is the study of shapes. Topological data analysis (TDA) is emerging machinery at the interface of algebraic topology, machine learning (ML), and statistics. TDA has shown a high utility in a diverse range of applications, from social studies, to digital health care, to power systems. While geometrical methods, such as TDA, continue to gain popularity in statistical sciences and ML, from causal inference to deep learning on manifolds, the utility of geometric methods for assessing the spatial characteristics of Earth science datasets is yet untapped. Topological information on the inherent data shape can provide invaluable insights into the latent data structure and organization, and can serve a leading role in understanding spatiotemporal dynamic patterns of observations and climate models.
Here, I studied latent shape in temperature maps over the contiguous United States in February, June, and July 2021. The cold wave in February 2021 was an extreme weather event that brought record-breaking temperatures to North America and caused multiple days of massive blackouts in Texas. From late June through mid-July, an extreme heat wave associated with a strong ridge occurred over Western North America. The main objective is to build a robust and reliable methodology that compares spatial patterns from different sources and detect anomalous spatial patterns during the extreme temperature events. Specifically, I assessed two temperature datasets, the Modern-Era Retrospective analysis for Research and Applications, version 2 (MERRA-2) reanalysis and Atmospheric Infrared Sounder (AIRS). By applying cubical complexes, I summarized the shape of these two temperature datasets into persistence diagrams (PDs) and calculated Wasserstein distance between two PDs. My previous work (Orofi-Boateng et al., 2021) shows that Wasserstein distance represents difference in spatial patterns and can replace conventional metrics, such as a bias and root-mean-square-deviation (RMSD).
To the best of my knowledge, there is no quantitative metric to measure difference in spatial patterns between Earth science datasets at different spatial resolutions. In my work, PDs summarized temperature maps during the extreme cold and heat waves. Applying TDA to observational and model datasets has enormous potential, because we can also analyze key spatial structures in the three dimensional data from sounders and compare them with climate models. In addition, the Wasserstein distance can offer game-changing capabilities for self organizing maps (SOMs), which are one of the most widely used ML tools among atmospheric scientists.
Hope to see you all there!
Yulia R. Gel