CMIP6 Hackathon Project - Time of Emergence
This repository is for planning and working on the "Time of Emergence" project for the CMIP6 hackathon at NCAR, Boulder, CO, on 16-18 October 2019. Feel free to open PR's against this repository, or create Wiki pages.
catalogs
: data catalogs that can be used by Intake-ESM.environments
: Conda environment files for the NCAR/Google Cloud deployments.notebooks
: a place for storing Jupyter Notebooks.README.md
: this document - a description of the repository and project.LICENSE
: MIT license file for your project. Do we want to change this?
Please clone this repository onto the compute system we plan to use for the hackathon.
- Open a JupyterLab session.
- Open a terminal in the JupyterLab environment.
- Clone your project:
git clone https://github.com/darothen/cmip6hack-toe.git
- Get to work!
Additionally, we've created a library of utility tools and functions.
We've set these up to be installable as a Python package; to do so, navigate to
the folder where you cloned cmip6hack-toe
and execute the command
$ pip install -e .
This will use setuptools to install an editable copy of the code; that means that you can update the code at will without needing to re-install it!
Zenodo is a data archiving tool that can help make your project citable by assigning a DOI to the project's GitHub repository.
Follow the guidelines here https://guides.github.com/activities/citable-code
Understanding where, when, and under what conditions climate-force trends will become "measurable" (that is, distinguishable from background climate variability or noise) is extremely helpful for both validating our understanding of large scale climate dynamics and framing climate risk assessments. Prior works such as Hawkins and Sutton (2012) have analyzed ensembles of model simulations from inter-comparison projects and worked out some important details, such as the idea that, despite experience a lower magnitude of warming, the lower latitudes will likely experience a measurable "climate change signal" before higher latitudes - a result which has been replicated by works such as Mahlstein et al (2011).
Figure from Mahlstein et al (2011) showing how project signal/noise in warming varies by region of the globe
To help understand and break down the time of emergence of climate-forced signals, it's often helpful to perform spatial analyses, breaking up the globe into larger regions which we might expect to experience similar changes on similar timescales. Understanding what sets the spatial patterns of these regions is often then helpful in elucidating the dynamics at play which sets different regional climates apart from each other - and more interestingly, how those differences may play out in a warming world.
We will replicate some of the foundational work by Hawkins and Sutton (2012) and Mahlstein and Rutti (2010) but with an emphasis on CMIP6 models. This will entail:
- Performing a spatial clustering analysis on different CMIP6 models to identify regions with similar baseline and climate trends; ideally we will explore using machine-learning techniques to "learn" these different regions across many different model simulations
- Analyzing climate trends by aggregating them across each region (a) model-by-model and (b) across models
- Develop visualizations and dashboards for exploring our results
- Create reproducible workflows that automate the entire analyses we develop
Based on this work we would hope to answer a few scientific questions:
- Has our understanding of the time of emergency of regional warming/drying/wetting/etc trends changed with the data from CMIP6?
- What regions and what trends might we expect to "emerge" first?
TBD - we will use standard dynamic/thermodynamic diagnostics (air temperature, humidity, precipitation, wind components, 500mb geopotential heights etc); a list will be forthcoming with the Data Request thread.
Although it would be interesting to break down these analyses for many combinations of CMIP6 experiments, we will focus on pre-industrial runs (for quantifying the background noise/climate variability in the simulations) and future warming scenarios.
We'll try to use as much of the standard suite of scientific Python tools as possible. In particular, we expect to use xarray, dask, pandas, and scikit-learn. We will probably build some interactive visualization tools for helping to understand our results, and for that we will likely use bokeh and Panel.
Ideally the entire research chain will be automated with a few Jupyter notebooks and a Snakemake build (or similar workflow management tool) to enhance reproducibility. Any core modules will be developed as an open source package.
Anyone! If you can bring some timeseries statistical analysis expertise that would be sweet - we'll definitely need that to solve some of the core science problems!