Working with Prudence regions#

This notebook should demonstrate how to work with subregions of a Cordex domain. Usually, we use subregions in the form of shapefiles or geodataframes which are handley by regionmask to mask certain geophysical regions of a cordex domain. Working with 3D boolean mask is mostly based on the excellent documentation provided by regionmask.

[5]:

import cartopy.crs as ccrs
import intake
import matplotlib.pyplot as plt
import numpy as np
import regionmask
import xarray as xr
from tqdm.autonotebook import tqdm

import cordex as cx

xr.set_options(keep_attrs=True)

[5]:

<xarray.core.options.set_options at 0x142175510>

We define a proper plotting function for showing rotated pole mappings with coastlines and country borders.

[6]:

def plot(
    da,
    transform=ccrs.PlateCarree(),
    projection=ccrs.PlateCarree(),
    vmin=None,
    vmax=None,
    borders=True,
    xlocs=range(-180, 180, 2),
    ylocs=range(-90, 90, 2),
    extent=None,
    figsize=(15, 10),
    title="",
):
    """plot a domain using the right projections and transformations with cartopy"""
    %matplotlib inline
    import cartopy.crs as ccrs
    import cartopy.feature as cf
    import matplotlib.pyplot as plt

    plt.figure(figsize=figsize)
    ax = plt.axes(projection=projection)
    if extent:
        # ax.set_extent([ds_sub.rlon.min(), ds_sub.rlon.max(), ds_sub.rlat.min(), ds_sub.rlat.max()], crs=transform)
        ax.set_extent(extent, crs=projection)
    ax.gridlines(
        draw_labels=True, linewidth=0.5, color="gray", xlocs=xlocs, ylocs=ylocs
    )
    da.plot(ax=ax, cmap="terrain", transform=transform, vmin=vmin, vmax=vmax)
    ax.coastlines(resolution="50m", color="black", linewidth=1)
    if borders:
        ax.add_feature(cf.BORDERS)
    if title:
        ax.set_title("")

Investigate regional warming in Prudence regions#

Now, we will use the regionmask to investiage regional warming of Cordex datasets. This will only work with proper access to Cordex data. This example will work at DKRZ with access to their intake catalog. Furthemore, we will use dask to parallelize our computations. For some more detailled explanations, please see also this notebook on CMIP6 data processing.

[15]:

from dask.distributed import Client, progress

client = Client()
client

[15]:

Client

Client-9a80b71c-1a6b-11ee-a9e1-d0a637e987cd

Connection method: Cluster object	Cluster type: distributed.LocalCluster
Dashboard: http://127.0.0.1:8787/status

Cluster Info

LocalCluster

9d3cd8e9

Dashboard: http://127.0.0.1:8787/status	Workers: 4
Total threads: 4	Total memory: 8.00 GiB
Status: running	Using processes: True

Scheduler Info

Scheduler

Scheduler-769bf993-9bcc-4f0b-b845-5ebcaaafe84a

Comm: tcp://127.0.0.1:55627	Workers: 4
Dashboard: http://127.0.0.1:8787/status	Total threads: 4
Started: Just now	Total memory: 8.00 GiB

Workers

Worker: 0

Comm: tcp://127.0.0.1:55754	Total threads: 1
Dashboard: http://127.0.0.1:55759/status	Memory: 2.00 GiB
Nanny: tcp://127.0.0.1:55630
Local directory: /var/folders/_8/qq584wg50hvdrcxg3k0wd9900000gn/T/dask-scratch-space/worker-n46f66dc

Worker: 1

Comm: tcp://127.0.0.1:55753	Total threads: 1
Dashboard: http://127.0.0.1:55757/status	Memory: 2.00 GiB
Nanny: tcp://127.0.0.1:55631
Local directory: /var/folders/_8/qq584wg50hvdrcxg3k0wd9900000gn/T/dask-scratch-space/worker-08vq22u5

Worker: 2

Comm: tcp://127.0.0.1:55756	Total threads: 1
Dashboard: http://127.0.0.1:55761/status	Memory: 2.00 GiB
Nanny: tcp://127.0.0.1:55632
Local directory: /var/folders/_8/qq584wg50hvdrcxg3k0wd9900000gn/T/dask-scratch-space/worker-gdf_uj85

Worker: 3

Comm: tcp://127.0.0.1:55755	Total threads: 1
Dashboard: http://127.0.0.1:55758/status	Memory: 2.00 GiB
Nanny: tcp://127.0.0.1:55633
Local directory: /var/folders/_8/qq584wg50hvdrcxg3k0wd9900000gn/T/dask-scratch-space/worker-tmmi3l0g

Let’s open the catalog and search for all rcm models with EC-EARTH forcing. We will investigate the warming of the rcp45 scenario with respect to the historical reference periond from 1971 to 2000. The search attributes will return a catalog subset that will give us direct access to the data.

[16]:

url = "https://euro-cordex.s3.eu-central-1.amazonaws.com/catalog/CORDEX-CMIP5.json"
cat = intake.open_esm_datastore(url)

[29]:

attrs = {
    "driving_model_id": "ICHEC-EC-EARTH",  #'MPI-M-MPI-ESM-LR',
    "member": "r12i1p1",
    #  'member': 'r1i1p1',
    "variable_id": "tas",
    "CORDEX_domain": "EUR-11",
    "frequency": "mon",
    "rcm_version_id": "v1",
    "experiment_id": ["historical", "rcp45"],
}

Now we search for our attributes and show the results.

[30]:

selection = cat.search(**attrs, require_all_on="model_id")
selection.df.groupby(
    [
        "model_id",
        "institute_id",
        "experiment_id",
        "driving_model_id",
        "member",
        "frequency",
        "rcm_version_id",
        "version",
    ]
)["variable_id"].unique().apply(list).to_frame()

[30]:

								variable_id
model_id	institute_id	experiment_id	driving_model_id	member	frequency	rcm_version_id	version
CCLM4-8-17	CLMcom	historical	ICHEC-EC-EARTH	r12i1p1	mon	v1	v20140515	[tas]
CCLM4-8-17	CLMcom	rcp45	ICHEC-EC-EARTH	r12i1p1	mon	v1	v20140515	[tas]
RACMO22E	KNMI	historical	ICHEC-EC-EARTH	r12i1p1	mon	v1	v20170208	[tas]
RACMO22E	KNMI	rcp45	ICHEC-EC-EARTH	r12i1p1	mon	v1	v20170713	[tas]
RCA4	SMHI	historical	ICHEC-EC-EARTH	r12i1p1	mon	v1	v20131026	[tas]
RCA4	SMHI	rcp45	ICHEC-EC-EARTH	r12i1p1	mon	v1	v20131026	[tas]

We have found some RCM datasets that fullfill our critera. We can now load them into a dataset dictionary.

[47]:

dsets = selection.to_dataset_dict(
    xarray_open_kwargs={"consolidated": True, "decode_times": True, "use_cftime": True},
    storage_options={"anon": True},
)


--> The keys in the returned dictionary of datasets are constructed as follows:
        'project_id.CORDEX_domain.institute_id.driving_model_id.experiment_id.member.model_id.rcm_version_id.frequency'

100.00% [6/6 00:01<00:00]

We sort our datasets by the RCM model id and the experiment id’s so that we can concatenate the historical and scenario data.

[48]:

from collections import defaultdict

dsets_dict = defaultdict(dict)

# sort dsets by rcm and exp_id
for key, dset in dsets.items():
    attrs = dict(zip(cat.key_template.split("."), key.split(".")))
    rcm = attrs["model_id"]
    exp = attrs["experiment_id"]
    dsets_dict[rcm][exp] = dset

Because we want to compare the scenario output to the historical time period, we concatenate the historical and senarion datasets here along the time-axis.

[52]:

dsets_concat = {
    rcm: xr.concat(
        [exp["historical"], exp["rcp45"]],
        dim="time",
        coords="minimal",
        compat="override",
    )
    for rcm, exp in dsets_dict.items()
}

Here comes the actual math. This function will compute a yearly mean over the input period and will substract the temporal mean over the reference period. Consequently, this function computes the change in the annual temperature mean in with respect to the reference period.

[53]:

def compute_change(ds, period, ref_period, mask=None):
    """compute yearly means and compare to reference period"""
    yearmean = ds.sel(time=period).groupby("time.year").mean(dim="time")  # yearly means
    ref = ds.sel(time=ref_period).mean(dim="time")  # mean over whole period
    results = yearmean - ref
    # lets do some maintenance to keep netcdf meta info...
    results.attrs = ds.attrs
    return results

Let’s define out time periods we are interested in and compute the change in the surface temperature for each rcm.

[54]:

ref_period = slice("1971", "2000")
# common_period = slice("1971","2100")
period = slice("2005", "2100")

[55]:

yearmeans = {
    rcm: compute_change(ds, period, ref_period) for rcm, ds in dsets_concat.items()
}

Finally, we will concatenate our results into a single dataset with the model id as a coordinate. This will allow us to easily work with the data of all rcms simultaneoulsy.

Note: Here we have to take care for the slightly different rlat and rlon coordinates that might appear in rcms probably due to different cmorizers involded. We can tell xarray to merge all datasets using the coordinates of the first model usint the keywords compat="override" and join="override".

[56]:

models = list(dsets_concat.keys())
models

[56]:

['RCA4', 'RACMO22E', 'CCLM4-8-17']

[57]:

model_da = xr.DataArray(
    models, dims="model_id", name="model_id", coords={"model_id": models}
)
changes = xr.concat(
    [ym.tas for ym in yearmeans.values()],
    dim=model_da,
    coords="minimal",
    compat="override",
    join="override",
)
changes

[57]:

<xarray.DataArray 'tas' (model_id: 3, year: 96, rlat: 412, rlon: 424)>
dask.array<concatenate, shape=(3, 96, 412, 424), dtype=float32, chunksize=(1, 1, 412, 424), chunktype=numpy.ndarray>
Coordinates:
    height    float64 2.0
    lat       (rlat, rlon) float64 dask.array<chunksize=(412, 424), meta=np.ndarray>
    lon       (rlat, rlon) float64 dask.array<chunksize=(412, 424), meta=np.ndarray>
  * rlat      (rlat) float64 -23.38 -23.27 -23.16 -23.05 ... 21.62 21.73 21.84
  * rlon      (rlon) float64 -28.38 -28.27 -28.16 -28.05 ... 17.94 18.05 18.16
  * year      (year) int64 2005 2006 2007 2008 2009 ... 2096 2097 2098 2099 2100
  * model_id  (model_id) <U10 'RCA4' 'RACMO22E' 'CCLM4-8-17'
Attributes:
    cell_methods:   time: mean
    grid_mapping:   rotated_pole
    long_name:      Near-Surface Air Temperature
    standard_name:  air_temperature
    units:          K

Now, we have the temperature change with respect to the historical period for all four RCMs in one dataset. Actually, we have not computed anything since we accessed the data lazily and only worked with meta data so far. This works very well because xarray uses dask under the hood. Now, we will trigger the computation using the compute function.

[58]:

%time changes_ = changes.compute()

CPU times: user 32.6 s, sys: 8.12 s, total: 40.7 s
Wall time: 2min 55s

With a proper weighting function, we can now have a quick look at the results of field mean over the whole EUR-11 domain.

[60]:

weight = np.cos(np.deg2rad(changes_.rlat))
changes_.weighted(weight).mean(dim=("rlat", "rlon")).plot(col="model_id")

[60]:

<xarray.plot.facetgrid.FacetGrid at 0x147c52710>

Add a regions coordinate#

Finally, we will compute a field mean for each prudence region. This can now be easily achieved using xarray weighting in combination with the prudence mask.

[61]:

prudence_changes = changes_.weighted(mask * weight).mean(dim=("rlat", "rlon"))

and again, we trigger the chunked computation…

[62]:

%time prudence_changes_ = prudence_changes.compute()

CPU times: user 400 µs, sys: 6 µs, total: 406 µs
Wall time: 417 µs

Finally, the plotting becomes as easy as…

[63]:

prudence_changes_["region"] = prudence_changes_.abbrevs
prudence_changes_.plot(col="region", col_wrap=3, hue="model_id")

[63]:

<xarray.plot.facetgrid.FacetGrid at 0x149093410>

Working with Prudence regions

Contents