Working with Prudence regions#

This notebook should demonstrate how to work with subregions of a Cordex domain. Usually, we use subregions in the form of shapefiles or geodataframes which are handley by regionmask to mask certain geophysical regions of a cordex domain. Working with 3D boolean mask is mostly based on the excellent documentation provided by regionmask.

[5]:
import cartopy.crs as ccrs
import intake
import matplotlib.pyplot as plt
import numpy as np
import regionmask
import xarray as xr
from tqdm.autonotebook import tqdm

import cordex as cx

xr.set_options(keep_attrs=True)
[5]:
<xarray.core.options.set_options at 0x142175510>

We define a proper plotting function for showing rotated pole mappings with coastlines and country borders.

[6]:
def plot(
    da,
    transform=ccrs.PlateCarree(),
    projection=ccrs.PlateCarree(),
    vmin=None,
    vmax=None,
    borders=True,
    xlocs=range(-180, 180, 2),
    ylocs=range(-90, 90, 2),
    extent=None,
    figsize=(15, 10),
    title="",
):
    """plot a domain using the right projections and transformations with cartopy"""
    %matplotlib inline
    import cartopy.crs as ccrs
    import cartopy.feature as cf
    import matplotlib.pyplot as plt

    plt.figure(figsize=figsize)
    ax = plt.axes(projection=projection)
    if extent:
        # ax.set_extent([ds_sub.rlon.min(), ds_sub.rlon.max(), ds_sub.rlat.min(), ds_sub.rlat.max()], crs=transform)
        ax.set_extent(extent, crs=projection)
    ax.gridlines(
        draw_labels=True, linewidth=0.5, color="gray", xlocs=xlocs, ylocs=ylocs
    )
    da.plot(ax=ax, cmap="terrain", transform=transform, vmin=vmin, vmax=vmax)
    ax.coastlines(resolution="50m", color="black", linewidth=1)
    if borders:
        ax.add_feature(cf.BORDERS)
    if title:
        ax.set_title("")

Using prudence regions for masking#

The prudence regions are included in regionmask in form of a coordinates table.

[7]:
prudence = regionmask.defined_regions.prudence
prudence
[7]:
<regionmask.Regions 'PRUDENCE'>
Source:   Christensen and Christensen, 2007, Climatic Change 81:7-30 (https:/...
overlap:  True

Regions:
1 BI     British Isles
2 IP Iberian Peninsula
3 FR            France
4 ME        Mid-Europe
5 SC       Scandinavia
6 AL              Alps
7 MD     Mediterranean
8 EA    Eastern Europe

[8 regions]

Let’s have a look at how the regions look like with the built in plotting function of regionmask.

[11]:
plt.figure(figsize=(8, 8))
proj = ccrs.LambertConformal(central_longitude=15)
ax = prudence.plot(add_ocean=True, proj=proj, resolution="50m", label="abbrev")
_images/prudence_9_0.png

As a simple example, we will use the prudence regionmask to mask out a region from a dummy Cordex domain.

[12]:
eur11 = cx.cordex_domain("EUR-11", dummy="topo")
pole = (
    eur11.rotated_latitude_longitude.grid_north_pole_longitude,
    eur11.rotated_latitude_longitude.grid_north_pole_latitude,
)

regionmask includes a function called mask_3D to create one mask for each region along a region coordinate, e.g.

[13]:
mask = prudence.mask_3D(eur11.lon, eur11.lat)
mask.region
[13]:
<xarray.DataArray 'region' (region: 8)>
array([1, 2, 3, 4, 5, 6, 7, 8])
Coordinates:
  * region   (region) int64 1 2 3 4 5 6 7 8
    abbrevs  (region) <U2 'BI' 'IP' 'FR' 'ME' 'SC' 'AL' 'MD' 'EA'
    names    (region) <U17 'British Isles' ... 'Eastern Europe'

The mask contains a region coordinate that will we can use to mask out EUR-11 dummy topography easily, e.g., to mask the Mediterranean:

[14]:
me_topo = eur11.topo.where(
    mask.isel(region=(mask.abbrevs == "MD")).squeeze(), drop=True
)
plot(
    me_topo,
    transform=ccrs.RotatedPole(*pole),
    projection=ccrs.RotatedPole(*pole),
    vmin=-300,
    vmax=1500,
)
_images/prudence_15_0.png

Investigate regional warming in Prudence regions#

Now, we will use the regionmask to investiage regional warming of Cordex datasets. This will only work with proper access to Cordex data. This example will work at DKRZ with access to their intake catalog. Furthemore, we will use dask to parallelize our computations. For some more detailled explanations, please see also this notebook on CMIP6 data processing.

[15]:
from dask.distributed import Client, progress

client = Client()
client
[15]:

Client

Client-9a80b71c-1a6b-11ee-a9e1-d0a637e987cd

Connection method: Cluster object Cluster type: distributed.LocalCluster
Dashboard: http://127.0.0.1:8787/status

Cluster Info

Let’s open the catalog and search for all rcm models with EC-EARTH forcing. We will investigate the warming of the rcp45 scenario with respect to the historical reference periond from 1971 to 2000. The search attributes will return a catalog subset that will give us direct access to the data.

[16]:
url = "https://euro-cordex.s3.eu-central-1.amazonaws.com/catalog/CORDEX-CMIP5.json"
cat = intake.open_esm_datastore(url)
[29]:
attrs = {
    "driving_model_id": "ICHEC-EC-EARTH",  #'MPI-M-MPI-ESM-LR',
    "member": "r12i1p1",
    #  'member': 'r1i1p1',
    "variable_id": "tas",
    "CORDEX_domain": "EUR-11",
    "frequency": "mon",
    "rcm_version_id": "v1",
    "experiment_id": ["historical", "rcp45"],
}

Now we search for our attributes and show the results.

[30]:
selection = cat.search(**attrs, require_all_on="model_id")
selection.df.groupby(
    [
        "model_id",
        "institute_id",
        "experiment_id",
        "driving_model_id",
        "member",
        "frequency",
        "rcm_version_id",
        "version",
    ]
)["variable_id"].unique().apply(list).to_frame()
[30]:
variable_id
model_id institute_id experiment_id driving_model_id member frequency rcm_version_id version
CCLM4-8-17 CLMcom historical ICHEC-EC-EARTH r12i1p1 mon v1 v20140515 [tas]
rcp45 ICHEC-EC-EARTH r12i1p1 mon v1 v20140515 [tas]
RACMO22E KNMI historical ICHEC-EC-EARTH r12i1p1 mon v1 v20170208 [tas]
rcp45 ICHEC-EC-EARTH r12i1p1 mon v1 v20170713 [tas]
RCA4 SMHI historical ICHEC-EC-EARTH r12i1p1 mon v1 v20131026 [tas]
rcp45 ICHEC-EC-EARTH r12i1p1 mon v1 v20131026 [tas]

We have found some RCM datasets that fullfill our critera. We can now load them into a dataset dictionary.

[47]:
dsets = selection.to_dataset_dict(
    xarray_open_kwargs={"consolidated": True, "decode_times": True, "use_cftime": True},
    storage_options={"anon": True},
)

--> The keys in the returned dictionary of datasets are constructed as follows:
        'project_id.CORDEX_domain.institute_id.driving_model_id.experiment_id.member.model_id.rcm_version_id.frequency'
100.00% [6/6 00:01<00:00]

We sort our datasets by the RCM model id and the experiment id’s so that we can concatenate the historical and scenario data.

[48]:
from collections import defaultdict

dsets_dict = defaultdict(dict)

# sort dsets by rcm and exp_id
for key, dset in dsets.items():
    attrs = dict(zip(cat.key_template.split("."), key.split(".")))
    rcm = attrs["model_id"]
    exp = attrs["experiment_id"]
    dsets_dict[rcm][exp] = dset

Because we want to compare the scenario output to the historical time period, we concatenate the historical and senarion datasets here along the time-axis.

[52]:
dsets_concat = {
    rcm: xr.concat(
        [exp["historical"], exp["rcp45"]],
        dim="time",
        coords="minimal",
        compat="override",
    )
    for rcm, exp in dsets_dict.items()
}

Here comes the actual math. This function will compute a yearly mean over the input period and will substract the temporal mean over the reference period. Consequently, this function computes the change in the annual temperature mean in with respect to the reference period.

[53]:
def compute_change(ds, period, ref_period, mask=None):
    """compute yearly means and compare to reference period"""
    yearmean = ds.sel(time=period).groupby("time.year").mean(dim="time")  # yearly means
    ref = ds.sel(time=ref_period).mean(dim="time")  # mean over whole period
    results = yearmean - ref
    # lets do some maintenance to keep netcdf meta info...
    results.attrs = ds.attrs
    return results

Let’s define out time periods we are interested in and compute the change in the surface temperature for each rcm.

[54]:
ref_period = slice("1971", "2000")
# common_period = slice("1971","2100")
period = slice("2005", "2100")
[55]:
yearmeans = {
    rcm: compute_change(ds, period, ref_period) for rcm, ds in dsets_concat.items()
}

Finally, we will concatenate our results into a single dataset with the model id as a coordinate. This will allow us to easily work with the data of all rcms simultaneoulsy.

Note: Here we have to take care for the slightly different rlat and rlon coordinates that might appear in rcms probably due to different cmorizers involded. We can tell xarray to merge all datasets using the coordinates of the first model usint the keywords compat="override" and join="override".

[56]:
models = list(dsets_concat.keys())
models
[56]:
['RCA4', 'RACMO22E', 'CCLM4-8-17']
[57]:
model_da = xr.DataArray(
    models, dims="model_id", name="model_id", coords={"model_id": models}
)
changes = xr.concat(
    [ym.tas for ym in yearmeans.values()],
    dim=model_da,
    coords="minimal",
    compat="override",
    join="override",
)
changes
[57]:
<xarray.DataArray 'tas' (model_id: 3, year: 96, rlat: 412, rlon: 424)>
dask.array<concatenate, shape=(3, 96, 412, 424), dtype=float32, chunksize=(1, 1, 412, 424), chunktype=numpy.ndarray>
Coordinates:
    height    float64 2.0
    lat       (rlat, rlon) float64 dask.array<chunksize=(412, 424), meta=np.ndarray>
    lon       (rlat, rlon) float64 dask.array<chunksize=(412, 424), meta=np.ndarray>
  * rlat      (rlat) float64 -23.38 -23.27 -23.16 -23.05 ... 21.62 21.73 21.84
  * rlon      (rlon) float64 -28.38 -28.27 -28.16 -28.05 ... 17.94 18.05 18.16
  * year      (year) int64 2005 2006 2007 2008 2009 ... 2096 2097 2098 2099 2100
  * model_id  (model_id) <U10 'RCA4' 'RACMO22E' 'CCLM4-8-17'
Attributes:
    cell_methods:   time: mean
    grid_mapping:   rotated_pole
    long_name:      Near-Surface Air Temperature
    standard_name:  air_temperature
    units:          K

Now, we have the temperature change with respect to the historical period for all four RCMs in one dataset. Actually, we have not computed anything since we accessed the data lazily and only worked with meta data so far. This works very well because xarray uses dask under the hood. Now, we will trigger the computation using the compute function.

[58]:
%time changes_ = changes.compute()
CPU times: user 32.6 s, sys: 8.12 s, total: 40.7 s
Wall time: 2min 55s

With a proper weighting function, we can now have a quick look at the results of field mean over the whole EUR-11 domain.

[60]:
weight = np.cos(np.deg2rad(changes_.rlat))
changes_.weighted(weight).mean(dim=("rlat", "rlon")).plot(col="model_id")
[60]:
<xarray.plot.facetgrid.FacetGrid at 0x147c52710>
_images/prudence_42_1.png

Add a regions coordinate#

Finally, we will compute a field mean for each prudence region. This can now be easily achieved using xarray weighting in combination with the prudence mask.

[61]:
prudence_changes = changes_.weighted(mask * weight).mean(dim=("rlat", "rlon"))

and again, we trigger the chunked computation…

[62]:
%time prudence_changes_ = prudence_changes.compute()
CPU times: user 400 µs, sys: 6 µs, total: 406 µs
Wall time: 417 µs

Finally, the plotting becomes as easy as…

[63]:
prudence_changes_["region"] = prudence_changes_.abbrevs
prudence_changes_.plot(col="region", col_wrap=3, hue="model_id")
[63]:
<xarray.plot.facetgrid.FacetGrid at 0x149093410>
_images/prudence_49_1.png