Working with Prudence regions#
This notebook should demonstrate how to work with subregions of a Cordex domain. Usually, we use subregions in the form of shapefiles or geodataframes which are handley by regionmask to mask certain geophysical regions of a cordex domain. Working with 3D boolean mask is mostly based on the excellent documentation provided by regionmask.
[5]:
import cartopy.crs as ccrs
import intake
import matplotlib.pyplot as plt
import numpy as np
import regionmask
import xarray as xr
from tqdm.autonotebook import tqdm
import cordex as cx
xr.set_options(keep_attrs=True)
[5]:
<xarray.core.options.set_options at 0x142175510>
We define a proper plotting function for showing rotated pole mappings with coastlines and country borders.
[6]:
def plot(
da,
transform=ccrs.PlateCarree(),
projection=ccrs.PlateCarree(),
vmin=None,
vmax=None,
borders=True,
xlocs=range(-180, 180, 2),
ylocs=range(-90, 90, 2),
extent=None,
figsize=(15, 10),
title="",
):
"""plot a domain using the right projections and transformations with cartopy"""
%matplotlib inline
import cartopy.crs as ccrs
import cartopy.feature as cf
import matplotlib.pyplot as plt
plt.figure(figsize=figsize)
ax = plt.axes(projection=projection)
if extent:
# ax.set_extent([ds_sub.rlon.min(), ds_sub.rlon.max(), ds_sub.rlat.min(), ds_sub.rlat.max()], crs=transform)
ax.set_extent(extent, crs=projection)
ax.gridlines(
draw_labels=True, linewidth=0.5, color="gray", xlocs=xlocs, ylocs=ylocs
)
da.plot(ax=ax, cmap="terrain", transform=transform, vmin=vmin, vmax=vmax)
ax.coastlines(resolution="50m", color="black", linewidth=1)
if borders:
ax.add_feature(cf.BORDERS)
if title:
ax.set_title("")
Using prudence regions for masking#
The prudence regions are included in regionmask in form of a coordinates table.
[7]:
prudence = regionmask.defined_regions.prudence
prudence
[7]:
<regionmask.Regions 'PRUDENCE'>
Source: Christensen and Christensen, 2007, Climatic Change 81:7-30 (https:/...
overlap: True
Regions:
1 BI British Isles
2 IP Iberian Peninsula
3 FR France
4 ME Mid-Europe
5 SC Scandinavia
6 AL Alps
7 MD Mediterranean
8 EA Eastern Europe
[8 regions]
Let’s have a look at how the regions look like with the built in plotting function of regionmask.
[11]:
plt.figure(figsize=(8, 8))
proj = ccrs.LambertConformal(central_longitude=15)
ax = prudence.plot(add_ocean=True, proj=proj, resolution="50m", label="abbrev")
As a simple example, we will use the prudence regionmask to mask out a region from a dummy Cordex domain.
[12]:
eur11 = cx.cordex_domain("EUR-11", dummy="topo")
pole = (
eur11.rotated_latitude_longitude.grid_north_pole_longitude,
eur11.rotated_latitude_longitude.grid_north_pole_latitude,
)
regionmask includes a function called mask_3D
to create one mask for each region along a region
coordinate, e.g.
[13]:
mask = prudence.mask_3D(eur11.lon, eur11.lat)
mask.region
[13]:
<xarray.DataArray 'region' (region: 8)> array([1, 2, 3, 4, 5, 6, 7, 8]) Coordinates: * region (region) int64 1 2 3 4 5 6 7 8 abbrevs (region) <U2 'BI' 'IP' 'FR' 'ME' 'SC' 'AL' 'MD' 'EA' names (region) <U17 'British Isles' ... 'Eastern Europe'
The mask contains a region coordinate that will we can use to mask out EUR-11 dummy topography easily, e.g., to mask the Mediterranean:
[14]:
me_topo = eur11.topo.where(
mask.isel(region=(mask.abbrevs == "MD")).squeeze(), drop=True
)
plot(
me_topo,
transform=ccrs.RotatedPole(*pole),
projection=ccrs.RotatedPole(*pole),
vmin=-300,
vmax=1500,
)
Investigate regional warming in Prudence regions#
Now, we will use the regionmask to investiage regional warming of Cordex datasets. This will only work with proper access to Cordex data. This example will work at DKRZ with access to their intake catalog. Furthemore, we will use dask to parallelize our computations. For some more detailled explanations, please see also this notebook on CMIP6 data processing.
[15]:
from dask.distributed import Client, progress
client = Client()
client
[15]:
Client
Client-9a80b71c-1a6b-11ee-a9e1-d0a637e987cd
Connection method: Cluster object | Cluster type: distributed.LocalCluster |
Dashboard: http://127.0.0.1:8787/status |
Cluster Info
LocalCluster
9d3cd8e9
Dashboard: http://127.0.0.1:8787/status | Workers: 4 |
Total threads: 4 | Total memory: 8.00 GiB |
Status: running | Using processes: True |
Scheduler Info
Scheduler
Scheduler-769bf993-9bcc-4f0b-b845-5ebcaaafe84a
Comm: tcp://127.0.0.1:55627 | Workers: 4 |
Dashboard: http://127.0.0.1:8787/status | Total threads: 4 |
Started: Just now | Total memory: 8.00 GiB |
Workers
Worker: 0
Comm: tcp://127.0.0.1:55754 | Total threads: 1 |
Dashboard: http://127.0.0.1:55759/status | Memory: 2.00 GiB |
Nanny: tcp://127.0.0.1:55630 | |
Local directory: /var/folders/_8/qq584wg50hvdrcxg3k0wd9900000gn/T/dask-scratch-space/worker-n46f66dc |
Worker: 1
Comm: tcp://127.0.0.1:55753 | Total threads: 1 |
Dashboard: http://127.0.0.1:55757/status | Memory: 2.00 GiB |
Nanny: tcp://127.0.0.1:55631 | |
Local directory: /var/folders/_8/qq584wg50hvdrcxg3k0wd9900000gn/T/dask-scratch-space/worker-08vq22u5 |
Worker: 2
Comm: tcp://127.0.0.1:55756 | Total threads: 1 |
Dashboard: http://127.0.0.1:55761/status | Memory: 2.00 GiB |
Nanny: tcp://127.0.0.1:55632 | |
Local directory: /var/folders/_8/qq584wg50hvdrcxg3k0wd9900000gn/T/dask-scratch-space/worker-gdf_uj85 |
Worker: 3
Comm: tcp://127.0.0.1:55755 | Total threads: 1 |
Dashboard: http://127.0.0.1:55758/status | Memory: 2.00 GiB |
Nanny: tcp://127.0.0.1:55633 | |
Local directory: /var/folders/_8/qq584wg50hvdrcxg3k0wd9900000gn/T/dask-scratch-space/worker-tmmi3l0g |
Let’s open the catalog and search for all rcm models with EC-EARTH forcing. We will investigate the warming of the rcp45 scenario with respect to the historical reference periond from 1971 to 2000. The search attributes will return a catalog subset that will give us direct access to the data.
[16]:
url = "https://euro-cordex.s3.eu-central-1.amazonaws.com/catalog/CORDEX-CMIP5.json"
cat = intake.open_esm_datastore(url)
[29]:
attrs = {
"driving_model_id": "ICHEC-EC-EARTH", #'MPI-M-MPI-ESM-LR',
"member": "r12i1p1",
# 'member': 'r1i1p1',
"variable_id": "tas",
"CORDEX_domain": "EUR-11",
"frequency": "mon",
"rcm_version_id": "v1",
"experiment_id": ["historical", "rcp45"],
}
Now we search for our attributes and show the results.
[30]:
selection = cat.search(**attrs, require_all_on="model_id")
selection.df.groupby(
[
"model_id",
"institute_id",
"experiment_id",
"driving_model_id",
"member",
"frequency",
"rcm_version_id",
"version",
]
)["variable_id"].unique().apply(list).to_frame()
[30]:
variable_id | ||||||||
---|---|---|---|---|---|---|---|---|
model_id | institute_id | experiment_id | driving_model_id | member | frequency | rcm_version_id | version | |
CCLM4-8-17 | CLMcom | historical | ICHEC-EC-EARTH | r12i1p1 | mon | v1 | v20140515 | [tas] |
rcp45 | ICHEC-EC-EARTH | r12i1p1 | mon | v1 | v20140515 | [tas] | ||
RACMO22E | KNMI | historical | ICHEC-EC-EARTH | r12i1p1 | mon | v1 | v20170208 | [tas] |
rcp45 | ICHEC-EC-EARTH | r12i1p1 | mon | v1 | v20170713 | [tas] | ||
RCA4 | SMHI | historical | ICHEC-EC-EARTH | r12i1p1 | mon | v1 | v20131026 | [tas] |
rcp45 | ICHEC-EC-EARTH | r12i1p1 | mon | v1 | v20131026 | [tas] |
We have found some RCM datasets that fullfill our critera. We can now load them into a dataset dictionary.
[47]:
dsets = selection.to_dataset_dict(
xarray_open_kwargs={"consolidated": True, "decode_times": True, "use_cftime": True},
storage_options={"anon": True},
)
--> The keys in the returned dictionary of datasets are constructed as follows:
'project_id.CORDEX_domain.institute_id.driving_model_id.experiment_id.member.model_id.rcm_version_id.frequency'
We sort our datasets by the RCM model id and the experiment id’s so that we can concatenate the historical and scenario data.
[48]:
from collections import defaultdict
dsets_dict = defaultdict(dict)
# sort dsets by rcm and exp_id
for key, dset in dsets.items():
attrs = dict(zip(cat.key_template.split("."), key.split(".")))
rcm = attrs["model_id"]
exp = attrs["experiment_id"]
dsets_dict[rcm][exp] = dset
Because we want to compare the scenario output to the historical time period, we concatenate the historical and senarion datasets here along the time-axis.
[52]:
dsets_concat = {
rcm: xr.concat(
[exp["historical"], exp["rcp45"]],
dim="time",
coords="minimal",
compat="override",
)
for rcm, exp in dsets_dict.items()
}
Here comes the actual math. This function will compute a yearly mean over the input period and will substract the temporal mean over the reference period. Consequently, this function computes the change in the annual temperature mean in with respect to the reference period.
[53]:
def compute_change(ds, period, ref_period, mask=None):
"""compute yearly means and compare to reference period"""
yearmean = ds.sel(time=period).groupby("time.year").mean(dim="time") # yearly means
ref = ds.sel(time=ref_period).mean(dim="time") # mean over whole period
results = yearmean - ref
# lets do some maintenance to keep netcdf meta info...
results.attrs = ds.attrs
return results
Let’s define out time periods we are interested in and compute the change in the surface temperature for each rcm.
[54]:
ref_period = slice("1971", "2000")
# common_period = slice("1971","2100")
period = slice("2005", "2100")
[55]:
yearmeans = {
rcm: compute_change(ds, period, ref_period) for rcm, ds in dsets_concat.items()
}
Finally, we will concatenate our results into a single dataset with the model id as a coordinate. This will allow us to easily work with the data of all rcms simultaneoulsy.
Note: Here we have to take care for the slightly different rlat and rlon coordinates that might appear in rcms probably due to different cmorizers involded. We can tell xarray to merge all datasets using the coordinates of the first model usint the keywords compat="override"
and join="override".
[56]:
models = list(dsets_concat.keys())
models
[56]:
['RCA4', 'RACMO22E', 'CCLM4-8-17']
[57]:
model_da = xr.DataArray(
models, dims="model_id", name="model_id", coords={"model_id": models}
)
changes = xr.concat(
[ym.tas for ym in yearmeans.values()],
dim=model_da,
coords="minimal",
compat="override",
join="override",
)
changes
[57]:
<xarray.DataArray 'tas' (model_id: 3, year: 96, rlat: 412, rlon: 424)> dask.array<concatenate, shape=(3, 96, 412, 424), dtype=float32, chunksize=(1, 1, 412, 424), chunktype=numpy.ndarray> Coordinates: height float64 2.0 lat (rlat, rlon) float64 dask.array<chunksize=(412, 424), meta=np.ndarray> lon (rlat, rlon) float64 dask.array<chunksize=(412, 424), meta=np.ndarray> * rlat (rlat) float64 -23.38 -23.27 -23.16 -23.05 ... 21.62 21.73 21.84 * rlon (rlon) float64 -28.38 -28.27 -28.16 -28.05 ... 17.94 18.05 18.16 * year (year) int64 2005 2006 2007 2008 2009 ... 2096 2097 2098 2099 2100 * model_id (model_id) <U10 'RCA4' 'RACMO22E' 'CCLM4-8-17' Attributes: cell_methods: time: mean grid_mapping: rotated_pole long_name: Near-Surface Air Temperature standard_name: air_temperature units: K
Now, we have the temperature change with respect to the historical period for all four RCMs in one dataset. Actually, we have not computed anything since we accessed the data lazily and only worked with meta data so far. This works very well because xarray uses dask under the hood. Now, we will trigger the computation using the compute
function.
[58]:
%time changes_ = changes.compute()
CPU times: user 32.6 s, sys: 8.12 s, total: 40.7 s
Wall time: 2min 55s
With a proper weighting function, we can now have a quick look at the results of field mean over the whole EUR-11 domain.
[60]:
weight = np.cos(np.deg2rad(changes_.rlat))
changes_.weighted(weight).mean(dim=("rlat", "rlon")).plot(col="model_id")
[60]:
<xarray.plot.facetgrid.FacetGrid at 0x147c52710>
Add a regions coordinate#
Finally, we will compute a field mean for each prudence region. This can now be easily achieved using xarray weighting in combination with the prudence mask.
[61]:
prudence_changes = changes_.weighted(mask * weight).mean(dim=("rlat", "rlon"))
and again, we trigger the chunked computation…
[62]:
%time prudence_changes_ = prudence_changes.compute()
CPU times: user 400 µs, sys: 6 µs, total: 406 µs
Wall time: 417 µs
Finally, the plotting becomes as easy as…
[63]:
prudence_changes_["region"] = prudence_changes_.abbrevs
prudence_changes_.plot(col="region", col_wrap=3, hue="model_id")
[63]:
<xarray.plot.facetgrid.FacetGrid at 0x149093410>