Cordex domains#

The domain module should give some tools to work with preconfigured or user defined domains. Domains are defined as xarray datasets that will contain dimensions and coodinates according to CF-conventions.

NOTE: The domain module mostly focuses on working with rotated cordex domains and how they are defined in the cordex archive specifications. However, there are some regional models that use different mappings instead of rotated_pole or rotated_latitude_longitude which we focus on. Any expertise working with those different mappings is highly welcome!

Working with domain information#

[1]:
import cordex as cx

The domain module contains some useful functions to work with cordex meta data, e.g., you can get some domain grid information using

[2]:
cx.domain_info("EUR-11")
[2]:
{'short_name': 'EUR-11',
 'region': 4,
 'long_name': 'Europe',
 'nlon': 424,
 'nlat': 412,
 'll_lon': -28.375,
 'ur_lon': 18.155,
 'll_lat': -23.375,
 'ur_lat': 21.835,
 'dlon': 0.11,
 'dlat': 0.11,
 'pollon': -162.0,
 'pollat': 39.25}

The domain information is stored in a number of csv tables. The module contains a tables dictionary that sorts the tables by resolution or project, e.g.

[3]:
cx.domains.tables.keys()
[3]:
dict_keys(['cordex', 'cordex-high-res', 'cordex-fps', 'cordex-core', 'cordex-regular'])

All available cordex domains are in those tables:

[4]:
cx.domains.table
[4]:
region long_name nlon nlat ll_lon ur_lon ll_lat ur_lat dlon dlat pollon pollat
short_name
SAM-44 1 South America 146 167 143.92000 207.72000 -38.28000 34.76000 0.4400 0.4400 -56.06 70.60
CAM-44 2 Central America 210 113 -52.80000 39.16000 -28.60000 20.68000 0.4400 0.4400 113.98 75.74
NAM-44 3 North America 155 130 -33.88000 33.88000 -28.40000 28.36000 0.4400 0.4400 83.00 42.50
EUR-44 4 Europe 106 103 -28.21000 17.99000 -23.21000 21.67000 0.4400 0.4400 -162.00 39.25
AFR-44 5 Africa 194 201 -24.64000 60.28000 -45.76000 42.24000 0.4400 0.4400 180.00 90.00
WAS-44 6 South Asia 193 130 -32.12000 52.36000 -21.56000 35.20000 0.4400 0.4400 -123.34 79.95
EAS-44 7 East Asia 203 167 -40.92000 47.96000 -26.84000 46.20000 0.4400 0.4400 -64.78 77.61
CAS-44 8 Central Asia 153 100 -34.32000 32.56000 -20.68000 22.88000 0.4400 0.4400 -103.39 43.48
AUS-44 9 Australasia 200 129 142.16000 229.72000 -22.88000 33.44000 0.4400 0.4400 141.38 60.31
ANT-44 10 Antarctica 125 97 152.72000 207.28000 -27.72000 14.52000 0.4400 0.4400 -166.92 6.08
ARC-44 11 Arctic 116 133 -22.88000 27.72000 -24.20000 33.88000 0.4400 0.4400 0.00 6.55
MED-44 12 Mediterranean 98 63 -23.22000 19.46000 -21.34000 5.94000 0.4400 0.4400 198.00 39.25
MNA-44 13 Middle East and North Africa 232 118 -26.40000 75.24000 -6.60000 44.88000 0.4400 0.4400 180.00 90.00
MNA-22 13 Middle East and North Africa 464 236 -26.51000 75.35000 -6.71000 44.99000 0.2200 0.2200 180.00 90.00
SAM-11 1 South America 584 668 143.75500 207.88500 -38.44500 34.92500 0.1100 0.1100 -56.06 70.60
CAM-11 2 Central America 840 452 -52.96500 39.32500 -28.76500 20.84500 0.1100 0.1100 113.98 75.74
NAM-11 3 North America 620 520 -34.04500 34.04500 -28.56500 28.52500 0.1100 0.1100 83.00 42.50
EUR-11 4 Europe 424 412 -28.37500 18.15500 -23.37500 21.83500 0.1100 0.1100 -162.00 39.25
AFR-11 5 Africa 776 804 -24.80500 60.44500 -45.92500 42.40500 0.1100 0.1100 180.00 90.00
WAS-11 6 South Asia 772 520 -32.28500 52.52500 -21.72500 35.36500 0.1100 0.1100 -123.34 79.95
EAS-11 7 East Asia 812 668 -41.08500 48.12500 -27.00500 46.36500 0.1100 0.1100 -64.78 77.61
CAS-11 8 Central Asia 612 400 -34.48500 32.72500 -20.84500 23.04500 0.1100 0.1100 -103.39 43.48
AUS-11 9 Australasia 800 516 141.99500 229.88500 -23.04500 33.60500 0.1100 0.1100 141.38 60.31
ANT-11 10 Antarctica 500 388 152.55500 207.44500 -27.88500 14.68500 0.1100 0.1100 -166.92 6.08
ARC-11 11 Arctic 464 532 -23.04500 27.88500 -24.36500 34.04500 0.1100 0.1100 0.00 6.55
MED-11 12 Mediterranean 392 252 -23.38500 19.62500 -21.50500 6.10500 0.1100 0.1100 198.00 39.25
MNA-11 13 Middle East and North Africa 928 472 -26.56500 75.40500 -6.76500 45.04500 0.1100 0.1100 180.00 90.00
GAR-0275 4 Greater Alpine Region 476 444 -13.51125 -0.44875 -10.93125 1.25125 0.0275 0.0275 -162.00 39.25
SAM-22 1 South America 292 334 143.81000 207.83000 -38.39000 34.87000 0.2200 0.2200 -56.06 70.60
CAM-22 2 Central America 420 226 -52.91000 39.27000 -28.71000 20.79000 0.2200 0.2200 113.98 75.74
NAM-22 3 North America 310 260 -33.99000 33.99000 -28.51000 28.47000 0.2200 0.2200 83.00 42.50
EUR-22 4 Europe 212 206 -28.32000 18.10000 -23.32000 21.78000 0.2200 0.2200 -162.00 39.25
AFR-22 5 Africa 388 402 -24.75000 60.39000 -45.87000 42.35000 0.2200 0.2200 180.00 90.00
WAS-22 6 South Asia 386 260 -32.23000 52.47000 -21.67000 35.31000 0.2200 0.2200 -123.34 79.95
EAS-22 7 East Asia 396 251 -43.23000 43.67000 -22.10000 32.90000 0.2200 0.2200 296.30 61.00
CAS-22 8 Central Asia 306 200 -34.43000 32.67000 -20.79000 22.99000 0.2200 0.2200 -103.39 43.48
AUS-22 9 Australasia 400 258 142.05000 229.83000 -22.99000 33.55000 0.2200 0.2200 141.38 60.31
SEA-22 14 South East Asia 264 194 89.26000 147.12000 -15.18000 27.28000 0.2200 0.2200 180.00 90.00
SAM-44i 1 South America 181 155 -106.25000 -16.25000 -58.25000 18.75000 0.5000 0.5000 NaN NaN
CAM-44i 2 Central America 207 111 -124.75000 -21.75000 -19.75000 35.25000 0.5000 0.5000 NaN NaN
NAM-44i 3 North America 300 129 -171.75000 -22.25000 12.25000 76.25000 0.5000 0.5000 NaN NaN
EUR-44i 4 Europe 221 103 -44.75000 65.25000 21.75000 72.75000 0.5000 0.5000 NaN NaN
AFR-44i 5 Africa 173 179 -25.25000 60.75000 -46.25000 42.75000 0.5000 0.5000 NaN NaN
WAS-44i 6 South Asia 195 124 19.25000 116.25000 -15.75000 45.75000 0.5000 0.5000 NaN NaN
EAS-44i 7 East Asia 227 157 62.75000 175.75000 -18.75000 59.25000 0.5000 0.5000 NaN NaN
CAS-44i 8 Central Asia 260 133 10.75000 140.25000 17.75000 69.75000 0.5000 0.5000 NaN NaN
AUS-44i 9 Australasia 238 133 88.75000 207.25000 -53.25000 12.75000 0.5000 0.5000 NaN NaN
ANT-44i 10 Antarctica 720 70 -179.75000 179.75000 -89.75000 -55.25000 0.5000 0.5000 NaN NaN
ARC-44i 11 Arctic 720 83 -179.75000 179.75000 48.75000 89.75000 0.5000 0.5000 NaN NaN
MED-44i 12 Mediterranean 144 65 -20.75000 51.75000 25.25000 57.25000 0.5000 0.5000 NaN NaN
MNA-44i 13 Middle East and North Africa 206 106 -26.75000 75.75000 -7.25000 45.25000 0.5000 0.5000 NaN NaN
MNA-22i 13 Middle East and North Africa high res. 410 209 -26.62500 75.62500 -6.87500 45.12500 0.2500 0.2500 NaN NaN
EUR-11i 4 Europe high res. 881 408 -44.81250 65.18750 21.81250 72.68750 0.1250 0.1250 NaN NaN
SEA-22i 14 South East Asia 233 172 89.12500 147.12500 -15.37500 27.37500 0.2500 0.2500 NaN NaN

EUR-11 example#

The heart of the module are some functions that create a dataset from the grid information, e.g.

[5]:
eur11 = cx.cordex_domain("EUR-11", dummy="topo")
eur11
[5]:
<xarray.Dataset>
Dimensions:                     (rlon: 424, rlat: 412)
Coordinates:
  * rlon                        (rlon) float64 -28.38 -28.27 ... 18.05 18.16
  * rlat                        (rlat) float64 -23.38 -23.27 ... 21.73 21.84
    lon                         (rlat, rlon) float64 -10.06 -9.964 ... 64.96
    lat                         (rlat, rlon) float64 21.99 22.03 ... 66.75 66.69
Data variables:
    rotated_latitude_longitude  int32 0
    topo                        (rlat, rlon) float32 284.0 246.0 ... 509.0 509.0
Attributes:
    CORDEX_domain:  EUR-11

The dummy='topo' argument means, we want a dummy variable in the dataset to see how the domain looks like. For the dummy topography, we use the cdo topo operator in the background. So maybe you have to install python-cdo, e.g., conda install -c conda-forge python-cdo. Working with xarray datasets means, that we can use all the nice functions of xarray including plotting, e.g.,

[6]:
eur11.topo.plot(cmap="terrain")
[6]:
<matplotlib.collections.QuadMesh at 0x1456ee850>
_images/domains_14_1.png
[7]:
eur11.topo.plot(x="lon", y="lat", cmap="terrain")
[7]:
<matplotlib.collections.QuadMesh at 0x1458587f0>
_images/domains_15_1.png

Let’s define a slightly more sophisticated plotting function that uses cartopy for the right projection with a rotated pole:

[8]:
def plot(da, pole, vmin=None, vmax=None, borders=True, title=None):
    """plot a domain using the right projection with cartopy"""
    %matplotlib inline
    import cartopy.crs as ccrs
    import cartopy.feature as cf
    import matplotlib.pyplot as plt

    plt.figure(figsize=(20, 10))
    projection = ccrs.PlateCarree()
    transform = ccrs.RotatedPole(pole_latitude=pole[1], pole_longitude=pole[0])
    # ax = plt.axes(projection=projection)
    ax = plt.axes(projection=transform)
    # ax.set_extent([ds_sub.rlon.min(), ds_sub.rlon.max(), ds_sub.rlat.min(), ds_sub.rlat.max()], crs=transform)
    ax.gridlines(
        draw_labels=True,
        linewidth=0.5,
        color="gray",
        xlocs=range(-180, 180, 10),
        ylocs=range(-90, 90, 5),
    )
    da.plot(
        ax=ax,
        cmap="terrain",
        transform=transform,
        vmin=vmin,
        vmax=vmax,
        x="rlon",
        y="rlat",
    )
    ax.coastlines(resolution="50m", color="black", linewidth=1)
    if borders:
        ax.add_feature(cf.BORDERS)
    if title is not None:
        ax.set_title(title)
[9]:
pole = (
    eur11.rotated_latitude_longitude.grid_north_pole_longitude,
    eur11.rotated_latitude_longitude.grid_north_pole_latitude,
)
pole
[9]:
(-162.0, 39.25)
[10]:
plot(eur11.topo, pole)
_images/domains_19_0.png

User defined domain#

The domains are actually created from csv tables. To checkout the tables you can have a look at dm.TABLES. This is a dictionary of dataframes created during the import of the model from a number of csv tables that define standard cordex domains. E.g., available tables are:

[11]:
cx.domains.tables["cordex-high-res"]
[11]:
region long_name nlon nlat ll_lon ur_lon ll_lat ur_lat dlon dlat pollon pollat
short_name
SAM-11 1 South America 584 668 143.755 207.885 -38.445 34.925 0.11 0.11 -56.06 70.60
CAM-11 2 Central America 840 452 -52.965 39.325 -28.765 20.845 0.11 0.11 113.98 75.74
NAM-11 3 North America 620 520 -34.045 34.045 -28.565 28.525 0.11 0.11 83.00 42.50
EUR-11 4 Europe 424 412 -28.375 18.155 -23.375 21.835 0.11 0.11 -162.00 39.25
AFR-11 5 Africa 776 804 -24.805 60.445 -45.925 42.405 0.11 0.11 180.00 90.00
WAS-11 6 South Asia 772 520 -32.285 52.525 -21.725 35.365 0.11 0.11 -123.34 79.95
EAS-11 7 East Asia 812 668 -41.085 48.125 -27.005 46.365 0.11 0.11 -64.78 77.61
CAS-11 8 Central Asia 612 400 -34.485 32.725 -20.845 23.045 0.11 0.11 -103.39 43.48
AUS-11 9 Australasia 800 516 141.995 229.885 -23.045 33.605 0.11 0.11 141.38 60.31
ANT-11 10 Antarctica 500 388 152.555 207.445 -27.885 14.685 0.11 0.11 -166.92 6.08
ARC-11 11 Arctic 464 532 -23.045 27.885 -24.365 34.045 0.11 0.11 0.00 6.55
MED-11 12 Mediterranean 392 252 -23.385 19.625 -21.505 6.105 0.11 0.11 198.00 39.25
MNA-11 13 Middle East and North Africa 928 472 -26.565 75.405 -6.765 45.045 0.11 0.11 180.00 90.00

The domains are created using the create_dataset function, e.g.:

[12]:
help(cx.create_dataset)
Help on function create_dataset in module cordex.core.domain:

create_dataset(nlon, nlat, dlon, dlat, ll_lon, ll_lat, pollon=None, pollat=None, name=None, dummy=False, add_vertices=False, attrs=None, mapping_name=None, bounds=None, **kwargs)
    Create domain dataset from grid information.

    Parameters
    ----------
    nlon : int
        longitudal number of grid boxes
    nlat : int
        latitudal number of grid boxes
    dlon : float
        longitudal resolution (degrees)
    dlat : float
        latitudal resolution (degrees)
    ll_lon : float
        lower left longitude (degrees)
    ll_lat : float
        lower left latitude (degrees)
    pollon : float
        pol longitude (degrees)
    pollat : float
        pol latitude (degrees)
    dummy : str or logical
        Name of dummy field, if dummy=topo, the cdo topo operator will be
        used to create some dummy topography data. dummy data is useful for
        looking at the domain with ncview.
    add_vertices : bool
        Add grid boundaries in the global coordinates (lon_vertices and lat_vertices).
    attrs: str or dict
        Global attributes that should be added to the dataset. If `attrs='CORDEX'`
        a set of standard CF global attributes.
    mapping_name: str
        Variable name of the grid mapping, if mapping_name is `None`, the CF standard
        variable name is used.

Let’s create the EUR-11 domain manually from the numbers in the table:

[13]:
eur11_user = cx.create_dataset(
    nlon=424,
    nlat=412,
    dlon=0.11,
    dlat=0.11,
    ll_lon=-28.375,
    ll_lat=-23.375,
    pollon=-162.00,
    pollat=39.25,
    dummy="topo",
)

We can check that this gives the same result as our preconfigured domain.

[14]:
eur11_user.equals(eur11)
[14]:
True

You can now use the create_dataset function to create any domain as an xarray dataset.

Check out the Africa domain!#

[15]:
afr11 = cx.cordex_domain("AFR-11", dummy="topo")
afr11
[15]:
<xarray.Dataset>
Dimensions:                     (rlon: 776, rlat: 804)
Coordinates:
  * rlon                        (rlon) float64 -24.8 -24.7 ... 60.34 60.45
  * rlat                        (rlat) float64 -45.92 -45.81 ... 42.3 42.41
    lon                         (rlat, rlon) float64 -24.81 -24.7 ... 60.44
    lat                         (rlat, rlon) float64 -45.92 -45.92 ... 42.41
Data variables:
    rotated_latitude_longitude  int32 0
    topo                        (rlat, rlon) float32 -4.429e+03 ... 161.0
Attributes:
    CORDEX_domain:  AFR-11
[16]:
pole = (
    afr11.rotated_latitude_longitude.grid_north_pole_longitude,
    afr11.rotated_latitude_longitude.grid_north_pole_latitude,
)
pole
[16]:
(180.0, 90.0)
[17]:
plot(afr11.topo, pole)
_images/domains_33_0.png

Plot all cordex-core domains#

We need a slightly modified plotting routine for this:

[18]:
def plots(dsets, vmin=None, vmax=None, borders=True, title=None):
    """plot a domain using the right projection with cartopy"""
    %matplotlib inline
    import cartopy.crs as ccrs
    import cartopy.feature as cf
    import matplotlib.patheffects as pe
    import matplotlib.pyplot as plt

    plt.figure(figsize=(20, 10))
    projection = ccrs.PlateCarree()
    # transform = ccrs.RotatedPole(pole_latitude=pole[1], pole_longitude=pole[0])
    # ax = plt.axes(projection=projection)
    ax = plt.axes(projection=projection)
    # ax.set_extent([ds_sub.rlon.min(), ds_sub.rlon.max(), ds_sub.rlat.min(), ds_sub.rlat.max()], crs=transform)
    ax.gridlines(
        draw_labels=True,
        linewidth=0.5,
        color="gray",
        xlocs=range(-180, 180, 15),
        ylocs=range(-90, 90, 10),
    )
    # path_effects = [pe.Stroke(linewidth=50, foreground='g'), pe.Normal()]
    for ds in dsets:
        pole = (
            ds.rotated_latitude_longitude.grid_north_pole_longitude,
            ds.rotated_latitude_longitude.grid_north_pole_latitude,
        )
        transform = ccrs.RotatedPole(pole_latitude=pole[1], pole_longitude=pole[0])
        ds.topo.plot(
            ax=ax,
            cmap="terrain",
            transform=transform,
            vmin=vmin,
            vmax=vmax,
            x="rlon",
            y="rlat",
            add_colorbar=False,
        )
    ax.coastlines(resolution="50m", color="black", linewidth=1)
    if borders:
        ax.add_feature(cf.BORDERS)
    if title is not None:
        ax.set_title("")

Now, let’s plot all cordex core domains into one overview:

[19]:
plots(
    [
        cx.cordex_domain(name, dummy="topo")
        for name in cx.domains.tables["cordex-core"].index
    ],
    borders=False,
)
_images/domains_38_0.png

Using the grid information for interpolation#

The gridded Cordex datasets are particular usefule for regridding either with CDO or other interpolation packages.

We will use some CMIP6 model data from the ESGF to show how we can do the regridding:

[20]:
import xarray as xr

# ds = xr.open_dataset("https://esgf3.dkrz.de/thredds/dodsC/cmip6/CMIP/MPI-M/MPI-ESM1-2-HR/historical/r1i1p1f1/Amon/tas/gn/v20190710/tas_Amon_MPI-ESM1-2-HR_historical_r1i1p1f1_gn_199501-199912.nc")
ds = xr.open_dataset(
    "http://esgf-data3.ceda.ac.uk/thredds/dodsC/esg_cmip6/CMIP6/CMIP/MOHC/UKESM1-0-LL/historical/r1i1p1f2/Amon/tas/gn/v20190406/tas_Amon_UKESM1-0-LL_historical_r1i1p1f2_gn_185001-194912.nc"
)
ds
[20]:
<xarray.Dataset>
Dimensions:    (time: 1200, bnds: 2, lat: 144, lon: 192)
Coordinates:
  * time       (time) object 1850-01-16 00:00:00 ... 1949-12-16 00:00:00
  * lat        (lat) float64 -89.38 -88.12 -86.88 -85.62 ... 86.88 88.12 89.38
  * lon        (lon) float64 0.9375 2.812 4.688 6.562 ... 355.3 357.2 359.1
    height     float64 1.5
Dimensions without coordinates: bnds
Data variables:
    time_bnds  (time, bnds) object 1850-01-01 00:00:00 ... 1950-01-01 00:00:00
    lat_bnds   (lat, bnds) float64 -90.0 -88.75 -88.75 ... 88.75 88.75 90.0
    lon_bnds   (lon, bnds) float64 0.0 1.875 1.875 3.75 ... 358.1 358.1 360.0
    tas        (time, lat, lon) float32 ...
Attributes: (12/47)
    Conventions:                     CF-1.7 CMIP-6.2
    activity_id:                     CMIP
    branch_method:                   standard
    branch_time_in_child:            0.0
    branch_time_in_parent:           144000.0
    creation_date:                   2019-04-05T16:02:56Z
    ...                              ...
    variable_id:                     tas
    variant_label:                   r1i1p1f2
    license:                         CMIP6 model data produced by the Met Off...
    cmor_version:                    3.4.0
    tracking_id:                     hdl:21.14100/255d149c-12fc-41f1-878d-034...
    DODS_EXTRA.Unlimited_Dimension:  time

Interpolation using CDO#

We will use CDO’s python bindings to control the cdo operators. Please note, that the python bindings in the background actually execute the cdo commands on the command line. CDO does have a huge IO overhead since it will always write a file between each operator step and will always need data from a file on the filesystem as input. If you give an xarray dataset as input (see below), the python binding will actually trigger a to_netcdf call to write the input as a temporary file to disk. You should be aware of this if you use huge xarray datasets as input.

We will first write the EUR-11 grid into a file on the disk so that we can use it as input to CDO

[21]:
from cdo import Cdo

eur11.to_netcdf("EUR-11_grid.nc")

Now we will remap the first timestep of the CMIP6 modeldata to the EUR-11 grid:

[22]:
remap_cdo = Cdo().remapbil("EUR-11_grid.nc", input=ds.isel(time=0), returnXArray="tas")
remap_cdo
[22]:
<xarray.DataArray 'tas' (rlat: 412, rlon: 424)>
[174688 values with dtype=float32]
Coordinates:
    lon      (rlat, rlon) float64 ...
    lat      (rlat, rlon) float64 ...
  * rlon     (rlon) float64 -28.38 -28.27 -28.16 -28.05 ... 17.94 18.05 18.16
  * rlat     (rlat) float64 -23.38 -23.27 -23.16 -23.05 ... 21.62 21.73 21.84
    height   float64 1.5
Attributes:
    standard_name:  air_temperature
    long_name:      Near-Surface Air Temperature
    units:          K
    grid_mapping:   rotated_latitude_longitude
    comment:        near-surface (usually, 2 meter) air temperature
    original_name:  mo: (stash: m01s03i236, lbproc: 128)
    cell_methods:   area: time: mean
    cell_measures:  area: areacella
    history:        2019-04-05T16:02:56Z altered by CMOR: Treated scalar dime...
    _ChunkSizes:    [  1 144 192]
[23]:
(remap_cdo - 273.5).plot()
[23]:
<matplotlib.collections.QuadMesh at 0x14696f820>
_images/domains_49_1.png

Alternative interpolation methods#

A nice alternative is xesmf since it is based on xarray and will also very nicely work with dask.

[24]:
import xesmf as xe
[25]:
regridder = xe.Regridder(ds, eur11, "bilinear", periodic=True)
regridder
[25]:
xESMF Regridder
Regridding algorithm:       bilinear
Weight filename:            bilinear_144x192_412x424_peri.nc
Reuse pre-computed weights? False
Input grid shape:           (144, 192)
Output grid shape:          (412, 424)
Periodic in longitude?      True
[26]:
remap_xe = regridder(ds.tas.isel(time=0))
[27]:
(remap_xe - 273.5).plot()
[27]:
<matplotlib.collections.QuadMesh at 0x1184813d0>
_images/domains_55_1.png

We can easily compare both approaches

[28]:
(remap_cdo - remap_xe).plot(x="lon", y="lat")
[28]:
<matplotlib.collections.QuadMesh at 0x1460e6a30>
_images/domains_57_1.png

The nice thing about xesmf is that it works together with xarray and will keep all meta information. Another consequence is that xesmf works well with dask and it’s vectorization. That means, if we have a long time axis along we want to regrid, this can easily be parallelized using, e.g., dask.distributed and will also work nicely with large datasets that don’t fit into memory. The xesmf regridder can also store and reuse regridding weights for faster interpolation.