MIP Convert User Guide
Overview
The mip_convert package enables a user to produce the output netCDF files for a MIP using model output files.
graph LR
A[model output .pp data] --> C[MIP Convert + CMOR];
B[model output .nc data] --> C;
C --> D[CF/CMIP6 Compliant .nc data];
- The user makes requests for one or more MIP requested variables by providing specific information (including the appropriate MIP requested variable names) in the user configuration file.
- The information required to produce the MIP requested variables is gathered from the user configuration file, the model to MIP mapping configuration files and the appropriate MIP table, in that order.
- The following steps are then performed for each MIP requested variable name in the user configuration file to produce the output netCDF files:
- load the relevant data from the model output files into one or more input variables depending on whether there is a one-to-one / simple arithmetic relationship between the MIP output variable and the input variables or if the MIP output variable is based on an arithmetic combination of two or more input variables, respectively, using Iris and the information provided in the user configuration file and the model to MIP mapping configuration files.
- process the input variable / input variables to produce the MIP output variable using the information provided in the model to MIP mapping configuration files.
- save the MIP output variable to an output netCDF file using CMOR and the information provided in the user configuration file and the appropriate MIP table.
Recommended Reading
- The Design Considerations and Overview section in the CMOR Documentation.
Quick Start Guide
-
Download a template user configuration file. Below are provided two examples, one with parent information and one without. Parent experiment information is used in CMIP to define where the initial conditions in a simulation are taken from, for example the historical experiments take their initial conditions from a point the pre-industrial control run (which needs to be documented). Certain experiments, such as the amip (atmosphere only recent history simulation) do not have a parent experiment and therefore this information is not required. For ad hoc use we recommend using the no parent example:
- If you intend to convert a dataset without parent information:
mip_convert_amip_no_parent.cfg. - If you intend to convert a dataset with parent information:
mip_convert_piControl_with_parent.cfg.
- If you intend to convert a dataset without parent information:
-
Make the appropriate edits to the template user configuration file using the information provided in the "User Configuration File" section and the specified sections in the CMOR Documentation.
-
Source an environment with
cddsand verify thatmip_convertruns.mip_convert -h -
Produce the output netCDF files by running
mip_convertand passing in the modified user configuration file as an argument.mip_convert mip_convert_<modified_template_name>.cfg -
Check the exit code
echo $?Exit Code Meaning 0 No errors were raised during processing. 1 An exception was raised and no MIP requested variables were produced. 2 One or more MIP requested variables were produced but not all variables were produced. See the CRITICALmessages in the log for further information about the MIP requested variables not produced. -
Check that the output netCDF files are as expected. For help or to report an issue, please see
support.
Selected MIP Convert Arguments
Argument |
Description |
|---|---|
config_file |
The name of the user configuration file. For more information, please see the MIP Convert user guide |
-s or --stream_identifiers |
The stream identifiers to process. If all streams should be processed, do not specify this option. |
--relaxed-cmor |
If specified, CMIP6 style validation is not performed by CMOR. If the validation is run then the following fields are not checked; model_id (source_id), experiment_id, further_info_url, grid_label, parent_experiment_id, sub_experiment_id. |
--mip_era |
The MIP era (e.g. CMIP6). |
--external_plugin |
Module path to external CDDS plugin (e.g. arise.plugin) |
--external_plugin_location |
Path to the external plugin implementation (e.g. /project/cdds/arise) |
Example Usage
Run for all streams with full checking of metadata
mip_convert mip_convert.cfg
Run for a single stream in relaxed mode
mip_convert mip_convert.cfg -s ap4 --relaxed_cmor
User Configuration File Reference
The user configuration file provides the information required by MIP Convert to produce the output netCDF files. It contains the following sections, some of which are optional.
| Section | Summary |
|---|---|
[COMMON] |
Convenience for setting up shared config values. |
[cmor_setup] |
Passed through to CMOR's cmor_setup() routine. |
[cmor_dataset] |
Passed through to CMOR's cmor_data_set_json() routine. |
[request] |
Configure mip_convert including input data. |
[stream_<stream_id>] |
The variables to produce from a particular <stream_id>. |
[masking] |
Apply polar row masking if needed. |
[halo_removal] |
Apply stripping of halo columns and rows if needed. |
[slicing_periods] |
Using the slicing period for a particular stream if given. |
[global_attributes] |
Add global attributes to the output netCDF. |
COMMON
The optional [COMMON] section.
cmor_setup
The [cmor_setup] section contains the following options which are used by cmor_setup().
For a description of each option please see the documentation for cmor_setup().
| Option | Required by | Used by | CMOR Name |
|---|---|---|---|
mip_table_dir |
MIP Convert | CMOR + MIP Convert | inpath |
netcdf_file_action |
CMOR | ||
set_verbosity |
CMOR | ||
exit_control |
CMOR | ||
cmor_log_file |
CMOR | log_file |
|
create_subdirectories |
CMOR |
Tip
When configuring a user configuration file, the mip_table_dir is likely to be the only value that will need modification.
cmor_dataset
The required cmor_dataset section contains the following options used for cmor_data_set_json() for CMIP6.
| Option | Required by | Used by | Notes |
|---|---|---|---|
branch_method |
MIP Convert + CMOR | MIP Convert + CMOR | 1 |
calendar |
MIP Convert + CMOR | MIP Convert + CMOR | 2 |
comment |
CMOR | 1, 3 | |
contact |
CMOR | 1 | |
experiment_id |
MIP Convert + CMOR | MIP Convert + CMOR | 1 |
grid |
CMOR | CMOR | 1 |
grid_label |
CMOR | CMOR | 1 |
institution_id |
MIP Convert + CMOR | MIP Convert + CMOR | 1 |
license |
CMOR | CMOR | 1 |
mip |
CMOR | CMOR | 1, 4 |
mip_era |
MIP Convert + CMOR | MIP Convert + CMOR | 1 |
model_id |
MIP Convert + CMOR | MIP Convert + CMOR | 1, 5 |
model_type |
CMOR | CMOR | 1, 6 |
nominal_resolution |
CMOR | CMOR | 1 |
output_dir |
MIP Convert + CMOR | MIP Convert + CMOR | 7 |
output_file_template |
CMOR | ||
output_path_template |
CMOR | ||
references |
CMOR | 1 | |
sub_experiment_id |
MIP Convert + CMOR | MIP Convert | 1 |
variant_info |
CMOR | 1 | |
variant_label |
MIP Convert + CMOR | MIP Convert + CMOR | 1 |
Notes
- For a description of each option, please see the CMIP6 Global Attributes document.
- Calendar types allowed are:
- standard
- gregorian
- proleptic_gregorian
- noleap
- 365_day
- 360_day
- julian
- all_leap
- 366_day
- It is recommended to use the
commentoption to record any perturbed physics details. - See activity_id. For more examples see CMIP6 CV's.
- See source_id. For more examples see CMIP6 CV's.
- See source_type.
- If
create_subdirectoriesis set to True CMOR will construct the full directory structure used by CMIP, as specified in the DRS (Directory Reference Syntax) information held in the CVs, underneathoutput_dir. Ifcreate_directoriesis set to False (as is usual within CDDS) files will be directly written tooutput_dir.
MIP Convert determines:
- the
experiment,institution,source,sub_experimentfrom the CV file using theexperiment_id,institution_id,source_idandsub_experiment_id, respectively - the
forcing_index,initialization_index,physics_indexandrealization_indexfrom thevariant_label - the
further_info_urlandtracking_prefixbased on the information from the CV file - the
history
Whenever a parent experiment exists the following options must also be specified.
| Option | Used by | Notes |
|---|---|---|
branch_date_in_child |
MIP Convert | 1,2,3 |
branch_date_in_parent |
MIP Convert | 1,2,3 |
parent_base_date |
MIP Convert | 1,2 |
parent_experiment_id |
CMOR | 3 |
parent_mip_era |
CMOR | 3 |
parent_model_id |
CMOR | 34 |
parent_time_units |
CMOR | 3 |
parent_variant_label |
CMOR | 3 |
Notes
- CMOR requires
branch_time_in_childandbranch_time_in_parent, which is determined from the optionsbase_date(see therequest <request_section>section) /parent_base_date(the base date of thechild_experiment_id/parent_experiment_id) andbranch_date_in_child/branch_date_in_parent(the date in thechild_experiment_id/parent_experiment_idfrom which the experiment branches) from thecmor_dataset <cmor_dataset_section>section in the |user configuration file| by taking the difference (in days) between thebranch_date_in_child/branch_date_in_parentand thebase_date/parent_base_date. Ifbranch_date_in_childorbranch_date_in_parentisN/Athenbranch_time_in_parentis set to 0. - Dates should be provided in the form
YYYY-MM-DDThh:mm:ssZ. - For a description of each option, please see the CMIP6 Global Attributes document.
- See
parent_source_idin the CMIP6 Global Attributes document.
request
The required request section contains the following options which are used only by MIP Convert.
Option |
Required | Description | Notes |
|---|---|---|---|
ancil_files |
A space separated list of the full paths to any required ancillary files. | ||
atmos_timestep |
The atmospheric model timestep in integer seconds. | 1 | |
base_date |
Yes | The date in the form YYYY-MM-DDThh:mm:ss. |
2 |
deflate_level |
The deflation level when writing the output netCDF file from 0 (no compression) to 9 (maximum compression). | ||
force_coordinate_rotation |
If set to True, output data will be forced to include rotated coordinates and true lat-lon coordinates. |
||
hybrid_heights_file |
A space separated list of the full path to the files containing the information about the hybrid heights. | 3 | |
mask_slice |
Yes | Optional slicing expression for masking data in the form of n:m,i:j, or no_mask |
4,8 |
model_output_dir |
Yes | The full path to the root directory containing the model output files. | 5 |
mip_convert_plugin |
Yes | The id of the MIP convert plugin that should be used. | |
mip_convert_external_plugin |
The module path to external MIP convert plugin, e.g. cdds_arise.arise_mip_convert_plugin. |
||
mip_convert_external_plugin_location |
The full path to the external plugin implementation, e.g. $CDDS_ETC/mapping_plugins/arise. |
||
reference_time |
Yes | The reference time used to construct reftime and leadtime coordinates. Only used if these coordinates are specified corresponding variable entries in the MIP table |
|
replacement_coordinates_file |
The full path to the netCDF file containing area variables that refer to the horizontal coordinates that should be used to replace the corresponding values in the model output files. | 6 | |
run_bounds |
Yes | The start and end time in the form <start_time> <end_time>, where <start_time> and <end_time> are in the form YYYY-MM-DDThh:mm:ss. |
|
shuffle |
Whether to shuffle when writing the output netCDF file. | ||
sites_file |
The full path to the file containing the information about the sites. | 7 | |
suite_id |
Yes | The suite identifier of the model. |
Notes
- The
atmos_timestepis required for atmospheric tendency diagnostics, which have model to MIP mappings that depend on the atmospheric model timestep, i.e., the expression containsATMOS_TIMESTEP. - The
base_dateis used to define the units of the time coordinate in the output netCDF file and is specified by the MIP. - The file containing the information about the hybrid heights has the following columns;
- the
model level number(int) - the
a_value(float) - the
a_lower_bound(float) - the
a_upper_bound(float) - the
b_value(float) - the
b_lower_bound(float) - the
b_upper_bound(float)
- the
- If not specified,
mip_convertwill try to retrieve masking expressions from plugins (this is a default behaviour for CMIP6-like processing). Puttingno_maskinto configuration file allowsmip_convertto process model output that does not require any masking; custom masks can be specified and passed tomip_convertwithout plugins dependencies. - It is expected that the model output files are located in the directory
<model_output_dir>/<suite_id>/<stream_id>/, where the<suite_id>is the suite identifier and the<stream_id>is the |stream identifier|. Note that MIP Convert will load all the files in this directory and then use therun_boundsto select the required data; when selecting a short time period from a large number of |model output files| it is recommended to copy the relevant files to an empty directory to save time when loading. - Currently, only CICE horizontal coordinates can be replaced.
- The file containing the information about the sites has the following columns;
- the
site number(int) - the
longitude(float, from 0 to 360) [degrees] - the
latitude(float, from -90 to 90) [degrees] - the
orography(float) [metres] and acomment(string).
- the
- This is usually only used to mask ocean/seaice data as part of a CDDS run.
stream_<\stream_id>
The required [stream_<stream_id>] section, where the <stream_id> is the stream identifier, contains options equal to the name of the MIP table and values equal to a space-separated list of MIP requested variable names.
Multiple [stream_<stream_id>] sections can be defined.
Note
All output netCDF files are created for a stream before moving onto the next stream.
Example
If we wanted to produce the following variables Amon/tas, Amon/pr, Emon/ps, Amon/tasmax, day/tasmin using the CMIP6 tables.
- We need two
stream_stream_idsections as the list of MIP requested variables span streamsap5andap6. - Within each of these sections we specify the mip table in the form
<MIP_ERA>_<MIP_TABLE>: - Then the list of variables names as a space seperated list.
[stream_ap5]
CMIP6_Amon = tas pr
CMIP6_Emon = ps
[stream_ap6]
CMIP6_Amon = tasmax
CMIP6_day = tasmin
masking
The optional masking section is used if a particular stream needs to be masked.
This is usually only used for polar row masking in NEMO & CICE output
Example
[masking]
stream_inm_cice-T: -1:,180:
halo_removal
The optional halo_removal section is used if haloes need to be removed from particular streams.
Example
[halo_removal]
stream_apa = 5:,:-10
stream_ap6 = 20:-15
Tip
If a cube is used which has the haloes_removed = "true" global attribute, the halo removal process will be skipped for that cube.
slicing_periods
The optional slicing_periods section is used if for a particular stream a period for slicing is specified.
For all streams that have no specified slicing period the default slicing period is year.
Example
[slicing_periods]
stream_apa = month
stream_ap6 = year
global_attributes
The global_attributes section takes additional metadata to be added to the global attributes of
every netCDF file produced.
Where a project specifies additional required_global_attributes in the Controlled Vocabulary files
these must be specified in this section. For example, for CMIP6 you must add the following line
here or MIP Convert will fail:
further_info_url = https://furtherinfo.es-doc.org/