MIP Convert User Guide
Overview
The mip_convert
package enables a user to produce the output netCDF files for a MIP using model output files.
graph LR
A[model output .pp data] --> C[MIP Convert + CMOR];
B[model output .nc data] --> C;
C --> D[CF/CMIP6 Compliant .nc data];
- The user makes requests for one or more MIP requested variables by providing specific information (including the appropriate MIP requested variable names) in the user configuration file.
- The information required to produce the MIP requested variables is gathered from the user configuration file, the model to MIP mapping configuration files and the appropriate MIP table, in that order.
- The following steps are then performed for each MIP requested variable name in the user configuration file to produce the output netCDF files:
- load the relevant data from the model output files into one or more input variables depending on whether there is a one-to-one / simple arithmetic relationship between the MIP output variable and the input variables or if the MIP output variable is based on an arithmetic combination of two or more input variables, respectively, using Iris and the information provided in the user configuration file and the model to MIP mapping configuration files.
- process the input variable / input variables to produce the MIP output variable using the information provided in the model to MIP mapping configuration files.
- save the MIP output variable to an output netCDF file using CMOR and the information provided in the user configuration file and the appropriate MIP table.
Recommended Reading
- The Design Considerations and Overview section in the CMOR Documentation.
Quick Start Guide
-
Download the template user configuration file
mip_convert.cfg
. -
Make the appropriate edits to the template user configuration file using the information provided in the "User Configuration File" section and the specified sections in the CMOR Documentation.
-
Source an environment with
cdds
and verify thatmip_convert
runs.mip_convert -h
-
Produce the output netCDF files by running
mip_convert
and passing in the modified user configuration file as an argument.mip_convert mip_convert.cfg
-
Check the exit code
echo $?
Exit Code Meaning 0 No errors were raised during processing. 1 An exception was raised and no MIP requested variables were produced. 2 One or more MIP requested variables were produced but not all variables were produced. See the CRITICAL
messages in the log for further information about the MIP requested variables not produced. -
Check that the output netCDF files are as expected. For help or to report an issue, please see
support
.
Selected MIP Convert Arguments
Argument |
Description |
---|---|
config_file |
The name of the user configuration file. For more information, please see the MIP Convert user guide |
-s or --stream_identifiers |
The stream identifiers to process. If all streams should be processed, do not specify this option. |
--relaxed-cmor |
If specified, CMIP6 style validation is not performed by CMOR. If the validation is run then the following fields are not checked; model_id (source_id ), experiment_id , further_info_url , grid_label , parent_experiment_id , sub_experiment_id . |
--mip_era |
The MIP era (e.g. CMIP6). |
--external_plugin |
Module path to external CDDS plugin (e.g. arise.plugin ) |
--external_plugin_location |
Path to the external plugin implementation (e.g. /project/cdds/arise ) |
Example Usage
Run for all streams with full checking of metadata
mip_convert mip_convert.cfg
Run for a single stream in relaxed mode
mip_convert mip_convert.cfg -s ap4 --relaxed_cmor
User Configuration File Reference
The user configuration file provides the information required by MIP Convert to produce the output netCDF files. It contains the following sections, some of which are optional.
Section | Summary |
---|---|
[COMMON] |
Convenience for setting up shared config values. |
[cmor_setup] |
Passed through to CMOR's cmor_setup() routine. |
[cmor_dataset] |
Passed through to CMOR's cmor_data_set_json() routine. |
[request] |
Configure mip_convert including input data. |
[stream_<stream_id>] |
The variables to produce from a particular <stream_id> . |
[masking] |
Apply polar row masking if needed. |
[halo_removal] |
Apply stripping of halo columns and rows if needed. |
[slicing_periods] |
Using the slicing period for a particular stream if given. |
[global_attributes] |
Add global attributes to the output netCDF. |
COMMON
The optional [COMMON]
section.
cmor_setup
The [cmor_setup]
section contains the following options which are used by cmor_setup()
.
For a description of each option please see the documentation for cmor_setup().
Option | Required by | Used by | CMOR Name |
---|---|---|---|
mip_table_dir |
MIP Convert | CMOR + MIP Convert | inpath |
netcdf_file_action |
CMOR | ||
set_verbosity |
CMOR | ||
exit_control |
CMOR | ||
cmor_log_file |
CMOR | log_file |
|
create_subdirectories |
CMOR |
Tip
When configuring a user configuration file, the mip_table_dir
is likely to be the only value that will need modification.
cmor_dataset
The required cmor_dataset
section contains the following options used for cmor_data_set_json()
Option | Required by | Used by | Notes |
---|---|---|---|
branch_method |
MIP Convert + CMOR | MIP Convert + CMOR | [1] |
calendar |
MIP Convert + CMOR | MIP Convert + CMOR | [2] |
comment |
CMOR | [1][3] | |
contact |
CMOR | [1] | |
experiment_id |
MIP Convert + CMOR | MIP Convert + CMOR | [1] |
grid |
CMOR | CMOR | [1] |
grid_label |
CMOR | CMOR | [1] |
institution_id |
MIP Convert + CMOR | MIP Convert + CMOR | [1] |
license |
CMOR | CMOR | [1] |
mip |
CMOR | CMOR | [1][4] |
mip_era |
MIP Convert + CMOR | MIP Convert + CMOR | [1] |
model_id |
MIP Convert + CMOR | MIP Convert + CMOR | [1][5] |
model_type |
CMOR | CMOR | [1][6] |
nominal_resolution |
CMOR | CMOR | [1] |
output_dir |
MIP Convert + CMOR | MIP Convert + CMOR | [7] |
output_file_template |
CMOR | ||
output_path_template |
CMOR | ||
references |
CMOR | [1] | |
sub_experiment_id |
MIP Convert + CMOR | MIP Convert | [1] |
variant_info |
CMOR | [1] | |
variant_label |
MIP Convert + CMOR | MIP Convert + CMOR | [1] |
Notes
- For a description of each option, please see the
CMIP6 Global Attributes document
_. - See calendars for allowed values.
- It is recommended to use the
comment
to record any perturbed physicsdetails. - See MIP.
- See model identifier.
- See model type.
- See
outpath
in the documentation forcmor_dataset_json
_.
MIP Convert determines:
- the
experiment
,institution
,source
,sub_experiment
from the CV file using theexperiment_id
,institution_id
,source_id
andsub_experiment_id
, respectively - the
forcing_index
,initialization_index
,physics_index
andrealization_index
from thevariant_label
- the
further_info_url
andtracking_prefix
based on the information from the CV file - the
history
Whenever a parent experiment exists the following options must also be specified.
Option | Used by | Notes |
---|---|---|
branch_date_in_child |
MIP Convert | [1][2][3] |
branch_date_in_parent |
MIP Convert | [1][2][3] |
parent_base_date |
MIP Convert | [1][2] |
parent_experiment_id |
CMOR | [3] |
parent_mip_era |
CMOR | [3] |
parent_model_id |
CMOR | [3][4] |
parent_time_units |
CMOR | [3] |
parent_variant_label |
CMOR | [3] |
Notes
- CMOR requires
branch_time_in_child
andbranch_time_in_parent
, which is determined from the optionsbase_date
(see therequest <request_section>
section) /parent_base_date
(the base date of thechild_experiment_id
/parent_experiment_id
) andbranch_date_in_child
/branch_date_in_parent
(the date in thechild_experiment_id
/parent_experiment_id
from which the experiment branches) from thecmor_dataset <cmor_dataset_section>
section in the |user configuration file| by taking the difference (in days) between thebranch_date_in_child
/branch_date_in_parent
and thebase_date
/parent_base_date
. Ifbranch_date_in_child
orbranch_date_in_parent
isN/A
thenbranch_time_in_parent
is set to 0. - Dates should be provided in the form
YYYY-MM-DDThh:mm:ssZ
. - For a description of each option, please see the CMIP6 Global Attributes document.
- See
parent_source_id
in the CMIP6 Global Attributes document.
request
The required request
section contains the following options which are used only by MIP Convert.
Option |
Required | Description | Notes |
---|---|---|---|
ancil_files |
A space separated list of the full paths to any required ancillary files. | ||
atmos_timestep |
The atmospheric model timestep in integer seconds. | [1] | |
base_date |
Yes | The date in the form YYYY-MM-DDThh:mm:ss . |
[2] |
deflate_level |
The deflation level when writing the output netCDF file from 0 (no compression) to 9 (maximum compression). | ||
force_coordinate_rotation |
If set to True , output data will be forced to include rotated coordinates and true lat-lon coordinates. |
||
hybrid_heights_file |
A space separated list of the full path to the files containing the information about the hybrid heights. | [3] | |
mask_slice |
Yes | Optional slicing expression for masking data in the form of n:m,i:j , or no_mask |
[4][8] |
model_output_dir |
Yes | The full path to the root directory containing the model output files. | [5] |
reference_time |
Yes | The reference time used to construct reftime and leadtime coordinates. Only used if these coordinates are specified corresponding variable entries in the MIP table |
|
replacement_coordinates_file |
The full path to the netCDF file containing area variables that refer to the horizontal coordinates that should be used to replace the corresponding values in the model output files. | [6] | |
run_bounds |
Yes | The start and end time in the form <start_time> <end_time> , where <start_time> and <end_time> are in the form YYYY-MM-DDThh:mm:ss . |
|
shuffle |
Whether to shuffle when writing the output netCDF file. | ||
sites_file |
The full path to the file containing the information about the sites. | [7] | |
suite_id |
Yes | The suite identifier of the model. |
Notes
- The
atmos_timestep
is required for atmospheric tendency diagnostics, which have model to MIP mappings that depend on the atmospheric model timestep, i.e., the expression containsATMOS_TIMESTEP
. - The
base_date
is used to define the units of the time coordinate in the output netCDF file and is specified by the MIP. - The file containing the information about the hybrid heights has the following columns;
- the
model level number
(int) - the
a_value
(float) - the
a_lower_bound
(float) - the
a_upper_bound
(float) - the
b_value
(float) - the
b_lower_bound
(float) - the
b_upper_bound
(float)
- the
- If not specified,
mip_convert
will try to retrieve masking expressions from plugins (this is a default behaviour for CMIP6-like processing). Puttingno_mask
into configuration file allowsmip_convert
to process model output that does not require any masking; custom masks can be specified and passed tomip_convert
without plugins dependencies. - It is expected that the model output files are located in the directory
<model_output_dir>/<suite_id>/<stream_id>/
, where the<suite_id>
is the suite identifier and the<stream_id>
is the |stream identifier|. Note that MIP Convert will load all the files in this directory and then use therun_bounds
to select the required data; when selecting a short time period from a large number of |model output files| it is recommended to copy the relevant files to an empty directory to save time when loading. - Currently, only CICE horizontal coordinates can be replaced.
- The file containing the information about the sites has the following columns;
- the
site number
(int) - the
longitude
(float, from 0 to 360) [degrees] - the
latitude
(float, from -90 to 90) [degrees] - the
orography
(float) [metres] and acomment
(string).
- the
- This is usually only used to mask ocean/seaice data as part of a CDDS run.
stream_<\stream_id>
The required [stream_<stream_id>]
section, where the <stream_id>
is the stream identifier, contains options equal to the name of the MIP table and values equal to a space-separated list of MIP requested variable names.
Multiple [stream_<stream_id>]
sections can be defined.
Note
All output netCDF files are created for a stream before moving onto the next stream.
Example
If we wanted to produce the following variables Amon/tas
, Amon/pr
, Emon/ps
, Amon/tasmax
, day/tasmin
using the CMIP6
tables.
- We need two
stream_stream_id
sections as the list of MIP requested variables span streamsap5
andap6
. - Within each of these sections we specify the mip table in the form
<MIP_ERA>_<MIP_TABLE>:
- Then the list of variables names as a space seperated list.
[stream_ap5]
CMIP6_Amon: tas pr
CMIP6_Emon: ps
[stream_ap6]
CMIP6_Amon: tasmax
CMIP6_day: tasmin
masking
The optional masking
section is used if a particular stream needs to be masked.
This is usually only used for polar row masking in NEMO & CICE output
Example
[masking]
stream_inm_cice-T: -1:,180:
halo_removal
The optional halo removal
section is used if for a particular stream haloes are to be removed.
Example
[halo_removal]
stream_apa: 5:,:-10
stream_ap6: 20:-15
slicing_periods
The optional slicing_periods
section is used if for a particular stream a period for slicing is specified.
For all streams that have no specified slicing period the default slicing period is year
.
Example
[slicing_periods]
stream_apa: month
stream_ap6: year
global_attributes
The optional global_attributes
section.
Any information provided in the optional global_attributes <global_attributes>
section will be written to the header of the output netCDF files.