cdds.convert
The cdds.convert
subpackage provides the functionality for the cdds_convert
command (mostly in a further subpackage cdds.convert.configure_workflow
) as well as supporting code for a number of commands that can only be used in the context of the Cylc conversion workflow.
cdds_convert
Roughly, the main operations performed by the cdds_convert
command are the following
- Running
generate_user_config_files
(unless explicitly skipped in therequest.cfg
). - Determining Jinja2 variable values for use in the Cylc conversion workflow.
- Copying the Cylc conversion workflow from
cdds.workflows.conversion
1 into the usersproc
directory (found underconversion
). - Interpolating the Jina2 variables into the users copy of the processing workflow.
- Running a specific invocation of
cylc vip
on the processing workflow.
Most of the functionality contained in the modules of cdds.convert.configure_workflow
is reasonably straightforward.
The one exception being CalculateISODatetimes
, which is used to determine various durations, cycle frequencies, and dates.
run_mip_convert
The main purpose of run_mip_convert
is to
- Interpolate values into the MIP Convert template configuration files created by
generate_user_config_files
for a given cycle point. - Copy the necessary input files to temporary storage on the node.
- Run the
mip_convert
command in a subprocess.
mip_batch_concatenate, mip_concatenate, mip_concatenate_organise, mip_concatenate_setup
As CDDS breaks down the conversion process into small chunks to support scalability and speed of conversion many small files can be produced. This is not optimal for data management where files in the range 5-10 GB are preferred.
The concatenate tools identify the files that should be concatenated together based upon "sizing" information from the Model Parameters JSON file (how many years of data should be concatenated together for a given frequency and "shape" of data) and the base date specified in the Request file.
-
Historically the conversion workflow was developed in a separate roses repository (u-ak283). This provided flexibility for modifying the workflow independently (i.e., not needing to re-release CDDS to fix a workflow bug) but did incur greater complexity during development and for initial releases. ↩