Recipe Format

The recipes are text files written in YAML 1.2, a configuration language that is widely used. They are saved with the .yaml extension. Below is a commented example recipe:

category: Category of recipe
title: Name of recipe
description: |
  Extended description that can
  go across multiple lines.

parallel:
  # Specify the operator to run in each step.
  - operator: read.read_cubes

  - operator: filters.filter_cubes
    # Can specify extra keyword arguments as sub-maps.
    constraint:
      # Can nest in another operator to use its output as an argument.
      operator: generate_constraints.generate_stash_constraints
      # Input implicitly taken from the previous step, but can be overridden
      # by using the appropriate keyword argument.
      stash: m01s03i236

  - operator: write.write_cube_to_nc
    # Specify the name of the argument, and its value.
    filename: intermediate/processed_data
    # intermediate is a slightly special folder for partially processed data
    # that needs collating.

# Steps to collate processed data into output.
collate:
  - operator: read.read_cube
    filename: intermediate/*.nc

  # Save a sequence of plots, one per time.
  - operator: plot.plot_spatial_plot

  # Save a single cube with all the processed data.
  - operator: write.write_cube_to_nc

The title and description keys provide a human readable description of what the recipe does. The title is also used to derive the ID of the running recipe, used when running the recipe in a workflow. The category is used to group the produced diagnostics in the output website.

The parallel and collate keys specify lists of processing steps. The steps are run from top to bottom, with each step specifying an operator to run, and optionally any additional inputs to that operator. A parallel step is denoted by a - under the parallel: key. The operators are specified on the operator key. Its value should be a string of the form module.function. For additional inputs the key should be the name of the argument.

The collate: key is used for collating together the output of the parallel steps to produce the final output. This allows for the expensive processing to be parallelised over many compute nodes, with just the final visualisation of the data done in a single job to ensure it has all of the data.

The below code block shows how you can nest operators multiple levels deep. For details of the specific operators involved, and the arguments that can take, see the CSET Operators page.

- operator: filters.filter_cubes
  constraint:
    operator: constraints.combine_constraints
    constraint:
      operator: constraints.generate_stash_constraint
      stash: m01s03i236
    cell_methods_constraint:
      operator: constraints.generate_cell_methods_constraint
      cell_methods: []

Using Recipe Variables

A CSET recipe may contain variables. These are values filled in at runtime. They allow making generic recipes that can handle multiple cases. This prevents the need to have hundreds of recipes for very similar tasks where only minor changes are required such as switching from mean to median or iterating over a number of variable names.

A variable can be added to a recipe by setting a parameter’s value to the variable name, prefixed with a dollar sign. This name may only contain upper case letters and underscores. For example:

parameter: $MY_VARIABLE

When the recipe is run with cset bake the variable is replaced with a value given on the command line. This is done using the variable name as an option, for example:

cset bake -i input -o output -r recipe.yaml --MY_VARIABLE='value'

The given value will be templated into the parameter so what runs is actually:

parameter: value