CSET Operators

This page details the operators contained within CSET. It is automatically generated from the code and its docstrings. Operators should be used via Run an Operator Recipe.

Generic Operators

CSET.operators.aggregate

Operators to aggregate across either 1 or 2 dimensions.

CSET.operators.aggregate.time_aggregate(cube: Cube, method: str, interval_iso: str, **kwargs) Cube

Aggregate cube by its time coordinate.

Aggregates similar (stash) fields in a cube for the specified coordinate and using the method supplied. The aggregated cube will keep the coordinate and add a further coordinate with the aggregated end time points.

Examples are: 1. Generating hourly or 6-hourly precipitation accumulations given an interval for the new time coordinate.

We use the isodate class to convert ISO 8601 durations into time intervals for creating a new time coordinate for aggregation.

We use the lambda function to pass coord and interval into the callable category function in add_categorised to allow users to define their own sub-daily intervals for the new time coordinate.

Parameters:
  • cube (iris.cube.Cube) – Cube to aggregate and iterate over one dimension

  • coordinate (str) – Coordinate to aggregate over i.e. ‘time’, ‘longitude’, ‘latitude’,’model_level_number’.

  • method (str) – Type of aggregate i.e. method: ‘SUM’, getattr creates iris.analysis.SUM, etc.

  • interval_iso (isodate timedelta ISO 8601 object i.e PT6H (6 hours), PT30M (30 mins)) – Interval to aggregate over.

Returns:

cube – Single variable but several methods of aggregation

Return type:

iris.cube.Cube

Raises:

ValueError – If the constraint doesn’t produce a single cube containing a field.

CSET.operators.transect

Operators to extract a transect given a tuple of xy coords to start/finish.

CSET.operators.transect.calc_transect(cube: Cube, startcoords: tuple, endcoords: tuple)

Compute transect between startcoords and endcoords.

Computes a transect for a given cube containing at least latitude and longitude coordinates, using an appropriate sampling interval along the transect based on grid spacing. Also decides a suitable x coordinate to plot along the transect - see notes for details on this.

Parameters:
  • cube (Cube) – An iris cube containing latitude and longitude coordinate dimensions, to compute the transect on.

  • startcoords (tuple) – A tuple containing the start coordinates for the transect using model coordinates, ordered (latitude,longitude).

  • endcoords (tuple) – A tuple containing the end coordinates for the transect using model coordinates, ordered (latitude,longitude).

Returns:

cube – A cube containing at least pressure and the coordinate specified by coord, for the transect specified between startcoords and endcoords.

Return type:

Cube

Notes

This operator uses the iris.Nearest method to interpolate the specific point along the transect. Identification of an appropriate coordinate to plot along the x axis is done by determining the largest distance out of latitude and longitude. For example, if the transect is 90 degrees (west to east), then delta latitude is zero, so it will always plot as a function of longitude. Vice versa if the transect is 180 degrees (south to north). If a transect is 45 degrees, and delta latitude and longitude are the same, it will default to plotting as a function of longitude. Note this doesn’t affect the transect plot, its purely for interpretation with some appropriate x axis labelling/points.

CSET.operators.collapse

Operators to perform various kind of collapse on either 1 or 2 dimensions.

CSET.operators.collapse.collapse(cube: Cube, coordinate: str | list[str], method: str, additional_percent: float = None, **kwargs) Cube

Collapse coordinate(s) of a cube.

Collapses similar (stash) fields in a cube into a cube collapsing around the specified coordinate(s) and method. This could be a (weighted) mean or percentile.

Parameters:
  • cube (iris.cube.Cube) – Cube to collapse and iterate over one dimension

  • coordinate (str | list[str]) – Coordinate(s) to collapse over e.g. ‘time’, ‘longitude’, ‘latitude’, ‘model_level_number’, ‘realization’. A list of multiple coordinates can be given.

  • method (str) – Type of collapse i.e. method: ‘MEAN’, ‘MAX’, ‘MIN’, ‘MEDIAN’, ‘PERCENTILE’ getattr creates iris.analysis.MEAN, etc For PERCENTILE YAML file requires i.e. method: ‘PERCENTILE’ additional_percent: 90

Returns:

cube – Single variable but several methods of aggregation

Return type:

iris.cube.Cube

Raises:

ValueError – If additional_percent wasn’t supplied while using PERCENTILE method.

CSET.operators.collapse.collapse_by_hour_of_day(cube: Cube, method: str, additional_percent: float = None, **kwargs) Cube

Collapse a cube by hour of the day.

Parameters:
  • cube (iris.cube.Cube) – Cube to collapse and iterate over one dimension. It should contain only one time dimension.

  • method (str) – Type of collapse i.e. method: ‘MEAN’, ‘MAX’, ‘MIN’, ‘MEDIAN’, ‘PERCENTILE’. For ‘PERCENTILE’ the additional_percent must be specified.

Returns:

cube – Single variable but several methods of aggregation.

Return type:

iris.cube.Cube

Raises:

ValueError – If additional_percent wasn’t supplied while using PERCENTILE method.

Notes

Collapsing of the cube is around the ‘time’ coordinate. The coordinates are first grouped by the hour of day, and then aggregated by the hour of day to create a diurnal cycle. This operator is applicable for both single forecasts and for multiple forecasts. The hour used is based on the units of the time coordinate. If the time coordinate is in UTC, hour will be in UTC.

To apply this operator successfully there must only be one time dimension. Should a MultiDim exception be raised the user first needs to apply the collapse operator to reduce the time dimensions before applying this operator.

CSET.operators.collapse.collapse_by_lead_time(cube: Cube | CubeList, method: str, additional_percent: float = None, **kwargs) Cube

Collapse a cube around lead time for multiple cases.

First checks if the data can be aggregated by lead time easily. Then collapses by lead time for a specified method using the collapse function.

Parameters:
  • cube (iris.cube.Cube | iris.cube.CubeList) – Cube to collapse by lead time or CubeList that will be converted to a cube before collapsing by lead time.

  • method (str) – Type of collapse i.e. method: ‘MEAN’, ‘MAX’, ‘MIN’, ‘MEDIAN’, ‘PERCENTILE’. For ‘PERCENTILE’ the additional_percent must be specified.

Returns:

cube – Single variable collapsed by lead time based on chosen method.

Return type:

iris.cube.Cube

Raises:

ValueError – If additional_percent wasn’t supplied while using PERCENTILE method.

CSET.operators.constraints

Operators to generate constraints to filter with.

CSET.operators.constraints.combine_constraints(constraint: Constraint = None, **kwargs) Constraint

Operator that combines multiple constraints into one.

Parameters:
  • constraint (iris.Constraint) – First constraint to combine.

  • additional_constraint_1 (iris.Constraint) – Second constraint to combine. This must be a named argument.

  • additional_constraint_2 (iris.Constraint) – There can be any number of additional constraint, they just need unique names.

  • ...

Returns:

combined_constraint

Return type:

iris.Constraint

Raises:

TypeError – If the provided arguments are not constraints.

CSET.operators.constraints.generate_area_constraint(lat_start: float | None, lat_end: float | None, lon_start: float | None, lon_end: float | None, **kwargs) Constraint

Generate an area constraint between latitude/longitude limits.

Operator that takes a set of latitude and longitude limits and returns a constraint that selects grid values only inside that area. Works with the data’s native grid so is defined within the rotated pole CRS.

Alternatively, all arguments may be None to indicate the area should not be constrained. This is useful to allow making subsetting an optional step in a processing pipeline.

Parameters:
  • lat_start (float | None) – Latitude value for lower bound

  • lat_end (float | None) – Latitude value for top bound

  • lon_start (float | None) – Longitude value for left bound

  • lon_end (float | None) – Longitude value for right bound

Returns:

area_constraint

Return type:

iris.Constraint

CSET.operators.constraints.generate_cell_methods_constraint(cell_methods: list, **kwargs) Constraint

Generate constraint from cell methods.

Operator that takes a list of cell methods and generates a constraint from that. Use [] to specify non-aggregated data.

Parameters:

cell_methods (list) – cube.cell_methods for filtering

Returns:

cell_method_constraint

Return type:

iris.Constraint

CSET.operators.constraints.generate_level_constraint(coordinate: str, levels: int | list[int] | str, **kwargs) Constraint

Generate constraint for particular levels on the specified coordinate.

Operator that generates a constraint to constrain to specific model or pressure levels. If no levels are specified then any cube with the specified coordinate is rejected.

Typically coordinate will be "pressure" or "model_level_number" for UM, or "full_levels" or "half_levels" for LFRic.

Parameters:
  • coordinate (str) – Level coordinate name about which to constraint.

  • levels (int | list[int] | str) – CF compliant level points, "*" for retrieving all levels, or [] for no levels.

Returns:

constraint

Return type:

iris.Constraint

Notes

Due to the specification of coordinate as an argument any iterable coordinate can be stratified with this function. Therefore, "realization" is a valid option. Subsequently, levels specifies the ensemble members, or group of ensemble members you wish to constrain your results over.

CSET.operators.constraints.generate_stash_constraint(stash: str, **kwargs) AttributeConstraint

Generate constraint from STASH code.

Operator that takes a stash string, and uses iris to generate a constraint to be passed into the read operator to minimize the CubeList the read operator loads and speed up loading.

Parameters:

stash (str) – stash code to build iris constraint, such as “m01s03i236”

Returns:

stash_constraint

Return type:

iris.AttributeConstraint

CSET.operators.constraints.generate_time_constraint(time_start: str, time_end: str = None, **kwargs) AttributeConstraint

Generate constraint between times.

Operator that takes one or two ISO 8601 date strings, and returns a constraint that selects values between those dates (inclusive).

Parameters:
  • time_start (str | datetime.datetime) – ISO date for lower bound

  • time_end (str | datetime.datetime) – ISO date for upper bound. If omitted it defaults to the same as time_start

Returns:

time_constraint

Return type:

iris.Constraint

CSET.operators.constraints.generate_var_constraint(varname: str, **kwargs) Constraint

Generate constraint from variable name or STASH code.

Operator that takes a CF compliant variable name string, and generates an iris constraint to be passed into the read or filter operator. Can also be passed a STASH code to generate a STASH constraint.

Parameters:

varname (str) – CF compliant name of variable, or a UM STASH code such as “m01s03i236”.

Returns:

varname_constraint

Return type:

iris.Constraint

CSET.operators.filters

Operators to perform various kind of filtering.

CSET.operators.filters.apply_mask(original_field: Cube, mask: Cube) Cube

Apply a mask to given data as a masked array.

Parameters:
  • original_field (iris.cube.Cube) – The field to be masked.

  • mask (iris.cube.Cube) – The mask being applied to the original field.

Returns:

masked_field – A cube of the masked field.

Return type:

iris.cube.Cube

Notes

The mask is first converted to 1s and NaNs before multiplication with the original data.

As discussed in generate_mask, you can combine multiple masks in a recipe using other functions before applying the mask to the data.

Examples

>>> land_points_only = apply_mask(temperature, land_mask)
CSET.operators.filters.filter_cubes(cube: Cube | CubeList, constraint: Constraint, **kwargs) Cube

Filter a CubeList down to a single Cube based on a constraint.

Parameters:
  • cube (iris.cube.Cube | iris.cube.CubeList) – Cube(s) to filter

  • constraint (iris.Constraint) – Constraint to extract

Return type:

iris.cube.Cube

Raises:

ValueError – If the constraint doesn’t produce a single cube.

CSET.operators.filters.filter_multiple_cubes(cubes: Cube | CubeList, **kwargs) CubeList

Filter a CubeList on multiple constraints, returning another CubeList.

Parameters:
  • cube (iris.cube.Cube | iris.cube.CubeList) – Cube(s) to filter

  • constraint (iris.Constraint) – Constraint to extract. This must be a named argument. There can be any number of additional constraints, they just need unique names.

Return type:

iris.cube.CubeList

Raises:

ValueError – The constraints don’t produce a single cube per constraint.

CSET.operators.filters.generate_mask(mask_field: Cube, condition: str, value: float) Cube

Generate a mask to remove data not meeting conditions.

Parameters:
  • mask_field (iris.cube.Cube) – The field to be used for creating the mask.

  • condition (str) – The type of condition applied, six available options: ‘==’,’!=’,’<’,’<=’,’>’, and ‘>=’.

  • value (float) – The value on the right hand side of the condition.

Returns:

mask – Mask meeting the condition applied.

Return type:

iris.cube.Cube

Raises:

ValueError – Unexpected value for condition. Expected ==, !=, >, >=, <, <=.: Got {condition}. Raised when condition is not supported.

Notes

The mask is created in the opposite sense to numpy.ma.masked_arrays. This method was chosen to allow easy combination of masks together outside of this function using misc.addition or misc.multiplication depending on applicability. The combinations can be of any fields such as orography > 500 m, and humidity == 100 %.

The conversion to a masked array occurs in the apply_mask routine, which should happen after all relevant masks have been combined.

Examples

>>> land_mask = generate_mask(land_sea_mask,'==',1)

CSET.operators.misc

Miscellaneous operators.

CSET.operators.misc.addition(addend_1, addend_2)

Addition of two fields.

Parameters:
  • addend_1 (Cube) – Any field to have another field added to it.

  • addend_2 (Cube) – Any field to be added to another field.

Return type:

Cube

Raises:

ValueError, iris.exceptions.NotYetImplementedError – When the cubes are not compatible.

Notes

This is a simple operator designed for combination of diagnostics or creating new diagnostics by using recipes.

Examples

>>> field_addition = misc.addition(kinetic_energy_u, kinetic_energy_v)
CSET.operators.misc.combine_cubes_into_cubelist(first: Cube | CubeList, **kwargs) CubeList

Operator that combines multiple cubes or CubeLists into one.

Parameters:
  • first (Cube | CubeList) – First cube or CubeList to merge into CubeList.

  • second (Cube | CubeList) – Second cube or CubeList to merge into CubeList. This must be a named argument.

  • third (Cube | CubeList) – There can be any number of additional arguments, they just need unique names.

  • ...

Returns:

combined_cubelist – Combined CubeList containing all cubes/CubeLists.

Return type:

CubeList

Raises:

TypeError: – If the provided arguments are not either a Cube or CubeList.

CSET.operators.misc.division(numerator, denominator)

Division of two fields.

Parameters:
  • numerator (Cube) – Any field to have the ratio taken with respect to another field.

  • denominator (Cube) – Any field used to divide another field or provide the reference value in a ratio.

Return type:

Cube

Raises:

ValueError – When the cubes are not compatible.

Notes

This is a simple operator designed for combination of diagnostics or creating new diagnostics by using recipes.

Examples

>>> bowen_ratio = misc.division(sensible_heat_flux, latent_heat_flux)
CSET.operators.misc.multiplication(multiplicand, multiplier)

Multiplication of two fields.

Parameters:
  • multiplicand (Cube) – Any field to be multiplied by another field.

  • multiplier (Cube) – Any field to be multiplied to another field.

Return type:

Cube

Raises:

ValueError – When the cubes are not compatible.

Notes

This is a simple operator designed for combination of diagnostics or creating new diagnostics by using recipes.

Examples

>>> filtered_CAPE_ratio = misc.multiplication(CAPE_ratio, inflow_layer_properties)
CSET.operators.misc.noop(x, **kwargs)

Return its input without doing anything to it.

Useful for constructing diagnostic chains.

Parameters:

x (Any) – Input to return.

Returns:

x – The input that was given.

Return type:

Any

CSET.operators.misc.remove_attribute(cubes: Cube | CubeList, attribute: str | Iterable, **kwargs) CubeList

Remove a cube attribute.

If the attribute is not on the cube, the cube is passed through unchanged.

Parameters:
  • cubes (Cube | CubeList) – One or more cubes to remove the attribute from.

  • attribute (str | Iterable) – Name of attribute (or Iterable of names) to remove.

Returns:

cubes – CubeList of cube(s) with the attribute removed.

Return type:

CubeList

CSET.operators.misc.subtraction(minuend, subtrahend)

Subtraction of two fields.

Parameters:
  • minuend (Cube) – Any field to have another field subtracted from it.

  • subtrahend (Cube) – Any field to be subtracted from to another field.

Return type:

Cube

Raises:

ValueError, iris.exceptions.NotYetImplementedError – When the cubes are not compatible.

Notes

This is a simple operator designed for combination of diagnostics or creating new diagnostics by using recipes. It can be used for model differences to allow for comparisons between the same field in different models or model configurations.

Examples

>>> model_diff = misc.subtraction(temperature_model_A, temperature_model_B)

CSET.operators.plot

Operators to produce various kinds of plots.

CSET.operators.plot.plot_histogram_series(cube: Cube, filename: str = None, sequence_coordinate: str = 'time', stamp_coordinate: str = 'realization', single_plot: bool = False, histtype: str = 'step', **kwargs) Cube

Plot a histogram plot for each vertical level provided.

A histogram plot can be plotted, but if the sequence_coordinate (i.e. time) is present then a sequence of plots will be produced using the time slider functionality to scroll through histograms against time. If a stamp_coordinate is present then postage stamp plots will be produced. If stamp_coordinate and single_plot is True, all postage stamp plots will be plotted in a single plot instead of separate postage stamp plots.

Parameters:
  • cube (Cube) – Iris cube of the data to plot. It should have a single dimension other than the stamp coordinate.

  • filename (str, optional) – Name of the plot to write, used as a prefix for plot sequences. Defaults to the recipe name.

  • sequence_coordinate (str, optional) – Coordinate about which to make a plot sequence. Defaults to "time". This coordinate must exist in the cube and will be used for the time slider.

  • stamp_coordinate (str, optional) – Coordinate about which to plot postage stamp plots. Defaults to "realization".

  • single_plot (bool, optional) – If True, all postage stamp plots will be plotted in a single plot. If False, each postage stamp plot will be plotted separately. Is only valid if stamp_coordinate exists and has more than a single point.

  • histtype (str, optional) – The type of histogram to plot. Options are “step” for a line histogram or “barstacked”, “stepfilled”. “Step” is the default option, but can be changed in the rose-suite.conf configuration.

Returns:

The original cube (so further operations can be applied).

Return type:

Cube

Raises:
  • ValueError – If the cube doesn’t have the right dimensions.

  • TypeError – If the cube isn’t a single cube.

CSET.operators.plot.plot_line_series(cube: Cube | CubeList, filename: str = None, series_coordinate: str = 'time', **kwargs) Cube | CubeList

Plot a line plot for the specified coordinate.

The Cube or CubeList must be 1D.

Parameters:
  • iris.cube.CubeList (iris.cube |) – Cube or CubeList of the data to plot. The individual cubes should have a single dimension. The cubes should cover the same phenomenon i.e. all cubes contain temperature data. We do not support different data such as temperature and humidity in the same CubeList for plotting.

  • filename (str, optional) – Name of the plot to write, used as a prefix for plot sequences. Defaults to the recipe name.

  • series_coordinate (str, optional) – Coordinate about which to make a series. Defaults to "time". This coordinate must exist in the cube.

Returns:

The original Cube or CubeList (so further operations can be applied). plotted data.

Return type:

iris.cube.Cube | iris.cube.CubeList

Raises:
  • ValueError – If the cubes don’t have the right dimensions.

  • TypeError – If the cube isn’t a Cube or CubeList.

CSET.operators.plot.plot_vertical_line_series(cube: Cube, filename: str = None, series_coordinate: str = 'model_level_number', sequence_coordinate: str = 'time', **kwargs) Cube

Plot a line plot against a type of vertical coordinate.

A 1D line plot with y-axis as pressure coordinate can be plotted, but if the sequence_coordinate is present then a sequence of plots will be produced.

The cube must be 1D.

Parameters:
  • cube (Cube) – Iris cube of the data to plot. It should have a single dimension.

  • filename (str, optional) – Name of the plot to write, used as a prefix for plot sequences. Defaults to the recipe name.

  • series_coordinate (str, optional) – Coordinate to plot on the y-axis. Can be pressure or model_level_number for UM, or full_levels or half_levels for LFRic. Defaults to model_level_number. This coordinate must exist in the cube.

  • sequence_coordinate (str, optional) – Coordinate about which to make a plot sequence. Defaults to "time". This coordinate must exist in the cube.

Returns:

The original cube (so further operations can be applied).

Return type:

Cube

Raises:
  • ValueError – If the cube doesn’t have the right dimensions.

  • TypeError – If the cube isn’t a single cube.

CSET.operators.plot.scatter_plot(cube_x: Cube, cube_y: Cube, filename: str = None, one_to_one: bool = True, **kwargs) CubeList

Plot a scatter plot between two variables.

Both cubes must be 1D.

Parameters:
  • cube_x (Cube) – 1 dimensional Cube of the data to plot on y-axis.

  • cube_y (Cube) – 1 dimensional Cube of the data to plot on x-axis.

  • filename (str, optional) – Filename of the plot to write.

  • one_to_one (bool, optional) – If True a 1:1 line is plotted; if False it is not. Default is True.

Returns:

cubes – CubeList of the original x and y cubes for further processing.

Return type:

CubeList

Raises:
  • ValueError – If the cube doesn’t have the right dimensions and cubes not the same size.

  • TypeError – If the cube isn’t a single cube.

Notes

Scatter plots are used for determining if there is a relationship between two variables. Positive relations have a slope going from bottom left to top right; Negative relations have a slope going from top left to bottom right.

A variant of the scatter plot is the quantile-quantile plot. This plot does not use all data points, but the selected quantiles of each variable instead. Quantile-quantile plots are valuable for comparing against observations and other models. Identical percentiles between the variables will lie on the one-to-one line implying the values correspond well to each other. Where there is a deviation from the one-to-one line a range of possibilities exist depending on how and where the data is shifted (e.g., Wilks 2011 [Wilks2011]).

For distributions above the one-to-one line the distribution is left-skewed; below is right-skewed. A distinct break implies a bimodal distribution, and closer values/values further apart at the tails imply poor representation of the extremes.

References

[Wilks2011]

Wilks, D.S., (2011) “Statistical Methods in the Atmospheric Sciences” Third Edition, vol. 100, Academic Press, Oxford, UK, 676 pp.

CSET.operators.plot.spatial_contour_plot(cube: Cube, filename: str = None, sequence_coordinate: str = 'time', stamp_coordinate: str = 'realization', **kwargs) Cube

Plot a spatial variable onto a map from a 2D, 3D, or 4D cube.

A 2D spatial field can be plotted, but if the sequence_coordinate is present then a sequence of plots will be produced. Similarly if the stamp_coordinate is present then postage stamp plots will be produced.

Parameters:
  • cube (Cube) – Iris cube of the data to plot. It should have two spatial dimensions, such as lat and lon, and may also have a another two dimension to be plotted sequentially and/or as postage stamp plots.

  • filename (str, optional) – Name of the plot to write, used as a prefix for plot sequences. Defaults to the recipe name.

  • sequence_coordinate (str, optional) – Coordinate about which to make a plot sequence. Defaults to "time". This coordinate must exist in the cube.

  • stamp_coordinate (str, optional) – Coordinate about which to plot postage stamp plots. Defaults to "realization".

Returns:

The original cube (so further operations can be applied).

Return type:

Cube

Raises:
  • ValueError – If the cube doesn’t have the right dimensions.

  • TypeError – If the cube isn’t a single cube.

CSET.operators.plot.spatial_pcolormesh_plot(cube: Cube, filename: str = None, sequence_coordinate: str = 'time', stamp_coordinate: str = 'realization', **kwargs) Cube

Plot a spatial variable onto a map from a 2D, 3D, or 4D cube.

A 2D spatial field can be plotted, but if the sequence_coordinate is present then a sequence of plots will be produced. Similarly if the stamp_coordinate is present then postage stamp plots will be produced.

This function is significantly faster than spatial_contour_plot, especially at high resolutions, and should be preferred unless contiguous contour areas are important.

Parameters:
  • cube (Cube) – Iris cube of the data to plot. It should have two spatial dimensions, such as lat and lon, and may also have a another two dimension to be plotted sequentially and/or as postage stamp plots.

  • filename (str, optional) – Name of the plot to write, used as a prefix for plot sequences. Defaults to the recipe name.

  • sequence_coordinate (str, optional) – Coordinate about which to make a plot sequence. Defaults to "time". This coordinate must exist in the cube.

  • stamp_coordinate (str, optional) – Coordinate about which to plot postage stamp plots. Defaults to "realization".

Returns:

The original cube (so further operations can be applied).

Return type:

Cube

Raises:
  • ValueError – If the cube doesn’t have the right dimensions.

  • TypeError – If the cube isn’t a single cube.

CSET.operators.read

Operators for reading various types of files from disk.

exception CSET.operators.read.NoDataWarning

Warning that no data has been loaded.

CSET.operators.read.read_cube(loadpath: list[str] | str, constraint: Constraint = None, filename_pattern: str = '*', **kwargs) Cube

Read a single cube from files.

Read operator that takes a path string (can include wildcards), and uses iris to load the cube matching the constraint.

If the loaded data is split across multiple files, a filename_pattern can be specified to select the read files using Unix shell-style wildcards. In this case the loadpath should point to the directory containing the data.

Ensemble data can also be loaded. If it has a realization coordinate already, it will be directly used. If not, it will have its member number guessed from the filename, based on one of several common patterns. For example the pattern emXX, where XX is the realization.

Deterministic data will be loaded with a realization of 0, allowing it to be processed in the same way as ensemble data.

Parameters:
  • loadpath (str | list[str]) – Path to where .pp/.nc files are located

  • constraint (iris.Constraint | iris.ConstraintCombination, optional) – Constraints to filter data by. Defaults to unconstrained.

  • filename_pattern (str, optional) – Unix shell-style glob pattern to match filenames to. Defaults to “*”

Returns:

cubes – Cube loaded

Return type:

iris.cube.Cube

Raises:
  • FileNotFoundError – If the provided path does not exist

  • ValueError – If the constraint doesn’t produce a single cube.

CSET.operators.read.read_cubes(loadpath: list[str] | str, constraint: Constraint = None, filename_pattern: str = '*', **kwargs) CubeList

Read cubes from files.

Read operator that takes a path string (can include wildcards), and uses iris to load the minimal set of cubes matching the constraint.

If the loaded data is split across multiple files, a filename_pattern can be specified to select the read files using Unix shell-style wildcards. In this case the loadpath should point to the directory containing the data.

Ensemble data can also be loaded. If it has a realization coordinate already, it will be directly used. If not, it will have its member number guessed from the filename, based on one of several common patterns. For example the pattern emXX, where XX is the realization.

Deterministic data will be loaded with a realization of 0, allowing it to be processed in the same way as ensemble data.

Data output by XIOS (such as LFRic) has its per-file metadata removed so that the cubes merge across files.

Parameters:
  • loadpath (str | list[str]) – Path to where .pp/.nc files are located. Can include globs.

  • constraint (iris.Constraint | iris.ConstraintCombination, optional) – Constraints to filter data by. Defaults to unconstrained.

  • filename_pattern (str, optional) – Unix shell-style glob pattern to match filenames to. Defaults to “*”

Returns:

cubes – Cubes loaded after being merged and concatenated.

Return type:

iris.cube.CubeList

Raises:

FileNotFoundError – If the provided path does not exist

CSET.operators.regrid

Operators to regrid cubes.

exception CSET.operators.regrid.BoundaryWarning

Selected gridpoint is close to the domain edge.

In many cases gridpoints near the domain boundary contain non-physical values, so caution is advised when interpreting them.

CSET.operators.regrid.regrid_onto_cube(toregrid: Cube | CubeList, target: Cube, method: str, **kwargs) Cube | CubeList

Regrid a cube or CubeList, projecting onto a target cube.

All cubes must have at least 2 spatial (map) dimensions.

Parameters:
  • toregrid (iris.cube | iris.cube.CubeList) – An iris Cube of data to regrid, or multiple cubes to regrid in a CubeList. A minimum requirement is that the cube(s) need to be 2D with a latitude, longitude coordinates.

  • target (Cube) – An iris cube of the data to regrid onto. It needs to be 2D with a latitude, longitude coordinate.

  • method (str) – Method used to regrid onto, etc. Linear will use iris.analysis.Linear()

Returns:

An iris cube of the data that has been regridded, or a CubeList of the cubes that have been regridded in the same order they were passed in toregrid.

Return type:

iris.cube | iris.cube.CubeList

Raises:
  • ValueError – If a unique x/y coordinate cannot be found

  • NotImplementedError – If the cubes grid, or the method for regridding, is not yet supported.

Notes

Currently rectlinear grids (uniform) are supported.

CSET.operators.regrid.regrid_onto_xyspacing(toregrid: Cube | CubeList, xspacing: float, yspacing: float, method: str, **kwargs) Cube | CubeList

Regrid cube or cubelist onto a set x,y spacing.

Regrid cube(s) using specified x,y spacing, which is performed linearly.

Parameters:
  • toregrid (iris.cube | iris.cube.CubeList) – An iris cube of the data to regrid, or multiple cubes to regrid in a cubelist. A minimum requirement is that the cube(s) need to be 2D with a latitude, longitude coordinates.

  • xspacing (float) – Spacing of points in longitude direction (could be degrees, meters etc.)

  • yspacing (float) – Spacing of points in latitude direction (could be degrees, meters etc.)

  • method (str) – Method used to regrid onto, etc. Linear will use iris.analysis.Linear()

Returns:

An iris cube of the data that has been regridded, or a cubelist of the cubes that have been regridded in the same order they were passed in toregrid.

Return type:

iris.cube | iris.cube.CubeList

Raises:
  • ValueError – If a unique x/y coordinate cannot be found

  • NotImplementedError – If the cubes grid, or the method for regridding, is not yet supported.

Notes

Currently rectlinear grids (uniform) are supported.

CSET.operators.regrid.regrid_to_single_point(cube: Cube, lat_pt: float, lon_pt: float, method: str, boundary_margin: int = 8, **kwargs) Cube

Select data at a single point by longitude and latitude.

Selection of model grid point is performed by a regrid function, either selecting the nearest gridpoint to the selected longitude and latitude values or using linear interpolation across the surrounding points.

Parameters:
  • cube (Cube) – An iris cube of the data to regrid. As a minimum, it needs to be 2D with latitude, longitude coordinates.

  • lon_pt (float) – Selected value of longitude: this should be in the range -180 degrees to 180 degrees.

  • lat_pt (float) – Selected value of latitude: this should be in the range -90 degrees to 90 degrees.

  • method (str) – Method used to determine the values at the selected longitude and latitude. The recommended approach is to use iris.analysis.Nearest(), which selects the nearest gridpoint. An alternative is iris.analysis.Linear(), which obtains the values at the selected longitude and latitude by linear interpolation.

  • boundary_margin (int, optional) – Number of grid points from the domain boundary considered “unreliable”. Defaults to 8.

Returns:

cube_rgd – An iris cube of the data at the specified point (this may have time and/or height dimensions).

Return type:

Cube

Raises:
  • ValueError – If a unique x/y coordinate cannot be found; also if, for selecting a single gridpoint, the chosen longitude and latitude point is outside the domain.

  • NotImplementedError – If the cubes grid, or the method for regridding, is not yet supported.

Notes

The acceptable coordinate names for X and Y coordinates are currently described in X_COORD_NAMES and Y_COORD_NAMES. These cover commonly used coordinate types, though a user can append new ones. Currently rectilinear grids (uniform) are supported. Warnings are raised if the selected gridpoint is within boundary_margin grid lengths of the domain boundary as data here is potentially unreliable.

CSET.operators.write

Operators for writing various types of files to disk.

CSET.operators.write.write_cube_to_nc(cube: Cube | CubeList, filename: str = None, overwrite: bool = False, **kwargs) str

Write a cube to a NetCDF file.

This operator expects an iris cube object that will then be saved to disk.

Parameters:
  • cube (iris.cube.Cube | iris.cube.CubeList) – Data to save.

  • filename (str, optional) – Path to save the cubes too. Defaults to the recipe title + .nc

  • overwrite (bool, optional) – Whether to overwrite an existing file. If False the filename will have a unique suffix added. Defaults to False.

Returns:

The inputted cube(list) (so further operations can be applied)

Return type:

Cube | CubeList

Convection Operators

CSET.operators.convection

A module containing different diagnostics for convection.

The diagnostics are calculated from output from the Unified Model, although precalculated values in the required input form may also be used.

CSET.operators.convection.cape_ratio(SBCAPE, MUCAPE, MUCIN, MUCIN_thresh=-75.0)

Ratio of two fields, one filtered to allow physical values to be output.

Parameters:
  • SBCAPE (Cube) – Surface-based convective available potential energy as calculated by the model. If using the UM please use STASH m01s20i114

  • MUCAPE (Cube) – Most-unstable convective available potential energy as calculated by the model. If using the UM please use STASH m01s20i112

  • MUCIN (Cube) – Most-unstable convective inhibition associated with the most-unstable ascent as calculated by the model. If using the UM please use STASH m01s20i113

  • MUCIN_thresh (float, optional, default is -75. J/kg.) – Threshold to filter the MUCAPE by values are realistically realisable.

Return type:

Cube

Notes

This diagnostic is based on Clark et al. (2012) [Clarketal2012]. It is based around the idea that for elevated convection the convective instability is not based at the surface. This utilises two flavours of CAPE: the surface-based CAPE (SBCAPE) and the most-unstable CAPE (MUCAPE). The MUCAPE is filtered by the MUCIN associated with that parcel’s ascent to ensure that any CAPE can at least theoretically be released. The default value is set at -75 J/kg but it can be changes depending on location and users requirements.

\[1 - (\frac{SBCAPE}{MUCAPE})\]

The ratio is defined in this way such that if SBCAPE=MUCAPE the ratio will equal 1. If the ratio was reversed when MUCAPE exists and SBCAPE is zero the ratio would be undefined.

The diagnostic varies smoothly between zero and unity. A value of 0 implies an environment is suitable for surface-based convection. A value of 1 implies an environment is suitable for elevated convection. Values between imply transition convection with values closer to one imply elevated convection is more likely and values closer to zero implying that surface-based convection is more likely.

Further details about this diagnostic for elevated convection identification can be found in Flack et al. (2023) [FlackCAPE2023].

Expected applicability ranges: Convective-scale models will be noisier than parametrized models as they are more responsive to the convection, and thus it may be more sensible to view as a larger spatial average rather than on the native resolution.

Interpretation notes: UM stash for CAPE and CIN are calculated at the end of the timestep. Therefore this diagnostic is applicable after precipitation has occurred, not before as is the usual interpretation of CAPE related diagnostics.

References

[Clarketal2012]

Clark, A. J., Kain J. S., Marsh P. T., Correia J., Xue M., and Kong F., (2012) “Forecasting tornado pathlengths using a three-dimensional object identification algorithm applied to convection-allowing forecasts.” Weather and Forecasting, vol. 27, 1090–1113, doi: 10.1175/WAF-D-11-00147.1

[FlackCAPE2023]

Flack, D.L.A., Lehnert, M., Lean, H.W., and Willington, S. (2023) “Characteristics of Diagnostics for Identifying Elevated Convection over the British Isles in a Convection-Allowing Model.” Weather and Forecasting, vol. 30, 1079-1094, doi: 10.1175/WAF-D-22-0219.1

Examples

>>> CAPE_ratios=convection.cape_ratio(
        SBCAPE,MUCAPE,MUCIN)
>>> iplt.pcolormesh(CAPE_ratios[0,:,:],cmap=mpl.cm.RdBu)
>>> plt.gca().coastlines('10m')
>>> plt.colorbar()
>>> plt.clim(0,1)
>>> plt.show()
>>> CAPE_ratios=convection.cape_ratio(
        SBCAPE,MUCAPE,MUCIN,MUCIN_thresh=-1.5)
>>> iplt.pcolormesh(CAPE_ratios[0,:,:],cmap=mpl.cm.RdBu)
>>> plt.gca().coastlines('10m')
>>> plt.clim(0,1)
>>> plt.colorbar()
>>> plt.show()
CSET.operators.convection.inflow_layer_properties(EIB, BLheight, Orography)

Filter to create a binary mask identifying elevated convection.

Parameters:
  • EIB (Cube) – Effective inflow layer base (precalculated or as identified by the model). If using the UM please use STASH m01s20i119.

  • BLheight (Cube) – Boundary layer height (precalculated or as identified by the model). If using the UM please use STASH m01s00i025.

  • Orography (Cube) – Model or actual orography, expected to be 2 dimensional. If 3 or 4 dimensional cube given converts to 2 dimensions assuming static orography field in ensemble realization and time. If using the UM please use STASH m01s00i033.

Return type:

Cube

Notes

This diagnostic is based on the concept of an effective inflow layer. This concept was first introduced by Thompson et al. (2007) [Thompsonetal2007]. The inflow layer defined the region of air that is most likely to be ingested into the convective event. It is defined by thresholding the CAPE and CIN values: CAPE > 100 J/kg and |CIN| < 250 J/kg.

To turn this into a diagnostic for elevated convection the inflow layer base is filtered against the boundary layer height. The model orography is added to the boundary layer height to ensure reference height consistency as the BL height is defined above ground level and the inflow layer base is defined above sea level in the model output.

\[EIB > BLheight + Orography\]

This is a binary diagnostic. It has a value of 0 to imply the environment is suitable for surface-based convection. It has a value of 1 to indicate the environment is suitable to produce elevated convection.

Further details about this diagnostic for elevated convection identification can be found in Flack et al. (2023) [Flackinf2023].

Expected applicability ranges: Convective-scale models will be noisier than parametrized models as they are more responsive to the convection, and thus it may be more sensible to view as a larger spatial average rather than at native resolution.

Interpretation notes: The effective inflow layer base diagnostic from UM STASH is dependent upon the UM CAPE and CIN diagnostics. These diagnostics are calculated at the end of the timestep. Therefore this diagnostic is applicable after precipitation has occurred, not before as is the usual interpretation of CAPE related diagnostics.

You might encounter warnings with the following text Orography assumed not to vary with time or ensemble member. or Orography assumed not to vary with time and ensemble member. these warnings are expected when the orography files are not 2-dimensional, and do not cause any problems unless ordering is not as expected.

References

[Thompsonetal2007]

Thompson, R. L. Mead, C. M., and Edwards, R., (2007) “Effective Storm-Relative Helicity and Bulk Shear in Supercell Thunderstorm Environments.” Weather and Forecasting, vol. 22, 102-115, doi: 10.1175/WAF969.1

[Flackinf2023]

Flack, D.L.A., Lehnert, M., Lean, H.W., and Willington, S. (2023) “Characteristics of Diagnostics for Identifying Elevated Convection over the British Isles in a Convection-Allowing Model.” Weather and Forecasting, vol. 30, 1079-1094, doi: 10.1175/WAF-D-22-0219.1

Examples

>>> Inflow_properties=convection.inflow_layer_properties(EIB,BLheight,Orography)
>>> iplt.pcolormesh(Inflow_properties[0,:,:],cmap=mpl.cm.Purples)
>>> plt.gca().coastlines('10m')
>>> plt.colorbar()
>>> plt.clim(0,1)
>>> plt.show()

Ensemble Operators

CSET.operators.ensembles

A module containing different diagnostics for ensembles.

The diagnostics here are applicable to ensembles and apply generally to ensembles. They are not just limited to considering ensemble spread.

CSET.operators.ensembles.DKE(u: Cube, v: Cube) Cube

Calculate the Difference Kinetic Energy (DKE).

Parameters:
  • u (iris.cube.Cube) – Iris cube of the u component of the wind field for all members. Must include a realization coordinate; realization must be the first coordinate.

  • v (iris.cube.Cube) – Iris cube of the v component of the wind field same format as u.

Returns:

DKE – An iris cube of the DKE for each of the control - member comparisons.

Return type:

iris.cube.Cube

Notes

The Difference Kinetic Energy, or DKE was first used in an ensemble sense by [Zhangetal2002]. It is calculated as

\[\frac{1}{2}u'u' + \frac{1}{2}v'v'\]

where

\[x' = x_{control} - x_{perturbed}\]

for each member of the ensemble.

The DKE is used to show links between the perturbation growth (growth of ensemble spread) or errors with dynamical features. It can be particularly useful in understanding differences in physical mechanisms between ensemble members. The larger the DKE the greater the difference between the two members being considered.

The DKE can be viewed as a domain average, horizontal integration (to produce profiles), or vertical integration/subsampling (to provide spatial maps). Furthermore, weightings based on volume or mass can be applied to these integrations. However, initially the DKE per unit mass is implemented here.

The DKE is often considered in the form of power spectra to identify the presence of upscale error growth or the dominant scales of error growth. It is often plotted on a logarithmic scale.

References

[Zhangetal2002]

Zhang, F., Snyder, C., and Rotunno, R., (2002) “Mesoscale Predictability of the ‘Surprise’ Snowstorm of 24-25 January 2000.” Monthly Weather Review, vol. 130, 1617-1632, doi: 10.1175/1520-0493(2002)130<1617:MPOTSS>2.0.CO;2

Examples

>>> DKE = ensembles.DKE(u, v)

Mesoscale Operators

CSET.operators.mesoscale

A module containing different diagnostics for mesoscales.

The diagnostics here are applicable at mesoscales and apply generally rather than for specific aspects of mesoscale meteorology (e.g. convection). For specific aspects, the user is referred to other modules available in CSET.

CSET.operators.mesoscale.spatial_perturbation_field(original_field: Cube, apply_gaussian_filter: bool = True, filter_scale: int = 40) Cube

Calculate a spatial perturbation field.

Parameters:
  • original_field (iris.cube.Cube) – Iris cube containing data to smooth, supporting multiple dimensions (at least two spatial dimensions must be supplied, i.e. 2D).

  • apply_gaussian_filter (boolean, optional) – If set to True a Gaussian filter is applied; if set to False a Uniform filter is applied. Default is True.

  • filter_scale (int, optional) – Scale at which to define the filter in grid boxes. If the filter is a Gaussian convolution this value represents the standard deviation of the Gaussian kernel. Default is 40 grid boxes.

Returns:

pert_field – An iris cube of the spatial perturbation field.

Return type:

iris.cube.Cube

Notes

In mesoscale meteorology the perturbation field is more important than the balanced flows for process understanding. This function is designed to create spatial perturbation fields based on smoothing with a Gaussian kernel or a uniform kernel.

The kernels are defined by the filter_scale, which for mesoscale perturbations should be between an approximate cloud separation distance, of 30 km, and synoptic scale variations (1000 km). In practice any value between these ranges should provided broadly consistent results (e.g. [Flacketal2016]). The Gaussian kernel will give greater importance to areas closer to the event and will produce a smooth perturbation field. The uniform kernel will produce a smooth perturbation field but will not give local features as much prominence.

Caution should be applied to boundaries, particularly if the domain is of variable resolution, as some numerical artifacts could be introduced.

References

[Flacketal2016]

Flack, D.L.A., Plant, R.S., Gray, S.L., Lean, H.W., Keil, C. and Craig, G.C. (2016) “Characterisation of Convective Regimes over the British Isles.” Quarterly Journal of the Royal Meteorological Society, vol. 142, 1541-1553. doi:10.1002/qj.2758

Examples

>>> Temperature_perturbation = meso.spatial_perturbation_fields(Temp,
                               gaussian_filter=True,filter_scale=40)
>>> iplt.pcolormesh(Temperature_perturabtion[0,:,:],cmap=mpl.cm.bwr)
>>> plt.gca().coastlines('10m')
>>> plt.clim(-5,5)
>>> plt.colorbar()
>>> plt.show()

Other

CSET.operators.ageofair

Age of air operator.

The age of air diagnostic provides a qualtitative view of how old air is within the domain, by calculating a back trajectory at each grid point at each lead time to determine when air entered through the lateral boundary. This is useful for diagnosing how quickly air ventilates the domain, depending on its size and the prevailing meteorology.

The diagnostic uses the u, v and w components of wind, along with geopotential height to perform the back trajectory. Data is first regridded to 0.5 degrees.

Note: the code here does not consider sub-grid transport, and only uses the postprocessed velocity fields and geopotential height. Its applicability is for large-scale flow O(1000 km), and not small scale flow where mixing is likely to play a larger role.

CSET.operators.ageofair.compute_ageofair(XWIND: Cube, YWIND: Cube, WWIND: Cube, GEOPOT: Cube, plev: int, cyclic: bool = False, multicore=True)

Compute back trajectories for a given forecast.

This allows us to determine when air entered through the boundaries. This will run on all available lead-times, and thus return an age of air cube of shape ntime, nlat, nlon. It supports multiprocesing, by iterating over longitude, or if set as None, will run on a single core, which is easier for debugging. This function supports ensembles, where it will check if realization dimension exists and if so, loop over this axis.

Parameters:
  • XWIND (Cube) – An iris cube containing the x component of wind on pressure levels, on a 0p5 degree grid. Requires 4 dimensions, ordered time, pressure, latitude and longitude. Must contain at least 2 time points to compute back trajectory.

  • YWIND (Cube) – An iris cube in the same format as XWIND.

  • WWIND (Cube) – An iris cube in the same format as XWIND.

  • GEOPOT (Cube) – An iris cube in the same format as XWIND.

  • plev (int) – The pressure level of which to compute the back trajectory on. The function will search to see if this exists and if not, will raise an exception.

  • cyclic (bool) – If cyclic is True, then the code will assume no east/west boundary and if a back trajectory reaches the boundary, it will emerge out of the other side. This option is useful for large domains such as the K-SCALE tropical channel, where there are only north/south boundaries in the domain.

  • multicore (bool) – If true, split up age of air diagnostic to use multiple cores (defaults to number of cores available to the process), otherwise run using a single process, which is easier to debug if developing the code.

Returns:

ageofair_cube – An iris cube of the age of air data, with 3 dimensions (time, latitude, longitude).

Return type:

Cube

Notes

The age of air diagnostic was used in Warner et al. (2023) [Warneretal2023] to identify the relative role of spin-up from initial conditions and lateral boundary conditions over tropical Africa to explore the impact of new data assimilation techniques. A further paper is currently in review ([Warneretal2024]) which applies the diagnostic more widely to the Australian ACCESS convection-permitting models.

References

[Warneretal2023]

Warner, J.L., Petch, J., Short, C., Bain, C., 2023. Assessing the impact of an NWP warm-start system on model spin-up over tropical Africa. QJ, 149( 751), pp.621-636. doi:10.1002/qj.4429

[Warneretal2024]

Diagnosing lateral boundary spin-up in regional models using an age of air diagnostic James L. Warner, Charmaine N. Franklin, Belinda Roux, Shaun Cooper, Susan Rennie, Vinod Kumar. Submitted for Quarterly Journal of the Royal Meteorological Society.