Writing an ANTS based application#
The ANTS library provides a toolkit of functionality common to many ancillary generating applications. The user-written applications used for the standard model configurations within the Momentum partnerhsip are held under the ancillary-file-science project (must be signed in to an account with appropriate GitHub permissions for access to that link).
An application will usually contain code to carry out the following steps:
Read input arguments, such as input and output file names, target grid definition.
Load the data from the input files.
Process (e.g regrid) the source data to produce the ancillary field(s).
Save the data to the output file.
For each of these steps, the ANTS library contains code to assist with common
operations. In particular, ants.command_parse provides a common
commandline interface, which is important to use to ensure a consistent user
interface (UI) between all applications, and ants.io.save contains
routines for saving ancillary data to common fileformats. As ANTS is based on
Iris, you can also use any of the available Iris functionality to process
your data.
In the tutorial here, we will write an ancillary generating application that covers all steps of the Ancillary Generation Pipeline. In real use cases though, it is often preferable to write multiple separate applications, connected via a workflow, to avoid writing monolithic, complex, and expensive to run individual applications. See ancillary-file-science and the associated rose-stem workflow for examples of such implementations and breakdowns.
As prerequisites for carrying out this tutorial we assume:
Familiarity with Python
Familiarity with Iris
Access to and ability to activate an environment with ANTS and its dependencies installed
And for the next steps, familiarity with:
Initial setup#
To get started with you will want to check out a copy of ancillary-file-science and create a branch where you will be working on your implementation. You will also need an environment active with ANTS installed into it so that you can carry out interactive development and testing of your application.
Create App Directory#
Having got your working copy of your branch checked out, we need to create our
application in the appropiate place. Ancillary generation applications live in
the “Apps” directory of ancillary-file-science. For now, create a
directory under “Apps” called “Tutorial”, and within that create an
“ancil_tutorial.py” where we will be carrying out the coding exercise in this
tutorial.
In your editor of choice, open up the ancil_tutorial.py file and update it so it contains the following:
"""
Tutorial application
********************
This application was written to carry out the application development
tutorial in ANTS.
The application does the following:
* Loads source and target cubes from the provided filepaths,
* Regrids the source data to the target grid using an area weighted regrid.
* Adds 1.0 to every data value,
* Saves the result to the specified output file path as NetCDF and, optionally,
also as a ancillary file.
The application implements a few features of ANTS that are common to
ancillary applications.
"""
import ants
import ants.regrid
import iris
from ants import load_cube
from ants.io.save import ancil, netcdf
def load_data():
return
def add_one():
return
def regrid_area_weighted():
return
def _get_parser():
return
def main():
return
if __name__ == "__main__":
main()
This provides us with an initial skeleton from which we will be working. Notice how we have set up placeholders for the majority of the steps we want our app to carry out - we always aim to break down our ancillary generation routines into their constituent parts and avoid “monolithic” main functions containing all the operations wherever possible.
Add Arg Parser#
ANTS provides a common command line arguments parser via
ants.command_parse for use in ancillary applications to provide
arguments such as:
Source filepath(s)
Output filepath
ANTS configuration filepath
Target grid or target landseamask filepath
NetCDF-only save option
This helps ensure a consistent interface across ANTS based applications and
provides a standard entry point for options typically required by them. The
ants.AntsArgParser class used for this can be extended to
include application-specific arguments, as required, by using the
argparse.ArgumentParser.add_argument() method. Here, we will use the out-the-box parser.
In your previously created ancil_tutorial.py file, update
_get_parser(): as follows:
def _get_parser():
"""Get ANTS argument parser."""
parser = ants.AntsArgParser(target_grid=True)
return parser
and also update the if __name__ == "__main__": as:
if __name__ == "__main__":
args = _get_parser().parse_args()
main()
Save your changes and you can then inspect the interface by running
python ancil_tutorial.py --help. If all has been done correctly then you
should see help text printed to the command line showing the various options
you can supply via the AntsArgParser.
Notice that when we create our parser we are using the target_grid=True
option. This is because we intend to supply the application with a target grid
for regridding to later in our implementation.
Add File Loading#
For our next step, we are going to need some data. For this, we will be using
the sample_source.nc and sample_target.nc files stored in the
rose-stem/sources directory of ancillary-file-science. Take a moment to
inspect and familiarise yourself with these files using your netCDF tool of
choice e.g. ncdump -h <file> to inspect the contents. As we go about
implementing our application we will use print() statements to provide
debugging to confirm what we have done is what we expect. You should see
outputs consistent with what you saw inspecting the file in your chosen netCDF
tool.
To load our source data into our application we will make use of the
load_cube routine from ANTS. Update the load_data() routine as follows:
def load_data(source_path, target_path):
"""
Return cubes obtained from loading the `source` and `target` files.
Parameters
----------
source_path : str
Filename from which to read source data.
target_path : str
Filename from which to read target data.
Returns
-------
tuple[:class:`~iris.cube.Cube`, :class:`~iris.cube.Cube`]
The loaded `surface_altitude` and `target` cubes.
"""
surface_altitude_constraint = iris.NameConstraint(standard_name="surface_altitude")
surface_altitude_cube = load_cube(source_path, surface_altitude_constraint)
target_cube = load_cube(target_path)
return surface_altitude_cube, target_cube
and update the main() routine as:
def main(source_path, target_path):
surface_altitude, target = load_data(source_path, target_path)
print(surface_altitude)
print(target)
return
and if __name__ == "__main__": as:
if __name__ == "__main__":
args = _get_parser().parse_args()
main(args.sources, args.target_grid)
Looking at our load_data routine we’ve done a few things worth noting.
Often, source files will contain more than one field, or we want to make sure
we are only loading specific fields (as opposed to whatever is in there). To do
this we make use of Iris constraints. Here we want to specifically load the
surface altitude fields.
Run your application as: python ancil_tutorial.py /path/to/sample_source.nc --target-grid /path/to/sample_target.nc -o output.nc
and you should see details of the loaded iris cubes printed to screen. N.B. we
have had to supply the -o output.nc option as the interface requires it,
even though we have not yet implemented code to save the file itself.
Add Regridding#
Assuming no further processing of our source is necessary, the next thing we
will want to do is to put that data on our target model grid, as supplied to
our application via the --target-grid option. To do this we will be using
the ANTS general regrid interface. Update the def regrid_area_weighted():
section so it is as follows:
def regrid_area_weighted(source, target):
"""Regrid data from `source` cube to `target` grid.
Uses the ANTS area-weighted regrid scheme.
Parameters
----------
source : :class:`~iris.cube.Cube`
Cube containing the data to be regridded.
target : :class:`~iris.cube.Cube`
Target cube containing the destination grid.
Returns
-------
:class:`~iris.cube.Cube`
Data from `source` regridded to `target`.
"""
scheme = ants.regrid.GeneralRegridScheme(horizontal_scheme="AreaWeighted")
return source.regrid(target, scheme)
This will take in a source code and use the ANTS general regrid AreaWeighted routine to regrid it to the provided target grid.
We will then integrate this into our main routine by updating it as follows:
def main(source_path, target_path):
surface_altitude, target = load_data(source_path, target_path)
print("Before regrid:")
print(surface_altitude)
print(target)
regridded_surface_altitude = regrid_area_weighted(surface_altitude, target)
print("After regrid:")
print(regridded_surface_altitude)
return
Run your application as before. Look at the outputs printed to screen and you should see that the resolution of the regridded surface altitude is different from the original source data.
Add Some Processing#
With data on our target grid we will then likely want to do some processing.
Here, we will use our add_one() routine to do this. We’ll use this routine
to take in a cube, and return the result of adding 1.0 to the data in there.
Update it as follows:
def add_one(cube):
"""
Return a modified cube.
Returned cube is the same as the original cube, except 1.0 is added to
each data value. Original cube is not modified.
Note
----
This function operates on a single cube, not a CubeList.
Parameters
----------
cube : :class:`~iris.cube.Cube`
Cube to add 1.0 to.
Returns
-------
:class:`~iris.cube.Cube`
The cube that has had 1.0 added to it.
"""
result = cube.copy(cube.core_data() + 1.0)
return result
We will make use of this in our main function by applying it to the sum of our
two regridded fields. Update main() so it looks like this:
def main(source_path, target_path):
surface_altitude, target = load_data(source_path, target_path)
print("Before regrid:")
print(surface_altitude)
print(target)
regridded_surface_altitude = regrid_area_weighted(surface_altitude, target)
print("After regrid:")
print(regridded_surface_altitude)
result = add_one(regridded_surface_altitude)
result.attributes["STASH"] = "m01s00i033"
print("Results cube:")
print(result)
return
Notice that we have also updated the STASH attribute on the cube via the
result.attributes["STASH"] = "m01s00i033" operation. In order to generate
a valid UM ancillary file we need an associated STASH code as this is the only
identifier in the file as to what the data is, both to the model and any user
inspecting the file.
As before, run your program and make sure everything is as expected.
Save a File#
The final thing to add is saving the results of our processing.
By default we recommend always saving out a copy of the processed data in
netcdf in addition to any other formats. For saving, we will make use of the
routines in ants.io.save we imported earlier - ancil and netcdf.
Firstly, lets update our if __name__ == "__main__": section to pass through
the argument to main for what file to save to, along with whether we want to
only save in netcdf (a convention we follow across ancillary apps where
applicable) as:
if __name__ == "__main__":
args = _get_parser().parse_args()
main(
args.sources,
args.target_grid,
args.output,
args.netcdf_only,
)
We will now update our main() section to take these arguments and to run
the savers. While we are in there we will also remove the debug print
statements, add some comments, and put in a docstring.
Update your main() section to look like this:
def main(source_path, target_path, output_path, netcdf_only):
"""
Regrid a field, add 1 to the data, and save
the result to a file.
Parameters
----------
source_path : str
Filename from which to read source data.
target_path : str
Filename from which to read target grid.
output_path : str
Filename to write results to.
netcdf_only : bool
If True, only write output to netCDF file.
If False, write output to both netCDF and
ancillary files.
Returns
-------
:class:`~iris.cube.Cube`
The processed cube.
"""
# Load in the source data and target grid
surface_altitude, target = load_data(source_path, target_path)
# Regrid surface_altitude to the target grid
regridded_surface_altitude = regrid_area_weighted(surface_altitude, target)
# Process the data
result = add_one(regridded_surface_altitude)
result.attributes["STASH"] = "m01s00i033"
# Always save to netcdf file
netcdf(result, output_path)
# Also save to ancillary file unless the '--netcdf-only'
# command line option has been specified
if not netcdf_only:
ancil(result, output_path)
return result
Save your changes and try out your code. Experiment with the --netcdf-only
option and different filenames for your output. You can use mule to inspect
the ancil file to make sure it is how you expect and whatever your goto netcdf
tool is to inspect the netcdf file.
Next Steps#
With your application now written, your next step would be to add pytest style
unittests for the routines in the application. These would be stored in a
tests subdirectory of the Apps/Tutorial directory you created earlier. We
will not go into detail of doing that here, but because we broke down our code
into constituent parts rather than leaving it all under main() we are more
easily able to test the code. With unittests written, your final step would be
to integrate your new application into the rose-stem framework in
ancillary-file-science, providing a representative workflow with some
cutdown/representative/synthetic low resolution data. For examples of adding
unittests and up-to-date way of implementing rose-stem testing, see the
example “Sample” application in
ancillary-file-science.
For the latest guidance on what would be expected from a finalised application visit the ANTS Working Practices page.
Summary#
From the above, you should now have created a application that:
Implements the ANTS argument parser
Loads in some specified data from a file
Loads in a target grid
Regrids data to a target grid using an ANTS regridder
Carries out some data processing
Saves data to netcdf and ancil file formats, with the option to save netcdf only
And if you carried out the Next Steps:
Is unittested
Is tested under rose-stem
Advanced Usage#
The tutorial provided above is a deliberately small scale example with low
resolution synthetic data and a low resolution target grid. In real world usage
source datasets are often significant in size and target grids contain many
more points than in the example here. As a result you may find that some of
the operations carried out consume too many resources for the platforms being
run on, or that the operations take too long for your needs. In that situation,
the author will need to look into optimisation of their codes. ANTS provides
some out-the-box support for this via the ants.decomposition framework
for splitting up and parallelising operation, noting that care will be needed
to ensure its usage is appropriate for the codes being parallelised.
Finally, it is also always worth taking a look at the existing ancillary generation applications in ancillary-file-science to see what has been done before. It may be that related implementations already exist for what you are intending to do, or that ideas for how to use parts of the API docs in your processing chain.