So You’ve Been Asked to Code Review an LFRic Core Ticket#
You will need to be familiar with the relevant Fortran Coding Standards for LFRic but it is also important to understand some broader concepts.
As ever when reviewing no rules can be hard and fast. There’s always the possibility of exceptions however your default assumption must be that code which goes against these guidelines is wrong.
Never forget that “but it’s too hard” is rarely a reason for circumventing the guidelines.
If you are unsure on any point or just need moral support simply ask someone from the Core Capabilities Development (CCD) team. We are always happy to help.
PSyKAl model#
The most fundamental concept which drives the LFRic core is the PSyKAl (Parallel System, Kernel, Algorithm) model. This is a means of enforcing separation of concerns, meaning that different parts of a system see only the data pertinent to their operation.
More in-depth documentation is available in the Overview of the PSyKAl design, what follows is an overview of how it relates to reviewing.
Algorithms#
Algorithms deal in whole fields, they have no access to the field data. This is
the level at which conditional logic (if tests and suchlike) should appear.
It is also where you would expect to see namelist derived configuration data used.
In order to operate on fields an algorithm will call invoke(... one or more
kernels.
Kernels#
Kernels are called once per column of data in a field. As such they should not
contain conditional logic such as if tests as this represents a potential
performance problem.
Discovering a kernel which may take different code paths based on conditionals is often a sign that you are actually looking at two or more kernels bundled together in one.
These conditionals often make use of configuration data derived from namelists. This is another code smell you should be on the look out for.
All data needed by a kernel should be passed to it through its argument list, passing data via global variable is contra-indicated even more than normal due to the way it inhibits the use of accelerators such as GPUs.
Kernels should also never emmit log messages. Remember, that message is going to be generated for every column in the field.
Parallel System#
The code which interfaces the field orientated algorithms with the column orientated kernels is the PSy layer. This is generated by the PSyclone tool using metadata from each kernel.
Occasionally the need will arise for functionality not yet implemented by PSyclone. In these cases a “PSyKAl light” implementation may be made. This is a hand written version of the code which would be generated by PSyclone.
It allows progress to be made without waiting for PSyclone to catch up but it is also a valuable tool for PSyclone developers to understand what the missing functionality is.
New PSyKAl light code may not be added on a whim, there must first be a discussion with both CCD and PSyclone teams. There should be evidence of this discussion on the ticket and an associated PSyclone issue.
This issue must be cited in a comment with the PSyKAl light implementation.
Even when a “light” implementation is made, any new kernels should have correct metadata. Don’t let a developer leave it to someone else to try and work out what they were thinking.
Passing values by Global Variables#
The dangers of global variables and passing values by global variable are well understood. In summary they by-pass procedure argument lists and become secret dependencies. This means changing a global variable can have unexpected effects clear across the code base.
As such we have the normal prohibition on the use of globals, however we do allow global data in certain limited circumstances.
The test suite includes a task which checks for global data and reports, as an error, where it is found. This task uses a “dirty list” which contains files which are allowed to contain global variables.
Removing files from this list is always welcome and encouraged. Adding files, on the other hand, is instantly suspicious. The dirty list can only be expanded after a discussion (visible on the ticket) with CCD team.
In particular, adding a file to the dirty list is not an excuse for adding new global variables.
Interface Kernels#
There is a special subset of kernels which do not behave in the normal way.
Instead of performing some transformation on field data they martial the field data into a different form, appropriate for passing to some other code-base.
These are “interface kernels.”
The rule of thumb for interface kernels is one call to the underlying code-base per kernel. If there are two or more calls then it probably means two or more kernels have been pushed together and should be split apart.