Interpreting Output

By default Vernier will write information out into one file per MPI rank. Single file output is not currently supported but is achievable via post-processing techniques.

Each file contains a table of the all the declared regions and their associated walltimes.

The environment variables section at the end of the setting up guide outlines environment variables relevant to Vernier’s output. To reiterate, the two output format options are threads and drhook.

Example “threads” output:

Thread: All                                     Self (s) Total (raw) (s)       Total (s)      Calls
---------------------------------------- --------------- --------------- --------------- ----------
LAPACK_zheev@0                                  0.176658        0.176658        0.176658       3141
PRINT_EIGENVALUES@0                             0.109331        0.109331        0.109331       3141
MAIN@0                                         0.0174985        0.320754        0.323922          1
Z_HAMILTONIAN_2ND@0                           0.00797964      0.00797964      0.00797964       3141
Z_HAMILTONIAN_INIT@0                          0.00691802      0.00691802      0.00691802       3141
__profiler__@0                                0.00316826      0.00316826      0.00316826      15706
Z_HAMILTONIAN_THIRD@0                         0.00236861      0.00236861      0.00236861       3141
  • Self: Total time spent inside the corresponding region only (excluding calls to other regions).

  • Total: Total time spent between the two callipers for this region including calls to elsewhere.

  • The “Total (raw)” column is the total time spent in a code region when the overhead from Vernier is taken into account. In the example above this overhead is only noticable with the overarching main program.

Example “drhook” output:

Profiling on 4 thread(s).

#  % Time         Cumul         Self        Total     # of calls        Self       Total    Routine@
                                                                         (Size; Size/sec; Size/call; MinSize; MaxSize)
    (self)        (sec)        (sec)        (sec)                    ms/call     ms/call

1   54.756        0.178        0.178        0.178           3141       0.057       0.057    LAPACK_zheev@0
2   34.040        0.288        0.110        0.110           3141       0.035       0.035    PRINT_EIGENVALUES@0
3    4.782        0.304        0.016        0.324              1      15.515     324.418    MAIN@0
4    2.500        0.312        0.008        0.008           3141       0.003       0.003    Z_HAMILTONIAN_2ND@0
5    2.141        0.319        0.007        0.007           3141       0.002       0.002    Z_HAMILTONIAN_INIT@0
6    1.033        0.322        0.003        0.003          15706       0.000       0.000    __profiler__@0
7    0.748        0.324        0.002        0.002           3141       0.001       0.001    Z_HAMILTONIAN_THIRD@0
  • % Time: The percentage of the overall time this region took up (calculated from self time).

    Warning

    The percentage is calculated from the highest total walltime of all profiled regions, so it is therefore assumed that a top-level pair of callipers encompassing the entire program is in place.

  • Cumul: The cumulative self time so far (the self times are ordered from high to low, this column sums the self times as it descends down the list).

  • Self: Total time spent inside the corresponding region only (excluding calls to other regions).

  • Total: Total time spent between the two callipers for this region including calls to elsewhere.

  • The self and total time per call (in ms) is also given.

In both examples the @0 appended onto the end of all region names indicates the OpenMP thread number.