This is an old revision of the document!
Table of Contents
Measuring performance
In general, to obtain useful information from binaries, certain compile flags should be used.
Perf
For using perf, one should compile OOFEM with the flag -fno-omit-frame-pointer
First record a session using
perf record -g ./oofem -f myinputfile.in
which will generate a perf.data file.
Then you can visualize the results in several ways. A good, simple to understand method is to use Gprof2Dot to generate a complete callgraph:
perf script | ./gprof2dot.py -f perf | dot -Tsvg -o output.svg
Perf has very small overhead, but only does statistical sampling.
Callgrind
Callgrind is a tool in valgrind, and should only be used on medium. This has a huge overhead, so expect OOFEM to run over a hundred times slower through valgrind. Simply run
valgrind –tool=callgrind ./oofem -f myinputfile.in
and you will produce a new file named callgrind.out.123456
where the numbers at the end are randomized.
Open this file in Kcachegrind to visualize the results.