====== Measuring performance ====== In general, to obtain useful information from binaries, certain compile flags should be used. ===== Perf ===== For using perf, one should compile OOFEM with the flag ''-fno-omit-frame-pointer'' First record a session using ''perf record -g ./oofem -f myinputfile.in'' which will generate a perf.data file. Then you can visualize the results in several ways. A good, simple to understand method is to use [[http://code.google.com/p/jrfonseca/wiki/Gprof2Dot|Gprof2Dot]] to generate a complete callgraph: ''perf script | ./gprof2dot.py -f perf | dot -Tsvg -o output.svg'' Or use the ncurses program ''perf report -G %%--%%sort comm,dso'' Perf has very small overhead, but only does statistical sampling. //Users of perf will likely first need to turn off the restrictions before they are able to run ''perf record'' as normal users:// ''sudo sh -c %%"%%echo 0 > /proc/sys/kernel/kptr_restrict%%"%%'' ===== Callgrind ===== Callgrind is a tool in valgrind, and should only be used on medium. This has a huge overhead, so expect OOFEM to run over a hundred times slower through valgrind. Simply run ''valgrind --tool=callgrind ./oofem -f myinputfile.in'' and you will produce a new file named ''callgrind.out.123456'' where the numbers at the end are randomized. Open this file in Kcachegrind to visualize the results.