op2kcg
The OProfile
set of tools, and in particular operf
, provide one way
of accessing the hardware performance counters under Linux for the
purposes of profiling.
$ operf ./a.out $ opreport -gdf | op2calltree $ kcachegrind oprof.out.unnamed &
It produces profiling information, by line and by instruction, but, despite its name, it cannot process calltree information.
The similarly-named
op2calltree.py
by Nathaniel Smith is written in python 2 and converts from the
output of opreport -gcf
to KCachegrind's callgraph
format. It includes the callgraph, but no profiling information.
The script offered here calls opreport
itself, twice,
and tries to combine the profiling data and the callgraph data.
$ operf -gl ./a.out $ op2kcg -o out.dat $ kcachegrind out.dat &
It comes with no guarantees of correctness, fitness for purpose, or anything else. It is written in python 3. And it can be downloaded as op2kcg.
There is also an example of its usage, a more complicated example with gprof, an example with valgrind/callgrind, and some notes on the call graph.
Known issues
It does not work well with threaded code, even
with OMP_NUM_THREADS=1
.
Recursive functions may show incorrect counts in kcachegrind: see this bug report.