gprof(1)
NAME
gprof - display call-graph profile data
SYNOPSIS
gprof [-abcCDlsz] [-e function-name] [-E function-name] [-
f function-name] [-F function-name] [ image-file [ profile-
file...]] [ -n number of functions]
DESCRIPTION
The gprof utility produces an execution profile of a pro-
gram. The effect of called routines is incorporated in the
profile of each caller. The profile data is taken from the
call graph profile file that is created by programs compiled
with the -xpg option of cc(1), or by the -pg option with
other compilers, or by setting the
LD_PROFILE environment variable for shared objects. See
ld.so.1(1). These compiler options also link in versions of
the library routines which are compiled for profiling.
The symbol table in the executable image file image-file
(a.out by default) is read and correlated with the call
graph profile file profile-file (gmon.out by default).
First, execution times for each routine are propagated along
the edges of the call graph. Cycles are discovered, and
calls into a cycle are made to share the time of the cycle.
The first listing shows the functions sorted according to
the time they represent, including the time of their call
graph descendants. Below each function entry is shown its
(direct) call-graph children and how their times are pro-
pagated to this function. A similar display above the func-
tion shows how this function's time and the time of its des-
cendants are propagated to its (direct) call-graph parents.
Cycles are also shown, with an entry for the cycle as a
whole and a listing of the members of the cycle and their
contributions to the time and call counts of the cycle.
Next, a flat profile is given, similar to that provided by
prof(1). This listing gives the total execution times and
call counts for each of the functions in the program, sorted
by decreasing time. Finally, an index is given, which shows
the correspondence between function names and call-graph
profile index numbers.
A single function may be split into subfunctions for profil-
ing by means of the MARK macro. See prof(5).
Beware of quantization errors. The granularity of the sam-
pling is shown, but remains statistical at best. It is
assumed that the time for each execution of a function can
be expressed by the total time for the function divided by
the number of times the function is called. Thus the time
propagated along the call-graph arcs to parents of that
function is directly proportional to the number of times
that arc is traversed.
The profiled program must call exit(2) or return normally
for the profiling information to be saved in the gmon.out
file.
OPTIONS
The following options are supported:
- a Suppress printing statically declared functions.
If this option is given, all relevant information
about the static function (for instance, time samples,
calls to other functions, calls from other functions)
belongs to the function loaded just before the static
function in the a.out file.
-b Brief. Suppress descriptions of each field in the
profile.
-c Discover the static call-graph of the program by a
heuristic which examines the text space of the object
file. Static-only parents or children are indicated
with call counts of 0. Note that for dynamically
linked executables, the linked shared objects' text
segments are not examined.
-C Demangle C++ symbol names before printing them out.
-D Produce a profile file gmon.sum that represents the
difference of the profile information in all specified
profile files. This summary profile file may be given
to subsequent executions of gprof (also with -D) to
summarize profile data across several runs of an a.out
file. See also the -s option.
As an example, suppose function A calls function B n
times in profile file gmon.sum, and m times in profile
file gmon.out. With -D, a new gmon.sum file will be
created showing the number of calls from A to B as n-
m.
-e function-name
Suppress printing the graph profile entry for routine
function-name and all its descendants (unless they
have other ancestors that are not suppressed). More
than one -e option may be given. Only one function-
name may be given with each -e option.
-E function-name
Suppress printing the graph profile entry for routine
function-name (and its descendants) as -e, below, and
also exclude the time spent in function-name (and its
descendants) from the total and percentage time compu-
tations. More than one -E option may be given. For
example:
`-E mcount -E mcleanup'
is the default.
-f function-name
Print the graph profile entry only for routine
function-name and its descendants. More than one -f
option may be given. Only one function-name may be
given with each -f option.
-F function-name
Print the graph profile entry only for routine
function-name and its descendants (as -f, below) and
also use only the times of the printed routines in
total time and percentage computations. More than one
-F option may be given. Only one function-name may be
given with each -F option. The -F option overrides
the -E option.
-l Suppress the reporting of graph profile entries for
all local symbols. This option would be the
equivalent of placing all of the local symbols for the
specified executable image on the -E exclusion list.
-n Limits the size of flat and graph profile listings to
the top n offending functions.
-s Produce a profile file gmon.sum which represents the
sum of the profile information in all of the specified
profile files. This summary profile file may be given
to subsequent executions of gprof (also with -s) to
accumulate profile data across several runs of an
a.out file. See also the -D option.
-z Display routines which have zero usage (as indicated
by call counts and accumulated time). This is useful
in conjunction with the -c option for discovering
which routines were never called. Note that this has
restricted use for dynamically linked executables,
since shared object text space will not be examined by
the -c option.
ENVIRONMENT VARIABLES
PROFDIR
If this environment variable contains a value, place
profiling output within that directory, in a file
named pid.programname. pid is the process ID and pro-
gramname is the name of the program being profiled, as
determined by removing any path prefix from the
argv[0] with which the program was called. If the
variable contains a null value, no profiling output is
produced. Otherwise, profiling output is placed in
the file gmon.out.
FILES
a.out
executable file containing namelist
gmon.out
dynamic call-graph and profile
gmon.sum
summarized dynamic call-graph and profile
$PROFDIR/pid .programname
ATTRIBUTES
See attributes(5) for descriptions of the following attri-
butes:
____________________________________________________________
| ATTRIBUTE TYPE | ATTRIBUTE VALUE |
|_____________________________|_____________________________|
| Availability | SUNWbtool |
|_____________________________|_____________________________|
SEE ALSO
cc(1), ld.so.1(1), prof(1), exit(2), pcsample(2), profil(2),
malloc(3C), malloc(3MALLOC), monitor(3C), attributes(5),
prof(5)
Graham, S.L., Kessler, P.B., McKusick, M.K., `gprof: A Call
Graph Execution Profiler', Proceedings of the SIGPLAN '82
Symposium on Compiler Construction, SIGPLAN Notices, Vol.
17, No. 6, pp. 120-126, June 1982.
Linker and Libraries Guide
NOTES
If the executable image has been stripped and has no symbol
table
(.symtab), then gprof will read the dynamic symbol table
(.dyntab), if present. If the dynamic symbol table is used,
then only the information for the global symbols will be
available, and the behavior will be identical to the -a
option.
LD_LIBRARY_PATH must not contain /usr/lib as a component
when compiling a program for profiling. If
LD_LIBRARY_PATH contains /usr/lib, the program will not be
linked correctly with the profiling versions of
the system libraries in /usr/lib/libp.
The times reported in successive identical runs may show
variances because of varying cache-hit ratios that result
from sharing the cache with other processes. Even if a pro-
gram seems to be the only one using the machine, hidden
background or asynchronous processes may blur the data. In
rare cases, the clock ticks initiating recording of the pro-
gram counter may "beat" with loops in a program, grossly
distorting measurements. Call counts are always recorded
precisely, however.
Only programs that call exit or return from main are
guaranteed to produce a profile file, unless a final call to
monitor is explicitly coded.
Functions such as mcount(), _mcount(), moncontrol(), _mon-
control(), monitor(), and _monitor() may appear in the gprof
report. These functions are part of the profiling implemen-
tation and thus account for some amount of the runtime over-
head. Since these functions are not present in an unpro-
filed application, time accumulated and call counts for
these functions may be ignored when evaluating the perfor-
mance of an application.
64-bit profiling
64-bit profiling may be used freely with dynamically linked
executables, and profiling information is collected for the
shared objects if the objects are compiled for profiling.
Care must be applied to interpret the profile output, since
it is possible for symbols from different shared objects to
have the same name. If name duplication occurs in the pro-
file output, the module id prefix before the symbol name in
the symbol index listing can be used to identify the
appropriate module for the symbol.
When using the -s or -D option to sum multiple profile
files, care must be taken not to mix 32-bit profile files
with 64-bit profile files.
32-bit profiling
32-bit profiling may be used with dynamically linked execut-
ables, but care must be applied. In 32-bit profiling, shared
objects cannot be profiled with gprof. Thus, when a pro-
filed, dynamically linked program is executed, only the
"main" portion of the image is sampled. This means that all
time spent outside of the "main" object, that is, time spent
in a shared object, will not be included in the profile
summary; the total time reported for the program may be less
than the total time used by the program.
Because the time spent in a shared object cannot be
accounted for, the use of shared objects should be minimized
whenever a program is profiled with gprof. If desired, the
program should be linked to the profiled version of a
library (or to the standard archive version if no profiling
version is available), instead of the shared object to get
profile information on the functions of a library. Versions
of profiled libraries may be supplied with the system in the
/usr/lib/libp directory. Refer to compiler driver documenta-
tion on profiling.
Consider an extreme case. A profiled program dynamically
linked with the shared C library spends 100 units of time in
some libc routine, say, malloc(). Suppose malloc() is called
only from routine B and B consumes only 1 unit of time.
Suppose further that routine A consumes 10 units of time,
more than any other routine in the "main" (profiled) portion
of the image. In this case, gprof will conclude that most of
the time is being spent in A and almost no time is being
spent in B. From this it will be almost impossible to tell
that the greatest improvement can be made by looking at rou-
tine B and not routine A. The value of the profiler in this
case is severely degraded; the solution is to use archives
as much as possible for profiling.
BUGS
Parents which are not themselves profiled will have the time
of their profiled children propagated to them, but they will
appear to be spontaneously invoked in the call-graph list-
ing, and will not have their time propagated further. Simi-
larly, signal catchers, even though profiled, will appear to
be spontaneous (although for more obscure reasons). Any pro-
filed children of signal catchers should have their times
propagated properly, unless the signal catcher was invoked
during the execution of the profiling routine, in which case
all is lost.
Man(1) output converted with
man2html