66
67
the memory isn't ever actually lost -- a pointer remains to it --
67
68
but it's not in use. Programs that have leaks like this can
68
69
unnecessarily increase the amount of memory they are using over
70
<div class="sect2" lang="en">
71
<div class="titlepage"><div><div><h3 class="title">
72
<a name="ms-manual.heapprof"></a>6.1.1.�Why Use a Heap Profiler?</h3></div></div></div>
73
<p>Everybody knows how useful time profilers are for speeding
74
up programs. They are particularly useful because people are
75
notoriously bad at predicting where are the bottlenecks in their
77
<p>But the story is different for heap profilers. Some
78
programming languages, particularly lazy functional languages
79
like <a href="http://www.haskell.org" target="_top">Haskell</a>, have
80
quite sophisticated heap profilers. But there are few tools as
81
powerful for profiling C and C++ programs.</p>
82
<p>Why is this? Maybe it's because C and C++ programmers must
83
think that they know where the memory is being allocated. After
84
all, you can see all the calls to
85
<code class="computeroutput">malloc()</code> and
86
<code class="computeroutput">new</code> and
87
<code class="computeroutput">new[]</code>, right? But, in a big
88
program, do you really know which heap allocations are being
89
executed, how many times, and how large each allocation is? Can
90
you give even a vague estimate of the memory footprint for your
91
program? Do you know this for all the libraries your program
92
uses? What about administration bytes required by the heap
93
allocator to track heap blocks -- have you thought about them?
94
What about the stack? If you are unsure about any of these
95
things, maybe you should think about heap profiling.</p>
96
<p>Massif can tell you these things.</p>
97
<p>Or maybe it's because it's relatively easy to add basic
98
heap profiling functionality into a program, to tell you how many
99
bytes you have allocated for certain objects, or similar. But
100
this information might only be simple like total counts for the
101
whole program's execution. What about space usage at different
102
points in the program's execution, for example? And
103
reimplementing heap profiling code for each project is a
105
<p>Massif can save you this effort.</p>
70
time. Massif can help identify these leaks.</p>
71
<p>Importantly, Massif tells you not only how much heap memory your
72
program is using, it also gives very detailed information that indicates
73
which parts of your program are responsible for allocating the heap memory.
108
76
<div class="sect1" lang="en">
109
77
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
110
<a name="ms-manual.using"></a>6.2.�Using Massif</h2></div></div></div>
111
<div class="sect2" lang="en">
112
<div class="titlepage"><div><div><h3 class="title">
113
<a name="ms-manual.overview"></a>6.2.1.�Overview</h3></div></div></div>
114
<p>First off, as for normal Valgrind use, you probably want to
115
compile with debugging info (the
116
<code class="computeroutput">-g</code> flag). But, as opposed to
117
Memcheck, you probably <span><strong class="command">do</strong></span> want to turn
118
optimisation on, since you should profile your program as it will
120
<p>Then, run your program with <code class="computeroutput">valgrind
121
--tool=massif</code> in front of the normal command
122
line invocation. When the program finishes, Massif will print
123
summary space statistics. It also creates a graph representing
124
the program's heap usage in a file called
125
<code class="filename">massif.pid.ps</code>, which can be read by any
126
PostScript viewer, such as Ghostview.</p>
127
<p>It also puts detailed information about heap consumption in
128
a file <code class="filename">massif.pid.txt</code> (text format) or
129
<code class="filename">massif.pid.html</code> (HTML format), where
130
<span class="emphasis"><em>pid</em></span> is the program's process id.</p>
132
<div class="sect2" lang="en">
133
<div class="titlepage"><div><div><h3 class="title">
134
<a name="ms-manual.basicresults"></a>6.2.2.�Basic Results of Profiling</h3></div></div></div>
135
<p>To gather heap profiling information about the program
78
<a name="ms-manual.using"></a>8.2.�Using Massif</h2></div></div></div>
79
<p>First off, as for the other Valgrind tools, you should compile with
80
debugging info (the <code class="computeroutput">-g</code> flag). It shouldn't
81
matter much what optimisation level you compile your program with, as this
82
is unlikely to affect the heap memory usage.</p>
83
<p>Then, to gather heap profiling information about the program
136
84
<code class="computeroutput">prog</code>, type:</p>
137
85
<pre class="screen">
138
% valgrind --tool=massif prog</pre>
139
<p>The program will execute (slowly). Upon completion,
140
summary statistics that look like this will be printed:</p>
141
<pre class="programlisting">
142
==27519== Total spacetime: 2,258,106 ms.B
143
==27519== heap: 24.0%
144
==27519== heap admin: 2.2%
145
==27519== stack(s): 73.7%</pre>
146
<p>All measurements are done in
147
<span class="emphasis"><em>spacetime</em></span>, i.e. space (in bytes) multiplied
148
by time (in milliseconds). Note that because Massif slows a
149
program down a lot, the actual spacetime figure is fairly
150
meaningless; it's the relative values that are
152
<p>Which entries you see in the breakdown depends on the
153
command line options given. The above example measures all the
154
possible parts of memory:</p>
155
<div class="itemizedlist"><ul type="disc">
156
<li><p>Heap: number of words allocated on the heap, via
157
<code class="computeroutput">malloc()</code>,
158
<code class="computeroutput">new</code> and
159
<code class="computeroutput">new[]</code>.</p></li>
160
<li><p>Heap admin: each heap block allocated requires some
161
administration data, which lets the allocator track certain
162
things about the block. It is easy to forget about this, and
163
if your program allocates lots of small blocks, it can add
164
up. This value is an estimate of the space required for this
165
administration data.</p></li>
166
<li><p>Stack(s): the spacetime used by the programs' stack(s).
167
(Threaded programs can have multiple stacks.) This includes
168
signal handler stacks.</p></li>
171
<div class="sect2" lang="en">
172
<div class="titlepage"><div><div><h3 class="title">
173
<a name="ms-manual.graphs"></a>6.2.3.�Spacetime Graphs</h3></div></div></div>
174
<p>As well as printing summary information, Massif also
175
creates a file representing a spacetime graph,
176
<code class="filename">massif.pid.hp</code>. It will produce a file
177
called <code class="filename">massif.pid.ps</code>, which can be viewed in
178
a PostScript viewer.</p>
179
<p>Massif uses a program called
180
<code class="computeroutput">hp2ps</code> to convert the raw data
181
into the PostScript graph. It's distributed with Massif, but
182
came originally from the
183
<a href="http://www.haskell.org/ghc/" target="_top">Glasgow Haskell
184
Compiler</a>. You shouldn't need to worry about this at all.
185
However, if the graph creation fails for any reason, Massif will
186
tell you, and will leave behind a file named
187
<code class="filename">massif.pid.hp</code>, containing the raw heap
189
<p>Here's an example graph:</p>
190
<div class="mediaobject">
191
<a name="spacetime-graph"></a><img src="images/massif-graph-sm.png" alt="Spacetime Graph">
193
<p>The graph is broken into several bands. Most bands
194
represent a single line of your program that does some heap
195
allocation; each such band represents all the allocations and
196
deallocations done from that line. Up to twenty bands are shown;
197
less significant allocation sites are merged into "other" and/or
198
"OTHER" bands. The accompanying text/HTML file produced by
199
Massif has more detail about these heap allocation bands. Then
200
there are single bands for the stack(s) and heap admin
202
<p><b>Note:�</b>it's the height of a band that's important. Don't let the
203
ups and downs caused by other bands confuse you. For example,
204
the <code class="computeroutput">read_alias_file</code> band in the
205
example has the same height all the time it's in existence.</p>
206
<p>The triangles on the x-axis show each point at which a
207
memory census was taken. These aren't necessarily evenly spread;
208
Massif only takes a census when memory is allocated or
209
deallocated. The time on the x-axis is wallclock time, which is
210
not ideal because you can get different graphs for different
211
executions of the same program, due to random OS delays. But
212
it's not too bad, and it becomes less of a problem the longer a
214
<p>Massif takes censuses at an appropriate timescale; censuses
215
take place less frequently as the program runs for longer. There
216
is no point having more than 100-200 censuses on a single
218
<p>The graphs give a good overview of where your program's
219
space use comes from, and how that varies over time. The
220
accompanying text/HTML file gives a lot more information about
224
<div class="sect1" lang="en">
225
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
226
<a name="ms-manual.heapdetails"></a>6.3.�Details of Heap Allocations</h2></div></div></div>
227
<p>The text/HTML file contains information to help interpret
228
the heap bands of the graph. It also contains a lot of extra
229
information about heap allocations that you don't see in the
231
<p>Here's part of the information that accompanies the above
233
<div class="blockquote"><blockquote class="blockquote">
234
<div class="literallayout"><p>==�0�===========================</p></div>
235
<p>Heap allocation functions accounted for 50.8% of measured
238
<div class="itemizedlist"><ul type="disc">
239
<li><p><a name="a401767D1"></a>
240
<a href="#b401767D1" target="_top">22.1%</a>: 0x401767D0:
241
_nl_intern_locale_data (in /lib/i686/libc-2.3.2.so)</p></li>
242
<li><p><a name="a4017C394"></a>
243
<a href="#b4017C394" target="_top">8.6%</a>: 0x4017C393:
244
read_alias_file (in /lib/i686/libc-2.3.2.so)</p></li>
245
<li><p>... ... <span class="emphasis"><em>(several entries omitted)</em></span></p></li>
246
<li><p>and 6 other insignificant places</p></li>
249
<p>The first part shows the total spacetime due to heap
250
allocations, and the places in the program where most memory was
251
allocated (Nb: if this program had been compiled with
252
<code class="computeroutput">-g</code>, actual line numbers would be
253
given). These places are sorted, from most significant to least,
254
and correspond to the bands seen in the graph. Insignificant
255
sites (accounting for less than 0.5% of total spacetime) are
257
<p>That alone can be useful, but often isn't enough. What if
258
one of these functions was called from several different places
259
in the program? Which one of these is responsible for most of
261
<code class="computeroutput">_nl_intern_locale_data()</code>, this
262
question is answered by clicking on the
263
<a href="#b401767D1" target="_top">22.1%</a> link, which takes us to the
264
following part of the file:</p>
265
<div class="blockquote">
266
<a name="b401767D1"></a><blockquote class="blockquote">
267
<div class="literallayout"><p>==�1�===========================</p></div>
268
<p>Context accounted for <a href="#a401767D1" target="_top">22.1%</a>
269
of measured spacetime</p>
270
<p><code class="computeroutput"> 0x401767D0: _nl_intern_locale_data (in
271
/lib/i686/libc-2.3.2.so)</code></p>
273
<div class="itemizedlist"><ul type="disc"><li><p><a name="a40176F96"></a>
274
<a href="#b40176F96" target="_top">22.1%</a>: 0x40176F95:
275
_nl_load_locale_from_archive (in
276
/lib/i686/libc-2.3.2.so)</p></li></ul></div>
279
<p>At this level, we can see all the places from which
280
<code class="computeroutput">_nl_load_locale_from_archive()</code>
281
was called such that it allocated memory at 0x401767D0. (We can
282
click on the top <a href="#a40176F96" target="_top">22.1%</a> link to go back
283
to the parent entry.) At this level, we have moved beyond the
284
information presented in the graph. In this case, it is only
285
called from one place. We can again follow the link for more
286
detail, moving to the following part of the file.</p>
287
<div class="blockquote"><blockquote class="blockquote">
288
<div class="literallayout"><p>==�2�===========================</p></div>
289
<p><a name="b40176F96"></a>
290
Context accounted for <a href="#a40176F96" target="_top">22.1%</a> of
291
measured spacetime</p>
292
<p><code class="computeroutput"> 0x401767D0: _nl_intern_locale_data (in
293
/lib/i686/libc-2.3.2.so)</code> <code class="computeroutput">
294
0x40176F95: _nl_load_locale_from_archive (in
295
/lib/i686/libc-2.3.2.so)</code></p>
297
<div class="itemizedlist"><ul type="disc"><li><p><a name="a40176185"></a>22.1%: 0x40176184: _nl_find_locale (in
298
/lib/i686/libc-2.3.2.so)</p></li></ul></div>
300
<p>In this way we can dig deeper into the call stack, to work
301
out exactly what sequence of calls led to some memory being
302
allocated. At this point, with a call depth of 3, the
303
information runs out (thus the address of the child entry,
304
0x40176184, isn't a link). We could rerun the program with a
305
greater <code class="computeroutput">--depth</code> value if we
306
wanted more information.</p>
307
<p>Sometimes you will get a code location like this:</p>
308
<pre class="programlisting">
309
30.8% : 0xFFFFFFFF: ???</pre>
310
<p>The code address isn't really 0xFFFFFFFF -- that's
311
impossible. This is what Massif does when it can't work out what
312
the real code address is.</p>
313
<p>Massif produces this information in a plain text file by
314
default, or HTML with the
315
<code class="computeroutput">--format=html</code> option. The plain
316
text version obviously doesn't have the links, but a similar
317
effect can be achieved by searching on the code addresses. (In
318
Vim, the '*' and '#' searches are ideal for this.)</p>
319
<div class="sect2" lang="en">
320
<div class="titlepage"><div><div><h3 class="title">
321
<a name="ms-manual.accuracy"></a>6.3.1.�Accuracy</h3></div></div></div>
322
<p>The information should be pretty accurate. Some
323
approximations made might cause some allocation contexts to be
324
attributed with less memory than they actually allocated, but the
325
amounts should be miniscule.</p>
326
<p>The heap admin spacetime figure is an approximation, as
327
described above. If anyone knows how to improve its accuracy,
328
please let us know.</p>
331
<div class="sect1" lang="en">
332
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
333
<a name="ms-manual.options"></a>6.4.�Massif Options</h2></div></div></div>
86
% valgrind --tool=massif prog
88
<p>The program will execute (slowly). Upon completion, no summary
89
statistics are printed to Valgrind's commentary; all of Massif's profiling
90
data is written to a file. By default, this file is called
91
<code class="filename">massif.out.<pid></code>, where
92
<code class="filename"><pid></code> is the process ID.</p>
93
<p>To see the information gathered by Massif in an easy-to-read form, use
94
the ms_print script. If the output file's name is
95
<code class="filename">massif.out.12345</code>, type:</p>
97
% ms_print massif.out.12345</pre>
98
<p>ms_print will produce (a) a graph showing the memory consumption over
99
the program's execution, and (b) detailed information about the responsible
100
allocation sites at various points in the program, including the point of
101
peak memory allocation. The use of a separate script for presenting the
102
results is deliberate: it separates the data gathering from its
103
presentation, and means that new methods of presenting the data can be added in
105
<div class="sect2" lang="en">
106
<div class="titlepage"><div><div><h3 class="title">
107
<a name="ms-manual.anexample"></a>8.2.1.�An Example Program</h3></div></div></div>
108
<p>An example will make things clear. Consider the following C program
109
(annotated with line numbers) which allocates a number of different blocks
112
1 #include <stdlib.h>
130
19 for (i = 0; i < 10; i++) {
131
20 a[i] = malloc(1000);
138
27 for (i = 0; i < 10; i++) {
146
<div class="sect2" lang="en">
147
<div class="titlepage"><div><div><h3 class="title">
148
<a name="ms-manual.theoutputpreamble"></a>8.2.2.�The Output Preamble</h3></div></div></div>
149
<p>After running this program under Massif, the first part of ms_print's
150
output contains a preamble which just states how the program, Massif and
151
ms_print were each invoked:</p>
153
--------------------------------------------------------------------------------
155
Massif arguments: (none)
156
ms_print arguments: massif.out.12797
157
--------------------------------------------------------------------------------
160
<div class="sect2" lang="en">
161
<div class="titlepage"><div><div><h3 class="title">
162
<a name="ms-manual.theoutputgraph"></a>8.2.3.�The Output Graph</h3></div></div></div>
163
<p>The next part is the graph that shows how memory consumption occurred
164
as the program executed:</p>
187
0 +----------------------------------------------------------------------->ki
190
Number of snapshots: 25
191
Detailed snapshots: [9, 14 (peak), 24]
193
<p>Why is most of the graph empty, with only a couple of bars at the very
194
end? By default, Massif uses "instructions executed" as the unit of time.
195
For very short-run programs such as the example, most of the executed
196
instructions involve the loading and dynamic linking of the program. The
197
execution of <code class="computeroutput">main</code> (and thus the heap
198
allocations) only occur at the very end. For a short-running program like
199
this, we can use the <code class="computeroutput">--time-unit=B</code> option
200
to specify that we want the time unit to instead be the number of bytes
201
allocated/deallocated on the heap and stack(s).</p>
202
<p>If we re-run the program under Massif with this option, and then
203
re-run ms_print, we get this more useful graph:</p>
214
| : : # : : : : : : : .
215
| : : # : : : : : : : : .
216
| : : : # : : : : : : : : : ,
217
| @ : : : # : : : : : : : : : @
218
| : @ : : : # : : : : : : : : : @
219
| : : @ : : : # : : : : : : : : : @
220
| : : : @ : : : # : : : : : : : : : @
221
| : : : : @ : : : # : : : : : : : : : @
222
| : : : : : @ : : : # : : : : : : : : : @
223
| : : : : : : @ : : : # : : : : : : : : : @
224
| : : : : : : : @ : : : # : : : : : : : : : @
225
| : : : : : : : : @ : : : # : : : : : : : : : @
226
0 +----------------------------------------------------------------------->KB
229
Number of snapshots: 25
230
Detailed snapshots: [9, 14 (peak), 24]
232
<p>Each vertical bar represents a snapshot, i.e. a measurement of the
233
memory usage at a certain point in time. The text at the bottom show that
234
25 snapshots were taken for this program, which is one per heap
235
allocation/deallocation, plus a couple of extras. Massif starts by taking
236
snapshots for every heap allocation/deallocation, but as a program runs for
237
longer, it takes snapshots less frequently. It also discards older
238
snapshots as the program goes on; when it reaches the maximum number of
239
snapshots (100 by default, although changeable with the
240
<code class="computeroutput">--max-snapshots</code> option) half of them are
241
deleted. This means that a reasonable number of snapshots are always
243
<p>Most snapshots are <span class="emphasis"><em>normal</em></span>, and only basic
244
information is recorded for them. Normal snapshots are represented in the
245
graph by bars consisting of ':' and '.' characters.</p>
246
<p>Some snapshots are <span class="emphasis"><em>detailed</em></span>. Information about
247
where allocations happened are recorded for these snapshots, as we will see
248
shortly. Detailed snapshots are represented in the graph by bars consisting
249
of '@' and ',' characters. The text at the bottom show that 3 detailed
250
snapshots were taken for this program (snapshots 9, 14 and 24). By default,
251
every 10th snapshot is detailed, although this can be changed via the
252
<code class="computeroutput">--detailed-freq</code> option.</p>
253
<p>Finally, there is at most one <span class="emphasis"><em>peak</em></span> snapshot. The
254
peak snapshot is a detailed snapshot, and records the point where memory
255
consumption was greatest. The peak snapshot is represented in the graph by
256
a bar consisting of '#' and ',' characters. The text at the bottom shows
257
that snapshot 14 was the peak. Note that for tiny programs that never
258
deallocate heap memory, Massif will not record a peak snapshot.</p>
259
<p>Some more details about the peak: the peak is determined by looking
260
at every allocation, i.e. it is <span class="emphasis"><em>not</em></span> just the peak among
261
the regular snapshots. However, recording the true peak is expensive, and
262
so by default Massif records a peak whose size is within 1% of the size of
263
the true peak. See the description of the
264
<code class="computeroutput">--peak-inaccuracy</code> option below for more
266
<p>The following graph is from an execution of Konqueror, the KDE web
267
browser. It shows what graphs for larger programs look like.</p>
277
| : :@ :@@@ :: :@@#::
278
| ,: :@ :@@@ :: :@@#::
279
| ,:@: :@ :@@@ :: :@@#::.
280
| @@:@: :@ :@@@ :: :@@#:::
281
| ,,: .:: . , .::@@:@: :@ :@@@ :: :@@#:::
282
| .:@@: .: ::: ::: @ :::@@:@: :@ :@@@ :: :@@#:::
283
| ,: ::@@: ::: ::::::: @ :::@@:@: :@ :@@@ :: :@@#:::
284
| @: ::@@: ::: ::::::: @ :::@@:@: :@ :@@@ :: :@@#::.
285
| @: ::@@: ::: ::::::: @ :::@@:@: :@ :@@@ :: :@@#:::
286
| , @: ::@@:: ::: ::::::: @ :::@@:@: :@ :@@@ :: :@@#:::
287
| ::@ @: ::@@:: ::: ::::::: @ :::@@:@: :@ :@@@ :: :@@#:::
288
| , :::::@ @: ::@@:: ::: ::::::: @ :::@@:@: :@ :@@@ :: :@@#:::
289
| ..@ :::::@ @: ::@@:: ::: ::::::: @ :::@@:@: :@ :@@@ :: :@@#:::
290
0 +----------------------------------------------------------------------->Mi
293
Number of snapshots: 63
294
Detailed snapshots: [3, 4, 10, 11, 15, 16, 29, 33, 34, 36, 39, 41,
295
42, 43, 44, 49, 50, 51, 53, 55, 56, 57 (peak)]
297
<p>Note that the larger size units are KB, MB, GB, etc. As is typical
298
for memory measurements, these are based on a multiplier of 1024, rather
299
than the standard SI multiplier of 1000. Strictly speaking, they should be
300
written KiB, MiB, GiB, etc.</p>
302
<div class="sect2" lang="en">
303
<div class="titlepage"><div><div><h3 class="title">
304
<a name="ms-manual.thesnapshotdetails"></a>8.2.4.�The Snapshot Details</h3></div></div></div>
305
<p>Returning to our example, the graph is followed by the detailed
306
information for each snapshot. The first nine snapshots are normal, so only
307
a small amount of information is recorded for each one:</p>
309
--------------------------------------------------------------------------------
310
n time(B) total(B) useful-heap(B) extra-heap(B) stacks(B)
311
--------------------------------------------------------------------------------
313
1 1,008 1,008 1,000 8 0
314
2 2,016 2,016 2,000 16 0
315
3 3,024 3,024 3,000 24 0
316
4 4,032 4,032 4,000 32 0
317
5 5,040 5,040 5,000 40 0
318
6 6,048 6,048 6,000 48 0
319
7 7,056 7,056 7,000 56 0
320
8 8,064 8,064 8,000 64 0
322
<p>Each normal snapshot records several things.</p>
323
<div class="itemizedlist"><ul type="disc">
324
<li><p>Its number.</p></li>
325
<li><p>The time it was taken. In this case, the time unit is
326
bytes, due to the use of
327
<code class="computeroutput">--time-unit=B</code>.</p></li>
328
<li><p>The total memory consumption at that point.</p></li>
329
<li><p>The number of useful heap bytes allocated at that point.
330
This reflects the number of bytes asked for by the
333
<p>The number of extra heap bytes allocated at that point.
334
This reflects the number of bytes allocated in excess of what the program
335
asked for. There are two sources of extra heap bytes.</p>
336
<p>First, every heap block has administrative bytes associated with it.
337
The exact number of administrative bytes depends on the details of the
338
allocator. By default Massif assumes 8 bytes per block, as can be seen
339
from the example, but this number can be changed via the
340
<code class="computeroutput">--heap-admin</code> option.</p>
341
<p>Second, allocators often round up the number of bytes asked for to a
342
larger number. By default, if N bytes are asked for, Massif rounds N up
343
to the nearest multiple of 8 that is equal to or greater than N. This is
344
typical behaviour for allocators, and is required to ensure that elements
345
within the block are suitably aligned. The rounding size can be changed
346
with the <code class="computeroutput">--alignment</code> option, although it
347
cannot be less than 8, and must be a power of two.</p>
349
<li><p>The size of the stack(s). By default, stack profiling is
350
off as it slows Massif down greatly. Therefore, the stack column is zero
351
in the example.</p></li>
353
<p>The next snapshot is detailed. As well as the basic counts, it gives
354
an allocation tree which indicates exactly which pieces of code were
355
responsible for allocating heap memory:</p>
357
9 9,072 9,072 9,000 72 0
358
99.21% (9,000B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
359
->99.21% (9,000B) 0x804841A: main (example.c:20)
361
<p>The allocation tree can be read from the top down. The first line
362
indicates all heap allocation functions such as <code class="function">malloc</code>
363
and C++ <code class="function">new</code>. All heap allocations go through these
364
functions, and so all 9,000 useful bytes (which is 99.21% of all allocated
365
bytes) go through them. But how were <code class="function">malloc</code> and new
366
called? At this point, every allocation so far has been due to line 21
367
inside <code class="function">main</code>, hence the second line in the tree. The
368
<code class="computeroutput">-></code> indicates that main (line 20) called
369
<code class="function">malloc</code>.</p>
370
<p>Let's see what the subsequent output shows happened next:</p>
372
--------------------------------------------------------------------------------
373
n time(B) total(B) useful-heap(B) extra-heap(B) stacks(B)
374
--------------------------------------------------------------------------------
375
10 10,080 10,080 10,000 80 0
376
11 12,088 12,088 12,000 88 0
377
12 16,096 16,096 16,000 96 0
378
13 20,104 20,104 20,000 104 0
379
14 20,104 20,104 20,000 104 0
380
99.48% (20,000B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
381
->49.74% (10,000B) 0x804841A: main (example.c:20)
383
->39.79% (8,000B) 0x80483C2: g (example.c:5)
384
| ->19.90% (4,000B) 0x80483E2: f (example.c:11)
385
| | ->19.90% (4,000B) 0x8048431: main (example.c:23)
387
| ->19.90% (4,000B) 0x8048436: main (example.c:25)
389
->09.95% (2,000B) 0x80483DA: f (example.c:10)
390
->09.95% (2,000B) 0x8048431: main (example.c:23)
392
<p>The first four snapshots are similar to the previous ones. But then
393
the global allocation peak is reached, and a detailed snapshot is taken.
394
Its allocation tree shows that 20,000B of useful heap memory has been
395
allocated, and the lines and arrows indicate that this is from three
396
different code locations: line 20, which is responsible for 10,000B
397
(49.74%); line 5, which is responsible for 8,000B (39.79%); and line 10,
398
which is responsible for 2,000B (9.95%).</p>
399
<p>We can then drill down further in the allocation tree. For example,
400
of the 8,000B asked for by line 5, half of it was due to a call from line
401
11, and half was due to a call from line 25.</p>
402
<p>In short, Massif collates the stack trace of every single allocation
403
point in the program into a single tree, which gives a complete picture of
404
how and why all heap memory was allocated.</p>
405
<p>Note that the tree entries correspond not to functions, but to
406
individual code locations. For example, if function <code class="function">A</code>
407
calls <code class="function">malloc</code>, and function <code class="function">B</code> calls
408
<code class="function">A</code> twice, once on line 10 and once on line 11, then
409
the two calls will result in two distinct stack traces in the tree. In
410
contrast, if <code class="function">B</code> calls <code class="function">A</code> repeatedly
411
from line 15 (e.g. due to a loop), then each of those calls will be
412
represented by the same stack trace in the tree.</p>
413
<p>Note also that tree entry with children in the example satisfies an
414
invariant: the entry's size is equal to the sum of its children's sizes.
415
For example, the first entry has size 20,000B, and its children have sizes
416
10,000B, 8,000B, and 2,000B. In general, this invariant almost always
417
holds. However, in rare circumstances stack traces can be malformed, in
418
which case a stack trace can be a sub-trace of another stack trace. This
419
means that some entries in the tree may not satisfy the invariant -- the
420
entry's size will be greater than the sum of its children's sizes. Massif
421
can sometimes detect when this happens; if it does, it issues a
424
Warning: Malformed stack trace detected. In Massif's output,
425
the size of an entry's child entries may not sum up
426
to the entry's size as they normally do.
428
<p>However, Massif does not detect and warn about every such occurrence.
429
Fortunately, malformed stack traces are rare in practice.</p>
430
<p>Returning now to ms_print's output, the final part is similar:</p>
432
--------------------------------------------------------------------------------
433
n time(B) total(B) useful-heap(B) extra-heap(B) stacks(B)
434
--------------------------------------------------------------------------------
435
15 21,112 19,096 19,000 96 0
436
16 22,120 18,088 18,000 88 0
437
17 23,128 17,080 17,000 80 0
438
18 24,136 16,072 16,000 72 0
439
19 25,144 15,064 15,000 64 0
440
20 26,152 14,056 14,000 56 0
441
21 27,160 13,048 13,000 48 0
442
22 28,168 12,040 12,000 40 0
443
23 29,176 11,032 11,000 32 0
444
24 30,184 10,024 10,000 24 0
445
99.76% (10,000B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
446
->79.81% (8,000B) 0x80483C2: g (example.c:5)
447
| ->39.90% (4,000B) 0x80483E2: f (example.c:11)
448
| | ->39.90% (4,000B) 0x8048431: main (example.c:23)
450
| ->39.90% (4,000B) 0x8048436: main (example.c:25)
452
->19.95% (2,000B) 0x80483DA: f (example.c:10)
453
| ->19.95% (2,000B) 0x8048431: main (example.c:23)
455
->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)
457
<p>The final detailed snapshot shows how the heap looked at termination.
458
The 00.00% entry represents the code locations for which memory was
459
allocated and then freed (line 20 in this case, the memory for which was
460
freed on line 28). However, no code location details are given for this
461
entry; by default, Massif only records the details for code locations
462
responsible for more than 1% of useful memory bytes, and ms_print likewise
463
only prints the details for code locations responsible for more than 1%.
464
The entries that do not meet this threshold are aggregated. This avoids
465
filling up the output with large numbers of unimportant entries. The
466
thresholds can be changed with the
467
<code class="computeroutput">--threshold</code> option that both Massif and
468
ms_print support.</p>
470
<div class="sect2" lang="en">
471
<div class="titlepage"><div><div><h3 class="title">
472
<a name="ms-manual.forkingprograms"></a>8.2.5.�Forking Programs</h3></div></div></div>
473
<p>If your program forks, the child will inherit all the profiling data that
474
has been gathered for the parent.</p>
475
<p>If the output file format string (controlled by
476
<code class="option">--massif-out-file</code>) does not contain <code class="option">%p</code>, then
477
the outputs from the parent and child will be intermingled in a single output
478
file, which will almost certainly make it unreadable by ms_print.</p>
481
<div class="sect1" lang="en">
482
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
483
<a name="ms-manual.options"></a>8.3.�Massif Options</h2></div></div></div>
334
484
<p>Massif-specific options are:</p>
335
485
<div class="variablelist">
336
486
<a name="ms.opts.list"></a><dl>
339
489
<code class="option">--heap=<yes|no> [default: yes] </code>
342
<dd><p>When enabled, profile heap usage in detail. Without it, the
343
<code class="filename">massif.pid.txt</code> or
344
<code class="filename">massif.pid.html</code> will be very short.</p></dd>
492
<dd><p>Specifies whether heap profiling should be done.</p></dd>
346
494
<a name="opt.heap-admin"></a><span class="term">
347
495
<code class="option">--heap-admin=<number> [default: 8] </code>
350
<dd><p>The number of admin bytes per block to use. This can only
351
be an estimate of the average, since it may vary. The allocator
352
used by <code class="computeroutput">glibc</code> requires somewhere
353
between 4 to 15 bytes per block, depending on various factors. It
354
also requires admin space for freed blocks, although
355
<code class="constant">massif</code> does not count this.</p></dd>
498
<dd><p>If heap profiling is enabled, gives the number of administrative
499
bytes per block to use. This should be an estimate of the average,
500
since it may vary. For example, the allocator used by
501
<code class="computeroutput">glibc</code> requires somewhere between 4 to
502
15 bytes per block, depending on various factors. It also requires
503
admin space for freed blocks, although Massif does not account
357
506
<a name="opt.stacks"></a><span class="term">
358
507
<code class="option">--stacks=<yes|no> [default: yes] </code>
361
<dd><p>When enabled, include stack(s) in the profile. Threaded
362
programs can have multiple stacks.</p></dd>
510
<dd><p>Specifies whether stack profiling should be done. This option
511
slows Massif down greatly, and so is off by default. Note that Massif
512
assumes that the main stack has size zero at start-up. This is not
513
true, but measuring the actual stack size is not easy, and it reflects
514
the size of the part of the main stack that a user program actually
515
has control over.</p></dd>
364
517
<a name="opt.depth"></a><span class="term">
365
<code class="option">--depth=<number> [default: 3] </code>
518
<code class="option">--depth=<number> [default: 30] </code>
368
<dd><p>Depth of call chains to present in the detailed heap
369
information. Increasing it will give more information, but
370
<code class="constant">massif</code> will run the program more slowly,
371
using more memory, and produce a bigger
372
<code class="filename">massif.pid.txt</code> or
373
<code class="filename">massif.pid.hp</code> file.</p></dd>
521
<dd><p>Maximum depth of the allocation trees recorded for detailed
522
snapshots. Increasing it will make Massif run somewhat more slowly,
523
use more memory, and produce bigger output files.</p></dd>
375
525
<a name="opt.alloc-fn"></a><span class="term">
376
526
<code class="option">--alloc-fn=<name> </code>
379
<dd><p>Specify a function that allocates memory. This is useful
380
for functions that are wrappers to <code class="function">malloc()</code>,
381
which can fill up the context information uselessly (and give very
382
uninformative bands on the graph). Functions specified will be
383
ignored in contexts, i.e. treated as though they were
384
<code class="function">malloc()</code>. This option can be specified
385
multiple times on the command line, to name multiple
388
<a name="opt.format"></a><span class="term">
389
<code class="option">--format=<text|html> [default: text] </code>
392
<dd><p>Produce the detailed heap information in text or HTML
393
format. The file suffix used will be either
394
<code class="filename">.txt</code> or <code class="filename">.html</code>.</p></dd>
530
<p>Functions specified with this option will be treated as though
531
they were a heap allocation function such as
532
<code class="function">malloc</code>. This is useful for functions that are
533
wrappers to <code class="function">malloc</code> or <code class="function">new</code>,
534
which can fill up the allocation trees with uninteresting information.
535
This option can be specified multiple times on the command line, to
536
name multiple functions.</p>
537
<p>Note that overloaded C++ names must be written in full. Single
538
quotes may be necessary to prevent the shell from breaking them up.
542
--alloc-fn='operator new(unsigned, std::nothrow_t const&amp;)'
547
The full list of functions and operators that are by default
548
considered allocation functions is as follows.</p>
556
operator new(unsigned)
557
operator new(unsigned long)
558
operator new[](unsigned)
559
operator new[](unsigned long)
560
operator new(unsigned, std::nothrow_t const&)
561
operator new[](unsigned, std::nothrow_t const&)
562
operator new(unsigned long, std::nothrow_t const&)
563
operator new[](unsigned long, std::nothrow_t const&)
567
<a name="opt.threshold"></a><span class="term">
568
<code class="option">--threshold=<m.n> [default: 1.0] </code>
571
<dd><p>The significance threshold for heap allocations, as a
572
percentage. Allocation tree entries that account for less than this
573
will be aggregated. Note that this should be specified in tandem with
574
ms_print's option of the same name.</p></dd>
576
<a name="opt.peak-inaccuracy"></a><span class="term">
577
<code class="option">--peak-inaccuracy=<m.n> [default: 1.0] </code>
580
<dd><p>Massif does not necessarily record the actual global memory
581
allocation peak; by default it records a peak only when the global
582
memory allocation size exceeds the previous peak by at least 1.0%.
583
This is because there can be many local allocation peaks along the way,
584
and doing a detailed snapshot for every one would be expensive and
585
wasteful, as all but one of them will be later discarded. This
586
inaccuracy can be changed (even to 0.0%) via this option, but Massif
587
will run drastically slower as the number approaches zero.</p></dd>
589
<a name="opt.time-unit"></a><span class="term">
590
<code class="option">--time-unit=i|ms|B [default: i] </code>
593
<dd><p>The time unit used for the profiling. There are three
594
possibilities: instructions executed (i), which is good for most
595
cases; real (wallclock) time (ms, i.e. milliseconds), which is
596
sometimes useful; and bytes allocated/deallocated on the heap and/or
597
stack (B), which is useful for very short-run programs, and for
598
testing purposes, because it is the most reproducible across different
601
<a name="opt.detailed-freq"></a><span class="term">
602
<code class="option">--detailed-freq=<n> [default: 10] </code>
605
<dd><p>Frequency of detailed snapshots. With
606
<code class="computeroutput">--detailed-freq=1</code>, every snapshot is
609
<a name="opt.max-snapshots"></a><span class="term">
610
<code class="option">--max-snapshots=<n> [default: 100] </code>
613
<dd><p>The maximum number of snapshots recorded. If set to N, for all
614
programs except very short-running ones, the final number of snapshots
615
will be between N/2 and N.</p></dd>
617
<a name="opt.massif-out-file"></a><span class="term">
618
<code class="option">--massif-out-file=<file> [default: massif.out.%p] </code>
621
<dd><p>Write the profile data to <code class="computeroutput">file</code>
622
rather than to the default output file,
623
<code class="computeroutput">massif.out.<pid></code>. The
624
<code class="option">%p</code> and <code class="option">%q</code> format specifiers can be
625
used to embed the process ID and/or the contents of an environment
626
variable in the name, as is the case for the core option
627
<code class="option">--log-file</code>. See <a href="manual-core.html#manual-core.basicopts">Basic Options</a> for details.
630
<a name="opt.alignment"></a><span class="term">
631
<code class="option">--alignment=<n> [default: 1.0] </code>
634
<dd><p>The minimum alignment (and thus size) of heap blocks.</p></dd>
638
<div class="sect1" lang="en">
639
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
640
<a name="ms-manual.ms_print-options"></a>8.4.�ms_print Options</h2></div></div></div>
641
<p>ms_print's options are:</p>
642
<div class="itemizedlist"><ul type="disc">
644
<p><code class="computeroutput">-h, --help</code></p>
645
<p><code class="computeroutput">-v, --version</code></p>
646
<p>Help and version, as usual.</p>
649
<p><code class="option">--threshold=<m.n></code> [default: 1.0]</p>
650
<p>Same as Massif's <code class="computeroutput">--threshold</code>, but
651
applied after profiling rather than during.</p>
654
<p><code class="option">--x=<m.n></code> [default: 72]</p>
655
<p>Width of the graph, in columns.</p>
658
<p><code class="option">--y=<n></code> [default: 20]</p>
659
<p>Height of the graph, in rows.</p>
663
<div class="sect1" lang="en">
664
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
665
<a name="ms-manual.fileformat"></a>8.5.�Massif's output file format</h2></div></div></div>
666
<p>Massif's file format is plain text (i.e. not binary) and deliberately
667
easy to read for both humans and machines. Nonetheless, the exact format
668
is not described here. This is because the format is currently very
669
Massif-specific. We plan to make the format more general, and thus suitable
670
for possible use with other tools. Once this has been done, the format will
671
be documented here.</p>
400
675
<br><table class="nav" width="100%" cellspacing="3" cellpadding="2" border="0" summary="Navigation footer">
402
677
<td rowspan="2" width="40%" align="left">
403
<a accesskey="p" href="cl-manual.html"><<�5.�Callgrind: a heavyweight profiler</a>�</td>
678
<a accesskey="p" href="hg-manual.html"><<�7.�Helgrind: a thread error detector</a>�</td>
404
679
<td width="20%" align="center"><a accesskey="u" href="manual.html">Up</a></td>
405
<td rowspan="2" width="40%" align="right">�<a accesskey="n" href="hg-manual.html">7.�Helgrind: a data-race detector�>></a>
680
<td rowspan="2" width="40%" align="right">�<a accesskey="n" href="nl-manual.html">9.�Nulgrind: the "null" tool�>></a>
408
683
<tr><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td></tr>