1
<?xml version="1.0" encoding="latin1" ?>
2
<!DOCTYPE chapter SYSTEM "chapter.dtd">
7
<year>2001</year><year>2009</year>
8
<holder>Ericsson AB. All Rights Reserved.</holder>
11
The contents of this file are subject to the Erlang Public License,
12
Version 1.1, (the "License"); you may not use this file except in
13
compliance with the License. You should have received a copy of the
14
Erlang Public License along with this software. If not, it can be
15
retrieved online at http://www.erlang.org/.
17
Software distributed under the License is distributed on an "AS IS"
18
basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See
19
the License for the specific language governing rights and limitations
24
<title>Processes</title>
25
<prepared>Bjorn Gustavsson</prepared>
27
<date>2007-11-21</date>
29
<file>processes.xml</file>
33
<title>Creation of an Erlang process</title>
35
<p>An Erlang process is lightweight compared to operating
36
systems threads and processes.</p>
38
<p>A newly spawned Erlang process uses 309 words of memory
39
in the non-SMP emulator without HiPE support. (SMP support
40
and HiPE support will both add to this size.) The size can
41
be found out like this:</p>
44
Erlang (BEAM) emulator version 5.6 [async-threads:0] [kernel-poll:false]
46
Eshell V5.6 (abort with ^G)
47
1> <input>Fun = fun() -> receive after infinity -> ok end end.</input>
49
2> <input>{_,Bytes} = process_info(spawn(Fun), memory).</input>
51
3> <input>Bytes div erlang:system_info(wordsize).</input>
54
<p>The size includes 233 words for the heap area (which includes the stack).
55
The garbage collector will increase the heap as needed.</p>
57
<p>The main (outer) loop for a process <em>must</em> be tail-recursive.
58
If not, the stack will grow until the process terminates.</p>
60
<p><em>DO NOT</em></p>
68
Reply = handle_msg(Msg),
72
io:format("Message is processed~n", []).</code>
74
<p>The call to <c>io:format/2</c> will never be executed, but a
75
return address will still be pushed to the stack each time
76
<c>loop/0</c> is called recursively. The correct tail-recursive
77
version of the function looks like this:</p>
87
Reply = handle_msg(Msg),
93
<title>Initial heap size</title>
95
<p>The default initial heap size of 233 words is quite conservative
96
in order to support Erlang systems with hundreds of thousands or
97
even millions of processes. The garbage collector will grow and
98
shrink the heap as needed.</p>
100
<p>In a system that use comparatively few processes, performance
101
<em>might</em> be improved by increasing the minimum heap size using either
102
the <c>+h</c> option for
103
<seealso marker="erts:erl">erl</seealso> or on a process-per-process
104
basis using the <c>min_heap_size</c> option for
105
<seealso marker="erts:erlang#spawn_opt/4">spawn_opt/4</seealso>.</p>
107
<p>The gain is twofold: Firstly, although the garbage collector will
108
grow the heap, it will it grow it step by step, which will be more
109
costly than directly establishing a larger heap when the process
110
is spawned. Secondly, the garbage collector may also shrink the
111
heap if it is much larger than the amount of data stored on it;
112
setting the minimum heap size will prevent that.</p>
114
<warning><p>The emulator will probably use more memory, and because garbage
115
collections occur less frequently, huge binaries could be
116
kept much longer.</p></warning>
118
<p>In systems with many processes, computation tasks that run
119
for a short time could be spawned off into a new process with
120
a higher minimum heap size. When the process is done, it will
121
send the result of the computation to another process and terminate.
122
If the minimum heap size is calculated properly, the process may not
123
have to do any garbage collections at all.
124
<em>This optimization should not be attempted
125
without proper measurements.</em></p>
131
<title>Process messages</title>
133
<p>All data in messages between Erlang processes is copied, with
135
<seealso marker="binaryhandling#refc_binary">refc binaries</seealso>
136
on the same Erlang node.</p>
138
<p>When a message is sent to a process on another Erlang node,
139
it will first be encoded to the Erlang External Format before
140
being sent via an TCP/IP socket. The receiving Erlang node decodes
141
the message and distributes it to the right process.</p>
144
<title>The constant pool</title>
146
<p>Constant Erlang terms (also called <em>literals</em>) are now
147
kept in constant pools; each loaded module has its own pool.
148
The following function</p>
150
<p><em>DO</em> (in R12B and later)</p>
153
element(M, {31,28,31,30,31,30,31,31,30,31,30,31}).</code>
155
<p>will no longer build the tuple every time it is called (only
156
to have it discarded the next time the garbage collector was run), but
157
the tuple will be located in the module's constant pool.</p>
159
<p>But if a constant is sent to another process (or stored in
160
an ETS table), it will be <em>copied</em>.
161
The reason is that the run-time system must be able
162
to keep track of all references to constants in order to properly
163
unload code containing constants. (When the code is unloaded,
164
the constants will be copied to the heap of the processes that refer
165
to them.) The copying of constants might be eliminated in a future
170
<title>Loss of sharing</title>
172
<p>Shared sub-terms are <em>not</em> preserved when a term is sent
173
to another process, passed as the initial process arguments in
174
the <c>spawn</c> call, or stored in an ETS table.
175
That is an optimization. Most applications do not send message
176
with shared sub-terms.</p>
178
<p>Here is an example of how a shared sub-term can be created:</p>
187
kilo_byte(N-1, [Acc|Acc]).</code>
189
<p><c>kilo_byte/1</c> creates a deep list. If we call
190
<c>list_to_binary/1</c>, we can convert the deep list to a binary
194
1> <input>byte_size(list_to_binary(efficiency_guide:kilo_byte())).</input>
197
<p>Using the <c>erts_debug:size/1</c> BIF we can see that the
198
deep list only requires 22 words of heap space:</p>
201
2> <input>erts_debug:size(efficiency_guide:kilo_byte()).</input>
204
<p>Using the <c>erts_debug:flat_size/1</c> BIF, we can calculate
205
the size of the deep list if sharing is ignored. It will be
206
the size of the list when it has been sent to another process
207
or stored in an ETS table:</p>
210
3> <input>erts_debug:flat_size(efficiency_guide:kilo_byte()).</input>
213
<p>We can verify that sharing will be lost if we insert the
214
data into an ETS table:</p>
217
4> <input>T = ets:new(tab, []).</input>
219
5> <input>ets:insert(T, {key,efficiency_guide:kilo_byte()}).</input>
221
6> <input>erts_debug:size(element(2, hd(ets:lookup(T, key)))).</input>
223
7> <input>erts_debug:flat_size(element(2, hd(ets:lookup(T, key)))).</input>
226
<p>When the data has passed through an ETS table,
227
<c>erts_debug:size/1</c> and <c>erts_debug:flat_size/1</c>
228
return the same value. Sharing has been lost.</p>
230
<p>In a future release of Erlang/OTP, we might implement a
231
way to (optionally) preserve sharing. We have no plans to make
232
preserving of sharing the default behaviour, since that would
233
penalize the vast majority of Erlang applications.</p>
238
<title>The SMP emulator</title>
240
<p>The SMP emulator (introduced in R11B) will take advantage of
241
multi-core or multi-CPU computer by running several Erlang schedulers
242
threads (typically, the same as the number of cores). Each scheduler
243
thread schedules Erlang processes in the same way as the Erlang scheduler
244
in the non-SMP emulator.</p>
246
<p>To gain performance by using the SMP emulator, your application
247
<em>must have more than one runnable Erlang process</em> most of the time.
248
Otherwise, the Erlang emulator can still only run one Erlang process
249
at the time, but you must still pay the overhead for locking. Although
250
we try to reduce the locking overhead as much as possible, it will never
251
become exactly zero.</p>
253
<p>Benchmarks that may seem to be concurrent are often sequential.
254
The estone benchmark, for instance, is entirely sequential. So is also
255
the most common implementation of the "ring benchmark"; usually one process
256
is active, while the others wait in a <c>receive</c> statement.</p>
258
<p>The <seealso marker="percept:percept">percept</seealso> application
259
can be used to profile your application to see how much potential (or lack
260
thereof) it has for concurrency.</p>