33
37
<tbody valign="top">
34
38
<tr><th class="docinfo-name">Author:</th>
35
39
<td>Arvid Norberg, <a class="last reference external" href="mailto:arvid@rasterbar.com">arvid@rasterbar.com</a></td></tr>
40
<tr><th class="docinfo-name">Version:</th>
38
44
<div class="contents topic" id="table-of-contents">
39
45
<p class="topic-title first">Table of contents</p>
40
46
<ul class="simple">
41
<li><a class="reference internal" href="#introduction" id="id4">introduction</a></li>
42
<li><a class="reference internal" href="#features" id="id5">features</a></li>
43
<li><a class="reference internal" href="#portability" id="id6">portability</a></li>
44
<li><a class="reference internal" href="#license" id="id7">license</a></li>
47
<li><a class="reference internal" href="#introduction" id="id2">introduction</a></li>
48
<li><a class="reference internal" href="#features" id="id3">features</a><ul>
49
<li><a class="reference internal" href="#extensions" id="id4">extensions</a></li>
50
<li><a class="reference internal" href="#disk-management" id="id5">disk management</a></li>
51
<li><a class="reference internal" href="#network" id="id6">network</a></li>
54
<li><a class="reference internal" href="#highlighted-features" id="id7">highlighted features</a><ul>
55
<li><a class="reference internal" href="#disk-caching" id="id8">disk caching</a></li>
56
<li><a class="reference internal" href="#high-performance-disk-subsystem" id="id9">high performance disk subsystem</a></li>
57
<li><a class="reference internal" href="#network-buffers" id="id10">network buffers</a></li>
58
<li><a class="reference internal" href="#piece-picker" id="id11">piece picker</a></li>
59
<li><a class="reference internal" href="#merkle-hash-tree-torrents" id="id12">merkle hash tree torrents</a></li>
60
<li><a class="reference internal" href="#customizable-file-storage" id="id13">customizable file storage</a></li>
61
<li><a class="reference internal" href="#easy-to-use-api" id="id14">easy to use API</a></li>
64
<li><a class="reference internal" href="#portability" id="id15">portability</a></li>
47
67
<div class="section" id="introduction">
48
68
<h1>introduction</h1>
49
<p>libtorrent is a C++ library that aims to be a good alternative to all the
50
other bittorrent implementations around. It is a
51
library and not a full featured client, although it comes with a working
53
<p>The main goals of libtorrent are:</p>
55
<li>to be cpu efficient</li>
56
<li>to be memory efficient</li>
57
<li>to be very easy to use</li>
69
<p>libtorrent is a feature complete C++ bittorrent implementation focusing
70
on efficiency and scalability. It runs on embedded devices as well as
71
desktops. It boasts a well documented library interface that is easy to
72
use. It comes with a simple bittorrent client demonstrating the use of
60
75
<div class="section" id="features">
62
<p>libtorrent is still being developed, however it is stable. It is an ongoing
63
project (including this documentation). The current state includes the
64
following features:</p>
77
<p>libtorrent is under active development. It is an ongoing project. Its
78
current state supports and includes the following features:</p>
79
<div class="section" id="extensions">
65
81
<ul class="simple">
66
<li>trackerless torrents (using the Mainline kademlia DHT protocol) with
67
some <a class="reference external" href="dht_extensions.html">DHT extensions</a>.</li>
68
<li>support for IPv6</li>
69
<li>NAT-PMP and UPnP support (automatic port mapping on routers that supports it)</li>
70
<li>uses a separate disk I/O thread to not have the disk ever block on network or
71
client interaction. (see <a class="reference external" href="manual.html#threads">threads</a>).</li>
72
<li>supports the bittorrent <a class="reference external" href="extension_protocol.html">extension protocol</a>. See <a class="reference external" href="manual.html#extensions">extensions</a>.</li>
73
<li>supports the uTorrent metadata transfer protocol (i.e. magnet links).</li>
82
<li>plugin interface for implementing custom bittorrent extensions
83
without having to modify libtorrent</li>
84
<li>supports trackerless torrents (using the Mainline kademlia DHT protocol) with
85
some <a class="reference external" href="dht_extensions.html">DHT extensions</a>. <a class="reference external" href="http://bittorrent.org/beps/bep_0005.html">BEP 5</a>.</li>
86
<li>supports the bittorrent <a class="reference external" href="extension_protocol.html">extension protocol</a>. See <a class="reference external" href="manual.html#extensions">extensions</a>. <a class="reference external" href="http://bittorrent.org/beps/bep_0010.html">BEP 10</a>.</li>
87
<li>supports the uTorrent metadata transfer protocol <a class="reference external" href="http://bittorrent.org/beps/bep_0009.html">BEP 9</a> (i.e. magnet links).</li>
74
88
<li>supports the uTorrent peer exchange protocol (PEX).</li>
75
89
<li>supports local peer discovery (multicasts for peers on the same local network)</li>
76
<li>adjusts the length of the request queue depending on download rate.</li>
77
<li>has an adjustable read and write disk cache for improved disk throughput.</li>
78
<li>multitracker extension support (supports both the <a class="reference external" href="http://bittorrent.org/beps/bep_0012.html">specification by John Hoffman</a>
79
and the uTorrent interpretation).</li>
90
<li>multitracker extension support (supports both strict <a class="reference external" href="http://bittorrent.org/beps/bep_0012.html">BEP 12</a> and the
91
uTorrent interpretation).</li>
80
92
<li>tracker scrapes</li>
81
<li>supports both sparse files and compact file allocation (where pieces
82
are kept consolidated on disk)</li>
93
<li>supports lt_trackers extension, to exchange trackers between peers</li>
94
<li><a class="reference external" href="manual.html#http-seeding">HTTP seeding</a>, as specified in <a class="reference external" href="http://bittorrent.org/beps/bep_0017.html">BEP 17</a> and <a class="reference external" href="http://bittorrent.org/beps/bep_0019.html">BEP 19</a>.</li>
95
<li>supports the udp-tracker protocol. (<a class="reference external" href="http://bittorrent.org/beps/bep_0015.html">BEP 15</a>).</li>
96
<li>supports the <tt class="docutils literal"><span class="pre">no_peer_id=1</span></tt> extension that will ease the load off trackers.</li>
97
<li>supports the <tt class="docutils literal"><span class="pre">compact=1</span></tt> tracker parameter.</li>
98
<li>super seeding/initial seeding (<a class="reference external" href="http://bittorrent.org/beps/bep_0016.html">BEP 16</a>).</li>
99
<li>private torrents (<a class="reference external" href="http://bittorrent.org/beps/bep_0027.html">BEP 27</a>).</li>
100
<li>support for IPv6, including <a class="reference external" href="http://bittorrent.org/beps/bep_0007.html">BEP 7</a> and <a class="reference external" href="http://bittorrent.org/beps/bep_0024.html">BEP 24</a>.</li>
101
<li>support for merkle hash tree torrents. This makes the size of torrent files
102
scale well with the size of the content.</li>
105
<div class="section" id="disk-management">
106
<h2>disk management</h2>
108
<li>uses a separate disk I/O thread to not have the disk ever block on network or
109
client interaction. (see <a class="reference external" href="manual.html#threads">threads</a>).</li>
83
110
<li>supports files > 2 gigabytes.</li>
84
<li>serves multiple torrents on a single port and in a single thread</li>
85
111
<li>fast resume support, a way to get rid of the costly piece check at the
86
112
start of a resumed torrent. Saves the storage state, piece_picker state
87
113
as well as all local peers in a separate fast-resume file.</li>
88
<li><a class="reference external" href="manual.html#http-seeding">HTTP seeding</a>, as <a class="reference external" href="http://www.getright.com/seedtorrent.html">specified by Michael Burford of GetRight</a>.</li>
114
<li>has an adjustable read and write disk cache for improved disk throughput.</li>
115
<li>queues torrents for file check, instead of checking all of them in parallel.</li>
116
<li>does not have any requirements on the piece order in a torrent that it
117
resumes. This means it can resume a torrent downloaded by any client.</li>
118
<li>supports both sparse files and compact file allocation (where pieces
119
are kept consolidated on disk)</li>
120
<li>seed mode, where the files on disk are assumed to be complete, and each
121
piece's hash is verified the first time it is requested.</li>
124
<div class="section" id="network">
127
<li>adjusts the length of the request queue depending on download rate.</li>
128
<li>serves multiple torrents on a single port and in a single thread</li>
89
129
<li>piece picking on block-level (as opposed to piece-level).
90
130
This means it can download parts of the same piece from different peers.
91
131
It will also prefer to download whole pieces from single peers if the
92
132
download speed is high enough from that particular peer.</li>
93
<li>supports the <a class="reference external" href="udp_tracker_protocol.html">udp-tracker protocol</a> by Olaf van der Spek.</li>
94
<li>queues torrents for file check, instead of checking all of them in parallel.</li>
95
133
<li>supports http proxies and basic proxy authentication</li>
96
<li>gzipped tracker-responses</li>
134
<li>supports gzipped tracker-responses</li>
97
135
<li>can limit the upload and download bandwidth usage and the maximum number of
98
136
unchoked peers</li>
99
<li>implements fair trade. User settable trade-ratio, must at least be 1:1,
100
but one can choose to trade 1 for 2 or any other ratio that isn't unfair
101
to the other party.</li>
102
<li>supports the <tt class="docutils literal">no_peer_id=1</tt> extension that will ease the load off trackers.</li>
103
137
<li>possibility to limit the number of connections.</li>
104
138
<li>delays have messages if there's no other outgoing traffic to the peer, and
105
139
doesn't send have messages to peers that already has the piece. This saves
107
<li>does not have any requirements on the piece order in a torrent that it
108
resumes. This means it can resume a torrent downloaded by any client.</li>
109
<li>supports the <tt class="docutils literal">compact=1</tt> tracker parameter.</li>
110
141
<li>selective downloading. The ability to select which parts of a torrent you
111
142
want to download.</li>
112
143
<li>ip filter to disallow ip addresses and ip ranges from connecting and
144
being connected.</li>
145
<li>NAT-PMP and UPnP support (automatic port mapping on routers that supports it)</li>
146
<li>implements automatic upload slots, to optimize download rate without spreading
147
upload capacity too thin. The number of upload slots is adjusted based on the
148
peers' download capacity to work even for connections that are orders of
149
magnitude faster than others.</li>
153
<div class="section" id="highlighted-features">
154
<h1>highlighted features</h1>
155
<div class="section" id="disk-caching">
156
<h2>disk caching</h2>
157
<p>All disk I/O in libtorrent is done asynchronously to the network thread, by the
158
disk io thread. When a block is read, the disk io thread reads all subsequent
159
blocks from that piece into the read cache, assuming that the peer requesting
160
the block will also request more blocks from the same piece. This decreases the
161
number of syscalls for reading data. It also decreases delay from seeking.</p>
162
<p>Similarly, for write requests, blocks are cached and flushed to disk once one full
163
piece is complete or the piece is the least recently updated one when more cache
164
space is needed. The cache dynamically allocates space between the write and read
165
cache. The write cache is strictly prioritized over the read cache.</p>
166
<p>The cache blocks that are in used, are locked into physical memory to avoid it
167
being paged out to disk. Allowing the disk cache to be paged out to disk means
168
that it would become extremely inefficient to flush it, since it would have to be
169
read back into physical memory only to be flushed back out to disk again.</p>
170
<p>In order to conserve memory, and system calls, iovec file operations are
171
used to flush multiple cache blocks in a single call.</p>
172
<p>On low-memory systems, the disk cache can be disabled altogether or set to smaller
173
limit, to save memory.</p>
174
<p>The disk caching algorithm is configurable between 'LRU' and 'largest contiguous'.
175
The largest contiguous algorithm is the default and flushes the largest contiguous
176
block of buffers, instead of flushing all blocks belonging to the piece which was
177
written to least recently.</p>
178
<p>For version 0.15 a lot of work went into optimizing the cache algorithm, trying
179
to increase the cache hit rate and utilization. The graph to the left shows the
180
memory utilization in 0.14. This cache is a straight forward, fairly naive, implementation.
181
Every block read will also read all subsequent blocks in that piece into the cache.
182
Whenever we need more space, the entire oldest piece is evicted from the cache. Caching
183
writes always takes presedence over the read cache. Whenever a piece is fully downloaded,
184
it is flushed to disk.</p>
185
<img alt="disk_buffer_before_optimization.png" src="disk_buffer_before_optimization.png" style="width: 49%;" />
186
<img alt="disk_buffer.png" src="disk_buffer.png" style="width: 49%;" />
187
<p>The left graph shows the problem of evicting entire pieces at a time, and waiting until
188
an entire piece is downloaded until flushing it. These graphs were generated for a torrent
189
with fairly large pieces. This means that granularity was poor in 0.14, since it only
190
dealt with entire pieces. In 0.15, the granularity problem has been fixed by evicting one
191
block at a time from the read cache. This maximizes the read cache utilization. The write
192
cache is also flushed when a sufficient number of contiguous blocks have been downloaded
193
for a piece, which is not tied to the piece size anymore. This way the cache scales a lot
194
better with piece sizes.</p>
195
<p>The graph to the right shows the same download but with the new optimized disk cache
196
algorithm. It clearly shows an increased utilization, which means higher read hit rates
197
or smaller caches with maintained hit rate.</p>
199
<div class="section" id="high-performance-disk-subsystem">
200
<h2>high performance disk subsystem</h2>
201
<p>In some circumstances, the disk cache may not suffice to provide maximum performance.
202
One such example is high performance seeding, to a large number of peers, over a fast
203
up-link. In such a case, the amount of RAM may simply not be enough to cache disk
204
reads. When there's not enough RAM to cache disk reads, the disk throughput would
205
typically degrade to perform as poorly as with no cache at all, with the majority
206
of the time spent waiting for the disk head to seek.</p>
207
<p>To solve this problem, libtorrent sorts read requests by their physical offset on the
208
disk. They are processed by having the disk read head sweep back and forth over the drive.</p>
209
<p>This makes libtorrent very suitable for large scale, high-throughput seeding.</p>
210
<img alt="disk_access_no_elevator.png" src="disk_access_no_elevator.png" style="width: 49%;" />
211
<img alt="disk_access_elevator.png" src="disk_access_elevator.png" style="width: 49%;" />
212
<p>These plots illustrates the physical disk offset for reads over time. The left plot
213
is of a run where disk operation re-ordering is turned off and the righ is when it's
214
turned on. The right one has a relatively smooth sine wave shape whereas the left
215
one is more random and involves much longer seeks back and forth over the disk.</p>
216
<p>True physical disk offset queries are only supported on newer linux kernels, Mac OS X and
217
Windows 2000 and up.</p>
219
<div class="section" id="network-buffers">
220
<h2>network buffers</h2>
221
<p>On CPUs with small L2 caches, copying memory can be expensive operations. It is important
222
to keep copying to a minimum on such machines. This mostly applies to embedded systems.</p>
223
<p>In order to minimize the number of times received data is copied, the receive buffer
224
for payload data is received directly into a page aligned disk buffer. If the connection
225
is encrypted, the buffer is decrypted in-place. The buffer is then moved into the disk
226
cache without being copied. Once all the blocks for a piece have been received, or the
227
cache needs to be flushed, all the blocks are passed directly to <tt class="docutils literal"><span class="pre">writev()</span></tt> to flush
228
them in a single syscall. This means a single copy into user space memory, and a single
229
copy back into kernel memory, as illustrated by this figure:</p>
230
<img alt="write_disk_buffers.png" src="write_disk_buffers.png" style="width: 100%;" />
231
<p>When seeding and uploading in general, unnecessary copying is avoided by caching blocks
232
in aligned buffers, that are copied once into the peer's send buffer. The peer's send buffer
233
is not guaranteed to be aligned, even though it is most of the time. The send buffer is
234
then encrypted with the peer specific key and chained onto the <tt class="docutils literal"><span class="pre">iovec</span></tt> for sending.
235
This means there is one user space copy in order to allow unaligned peer requests and
236
peer-specific encryption. This is illustrated by the following figure:</p>
237
<img alt="read_disk_buffers.png" src="read_disk_buffers.png" style="width: 100%;" />
239
<div class="section" id="piece-picker">
240
<h2>piece picker</h2>
241
<p>The piece picker is a central component in a bittorrent implementation. The piece picker
242
in libtorrent is optimized for quickly finding the rarest pieces. It keeps a list of all
243
available pieces sorted by rarity, and pieces with the same rarity, shuffled. The rarest
244
first mode is the dominant piece picker mode. Other modes are supported as well, and
245
used by peers in specific situations.</p>
246
<p>The piece picker allows to combine the availability of a piece with a priority. Together
247
they determine the sort order of the piece list. Pieces with priority 0 will never be
248
picked, which is used for the selective download feature.</p>
249
<p>In order to have as few partially finished pieces as possible, peers have an affinity
250
towards picking blocks from the same pieces as other peers in the same speed category.
251
The speed category is a coarse categorization of peers based on their download rate. This
252
makes slow peers pick blocks from the same piece, and fast peers pick from the same piece,
253
and hence decreasing the likelihood of slow peers blocking the completion of pieces.</p>
254
<p>The piece picker can also be set to download pieces in sequential order.</p>
256
<div class="section" id="merkle-hash-tree-torrents">
257
<h2>merkle hash tree torrents</h2>
258
<p>Merkle hash tree torrents is an extension that lets a torrent file only contain the
259
root hash of the hash tree forming the piece hashes. The main benefit of this feature
260
is that regardless of how many pieces there is in a torrent, the .torrent file will
261
always be the same size. It will only grow with the number of files (since it still
262
has to contain the file names).</p>
263
<p>With regular torrents, clients have to request multiple blocks for pieces, typically
264
from different peers, before the data can be verified against the piece hash. The
265
larger the pieces are, the longer it will take to download a complete piece and verify
266
it. Before the piece is verified, it cannot be shared with the swarm, which means the
267
larger piece sizes, the slower turnaround data has when it is downloaded by peers.
268
Since on average the data has to sit around, waiting, in client buffers before it has
269
been verified and can be uploaded again.</p>
270
<p>Another problem with large piece sizes is that it is harder for a client to pinpoint
271
the malicious or buggy peer when a piece fails, and it will take longer to re-download
272
it and take more tries before the piece succeeds the larger the pieces are.</p>
273
<p>The piece size in regular torrents is a tradeoff between the size of the .torrent file
274
itself and the piece size. Often, for files that are 4 GB, the piece size is 2 or 4 MB,
275
just to avoid making the .torrent file too big.</p>
276
<p>Merkle torrents solves these problems by removing the tradeoff between .torrent size and
277
piece size. With merkle torrents, the piece size can be the minimum block size (16 kB),
278
which lets peers verify every block of data received from peers, immediately. This
279
gives a minimum turnaround time and completely removes the problem of identifying malicious
281
<img alt="merkle_tree.png" src="merkle_tree.png" />
282
<p>The root hash is built by hashing all the piece hashes pair-wise, until they all collapse
283
down to the root.</p>
284
<img align="right" alt="storage.png" class="align-right" src="storage.png" />
286
<div class="section" id="customizable-file-storage">
287
<h2>customizable file storage</h2>
288
<p>libtorrent's storage implementation is customizable. That means a special purpose bittorrent
289
client can replace the default way to store files on disk.</p>
290
<p>When implementing a bittorrent cache, it doesn't matter how the data is stored on disk, as
291
long as it can be retrieved and seeded. In that case a new storage class can be implemented
292
(inheriting from the <tt class="docutils literal"><span class="pre">storage_interface</span></tt> class) that avoids the unnecessary step of mapping
293
slots to files and offsets. The storage can ignore the file boundaries and just store the
294
entire torrent in a single file (which will end up being all the files concatenated). The main
295
advantage of this, other than a slight cpu performance gain, is that all file operations would
296
be page (and sector) aligned. This enables efficient unbuffered I/O, and can potentially
297
lead to more efficient read caching (using the built in disk cache rather than relying on the
298
operating system's disk cache).</p>
299
<p>The storage interface supports operating systems where you can ask for sparse regions
300
(such as Windows and Solaris). The advantage of this is that when checking files, the regions
301
that are known to be sparse can be skipped, which can reduce the time to check a torrent
304
<div class="section" id="easy-to-use-api">
305
<h2>easy to use API</h2>
306
<p>One of the design goals of the libtorrent API is to make common operations simple, but still
307
have it possible to do complicated and advanced operations. This is best illustrated by example
308
code to implement a simple bittorrent client:</p>
309
<pre class="literal-block">
310
#include <iostream>
311
#include "libtorrent/session.hpp"
313
// usage a.out [torrent-file]
314
int main(int argc, char* argv[]) try
316
using namespace libtorrent;
319
s.listen_on(std::make_pair(6881, 6889));
320
add_torrent_params p;
321
p.save_path = "./";
322
p.ti = new torrent_info(argv[1]);
325
// wait for the user to end
327
std::cin.unsetf(std::ios_base::skipws);
331
catch (std::exception& e)
333
std::cerr << ec.what() << std::endl;
337
<p>This client doesn't give the user any status information or progress about the torrent, but
338
it is fully functional.</p>
339
<p>libtorrent also comes with python bindings for easy access for python developers.</p>
116
342
<div class="section" id="portability">
117
343
<h1>portability</h1>
118
<p>libtorrent is portable at least among Windows, MacOS X and other UNIX-systems.
344
<p>libtorrent runs on most major operating systems, including Windows,
345
MacOS X, Linux, BSD and Solaris.
119
346
It uses Boost.Thread, Boost.Filesystem, Boost.Date_time and various other
120
347
boost libraries as well as <a class="reference external" href="http://www.zlib.org">zlib</a> (shipped) and <a class="reference external" href="http://asio.sf.net">asio</a> (shipped). At least version
121
1.33.1 of boost is required.</p>
122
<p>Since libtorrent uses asio, it will take full advantage of high performance
348
1.34.1 of boost is required.</p>
349
<p>libtorrent uses asio, hence it will take full advantage of high performance
123
350
network APIs on the most popular platforms. I/O completion ports on windows,
124
351
epoll on linux and kqueue on MacOS X and BSD.</p>
125
352
<p>libtorrent has been successfully compiled and tested on:</p>
126
353
<ul class="simple">
127
<li>Windows 2000 vc7.1, vc8</li>
128
<li>Linux x86 GCC 3.3, GCC 3.4.2</li>
354
<li>Windows 2000, XP and Vista vc7.1, vc8</li>
355
<li>Linux x86 GCC 3.3, GCC 3.4.2, 4.x</li>
356
<li>Linux PPC GCC 4.1.1</li>
129
357
<li>MacOS X (darwin), (Apple's) GCC 3.3, (Apple's) GCC 4.0</li>
130
<li>SunOS 5.8 GCC 3.1</li>
358
<li>SunOS 5.8 GCC 3.1 and Sunpro</li>
131
359
<li>Cygwin GCC 3.3.3</li>