14
pt-tcp-model - Transform tcpdump into metrics that permit performance and scalability modeling.
22
Usage: pt-tcp-model [OPTION...] [FILE]
24
pt-tcp-model parses and analyzes tcpdump files. With no FILE, or when
25
FILE is -, it read standard input.
27
Dump TCP requests and responses to a file, capturing only the packet headers to
28
avoid dropped packets, and ignoring any packets without a payload (such as
29
ack-only packets). Capture port 3306 (MySQL database traffic). Note that to
30
avoid line breaking in terminals and man pages, the TCP filtering expression
31
that follows has a line break at the end of the second line; you should omit
32
this from your tcpdump command.
37
tcpdump -s 384 -i any -nnq -tttt \
38
'tcp port 3306 and (((ip[2:2] - ((ip[0]&0xf)<<2))
39
- ((tcp[12]&0xf0)>>2)) != 0)' \
40
> /path/to/tcp-file.txt
43
Extract individual response times, sorted by end time:
48
pt-tcp-model /path/to/tcp-file.txt > requests.txt
51
Sort the result by arrival time, for input to the next step:
56
sort -n -k1,1 requests.txt > sorted.txt
59
Slice the result into 10-second intervals and emit throughput, concurrency, and
60
response time metrics for each interval:
65
pt-tcp-model --type=requests --run-time=10 sorted.txt > sliced.txt
68
Transform the result for modeling with Aspersa's usl tool, discarding the first
69
and last line of each file if you specify multiple files (the first and last
70
line are normally incomplete observation periods and are aberrant):
75
for f in sliced.txt; do
76
tail -n +2 "$f" | head -n -1 | awk '{print $2, $3, $7/$4}'
86
The following section is included to inform users about the potential risks,
87
whether known or unknown, of using this tool. The two main categories of risks
88
are those created by the nature of the tool (e.g. read-only tools vs. read-write
89
tools) and those created by bugs.
91
pt-tcp-model merely reads and transforms its input, printing it to the output.
92
It should be very low risk.
94
At the time of this release, we know of no bugs that could cause serious harm
97
The authoritative source for updated information is always the online issue
98
tracking system. Issues that affect this tool will be marked as such. You can
99
see a list of such issues at the following URL:
100
`http://www.percona.com/bugs/pt-tcp-model <http://www.percona.com/bugs/pt-tcp-model>`_.
102
See also "BUGS" for more information on filing bugs and getting help.
110
This tool recognizes requests and responses in a TCP stream, and extracts the
111
"conversations". You can use it to capture the response times of individual
112
queries to a database, for example. It expects the TCP input to be in the
113
following format, which should result from the sample shown in the SYNOPSIS:
118
<date> <time.microseconds> IP <IP.port> > <IP.port>: <junk>
121
The tool watches for "incoming" packets to the port you specify with the
122
"--watch-server" option. This begins a request. If multiple inbound packets
123
follow each other, then by default the last inbound packet seen determines the
124
time at which the request is assumed to begin. This is logical if one assumes
125
that a server must receive the whole SQL statement before beginning execution,
128
When the first outbound packet is seen, the server is considered to have
129
responded to the request. The tool might see an inbound packet, but never see a
130
response. This can happen when the kernel drops packets, for example. As a
131
result, the tool never prints a request unless it sees the response to it.
132
However, the tool actually does not print any request until it sees the "last"
133
outbound packet. It determines this by waiting for either another inbound
134
packet, or EOF, and then considers the previous inbound/outbound pair to be
135
complete. As a result, the tool prints requests in a relatively random order.
136
Most types of analysis require processing in either arrival or completion order.
137
Therefore, the second type of processing this tool can do requires that you sort
138
the output from the first stage and supply it as input.
140
The second type of processing is selected with the "--type" option set to
141
"requests". In this mode, the tool reads a group of requests and aggregates
142
them, then emits the aggregated metrics.
150
In the default mode (parsing tcpdump output), requests are printed out one per
151
line, in the following format:
156
<id> <start> <end> <elapsed> <IP:port>
159
The ID is an incrementing number, assigned in arrival order in the original TCP
160
traffic. The start and end timestamps, and the elapsed time, can be customized
161
with the "--start-end" option.
163
In "--type=requests" mode, the tool prints out one line per time interval as
164
defined by "--run-time", with the following columns: ts, concurrency,
165
throughput, arrivals, completions, busy_time, weighted_time, sum_time,
166
variance_mean, quantile_time, obs_time. A detailed explanation follows:
171
The timestamp that defines the beginning of the interval.
177
The average number of requests resident in the server during the interval.
183
The number of arrivals per second during the interval.
189
The number of arrivals during the interval.
195
The number of completions during the interval.
201
The total amount of time during which at least one request was resident in
202
the server during the interval.
208
The total response time of all the requests resident in the server during the
209
interval, including requests that neither arrived nor completed during the
216
The total response time of all the requests that arrived in the interval.
222
The variance-to-mean ratio (index of dispersion) of the response times of the
223
requests that arrived in the interval.
229
The Nth percentile response time for all the requests that arrived in the
230
interval. See also "--quantile".
236
The length of the observation time window. This will usually be the same as the
237
interval length, except for the first and last intervals in a file, which might
238
have a shorter observation time.
248
This tool accepts additional command-line arguments. Refer to the
249
"SYNOPSIS" and usage information for details.
256
Read this comma-separated list of config files; if specified, this must be the
257
first option on the command line.
269
type: array; default: time,30
271
Print progress reports to STDERR. The value is a comma-separated list with two
272
parts. The first part can be percentage, time, or iterations; the second part
273
specifies how often an update should be printed, in percentage, seconds, or
274
number of iterations.
282
The percentile for the last column when "--type" is "requests" (default .99).
290
The size of the aggregation interval in seconds when "--type" is "requests"
291
(default 1). Fractional values are permitted.
297
type: Array; default: ts,end
299
Define how the arrival and completion timestamps of a query, and thus its
300
response time (elapsed time) are computed. Recall that there may be multiple
301
inbound and outbound packets per request and response, and refer to the
302
following ASCII diagram. Suppose that a client sends a series of three inbound
303
(I) packets to the server, whch computes the result and then sends two outbound
309
I I I ..................... O O
310
|<---->|<---response time----->|<-->|
314
By default, the query is considered to arrive at time ts, and complete at time
315
end. However, this might not be what you want. Perhaps you do not want to
316
consider the query to have completed until time end1. You can accomplish this
317
by setting this option to \ ``ts,end1``\ .
325
The type of input to parse (default tcpdump). The permitted types are
330
The parser expects the input to be formatted with the following options: \ ``-x -n
331
-q -tttt``\ . For example, if you want to capture output from your local machine,
332
you can do something like the following (the port must come last on FreeBSD):
337
tcpdump -s 65535 -x -nn -q -tttt -i any -c 1000 port 3306 \
339
pt-query-digest --type tcpdump mysql.tcp.txt
342
The other tcpdump parameters, such as -s, -c, and -i, are up to you. Just make
343
sure the output looks like this (there is a line break in the first line to
344
avoid man-page problems):
349
2009-04-12 09:50:16.804849 IP 127.0.0.1.42167
350
> 127.0.0.1.3306: tcp 37
353
All MySQL servers running on port 3306 are automatically detected in the
354
tcpdump output. Therefore, if the tcpdump out contains packets from
355
multiple servers on port 3306 (for example, 10.0.0.1:3306, 10.0.0.2:3306,
356
etc.), all packets/queries from all these servers will be analyzed
357
together as if they were one server.
359
If you're analyzing traffic for a protocol that is not running on port
360
3306, see "--watch-server".
368
Show version and exit.
374
type: string; default: 10.10.10.10:3306
376
This option tells pt-tcp-model which server IP address and port (such as
377
"10.0.0.1:3306") to watch when parsing tcpdump for "--type" tcpdump. If you
378
don't specify it, the tool watches all servers by looking for any IP address
379
using port 3306. If you're watching a server with a non-standard port, this
380
won't work, so you must specify the IP address and port to watch.
382
Currently, IP address filtering isn't implemented; so even though you must
383
specify the option in IP:port form, it ignores the IP and only looks at the port
394
The environment variable \ ``PTDEBUG``\ enables verbose debugging output to STDERR.
395
To enable debugging and capture all output to a file, run the tool like:
400
PTDEBUG=1 pt-tcp-model ... > FILE 2>&1
403
Be careful: debugging output is voluminous and can generate several megabytes
412
You need Perl, DBI, DBD::mysql, and some core packages that ought to be
413
installed in any reasonably new version of Perl.
421
For a list of known bugs, see `http://www.percona.com/bugs/pt-tcp-model <http://www.percona.com/bugs/pt-tcp-model>`_.
423
Please report bugs at `https://bugs.launchpad.net/percona-toolkit <https://bugs.launchpad.net/percona-toolkit>`_.
424
Include the following information in your bug report:
427
\* Complete command-line used to run the tool
435
\* MySQL version of all servers involved
439
\* Output from the tool including STDERR
443
\* Input files (log/dump/config files, etc.)
447
If possible, include debugging output by running the tool with \ ``PTDEBUG``\ ;
456
Visit `http://www.percona.com/software/percona-toolkit/ <http://www.percona.com/software/percona-toolkit/>`_ to download the
457
latest release of Percona Toolkit. Or, get the latest release from the
463
wget percona.com/get/percona-toolkit.tar.gz
465
wget percona.com/get/percona-toolkit.rpm
467
wget percona.com/get/percona-toolkit.deb
470
You can also get individual tools from the latest release:
475
wget percona.com/get/TOOL
478
Replace \ ``TOOL``\ with the name of any tool.
489
*********************
490
ABOUT PERCONA TOOLKIT
491
*********************
494
This tool is part of Percona Toolkit, a collection of advanced command-line
495
tools developed by Percona for MySQL support and consulting. Percona Toolkit
496
was forked from two projects in June, 2011: Maatkit and Aspersa. Those
497
projects were created by Baron Schwartz and developed primarily by him and
498
Daniel Nichter, both of whom are employed by Percona. Visit
499
`http://www.percona.com/software/ <http://www.percona.com/software/>`_ for more software developed by Percona.
502
********************************
503
COPYRIGHT, LICENSE, AND WARRANTY
504
********************************
507
This program is copyright 2011 Baron Schwartz, 2011 Percona Inc.
508
Feedback and improvements are welcome.
510
THIS PROGRAM IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
511
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
512
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
514
This program is free software; you can redistribute it and/or modify it under
515
the terms of the GNU General Public License as published by the Free Software
516
Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
517
systems, you can issue \`man perlgpl' or \`man perlartistic' to read these
520
You should have received a copy of the GNU General Public License along with
521
this program; if not, write to the Free Software Foundation, Inc., 59 Temple
522
Place, Suite 330, Boston, MA 02111-1307 USA.