1
.. program:: pt-visual-explain
4
==============================
5
:program:`pt-visual-explain`
6
==============================
14
:program:`pt-visual-explain` - Format EXPLAIN output as a tree.
26
pt-visual-explain [OPTION...] [FILE...]
28
:program:`pt-visual-explain` transforms EXPLAIN output into a tree representation of
29
the query plan. If FILE is given, input is read from the file(s). With no
30
FILE, or when FILE is -, read standard input.
37
pt-visual-explain <file_containing_explain_output>
39
pt-visual-explain -c <file_containing_query>
41
mysql -e "explain select * from mysql.user" | pt-visual-explain
49
The following section is included to inform users about the potential risks,
50
whether known or unknown, of using this tool. The two main categories of risks
51
are those created by the nature of the tool (e.g. read-only tools vs. read-write
52
tools) and those created by bugs.
54
:program:`pt-visual-explain` is read-only and very low-risk.
56
At the time of this release, we know of no bugs that could cause serious harm to
59
The authoritative source for updated information is always the online issue
60
tracking system. Issues that affect this tool will be marked as such. You can
61
see a list of such issues at the following URL:
62
`http://www.percona.com/bugs/pt-visual-explain <http://www.percona.com/bugs/pt-visual-explain>`_.
64
See also :ref:`bugs` for more information on filing bugs and getting help.
70
:program:`pt-visual-explain` reverse-engineers |MySQL|'s EXPLAIN output into a query
71
execution plan, which it then formats as a left-deep tree -- the same way the
72
plan is represented inside |MySQL|. It is possible to do this by hand, or to read
73
EXPLAIN's output directly, but it requires patience and expertise. Many people
74
find a tree representation more understandable.
76
You can pipe input into :program:`pt-visual-explain` or specify a filename at the
77
command line, including the magical '-' filename, which will read from standard
78
input. It can do two things with the input: parse it for something that looks
79
like EXPLAIN output, or connect to a |MySQL| instance and run EXPLAIN on the
82
When parsing its input, :program:`pt-visual-explain` understands three formats: tabular
83
like that shown in the mysql command-line client, vertical like that created by
84
using the \G line terminator in the mysql command-line client, and tab
85
separated. It ignores any lines it doesn't know how to parse.
87
When executing the input, :program:`pt-visual-explain` replaces everything in the input
88
up to the first SELECT keyword with 'EXPLAIN SELECT,' and then executes the
89
result. You must specify :option:`--connect` to execute the input as a query.
91
Either way, it builds a tree from the result set and prints it to standard
92
output. For the following query,
97
select * from sakila.film_actor join sakila.film using(film_id);
99
:program:`pt-visual-explain` generates this query plan:
108
| | possible_keys idx_fk_film_id
110
| key film_actor->idx_fk_film_id
111
| possible_keys idx_fk_film_id
113
| ref sakila.film.film_id
119
possible_keys PRIMARY
122
The query plan is left-deep, depth-first search, and the tree's root is the
123
output node -- the last step in the execution plan. In other words, read it
127
* ``1`` Table scan the 'film' table, which accesses an estimated 952 rows.
131
* ``2`` For each row, find matching rows by doing an index lookup into the
132
film_actor->idx_fk_film_id index with the value from sakila.film.film_id, then a
133
bookmark lookup into the film_actor table.
137
For more information on how to read EXPLAIN output, please see
138
`http://dev.mysql.com/doc/en/explain.html <http://dev.mysql.com/doc/en/explain.html>`_, and this talk titled "Query
139
Optimizer Internals and What's New in the |MySQL| 5.2 Optimizer," from Timour
140
Katchaounov, one of the |MySQL| developers:
141
`http://maatkit.org/presentations/katchaounov_timour.pdf <http://maatkit.org/presentations/katchaounov_timour.pdf>`_.
148
This program is actually a runnable module, not just an ordinary *Perl* script.
149
In fact, there are two modules embedded in it. This makes unit testing easy,
150
but it also makes it easy for you to use the parsing and tree-building
151
functionality if you want.
153
The ExplainParser package accepts a string and parses whatever it thinks looks
154
like EXPLAIN output from it. The synopsis is as follows:
159
require :program:`pt-visual-explain`";
160
my $p = ExplainParser->new();
161
my $rows = $p->parse("some text");
162
# $rows is an arrayref of hashrefs.
165
The ExplainTree package accepts a set of rows and turns it into a tree. For
166
convenience, you can also have it delegate to ExplainParser and parse text for
167
you. Here's the synopsis:
172
require :program:`pt-visual-explain`";
173
my $e = ExplainTree->new();
174
my $tree = $e->parse("some text", \%options);
175
my $output = $e->pretty_print($tree);
184
This section explains the algorithm that converts EXPLAIN into a tree. You may
185
be interested in reading this if you want to understand EXPLAIN more fully, or
186
trying to figure out how this works, but otherwise this section will probably
187
not make your life richer.
189
The tree can be built by examining the id, select_type, and table columns of
190
each row. Here's what I know about them:
192
The id column is the sequential number of the select. This does not indicate
193
nesting; it just comes from counting SELECT from the left of the SQL statement.
194
It's like capturing parentheses in a regular expression. A UNION RESULT row
195
doesn't have an id, because it isn't a SELECT. The source code actually refers
196
to UNIONs as a fake_lex, as I recall.
198
If two adjacent rows have the same id value, they are joined with the standard
199
single-sweep multi-join method.
201
The select_type column tells a) that a new sub-scope has opened b) what kind
202
of relationship the row has to the previous row c) what kind of operation the
206
* SIMPLE means there are no subqueries or unions in the whole query.
210
* PRIMARY means there are, but this is the outermost SELECT.
214
* [DEPENDENT] UNION means this result is UNIONed with the previous result (not
215
row; a result might encompass more than one row).
219
* UNION RESULT terminates a set of UNIONed results.
223
* [DEPENDENT|UNCACHEABLE] SUBQUERY means a new sub-scope is opening. This is the
224
kind of subquery that happens in a WHERE clause, SELECT list or whatnot; it does
225
not return a so-called "derived table."
229
* DERIVED is a subquery in the FROM clause.
233
Tables that are JOINed all have the same select_type. For example, if you JOIN
234
three tables inside a dependent subquery, they'll all say the same thing:
237
The table column usually specifies the table name or alias, but may also say
238
<derivedN> or <unionN,N...N>. If it says <derivedN>, the row represents an
239
access to the temporary table that holds the result of the subquery whose id is
240
N. If it says <unionN,..N> it's the same thing, but it refers to the results it
243
Finally, order matters. If a row's id is less than the one before it, I think
244
that means it is dependent on something other than the one before it. For
251
(select 1 from sakila.film),
252
(select 2 from sakila.film_actor),
253
(select 3 from sakila.actor);
255
| id | select_type | table |
256
+----+-------------+------------+
257
| 1 | PRIMARY | NULL |
258
| 4 | SUBQUERY | actor |
259
| 3 | SUBQUERY | film_actor |
260
| 2 | SUBQUERY | film |
263
If the results were in order 2-3-4, I think that would mean 3 is a subquery of
264
2, 4 is a subquery of 3. As it is, this means 4 is a subquery of the nearest
265
previous recent row with a smaller id, which is 1. Likewise for 3 and 2.
267
This structure is hard to programatically build into a tree for the same reason
268
it's hard to understand by inspection: there are both forward and backward
269
references. <derivedN> is a forward reference to selectN, while <unionM,N> is a
270
backward reference to selectM and selectN. That makes recursion and other
271
tree-building algorithms hard to get right (NOTE: after implementation, I now
272
see how it would be possible to deal with both forward and backward references,
273
but I have no motivation to change something that works). Consider the
280
select 1 from sakila.actor as actor_1
282
select 1 from sakila.actor as actor_2
286
select 1 from sakila.actor as actor_3
288
select 1 from sakila.actor as actor_4
291
| id | select_type | table |
292
+------+--------------+------------+
293
| 1 | PRIMARY | <derived2> |
294
| 2 | DERIVED | actor_1 |
295
| 3 | UNION | actor_2 |
296
| NULL | UNION RESULT | <union2,3> |
297
| 4 | UNION | <derived5> |
298
| 5 | DERIVED | actor_3 |
299
| 6 | UNION | actor_4 |
300
| NULL | UNION RESULT | <union5,6> |
301
| NULL | UNION RESULT | <union1,4> |
304
This would be a lot easier to work with if it looked like this (I've
305
bracketed the id on rows I moved):
310
| id | select_type | table |
311
+------+--------------+------------+
312
| [1] | UNION RESULT | <union1,4> |
313
| 1 | PRIMARY | <derived2> |
314
| [2] | UNION RESULT | <union2,3> |
315
| 2 | DERIVED | actor_1 |
316
| 3 | UNION | actor_2 |
317
| 4 | UNION | <derived5> |
318
| [5] | UNION RESULT | <union5,6> |
319
| 5 | DERIVED | actor_3 |
320
| 6 | UNION | actor_4 |
323
In fact, why not re-number all the ids, so the PRIMARY row becomes 2, and so on?
324
That would make it even easier to read. Unfortunately that would also have the
325
effect of destroying the meaning of the id column, which I think is important to
326
preserve in the final tree. Also, though it makes it easier to read, it doesn't
327
make it easier to manipulate programmatically; so it's fine to leave them
328
numbered as they are.
330
The goal of re-ordering is to make it easier to figure out which rows are
331
children of which rows in the execution plan. Given the reordered list and some
332
row whose table is <union...> or <derived>, it is easy to find the beginning of
333
the slice of rows that should be child nodes in the tree: you just look for the
334
first row whose ID is the same as the first number in the table.
336
The next question is how to find the last row that should be a child node of a
337
UNION or DERIVED. I'll start with DERIVED, because the solution makes UNION
340
Consider how |MySQL| numbers the SELECTs sequentially according to their position
341
in the SQL, left-to-right. Since a DERIVED table encloses everything within it
342
in a scope, which becomes a temporary table, there are only two things to think
343
about: its child subqueries and unions (if any), and its next siblings in the
344
scope that encloses it. Its children will all have an id greater than it does,
345
by definition, so any later rows with a smaller id terminate the scope.
347
Here's an example. The middle derived table here has a subquery and a UNION to
348
make it a little more complex for the example.
355
select film_id from sakila.film limit 1
358
select film_id, actor_id, (select count(*) from sakila.rental) as r
359
from sakila.film_actor limit 1
361
select 1, 1, 1 from sakila.film_actor as dummy
362
) as der_2 using (film_id)
364
select actor_id from sakila.actor limit 1
365
) as der_3 using (actor_id);
368
Here's the output of EXPLAIN:
373
| id | select_type | table |
374
| 1 | PRIMARY | <derived2> |
375
| 1 | PRIMARY | <derived6> |
376
| 1 | PRIMARY | <derived3> |
377
| 6 | DERIVED | actor |
378
| 3 | DERIVED | film_actor |
379
| 4 | SUBQUERY | rental |
380
| 5 | UNION | dummy |
381
| NULL | UNION RESULT | <union3,5> |
382
| 2 | DERIVED | film |
385
The siblings all have id 1, and the middle one I care about is derived3.
386
(Notice |MySQL| doesn't execute them in the order I defined them, which is fine).
387
Now notice that |MySQL| prints out the rows in the opposite order I defined the
388
subqueries: 6, 3, 2. It always seems to do this, and there might be other
389
methods of finding the scope boundaries including looking for the lower boundary
390
of the next largest sibling, but this is a good enough heuristic. I am forced
391
to rely on it for non-DERIVED subqueries, so I rely on it here too. Therefore,
392
I decide that everything greater than or equal to 3 belongs to the DERIVED
395
The rule for UNION is simple: they consume the entire enclosing scope, and to
396
find the component parts of each one, you find each part's beginning as referred
397
to in the <unionN,...> definition, and its end is either just before the next
398
one, or if it's the last part, the end is the end of the scope.
400
This is only simple because UNION consumes the entire scope, which is either the
401
entire statement, or the scope of a DERIVED table. This is because a UNION
402
cannot be a sibling of another UNION or a table, DERIVED or not. (Try writing
403
such a statement if you don't see it intuitively). Therefore, you can just find
404
the enclosing scope's boundaries, and the rest is easy. Notice in the example
405
above, the UNION is over <union3,5>, which includes the row with id 4 -- it
406
includes every row between 3 and 5.
408
Finally, there are non-derived subqueries to deal with as well. In this case I
409
can't look at siblings to find the end of the scope as I did for DERIVED. I
410
have to trust that |MySQL| executes depth-first. Here's an example:
418
select count(film_id)
419
+ (select count(*) from sakila.film)
420
from sakila.film join sakila.film_actor using(film_id)
422
select * from sakila.actor
423
where sakila.actor.actor_id = sakila.film_actor.actor_id
428
| id | select_type | table |
429
| 1 | PRIMARY | actor |
430
| 2 | SUBQUERY | film |
431
| 2 | SUBQUERY | film_actor |
432
| 4 | DEPENDENT SUBQUERY | actor |
433
| 3 | SUBQUERY | film |
436
In order, the tree should be built like this:
443
* See row 2. It's a higher id than 1, so it's a subquery, along with every other
444
row whose id is greater than 2.
448
* Inside this scope, see 2 and 2 and JOIN them. See 4. It's a higher id than 2,
449
so it's again a subquery; recurse. After that, see 3, which is also higher;
454
But the only reason the nested subquery didn't include select 3 is because
455
select 4 came first. In other words, if EXPLAIN looked like this,
460
| id | select_type | table |
461
| 1 | PRIMARY | actor |
462
| 2 | SUBQUERY | film |
463
| 2 | SUBQUERY | film_actor |
464
| 3 | SUBQUERY | film |
465
| 4 | DEPENDENT SUBQUERY | actor |
468
I would be forced to assume upon seeing select 3 that select 4 is a subquery
469
of it, rather than just being the next sibling in the enclosing scope. If this
470
is ever wrong, then the algorithm is wrong, and I don't see what could be done
473
UNION is a little more complicated than just "the entire scope is a UNION,"
474
because the UNION might itself be inside an enclosing scope that's only
475
indicated by the first item inside the UNION. There are only three kinds of
476
enclosing scopes: UNION, DERIVED, and SUBQUERY. A UNION can't enclose a UNION,
477
and a DERIVED has its own "scope markers," but a SUBQUERY can wholly enclose a
478
UNION, like this strange example on the empty table t1:
483
explain select * from t1 where not exists(
484
(select t11.i from t1 t11) union (select t12.i from t1 t12));
486
| id | select_type | table | Extra |
487
+------+--------------+------------+--------------------------------+
488
| 1 | PRIMARY | t1 | const row not found |
489
| 2 | SUBQUERY | NULL | No tables used |
490
| 3 | SUBQUERY | NULL | no matching row in const table |
491
| 4 | UNION | t12 | const row not found |
492
| NULL | UNION RESULT | <union2,4> | |
495
The UNION's backward references might make it look like the UNION encloses the
496
subquery, but studying the query makes it clear this isn't the case. So when a
497
UNION's first row says SUBQUERY, it is this special case.
499
By the way, I don't fully understand this query plan; there are 4 numbered
500
SELECT in the plan, but only 3 in the query. The parens around the UNIONs are
501
meaningful. Removing them will make the EXPLAIN different. Please tell me how
502
and why this works if you know.
504
Armed with this knowledge, it's possible to use recursion to turn the
505
parent-child relationship between all the rows into a tree representing the
508
|MySQL| prints the rows in execution order, even the forward and backward
509
references. At any given scope, the rows are processed as a left-deep tree.
510
|MySQL| does not do "bushy" execution plans. It begins with a table, finds a
511
matching row in the next table, and continues till the last table, when it emits
512
a row. When it runs out, it backtracks till it can find the next row and
513
repeats. There are subtleties of course, but this is the basic plan. This is
514
why |MySQL| transforms all RIGHT OUTER JOINs into LEFT OUTER JOINs and cannot do
517
This means in any given scope, say
522
| id | select_type | table |
523
| 1 | SIMPLE | tbl1 |
524
| 1 | SIMPLE | tbl2 |
525
| 1 | SIMPLE | tbl3 |
528
The execution plan looks like a depth-first traversal of this tree:
540
The JOIN might not be a JOIN. It might be a subquery, for example. This comes
541
from the type column of EXPLAIN. The documentation says this is a "join type,"
542
but I think "access type" is more accurate, because it's "how |MySQL| accesses
544
:program:`pt-visual-explain` decorates the tree significantly more than just turning
545
rows into nodes. Each node may get a series of transformations that turn it
546
into a subtree of more than one node. For example, an index scan not marked
547
with 'Using index' must do a bookmark lookup into the table rows; that is a
548
three-node subtree. However, after the above node-ordering and scoping stuff,
549
the rest of the process is pretty simple.
556
This tool accepts additional command-line arguments. Refer to the
557
"SYNOPSIS" and usage information for details.
560
.. option:: --ask-pass
562
Prompt for a password when connecting to |MySQL|.
566
.. option:: --charset
568
short form: -A; type: string
570
Default character set. If the value is utf8, sets *Perl* 's binmode on
571
``STDOUT`` to utf8, passes the mysql_enable_utf8 option to ``DBD::mysql``, and
572
runs SET NAMES UTF8 after connecting to |MySQL|. Any other value sets
573
binmode on ``STDOUT`` without the utf8 layer, and runs SET NAMES after
574
connecting to |MySQL|.
578
.. option:: --clustered-pk
580
Assume that PRIMARY KEY index accesses don't need to do a bookmark lookup to
581
retrieve rows. This is the case for |InnoDB|.
589
Read this comma-separated list of config files; if specified, this must be the
590
first option on the command line.
594
.. option:: --connect
596
Treat input as a query, and obtain EXPLAIN output by connecting to a |MySQL|
597
instance and running EXPLAIN on the query. When this option is given,
598
:program:`pt-visual-explain` uses the other connection-specific options such as
599
:option:`--user` to connect to the |MySQL| instance. If you have a .my.cnf file,
600
it will read it, so you may not need to specify any connection-specific
605
.. option:: --database
607
short form: -D; type: string
609
Connect to this database.
613
.. option:: --defaults-file
615
short form: -F; type: string
617
Only read mysql options from the given file. You must give an absolute
624
type: string; default: tree
628
The default is a terse pretty-printed tree. The valid values are:
634
===== ================================================
635
tree Pretty-printed terse tree.
636
dump Data::Dumper output (see Data::Dumper for more).
649
short form: -h; type: string
655
.. option:: --password
657
short form: -p; type: string
659
Password to use when connecting.
667
Create the given PID file. The file contains the process ID of the script.
668
The PID file is removed when the script exits. Before starting, the script
669
checks if the PID file already exists. If it does not, then the script creates
670
and writes its own PID to it. If it does, then the script checks the following:
671
if the file contains a PID and a process is running with that PID, then
672
the script dies; or, if there is no process running with that PID, then the
673
script overwrites the file with its own PID and starts; else, if the file
674
contains no PID, then the script dies.
680
short form: -P; type: int
682
Port number to use for connection.
686
.. option:: --set-vars
688
type: string; default: wait_timeout=10000
690
Set these |MySQL| variables. Immediately after connecting to |MySQL|, this
691
string will be appended to SET and executed.
697
short form: -S; type: string
699
Socket file to use for connection.
705
short form: -u; type: string
707
User for login if not current user.
711
.. option:: --version
713
Show version and exit.
722
These DSN options are used to create a DSN. Each option is given like
723
\ ``option=value``\ . The options are case-sensitive, so P and p are not the
724
same option. There cannot be whitespace before or after the \ ``=``\ and
725
if the value contains whitespace it must be quoted. DSN options are
726
comma-separated. See the percona-toolkit manpage for full details.
731
dsn: charset; copy: yes
733
Default character set.
739
dsn: database; copy: yes
747
dsn: mysql_read_default_file; copy: yes
749
Only read default options from the given file
763
dsn: password; copy: yes
765
Password to use when connecting.
773
Port number to use for connection.
779
dsn: mysql_socket; copy: yes
781
Socket file to use for connection.
789
User for login if not current user.
798
The environment variable \ ``PTDEBUG``\ enables verbose debugging output to ``STDERR``.
799
To enable debugging and capture all output to a file, run the tool like:
804
PTDEBUG=1 :program:`pt-visual-explain` ... > FILE 2>&1
807
Be careful: debugging output is voluminous and can generate several megabytes
815
You need *Perl* , ``DBI``, ``DBD::mysql``, and some core packages that ought to be
816
installed in any reasonably new version of *Perl* .
823
For a list of known bugs, see `http://www.percona.com/bugs/pt-visual-explain <http://www.percona.com/bugs/pt-visual-explain>`_.
825
Please report bugs at `https://bugs.launchpad.net/percona-toolkit <https://bugs.launchpad.net/percona-toolkit>`_.
835
COPYRIGHT, LICENSE, AND WARRANTY
836
================================
839
This program is copyright 2007-2011 *Baron Schwartz*, 2011 Percona Inc.
840
Feedback and improvements are welcome.
846
:program:`pt-visual-explain` 1.0.1