14
pt-visual-explain - Format EXPLAIN output as a tree.
22
Usage: pt-visual-explain [OPTION...] [FILE...]
24
pt-visual-explain transforms EXPLAIN output into a tree representation of
25
the query plan. If FILE is given, input is read from the file(s). With no
26
FILE, or when FILE is -, read standard input.
33
pt-visual-explain <file_containing_explain_output>
35
pt-visual-explain -c <file_containing_query>
37
mysql -e "explain select * from mysql.user" | pt-visual-explain
46
The following section is included to inform users about the potential risks,
47
whether known or unknown, of using this tool. The two main categories of risks
48
are those created by the nature of the tool (e.g. read-only tools vs. read-write
49
tools) and those created by bugs.
51
pt-visual-explain is read-only and very low-risk.
53
At the time of this release, we know of no bugs that could cause serious harm to
56
The authoritative source for updated information is always the online issue
57
tracking system. Issues that affect this tool will be marked as such. You can
58
see a list of such issues at the following URL:
59
`http://www.percona.com/bugs/pt-visual-explain <http://www.percona.com/bugs/pt-visual-explain>`_.
61
See also "BUGS" for more information on filing bugs and getting help.
69
pt-visual-explain reverse-engineers MySQL's EXPLAIN output into a query
70
execution plan, which it then formats as a left-deep tree -- the same way the
71
plan is represented inside MySQL. It is possible to do this by hand, or to read
72
EXPLAIN's output directly, but it requires patience and expertise. Many people
73
find a tree representation more understandable.
75
You can pipe input into pt-visual-explain or specify a filename at the
76
command line, including the magical '-' filename, which will read from standard
77
input. It can do two things with the input: parse it for something that looks
78
like EXPLAIN output, or connect to a MySQL instance and run EXPLAIN on the
81
When parsing its input, pt-visual-explain understands three formats: tabular
82
like that shown in the mysql command-line client, vertical like that created by
83
using the \G line terminator in the mysql command-line client, and tab
84
separated. It ignores any lines it doesn't know how to parse.
86
When executing the input, pt-visual-explain replaces everything in the input
87
up to the first SELECT keyword with 'EXPLAIN SELECT,' and then executes the
88
result. You must specify "--connect" to execute the input as a query.
90
Either way, it builds a tree from the result set and prints it to standard
91
output. For the following query,
96
select * from sakila.film_actor join sakila.film using(film_id);
99
pt-visual-explain generates this query plan:
108
| | possible_keys idx_fk_film_id
110
| key film_actor->idx_fk_film_id
111
| possible_keys idx_fk_film_id
113
| ref sakila.film.film_id
119
possible_keys PRIMARY
122
The query plan is left-deep, depth-first search, and the tree's root is the
123
output node -- the last step in the execution plan. In other words, read it
129
Table scan the 'film' table, which accesses an estimated 952 rows.
135
For each row, find matching rows by doing an index lookup into the
136
film_actor->idx_fk_film_id index with the value from sakila.film.film_id, then a
137
bookmark lookup into the film_actor table.
141
For more information on how to read EXPLAIN output, please see
142
`http://dev.mysql.com/doc/en/explain.html <http://dev.mysql.com/doc/en/explain.html>`_, and this talk titled "Query
143
Optimizer Internals and What's New in the MySQL 5.2 Optimizer," from Timour
144
Katchaounov, one of the MySQL developers:
145
`http://maatkit.org/presentations/katchaounov_timour.pdf <http://maatkit.org/presentations/katchaounov_timour.pdf>`_.
153
This program is actually a runnable module, not just an ordinary Perl script.
154
In fact, there are two modules embedded in it. This makes unit testing easy,
155
but it also makes it easy for you to use the parsing and tree-building
156
functionality if you want.
158
The ExplainParser package accepts a string and parses whatever it thinks looks
159
like EXPLAIN output from it. The synopsis is as follows:
164
require "pt-visual-explain";
165
my $p = ExplainParser->new();
166
my $rows = $p->parse("some text");
167
# $rows is an arrayref of hashrefs.
170
The ExplainTree package accepts a set of rows and turns it into a tree. For
171
convenience, you can also have it delegate to ExplainParser and parse text for
172
you. Here's the synopsis:
177
require "pt-visual-explain";
178
my $e = ExplainTree->new();
179
my $tree = $e->parse("some text", \%options);
180
my $output = $e->pretty_print($tree);
190
This section explains the algorithm that converts EXPLAIN into a tree. You may
191
be interested in reading this if you want to understand EXPLAIN more fully, or
192
trying to figure out how this works, but otherwise this section will probably
193
not make your life richer.
195
The tree can be built by examining the id, select_type, and table columns of
196
each row. Here's what I know about them:
198
The id column is the sequential number of the select. This does not indicate
199
nesting; it just comes from counting SELECT from the left of the SQL statement.
200
It's like capturing parentheses in a regular expression. A UNION RESULT row
201
doesn't have an id, because it isn't a SELECT. The source code actually refers
202
to UNIONs as a fake_lex, as I recall.
204
If two adjacent rows have the same id value, they are joined with the standard
205
single-sweep multi-join method.
207
The select_type column tells a) that a new sub-scope has opened b) what kind
208
of relationship the row has to the previous row c) what kind of operation the
214
SIMPLE means there are no subqueries or unions in the whole query.
220
PRIMARY means there are, but this is the outermost SELECT.
226
[DEPENDENT] UNION means this result is UNIONed with the previous result (not
227
row; a result might encompass more than one row).
233
UNION RESULT terminates a set of UNIONed results.
239
[DEPENDENT|UNCACHEABLE] SUBQUERY means a new sub-scope is opening. This is the
240
kind of subquery that happens in a WHERE clause, SELECT list or whatnot; it does
241
not return a so-called "derived table."
247
DERIVED is a subquery in the FROM clause.
251
Tables that are JOINed all have the same select_type. For example, if you JOIN
252
three tables inside a dependent subquery, they'll all say the same thing:
255
The table column usually specifies the table name or alias, but may also say
256
<derivedN> or <unionN,N...N>. If it says <derivedN>, the row represents an
257
access to the temporary table that holds the result of the subquery whose id is
258
N. If it says <unionN,..N> it's the same thing, but it refers to the results it
261
Finally, order matters. If a row's id is less than the one before it, I think
262
that means it is dependent on something other than the one before it. For
269
(select 1 from sakila.film),
270
(select 2 from sakila.film_actor),
271
(select 3 from sakila.actor);
273
| id | select_type | table |
274
+----+-------------+------------+
275
| 1 | PRIMARY | NULL |
276
| 4 | SUBQUERY | actor |
277
| 3 | SUBQUERY | film_actor |
278
| 2 | SUBQUERY | film |
281
If the results were in order 2-3-4, I think that would mean 3 is a subquery of
282
2, 4 is a subquery of 3. As it is, this means 4 is a subquery of the nearest
283
previous recent row with a smaller id, which is 1. Likewise for 3 and 2.
285
This structure is hard to programatically build into a tree for the same reason
286
it's hard to understand by inspection: there are both forward and backward
287
references. <derivedN> is a forward reference to selectN, while <unionM,N> is a
288
backward reference to selectM and selectN. That makes recursion and other
289
tree-building algorithms hard to get right (NOTE: after implementation, I now
290
see how it would be possible to deal with both forward and backward references,
291
but I have no motivation to change something that works). Consider the
298
select 1 from sakila.actor as actor_1
300
select 1 from sakila.actor as actor_2
304
select 1 from sakila.actor as actor_3
306
select 1 from sakila.actor as actor_4
309
| id | select_type | table |
310
+------+--------------+------------+
311
| 1 | PRIMARY | <derived2> |
312
| 2 | DERIVED | actor_1 |
313
| 3 | UNION | actor_2 |
314
| NULL | UNION RESULT | <union2,3> |
315
| 4 | UNION | <derived5> |
316
| 5 | DERIVED | actor_3 |
317
| 6 | UNION | actor_4 |
318
| NULL | UNION RESULT | <union5,6> |
319
| NULL | UNION RESULT | <union1,4> |
322
This would be a lot easier to work with if it looked like this (I've
323
bracketed the id on rows I moved):
328
| id | select_type | table |
329
+------+--------------+------------+
330
| [1] | UNION RESULT | <union1,4> |
331
| 1 | PRIMARY | <derived2> |
332
| [2] | UNION RESULT | <union2,3> |
333
| 2 | DERIVED | actor_1 |
334
| 3 | UNION | actor_2 |
335
| 4 | UNION | <derived5> |
336
| [5] | UNION RESULT | <union5,6> |
337
| 5 | DERIVED | actor_3 |
338
| 6 | UNION | actor_4 |
341
In fact, why not re-number all the ids, so the PRIMARY row becomes 2, and so on?
342
That would make it even easier to read. Unfortunately that would also have the
343
effect of destroying the meaning of the id column, which I think is important to
344
preserve in the final tree. Also, though it makes it easier to read, it doesn't
345
make it easier to manipulate programmatically; so it's fine to leave them
346
numbered as they are.
348
The goal of re-ordering is to make it easier to figure out which rows are
349
children of which rows in the execution plan. Given the reordered list and some
350
row whose table is <union...> or <derived>, it is easy to find the beginning of
351
the slice of rows that should be child nodes in the tree: you just look for the
352
first row whose ID is the same as the first number in the table.
354
The next question is how to find the last row that should be a child node of a
355
UNION or DERIVED. I'll start with DERIVED, because the solution makes UNION
358
Consider how MySQL numbers the SELECTs sequentially according to their position
359
in the SQL, left-to-right. Since a DERIVED table encloses everything within it
360
in a scope, which becomes a temporary table, there are only two things to think
361
about: its child subqueries and unions (if any), and its next siblings in the
362
scope that encloses it. Its children will all have an id greater than it does,
363
by definition, so any later rows with a smaller id terminate the scope.
365
Here's an example. The middle derived table here has a subquery and a UNION to
366
make it a little more complex for the example.
373
select film_id from sakila.film limit 1
376
select film_id, actor_id, (select count(*) from sakila.rental) as r
377
from sakila.film_actor limit 1
379
select 1, 1, 1 from sakila.film_actor as dummy
380
) as der_2 using (film_id)
382
select actor_id from sakila.actor limit 1
383
) as der_3 using (actor_id);
386
Here's the output of EXPLAIN:
391
| id | select_type | table |
392
| 1 | PRIMARY | <derived2> |
393
| 1 | PRIMARY | <derived6> |
394
| 1 | PRIMARY | <derived3> |
395
| 6 | DERIVED | actor |
396
| 3 | DERIVED | film_actor |
397
| 4 | SUBQUERY | rental |
398
| 5 | UNION | dummy |
399
| NULL | UNION RESULT | <union3,5> |
400
| 2 | DERIVED | film |
403
The siblings all have id 1, and the middle one I care about is derived3.
404
(Notice MySQL doesn't execute them in the order I defined them, which is fine).
405
Now notice that MySQL prints out the rows in the opposite order I defined the
406
subqueries: 6, 3, 2. It always seems to do this, and there might be other
407
methods of finding the scope boundaries including looking for the lower boundary
408
of the next largest sibling, but this is a good enough heuristic. I am forced
409
to rely on it for non-DERIVED subqueries, so I rely on it here too. Therefore,
410
I decide that everything greater than or equal to 3 belongs to the DERIVED
413
The rule for UNION is simple: they consume the entire enclosing scope, and to
414
find the component parts of each one, you find each part's beginning as referred
415
to in the <unionN,...> definition, and its end is either just before the next
416
one, or if it's the last part, the end is the end of the scope.
418
This is only simple because UNION consumes the entire scope, which is either the
419
entire statement, or the scope of a DERIVED table. This is because a UNION
420
cannot be a sibling of another UNION or a table, DERIVED or not. (Try writing
421
such a statement if you don't see it intuitively). Therefore, you can just find
422
the enclosing scope's boundaries, and the rest is easy. Notice in the example
423
above, the UNION is over <union3,5>, which includes the row with id 4 -- it
424
includes every row between 3 and 5.
426
Finally, there are non-derived subqueries to deal with as well. In this case I
427
can't look at siblings to find the end of the scope as I did for DERIVED. I
428
have to trust that MySQL executes depth-first. Here's an example:
436
select count(film_id)
437
+ (select count(*) from sakila.film)
438
from sakila.film join sakila.film_actor using(film_id)
440
select * from sakila.actor
441
where sakila.actor.actor_id = sakila.film_actor.actor_id
446
| id | select_type | table |
447
| 1 | PRIMARY | actor |
448
| 2 | SUBQUERY | film |
449
| 2 | SUBQUERY | film_actor |
450
| 4 | DEPENDENT SUBQUERY | actor |
451
| 3 | SUBQUERY | film |
454
In order, the tree should be built like this:
465
See row 2. It's a higher id than 1, so it's a subquery, along with every other
466
row whose id is greater than 2.
472
Inside this scope, see 2 and 2 and JOIN them. See 4. It's a higher id than 2,
473
so it's again a subquery; recurse. After that, see 3, which is also higher;
478
But the only reason the nested subquery didn't include select 3 is because
479
select 4 came first. In other words, if EXPLAIN looked like this,
484
| id | select_type | table |
485
| 1 | PRIMARY | actor |
486
| 2 | SUBQUERY | film |
487
| 2 | SUBQUERY | film_actor |
488
| 3 | SUBQUERY | film |
489
| 4 | DEPENDENT SUBQUERY | actor |
492
I would be forced to assume upon seeing select 3 that select 4 is a subquery
493
of it, rather than just being the next sibling in the enclosing scope. If this
494
is ever wrong, then the algorithm is wrong, and I don't see what could be done
497
UNION is a little more complicated than just "the entire scope is a UNION,"
498
because the UNION might itself be inside an enclosing scope that's only
499
indicated by the first item inside the UNION. There are only three kinds of
500
enclosing scopes: UNION, DERIVED, and SUBQUERY. A UNION can't enclose a UNION,
501
and a DERIVED has its own "scope markers," but a SUBQUERY can wholly enclose a
502
UNION, like this strange example on the empty table t1:
507
explain select * from t1 where not exists(
508
(select t11.i from t1 t11) union (select t12.i from t1 t12));
510
| id | select_type | table | Extra |
511
+------+--------------+------------+--------------------------------+
512
| 1 | PRIMARY | t1 | const row not found |
513
| 2 | SUBQUERY | NULL | No tables used |
514
| 3 | SUBQUERY | NULL | no matching row in const table |
515
| 4 | UNION | t12 | const row not found |
516
| NULL | UNION RESULT | <union2,4> | |
519
The UNION's backward references might make it look like the UNION encloses the
520
subquery, but studying the query makes it clear this isn't the case. So when a
521
UNION's first row says SUBQUERY, it is this special case.
523
By the way, I don't fully understand this query plan; there are 4 numbered
524
SELECT in the plan, but only 3 in the query. The parens around the UNIONs are
525
meaningful. Removing them will make the EXPLAIN different. Please tell me how
526
and why this works if you know.
528
Armed with this knowledge, it's possible to use recursion to turn the
529
parent-child relationship between all the rows into a tree representing the
532
MySQL prints the rows in execution order, even the forward and backward
533
references. At any given scope, the rows are processed as a left-deep tree.
534
MySQL does not do "bushy" execution plans. It begins with a table, finds a
535
matching row in the next table, and continues till the last table, when it emits
536
a row. When it runs out, it backtracks till it can find the next row and
537
repeats. There are subtleties of course, but this is the basic plan. This is
538
why MySQL transforms all RIGHT OUTER JOINs into LEFT OUTER JOINs and cannot do
541
This means in any given scope, say
546
| id | select_type | table |
547
| 1 | SIMPLE | tbl1 |
548
| 1 | SIMPLE | tbl2 |
549
| 1 | SIMPLE | tbl3 |
552
The execution plan looks like a depth-first traversal of this tree:
564
The JOIN might not be a JOIN. It might be a subquery, for example. This comes
565
from the type column of EXPLAIN. The documentation says this is a "join type,"
566
but I think "access type" is more accurate, because it's "how MySQL accesses
569
pt-visual-explain decorates the tree significantly more than just turning
570
rows into nodes. Each node may get a series of transformations that turn it
571
into a subtree of more than one node. For example, an index scan not marked
572
with 'Using index' must do a bookmark lookup into the table rows; that is a
573
three-node subtree. However, after the above node-ordering and scoping stuff,
574
the rest of the process is pretty simple.
582
This tool accepts additional command-line arguments. Refer to the
583
"SYNOPSIS" and usage information for details.
588
Prompt for a password when connecting to MySQL.
594
short form: -A; type: string
596
Default character set. If the value is utf8, sets Perl's binmode on
597
STDOUT to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and
598
runs SET NAMES UTF8 after connecting to MySQL. Any other value sets
599
binmode on STDOUT without the utf8 layer, and runs SET NAMES after
606
Assume that PRIMARY KEY index accesses don't need to do a bookmark lookup to
607
retrieve rows. This is the case for InnoDB.
615
Read this comma-separated list of config files; if specified, this must be the
616
first option on the command line.
622
Treat input as a query, and obtain EXPLAIN output by connecting to a MySQL
623
instance and running EXPLAIN on the query. When this option is given,
624
pt-visual-explain uses the other connection-specific options such as
625
"--user" to connect to the MySQL instance. If you have a .my.cnf file,
626
it will read it, so you may not need to specify any connection-specific
633
short form: -D; type: string
635
Connect to this database.
641
short form: -F; type: string
643
Only read mysql options from the given file. You must give an absolute
650
type: string; default: tree
654
The default is a terse pretty-printed tree. The valid values are:
661
tree Pretty-printed terse tree.
662
dump Data::Dumper output (see L<Data::Dumper> for more).
675
short form: -h; type: string
683
short form: -p; type: string
685
Password to use when connecting.
693
Create the given PID file. The file contains the process ID of the script.
694
The PID file is removed when the script exits. Before starting, the script
695
checks if the PID file already exists. If it does not, then the script creates
696
and writes its own PID to it. If it does, then the script checks the following:
697
if the file contains a PID and a process is running with that PID, then
698
the script dies; or, if there is no process running with that PID, then the
699
script overwrites the file with its own PID and starts; else, if the file
700
contains no PID, then the script dies.
706
short form: -P; type: int
708
Port number to use for connection.
714
type: string; default: wait_timeout=10000
716
Set these MySQL variables. Immediately after connecting to MySQL, this
717
string will be appended to SET and executed.
723
short form: -S; type: string
725
Socket file to use for connection.
731
short form: -u; type: string
733
User for login if not current user.
739
Show version and exit.
749
These DSN options are used to create a DSN. Each option is given like
750
\ ``option=value``\ . The options are case-sensitive, so P and p are not the
751
same option. There cannot be whitespace before or after the \ ``=``\ and
752
if the value contains whitespace it must be quoted. DSN options are
753
comma-separated. See the percona-toolkit manpage for full details.
758
dsn: charset; copy: yes
760
Default character set.
766
dsn: database; copy: yes
774
dsn: mysql_read_default_file; copy: yes
776
Only read default options from the given file
790
dsn: password; copy: yes
792
Password to use when connecting.
800
Port number to use for connection.
806
dsn: mysql_socket; copy: yes
808
Socket file to use for connection.
816
User for login if not current user.
826
Visit `http://www.percona.com/software/ <http://www.percona.com/software/>`_ to download the latest release of
827
Percona Toolkit. Or, to get the latest release from the command line:
832
wget percona.com/latest/percona-toolkit/PKG
835
Replace \ ``PKG``\ with \ ``tar``\ , \ ``rpm``\ , or \ ``deb``\ to download the package in that
836
format. You can also get individual tools from the latest release:
841
wget percona.com/latest/percona-toolkit/TOOL
844
Replace \ ``TOOL``\ with the name of any tool.
852
The environment variable \ ``PTDEBUG``\ enables verbose debugging output to STDERR.
853
To enable debugging and capture all output to a file, run the tool like:
858
PTDEBUG=1 pt-visual-explain ... > FILE 2>&1
861
Be careful: debugging output is voluminous and can generate several megabytes
870
You need Perl, DBI, DBD::mysql, and some core packages that ought to be
871
installed in any reasonably new version of Perl.
879
For a list of known bugs, see `http://www.percona.com/bugs/pt-visual-explain <http://www.percona.com/bugs/pt-visual-explain>`_.
881
Please report bugs at `https://bugs.launchpad.net/percona-toolkit <https://bugs.launchpad.net/percona-toolkit>`_.
882
Include the following information in your bug report:
885
\* Complete command-line used to run the tool
893
\* MySQL version of all servers involved
897
\* Output from the tool including STDERR
901
\* Input files (log/dump/config files, etc.)
905
If possible, include debugging output by running the tool with \ ``PTDEBUG``\ ;
917
*********************
918
ABOUT PERCONA TOOLKIT
919
*********************
922
This tool is part of Percona Toolkit, a collection of advanced command-line
923
tools developed by Percona for MySQL support and consulting. Percona Toolkit
924
was forked from two projects in June, 2011: Maatkit and Aspersa. Those
925
projects were created by Baron Schwartz and developed primarily by him and
926
Daniel Nichter, both of whom are employed by Percona. Visit
927
`http://www.percona.com/software/ <http://www.percona.com/software/>`_ for more software developed by Percona.
930
********************************
931
COPYRIGHT, LICENSE, AND WARRANTY
932
********************************
935
This program is copyright 2007-2011 Baron Schwartz, 2011 Percona Inc.
936
Feedback and improvements are welcome.
938
THIS PROGRAM IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
939
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
940
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
942
This program is free software; you can redistribute it and/or modify it under
943
the terms of the GNU General Public License as published by the Free Software
944
Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
945
systems, you can issue \`man perlgpl' or \`man perlartistic' to read these
948
You should have received a copy of the GNU General Public License along with
949
this program; if not, write to the Free Software Foundation, Inc., 59 Temple
950
Place, Suite 330, Boston, MA 02111-1307 USA.
958
Percona Toolkit v1.0.0 released 2011-08-01