14
pt-index-usage - Read queries from a log and analyze how they use indexes.
22
Usage: pt-index-usage [OPTION...] [FILE...]
24
pt-index-usage reads queries from logs and analyzes how they use indexes.
26
Analyze queries in slow.log and print reports:
31
pt-index-usage /path/to/slow.log --host localhost
34
Disable reports and save results to mk database for later analysis:
39
pt-index-usage slow.log --no-report --save-results-database mk
48
The following section is included to inform users about the potential risks,
49
whether known or unknown, of using this tool. The two main categories of risks
50
are those created by the nature of the tool (e.g. read-only tools vs. read-write
51
tools) and those created by bugs.
53
This tool is read-only unless you use "--save-results-database". It reads a
54
log of queries and EXPLAIN them. It also gathers information about all tables
55
in all databases. It should be very low-risk.
57
At the time of this release, we know of no bugs that could cause serious harm to
60
The authoritative source for updated information is always the online issue
61
tracking system. Issues that affect this tool will be marked as such. You can
62
see a list of such issues at the following URL:
63
`http://www.percona.com/bugs/pt-index-usage <http://www.percona.com/bugs/pt-index-usage>`_.
65
See also "BUGS" for more information on filing bugs and getting help.
73
This tool connects to a MySQL database server, reads through a query log, and
74
uses EXPLAIN to ask MySQL how it will use each query. When it is finished, it
75
prints out a report on indexes that the queries didn't use.
77
The query log needs to be in MySQL's slow query log format. If you need to
78
input a different format, you can use pt-query-digest to translate the
79
formats. If you don't specify a filename, the tool reads from STDIN.
81
The tool runs two stages. In the first stage, the tool takes inventory of all
82
the tables and indexes in your database, so it can compare the existing indexes
83
to those that were actually used by the queries in the log. In the second
84
stage, it runs EXPLAIN on each query in the query log. It uses separate
85
database connections to inventory the tables and run EXPLAIN, so it opens two
86
connections to the database.
88
If a query is not a SELECT, it tries to transform it to a roughly equivalent
89
SELECT query so it can be EXPLAINed. This is not a perfect process, but it is
90
good enough to be useful.
92
The tool skips the EXPLAIN step for queries that are exact duplicates of those
93
seen before. It assumes that the same query will generate the same EXPLAIN plan
94
as it did previously (usually a safe assumption, and generally good for
95
performance), and simply increments the count of times that the indexes were
96
used. However, queries that have the same fingerprint but different checksums
97
will be re-EXPLAINed. Queries that have different literal constants can have
98
different execution plans, and this is important to measure.
100
After EXPLAIN-ing the query, it is necessary to try to map aliases in the query
101
back to the original table names. For example, consider the EXPLAIN plan for
107
SELECT * FROM tbl1 AS foo;
110
The EXPLAIN output will show access to table \ ``foo``\ , and that must be translated
111
back to \ ``tbl1``\ . This process involves complex parsing. It is generally very
112
accurate, but there is some chance that it might not work right. If you find
113
cases where it fails, submit a bug report and a reproducible test case.
115
Queries that cannot be EXPLAINed will cause all subsequent queries with the
116
same fingerprint to be blacklisted. This is to reduce the work they cause, and
117
prevent them from continuing to print error messages. However, at least in
118
this stage of the tool's development, it is my opinion that it's not a good
119
idea to preemptively silence these, or prevent them from being EXPLAINed at
120
all. I am looking for lots of feedback on how to improve things like the
121
query parsing. So please submit your test cases based on the errors the tool
130
After it reads all the events in the log, the tool prints out DROP statements
131
for every index that was not used. It skips indexes for tables that were never
132
accessed by any queries in the log, to avoid false-positive results.
134
If you don't specify "--quiet", the tool also outputs warnings about
135
statements that cannot be EXPLAINed and similar. These go to standard error.
137
Progress reports are enabled by default (see "--progress"). These also go to
146
This tool accepts additional command-line arguments. Refer to the
147
"SYNOPSIS" and usage information for details.
152
Prompt for a password when connecting to MySQL.
158
short form: -A; type: string
160
Default character set. If the value is utf8, sets Perl's binmode on
161
STDOUT to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and
162
runs SET NAMES UTF8 after connecting to MySQL. Any other value sets
163
binmode on STDOUT without the utf8 layer, and runs SET NAMES after
172
Read this comma-separated list of config files; if specified, this must be the
173
first option on the command line.
177
--create-save-results-database
179
Create the "--save-results-database" if it does not exist.
181
If the "--save-results-database" already exists and this option is
182
specified, the database is used and the necessary tables are created if
183
they do not already exist.
189
Create views for "--save-results-database" example queries.
191
Several example queries are given for querying the tables in the
192
"--save-results-database". These example queries are, by default, created
193
as views. Specifying \ ``--no-create-views``\ prevents these views from being
200
short form: -D; type: string
202
The database to use for the connection.
208
short form: -d; type: hash
210
Only get tables and indexes from this comma-separated list of databases.
218
Only get tables and indexes from database whose names match this Perl regex.
224
short form: -F; type: string
226
Only read mysql options from the given file. You must give an absolute pathname.
232
type: Hash; default: non-unique
234
Suggest dropping only these types of unused indexes.
236
By default pt-index-usage will only suggest to drop unused secondary indexes,
237
not primary or unique indexes. You can specify which types of unused indexes
238
the tool suggests to drop: primary, unique, non-unique, all.
240
A separate \ ``ALTER TABLE``\ statement for each type is printed. So if you
241
specify \ ``--drop all``\ and there is a primary key and a non-unique index,
242
the \ ``ALTER TABLE ... DROP``\ for each will be printed on separate lines.
246
--empty-save-results-tables
248
Drop and re-create all pre-existing tables in the "--save-results-database".
249
This allows information from previous runs to be removed before the current run.
261
short form: -h; type: string
271
Ignore this comma-separated list of databases.
275
--ignore-databases-regex
279
Ignore databases whose names match this Perl regex.
287
Ignore this comma-separated list of table names.
289
Table names may be qualified with the database name.
293
--ignore-tables-regex
297
Ignore tables whose names match the Perl regex.
303
short form: -p; type: string
305
Password to use when connecting.
311
short form: -P; type: int
313
Port number to use for connection.
319
type: array; default: time,30
321
Print progress reports to STDERR. The value is a comma-separated list with two
322
parts. The first part can be percentage, time, or iterations; the second part
323
specifies how often an update should be printed, in percentage, seconds, or
324
number of iterations.
332
Do not print any warnings. Also disables "--progress".
340
Print the reports for "--report-format".
342
You may want to disable the reports by specifying \ ``--no-report``\ if, for
343
example, you also specify "--save-results-database" and you only want
344
to query the results tables later.
350
type: Array; default: drop_unused_indexes
352
Right now there is only one report: drop_unused_indexes. This report prints
353
SQL statements for dropping any unused indexes. See also "--drop".
355
See also "--[no]report".
359
--save-results-database
363
Save results to tables in this database. Information about indexes, queries,
364
tables and their usage is stored in several tables in the specified database.
365
The tables are auto-created if they do not exist. If the database doesn't
366
exist, it can be auto-created with "--create-save-results-database". In this
367
case the connection is initially created with no default database, then after
368
the database is created, it is USE'ed.
370
pt-index-usage executes INSERT statements to save the results. Therefore, you
371
should be careful if you use this feature on a production server. It might
372
increase load, or cause trouble if you don't want the server to be written to,
375
This is a new feature. It may change in future releases.
377
After a run, you can query the usage tables to answer various questions about
378
index usage. The tables have the following CREATE TABLE definitions:
380
MAGIC_create_indexes:
385
CREATE TABLE IF NOT EXISTS indexes (
386
db VARCHAR(64) NOT NULL,
387
tbl VARCHAR(64) NOT NULL,
388
idx VARCHAR(64) NOT NULL,
389
cnt BIGINT UNSIGNED NOT NULL DEFAULT 0,
390
PRIMARY KEY (db, tbl, idx)
394
MAGIC_create_queries:
399
CREATE TABLE IF NOT EXISTS queries (
400
query_id BIGINT UNSIGNED NOT NULL,
401
fingerprint TEXT NOT NULL,
402
sample TEXT NOT NULL,
403
PRIMARY KEY (query_id)
412
CREATE TABLE IF NOT EXISTS tables (
413
db VARCHAR(64) NOT NULL,
414
tbl VARCHAR(64) NOT NULL,
415
cnt BIGINT UNSIGNED NOT NULL DEFAULT 0,
416
PRIMARY KEY (db, tbl)
420
MAGIC_create_index_usage:
425
CREATE TABLE IF NOT EXISTS index_usage (
426
query_id BIGINT UNSIGNED NOT NULL,
427
db VARCHAR(64) NOT NULL,
428
tbl VARCHAR(64) NOT NULL,
429
idx VARCHAR(64) NOT NULL,
430
cnt BIGINT UNSIGNED NOT NULL DEFAULT 1,
431
UNIQUE INDEX (query_id, db, tbl, idx)
435
MAGIC_create_index_alternatives:
440
CREATE TABLE IF NOT EXISTS index_alternatives (
441
query_id BIGINT UNSIGNED NOT NULL, -- This query used
442
db VARCHAR(64) NOT NULL, -- this index, but...
443
tbl VARCHAR(64) NOT NULL, --
444
idx VARCHAR(64) NOT NULL, --
445
alt_idx VARCHAR(64) NOT NULL, -- was an alternative
446
cnt BIGINT UNSIGNED NOT NULL DEFAULT 1,
447
UNIQUE INDEX (query_id, db, tbl, idx, alt_idx),
448
INDEX (db, tbl, idx),
449
INDEX (db, tbl, alt_idx)
453
The following are some queries you can run against these tables to answer common
454
questions you might have. Each query is also created as a view (with MySQL
455
v5.0 and newer) if \ ``"--[no]create-views"``\ is true (it is by default).
456
The view names are the strings after the \ ``MAGIC_view_``\ prefix.
458
Question: which queries sometimes use different indexes, and what fraction of
459
the time is each index chosen? MAGIC_view_query_uses_several_indexes:
464
SELECT iu.query_id, CONCAT_WS('.', iu.db, iu.tbl, iu.idx) AS idx,
465
variations, iu.cnt, iu.cnt / total_cnt * 100 AS pct
466
FROM index_usage AS iu
468
SELECT query_id, db, tbl, SUM(cnt) AS total_cnt,
469
COUNT(*) AS variations
471
GROUP BY query_id, db, tbl
473
) AS qv USING(query_id, db, tbl);
476
Question: which indexes have lots of alternatives, i.e. are chosen instead of
477
other indexes, and for what queries? MAGIC_view_index_has_alternates:
482
SELECT CONCAT_WS('.', db, tbl, idx) AS idx_chosen,
483
GROUP_CONCAT(DISTINCT alt_idx) AS alternatives,
484
GROUP_CONCAT(DISTINCT query_id) AS queries, SUM(cnt) AS cnt
485
FROM index_alternatives
486
GROUP BY db, tbl, idx
490
Question: which indexes are considered as alternates for other indexes, and for
491
what queries? MAGIC_view_index_alternates:
496
SELECT CONCAT_WS('.', db, tbl, alt_idx) AS idx_considered,
497
GROUP_CONCAT(DISTINCT idx) AS alternative_to,
498
GROUP_CONCAT(DISTINCT query_id) AS queries, SUM(cnt) AS cnt
499
FROM index_alternatives
500
GROUP BY db, tbl, alt_idx
504
Question: which of those are never chosen by any queries, and are therefore
505
superfluous? MAGIC_view_unused_index_alternates:
510
SELECT CONCAT_WS('.', i.db, i.tbl, i.idx) AS idx,
511
alt.alternative_to, alt.queries, alt.cnt
514
SELECT db, tbl, alt_idx, GROUP_CONCAT(DISTINCT idx) AS alternative_to,
515
GROUP_CONCAT(DISTINCT query_id) AS queries, SUM(cnt) AS cnt
516
FROM index_alternatives
517
GROUP BY db, tbl, alt_idx
519
) AS alt ON i.db = alt.db AND i.tbl = alt.tbl
520
AND i.idx = alt.alt_idx
524
Question: given a table, which indexes were used, by how many queries, with how
525
many distinct fingerprints? Were there alternatives? Which indexes were not
526
used? You can edit the following query's SELECT list to also see the query IDs
527
in question. MAGIC_view_index_usage:
532
SELECT i.idx, iu.usage_cnt, iu.usage_total,
533
ia.alt_cnt, ia.alt_total
536
SELECT db, tbl, idx, COUNT(*) AS usage_cnt,
537
SUM(cnt) AS usage_total, GROUP_CONCAT(query_id) AS used_by
539
GROUP BY db, tbl, idx
540
) AS iu ON i.db=iu.db AND i.tbl=iu.tbl AND i.idx = iu.idx
542
SELECT db, tbl, idx, COUNT(*) AS alt_cnt,
543
SUM(cnt) AS alt_total,
544
GROUP_CONCAT(query_id) AS alt_queries
545
FROM index_alternatives
546
GROUP BY db, tbl, idx
547
) AS ia ON i.db=ia.db AND i.tbl=ia.tbl AND i.idx = ia.idx;
550
Question: which indexes on a given table are vital for at least one query (there
551
is no alternative)? MAGIC_view_required_indexes:
556
SELECT i.db, i.tbl, i.idx, no_alt.queries
559
SELECT iu.db, iu.tbl, iu.idx,
560
GROUP_CONCAT(iu.query_id) AS queries
561
FROM index_usage AS iu
562
LEFT OUTER JOIN index_alternatives AS ia
565
GROUP BY iu.db, iu.tbl, iu.idx
566
) AS no_alt ON no_alt.db = i.db AND no_alt.tbl = i.tbl
567
AND no_alt.idx = i.idx
568
ORDER BY i.db, i.tbl, i.idx, no_alt.queries;
575
type: string; default: wait_timeout=10000
577
Set these MySQL variables. Immediately after connecting to MySQL, this
578
string will be appended to SET and executed.
584
short form: -S; type: string
586
Socket file to use for connection.
592
short form: -t; type: hash
594
Only get indexes from this comma-separated list of tables.
602
Only get indexes from tables whose names match this Perl regex.
608
short form: -u; type: string
610
User for login if not current user.
616
Show version and exit.
626
These DSN options are used to create a DSN. Each option is given like
627
\ ``option=value``\ . The options are case-sensitive, so P and p are not the
628
same option. There cannot be whitespace before or after the \ ``=``\ and
629
if the value contains whitespace it must be quoted. DSN options are
630
comma-separated. See the percona-toolkit manpage for full details.
635
dsn: charset; copy: yes
637
Default character set.
643
dsn: database; copy: yes
645
Database to connect to.
651
dsn: mysql_read_default_file; copy: yes
653
Only read default options from the given file
667
dsn: password; copy: yes
669
Password to use when connecting.
677
Port number to use for connection.
683
dsn: mysql_socket; copy: yes
685
Socket file to use for connection.
693
User for login if not current user.
703
The environment variable \ ``PTDEBUG``\ enables verbose debugging output to STDERR.
704
To enable debugging and capture all output to a file, run the tool like:
709
PTDEBUG=1 pt-index-usage ... > FILE 2>&1
712
Be careful: debugging output is voluminous and can generate several megabytes
721
You need Perl, DBI, DBD::mysql, and some core packages that ought to be
722
installed in any reasonably new version of Perl.
730
For a list of known bugs, see `http://www.percona.com/bugs/pt-index-usage <http://www.percona.com/bugs/pt-index-usage>`_.
732
Please report bugs at `https://bugs.launchpad.net/percona-toolkit <https://bugs.launchpad.net/percona-toolkit>`_.
733
Include the following information in your bug report:
736
\* Complete command-line used to run the tool
744
\* MySQL version of all servers involved
748
\* Output from the tool including STDERR
752
\* Input files (log/dump/config files, etc.)
756
If possible, include debugging output by running the tool with \ ``PTDEBUG``\ ;
765
Visit `http://www.percona.com/software/percona-toolkit/ <http://www.percona.com/software/percona-toolkit/>`_ to download the
766
latest release of Percona Toolkit. Or, get the latest release from the
772
wget percona.com/get/percona-toolkit.tar.gz
774
wget percona.com/get/percona-toolkit.rpm
776
wget percona.com/get/percona-toolkit.deb
779
You can also get individual tools from the latest release:
784
wget percona.com/get/TOOL
787
Replace \ ``TOOL``\ with the name of any tool.
795
Baron Schwartz and Daniel Nichter
798
*********************
799
ABOUT PERCONA TOOLKIT
800
*********************
803
This tool is part of Percona Toolkit, a collection of advanced command-line
804
tools developed by Percona for MySQL support and consulting. Percona Toolkit
805
was forked from two projects in June, 2011: Maatkit and Aspersa. Those
806
projects were created by Baron Schwartz and developed primarily by him and
807
Daniel Nichter, both of whom are employed by Percona. Visit
808
`http://www.percona.com/software/ <http://www.percona.com/software/>`_ for more software developed by Percona.
811
********************************
812
COPYRIGHT, LICENSE, AND WARRANTY
813
********************************
816
This program is copyright 2010-2011 Baron Schwartz, 2011 Percona Inc.
817
Feedback and improvements are welcome.
819
THIS PROGRAM IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
820
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
821
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
823
This program is free software; you can redistribute it and/or modify it under
824
the terms of the GNU General Public License as published by the Free Software
825
Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
826
systems, you can issue \`man perlgpl' or \`man perlartistic' to read these
829
You should have received a copy of the GNU General Public License along with
830
this program; if not, write to the Free Software Foundation, Inc., 59 Temple
831
Place, Suite 330, Boston, MA 02111-1307 USA.