~percona-toolkit-dev/percona-toolkit/1.0

« back to all changes in this revision

Viewing changes to docs/user/pt-index-usage.rst

Committer: Daniel Nichter
Date: 2011-12-29 22:16:13 UTC
Revision ID: daniel@percona.com-20111229221613-oprw0q2td34r6zkb

Update release date.

files removed:
docs/user/authors.rst

docs/user/bugs.rst

docs/user/configuration_files.rst

docs/user/copyright_license_and_warranty.rst

docs/user/environment.rst

docs/user/index.rst

docs/user/pt-archiver.rst

docs/user/pt-collect.rst

docs/user/pt-config-diff.rst

docs/user/pt-deadlock-logger.rst

docs/user/pt-diskstats.rst

docs/user/pt-duplicate-key-checker.rst

docs/user/pt-fifo-split.rst

docs/user/pt-find.rst

docs/user/pt-fk-error-logger.rst

docs/user/pt-heartbeat.rst

docs/user/pt-index-usage.rst

docs/user/pt-kill.rst

docs/user/pt-log-player.rst

docs/user/pt-mext.rst

docs/user/pt-mysql-summary.rst

docs/user/pt-online-schema-change.rst

docs/user/pt-pmp.rst

docs/user/pt-query-advisor.rst

docs/user/pt-query-digest.rst

docs/user/pt-show-grants.rst

docs/user/pt-sift.rst

docs/user/pt-slave-delay.rst

docs/user/pt-slave-find.rst

docs/user/pt-slave-restart.rst

docs/user/pt-stalk.rst

docs/user/pt-summary.rst

docs/user/pt-table-checksum.rst

docs/user/pt-table-sync.rst

docs/user/pt-tcp-model.rst

docs/user/pt-trend.rst

docs/user/pt-upgrade.rst

docs/user/pt-variable-advisor.rst

docs/user/pt-visual-explain.rst

docs/user/release_notes.rst

docs/user/system_requirements.rst

docs/user/version.rst

files modified:
Changelog

docs/release_notes.rst

Show diffs side-by-side

added added

removed removed

docs/user/pt-index-usage.rst

##############

pt-index-usage

##############

.. highlight:: perl

****

NAME

****

pt-index-usage - Read queries from a log and analyze how they use indexes.

********

SYNOPSIS

********

Usage: pt-index-usage [OPTION...] [FILE...]

pt-index-usage reads queries from logs and analyzes how they use indexes.

Analyze queries in slow.log and print reports:

.. code-block:: perl

pt-index-usage /path/to/slow.log --host localhost

Disable reports and save results to mk database for later analysis:

.. code-block:: perl

pt-index-usage slow.log --no-report --save-results-database mk

*****

RISKS

*****

The following section is included to inform users about the potential risks,

whether known or unknown, of using this tool. The two main categories of risks

are those created by the nature of the tool (e.g. read-only tools vs. read-write

tools) and those created by bugs.

This tool is read-only unless you use "--save-results-database". It reads a

log of queries and EXPLAIN them. It also gathers information about all tables

in all databases. It should be very low-risk.

At the time of this release, we know of no bugs that could cause serious harm to

users.

The authoritative source for updated information is always the online issue

tracking system. Issues that affect this tool will be marked as such. You can

see a list of such issues at the following URL:

`http://www.percona.com/bugs/pt-index-usage <http://www.percona.com/bugs/pt-index-usage>`_.

See also "BUGS" for more information on filing bugs and getting help.

***********

DESCRIPTION

***********

This tool connects to a MySQL database server, reads through a query log, and

uses EXPLAIN to ask MySQL how it will use each query. When it is finished, it

prints out a report on indexes that the queries didn't use.

The query log needs to be in MySQL's slow query log format. If you need to

input a different format, you can use pt-query-digest to translate the

formats. If you don't specify a filename, the tool reads from STDIN.

The tool runs two stages. In the first stage, the tool takes inventory of all

the tables and indexes in your database, so it can compare the existing indexes

to those that were actually used by the queries in the log. In the second

stage, it runs EXPLAIN on each query in the query log. It uses separate

database connections to inventory the tables and run EXPLAIN, so it opens two

connections to the database.

If a query is not a SELECT, it tries to transform it to a roughly equivalent

SELECT query so it can be EXPLAINed. This is not a perfect process, but it is

good enough to be useful.

The tool skips the EXPLAIN step for queries that are exact duplicates of those

seen before. It assumes that the same query will generate the same EXPLAIN plan

as it did previously (usually a safe assumption, and generally good for

performance), and simply increments the count of times that the indexes were

used. However, queries that have the same fingerprint but different checksums

will be re-EXPLAINed. Queries that have different literal constants can have

different execution plans, and this is important to measure.

100

After EXPLAIN-ing the query, it is necessary to try to map aliases in the query

101

back to the original table names. For example, consider the EXPLAIN plan for

102

the following query:

103

104

105

.. code-block:: perl

106

107

SELECT * FROM tbl1 AS foo;

108

109

110

The EXPLAIN output will show access to table \ ``foo``\ , and that must be translated

111

back to \ ``tbl1``\ . This process involves complex parsing. It is generally very

112

accurate, but there is some chance that it might not work right. If you find

113

cases where it fails, submit a bug report and a reproducible test case.

114

115

Queries that cannot be EXPLAINed will cause all subsequent queries with the

116

same fingerprint to be blacklisted. This is to reduce the work they cause, and

117

prevent them from continuing to print error messages. However, at least in

118

this stage of the tool's development, it is my opinion that it's not a good

119

idea to preemptively silence these, or prevent them from being EXPLAINed at

120

all. I am looking for lots of feedback on how to improve things like the

121

query parsing. So please submit your test cases based on the errors the tool

122

prints!

123

124

125

******

126

OUTPUT

127

******

128

129

130

After it reads all the events in the log, the tool prints out DROP statements

131

for every index that was not used. It skips indexes for tables that were never

132

accessed by any queries in the log, to avoid false-positive results.

133

134

If you don't specify "--quiet", the tool also outputs warnings about

135

statements that cannot be EXPLAINed and similar. These go to standard error.

136

137

Progress reports are enabled by default (see "--progress"). These also go to

138

standard error.

139

140

141

*******

142

OPTIONS

143

*******

144

145

146

This tool accepts additional command-line arguments. Refer to the

147

"SYNOPSIS" and usage information for details.

148

149

150

--ask-pass

151

152

Prompt for a password when connecting to MySQL.

153

154

155

156

--charset

157

158

short form: -A; type: string

159

160

Default character set. If the value is utf8, sets Perl's binmode on

161

STDOUT to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and

162

runs SET NAMES UTF8 after connecting to MySQL. Any other value sets

163

binmode on STDOUT without the utf8 layer, and runs SET NAMES after

164

connecting to MySQL.

165

166

167

168

--config

169

170

type: Array

171

172

Read this comma-separated list of config files; if specified, this must be the

173

first option on the command line.

174

175

176

177

--create-save-results-database

178

179

Create the "--save-results-database" if it does not exist.

180

181

If the "--save-results-database" already exists and this option is

182

specified, the database is used and the necessary tables are created if

183

they do not already exist.

184

185

186

187

--[no]create-views

188

189

Create views for "--save-results-database" example queries.

190

191

Several example queries are given for querying the tables in the

192

"--save-results-database". These example queries are, by default, created

193

as views. Specifying \ ``--no-create-views``\ prevents these views from being

194

created.

195

196

197

198

--database

199

200

short form: -D; type: string

201

202

The database to use for the connection.

203

204

205

206

--databases

207

208

short form: -d; type: hash

209

210

Only get tables and indexes from this comma-separated list of databases.

211

212

213

214

--databases-regex

215

216

type: string

217

218

Only get tables and indexes from database whose names match this Perl regex.

219

220

221

222

--defaults-file

223

224

short form: -F; type: string

225

226

Only read mysql options from the given file. You must give an absolute pathname.

227

228

229

230

--drop

231

232

type: Hash; default: non-unique

233

234

Suggest dropping only these types of unused indexes.

235

236

By default pt-index-usage will only suggest to drop unused secondary indexes,

237

not primary or unique indexes. You can specify which types of unused indexes

238

the tool suggests to drop: primary, unique, non-unique, all.

239

240

A separate \ ``ALTER TABLE``\ statement for each type is printed. So if you

241

specify \ ``--drop all``\ and there is a primary key and a non-unique index,

242

the \ ``ALTER TABLE ... DROP``\ for each will be printed on separate lines.

243

244

245

246

--empty-save-results-tables

247

248

Drop and re-create all pre-existing tables in the "--save-results-database".

249

This allows information from previous runs to be removed before the current run.

250

251

252

253

--help

254

255

Show help and exit.

256

257

258

259

--host

260

261

short form: -h; type: string

262

263

Connect to host.

264

265

266

267

--ignore-databases

268

269

type: Hash

270

271

Ignore this comma-separated list of databases.

272

273

274

275

--ignore-databases-regex

276

277

type: string

278

279

Ignore databases whose names match this Perl regex.

280

281

282

283

--ignore-tables

284

285

type: Hash

286

287

Ignore this comma-separated list of table names.

288

289

Table names may be qualified with the database name.

290

291

292

293

--ignore-tables-regex

294

295

type: string

296

297

Ignore tables whose names match the Perl regex.

298

299

300

301

--password

302

303

short form: -p; type: string

304

305

Password to use when connecting.

306

307

308

309

--port

310

311

short form: -P; type: int

312

313

Port number to use for connection.

314

315

316

317

--progress

318

319

type: array; default: time,30

320

321

Print progress reports to STDERR. The value is a comma-separated list with two

322

parts. The first part can be percentage, time, or iterations; the second part

323

specifies how often an update should be printed, in percentage, seconds, or

324

number of iterations.

325

326

327

328

--quiet

329

330

short form: -q

331

332

Do not print any warnings. Also disables "--progress".

333

334

335

336

--[no]report

337

338

default: yes

339

340

Print the reports for "--report-format".

341

342

You may want to disable the reports by specifying \ ``--no-report``\ if, for

343

example, you also specify "--save-results-database" and you only want

344

to query the results tables later.

345

346

347

348

--report-format

349

350

type: Array; default: drop_unused_indexes

351

352

Right now there is only one report: drop_unused_indexes. This report prints

353

SQL statements for dropping any unused indexes. See also "--drop".

354

355

See also "--[no]report".

356

357

358

359

--save-results-database

360

361

type: DSN

362

363

Save results to tables in this database. Information about indexes, queries,

364

tables and their usage is stored in several tables in the specified database.

365

The tables are auto-created if they do not exist. If the database doesn't

366

exist, it can be auto-created with "--create-save-results-database". In this

367

case the connection is initially created with no default database, then after

368

the database is created, it is USE'ed.

369

370

pt-index-usage executes INSERT statements to save the results. Therefore, you

371

should be careful if you use this feature on a production server. It might

372

increase load, or cause trouble if you don't want the server to be written to,

373

or so on.

374

375

This is a new feature. It may change in future releases.

376

377

After a run, you can query the usage tables to answer various questions about

378

index usage. The tables have the following CREATE TABLE definitions:

379

380

MAGIC_create_indexes:

381

382

383

.. code-block:: perl

384

385

CREATE TABLE IF NOT EXISTS indexes (

386

db VARCHAR(64) NOT NULL,

387

tbl VARCHAR(64) NOT NULL,

388

idx VARCHAR(64) NOT NULL,

389

cnt BIGINT UNSIGNED NOT NULL DEFAULT 0,

390

PRIMARY KEY (db, tbl, idx)

391

)

392

393

394

MAGIC_create_queries:

395

396

397

.. code-block:: perl

398

399

CREATE TABLE IF NOT EXISTS queries (

400

query_id BIGINT UNSIGNED NOT NULL,

401

fingerprint TEXT NOT NULL,

402

sample TEXT NOT NULL,

403

PRIMARY KEY (query_id)

404

)

405

406

407

MAGIC_create_tables:

408

409

410

.. code-block:: perl

411

412

CREATE TABLE IF NOT EXISTS tables (

413

db VARCHAR(64) NOT NULL,

414

tbl VARCHAR(64) NOT NULL,

415

cnt BIGINT UNSIGNED NOT NULL DEFAULT 0,

416

PRIMARY KEY (db, tbl)

417

)

418

419

420

MAGIC_create_index_usage:

421

422

423

.. code-block:: perl

424

425

CREATE TABLE IF NOT EXISTS index_usage (

426

query_id BIGINT UNSIGNED NOT NULL,

427

db VARCHAR(64) NOT NULL,

428

tbl VARCHAR(64) NOT NULL,

429

idx VARCHAR(64) NOT NULL,

430

cnt BIGINT UNSIGNED NOT NULL DEFAULT 1,

431

UNIQUE INDEX (query_id, db, tbl, idx)

432

)

433

434

435

MAGIC_create_index_alternatives:

436

437

438

.. code-block:: perl

439

440

CREATE TABLE IF NOT EXISTS index_alternatives (

441

query_id BIGINT UNSIGNED NOT NULL, -- This query used

442

db VARCHAR(64) NOT NULL, -- this index, but...

443

tbl VARCHAR(64) NOT NULL, --

444

idx VARCHAR(64) NOT NULL, --

445

alt_idx VARCHAR(64) NOT NULL, -- was an alternative

446

cnt BIGINT UNSIGNED NOT NULL DEFAULT 1,

447

UNIQUE INDEX (query_id, db, tbl, idx, alt_idx),

448

INDEX (db, tbl, idx),

449

INDEX (db, tbl, alt_idx)

450

)

451

452

453

The following are some queries you can run against these tables to answer common

454

questions you might have. Each query is also created as a view (with MySQL

455

v5.0 and newer) if \ ``"--[no]create-views"``\ is true (it is by default).

456

The view names are the strings after the \ ``MAGIC_view_``\ prefix.

457

458

Question: which queries sometimes use different indexes, and what fraction of

459

the time is each index chosen? MAGIC_view_query_uses_several_indexes:

460

461

462

.. code-block:: perl

463

464

SELECT iu.query_id, CONCAT_WS('.', iu.db, iu.tbl, iu.idx) AS idx,

465

variations, iu.cnt, iu.cnt / total_cnt * 100 AS pct

466

FROM index_usage AS iu

467

INNER JOIN (

468

SELECT query_id, db, tbl, SUM(cnt) AS total_cnt,

469

COUNT(*) AS variations

470

FROM index_usage

471

GROUP BY query_id, db, tbl

472

HAVING COUNT(*) > 1

473

) AS qv USING(query_id, db, tbl);

474

475

476

Question: which indexes have lots of alternatives, i.e. are chosen instead of

477

other indexes, and for what queries? MAGIC_view_index_has_alternates:

478

479

480

.. code-block:: perl

481

482

SELECT CONCAT_WS('.', db, tbl, idx) AS idx_chosen,

483

GROUP_CONCAT(DISTINCT alt_idx) AS alternatives,

484

GROUP_CONCAT(DISTINCT query_id) AS queries, SUM(cnt) AS cnt

485

FROM index_alternatives

486

GROUP BY db, tbl, idx

487

HAVING COUNT(*) > 1;

488

489

490

Question: which indexes are considered as alternates for other indexes, and for

491

what queries? MAGIC_view_index_alternates:

492

493

494

.. code-block:: perl

495

496

SELECT CONCAT_WS('.', db, tbl, alt_idx) AS idx_considered,

497

GROUP_CONCAT(DISTINCT idx) AS alternative_to,

498

GROUP_CONCAT(DISTINCT query_id) AS queries, SUM(cnt) AS cnt

499

FROM index_alternatives

500

GROUP BY db, tbl, alt_idx

501

HAVING COUNT(*) > 1;

502

503

504

Question: which of those are never chosen by any queries, and are therefore

505

superfluous? MAGIC_view_unused_index_alternates:

506

507

508

.. code-block:: perl

509

510

SELECT CONCAT_WS('.', i.db, i.tbl, i.idx) AS idx,

511

alt.alternative_to, alt.queries, alt.cnt

512

FROM indexes AS i

513

INNER JOIN (

514

SELECT db, tbl, alt_idx, GROUP_CONCAT(DISTINCT idx) AS alternative_to,

515

GROUP_CONCAT(DISTINCT query_id) AS queries, SUM(cnt) AS cnt

516

FROM index_alternatives

517

GROUP BY db, tbl, alt_idx

518

HAVING COUNT(*) > 1

519

) AS alt ON i.db = alt.db AND i.tbl = alt.tbl

520

AND i.idx = alt.alt_idx

521

WHERE i.cnt = 0;

522

523

524

Question: given a table, which indexes were used, by how many queries, with how

525

many distinct fingerprints? Were there alternatives? Which indexes were not

526

used? You can edit the following query's SELECT list to also see the query IDs

527

in question. MAGIC_view_index_usage:

528

529

530

.. code-block:: perl

531

532

SELECT i.idx, iu.usage_cnt, iu.usage_total,

533

ia.alt_cnt, ia.alt_total

534

FROM indexes AS i

535

LEFT OUTER JOIN (

536

SELECT db, tbl, idx, COUNT(*) AS usage_cnt,

537

SUM(cnt) AS usage_total, GROUP_CONCAT(query_id) AS used_by

538

FROM index_usage

539

GROUP BY db, tbl, idx

540

) AS iu ON i.db=iu.db AND i.tbl=iu.tbl AND i.idx = iu.idx

541

LEFT OUTER JOIN (

542

SELECT db, tbl, idx, COUNT(*) AS alt_cnt,

543

SUM(cnt) AS alt_total,

544

GROUP_CONCAT(query_id) AS alt_queries

545

FROM index_alternatives

546

GROUP BY db, tbl, idx

547

) AS ia ON i.db=ia.db AND i.tbl=ia.tbl AND i.idx = ia.idx;

548

549

550

Question: which indexes on a given table are vital for at least one query (there

551

is no alternative)? MAGIC_view_required_indexes:

552

553

554

.. code-block:: perl

555

556

SELECT i.db, i.tbl, i.idx, no_alt.queries

557

FROM indexes AS i

558

INNER JOIN (

559

SELECT iu.db, iu.tbl, iu.idx,

560

GROUP_CONCAT(iu.query_id) AS queries

561

FROM index_usage AS iu

562

LEFT OUTER JOIN index_alternatives AS ia

563

USING(db, tbl, idx)

564

WHERE ia.db IS NULL

565

GROUP BY iu.db, iu.tbl, iu.idx

566

) AS no_alt ON no_alt.db = i.db AND no_alt.tbl = i.tbl

567

AND no_alt.idx = i.idx

568

ORDER BY i.db, i.tbl, i.idx, no_alt.queries;

569

570

571

572

573

--set-vars

574

575

type: string; default: wait_timeout=10000

576

577

Set these MySQL variables. Immediately after connecting to MySQL, this

578

string will be appended to SET and executed.

579

580

581

582

--socket

583

584

short form: -S; type: string

585

586

Socket file to use for connection.

587

588

589

590

--tables

591

592

short form: -t; type: hash

593

594

Only get indexes from this comma-separated list of tables.

595

596

597

598

--tables-regex

599

600

type: string

601

602

Only get indexes from tables whose names match this Perl regex.

603

604

605

606

--user

607

608

short form: -u; type: string

609

610

User for login if not current user.

611

612

613

614

--version

615

616

Show version and exit.

617

618

619

620

621

***********

622

DSN OPTIONS

623

***********

624

625

626

These DSN options are used to create a DSN. Each option is given like

627

\ ``option=value``\ . The options are case-sensitive, so P and p are not the

628

same option. There cannot be whitespace before or after the \ ``=``\ and

629

if the value contains whitespace it must be quoted. DSN options are

630

comma-separated. See the percona-toolkit manpage for full details.

631

632

633

\* A

634

635

dsn: charset; copy: yes

636

637

Default character set.

638

639

640

641

\* D

642

643

dsn: database; copy: yes

644

645

Database to connect to.

646

647

648

649

\* F

650

651

dsn: mysql_read_default_file; copy: yes

652

653

Only read default options from the given file

654

655

656

657

\* h

658

659

dsn: host; copy: yes

660

661

Connect to host.

662

663

664

665

\* p

666

667

dsn: password; copy: yes

668

669

Password to use when connecting.

670

671

672

673

\* P

674

675

dsn: port; copy: yes

676

677

Port number to use for connection.

678

679

680

681

\* S

682

683

dsn: mysql_socket; copy: yes

684

685

Socket file to use for connection.

686

687

688

689

\* u

690

691

dsn: user; copy: yes

692

693

User for login if not current user.

694

695

696

697

698

***********

699

ENVIRONMENT

700

***********

701

702

703

The environment variable \ ``PTDEBUG``\ enables verbose debugging output to STDERR.

704

To enable debugging and capture all output to a file, run the tool like:

705

706

707

.. code-block:: perl

708

709

PTDEBUG=1 pt-index-usage ... > FILE 2>&1

710

711

712

Be careful: debugging output is voluminous and can generate several megabytes

713

of output.

714

715

716

*******************

717

SYSTEM REQUIREMENTS

718

*******************

719

720

721

You need Perl, DBI, DBD::mysql, and some core packages that ought to be

722

installed in any reasonably new version of Perl.

723

724

725

****

726

BUGS

727

****

728

729

730

For a list of known bugs, see `http://www.percona.com/bugs/pt-index-usage <http://www.percona.com/bugs/pt-index-usage>`_.

731

732

Please report bugs at `https://bugs.launchpad.net/percona-toolkit <https://bugs.launchpad.net/percona-toolkit>`_.

733

Include the following information in your bug report:

734

735

736

\* Complete command-line used to run the tool

737

738

739

740

\* Tool "--version"

741

742

743

744

\* MySQL version of all servers involved

745

746

747

748

\* Output from the tool including STDERR

749

750

751

752

\* Input files (log/dump/config files, etc.)

753

754

755

756

If possible, include debugging output by running the tool with \ ``PTDEBUG``\ ;

757

see "ENVIRONMENT".

758

759

760

***********

761

DOWNLOADING

762

***********

763

764

765

Visit `http://www.percona.com/software/percona-toolkit/ <http://www.percona.com/software/percona-toolkit/>`_ to download the

766

latest release of Percona Toolkit. Or, get the latest release from the

767

command line:

768

769

770

.. code-block:: perl

771

772

wget percona.com/get/percona-toolkit.tar.gz

773

774

wget percona.com/get/percona-toolkit.rpm

775

776

wget percona.com/get/percona-toolkit.deb

777

778

779

You can also get individual tools from the latest release:

780

781

782

.. code-block:: perl

783

784

wget percona.com/get/TOOL

785

786

787

Replace \ ``TOOL``\ with the name of any tool.

788

789

790

*******

791

AUTHORS

792

*******

793

794

795

Baron Schwartz and Daniel Nichter

796

797

798

*********************

799

ABOUT PERCONA TOOLKIT

800

*********************

801

802

803

This tool is part of Percona Toolkit, a collection of advanced command-line

804

tools developed by Percona for MySQL support and consulting. Percona Toolkit

805

was forked from two projects in June, 2011: Maatkit and Aspersa. Those

806

projects were created by Baron Schwartz and developed primarily by him and

807

Daniel Nichter, both of whom are employed by Percona. Visit

808

`http://www.percona.com/software/ <http://www.percona.com/software/>`_ for more software developed by Percona.

809

810

811

********************************

812

813

********************************

814

815

816

817

Feedback and improvements are welcome.

818

819

THIS PROGRAM IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED

820

WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF

821

MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.

822

823

This program is free software; you can redistribute it and/or modify it under

824

the terms of the GNU General Public License as published by the Free Software

825

Foundation, version 2; OR the Perl Artistic License. On UNIX and similar

826

systems, you can issue \`man perlgpl' or \`man perlartistic' to read these

827

licenses.

828

829

You should have received a copy of the GNU General Public License along with

830

this program; if not, write to the Free Software Foundation, Inc., 59 Temple

831

Place, Suite 330, Boston, MA 02111-1307 USA.

832

833

834

*******

835

VERSION

836

*******

837

838

839

pt-index-usage 1.0.2

840

Older »