14
pt-heartbeat - Monitor MySQL replication delay.
22
Usage: pt-heartbeat [OPTION...] [DSN] --update|--monitor|--check|--stop
24
pt-heartbeat measures replication lag on a MySQL or PostgreSQL server. You can
25
use it to update a master or monitor a replica. If possible, MySQL connection
26
options are read from your .my.cnf file.
28
Start daemonized process to update test.heartbeat table on master:
33
pt-heartbeat -D test --update -h master-server --daemonize
36
Monitor replication lag on slave:
41
pt-heartbeat -D test --monitor -h slave-server
43
pt-heartbeat -D test --monitor -h slave-server --dbi-driver Pg
46
Check slave lag once and exit (using optional DSN to specify slave host):
51
pt-heartbeat -D test --check h=slave-server
60
The following section is included to inform users about the potential risks,
61
whether known or unknown, of using this tool. The two main categories of risks
62
are those created by the nature of the tool (e.g. read-only tools vs. read-write
63
tools) and those created by bugs.
65
pt-heartbeat merely reads and writes a single record in a table. It should be
68
At the time of this release, we know of no bugs that could cause serious harm to
71
The authoritative source for updated information is always the online issue
72
tracking system. Issues that affect this tool will be marked as such. You can
73
see a list of such issues at the following URL:
74
`http://www.percona.com/bugs/pt-heartbeat <http://www.percona.com/bugs/pt-heartbeat>`_.
76
See also "BUGS" for more information on filing bugs and getting help.
84
pt-heartbeat is a two-part MySQL and PostgreSQL replication delay monitoring
85
system that measures delay by looking at actual replicated data. This
86
avoids reliance on the replication mechanism itself, which is unreliable. (For
87
example, \ ``SHOW SLAVE STATUS``\ on MySQL).
89
The first part is an "--update" instance of pt-heartbeat that connects to
90
a master and updates a timestamp ("heartbeat record") every "--interval"
91
seconds. Since the heartbeat table may contain records from multiple
92
masters (see "MULTI-SLAVE HIERARCHY"), the server's ID (@@server_id) is
93
used to identify records.
95
The second part is a "--monitor" or "--check" instance of pt-heartbeat
96
that connects to a slave, examines the replicated heartbeat record from its
97
immediate master or the specified "--master-server-id", and computes the
98
difference from the current system time. If replication between the slave and
99
the master is delayed or broken, the computed difference will be greater than
100
zero and potentially increase if "--monitor" is specified.
102
You must either manually create the heartbeat table on the master or use
103
"--create-table". See "--create-table" for the proper heartbeat
104
table structure. The \ ``MEMORY``\ storage engine is suggested, but not
105
required of course, for MySQL.
107
The heartbeat table must contain a heartbeat row. By default, a heartbeat
108
row is inserted if it doesn't exist. This feature can be disabled with the
109
"--[no]insert-heartbeat-row" option in case the database user does not
110
have INSERT privileges.
112
pt-heartbeat depends only on the heartbeat record being replicated to the slave,
113
so it works regardless of the replication mechanism (built-in replication, a
114
system such as Continuent Tungsten, etc). It works at any depth in the
115
replication hierarchy; for example, it will reliably report how far a slave lags
116
its master's master's master. And if replication is stopped, it will continue
117
to work and report (accurately!) that the slave is falling further and further
120
pt-heartbeat has a maximum resolution of 0.01 second. The clocks on the
121
master and slave servers must be closely synchronized via NTP. By default,
122
"--update" checks happen on the edge of the second (e.g. 00:01) and
123
"--monitor" checks happen halfway between seconds (e.g. 00:01.5).
124
As long as the servers' clocks are closely synchronized and replication
125
events are propagating in less than half a second, pt-heartbeat will report
126
zero seconds of delay.
128
pt-heartbeat will try to reconnect if the connection has an error, but will
129
not retry if it can't get a connection when it first starts.
131
The "--dbi-driver" option lets you use pt-heartbeat to monitor PostgreSQL
132
as well. It is reported to work well with Slony-1 replication.
135
*********************
136
MULTI-SLAVE HIERARCHY
137
*********************
140
If the replication hierarchy has multiple slaves which are masters of
141
other slaves, like "master -> slave1 -> slave2", "--update" instances
142
can be ran on the slaves as well as the master. The default heartbeat
143
table (see "--create-table") is keyed on the \ ``server_id``\ column, so
144
each server will update the row where \ ``server_id=@@server_id``\ .
146
For "--monitor" and "--check", if "--master-server-id" is not
147
specified, the tool tries to discover and use the slave's immediate master.
148
If this fails, or if you want monitor lag from another master, then you can
149
specify the "--master-server-id" to use.
151
For example, if the replication hierarchy is "master -> slave1 -> slave2"
152
with corresponding server IDs 1, 2 and 3, you can:
157
pt-heartbeat --daemonize -D test --update -h master
158
pt-heartbeat --daemonize -D test --update -h slave1
161
Then check (or monitor) the replication delay from master to slave2:
166
pt-heartbeat -D test --master-server-id 1 --check slave2
169
Or check the replication delay from slave1 to slave2:
174
pt-heartbeat -D test --master-server-id 2 --check slave2
177
Stopping the "--update" instance one slave1 will not affect the instance
181
***********************
182
MASTER AND SLAVE STATUS
183
***********************
186
The default heartbeat table (see "--create-table") has columns for saving
187
information from \ ``SHOW MASTER STATUS``\ and \ ``SHOW SLAVE STATUS``\ . These
188
columns are optional. If any are present, their corresponding information
197
Specify at least one of "--stop", "--update", "--monitor", or "--check".
199
"--update", "--monitor", and "--check" are mutually exclusive.
201
"--daemonize" and "--check" are mutually exclusive.
203
This tool accepts additional command-line arguments. Refer to the
204
"SYNOPSIS" and usage information for details.
209
Prompt for a password when connecting to MySQL.
215
short form: -A; type: string
217
Default character set. If the value is utf8, sets Perl's binmode on STDOUT to
218
utf8, passes the mysql_enable_utf8 option to DBD::mysql, and runs SET NAMES UTF8
219
after connecting to MySQL. Any other value sets binmode on STDOUT without the
220
utf8 layer, and runs SET NAMES after connecting to MySQL.
226
Check slave delay once and exit. If you also specify "--recurse", the
227
tool will try to discover slave's of the given slave and check and print
228
their lag, too. The hostname or IP and port for each slave is printed
229
before its delay. "--recurse" only works with MySQL.
237
Read this comma-separated list of config files; if specified, this must be the
238
first option on the command line.
244
Create the heartbeat "--table" if it does not exist.
246
This option causes the table specified by "--database" and "--table" to
247
be created with the following MAGIC_create_heartbeat table definition:
252
CREATE TABLE heartbeat (
253
ts varchar(26) NOT NULL,
254
server_id int unsigned NOT NULL PRIMARY KEY,
255
file varchar(255) DEFAULT NULL, -- SHOW MASTER STATUS
256
position bigint unsigned DEFAULT NULL, -- SHOW MASTER STATUS
257
relay_master_log_file varchar(255) DEFAULT NULL, -- SHOW SLAVE STATUS
258
exec_master_log_pos bigint unsigned DEFAULT NULL -- SHOW SLAVE STATUS
262
The heartbeat table requires at least one row. If you manually create the
263
heartbeat table, then you must insert a row by doing:
268
INSERT INTO heartbeat (ts, server_id) VALUES (NOW(), N);
271
where \ ``N``\ is the server's ID; do not use @@server_id because it will replicate
272
and slaves will insert their own server ID instead of the master's server ID.
274
This is done automatically by "--create-table".
276
A legacy version of the heartbeat table is still supported:
281
CREATE TABLE heartbeat (
282
id int NOT NULL PRIMARY KEY,
287
Legacy tables do not support "--update" instances on each slave
288
of a multi-slave hierarchy like "master -> slave1 -> slave2".
289
To manually insert the one required row into a legacy table:
294
INSERT INTO heartbeat (id, ts) VALUES (1, NOW());
297
The tool automatically detects if the heartbeat table is legacy.
299
See also "MULTI-SLAVE HIERARCHY".
305
Fork to the background and detach from the shell. POSIX operating systems only.
311
short form: -D; type: string
313
The database to use for the connection.
319
default: mysql; type: string
321
Specify a driver for the connection; \ ``mysql``\ and \ ``Pg``\ are supported.
327
short form: -F; type: string
329
Only read mysql options from the given file. You must give an absolute
338
Print latest "--monitor" output to this file.
340
When "--monitor" is given, prints output to the specified file instead of to
341
STDOUT. The file is opened, truncated, and closed every interval, so it will
342
only contain the most recent statistics. Useful when "--daemonize" is given.
348
type: string; default: 1m,5m,15m
350
Timeframes for averages.
352
Specifies the timeframes over which to calculate moving averages when
353
"--monitor" is given. Specify as a comma-separated list of numbers with
354
suffixes. The suffix can be s for seconds, m for minutes, h for hours, or d for
355
days. The size of the largest frame determines the maximum memory usage, as up
356
to the specified number of per-second samples are kept in memory to calculate
357
the averages. You can specify as many timeframes as you like.
369
short form: -h; type: string
375
--[no]insert-heartbeat-row
379
Insert a heartbeat row in the "--table" if one doesn't exist.
381
The heartbeat "--table" requires a heartbeat row, else there's nothing
382
to "--update", "--monitor", or "--check"! By default, the tool will
383
insert a heartbeat row if one is not already present. You can disable this
384
feature by specifying \ ``--no-insert-heartbeat-row``\ in case the database user
385
does not have INSERT privileges.
391
type: float; default: 1.0
393
How often to update or check the heartbeat "--table". Updates and checks
394
begin on the first whole second then repeat every "--interval" seconds
395
for "--update" and every "--interval" plus "--skew" seconds for
398
For example, if at 00:00.4 an "--update" instance is started at 0.5 second
399
intervals, the first update happens at 00:01.0, the next at 00:01.5, etc.
400
If at 00:10.7 a "--monitor" instance is started at 0.05 second intervals
401
with the default 0.5 second "--skew", then the first check happens at
402
00:11.5 (00:11.0 + 0.5) which will be "--skew" seconds after the last update
403
which, because the instances are checking at synchronized intervals, happened
406
The tool waits for and begins on the first whole second just to make the
407
interval calculations simpler. Therefore, the tool could wait up to 1 second
408
before updating or checking.
410
The minimum (fastest) interval is 0.01, and the maximum precision is two
411
decimal places, so 0.015 will be rounded to 0.02.
413
If a legacy heartbeat table (see "--create-table") is used, then the
414
maximum precision is 1s because the \ ``ts``\ column is type \ ``datetime``\ .
422
Print all output to this file when daemonized.
430
Calculate delay from this master server ID for "--monitor" or "--check".
431
If not given, pt-heartbeat attempts to connect to the server's master and
432
determine its server id.
438
Monitor slave delay continuously.
440
Specifies that pt-heartbeat should check the slave's delay every second and
441
report to STDOUT (or if "--file" is given, to the file instead). The output
442
is the current delay followed by moving averages over the timeframe given in
443
"--frames". For example,
448
5s [ 0.25s, 0.05s, 0.02s ]
455
short form: -p; type: string
457
Password to use when connecting.
465
Create the given PID file when daemonized. The file contains the process ID of
466
the daemonized instance. The PID file is removed when the daemonized instance
467
exits. The program checks for the existence of the PID file when starting; if
468
it exists and the process with the matching PID exists, the program exits.
474
short form: -P; type: int
476
Port number to use for connection.
480
--print-master-server-id
482
Print the auto-detected or given "--master-server-id". If "--check"
483
or "--monitor" is specified, specifying this option will print the
484
auto-detected or given "--master-server-id" at the end of each line.
492
Check slaves recursively to this depth in "--check" mode.
494
Try to discover slave servers recursively, to the specified depth. After
495
discovering servers, run the check on each one of them and print the hostname
496
(if possible), followed by the slave delay.
498
This currently works only with MySQL. See "--recursion-method".
506
Preferred recursion method used to find slaves.
508
Possible methods are:
514
=========== ================
515
processlist SHOW PROCESSLIST
516
hosts SHOW SLAVE HOSTS
519
The processlist method is preferred because SHOW SLAVE HOSTS is not reliable.
520
However, the hosts method is required if the server uses a non-standard
521
port (not 3306). Usually pt-heartbeat does the right thing and finds
522
the slaves, but you may give a preferred method and it will be used first.
523
If it doesn't find any slaves, the other methods will be tried.
529
Use \ ``REPLACE``\ instead of \ ``UPDATE``\ for --update.
531
When running in "--update" mode, use \ ``REPLACE``\ instead of \ ``UPDATE``\ to set
532
the heartbeat table's timestamp. The \ ``REPLACE``\ statement is a MySQL extension
533
to SQL. This option is useful when you don't know whether the table contains
534
any rows or not. It must be used in conjunction with --update.
542
Time to run before exiting.
548
type: string; default: /tmp/pt-heartbeat-sentinel
550
Exit if this file exists.
556
type: string; default: wait_timeout=10000
558
Set these MySQL variables. Immediately after connecting to MySQL, this string
559
will be appended to SET and executed.
565
type: float; default: 0.5
567
How long to delay checks.
569
The default is to delay checks one half second. Since the update happens as
570
soon as possible after the beginning of the second on the master, this allows
571
one half second of replication delay before reporting that the slave lags the
572
master by one second. If your clocks are not completely accurate or there is
573
some other reason you'd like to delay the slave more or less, you can tweak this
574
value. Try setting the \ ``MKDEBUG``\ environment variable to see the effect this
581
short form: -S; type: string
583
Socket file to use for connection.
589
Stop running instances by creating the sentinel file.
591
This should have the effect of stopping all running
592
instances which are watching the same sentinel file. If none of
593
"--update", "--monitor" or "--check" is specified, \ ``pt-heartbeat``\
594
will exit after creating the file. If one of these is specified,
595
\ ``pt-heartbeat``\ will wait the interval given by "--interval", then remove
596
the file and continue working.
598
You might find this handy to stop cron jobs gracefully if necessary, or to
599
replace one running instance with another. For example, if you want to stop
600
and restart \ ``pt-heartbeat``\ every hour (just to make sure that it is restarted
601
every hour, in case of a server crash or some other problem), you could use a
602
\ ``crontab``\ line like this:
607
0 * * * * pt-heartbeat --update -D test --stop \
608
--sentinel /tmp/pt-heartbeat-hourly
611
The non-default "--sentinel" will make sure the hourly \ ``cron``\ job stops
612
only instances previously started with the same options (that is, from the
613
same \ ``cron``\ job).
615
See also "--sentinel".
621
type: string; default: heartbeat
623
The table to use for the heartbeat.
625
Don't specify database.table; use "--database" to specify the database.
627
See "--create-table".
633
Update a master's heartbeat.
639
short form: -u; type: string
641
User for login if not current user.
647
Show version and exit.
657
These DSN options are used to create a DSN. Each option is given like
658
\ ``option=value``\ . The options are case-sensitive, so P and p are not the
659
same option. There cannot be whitespace before or after the \ ``=``\ and
660
if the value contains whitespace it must be quoted. DSN options are
661
comma-separated. See the percona-toolkit manpage for full details.
666
dsn: charset; copy: yes
668
Default character set.
674
dsn: database; copy: yes
682
dsn: mysql_read_default_file; copy: yes
684
Only read default options from the given file
698
dsn: password; copy: yes
700
Password to use when connecting.
708
Port number to use for connection.
714
dsn: mysql_socket; copy: yes
716
Socket file to use for connection.
724
User for login if not current user.
734
Visit `http://www.percona.com/software/ <http://www.percona.com/software/>`_ to download the latest release of
735
Percona Toolkit. Or, to get the latest release from the command line:
740
wget percona.com/latest/percona-toolkit/PKG
743
Replace \ ``PKG``\ with \ ``tar``\ , \ ``rpm``\ , or \ ``deb``\ to download the package in that
744
format. You can also get individual tools from the latest release:
749
wget percona.com/latest/percona-toolkit/TOOL
752
Replace \ ``TOOL``\ with the name of any tool.
760
The environment variable \ ``PTDEBUG``\ enables verbose debugging output to STDERR.
761
To enable debugging and capture all output to a file, run the tool like:
766
PTDEBUG=1 pt-heartbeat ... > FILE 2>&1
769
Be careful: debugging output is voluminous and can generate several megabytes
778
You need Perl, DBI, DBD::mysql, and some core packages that ought to be
779
installed in any reasonably new version of Perl.
787
For a list of known bugs, see `http://www.percona.com/bugs/pt-heartbeat <http://www.percona.com/bugs/pt-heartbeat>`_.
789
Please report bugs at `https://bugs.launchpad.net/percona-toolkit <https://bugs.launchpad.net/percona-toolkit>`_.
790
Include the following information in your bug report:
793
\* Complete command-line used to run the tool
801
\* MySQL version of all servers involved
805
\* Output from the tool including STDERR
809
\* Input files (log/dump/config files, etc.)
813
If possible, include debugging output by running the tool with \ ``PTDEBUG``\ ;
822
Proven Scaling LLC, SixApart Ltd, Baron Schwartz, and Daniel Nichter
825
*********************
826
ABOUT PERCONA TOOLKIT
827
*********************
830
This tool is part of Percona Toolkit, a collection of advanced command-line
831
tools developed by Percona for MySQL support and consulting. Percona Toolkit
832
was forked from two projects in June, 2011: Maatkit and Aspersa. Those
833
projects were created by Baron Schwartz and developed primarily by him and
834
Daniel Nichter, both of whom are employed by Percona. Visit
835
`http://www.percona.com/software/ <http://www.percona.com/software/>`_ for more software developed by Percona.
838
********************************
839
COPYRIGHT, LICENSE, AND WARRANTY
840
********************************
843
This program is copyright 2006 Proven Scaling LLC and Six Apart Ltd,
844
2007-2011 Percona Inc.
845
Feedback and improvements are welcome.
847
Feedback and improvements are welcome.
849
THIS PROGRAM IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
850
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
851
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
853
This program is free software; you can redistribute it and/or modify it under
854
the terms of the GNU General Public License as published by the Free Software
855
Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
856
systems, you can issue \`man perlgpl' or \`man perlartistic' to read these
859
You should have received a copy of the GNU General Public License along with
860
this program; if not, write to the Free Software Foundation, Inc., 59 Temple
861
Place, Suite 330, Boston, MA 02111-1307 USA.
869
Percona Toolkit v1.0.0 released 2011-08-01