3
== Defining a Cluster Node ==
5
Each node in the cluster will have an entry in the nodes section
6
containing its UUID, uname, and type.
8
.Example Heartbeat cluster node entry
11
<node id="1186dc9a-324d-425a-966e-d757e693dc86" uname="pcmk-1" type="normal"/>
14
.Example Corosync cluster node entry
17
<node id="101" uname="pcmk-1" type="normal"/>
20
In normal circumstances, the admin should let the cluster populate
21
this information automatically from the communications and membership
22
data. However for Heartbeat, one can use the `crm_uuid` tool
23
to read an existing UUID or define a value before the cluster starts.
26
== Where Pacemaker Gets the Node Name ==
28
Traditionally, Pacemaker required nodes to be referred to by the value
29
returned by `uname -n`. This can be problematic for services that
30
require the `uname -n` to be a specific value (ie. for a licence
33
Since version 2.0.0 of Pacemaker, this requirement has been relaxed
34
for clusters using Corosync 2.0 or later. The name Pacemaker uses is:
36
. The value stored in 'corosync.conf' under +ring0_addr+ in the +nodelist+, if it does not contain an IP address; otherwise
37
. The value stored in 'corosync.conf' under +name+ in the +nodelist+; otherwise
38
. The value of `uname -n`
40
Pacemaker provides the `crm_node -n` command which displays the name
41
used by a running cluster.
43
If a Corosync nodelist is used, `crm_node --name-for-id $number` is also
44
available to display the name used by the node with the corosync
45
+nodeid+ of '$number', for example: `crm_node --name-for-id 2`.
48
== Describing a Cluster Node ==
50
indexterm:[Node,attribute]
51
Beyond the basic definition of a node the administrator can also
52
describe the node's attributes, such as how much RAM, disk, what OS or
53
kernel version it has, perhaps even its physical location. This
54
information can then be used by the cluster when deciding where to
55
place resources. For more information on the use of node attributes,
58
Node attributes can be specified ahead of time or populated later,
59
when the cluster is running, using `crm_attribute`.
61
Below is what the node's definition would look like if the admin ran the command:
63
.The result of using crm_attribute to specify which kernel pcmk-1 is running
67
# crm_attribute --type nodes --node-uname pcmk-1 --attr-name kernel --attr-value `uname -r`
71
<node uname="pcmk-1" type="normal" id="101">
72
<instance_attributes id="nodes-101">
73
<nvpair id="kernel-101" name="kernel" value="2.6.16.46-0.4-default"/>
74
</instance_attributes>
78
A simpler way to determine the current value of an attribute is to use `crm_attribute` command again:
81
# crm_attribute --type nodes --node-uname pcmk-1 --attr-name kernel --get-value
83
By specifying `--type nodes` the admin tells the cluster that this
84
attribute is persistent. There are also transient attributes which
85
are kept in the status section which are "forgotten" whenever the node
86
rejoins the cluster. The cluster uses this area to store a record of
87
how many times a resource has failed on that node but administrators
88
can also read and write to this section by specifying `--type status`.
92
=== Adding a New Corosync Node ===
94
indexterm:[Corosync,Add Cluster Node]
95
indexterm:[Add Cluster Node,Corosync]
97
Adding a new node is as simple as installing Corosync and Pacemaker,
98
and copying '/etc/corosync/corosync.conf' and '/etc/corosync/authkey' (if
99
it exists) from an existing node. You may need to modify the
100
+mcastaddr+ option to match the new node's IP address.
102
If a log message containing "Invalid digest" appears from Corosync,
103
the keys are not consistent between the machines.
105
=== Removing a Corosync Node ===
107
indexterm:[Corosync,Remove Cluster Node]
108
indexterm:[Remove Cluster Node,Corosync]
110
Because the messaging and membership layers are the authoritative
111
source for cluster nodes, deleting them from the CIB is not a reliable
112
solution. First one must arrange for corosync to forget about the
113
node (_pcmk-1_ in the example below).
115
On the host to be removed:
117
. Stop the cluster: `/etc/init.d/corosync stop`
119
Next, from one of the remaining active cluster nodes:
121
. Tell Pacemaker to forget about the removed host:
126
This includes deleting the node from the CIB
130
This proceedure only works for versions after 1.1.8
133
=== Replacing a Corosync Node ===
135
indexterm:[Corosync,Replace Cluster Node]
136
indexterm:[Replace Cluster Node,Corosync]
138
The five-step guide to replacing an existing cluster node:
140
. Make sure the old node is completely stopped
141
. Give the new machine the same hostname and IP address as the old one
142
. Install the cluster software :-)
143
. Copy '/etc/corosync/corosync.conf' and '/etc/corosync/authkey' (if it exists) to the new node
144
. Start the new cluster node
146
If a log message containing "Invalid digest" appears from Corosync,
147
the keys are not consistent between the machines.
151
=== Adding a New CMAN Node ===
153
indexterm:[CMAN,Add Cluster Node]
154
indexterm:[Add Cluster Node,CMAN]
156
=== Removing a CMAN Node ===
158
indexterm:[CMAN,Remove Cluster Node]
159
indexterm:[Remove Cluster Node,CMAN]
163
=== Adding a New Heartbeat Node ===
165
indexterm:[Heartbeat,Add Cluster Node]
166
indexterm:[Add Cluster Node,Heartbeat]
168
Provided you specified +autojoin any+ in 'ha.cf', adding a new node is
169
as simple as installing heartbeat and copying 'ha.cf' and 'authkeys'
170
from an existing node.
172
If you don't want to use +autojoin+, then after setting up 'ha.cf' and
173
'authkeys', you must use `hb_addnode` before starting the new node.
175
=== Removing a Heartbeat Node ===
177
indexterm:[Heartbeat,Remove Cluster Node]
178
indexterm:[Remove Cluster Node,Heartbeat]
180
Because the messaging and membership layers are the authoritative
181
source for cluster nodes, deleting them from the CIB is not a reliable
184
First one must arrange for Heartbeat to forget about the node (pcmk-1
185
in the example below).
187
On the host to be removed:
189
. Stop the cluster: `/etc/init.d/corosync stop`
191
Next, from one of the remaining active cluster nodes:
193
. Tell Heartbeat the node should be removed
198
. Tell Pacemaker to forget about the removed host:
205
This proceedure only works for versions after 1.1.8
208
=== Replacing a Heartbeat Node ===
210
indexterm:[Heartbeat,Replace Cluster Node]
211
indexterm:[Replace Cluster Node,Heartbeat]
212
The seven-step guide to replacing an existing cluster node:
214
. Make sure the old node is completely stopped
215
. Give the new machine the same hostname as the old one
216
. Go to an active cluster node and look up the UUID for the old node in '/var/lib/heartbeat/hostcache'
217
. Install the cluster software
218
. Copy 'ha.cf' and 'authkeys' to the new node
219
. On the new node, populate it's UUID using `crm_uuid -w` and the UUID from step 2
220
. Start the new cluster node