~ubuntu-branches/ubuntu/vivid/lvm2/vivid

« back to all changes in this revision

Viewing changes to doc/lvm_fault_handling.txt

Committer: Package Import Robot
Author(s): Bastian Blank
Date: 2012-05-01 20:27:50 UTC
mto: (3.1.23 sid)
mto: This revision was merged to the branch mainline in revision 72.
Revision ID: package-import@ubuntu.com-20120501202750-gljjjtblowwq9mw8

Tags: upstream-2.02.95

Import upstream version 2.02.95

files added:
daemons/dmeventd/plugins/thin

daemons/dmeventd/plugins/thin/.exported_symbols

daemons/dmeventd/plugins/thin/Makefile.in

daemons/dmeventd/plugins/thin/dmeventd_thin.c

daemons/lvmetad

daemons/lvmetad/Makefile.in

daemons/lvmetad/lvmetad-client.h

daemons/lvmetad/lvmetad-core.c

daemons/lvmetad/test.sh

daemons/lvmetad/testclient.c

doc/kernel

doc/kernel/crypt.txt

doc/kernel/delay.txt

doc/kernel/flakey.txt

doc/kernel/io.txt

doc/kernel/kcopyd.txt

doc/kernel/linear.txt

doc/kernel/log.txt

doc/kernel/persistent-data.txt

doc/kernel/queue-length.txt

doc/kernel/raid.txt

doc/kernel/service-time.txt

doc/kernel/snapshot.txt

doc/kernel/striped.txt

doc/kernel/thin-provisioning.txt

doc/kernel/uevent.txt

doc/kernel/zero.txt

lib/cache/lvmetad.c

lib/cache/lvmetad.h

lib/filters/filter-mpath.c

lib/filters/filter-mpath.h

lib/metadata/thin_manip.c

lib/thin

lib/thin/.exported_symbols

lib/thin/Makefile.in

lib/thin/thin.c

libdaemon

libdaemon/Makefile.in

libdaemon/client

libdaemon/client/Makefile.in

libdaemon/client/daemon-client.c

libdaemon/client/daemon-client.h

libdaemon/client/daemon-shared.c

libdaemon/client/daemon-shared.h

libdaemon/server

libdaemon/server/Makefile.in

libdaemon/server/daemon-server.c

libdaemon/server/daemon-server.h

libdm/libdm-config.c

scripts/lvm2_lvmetad_init_red_hat.in

scripts/lvm2_lvmetad_systemd_red_hat.service.in

scripts/lvm2_lvmetad_systemd_red_hat.socket.in

scripts/lvm2_tmpfiles_red_hat.conf.in

test/shell

test/shell/000-basic.sh

test/shell/activate-missing.sh

test/shell/activate-partial.sh

test/shell/clvmd-restart.sh

test/shell/covercmd.sh

test/shell/dmeventd-restart.sh

test/shell/dumpconfig.sh

test/shell/fsadm.sh

test/shell/inconsistent-metadata.sh

test/shell/listings.sh

test/shell/lock-blocking.sh

test/shell/lvchange-mirror.sh

test/shell/lvconvert-mirror-basic-0.sh

test/shell/lvconvert-mirror-basic-1.sh

test/shell/lvconvert-mirror-basic-2.sh

test/shell/lvconvert-mirror-basic-3.sh

test/shell/lvconvert-mirror-basic.sh

test/shell/lvconvert-mirror.sh

test/shell/lvconvert-raid.sh

test/shell/lvconvert-repair-dmeventd.sh

test/shell/lvconvert-repair-policy.sh

test/shell/lvconvert-repair-replace.sh

test/shell/lvconvert-repair-snapshot.sh

test/shell/lvconvert-repair-transient-dmeventd.sh

test/shell/lvconvert-repair-transient.sh

test/shell/lvconvert-repair.sh

test/shell/lvconvert-twostep.sh

test/shell/lvcreate-large.sh

test/shell/lvcreate-mirror.sh

test/shell/lvcreate-operation.sh

test/shell/lvcreate-pvtags.sh

test/shell/lvcreate-raid.sh

test/shell/lvcreate-repair.sh

test/shell/lvcreate-small-snap.sh

test/shell/lvcreate-striped-mirror.sh

test/shell/lvcreate-thin.sh

test/shell/lvcreate-usage.sh

test/shell/lvextend-percent-extents.sh

test/shell/lvextend-snapshot-dmeventd.sh

test/shell/lvextend-snapshot-policy.sh

test/shell/lvm-init.sh

test/shell/lvmcache-exercise.sh

test/shell/lvmetad-pvs.sh

test/shell/lvresize-mirror.sh

test/shell/lvresize-rounding.sh

test/shell/lvresize-usage.sh

test/shell/mdata-strings.sh

test/shell/metadata-balance.sh

test/shell/metadata-dirs.sh

test/shell/metadata.sh

test/shell/mirror-names.sh

test/shell/mirror-vgreduce-removemissing.sh

test/shell/name-mangling.sh

test/shell/nomda-missing.sh

test/shell/pool-labels.sh

test/shell/pv-duplicate.sh

test/shell/pv-min-size.sh

test/shell/pv-range-overflow.sh

test/shell/pvchange-usage.sh

test/shell/pvcreate-metadata0.sh

test/shell/pvcreate-operation-md.sh

test/shell/pvcreate-operation.sh

test/shell/pvcreate-usage.sh

test/shell/pvmove-basic.sh

test/shell/pvremove-usage.sh

test/shell/read-ahead.sh

test/shell/snapshot-autoumount-dmeventd.sh

test/shell/snapshot-merge.sh

test/shell/snapshots-of-mirrors.sh

test/shell/tags.sh

test/shell/test-partition.sh

test/shell/topology-support.sh

test/shell/unknown-segment.sh

test/shell/unlost-pv.sh

test/shell/vgcfgbackup-usage.sh

test/shell/vgchange-maxlv.sh

test/shell/vgchange-sysinit.sh

test/shell/vgchange-usage.sh

test/shell/vgcreate-usage.sh

test/shell/vgextend-restoremissing.sh

test/shell/vgextend-usage.sh

test/shell/vgimportclone.sh

test/shell/vgmerge-operation.sh

test/shell/vgmerge-usage.sh

test/shell/vgreduce-removemissing-snapshot.sh

test/shell/vgreduce-usage.sh

test/shell/vgrename-usage.sh

test/shell/vgsplit-operation.sh

test/shell/vgsplit-stacked.sh

test/shell/vgsplit-usage.sh

test/unit

test/unit/Makefile.in

test/unit/bitset_t.c

test/unit/config_t.c

test/unit/matcher_data.h

test/unit/matcher_t.c

test/unit/run.c

test/unit/string_t.c

udev/13-dm-disk.rules.in

udev/69-dm-lvm-metad.rules

files removed:
libdm/mm/dbg_malloc.h

test/t-000-basic.sh

test/t-activate-missing.sh

test/t-activate-partial.sh

test/t-covercmd.sh

test/t-dmeventd-restart.sh

test/t-fsadm.sh

test/t-inconsistent-metadata.sh

test/t-listings.sh

test/t-lock-blocking.sh

test/t-lvchange-mirror.sh

test/t-lvconvert-mirror-basic-0.sh

test/t-lvconvert-mirror-basic-1.sh

test/t-lvconvert-mirror-basic-2.sh

test/t-lvconvert-mirror-basic-3.sh

test/t-lvconvert-mirror-basic.sh

test/t-lvconvert-mirror.sh

test/t-lvconvert-raid.sh

test/t-lvconvert-repair-dmeventd.sh

test/t-lvconvert-repair-policy.sh

test/t-lvconvert-repair-replace.sh

test/t-lvconvert-repair-snapshot.sh

test/t-lvconvert-repair-transient-dmeventd.sh

test/t-lvconvert-repair-transient.sh

test/t-lvconvert-repair.sh

test/t-lvconvert-twostep.sh

test/t-lvcreate-mirror.sh

test/t-lvcreate-operation.sh

test/t-lvcreate-pvtags.sh

test/t-lvcreate-raid.sh

test/t-lvcreate-repair.sh

test/t-lvcreate-small-snap.sh

test/t-lvcreate-usage.sh

test/t-lvextend-percent-extents.sh

test/t-lvextend-snapshot-dmeventd.sh

test/t-lvextend-snapshot-policy.sh

test/t-lvm-init.sh

test/t-lvmcache-exercise.sh

test/t-lvresize-mirror.sh

test/t-lvresize-usage.sh

test/t-mdata-strings.sh

test/t-metadata-balance.sh

test/t-metadata-dirs.sh

test/t-metadata.sh

test/t-mirror-names.sh

test/t-mirror-vgreduce-removemissing.sh

test/t-nomda-missing.sh

test/t-pool-labels.sh

test/t-pv-duplicate.sh

test/t-pv-min-size.sh

test/t-pv-range-overflow.sh

test/t-pvchange-usage.sh

test/t-pvcreate-metadata0.sh

test/t-pvcreate-operation-md.sh

test/t-pvcreate-operation.sh

test/t-pvcreate-usage.sh

test/t-pvmove-basic.sh

test/t-pvremove-usage.sh

test/t-read-ahead.sh

test/t-snapshot-autoumount-dmeventd.sh

test/t-snapshot-merge.sh

test/t-snapshots-of-mirrors.sh

test/t-tags.sh

test/t-test-partition.sh

test/t-topology-support.sh

test/t-unknown-segment.sh

test/t-unlost-pv.sh

test/t-vgcfgbackup-usage.sh

test/t-vgchange-maxlv.sh

test/t-vgchange-sysinit.sh

test/t-vgchange-usage.sh

test/t-vgcreate-usage.sh

test/t-vgextend-restoremissing.sh

test/t-vgextend-usage.sh

test/t-vgimportclone.sh

test/t-vgmerge-operation.sh

test/t-vgmerge-usage.sh

test/t-vgreduce-removemissing-snapshot.sh

test/t-vgreduce-usage.sh

test/t-vgrename-usage.sh

test/t-vgsplit-operation.sh

test/t-vgsplit-stacked.sh

test/t-vgsplit-usage.sh

udev/13-dm-disk.rules

files modified:
Makefile.in

VERSION

VERSION_DM

WHATS_NEW

WHATS_NEW_DM

configure

configure.in

daemons/Makefile.in

daemons/clvmd/Makefile.in

daemons/clvmd/clvm.h

daemons/clvmd/clvmd-cman.c

daemons/clvmd/clvmd-command.c

daemons/clvmd/clvmd-comms.h

daemons/clvmd/clvmd-corosync.c

daemons/clvmd/clvmd-openais.c

daemons/clvmd/clvmd-singlenode.c

daemons/clvmd/clvmd.c

daemons/clvmd/lvm-functions.c

daemons/clvmd/lvm-functions.h

daemons/clvmd/refresh_clvmd.c

daemons/cmirrord/clogd.c

daemons/cmirrord/cluster.c

daemons/cmirrord/functions.c

daemons/dmeventd/Makefile.in

daemons/dmeventd/dmeventd.c

daemons/dmeventd/dmeventd.h

daemons/dmeventd/libdevmapper-event.c

daemons/dmeventd/libdevmapper-event.h

daemons/dmeventd/plugins/Makefile.in

daemons/dmeventd/plugins/lvm2/.exported_symbols

daemons/dmeventd/plugins/lvm2/Makefile.in

daemons/dmeventd/plugins/lvm2/dmeventd_lvm.c

daemons/dmeventd/plugins/lvm2/dmeventd_lvm.h

daemons/dmeventd/plugins/mirror/Makefile.in

daemons/dmeventd/plugins/mirror/dmeventd_mirror.c

daemons/dmeventd/plugins/raid/Makefile.in

daemons/dmeventd/plugins/raid/dmeventd_raid.c

daemons/dmeventd/plugins/snapshot/Makefile.in

daemons/dmeventd/plugins/snapshot/dmeventd_snapshot.c

doc/example.conf.in

doc/lvm2-raid.txt

doc/lvm_fault_handling.txt

doc/lvmetad_design.txt

doc/tagging.txt

doc/udev_assembly.txt

include/.symlinks.in

lib/Makefile.in

lib/activate/activate.c

lib/activate/activate.h

lib/activate/dev_manager.c

lib/activate/dev_manager.h

lib/activate/fs.c

lib/activate/fs.h

lib/cache/lvmcache.c

lib/cache/lvmcache.h

lib/commands/toolcontext.c

lib/commands/toolcontext.h

lib/config/config.c

lib/config/config.h

lib/config/defaults.h

lib/datastruct/str_list.c

lib/datastruct/str_list.h

lib/device/dev-cache.c

lib/device/dev-cache.h

lib/device/dev-io.c

lib/display/display.c

lib/filters/filter-persistent.c

lib/filters/filter-persistent.h

lib/filters/filter-regex.c

lib/filters/filter-regex.h

lib/filters/filter-sysfs.c

lib/filters/filter.c

lib/filters/filter.h

lib/format1/disk-rep.c

lib/format1/format1.c

lib/format1/import-extents.c

lib/format1/lvm1-label.c

lib/format_pool/disk_rep.c

lib/format_pool/format_pool.c

lib/format_pool/import_export.c

lib/format_text/archive.c

lib/format_text/archiver.c

lib/format_text/export.c

lib/format_text/flags.c

lib/format_text/format-text.c

lib/format_text/format-text.h

lib/format_text/import-export.h

lib/format_text/import.c

lib/format_text/import_vsn1.c

lib/format_text/layout.h

lib/format_text/tags.c

lib/format_text/text_export.h

lib/format_text/text_import.h

lib/format_text/text_label.c

lib/label/label.c

lib/label/label.h

lib/locking/cluster_locking.c

lib/locking/external_locking.c

lib/locking/file_locking.c

lib/locking/locking.c

lib/locking/locking.h

lib/locking/no_locking.c

lib/log/log.c

lib/metadata/lv.c

lib/metadata/lv.h

lib/metadata/lv_alloc.h

lib/metadata/lv_manip.c

lib/metadata/merge.c

lib/metadata/metadata-exported.h

lib/metadata/metadata.c

lib/metadata/metadata.h

lib/metadata/mirror.c

lib/metadata/pv.c

lib/metadata/pv_manip.c

lib/metadata/pv_map.c

lib/metadata/pv_map.h

lib/metadata/raid_manip.c

lib/metadata/replicator_manip.c

lib/metadata/segtype.h

lib/metadata/snapshot_manip.c

lib/metadata/vg.c

lib/metadata/vg.h

lib/mirror/mirrored.c

lib/misc/configure.h.in

lib/misc/lvm-exec.c

lib/misc/lvm-file.c

lib/misc/lvm-globals.c

lib/misc/lvm-globals.h

lib/misc/lvm-percent.h

lib/misc/lvm-string.c

lib/misc/lvm-string.h

lib/misc/sharedlib.c

lib/mm/memlock.c

lib/raid/.exported_symbols

lib/raid/raid.c

lib/replicator/replicator.c

lib/report/columns.h

lib/report/properties.c

lib/report/properties.h

lib/report/report.c

lib/snapshot/snapshot.c

lib/striped/striped.c

lib/unknown/unknown.c

libdm/Makefile.in

libdm/ioctl/libdm-iface.c

libdm/ioctl/libdm-targets.h

libdm/libdevmapper.h

libdm/libdm-common.c

libdm/libdm-common.h

libdm/libdm-deptree.c

libdm/libdm-file.c

libdm/libdm-report.c

libdm/libdm-string.c

libdm/misc/dm-log-userspace.h

libdm/mm/dbg_malloc.c

libdm/mm/pool-fast.c

libdm/mm/pool.c

libdm/regex/matcher.c

liblvm/Makefile.in

liblvm/lvm2app.h

liblvm/lvm_base.c

liblvm/lvm_lv.c

liblvm/lvm_pv.c

make.tmpl.in

man/Makefile.in

man/clvmd.8.in

man/dmeventd.8.in

man/dmsetup.8.in

man/fsadm.8.in

man/lvconvert.8.in

man/lvcreate.8.in

man/lvextend.8.in

man/lvm.8.in

man/lvm.conf.5.in

man/lvreduce.8.in

man/lvremove.8.in

man/lvrename.8.in

man/lvresize.8.in

man/lvs.8.in

man/pvcreate.8.in

man/pvscan.8.in

scripts/Makefile.in

scripts/clvmd_fix_conf.sh *

scripts/dm_event_systemd_red_hat.service.in

scripts/fsadm.sh *

scripts/gdbinit

scripts/lvm2_monitoring_init_red_hat.in

scripts/lvm2_monitoring_init_rhel4

scripts/lvm2_monitoring_systemd_red_hat.service.in

scripts/lvm2create_initrd/lvm2create_initrd

scripts/lvm2create_initrd/lvm2create_initrd.8

scripts/lvm2create_initrd/lvm2create_initrd.pod

scripts/lvmconf.sh *

scripts/lvmconf_lockingtype2.sh *

scripts/vgimportclone.sh

test/Makefile.in

test/api/Makefile.in

test/api/percent.sh

test/lib/aux.sh

test/lib/check.sh

test/lib/harness.c

test/lib/test.sh

tools/Makefile.in

tools/args.h

tools/commands.h

tools/dmsetup.c

tools/dumpconfig.c

tools/lvchange.c

tools/lvconvert.c

tools/lvcreate.c

tools/lvm.c

tools/lvm2cmd.h

tools/lvmcmdlib.c

tools/lvmcmdline.c

tools/lvmdiskscan.c

tools/lvrename.c

tools/lvresize.c

tools/polldaemon.c

tools/polldaemon.h

tools/pvchange.c

tools/pvck.c

tools/pvcreate.c

tools/pvmove.c

tools/pvremove.c

tools/pvresize.c

tools/pvscan.c

tools/reporter.c

tools/toollib.c

tools/toollib.h

tools/tools.h

tools/vgcfgbackup.c

tools/vgcfgrestore.c

tools/vgchange.c

tools/vgconvert.c

tools/vgcreate.c

tools/vgmerge.c

tools/vgreduce.c

tools/vgremove.c

tools/vgrename.c

tools/vgscan.c

tools/vgsplit.c

udev/Makefile.in

Show diffs side-by-side

added added

removed removed

doc/lvm_fault_handling.txt

relocation, etc). The policies for handling both types of failures

is described herein.

Users need to be aware that there are two implementations of RAID1 in LVM.

The first is defined by the "mirror" segment type. The second is defined by

the "raid1" segment type. The characteristics of each of these are defined

in lvm.conf under 'mirror_segtype_default' - the configuration setting used to

identify the default RAID1 implementation used for LVM operations.

Available Operations During a Device Failure

--------------------------------------------

When there is a device failure, LVM behaves somewhat differently because

a linear, stripe, or snapshot device is located on the failed device

the command will not proceed without a '--force' option. The result

of using the '--force' option is the entire removal and complete

loss of the non-redundant logical volume. Once this operation is

complete, the volume group will again have a complete and consistent

view of the devices it contains. Thus, all operations will be

permitted - including creation, conversion, and resizing operations.

loss of the non-redundant logical volume. If an image or metadata area

of a RAID logical volume is on the failed device, the sub-LV affected is

replace with an error target device - appearing as <unknown> in 'lvs'

output. RAID logical volumes cannot be completely repaired by vgreduce -

'lvconvert --repair' (listed below) must be used. Once this operation is

complete on volume groups not containing RAID logical volumes, the volume

group will again have a complete and consistent view of the devices it

contains. Thus, all operations will be permitted - including creation,

conversion, and resizing operations. It is currently the preferred method

to call 'lvconvert --repair' on the individual logical volumes to repair

them followed by 'vgreduce --removemissing' to extract the physical volume's

representation in the volume group.

- 'lvconvert --repair <VG/LV>': This action is designed specifically

to operate on mirrored logical volumes. It is used on logical volumes

individually and does not remove the faulty device from the volume

group. If, for example, a failed device happened to contain the

images of four distinct mirrors, it would be necessary to run

'lvconvert --repair' on each of them. The ultimate result is to leave

the faulty device in the volume group, but have no logical volumes

referencing it. In addition to removing mirror images that reside

on failed devices, 'lvconvert --repair' can also replace the failed

device if there are spare devices available in the volume group. The

user is prompted whether to simply remove the failed portions of the

mirror or to also allocate a replacement, if run from the command-line.

Optionally, the '--use-policies' flag can be specified which will

cause the operation not to prompt the user, but instead respect

to operate on individual logical volumes. If, for example, a failed

device happened to contain the images of four distinct mirrors, it would

be necessary to run 'lvconvert --repair' on each of them. The ultimate

result is to leave the faulty device in the volume group, but have no logical

volumes referencing it. (This allows for 'vgreduce --removemissing' to

removed the physical volumes cleanly.) In addition to removing mirror or

RAID images that reside on failed devices, 'lvconvert --repair' can also

replace the failed device if there are spare devices available in the

volume group. The user is prompted whether to simply remove the failed

portions of the mirror or to also allocate a replacement, if run from the

command-line. Optionally, the '--use-policies' flag can be specified which

will cause the operation not to prompt the user, but instead respect

the policies outlined in the LVM configuration file - usually,

/etc/lvm/lvm.conf. Once this operation is complete, mirrored logical

volumes will be consistent and I/O will be allowed to continue.

However, the volume group will still be inconsistent - due to the

refernced-but-missing device/PV - and operations will still be

/etc/lvm/lvm.conf. Once this operation is complete, the logical volumes

will be consistent. However, the volume group will still be inconsistent -

due to the refernced-but-missing device/PV - and operations will still be

restricted to the aformentioned actions until either the device is

restored or 'vgreduce --removemissing' is run.

110

111

Automated Target Response to Failures:

100

112

--------------------------------------

101

The only LVM target type (i.e. "personality") that has an automated

102

response to failures is a mirrored logical volume. The other target

113

The only LVM target types (i.e. "personalities") that have an automated

114

response to failures are the mirror and RAID logical volumes. The other target

103

115

types (linear, stripe, snapshot, etc) will simply propagate the failure.

104

116

[A snapshot becomes invalid if its underlying device fails, but the

105

117

origin will remain valid - presuming the origin device has not failed.]

106

There are three types of errors that a mirror can suffer - read, write,

107

and resynchronization errors. Each is described in depth below.

118

119

Starting with the "mirror" segment type, there are three types of errors that

120

a mirror can suffer - read, write, and resynchronization errors. Each is

121

described in depth below.

108

122

109

123

Mirror read failures:

110

124

If a mirror is 'in-sync' (i.e. all images have been initialized and

184

198

choice of when to incure the extra performance costs of replacing

185

199

the failed image.

186

200

187

TODO...

188

The appropriate time to take permanent corrective action on a mirror

189

should be driven by policy. There should be a directive that takes

190

a time or percentage argument. Something like the following:

191

- mirror_fault_policy_WHEN = "10sec"/"10%"

192

A time value would signal the amount of time to wait for transient

193

failures to resolve themselves. The percentage value would signal the

194

amount a mirror could become out-of-sync before the faulty device is

195

removed.

196

197

A mirror cannot be used unless /some/ corrective action is taken,

198

however. One option is to replace the failed mirror image with an

199

error target, forgo the use of 'handle_errors', and simply let the

200

out-of-sync regions accumulate and be tracked by the log. Mirrors

201

that have more than 2 images would have to "stack" to perform the

202

tracking, as each failed image would have to be associated with a

203

log. If the failure is transient, the device would replace the

204

error target that was holding its spot and the log that was tracking

205

the deltas would be used to quickly restore the portions that changed.

206

207

One unresolved issue with the above scheme is how to know which

208

regions of the mirror are out-of-sync when a problem occurs. When

209

a write failure occurs in the kernel, the log will contain those

210

regions that are not in-sync. If the log is a disk log, that log

211

could continue to be used to track differences. However, if the

212

log was a core log - or if the log device failed at the same time

213

as an image device - there would be no way to determine which

214

regions are out-of-sync to begin with as we start to track the

215

deltas for the failed image. I don't have a solution for this

216

problem other than to only be able to handle errors in this way

217

if conditions are right. These issues will have to be ironed out

218

before proceeding. This could be another case, where it is better

219

to handle failures in the kernel by allowing the kernel to store

220

updates in various metadata areas.

221

...TODO

201

RAID logical volume device failures are handled differently from the "mirror"

202

segment type. Discussion of this can be found in lvm2-raid.txt.

Older »