~louis/crashdc/trunk

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
		crashdc : an automated collection tool
		======================================

crashdc is an automated collection tool . It is intended to be run whenever 
a new vmcore file is generated in order to collect basic crash dump data for 
offline analysis. It can also be executed interactively on existing dump files. 

0. Preliminary warning
======================
This is the beta release of crashdc. Please report any issue you find or 
comments you may have. We definitively want to hear from any 
issue/comment/rant/praise you would like to submit.  If you want to, you can 
submit these at crashdc-devel@lists.sourceforge.net

1. Introduction
===============
Crash dump files (i.e. vmcore) are becoming larger and larger. It is not 
uncommon to encounter files larger than 16 Gb. It is becoming difficult to have 
those files transfered to vendor's facilities for analysis. And sometimes, only 
a few standard crash commands are necessary to have a good idea of what caused 
the crash.

crashdc is meant to run automatically after creation of the vmcore file. It will
gather the main crash data elements an transfer them into a text file. Normally,
this is only done on the most recent vmcore generated.

But when invoked manually using the init.d script with the 'generate' keyword,
it is possible to generate specific reports, using specific modes supplied on
the command line.

2. crashdc operation
====================
Crashdc main usage is to automate the collection of basic data elements presents
in a vmcore file. Automation of its execution can be done using on of these two
methods :

	* kdump post-save trigger
	* init script

While the kdump method is better integrated in the dump procedure, it can appear
as limitative, especially since it runs within the kexec reserved space. For
instance, it may be necessary to reserve up to 256 Mb of kexec space (SLES11) in
order for crashdc to run properly. This might prove to be impossible on some
system with limited amount of memory.

If this happens, then the init script method will prove to be a better choice as
it happens during the normal course of a reboot, late in the boot process and
doesn't require an increase in kexec memory reservation. It may also be the only
possible method on environments where kexec/kdump is not available at all (i.e.
RHEL4).

But automatic execution of crashdc is not required.  It is possible to use the
init script manually to create crash-data-{date}.txt reports. It is also
possible to use this method to override the default mode (as defined in
/etc/sysconfig/crashdc) or to provide custom-made crash commands through a file.

Finally, the crashdc tool itself can be used as a command line tool, in
situations where the debuginfo RPM cannot be installed, specific kernel
locations are used or non standard environments are a necessity.

For further details on each one of the commands refer to :

	* crashdc(8) : The crashdc command
	* crashdc(7) : The init script
	* crashdc(5) : The configuration file

3. crashdc testing context
==========================
crashdc is currently tested in a limited environment which consist of standard
installs of RHEL5, SLES10 and SLES11 in VM. No specific configuration is done
except for what is described here.

4. crashdc known limitation
===========================

4.1 Local storage only
----------------------
So far, crashdc has been tested on local storage only. This means that it might
not work at all using NFS network storage (or CIFS on SLES). It will not work at
all with ftp/scp as the vmcore file is sent away to another host. If you want to
use crashdc in this fashion, you will have to install it on the remote server
where the vmcore file is stored and will not be able to use the automated
method.

You still can use the manual method to generate the crash-data-{date}.txt file.

4.2 Same kernel type
--------------------
When the /etc/init.d/crashdc script is invoked manually to generate the
crash-data-{date}.txt file, it supposes that the booted kernel is the same than
the one that generated the vmcore file. If both are different, an error will be
displayed and the command will fail.