6
# Description: Manages an Oracle Database as a High-Availability
10
# Author: Dejan Muhamedagic
11
# Support: linux-ha@lists.linux-ha.org
12
# License: GNU General Public License (GPL)
13
# Copyright: (C) 2006 International Business Machines, Inc.
15
# This code inspired by the DB2 resource script
16
# written by Alan Robertson
18
# An example usage in /etc/ha.d/haresources:
19
# node1 10.0.0.170 oracle::RK1::/oracle/10.2::orark1
21
# See usage() function below for more details...
23
# OCF instance parameters:
25
# OCF_RESKEY_home (optional; else read it from /etc/oratab)
26
# OCF_RESKEY_user (optional; figure it out by checking file ownership)
27
# OCF_RESKEY_ipcrm (optional; defaults to "instance")
28
# OCF_RESKEY_clear_backupmode (optional; default to "false")
29
# OCF_RESKEY_shutdown_method (optional; default to "checkpoint/abort")
33
: ${OCF_FUNCTIONS_DIR=${OCF_ROOT}/lib/heartbeat}
34
. ${OCF_FUNCTIONS_DIR}/ocf-shellfuncs
36
#######################################################################
39
methods=`oracle_methods`
40
methods=`echo $methods | tr ' ' '|'`
44
$0 manages an Oracle Database instance as an HA resource.
46
The 'start' operation starts the database.
47
The 'stop' operation stops the database.
48
The 'status' operation reports whether the database is running
49
The 'monitor' operation reports whether the database seems to be working
50
The 'dumpinstipc' operation prints IPC resources used by the instance
51
The 'cleanup' operation tries to clean up after Oracle was brutally stopped
52
The 'validate-all' operation reports whether the parameters are valid
53
The 'methods' operation reports on the methods $0 supports
61
<!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd">
62
<resource-agent name="oracle">
63
<version>1.0</version>
66
Resource script for oracle. Manages an Oracle Database instance
69
<shortdesc lang="en">Manages an Oracle Database instance</shortdesc>
73
<parameter name="sid" unique="1" required="1">
75
The Oracle SID (aka ORACLE_SID).
77
<shortdesc lang="en">sid</shortdesc>
78
<content type="string" default="" />
81
<parameter name="home" unique="0">
83
The Oracle home directory (aka ORACLE_HOME).
84
If not specified, then the SID along with its home should be listed in
87
<shortdesc lang="en">home</shortdesc>
88
<content type="string" default="" />
91
<parameter name="user" unique="0">
93
The Oracle owner (aka ORACLE_OWNER).
94
If not specified, then it is set to the owner of
95
file \$ORACLE_HOME/dbs/*\${ORACLE_SID}.ora.
96
If this does not work for you, just set it explicitely.
98
<shortdesc lang="en">user</shortdesc>
99
<content type="string" default="" />
102
<parameter name="ipcrm" unique="0">
104
Sometimes IPC objects (shared memory segments and semaphores)
105
belonging to an Oracle instance might be left behind which
106
prevents the instance from starting. It is not easy to figure out
107
which shared segments belong to which instance, in particular when
108
more instances are running as same user.
110
What we use here is the "oradebug" feature and its "ipc" trace
111
utility. It is not optimal to parse the debugging information, but
112
I am not aware of any other way to find out about the IPC
113
information. In case the format or wording of the trace report
114
changes, parsing might fail. There are some precautions, however,
115
to prevent stepping on other peoples toes. There is also a
116
dumpinstipc option which will make us print the IPC objects which
117
belong to the instance. Use it to see if we parse the trace file
120
Three settings are possible:
122
- none: don't mess with IPC and hope for the best (beware: you'll
123
probably be out of luck, sooner or later)
124
- instance: try to figure out the IPC stuff which belongs to the
125
instance and remove only those (default; should be safe)
126
- orauser: remove all IPC belonging to the user which runs the
127
instance (don't use this if you run more than one instance as same
128
user or if other apps running as this user use IPC)
130
The default setting "instance" should be safe to use, but in that
131
case we cannot guarantee that the instance will start. In case IPC
132
objects were already left around, because, for instance, someone
133
mercilessly killing Oracle processes, there is no way any more to
134
find out which IPC objects should be removed. In that case, human
135
intervention is necessary, and probably _all_ instances running as
136
same user will have to be stopped. The third setting, "orauser",
137
guarantees IPC objects removal, but it does that based only on IPC
138
objects ownership, so you should use that only if every instance
139
runs as separate user.
141
Please report any problems. Suggestions/fixes welcome.
143
<shortdesc lang="en">ipcrm</shortdesc>
144
<content type="string" default="instance" />
147
<parameter name="clear_backupmode" unique="0" required="0">
149
The clear of the backup mode of ORACLE.
151
<shortdesc lang="en">clear_backupmode</shortdesc>
152
<content type="boolean" default="false" />
155
<parameter name="shutdown_method" unique="0" required="0">
157
How to stop Oracle is a matter of taste it seems. The default
158
method ("checkpoint/abort") is:
160
alter system checkpoint;
163
This should be the fastest safe way bring the instance down. If
164
you find "shutdown abort" distasteful, set this attribute to
165
"immediate" in which case we will
169
If you still think that there's even better way to shutdown an
170
Oracle instance we are willing to listen.
172
<shortdesc lang="en">shutdown_method</shortdesc>
173
<content type="string" default="checkpoint/abort" />
179
<action name="start" timeout="120" />
180
<action name="stop" timeout="120" />
181
<action name="status" timeout="5" />
182
<action name="monitor" depth="0" timeout="30" interval="120" />
183
<action name="validate-all" timeout="5" />
184
<action name="methods" timeout="5" />
185
<action name="meta-data" timeout="5" />
193
# methods: What methods/operations do we support?
212
# Gather up information about our oracle instance
219
# get ORACLE_HOME from /etc/oratab if not set
220
[ x = "x$ORACLE_HOME" ] &&
221
ORACLE_HOME=`awk -F: "/^$ORACLE_SID:/"'{print $2}' /etc/oratab`
223
# there a better way to find out ORACLE_OWNER?
224
[ x = "x$ORACLE_OWNER" ] &&
225
ORACLE_OWNER=`ls -ld $ORACLE_HOME/. 2>/dev/null | awk 'NR==1{print $3}'`
227
sqlplus=$ORACLE_HOME/bin/sqlplus
228
lsnrctl=$ORACLE_HOME/bin/lsnrctl
229
tnsping=$ORACLE_HOME/bin/tnsping
233
# Let's make sure a few important things are set...
234
if [ x = "x$ORACLE_HOME" ]; then
235
ocf_log info "ORACLE_HOME not set"
236
return $OCF_ERR_CONFIGURED
238
if [ x = "x$ORACLE_OWNER" ]; then
239
ocf_log info "ORACLE_OWNER not set"
240
return $OCF_ERR_CONFIGURED
242
# and some important things are there
243
if [ ! -x "$sqlplus" ]; then
244
ocf_log info "$sqlplus does not exist"
245
return $OCF_ERR_INSTALLED
247
if [ ! -x "$lsnrctl" ]; then
248
ocf_log err "$lsnrctl does not exist"
249
return $OCF_ERR_INSTALLED
251
if [ ! -x "$tnsping" ]; then
252
ocf_log err "$tnsping does not exist"
253
return $OCF_ERR_INSTALLED
259
LD_LIBRARY_PATH=$ORACLE_HOME/lib
260
LIBPATH=$ORACLE_HOME/lib
261
TNS_ADMIN=$ORACLE_HOME/network/admin
262
PATH=$ORACLE_HOME/bin:$ORACLE_HOME/dbs:$PATH
263
export ORACLE_SID ORACLE_HOME ORACLE_OWNER TNS_ADMIN
264
export LD_LIBRARY_PATH LIBPATH
268
PATH=$ORACLE_HOME/bin:$ORACLE_HOME/dbs:$PATH
269
ORACLE_SID=$ORACLE_SID
270
ORACLE_HOME=$ORACLE_HOME
271
ORACLE_OWNER=$ORACLE_OWNER
272
LD_LIBRARY_PATH=$ORACLE_HOME/lib
273
LIBPATH=$ORACLE_HOME/lib
274
TNS_ADMIN=$ORACLE_HOME/network/admin
275
export ORACLE_SID ORACLE_HOME ORACLE_OWNER TNS_ADMIN
276
export LD_LIBRARY_PATH LIBPATH
281
# Run commands as the Oracle owner...
284
if [ "$US" = "$ORACLE_OWNER" ]; then
287
su - $ORACLE_OWNER -c ". $envtmpf; $sqlplus -S /nolog"
292
# Run commands in the oracle admin sqlplus...
297
echo "connect / as sysdba"
298
echo "set feedback off"
299
echo "set heading off"
300
echo "set pagesize 0"
301
for func; do $func; done
303
execsql | grep -v '^Connected' |
306
# use dbasql_one if the query should result in a single line output
307
# at times people stuff commands in oracle .profile
308
# which may produce extra output
314
# various interesting sql
317
echo 'select status from v$instance;'
320
echo 'alter database mount;'
323
echo 'alter database open;'
326
echo 'shutdown immediate'
328
dbstop_checkpoint_abort() {
329
echo 'alter system checkpoint;'
330
echo 'shutdown abort'
333
case "${shutdown_method}" in
338
dbstop_checkpoint_abort
349
echo 'alter database end backup;'
352
echo "select 'COUNT'||count(*) from v\$backup where status='ACTIVE';"
354
is_clear_backupmode_set(){
355
[ x"${clear_backupmode}" = x"true" ]
357
is_instance_in_backup_mode() {
359
count="`dbasql_one db_backup_mode | sed 's/COUNT//'`"
360
[ x"$count" != x"0" ]
362
clear_backup_mode() {
364
output="`dbasql dbendbackup`"
365
ocf_log info "Oracle instance $ORACLE_SID alter database end backup: $output"
368
#echo 'select value from v$parameter where name = \'user_dump_dest\';'
369
echo "select value from v\$parameter where name = 'user_dump_dest';"
372
echo "oradebug setmypid"
376
# print the output of dbstat (for debugging)
381
echo "Stripped output:"
382
echo "<`dbasql dbstat`>"
386
# IPC stuff: not overly complex, but quite involved :-/
391
local dumpdest=`dbasql_one getdumpdest`
392
if [ "x$dumpdest" = x -o ! -d "$dumpdest" ]; then
393
ocf_log warn "$dumpdest is not a directory"
396
local fcount=`ls -rt $dumpdest | wc -l`
397
output=`dbasql getipc`
398
local lastf=`ls -rt $dumpdest | grep -v '^\.*$' | tail -1`
399
local fcount2=`ls -rt $dumpdest | wc -l`
400
if [ $((fcount+1)) -eq $fcount2 ]; then
401
echo $dumpdest/$lastf
403
ocf_log warn "'dbasql getipc' failed: $output"
409
test -f "$1" || return 1
411
$3 == "Shmid" {n=1;next}
413
if( $3~/^[0-9]+$/ ) print $3;
417
sort -u | sed 's/^/m:/'
419
/Semaphore List/ {insems=1;next}
421
for( i=1; i<=NF; i++ )
422
if( $i~/^[0-9]+$/ ) print $i;
424
/system semaphore information/ {exit}
426
sort -u | sed 's/^/s:/'
429
# Part 2: OS (ipcs,ipcrm)
430
filteroraipc() { # this portable?
431
grep -w $ORACLE_OWNER | awk '{print $2}'
436
m) echo "shared memory segment";;
437
s) echo "semaphore";;
438
q) echo "message queue";;
443
ipcs -$what | filteroraipc | grep -w $id >/dev/null 2>&1 ||
445
ocf_log info "Removing `ipcdesc $what` $id."
450
for what in m s q; do
451
for id in `ipcs -$what | filteroraipc`; do
459
rmipc `echo $ipcobj | sed 's/:/ /'`
464
# oracle_status: is the Oracle instance running?
466
# quick check to see if the instance is up
468
ps -ef | grep -wiqs "[^ ]*[_]pmon_${ORACLE_SID}"
470
# instance in OPEN state?
472
local status=`dbasql_one dbstat`
473
if [ "$status" = OPEN ]; then
476
ocf_log info "$ORACLE_SID instance state is not OPEN (dbstat output: $status)"
482
#rm -fr /tmp/.oracle #???
483
rm -f `ls $ORACLE_HOME/dbs/lk* | grep -i $ORACLE_SID`
496
ocf_log warn "bad usage: ipcrm set to $IPCRM"
502
# oracle_start: Start the Oracle instance
504
# NOTE: We handle instance in the MOUNTED and STARTED states
506
# We *do not* handle instance in the restricted or read-only
507
# mode, i.e. it appears as running, but its availability is
508
# "not for general use"
513
if is_oracle_up; then
514
status="`dbasql_one dbstat`"
517
: nothing to be done, we can leave right now
518
ocf_log info "Oracle instance $ORACLE_SID already running"
522
output=`dbasql dbmount`
525
: we proceed if mounted
528
output=`dbasql dbstop dbstart_mount`
532
output="`dbasql dbstart_mount`"
533
# try to cleanup in case of
534
# ORA-01081: cannot start already-running ORACLE - shut it down first
535
if echo "$output" | grep ORA-01081 >/dev/null 2>&1; then
536
ocf_log info "ORA-01081 error found, trying to cleanup oracle (dbstart_mount output: $output)"
538
output=`dbasql dbstart_mount`
542
# oracle instance should be mounted.
543
status="`dbasql_one dbstat`"
549
ocf_log err "oracle $ORACLE_SID can not be mounted (status: $status)"
550
return $OCF_ERR_GENERIC
554
# It is examined whether mode is "online backup mode",
555
# and if it is true, makes clear the mode.
556
# Afterwards, DB is opened.
557
if is_clear_backupmode_set && is_instance_in_backup_mode; then
560
output=`dbasql dbopen`
562
if ! is_oracle_up; then
563
ocf_log err "oracle process not running: $output"
564
return $OCF_ERR_GENERIC
565
elif ! instance_live; then
566
ocf_log err "oracle instance $ORACLE_SID not started: $output"
567
return $OCF_ERR_GENERIC
569
: cool, we are up and running
570
ocf_log info "Oracle instance $ORACLE_SID started: $output"
576
# oracle_stop: Stop the Oracle instance
579
local status output ipc=""
580
if is_oracle_up; then
581
[ "$IPCRM" = "instance" ] && ipc=$(parseipc `dumpinstipc`)
582
output=`dbasql dbstop`
584
ocf_log info "Oracle instance $ORACLE_SID already stopped"
587
ora_kill # kill any processes left
588
if is_oracle_up; then
589
ocf_log err "Oracle instance $ORACLE_SID not stopped: $output"
590
return $OCF_ERR_GENERIC
592
ocf_log info "Oracle instance $ORACLE_SID stopped: $output"
593
sleep 1 # give em a chance to cleanup
594
ocf_log info "Cleaning up for $ORACLE_SID"
599
# kill the database processes (if any left)
600
# give them 30 secs to exit cleanly (6 times 5)
605
kill -s $sig $* >/dev/null
608
oraprocs=`eval $procs | awk '{print $1}'`
609
if [ -z "$oraprocs" ]; then
610
ocf_log debug "All oracle processes are already stopped."
613
killprocs TERM $oraprocs
614
for i in 1 2 3 4 5; do
615
if [ -z "`eval $procs | awk '{print $1}'`" ]; then
616
ocf_log debug "All oracle processes are killed."
621
killprocs KILL `eval $procs | awk '{print $1}'`
625
# oracle_monitor: Can the Oracle instance do anything useful?
628
if ! is_oracle_up; then
629
ocf_log info "oracle process not running"
630
return $OCF_NOT_RUNNING
632
if ! instance_live; then
633
ocf_log info "oracle instance $ORACLE_SID is down"
634
return $OCF_NOT_RUNNING
636
#ocf_log info "Oracle instance $ORACLE_SID is alive"
641
# 'main' starts here...
650
# These operations don't require OCF instance parameters to be set
658
methods) oracle_methods
664
clear_backupmode=${OCF_RESKEY_clear_backupmode:-"false"}
665
shutdown_method=${OCF_RESKEY_shutdown_method:-"checkpoint/abort"}
667
case "${shutdown_method}" in
669
"checkpoint/abort") ;;
670
*) ocf_log err "unsupported shutdown_method, please read meta-data"
673
if [ x = "x$OCF_RESKEY_sid" ]
675
ocf_log err "Please set OCF_RESKEY_sid to the Oracle SID !"
679
ora_info "$OCF_RESKEY_sid" "$OCF_RESKEY_home" "$OCF_RESKEY_user"
684
if [ $rc -ne 0 ]; then
685
ocf_log info "Oracle environment for SID $ORACLE_SID does not exist"
687
stop) exit $OCF_SUCCESS;;
688
monitor) exit $OCF_NOT_RUNNING;;
689
status) exit $LSB_STATUS_STOPPED;;
691
ocf_log err "Oracle environment for SID $ORACLE_SID broken"
697
setoraenv # important: set the environment for the SID
699
dumporaenv > $envtmpf
701
trap "rm -f $envtmpf" EXIT
702
procs="ps -e -o pid,args | grep -i \"[o]ra.*$ORACLE_SID\""
705
if [ $US != root -a $US != $ORACLE_OWNER ]
707
ocf_log err "$0 must be run as root or $ORACLE_OWNER"
711
if [ x = "x$OCF_RESKEY_ipcrm" ]
715
IPCRM="$OCF_RESKEY_ipcrm"
718
# What kind of method was invoked?
727
status) if is_oracle_up
729
echo Oracle instance $ORACLE_SID is running
732
echo Oracle instance $ORACLE_SID is stopped
733
exit $OCF_NOT_RUNNING
738
is_oracle_up && parseipc `dumpinstipc`
746
if [ "$IPCRM" = "instance" ]; then
747
ora_cleanup $(parseipc `dumpinstipc`)
753
monitor) oracle_monitor
756
validate-all) # OCF_RESKEY_sid was already checked by testoraenv(),
757
# just exit successfully here.
761
exit $OCF_ERR_UNIMPLEMENTED;;
765
# vim:tabstop=4:shiftwidth=4:textwidth=0:wrapmargin=0