1
Profiling the N1GE 6 System
9
This document gives a brief summary on the profiling capabilities built
10
into the Grid Engine Software. The profiling facility is one component in
11
the troubleshooting and profiling environment. In addition the profiling
12
component Grid Engine provides: health monitoring through qping, monitoring
13
for the qmaster, and monitoring for the scheduler.
14
The profiling facility can be used to analyse bottlenecks within the software
15
or outside and provide guidence for tuning Grid Engine.
20
The format of the profiling output is not fixed and can change from one version
21
to another or even inbetween version. It is used as a trouble shooting tool and
22
will be adapted to new needs when ever they arise. This document is based on
23
the profiling output of SGE 6.0u8.
32
output: <SGE_ROOT>/<CELL>/spool/qmaster/schedd/messages
35
[1] PROF: sge_mirror processed 30 events in 0.000 s
36
[2] PROF: static urgency took 0.000 s
37
[3] PROF: job ticket calculation: init: 0.000 s, pass 0: 0.000 s,
38
pass 1: 0.000, pass2: 0.000, calc: 0.000 s
39
[3a] PROF: job ticket calculation: init: 0.000 s, pass 0: 0.000 s,
40
pass 1: 0.000, pass2: 0.000, calc: 0.000 s
41
[4] PROF: normalizing job tickets took 0.000 s
42
[5] PROF: create active job orders: 0.000 s
43
[6] PROF: job-order calculation took 0.010 s
44
[7] PROF: job sorting took 0.000 s
45
[8] PROF: job dispatching took 0.000 s (1 fast, 0 comp, 0 pe, 0 res)
46
[9] PROF: create pending job orders: 0.000 s
47
[10] PROF: scheduled in 0.010 (u 0.010 + s 0.000 = 0.010):
48
1 sequential, 0 parallel, 4 orders, 2 H, 4 Q, 5 QA,
49
0 J(qw), 1 J(r), 0 J(s), 0 J(h), 0 J(e), 0 J(x), 1 J(all),
50
52 C, 4 ACL, 2 PE, 2 U, 2 D, 96 PRJ, 1 ST, 0 CKPT, 0 RU,
51
1 gMes, 0 jMes, 2/1 pre-send, 0/0/0 pe-alg
52
[11] PROF: send orders and cleanup took: 0.100 (u 0.000,s 0.000) s
53
[12] PROF: schedd run took: 0.120 s (init: 0.000 s, copy: 0.010 s, run:0.110,
54
free: 0.000 s, jobs: 1, categories: 1/1)
58
[1] - Is generated from the event processing.
59
<NR> events - number of events processed (received from the qmaster)
60
time s - time needed to update the internal data
62
[2] - Time needed to compute the urgency policy for all jobs in the system
63
[3] - Time needed to compute the different steps from the ticket policy
64
init - time needed to setup the internal data structures
65
pass 0 - time for data preparation (tfirst pass)
66
pass 1 - time for data preparation (tsecond pass)
67
pass 2 - time for data preparation (third pass)
68
calc - time for the ticket calculation
69
[3a] - Same as [3]. This step is used to correct the tickets for the running jobs.
70
[4] - Time needed to normalize the tickets to a range of 0 to 1
71
[5] - Time needed to create job update tickets for running jobs, and sending them.
72
[6] - Time needed to compute the priority for each job, includes steps: 2,3,3a,4
74
[7] - Time needed to sort the job list to reflect the job priority
76
[8] - Time needed to assign jobs to the compute resources and sending the job start
77
orders to the qmaster. The following numbers show the jobs, which were assigned:
78
<NR> fast - number of sequential jobs without soft requests
79
<NR> comp - number of sequential jobs with soft requests
80
<NR> pe - number of parallel jobs
81
<NR> res - number of reservations
83
[9] - Time needed to create the priority update orders for all pending jobs.
85
[10] - Time needed to schedule the jobs. The output includes all steps between [2] and
86
[9]. The time shows the wallclock time as well as the user and system time.
87
<NR> sequential - number of sequential jobs started
88
<NR> parallel - number of parallel jobs started
89
<NR> orders - number of orders generated
90
<NR> H - number of hosts in the grid
91
<NR> Q - number of available queue instances in the grid
92
<NR> QA - number of all queue instances in the grid
93
<NR> J(xxx) - number of job in the different states in the grid
94
<NR> C - number of complex attributes in the grid
95
<NR> ACL - number of access list in the grid
96
<NR> PE - number of parallel environment objects in the grid
97
<NR> U - number of users in the grid
98
<NR> D - number of departments in the grid
99
<NR> PRJ - number of projects in the grid
100
<NR> ST - number of share trees in the grid
101
<NR> CKPT - number of check pointing objects in the grid
102
<NR> RU - number of jobs running per user
103
<NR> gMes - number of global messages created in this scheduling run
104
<NR> jMes - number of job related messages created in this scheduling run
105
<1/2> pre-send - shows the orders send during the dispatching run [8]
106
1 - number of orders send
108
<1/2/3> pe-alg - shows which pe-range algorithm was used during this scheduling run:
109
1 - lowest pe range first
111
3 - highest pe range first
112
[11] - Time needed to send the left over orders to the qmaster and wait until they are processed.
113
In addition the scheduler cleans up.
114
[12] - Time needed for the entire scheduling run ([1] to [11]).
115
init <NR> - time needed to setup the scheduler for the 1 run
116
copy <NR> - time needed to replicate the entire dataset
117
run <NR> - time needed for the scheduling run
118
free <NR> - time needed to free up the duplicated data set
119
jobs <NR> - number of jobs in the system, before the duplication / filtering
120
categories <1/2> - number of categories in the system
121
1 - number of job categories
122
2 - number of job priority categories
126
The qmaster allows to profile the different threads in the qmaster.
130
qmaster_params PROF_SIGNAL=[0,1], - profile the signal handling thread
131
PROF_MESSAGE=[0,1], - profile the message processing thread
132
PROF_DELIVER=[0,1], - profile the event delivery thread
133
PROF_TEVENT=[0,1] - profile the timed event thread
135
output: <SGE_ROOT>/<CELL>/spool/qmaster/messages