~ubuntu-branches/ubuntu/utopic/gridengine/utopic

« back to all changes in this revision

Viewing changes to doc/devel/rfe/spooling.html

  • Committer: Bazaar Package Importer
  • Author(s): Mark Hymers
  • Date: 2008-06-25 22:36:13 UTC
  • Revision ID: james.westby@ubuntu.com-20080625223613-tvd9xlhuoct9kyhm
Tags: upstream-6.2~beta2
ImportĀ upstreamĀ versionĀ 6.2~beta2

Show diffs side-by-side

added added

removed removed

Lines of Context:
 
1
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
 
2
<HTML>
 
3
<HEAD>
 
4
        <META HTTP-EQUIV="CONTENT-TYPE" CONTENT="text/html; charset=iso-8859-1">
 
5
        <TITLE>Spooling framework</TITLE>
 
6
        <META NAME="GENERATOR" CONTENT="StarOffice 6.0  (Solaris Sparc)">
 
7
        <META NAME="CREATED" CONTENT="20020524;12211900">
 
8
        <META NAME="CHANGEDBY" CONTENT="Joachim Gabler">
 
9
        <META NAME="CHANGED" CONTENT="20020621;13010600">
 
10
        <META NAME="CLASSIFICATION" CONTENT="Analysis and Redesign">
 
11
        <META NAME="DESCRIPTION" CONTENT="Analysis of the current spooling functionality and possibilities for a redesign">
 
12
        <STYLE>
 
13
        <!--
 
14
                @page { size: 21.59cm 27.94cm; margin-left: 3.18cm; margin-right: 3.18cm; margin-top: 2.54cm; margin-bottom: 2.54cm }
 
15
                TD P { margin-bottom: 0.21cm }
 
16
                P { margin-bottom: 0.21cm }
 
17
                H2.western { font-family: "Albany", sans-serif; font-size: 14pt; font-style: italic }
 
18
                H2.cjk { font-family: "MSung Light SC"; font-size: 14pt; font-style: italic }
 
19
                H2.ctl { font-size: 14pt; font-style: italic }
 
20
                H3.western { font-family: "Albany", sans-serif }
 
21
                H3.cjk { font-family: "MSung Light SC" }
 
22
                H4.western { font-family: "Albany", sans-serif; font-size: 11pt; font-style: italic }
 
23
                H4.cjk { font-family: "MSung Light SC"; font-size: 11pt; font-style: italic }
 
24
                H4.ctl { font-size: 11pt; font-style: italic }
 
25
                H5.western { font-family: "Albany", sans-serif; font-size: 11pt }
 
26
                H5.cjk { font-family: "MSung Light SC"; font-size: 11pt }
 
27
                H5.ctl { font-size: 11pt }
 
28
                TH P { margin-bottom: 0.21cm; font-style: italic }
 
29
        -->
 
30
        </STYLE>
 
31
</HEAD>
 
32
<BODY LANG="de-LU">
 
33
<H1>Spooling framework</H1>
 
34
<H2 CLASS="western">Idea</H2>
 
35
<P>Spooling is done through a spooling framework, that can have
 
36
different implementations, e.g. spooing in ascii files, in a database
 
37
...</P>
 
38
<P>In a first step, spooling for monitoring and accounting is done in
 
39
a separate event client subscribing a certain number of object types
 
40
and simply spooling them through the spooling framework.</P>
 
41
<P>Qmaster still spools its own ascii files. If spooling framework
 
42
proves to be stable, switch qmaster to use the spooling framework and
 
43
let the Grid Engine admin decide, which spooling type to use.</P>
 
44
<P>If qmaster is set to spool into database, and a common production
 
45
and reporting database is to be used, the event client is not needed.</P>
 
46
<P><BR><BR>
 
47
</P>
 
48
<H2 CLASS="western">Spooled Objects &ndash; current implementation</H2>
 
49
<P>One implementation for each object type &ndash; for the reading of
 
50
most objects a common function call read_object is used.</P>
 
51
<TABLE WIDTH=100% BORDER=1 BORDERCOLOR="#000000" CELLPADDING=4 CELLSPACING=0>
 
52
        <COL WIDTH=39*>
 
53
        <COL WIDTH=80*>
 
54
        <COL WIDTH=74*>
 
55
        <COL WIDTH=64*>
 
56
        <THEAD>
 
57
                <TR VALIGN=TOP>
 
58
                        <TH WIDTH=15%>
 
59
                                <P>Object</P>
 
60
                        </TH>
 
61
                        <TH WIDTH=31%>
 
62
                                <P>Implementation</P>
 
63
                        </TH>
 
64
                        <TH WIDTH=29%>
 
65
                                <P>Structure</P>
 
66
                        </TH>
 
67
                        <TH WIDTH=25%>
 
68
                                <P>Comment</P>
 
69
                        </TH>
 
70
                </TR>
 
71
        </THEAD>
 
72
        <TBODY>
 
73
                <TR VALIGN=TOP>
 
74
                        <TD WIDTH=15%>
 
75
                                <P>Accounting</P>
 
76
                        </TD>
 
77
                        <TD WIDTH=31%>
 
78
                                <P>daemons/qmaster/job_exit.c,</P>
 
79
                                <P>clients/qacct/qacct.c</P>
 
80
                        </TD>
 
81
                        <TD WIDTH=29%>
 
82
                                <P>Ascii file, one line per record, fixed delimiter</P>
 
83
                        </TD>
 
84
                        <TD WIDTH=25%>
 
85
                                <P>Nothing to do. The same information can come from spooling
 
86
                                with history.</P>
 
87
                        </TD>
 
88
                </TR>
 
89
                <TR VALIGN=TOP>
 
90
                        <TD WIDTH=15%>
 
91
                                <P>Calendar</P>
 
92
                        </TD>
 
93
                        <TD WIDTH=31%>
 
94
                                <P>common/read_write_cal.c</P>
 
95
                        </TD>
 
96
                        <TD WIDTH=29%>
 
97
                                <P>Ascii file per object, one whitespace separated name/value per
 
98
                                line</P>
 
99
                        </TD>
 
100
                        <TD WIDTH=25%>
 
101
                                <P><BR>
 
102
                                </P>
 
103
                        </TD>
 
104
                </TR>
 
105
                <TR VALIGN=TOP>
 
106
                        <TD WIDTH=15%>
 
107
                                <P>Checkpoint Environment</P>
 
108
                        </TD>
 
109
                        <TD WIDTH=31%>
 
110
                                <P>common/read_write_ckpt.c</P>
 
111
                        </TD>
 
112
                        <TD WIDTH=29%>
 
113
                                <P>Ascii file per object, one whitespace separated name/value per
 
114
                                line</P>
 
115
                        </TD>
 
116
                        <TD WIDTH=25%>
 
117
                                <P>sublist: queues, only names, could be stored as string</P>
 
118
                        </TD>
 
119
                </TR>
 
120
                <TR VALIGN=TOP>
 
121
                        <TD WIDTH=15%>
 
122
                                <P>Cluster configuration</P>
 
123
                        </TD>
 
124
                        <TD WIDTH=31%>
 
125
                                <P>common/rw_configuration.c</P>
 
126
                        </TD>
 
127
                        <TD WIDTH=29%>
 
128
                                <P>Ascii file per object, one whitespace separated name/value per
 
129
                                line</P>
 
130
                        </TD>
 
131
                        <TD WIDTH=25%>
 
132
                                <P>Probably merge with host objects</P>
 
133
                        </TD>
 
134
                </TR>
 
135
                <TR VALIGN=TOP>
 
136
                        <TD WIDTH=15%>
 
137
                                <P>Complex</P>
 
138
                        </TD>
 
139
                        <TD WIDTH=31%>
 
140
                                <P>common/sge_complex.c</P>
 
141
                        </TD>
 
142
                        <TD WIDTH=29%>
 
143
                                <P>Ascii file per complex, one line per complex attribute,
 
144
                                whitespace separated fields</P>
 
145
                        </TD>
 
146
                        <TD WIDTH=25%>
 
147
                                <P>Need rules for spooling of complex attributes. On/Off.
 
148
                                Min,Max,Avg in a certain interval.</P>
 
149
                        </TD>
 
150
                </TR>
 
151
                <TR VALIGN=TOP>
 
152
                        <TD WIDTH=15%>
 
153
                                <P>History</P>
 
154
                        </TD>
 
155
                        <TD WIDTH=31%>
 
156
                                <P>common/complex_history.c</P>
 
157
                        </TD>
 
158
                        <TD WIDTH=29%>
 
159
                                <P>Directory for hosts and queues, one file per timestamp,
 
160
                                complex file format</P>
 
161
                        </TD>
 
162
                        <TD WIDTH=25%>
 
163
                                <P>Nothing to do. The same information can come from spooling
 
164
                                with history.</P>
 
165
                        </TD>
 
166
                </TR>
 
167
                <TR VALIGN=TOP>
 
168
                        <TD WIDTH=15%>
 
169
                                <P>Host</P>
 
170
                        </TD>
 
171
                        <TD WIDTH=31%>
 
172
                                <P>common/read_write_host.c</P>
 
173
                        </TD>
 
174
                        <TD WIDTH=29%>
 
175
                                <P>Ascii file per object, one whitespace separated name/value per
 
176
                                line</P>
 
177
                                <P>Admin and submit hosts only contain one attribute, the name</P>
 
178
                        </TD>
 
179
                        <TD WIDTH=25%>
 
180
                                <P>Admin-/Exec-/Submit- hosts are different objects. Should be
 
181
                                merged into one object.</P>
 
182
                        </TD>
 
183
                </TR>
 
184
                <TR VALIGN=TOP>
 
185
                        <TD WIDTH=15%>
 
186
                                <P>Hostgroup</P>
 
187
                        </TD>
 
188
                        <TD WIDTH=31%>
 
189
                                <P>common/read_write_host_group.c</P>
 
190
                        </TD>
 
191
                        <TD WIDTH=29%>
 
192
                                <P><BR>
 
193
                                </P>
 
194
                        </TD>
 
195
                        <TD WIDTH=25%>
 
196
                                <P>Not active</P>
 
197
                        </TD>
 
198
                </TR>
 
199
                <TR VALIGN=TOP>
 
200
                        <TD WIDTH=15%>
 
201
                                <P>Job</P>
 
202
                        </TD>
 
203
                        <TD WIDTH=31%>
 
204
                                <P>daemons/common/read_write_job.c</P>
 
205
                        </TD>
 
206
                        <TD WIDTH=29%>
 
207
                                <P>Directory structure, multiple binary files (cull packing
 
208
                                buffer)</P>
 
209
                                <P>Job script is stored separately</P>
 
210
                        </TD>
 
211
                        <TD WIDTH=25%>
 
212
                                <P><BR>
 
213
                                </P>
 
214
                        </TD>
 
215
                </TR>
 
216
                <TR VALIGN=TOP>
 
217
                        <TD WIDTH=15%>
 
218
                                <P>Manager</P>
 
219
                                <P>Operator</P>
 
220
                        </TD>
 
221
                        <TD WIDTH=31%>
 
222
                                <P>daemons/qmaster/read_write_manop.c</P>
 
223
                        </TD>
 
224
                        <TD WIDTH=29%>
 
225
                                <P>Ascii files, one line per user name</P>
 
226
                        </TD>
 
227
                        <TD WIDTH=25%>
 
228
                                <P>Should better be attribute of a user object</P>
 
229
                        </TD>
 
230
                </TR>
 
231
                <TR VALIGN=TOP>
 
232
                        <TD WIDTH=15%>
 
233
                                <P>Messages</P>
 
234
                        </TD>
 
235
                        <TD WIDTH=31%>
 
236
                                <P><BR>
 
237
                                </P>
 
238
                        </TD>
 
239
                        <TD WIDTH=29%>
 
240
                                <P>Ascii files, one line per record, fixed delimiter</P>
 
241
                        </TD>
 
242
                        <TD WIDTH=25%>
 
243
                                <P>No real objects at the moment. But each message has a
 
244
                                structure well suited for storage in database tables.</P>
 
245
                        </TD>
 
246
                </TR>
 
247
                <TR VALIGN=TOP>
 
248
                        <TD WIDTH=15%>
 
249
                                <P>Parallel Environment</P>
 
250
                        </TD>
 
251
                        <TD WIDTH=31%>
 
252
                                <P>common/read_write_pe.c</P>
 
253
                        </TD>
 
254
                        <TD WIDTH=29%>
 
255
                                <P>Ascii file per object, one whitespace separated name/value per
 
256
                                line</P>
 
257
                        </TD>
 
258
                        <TD WIDTH=25%>
 
259
                                <P>sublist: queues, only names, could be stored as string</P>
 
260
                        </TD>
 
261
                </TR>
 
262
                <TR VALIGN=TOP>
 
263
                        <TD WIDTH=15%>
 
264
                                <P>Project</P>
 
265
                        </TD>
 
266
                        <TD WIDTH=31%>
 
267
                                <P>common/read_write_userprj.c</P>
 
268
                        </TD>
 
269
                        <TD WIDTH=29%>
 
270
                                <P>Ascii file per object, one whitespace separated name/value per
 
271
                                line</P>
 
272
                        </TD>
 
273
                        <TD WIDTH=25%>
 
274
                                <P>Usage and longterm usage are sublists. Stored as name/values
 
275
                                pairs: cpu, mem, io, finished jobs. Could also be stored as
 
276
                                single attributes. 
 
277
                                </P>
 
278
                        </TD>
 
279
                </TR>
 
280
                <TR VALIGN=TOP>
 
281
                        <TD WIDTH=15%>
 
282
                                <P>Queue</P>
 
283
                        </TD>
 
284
                        <TD WIDTH=31%>
 
285
                                <P>common/read_write_queue.c</P>
 
286
                        </TD>
 
287
                        <TD WIDTH=29%>
 
288
                                <P>Ascii file per object, one whitespace separated name/value per
 
289
                                line</P>
 
290
                        </TD>
 
291
                        <TD WIDTH=25%>
 
292
                                <P>Qtype is stored as bitfield, spooled as list of type
 
293
                                identifiers</P>
 
294
                                <P>sublists: thresholds (name/value pairs), owner (string list),
 
295
                                user (string list), xuser (string list), subordinates (string
 
296
                                list), complexes (string list), complex_values (name/value
 
297
                                pairs), projects (string list), xprojects (string list)</P>
 
298
                        </TD>
 
299
                </TR>
 
300
                <TR VALIGN=TOP>
 
301
                        <TD WIDTH=15%>
 
302
                                <P>Sharetree</P>
 
303
                        </TD>
 
304
                        <TD WIDTH=31%>
 
305
                                <P>common/sge_sharetree.c</P>
 
306
                        </TD>
 
307
                        <TD WIDTH=29%>
 
308
                                <P>One ascii file, references by node ids within the file</P>
 
309
                        </TD>
 
310
                        <TD WIDTH=25%>
 
311
                                <P><BR>
 
312
                                </P>
 
313
                        </TD>
 
314
                </TR>
 
315
                <TR VALIGN=TOP>
 
316
                        <TD WIDTH=15%>
 
317
                                <P>User</P>
 
318
                        </TD>
 
319
                        <TD WIDTH=31%>
 
320
                                <P>common/read_write_userprj.c</P>
 
321
                        </TD>
 
322
                        <TD WIDTH=29%>
 
323
                                <P>Ascii file per object, one whitespace separated name/value per
 
324
                                line, special format for project related data</P>
 
325
                        </TD>
 
326
                        <TD WIDTH=25%>
 
327
                                <P><BR>
 
328
                                </P>
 
329
                        </TD>
 
330
                </TR>
 
331
                <TR VALIGN=TOP>
 
332
                        <TD WIDTH=15%>
 
333
                                <P>Usermapping</P>
 
334
                        </TD>
 
335
                        <TD WIDTH=31%>
 
336
                                <P>common/read_write_ume.c</P>
 
337
                        </TD>
 
338
                        <TD WIDTH=29%>
 
339
                                <P><BR>
 
340
                                </P>
 
341
                        </TD>
 
342
                        <TD WIDTH=25%>
 
343
                                <P>Not active</P>
 
344
                        </TD>
 
345
                </TR>
 
346
                <TR VALIGN=TOP>
 
347
                        <TD WIDTH=15%>
 
348
                                <P>Userset</P>
 
349
                        </TD>
 
350
                        <TD WIDTH=31%>
 
351
                                <P>common/read_write_userset.c</P>
 
352
                        </TD>
 
353
                        <TD WIDTH=29%>
 
354
                                <P>Ascii file per object, one whitespace separated name/value per
 
355
                                line</P>
 
356
                        </TD>
 
357
                        <TD WIDTH=25%>
 
358
                                <P><BR>
 
359
                                </P>
 
360
                        </TD>
 
361
                </TR>
 
362
        </TBODY>
 
363
</TABLE>
 
364
<P STYLE="margin-bottom: 0cm"><BR>
 
365
</P>
 
366
<P STYLE="margin-bottom: 0cm"><BR>
 
367
</P>
 
368
<H2 CLASS="western">Implementation</H2>
 
369
<H3 CLASS="western">Types of spooling</H3>
 
370
<P>Spooling is done in a certain spooling context.</P>
 
371
<P>A spooling context defines, how objects are spooled.</P>
 
372
<P>Multiple spooling contexts can be used within one process.</P>
 
373
<P>Examples for spooling types/destinations:</P>
 
374
<UL>
 
375
        <LI><P>Ascii file, one record per file, name/value pairs per line</P>
 
376
        <LI><P>Ascii file, fixed delimiters for objects and attributes</P>
 
377
        <LI><P>Cull binary file (actually used for jobs, combined with a
 
378
        sophisticated directory structure).</P>
 
379
        <LI><P>XML files. They could easily replace the Cull binary file
 
380
        format, as hierarchies can be implemented in a straigthforward and
 
381
        readable way.</P>
 
382
        <LI><P>Database files (e.g. Xbase)</P>
 
383
        <LI><P>SQL Database</P>
 
384
        <LI><P>LDAP Repository (for certain objects like users)</P>
 
385
</UL>
 
386
<P>Further information stored in a spooling context:</P>
 
387
<UL>
 
388
        <LI><P>spool historical data (with timestamp) or snapshot</P>
 
389
        <LI><P>spooling type specific information, e.g. delimiters for ascii
 
390
        file spooling, file handles, database connections etc. if they are
 
391
        to be kept open.</P>
 
392
</UL>
 
393
<H3 CLASS="western">Spooling of sublists</H3>
 
394
<P>Many Grid Engine object types contain sublists. 
 
395
</P>
 
396
<P>In the current implementation, these hierarchical data structures
 
397
are stored in different ways:</P>
 
398
<UL>
 
399
        <LI><P>by referencing other objects using string lists, e.g. the
 
400
        queue names in pe objects reference queue objects</P>
 
401
        <LI><P>by using name/value pairs in string lists, e.g. complex
 
402
        variables set for queues are stored in a string lists containing
 
403
        tuples in the format &lt;name&gt;=&lt;value&gt;</P>
 
404
        <LI><P>by using special formats within the same ascii file (e.g. the
 
405
        user object or the sharetree). We should avoid these in the future.</P>
 
406
        <LI><P>by using the cull binary format as spool file format
 
407
        including sublists. We should not differentiate between ascii and
 
408
        cull binary file formats in the future.</P>
 
409
        <LI><P>by using directory hierarchies (e.g. storing array tasks
 
410
        within the jobs spool directory). For file based storage, we'll need
 
411
        them also in future implementations.</P>
 
412
</UL>
 
413
<P><BR><BR>
 
414
</P>
 
415
<P>For the new implementation, we'll have to differentiate between
 
416
file based formats and database storage.</P>
 
417
<P>For file based storage, we should use the following strategies:</P>
 
418
<UL>
 
419
        <LI><P>when referencing other spooled objects, we should store a
 
420
        unique keys. Lists of such keys can be stored as string list.</P>
 
421
        <LI><P>name/value pairs can be stored in string lists in the
 
422
        existing format &lt;name&gt;=&lt;value&gt;</P>
 
423
        <LI><P>We'll have to continue the use of directory hierarchies for
 
424
        job spooling due to limitations of the number of files per
 
425
        directory.</P>
 
426
</UL>
 
427
<P>For database storage, we should use the following strategies:</P>
 
428
<UL>
 
429
        <LI><P>referencing single other objects can be done by storing a
 
430
        unique key.</P>
 
431
        <LI><P>referencing lists of other objects can also be done by
 
432
        storing a string list of keys, if we want to accept performance
 
433
        drawbacks for certain queries, e.g. &bdquo;which pe's contain queue
 
434
        xyz&ldquo;.<BR>Better would be to use mapping tables, e.g. a table
 
435
        pe_queues, that links queues to pe's. Problem: Special keywords like
 
436
        &bdquo;all&ldquo; would have to be handled by either a pseudo queue
 
437
        &bdquo;all&ldquo; or a mapping entry without queue reference.</P>
 
438
        <LI><P>name/value pairs have to be stored in additional tables. In
 
439
        certain cases this can be extended mapping tables, e.g. mapping
 
440
        complex attributes to queues and giving them a value.</P>
 
441
        <LI><P>The hierarchy job &ndash; ja_task &ndash; pe_task can be
 
442
        easily implemented by referencing the hierarchical superior object
 
443
        in the subordinated object &ndash; pe_tasks reference the ja_task,
 
444
        ja_tasks reference the job.</P>
 
445
</UL>
 
446
<TABLE WIDTH=100% BORDER=1 BORDERCOLOR="#000000" CELLPADDING=4 CELLSPACING=0>
 
447
        <COL WIDTH=64*>
 
448
        <COL WIDTH=64*>
 
449
        <COL WIDTH=64*>
 
450
        <COL WIDTH=64*>
 
451
        <THEAD>
 
452
                <TR VALIGN=TOP>
 
453
                        <TH WIDTH=25%>
 
454
                                <P>reference type</P>
 
455
                        </TH>
 
456
                        <TH WIDTH=25%>
 
457
                                <P>current implementation</P>
 
458
                        </TH>
 
459
                        <TH WIDTH=25%>
 
460
                                <P>new filebased</P>
 
461
                        </TH>
 
462
                        <TH WIDTH=25%>
 
463
                                <P>new database</P>
 
464
                        </TH>
 
465
                </TR>
 
466
        </THEAD>
 
467
        <TBODY>
 
468
                <TR VALIGN=TOP>
 
469
                        <TD WIDTH=25%>
 
470
                                <P>referencing objects</P>
 
471
                        </TD>
 
472
                        <TD WIDTH=25%>
 
473
                                <P>object id from cull</P>
 
474
                        </TD>
 
475
                        <TD WIDTH=25%>
 
476
                                <P>object id from cull</P>
 
477
                        </TD>
 
478
                        <TD WIDTH=25%>
 
479
                                <P>object id, either from cull or database internal serial number</P>
 
480
                        </TD>
 
481
                </TR>
 
482
                <TR VALIGN=TOP>
 
483
                        <TD WIDTH=25%>
 
484
                                <P>list of references</P>
 
485
                        </TD>
 
486
                        <TD WIDTH=25%>
 
487
                                <P>string list or cull sublist</P>
 
488
                        </TD>
 
489
                        <TD WIDTH=25%>
 
490
                                <P>string list</P>
 
491
                        </TD>
 
492
                        <TD WIDTH=25%>
 
493
                                <P>mapping table</P>
 
494
                        </TD>
 
495
                </TR>
 
496
                <TR VALIGN=TOP>
 
497
                        <TD WIDTH=25%>
 
498
                                <P>name/value pairs</P>
 
499
                        </TD>
 
500
                        <TD WIDTH=25%>
 
501
                                <P>string list or cull sublist</P>
 
502
                        </TD>
 
503
                        <TD WIDTH=25%>
 
504
                                <P>string list</P>
 
505
                        </TD>
 
506
                        <TD WIDTH=25%>
 
507
                                <P>mapping table with value</P>
 
508
                        </TD>
 
509
                </TR>
 
510
                <TR VALIGN=TOP>
 
511
                        <TD WIDTH=25%>
 
512
                                <P>subordinate objects</P>
 
513
                        </TD>
 
514
                        <TD WIDTH=25%>
 
515
                                <P>special format or spool in cull binary format</P>
 
516
                        </TD>
 
517
                        <TD WIDTH=25%>
 
518
                                <P>break up such hierarchies (e.g. possible in the user object)
 
519
                                or store data in additional files or directory structure and
 
520
                                reference these files</P>
 
521
                        </TD>
 
522
                        <TD WIDTH=25%>
 
523
                                <P>store them in additional tables and make them reference their
 
524
                                superior object</P>
 
525
                        </TD>
 
526
                </TR>
 
527
                <TR VALIGN=TOP>
 
528
                        <TD WIDTH=25%>
 
529
                                <P>job hierarchy</P>
 
530
                        </TD>
 
531
                        <TD WIDTH=25%>
 
532
                                <P>directory hierarchy</P>
 
533
                        </TD>
 
534
                        <TD WIDTH=25%>
 
535
                                <P>directory hierarchy</P>
 
536
                        </TD>
 
537
                        <TD WIDTH=25%>
 
538
                                <P>subordinate objects reference superior objects</P>
 
539
                        </TD>
 
540
                </TR>
 
541
        </TBODY>
 
542
</TABLE>
 
543
<P STYLE="margin-bottom: 0cm"><BR>
 
544
</P>
 
545
<H3 CLASS="western">Spooling policies dependent on component</H3>
 
546
<H4 CLASS="western">Current implementation 
 
547
</H4>
 
548
<P>In the current implementation we have different spooling policies
 
549
dependent on the component that does spooling.</P>
 
550
<P>Main spooling component is the qmaster.</P>
 
551
<P>But also execd has spooling of jobs and related information, e.g.
 
552
queues, or parallel environment information. 
 
553
</P>
 
554
<P>The related information reflects the status of the spooled object
 
555
at the time the job was delivered to execd.</P>
 
556
<P>It is also possible that execd does spool other attributes of jobs
 
557
than does qmaster.</P>
 
558
<H4 CLASS="western">Suggestions for a new implementation</H4>
 
559
<P>Different approaches are possible to address this issue. The
 
560
following will discuss some ideas.</P>
 
561
<H5 CLASS="western">Multiple writing instances to one global database</H5>
 
562
<P>All daemons use a common database. The execds can write directly
 
563
to the database. Qmaster is notified about changes by the database.</P>
 
564
<P>Pros: 
 
565
</P>
 
566
<UL>
 
567
        <LI><P>Reduced message transfer volume between qmaster and execd</P>
 
568
        <LI><P>Reduced spooling overhead in qmaster</P>
 
569
        <LI><P>More accurate data in the database, as data doesn't have to
 
570
        go through qmaster. 
 
571
        </P>
 
572
</UL>
 
573
<P>Cons:</P>
 
574
<UL>
 
575
        <LI><P>Danger of inconsistencies between data in qmaster and data in
 
576
        the database. This problem exists with any implementation, but most
 
577
        probably qmaster should be the instance that holds the most recent
 
578
        information.</P>
 
579
        <LI><P>Scalability issues. It takes away the possibility of local
 
580
        spooling.</P>
 
581
</UL>
 
582
<P>Probably not an option for the near future.</P>
 
583
<H5 CLASS="western">Restrict to file spooling in execd</H5>
 
584
<P>Each execd has its own area for spooling, usually file based,
 
585
either on a local disk (recommended) or via NFS mount.</P>
 
586
<P>Use formats that allow the spooling of hierarchical data, i.e.
 
587
either cull binary format or XML format.</P>
 
588
<P>As execd spools information in a different way (not all / other
 
589
attributes as qmaster, different strategy for sublists), the spooling
 
590
implementation has to provide means to overwrite the spooling
 
591
strategies defined as default for certain object types, or 2 spooling
 
592
strategies have to be defined for object types.</P>
 
593
<P>Pros:</P>
 
594
<UL>
 
595
        <LI><P>spooling load can be easily distributed by using local file
 
596
        systems</P>
 
597
        <LI><P>execd is the only instance that needs to spool hierarchical
 
598
        data not normalized, as the sub objects that have to be spooled are
 
599
        only valid for the lifetime of the only spooled object types (job
 
600
        related data)</P>
 
601
</UL>
 
602
<P>Cons:</P>
 
603
<UL>
 
604
        <LI><P>Different spooling strategies within one cluster have to be
 
605
        implemented</P>
 
606
        <LI><P>spooling remains a bottleneck when NFS has to be used for
 
607
        some reason, e.g. diskless compute engines</P>
 
608
        <LI><P>on very big SMP machines (some hundred processors) spooling
 
609
        could become a bottleneck due to slow file spooling</P>
 
610
</UL>
 
611
<H3 CLASS="western">Cull enhancements</H3>
 
612
<H4 CLASS="western">Definition of attributes</H4>
 
613
<P>Cull definition will have to contain information, which fields
 
614
have to be spooled and how sublists are spooled.</P>
 
615
<P>Replace the many similar definitions for same object types by a
 
616
combination of flags. Example:</P>
 
617
<P>We have now 14 definitions for the string datatype (SGE_STRING,
 
618
SGE_STRINGH, SGE_STRING_HU, SGE_KSTRING, ...)</P>
 
619
<P><BR><BR>
 
620
</P>
 
621
<P>A list element definition like 
 
622
</P>
 
623
<P>SGE_KULONGH(JB_job_number)</P>
 
624
<P>could be replaced by 
 
625
</P>
 
626
<P>SGE_ULONG(JG_job_number, HASH | UNIQUE | SPOOL | QIDL_K)</P>
 
627
<P>or</P>
 
628
<P>SGE_LIST_ELEMENT(JG_job_number, ULONG | HASH | UNIQUE | SPOOL |
 
629
SHOW | QIDL_K)</P>
 
630
<P><BR><BR>
 
631
</P>
 
632
<P>A keyword DEFAULT could be used, if no special settings are done
 
633
for a type.</P>
 
634
<P><BR><BR>
 
635
</P>
 
636
<P>Descriptor field mt has lots of free space (currently only uses 4
 
637
bit for the data types from a (32 bit) integer) that could hold the
 
638
following additional information:</P>
 
639
<UL>
 
640
        <LI><P>ARRAY <BR>For an array implementation (optionally to be done
 
641
        in a separate step)</P>
 
642
        <LI><P>HASH<BR>Enable hashing for the field.</P>
 
643
        <LI><P>UNIQUE<BR>Attribute has unique values within one list. This
 
644
        is at the moment only checked for attributes that have hashing
 
645
        enabled, but could be extended to any operations setting values.</P>
 
646
        <LI><P>SPOOL<BR>Shall the attribute be spooled.</P>
 
647
        <LI><P>SHOW<BR>Shall the attribute be shown (e.g. in qconf -s*,
 
648
        qstat -j etc.)</P>
 
649
        <LI><P>CONFIG<BR>Shall the attribute be configurable, i.e. be
 
650
        contained in the temporary files created for qconf -m* operations or
 
651
        for qconf -mattr operations</P>
 
652
</UL>
 
653
<P>Probably we should use a prefix like CULL_ or SGE_ to ensure
 
654
uniqueness, e.g. CULL_HASH instead of HASH.</P>
 
655
<H4 CLASS="western">Tracking of changed attributes</H4>
 
656
<P>To be able to interface a database using mechanisms like SQL, each
 
657
object must know, which attributes have changed. Otherwise, the whole
 
658
object has to be spooled on each spooling function call, even if only
 
659
few attributes have been changed or the object hasn't been changed at
 
660
all.</P>
 
661
<P>This could be achieved by making a struct arround the lMultiType
 
662
enum type and reserving &bdquo;one bit&ldquo; for the changed
 
663
attribute.</P>
 
664
<P>Or by adding a bitfield containing this information to the
 
665
lListElem data type &ndash; this would be less memory consuming.</P>
 
666
<H4 CLASS="western">Attribute names</H4>
 
667
<P>A set of attribute names are generated using the NAMEDEF macros
 
668
for each object type.</P>
 
669
<P>These attribute names have very limited use in the current
 
670
implementation &ndash; they are only used for debugging purposes
 
671
(lWrite* function calls).</P>
 
672
<P>For spooling, information output and configuration changes we also
 
673
need attribute names. These names are at the moment hardcoded in the
 
674
spooling, output and parsing functions.</P>
 
675
<P>It would be better, to extend the existing NAMEDEF macros to
 
676
create struct objects containing both the internal attribute name and
 
677
an attribute name to be used for the other purposes.</P>
 
678
<H3 CLASS="western">Functions 
 
679
</H3>
 
680
<P>create_spooling_context</P>
 
681
<P>free_spooling_context</P>
 
682
<P><BR><BR>
 
683
</P>
 
684
<P>spool_prepare</P>
 
685
<P>spool_commit</P>
 
686
<P><BR><BR>
 
687
</P>
 
688
<P>spool_object</P>
 
689
<P>spool_attribute</P>
 
690
<P><BR><BR>
 
691
</P>
 
692
<H3 CLASS="western">Installation issues</H3>
 
693
<P>First step:</P>
 
694
<P>Provide an install_monitoring script to setup the event client and
 
695
its spooling configuration.</P>
 
696
<P>Second step:</P>
 
697
<P>In qmaster install, decide which spooling type to use, with type
 
698
specific further actions (for SQL database, query user for parameters
 
699
and test the database).</P>
 
700
<P STYLE="margin-bottom: 0cm"><BR>
 
701
</P>
 
702
<H2 CLASS="western">Implementation proposal</H2>
 
703
<P>The implementation can be done in separate steps that can each
 
704
face thorough testing. Time estimations are netto times and include
 
705
documentation and testing.</P>
 
706
<TABLE WIDTH=100% BORDER=1 BORDERCOLOR="#000000" CELLPADDING=4 CELLSPACING=0>
 
707
        <COL WIDTH=203*>
 
708
        <COL WIDTH=53*>
 
709
        <THEAD>
 
710
                <TR VALIGN=TOP>
 
711
                        <TH WIDTH=79% BGCOLOR="#e6e6ff">
 
712
                                <P>task</P>
 
713
                        </TH>
 
714
                        <TH WIDTH=21% BGCOLOR="#e6e6ff">
 
715
                                <P>est. time [weeks]</P>
 
716
                        </TH>
 
717
                </TR>
 
718
        </THEAD>
 
719
        <TBODY>
 
720
                <TR>
 
721
                        <TD WIDTH=79% VALIGN=TOP>
 
722
                                <P>implement the suggested cull object definition changes</P>
 
723
                        </TD>
 
724
                        <TD WIDTH=21% VALIGN=BOTTOM SDVAL="2" SDNUM="1023;">
 
725
                                <P ALIGN=RIGHT>2</P>
 
726
                        </TD>
 
727
                </TR>
 
728
                <TR>
 
729
                        <TD WIDTH=79% VALIGN=TOP>
 
730
                                <P>implement tracking of attribute changes</P>
 
731
                        </TD>
 
732
                        <TD WIDTH=21% VALIGN=BOTTOM SDVAL="2" SDNUM="1023;">
 
733
                                <P ALIGN=RIGHT>2</P>
 
734
                        </TD>
 
735
                </TR>
 
736
                <TR>
 
737
                        <TD WIDTH=79% VALIGN=TOP>
 
738
                                <P>implement file based spooling. Restrict to the following text
 
739
                                file formats:</P>
 
740
                                <UL>
 
741
                                        <LI><P>one record per file, name/value pairs per line</P>
 
742
                                        <LI><P>fixed delimiters for objects and attribute values</P>
 
743
                                        <LI><P>XML</P>
 
744
                                </UL>
 
745
                        </TD>
 
746
                        <TD WIDTH=21% VALIGN=BOTTOM SDVAL="3" SDNUM="1023;">
 
747
                                <P ALIGN=RIGHT>3</P>
 
748
                        </TD>
 
749
                </TR>
 
750
                <TR>
 
751
                        <TD WIDTH=79% VALIGN=TOP>
 
752
                                <P>make a compile time switch that will make the new spooling
 
753
                                functions used by qmaster for some selected object types. Only
 
754
                                for test purposes.</P>
 
755
                        </TD>
 
756
                        <TD WIDTH=21% VALIGN=BOTTOM SDVAL="1" SDNUM="1023;">
 
757
                                <P ALIGN=RIGHT>1</P>
 
758
                        </TD>
 
759
                </TR>
 
760
                <TR>
 
761
                        <TD WIDTH=79% VALIGN=TOP>
 
762
                                <P>implement database storage</P>
 
763
                        </TD>
 
764
                        <TD WIDTH=21% VALIGN=BOTTOM SDVAL="8" SDNUM="1023;">
 
765
                                <P ALIGN=RIGHT>8</P>
 
766
                        </TD>
 
767
                </TR>
 
768
                <TR>
 
769
                        <TD WIDTH=79% VALIGN=TOP>
 
770
                                <P>create an event client that subscribes all events for all
 
771
                                object types and spools them to a database</P>
 
772
                        </TD>
 
773
                        <TD WIDTH=21% VALIGN=BOTTOM SDVAL="2" SDNUM="1023;">
 
774
                                <P ALIGN=RIGHT>2</P>
 
775
                        </TD>
 
776
                </TR>
 
777
                <TR>
 
778
                        <TD WIDTH=79% VALIGN=TOP>
 
779
                                <P>do extensive tests with qmaster using some of the new spooling
 
780
                                functions to files and the event client attached, continue tests
 
781
                                during the next phases.</P>
 
782
                        </TD>
 
783
                        <TD WIDTH=21% VALIGN=BOTTOM SDVAL="2" SDNUM="1023;">
 
784
                                <P ALIGN=RIGHT>2</P>
 
785
                        </TD>
 
786
                </TR>
 
787
                <TR>
 
788
                        <TD WIDTH=79% VALIGN=TOP BGCOLOR="#e6e6e6">
 
789
                                <P><I><B>Sum essential steps</B></I></P>
 
790
                        </TD>
 
791
                        <TD WIDTH=21% VALIGN=BOTTOM BGCOLOR="#e6e6e6" SDVAL="20" SDNUM="1023;">
 
792
                                <P ALIGN=RIGHT>20</P>
 
793
                        </TD>
 
794
                </TR>
 
795
                <TR>
 
796
                        <TD WIDTH=79% VALIGN=TOP>
 
797
                                <P>make qmaster and execd use the new spooling framework (compile
 
798
                                time option), test different spooling strategies</P>
 
799
                        </TD>
 
800
                        <TD WIDTH=21% VALIGN=BOTTOM SDVAL="4" SDNUM="1023;">
 
801
                                <P ALIGN=RIGHT>4</P>
 
802
                        </TD>
 
803
                </TR>
 
804
                <TR>
 
805
                        <TD WIDTH=79% VALIGN=TOP>
 
806
                                <P>make new spooling framework the default, create means to
 
807
                                configure spooling strategies during the installation process 
 
808
                                </P>
 
809
                        </TD>
 
810
                        <TD WIDTH=21% VALIGN=BOTTOM SDVAL="2" SDNUM="1023;">
 
811
                                <P ALIGN=RIGHT>2</P>
 
812
                        </TD>
 
813
                </TR>
 
814
                <TR>
 
815
                        <TD WIDTH=79% VALIGN=TOP>
 
816
                                <P>create install_monitoring that will install the event client
 
817
                                separately</P>
 
818
                        </TD>
 
819
                        <TD WIDTH=21% VALIGN=BOTTOM SDVAL="1" SDNUM="1023;">
 
820
                                <P ALIGN=RIGHT>1</P>
 
821
                        </TD>
 
822
                </TR>
 
823
                <TR>
 
824
                        <TD WIDTH=79% VALIGN=TOP>
 
825
                                <P>create means to update the database structure, backup and
 
826
                                purging of outdated information</P>
 
827
                        </TD>
 
828
                        <TD WIDTH=21% VALIGN=BOTTOM SDVAL="2" SDNUM="1023;">
 
829
                                <P ALIGN=RIGHT>2</P>
 
830
                        </TD>
 
831
                </TR>
 
832
                <TR>
 
833
                        <TD WIDTH=79% VALIGN=TOP>
 
834
                                <P>build clients that use the database as source of information
 
835
                                instead of qmaster (qhost, qstat, qacct)</P>
 
836
                        </TD>
 
837
                        <TD WIDTH=21% VALIGN=BOTTOM SDVAL="2" SDNUM="1023;">
 
838
                                <P ALIGN=RIGHT>2</P>
 
839
                        </TD>
 
840
                </TR>
 
841
                <TR>
 
842
                        <TD WIDTH=79% VALIGN=TOP>
 
843
                                <P>change qconf and qalter to use the new spooling framework for
 
844
                                reading information and for creating and processing the data to
 
845
                                be configured.</P>
 
846
                        </TD>
 
847
                        <TD WIDTH=21% VALIGN=BOTTOM SDVAL="2" SDNUM="1023;">
 
848
                                <P ALIGN=RIGHT>2</P>
 
849
                        </TD>
 
850
                </TR>
 
851
                <TR>
 
852
                        <TD WIDTH=79% VALIGN=TOP BGCOLOR="#e6e6e6">
 
853
                                <P><I><B>Sum additional steps</B></I></P>
 
854
                        </TD>
 
855
                        <TD WIDTH=21% VALIGN=BOTTOM BGCOLOR="#e6e6e6" SDVAL="13" SDNUM="1023;">
 
856
                                <P ALIGN=RIGHT>13</P>
 
857
                        </TD>
 
858
                </TR>
 
859
        </TBODY>
 
860
</TABLE>
 
861
<P><BR><BR>
 
862
</P>
 
863
</BODY>
 
864
</HTML>
 
 
b'\\ No newline at end of file'