4
<title>Makeflow User's Manual</title>
9
<style type="text/css">
12
font-family: monospace;
16
border: solid 1px black;
21
<h1>Makeflow User's Manual</h1>
22
<b>Last Updated Febuary 2011</b>
24
Makeflow is Copyright (C) 2009 The University of Notre Dame.
25
This software is distributed under the GNU General Public License.
26
See the file COPYING for details.
30
Makeflow is a <b>workflow engine</b> for distributed computing.
31
It accepts a specification of a large amount of work to be
32
performed, and runs it on remote machines in parallel where possible.
33
In addition, Makeflow is fault-tolerant, so you can use it to coordinate
34
very large tasks that may run for days or weeks in the face of failures.
35
Makeflow is designed to be similar to <b>Make</b>, so if you can write
36
a Makefile, then you can write a Makeflow.
38
You can run a Makeflow on your local machine to test it out.
39
If you have a multi-core machine, then you can run multiple tasks simultaneously.
40
If you have a Condor pool or a Sun Grid Engine batch system, then you can send
41
your jobs there to run. If you don't already have a batch system, Makeflow comes with a
42
system called Work Queue that will let you distribute the load across any collection
43
of machines, large or small.
45
Makeflow is part of the <a href=http://www.cse.nd.edu/~ccl/software>Cooperating Computing Tools</a>. You can download the CCTools from <a href=http://www.cse.nd.edu/~ccl/software/download>this web page</a>, follow the <a href=install.html>installation instructions</a>, and you are ready to go.
47
<h2>The Makeflow Language</h2>
49
The Makeflow language is very similar to Make.
50
A Makeflow script consists of a set of rules.
51
Each rule specifies a set of <i>target files</i> to create,
52
a set of <i>source files</i> needed to create them,
53
and a <i>command</i> that generates the target files from the source files.
55
Makeflow attempts to generate all of the target files in a script.
56
It examines all of the rules and determines which rules must run before
57
others. Where possible, it runs commands in parallel to reduce the
60
Here is a Makeflow that uses the <tt>convert</tt> utility to make an animation.
61
It downloads an image from the web, creates four variations
62
of the image, and then combines them back together into an animation.
63
The first and the last task are marked as LOCAL to force them to
64
run on the controlling machine.
68
CONVERT=/usr/bin/convert
69
URL=http://www.cse.nd.edu/~ccl/images/capitol.jpg
71
capitol.montage.gif: capitol.jpg capitol.90.jpg capitol.180.jpg capitol.270.jpg capitol.360.jpg
72
LOCAL $CONVERT -delay 10 -loop 0 capitol.jpg capitol.90.jpg capitol.180.jpg capitol.270.jpg capitol.360.jpg capitol.270.jpg capitol.180.jpg capitol.90.jpg capitol.montage.gif
74
capitol.90.jpg: capitol.jpg $CONVERT
75
$CONVERT -swirl 90 capitol.jpg capitol.90.jpg
77
capitol.180.jpg: capitol.jpg $CONVERT
78
$CONVERT -swirl 180 capitol.jpg capitol.180.jpg
80
capitol.270.jpg: capitol.jpg $CONVERT
81
$CONVERT -swirl 270 capitol.jpg capitol.270.jpg
83
capitol.360.jpg: capitol.jpg $CONVERT
84
$CONVERT -swirl 360 capitol.jpg capitol.360.jpg
87
LOCAL $CURL -o capitol.jpg $URL
90
Note that Makeflow differs from Make in a few important ways.
91
Read section 4 below to get all of the details.
93
<h2>Running Makeflow</h2>
95
To try out the example above, copy and paste it into a file named <tt>example.makeflow</tt>.
96
To run it on your local machine:
99
% makeflow example.makeflow
102
Note that if you run it a second time, nothing will happen, because all of the files are built:
105
% makeflow example.makeflow
106
makeflow: nothing left to do
109
Use the <tt>-c</tt> option to clean everything up before trying it again:
112
% makeflow -c example.makeflow
115
If you have access to a batch system running <a href=http://www.sun.com/software/sge>SGE</a>,
116
then you can direct Makeflow to run your jobs there:
119
% makeflow -T sge example.makeflow
122
Or, if you have a <a href=http://www.cs.wisc.edu/condor>Condor Pool</a>,
123
then you can direct Makeflow to run your jobs there:
126
% makeflow -T condor example.makeflow
129
To submit Makeflow as a Condor job that submits more Condor jobs:
132
% condor_submit_makeflow example.makeflow
135
You will notice that a workflow can run very slowly if you submit
136
each batch job to SGE or Condor, because it typically
137
takes 30 seconds or so to start each batch job running. To get
138
around this limitation, we provide the Work Queue system.
139
This allows Makeflow to function as a master process that
140
quickly dispatches work to remote worker processes.
142
To begin, let's assume that you are logged into a machine
143
named <tt>barney.nd.edu</tt>. start your Makeflow like this:
145
% makeflow -T wq example.makeflow
148
Then, submit 10 worker processes to Condor like this:
151
% condor_submit_workers barney.nd.edu 9123 10
152
Submitting job(s)..........
153
Logging submit event(s)..........
154
10 job(s) submitted to cluster 298.
157
Or, submit 10 worker processes to SGE like this:
159
% sge_submit_workers barney.nd.edu 9123 10
162
Or, you can start workers manually on any other machine you can log into:
164
% work_queue_worker barney.nd.edu 9123
167
Once the workers begin running, Makeflow will dispatch multiple
168
tasks to each one very quickly. If a worker should fail, Makeflow
169
will retry the work elsewhere, so it is safe to submit many
170
workers to an unreliable system.
172
When the Makeflow completes, your workers will still be available,
173
so you can either run another Makeflow with the same workers,
174
remove them from the batch system, or wait for them to expire.
175
If you do nothing for 15 minutes, they will automatically exit.
177
Note that <tt>condor_submit_workers</tt> and <tt>sge_submit_workers</tt>
178
are simple shell scripts, so you can edit them directly if you would
179
like to change batch options or other details.
181
<h2>The Fine Details</h2>
183
The Makeflow language is very similar to Make, but it does have a few important differences that you should be aware of.
185
<h3>Get the Dependencies Right</h3>
187
You must be careful to accurately specify <b>all of the files that a rule requires and creates</b>, including any custom executables. This is because Makeflow requires all these information to construct the environment for a remote job. For example, suppose that you have written a simulation program called <tt>mysim.exe</tt> that reads <tt>calib.data</tt> and then produces and output file. The following rule won't work, because it doesn't inform Makeflow what files are neded to execute the simulation:
189
# This is an incorrect rule.
192
./mysim.exe -c calib.data -o output.txt
195
However, the following is correct, because the rule states all of the files needed to run the simulation. Makeflow will use this information to
196
construct a batch job that consists of <tt>mysim.exe</tt> and <tt>calib.data</tt> and uses it to produce <tt>output.txt</tt>:
199
# This is a correct rule.
201
output.txt: mysim.exe calib.data
202
./mysim.exe -c calib.data -o output.txt
204
When a regular file is specified as an input file, it means the command relies on the contents of that file. When a directory is specified as an input file, however, it could mean one of two things. First, the command depends on the contents inside the directory. Second, the command relies on the existence of the directory (for example, you just want to add more things into the directory later, it does not matter what's already in it). <b>Makeflow assumes that an input directory indicates that the command relies on the directory's existence</b>.
206
<h3>No Phony Rules</h3>
208
For a similar reason, you cannot have "phony" rules that don't actually
209
create the specified files. For example, it is common practice to define
210
a <tt>clean</tt> rule in Make that deletes all derived files. This doesn't
211
make sense in Makeflow, because such a rule does not actually create
212
a file named <tt>clean</tt>. Instead use the <tt>-c</tt> option as shown above.
214
<h3>Just Plain Rules</h3>
216
Makeflow does not support all of the syntax that you find in various versions of Make. Each rule must have exactly one command to execute. If you have multiple commands, simply join them together with semicolons. Makeflow allows you to define and use variables, but it does not support pattern rules, wildcards, or special variables like <tt>$<</tt> or <tt>$@</tt>. You simply have to write out the rules longhand, or write a script in your favorite language to generate a large Makeflow.
218
<h3>Local Job Execution</h3>
220
Certain jobs don't make much sense to distribute. For example, if you have a very fast running job that consumes a large amount of data, then it should simply run on the same machine as Makeflow. To force this, simply add the word <tt>LOCAL</tt> to the beginning of the command line in the rule.
222
<h3>Batch Job Refinement</h3>
224
When executing jobs, Makeflow simply uses the default settings in your batch system. If you need to pass additional options, use the <tt>BATCH_OPTIONS</tt> variable or the <tt>-B</tt> option to Makeflow.
226
When using Condor, this string will be added to each submit file. For example, if you want to add <tt>Requirements</tt> and <tt>Rank</tt> lines to your Condor submit files, add this to your Makeflow:
228
BATCH_OPTIONS = Requirements = (Memory>1024)
231
When using SGE, the string will be added to the qsub options. For example, to specify that jobs should be submitted to the <tt>devel</tt> queue:
233
BATCH_OPTIONS = -q devel
236
<h3>Displaying a Makeflow</h3>
238
When run with the <tt>-D</tt> option, Makeflow will emit a diagram of the Makeflow in the
239
<a href=http://www.graphviz.org>Graphviz DOT</a> format. If you have <tt>dot</tt> installed,
240
then you can generate an image of your workload like this:
243
% makeflow -D example.makeflow | dot -T gif > example.gif
246
<h2>Running Makeflow with Work Queue</h2>
247
With the '-T wq' option, Makeflow runs as a master process that dispatches
248
tasks to remote worker processes.
250
<h3>Change master port</h3>
251
The master process listens on a port which the remote workers would connect to.
252
The default port number is 9123. Sometimes, however, the port number might be
253
not available on your system. You can change the default port via the '-p
254
<port number>' option. For example, if you want the master to listen on port
255
9567 by default, you can run the following command:
257
% makeflow -T wq -p 9567 exmaple.makeflow
260
<h3>Set Project Name</h3>
261
The hostname:port is the default way for workers to identify masters. Anyone
262
can start a worker and connect to your master at the hostname:port. Sometimes
263
you don't want other people's workers to work on your project. To
264
<strong>avoid anonymous worker connections</strong>, you will need to set a
265
project name with the '-N' option for both your master and workers and enable
266
the exclusive worker mode on your work queue master with the "-e" option:
268
% makeflow -T wq -N proj1 -e example.makeflow
270
This would inform the master to only allow connections from workers who have a
271
preference on "proj1". If the '-e' option is not specified, the master would
272
accept shared workers submitted by anyone. To add a project preference for the
273
worker, add the '-N' option to the worker as well:
275
% work_queue_worker -N proj1 barney.nd.edu 9123
277
You can even assign mulitple preferred project names to a single worker. The
278
worker would only work for the master whose name is contained in the list of
279
the worker's preferred project names.
281
<h3>Enable Catalog Mode</h3>
282
In the catalog mode, the master would advertise its information, such as
283
project name, running status, and hostname and port number, to a catalog
284
server. The workers (with proper options turned on) would contact the catalog
285
server to ask for available masters and then work for them. The catalog mode
286
allows a set of workers to work for different masters without stopping them
287
and telling them a new master's hostname:port for each master. It also allows
288
different users to share their computing resources. So your workers could work
289
for other's projects when they are idle and other's workers could also work
292
To make use of the catalog mode, you will have to have a catalog server to
293
pass information between potential masters and workers. Say you want to run
294
your catalog server on a machine named barney.nd.edu (the default port that
295
the catalog server will be listening on is 9097, you can change it via the
298
barney% catalog_server
300
Now you have a catalog server listening at barney.nd.edu:9097. To make your
301
masters and workers contact this catalog server, add the '-C hostname:port'
302
option to both the master and worker programs:
304
% makeflow -T wq -C barney.nd.edu:9097 -N proj1 example.makeflow
305
% work_queue_worker -C barney.nd.edu:9097 -s
307
The '-C' option turns on the catalog mode. Note that the '-s' option in the
308
above worker command tells the worker to run in a shared worker mode, as
309
opposed to the default exlusive mode which would let the worker only work for
310
masters that it prefers (set by the -N option).
312
At Notre Dame, we have already had a catalog server runing 24/7. The work
313
queue master and worker program would contact the Notre Dame catalog server by
314
default. Thus, you don't have to specify the catalog server explicitly with
315
the '-C' option unless you have your own catalog server. To turn on the
316
catalog mode in this case you will use the '-a' option, again on both the
317
master and worker programs.
319
% makeflow -T wq -a -N proj1 example.makeflow
320
% work_queue_worker -a -N proj1 -N proj2 -N proj3
322
Note that this time the worker is not run with the '-s' option. Instead, it
323
has three preferred projects. The worker would ask the catalog server for
324
available masters and only work for a master if its project name matches one
325
of the worker's preferences.
327
<h2>For More Information</h2>
329
For the latest information about Makeflow, please visit our <a href=http://www.cse.nd.edu/~ccl/software/makeflow>web site</a> and subscribe to our <a href=http://www.cse.nd.edu/~ccl/software>mailing list</a>.