3
<title>Allpairs User's Manual</title>
7
<style type="text/css">
10
font-family: monospace;
14
border: solid 1px black;
19
<h1>Allpairs User's Manual</h1>
20
<b>Last Updated October 2010</b>
22
Allpairs is Copyright (C) 2009 The University of Notre Dame.
23
This software is distributed under the GNU General Public License.
24
See the file COPYING for details.
31
<a href=/~ccl/software/allpairs/large.gif><img src=/~ccl/software/allpairs/small.gif align=right border=0></a>
35
All-Pairs( array A[i], array B[j], function F(x,y) )<br>
36
returns matrix M where<br>
37
M[i,j] = F( A[i], B[j] )<br>
43
The All-Pairs abstraction computes the Cartesian product of two sets,
44
generating a matrix where each cell M[i,j] contains the output of the function
45
F on objects A[i] and B[j]. You provide two sets of data files (as in the above
46
figure, one set is setA = {A0, A1, A2, A3} and the other set is setB = {B0, B1,
47
B2, B3}) and a function F that computes on them (later in the text we refer to this fuction F as either <b>compare program</b> or <b>compare function</b>. You may optionally provide
48
additional parameters to control the actual computation, such as computing only
49
part of the matrix, using a specified number of CPU cores. The abstraction then
50
runs each of the functions in parallel, automatically handling load balancing,
51
data movement, fault tolerance and so on for you.
53
<h2 id="preparation">All-Pairs on a Single Machine</h2>
55
Let's suppose you have a whole lot of files that you want to compare all to each other, named <tt>a</tt>, <tt>b</tt>, <tt>c</tt>, and so on. Suppose that you also have a program named <tt>compareit</tt> that when invoked as <tt>compareit a b</tt> will compare files <tt>a</tt> and <tt>b</tt> and produce some output summarizing the difference between the two, like this:
57
a b are 45 percent similar
60
To use the allpairs framework, create a file called <tt>set.list</tt> that lists each of your files, one per line:
67
Then, invoke <tt>allpairs_multicore</tt> like this:
69
allpairs_multicore set.list set.list compareit
71
The framework will carry out all possible comparisons of the objects, and print the results one by one:
73
a a are 100 percent similar
74
a b are 45 percent similar
75
a c are 37 percent similar
78
For large sets of objects, allpairs_multicore will use as many cores as you have available, and will carefully manage virtual memory to exploit locality and avoid thrashing. Because of this, you should be prepared for the results to come back in any order.
80
<h2>All-Pairs on a Distributed System</h2>
82
So far, we have introduced how to use All-Pairs abstraction on a single
83
machine. But sometimes the All-Pairs problem is too big to allow a single
84
machine to finish it in a reasonable amount of time, even if the single machine
85
is multicore. So, we have built a <a
86
href=http://www.cse.nd.edu/~ccl/software/workqueue>Work Queue</a>
87
version of the All-Pairs abstraction which allows the users to easily apply the
88
All-Pairs abstraction on clusters, grids or clouds.
90
To use the All-Pairs Work Queue version, you will need to start a All-Pairs
91
master program called <tt>allpairs_master</tt> and a number of workers.
92
The workers will perform the tasks distributed by the master and return the
93
results to the master. The individual tasks that the master program distributes
94
are sub-matrix computation tasks and all the tasks would be performed by the
95
<tt>allpairs_multicore</tt> program on the workers. For end users, the only
96
extra step involved here is starting the workers. Starting the All-Pairs master
97
program is almost identical to starting the All-Pairs multicore program.
99
For example, to run the same example as above on a distributed system:
101
allpairs_master set.list set.list compareit
104
This will start the master process, which will wait for workers to connect.
105
Let's suppose the master is running on a machine named <tt>barney.nd.edu</tt>.
106
If you have access to login to other machines, you could simply start
107
worker processes by hand on each one, like this:
109
% work_queue_worker barney.nd.edu 9123
111
If you have access to a batch system like <a href=http://www.cs.wisc.edu/condor>Condor</a>, you can submit multiple workers at once:
113
% condor_submit_workers barney.nd.edu 9123 10
114
Submitting job(s)..........
115
Logging submit event(s)..........
116
10 job(s) submitted to cluster 298.
119
A similar script is available for Sun Grid Engine:
121
% sge_submit_workers barney.nd.edu 9123 10
124
In the above two examples, the first argument is the port number that the
125
master process will be or is listening on and the second the argument is the
126
number of workers to start. Note that <tt>9123</tt> is the default port
127
number that the master process uses. If you use the '-p' option in the
128
<tt>allpairs_master</tt> to change the listening port, you will need to
129
modify the port argument in the starting worker command accordingly.
131
<p>Once the workers are running, the <tt>allpairs_master</tt> can dispatch tasks
132
to each one very quickly. If a worker should fail, Work Queue will retry the
133
work elsewhere, so it is safe to submit many workers to an unreliable
136
<p>When the All-Pairs master process completes, your workers will
137
still be available, so you can either run another master with the same workers,
138
remove them from the batch system, or wait for them to expire. If you do
139
nothing for 15 minutes, they will automatically exit by default. You
140
can change this worker expiration time by setting the '<tt>-t</tt>' option.</p>
141
<p>Note that <tt>condor_submit_workers</tt> and <tt>sge_submit_workers</tt> are
142
simple shell scripts, so you can edit them directly if you would like to
143
change batch options or other details.</p>
145
<h2>Using an Internal Function</h2>
147
If you have a very fast comparison program (less than a tenth of a second),
148
the allpairs framework may be spending more time starting your program
149
than actually accomplishing real work. If you can express your comparison
150
as a simple function in C, you can embed that into the allpairs framework
151
to achieve significant speedups.
153
To accomplish this, <a href=http://www.cse.nd.edu/~ccl/software/download.shtml>download</a>
154
the CCTools source code, and build it. Then, look for the file <tt>allpairs/src/allpairs_compare.c</tt>.
155
At the top, you will see a function named <tt>allpairs_compare_CUSTOM</tt>, which accepts
156
two memory objects as arguments. Implement your comparison function, and then rebuild
157
the code. Test you code by running <tt>allpairs_multicore</tt> on a small set of data,
158
but specify <tt>CUSTOM</tt> as the name of the comparison program. If your tests succeeed
159
on a small set of data, then proceed to using <tt>allpairs_master</tt>.
161
We have implemented several internal comparison functions as examples, including:
163
<li> BITWISE - Counts the number of bytes different in each object.
164
<li> SWALIGN - Performs a Smith-Waterman alignment on two genomic sequences.
165
<li> IRIS - Performs a similarity comparison between two iris templates.
168
<h2>Tuning Performance</h2>
170
By default, both <tt>allpairs_master</tt> and <tt>allpairs_multicore</tt> will adjust to
171
the proprties of your workload to run it efficiently. <tt>allpairs_master</tt> will run
172
a few sample executions of your comparison program to measure how long it takes, and
173
then break up the work units into tasks that take abuot one minute each. Likewise,
174
<tt>allpairs_multicore</tt> will measure the number of cores and amount of memory
175
available on your system, and then arrange the computation to maximize performance.
177
If you like, you can use the options to further tune how the problem is decomposed:
179
<li> <tt>-t</tt> can be used to inform <tt>allpairs_master</tt> how long (in seconds)
180
it takes to perform each comparison. If given, <tt>allpairs_master</tt> will not
181
sample the execution, and will start the computation immediately.
182
<li> <tt>-x</tt> and <tt>-y</tt> can be used to set the size of the sub-problem
183
dispatched from <tt>allpairs_master</tt> to <tt>allpairs_multicore</tt>
184
<li> <tt>-c</tt> controls the number of cores used by <tt>allpairs_multicore</tt>,
185
which is all available cores by default.
186
<li> <tt>-b</tt> controls the block size of elements maintained in memory by <tt>allpairs_multicore</tt>,
187
which is 3/4 of memory by default.
191
<h2>For More Information</h2>
193
For the latest information about Allpairs, please visit our <a href=http://www.cse.nd.edu/~ccl/software/allpairs>web site</a> and subscribe to our <a href=http://www.cse.nd.edu/~ccl/software/help.shtml>mailing list</a>.