3
.\" NetPIPE -- Network Protocol Independent Performance Evaluator.
4
.\" Copyright 1997, 1998 Iowa State University Research Foundation, Inc.
6
.\" This program is free software; you can redistribute it and/or modify
7
.\" it under the terms of the GNU General Public License as published by
8
.\" the Free Software Foundation. You should have received a copy of the
9
.\" GNU General Public License along with this program; if not, write to the
10
.\" Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
13
.\" Created: Mon Jun 15 1998 by Guy Helmer
14
.\" Rewritten: Jun 1 2004 by Dave Turner
16
.\" $Id: netpipe.1,v 1.3 1998/09/24 16:23:59 ghelmer Exp $
18
.TH netpipe 1 "June 1, 2004" "NetPIPE" "netpipe"
31
.BI \-h \ receiver_hostname\fR\c
34
.BI \-b \ TCP_buffer_sizes\fR\c
42
.BI \-machinefile \ hostlist\fR\c
46
[-a] [-S] [-z] [options]
52
.BI \-machinefile \ hostlist\fR\c
64
See the TESTING sections below for a more complete description of
65
how to run NetPIPE in each environment.
66
The OPTIONS section describes the general options available for
68
See the README file from the tar-ball at
69
http://www.scl.ameslab.gov/Projects/NetPIPE/ for documentation on
70
the InfiniBand, GM, SHMEM, LAPI, and memcpy modules.
76
uses a simple series of ping-pong tests over a range of message
77
sizes to provide a complete measure of the performance of a network.
78
It bounces messages of increasing size between two processes, whether across a
79
network or within an SMP system.
80
Message sizes are chosen at regular intervals, and with slight perturbations,
81
to provide a complete evaluation of the communication system.
82
Each data point involves many ping-pong tests to provide an accurate timing.
83
Latencies are calculated by dividing the round trip time in half for small
84
messages ( less than 64 Bytes ).
86
The communication time for small messages is dominated by the
87
overhead in the communication layers, meaning that the transmission
89
For larger messages, the communication rate becomes bandwidth limited by
91
the communication subsystem (PCI bus, network card link, network switch).
93
These measurements can be done at the message-passing layer
94
(MPI, MPI-2, and PVM) or at the native communications layers
95
that that run upon (TCP/IP, GM for Myrinet cards, InfiniBand,
96
SHMEM for the Cray T3E systems, and LAPI for IBM SP systems).
97
Recent work is being aimed at measuring some internal system properties
98
such as the memcpy module that measures the internal memory copy rates,
99
or a disk module under development that measures the performance
100
to various I/O devices.
103
Some uses for NetPIPE include:
106
Comparing the latency and maximum throughput of various network cards.
108
Comparing the performance between different types of networks.
110
Looking for inefficiencies in the message-passing layer by comparing it
111
to the native communication layer.
113
Optimizing the message-passing layer and tune OS and driver parameters
114
for optimal performance of the communication subsystem.
120
is provided with many modules allowing it to interface with a wide
121
variety of communication layers.
122
It is fairly easy to write new interfaces for other reliable protocols
123
by using the existing modules as examples.
129
NPtcp can now be launched in two ways, by manually starting NPtcp on
130
both systems or by using a nplaunch script. To manually start NPtcp,
131
the NetPIPE receiver must be
132
started first on the remote system using the command:
138
then the primary transmitter is started on the local system with the
147
Any options used must be the same on both sides.
149
The nplaunch script uses ssh to launch the remote receiver
150
before starting the local transmitter. To use rsh, simply change
160
.BI \-b \ TCP_buffer_sizes\fR\c
161
option sets the TCP socket buffer size, which can greatly influence
162
the maximum throughput on some systems. A throughput graph that
163
flattens out suddenly may be a sign of the performance being limited
164
by the socket buffer sizes.
167
.SH TESTING MPI and MPI-2
169
Use of the MPI interface for NetPIPE depends on the MPI implementation
171
All will require the number of processes to be specified, usually
174
argument. Clusters environments may require a list of the
175
hosts being used, either during initialization of MPI (during lamboot
176
for LAM-MPI) or when each job is run (using a -machinefile argument
178
For LAM-MPI, for example, put the list of hosts in hostlist then boot LAM
179
and run NetPIPE using:
185
mpirun \-np 2 NPmpi [NetPIPE options]
189
For MPICH use a command like:
194
\-np 2 NPmpi [NetPIPE options]
198
To test the 1-sided communications of the MPI-2 standard, compile
205
Running as described above and MPI will use 1-sided MPI_Put()
206
calls in both directions, with each receiver blocking until the
207
last byte has been overwritten before bouncing the message back.
210
option to force usage of a fence to block rather than an overwrite
214
option will use MP_Get() functions to transfer the data rather than
220
Start the pvm system using:
226
and adding a second machine with the PVM command
233
Exit the PVM command line interface using quit, then run the PVM NetPIPE
234
receiver on one system with the command:
240
and run the TCP NetPIPE transmitter on the other system with the
249
Any options used must be the same on both sides.
250
The nplaunch script may also be used with NPpvm as described above
253
.SH TESTING METHODOLOGY
256
tests network performance by sending a number of messages at each
257
block size, starting from the lower bound on the message sizes.
259
The message size is incremented until the upper bound on the message size is
260
reached or the time to transmit a block exceeds one second, which ever
261
occurs first. Message sizes are chosen at regular intervals, and for
262
slight perturbations from them to provide a more complete evaluation
263
of the communication subsystem.
267
output file may be graphed using a program such as
269
The output file contains three columns: the number of bytes in the block,
270
the transfer rate in bits per second, and
271
the time to transfer the block (half the round-trip time).
272
The first two columns are normally used to graph the throughput
273
vs block size, while the third column provides the latency.
275
.B throughput versus block size
276
graph can be created by graphing bytes versus bits per second.
279
commands for such a graph would be
293
asynchronous mode: prepost receives (MPI, IB modules)
296
.BI \-b \ \fITCP_buffer_sizes\fR
297
Set the send and receive TCP buffer sizes (TCP module only).
302
Burst mode where all receives are preposted at once (MPI, IB modules).
307
Use a fence to block for completion (MPI2 module only).
312
Use MPI_Get() instead of MPI_Put() (MPI2 module only).
316
.BI \-h \ \fIhostname\fR
317
Specify the name of the receiver host to connect to (TCP, PVM, IB, GM).
322
Invalidate cache to measure performance without cache effects (mostly affects
323
IB and memcpy modules).
328
Do an integrity check instead of a performance evaluation.
332
.BI \-l \ \fIstarting_msg_size\fR
333
Specify the lower bound for the size of messages to be tested.
338
.BI \-n \ \fInrepeats\fR
339
Set the number of repeats for each test to a constant.
340
Otherwise, the number of repeats is chosen to provide an accurate
341
timing for each test. Be very careful if specifying a low number
342
so that the time for the ping-pong test exceeds the timer accuracy.
346
.BI \-O \ \fIsource_offset,dest_offset\fR
347
Specify the source and destination offsets of the buffers from perfect
352
.BI \-o \ \fIoutput_filename\fR
353
Specify the output filename (default is np.out).
357
.BI \-p \ \fIperturbation_size\fR
358
NetPIPE chooses the message sizes at regular intervals, increasing them
359
exponentially from the lower boundary to the upper boundary.
360
At each point, it also tests perturbations of 3 bytes above and 3 bytes
361
below each test point to find idiosyncrasies in the system.
362
This perturbation value can be changed using the
373
This option resets the TCP sockets after every test (TCP module only).
374
It is necessary for some streaming tests to get good measurements
375
since the socket window size may otherwise collapse.
380
Set streaming mode where data is only transmitted in one direction.
385
Use synchronous sends (MPI module only).
389
.BI \-u \ \fIupper_bound\fR
390
Specify the upper boundary to the size of message being tested.
391
By default, NetPIPE will stop when
392
the time to transmit a block exceeds one second.
396
Receive messages using MPI_ANY_SOURCE (MPI module only)
401
Set bi-directional mode where both sides send and receive at the
402
same time (supported by most modules).
405
to choose asynchronous communications for MPI to avoid freeze-ups.
406
For TCP, the maximum test size will be limited by the TCP
414
Default output file for
422
The original NetPIPE core plus TCP and MPI modules were written by
423
Quinn Snell, Armin Mikler, Guy Helmer, and John Gustafson.
424
NetPIPE is currently being developed and maintained by Dave Turner
425
with contributions from many students (Bogdan Vasiliu, Adam Oline,
426
Xuehua Chen, and Brian Smith).
429
Send comments/bug-reports to:
431
<netpipe@scl.ameslab.gov>.
433
Additional information about
435
can be found on the World Wide Web at
436
.I http://www.scl.ameslab.gov/Projects/NetPIPE/
439
As of version 3.6.1, there is a bug that causes NetPIPE to segfault on
440
RedHat Enterprise systems. I will debug this as soon as I get access to a
441
few such systems. -Dave Turner (turner@ameslab.gov)