7
This material was prepared as an account of work sponsored by an
8
agency of the United States Government. Neither the United States
9
Government nor the United States Department of Energy, nor Battelle,
10
nor any of their employees, MAKES ANY WARRANTY, EXPRESS OR IMPLIED,
11
OR ASSUMES ANY LEGAL LIABILITY OR RESPONSIBILITY FOR THE ACCURACY,
12
COMPLETENESS, OR USEFULNESS OF ANY INFORMATION, APPARATUS, PRODUCT,
13
SOFTWARE, OR PROCESS DISCLOSED, OR REPRESENTS THAT ITS USE WOULD NOT
14
INFRINGE PRIVATELY OWNED RIGHTS.
19
This software and its documentation were produced with United States
20
Government support under Contract Number DE-AC06-76RLO-1830 awarded
21
by the United States Department of Energy. The United States
22
Government retains a paid-up non-exclusive, irrevocable worldwide
23
license to reproduce, prepare derivative works, perform publicly and
24
display publicly by or for the US Government, including the right to
25
distribute to other US Government contractors.
27
The primary current source of funding for development of GA is the DoE-2000
28
ACTS project. GA is a part of the ACTS toolkit:
37
./configure && make && make install
39
should compile the static GA library (libga.a) to use sockets and install
40
headers and libraries to /usr/local/include and /usr/local/lib, respectively.
42
Please refer to the INSTALL file for generic build instructions. That is a
43
good place to start if you are new to using "configure; make; make install"
44
types of builds. Detailed instructions are covered later in this file.
46
QUESTIONS/HELP/SUPPORT/BUG-REPORT
47
=================================
49
email: hpctools@pnl.gov
51
If you encounter any problems, please first refer to the file NOTES located in
52
the same directory and see the GA support webpage:
54
http://www.emsl.pnl.gov/docs/global/support.html
56
Please don't hesitate to send us an email. An archive of emails is available
59
http://groups.google.com/group/hpctools
61
WHERE IS THE DOCUMENTATION?
62
===========================
64
The GA webpage has the most current versions of the Fortran and C documentation
65
and the User's Manual in the HTML format:
67
http://www.emsl.pnl.gov/docs/global/
72
This directory contains the Global Arrays (GA), Aggregate Remote Memory Copy
73
Interface (ARMCI) run-time library, and Memory Allocator (MA), parallel I/O
74
libraries (DRA,EAF,SF), TCGMSG, and TCGMSG-MPI packages bundled together.
76
Global Arrays is a portable Non-Uniform Memory Access (NUMA) shared-memory
77
programming environment for distributed and shared memory computers. It
78
augments the message-passing model by providing a shared-memory like access to
79
distributed dense arrays.
81
ARMCI provides one-sided remote memory operations used by GA.
83
DRA (Disk Resident Arrays) is a parallel I/O library that maintains dense 2-dim
86
SF (Shared Files) is a parallel I/O library that allows noncollective I/O to a
89
EAF (Exclusive Access Files) is parallel I/O library that supports I/O to
92
TCGMSG is a simple, efficient, but becoming obsolete message-passing library.
94
TCGMSG-MPI is a TCGMSG interface implementation on top of MPI and ARMCI.
96
MA is a dynamic memory allocator/manager for Fortran and C programs.
98
GA++ is a C++ binding for global arrays.
100
See file 'COPYRIGHT' for copying conditions.
101
See file 'INSTALL' for compilation and installation instructions (generic).
102
See file 'NEWS' for a list of major changes in the current release.
103
See file 'AUTHORS' for the names of anyone who has contributed to GA.
104
See file 'NOTES' for a few platform-specific symptoms and fixes.
106
DIRECTORY STRUCTURE (ALPHABETICALLY)
107
====================================
110
+ config (configuration makefile includes)
111
+ doc (documentation for ARMCI library)
112
+ lib (compatibility source files for missing system features)
113
+ src (source code for ARMCI library)
114
- build-aux (autotools support scripts)
115
- cca (common component architecture)
116
- compat (compatibility source files for missing system features)
117
- doc (documentation)
118
- ga++ (C++ Bindings for Global Arrays)
119
+ doc (contains sample Doxyfile for Doxygen doc generator)
120
+ src (source code for GA C++ bindings)
121
+ testing (test programs)
122
- gaf2c (Fortran-to-C compatibility library and tests)
124
+ doc (paper & documentation in PostScript, HTML & plain text)
125
+ src (source code for GA library)
126
+ testing (GA test programs and performance results)
127
+ trace (library and programs to generate and process tracefiles)
128
+ X (xregion visualization program for GA)
129
- LinAlg (linear algebra software used by GA)
131
- m4 (autoconf macros)
132
- ma (Memory Allocator)
135
+ dra (Disk Resident Array Library code)
136
+ eaf (Exclusive Access Files Library code)
137
+ elio ("device" layer for other parallel I/O models)
138
+ sf (Shared Files Library code)
139
- tcgmsg (simple, legacy message-passing library)
142
+ tcgmsg-mpi (TCGMSG on top of MPI)
144
HOW TO BUILD THE PACKAGE?
145
=========================
147
Please refer to the INSTALL file for generic build instructions. That is a
148
good place to start if you are new to using "configure; make; make install"
149
types of builds. The following will cover platform-specific considerations as
150
well as the various optional features of GA. Customizations to the GA build
151
via the configure script are discussed next.
153
Configuration Options
154
---------------------
156
There are many options available when configuring GA. Although configure can
157
be safely run within this distributions' root folder, we recommend performing
158
an out-of-source (aka VPATH) build. This will cleanly separate the generated
159
Makefiles and compiled object files and libraries from the source code. This
160
will allow, for example, one build using sockets versus another build using
161
OpenIB for the communication layer to use the same source tree e.g.::
163
mkdir bld_mpi_sockets && cd bld_mpi_sockets && ../configure
164
mkdir bld_mpi_openib && cd bld_mpi_openib && ../configure --with-openib
166
Regardless of your choice to perform a VPATH build, the following should
167
hopefully elucidate the myriad options to configure. Only the options
168
requiring additional details are documented here. ./configure --help will
169
certainly list more options in addition to limited documentation.
171
--disable-f77 Disable Fortran code. This used to be the old
172
GA_C_CORE or NOFORT environment variables which
173
enabled the C++ bindings. However, it is severely
174
broken. There are certain cases where Fortran code is
175
required but this will not inhibit the building of the
176
C++ bindings. In the future we may be able to
177
eliminate the need for the Fortran compiler/linker.
178
Use at your own risk (of missing symbols at link-time.)
179
--enable-cxx Build C++ interface. This will require the C++ linker
180
to locate the Fortran libraries (handled
181
automatically) but user C++ code will require the same
182
considerations (C++ linker, Fortran libraries.)
183
--disable-opt Don't use hard-coded optimization flags. GA is a
184
highly-optimized piece of software. There are certain
185
optimization levels or flags that are known to break
186
the software. If you experience mysterious faults,
187
consider rebuilding without optimization by using this
189
--enable-peigs Enable Parallel Eigensystem Solver interface. This
190
will build the stubs required to call into the peigs
192
--enable-checkpoint Enable checkpointing. Untested. For use with old
193
X-based visualization tool.
194
--enable-profile Enable profiling. Not sure what this does, sorry.
195
--enable-trace Enable tracing. Not sure what this does, sorry.
196
--enable-thread-safety **unsupported** Turn on thread safety.
197
--enable-underscoring Force single underscore for all external Fortran
198
symbols. Usually, configure is able to detect the name
199
mangling scheme of the detected Fortran compiler and
200
will default to using what is detected. This includes
201
any variation of zero, one, or two underscores or
202
whether UPPERCASE or lowercase symbols are used. If
203
you want to force a single underscore which was the
204
default of older GA builds, use this option.
205
Otherwise, you can use the FFLAGS environment variable
206
to override the Fortran compiler's or platform's
207
defaults e.g. configure FFLAGS=-fno-underscoring.
208
--enable-i4 Use 4 bytes for Fortran INTEGER size. Otherwise, the
209
default INTEGER size is set to the results of the C
210
sizeof(void*) operator.
211
--enable-i8 Use 8 bytes for Fortran INTEGER size. Otherwise, the
212
default INTEGER size is set to the results of the C
213
sizeof(void*) operator.
214
--enable-shared Build shared libraries [default=no]. Useful, for
215
example, if you plan on wrapping GA with an
216
interpreted language such as Python. Otherwise, some
217
systems only support static libraries (or vice versa)
218
but static libraries are the default.
220
For most of the external software packages an optional argument is allowed
221
(represented as ARG below.) **ARG can be omitted** or can be one or more
222
whitespace-separated directories, linker or preprocessor directives. For
225
--with-mpi="/path/to/mpi -lmylib -I/mydir"
226
--with-mpi=/path/to/mpi/base
229
The messaging libraries supported include MPI, TCGMSG, and TCGMSG over MPI. If
230
you omit their respective --with-* option, MPI is the default. GA can be built
231
to work with MPI or TCGMSG. Since the TCGMSG package is small (comparing to
232
portable MPI implementations), compiles fast, it is still bundled with the GA
235
--with-mpi=ARG Select MPI as the messaging library (default). If you
236
omit ARG, we attempt to locate the MPI compiler
237
wrappers. If you supply anything for ARG, we will
238
parse ARG as indicated above.
239
--with-tcgmsg Select TCGMSG as the messaging library; if
240
--with-mpi is also specified then TCGMSG over MPI is
242
--with-vampir=ARG Enable VAMPIR performance tracing.
244
--with-blas=ARG Use external BLAS library; attempt to detect
245
sizeof(INTEGER) used to compile BLAS; if not found, an
246
internal BLAS is built
247
--with-blas4=ARG Use external BLAS library compiled with
249
--with-blas8=ARG Use external BLAS library compiled with
251
--with-lapack=ARG Use external LAPACK library. If not found, an internal
253
--with-scalapack=ARG Use external ScaLAPACK library.
255
The ARMCI networks supported are listed next. Our ability to automatically
256
locate required headers libraries is currently inadequate. Therefore, you will
257
likely need to specify the optional ARG pointing to the necessary directories
258
and/or libraries. sockets is the default ARMCI network if nothing else is
261
--with-bgml=ARG select armci network as IBM BG/L
262
--with-cray-shmem=ARG select armci network as Cray XT shmem
263
--with-dcmf=ARG select armci network as IBM BG/P Deep Computing
265
--with-lapi=ARG select armci network as IBM LAPI
266
--with-mpi-spawn=ARG select armci network as MPI-2 dynamic process mgmt
267
--with-openib=ARG select armci network as Infiniband OpenIB
268
--with-portals=ARG select armci network as Cray XT portals
269
--with-sockets=ARG select armci network as Ethernet TCP/IP (default)
271
There are some influential environment variables as documented in configure
272
--help, however there are a few that are special to GA.
275
See --enable-thread-safety. I don't know what this does, sorry.
278
Fortran compiler flag to set the default INTEGER size. We know about certain
279
Fortran flags that set the default INTEGER size, but there will certainly be
280
some new (or old) ones that we don't know about. If the configure test to
281
determine the correct flag fails, please try setting this variable and
284
- F2C_HIDDEN_STRING_LENGTH_AFTER_ARGS
285
If cross compiling, set to either "yes" (default) or "no" (after string).
286
For compatibility between Fortran and C, a Fortran subroutine written in C
287
that takes a character string must take an additional argument (one per
288
character string) indicating the length of the string. This 'hidden'
289
argument appears either immediately after the string in the argument list
290
or after all other arguments to the function. This is compiler dependent. We
291
attempt to detect this behavior automatically, but in the case of
292
cross-compiled systems it may be necessary to specify the less usual after
293
string convention the gaf2c/testarg program crashes.
295
Special Notes for BLAS
296
----------------------
298
BLAS, being a Fortran library, can be compiled with a default INTEGER size of
299
4 or a promoted INTEGER size of 8. Experience has shown us that most of the
300
time the default size of INTEGER used is 4. In some cases, however, you may
301
have an external BLAS library which is using 8-byte INTEGERs. In order to
302
correctly interface with an external BLAS library, GA must know the size of
303
INTEGER used by the BLAS library.
305
configure has the following BLAS-related options: --with-blas, --with-blas4,
306
and --with-blas8. The latter two will force the INTEGER size to 4- or
307
8-bytes, respectively. The first option, --with-blas, defaults to 4-byte
308
INTEGERS *however* in the two special cases of using ACML or MKL, it is
309
possible to detect 8-byte INTEGERs automatically. As documented in the ACML
310
manual, if the path to the library has "_int64" then 8-byte INTEGERs are used.
311
As documented in the MKL manual, if the library is "ilp64", then 8-byte
314
You may always override --with-blas by specifying the INTEGER size using one
315
of the two more specific options.
317
Cross-Compilation Issues
318
------------------------
320
Certain platforms cross-compile from a login node for a compute node, or one
321
might choose to cross-compile for other reasons. Cross-compiling requires the
322
use of the --host option to configure which indicates to configure that certain
323
run-time tests should not be executed. See INSTALL for details on use of the
326
Two of our target platforms are known to require cross-compilation, Cray XT and
332
It has been noted that configure still succeeds without the use of the --host
333
flag. If you experience problems without --host, we recommend::
335
configure --host=x86_64-unknown-linux-gnu
337
And if that doesn't work (cross-compilation is not detected) you must then
338
*force* cross-compilation using both --host and --build together::
340
configure --host=x86_64-unknown-linux-gnu --build=x86_64-unknown-linux-gnu
345
Currently the only way to detect the BGP platform and compile correctly is to
348
configure --host=powerpc-bgp-linux
350
The rest of the configure options apply as usual e.g. --with-dcmf in this case.
355
Unless otherwise noted you can try to overwrite the default compiler names
356
detected by configure by defining F77, CC, and CXX for Fortran (77), C, and C++
357
compilers, respectively. Or when using the MPI compilers MPIF77, MPICC, and
358
MPICXX for MPI Fortran (77), C, and C++ compilers, respectively::
360
configure F77=f90 CC=gcc
361
configure MPIF77=mpif90 MPICC=mpicc
363
Although you can change the compiler at make-time it will likely fail. Many
364
platform-specific compiler flags are detected at configure-time based on the
365
compiler selection. If changing compilers, we recommend rerunning configure as
371
By this point we assume you have successfully run configure either from the
372
base distribution directory or from a separate build directory (aka VPATH
373
build.) You are now ready to run 'make'. You can optionally run parallel
374
make using the "-j" option which significantly speeds up the build. If using
375
the MPI compiler wrappers, occasionally using "-j" will cause build failures
376
because the MPI compiler wrapper creates a temporary symlink to the mpif.h
377
header. In that case, you won't be able to use the "-j" option. Further, the
378
influential environment variables used at configure-time can be overridden at
379
make-time in case problems are encountered. For example::
381
./configure CFLAGS=-Wimplicit
383
make CFLAGS="-Wimplicit -g -O0"
385
One particularly influential make variable is "V" which controls the verbosity
386
of the make output. This variable corresponds to the --dis/enable-silent-riles
387
configure-time option, but I often prefer the make-time variable::
389
make V=0 (configure --enable-silent-rules)
390
make V=1 (configure --disable-silent-rules)
395
Running "make checkprogs" will build most test and example programs. Note that
396
not all tests are built -- some tests depend on certain features being
397
detected or enabled during configure. These programs are not intented to be
398
examples of good GA coding practices because they often include private
399
headers. However, they help us debug or time our GA library.
404
Running "make check" will build most test and example programs (See "make
405
checkprogs" notes above) in addition to running the test suite. The test
406
suite runs both the serial and parallel tests. The test suite must know how
407
to launch the parallel tests via the MPIEXEC variable. Please read your MPI
408
flavor's documentation on how to launch, or if using TCGMSG you will use the
409
"parallel" tool. For example, the following is the command to launch the test
410
suite when compiled with OpenMPI::
412
make check MPIEXEC="mpiexec -np 4"
414
All tests have a per-test log file containing the output of the test. So if
415
the test is global/testing/test.x, the log file would be
416
global/testing/test.log. The output of failed tests is collected in the
417
top-level log summary test-suite.log.
419
The test suite will recurse into the ARMCI directory and run the ARMCI test
420
suite first. If the ARMCI test suite fails, the GA test suite will not run
421
(the assumption here is that you should fix bugs in the dependent library
422
first.) To run only the GA test suite, type "make check-ga" with the
423
appropriate MPIEXEC variable.
425
How to Run GA Test Programs?
426
----------------------------
428
Depends on the system. MPPs like Intel iPSC/860, Delta, Paragon, IBM SPx, Cray
429
T3/E have their own commands for submitting parallel jobs.
431
On workstations and clusters, GA are run like ordinary message-passing
434
To run GA programs with MPI, you need to built the package to be compatible
435
with MPI (see README in ./global and documentation in ./global/doc/ ) and
436
run it as any other MPI program. The GA package has been tested only with a
437
limited number of MPI implementations (MPICH, and vendor's: Intel, IBM, Sun,
440
TCGMSG `parallel' command (built automaticaly in ./tcgmsg/ipcv4.0/parallel
441
if needed) is used to start a job on clusters if you are using TCGMSG as
442
your message-passing library. On the workstations, GA-based programs that
443
use TCGMSG can be run with a single process without the `parallel' command
444
-- just by typing program name -- useful for debugging.
449
a. LINUX64 supports ALPHA, Itanium, Opteron, and Em64T processors only.
451
b. The SGI_N32 version is recommended on all newer SGI boxes including
452
the O2, Octane, Origin, Indigo2, and PowerChallenge systems
453
unless the system has lots of memory and your program uses
454
huge arrays (>4GB) in which case 64-bit addressing is required
455
(SGITFP version). In addition, TARGET_CPU environment
456
variable can be used to choose the optimal compiler flags
457
for R8000 and R10000 processors.
459
c. In 64 bit platforms, if you are using blas libraries that takes
460
integer as 8 bytes, then set the following environment variables:
462
setenv BLAS_LIB specify_your_blas_library
463
e.g.setenv BLAS_LIB -L/usr/lib/libblas.a
465
d. To turn Async I/O on under Linux, set environment variable USE_LINUXAIO=y
470
global/src/gaconfig.h has a varible called AVOID_MA_STORAGE. If defined, this
471
variable forces GA to use ARMCI memory which can lead to better performance on
472
platforms on which memory needs to be registered for fast communication.
474
Setting an environment variable MA_USE_ARMCI_MEM forces MA library to use
475
ARMCI memory, communication via which can be faster on networks like GM, VIA