50
46
regardless if which cpu you specify. I suspect this is a bug, because on
51
47
earlier systems, the remaining CPUs were controlled via a logical link to
52
48
/sys/devices/system/cpu/cpu0/. In this case, the only way I found to force
53
the second processor to also run at its peak frequence was to issue the
49
the second processor to also run at its peak frequency was to issue the
54
50
following as root after setting CPU0 to performance:
55
51
cp /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor \
56
52
/sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
54
For non-broken systems, you instead issue the above command with -c <#> appended
55
to change the performance of each core in turn. For example, to speedup both
56
processors of a dual system you would issue:
57
/usr/bin/cpufreq-selector -g performance -c 0
58
/usr/bin/cpufreq-selector -g performance -c 1
60
On Kubuntu, I had problems with this not working because scaling_max_freq
61
was set to the minimal speed. To fix, I had to first increase the max scaling
62
frequency, which you can do (as root) by (where <#> below is replaced by each processor
63
number eg., 0 and 1 for dual processor system):
64
cd /sys/devices/system/cpu/cpu<#>/cpufreq
65
cp cpuinfo_max_freq scaling_max_freq
67
In Kubuntu 9.10, the only fix I found was to issue:
68
sudo echo "performance" > \
69
/sys/devices/system/cpu/cpuX/cpufreq/scaling_governor
70
Where 'X' is replaced by each of your cpu numbers in turn (eg., if you have
71
a quad processor, you would issue this command four times, using X=[0,1,2,3]).
57
73
Under MacOS or Windows, you may be able to change this under the power settings.
74
I have reports that issuing "powercfg /q" in cmd.exe under windows will tell
75
you whether windows is throttling or not.
59
77
ATLAS config tries to detect if CPU throttling is enabled, but it may not
60
78
always detect it, and sometimes may detect it after you have disabled it.
98
133
instead. Note that the fortran interface to BLAS and LAPACK cannot be built
99
134
without a fortran compiler.
101
You typically must build ATLAS's interface routines with the compiler that
102
you use to do the linking, so that the proper libraries can be found. We
103
just discussed how to override the fortran choice; if you use a C compiler
104
that does not seamlessly interoperate with gcc, you may need to override
105
the C compiler as well. Overriding all of ATLAS's C compilers will typically
106
mean you can't use the architectural defaults, which will greatly increase
107
your install time and will potentially decrease your performance by a large
108
amount. Therefore, it is usually advised to only override the C interface
109
compiler, leaving the kernel routines to be compiled by the default C compiler
110
(usually gnu gcc). To override the C interface compiler, simply add these
111
flags to your configure invocation:
112
-C ic <C compiler with path> -F ic 'C compiler flags'
114
136
Note that all compilers used in an ATLAS install must be able to interoperate.
115
137
For more compiler-controlling flags, add --help to the configure command.
117
*********** Important x86 Compiler Advice ***********
118
If you are on an x86 and are using gcc 4.1, you should be aware that gcc 4.1
119
produces x87 code that gets performance of between 56-75% of the code produced
120
by gcc 3 (i.e. gcc3-produced code is almost twice as fast as gcc4's) depending
121
on the architecture. From our own timings, gcc 4.2 is superior to either
122
4.1 or 3. Gcc 4.1 produces adequate performance only on Intel Core
123
machines. See ATLAS/doc/atlas_install.pdf for further details.
139
*********** Important CLANG Compiler Advice ***********
140
I have never succesfully built ATLAS with clang. Clang produces the wrong
141
answer on many kernels, and poorer performance than gcc on every kernel
142
I've timed. I strongly recommend GNU gcc over Clang/LLVM.
125
144
********************************** BUILD **************************************
126
145
If config finishes without error, start the build/tuning process by:
204
226
ATLAS natively builds to a static library (i.e. libs that usually end in
205
227
".a" under unix and ".lib" under windows). ATLAS always builds such a library,
206
228
but it can also optionally be requested to build a dynamic/shared library
207
(typically ending in .so for unix or .dll windows). In order to do so, you
208
must tell ATLAS up front to compile with the proper flags (the same is
209
true when building netlib's LAPACK, see the LAPACK note below). Assuming
210
you are using the gnu C and Fortran compilers, you can add the following
211
commands to your configure command:
213
to force ATLAS to be built using position independent code (required for a
214
dynamic lib). If you use non-gnu compilers, you'll need to use -Fa to
229
(typically ending in .so for unix or .dll windows and .dylib for OS X).
230
To do this for GNU compilers, you typically just need to add
232
flag to your configure line.
233
If you use non-gnu compilers, you'll need to use -Fa to
215
234
pass the correct flag(s) to append to force position independent code for
216
235
each compiler (don't forget the gcc compiler used in the index files).
217
236
NOTE: Since gcc uses one less int register when compiling with this flag, this
218
237
could potentially impact performance of the architectural defaults,
219
238
but we have not seen it so far.
221
After you build is complete, you can cd to the OBJdir/lib directory, and
222
ask ATLAS to build the .so you want. If you want all libraries, including
223
the Fortran77 routines, the target choices are :
224
shared : Create shared versions of ATLAS's sequential libs
225
ptshared : Create shared versions of ATLAS's threaded libs
226
If you want only the C routines (eg. you don't have a fortran compiler):
227
cshared : Create shared versions of ATLAS's sequential libs
228
cptshared : Create shared versions of ATLAS's threaded libs
230
240
****************** NOTE ON BUILDING A FULL LAPACK LIBRARY *********************
231
If you want to build a full LAPACK library, you must obtain and build
232
netlib's LAPACK library (ATLAS provides only a few lapack routines natively).
233
For the routines ATLAS does not provide natively, you will therefore only have
234
the Fortran77 interface (ATLAS's native lapack routines have both F77 and C
235
interfaces). Install LAPACK first (minimally, doing a "make lib" after
236
editing the make.inc for your system).
237
IMPORTANT NOTE: if you wish to build a dynamic library, remember to specify
238
-fPIC (assuming gfortran/g77) for both the OPTS and NOOPT macros of
241
Once the library is built, you have to tell ATLAS to use it during the
242
configure, which you do by adding the following flag to configure:
243
--with-netlib-lapack=<path to lapack>
244
(eg. --with-netlib-lapack=/home/whaley/LAPACK3.0/lapack_LINUX.a).
246
If you want to add lapack symbols to a previously built ATLAS, see
247
ATLAS/doc/LibReadme.txt for further info.
241
To build lapack, first download the lapack tarfile from
242
www.netlib.org/lapack
243
Then, pass the flag to configure:
244
--with-netlib-lapack-tarfile=/path/to/downloaded/tarfile
246
***************************** EXTENDED ATLAS TESTING **************************
247
ATLAS has two extended testers beyond the sanity checks that can be
248
automatically invoked from the BLDdir. These tests are longer running and
249
more complex to interpret than the sanity tests, and so not every user will
250
want to run them. They are particularly recommended for installers who wish
251
to use a developer release for production code.
253
--------------------------------- full_test -----------------------------------
254
The first is a set of testing scripts written by Antoine Petitet, that
255
randomly generate testcases for a host of ATLAS's testers. This testing
256
phase may take as long as two days to complete (and almost always takes
257
at least 4 hours). To perform this long-running test, simply issue:
259
If you are logged into the host machine remotely, chances are good your
260
connection will go down before the install completes. Therefore, you
261
should consider running the above command with nohup.
263
At the completion of the tests, the extensive output files will be searched
264
for errors (much as with the sanity tests), and the output sent to the screen.
265
If you have lost this screen of data, you can regenerate it with the command:
268
Running these tests will create a directory BLDdir/bin/AtlasTest where the
269
tester resides, and your output files will be stored a $(ARCH) subdir.
270
If you want to rerun the testers from scratch (rather than just searching
271
old output), you can simply delete the entire BLDdir/bin/AtlasTest
272
directory tree, and do "make full_test" again.
274
----------------------------- lapack_test -------------------------------------
275
If you have installed the full LAPACK library, then you can run the standard
276
lapack testers as well. The command you give is:
277
make lapack_test_[a,s,f]l_[ab,sb,fb,pt]
278
The first choice (choose one of three) controls which LAPACK library macro is
279
used in the link for testing:
281
_l LINK FOR LAPACK Make.inc MACRO
282
== =================== ==============
283
a ATLAS's LAPACK $(LAPACKlib)
284
s system LAPACK $(SLAPACKlib)
285
f F77 reference LAPACK $(FLAPACKlib)
287
The second choice (choose one of three) controls which BLAS macros are
288
used in the link for testing:
289
_b/pt LINK FOR BLAS Make.inc MACRO
290
==== ===================== =========================================
291
ab ATLAS BLAS $(F77BLASlib) $(CBLASlib) $(ATLASlib)
292
sb system BLAS $(BLASlib)
293
fb F77 reference BLAS $(FBLASlib)
294
pt ATLAS' threaded BLAS $(PTF77BLASlib) $(PTCBLASlib) $(ATLASlib)
296
Not all of these combinations will work without user modification of Make.inc.
297
You will need to fill in values for
301
if you want to run the lapack tester against these libraries.
303
Usually, you will want to test your newly install ATLAS LAPACK & BLAS:
304
make lapack_test_al_ab
306
As before, once the testing is complete, you will get the output of a search
307
for errors though all output files, and you can search them again with:
308
make scope_lapack_test_al_ab
310
Unfortunately, the lapack testers always show errors on almost all platforms.
311
So, how do you know if you have a real error? Real errors will usually
312
have residuals in the 10^6 range, rather than O(1) (smaller residuals mean
313
less error). If you are unsure, the best way is to contrast ATLAS with an
315
make lapack_test_fl_fb
316
(To run this test, you will have to build a stock netlib LAPACK library,
317
and fill out Make.inc's FLAPACKlib macro appropriately.) You can then see
318
how the errors reported by ATLAS stack up against the all-F77 version:
319
if they are roughly the same, then you are usually OK.
321
All the lapack testers create a directory BLDdir/bin/LAPACK_TEST. For
322
each test you run there will be a subdirectory
323
LAOUT_[A,S,F]L_[AB,SB,FB,PT]
324
where all your output files will be located. Additionally, the results
325
of the scope (search for error) will be stored in
326
BLDdir/bin/LAPACK_TEST/SUMMARY_<lapack>_<blas>
328
Therefore, a typical round of testing might be:
329
make lapack_test_al_ab
330
make lapack_test_fl_fb
331
# compare SUMMARY_al_ab with SUMMARY_fl_fb to check for error
332
make lapack_test_al_pt
333
# compare SUMMARY_al_pt with SUMMARY_fl_fb to check for error in parallel lib
335
If you had an error, you might want to be sure the error was in ATLAS's BLAS
336
and not lapack, so you could do "make lapack_test_fl_ab", and see if the
337
error went away. If you filled in the GotoBLAS for the SLAPACKlib & BLASlib
338
macros, you could scope the error properties of Goto's BLAS and LAPACK.
339
Many system/vendor LAPACK/BLAS do not provide all of the routines required
340
to run the LAPACK testers, and some ATLAS testers call ATLAS internal
341
routines. Therefore, the safest thing if you have missing symbol errors
342
when building system/vendor tests, is to use ATLAS to pick up any missing
343
symbols. For instance, here is an example Make.inc output that makes all of
344
ATLAS testers work with the GotoBLAS on my Athlon-64 workstation:
345
BLASlib = /opt/lib/libgoto_opteronp-r1.26.a \
346
$(F77BLASlib) $(CBLASlib) $(ATLASlib)
347
SLAPACKlib = /opt/lib/libgoto_opteronp-r1.26.a $(FLAPACKlib)