15
Section 1. Introduction and General Information
17
Q1.2 How do I obtain FFTW?
18
Q1.3 Is FFTW free software?
19
Q1.4 What is this about non-free licenses?
20
Q1.5 In the West? I thought MIT was in the East?
22
Section 2. Installing FFTW
23
Q2.1 Which systems does FFTW run on?
24
Q2.2 Does FFTW run on Windows?
25
Q2.3 My compiler has trouble with FFTW.
26
Q2.4 FFTW does not compile on Solaris, complaining about const.
27
Q2.5 What's the difference between --enable-3dnow and --enable-k7?
28
Q2.6 What's the difference between the fma and the non-fma versions?
29
Q2.7 Which language is FFTW written in?
30
Q2.8 Can I call FFTW from Fortran?
31
Q2.9 Can I call FFTW from C++?
32
Q2.10 Why isn't FFTW written in Fortran/C++?
33
Q2.11 How do I compile FFTW to run in single precision?
36
Q3.1 Why not support the FFTW 2 interface in FFTW 3?
37
Q3.2 Why do FFTW 3 plans encapsulate the input/output arrays and not ju
38
Q3.3 FFTW seems really slow.
39
Q3.4 FFTW slows down after repeated calls.
40
Q3.5 An FFTW routine is crashing when I call it.
41
Q3.6 My Fortran program crashes when calling FFTW.
42
Q3.7 FFTW gives results different from my old FFT.
43
Q3.8 Can I save FFTW's plans?
44
Q3.9 Why does your inverse transform return a scaled result?
45
Q3.10 How can I make FFTW put the origin (zero frequency) at the center
46
Q3.11 How do I FFT an image/audio file in *foobar* format?
47
Q3.12 My program does not link (on Unix).
48
Q3.13 I included your header, but linking still fails.
49
Q3.14 My program crashes, complaining about stack space.
50
Q3.15 FFTW seems to have a memory leak.
51
Q3.16 The output of FFTW's transform is all zeros.
52
Q3.17 How do I call FFTW from the Microsoft language du jour?
53
Q3.18 Can I compute only a subset of the DFT outputs?
55
Section 4. Internals of FFTW
56
Q4.1 How does FFTW work?
57
Q4.2 Why is FFTW so fast?
60
Q5.1 FFTW 1.1 crashes in rfftwnd on Linux.
61
Q5.2 The MPI transforms in FFTW 1.2 give incorrect results/leak memory.
62
Q5.3 The test programs in FFTW 1.2.1 fail when I change FFTW to use sin
63
Q5.4 The test program in FFTW 1.2.1 fails for n > 46340.
64
Q5.5 The threaded code fails on Linux Redhat 5.0
65
Q5.6 FFTW 2.0's rfftwnd fails for rank > 1 transforms with a final dime
66
Q5.7 FFTW 2.0's complex transforms give the wrong results with prime fa
67
Q5.8 FFTW 2.1.1's MPI test programs crash with MPICH.
68
Q5.9 FFTW 2.1.2's multi-threaded transforms don't work on AIX.
69
Q5.10 FFTW 2.1.2's complex transforms give incorrect results for large p
70
Q5.11 FFTW 2.1.3's multi-threaded transforms don't give any speedup on S
71
Q5.12 FFTW 2.1.3 crashes on AIX.
16
73
===============================================================================
18
75
Section 1. Introduction and General Information
78
Q1.2 How do I obtain FFTW?
79
Q1.3 Is FFTW free software?
80
Q1.4 What is this about non-free licenses?
81
Q1.5 In the West? I thought MIT was in the East?
21
83
-------------------------------------------------------------------------------
67
129
would neither affect their licensing revenue nor irritate existing
132
-------------------------------------------------------------------------------
134
Question 1.5. In the West? I thought MIT was in the East?
136
Not to an Italian. You could say that we're a Spaghetti Western (with
137
apologies to Sergio Leone).
70
139
===============================================================================
72
141
Section 2. Installing FFTW
143
Q2.1 Which systems does FFTW run on?
144
Q2.2 Does FFTW run on Windows?
145
Q2.3 My compiler has trouble with FFTW.
146
Q2.4 FFTW does not compile on Solaris, complaining about const.
147
Q2.5 What's the difference between --enable-3dnow and --enable-k7?
148
Q2.6 What's the difference between the fma and the non-fma versions?
149
Q2.7 Which language is FFTW written in?
150
Q2.8 Can I call FFTW from Fortran?
151
Q2.9 Can I call FFTW from C++?
152
Q2.10 Why isn't FFTW written in Fortran/C++?
153
Q2.11 How do I compile FFTW to run in single precision?
75
155
-------------------------------------------------------------------------------
77
157
Question 2.1. Which systems does FFTW run on?
79
159
FFTW is written in ANSI C, and should work on any system with a decent C
80
compiler. (See also pageref:runOnWindows::' and
81
pageref:compilerCrashes::'.) FFTW can also take advantage of certain
160
compiler. (See also Q2.2 `Does FFTW run on Windows?', Q2.3 `My compiler
161
has trouble with FFTW.'.) FFTW can also take advantage of certain
82
162
hardware-specific features, such as cycle counters and SIMD instructions,
83
163
but this is optional.
87
167
Question 2.2. Does FFTW run on Windows?
89
It should. FFTW was not developed on Windows, but the source code is
90
essentially straight ANSI C. Many users have reported using FFTW 2 in the
91
past on Windows with various compilers; we are currently awaiting reports
92
for FFTW 3. See also the FFTW Windows installation notes and
93
pageref:compilerCrashes::'
169
Yes, many people have reported successfully using FFTW on Windows with
170
various compilers. FFTW was not developed on Windows, but the source code
171
is essentially straight ANSI C. See also the FFTW Windows installation
172
notes, Q2.3 `My compiler has trouble with FFTW.', and Q3.17 `How do I call
173
FFTW from the Microsoft language du jour?'.
95
175
-------------------------------------------------------------------------------
99
179
Complain fiercely to the vendor of the compiler.
101
FFTW is likely to push compilers to their limits. We have successfully
102
used gcc 3.2.x on x86 and PPC, a recent Compaq C compiler for Alpha,
103
version 6 of IBM's xlc compiler for AIX, Intel's icc versions 5-7, and Sun
104
WorkShop cc version 6. Several compiler bugs have been exposed by FFTW,
105
however. A partial list follows.
181
We have successfully used gcc 3.2.x on x86 and PPC, a recent Compaq C
182
compiler for Alpha, version 6 of IBM's xlc compiler for AIX, Intel's icc
183
versions 5-7, and Sun WorkShop cc version 6.
185
FFTW is likely to push compilers to their limits, however, and several
186
compiler bugs have been exposed by FFTW. A partial list follows.
107
188
gcc 2.95.x for Solaris/SPARC produces incorrect code for the test program
108
189
(workaround: recompile the libbench2 directory with -O2).
110
191
NetBSD/macppc 1.6 comes with a gcc version that also miscompiles the test
111
192
program. (Please report a workaround if you know one.)
113
gcc 3.2.3 for ARM reportedly crashes during compilation. (Please report a
114
workaround if you know one.)
194
gcc 3.2.3 for ARM reportedly crashes during compilation. This bug is
195
reportedly fixed in later versions of gcc.
116
Intel's icc-7.1 compiler build 20030402Z appears to produce incorrect
197
Versions 8.0 and 8.1 of Intel's icc falsely claim to be gcc, so you should
198
specify CC="icc -no-gcc"; this is automatic in FFTW 3.1. icc-8.0.066
199
reportely produces incorrect code for FFTW 2.1.5, but is fixed in version
200
8.1. icc-7.1 compiler build 20030402Z appears to produce incorrect
117
201
dependencies, causing the compilation to fail. icc-7.1 build 20030307Z
118
202
appears to work fine. (Use icc -V to check which build you have.) As of
119
203
2003/04/18, build 20030402Z appears not to be available any longer on
135
223
Some 3.0.x and 3.1.x versions of gcc on x86 may crash. gcc so-called 2.96
136
224
shipping with RedHat 7.3 crashes when compiling SIMD code. In both cases,
137
please upgrade to gcc-3.2.
139
Intel's icc 6.0 misaligns SSE constants, but FFTW has a workaround.
225
please upgrade to gcc-3.2 or later.
227
Intel's icc 6.0 misaligns SSE constants, but FFTW has a workaround. icc
228
8.x fails to compile FFTW 3.0.x because it falsely claims to be gcc; we
229
believe this to be a bug in icc, but FFTW 3.1 has a workaround.
231
Visual C++ 2003 reportedly produces incorrect code for SSE/SSE2 when
232
compiling FFTW. This bug was reportedly fixed in VC++ 2005;
233
alternatively, you could switch to the Intel compiler. VC++ 6.0 also
234
reportedly produces incorrect code for the file reodft11e-r2hc-odd.c
235
unless optimizations are disabled for that file.
141
237
gcc 2.95 on MacOS X miscompiles AltiVec code (fixed in later versions).
142
238
gcc 3.2.x miscompiles AltiVec permutations, but FFTW has a workaround.
189
285
The fma version tries to exploit the fused multiply-add instructions
190
286
implemented in many processors such as PowerPC, ia-64, and MIPS. The two
191
FFTW packages are otherwise identical.
193
Definitely use the fma version if you have a PowerPC-based system with
194
gcc. This includes all GNU/Linux systems for PowerPC and all MacOS X
287
FFTW packages are otherwise identical. In FFTW 3.1, the fma and non-fma
288
versions were merged together into a single package, and the configure
289
script attempts to automatically guess which version to use.
291
The FFTW 3.1 configure script enables fma by default on PowerPC, Itanium,
292
and PA-RISC, and disables it otherwise. You can force one or the other by
293
using the --enable-fma or --disable-fma flag for configure.
295
Definitely use fma if you have a PowerPC-based system with gcc (or IBM
296
xlc). This includes all GNU/Linux systems for PowerPC and all MacOS X
297
systems. Also use it on PA-RISC and Itanium with the HP/UX compiler.
197
299
Definitely do not use the fma version if you have an ia-32 processor
198
300
(Intel, AMD, etcetera).
200
On other architectures, the situation is not so clear. For example, ia-64
201
has the fma instruction, but gcc-3.2 appears not to exploit it correctly.
202
Other compilers may do the right thing, but we have not tried them.
203
Please send us your feedback so that we can update this FAQ entry.
302
For other architectures/compilers, the situation is not so clear. For
303
example, ia-64 has the fma instruction, but gcc-3.2 appears not to exploit
304
it correctly. Other compilers may do the right thing, but we have not
305
tried them. Please send us your feedback so that we can update this FAQ
205
308
-------------------------------------------------------------------------------
261
364
Section 3. Using FFTW
366
Q3.1 Why not support the FFTW 2 interface in FFTW 3?
367
Q3.2 Why do FFTW 3 plans encapsulate the input/output arrays and not ju
368
Q3.3 FFTW seems really slow.
369
Q3.4 FFTW slows down after repeated calls.
370
Q3.5 An FFTW routine is crashing when I call it.
371
Q3.6 My Fortran program crashes when calling FFTW.
372
Q3.7 FFTW gives results different from my old FFT.
373
Q3.8 Can I save FFTW's plans?
374
Q3.9 Why does your inverse transform return a scaled result?
375
Q3.10 How can I make FFTW put the origin (zero frequency) at the center
376
Q3.11 How do I FFT an image/audio file in *foobar* format?
377
Q3.12 My program does not link (on Unix).
378
Q3.13 I included your header, but linking still fails.
379
Q3.14 My program crashes, complaining about stack space.
380
Q3.15 FFTW seems to have a memory leak.
381
Q3.16 The output of FFTW's transform is all zeros.
382
Q3.17 How do I call FFTW from the Microsoft language du jour?
383
Q3.18 Can I compute only a subset of the DFT outputs?
264
385
-------------------------------------------------------------------------------
431
553
-------------------------------------------------------------------------------
433
Question 3.15. FFTW seems to have a memory leak
555
Question 3.15. FFTW seems to have a memory leak.
435
557
After you create a plan, FFTW caches the information required to quickly
436
recreate the plan. (See pageref:savePlans::') It also maintains a small
437
amount of other persistent memory. You can deallocate all of FFTW's
438
internally allocated memory, if you wish, by calling fftw_cleanup(), as
439
documented in the manual.
558
recreate the plan. (See Q3.8 `Can I save FFTW's plans?') It also
559
maintains a small amount of other persistent memory. You can deallocate
560
all of FFTW's internally allocated memory, if you wish, by calling
561
fftw_cleanup(), as documented in the manual.
563
-------------------------------------------------------------------------------
565
Question 3.16. The output of FFTW's transform is all zeros.
567
You should initialize your input array *after* creating the plan, unless
568
you use FFTW_ESTIMATE: planning with FFTW_MEASURE or FFTW_PATIENT
569
overwrites the input/output arrays, as described in the manual.
571
-------------------------------------------------------------------------------
573
Question 3.17. How do I call FFTW from the Microsoft language du jour?
575
Please *do not* ask us Windows-specific questions. We do not use Windows.
576
We know nothing about Visual Basic, Visual C++, or .NET. Please find the
577
appropriate Usenet discussion group and ask your question there. See also
578
Q2.2 `Does FFTW run on Windows?'.
580
-------------------------------------------------------------------------------
582
Question 3.18. Can I compute only a subset of the DFT outputs?
584
In general, no, an FFT intrinsically computes all outputs from all inputs.
585
In principle, there is something called a *pruned FFT* that can do what
586
you want, but to compute K outputs out of N the complexity is in general
587
O(N log K) instead of O(N log N), thus saving only a small additive factor
588
in the log. (The same argument holds if you instead have only K nonzero
591
There are some specific cases in which you can get the O(N log K)
592
performance benefits easily, however, by combining a few ordinary FFTs.
593
In particular, the case where you want the first K outputs, where K
594
divides N, can be handled by performing N/K transforms of size K and then
595
summing the outputs multiplied by appropriate phase factors. For more
596
details, see pruned FFTs with FFTW.
598
There are also some algorithms that compute pruned transforms
599
*approximately*, but they are beyond the scope of this FAQ.
441
601
===============================================================================
443
603
Section 4. Internals of FFTW
605
Q4.1 How does FFTW work?
606
Q4.2 Why is FFTW so fast?
446
608
-------------------------------------------------------------------------------
486
648
Section 5. Known bugs
489
-------------------------------------------------------------------------------
491
Question 5.1. FFTW 1.1 crashes in rfftwnd on Linux.
493
This bug was fixed in FFTW 1.2. There was a bug in rfftwnd causing an
494
incorrect amount of memory to be allocated. The bug showed up in Linux
495
with libc-5.3.12 (and nowhere else that we know of).
497
-------------------------------------------------------------------------------
499
Question 5.2. The MPI transforms in FFTW 1.2 give incorrect results/leak memory.
501
These bugs were corrected in FFTW 1.2.1. The MPI transforms (really, just
502
the transpose routines) in FFTW 1.2 had bugs that could cause errors in
505
-------------------------------------------------------------------------------
507
Question 5.3. The test programs in FFTW 1.2.1 fail when I change FFTW to use single precision.
509
This bug was fixed in FFTW 1.3. (Older versions of FFTW did work in
510
single precision, but the test programs didn't--the error tolerances in
511
the tests were set for double precision.)
513
-------------------------------------------------------------------------------
515
Question 5.4. The test program in FFTW 1.2.1 fails for n > 46340.
517
This bug was fixed in FFTW 1.3. FFTW 1.2.1 produced the right answer, but
518
the test program was wrong. For large n, n*n in the naive transform that
519
we used for comparison overflows 32 bit integer precision, breaking the
522
-------------------------------------------------------------------------------
524
Question 5.5. The threaded code fails on Linux Redhat 5.0
526
We had problems with glibc-2.0.5. The code should work with glibc-2.0.7.
528
-------------------------------------------------------------------------------
530
Question 5.6. FFTW 2.0's rfftwnd fails for rank > 1 transforms with a final dimension >= 65536.
532
This bug was fixed in FFTW 2.0.1. (There was a 32-bit integer overflow
533
due to a poorly-parenthesized expression.)
535
-------------------------------------------------------------------------------
537
Question 5.7. FFTW 2.0's complex transforms give the wrong results with prime factors 17 to 97.
539
There was a bug in the complex transforms that could cause incorrect
540
results under (hopefully rare) circumstances for lengths with
541
intermediate-size prime factors (17-97). This bug was fixed in FFTW
544
-------------------------------------------------------------------------------
546
Question 5.8. FFTW 2.1.1's MPI test programs crash with MPICH.
548
This bug was fixed in FFTW 2.1.2. The 2.1/2.1.1 MPI test programs crashed
549
when using the MPICH implementation of MPI with the ch_p4 device (TCP/IP);
550
the transforms themselves worked fine.
552
-------------------------------------------------------------------------------
554
Question 5.9. FFTW 2.1.2's multi-threaded transforms don't work on AIX.
556
This bug was fixed in FFTW 2.1.3. The multi-threaded transforms in
557
previous versions didn't work with AIX's pthreads implementation, which
558
idiosyncratically creates threads in detached (non-joinable) mode by
561
-------------------------------------------------------------------------------
563
Question 5.10. FFTW 2.1.2's complex transforms give incorrect results for large prime sizes.
565
This bug was fixed in FFTW 2.1.3. FFTW's complex-transform algorithm for
566
prime sizes (in versions 2.0 to 2.1.2) had an integer overflow problem
567
that caused incorrect results for many primes greater than 32768 (on
568
32-bit machines). (Sizes without large prime factors are not affected.)
570
-------------------------------------------------------------------------------
572
Question 5.11. FFTW 2.1.3's multi-threaded transforms don't give any speedup on Solaris.
574
This bug was fixed in FFTW 2.1.4. (By default, Solaris creates threads
575
that do not parallelize over multiple processors, so one has to request
576
the proper behavior specifically.)
578
-------------------------------------------------------------------------------
580
Question 5.12. FFTW 2.1.3 crashes on AIX.
582
The FFTW 2.1.3 configure script picked incorrect compiler flags for the
583
xlc compiler on newer IBM processors. This is fixed in FFTW 2.1.4.
585
-conquer to take advantage of the memory
588
For more details (albeit somewhat outdated), see the paper "FFTW: An
589
Adaptive Software Architecture for the FFT", by M. Frigo and S. G.
590
Johnson, *Proc. ICASSP* 3, 1381 (1998), available along with other
591
references at the FFTW web page.
593
===============================================================================
595
Section 5. Known bugs
597
650
Q5.1 FFTW 1.1 crashes in rfftwnd on Linux.
598
651
Q5.2 The MPI transforms in FFTW 1.2 give incorrect results/leak memory.
599
652
Q5.3 The test programs in FFTW 1.2.1 fail when I change FFTW to use sin