1
GCC 4.8 Release Series -- Changes, New Features, and Fixes
1
GCC 4.9 Release Series -- Changes, New Features, and Fixes
2
2
==========================================================
8
GCC now uses C++ as its implementation language. This means that to build GCC
9
from sources, you will need a C++ compiler that understands C++ 2003. For more
10
details on the rationale and specific changes, please refer to the C++
13
To enable the Graphite framework for loop optimizations you now need CLooG
14
version 0.18.0 and ISL version 0.11.1. Both can be obtained from the GCC
15
infrastructure directory. The installation manual contains more information
16
about requirements to build GCC.
18
GCC now uses a more aggressive analysis to derive an upper bound for the
19
number of iterations of loops using constraints imposed by language standards.
20
This may cause non-conforming programs to no longer work as expected, such as
21
SPEC CPU 2006 464.h264ref and 416.gamess. A new option, -fno-aggressive-loop-
22
optimizations, was added to disable this aggressive analysis. In some loops
23
that have known constant number of iterations, but undefined behavior is known
24
to occur in the loop before reaching or during the last iteration, GCC will
25
warn about the undefined behavior in the loop instead of deriving lower upper
26
bound of the number of iterations for the loop. The warning can be disabled
27
with -Wno-aggressive-loop-optimizations.
29
On ARM, a bug has been fixed in GCC's implementation of the AAPCS rules for
30
the layout of vectors that could lead to wrong code being generated. Vectors
31
larger than 8 bytes in size are now by default aligned to an 8-byte boundary.
32
This is an ABI change: code that makes explicit use of vector types may be
33
incompatible with binary objects built with older versions of GCC. Auto-
34
vectorized code is not affected by this change.
36
On AVR, support has been removed for the command-line option -mshort-calls
37
deprecated in GCC 4.7.
39
On AVR, the configure option --with-avrlibc supported since GCC 4.7.2 is
40
turned on per default for all non-RTEMS configurations. This option arranges
41
for a better integration of AVR_Libc with avr-gcc. For technical details, see
42
PR54461. To turn off the option in non-RTEMS configurations, use --with-
43
avrlibc=no. If the compiler is configured for RTEMS, the option is always
46
More information on porting to GCC 4.8 from previous versions of GCC can be
47
found in the porting guide (http://gcc.gnu.org/gcc-4.8/porting_to.html) for
50
General Optimizer Improvements (and Changes)
51
============================================
53
- DWARF4 is now the default when generating DWARF debug information. When -g
54
is used on a platform that uses DWARF debugging information, GCC will now
55
default to -gdwarf-4 -fno-debug-types-section.
56
GDB 7.5, Valgrind 3.8.0 and elfutils 0.154 debug information consumers
57
support DWARF4 by default. Before GCC 4.8 the default version used was
58
DWARF2. To make GCC 4.8 generate an older DWARF version use -g together with
59
-gdwarf-2 or -gdwarf-3. The default for Darwin and VxWorks is still
60
-gdwarf-2 -gstrict-dwarf.
62
- A new general optimization level, -Og, has been introduced. It addresses the
63
need for fast compilation and a superior debugging experience while
64
providing a reasonable level of runtime performance. Overall experience for
65
development should be better than the default optimization level -O0.
67
- A new option -ftree-partial-pre was added to control the partial redundancy
68
elimination (PRE) optimization. This option is enabled by default at the -O3
69
optimization level, and it makes PRE more aggressive.
71
- The option -fconserve-space has been removed; it was no longer useful on
72
most targets since GCC supports putting variables into BSS without making
75
- The struct reorg and matrix reorg optimizations (command-line-options
76
-fipa-struct-reorg and -fipa-matrix-reorg) have been -removed. They did not
77
always work correctly, nor did they work -with link-time optimization (LTO),
78
hence were only applicable to -programs consisting of a single translation
81
- Several scalability bottle-necks have been removed from GCC's optimization
82
passes. Compilation of extremely large functions, e.g. due to the use of the
83
flatten attribute in the "Eigen" C++ linear algebra templates library, is
84
significantly faster than previous releases of GCC.
8
- The mudflap run time checker has been removed. The mudflap options
9
remain, but do nothing.
11
- Support for a number of older systems and recently unmaintained or
12
untested target ports of GCC has been declared obsolete in GCC 4.9.
13
Unless there is activity to revive them, the next release of
14
GCC will have their sources permanently removed.
16
- The following ports for individual systems on particular
17
architectures have been obsoleted:
19
- Solaris 9 (*-*-solaris2.9). Details can be found in the announcement.
21
More information on porting to GCC 4.9 from previous versions of GCC
22
can be found in the porting guide for this release.
23
See http://gcc.gnu.org/gcc-4.9/porting_to.html.
26
General Optimizer Improvements
27
==============================
30
- AddressSanitizer, a fast memory error detector, is now available on ARM.
32
- UndefinedBehaviorSanitizer (ubsan), a fast undefined behavior
33
detector, has been added and can be enabled via -fsanitize=undefined.
34
Various computations will be instrumented to detect undefined
35
behavior at runtime. UndefinedBehaviorSanitizer is currently
36
available for the C and C++ languages.
86
38
- Link-time optimization (LTO) improvements:
88
- LTO partitioning has been rewritten for better reliability and
89
maintanibility. Several important bugs leading to link failures have been
92
- Interprocedural optimization improvements:
94
- A new symbol table has been implemented. It builds on existing callgraph
95
and varpool modules and provide a new API. Unusual symbol visibilities and
96
aliases are handled more consistently leading to, for example, more
97
aggressive unreachable code removal with LTO.
99
- The inline heuristic can now bypass limits on the size of of inlined
100
functions when the inlining is particularly profitable. This happens, for
101
example, when loop bounds or array strides get propagated.
103
- Values passed through aggregates (either by value or reference) are now
104
propagated at the inter-procedural level leading to better inlining
105
decisions (for example in the case of Fortran array descriptors) and
108
- AddressSanitizer, a fast memory error detector, has been added and can be
109
enabled via -fsanitize=address. Memory access instructions will be
110
instrumented to detect heap-, stack-, and global-buffer overflow as well as
111
use-after-free bugs. To get nicer stacktraces, use -fno-omit-frame-pointer.
112
The AddressSanitizer is available on IA-32/x86-64/x32/PowerPC/PowerPC64 GNU/
113
Linux and on x86-64 Darwin.
115
- ThreadSanitizer has been added and can be enabled via -fsanitize=thread.
116
Instructions will be instrumented to detect data races. The ThreadSanitizer
117
is available on x86-64 GNU/Linux.
40
- Type merging was rewritten. The new implementation is significantly
41
faster and uses less memory.
43
- Better partitioning algorithm resulting in less streaming during
46
- Early removal of virtual methods reduces the size of object files
47
and improves link-time memory usage and compile time.
49
- Function bodies are now loaded on-demand and released early improving
50
overall memory usage at link time.
52
- C++ hidden keyed methods can now be optimized out.
54
- When using a linker plugin, compiling with the -flto option now
55
generates slim objects files (.o) which only contain intermediate
56
language representation for LTO. Use -ffat-lto-objects to create
57
files which contain additionally the object code. To generate
58
static libraries suitable for LTO processing, use gcc-ar and
59
gcc-ranlib; to list symbols from a slim object file use
60
gcc-nm. (Requires that ar, ranlib and nm have been compiled with
63
Memory usage building Firefox with debug enabled was reduced from 15GB
64
to 3.5GB; link time from 1700 seconds to 350 seconds.
66
- Inter-procedural optimization improvements:
68
- New type inheritance analysis module improving devirtualization.
69
Devirtualization now takes into account anonymous name-spaces and
70
the C++11 final keyword.
72
- New speculative devirtualization pass (controlled by
73
-fdevirtualize-speculatively).
74
- Calls that were speculatively made direct are turned back to indirect
75
where direct call is not cheaper.
76
- Local aliases are introduced for symbols that are known to be
77
semantically equivalent across shared libraries improving dynamic
80
- Feedback directed optimization improvements:
82
- Profiling of programs using C++ inline functions is now more reliable.
84
- New time profiling determines typical order in which functions are
87
- A new function reordering pass (controlled by -freorder-functions)
88
significantly reduces startup time of large applications. Until binutils
89
support is completed, it is effective only with link-time optimization.
91
- Feedback driven indirect call removal and devirtualization now handle
92
cross-module calls when link-time optimization is enabled.
120
95
New Languages and Language specific improvements
121
96
================================================
98
- Version 4.0 of the OpenMP specification is now supported for the C
99
and C++ compilers. The new -fopenmp-simd option can be used to
100
enable OpenMP's SIMD directives, while ignoring other OpenMP
101
directives. The new -fsimd-cost-model= option permits to tune the
102
vectorization cost model for loops annotated with OpenMP and Cilk
103
Plus simd directives; -Wopenmp-simd warns when the current
104
costmodel overrides simd directives set by the user.
106
- The -Wdate-time option has been added for the C, C++ and Fortran
107
compilers, which warns when the __DATE__, __TIME__ or __TIMESTAMP__
108
macros are used. Those macros might prevent bit-wise-identical
109
reproducible compilations.
115
- GNAT switched to Ada 2012 instead of Ada 2005 by default.
127
- Each diagnostic emitted now includes the original source line and a caret
128
'^' indicating the column. The option -fno-diagnostics-show-caret suppresses
131
- The option -ftrack-macro-expansion=2 is now enabled by default. This allows
132
the compiler to display the macro expansion stack in diagnostics. Combined
133
with the caret information, an example diagnostic showing these two features
136
t.c:1:94: error: invalid operands to binary < (have ‘struct mystruct’ and ‘float’)
137
#define MYMAX(A,B) __extension__ ({ __typeof__(A) __a = (A);
138
__typeof__(B) __b = (B);
139
__a < __b ? __b : __a; })
141
t.c:7:7: note: in expansion of macro 'MYMAX'
145
- A new -Wsizeof-pointer-memaccess warning has been added (also enabled by
146
-Wall) to warn about suspicious length parameters to certain string and
147
memory built-in functions if the argument uses sizeof. This warning warns
148
e.g. about memset (ptr, 0, sizeof (ptr)); if ptr is not an array, but a
149
pointer, and suggests a possible fix, or about
150
memcpy (&foo, ptr, sizeof (&foo));.
152
- The new option -Wpedantic is an alias for -pedantic, which is now
153
deprecated. The forms -Wno-pedantic, -Werror=pedantic, and -Wno-
154
error=pedantic work in the same way as for any other -W option. One caveat
155
is that -Werror=pedantic is not equivalent to -pedantic-errors, since the
156
latter makes into errors some warnings that are not controlled by
157
-Wpedantic, and the former only affects diagnostics that are disabled when
160
- The option -Wshadow no longer warns if a declaration shadows a function
161
declaration, unless the former declares a function or pointer to function,
162
because this is a_common_and_valid_case_in_real-world_code.
121
- Support for colorizing diagnostics emitted by GCC has been added.
122
The -fdiagnostics-color=auto will enable it when outputting to
123
terminals, -fdiagnostics-color=always unconditionally. The
124
GCC_COLORS environment variable can be used to customize the colors
125
or disable coloring. If GCC_COLORS variable is present in the
126
environment, the default is -fdiagnostics-color=auto, otherwise
127
-fdiagnostics-color=never. Sample diagnostics output:
129
$ g++ -fdiagnostics-color=always -S -Wall test.C
130
test.C: In function ‘int foo()’:
131
test.C:1:14: warning: no return statement in function returning non-void [-Wreturn-type]
134
test.C:2:46: error: template instantiation depth exceeds maximum of 900 (use -ftemplate-depth= to increase the maximum) instantiating `struct X<100>';
135
template <int N> struct X { static const int value = X<N-1>::value; }; template struct X<1000>;
137
test.C:2:46: recursively required from `const int X<999>::value'
138
test.C:2:46: required from `const int X<1000>::value'
139
test.C:2:88: required from here
141
test.C:2:46: error: incomplete type `X<100>' used in nested name specifier
143
- With the new #pragma GCC ivdep, the user can assert that there are
144
no loop-carried dependencies which would prevent concurrent
145
execution of consecutive iterations using SIMD (single instruction
146
multiple data) instructions.
148
- Support for Cilk Plus has been added and can be enabled with the
149
-fcilkplus option. Cilk Plus is an extension to the C and C++
150
languages to support data and task parallelism. The present
151
implementation follows ABI version 1.2; all features but _Cilk_for
152
have been implemented.
158
- ISO C11 atomics (the _Atomic type specifier and qualifier and the
159
<stdatomic.h> header) are now supported.
161
- ISO C11 generic selections (_Generic keyword) are now supported.
163
- ISO C11 thread-local storage (_Thread_local, similar to GNU C
164
__thread) is now supported.
166
- ISO C11 support is now at a similar level of completeness to ISO
167
C99 support: substantially complete modulo bugs, extended
168
identifiers (supported except for corner cases when
169
-fextended-identifiers is used), floating-point issues (mainly but
170
not entirely relating to optional C99 features from Annexes F and
171
G) and the optional Annexes K (Bounds-checking interfaces) and L
174
- A new C extension __auto_type provides a subset of the
175
functionality of C++11 auto in GNU C.
168
- G++ now implements the C++11 thread_local keyword; this differs from the GNU
169
__thread keyword primarily in that it allows dynamic initialization and
170
destruction semantics. Unfortunately, this support requires a run-time
171
penalty for references to non-function-local thread_local variables defined
172
in a different translation unit even if they don't need dynamic
173
initialization, so users may want to continue to use __thread for TLS
174
variables with static initialization semantics.
176
If the programmer can be sure that no use of the variable in a non-defining
177
TU needs to trigger dynamic initialization (either because the variable is
178
statically initialized, or a use of the variable in the defining TU will be
179
executed before any uses in another TU), they can avoid this overhead with
180
the -fno-extern-tls-init option.
182
OpenMP threadprivate variables now also support dynamic initialization and
183
destruction by the same mechanism.
185
- G++ now implements the C++11 attribute syntax, e.g.
187
[[noreturn]] void f();
189
and also the alignment specifier, e.g.
191
alignas(double) int i;
193
- G++ now implements C++11 inheriting constructors, e.g.
195
struct A { A(int); };
196
struct B: A { using A::A; }; // defines B::B(int)
199
- As of GCC 4.8.1, G++ implements the change to decltype semantics from N3276.
202
decltype(f()) g(); // OK, return type of f() is not required
205
- G++ now supports a -std=c++1y option for experimentation with features
206
proposed for the next revision of the standard, expected around 2017.
207
Currently the only difference from -std=c++11 is support for return type
208
deduction in normal functions, as proposed in N3386.
210
- The G++ namespace association extension, __attribute ((strong)), has been
211
deprecated. Inline namespaces should be used instead.
213
- G++ now supports a -fext-numeric-literal option to control whether GNU
214
numeric literal suffixes are accepted as extensions or processed as C++11
215
user-defined numeric literal suffixes. The flag is on (use suffixes for GNU
216
literals) by default for -std=gnu++*, and -std=c++98. The flag is off (use
217
suffixes for user-defined literals) by default for -std=c++11 and later.
181
- The G++ implementation of C++1y return type deduction for normal
182
functions has been updated to conform to N3638, the proposal
183
accepted into the working paper. Most notably, it adds
184
decltype(auto) for getting decltype semantics rather than the
185
template argument deduction semantics of plain auto:
188
auto i1 = f(); // int
189
decltype(auto) i2 = f(); // int&
191
- G++ supports C++1y lambda capture initializers:
195
Actually, they have been accepted since GCC 4.5, but now the
196
compiler doesn't warn about them with -std=c++1y, and supports
197
parenthesized and brace-enclosed initializers as well.
199
- G++ supports C++1y variable length arrays. G++ has supported
200
GNU/C99-style VLAs for a long time, but now additionally supports
201
initializers and lambda capture by reference. In C++1y mode G++
202
will complain about VLA uses that are not permitted by the draft
203
standard, such as forming a pointer to VLA type or applying sizeof
204
to a VLA variable. Note that it now appears that VLAs will not be
205
part of C++14, but will be part of a separate document and then
209
int a[n] = { 1, 2, 3 }; // throws std::bad_array_length if n < 3
210
[&a]{ for (int i : a) { cout << i << endl; } }();
211
&a; // error, taking address of VLA
214
- G++ supports the C++1y [[deprecated]] attribute modulo bugs in the
215
underlying [[gnu::deprecated]] attribute. Classes and functions
216
can be marked deprecated and a diagnostic message added:
220
#if __cplusplus > 201103
221
class [[deprecated("A is deprecated in C++14; Use B instead")]] A;
222
[[deprecated("bar is unsafe; use foo() instead")]]
228
A aa; // warning: 'A' is deprecated : A is deprecated in C++14; Use B instead
229
int j = bar(2); // warning: 'int bar(int)' is deprecated : bar is unsafe; use foo() instead
231
- G++ supports C++1y digit separators. Long numeric literals can be
232
subdivided with a single quote ' to enhance readability:
237
int m = 0'004'000'000;
238
int n = 0b0001'0000'0000'0000'0000'0000;
240
double x = 1.602'176'565e-19;
241
double y = 1.602'176'565e-1'9;
243
- G++ supports C++1y polymorphic lambdas.
245
// a functional object that will increment any type
246
auto incr = [](auto x) { return x++; };
220
249
Runtime Library (libstdc++)
221
250
---------------------------
223
- Improved_experimental_support_for_the_new_ISO_C++_standard,_C++11,
252
- Improved support for C++11, including:
254
- support for <regex>;
256
- The associative containers in <map> and <set> and the unordered
257
associative containers in <unordered_map> and <unordered_set>
258
meet the allocator-aware container requirements;
260
- Improved experimental support for the upcoming ISO C++ standard, C++14,
226
- forward_list meets the allocator-aware container requirements;
228
- this_thread::sleep_for(), this_thread::sleep_until() and this_thread::
229
yield() are defined without requiring the configure option
230
--enable-libstdcxx-time;
232
- Improvements to <random>:
234
- SSE optimized normal_distribution.
236
- Use of hardware RNG instruction for random_device on new x86 processors
237
(requires the assembler to support the instruction.)
241
- New random number engine simd_fast_mersenne_twister_engine with an
242
optimized SSE implementation.
244
- New random number distributions beta_distribution, normal_mv_distribution,
245
rice_distribution, nakagami_distribution, pareto_distribution,
246
k_distribution, arcsine_distribution, hoyt_distribution.
248
- Added --disable-libstdcxx-verbose configure option to disable diagnostic
249
messages issued when a process terminates abnormally. This may be useful for
250
embedded systems to reduce the size of executables that link statically to
263
- fixing constexpr member functions without const;
264
- implementation of the std::exchange() utility function;
265
- addressing tuples by type;
266
- implemention of std::make_unique;
267
- implemention of std::shared_lock;
268
- making std::result_of SFINAE-friendly;
269
- adding operator() to integral_constant;
270
- adding user-defined literals for standard library types std::basic_string,
271
std::chrono::duration, and std::complex;
272
- adding two range overloads to non-modifying sequence oprations
273
std::equal and std::mismatch;
274
- adding IO manipulators for quoted strings;
275
- adding constexpr members to <utility>, <complex>, <chrono>,
277
- adding compile-time std::integer_sequence;
278
- adding cleaner transformation traits;
279
- making <functional>s operator functors easier to use and more generic;
281
- An implementation of std::experimental::optional.
283
- An implementation of std::experimental::string_view.
285
- The non-standard function std::copy_exception has been deprecated
286
and will be removed in a future version. std::make_exception_ptr
287
should be used instead.
257
293
- Compatibility notice:
259
- Module files: The version of module files (.mod) has been incremented.
260
Fortran MODULEs compiled by earlier GCC versions have to be recompiled,
261
when they are USEd by files compiled with GCC 4.8. GCC 4.8 is not able to
262
read .mod files created by earlier versions; attempting to do so gives an
264
Note: The ABI of the produced assembler data itself has not changed;
265
object files and libraries are fully compatible with older versions except
268
- ABI: Some internal names (used in the assembler/object file) have changed
269
for symbols declared in the specification part of a module. If an affected
270
module - or a file using it via use association - is recompiled, the
271
module and all files which directly use such symbols have to be recompiled
272
as well. This change only affects the following kind of module symbols:
274
- Procedure pointers. Note: C-interoperable function pointers (type
275
(c_funptr)) are not affected nor are procedure-pointer components.
276
- Deferred-length character strings.
278
- The BACKTRACE intrinsic subroutine has been added. It shows a backtrace at
279
an arbitrary place in user code; program execution continues normally
282
- The -Wc-binding-type warning option has been added (disabled by default). It
283
warns if the a variable might not be C interoperable; in particular, if the
284
variable has been declared using an intrinsic type with default kind instead
285
of using a kind parameter defined for C interoperability in the intrinsic
286
ISO_C_Binding module. Before, this warning was always printed. The
287
-Wc-binding-type option is enabled by -Wall.
289
- The -Wrealloc-lhs and -Wrealloc-lhs-all warning command-line options have
290
been added, which diagnose when code to is inserted for automatic
291
(re)allocation of a variable during assignment. This option can be used to
292
decide whether it is safe to use -fno-realloc-lhs. Additionally, it can be
293
used to find automatic (re)allocation in hot loops. (For arrays, replacing
294
var= by var(:)= disables the automatic reallocation.)
296
- The -Wcompare-reals command-line option has been added. When this is set,
297
warnings are issued when comparing REAL or COMPLEX types for equality and
298
inequality; consider replacing a == b by abs(a-b) < eps with a suitable eps.
299
-Wcompare-reals is enabled by -Wextra.
301
- The -Wtarget-lifetime command-line option has been added (enabled with
302
-Wall), which warns if the pointer in a pointer assignment might outlive its
305
- Reading floating point numbers which use q for the exponential (such as
306
4.0q0) is now supported as vendor extension for better compatibility with
307
old data files. It is strongly recommended to use for I/O the equivalent but
308
standard conforming e (such as 4.0e0).
309
(For Fortran source code, consider replacing the q in floating-point
310
literals by a kind parameter (e.g. 4.0e0_qp with a suitable qp). Note that -
311
in Fortran source code - replacing q by a simple e is not equivalent.)
313
- The GFORTRAN_TMPDIR environment variable for specifying a non-default
314
directory for files opened with STATUS="SCRATCH", is not used anymore.
315
Instead gfortran checks the POSIX/GNU standard TMPDIR environment variable.
316
If TMPDIR is not defined, gfortran falls back to other methods to determine
317
the directory for temporary files as documented in the user_manual.
321
- Support for unlimited polymorphic variables (CLASS(*)) has been added.
322
Nonconstant character lengths are not yet supported.
326
- Assumed types (TYPE(*)) are now supported.
328
- Experimental support for assumed-rank arrays (dimension(..)) has been
329
added. Note that currently gfortran's own array descriptor is used, which
330
is different from the one defined in TS29113, see gfortran's_header_file
331
or use the Chasm_Language_Interoperability_Tools.
295
- Module files: The version of the module files (.mod) has been
296
incremented; additionally, module files are now compressed.
297
Fortran MODULEs compiled by earlier GCC versions have to be
298
recompiled, when they are USEd by files compiled with GCC 4.9,
299
because GCC 4.9 is not able to read .mod files of earlier GCC
300
versions; attempting to do so gives an error message. Note: The
301
ABI of the produced assembler data itself has not changed: object
302
files and libraries are fully compatible to older
303
versions. (Except for the next items.)
307
- Note that the argument passing ABI has changed for scalar dummy
308
arguments of type INTEGER, REAL, COMPLEX and LOGICAL, which
309
have both the VALUE and the OPTIONAL attribute.
311
- Due to the support of finalization, the virtual table
312
associated with polymorphic variables has changed. Therefore,
313
code containing CLASS should be recompiled, including all files
314
which define derived types involved in the type definition used
315
by polymorphic variables. (Note: Due to the incremented module
316
version, trying to mix old code with new code will usually give
319
- GNU Fortran no longer deallocates allocatable variables or
320
allocatable components of variables declared in the main
321
program. Since Fortran 2008, the standard explicitly states that
322
variables declared in the Fortran main program automatically have
325
- When opening files, the close-on-exec flag is set if the system
326
supports such a feature. This is generally considered good
327
practice these days, but if there is a need to pass file
328
descriptors to child processes the parent process must now
329
remember to clear the close-on-exec flag by calling fcntl(),
330
e.g. via ISO_C_BINDING, before executing the child process.
332
- The deprecated command-line option -fno-whole-file has been
333
removed. (-fwhole-file is the default since GCC 4.6.)
334
-fwhole-file/-fno-whole-file continue to be accepted but do not
335
influence the code generation.
337
- The compiler no longer unconditionally warns about DO loops with
338
zero iterations. This warning is now controlled by the
339
-Wzerotrips option, which is implied by -Wall.
341
- The new NO_ARG_CHECK attribute of the !GCC$ directive can be used
342
to disable the type-kind-rank (TKR) argument check for a dummy
343
argument. The feature is similar to ISO/IEC TS 29133:2012's
344
TYPE(*), except that it additionally also disables the rank
345
check. Variables with NO_ARG_CHECK have to be dummy arguments and
346
may only be used as argument to ISO_C_BINDING's C_LOC and as
347
actual argument to another NO_ARG_CHECK dummy argument; also the
348
other constraints of TYPE(*) apply. The dummy arguments should
349
be declared as scalar or assumed-size variable of type type(*)
350
(recommended) – or of type integer, real, complex or
351
logical. With NO_ARG_CHECK, a pointer to the data without further
352
type or shape information is passed, similar to C's void*. Note
353
that also TS 29113's type(*),dimension(..) accepts arguments of
354
any type and rank; contrary to NO_ARG_CHECK assumed-rank
355
arguments pass an array descriptor which contains the array shape
356
and stride of the argument.
360
- Finalization is now supported. Note that finalization is
361
currently only done for a subset of the situations in which it
364
- Experimental support for scalar character components with
365
deferred length (i.e. allocatable string length) in derived
366
types has been added. (Deferred-length character variables are
367
supported since GCC 4.6.)
371
- When STOP or ERROR STOP is used to terminate the execution and
372
any exception (but inexact) is signaling, a warning is printed
373
to ERROR_UNIT, indicating which exceptions are signaling. The
374
-ffpe-summary= command-line option can be used to fine-tune for
375
which exception the warning should be shown.
377
- Rounding on input (READ) is now handled on systems where strtod
378
honours the rounding mode. (For output, rounding is supported
379
since GCC 4.5.) Note that for input, the compatible rounding
380
mode is handled as nearest (i.e., for a tie, rounding to an
381
even last significant [cf. IEC 60559:1989] – while
382
compatible rounds away from zero for a tie).
337
- GCC 4.8.0 implements a preliminary version of the upcoming Go 1.1 release.
338
The library support is not quite complete, due to release timing.
340
- Go has been tested on GNU/Linux and Solaris platforms for various processors
341
including x86, x86_64, PowerPC, SPARC, and Alpha. It may work on other
388
- GCC 4.9 provides a complete implementation of the Go 1.2.1 release.
345
391
New Targets and Target Specific Improvements
346
392
============================================
352
- A new port has been added to support AArch64, the new 64-bit architecture
353
from ARM. Note that this is a separate port from the existing 32-bit ARM
355
- The port provides initial support for the Cortex-A53 and the Cortex-A57
356
processors with the command line options -mcpu=cortex-a53 and
397
- The ARMv8-A crypto and CRC instructions are now supported through
398
intrinsics. These are enabled when the architecture supports these
399
and are available through the -march=armv8-a+crc and
400
-march=armv8-a+crypto options.
402
- Initial support for ILP32 has now been added to the compiler. This
403
is now available through the command line option
404
-mabi=ilp32. Support for ILP32 is considered experimental as the
405
ABI specification is still beta.
407
- Coverage of more of the ISA including the SIMD extensions has been
408
added. The Advanced SIMD intrinsics have also been improved.
410
- The new local register allocator (LRA) is now on by default for the
413
- The REE (Redundant extension elimination) pass has now been enabled
414
by default for the AArch64 backend.
416
- Tuning for the Cortex-A53 and Cortex-A57 has been improved.
418
- Initial big.LITTLE tuning support for the combination of Cortex-A57
419
and Cortex-A53 was added through the -mcpu=cortex-a57.cortex-a53
422
- A number of structural changes have been made to both the ARM
423
and AArch64 backends to facilitate improved code-generation.
363
- Initial support has been added for the AArch32 extensions defined in the
366
- Code generation improvements for the Cortex-A7 and Cortex-A15 CPUs.
368
- A new option, -mcpu=marvell-pj4, has been added to generate code for the
369
Marvell PJ4 processor.
371
- The compiler can now automatically generate the VFMA, VFMS, REVSH and REV16
374
- A new vectorizer cost model for Advanced SIMD configurations to improve the
375
auto-vectorization strategies used.
377
- The scheduler now takes into account the number of live registers to reduce
378
the amount of spilling that can occur. This should improve code performance
379
in large functions. The limit can be removed by using the option -fno-sched-
382
- Improvements have been made to the Marvell iWMMX code generation and support
383
for the iWMMX2 SIMD unit has been added. The option -mcpu=iwmmxt2 can be
384
used to enable code generation for the latter.
386
- A number of code generation improvements for Thumb2 to reduce code size when
387
compiling for the M-profile processors.
389
- The RTEMS (arm-rtems) port has been updated to use the EABI.
391
- Code generation support for the old FPA and Maverick floating-point
392
architectures has been removed. Ports that previously relied on these
393
features have also been removed. This includes the targets:
395
- arm*-*-linux-gnu (use arm*-*-linux-gnueabi)
396
- arm*-*-elf (use arm*-*-eabi)
397
- arm*-*-uclinux* (use arm*-*-uclinux*eabi)
398
- arm*-*-ecos-elf (no alternative)
399
- arm*-*-freebsd (no alternative)
400
- arm*-wince-pe* (no alternative).
406
- Support for the "Embedded C" fixed-point has been added. For details, see
407
the GCC_wiki and the user_manual. The support is not complete.
408
- A new print modifier %r for register operands in inline assembler is
409
supported. It will print the raw register number without the register prefix
412
/* Return the most significant byte of 'val', a 64-bit value. */
414
unsigned char msb (long long val)
417
__asm__ ("mov %0, %r1+7" : "=r" (c) : "r" (val));
421
The inline assembler in this example will generate code like
425
provided c is allocated to R24 and val is allocated to R8...R15. This works
426
because the GNU assembler accepts plain register numbers without register
429
- Static initializers with 3-byte symbols are supported now:
431
extern const __memx char foo;
432
const __memx void *pfoo = &foo;
434
This requires at least Binutils 2.23.
429
- Use of Advanced SIMD (Neon) for 64-bit scalar computations has been
430
disabled by default. This was found to generate better code in only
431
a small number of cases. It can be turned back on with the
432
-mneon-for-64bits option.
434
- Further support for the ARMv8-A architecture, notably implementing
435
the restriction around IT blocks in the Thumb32 instruction set has
436
been added. The -mrestrict-it option can be used with
437
-march=armv7-a or the -march=armv7ve options to make code
438
generation fully compatible with the deprecated instructions in
441
- Support has now been added for the ARMv7ve variant of the
442
architecture. This can be used by the -march=armv7ve option.
444
- The ARMv8-A crypto and CRC instructions are now supported through
445
intrinsics and are available through the -march=armv8-a+crc
446
and mfpu=crypto-neon-fp-armv8 options.
448
- LRA is now on by default for the ARM target. This can be turned off
449
using the -mno-lra option. This option is purely
450
transitionary command line option and will be removed in a future
451
release. We are interested in any bug reports regarding functional and
452
performance regressions with LRA.
454
- A new option -mslow-flash-data to improve performance of programs
455
fetching data on slow flash memory has now been introduced for the
456
ARMv7-M profile cores.
458
- A new option -mpic-data-is-text-relative for targets that allows
459
data segments to be relative to text segments has been added. This
460
is on by default for all targets except VxWorks RTP.
462
- A number of infrastructural changes have been made to both the ARM
463
and AArch64 backends to facilitate improved code-generation.
465
- GCC now supports Cortex-A12 and the Cortex-R7 through the
466
-mcpu=cortex-a12 and -mcpu=cortex-r7 options.
468
- GCC now has tuning for the Cortex-A57 and Cortex-A53 through the
469
-mcpu=cortex-a57 and -mcpu=cortex-a53 options.
471
- Initial big.LITTLE tuning support for the combination of Cortex-A57
472
and Cortex-A53 was added through the -mcpu=cortex-a57.cortex-a53
473
option. Similar support was added for the combination of Cortex-A15
474
and Cortex-A7 through the -mcpu=cortex-a15.cortex-a7 option.
476
- Further performance optimizations for the Cortex-A15 and the
477
Cortex-M4 have been added.
479
- A number of code generation improvements for Thumb2 to reduce code
480
size when compiling for the M-profile processors.
440
- Allow -mpreferred-stack-boundary=3 for the x86-64 architecture with SSE
441
extensions disabled. Since the x86-64 ABI requires 16 byte stack alignment,
442
this is ABI incompatible and intended to be used in controlled environments
443
where stack space is an important limitation. This option will lead to wrong
444
code when functions compiled with 16 byte stack alignment (such as functions
445
from a standard library) are called with misaligned stack. In this case, SSE
446
instructions may lead to misaligned memory access traps. In addition,
447
variable arguments will be handled incorrectly for 16 byte aligned objects
448
(including x87 long double and __int128), leading to wrong results. You must
449
build all modules with -mpreferred-stack-boundary=3, including any
450
libraries. This includes the system libraries and startup modules.
452
- Support for the new Intel processor codename Broadwell with RDSEED, ADCX,
453
ADOX, PREFETCHW is available through -madx, -mprfchw, -mrdseed command-line
456
- Support for the Intel RTM and HLE intrinsics, built-in functions and code
457
generation is available via -mrtm and -mhle.
459
- Support for the Intel FXSR, XSAVE and XSAVEOPT instruction sets. Intrinsics
460
and built-in functions are available via -mfxsr, -mxsave and -mxsaveopt
463
- New -maddress-mode=[short|long] options for x32. -maddress-mode=short
464
overrides default 64-bit addresses to 32-bit by emitting the 0x67 address-
465
size override prefix. This is the default address mode for x32.
467
- New built-in functions to detect run-time CPU type and ISA:
469
- A built-in function __builtin_cpu_is has been added to detect if the run-
470
time CPU is of a particular type. It returns a positive integer on a match
471
and zero otherwise. It accepts one string literal argument, the CPU name.
472
For example, __builtin_cpu_is("westmere") returns a positive integer if
473
the run-time CPU is an Intel Core i7 Westmere processor. Please refer to
474
the user_manual for the list of valid CPU names recognized.
476
- A built-in function __builtin_cpu_supports has been added to detect if the
477
run-time CPU supports a particular ISA feature. It returns a positive
478
integer on a match and zero otherwise. It accepts one string literal
479
argument, the ISA feature. For example, __builtin_cpu_supports("ssse3")
480
returns a positive integer if the run-time CPU supports SSSE3
481
instructions. Please refer to the user_manual for the list of valid ISA
484
Caveat: If these built-in functions are called before any static
485
constructors are invoked, like during IFUNC initialization, then the CPU
486
detection initialization must be explicitly run using this newly provided
487
built-in function, __builtin_cpu_init. The initialization needs to be done
488
only once. For example, this is how the invocation would look like inside an
491
static void (*some_ifunc_resolver(void))(void)
493
__builtin_cpu_init();
494
if (__builtin_cpu_is("amdfam10h") ...
495
if (__builtin_cpu_supports("popcnt") ...
498
- Function Multiversioning Support with G++:
499
It is now possible to create multiple function versions each targeting a
500
specific processor and/or ISA. Function versions have the same signature but
501
different target attributes. For example, here is a program with function
504
__attribute__ ((target ("default")))
510
__attribute__ ((target ("sse4.2")))
519
assert ((*p)() == foo());
523
Please refer to this wiki for more information.
525
- The x86 backend has been improved to allow option -fschedule-insns to work
526
reliably. This option can be used to schedule instructions better and leads
527
to improved performace in certain cases.
529
- Windows MinGW-w64 targets (*-w64-mingw*) require at least r5437 from the
532
- Support for new AMD family 15h processors (Steamroller core) is now
533
available through the -march=bdver3 and -mtune=bdver3 options.
535
- Support for new AMD family 16h processors (Jaguar core) is now available
536
through the -march=btver2 and -mtune=btver2 options.
542
- This target now supports the -fstack-usage command-line option.
548
- GCC can now generate code specifically for the R4700, Broadcom XLP and MIPS
549
34kn processors. The associated -march options are -march=r4700, -march=xlp
550
and -march=34kn respectively.
552
- GCC now generates better DSP code for MIPS 74k cores thanks to further
553
scheduling optimizations.
555
- The MIPS port now supports the -fstack-check option.
557
- GCC now passes the -mmcu and -mno-mcu options to the assembler.
559
- Previous versions of GCC would silently accept -fpic and -fPIC for
560
-mno-abicalls targets like mips*-elf. This combination was not intended or
561
supported, and did not generate position-independent code. GCC 4.8 now
562
reports an error when this combination is used.
486
- -mfpmath=sse is now implied by -ffast-math on all targets where
489
- Intel AVX-512 support was added to GCC. That includes inline
490
assembly support, new registers and extending existing ones, new
491
intrinsics (covered by corresponding testsuite), and basic
492
autovectorization. AVX-512 instructions are available via the
493
following GCC switches: AVX-512 foundation instructions: -mavx512f,
494
AVX-512 prefetch instructions: -mavx512pf, AVX-512 exponential and
495
reciprocal instructions: -mavx512er, AVX-512 conflict detection
496
instructions: -mavx512cd.
498
- It is now possible to call x86 intrinsics from select functions in
499
a file that are tagged with the corresponding target attribute
500
without having to compile the entire file with the -mxxx option.
501
This improves the usability of x86 intrinsics and is particularly
502
useful when doing Function Multiversioning.
504
- GCC now supports the new Intel microarchitecture named Silvermont
505
through -march=silvermont.
507
- GCC now supports the new Intel microarchitecture named Broadwell
508
through -march=broadwell.
510
- Optimizing for other Intel microarchitectures have been renamed to
511
-march=nehalem, westmere, sandybridge, ivybridge, haswell, bonnell.
513
- -march=generic has been retuned for better support of Intel core
514
and AMD Bulldozer architectures. Performance of AMD K7, K8, Intel
515
Pentium-M, and Pentium4 based CPUs is no longer considered
516
important for generic.
518
- -mtune=intel can now be used to generate code running well on the
519
most current Intel processors, which are Haswell and Silvermont for
522
- Support to encode 32-bit assembly instructions in 16-bit format is
523
now available through the -m16 command-line option.
525
- Better inlining of memcpy and memset that is aware of value ranges
526
and produces shorter alignment prologues.
528
- -mno-accumulate-outgoing-args is now honored when unwind
529
information is output. Argument accumulation is also now turned
530
off for portions of programs optimized for size.
532
- Support for new AMD family 15h processors (Excavator core) is now
533
available through the -march=bdver4 and -mtune=bdver4 options.
539
- A new command-line option -mcpu= has been added to the MSP430
540
backend. This option is used to specify the ISA to be used.
541
Accepted values are msp430 (the default), msp430x and msp430xv2.
542
The ISA is no longer deduced from the -mmcu= option as there are
543
far too many different MCU names. The -mmcu= option is still
544
supported, and this is still used to select linker scripts and
545
generate a C preprocessor symbol that will be recognised by the
546
msp430.h header file.
552
- A new nds32 port supports the 32-bit architecture from Andes
553
Technology Corporation.
555
- The port provides initial support for the V2, V3, V3m instruction
562
- A port for the Altera Nios II has been contributed by Mentor Graphics.
565
565
PowerPC / PowerPC64 / RS6000
566
566
----------------------------
568
- SVR4 configurations (GNU/Linux, FreeBSD, NetBSD) no longer save, restore or
569
update the VRSAVE register by default. The respective operating systems
570
manage the VRSAVE register directly.
572
- Large TOC support has been added for AIX through the command line option
575
- Native Thread-Local Storage support has been added for AIX.
577
- VMX (Altivec) and VSX instruction sets now are enabled implicitly when
578
targetting processors that support those hardware features on AIX 6.1 and
585
- This target will now issue a warning message whenever multiple fast
586
interrupt handlers are found in the same cpmpilation unit. This feature can
587
be turned off by the new -mno-warn-multiple-fast-interrupts command-line
568
- GCC now supports Power ISA 2.07, which includes support for
569
Hardware Transactional Memory (HTM), Quadword atomics and several
570
VMX and VSX additions, including Crypto, 64-bit integer, 128-bit
571
integer and decimal integer operations.
573
- Support for the POWER8 processor is now available through the
574
-mcpu=power8 and -mtune=power8 options. The libitm library has
575
been modified to add a HTM fastpath that automatically uses POWER's
576
HTM hardware instructions when it is executing on a HTM enabled
579
- Support for the new powerpc64le-linux platform has been added. It
580
defaults to generating code that conforms to the ELFV2 ABI.
594
- Support for the IBM zEnterprise zEC12 processor has been added. When using
595
the -march=zEC12 option, the compiler will generate code making use of the
596
following new instructions:
598
- load and trap instructions
599
- 2 new compare and trap instructions
600
- rotate and insert selected bits - without CC clobber
602
The -mtune=zEC12 option enables zEC12 specific instruction scheduling
603
without making use of new instructions.
605
- Register pressure sensitive instruction scheduling is enabled by default.
607
- The ifunc function attribute is enabled by default.
609
- memcpy and memcmp invokations on big memory chunks or with run time lengths
610
are not generated inline anymore when tuning for z10 or higher. The purpose
611
is to make use of the IFUNC optimized versions in Glibc.
586
- Support for the Transactional Execution Facility included with the
587
IBM zEnterprise zEC12 processor has been added. A set of GCC style
588
builtins as well as XLC style builtins are provided. The builtins
589
are enabled by default when using the -march=zEC12 option but can
590
explicitly be disabled with -mno-htm. Using the GCC builtins also
591
libitm supports hardware transactions on S/390.
593
- The hotpatch features allows to prepare functions for hotpatching.
594
A certain amount of bytes is reserved before the function entry
595
label plus a NOP is inserted at its very beginning to implement a
596
backward jump when applying a patch. The feature can either be
597
enabled via command line option -mhotpatch for a compilation unit
598
or can be enabled per function using the hotpatch attribute.
600
- The shrink wrap optimization is now supported on S/390 and
603
- A major rework of the routines to determine which registers need to
604
be saved and restored in function prologue/epilogue now allow to
605
use floating point registers as save slots. This will happen for
606
certain leaf function with -march=z10 or higher.
608
- The LRA rtl pass replaces reload by default on S/390.
614
- The port now allows to specify the RX100, RX200, and RX600
615
processors with the command line options -mcpu=rx100, -mcpu=rx200
617
- The default alignment settings have been reduced to be less aggressive. This
618
results in more compact code for optimization levels other than -Os.
620
- Improved support for the __atomic built-in functions:
622
- A new option -matomic-model=model selects the model for the generated
623
atomic sequences. The following models are supported:
626
Software gUSA sequences (SH3* and SH4* only). On SH4A targets this
627
will now also partially utilize the movco.l and movli.l
628
instructions. This is the default when the target is sh3*-*-linux*
631
Hardware movco.l / movli.l sequences (SH4A only).
633
Software thread control block sequences.
635
Software interrupt flipping sequences (privileged mode only). This
636
is the default when the target is sh1*-*-linux* or sh2*-*-linux*.
638
Generates function calls to the respective __atomic built-in
639
functions. This is the default for SH64 targets or when the target
642
- The option -msoft-atomic has been deprecated. It is now an alias for
643
-matomic-model=soft-gusa.
645
- A new option -mtas makes the compiler generate the tas.b instruction for
646
the __atomic_test_and_set built-in function regardless of the selected
649
- The __sync functions in libgcc now reflect the selected atomic model when
650
building the toolchain.
652
- Added support for the mov.b and mov.w instructions with displacement
655
- Added support for the SH2A instructions movu.b and movu.w.
657
- Various improvements to code generated for integer arithmetic.
659
- Improvements to conditional branches and code that involves the T bit. A new
660
option -mzdcbranch tells the compiler to favor zero-displacement branches.
661
This is enabled by default for SH4* targets.
663
- The pref instruction will now be emitted by the __builtin_prefetch built-in
664
function for SH3* targets.
666
- The fmac instruction will now be emitted by the fmaf standard function and
667
the __builtin_fmaf built-in function.
669
- The -mfused-madd option has been deprecated in favor of the machine-
670
independent -ffp-contract option. Notice that the fmac instruction will now
671
be generated by default for expressions like a * b + c. This is due to the
672
compiler default setting -ffp-contract=fast.
674
- Added new options -mfsrra and -mfsca to allow the compiler using the fsrra
675
and fsca instructions on targets other than SH4A (where they are already
678
- Added support for the __builtin_bswap32 built-in function. It is now
679
expanded as a sequence of swap.b and swap.w instructions instead of a
680
library function call.
682
- The behavior of the -mieee option has been fixed and the negative form
683
-mno-ieee has been added to control the IEEE conformance of floating point
684
comparisons. By default -mieee is now enabled and the option -ffinite-math-
685
only implicitly sets -mno-ieee.
687
- Added support for the built-in functions __builtin_thread_pointer and
688
__builtin_set_thread_pointer. This assumes that GBR is used to hold the
689
thread pointer of the current thread. Memory loads and stores relative to
690
the address returned by __builtin_thread_pointer will now also utilize GBR
691
based displacement address modes.
697
- Added optimized instruction scheduling for Niagara4.
703
- Added support for the -mcmodel=MODEL command-line option. The models
704
supported are small and large.
710
- This target now supports the E3V5 architecture via the use of the new
711
-mv850e3v5 command-line option. It also has experimental support for the e3v5
712
LOOP instruction which can be enabled via the new -mloop command-line
719
- This target now supports the -fstack-usage command-line option.
729
- Executables are now linked against shared libgcc by default. The previous
730
default was to link statically, which can still be done by explicitly
731
specifying -static or -static-libgcc on the command line. However it is
732
strongly advised against, as it will cause problems for any application that
733
makes use of DLLs compiled by GCC. It should be alright for a monolithic
734
stand-alone application that only links against the Windows OS DLLs, but
735
offers little or no benefit.
622
- Minor improvements to code generated for integer arithmetic and code
623
that involves the T bit.
625
- Added support for the SH2A clips and clipu instructions. The
626
compiler will now try to utilize them for min/max expressions such
627
as max (-128, min (127, x)).
629
- Added support for the cmp/str instruction through built-in
630
functions such as __builtin_strlen. When not optimizing for size,
631
the compiler will now expand calls to e.g. strlen as an inlined
632
sequences which utilize the cmp/str instruction.
634
- Improved code generated around volatile memory loads and stores.
636
- The option -mcbranchdi has been deprecated. Specifying it
637
will result in a warning and will not influence code generation.
639
- The option -mcmpeqdi has been deprecated. Specifying it
640
will result in a warning and will not influence code generation.
738
644
For questions related to the use of GCC, please consult these web