1
On olympus.extreme.indiana.edu (sparc-sun-solaris2.6):
4
Initial version, with -O2 -ftemplate-depth-30 -O2 -funroll-loops
30
Woohoo. Okay, obviously inlining is the key.
32
Now try new expression templates:
34
With -O -funroll-loops -DBZ_NEW_EXPRESSION_TEMPLATES
39
With -O -funroll-loops -DBZ_NEW_EXPRESSION_TEMPLATES -DBZ_NO_INLINE_ET
44
With -O -funroll-loops -DBZ_NEW_EXPRESSION_TEMPLATES -DBZ_NO_INLINE_ET -DBZ_ETPARMS_CONSTREF
53
Just -O (this will turn off -funroll-all-loops)
54
-fno-expensive-optimizations
57
-fno-rerun-cse-after-loop
64
On hgar1.cwru.edu (alpha), with KCC:
66
With +K3 -O3 -DBZ_NEW_EXPRESSION_TEMPLATES -DBZ_NO_INLINE_ET -DBZ_ETPARMS_CONSTREF:
79
So a speed up of about X 2 with KCC, not counting the overhead.
82
Here are the results for <valarray>: