~ubuntu-branches/ubuntu/feisty/clamav/feisty

« back to all changes in this revision

Viewing changes to libclamav/c++/llvm/lib/Target/X86/README-FPStack.txt

  • Committer: Bazaar Package Importer
  • Author(s): Kees Cook
  • Date: 2007-02-20 10:33:44 UTC
  • mto: This revision was merged to the branch mainline in revision 16.
  • Revision ID: james.westby@ubuntu.com-20070220103344-zgcu2psnx9d98fpa
Tags: upstream-0.90
ImportĀ upstreamĀ versionĀ 0.90

Show diffs side-by-side

added added

removed removed

Lines of Context:
1
 
//===---------------------------------------------------------------------===//
2
 
// Random ideas for the X86 backend: FP stack related stuff
3
 
//===---------------------------------------------------------------------===//
4
 
 
5
 
//===---------------------------------------------------------------------===//
6
 
 
7
 
Some targets (e.g. athlons) prefer freep to fstp ST(0):
8
 
http://gcc.gnu.org/ml/gcc-patches/2004-04/msg00659.html
9
 
 
10
 
//===---------------------------------------------------------------------===//
11
 
 
12
 
This should use fiadd on chips where it is profitable:
13
 
double foo(double P, int *I) { return P+*I; }
14
 
 
15
 
We have fiadd patterns now but the followings have the same cost and
16
 
complexity. We need a way to specify the later is more profitable.
17
 
 
18
 
def FpADD32m  : FpI<(ops RFP:$dst, RFP:$src1, f32mem:$src2), OneArgFPRW,
19
 
                    [(set RFP:$dst, (fadd RFP:$src1,
20
 
                                     (extloadf64f32 addr:$src2)))]>;
21
 
                // ST(0) = ST(0) + [mem32]
22
 
 
23
 
def FpIADD32m : FpI<(ops RFP:$dst, RFP:$src1, i32mem:$src2), OneArgFPRW,
24
 
                    [(set RFP:$dst, (fadd RFP:$src1,
25
 
                                     (X86fild addr:$src2, i32)))]>;
26
 
                // ST(0) = ST(0) + [mem32int]
27
 
 
28
 
//===---------------------------------------------------------------------===//
29
 
 
30
 
The FP stackifier should handle simple permutates to reduce number of shuffle
31
 
instructions, e.g. turning:
32
 
 
33
 
fld P   ->              fld Q
34
 
fld Q                   fld P
35
 
fxch
36
 
 
37
 
or:
38
 
 
39
 
fxch    ->              fucomi
40
 
fucomi                  jl X
41
 
jg X
42
 
 
43
 
Ideas:
44
 
http://gcc.gnu.org/ml/gcc-patches/2004-11/msg02410.html
45
 
 
46
 
 
47
 
//===---------------------------------------------------------------------===//
48
 
 
49
 
Add a target specific hook to DAG combiner to handle SINT_TO_FP and
50
 
FP_TO_SINT when the source operand is already in memory.
51
 
 
52
 
//===---------------------------------------------------------------------===//
53
 
 
54
 
Open code rint,floor,ceil,trunc:
55
 
http://gcc.gnu.org/ml/gcc-patches/2004-08/msg02006.html
56
 
http://gcc.gnu.org/ml/gcc-patches/2004-08/msg02011.html
57
 
 
58
 
Opencode the sincos[f] libcall.
59
 
 
60
 
//===---------------------------------------------------------------------===//
61
 
 
62
 
None of the FPStack instructions are handled in
63
 
X86RegisterInfo::foldMemoryOperand, which prevents the spiller from
64
 
folding spill code into the instructions.
65
 
 
66
 
//===---------------------------------------------------------------------===//
67
 
 
68
 
Currently the x86 codegen isn't very good at mixing SSE and FPStack
69
 
code:
70
 
 
71
 
unsigned int foo(double x) { return x; }
72
 
 
73
 
foo:
74
 
        subl $20, %esp
75
 
        movsd 24(%esp), %xmm0
76
 
        movsd %xmm0, 8(%esp)
77
 
        fldl 8(%esp)
78
 
        fisttpll (%esp)
79
 
        movl (%esp), %eax
80
 
        addl $20, %esp
81
 
        ret
82
 
 
83
 
This just requires being smarter when custom expanding fptoui.
84
 
 
85
 
//===---------------------------------------------------------------------===//