1 //===---------------------------------------------------------------------===// 2 // Random ideas for the X86 backend: FP stack related stuff 3 //===---------------------------------------------------------------------===// 4 5 //===---------------------------------------------------------------------===// 6 7 Some targets (e.g. athlons) prefer freep to fstp ST(0): 8 http://gcc.gnu.org/ml/gcc-patches/2004-04/msg00659.html 9 10 //===---------------------------------------------------------------------===// 11 12 This should use fiadd on chips where it is profitable: 13 double foo(double P, int *I) { return P+*I; } 14 15 We have fiadd patterns now but the followings have the same cost and 16 complexity. We need a way to specify the later is more profitable. 17 18 def FpADD32m : FpI<(ops RFP:$dst, RFP:$src1, f32mem:$src2), OneArgFPRW, 19 [(set RFP:$dst, (fadd RFP:$src1, 20 (extloadf64f32 addr:$src2)))]>; 21 // ST(0) = ST(0) + [mem32] 22 23 def FpIADD32m : FpI<(ops RFP:$dst, RFP:$src1, i32mem:$src2), OneArgFPRW, 24 [(set RFP:$dst, (fadd RFP:$src1, 25 (X86fild addr:$src2, i32)))]>; 26 // ST(0) = ST(0) + [mem32int] 27 28 //===---------------------------------------------------------------------===// 29 30 The FP stackifier should handle simple permutates to reduce number of shuffle 31 instructions, e.g. turning: 32 33 fld P -> fld Q 34 fld Q fld P 35 fxch 36 37 or: 38 39 fxch -> fucomi 40 fucomi jl X 41 jg X 42 43 Ideas: 44 http://gcc.gnu.org/ml/gcc-patches/2004-11/msg02410.html 45 46 47 //===---------------------------------------------------------------------===// 48 49 Add a target specific hook to DAG combiner to handle SINT_TO_FP and 50 FP_TO_SINT when the source operand is already in memory. 51 52 //===---------------------------------------------------------------------===// 53 54 Open code rint,floor,ceil,trunc: 55 http://gcc.gnu.org/ml/gcc-patches/2004-08/msg02006.html 56 http://gcc.gnu.org/ml/gcc-patches/2004-08/msg02011.html 57 58 Opencode the sincos[f] libcall. 59 60 //===---------------------------------------------------------------------===// 61 62 None of the FPStack instructions are handled in 63 X86RegisterInfo::foldMemoryOperand, which prevents the spiller from 64 folding spill code into the instructions. 65 66 //===---------------------------------------------------------------------===// 67 68 Currently the x86 codegen isn't very good at mixing SSE and FPStack 69 code: 70 71 unsigned int foo(double x) { return x; } 72 73 foo: 74 subl $20, %esp 75 movsd 24(%esp), %xmm0 76 movsd %xmm0, 8(%esp) 77 fldl 8(%esp) 78 fisttpll (%esp) 79 movl (%esp), %eax 80 addl $20, %esp 81 ret 82 83 This just requires being smarter when custom expanding fptoui. 84 85 //===---------------------------------------------------------------------===// 86