Home | History | Annotate | Download | only in docs
      1 //===----------------------------------------------------------------------===//
      2 // Representing sign/zero extension of function results
      3 //===----------------------------------------------------------------------===//
      4 
      5 Mar 25, 2009  - Initial Revision
      6 
      7 Most ABIs specify that functions which return small integers do so in a
      8 specific integer GPR.  This is an efficient way to go, but raises the question:
      9 if the returned value is smaller than the register, what do the high bits hold?
     10 
     11 There are three (interesting) possible answers: undefined, zero extended, or
     12 sign extended.  The number of bits in question depends on the data-type that
     13 the front-end is referencing (typically i1/i8/i16/i32).
     14 
     15 Knowing the answer to this is important for two reasons: 1) we want to be able
     16 to implement the ABI correctly.  If we need to sign extend the result according
     17 to the ABI, we really really do need to do this to preserve correctness.  2)
     18 this information is often useful for optimization purposes, and we want the
     19 mid-level optimizers to be able to process this (e.g. eliminate redundant
     20 extensions).
     21 
     22 For example, lets pretend that X86 requires the caller to properly extend the
     23 result of a return (I'm not sure this is the case, but the argument doesn't
     24 depend on this).  Given this, we should compile this:
     25 
     26 int a();
     27 short b() { return a(); }
     28 
     29 into:
     30 
     31 _b:
     32 	subl	$12, %esp
     33 	call	L_a$stub
     34 	addl	$12, %esp
     35 	cwtl
     36 	ret
     37 
     38 An optimization example is that we should be able to eliminate the explicit
     39 sign extension in this example:
     40 
     41 short y();
     42 int z() {
     43   return ((int)y() << 16) >> 16;
     44 }
     45 
     46 _z:
     47 	subl	$12, %esp
     48 	call	_y
     49 	;;  movswl %ax, %eax   -> not needed because eax is already sext'd
     50 	addl	$12, %esp
     51 	ret
     52 
     53 //===----------------------------------------------------------------------===//
     54 // What we have right now.
     55 //===----------------------------------------------------------------------===//
     56 
     57 Currently, these sorts of things are modelled by compiling a function to return
     58 the small type and a signext/zeroext marker is used.  For example, we compile
     59 Z into:
     60 
     61 define i32 @z() nounwind {
     62 entry:
     63 	%0 = tail call signext i16 (...)* @y() nounwind
     64 	%1 = sext i16 %0 to i32
     65 	ret i32 %1
     66 }
     67 
     68 and b into:
     69 
     70 define signext i16 @b() nounwind {
     71 entry:
     72 	%0 = tail call i32 (...)* @a() nounwind		; <i32> [#uses=1]
     73 	%retval12 = trunc i32 %0 to i16		; <i16> [#uses=1]
     74 	ret i16 %retval12
     75 }
     76 
     77 This has some problems: 1) the actual precise semantics are really poorly
     78 defined (see PR3779).  2) some targets might want the caller to extend, some
     79 might want the callee to extend 3) the mid-level optimizer doesn't know the
     80 size of the GPR, so it doesn't know that %0 is sign extended up to 32-bits 
     81 here, and even if it did, it could not eliminate the sext. 4) the code
     82 generator has historically assumed that the result is extended to i32, which is
     83 a problem on PIC16 (and is also probably wrong on alpha and other 64-bit
     84 targets).
     85 
     86 //===----------------------------------------------------------------------===//
     87 // The proposal
     88 //===----------------------------------------------------------------------===//
     89 
     90 I suggest that we have the front-end fully lower out the ABI issues here to
     91 LLVM IR.  This makes it 100% explicit what is going on and means that there is
     92 no cause for confusion.  For example, the cases above should compile into:
     93 
     94 define i32 @z() nounwind {
     95 entry:
     96         %0 = tail call i32 (...)* @y() nounwind
     97 	%1 = trunc i32 %0 to i16
     98         %2 = sext i16 %1 to i32
     99         ret i32 %2
    100 }
    101 define i32 @b() nounwind {
    102 entry:
    103 	%0 = tail call i32 (...)* @a() nounwind
    104 	%retval12 = trunc i32 %0 to i16
    105 	%tmp = sext i16 %retval12 to i32
    106 	ret i32 %tmp
    107 }
    108 
    109 In this model, no functions will return an i1/i8/i16 (and on a x86-64 target
    110 that extends results to i64, no i32).  This solves the ambiguity issue, allows us 
    111 to fully describe all possible ABIs, and now allows the optimizers to reason
    112 about and eliminate these extensions.
    113 
    114 The one thing that is missing is the ability for the front-end and optimizer to
    115 specify/infer the guarantees provided by the ABI to allow other optimizations.
    116 For example, in the y/z case, since y is known to return a sign extended value,
    117 the trunc/sext in z should be eliminable.
    118 
    119 This can be done by introducing new sext/zext attributes which mean "I know
    120 that the result of the function is sign extended at least N bits.  Given this,
    121 and given that it is stuck on the y function, the mid-level optimizer could
    122 easily eliminate the extensions etc with existing functionality.
    123 
    124 The major disadvantage of doing this sort of thing is that it makes the ABI
    125 lowering stuff even more explicit in the front-end, and that we would like to
    126 eventually move to having the code generator do more of this work.  However,
    127 the sad truth of the matter is that this is a) unlikely to happen anytime in
    128 the near future, and b) this is no worse than we have now with the existing
    129 attributes.
    130 
    131 C compilers fundamentally have to reason about the target in many ways.  
    132 This is ugly and horrible, but a fact of life.
    133 
    134