Home | History | Annotate | Download | only in HistoricalNotes
      1 From: Chris Lattner <sabre (a] nondot.org>
      2 To: "Vikram S. Adve" <vadve (a] cs.uiuc.edu>
      3 Subject: Re: LLVM Feedback
      4 
      5 I've included your feedback in the /home/vadve/lattner/llvm/docs directory
      6 so that it will live in CVS eventually with the rest of LLVM.  I've
      7 significantly updated the documentation to reflect the changes you
      8 suggested, as specified below:
      9 
     10 > We should consider eliminating the type annotation in cases where it is
     11 > essentially obvious from the instruction type:
     12 >        br bool <cond>, label <iftrue>, label <iffalse>
     13 > I think your point was that making all types explicit improves clarity
     14 > and readability.  I agree to some extent, but it also comes at the
     15 > cost of verbosity.  And when the types are obvious from people's
     16 > experience (e.g., in the br instruction), it doesn't seem to help as
     17 > much.
     18 
     19 Very true.  We should discuss this more, but my reasoning is more of a
     20 consistency argument.  There are VERY few instructions that can have all
     21 of the types eliminated, and doing so when available unnecessarily makes
     22 the language more difficult to handle.  Especially when you see 'int
     23 %this' and 'bool %that' all over the place, I think it would be
     24 disorienting to see:
     25 
     26   br %predicate, %iftrue, %iffalse
     27 
     28 for branches.  Even just typing that once gives me the creeps. ;)  Like I
     29 said, we should probably discuss this further in person...
     30 
     31 > On reflection, I really like your idea of having the two different
     32 > switch types (even though they encode implementation techniques rather
     33 > than semantics).  It should simplify building the CFG and my guess is it
     34 > could enable some significant optimizations, though we should think
     35 > about which.
     36 
     37 Great.  I added a note to the switch section commenting on how the VM
     38 should just use the instruction type as a hint, and that the
     39 implementation may choose altermate representations (such as predicated
     40 branches).
     41 
     42 > In the lookup-indirect form of the switch, is there a reason not to
     43 > make the val-type uint?
     44 
     45 No.  This was something I was debating for a while, and didn't really feel
     46 strongly about either way.  It is common to switch on other types in HLL's
     47 (for example signed int's are particularly common), but in this case, all
     48 that will be added is an additional 'cast' instruction.  I removed that
     49 from the spec.
     50 
     51 > I agree with your comment that we don't need 'neg'
     52 
     53 Removed.
     54 
     55 > There's a trade-off with the cast instruction:
     56 >  +  it avoids having to define all the upcasts and downcasts that are
     57 >     valid for the operands of each instruction  (you probably have
     58 >     thought of other benefits also)
     59 >  -  it could make the bytecode significantly larger because there could
     60 >     be a lot of cast operations
     61 
     62  + You NEED casts to represent things like:
     63     void foo(float);
     64     ...
     65     int x;
     66     ...
     67     foo(x);
     68    in a language like C.  Even in a Java like language, you need upcasts
     69    and some way to implement dynamic downcasts.
     70  + Not all forms of instructions take every type (for example you can't
     71    shift by a floating point number of bits), thus SOME programs will need
     72    implicit casts.
     73 
     74 To be efficient and to avoid your '-' point above, we just have to be
     75 careful to specify that the instructions shall operate on all common
     76 types, therefore casting should be relatively uncommon.  For example all
     77 of the arithmetic operations work on almost all data types.
     78 
     79 > Making the second arg. to 'shl' a ubyte seems good enough to me.
     80 > 255 positions seems adequate for several generations of machines
     81 
     82 Okay, that comment is removed.
     83 
     84 > and is more compact than uint.
     85 
     86 No, it isn't.  Remember that the bytecode encoding saves value slots into
     87 the bytecode instructions themselves, not constant values.  This is
     88 another case where we may introduce more cast instructions (but we will
     89 also reduce the number of opcode variants that must be supported by a
     90 virtual machine).  Because most shifts are by constant values, I don't
     91 think that we'll have to cast many shifts.  :)
     92 
     93 > I still have some major concerns about including malloc and free in the
     94 > language (either as builtin functions or instructions).
     95 
     96 Agreed.  How about this proposal:
     97 
     98 malloc/free are either built in functions or actual opcodes.  They provide
     99 all of the type safety that the document would indicate, blah blah
    100 blah. :)
    101 
    102 Now, because of all of the excellent points that you raised, an
    103 implementation may want to override the default malloc/free behavior of
    104 the program.  To do this, they simply implement a "malloc" and
    105 "free" function.  The virtual machine will then be defined to use the user
    106 defined malloc/free function (which return/take void*'s, not type'd
    107 pointers like the builtin function would) if one is available, otherwise
    108 fall back on a system malloc/free.
    109 
    110 Does this sound like a good compromise?  It would give us all of the
    111 typesafety/elegance in the language while still allowing the user to do
    112 all the cool stuff they want to...
    113 
    114 >  'alloca' on the other hand sounds like a good idea, and the
    115 >  implementation seems fairly language-independent so it doesn't have the
    116 >  problems with malloc listed above.
    117 
    118 Okay, once we get the above stuff figured out, I'll put it all in the
    119 spec.
    120 
    121 >  About indirect call:
    122 >  Your option #2 sounded good to me.  I'm not sure I understand your
    123 >  concern about an explicit 'icall' instruction?
    124 
    125 I worry too much.  :)  The other alternative has been removed. 'icall' is
    126 now up in the instruction list next to 'call'.
    127 
    128 > I believe tail calls are relatively easy to identify; do you know why
    129 > .NET has a tailcall instruction?
    130 
    131 Although I am just guessing, I believe it probably has to do with the fact
    132 that they want languages like Haskell and lisp to be efficiently runnable
    133 on their VM.  Of course this means that the VM MUST implement tail calls
    134 'correctly', or else life will suck.  :)  I would put this into a future
    135 feature bin, because it could be pretty handy...
    136 
    137 >  A pair of important synchronization instr'ns to think about:
    138 >    load-linked
    139 >    store-conditional
    140 
    141 What is 'load-linked'?  I think that (at least for now) I should add these
    142 to the 'possible extensions' section, because they are not immediately
    143 needed...
    144 
    145 > Other classes of instructions that are valuable for pipeline
    146 > performance:
    147 >    conditional-move            
    148 >    predicated instructions
    149 
    150 Conditional move is effectly a special case of a predicated
    151 instruction... and I think that all predicated instructions can possibly
    152 be implemented later in LLVM.  It would significantly change things, and
    153 it doesn't seem to be very necessary right now.  It would seem to
    154 complicate flow control analysis a LOT in the virtual machine.  I would
    155 tend to prefer that a predicated architecture like IA64 convert from a
    156 "basic block" representation to a predicated rep as part of it's dynamic
    157 complication phase.  Also, if a basic block contains ONLY a move, then
    158 that can be trivally translated into a conditional move...
    159 
    160 > I agree that we need a static data space.  Otherwise, emulating global
    161 > data gets unnecessarily complex.
    162 
    163 Definitely.  Also a later item though.  :)
    164 
    165 > We once talked about adding a symbolic thread-id field to each
    166 > ..
    167 > Instead, it could a great topic for a separate study.
    168 
    169 Agreed.  :)
    170 
    171 > What is the semantics of the IA64 stop bit?
    172 
    173 Basically, the IA64 writes instructions like this:
    174 mov ...
    175 add ...
    176 sub ...
    177 op xxx
    178 op xxx
    179 ;;
    180 mov ...
    181 add ...
    182 sub ...
    183 op xxx
    184 op xxx
    185 ;;
    186 
    187 Where the ;; delimits a group of instruction with no dependencies between
    188 them, which can all be executed concurrently (to the limits of the
    189 available functional units).  The ;; gets translated into a bit set in one
    190 of the opcodes.
    191 
    192 The advantages of this representation is that you don't have to do some
    193 kind of 'thread id scheduling' pass by having to specify ahead of time how
    194 many threads to use, and the representation doesn't have a per instruction
    195 overhead...
    196 
    197 > And finally, another thought about the syntax for arrays :-)
    198 >  Although this syntax:
    199 >         array <dimension-list> of <type>
    200 >  is verbose, it will be used only in the human-readable assembly code so
    201 >  size should not matter.  I think we should consider it because I find it
    202 >  to be the clearest syntax.  It could even make arrays of function
    203 >  pointers somewhat readable.
    204 
    205 My only comment will be to give you an example of why this is a bad
    206 idea.  :)
    207 
    208 Here is an example of using the switch statement (with my recommended
    209 syntax):
    210 
    211 switch uint %val, label %otherwise, 
    212        [%3 x {uint, label}] [ { uint %57, label %l1 }, 
    213                               { uint %20, label %l2 }, 
    214                               { uint %14, label %l3 } ]
    215 
    216 Here it is with the syntax you are proposing:
    217 
    218 switch uint %val, label %otherwise, 
    219        array %3 of {uint, label} 
    220               array of {uint, label}
    221                               { uint %57, label %l1 },
    222                               { uint %20, label %l2 },
    223                               { uint %14, label %l3 }
    224 
    225 Which is ambiguous and very verbose. It would be possible to specify
    226 constants with [] brackets as in my syntax, which would look like this:
    227 
    228 switch uint %val, label %otherwise,
    229        array %3 of {uint, label}  [ { uint %57, label %l1 },
    230                                     { uint %20, label %l2 },
    231                                     { uint %14, label %l3 } ]
    232 
    233 But then the syntax is inconsistent between type definition and constant
    234 definition (why do []'s enclose the constants but not the types??).  
    235 
    236 Anyways, I'm sure that there is much debate still to be had over
    237 this... :)
    238 
    239 -Chris
    240 
    241 http://www.nondot.org/~sabre/os/
    242 http://www.nondot.org/MagicStats/
    243 http://korbit.sourceforge.net/
    244 
    245 
    246