1 SUMMARY 2 ------- 3 4 We met to discuss the LLVM instruction format and bytecode representation: 5 6 ISSUES RESOLVED 7 --------------- 8 9 1. We decided that we shall use a flat namespace to represent our 10 variables in SSA form, as opposed to having a two dimensional namespace 11 of the original variable and the SSA instance subscript. 12 13 ARGUMENT AGAINST: 14 * A two dimensional namespace would be valuable when doing alias 15 analysis because the extra information can help limit the scope of 16 analysis. 17 18 ARGUMENT FOR: 19 * Including this information would require that all users of the LLVM 20 bytecode would have to parse and handle it. This would slow down the 21 common case and inflate the instruction representation with another 22 infinite variable space. 23 24 REASONING: 25 * It was decided that because original variable sources could be 26 reconstructed from SSA form in linear time, that it would be an 27 unjustified expense for the common case to include the extra 28 information for one optimization. Alias analysis itself is typically 29 greater than linear in asymptotic complexity, so this extra analaysis 30 would not affect the runtime of the optimization in a significant 31 way. Additionally, this would be an unlikely optimization to do at 32 runtime. 33 34 35 IDEAS TO CONSIDER 36 ----------------- 37 38 1. Including dominator information in the LLVM bytecode 39 representation. This is one example of an analysis result that may be 40 packaged with the bytecodes themselves. As a conceptual implementation 41 idea, we could include an immediate dominator number for each basic block 42 in the LLVM bytecode program. Basic blocks could be numbered according 43 to the order of occurrence in the bytecode representation. 44 45 2. Including loop header and body information. This would facilitate 46 detection of intervals and natural loops. 47 48 UNRESOLVED ISSUES 49 ----------------- 50 51 1. Will oSUIF provide enough of an infrastructure to support the research 52 that we will be doing? We know that it has less than stellar 53 performance, but hope that this will be of little importance for our 54 static compiler. This could affect us if we decided to do some IP 55 research. Also we do not yet understand the level of exception support 56 currently implemented. 57 58 2. Should we consider the requirements of a direct hardware implementation 59 of the LLVM when we design it? If so, several design issues should 60 have their priorities shifted. The other option is to focus on a 61 software layer interpreting the LLVM in all cases. 62 63 3. Should we use some form of packetized format to improve forward 64 compatibility? For example, we could design the system to encode a 65 packet type and length field before analysis information, to allow a 66 runtime to skip information that it didn't understand in a bytecode 67 stream. The obvious benefit would be for compatibility, the drawback 68 is that it would tend to splinter that 'standard' LLVM definition. 69 70 4. Should we use fixed length instructions or variable length 71 instructions? Fetching variable length instructions is expensive (for 72 either hardware or software based LLVM runtimes), but we have several 73 'infinite' spaces that instructions operate in (SSA register numbers, 74 type spaces, or packet length [if packets were implemented]). Several 75 options were mentioned including: 76 A. Using 16 or 32 bit numbers, which would be 'big enough' 77 B. A scheme similar to how UTF-8 works, to encode infinite numbers 78 while keeping small number small. 79 C. Use something similar to Huffman encoding, so that the most common 80 numbers are the smallest. 81 82 -Chris 83 84