1 Ok, here are my comments and suggestions about the LLVM instruction set. 2 We should discuss some now, but can discuss many of them later, when we 3 revisit synchronization, type inference, and other issues. 4 (We have discussed some of the comments already.) 5 6 7 o We should consider eliminating the type annotation in cases where it is 8 essentially obvious from the instruction type, e.g., in br, it is obvious 9 that the first arg. should be a bool and the other args should be labels: 10 11 br bool <cond>, label <iftrue>, label <iffalse> 12 13 I think your point was that making all types explicit improves clarity 14 and readability. I agree to some extent, but it also comes at the cost 15 of verbosity. And when the types are obvious from people's experience 16 (e.g., in the br instruction), it doesn't seem to help as much. 17 18 19 o On reflection, I really like your idea of having the two different switch 20 types (even though they encode implementation techniques rather than 21 semantics). It should simplify building the CFG and my guess is it could 22 enable some significant optimizations, though we should think about which. 23 24 25 o In the lookup-indirect form of the switch, is there a reason not to make 26 the val-type uint? Most HLL switch statements (including Java and C++) 27 require that anyway. And it would also make the val-type uniform 28 in the two forms of the switch. 29 30 I did see the switch-on-bool examples and, while cute, we can just use 31 the branch instructions in that particular case. 32 33 34 o I agree with your comment that we don't need 'neg'. 35 36 37 o There's a trade-off with the cast instruction: 38 + it avoids having to define all the upcasts and downcasts that are 39 valid for the operands of each instruction (you probably have thought 40 of other benefits also) 41 - it could make the bytecode significantly larger because there could 42 be a lot of cast operations 43 44 45 o Making the second arg. to 'shl' a ubyte seems good enough to me. 46 255 positions seems adequate for several generations of machines 47 and is more compact than uint. 48 49 50 o I still have some major concerns about including malloc and free in the 51 language (either as builtin functions or instructions). LLVM must be 52 able to represent code from many different languages. Languages such as 53 C, C++ Java and Fortran 90 would not be able to use our malloc anyway 54 because each of them will want to provide a library implementation of it. 55 56 This gets even worse when code from different languages is linked 57 into a single executable (which is fairly common in large apps). 58 Having a single malloc would just not suffice, and instead would simply 59 complicate the picture further because it adds an extra variant in 60 addition to the one each language provides. 61 62 Instead, providing a default library version of malloc and free 63 (and perhaps a malloc_gc with garbage collection instead of free) 64 would make a good implementation available to anyone who wants it. 65 66 I don't recall all your arguments in favor so let's discuss this again, 67 and soon. 68 69 70 o 'alloca' on the other hand sounds like a good idea, and the 71 implementation seems fairly language-independent so it doesn't have the 72 problems with malloc listed above. 73 74 75 o About indirect call: 76 Your option #2 sounded good to me. I'm not sure I understand your 77 concern about an explicit 'icall' instruction? 78 79 80 o A pair of important synchronization instr'ns to think about: 81 load-linked 82 store-conditional 83 84 85 o Other classes of instructions that are valuable for pipeline performance: 86 conditional-move 87 predicated instructions 88 89 90 o I believe tail calls are relatively easy to identify; do you know why 91 .NET has a tailcall instruction? 92 93 94 o I agree that we need a static data space. Otherwise, emulating global 95 data gets unnecessarily complex. 96 97 98 o About explicit parallelism: 99 100 We once talked about adding a symbolic thread-id field to each 101 instruction. (It could be optional so single-threaded codes are 102 not penalized.) This could map well to multi-threaded architectures 103 while providing easy ILP for single-threaded onces. But it is probably 104 too radical an idea to include in a base version of LLVM. Instead, it 105 could a great topic for a separate study. 106 107 What is the semantics of the IA64 stop bit? 108 109 110 111 112 o And finally, another thought about the syntax for arrays :-) 113 114 Although this syntax: 115 array <dimension-list> of <type> 116 is verbose, it will be used only in the human-readable assembly code so 117 size should not matter. I think we should consider it because I find it 118 to be the clearest syntax. It could even make arrays of function 119 pointers somewhat readable. 120 121