1 Date: Sun, 19 Nov 2000 16:23:57 -0600 (CST) 2 From: Chris Lattner <sabre (a] nondot.org> 3 To: Vikram Adve <vadve (a] cs.uiuc.edu> 4 Subject: Re: a few thoughts 5 6 Okay... here are a few of my thoughts on this (it's good to know that we 7 think so alike!): 8 9 > 1. We need to be clear on our goals for the VM. Do we want to emphasize 10 > portability and safety like the Java VM? Or shall we focus on the 11 > architecture interface first (i.e., consider the code generation and 12 > processor issues), since the architecture interface question is also 13 > important for portable Java-type VMs? 14 15 I forsee the architecture looking kinda like this: (which is completely 16 subject to change) 17 18 1. The VM code is NOT guaranteed safe in a java sense. Doing so makes it 19 basically impossible to support C like languages. Besides that, 20 certifying a register based language as safe at run time would be a 21 pretty expensive operation to have to do. Additionally, we would like 22 to be able to statically eliminate many bounds checks in Java 23 programs... for example. 24 25 2. Instead, we can do the following (eventually): 26 * Java bytecode is used as our "safe" representation (to avoid 27 reinventing something that we don't add much value to). When the 28 user chooses to execute Java bytecodes directly (ie, not 29 precompiled) the runtime compiler can do some very simple 30 transformations (JIT style) to convert it into valid input for our 31 VM. Performance is not wonderful, but it works right. 32 * The file is scheduled to be compiled (rigorously) at a later 33 time. This could be done by some background process or by a second 34 processor in the system during idle time or something... 35 * To keep things "safe" ie to enforce a sandbox on Java/foreign code, 36 we could sign the generated VM code with a host specific private 37 key. Then before the code is executed/loaded, we can check to see if 38 the trusted compiler generated the code. This would be much quicker 39 than having to validate consistency (especially if bounds checks have 40 been removed, for example) 41 42 > This is important because the audiences for these two goals are very 43 > different. Architects and many compiler people care much more about 44 > the second question. The Java compiler and OS community care much more 45 > about the first one. 46 47 3. By focusing on a more low level virtual machine, we have much more room 48 for value add. The nice safe "sandbox" VM can be provided as a layer 49 on top of it. It also lets us focus on the more interesting compilers 50 related projects. 51 52 > 2. Design issues to consider (an initial list that we should continue 53 > to modify). Note that I'm not trying to suggest actual solutions here, 54 > but just various directions we can pursue: 55 56 Understood. :) 57 58 > a. A single-assignment VM, which we've both already been thinking 59 > about. 60 61 Yup, I think that this makes a lot of sense. I am still intrigued, 62 however, by the prospect of a minimally allocated VM representation... I 63 think that it could have definite advantages for certain applications 64 (think very small machines, like PDAs). I don't, however, think that our 65 initial implementations should focus on this. :) 66 67 Here are some other auxiliary goals that I think we should consider: 68 69 1. Primary goal: Support a high performance dynamic compilation 70 system. This means that we have an "ideal" division of labor between 71 the runtime and static compilers. Of course, the other goals of the 72 system somewhat reduce the importance of this point (f.e. portability 73 reduces performance, but hopefully not much) 74 2. Portability to different processors. Since we are most familiar with 75 x86 and solaris, I think that these two are excellent candidates when 76 we get that far... 77 3. Support for all languages & styles of programming (general purpose 78 VM). This is the point that disallows java style bytecodes, where all 79 array refs are checked for bounds, etc... 80 4. Support linking between different language families. For example, call 81 C functions directly from Java without using the nasty/slow/gross JNI 82 layer. This involves several subpoints: 83 A. Support for languages that require garbage collectors and integration 84 with languages that don't. As a base point, we could insist on 85 always using a conservative GC, but implement free as a noop, f.e. 86 87 > b. A strongly-typed VM. One question is do we need the types to be 88 > explicitly declared or should they be inferred by the dynamic 89 > compiler? 90 91 B. This is kind of similar to another idea that I have: make OOP 92 constructs (virtual function tables, class heirarchies, etc) explicit 93 in the VM representation. I believe that the number of additional 94 constructs would be fairly low, but would give us lots of important 95 information... something else that would/could be important is to 96 have exceptions as first class types so that they would be handled in 97 a uniform way for the entire VM... so that C functions can call Java 98 functions for example... 99 100 > c. How do we get more high-level information into the VM while keeping 101 > to a low-level VM design? 102 > o Explicit array references as operands? An alternative is 103 > to have just an array type, and let the index computations be 104 > separate 3-operand instructions. 105 106 C. In the model I was thinking of (subject to change of course), we 107 would just have an array type (distinct from the pointer 108 types). This would allow us to have arbitrarily complex index 109 expressions, while still distinguishing "load" from "Array load", 110 for example. Perhaps also, switch jump tables would be first class 111 types as well? This would allow better reasoning about the program. 112 113 5. Support dynamic loading of code from various sources. Already 114 mentioned above was the example of loading java bytecodes, but we want 115 to support dynamic loading of VM code as well. This makes the job of 116 the runtime compiler much more interesting: it can do interprocedural 117 optimizations that the static compiler can't do, because it doesn't 118 have all of the required information (for example, inlining from 119 shared libraries, etc...) 120 121 6. Define a set of generally useful annotations to add to the VM 122 representation. For example, a function can be analysed to see if it 123 has any sideeffects when run... also, the MOD/REF sets could be 124 calculated, etc... we would have to determine what is reasonable. This 125 would generally be used to make IP optimizations cheaper for the 126 runtime compiler... 127 128 > o Explicit instructions to handle aliasing, e.g.s: 129 > -- an instruction to say "I speculate that these two values are not 130 > aliased, but check at runtime", like speculative execution in 131 > EPIC? 132 > -- or an instruction to check whether two values are aliased and 133 > execute different code depending on the answer, somewhat like 134 > predicated code in EPIC 135 136 These are also very good points... if this can be determined at compile 137 time. I think that an epic style of representation (not the instruction 138 packing, just the information presented) could be a very interesting model 139 to use... more later... 140 141 > o (This one is a difficult but powerful idea.) 142 > A "thread-id" field on every instruction that allows the static 143 > compiler to generate a set of parallel threads, and then have 144 > the runtime compiler and hardware do what they please with it. 145 > This has very powerful uses, but thread-id on every instruction 146 > is expensive in terms of instruction size and code size. 147 > We would need to compactly encode it somehow. 148 149 Yes yes yes! :) I think it would be *VERY* useful to include this kind 150 of information (which EPIC architectures *implicitly* encode. The trend 151 that we are seeing supports this greatly: 152 153 1. Commodity processors are getting massive SIMD support: 154 * Intel/Amd MMX/MMX2 155 * AMD's 3Dnow! 156 * Intel's SSE/SSE2 157 * Sun's VIS 158 2. SMP is becoming much more common, especially in the server space. 159 3. Multiple processors on a die are right around the corner. 160 161 If nothing else, not designing this in would severely limit our future 162 expansion of the project... 163 164 > Also, this will require some reading on at least two other 165 > projects: 166 > -- Multiscalar architecture from Wisconsin 167 > -- Simultaneous multithreading architecture from Washington 168 > 169 > o Or forget all this and stick to a traditional instruction set? 170 171 Heh... :) Well, from a pure research point of view, it is almost more 172 attactive to go with the most extreme/different ISA possible. On one axis 173 you get safety and conservatism, and on the other you get degree of 174 influence that the results have. Of course the problem with pure research 175 is that often times there is no concrete product of the research... :) 176 177 > BTW, on an unrelated note, after the meeting yesterday, I did remember 178 > that you had suggested doing instruction scheduling on SSA form instead 179 > of a dependence DAG earlier in the semester. When we talked about 180 > it yesterday, I didn't remember where the idea had come from but I 181 > remembered later. Just giving credit where its due... 182 183 :) Thanks. 184 185 > Perhaps you can save the above as a file under RCS so you and I can 186 > continue to expand on this. 187 188 I think it makes sense to do so when we get our ideas more formalized and 189 bounce it back and forth a couple of times... then I'll do a more formal 190 writeup of our goals and ideas. Obviously our first implementation will 191 not want to do all of the stuff that I pointed out above... be we will 192 want to design the project so that we do not artificially limit ourselves 193 at sometime in the future... 194 195 Anyways, let me know what you think about these ideas... and if they sound 196 reasonable... 197 198 -Chris 199 200