Lecture 22 - 480/580 (compiler construction) a RISCy business Having generated code, we have completed our tour through a typical compiler. In my last few lectures I want to touch on a few additional topics that are interesting but not central to our discussion of a typical compiler. The first of these topics is the RISC vs CISC debate. This is probably the most exciting debate to take place in computer architecture in the 1980's. The idea of the RISC was invented largely by David Patterson and his students at UC, Berkeley. Our own Bruce D'Ambrosio worked for a time on the RISC project. For the interested, more data can be found in back issues of Computer, particular Sept. 1982 (which started the debate). Consider the world as it existed in ancient times; say around 1965: 1) computer memory (air-cooled core memory) was expensive. A machine with 128K words of memory was massive. 2) many, perhaps most, programs were written in assembly language. because of (1) machine designers were very concerned with space; making programs as small as possible. Thus we saw a flowering of special purpose instructions, such as the polynomial arithmetic instructions on the VAX, special subroutine call instructions, complex addressing modes, and the like. People thought that 16 registers were sufficient, because of (1) any more would have been expensive, and (2) people, with limited brainpower, wouldn't know how to keep any more than that number straight. Besides, saving registers was expensive. During the late 60's and 1970's a great deal of effort was spent on making machines faster (this is still true). Given fixed limitations (electricity will only travel so fast through copper), one way to speed up machines is to pipeline instructions. Consider a typical instruction, such as: add 8(16),d0 a typical computer process this instruction in four stages: 1. fetch the instruction from memory, and decode what type of instruction it is. 2. fetch the data from memory or registers, placing it into internal registers. 3. perform the instruction 4. store the result back into memory or registers or wherever. Now, the idea behind pipelining is that you can be doing part (1) for the next instruction while you are doing part (3) say, of the last one. The major bottleneck to this idea is the path between the processor and memory. John Backus coined the term--"the von Neumann bottleneck", when he came up with quite another solution to this problem. The difficulty is that everybody wants to get to memory; instruction fetch certainly must do so, and any time an instruction references memory (and about 30 to 50 percent of instructions typically executed will reference memory) we have collisions. Complex instructions mess up this picture in at least three ways: * complex instructions often have more arguments than simple instructions, thus they take more space. This makes prefetching the next instruction difficult. * complex things take time. Complex instructions take longer to perform than simple instructions; a designer is left with a choice of making some instructions take lots longer than others, or making all instructions take a long time. One is a clear loss, the other means additional complexity in the machine. * branch instructions mess up the pipeline. Depending upon the outcome of the branch, the instruction you have decoded may have to be discarded. complex instruction sets also make the compiler writers part harder, by giving her a large set of choices to make, complex addressing modes to recognize, and so on. During the late 70's many people started to question the CISC philosophy. For example, it was found that many of the more complex instructions on the VAX were actually SLOWER than equivalent sequences of simple instructions. Multiplies, to take another example, were often by a constant amount and could be performed much faster with chains of additions and shifts. Thus there seemed to be no reason to use complex instructions. What Patterson did in the early 1980's was to notice that the world had changed. Memory was (or, at that time, showed promise of soon being) dirt cheap. Furthermore almost nobody wrote in assembly language anymore; people universally wrote in high level languages. Thus the basis upon which computer design for the past 20 years had been made was no longer valid. His revoluation was to completely toss out conventional design, and start over again. In doing so, he came up with a design that overcame the problems of the CISC machine. The basic features of the machine he designed, the Berkeley RISC, was as follows: * complex addressing modes were simply hidden computation; and nothing is free. Lets call an add an add; there would be no addressing modes, instead the computations necessary for addressing modes would be done with instructions. Indeed, this makes programs longer, but since memory is cheap, who cares. * make all instructions have the same format, and recognize the same addressing modes. Thus there is almost no special case code in a code generator for the RISC; instructions differ only in their op code. * make all instructions the same size, and execute in the same length of time. Thus all instructions are one word - this makes prefetching the next instruction easy - since it is just one word long. The idea was a simple two-stage pipeline, in which at each point the processor would be executing one instruction and prefetching the next. As I will note in a moment, they didn't quite succeeed in the latter. Complex operations, such as multiplication or floating point, is done by calling subroutines, or by replacing them with equivalent sequences of simple instructions. * there would be lots of registers; at least 64. Thus most scalar variables could be kept in registers. Since the user never wrote assembly language; they would not have to know how these registers were used. Register saves could be avoided by using a register window (which I talked about last week). Because of the high number of registers, most operations would take place in registers. Thus register allocation is almost trivial, since almost everything can be kept in a register. (This does make pointers a bit harder, since the object being pointed to may be in a register, but that can easily be fixed). * the only instructions to access data would be load and store from and to a register. They only wanted one path between the processor and memory, since fetching the next instruction would hog this path most of the time, loads and stores should be infrequent. When they do occur, however, they have to take more time (and extra memory cycles) in order to gain access to the memory path. * to avoid the problem with the branch instruction, delay the branch until on the instruction AFTER the actual branch instruction. Thus nothing needs to be tossed out of the pipline. Simple-minded code generators could just insert no-ops (null operations) after brach statements. But in a surprising number of cases a good optimizer can make use of these time. (about 90 percent of the no-ops following an unconditional branch and about 20 percent of the the no-ops following a conditional branch can be eliminated). By making these changes the result was a machine that was VERY simple. Now comes the kicker; consider VLSI technology. At any time there is a fixed amount of space that can be used on any chip. Complex designs tended to take up a lot of this space, forcing registers to be small or off the chip altogether. With a simple machine, the processor need occupy very little of the chip (about 8%, versus 50% or most VLSI designs). The rest can be devoted to holding that huge register window. However registers are very regular, whereas processors aren't. But this just means that not only is a RISC smaller, faster, but it is easier to design as well. What does this all have to do with compiler construction? Well, the RISC revolution has really been a revolution; forcing everybody to rethink machine design. In the future more machines will be designed along the RISC lines. This will have the effect of making compiler writing is some sense much easier. -------------and now for something completely different -------------- depending upon time, various other topics that can be discussed bootstrapping. How to get a compiler on to a new machine. Interpreters vs compilers; threaded code, etc.