Lecture 14 - 480/580 (compiler construction) Code Generation for Control Flow Constructs Today we are going to discuss how to go about, abstractly, generating code for control flow constructs. I say abstractly becuase we are really not going to generate code for any real machine; but rather describe general algorithms that must then be mapped onto some machine. Just as our abstract syntax trees that I talked about in previous lectures must be mapped onto real machine instructions at some point. In conventional, von Neumann style architecutes, the basic building blocks we have for control flow are: 1. sequential execution (i.e. statements are executed one after another unless interrupted by branching), 2. unconditional branches to labelled statement, and 3. conditional branches to labelled statements, where the conditions are restricted to arithmetic conditions of less than, le, eq, ne, ge, and gt. From these we build everything else. Let us start with the simplest type of statement; the if-then-else. We have if expression then statement1 else statement2 how do we translate that into our building blocks? Well, we can use a combination of conditional and unconditional branches; if expression IS FALSE then GOTO L1 statement1 GOTO L2 L1: statement 2 L2: Remember I said we only had conditional tests for le, lt, etc. I didn't say anything about how to invert a test - how do we handle the not? Well, let's not worry about that today; that topic deserves an entire lecture to itself (so I will do it next lecture). For the moment, assume it can be done. What are the basic parts: 1. Generating a couple new labels. 2. A branch if a condition is FALSE (it may seem a bit odd that we invert the condition; as we will see today, however, this is very common) 3. labels on statements 4. unconditional branches By generating assembly language are are in a sense being a bit lazy; after all we will then have to pass our output through another tool (the assembler). We could generate machine code directly; however one of the advantages of assembler is that we can leave it to the assembler to take care of matching up labels and locations. Suppose we didn't do that, and we generated machine code directly? How could we handle the forward branches? (describe backpatching); a dying art since most modern compilers do generate assembly language. Lets look at the while: while expression do statement L1: if expression IS FALSE then GOTO L2 statement GOTO L1 L2: same pieces, just put together in different fashion. C allow while loops to have a break or continue statement; how might we translate these? (simple unconditional branches to the appropriate label). Pascal has the statement REPEAT statement UNTIL condition how would you generate code for it? How about FOR i := lower TO upper DO statement oops; what if lower and upper are expressions? we have to introduce temporary variables; variables that the compiler introduces but the user doesn't see; these are treated just like any other local variable. Now lets try something a bit more complex, how about: CASE expression OF case 1: statement 1 case 2: statement 2 case 3: statement 3 ... default: statement n end; How might we do this? a) nested if's b) nested if's with temporary variable c) branch tables On many machines we can make a static (compile time) array of labels (either in the data area or the code area). Suppose we can do such a thing; how could we use it? Under what situations do we want to use version a, version b, version c, how about a combination of all? In a REAL compiler (Bell Labs cc, for example), there are very complex algorithms used to try to figure out the best choice of algorithms for switch (or case) statements.