First cut at defining the "IR With No Name"(r)(tm): Each statement in the IRWNN is terminated by a carriage return. Comments are preceeded by either the '#' character, or by the ';' character. Identifiers and labels are made up of [A-Za-z_][A-Za-z0-9_]*, and must be unique across the program. Locals must be unique in that particular subroutine. CONSTANTS: The simplest units in the IRWNN are constants. These may be formed in the following ways: 0xdead1eaf - Prefix of 0x indicates a 32 bit hexidecimal constant 20 - Number with no prefix indicates a decimal number f@a - Represents the offset of variable 'a' from function f's stack frame. $global - Represents the absolute address of global variable 'global'. Note that this operator is overloaded as well for local vars. 'c', '\ab' - Represents a one byte constant. In most cases, this will be extended to 32 bits, except when used with the 'data' directive (below). A \ denotes an escape code. It may be followed with either a hex number or nothing (ie '\' == '\5C'). "abcde" - Represents a 5 byte constant that may only be used with the 'data' directive. DEFINING VARIABLES: The IRWNN supports three fundemental types of variables: global, function local, and heap allocated (GC'd). These are declared as follows. Note that a variable must be declared before it is used, but variables may be declared anywhere (except that locals must be in subroutines). data X = 'A', 'B' - Defined a two byte global data object (raw memory) and the identifier X. X holds the address of the global data. data Y = "AB" - Defines the same object. data Z = 0x1234, 1 - Defines an 8 byte object initialized to 0x00001234, 0x00000001 object A = 10, 4, $X - Defines a three word record named A. The record is initialized as {10, 4, addr of X}. X must be another global to be used with the $ notation. Objects created with the 'object' directive are used by the GC to provide roots for collections of other objects (as well as the stack). 'data' objects are not. local I - Defines a variable with a scope that is restricted to the subroutine that it is defined. The contents of the newly initialized variable are undefined. It is illegal to declare two local variables of the same name in the same subroutine. N = alloc M - N is set to the address of a newly allocated block of at least M bytes. The contents of the newly allocated memory block are undefined. BRANCHING AND LABELS: Intra-subroutine control flow is achieved by using branching and labels: label: - Defines an entry point. Labels must be on a line by themselves. goto label - Unconditional branch to a label if B goto label - If B != 0 branch to the label. goto $VT[A] - Multiple dispatch. VT must be a global object. This looks up element 'A' in table VT and indirects through the table entry. Used for switch statements and OOP. OPERATORS: Operators are how all of the interesting things happen. There are two unary operators defined, and several binary operators. All operators operate on 32 bit values at this time. String manipulation will be handled by the runtime. _ = W - Bitwise copy. A = $X - If X is a subroutine local variable, $X evaluates to the absolute address of X. This is overloaded as mentioned before for globals (where it is a constant, rather than an expression). For subroutine local variables calculation must occur to determine the distance from the current frame. B = Y[Z] - Indirect into Y, looking at entry Z. Each entry of Y is considered to be a 32 bit number. To indirect on a pointer, simply evaluate Ptr[0]. C = A+B - Addition D = A-B - Subtraction, -A is defined as 0-A E = A*B - Multiplication F = A/B - Signed division (defined to conform to the x86 div) G = A\B - Unsigned division (defined to conform to the x86 idiv) H = A%B - Signed modulus (defined to conform to x86 div) I = A&B - Bitwise And J = A|B - Bitwise Or K = A^B - Bitwise Xor. Note that ~A is defined as A^-1 L = A<>B - Bit shift right N = A>>>B - Arithmatic shift right, preserving sign bit. O = A, ==, !=, >=, and <= SUBROUTINES AND FUNCTIONS: Subroutines are defined to have a block structure starting with SUB X(...) and termintating with END SUB X. Subroutines may not be nested. sub X(A,B,C,AL) - Define a new subroutine named X, with 4 formal (local) parameters. Note that the Access link is explicitly passed, as the last parameter. end sub X - Terminate the definition of the function, instructions emitted would go back into the main program. return - Return from the subroutine with an undefined return value. The 'return' instruction may only be used IN a subroutine. return Z - Return from subroutine with return value 'Z'. call X(I,J,K,L) - Call subroutine X, passing I, J, K, L as parameters. Discard the return value. Parameters must either be constants or identifiers. M = call Y() - Call subroutine Y, storing return value in M. EXAMPLE PROGRAM: This simple example program uses a subroutine to multiply two numbers: local X ; Declare two variables local Y X = call readint() ; Call out to the runtime to read an integer Y = call readint() ; Note that the runtime is subject to change. local Z Z = call Multiply(X, Y) sub Multiply(X, Y) # Our simple function need not be defined before it is local Z # used. Z = X*Y return Z end sub Multiply call printint(Z)