//===----------------------------------------------------------------------===// // Reworking and revitalizing the LLVM Code Generator //===----------------------------------------------------------------------===// 10/2/2004 Initial revision This note describes changes to the code generator that we need to make in order for it to be able to adapt and work well for our current and future targets. The current MachineInstr/MachineOperand classes are both a mess, and they are neither expressive enough or simple enough. The following is a specific list of changes that should be made to each class or portion of the code generator. //===----------------------------------------------------------------------===// // An aside about the SparcV9 backend //===----------------------------------------------------------------------===// One thing to note is that the SparcV9 backend is currently the piece that is holding back the progress of the rest of the code generator. The primary reason for this is that it uses a completely different representation for machine instructions (based on Value*'s) than the target independent code generator does. In order to address this, the current SparcV9 code generator can be either converted completely over to the target-independent structure (this seems unlikely) or we can fork the machine instruction/operand representation so that the V9 backend uses one set of classes and everything else uses another. I think this is the best short term solution until V9 can be properly addressed. To split the V9 and target-independent instruction representation, I propose renaming MachineInstr to SparcMachineInstr and MachineOperand to SparcMachineOperand. To support the target independent code generator, a new MachineInst and MachineInst::Operand class The MachineBasicBlock class would then contain two lists of instructions: one for the target independent code generator and one for the sparc backend. //===----------------------------------------------------------------------===// // Two-address instruction lowering //===----------------------------------------------------------------------===// The code generator currently uses a form of "lowering" to reduce instructions from three address form to two address form. This is used on targets that have operands that they both use and define, permitting the code generator to represent them in three address form before register allocation. In particular consider the X86 addition operation. Before register allocation, we might have: %vreg1024 = add %vreg1025, %vreg1026 After register allocation, we require: EAX = add EAX, ECX which gets represented as: add EAX, ECX (in other words, one operand is dropped). There are two problems with this: the first is that deleting the operand from the machine instruction is problematic, and the second is that this is not expressive enough. 1. Any information about the instruction needs to know whether it is being used before or after register allocation (because operand numbers shift around). This is a huge problem moving towards an autogenerated target description. 2. This is not expressive enough. In particular, it is only sufficient for targets that need two operands register allocated to the same register. Some targets need more, and supporting inline assembly will eventually require support for this. In order to address these problems, we should make the following changes: 1. The register allocator/two address elimination passes should not delete the operand. After register allocation, the two operands are required to have identical registers. 2. The TargetInstrInfo "isTwoAddress" flag needs to be replaced. In particular, we need more powerful way of describing this. One way of doing so is to have a per-operand reload constraint, which would indicate that the second operand must contain the same physical register as the first operand. This change can and should be made independently of the first. See the 'TargetInstrDescriptor class' section below for more information. //===----------------------------------------------------------------------===// // ValueType enum //===----------------------------------------------------------------------===// The ValueType enum indicates what types of value an operand may have (somewhat equivalent to the GCC concept of a 'mode'). It currently allows for the integer types i1, i8, i16, i32, i64, and i128, the floating point types 'f32, f64, f80', and the special types Other and isVoid. We need arbitrary sized integers to be able to accurately model targets that have (for example) 13-bit immediate fields. This enum should be extended to allow any integer size from 1 to 64 bits, as well keeping i128. The floating point types may be extended or changed in the future (to support non-IEEE FP), but for now they can remain the same. //===----------------------------------------------------------------------===// // TargetInstrDescriptor class //===----------------------------------------------------------------------===// The TargetInstrDescriptor class needs several important changes. First and easiest, in needs a field to indicate the opcode of the instruction the descriptor corresponds to. Thus, this expression should always be true: assert(TII->get(opcode).Opcode == opcode); The reason for this will become clear later. Second, the TargetInstrDescriptor needs to indicate the number of operands the machine instr has. Third, TargetInstrDescriptor needs to contain information about all of the operands of a machine instruction. In particular, any information about operands of a instruction that is always true (for instructions of that opcode) should be stored here. Basically we want to store as much as we can once, in static memory, instead of once for each instance of a machine instruction. In particular, we should keep use/def information for each operand, a ValueType for each operand, operand constraint information for the register allocator, etc. None of this should be stored in the MachineInstr object, it should all be stored once for each opcode. //===----------------------------------------------------------------------===// // MachineInst class //===----------------------------------------------------------------------===// The MachineInst class, which is the target-independent replacement for the current MachineInstr class, should stay somewhat similar to the current class. In particular, it should look like the following: 1. The Parent, Next, and Previous pointers should be retained. 2. numImplicitRefs and all users should be eliminated, as it's V9 specific. 3. Instead of containing an opcode, it should contain a direct pointer to the TargetInstrDescriptor object for the instruction. If the opcode is needed, clients can use MI->getInstrInfo()->Opcode, or a helper method in MachineInstr can be used. 4. The Opcode/MID pointer should be const upon creation of the machine instruction. Once created, this should never, ever, change. As such, the MID pointer should probably be an MID reference, and the 'replace' and 'setOpcode' methods should go away. 5. The number of operands an instruction contains should never change once created (see note about fixing the 2-addr thing above). As such, the operands list should be allocated at the same time the MachineInstr object is. Because of this, the MachineInstr ctor should be made private, and a new MachineInstr::create method should be used instead. MachineInstr::create would take a MID pointer (which indicates the number of operands the MI has), allocate memory, then call the in-place ctor for MachineInst. Given that all clients call (or should call) BuildMI, this change should be easy. 6. Many of the methods in this class should get zapped. In particular, all the "addOperandOfThisFlavor" methods should go away. Imagine a MachineInst class where mere mortals could understand what all of the methods do!! //===----------------------------------------------------------------------===// // MachineInst::Operand class //===----------------------------------------------------------------------===// The MachineOperand replacement is the hardest part of this whole puzzle to get right. In particular, we want to be able to have the following capabilities in our MI::Operand class: 1. We need the following basic operand types: A. Register B. Subregister C. Immediate D. Memory reference 2. For register references, what we have now is basically sufficient: we need to be able to represent virtual regs before allocation and change them to physregs after. This isn't hard. 3. Subregisters are similar, but also a bit different. Basically, a subregister reference is a pair containing a virtual register and a small constant subregister ID. At register allocation time, this is replaced with a specific physical register. Details of this mechanism are not described in detail here. 4. Immediates need to be generalized substantially. In particular, we need the following capabilities: A. The continued ability to *efficiently* represent immediate integers <= 32 bits in size. B. The ability to represent the addresses of global variables, including constant offsets from their start. This should be done without referring to the LLVM Value*'s for the globals. C. The ability to hold the address of a machine basic block. D. The ability to represent that this is a subreference to a portion of the immediate. In particular, RISC machines often load addresses of globals in multiple parts, and need relocations to indicate which slab of the global address the instruction refers to. E. An abstract FrameIndex or ConstantPoolIndex address. 5. Memory references (loaded and stored addresses) need to be very general. In particular, we need to be able to represent the following forms of memory reference: A. Any immediate. B. Target-specific patterns of almost arbitrary complexity. In particular, we should be able to say that the X86 has "CST + REG + [1,2,4,8] * REG" addressing. In addition, we need to keep an alias set number for the memory reference, or some sort of way to relate to the LLVM code. This alias information should come from the instruction selector. Finally, we need to keep information about alignment. //===--------------------- Implementation The above constraints are fairly onerous to provide, at least in an efficient way. The two obvious implementation strategies would be to either: 1. make the Operand class a union of all of the possible fields with a discriminator indicating the current form (this is how MachineOperand currently works). 2. Make the Operand class a base class, which has subclasses for all of the different forms and things. The problem with the first approach is that it wastes memory for operands that are simple (e.g. register references and small integer immediates, which are very very common) by forcing them to be the size of a union of operands. Also, it is not clear how to support arbitrary target-specific addressing modes. The problem with the second approach is that it is very costly. Instead of being a single memory allocation for the machine instruction, it is an allocation for the instruction plus an allocation for each operand. In addition, traversing the operands would require trapsing around memory in random ways, which is bad bad bad. I propose that we use a hybrid approach. In particular, the Operand class would be a base class that contains a union of SOME of the state for the operand, and, further, derived classes are not allowed to contain extra state. Consider the following sketch: struct Operand { enum OperandType { isReg, isSubReg, isIntImmed, isArbitraryImmediate, isMem } OpDiscriminator; protected: union { struct RegStuff { unsigned RegNo; }; struct SubRegStuff { unsigned VirtReg; unsigned SubRegNo; }; struct IntImmediateStuff { unsigned IntVal; }; struct ArbitraryImmediateStuff { MachineImmediate *MImm; }; struct MemStuff { AddressingMode *Addr; unsigned AliasSet; }; } Contents; public: Operand(OperandType OT); } struct RegOperand : public Operand { unsigned getReg() const { return Contents.RegNo; } void setReg(unsigned R) { ... } ... classof stuff for dyn_cast ... }; struct IntImmOperand : public Operand { unsigned getValue() { return ... } ... classof stuff for dyn_cast ... }; ... In particular, because the derived classes do not add any data members to the base class, we can safely construct an array of Operand's (as part of the MachineInst memory allocation) and cast the elements to the derived classes. All of the variable sized and arbitrary complex stuff in the operands (in the arbitrary immediate and the memory mode), which are (small) class hierarchies of simple classes. One thing that is important to note is that the AddressingMode class is extensible by the target, allowing them to have their arbitrarily wierd and complex addressing mode stuff. The addressing mode class would provide virtual methods to print and emit machine code for an addressing mode. A final important aspect of the AddressingMode and MachineImmediate classes is that all instances of them should be aquired through a ::get method (e.g. MachineGlobalAddress::get("malloc")), and all of them are immutable on construction. These two requirements allow the objects to all be shared and reference counted, allowing us to avoid tons of dynamic memory allocations (e.g. all "[ESP + 12]" objects will be shared by instructions that reference that stack slot. //===----------------------------------------------------------------------===// // A path from here to there //===----------------------------------------------------------------------===// This document describes a lot of ideas and important things that need to be done. When we start making these changes, it is important that we make them incrementally. In particular, there are three things that are independent: * The two-address change can be done at any time, and blocks most of the other things here (how can you have info about operands if the operand indexes are not stable?). * The V9 and target-independent Instruction classes can be split up. Initially the bodies of the classes would be the same, then they would be pruned to eliminate unneeded facilities on both sides, then the t-i one would be enhanced. * ValueType should be enhanced. Once these are under way, it would be reasonable to start adding information about operands to the TargetInstrDescriptor class, then finally the MachineOperand/MachineInst changes can be made. It's a long road, but each little piece is relatively straight-forward!