WO2006110069A1 - Data value coherence in computer systems - Google Patents
Data value coherence in computer systems Download PDFInfo
- Publication number
- WO2006110069A1 WO2006110069A1 PCT/SE2005/000534 SE2005000534W WO2006110069A1 WO 2006110069 A1 WO2006110069 A1 WO 2006110069A1 SE 2005000534 W SE2005000534 W SE 2005000534W WO 2006110069 A1 WO2006110069 A1 WO 2006110069A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- target
- original
- data value
- code
- register
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/362—Software debugging
- G06F11/3628—Software debugging of optimised code
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/362—Software debugging
- G06F11/3624—Software debugging by performing operations on the source code, e.g. via a compiler
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
Definitions
- the present invention generally relates to the field of computer technology, and more particularly to computer micro-architecture, compiler technology and debugging techniques, and especially the problem of data value coherence in computer systems when translating program code.
- Compiler technology is a technology to support the translation of computer program code from one form to another, and debugging techniques are generally techniques to debug, or in other words, to find faults in computer programs.
- the data value problem is sometimes formulated as the problem of reporting or tracking the expected values of original registers (or variables in the general high level language case) that are expected at a certain program point.
- Reference [2] relates to a built-in debug support device that realizes a multi-processor simulating environment on plural general purpose computers to improve debugging efficiency.
- Reference [3] concerns a software debug port for a microprocessor.
- the software debug port When used in conjunction with an on-chip trace cache, the software debug port provides trace information for reconstructing instruction execution flow on the processor and is also capable of examining register contents without halting processor operation.
- Reference [4] relates to a debug support device with a debug exception control part that preserves a register state when the generation instruction of a debug exception from a CPU core is received, and changes a program counter to the address of a debug exception handler, and returns the register, to a state before the debug generation when a restoration instruction is received.
- Reference [5] discloses a debug interface with a compact trace record storage having a plurality of trace data storage elements.
- the storage elements have a format including a trace code field indicative of the type of trace information' and a trace data field indicative of the type of trace information data.
- Reference [6] relates to a programmable logic device (PLD) that provides the capability to observe and control the logic state of buried internal nodes.
- PLD programmable logic device
- the PLD provides shadow storage units for internal nodes such as logic element registers, memory cells and I/O registers.
- a sample/load data path includes bidirectional data buses and shift registers that facilitate the sampling of internal nodes for observing their logical states, and loading of internal nodes for controlling their logical states.
- the invention concerns the general data value problem, and especially the residence problem in a computer system when executing program code translated from a source code representation into a target code representation.
- a basic idea of the. invention is to associate references to target data value containers in the target code with corresponding address information of original data value containers of the source code during program code translation, and store information related to target code instructions together with associated address information of original data value containers at execution of target code to uphold a data value view of the original source code representation. In this way, tracking of data values of original source code at execution of translated target code in the target system is supported in a highly efficient manner.
- Examples of data value containers include ordinary micro-computer registers as well as memory-allocated variables in a high-level programming language.
- the invention consequently provides support for register coherence between an original register set of the source code and a target register set of the target code.
- the case involving memory variables reflects a solution to the more general data value residency problem.
- the invention also makes it possible to maintain a data value view of an original computer system after optimizations and adaptations of the program code to a different computer system.
- target code instructions are tagged during code translation with respect to data value coherence to provide an association between target data value containers (such as target registers or target variables) in the target code and original data value containers in the source code.
- target data value containers such as target registers or target variables
- the information stored together with the associated original data value container addresses at target code execution includes corresponding target instruction address information and/or corresponding instruction operand values.
- target instruction address information and/or corresponding instruction operand values.
- the above tracking information is written into a specially designed extra register file, referred to as a ghost register file in the following, and preferably implemented as a hardware register file in the micro-computer architecture.
- the tagged target instructions include one or more target instructions representing an assignment of an original data value container such as a register or variable in the source code.
- data value container address information of the original source code can be moved from one entry (or register) to another in the ghost register file.
- the invention inherently supports the case where different original register (or variable) values might be assigned a target register (or variable) depending on the execution path taken. This is also true for the case where different instances of an original register (or variable) have been translated into parallel instances of target registers.
- the invention preferably also includes logic for upholding sequential consistency of ghost register file operation.
- the invention preferably also provides operation stream processing for selectively transforming ghost register file operations.
- the operation processing logic typically transforms operations siphoned off the processor's ordinary operations stream into assignment, move, store and nop operations targeted for the extra register file.
- the transformation is normally governed by the original ordinary operation and the preceding transformed operations in the extra register file's so-called read latency window.
- An extra field in the instruction words of the target code, an extra register file and some operation processing logic thus enable the states of data value containers of the original source code, in addition to the states of the target data value containers, to be maintained at target code execution, and, if required, even the states of two different computer architectures.
- the invention provides both individual "compile-time” and "run-time” components, as well as an integrated system combination of such individual components. Examples include a compile-time component operable for performing the so-called tagging, and a run-time component operable for storing relevant information in the ghost register file.
- Fig. 1 is a schematic diagram illustrating a mechanism for supporting data value coherence between original source code and target code according to a preferred embodiment of the invention
- Fig. 2 is a schematic flow diagram of an exemplary overall method for tracking data values of original source code at execution of corresponding target code, including debugging and/or trace analysis;
- Fig. 3 illustrates a ghost register file in an exemplary processor environment according to a particular embodiment of the invention
- Fig. 4 is a schematic diagram of a target register file and a ghost register file showing examples of different uses of the ghost register file;
- Fig. 5 is a schematic diagram of a target register file and a ghost register file showing an example of use of the ghost register file when target registers in the sequential code has been parallelized so that sequential target registers have several parallel instances;
- Fig. 6 illustrates an example of ghost register file pipe step logic for ensuring sequential consistency of the ghost register file operation
- Fig. 7 illustrates an example of operation stream processing logic for selectively transforming ghost register file operations
- Fig. 8 illustrates examples of spill and reload operations in the ghost register file according to a preferred embodiment of the invention.
- the invention will first be described in a general context as a general solution to the data value residency problem. Subsequently, the invention will be exemplified in the contexts of register coherence support and tracking of memory variables in a high- level language, respectively.
- Fig. 1 is a schematic diagram illustrating a mechanism for supporting data value coherence between original source code and target code according to a preferred embodiment of the invention.
- the program code is given in an original source code representation, simply referred to as the source code 10, and operates with respect to a set of data value containers 20 such as registers or memory variables.
- the data value containers of the original source code are referred to as original data value containers.
- the source code 10 is translated into a target code representation, simply referred to as target code 30.
- the target code 30 operates with respect to another set of data value containers 40.
- the data value containers of the target code are referred to as target data value containers.
- the code translation normally includes code transformation, optimizations as well as register allocation and allocation of static variables.
- an additional task during the program code translation is to associate references to target data value containers in target code instructions with corresponding address information of original data value containers of the source code.
- this is accomplished by tagging the target code instructions with the relevant information concerning the original data value containers.
- the tagging process may mark (or tag) an instruction's destination container with the address/name of the original container.
- instruction operand values are assigned to the target data value containers and usually also moved between different target containers.
- information related to the target code instructions is stored together with associated address information of original data value containers in a set of so-called 'ghost' or 'shadow' data value containers 50. In this way, a data value view of the original source code representation can be maintained or upheld, at target code run-time, thereby considerably facilitating debugging and/or trace analysis.
- the data value containers 50 for upholding the data value view of the original source code are preferably implemented in a 'ghost' register file in the micro-computer architecture, although alternative implementations exist including the use of an ordinary transaction memory.
- the 'ghost' data value containers 50 may also be memory positions allocated to variables in a high-level programming language.
- Fig. 2 is a schematic flow diagram of an exemplary overall method for tracking data values of original source code at execution of corresponding target code, including debugging and/or trace analysis.
- the target code instructions are preferably tagged with respect to data value coherence to provide an association between target data value containers (such as target registers or target variables) in the target code and original data value containers in the source code (Sl).
- target data value containers such as target registers or target variables
- Sl original data value containers in the source code
- S2 original data value containers
- the original data value container addresses S2
- the original source code can be analyzed based on the stored tracking information.
- the debug utility can use the target instruction address information to recreate the values which are active in the optimized target code. If the instruction operand values themselves are stored, tracking of values of original data value containers can be made without reference to any target data value containers.
- the code analyzer and translator typically a compiler, normally analyzes the code to provide data value container information of the original source code that can be used later, during execution of the resulting target code, together with selected run-time information to provide a data value container view of the original source code.
- the invention thus provides a solution of the data value residency problem, and especially the problem of reporting variable values (in the high-level language translation case) or original register values (in the binary translation case) as they are set by the translated optimized code and/or reporting which target registers they reside in at the trace/breakpoint.
- the registers in the original architecture are named original registers, OREGs.
- the registers pertaining to the target architecture are named target registers, treg:s.
- Example 1 The registers in the original architecture are named original registers, OREGs.
- the registers pertaining to the target architecture are named target registers, treg:s.
- Example 1 The registers in the original architecture are named original registers, OREGs.
- the registers pertaining to the target architecture are named target registers, treg:s.
- Example 1 Example 1 :
- Id tregB ADDRO; // target register tregB gets value from
- tregB initially got the value of the original register OREGl from the memory cell at ADDRO (that is, the first original instruction loaded OREGl from ADDRO) and tregD contained the value of OREG2, which value did tregE get at Ll ?
- the answer to that question is depending both on which path through the code that is executed and which information the compiler can give about the assignments. This implies a solution to the problem where both dynamic as well as static information is used.
- the optimization process could also distribute the values in an original register at different time points in the original code, to the same time point in the target code. This is demonstrated in example 2.
- Example 2
- the OREGA register is used for values at different times.
- VLIW Very Long Instruction Word
- Id tregl7 addrl ; Id tregl9, addr2; nop; nop;
- the dynamic computation in the target system is normally tapped of which original registers (in the binary translation case) or variables (in the high level translation case) that are assigned to which target registers.
- This information is preferably maintained in a 'ghost' register file.
- Original registers (or variables) found there are resident and those not found are not resident.
- An advantage of this approach is that an evicted original register (or variable value) can still be reported as long as it is not evicted by another original register (or variable) value.
- the destination register encoding in the target instruction primitive such as a VLIW or RISC primitive
- the target instruction primitive has an extra field for the original register address. If this field has a valid original register address, the result of the operation will also be written into the ghost registers upholding the coherence of the original register set. This ghost register set will never be read except for trace and debugging purposes. There is thus a tolerance of large operation latencies towards the ghost register file. This means that the ghost register file is off the critical path and the access to it could be pipelined to meet timing requirements, and the ghost register file could be placed just about anywhere on the chip. If 'original register values in target register 1 tracking is to be supported, the ghost register file normally transfers emulated register tags between ghost register elements.
- the compiler which is aware of the mapping from the original register set to the target register set, tags an instruction's destination register with the address/name of the original register if the instruction represent an assignment of the original register in the source code. In the examples above this would mean:
- Id tregB (OREGl)
- ADDRO // Value of original register // OREGl is loaded into tregB mv tregA(OREGl), tregB; // Value of original register // OREGl is moved into tregA Id tregB, ADDRl ; // tregB gets new value
- Example (1) implies that to just keep the latest values of original registers it is sufficient to tag the target destination registers with the original destination registers at compile/link time and then to write the value to the ghost register at runtime. To keep track of where the original register values reside at every point in the program, it is necessary to have support at runtime in the ghost register file for the case where different original register values might be assigned a target register depending on the execution path taken. As will be explained below, this is preferably accomplished by means of a special ghost register operation type moving information within the ghost register file.
- FIG. 3 illustrates a ghost register file in an exemplary processor environment according to a particular embodiment of the invention.
- the exemplary processor 100 has four stages: Instruction Fetch 110, Decode 120, Execute 130 and Commit 140.
- the processor system also has an ordinary register file 40 and a ghost register file 50.
- the ghost register operations are decoded from the ordinary code stream in the Decode stage 120 and committed to the ghost register 50 when the ordinary instruction operating upon memory and ordinary architectural register file 40 is committed.
- the write data path from the ghost register file to memory uses the ordinary memory write data paths.
- the ghost register file 50 resides a number of pipe steps away from execution pipe for possibly handling read latency. The actual number is chosen to ease the implementation.
- the memory read data path is omitted as well as other necessary common structures, such as data read address bus, instruction address bus and so forth.
- the tracking information stored in the ghost register file (GRF) 50 can be read by a debug utility 200 for performing debugging and/or trace analysis of the original source code.
- the debug utility may of course also read information from the ordinary register file 40 to support target code debugging.
- the debug utility reads snapshots of the GRF stored in memory. In practice, a snapshot of the GRF is normally taken by ordering the contents of the GRF into memory by executing ghost store operations, just like the ordinary register file is stored to memory by (ordinary) store operations. The actual encoding of the ghost store operation could be in the opcode field or in the address field.
- the ghost register set may for example be under three write modes.
- the write modes includes different sets of register write destinations among the instructions which writes to registers.
- the first write mode is the case where no register coherency information is written. Only register write destination case 1 is found among instructions in this mode.
- the second write mode is the standard register coherency support case, that is, only destination registers in non-removed statements assigning an original register are tagged.
- Register write destination cases 1 and 2 could be found among instructions in this mode. That is, instructions with destination registers which are not tagged with- an original register will be written to a target register, whereas instructions with destination registers which are tagged with an original register will be written both to a target register and a ghost register.
- the third write mode is an extra register coherency support case, that is, the compiler will keep a 'ghost' assignment statement where the destination register is tagged with both the original register tag and a write destination 3 tag even though the real assignment has been removed in an optimization.
- this write mode all register write destination cases are found among the instructions. This means that in addition to register write destinations case 1 and 2 (second write mode), the register write destination case 3 is found among instructions in this mode.
- the instructions with register write destination 3 represent instructions that have been removed by the compiler during optimization.
- the third write mode is mostly of theoretical interest in the context of register coherence, since the register allocation cannot separate the assignment of ghost registers from target registers, that is, the register allocation will allocate over removed instructions also.
- the presence of ghost-only assignments in the code might thus introduce effects on the register usage and spill (depending on the register file size).
- the write modes are specified in a processor control register.
- the debugger/trace utility will have access to a set of ghost registers as well as the instruction address at which each of the ghost registers was written. It will also have the register write destination identification, so that it can deduce how the ghost register was written. This will enable the utility to recreate the original register values that are active in the optimized target code.
- the ghost register file normally contains as many entries as the number of registers in the target architecture, and preferably also a spill area. If this spill area is as large as the target register set all ghost registers could easily be spilled. Each entry preferably includes the original register number together with the instruction address where the register was written and/or the data value stored.
- each entry preferably includes the following fields:
- the value field (V) is not needed since the value is found in the corresponding target register.
- the value field (V) is needed since the target register may be assigned new values, in which case the old value(s) will be evicted from the target register.
- the RWD field is typically produced by a mode value in the processor control register for the second write destination (the first will never be present in a ghost register).
- the third write destination is assigned the RWD field if the processor is in third write mode and the target and the target destination register is defined as void.
- the encoding of a target register non-modifying operation could also be done via a bit in the opcode or a mode bit field as well as a bit in target destination register. If only register write destination 2 is needed, the RWD field can be omitted.
- Another option is to use one register entry in the register address map as a void marker. Writes to this register entry will not change its content. This register entry could be used to always produce a zero if used. If the E-bit is set, the value (V) is evicted from the target register at the TMIA by a non-original register value.
- Figs. 4 and 5 illustrate exemplary operations on a ghost register file in the cases of 'sequential code' and 'parallel code', respectively.
- the ghost register file and the target register file are depicted to be adjacent. This is generally not the case, in order not to interfere with routing and placement around the target register file.
- the figures just provide a logical view. Figs. 4 and 5 correspond to examples 1) and 2) described above.
- Example 1 of Fig. 4 is used for showing the different uses of the ghost register file when 'tracking of source register values in target registers' is enabled and when 'tracking values of source registers only' is enabled.
- the V field is present here in both cases.
- Example 1) is presented as sequential code, that is, it has not been parallelized into for example VLIW instruction words. The use of the ghost register is the same irrespective of whether it is written from a RISC instruction word or a VLIW instruction word.
- the solid arrow line in Fig. 4 indicates the operation when tracking values of source registers, and the dashed arrow lines indicate added operations when tracking source register values in target registers. For example, the operation to load the value in
- ADDRO into target register tregB, 1 Id tregB(OREGl), ADDR0 ⁇ executed at target instruction IA implies that the information [OREGl, 2, IA, [ADDRO]] is written to the ghost register 'ghost tregB', where OR is OREGl, RWD is equal to 2, the target machine instruction address is IA, and the value V is taken from the memory position at ADDRO.
- a control bit (EVICT) in processor control register defines if non-tagged target register assignments (i.e. non-valid register address in ghost register field) should invalidate earlier assignment of original register address (OR) in ghost register. This will enable two different uses. If original register values in target registers should be tracked then EVICT should be set always. The EVICT control bit is thus only meaningful if original register values is of interest only. If cleared, the original register values are kept even when they are not present in the target registers, they will only be overwritten by instructions tagged with an original register. If set, the original register values could be ousted from the ghost register file by a non-tagged assignment. This could be summarized in Table II below:
- the tracing is disabled and the ghost register memory is read and written to main memory via a ghost register dump routine.
- snapshots or checkpoints of the original architecture register set can be saved for later analysis.
- Example 2 of Fig. 5 is used to show the use of the ghost register file when target registers in the sequential code has been parallelized so that 'sequential' target registers have several parallel 'instances'.
- ghost register file internals
- the ghost register file (including the spill area) is a scratchpad which upholds a register view of the original source code, or an original architecture emulated on a target architecture. This scratchpad is operated upon by two basic types of operations.
- One operation is the assignment operation, which is the side-effect of an arithmetic or load instruction in the execution pipe.
- assignment operation which is the side-effect of an arithmetic or load instruction in the execution pipe.
- the other operation is the move operation.
- This move operation originates from the move instruction in the execution pipe. The difference here is that this operation does not have an explicitly named original register to write into the ghost register file.
- the move operation must copy the OR field of the ghost register entry indexed by the move operations source target register number to the ghost register entry indexed by the move operations destination target register number.
- move operations in the target register file will not incur move operations in the ghost register file.
- move operations which are not OR-tagged in the non- evict case will be a NOP when it reaches the ghost register.
- Move operations which are OR-tagged will be transformed to a ghost register assignment operation.
- a ghost store operation is used to write ghost register content into the memory.
- the ghost store operation is used when taking snapshots of the ghost register content at observation points in the code stream.
- the code containing ghost store instructions could be code in exception or interrupt routines or ordinary code depending how you would want to set up the observation. These observation snapshots are then used as input to analyzing debugging software.
- a NOP instruction is an empty operation which do not operate upon the ghost register file.
- the ghost move operation must read from the ghost register file in order to be able to write its OR value into the ghost register file.
- the assignment operation only needs to write to the ghost register file.
- the assignment operation supports the static conveying of OR information from the compiler into the ghost register file.
- the move operation supports the dynamic, execution-dependent conveying of OR information in the ghost register file.
- a preferred solution is to provide the ghost register file with one write port and one read port per active functional processor unit, i.e. for each active operation-producing entity within the processor. In other words, one read and one write port per issue slot.
- Fig. 6 illustrating an example of ghost register file pipe step logic for ensuring sequential consistency of the ghost register file operation (depicting one operation stream).
- the ghost register file 50 needs to support as many operation streams as there are functional units in the machine this logic is to be duplicated.
- Fig. 6 omits the transport and writing of RWD and TMIA elements for brevity.
- the store to memory path i.e. the data path to the execution pipe, and the store address path are also omitted.
- a number of pipe registers (Pl, P2, P3, ...) are introduced. These register contains the RWD flag, the TMIA value (both omitted in Fig. 6), the destination target register number (dtreg), the original register number (oreg, if the operation is an assignment) or the source target register (streg, if the operation is a move) and the operation indicator (assignment or move).
- a forwarding and write control unit 60 preferably monitors the pipe register data and if a move operation is detected it will forward the source target register number (streg) to the read address port in order to have the data (OR value) from register file for writing when the move operation is in the last pipe stage.
- the storage structure (registers) needed is named as elements in a port interface structure, PO, the last pipe step before ghost register file. Please note that the code is not complete, e.g. omitting the ignore flag handling.
- the store operation processing is omitted as well. Even though the store operation is a GRF read it does not write to the GRF but to the memory. All operations preceding the store upholds the consistency via move source address forwarding.
- the pseudo code is just given as an exemplary sketch to provide a better understanding of the logic.
- PO.wdata // write data, either the OR value from // assignment operations or the OR value from
- the incoming data to the last pipe step in the forwarding and write control unit is data of the previous pipe steps Pl, P2, P3.
- the ghost register file 50 is off the critical path as the completion of operations towards the ghost register file could trail the operation producing activities with any appropriate number of cycles. This facilitates the placement of the ghost register file and it's small set of support logic anywhere on the chip. Just add a number of pipe steps in the operation transport path if place and route timing problems exist. ghost register operation stream processing
- Fig. 7 illustrates an example of operation stream processing logic for selectively transforming ghost register file operations.
- the analysis of preceding operations could be moved backwards from the ghost register file 50 to ensure enough analysis time.
- the operation stream processing logic 70 filters the irrelevant instruction codes from functional units (FU:s) to ghost register file nop:s. It may furthermore squash forwarded move operations read addresses and replace the move operation with an assignment operation if the move was preceded by a source-modifying operation. The new assignment operation gets its destination register value from the preceding source modifying operation.
- the operation stream processing will also transform some move operations to nop:s and assignment operations (see description of ghost register operation types above). It may also transform some load and stores (spill and reloads) to move operations (compare with description of register allocation below).
- the processing 70 of the operation stream could be placed anywhere and the same goes for the ghost register file 50. There is no need for the ghost register file 50 and the operation processing 70 to be adjacent.
- the utility reads the ghost register data from memory. Each snapshot taken will normally contain the following data:
- the utility will now use the snapshot target machine instruction address to index itself into the target machine code.
- it will find a number of target machine (e.g. VLIW) primitives, each of them attributed by the instruction address of the original code.
- VLIW target machine
- These original code instruction addresses enables the utility to map itself back into the original code.
- Compiler The task of the compiler in this scheme is to tag the destination registers of instructions which assigns original register values to target registers. This is first done in the translation phase of the compiler, where each source statement is translated to one or several target statements. If the source statement assigns an original register, then at least one of the target statements will assign the value to a virtual or symbolic register which at this point represent the original register.
- the translation phase is the phase in the compiler where a source instruction is translated to a sequence of one or more target instructions. This is normally done by translating from an intermediate form representing the source program or object to an intermediate form representing the target program or object. Normally the translation is done towards a symbolic or virtual register set which is unlimited in size. In other words, the translation phase does not bother to assign target registers to the target instructions. Henceforth, when we speak of target registers we will mean virtual target registers until the register allocation phase. It should though be understood that the expression “translation” also encompasses the overall code translation or conversion of the source code into target code, including optimizations and register allocations.
- the task of the compiler in the translation stage is to tag the target destination register, for each translated instruction, with the number or name representing the original destination register.
- This tag is normally kept as an attribute in the intermediate form representing the target program. More specifically it is kept in the data structure representing the translated instruction.
- the compiler chooses to tag the target instruction in the sequence of target instructions which represents a source instruction which performs the loading of the original register value into the target destination register.
- the optimization phases are the phases where the target program, normally represented by an intermediate form, is optionally analysed, transformed and tailored to fit a particular target system.
- the register allocation and instruction scheduler explained later on can also be considered to be optimization phases.
- the task of the compiler in the optimization phases is to keep the original register tag (which the translation phase sets) on the target destination register of the instruction that loads the original register value throughout the transformations.
- Different cases can occur:
- Register allocation Register allocation refers to the case where instances of the unlimited set of virtual or symbolic registers in the code are assigned physical register names from the target architecture. Since there are a limited number of physical registers in a machine, the register allocator sometimes inserts and removes instructions which loads, stores and moves register values in and out of memory and between registers. The storing and loading of register values to memory due to limited availability of free physical registers at a program point is called "spilling of registers". One instance of a virtual register is thus not necessarily mapped to one instance of a physical register.
- the task of the register allocator is just as for the other phases to keep the original register tag on the target destination register throughout the register allocation phase, regardless of the different intermediate representation of the target register in the different stages of the register allocation.
- For tracking mode "source register value only” it does only have to concern itself with maintaining the tag in the assignments. This will lead to the ghost register file being assigned with values in the ghost registers at runtime.
- the spill code generator of the register allocator has to tag the inserted load instruction's destination register with the original register name or number if that can be statically deduced. If not, the spill code has to include a ghost register move from the spilled register to the spill area of the ghost register file and the reload code must include a move from the ghost register file spill area into the area of the ghost register file which corresponds to the target register set.
- the dynamic case may be encoded in the following way:
- a spill store is marked by the compiler so that it reaches the ghost register file.
- the processing logic of the ghost register file transforms the spill store to a move which transfers the ghost register data to the spill area.
- the reload load instruction is marked by the compiler so that the processing logic of the ghost register file transforms it to a move from the spill area into the area of the ghost register file which corresponds to the target register set (the normal load that reaches the ghost register file is transformed to an assignment).
- the reload load's OR- value is assigned the spilled target register number by the compiler, this is then used as source register operand in the ghost register move operation. This is expressed in Table III below and also illustrated in Fig. 8.
- Fig. 8 illustrates examples of spill and reload operations in the ghost register file 50 for a "dynamic" case.
- the tracking mode is a processor state
- a compiler that supports both tracking modes must tag the reload code if it is possible or else introduce moves to and from the ghost register file's spill area.
- the instruction scheduler is the phase where instructions are placed in the code stream so that both the micro- architecture of the machine is utilised efficiently and so that the latency and hardware resource constraints are not broken by the code and of course also that the semantics of the program is preserved.
- this phase also means the parallelizing of instructions into VLIW words.
- the linker is the phase where relocatable addresses are being resolved into physical addresses.
- the registers are normally not touched here, but if there are link time optimizations, the same rules apply for them as for optimizations in the compiler.
- the operations (assign, nop, move) of the ghost register could instead be directed against an ordinary transaction memory.
- the backward analysis is then left to an analysis stage (implemented in SW) after the off-loading of the trace data or to a debugger.
- the compactation of the data is lost here.
- the ghost register file's support logic is then traded for a larger set of data (operation transactions) and the off-line multipass reconstruction of a ghost register image.
- the trace must be sufficiently large to accommodate actions defining all target registers, definitions which will always be direct accessible when a ghost register file is present.
- the OR-field When tracking values of variables, the encoding of the ghost register word as described above is different. Instead of the OR-field holding the original register number, the OR- field now represents a variable in a certain context.
- the encoding is done by the translation system (compiler/linker) so that each live variable in a program address range has a unique encoding number.
- This variable encoding field is henceforth called Variable Encoding (VE).
- the encoding must be presented to the debugger/trace system as output from the translation system.
- the TMIA-field is the key to the resolution of the variable encoding, since the TMIA is always included in an address range which always have a unique variable identity for the VE associated with the TMIA.
- the compiler tags each target instruction which loads a register from a variable with a variable encoding number. As in the register coherency case, this leads to assignments in the ghost register file or equivalent module. When variable values are moved between target registers, this leads, as in the register coherency case to move operations in the ghost register file.
- the spill area of the ghost register file is not needed in the variable value tracking case, since a register holding a variable value is always spilled to the variables memory location. Then, of course the compiler has to tag the instruction which reloads the variable into the target register file.
- This approach shows the residency and current values of variables that are present in the registers.
Abstract
Description
Claims
Priority Applications (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2005800501216A CN101198930B (en) | 2005-04-13 | 2005-04-13 | System and method for supporting data value coherence in computer system |
JP2008506398A JP2008536236A (en) | 2005-04-13 | 2005-04-13 | Data value consistency in a computer system (coherence) |
PCT/SE2005/000534 WO2006110069A1 (en) | 2005-04-13 | 2005-04-13 | Data value coherence in computer systems |
CA002604573A CA2604573A1 (en) | 2005-04-13 | 2005-04-13 | Data value coherence in computer systems |
US11/911,265 US8095915B2 (en) | 2005-04-13 | 2005-04-13 | Data value coherence in computer systems |
EP05736464A EP1869551A1 (en) | 2005-04-13 | 2005-04-13 | Data value coherence in computer systems |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/SE2005/000534 WO2006110069A1 (en) | 2005-04-13 | 2005-04-13 | Data value coherence in computer systems |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2006110069A1 true WO2006110069A1 (en) | 2006-10-19 |
Family
ID=37087276
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/SE2005/000534 WO2006110069A1 (en) | 2005-04-13 | 2005-04-13 | Data value coherence in computer systems |
Country Status (6)
Country | Link |
---|---|
US (1) | US8095915B2 (en) |
EP (1) | EP1869551A1 (en) |
JP (1) | JP2008536236A (en) |
CN (1) | CN101198930B (en) |
CA (1) | CA2604573A1 (en) |
WO (1) | WO2006110069A1 (en) |
Families Citing this family (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9009692B2 (en) * | 2009-12-26 | 2015-04-14 | Oracle America, Inc. | Minimizing register spills by using register moves |
JP5303531B2 (en) * | 2010-09-28 | 2013-10-02 | 株式会社日立製作所 | Maintenance support device for embedded system |
CN104360949B (en) * | 2014-11-29 | 2017-05-17 | 中国航空工业集团公司第六三一研究所 | A-grade software object code coverage analysis method conforming to DO-178B/C |
US9733911B2 (en) * | 2015-11-11 | 2017-08-15 | National Instruments Corporation | Value transfer between program variables using dynamic memory resource mapping |
CN105843605B (en) * | 2016-03-17 | 2019-03-08 | 中国银行股份有限公司 | A kind of data mapping method and device |
US10740108B2 (en) | 2017-04-18 | 2020-08-11 | International Business Machines Corporation | Management of store queue based on restoration operation |
US10552164B2 (en) | 2017-04-18 | 2020-02-04 | International Business Machines Corporation | Sharing snapshots between restoration and recovery |
US11010192B2 (en) | 2017-04-18 | 2021-05-18 | International Business Machines Corporation | Register restoration using recovery buffers |
US10782979B2 (en) | 2017-04-18 | 2020-09-22 | International Business Machines Corporation | Restoring saved architected registers and suppressing verification of registers to be restored |
US10545766B2 (en) | 2017-04-18 | 2020-01-28 | International Business Machines Corporation | Register restoration using transactional memory register snapshots |
US10572265B2 (en) | 2017-04-18 | 2020-02-25 | International Business Machines Corporation | Selecting register restoration or register reloading |
US10838733B2 (en) | 2017-04-18 | 2020-11-17 | International Business Machines Corporation | Register context restoration based on rename register recovery |
US10649785B2 (en) | 2017-04-18 | 2020-05-12 | International Business Machines Corporation | Tracking changes to memory via check and recovery |
US10963261B2 (en) | 2017-04-18 | 2021-03-30 | International Business Machines Corporation | Sharing snapshots across save requests |
US10564977B2 (en) | 2017-04-18 | 2020-02-18 | International Business Machines Corporation | Selective register allocation |
US10489382B2 (en) | 2017-04-18 | 2019-11-26 | International Business Machines Corporation | Register restoration invalidation based on a context switch |
US10540184B2 (en) | 2017-04-18 | 2020-01-21 | International Business Machines Corporation | Coalescing store instructions for restoration |
US10545740B2 (en) * | 2017-10-25 | 2020-01-28 | Saudi Arabian Oil Company | Distributed agent to collect input and output data along with source code for scientific kernels of single-process and distributed systems |
JP6890557B2 (en) * | 2018-01-17 | 2021-06-18 | 株式会社日立製作所 | Analytical model creation system, programming device and analytical model creation method |
CN110737501A (en) * | 2018-07-18 | 2020-01-31 | 中标软件有限公司 | Method and system for realizing functions of check point and recovery point in Docker container |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5713010A (en) * | 1995-02-10 | 1998-01-27 | Hewlett-Packard Company | Source line tracking in optimized code |
US6094729A (en) * | 1997-04-08 | 2000-07-25 | Advanced Micro Devices, Inc. | Debug interface including a compact trace record storage |
US6243304B1 (en) * | 1996-03-11 | 2001-06-05 | Altera Corporation | Sample and load scheme for observability internal nodes in a PLD |
US6658651B2 (en) * | 1998-03-02 | 2003-12-02 | Metrowerks Corporation | Method and apparatus for analyzing software in a language-independent manner |
Family Cites Families (37)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0282634A (en) * | 1988-09-20 | 1990-03-23 | Mitsubishi Electric Corp | Tape carrier |
US5781753A (en) * | 1989-02-24 | 1998-07-14 | Advanced Micro Devices, Inc. | Semi-autonomous RISC pipelines for overlapped execution of RISC-like instructions within the multiple superscalar execution units of a processor having distributed pipeline control for speculative and out-of-order execution of complex instructions |
US5768575A (en) * | 1989-02-24 | 1998-06-16 | Advanced Micro Devices, Inc. | Semi-Autonomous RISC pipelines for overlapped execution of RISC-like instructions within the multiple superscalar execution units of a processor having distributed pipeline control for sepculative and out-of-order execution of complex instructions |
JP3062266B2 (en) * | 1991-03-20 | 2000-07-10 | 富士通株式会社 | Support device |
GB2272085A (en) * | 1992-10-30 | 1994-05-04 | Tao Systems Ltd | Data processing system and operating system. |
JP2728002B2 (en) | 1995-02-15 | 1998-03-18 | 日本電気株式会社 | Embedded software debug support device |
US5581729A (en) * | 1995-03-31 | 1996-12-03 | Sun Microsystems, Inc. | Parallelized coherent read and writeback transaction processing system for use in a packet switched cache coherent multiprocessor system |
US5832297A (en) * | 1995-04-12 | 1998-11-03 | Advanced Micro Devices, Inc. | Superscalar microprocessor load/store unit employing a unified buffer and separate pointers for load and store operations |
US5881288A (en) | 1995-09-29 | 1999-03-09 | Matsushita Electric Industrial Co., Ltd. | Debugging information generation system |
JPH0997182A (en) * | 1995-09-29 | 1997-04-08 | Matsushita Electric Ind Co Ltd | Program converter and debugger |
US6185732B1 (en) | 1997-04-08 | 2001-02-06 | Advanced Micro Devices, Inc. | Software debug port for a microprocessor |
JPH10289110A (en) * | 1997-04-14 | 1998-10-27 | Matsushita Electric Ind Co Ltd | Program converter and debugging device |
US5999734A (en) * | 1997-10-21 | 1999-12-07 | Ftl Systems, Inc. | Compiler-oriented apparatus for parallel compilation, simulation and execution of computer programs and hardware models |
JP3178403B2 (en) * | 1998-02-16 | 2001-06-18 | 日本電気株式会社 | Program conversion method, program conversion device, and storage medium storing program conversion program |
US6397242B1 (en) * | 1998-05-15 | 2002-05-28 | Vmware, Inc. | Virtualization system including a virtual machine monitor for a computer with a segmented architecture |
US6704925B1 (en) * | 1998-09-10 | 2004-03-09 | Vmware, Inc. | Dynamic binary translator with a system and method for updating and maintaining coherency of a translation cache |
US7516453B1 (en) * | 1998-10-26 | 2009-04-07 | Vmware, Inc. | Binary translator with precise exception synchronization mechanism |
JP2000181746A (en) | 1998-12-18 | 2000-06-30 | Toshiba Corp | Processor with debug support and debug function execution control method |
JP2000227861A (en) * | 1999-02-05 | 2000-08-15 | Nec Ic Microcomput Syst Ltd | Method and device for debugging |
JP2000322285A (en) * | 1999-05-14 | 2000-11-24 | Nec Corp | Method and system for debugging program described in language to be precompiled and information recording medium |
US6412043B1 (en) * | 1999-10-01 | 2002-06-25 | Hitachi, Ltd. | Microprocessor having improved memory management unit and cache memory |
US6598128B1 (en) * | 1999-10-01 | 2003-07-22 | Hitachi, Ltd. | Microprocessor having improved memory management unit and cache memory |
KR100362193B1 (en) * | 1999-11-26 | 2002-11-23 | 주식회사 하이닉스반도체 | Data Output Device of DDR SDRAM |
US20030018694A1 (en) * | 2000-09-01 | 2003-01-23 | Shuang Chen | System, method, uses, products, program products, and business methods for distributed internet and distributed network services over multi-tiered networks |
JP2002108649A (en) * | 2000-09-29 | 2002-04-12 | Toshiba Corp | Recording medium in which program to support trace analysis is recorded and program product |
US20020156977A1 (en) * | 2001-04-23 | 2002-10-24 | Derrick John E. | Virtual caching of regenerable data |
US6754782B2 (en) * | 2001-06-21 | 2004-06-22 | International Business Machines Corporation | Decentralized global coherency management in a multi-node computer system |
US7418536B2 (en) * | 2001-07-30 | 2008-08-26 | Cisco Technology, Inc. | Processor having systolic array pipeline for processing data packets |
US7051191B2 (en) * | 2001-12-26 | 2006-05-23 | Intel Corporation | Resource management using multiply pendent registers |
WO2004021176A2 (en) * | 2002-08-07 | 2004-03-11 | Pact Xpp Technologies Ag | Method and device for processing data |
US7165824B2 (en) * | 2002-12-02 | 2007-01-23 | Silverbrook Research Pty Ltd | Dead nozzle compensation |
DE10335888B4 (en) * | 2003-08-06 | 2008-03-13 | Man Roland Druckmaschinen Ag | Method and apparatus for controlling the total cut register error of a web-fed rotary press |
US7472184B2 (en) * | 2003-09-19 | 2008-12-30 | International Business Machines Corporation | Framework for restricting resources consumed by ghost agents |
US7584329B2 (en) * | 2005-02-10 | 2009-09-01 | International Business Machines Corporation | Data processing system and method for efficient communication utilizing an Ig coherency state |
US7293158B2 (en) * | 2005-03-02 | 2007-11-06 | International Business Machines Corporation | Systems and methods for implementing counters in a network processor with cost effective memory |
US8301868B2 (en) * | 2005-09-23 | 2012-10-30 | Intel Corporation | System to profile and optimize user software in a managed run-time environment |
JP4572169B2 (en) * | 2006-01-26 | 2010-10-27 | エヌイーシーコンピュータテクノ株式会社 | Multiprocessor system and operation method thereof |
-
2005
- 2005-04-13 JP JP2008506398A patent/JP2008536236A/en active Pending
- 2005-04-13 CN CN2005800501216A patent/CN101198930B/en not_active Expired - Fee Related
- 2005-04-13 CA CA002604573A patent/CA2604573A1/en not_active Abandoned
- 2005-04-13 EP EP05736464A patent/EP1869551A1/en not_active Withdrawn
- 2005-04-13 WO PCT/SE2005/000534 patent/WO2006110069A1/en active Application Filing
- 2005-04-13 US US11/911,265 patent/US8095915B2/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5713010A (en) * | 1995-02-10 | 1998-01-27 | Hewlett-Packard Company | Source line tracking in optimized code |
US6243304B1 (en) * | 1996-03-11 | 2001-06-05 | Altera Corporation | Sample and load scheme for observability internal nodes in a PLD |
US6094729A (en) * | 1997-04-08 | 2000-07-25 | Advanced Micro Devices, Inc. | Debug interface including a compact trace record storage |
US6658651B2 (en) * | 1998-03-02 | 2003-12-02 | Metrowerks Corporation | Method and apparatus for analyzing software in a language-independent manner |
Non-Patent Citations (1)
Title |
---|
ADL-TABATABAI A ET AL: "Evicted Variables and the Interaction of Global Register Allocation and Symbolic Debugging.", THE 20TH ANNUAL ACM SIGACT-SIGPLAN SYMPOSIUM ON PRINCIPLES OF PROGRAMMING LANGUAGES., January 1993 (1993-01-01), XP002993861 * |
Also Published As
Publication number | Publication date |
---|---|
CN101198930A (en) | 2008-06-11 |
CA2604573A1 (en) | 2006-10-19 |
JP2008536236A (en) | 2008-09-04 |
CN101198930B (en) | 2011-07-27 |
US20080178157A1 (en) | 2008-07-24 |
EP1869551A1 (en) | 2007-12-26 |
US8095915B2 (en) | 2012-01-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8095915B2 (en) | Data value coherence in computer systems | |
Pulte et al. | Simplifying ARM concurrency: multicopy-atomic axiomatic and operational models for ARMv8 | |
US8495636B2 (en) | Parallelizing single threaded programs by performing look ahead operation on the single threaded program to identify plurality of instruction threads prior to execution | |
Halstead Jr et al. | MASA: A multithreaded processor architecture for parallel symbolic computing | |
KR101559090B1 (en) | Automatic kernel migration for heterogeneous cores | |
US8332829B2 (en) | Communication scheduling within a parallel processing system | |
Gove | Multicore Application Programming: For Windows, Linux, and Oracle Solaris | |
Marino et al. | A case for an SC-preserving compiler | |
US8544006B2 (en) | Resolving conflicts by restarting execution of failed discretely executable subcomponent using register and memory values generated by main component after the occurrence of a conflict | |
US8312455B2 (en) | Optimizing execution of single-threaded programs on a multiprocessor managed by compilation | |
Abdulla et al. | Context-bounded analysis for POWER | |
KR100368166B1 (en) | Methods for renaming stack references in a computer processing system | |
Haas et al. | Fault-tolerant execution on cots multi-core processors with hardware transactional memory support | |
Nagarajan et al. | ECMon: exposing cache events for monitoring | |
US9817669B2 (en) | Computer processor employing explicit operations that support execution of software pipelined loops and a compiler that utilizes such operations for scheduling software pipelined loops | |
Zhang | URSIM reference manual | |
Moreno et al. | Simulation/evaluation environment for a VLIW processor architecture | |
Arya et al. | An architecture for high instruction level parallelism | |
Richie et al. | Advances in run-time performance and interoperability for the Adapteva epiphany coprocessor | |
Alur et al. | Static detection of uncoalesced accesses in GPU programs | |
Grossman et al. | Efficient checkpointing of multi-threaded applications as a tool for debugging, performance tuning, and resiliency | |
Tiwari et al. | Quantifying the potential of program analysis peripherals | |
Kleckner | Optimization of naïve dynamic binary instrumentation Tools | |
Hampton | Reducing exception management overhead with software restart markers | |
Krause et al. | C for a tiny system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 200580050121.6 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2005736464 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 11911265 Country of ref document: US |
|
ENP | Entry into the national phase |
Ref document number: 2008506398 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2604573 Country of ref document: CA |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
NENP | Non-entry into the national phase |
Ref country code: RU |
|
WWP | Wipo information: published in national office |
Ref document number: 2005736464 Country of ref document: EP |