US8478940B2 - Controlling simulation of a microprocessor instruction fetch unit through manipulation of instruction addresses - Google Patents

Controlling simulation of a microprocessor instruction fetch unit through manipulation of instruction addresses Download PDF

Info

Publication number
US8478940B2
US8478940B2 US12/476,477 US47647709A US8478940B2 US 8478940 B2 US8478940 B2 US 8478940B2 US 47647709 A US47647709 A US 47647709A US 8478940 B2 US8478940 B2 US 8478940B2
Authority
US
United States
Prior art keywords
instruction
address
ifu
model
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related, expires
Application number
US12/476,477
Other versions
US20100306476A1 (en
Inventor
Akash V. Giri
Darin M. Greene
Alan G. Singletary
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Accuri Cytometers Inc
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to US12/476,477 priority Critical patent/US8478940B2/en
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION reassignment INTERNATIONAL BUSINESS MACHINES CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: GIRI, AKASH V., GREENE, DARIN M., SINGLETARY, ALAN G.
Assigned to ACCURI CYTOMETERS, INC. reassignment ACCURI CYTOMETERS, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: BALL, JACK T., RICH, COLLIN A., ROGERS, CLARE E.
Publication of US20100306476A1 publication Critical patent/US20100306476A1/en
Priority to US13/399,816 priority patent/US8533394B2/en
Application granted granted Critical
Publication of US8478940B2 publication Critical patent/US8478940B2/en
Expired - Fee Related legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/2205Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing using arrangements specific to the hardware being tested
    • G06F11/2226Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing using arrangements specific to the hardware being tested to test ALU

Definitions

  • the present invention generally relates to computer systems, and more specifically to a method of simulating microprocessor operation for verification purposes, particularly operation of the instruction fetch unit within a microprocessor.
  • Microprocessors are used for a wide variety of electronics applications. High-performance computer systems typically use multiple microprocessors to carry out the various program instructions embodied in computer programs such as software applications and operating systems.
  • a conventional microprocessor design is illustrated in FIG. 1 .
  • Processor 10 is generally a single integrated circuit superscalar microprocessor, and includes various execution units, registers, buffers, memories, and other functional units which are all formed by integrated circuitry.
  • Processor 10 operates according to reduced instruction set computing (RISC) techniques, and is coupled to a system or fabric bus 12 via a bus interface unit (BIU) 14 within processor 10 .
  • RISC reduced instruction set computing
  • BIU 14 controls the transfer of information between processor 10 and other devices coupled to system bus 12 , such as a main memory or a second-level (L2) cache memory, by participating in bus arbitration.
  • processor 10 and other devices coupled to system bus 12 together form a host data processing system.
  • system bus 12 and the other devices coupled to system bus 12 together form a host data processing system.
  • BIU 14 is connected to an instruction cache 16 and to a data cache 18 within processor 10 .
  • High-speed caches such as those within instruction cache 16 and data cache 18 , enable processor 40 to achieve relatively fast access time to a subset of data or instructions previously transferred from main memory to the caches, thus improving the speed of operation of the host data processing system.
  • Instruction cache 16 is further coupled to a fetcher 20 which fetches instructions for execution from instruction cache 16 during each cycle.
  • Fetcher 20 temporarily stores sequential instructions within an instruction queue 21 for execution by other execution circuitry within processor 10 . From the instruction queue 21 , instructions pass sequentially through the decode unit 22 where they are translated into simpler operational codes (iops) and numerous control signals used by the downstream units.
  • Instruction cache 16 , fetcher 20 , instruction queue 21 , decode unit 22 and dispatch unit 23 are collectively referred to as an instruction fetch unit 24 .
  • the execution circuitry of processor 10 has multiple execution units for executing sequential instructions, including one or more fixed-point units (FXUs) 26 , load-store units (LSUs) 28 , floating-point units (FPUs) 30 , and branch processing units (BPUs) 32 .
  • FXUs fixed-point units
  • LSUs load-store units
  • FPUs floating-point units
  • BPUs branch processing units
  • These execution units 26 , 28 , 30 , and 32 execute one or more instructions of a particular type of sequential instructions during each processor cycle.
  • FXU 26 performs fixed-point mathematical and logical operations such as addition, subtraction, shifts, rotates, and XORing, utilizing source operands received from specified general purpose registers (GPRs) or GPR rename buffers.
  • GPRs general purpose registers
  • FXUs 26 output the data results of the instruction to the GPR rename buffers, which provide temporary storage for the operand data until the instruction is completed by transferring the result data from the GPR rename buffers to one or more of the GPRs.
  • FPUs 30 perform single and double-precision floating-point arithmetic and logical operations, such as floating-point multiplication and division, on source operands received from floating-point registers (FPRs) or FPR rename buffers.
  • FPU 30 outputs data resulting from the execution of floating-point instructions to selected FPR rename buffers, which temporarily store the result data until the instructions are completed by transferring the result data from the FPR rename buffers to selected FPRs.
  • LSUs 28 execute floating-point and fixed-point instructions which either load data from memory (i.e., either the data cache within data cache 18 or main memory) into selected GPRs or FPRs, or which store data from a selected one of the GPRs, GPR rename buffers, FPRs, or FPR rename buffers to system memory.
  • BPUs 32 perform condition code manipulation instructions and branch instructions.
  • Processor 10 employs both pipelining and out-of-order execution of instructions to further improve the performance of its superscalar architecture, but may alternatively use in-order program execution.
  • instructions can be executed by FXUs 26 , LSUs 28 , FPUs 30 , and BPUs 32 in any order as long as data dependencies are observed.
  • instructions are processed by each of the FXUs 26 , LSUs 28 , FPUs 30 , and BPUs 32 at a sequence of pipeline stages, in particular, five distinct pipeline stages: fetch, decode/dispatch, execute, finish, and completion.
  • fetcher 20 retrieves one or more instructions associated with one or more memory addresses from instruction cache. Sequential instructions fetched from instruction cache 16 are stored by fetcher 20 within instruction queue 21 . The instructions are processed by the decode unit 22 and formed into groups by the dispatch unit 23 . Issue unit 42 then issues one or more instructions to execution units 26 , 28 , 30 , and 32 . Upon dispatch, instructions are also stored within the multiple-slot completion buffer of a completion unit 44 to await completion. Processor 10 tracks the program order of the dispatched instructions during out-of-order execution utilizing unique instruction identifiers.
  • IFU instruction fetch unit
  • the actual instructions being processed are irrelevant or at most secondary. They are merely pieces of binary data which need to be delivered to the rest of the CPU as requested. Much more important than the instructions themselves are the addresses by which they are retrieved and processed. The instruction addresses control which instructions are fetched, where they are stored in any resident caches, and whether duplications or conflicts exist between different execution threads or storage locations.
  • the foregoing objects are achieved in a method of testing a design for an IFU by supplying a sequence of instruction addresses to an IFU model which represents the IFU design, fetching one or more of the program instructions according to the instruction address sequence from a memory hierarchy external to the IFU model, detecting that the current state of the IFU model is a predetermined state of interest, and automatically modifying the instruction address sequence to force a selected address to be fetched next by the IFU model.
  • the instruction address sequence may be modified by inserting one or more new instruction addresses, or by jumping to a non-sequential address in the instruction address sequence.
  • the selected address is a corresponding address for an existing instruction already loaded in the IFU cache, an instruction already requested from the external memory hierarchy and in the process of being delivered to the IFU model, or differs only in a specific field from such an address.
  • the instruction address control is preferably accomplished without violating any rules of the processor architecture by sending a flush signal to the IFU model and overwriting an address register corresponding to a next address to be fetched.
  • FIG. 1 is a block diagram illustrating a conventional construction for a microprocessor which includes an instruction fetch unit;
  • FIG. 2 is a block diagram of a computer system programmed to carry out verification of an instruction fetch unit design in accordance with one implementation of the present invention
  • FIG. 3 is a block diagram of a simulation program having various software modules for dynamically testing an instruction fetch unit design in accordance with one implementation of the present invention.
  • FIG. 4 is a chart illustrating the logical flow for controlling the simulation of an instruction fetch unit design by manipulating instruction addresses in accordance with one implementation of the present invention.
  • Computer system 50 is a symmetric multiprocessor (SMP) system having a plurality of processors 52 a , 52 b connected to a system bus 54 .
  • System bus 54 is further connected to a combined memory controller/host bridge (MC/HB) 56 which provides an interface to system memory 58 .
  • System memory 58 may be a local memory device or alternatively may include a plurality of distributed memory devices, preferably dynamic random-access memory (DRAM).
  • DRAM dynamic random-access memory
  • MC/HB 56 also has an interface to peripheral component interconnect (PCI) Express links 60 a , 60 b , 60 c .
  • PCIe peripheral component interconnect
  • Each PCI Express (PCIe)link 60 a , 60 b is connected to a respective PCIe adaptor 62 a , 62 b
  • each PCIe adaptor 62 a , 62 b is connected to a respective input/output (I/O) device 64 a , 64 b .
  • MC/HB 56 may additionally have an interface to an I/O bus 66 which is connected to a switch (I/O fabric) 68 .
  • Switch 68 provides a fan-out for the I/O bus to a plurality of PCI links 60 d , 60 e , 60 f . These PCI links are connected to more PCIe adaptors 62 c , 62 d , 62 e which in turn support more I/O devices 64 c , 64 d , 64 e .
  • the I/O devices may include, without limitation, a keyboard, a graphical pointing device (mouse), a microphone, a display device, speakers, a permanent storage device (hard disk drive) or an array of such storage devices, an optical disk drive, and a network card.
  • Each PCIe adaptor provides an interface between the PCI link and the respective I/O device.
  • MC/HB 56 provides a low latency path through which processors 52 a , 52 b may access PCI devices mapped anywhere within bus memory or I/O address spaces. MC/HB 56 further provides a high bandwidth path to allow the PCI devices to access memory 58 . Switch 68 may provide peer-to-peer communications between different endpoints and this data traffic does not need to be forwarded to MC/HB 56 if it does not involve cache-coherent memory transfers. Switch 68 is shown as a separate logical component but it could be integrated into MC/HB 56 .
  • PCI link 60 c connects MC/HB 56 to a service processor interface 70 to allow communications between I/O device 64 a and a service processor 72 .
  • Service processor 72 is connected to processors 52 a , 52 b via a JTAG interface 74 , and uses an attention line 76 which interrupts the operation of processors 52 a , 52 b .
  • Service processor 72 may have its own local memory 78 , and is connected to read-only memory (ROM) 80 which stores various program instructions for system startup. Service processor 72 may also have access to a hardware operator panel 82 to provide system status and diagnostic information.
  • ROM read-only memory
  • computer system 50 may include modifications of these hardware components or their interconnections, or additional components, so the depicted example should not be construed as implying any architectural limitations with respect to the present invention.
  • service processor 72 uses JTAG interface 74 to interrogate the system (host) processors 52 a , 52 b and MC/HB 56 . After completing the interrogation, service processor 72 acquires an inventory and topology for computer system 50 . Service processor 72 then executes various tests such as built-in-self-tests (BISTs), basic assurance tests (BATs), and memory tests on the components of computer system 50 . Any error information for failures detected during the testing is reported by service processor 72 to operator panel 82 . If a valid configuration of system resources is still possible after taking out any components found to be faulty during the testing then computer system 50 is allowed to proceed.
  • BISTs built-in-self-tests
  • BATs basic assurance tests
  • memory tests any error information for failures detected during the testing is reported by service processor 72 to operator panel 82 . If a valid configuration of system resources is still possible after taking out any components found to be faulty during the testing then computer system 50 is allowed to proceed.
  • Executable code is loaded into memory 58 and service processor 72 releases host processors 52 a , 52 b for execution of the program code, e.g., an operating system (OS) which is used to launch applications and in particular the IFU verification application of the present invention, results of which may be stored in a hard disk drive of the system (an I/O device 64 ).
  • OS operating system
  • service processor 72 may enter a mode of monitoring and reporting any operating parameters or errors, such as the cooling fan speed and operation, thermal sensors, power supply regulators, and recoverable and non-recoverable errors reported by any of processors 52 a , 52 b , memory 58 , and MC/HB 56 .
  • Service processor 72 may take further action based on the type of errors or defined thresholds.
  • While the illustrative implementation provides program instructions embodying the present invention on disk drive 76 , those skilled in the art will appreciate that the invention can be embodied in a program product utilizing other computer-readable media.
  • the program instructions may be written in the C++ programming language for an AIX environment.
  • Computer system 50 carries out program instructions for a verification process that uses dynamic controls to manage instruction addresses fetched by an IFU design. Accordingly, a program embodying the invention may include conventional aspects of various simulation tools, and these details will become apparent to those skilled in the art upon reference to this disclosure.
  • Simulation program 90 is comprised of an IFU model 92 and a verification environment 94 having various software modules which mimic the behavior of devices and components interacting with the IFU.
  • IFU model 92 represents the device to be tested, i.e., a proposed instruction fetch unit design for a microprocessor, and in this example includes an instruction cache 96 , an address translation table 98 , pipelined address registers 100 , and control logic 102 .
  • Each of these modules within IFU model 92 is programmed to perform certain functions on a cycle-by-cycle basis according to the specifications of the IFU design under test, e.g., reading or writing instructions or instruction addresses, table lookups, accessing status registers, generating coherency or state bits, etc. Details of these functions go beyond the scope of the present invention and are dictated by the desired test parameters, but will become apparent to those skilled in the art upon reference to this disclosure.
  • Verification environment 94 includes an external memory hierarchy 104 , instruction address generation 106 , issue and execution units 108 , and sequencing and control 110 .
  • External memory hierarchy 104 simulates the entire memory structure outside of the microprocessor, e.g., any second level (L2) or higher caches and system memory.
  • External memory hierarchy 104 responds to address requests from IFU 92 by transmitting binary instructions associated with the requested addresses.
  • Issue and execution units 108 receive an instruction stream dispatched by IFU model 92 .
  • Sequencing and control 110 provides IFU model 92 with addresses, state information and control signals directing when and how it should fetch additional instructions.
  • instruction addresses are initially provided by instruction address generation 106 which may use any convenient method for generating an address sequence, including random generation or sequences adapted to stress targeted functions of IFU model 92 , generated either in advance of the simulation or dynamically as requested by IFU model 92 .
  • Verification environment 94 is augmented with additional capabilities in order to precisely control the address seen by IFU model 92 for the next instruction required by the execution units. Control of this address allows control of the operations performed by the IFU model to access its internal components, and to request instructions on its external interfaces. By opportunistic manipulation of the fetch addresses, complex and interesting scenarios are created within the operational logic of the IFU model which might never otherwise occur with random or iterative simulation. These capabilities are achieved in this implementation using state monitor 112 , address selector 114 , and address override 116 . State monitor 112 reads the current state of IFU model 92 and detects any interesting conditions as predefined by the programmer.
  • the current state of IFU model 92 may for example be based upon characteristics of the internal components such as the instruction cache 96 and address translation table 98 , and characteristics of the interface between IFU model 92 and external memory hierarchy 104 such as a recent history of requests and responses.
  • a group of instructions when a group of instructions is being sent from external memory hierarchy 104 back to IFU model 92 in response to a fetch request, said group will typically pass sequentially through a pipeline containing a number of different stages. At each stage, operations may be performed on the group of instructions, or actions may be driven within the IFU model, based on the contents or characteristics of the group of instructions. For the purpose of complete verification of the proper performance of the fetch logic, it is desirable to induce collisions between an incoming group of instructions at each of the different stages in the pipeline and a new outgoing request from the fetch logic.
  • One interesting collision scenario would be a request for an instruction that is within the incoming group, but different in address from the instruction originally requested.
  • Another case would involve a colliding request to an address that differs from that originally requested, but which falls into the same congruence class, meaning that certain fields of the address are common and would cause the instructions to map to the same location in the instruction cache 16 .
  • state monitor 112 detects various states of interest such as an error (from IFU model 92 or from an external module such as an L2 cache), receipt of a particular sector of a multi-instruction cache line which includes the requested instruction, or certain asynchronous events such as a cache line for a requested instruction being invalidated prior to delivery of the critical sector.
  • address selector 114 is automatically invoked to generate a new fetch address required to induce the desired test conditions within IFU model 92 .
  • Address selector 114 may for example cause IFU model 92 to fetch a cache line for which a request is already outstanding for another thread, or fetch an address which is similar to one currently resident in the cache, but different in a specific field or fields of the local or translated address.
  • Such refetches are preferably accomplished by overriding the state of the IFU under test without violating any rules of the processor architecture.
  • One implementation of such a refetch is to send to the IFU a flush signal of the type that causes execution to resume at the next instruction address, while overwriting the register containing this address with the selected address address of interest.
  • These functions may be carried out by an address override 116 programmed to manipulate the appropriate components of IFU model 92 and sequencing and control 110 .
  • One such modification would be to override a fetch to the next sequential set of addresses with a different set which targets the same location in the instruction cache 16 as the previous request. It is common to use a subset of each instruction address to determine a congruence class which determines which line in an instruction cache will store the instruction group. For example, any arbitrary 5 bits of a 64-bit address might be used to map an instruction address into one of the 32 lines of an instruction cache. If a goal of the verification process is to operate the cache in a full or overfull condition, then new addresses of the same congruence class might be continually inserted, rather than allowing for sequential fetching to occur.
  • a single request from the fetcher to the L2 cache results in a cache line being returned.
  • a single cache line holds 32 instructions which come back as 4 sectors of 8 instructions each, asynchronously and in any order, so there is interest in a new request that would hit in a line that is partially returned (maybe some sectors in the cache, maybe none yet).
  • the sectors may go through a pipeline of about 6 cycles from the time that the fetcher logic becomes aware of them until they are safely written into the instruction cache. It is of interest to see collisions at each of these steps.
  • FIG. 4 illustrates an IFU verification process in accordance with one implementation.
  • the process progresses as a time clock for the system (IFU and verification environment) advances cycle-by-cycle, and begins by generating instruction addresses to be sequentially fetched by the IFU and delivered to the execution units ( 120 ).
  • the IFU then fetches the first address in the sequence from the external memory hierarchy ( 122 ).
  • the state monitor examines the current state of the IFU after the first address fetch to check for any special states of interest ( 124 ). If the current state is not a special state, the process proceeds with the IFU delivering the instruction to the execution units ( 126 ).
  • one or more appropriate refetch addresses are calculated ( 128 ), and the new addresses are inserted into the address registers of the IFU with appropriate sequencing control signals ( 130 ).
  • the process may instead jump (in a non-sequential manner) to an address in the original list of addresses from instruction address generation 106 , by moving an address pointer of sequencing and control 110 (either ahead or behind). After this dynamic modification of the address sequence, the process continues with instruction delivery ( 126 ). The state of the IFU is then examined for verification purposes, i.e., compared to an anticipated state or recorded for later comparison ( 132 ).
  • Verification may examine various aspects of the IFU, including for example timing of write operations, control signals and interface, the instruction itself within the cache, pre-decode information, parity bits, partial directory addresses, coherency bits, state machine registers (for instruction relocation or remapping), thread indicator bits, hypervisor permission levels, etc. If there are more instructions remaining in the test sequence ( 134 ) the next instruction in the sequence is fetched ( 136 ), which will be a new refetch address if the current state of the IFU is a predetermined state of interest. The process thereafter returns to monitoring the IFU state ( 124 ). Once all addresses in the sequence have been fetched, the verification results are stored for later review by the designer ( 138 ).
  • the present invention accordingly provides a much more effective method for testing operation of an IFU, with the ability to dynamically force fetches of specified addresses as a simulation progresses.
  • This feature of the invention allows the designer to tailor IFU testing for special states of interest which might otherwise never be simulated.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Advance Control (AREA)

Abstract

Instruction fetch unit (IFU) verification is improved by dynamically monitoring the current state of the IFU model and detecting any predetermined states of interest. The instruction address sequence is automatically modified to force a selected address to be fetched next by the IFU model. The instruction address sequence may be modified by inserting one or more new instruction addresses, or by jumping to a non-sequential address in the instruction address sequence. In exemplary implementations, the selected address is a corresponding address for an existing instruction already loaded in the IFU cache, or differs only in a specific field from such an address. The instruction address control is preferably accomplished without violating any rules of the processor architecture by sending a flush signal to the IFU model and overwriting an address register corresponding to a next address to be fetched.

Description

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT
This invention was made with United States Government support under Agreement No. HR0011-07-9-0002 awarded by DARPA. THE GOVERNMENT HAS CERTAIN RIGHTS IN THIS INVENTION.
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention generally relates to computer systems, and more specifically to a method of simulating microprocessor operation for verification purposes, particularly operation of the instruction fetch unit within a microprocessor.
2. Description of the Related Art
Microprocessors are used for a wide variety of electronics applications. High-performance computer systems typically use multiple microprocessors to carry out the various program instructions embodied in computer programs such as software applications and operating systems. A conventional microprocessor design is illustrated in FIG. 1. Processor 10 is generally a single integrated circuit superscalar microprocessor, and includes various execution units, registers, buffers, memories, and other functional units which are all formed by integrated circuitry. Processor 10 operates according to reduced instruction set computing (RISC) techniques, and is coupled to a system or fabric bus 12 via a bus interface unit (BIU) 14 within processor 10. BIU 14 controls the transfer of information between processor 10 and other devices coupled to system bus 12, such as a main memory or a second-level (L2) cache memory, by participating in bus arbitration. Processor 10, system bus 12, and the other devices coupled to system bus 12 together form a host data processing system.
BIU 14 is connected to an instruction cache 16 and to a data cache 18 within processor 10. High-speed caches, such as those within instruction cache 16 and data cache 18, enable processor 40 to achieve relatively fast access time to a subset of data or instructions previously transferred from main memory to the caches, thus improving the speed of operation of the host data processing system. Instruction cache 16 is further coupled to a fetcher 20 which fetches instructions for execution from instruction cache 16 during each cycle. Fetcher 20 temporarily stores sequential instructions within an instruction queue 21 for execution by other execution circuitry within processor 10. From the instruction queue 21, instructions pass sequentially through the decode unit 22 where they are translated into simpler operational codes (iops) and numerous control signals used by the downstream units. After being decoded, instructions are processed by the dispatch unit 23, which gathers them into groups suitable for simultaneous processing and dispatches them to the issue unit 42. Instruction cache 16, fetcher 20, instruction queue 21, decode unit 22 and dispatch unit 23 are collectively referred to as an instruction fetch unit 24.
The execution circuitry of processor 10 has multiple execution units for executing sequential instructions, including one or more fixed-point units (FXUs) 26, load-store units (LSUs) 28, floating-point units (FPUs) 30, and branch processing units (BPUs) 32. These execution units 26, 28, 30, and 32 execute one or more instructions of a particular type of sequential instructions during each processor cycle. For example, FXU 26 performs fixed-point mathematical and logical operations such as addition, subtraction, shifts, rotates, and XORing, utilizing source operands received from specified general purpose registers (GPRs) or GPR rename buffers. Following the execution of a fixed-point instruction, FXUs 26 output the data results of the instruction to the GPR rename buffers, which provide temporary storage for the operand data until the instruction is completed by transferring the result data from the GPR rename buffers to one or more of the GPRs. FPUs 30 perform single and double-precision floating-point arithmetic and logical operations, such as floating-point multiplication and division, on source operands received from floating-point registers (FPRs) or FPR rename buffers. FPU 30 outputs data resulting from the execution of floating-point instructions to selected FPR rename buffers, which temporarily store the result data until the instructions are completed by transferring the result data from the FPR rename buffers to selected FPRs. LSUs 28 execute floating-point and fixed-point instructions which either load data from memory (i.e., either the data cache within data cache 18 or main memory) into selected GPRs or FPRs, or which store data from a selected one of the GPRs, GPR rename buffers, FPRs, or FPR rename buffers to system memory. BPUs 32 perform condition code manipulation instructions and branch instructions.
Processor 10 employs both pipelining and out-of-order execution of instructions to further improve the performance of its superscalar architecture, but may alternatively use in-order program execution. For out-of-order processing, instructions can be executed by FXUs 26, LSUs 28, FPUs 30, and BPUs 32 in any order as long as data dependencies are observed. In addition, instructions are processed by each of the FXUs 26, LSUs 28, FPUs 30, and BPUs 32 at a sequence of pipeline stages, in particular, five distinct pipeline stages: fetch, decode/dispatch, execute, finish, and completion.
During the fetch stage, fetcher 20 retrieves one or more instructions associated with one or more memory addresses from instruction cache. Sequential instructions fetched from instruction cache 16 are stored by fetcher 20 within instruction queue 21. The instructions are processed by the decode unit 22 and formed into groups by the dispatch unit 23. Issue unit 42 then issues one or more instructions to execution units 26, 28, 30, and 32. Upon dispatch, instructions are also stored within the multiple-slot completion buffer of a completion unit 44 to await completion. Processor 10 tracks the program order of the dispatched instructions during out-of-order execution utilizing unique instruction identifiers.
It can be seen from the foregoing description that the flow of instructions through a state-of-the-art microprocessor is particularly complicated, and timing is critical. It is accordingly incumbent upon the designer to be able to verify proper operation of a new microprocessor design, especially the instruction fetch unit (IFU) 24. Functional verification of IFUs is conventionally accomplished by running computer simulations in which program instructions are fetched from other devices outside of the simulated processor, or from the internal caches within the IFU model, and delivered to the other portions of the simulated processor for execution. The instructions fetched may be part of a special software program written for testing purposes, or may be generated by the verification environment; see, e.g., U.S. Pat. No. 6,212,493.
With specific regard to functional verification of the IFU, there is a different focus compared to the other components of the processor. For significant portions of the IFU, the actual instructions being processed are irrelevant or at most secondary. They are merely pieces of binary data which need to be delivered to the rest of the CPU as requested. Much more important than the instructions themselves are the addresses by which they are retrieved and processed. The instruction addresses control which instructions are fetched, where they are stored in any resident caches, and whether duplications or conflicts exist between different execution threads or storage locations.
Unfortunately, the prior art lacks an effective method of precisely controlling the addresses to be handled by the IFU at any given point in the simulation. Randomly generated instruction address sequences do not allow for the creation of specific simulation scenarios which may be of interest to the designer. The '493 patent provides some improvement by collecting profile data such as addresses and program counter contents, but this approach still requires multiple passes of the simulation. It would, therefore, be desirable to devise an improved method for simulation of an instruction fetch unit which could allow dynamic control of the instruction addresses as the simulation progresses. It would be further advantageous if the method could force a specially selected instruction address to be fetched during the next IFU cycle.
SUMMARY OF THE INVENTION
It is therefore one object of the present invention to provide an improved method of verifying proper operation of an instruction fetch unit design.
It is another object of the present invention to provide such a method which more effectively tests the handling of instruction addresses by the IFU design under certain operational conditions of interest.
It is yet another object of the present invention to provide an IFU verification environment which can dynamically force fetches of specified addresses as a simulation progresses.
The foregoing objects are achieved in a method of testing a design for an IFU by supplying a sequence of instruction addresses to an IFU model which represents the IFU design, fetching one or more of the program instructions according to the instruction address sequence from a memory hierarchy external to the IFU model, detecting that the current state of the IFU model is a predetermined state of interest, and automatically modifying the instruction address sequence to force a selected address to be fetched next by the IFU model. The instruction address sequence may be modified by inserting one or more new instruction addresses, or by jumping to a non-sequential address in the instruction address sequence. In exemplary implementations, the selected address is a corresponding address for an existing instruction already loaded in the IFU cache, an instruction already requested from the external memory hierarchy and in the process of being delivered to the IFU model, or differs only in a specific field from such an address. The instruction address control is preferably accomplished without violating any rules of the processor architecture by sending a flush signal to the IFU model and overwriting an address register corresponding to a next address to be fetched.
The above as well as additional objectives, features, and advantages of the present invention will become apparent in the following detailed written description.
BRIEF DESCRIPTION OF THE DRAWINGS
The present invention may be better understood, and its numerous objects, features, and advantages made apparent to those skilled in the art by referencing the accompanying drawings.
FIG. 1 is a block diagram illustrating a conventional construction for a microprocessor which includes an instruction fetch unit;
FIG. 2 is a block diagram of a computer system programmed to carry out verification of an instruction fetch unit design in accordance with one implementation of the present invention;
FIG. 3 is a block diagram of a simulation program having various software modules for dynamically testing an instruction fetch unit design in accordance with one implementation of the present invention; and
FIG. 4 is a chart illustrating the logical flow for controlling the simulation of an instruction fetch unit design by manipulating instruction addresses in accordance with one implementation of the present invention.
The use of the same reference symbols in different drawings indicates similar or identical items.
DESCRIPTION OF THE PREFERRED EMBODIMENT(S)
With reference now to the figures, and in particular with reference to FIG. 2, there is depicted one embodiment 50 of a computer system in which the present invention may be implemented to carry out verification of an instruction fetch unit design. Computer system 50 is a symmetric multiprocessor (SMP) system having a plurality of processors 52 a, 52 b connected to a system bus 54. System bus 54 is further connected to a combined memory controller/host bridge (MC/HB) 56 which provides an interface to system memory 58. System memory 58 may be a local memory device or alternatively may include a plurality of distributed memory devices, preferably dynamic random-access memory (DRAM). There may be additional structures in the memory hierarchy which are not depicted, such as on-board (L1) and second-level (L2) or third-level (L3) caches.
MC/HB 56 also has an interface to peripheral component interconnect (PCI) Express links 60 a, 60 b, 60 c. Each PCI Express (PCIe)link 60 a, 60 b is connected to a respective PCIe adaptor 62 a, 62 b, and each PCIe adaptor 62 a, 62 b is connected to a respective input/output (I/O) device 64 a, 64 b. MC/HB 56 may additionally have an interface to an I/O bus 66 which is connected to a switch (I/O fabric) 68. Switch 68 provides a fan-out for the I/O bus to a plurality of PCI links 60 d, 60 e, 60 f. These PCI links are connected to more PCIe adaptors 62 c, 62 d, 62 e which in turn support more I/O devices 64 c, 64 d, 64 e. The I/O devices may include, without limitation, a keyboard, a graphical pointing device (mouse), a microphone, a display device, speakers, a permanent storage device (hard disk drive) or an array of such storage devices, an optical disk drive, and a network card. Each PCIe adaptor provides an interface between the PCI link and the respective I/O device. MC/HB 56 provides a low latency path through which processors 52 a, 52 b may access PCI devices mapped anywhere within bus memory or I/O address spaces. MC/HB 56 further provides a high bandwidth path to allow the PCI devices to access memory 58. Switch 68 may provide peer-to-peer communications between different endpoints and this data traffic does not need to be forwarded to MC/HB 56 if it does not involve cache-coherent memory transfers. Switch 68 is shown as a separate logical component but it could be integrated into MC/HB 56.
In this embodiment, PCI link 60 c connects MC/HB 56 to a service processor interface 70 to allow communications between I/O device 64 a and a service processor 72. Service processor 72 is connected to processors 52 a, 52 b via a JTAG interface 74, and uses an attention line 76 which interrupts the operation of processors 52 a, 52 b. Service processor 72 may have its own local memory 78, and is connected to read-only memory (ROM) 80 which stores various program instructions for system startup. Service processor 72 may also have access to a hardware operator panel 82 to provide system status and diagnostic information.
In alternative embodiments computer system 50 may include modifications of these hardware components or their interconnections, or additional components, so the depicted example should not be construed as implying any architectural limitations with respect to the present invention.
When computer system 50 is initially powered up, service processor 72 uses JTAG interface 74 to interrogate the system (host) processors 52 a, 52 b and MC/HB 56. After completing the interrogation, service processor 72 acquires an inventory and topology for computer system 50. Service processor 72 then executes various tests such as built-in-self-tests (BISTs), basic assurance tests (BATs), and memory tests on the components of computer system 50. Any error information for failures detected during the testing is reported by service processor 72 to operator panel 82. If a valid configuration of system resources is still possible after taking out any components found to be faulty during the testing then computer system 50 is allowed to proceed. Executable code is loaded into memory 58 and service processor 72 releases host processors 52 a, 52 b for execution of the program code, e.g., an operating system (OS) which is used to launch applications and in particular the IFU verification application of the present invention, results of which may be stored in a hard disk drive of the system (an I/O device 64). While host processors 52 a, 52 b are executing program code, service processor 72 may enter a mode of monitoring and reporting any operating parameters or errors, such as the cooling fan speed and operation, thermal sensors, power supply regulators, and recoverable and non-recoverable errors reported by any of processors 52 a, 52 b, memory 58, and MC/HB 56. Service processor 72 may take further action based on the type of errors or defined thresholds.
While the illustrative implementation provides program instructions embodying the present invention on disk drive 76, those skilled in the art will appreciate that the invention can be embodied in a program product utilizing other computer-readable media. The program instructions may be written in the C++ programming language for an AIX environment. Computer system 50 carries out program instructions for a verification process that uses dynamic controls to manage instruction addresses fetched by an IFU design. Accordingly, a program embodying the invention may include conventional aspects of various simulation tools, and these details will become apparent to those skilled in the art upon reference to this disclosure.
One example of a software program embodying the present invention is illustrated in FIG. 3. Simulation program 90 is comprised of an IFU model 92 and a verification environment 94 having various software modules which mimic the behavior of devices and components interacting with the IFU. IFU model 92 represents the device to be tested, i.e., a proposed instruction fetch unit design for a microprocessor, and in this example includes an instruction cache 96, an address translation table 98, pipelined address registers 100, and control logic 102. Each of these modules within IFU model 92 is programmed to perform certain functions on a cycle-by-cycle basis according to the specifications of the IFU design under test, e.g., reading or writing instructions or instruction addresses, table lookups, accessing status registers, generating coherency or state bits, etc. Details of these functions go beyond the scope of the present invention and are dictated by the desired test parameters, but will become apparent to those skilled in the art upon reference to this disclosure.
Verification environment 94 includes an external memory hierarchy 104, instruction address generation 106, issue and execution units 108, and sequencing and control 110. External memory hierarchy 104 simulates the entire memory structure outside of the microprocessor, e.g., any second level (L2) or higher caches and system memory. External memory hierarchy 104 responds to address requests from IFU 92 by transmitting binary instructions associated with the requested addresses. Issue and execution units 108 receive an instruction stream dispatched by IFU model 92. Sequencing and control 110 provides IFU model 92 with addresses, state information and control signals directing when and how it should fetch additional instructions. In this embodiment the instruction addresses are initially provided by instruction address generation 106 which may use any convenient method for generating an address sequence, including random generation or sequences adapted to stress targeted functions of IFU model 92, generated either in advance of the simulation or dynamically as requested by IFU model 92.
Verification environment 94 is augmented with additional capabilities in order to precisely control the address seen by IFU model 92 for the next instruction required by the execution units. Control of this address allows control of the operations performed by the IFU model to access its internal components, and to request instructions on its external interfaces. By opportunistic manipulation of the fetch addresses, complex and interesting scenarios are created within the operational logic of the IFU model which might never otherwise occur with random or iterative simulation. These capabilities are achieved in this implementation using state monitor 112, address selector 114, and address override 116. State monitor 112 reads the current state of IFU model 92 and detects any interesting conditions as predefined by the programmer. The current state of IFU model 92 may for example be based upon characteristics of the internal components such as the instruction cache 96 and address translation table 98, and characteristics of the interface between IFU model 92 and external memory hierarchy 104 such as a recent history of requests and responses.
As an example, when a group of instructions is being sent from external memory hierarchy 104 back to IFU model 92 in response to a fetch request, said group will typically pass sequentially through a pipeline containing a number of different stages. At each stage, operations may be performed on the group of instructions, or actions may be driven within the IFU model, based on the contents or characteristics of the group of instructions. For the purpose of complete verification of the proper performance of the fetch logic, it is desirable to induce collisions between an incoming group of instructions at each of the different stages in the pipeline and a new outgoing request from the fetch logic. One interesting collision scenario would be a request for an instruction that is within the incoming group, but different in address from the instruction originally requested. Another case would involve a colliding request to an address that differs from that originally requested, but which falls into the same congruence class, meaning that certain fields of the address are common and would cause the instructions to map to the same location in the instruction cache 16. These actions may be taken when state monitor 112 detects various states of interest such as an error (from IFU model 92 or from an external module such as an L2 cache), receipt of a particular sector of a multi-instruction cache line which includes the requested instruction, or certain asynchronous events such as a cache line for a requested instruction being invalidated prior to delivery of the critical sector.
Once a predetermined state of interest has been detected, address selector 114 is automatically invoked to generate a new fetch address required to induce the desired test conditions within IFU model 92. Address selector 114 may for example cause IFU model 92 to fetch a cache line for which a request is already outstanding for another thread, or fetch an address which is similar to one currently resident in the cache, but different in a specific field or fields of the local or translated address. Such refetches are preferably accomplished by overriding the state of the IFU under test without violating any rules of the processor architecture. One implementation of such a refetch is to send to the IFU a flush signal of the type that causes execution to resume at the next instruction address, while overwriting the register containing this address with the selected address address of interest. These functions may be carried out by an address override 116 programmed to manipulate the appropriate components of IFU model 92 and sequencing and control 110.
One such modification would be to override a fetch to the next sequential set of addresses with a different set which targets the same location in the instruction cache 16 as the previous request. It is common to use a subset of each instruction address to determine a congruence class which determines which line in an instruction cache will store the instruction group. For example, any arbitrary 5 bits of a 64-bit address might be used to map an instruction address into one of the 32 lines of an instruction cache. If a goal of the verification process is to operate the cache in a full or overfull condition, then new addresses of the same congruence class might be continually inserted, rather than allowing for sequential fetching to occur.
While it is interesting to have collisions between new fetches and addresses already loaded in the cache, it is also useful to examine collisions between new fetches and requests that were outstanding or in the process of being returned from the L2 cache and delivered to the IFU model. In an exemplary implementation, a single request from the fetcher to the L2 cache results in a cache line being returned. However, a single cache line holds 32 instructions which come back as 4 sectors of 8 instructions each, asynchronously and in any order, so there is interest in a new request that would hit in a line that is partially returned (maybe some sectors in the cache, maybe none yet). The sectors may go through a pipeline of about 6 cycles from the time that the fetcher logic becomes aware of them until they are safely written into the instruction cache. It is of interest to see collisions at each of these steps.
The invention may be further understood with reference to the flow chart of FIG. 4 which illustrates an IFU verification process in accordance with one implementation. The process progresses as a time clock for the system (IFU and verification environment) advances cycle-by-cycle, and begins by generating instruction addresses to be sequentially fetched by the IFU and delivered to the execution units (120). The IFU then fetches the first address in the sequence from the external memory hierarchy (122). The state monitor examines the current state of the IFU after the first address fetch to check for any special states of interest (124). If the current state is not a special state, the process proceeds with the IFU delivering the instruction to the execution units (126). However, if a special state is detected, one or more appropriate refetch addresses are calculated (128), and the new addresses are inserted into the address registers of the IFU with appropriate sequencing control signals (130). As an alternative to inserting new addresses, the process may instead jump (in a non-sequential manner) to an address in the original list of addresses from instruction address generation 106, by moving an address pointer of sequencing and control 110 (either ahead or behind). After this dynamic modification of the address sequence, the process continues with instruction delivery (126). The state of the IFU is then examined for verification purposes, i.e., compared to an anticipated state or recorded for later comparison (132). Verification may examine various aspects of the IFU, including for example timing of write operations, control signals and interface, the instruction itself within the cache, pre-decode information, parity bits, partial directory addresses, coherency bits, state machine registers (for instruction relocation or remapping), thread indicator bits, hypervisor permission levels, etc. If there are more instructions remaining in the test sequence (134) the next instruction in the sequence is fetched (136), which will be a new refetch address if the current state of the IFU is a predetermined state of interest. The process thereafter returns to monitoring the IFU state (124). Once all addresses in the sequence have been fetched, the verification results are stored for later review by the designer (138).
The present invention accordingly provides a much more effective method for testing operation of an IFU, with the ability to dynamically force fetches of specified addresses as a simulation progresses. This feature of the invention allows the designer to tailor IFU testing for special states of interest which might otherwise never be simulated.
Although the invention has been described with reference to specific embodiments, this description is not meant to be construed in a limiting sense. Various modifications of the disclosed embodiments, as well as alternative embodiments of the invention, will become apparent to persons skilled in the art upon reference to the description of the invention. It is therefore contemplated that such modifications can be made without departing from the spirit or scope of the present invention as defined in the appended claims.

Claims (10)

What is claimed is:
1. A computer-implemented method of testing a design for an instruction fetch unit (IFU) of a microprocessor, comprising:
supplying a sequence of instruction addresses to an IFU model which represents the IFU design, wherein the instruction addresses correspond to program instructions provided by a memory hierarchy which is external to the IFU model;
fetching one or more of the program instructions according to the instruction address sequence from the external memory hierarchy to the IFU model;
detecting that a current state of the IFU model is a predetermined state of interest;
automatically modifying the instruction address sequence responsive to said detecting to force a selected address to be fetched next by the IFU models wherein the selected address is specifically based on the predetermined state of interest; and
fetching a program instruction corresponding to the selected address from the external memory hierarchy to the IFU model.
2. The method of claim 1 wherein the instruction address sequence is modified by inserting at least one new instruction address.
3. The method of claim 1 wherein the instruction address sequence is modified by jumping to a non-sequential address in the instruction address sequence.
4. The method of claim 1 wherein the selected address is a corresponding address for an instruction already requested from the external memory hierarchy and in the process of being delivered to the IFU model.
5. The method of claim 1 wherein the selected address differs only in a specific field from a corresponding address for an existing instruction already loaded in a cache of the IFU model.
6. The method of claim 1 wherein:
the IFU model includes a plurality of address registers which are pipelined such that a first one of the address registers is designated to provide a next address to be fetched; and
said modifying is accomplished without violating any rules of the microprocessor architecture by sending a flush signal to the IFU model and overwriting the first address register with the selected address.
7. The method of claim 1 wherein the predetermined state of interest is an error condition.
8. The method of claim 1 wherein the predetermined state of interest is receipt of a particular sector of a multi-instruction cache line which includes a requested instruction.
9. The method of claim 1 wherein the predetermined state of interest is a cache line for a requested instruction being invalidated prior to delivery of a corresponding sector of the cache line.
10. A computer-implemented method of testing a design for an instruction fetch unit (IFU) of a microprocessor, comprising:
supplying a sequence of instruction addresses to an IFU model which represents the IFU design, wherein the instruction addresses correspond to program instructions provided by a memory hierarchy which is external to the IFU model, and the IFU model includes an instruction cache and a plurality of address registers which are pipelined such that a first one of the address registers is designated to provide a next address to be fetched;
fetching one or more of the program instructions according to the instruction address sequence from the external memory hierarchy to the IFU model;
detecting that a current state of the IFU model includes an outstanding fetch request for a cache line which is in the process of being delivered to the instruction cache;
automatically modifying the instruction address sequence responsive to said detecting without violating any rules of the microprocessor architecture by sending a flush signal to the IFU model and overwriting the first address register with a selected address, wherein the selected address is contained in the cache line; and
fetching a program instruction corresponding to the selected address from the external memory hierarchy to the IFU model.
US12/476,477 2009-06-02 2009-06-02 Controlling simulation of a microprocessor instruction fetch unit through manipulation of instruction addresses Expired - Fee Related US8478940B2 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US12/476,477 US8478940B2 (en) 2009-06-02 2009-06-02 Controlling simulation of a microprocessor instruction fetch unit through manipulation of instruction addresses
US13/399,816 US8533394B2 (en) 2009-06-02 2012-02-17 Controlling simulation of a microprocessor instruction fetch unit through manipulation of instruction addresses

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US12/476,477 US8478940B2 (en) 2009-06-02 2009-06-02 Controlling simulation of a microprocessor instruction fetch unit through manipulation of instruction addresses

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US13/399,816 Division US8533394B2 (en) 2009-06-02 2012-02-17 Controlling simulation of a microprocessor instruction fetch unit through manipulation of instruction addresses

Publications (2)

Publication Number Publication Date
US20100306476A1 US20100306476A1 (en) 2010-12-02
US8478940B2 true US8478940B2 (en) 2013-07-02

Family

ID=43221578

Family Applications (2)

Application Number Title Priority Date Filing Date
US12/476,477 Expired - Fee Related US8478940B2 (en) 2009-06-02 2009-06-02 Controlling simulation of a microprocessor instruction fetch unit through manipulation of instruction addresses
US13/399,816 Expired - Fee Related US8533394B2 (en) 2009-06-02 2012-02-17 Controlling simulation of a microprocessor instruction fetch unit through manipulation of instruction addresses

Family Applications After (1)

Application Number Title Priority Date Filing Date
US13/399,816 Expired - Fee Related US8533394B2 (en) 2009-06-02 2012-02-17 Controlling simulation of a microprocessor instruction fetch unit through manipulation of instruction addresses

Country Status (1)

Country Link
US (2) US8478940B2 (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9047068B2 (en) * 2011-10-31 2015-06-02 Dell Products L.P. Information handling system storage device management information access
US20130191584A1 (en) * 2012-01-23 2013-07-25 Honeywell International Inc. Deterministic high integrity multi-processor system on a chip
JP6698691B2 (en) * 2015-01-09 2020-05-27 オプティコン センサ—ズ ヨーロッパ ベー.フェー. Modular wall system and panel body for use in the system
US9940455B2 (en) 2015-02-25 2018-04-10 International Business Machines Corporation Programming code execution management
EP3372367A4 (en) * 2015-11-06 2019-07-10 Furukawa Electric Co., Ltd. Thermoplastic composite material and formed body
CN107665169B (en) * 2016-07-29 2020-07-28 龙芯中科技术有限公司 Method and device for testing processor program
CN109344085B (en) * 2018-11-14 2021-11-23 上海微小卫星工程中心 Method and system for analyzing satellite test data
US10754775B2 (en) * 2018-11-30 2020-08-25 Nvidia Corporation Fast cache invalidation response using cache class attributes
CN111523283B (en) * 2020-04-16 2023-05-26 北京百度网讯科技有限公司 Method and device for verifying processor, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5475852A (en) * 1989-06-01 1995-12-12 Mitsubishi Denki Kabushiki Kaisha Microprocessor implementing single-step or sequential microcode execution while in test mode
US6212493B1 (en) 1998-12-01 2001-04-03 Compaq Computer Corporation Profile directed simulation used to target time-critical crossproducts during random vector testing
US6249892B1 (en) * 1998-10-29 2001-06-19 Advantest Corp. Circuit structure for testing microprocessors and test method thereof
US20050038976A1 (en) * 2003-08-13 2005-02-17 Miller William V. Processor and method for pre-fetching out-of-order instructions

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5475852A (en) * 1989-06-01 1995-12-12 Mitsubishi Denki Kabushiki Kaisha Microprocessor implementing single-step or sequential microcode execution while in test mode
US6249892B1 (en) * 1998-10-29 2001-06-19 Advantest Corp. Circuit structure for testing microprocessors and test method thereof
US6212493B1 (en) 1998-12-01 2001-04-03 Compaq Computer Corporation Profile directed simulation used to target time-critical crossproducts during random vector testing
US20050038976A1 (en) * 2003-08-13 2005-02-17 Miller William V. Processor and method for pre-fetching out-of-order instructions

Also Published As

Publication number Publication date
US20120151186A1 (en) 2012-06-14
US8533394B2 (en) 2013-09-10
US20100306476A1 (en) 2010-12-02

Similar Documents

Publication Publication Date Title
US8533394B2 (en) Controlling simulation of a microprocessor instruction fetch unit through manipulation of instruction addresses
JP6507435B2 (en) Instruction emulation processor, method, and system
US7444499B2 (en) Method and system for trace generation using memory index hashing
KR101738212B1 (en) Instruction emulation processors, methods, and systems
EP3716056B1 (en) Apparatus and method for program order queue (poq) to manage data dependencies in processor having multiple instruction queues
US7877580B2 (en) Branch lookahead prefetch for microprocessors
US20170097826A1 (en) System, Method, and Apparatus for Improving Throughput of Consecutive Transactional Memory Regions
US6754856B2 (en) Memory access debug facility
US6266768B1 (en) System and method for permitting out-of-order execution of load instructions
US9135005B2 (en) History and alignment based cracking for store multiple instructions for optimizing operand store compare penalties
US20110219213A1 (en) Instruction cracking based on machine state
US11188341B2 (en) System, apparatus and method for symbolic store address generation for data-parallel processor
US11113164B2 (en) Handling errors in buffers
CN114661347A (en) Apparatus and method for secure instruction set execution, emulation, monitoring and prevention
US20220413860A1 (en) System, Apparatus And Methods For Minimum Serialization In Response To Non-Serializing Register Write Instruction
US5742755A (en) Error-handling circuit and method for memory address alignment double fault
US20100083269A1 (en) Algorithm for fast list allocation and free
US9582286B2 (en) Register file management for operations using a single physical register for both source and result
US7844859B2 (en) Method and apparatus for instruction trace registers
JPH1049373A (en) Method and device for operating multiplex and highly accurate event for pipeline digital processor
US10133620B2 (en) Detecting errors in register renaming by comparing value representing complete error free set of identifiers and value representing identifiers in register rename unit
Karimi et al. On the impact of performance faults in modern microprocessors
US10346171B2 (en) End-to end transmission of redundant bits for physical storage location identifiers between first and second register rename storage structures
CN115934433A (en) System, apparatus, and method for autonomic function testing of a processor
Bilen et al. Performance Evaluation of Embedded Microcomputers for Avionics Applications

Legal Events

Date Code Title Description
AS Assignment

Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:GIRI, AKASH V.;GREENE, DARIN M.;SINGLETARY, ALAN G.;SIGNING DATES FROM 20090527 TO 20090601;REEL/FRAME:022768/0207

AS Assignment

Owner name: ACCURI CYTOMETERS, INC., MICHIGAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BALL, JACK T.;RICH, COLLIN A.;ROGERS, CLARE E.;SIGNING DATES FROM 20090716 TO 20090717;REEL/FRAME:024789/0822

REMI Maintenance fee reminder mailed
LAPS Lapse for failure to pay maintenance fees
STCH Information on status: patent discontinuation

Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FP Lapsed due to failure to pay maintenance fee

Effective date: 20170702