WO1998011484A1 - Command processor having history memory - Google Patents

Command processor having history memory Download PDF

Info

Publication number
WO1998011484A1
WO1998011484A1 PCT/JP1996/002633 JP9602633W WO9811484A1 WO 1998011484 A1 WO1998011484 A1 WO 1998011484A1 JP 9602633 W JP9602633 W JP 9602633W WO 9811484 A1 WO9811484 A1 WO 9811484A1
Authority
WO
WIPO (PCT)
Prior art keywords
instruction
history
data
address
store
Prior art date
Application number
PCT/JP1996/002633
Other languages
French (fr)
Japanese (ja)
Inventor
Yukio Umetani
Yoshio Miki
Original Assignee
Hitachi, Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hitachi, Ltd. filed Critical Hitachi, Ltd.
Priority to PCT/JP1996/002633 priority Critical patent/WO1998011484A1/en
Publication of WO1998011484A1 publication Critical patent/WO1998011484A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3824Operand accessing
    • G06F9/383Operand prefetching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3824Operand accessing
    • G06F9/383Operand prefetching
    • G06F9/3832Value prediction for operands; operand history buffers

Definitions

  • the present invention relates to the configuration of an instruction processing device, and more particularly to an instruction processing device that uses an execution history to increase the speed.
  • the cache memory is a small-capacity high-speed memory placed in order to reduce the access time to the main memory.
  • a processor reads data or an instruction from the main memory
  • the address and data or the instruction are read. Is registered in the cache memory.
  • the processing unit accesses the main memory, it first refers to the cache memory, and accesses the main memory only when no data or instruction is registered there.
  • Japanese Patent Application Laid-Open No. 60-129,839 discloses a technique for speculatively starting the execution of an instruction using the result of a load instruction before the execution of the instruction is completed.
  • a history record is used to store the address of a spoken command and data previously read from main memory as a pair.
  • the use of the history information is limited to the reduction of the main memory access time or the speculative start of instruction execution, the calculation time performed in the processing device, the address calculation time, and the cache memory. Not used to reduce access time.
  • the validity of operation input and result data, the load / store address and data of main memory, the contents of the register forming the address, and the history are determined in correspondence with each executed instruction address.
  • Prepare a history storage (HS: History Storage) that selectively stores the success count shown, and execute the instruction of the same address again.
  • the current input data is compared with the history data in the case of the operation instruction, and in the case of the read Z write instruction, the current read / write data is The addresses are compared with the history, and if they are the same, the execution of the operation is skipped in the case of the operation instruction, and the result data of the history storage is used as the current result, and in the case of the load store instruction, the cache memory is used.
  • the speed-up is achieved by using the result data of the history storage as the current result in the spoken instruction.
  • the load store instruction if the contents of the registers forming the address are equal to the current value and the history value, the calculation of the address is also skipped. The number of successful skips is reflected in the history's success count, and when there is no more space in the history storage, information is deleted in ascending order of success count so that effective information remains in the history record.
  • the history storage has an invalid display field corresponding to the instruction address, and is written to the instruction address or the word address in the history storage. In such a case, the corresponding part is invalidated.
  • the instruction processing device processes an instruction specified by an address.
  • the instruction processing device stores, as history information, a pair of data processed by the instruction and an address of the instruction.
  • a history storage device for executing the instruction a recording unit for recording the history information in the history storage device when executing the instruction, and a content of the history information stored in the history storage device when the instruction of the same address is executed again.
  • the data when the instruction is an operation instruction, the data includes data that is an input of an operation and data of an operation result, and the re-executing unit is configured to execute the operation at the present time.
  • the instruction when the instruction is a store instruction, the data includes a store destination address and a store data accessed by the store instruction, and the re-executing means stores the store instruction of the store instruction at the present time.
  • a first comparator for comparing a store destination address with a store destination address of the store instruction recorded in the history storage device; and a store data of the store instruction at the present time and the storage data stored in the history storage device.
  • a second comparator for comparing the stored data of the store instruction with the stored instruction of the store instruction, and a store of the current store instruction when the first comparator and the second comparator both indicate a match. This is achieved by an instruction processing device having means for suppressing operation.
  • the re-executing means has means for executing a store instruction at the present time when the first comparator and the second comparator do not show a match, and in this case, the updating means Is achieved by an instruction processing device for invalidating the data in the history storage device having the storage destination address.
  • the data when the instruction is a store instruction, the data includes the contents of an address register group forming a store address and a store address, and The comparing means for comparing the content of the address register group at the current time with the content of the address register group recorded in the history storage device; and calculating the destination address when the comparing means indicates a match.
  • an instruction processing device having means for using the store destination address recorded in the history storage device.
  • the re-executing means includes means for executing a store instruction at the present time when the comparing means does not indicate a match, and in this case, the updating means includes a memory for executing the store instruction having the store destination address. This is achieved by invalidating the data in the history storage device.
  • the instruction when the instruction is a load instruction, the data includes a load source address and load data to be accessed by the load instruction, and the re-executing means includes a load source address of the load instruction at a current time. And a first comparator for comparing the address and the load source address of the above-mentioned instruction stored in the history storage device. Means for converting the read load data into load data by execution of a currently executed instruction.
  • the instruction when the instruction is a load instruction, the data includes the load data and the contents of a group of address registers forming a cache address, and the re-executing means includes: Comparing means for comparing the contents of the address register group with the contents of the address register group of the confidential instruction recorded in the history storage device; and
  • the present invention is achieved by an instruction processing apparatus having means for using the recorded load data as load data of a result of a click operation of a current click command.
  • the history information recorded in the history storage device has a counter for counting the number of times the re-executing unit has executed the command using the history information
  • the updating unit includes: Means for updating the counter when an instruction is executed using the history information, and means for deleting the history information in ascending order of the counter value when the storage area of the history storage device is insufficient.
  • FIG. 1 is a diagram showing an overall configuration according to the present invention.
  • Figure 2 is a detailed configuration diagram of the general-purpose operation unit.
  • FIG. 3 is a detailed configuration diagram of the floating-point operation unit.
  • FIG. 4 is a detailed configuration diagram of the branch operation unit.
  • FIG. 5 is a diagram illustrating an example of an instruction sequence.
  • FIG. 6 is a diagram showing an execution stage flow of an instruction sequence.
  • Figure 7 shows an alternative to the overall configuration.
  • FIG. 1 shows the configuration of the target instruction processing device.
  • the instruction processor has an instruction cache memory 11 that temporarily stores instructions, an instruction control unit 12 that controls instructions in the instruction processor, a branch operation unit 13 that performs branch operations, and fixed-point instructions. mouth It comprises a general-purpose operation unit 14 for executing a 1-d Z store instruction, a floating-point operation unit 15 for executing a floating-point operation, and a data cache storage 16 for temporarily storing data.
  • the main storage device is not shown in FIG. 1, instructions and data to be processed by the instruction processing device are stored in the main storage device.
  • the instruction control unit 12 and the operation units 13, 14, and 15 transfer the above-described instruction and data to and from the main storage device via the instruction cache 11 and the data cache 16.
  • the instruction sequence is read from the instruction cache memory 11 to the instruction buffer 12 in the instruction control unit 12 by the instruction control unit 12, and the branch operation unit 13 and the general-purpose operation unit are executed for each instruction type. 14. Floating-point operation unit 15 and then executed by each operation unit. Dotted lines indicate the flow of command words.
  • the branch operation unit 13 has an instruction queue (IQB) 133 that receives the distributed partial instruction sequence, a history storage (HSB) 135 that records the history data of the executed instruction along with the instruction address of the instruction, and an instruction queue.
  • the history queue (HQB) 134 that stores history data corresponding to the instruction of instruction, the instruction processing unit (Br) 136 that performs branch operation, the history processing unit (HeVB) 131, and the condition code (CC) 132 is included.
  • the general-purpose operation unit 14 has an instruction queue (IQG) 143 that receives the sorted partial instruction sequence, a history storage (HSG) 145, and a history queue (HQG) that holds history data corresponding to instructions in the instruction queue 143.
  • IQG instruction queue
  • HSG history storage
  • HQG history queue
  • the floating-point arithmetic unit 15 has an instruction queue (IQF) 153 for receiving the divided partial instruction sequence, a history storage (HSF)] 55, and a history queue (Histogram for holding history data corresponding to the instruction queue).
  • FIG. 2 shows a more detailed configuration of the core general-purpose arithmetic unit 14.
  • the instruction queue (IQG) 143 is a storage means for temporarily storing instructions, and an instruction address field (I adr) 1 43 1 for storing the address of the instruction in main memory for each instruction.
  • Instruction code field (Op) 1 432 Field for holding the operation input register or load register address register number (S1, S2) 1 433, 1434, Operation result or load store data It consists of an (R) field 1435 that holds the register number.
  • the register number values in fields 1433, 1434, and 1435 are used to specify the registers in general-purpose registers 142a and 142b.
  • the instruction address is stored when the instruction is registered in the instruction queue. Note that, in the general-purpose registers 142a and 142b, r0 and r31 indicate register numbers. In the figure, r2 to r30 are omitted.
  • History storage (HSG) 145 is a storage device for storing a history of executed instructions, and stores history information on instructions executed in instruction units.
  • the history information includes, for each instruction, an instruction address field (I adr) 1451 for storing the address of the instruction in the main memory, and a condition code field for storing a condition code of an instruction execution result.
  • the history queue (HQG) 144 has a V field (V) 144 1 that indicates the presence or absence of history information, a condition code history field (CC) 1 442, and operation input data or an address register corresponding to each instruction in the instruction queue.
  • History field of data contents (D ata S l, D ata S 2) 1443, 1444, History field of data stored in the register data (Data R) 1445, Rhodes
  • the history field of the address (Dadr) consists of 1446.
  • Instructions stacked in the instruction queue 143 are pipelined in order from the bottom in FIG. Scheduled and executed.
  • the operation instruction bi-line is read from the register.
  • the pipeline of the load instruction consists of four stages: address register read (r), address calculation ( a ), data load (L), and register write (w).
  • the store instruction bipline consists of four stages: address register read (r), address calculation ( a ), data read (r), and data store (ST).
  • the instruction word registers 2 1 1, 2 1 2, 2 1 3, and 2 1 4 are registers for storing instruction words executed in the first to fourth stages, respectively. 2 3 3 and 2 3 4 correspond to the second to fourth stages of the instruction word register, respectively, and a part of the history information of each stage (result data or password Z store data and password z store address). ) Is a register that stores. Scheduled command and history information flows sequentially through these registers. The line indicating the movement of data from the history register 23 3 to the history register 23 4 is omitted from FIG.
  • the storage means 201 to 204 hold a bit (V bit) indicating the presence or absence of history for each stage.
  • the storage means 26 2, 26 3, and 26 4 hold a bit (U bit) indicating the validity of the history determined in the first stage from the second stage to the fourth stage.
  • the movement of U-bit data between storage devices from the second stage to the four stages and the movement of V-bit data between the storage devices from the second stage to four stages are omitted from FIG.
  • the U bit is set to "1" when the input data of the history or the contents of the address register are all equal to the input data of the corresponding current instruction or the contents of the address register. Otherwise, it is set to "0".
  • Two comparators 2 4 1 a, 2411b are used for determination when setting the U bit, and the U bit is the output of the two comparators and the V bit ( The value of the logical product of the output of 2 0 1).
  • An AND circuit 251 is used to calculate the AND.
  • the general-purpose registers 1442a and 1442b are used to hold both fixed-point data and memory addresses.
  • the two general-purpose registers 1 4 2 a and 1 4 2 b hold the same data and store the contents of the number specified by the instruction word into the general-purpose operation unit address Z data register 2 2 2 a and 2 2 b.
  • Register 2 42 holds the condition code CC generated as a result of the operation
  • result register 2 23 holds the result of the second stage operation or address calculation (Adr ZD ata).
  • the address register 224a is a register that holds the address generated in the second stage by the load store instruction in the third stage, and the register 224b is the memory (main memory, data) by the load instruction. This register holds the data read from the cache or the store data read from the register by the store instruction.
  • the third comparator 243 is used for comparing the load destination address of the data of the current instruction with the load destination address of the history data in the load instruction.
  • the fourth and fifth comparators 244a and 244b respectively compare the value of the current instruction with respect to the write data (store data) with the history value and store the write address (main memory) by the store instruction. This is used to compare the value of the current instruction with respect to the storage address of the current instruction and the history value.
  • the AND circuit 254 is a circuit that calculates the logical product of the comparison result of the two.
  • the procedure for using and generating history information using these configurations is as follows. You. First, when registering an instruction in the instruction queue 144, the instruction address 143 1 is used as a key to search the history record 145, and if found, the V field of the corresponding entry of the history queue 144 1 Is set to "]", and the corresponding information in the history storage is copied to the information field of the history queue 144. When the history memory 145 is searched using the instruction address 1431 as a key and cannot be found, the V field is set to "0 (zero)" to set that there is no history data in the instruction.
  • the current instruction value of the input data or address and the history value are compared by the comparators 24 1a and 24 1b. More specifically, first, the data of general-purpose registers 142a and 142b, designated by the address register numbers S1 and S2 of the instruction word register 211, are connected to the lines 2101 and 2102, respectively. read out. Also, read Data S 1 and Data S 2 from the history queue 44. The comparator 241a compares the Data 201 1 with the Data S 1 and outputs whether or not they match to the AND circuit 251 via the line 2103. Similarly, D ata 2 l 02 and Data S 2 are compared by the comparator 241 b, and whether they match is output to the AND circuit 251 via the line 2104.
  • the AND circuit 251 ANDs the result from the comparators 241 a and 241 b and the V value in 201.
  • the results from the comparators 24 1a and 24 1b both indicate a match and the V value in 201 indicates valid, the U bit is determined as a match between the current instruction value of the input data or address and the history value. "1" is set.
  • the instruction is an arithmetic instruction
  • the second stage is skipped, the instruction word of the third stage is stored in the instruction register 2 13, and the history information (CC, DataR, ).
  • the history information sent to the history register 233 is written as (HCC, HDDataR, HDDadr) to distinguish it from the contents of the history queue HQG.
  • the operation is completed by writing the result data (HDatR) of the history register 233 to the designated numbers of the general-purpose registers 142a and 142b via the line 2301 and the line 2001.
  • the second stage is skipped and the instruction is sent to the instruction register 2 13 of the third stage, and the history information is sent from the history queue 144 to the history register 233.
  • store data is stored in register 224b (the contents of general registers 142a to 142b specified by the field R of the instruction. 42b is written as a general-purpose register G r, and store data is input from this G r), and then proceed to the fourth stage.
  • the comparator 244a compares the current value of the store data with the history value HD ata R, and if both match, the store operation is suppressed. This has the effect of shortening the execution time of the store instruction.
  • the instruction word of the fourth stage is stored in the instruction word register 214, and the history information is stored in the history register 234 from the history queue (CC, DataR). , D adr).
  • the read data (HDa R) of the history register 234 is written to the general-purpose registers 142a and 142b via the lines 2403 and 2001, or the floating-point register 1 in FIG. Send to floating point unit 15 via line 2405 to write to 52a, 152b and terminate the instruction (in FIG. 2, the floating point unit is denoted by Fu).
  • the operation input data of the registers 222a and 222b are registered in the history storage 145 via the line 2201, and the operation is performed in the instruction processing unit 146. Is stored in the register 223.
  • the operation result is written from the register 223 of the third stage to the general-purpose registers 142a and 142b via the lines 2302 and 2001, and is registered in the history storage 145 via the line 2002.
  • the load source address is calculated in the second stage, the calculated load source address is stored in the register 223, and is indicated by the address stored in the register 223 in the third stage. Start reading data from main memory (including data cache) to register 224b.
  • the comparator 243 compares the current address from the register 223 with the history address Hadr from the 233. I do. As a result of the comparison of the load source address by the comparator 243, when the two match, the data reading is canceled (load suppression) and the effective count of the history storage via the lines 2304 and 20033 is performed. Update 1 4 5 7 Also, set the U bit 264 of the fourth stage to "1". In the fourth stage, the U bit 264 power; when "1", the result data (Hdata R) on the history register 234 is transferred to the general registers 1442a and 1442 via lines 2403 and 2001, respectively.
  • the address calculation is performed by the second stage computing unit 146, and in the third stage, the calculated address is stored in the register 224a and stored in the general-purpose register indicated by the store instruction. Set the data in register 2 2 4b.
  • the fourth stage the current value of the data and the history value are compared by the comparator 244a, the current value of the address and the history value are compared by the comparator 244b, and the logical product of the comparison results is obtained. Is taken by the AND circuit 2 5 4.
  • FIG. 3 similarly shows a detailed configuration of the floating-point operation unit 15. Since load Z store instruction is not handled here, it is simpler than that in Fig. 2.
  • 301, 302, and 303 are stage V bits in the floating-point operation unit
  • 311, 312, and 313 are instruction word registers for storing the instruction words of each stage in the floating-point operation unit.
  • 322a, 322b, and 323 are data registers in the floating-point operation unit
  • 332 and 333 are stage history registers in the floating-point operation unit
  • 341a and 341b are data in the dynamic point operation unit.
  • 342 is a condition code register in the floating-point operation unit
  • 351 is an AND circuit in the floating-point operation unit
  • 362 and 363 are stage history valid bits in the floating-point operation unit.
  • 3001, 3002, 310, 310, 3201, 3301, 3302, and 3303 are lines in the floating operation unit.
  • the instruction queue (IQF) 153 is a storage means for temporarily storing instructions. For each instruction, an instruction address field (I adr) 1 53 1, an instruction code field (Op) 1 532, and an operation It consists of fields (S1, S2) 1533 and 1534 that hold the address register number of the input register, and a field (R) 1535 that holds the register number that stores the operation result.
  • the register number values for fields 1 533, 1 534, and 15 35 are the floating point registers 1 52a and 1 52b registers ⁇ Used to specify.
  • the history storage (HSF) 155 is a storage means for storing the history of execution, stores history information registered for each instruction, and stores an instruction address field (I adr) 155 1 for each instruction and a condition code field. (CC)] 552, Finoredo of operation input data (DataSl, DataS2) 1 553, 1554, Field of operation result (DataR) 1555, Effective count (Cnt) Field 1 consists of 557.
  • the history queue (HQF) 154 has a V field 154 1, which indicates the presence or absence of history information, the condition code history field (CC) 154 2, and the contents of operation input data for each instruction in the instruction queue.
  • the instruction register 311, 312, 313 holds the instruction of each stage.
  • the information held in the history register is only the condition code and the value of the result register.
  • the history register 332 holds the history of the instruction of the second stage, and the history register 333 stores the history data of the instruction of the third stage. Hold.
  • the comparators 34 1 a and 34 1 b are used for comparing the operation input data with the history value of the history queue.
  • the operation of each component shown in FIG. 3 when the floating-point operation instruction in FIG. 3 is executed is the same as the operation of each component shown in FIG. 2 corresponding to the component shown in FIG. This is the same as the operation at the time of execution of the operation instruction in FIG.
  • FIG. 4 shows the detailed configuration of the branch operation unit 13.
  • 401, 402 are stage V bits in the branch operation unit
  • 411, 4] and 2 are instruction word registers for storing stage instruction words in the branch operation unit
  • 431, 432 Is a stage history register in the branch operation unit
  • 441 and 452 are condition code registers in the branch operation unit
  • 442 is a branch establishment register
  • 461 is a branch address addition circuit
  • 462 is a branch address addition circuit.
  • a branch address register, 471 is a branch operation unit / comparator
  • 472 is a branch operation unit history valid bit
  • 4022 and 4003 are lines in the branch operation unit.
  • Each line of the instruction queue (IQB) 133 has an instruction address field (Iadr) 1331, an instruction code field (Op) 1332, a branch mask field (M) 1333, and a branch destination relative address field (Tadr). Consists of 1 334.
  • the history queue (HQB) 134 has a V field 1341 indicating the presence / absence of history information, a condition code field (CC) 1 342, a branch destination address (Badr) corresponding to each instruction in the instruction queue. 1) Hold 343.
  • History storage (HS B) 1 35 the instruction address field (I a dr) 1 35 1 , condition code field (CC) 1 3 52, branch destination Adoresu (B adr) 1 353, the effective count field (C nt) 1 Consists of 354.
  • the pipeline consists of two stages: branch address calculation (ba) and branch (bb).
  • the instruction registers 4 1 1 and 4 1 2 hold the instruction word of each stage, and the history registers 43 1 and 432 hold the history information corresponding to those instructions.
  • condition code (CC) 441 sent from the general-purpose arithmetic unit (Gu) 14 or the floating-point arithmetic unit (Fu) 15 is used for the instruction in the instruction register 4 11 1.
  • the word mask is compared with the comparator 451 to determine whether or not there is a branch, and the result is set in the register 442.
  • the branch address is determined from the instruction address and the branch destination relative address using the address adder 461 and set in the branch address register 462.
  • condition code (HCC) of the history register 431 is compared with the condition code of the register 441 using the comparator 471, and the result is set in the history valid bit (U bit) 472.
  • the leftmost column is the instruction number ((1) (2) ⁇ ⁇ ⁇ (1 4)), the right is the branch destination label (L0), and the next column is the instruction code.
  • C LD I, SUB, etc.
  • the following columns indicate the operands (],% r 9, etc.).
  • % r indicates a general-purpose register
  • % ir indicates a floating-point register.
  • the registers at the right end indicate the registers where the results are placed by operation instructions and load instructions, and the others indicate registers for operation input, store data, or address specification. Numerical value means direct designation of data or address by command word.
  • Instruction numbers (1), (2), (4), (6), (7), (8), and (11) are fixed-point arithmetic instructions, which are executed by the general-purpose arithmetic unit.
  • Instruction numbers (3) and (14) are fixed-point load instructions, also executed by the general-purpose arithmetic unit.
  • Instruction numbers (5), (9) and (12) are floating-point load / store instructions, which are executed by the general-purpose operation unit 14 and the floating-point operation unit 15 in cooperation.
  • the instruction number (10) is a floating-point operation instruction, and is executed by the floating-point operation unit 15.
  • the instruction number (13) is a comparison / branch instruction, which is executed by the general-purpose operation unit 14 and the branch operation unit 13 in cooperation.
  • the LDI (Load Immediate) instructions (1) and (2) transfer the value directly to the result register.
  • the value 1 is transferred to the general-purpose register 9.
  • the LDW (Load Word) instructions in (3) and (14) are the fixed-point data for one word from the memory address specified by the sum of the contents of general-purpose register 30 and the direct value 604. And set it to general-purpose register 3 No. 1.
  • the (4), (7) and (11) LDO (Load Offset) instructions add the specified offset value directly to the contents of the end address register and set it in the result register. In the example of (4), from the contents of general-purpose register 9; , And set the result in general-purpose register 6].
  • the (5) and (9) FLDDX (F) loading L Double (ndexed) instructions read the floating-point data from the memory address to which the contents of the two address registers have been added, and set them in the result register. .
  • data is read from the address obtained by adding the contents of general-purpose registers 16 and 16 to the contents of general-purpose registers 2 and 2 and set to the floating-point register 8.
  • (6) is a fixed-point arithmetic instruction.
  • the contents of general registers 23 and 23 are subtracted from the contents of general registers 8 and the result is set to general registers 17 and 17.
  • (8) is an addition instruction.
  • the instruction (10) is an instruction that multiplies the contents of the floating-point register 7 and the contents of the floating-point register 8 and sets the result in the dynamic-point register 13.
  • the FS TDX (Floating Store Double Indexed) instruction of (1 2) stores the contents of floating-point register 13 into a memory address that is the sum of the contents of general-purpose register 20 and general-purpose register 26. I do.
  • the (13) COMB (Compare and Branch) instruction compares the contents of general-purpose register 9 and general-purpose register 11 and has the L0 label when the latter is greater than or equal to the former. Branch to instruction. That is, this instruction sequence creates a loop.
  • the instruction (14) is located in the delay slot of the instruction (13), and is apparently placed after the instruction (13). However, when the branch is taken in (13), this instruction is also executed. You.
  • the instruction with “au” in the address invariance column in FIG. 5 means that the memory address does not change every time the loop is repeated by the load Z store instruction. This is clear from the fact that the contents of the address register are not rewritten. Also, u or ⁇ u in the column of data invariance means that the contents of the result register of the instruction do not change as the loop proceeds. For the instruction in (5), such an assumption is made, and the others are consequent to the relation between the assumption and the delivery of the invariant data of the address.
  • FIG. 6 shows a stage flow of the pipeline when the instruction sequence of FIG. 5 is executed by the configuration of the present invention.
  • the horizontal axis indicates the flow of time from left to right, and the vertical axis indicates the instruction number.
  • the first stage flow and the second and subsequent stage flows are shown in a superimposed manner. From the second time onward, the instruction numbers are distinguished by appending “'”. The reason why the progress is accelerated in the middle after the second time is due to the use of the history information.
  • r, a, w, L, ST, i w, ia, fw, ba, and bb in FIG. 6 mean the stages shown in the description of FIG. 2, FIG. 3, and FIG. The situation is described below.
  • the operation input data is transmitted via lines 2201 and 3201, and in the third stage the operation result data and condition code are transmitted on lines 2302, 2303 and 3302.
  • the memory address is written to history storage 145 via line 2401 and the load / store data via 2402.
  • Condition code in the second stage for branch instructions And the destination address is written to history record 35 via line 4002.
  • the history information is stored in the history storage 144 of the general-purpose operation unit for fixed-point operation instructions and mouth storage instructions, and in the history storage 155 of the floating-point operation unit for floating-point operation instructions.
  • the comparison / branch instruction is written to both the general-purpose operation unit and the history memories 144 and 155 of the branch operation unit.
  • the address of the instruction in the main memory is also added to the history storage, and the ladr of the stage to write the history (1 adr of any of 2 1 2, 2 1 3, 2 1 4) Is written.
  • the timing of writing is when writing the history data to the history storage first, and when adding the history data for the instruction to the history storage, new history data is added to the line corresponding to this address.
  • the line for writing the instruction address to the history storage is omitted from the figure.
  • the instruction (11) is executed before the instruction (10) because the instruction (11) is a fixed-point operation instruction and the instruction (10) is a floating-point operation. Instructions, where each instruction is executed by a separate arithmetic unit, and instruction (10) uses the execution result of instruction (9). Also, the execution of the instruction (11), which is first executed by the general-purpose arithmetic unit and then executed by the floating-point arithmetic unit, is in the middle of execution because the instruction (11) is floating-point This is because the execution result of the instruction (10) being executed by the operation unit is used.
  • the loop After completing the first loop, the loop returns to instruction (4) 'and the next loop starts.
  • the comparator 241 a uses the history value from the history queue 144 43 and the general-purpose register 144 a. Is compared with the current value read from the. Since the contents of general-purpose register 9 have been updated with the instruction (11) after storing the history value in the previous loop, the history value and the current value do not match. Therefore, the result of the logical product 25 1 is “0”, “0” is set in the valid bit 2 12, and the instruction word 211 of the first stage is sent to the second stage 2 12, and thereafter, The normal stage operation is performed.
  • the comparison between the history value of general-purpose register 16 and the current value is performed by comparator 241 a, and the history value and general value of general-purpose register 22 are compared.
  • the comparison of the values is made using the comparator 241b.
  • the current value of general register 16 is updated using the value of general register 9 in the immediately preceding instruction (4) ', and differs from the history value, so the result of the AND circuit 251 is'0', All stages are executed as in the previous instruction (4) '.
  • the data read to 224b and the history value of the history register 234 are compared using the comparator 244a, and since both are equal, the history is stored via lines 2404 and 2003.
  • the value of the effective field 1457 of the line of the instruction (5) is incremented by one.
  • the data in the data register 224b is sent from the line 2402 to the floating point arithmetic unit (Fu), and is written to the floating point registers 152a and 152b via the line 3302 and the line 3001 in FIG.
  • the first stage compares the current value with the history value of general-purpose register 8 and the current value with the history value of general-purpose register 23.
  • the result register value (H data R) of the history register 2 33 is transferred to the general register 1 via lines 2 3 0 1 and 2 Writing to 42a and 142b is completed.
  • the operation of the following operation instructions (7) 'and (8)' is not affected by the preceding instruction, so the operation instruction (6) 'except for the operation that waits for one cycle in the middle of operation instruction (6)' Same as'.
  • the operation of the operation instructions (7) 'and (8)' in the second loop skips the second stage compared to the operation of the operation instructions (7) and (8) in the first loop. Time is shortened.
  • the first stage in FIG. 2 compares the current value with the history value of general-purpose register 18 and compares the current value with the history value of general-purpose register 3 1 at the first stage in FIG. A comparison is made.
  • the result of the comparison is the same for both, and the V-bit 201 is "1" because the spoken instruction (9) 'is the second execution, and from this, the logical product circuit 25
  • the result of 1 is '1', and after updating the effective count 1 4 5 7 of the history record from lines 2 105 and 2003, skip the second and third stages and go to the fourth stage move on.
  • the instruction word of the first-stage instruction register 211 and the history value of the history queue are stored in the second and third stages.
  • the next operation instruction (1 0) ' is executed in the floating-point operation unit 15 in FIG.
  • the current value and the history value of the floating-point register 7 are compared by the comparator 34 1a
  • the current value and the history value of the floating-point register 8 are compared by the comparator 34 1b.
  • V bit 301 is “1”
  • the result of the AND circuit 35 1 Becomes "1”
  • the effective count 1 5 5 7 of the history memory 1 5 5 is updated, and then the second stage is skipped and proceeds to the third stage.
  • the instruction word of the first-stage instruction register 311 and the history value of the history queue are skipped to the second stage, and are stored in the third-stage instruction register 313 and the history register 333. Moved.
  • the result value (Hdata R) of the history register 3333 is transferred to the floating-point register 1 5 2 via lines 3 301 and 300 1. a,] 5 2 b.
  • the operation instruction (1 1) ' is executed by the general-purpose operation unit 14. Every time the contents of input register 9 are counted up and the current value and the history value are not equal, Stage is executed.
  • the floating-point store instruction (] 2) ' is implemented in the general-purpose arithmetic unit 14.
  • the current value of the general-purpose register 26 is compared with the history value, and the current value and the history value of the general-purpose register 20 are compared.
  • the aND circuit 2 5 1 outputs "1" and becomes c result, it updates the valid count field 1 4 5 7 history storage 1 4 5, the second scan Skip the stage and proceed to the third stage (ir).
  • the instruction word of the instruction register 2 11 of the first stage and the history value of the history queue 1 4 4 skip the second stage, and the instruction word register 2 13 of the third stage. Moved to history register 2 3 3.
  • the floating-point operation unit 13 reads out the store data value of the floating-point register 13 via the line 3106 of the floating-point operation unit, and sets it in the data register 222b of the general-purpose operation unit. I do.
  • the store data of the data register 224b and the history store data (H data R) in the history register 234 are compared by the comparator 244a. Update the valid count field 1457 of the history storage 1405 via 404, line 2003. Also, the data store operation is suppressed through the circuit 254. On the other hand, if the comparison results of the comparators 244a do not match, the store operation is performed because the previous store data and the current store data are different.
  • the address information (HAdr) of the history register 234 is stored in the history storage 1 via lines 2406 and 2002. 4 5 and the invalidation field of the row with the same value as HA dr in the data address field 1 4 5 6 Set the field 1 4 58 to "1" to invalidate the line.
  • the general-purpose operation unit 14 and the branch operation unit 13 cooperate to execute the instruction (1 3)'.
  • the first stage of the general-purpose arithmetic unit 4 compare the current value of general-purpose register 9 with the history value, and compare the current value of general-purpose register 11 with the history value. Since the current value of general-purpose register 9 is different from the history value, normal stage progress is performed.
  • the comparison operation is performed by the fixed-point operation circuit 146, and the condition code (CC) of the comparison result is set in the register 242.
  • the condition code of the register 242 is written from the lines 2303 and 2002 to the condition code field 1452 of the history storage 1450, and is also sent to the branch operation unit (Bu) 13 at the same time.
  • the condition code is received in the register 441, and this is compared with the mask field of the instruction word 4 1 1 by the comparator 4 5 1.
  • the presence / absence of the obtained branch is determined by the information 442 indicating the presence / absence of the branch, and the condition code (HCC) of the history register 43 1 is compared by the comparator 471 with the condition code (HCC) of the history register 431.
  • the effective count field 1 354 of the history storage 1 35 is updated from the line 4 403, and the process branches to the beginning of the norape.
  • the instruction (14) 'in the delay slot is a fixed-point load instruction.
  • the current value of general-purpose register 30 is compared with the history value, and since they are equal, the effective count field 1457 of the history storage 145 is updated from lines 210 and 2003, and the second and third stages are updated.
  • Skip the fourth step Proceed to In the fourth stage, the result data (HDa ⁇ aR) of the history register 234 is written to the general-purpose registers 140a and 144b from lines 2403 and 2001.
  • the instruction sequence can be executed earlier than the first loop by skipping the stage using the history data.
  • the free space is deleted in order from the one with the smallest effective count to create a free space. This is to control so that the history of the instruction of the address whose history record was valid as much as possible remains in the history memory. Instructions with no history at the time of execution are the V fields 1341, 14441, and 1541 of the history queue; "0"; in this case, they are executed without skipping the stage and the history information is Registered in history memory.
  • the data address of the history storage is stored in the same way as the store from the own processor. Invalidate the row having the data address 1456, which matches the address. If the storage address is unknown, invalidate all lines related to the load store in the history storage.
  • the history records are distributed and arranged in each operation unit. However, like the history cache (history storage) 17 in FIG. Instruction cache] You may have it corresponding to the instruction on 1.
  • the management of the replacement of the history information stored in the history storage registration of the history data in the history storage or deletion of the history information from the history storage, etc.
  • the history information is temporarily taken into the history buffer (HBUF) 122 at the same time as the instruction sequence is taken into the instruction buffer (I BUF) 122 of the instruction control unit 12, and the history queue of each operation unit is taken.
  • the history information updated as a result of the instruction execution is returned from the output history queues 137 (HQB), 147 (HQGO) and 157 (HQFO) to the history storage 17 via the history buffer (HBUF) 122. It is.
  • the present invention is not limited to the format of the instruction sequence shown in FIG. Any form of instruction that can specify input registers, address registers, result registers, and direct values can be used. Alternatively, it may be in the form of a command word that specifies a plurality of operations with a single command, such as a VLIW (VeryLargeInstruckintiornWord) computer.
  • VLIW VeryLargeInstruckintiornWord
  • Figure 1 shows an example of three arithmetic units, but this may be more or less.
  • each unit since it is the essence of the invention to skip the execution of a part of the stage by associating the history information with each instruction, each unit has a plurality of instruction queues or multiple instruction queues such as a multi-pipeline computer and a superscalar computer. It can be applied to a configuration with an arithmetic unit.
  • the number of pipeline stages required to execute each instruction can be reduced, and as a result, the execution time of an instruction sequence can be reduced. Also, since unnecessary load restore operations on the memory are suppressed, the load of storage management such as cache control and address conversion can be reduced, and as a result, the average execution time of the cache store instruction can be reduced. Furthermore, since the instruction history is used when the instruction address in the main memory matches, the history information is reused at a high rate and the history information can be used effectively. Industrial applicability
  • the instruction processing device is useful for an instruction device that reads an instruction from a storage device by specifying an address and executes the instruction, and specifies an input register, an address register, a result register, a direct value, and the like. It is suitable for application to an instruction processing device that processes available instructions.

Abstract

A command processor having history memory that contains data obtained when commands were executed in the past, such as input data with the arithmetic results and read/write addresses with data at the addresses, wherein the history data are referred to so as to speed up execution of the same commands. If the history remains, the current input data and the history data are compared for arithmetic operation, while the current read/address data and the addresses are compared for read/write operations. If a coincidence occurs, the arithmetic operation is skipped and the history data is used, while the read/write operation is skipped. This increases the speed of processing and reduces the number of pipeline stages necessary for executing each command, so that the execution time of a command string can be shortened and an unnecessary load/store operation for the memory can be eliminated. In consequence, the memory management such as cache control and address conversion can be efficient, and the mean execution time of the load/store command can be shortened.

Description

/JP96/02633  / JP96 / 02633
1  1
明細書  Specification
履歴記憶を有する命令処理装置 技術分野 Instruction processing device with history storage
本発明は命令処理装置の構成に関わり、 特に実行履歴を使用して高速化をはか る命令処理装置に関する。 背景技術  The present invention relates to the configuration of an instruction processing device, and more particularly to an instruction processing device that uses an execution history to increase the speed. Background art
この分野の技術としては、 データや命令語のァドレス履歴を利用したキヤッシ ュメモリ制御技術が広く知られており、 また分岐命令の成立履歴を用いて予測の 確度を高める技術 (日本国特願昭 6 3— 4 0 0 2 6号公報他) が知られている。 周知のとおりキヤッシュメモリは主記憶へのアクセス時間を短縮するために 置かれた小容量の高速メモリであり、 処理装置が主記憶からデ一タないし命令を 読みだしたときには、 ァドレスとデータないし命令の対をキヤッシュメモリに登 録する。 処理装置が主記憶にアクセスする際にはまずキャッシュメモリを参照し、 そこにデータないし命令が登録されていないときに初めて主記憶にアクセスす る。 データや命令語のアドレス履歴には再帰性があり、 一度キャッシュメモリに 登録された対は再度参照される確率が高いことから主記憶をアクセスすること は稀となり、 データや命令語の読みだし時間を実効的に短縮できる。 主記憶にデ ータないし命令を書き込むときにはキャッシュメモリの該当する対を無効化す るか書き替えることにより、 主記憶とキャッシュメモリの内容の一致をとる。 ま た、 キャッシュメモリが一杯になったときは既登録の対の一つを消去して空きを 作る。 分岐命令の成立履歴に関しては、 分岐命令のアドレスごとに分岐の成否の 回数を特別な記憶装置に保持し、 その情報を使って可能性の高いパスの命令を先 読みすることで命令読みだしの時間を確率的に短縮する。 As a technology in this field, a cache memory control technology using the address history of data and instruction words is widely known, and a technology for increasing the accuracy of prediction using the branch instruction establishment history (Japanese Patent Application No. No. 3,400,266, etc.) are known. As is well known, the cache memory is a small-capacity high-speed memory placed in order to reduce the access time to the main memory. When a processor reads data or an instruction from the main memory, the address and data or the instruction are read. Is registered in the cache memory. When the processing unit accesses the main memory, it first refers to the cache memory, and accesses the main memory only when no data or instruction is registered there. Since the address history of data and instruction words is recursive, pairs that have been registered in the cache memory once have a high probability of being referenced again. Can be effectively shortened. When writing data or instructions to the main memory, the contents of the main memory and the cache memory are matched by invalidating or rewriting the corresponding pair of the cache memory. Ma When the cache memory becomes full, one of the registered pairs is deleted to make room. Regarding the branch instruction establishment history, the number of successful or unsuccessful branches is stored in a special storage device for each address of the branch instruction, and the information is read by prefetching the instruction of the path with a high possibility using the information. Stochastically reduce time.
また日本国出願の特開昭 6 0— 1 2 9 8 3 9号公報においては、 ロード命令の 実行が終了する前にその命令の結果を使用する命令の実行を投機的に開始する ために、 口一ド命令のァドレスと以前に主記憶から読みだしたデータを対にして 記憶する履歴記億が用レ、られている。  Also, Japanese Patent Application Laid-Open No. 60-129,839 discloses a technique for speculatively starting the execution of an instruction using the result of a load instruction before the execution of the instruction is completed. A history record is used to store the address of a spoken command and data previously read from main memory as a pair.
しかしながら、 上記の従来技術にあっては、 履歴情報の利用は主記憶アクセス 時間の短縮ないし投機的な命令実行開始に限られ、 処理装置内で行われる演算時 間ゃァドレス計算時問ないしキヤッシュメモリアクセス時間の短縮に用いられ ていない。  However, in the above-described conventional technology, the use of the history information is limited to the reduction of the main memory access time or the speculative start of instruction execution, the calculation time performed in the processing device, the address calculation time, and the cache memory. Not used to reduce access time.
本発明の課題はそれらの時間短縮を可能にする履歴記憶装置を提示すること にある。 またそのために、 有効な情報が履歴記憶に保持されるような記憶管理手 段を提示することにある。 発明の開示  It is an object of the present invention to provide a history storage device which can reduce such time. Another object of the present invention is to provide a storage management method in which valid information is stored in history storage. Disclosure of the invention
このために本発明では、 既実行の各命令ァドレス対応に演算の入力と結果のデ ータ、 主記憶のロー ド/ス トアアドレスとデータ、 アドレスを形成するレジスタ の内容、 履歴の有効性を示す成功カウントを選択的に記憶する履歴記憶 (H S : H i s t o r y S t o r a g e ) を用意し、 再度同じァドレスの命令を実行す るときに履歴記憶を参照し、 履歴が残っていれば、 演算命令の場合には現在の入 力データと履歴データを比較し、 読みだし Z書き込み命令の場合には現在の読み 出し 書き込みデータとア ドレスを履歴と比較し、 それらが同じであれば、 演算 命令の場合には演算の実行をスキップして履歴記憶の結果データを現在の結果 として用い、 ロー ド ス トア命令の場合はキャッシュメモリへのアクセスをスキ ップして口一ド命令では履歴記憶の結果データを現在の結果とすることにより 高速化を達成する。 ロード ス トァ命令においてはさらにァドレスを形成するレ ジスタの内容が現在値と履歴値で等しければ、 ァドレスの計算もスキップする。 スキップが成立した回数を履歴の成功カウン卜に反映し、 履歴記憶に空きがなく なったときは成功カウントの小さな情報から順に消去する事により有効な情報 が履歴記億に残るように制御する。 履歴記憶とキヤッシュメモリや主記憶との内 容の整合を保っために、 履歴記憶は命令ァドレス対応に無効表示フィールドを持 ち、 履歴記憶中の命令ァドレスないし口一ド ス トァァドレスに書き込みが行わ れたときには該当部分を無効とする。 For this reason, in the present invention, the validity of operation input and result data, the load / store address and data of main memory, the contents of the register forming the address, and the history are determined in correspondence with each executed instruction address. Prepare a history storage (HS: History Storage) that selectively stores the success count shown, and execute the instruction of the same address again. When the history is left, the current input data is compared with the history data in the case of the operation instruction, and in the case of the read Z write instruction, the current read / write data is The addresses are compared with the history, and if they are the same, the execution of the operation is skipped in the case of the operation instruction, and the result data of the history storage is used as the current result, and in the case of the load store instruction, the cache memory is used. By skipping the access to the command, the speed-up is achieved by using the result data of the history storage as the current result in the spoken instruction. In the load store instruction, if the contents of the registers forming the address are equal to the current value and the history value, the calculation of the address is also skipped. The number of successful skips is reflected in the history's success count, and when there is no more space in the history storage, information is deleted in ascending order of success count so that effective information remains in the history record. In order to maintain consistency between the history storage and the contents of the cache memory and main memory, the history storage has an invalid display field corresponding to the instruction address, and is written to the instruction address or the word address in the history storage. In such a case, the corresponding part is invalidated.
具体的には、 ア ドレスによって指定される命令を処理する命令処理装置であつ て、 命令を実行する際に当該命令が処理したデータの内容と当該命令のァドレス とを対にして履歴情報として記憶する履歴記憶装置と、 命令を実行する際に前記 履歴記憶装置に前記履歴情報を記録する記録手段と、 再度同じァドレスの命令を 実行する際に前記履歴記憶装置に記憶された前記履歴情報の内容を用いて当該 命令を実行する再実行手段と、 および、 前記同じア ドレスの命令を再実行する際 に前記履歴記憶装置に記憶された前記履歴情報の内容を更新する更新手段とを 有する命令処理装置により達成する。 Specifically, the instruction processing device processes an instruction specified by an address. When executing the instruction, the instruction processing device stores, as history information, a pair of data processed by the instruction and an address of the instruction. A history storage device for executing the instruction, a recording unit for recording the history information in the history storage device when executing the instruction, and a content of the history information stored in the history storage device when the instruction of the same address is executed again. Re-executing means for executing the instruction with the use of: and updating means for updating the content of the history information stored in the history storage device when re-executing the instruction at the same address. This is achieved by an instruction processing device having the same.
また、 上記命令処理装置に於いて、 前記命令が演算命令で有る場合は前記デ一 タは演算の入力となるデータと演算結果のデータを含み、 前記再実行手段は、 現 時点での前記演算命令の演算の入力となるデータと前記履歴記憶装置に記録さ れた前記演算命令の演算の入力となるデータとを比較する比較器と、 前記比較器 がー致を示す際に前記履歴記憶装置に記録された前記演算結果のデータを現時 点での前記演算命令の演算結果データとする手段を有することにより達成する。 また、 前記命令がス トァ命令である場合は前記データは当該ス トァ命令がァク セスするス トァ先ァドレスとス トアデ一タを含み、 前記再実行手段は、 現時点で の前記ス トァ命令のス トァ先ァドレスと前記履歴記憶装置に記録された前記ス トァ命令のストァ先ァドレスとを比較する第 1の比較器と、 現時点での前記ス ト ァ命令のストァデータと前記履歴記憶装置に記録された前記ス 卜ァ命令のス ト アデ一タとを比較する第 2の比較器と、 前記第 1の比較器および前記第 2の比較 器が共に一致を示す際に現時点でのストァ命令のストァ動作を抑止する手段と を有する命令処理装置により達成する。  In the above-mentioned instruction processing apparatus, when the instruction is an operation instruction, the data includes data that is an input of an operation and data of an operation result, and the re-executing unit is configured to execute the operation at the present time. A comparator for comparing data serving as an input of the operation of the instruction with data serving as an input of the operation of the operation instruction recorded in the history storage device; and the history storage device when the comparator indicates a match. This is achieved by having means for using the data of the operation result recorded in the operation instruction as the operation result data of the operation instruction at the current time. Further, when the instruction is a store instruction, the data includes a store destination address and a store data accessed by the store instruction, and the re-executing means stores the store instruction of the store instruction at the present time. A first comparator for comparing a store destination address with a store destination address of the store instruction recorded in the history storage device; and a store data of the store instruction at the present time and the storage data stored in the history storage device. A second comparator for comparing the stored data of the store instruction with the stored instruction of the store instruction, and a store of the current store instruction when the first comparator and the second comparator both indicate a match. This is achieved by an instruction processing device having means for suppressing operation.
更に又、 前記再実行手段は、 前記第 1の比較器および前記第 2の比較器が共に 一致を示さない際に現時点でのス トア命令を実行する手段を有し、 この際、 前記 更新手段は、 前記ストァ先ァドレスを有する前記履歴記憶装置の前記データを無 効にする命令処理装置により達成する。  Furthermore, the re-executing means has means for executing a store instruction at the present time when the first comparator and the second comparator do not show a match, and in this case, the updating means Is achieved by an instruction processing device for invalidating the data in the history storage device having the storage destination address.
更に又、 前記命令がストァ命令である場合は前記データはストァ先ァドレスを 形成するァドレスレジスタ群の内容とストァ先ァドレスとを有し、 前記再実行手 段は、 現時点での前記ァドレスレジスタ群の内容と前記履歴記憶装置に記録され た前記ァドレスレジスタ群の内容とを比較する比較手段と、 前記比較手段が一致 を示す際にス トァ先ァドレスの計算を抑止し前記履歴記憶装置に記録されたス トァ先アドレスを使用する手段を有する命令処理装置により達成する。 また、 前 記再実行手段は、 比較手段が一致を示さない際に現時点でのス トア命令を実行す る手段を有し、 この際、 前記更新手段は、 前記ス トア先アドレスを有する前記履 歴記憶装置の前記データを無効にすることにより達成する。 Further, when the instruction is a store instruction, the data includes the contents of an address register group forming a store address and a store address, and The comparing means for comparing the content of the address register group at the current time with the content of the address register group recorded in the history storage device; and calculating the destination address when the comparing means indicates a match. And an instruction processing device having means for using the store destination address recorded in the history storage device. The re-executing means includes means for executing a store instruction at the present time when the comparing means does not indicate a match, and in this case, the updating means includes a memory for executing the store instruction having the store destination address. This is achieved by invalidating the data in the history storage device.
更に又、 前記命令がロード命令である場合は前記データは当該ロード命令がァ クセスするロード元アドレスとロードデータとを含み、 前記再実行手段は、 現時 点での前記ロード命令の口一ド元ァドレスと前記履歴記憶装置に記録された前 記口一ド命令のロード元ァドレ とを比較する第 1の比較器と、 前記第 1の比較 器が共に一致を示す際に前記履歴記憶装置に記録されたロードデータを現時点 での口一ド命令実行によるロードデータとする手段とを有する命令処理装置に より達成する。  Further, when the instruction is a load instruction, the data includes a load source address and load data to be accessed by the load instruction, and the re-executing means includes a load source address of the load instruction at a current time. And a first comparator for comparing the address and the load source address of the above-mentioned instruction stored in the history storage device. Means for converting the read load data into load data by execution of a currently executed instruction.
更に又、 前記命令がロード命令である場合は前記データはロードデータと口一 ド元ァドレスを形成するァドレスレジスタ群の内容を含み、 前記再実行手段は、 現時点での前記口一ド命令の前記ァ ドレスレジスタ群の内容と前記履歴記憶装 匱に記録された前記口一ド命令の前記ァドレスレジスタ群の内容とを比較する 比較手段と、 前記比較手段が一致を示す際に前記履歴記憶装置に記録された前記 ロードデータを現時点での口一ド命令の口一ド動作結果のロードデータとする 手段を有する命令処理装置により達成する。 更に又、 前記履歴記憶装置に記録される履歴情報は当該履歴情報を利用して前 記再実行手段が命令実行した回数を集計する力ゥンタを有し、 前記更新手段は、 前記際実行手段が当該履歴情報を利用して命令実行したとき前記カウンタを更 新する手段と、 前記履歴記憶装置の記億域が不足するときはカウンタの値の少な いものから順に前記履歴情報を削除する手段とを有する命令処理装置により達 成する。 また、 前記命令処理装置は、 命令実行時に前記履歴記憶装置に当該命令 に関する履歴情報が無いときは、 履歴を利用せずに命令を実行し、 当該命令に関 する履歴情報を新たに前記履歴記憶装置に登録することにより達成する。 図面の簡単な説明 Further, when the instruction is a load instruction, the data includes the load data and the contents of a group of address registers forming a cache address, and the re-executing means includes: Comparing means for comparing the contents of the address register group with the contents of the address register group of the confidential instruction recorded in the history storage device; and The present invention is achieved by an instruction processing apparatus having means for using the recorded load data as load data of a result of a click operation of a current click command. Further, the history information recorded in the history storage device has a counter for counting the number of times the re-executing unit has executed the command using the history information, and the updating unit includes: Means for updating the counter when an instruction is executed using the history information, and means for deleting the history information in ascending order of the counter value when the storage area of the history storage device is insufficient. This is achieved by an instruction processing device having Also, the instruction processing device executes the instruction without using the history when the history storage device does not have the history information on the instruction when executing the instruction, and newly stores the history information on the instruction in the history storage. Achieved by registering with the device. BRIEF DESCRIPTION OF THE FIGURES
図 1はこの発明に係わる全体構成を示す図である。 図 2は汎用演算ュニットの 詳細な構成図である。 図 3は浮動小数点演算ユニッ トの詳細な構成図である。 図 4は、 分岐演算ュニッ 卜の詳細な構成図である。 図 5は、 命令列の例を示す図で ある。 図 6は、 命令列の実行ステージフローを示す図である。 図 7は、 全体構成 の代替案を示す図である。 発明を実施するための最良の形態  FIG. 1 is a diagram showing an overall configuration according to the present invention. Figure 2 is a detailed configuration diagram of the general-purpose operation unit. FIG. 3 is a detailed configuration diagram of the floating-point operation unit. FIG. 4 is a detailed configuration diagram of the branch operation unit. FIG. 5 is a diagram illustrating an example of an instruction sequence. FIG. 6 is a diagram showing an execution stage flow of an instruction sequence. Figure 7 shows an alternative to the overall configuration. BEST MODE FOR CARRYING OUT THE INVENTION
本発明をより詳細に説述するために、 添付の図面に従ってこれを説明する。 図 1は対象とする命令処理装置の構成を示す。 本命令処理装置は、 命令を一時 的に記憶する命令キヤッシュ記憶 1 1、 命令処理装置内での命令を制御する命令 制御ュニッ ト 1 2、 分岐演算を行う分岐演算ュニット 1 3、 固定小数点命令と口 一 ド Zス トァ命令を実行する汎用演算ュニット 1 4、 浮動小数点演算を実行する 浮動小数点演算ュニッ ト 1 5およびデータを一時的に記憶するデータキヤッシ ュ記憶 1 6からなる。 図 1には主記憶装置は示していないが、 この主記憶装置内 に上記命令処理装置が処理する命令やデータが格納される。 命令制御ュニット 1 2や演算ユニッ ト 1 3、 14、 1 5は、 命令キャッシュ 1 1やデータキャッシュ 1 6を介して主記憶装置との間で前述した命令やデータを受け渡しする。 The present invention will be described in more detail with reference to the accompanying drawings. FIG. 1 shows the configuration of the target instruction processing device. The instruction processor has an instruction cache memory 11 that temporarily stores instructions, an instruction control unit 12 that controls instructions in the instruction processor, a branch operation unit 13 that performs branch operations, and fixed-point instructions. mouth It comprises a general-purpose operation unit 14 for executing a 1-d Z store instruction, a floating-point operation unit 15 for executing a floating-point operation, and a data cache storage 16 for temporarily storing data. Although the main storage device is not shown in FIG. 1, instructions and data to be processed by the instruction processing device are stored in the main storage device. The instruction control unit 12 and the operation units 13, 14, and 15 transfer the above-described instruction and data to and from the main storage device via the instruction cache 11 and the data cache 16.
命令列は、 当該命令制御ュニット 1 2によって、 命令キャッシュ記憶 1 1から 命令制御ュニット 1 2内の命令バッファ 1 2 1に読みだされ、 命令種別ごとに分 岐演算ュニット 1 3、 汎用演算ュニッ ト 14、 浮動小数点演算ュニッ ト 1 5に振 り分けられ、 その後、 各演算ユニッ トで実行される。 点線は命令語の流れを示す。 分岐演算ュニット 1 3は、 振り分けられた部分命令列を受け取る命令キュー ( I QB) 1 33、 実行した命令の履歴データを当該命令の命令アドレスと共に 記録する履歴記憶 (HSB) 1 35、 命令キュー内の命令に対応して履歴データ を保持する履歴キュー (HQB) 1 34、 分岐演算を行う命令処理部 (B r) 1 36、 履歴処理部 (H e V B) 1 3 1、 および、 条件コ — ド (CC) 1 32を含 む。  The instruction sequence is read from the instruction cache memory 11 to the instruction buffer 12 in the instruction control unit 12 by the instruction control unit 12, and the branch operation unit 13 and the general-purpose operation unit are executed for each instruction type. 14. Floating-point operation unit 15 and then executed by each operation unit. Dotted lines indicate the flow of command words. The branch operation unit 13 has an instruction queue (IQB) 133 that receives the distributed partial instruction sequence, a history storage (HSB) 135 that records the history data of the executed instruction along with the instruction address of the instruction, and an instruction queue. The history queue (HQB) 134 that stores history data corresponding to the instruction of instruction, the instruction processing unit (Br) 136 that performs branch operation, the history processing unit (HeVB) 131, and the condition code (CC) 132 is included.
汎用演算ュニット 1 4は、 振り分けられた部分命令列を受け取る命令キュー ( I QG) 143、 履歴記憶 (HSG) 1 45、 命令キュー 1 43の命令に対応 した履歴データを保持する履歴キュー (HQG) 1 44、 命令処理部 (F i x e d) 1 46、 履歴処理部 (He vG) 14 1、 および、 汎用レジスタ (G R) 1 42を含む。 浮動小数点演算ュニッ ト 1 5は、 振り分けられた部分命令列を受け取る命令キ ュ— ( I QF) 1 53、 履歴記憶 (HS F) ] 55、 命令キューに対応した履歴 データを保持する履歴キュー (HQ F) 1 54、 命令処理部 (F 1 o a t ) 1 5 6、 履歴処理部 (H e v F) ] 5 1、 および、 浮動小数点レジスタ (FR) 1 5 2を含む。 The general-purpose operation unit 14 has an instruction queue (IQG) 143 that receives the sorted partial instruction sequence, a history storage (HSG) 145, and a history queue (HQG) that holds history data corresponding to instructions in the instruction queue 143. 144, Instruction processing unit (Fixed) 146, History processing unit (He vG) 141, and general-purpose register (GR) 142 are included. The floating-point arithmetic unit 15 has an instruction queue (IQF) 153 for receiving the divided partial instruction sequence, a history storage (HSF)] 55, and a history queue (Histogram for holding history data corresponding to the instruction queue). HQ F) 154, instruction processing unit (F 1 oat) 156, history processing unit (H ev F)] 51, and floating point register (FR) 155.
図 2は中心となる汎用演算ュニッ ト 1 4のより詳細な構成を示す。  FIG. 2 shows a more detailed configuration of the core general-purpose arithmetic unit 14.
命令キュー ( I QG) 143は、 命令を一時的に保持する記億手段であり、 命 令ごとに、 主記憶上での命令のア ドレスを格納する命令ア ドレスフィールド ( I a d r ) 1 43 1、 命令コードフィールド (Op) 1 43 2、 演算の入力レジス タないしロードノストァのァドレスレジスタ番号を保持するフィ一ルド (S 1、 S 2) 1 433、 1434、 演算結果ないしロード ス トアデータを格納するレ ジスタ番号を保持する (R) フィールド 1435よりなる。 フィールド 1 433、 1 434、 1 435のレジスタ番号値は、 汎用レジスタ 1 42 a、 1 4 2 bのレ ジスタを指定するのに用いられる。 命令アドレスは、 命令が命令キューに登録さ れるときに合わせて格納される。 なお、 汎用レジスタ 1 42 a、 142 bにおい て、 r 0、 r 3 1はレジスタ番号を示す。 図に於いて r 2から r 30までは省略 している。  The instruction queue (IQG) 143 is a storage means for temporarily storing instructions, and an instruction address field (I adr) 1 43 1 for storing the address of the instruction in main memory for each instruction. , Instruction code field (Op) 1 432, Field for holding the operation input register or load register address register number (S1, S2) 1 433, 1434, Operation result or load store data It consists of an (R) field 1435 that holds the register number. The register number values in fields 1433, 1434, and 1435 are used to specify the registers in general-purpose registers 142a and 142b. The instruction address is stored when the instruction is registered in the instruction queue. Note that, in the general-purpose registers 142a and 142b, r0 and r31 indicate register numbers. In the figure, r2 to r30 are omitted.
履歴記憶 (HSG) 145は実行された命令の履歴を記憶する記憶装置であり、 命令単位に実行された命令に関する履歴情報が格納される。 履歴情報は、 命令ご とに、 主記憶での当該命令のア ドレスを格納する命令ア ドレスフィールド (I a d r ) 1 451、 命令の実行結果の条件コードを格納する条件コードフィールド (CC) 1 4 52、 演算入力データないしァドレスレジスタの内容のフィ一ルド (D a t a S D a t a S 2) 1 4 53、 1 4 54、 演算結果ないし主記憶か らロ一ドした口一ドデータないし主記憶ヘス トァするス トァデータのフィ一ル ド (D a t a R) 1 4 55、 主記憶上の口一ド元ァドレスまたはス トァ先ァドレ スを格納するロード ス トアア ドレスフィールド (D a d r) 1 456、 履歴情 報が命令再実行時に有効に用いられた回数を記録するカウンタであり、 履歴記憶 に空き領域が無くなった場合に削除する履歴情報を決定するために用いる有効 カウントフィールド (Cn t ) 1 457、 ス トァキャンセルの結果を表示する無 効化フィールド ( I ) 1 458よりなる。 主記憶での命令アドレスは、 新たに命 令実行の履歴が履歴記憶に追加される際に合わせて登録される。 なお、 命令アド レスを履歴記憶へ書き込むためのラインは図から省略している。 History storage (HSG) 145 is a storage device for storing a history of executed instructions, and stores history information on instructions executed in instruction units. The history information includes, for each instruction, an instruction address field (I adr) 1451 for storing the address of the instruction in the main memory, and a condition code field for storing a condition code of an instruction execution result. (CC) 1 452, operation input data or address register contents field (D ata SD ata S 2) 1 453, 1 454, operation result or data loaded from main memory Field of store data to be stored in the main memory (Data R) 1 455, Load store address field (D adr) 1 456 for storing the source or destination address in main memory A counter that records the number of times the history information has been used effectively during instruction re-execution. The effective count field (Cnt) 1 used to determine the history information to be deleted when there is no more free space in the history storage 457, an invalidation field (I) 1458 that displays the result of store cancellation. The instruction address in the main memory is registered when a new instruction execution history is added to the history storage. The line for writing the instruction address to the history storage is omitted from the figure.
履歴キュー (HQG) 144は、 命令キューの各命令に対応して、 履歴情報の 有無を示す Vフィールド (V) 144 1、 条件コー ドの履歴フィールド (CC) 1 442、 演算入力データないしアドレスレジスタの内容の履歴フィールド (D a t a S l、 D a t a S 2) 1 443、 1 444、 演算結果ないし口—ド結果な レヽしス トァデータの履歴フィ一ルド (Da t a R) 1 44 5、 ロー ドノス トァァ ドレスの履歴フィールド (D a d r) 1 446よりなる。 命令が命令キューに登 録されるとき、 履歴記憶 145内に当該登録される命令と対応する命令の履歴が 格納されていれば、 履歴キューに当該登録される命令と対応する命令の履歴が履 歴記憶 145からコピーされる。  The history queue (HQG) 144 has a V field (V) 144 1 that indicates the presence or absence of history information, a condition code history field (CC) 1 442, and operation input data or an address register corresponding to each instruction in the instruction queue. History field of data contents (D ata S l, D ata S 2) 1443, 1444, History field of data stored in the register data (Data R) 1445, Rhodes The history field of the address (Dadr) consists of 1446. When an instruction is registered in the instruction queue, if the history of the instruction corresponding to the registered instruction is stored in the history storage 145, the history of the instruction corresponding to the registered instruction is stored in the history queue. Copied from history memory 145.
命令キュー 1 43に積まれた命令は図 2に於いて下から順にパイプラインス ケジュールされて実行される。 演算命令のバイプラインは、 レ スタ読みだしInstructions stacked in the instruction queue 143 are pipelined in order from the bottom in FIG. Scheduled and executed. The operation instruction bi-line is read from the register.
( r ) 、 算術演算 (a ) 、 レジスタ書き込み (w ) の 3ステージからなる。 口一 ド命令のパイプラインは、 ァドレスレジスタ読み出し ( r ) 、 ァドレス計算 (a ) , デ一タロー ド (L ) 、 レジスタ書き込み (w ) の 4ステージからなる。 ス トア命 令のバイプラインは、 ア ドレスレジスタ読み出し (r ) 、 ア ドレス計算 (a ) 、 データ読みだし (r ) 、 データス トア (S T ) の 4ステージからなる。 命令語レ ジスタ 2 1 1 、 2 1 2、 2 1 3、 2 1 4は、 それぞれ第 1ステージから第 4ステ —ジで実行される命令語を格納するレジスタであり、 履歴レジスタ 2 3 2、 2 3 3、 2 3 4は、 命令語レジスタの第 2ステージから第 4ステージに対応して、 各 ステージの履歴情報の一部 (結果データないし口一ド Zストァデータと口一ド z ス トアアドレス) を格納するレジスタである。 スケジュールされた命令語と履歴 情報はこれらのレジスタの上を順に流れて行く。 履歴レジスタ 2 3 3から履歴レ ジスタ 2 3 4へのデータの移動を示す線は図 2から省略している。 It consists of three stages: (r), arithmetic operation (a), and register write (w). The pipeline of the load instruction consists of four stages: address register read (r), address calculation ( a ), data load (L), and register write (w). The store instruction bipline consists of four stages: address register read (r), address calculation ( a ), data read (r), and data store (ST). The instruction word registers 2 1 1, 2 1 2, 2 1 3, and 2 1 4 are registers for storing instruction words executed in the first to fourth stages, respectively. 2 3 3 and 2 3 4 correspond to the second to fourth stages of the instruction word register, respectively, and a part of the history information of each stage (result data or password Z store data and password z store address). ) Is a register that stores. Scheduled command and history information flows sequentially through these registers. The line indicating the movement of data from the history register 23 3 to the history register 23 4 is omitted from FIG.
記憶手段 2 0 1から 2 0 4は、 各ステージごとに履歴の有無を示すビッ ト (V ビッ ト) を保持する。 記憶手段 2 6 2、 2 6 3、 2 6 4は、 第 1ステージで判定 される履歴の有効性を示すビット (Uビット) を第 2ステージから第 4ステージ に渡って保持する。 Uビットデータの 2ステージから 4ステージまでの記憶装置 間の移動、 Vビットデータの 2ステージから 4ステージまでの記憶装置間の移動 は図 2から省略している。 Uビットは、 履歴の入力データないしアドレスレジス タの内容が対応する現命令の入力データないしァドレスレジスタの内容と全て 等しいとき、 " 1 " となり、 そうでないとき " 0 " となる。 2つの比較器 2 4 1 a、 2 4 1 b (第 1、 第 2の比較器) は、 上記 Uビッ トを設定する際の判定に用 いられ、 Uビッ トは、 上記 2つの比較器の出力と Vビッ ト (2 0 1 ) の出力の論 理積の値である。 なお、 この論理積を取るのに論理積回路 2 5 1が用いられる。 汎用レジスタ 1 4 2 a、 1 4 2 bは、 固定小数点データとメモリア ドレスの両 者を保持するために用いられる。 2つの汎用レジスタ 1 4 2 a、 1 4 2 bは同じ データを保持し、 命令語で指定された番号の内容を、 汎用演算ュニット内ァドレ ス Zデータレジスタ 2 2 2 a、 2 2 2 bを介して、 固定小数点演算器 1 4 6に送 出する。 レジスタ 2 4 2は演算の結果生成される条件コード C Cを保持し、 結果 レジスタ 2 2 3は第 2ステージの演算ないしァドレス計算の結果 (A d r ZD a t a ) を保持する。 The storage means 201 to 204 hold a bit (V bit) indicating the presence or absence of history for each stage. The storage means 26 2, 26 3, and 26 4 hold a bit (U bit) indicating the validity of the history determined in the first stage from the second stage to the fourth stage. The movement of U-bit data between storage devices from the second stage to the four stages and the movement of V-bit data between the storage devices from the second stage to four stages are omitted from FIG. The U bit is set to "1" when the input data of the history or the contents of the address register are all equal to the input data of the corresponding current instruction or the contents of the address register. Otherwise, it is set to "0". Two comparators 2 4 1 a, 2411b (first and second comparators) are used for determination when setting the U bit, and the U bit is the output of the two comparators and the V bit ( The value of the logical product of the output of 2 0 1). An AND circuit 251 is used to calculate the AND. The general-purpose registers 1442a and 1442b are used to hold both fixed-point data and memory addresses. The two general-purpose registers 1 4 2 a and 1 4 2 b hold the same data and store the contents of the number specified by the instruction word into the general-purpose operation unit address Z data register 2 2 2 a and 2 2 b. Via the fixed-point arithmetic unit 146. Register 2 42 holds the condition code CC generated as a result of the operation, and result register 2 23 holds the result of the second stage operation or address calculation (Adr ZD ata).
ァドレスレジスタ 2 2 4 aはロード ス トァ命令にて第 2ステージで生成し たァドレスを第 3ステ一ジで保持するレジスタであり、 レジスタ 2 2 4 bはロー ド命令でメモリ (主記憶、 データキャッシュ) から読みだしたデータないしスト ァ命令でレジスタから読み出したストアデ一タを保持するレジスタである。 第 3の比較器 2 4 3は、 ロード命令にて現命令のデータのロード先アドレスと 履歴データの口一ド先アドレスの比較に用いられる。 第 4、 第 5の比較器 2 4 4 a、 2 4 4 bは、 ス トア命令にてそれぞれ、 書き込みデータ (ス トアデータ) に 関する現命令における値と履歴値の比較、 および書き込みアドレス (主記憶のス トァ先アドレス) に関する現命令における値と履歴値の比較に用いられる。 論理 積回路 2 5 4は両者の比較結果の論理積をとる回路である。  The address register 224a is a register that holds the address generated in the second stage by the load store instruction in the third stage, and the register 224b is the memory (main memory, data) by the load instruction. This register holds the data read from the cache or the store data read from the register by the store instruction. The third comparator 243 is used for comparing the load destination address of the data of the current instruction with the load destination address of the history data in the load instruction. The fourth and fifth comparators 244a and 244b respectively compare the value of the current instruction with respect to the write data (store data) with the history value and store the write address (main memory) by the store instruction. This is used to compare the value of the current instruction with respect to the storage address of the current instruction and the history value. The AND circuit 254 is a circuit that calculates the logical product of the comparison result of the two.
これらの構成を用いた履歴情報の利用ならびに生成の手順は次のとおりであ る。 まず命令キュー 1 43への命令の登録時に、 命令ァドレス 1 43 1をキーに 履歴記億 1 4 5を検索し見つかったときには、 履歴キュー 1 44の対応ェン卜リ の Vフィ一ルド 144 1を " ] " とし、 履歴キュー 1 44の情報フィールドにそ れと対応する履歴記憶内の情報をコピーする。 命令ァ ドレス 1 431をキーに履 歴記憶 1 45を検索し見つからないときは Vフィールドを "0 (ゼロ) " とする ことで、 当該命令には履歴データが無いことを設定する。 The procedure for using and generating history information using these configurations is as follows. You. First, when registering an instruction in the instruction queue 144, the instruction address 143 1 is used as a key to search the history record 145, and if found, the V field of the corresponding entry of the history queue 144 1 Is set to "]", and the corresponding information in the history storage is copied to the information field of the history queue 144. When the history memory 145 is searched using the instruction address 1431 as a key and cannot be found, the V field is set to "0 (zero)" to set that there is no history data in the instruction.
つぎに第 1ステージにて、 比較器 24 1 a、 24 1 bにより、 入力データない しアドレスの現命令値と履歴値の比較を行う。 具体的には、 まず、 命令語レジス タ 21 1のァドレスレジスタ番号 S 1および S 2により指定される、 汎用レジス タ 142 aおよび 1 42 bの Da t aをライン 2 1 0 1、 2 1 02を読み出す。 また、 履歴キュー ] 44から Da t a S 1および D a t a S 2を読み出す。 そし て、 比較器 241 aにより、 Da t a 21 01 と D a t a S 1を比較し、 一致す るか否かを論理積回路 251にライン 2 1 03を介して出力する。 同様に、 比較 器 24 1 bにより D a t a 2 l 02と Da t a S 2を比較し、 一致するか否かを 論理積回路 25 1にライン 21 04を介して出力する。 論理積回路 25 1では、 比較器 24 1 a、 241 bからの結果と、 201にある V値とともに論理積をと る。 比較器 24 1 a、 24 1 bからの結果が共に一致を示し、 且つ、 201にあ る V値が有効を示すとき、 入力データないしァドレスの現命令値と履歴値が一致 したとして Uビットに "1" が設定される。  Next, in the first stage, the current instruction value of the input data or address and the history value are compared by the comparators 24 1a and 24 1b. More specifically, first, the data of general-purpose registers 142a and 142b, designated by the address register numbers S1 and S2 of the instruction word register 211, are connected to the lines 2101 and 2102, respectively. read out. Also, read Data S 1 and Data S 2 from the history queue 44. The comparator 241a compares the Data 201 1 with the Data S 1 and outputs whether or not they match to the AND circuit 251 via the line 2103. Similarly, D ata 2 l 02 and Data S 2 are compared by the comparator 241 b, and whether they match is output to the AND circuit 251 via the line 2104. The AND circuit 251 ANDs the result from the comparators 241 a and 241 b and the V value in 201. When the results from the comparators 24 1a and 24 1b both indicate a match and the V value in 201 indicates valid, the U bit is determined as a match between the current instruction value of the input data or address and the history value. "1" is set.
論理積回路 25 1にて U値をセットすると共に、 Uが "1 " の場合、 すなわち その命令に履歴データがあって入力の現在値が履歴値に等しいときは、 ライン 2 1 05、 ライン 2003を経由して履歴記億の現命令の有効力ゥント 1 4 5 7を 更新する。 When the U value is set in the AND circuit 25 1 and U is "1", that is, when the instruction has history data and the current value of the input is equal to the history value, the line 2 2005, update the effective count of the current instruction in the history record via the line 2003.
以下、 が "1 " の場合における、 命令の種類毎の処理について示す。  The processing for each instruction type when is "1" is described below.
当該命令が演算命令の場合、 第 2ステージをスキップして、 第 3ステージの命 令語レジスタ 2 1 3に命令語を、 履歴レジスタ 233に履歴キューから履歴情報 (C C、 D a t a R、 D a d r ) を送る。 なお、 履歴レジスタ 233に送られた 履歴情報は履歴キュー HQGの内容と区別するため (HCC、 HDa t a R、 H D a d r ) と記す。 第 3ステージでは、 履歴レジスタ 233の結果データ (HD a t a R) をライン 2301、 ライン 2001を経由して汎用レジスタ 1 42 a、 142 bの指定番号に書き込むことにより演算を完了する。  If the instruction is an arithmetic instruction, the second stage is skipped, the instruction word of the third stage is stored in the instruction register 2 13, and the history information (CC, DataR, ). The history information sent to the history register 233 is written as (HCC, HDDataR, HDDadr) to distinguish it from the contents of the history queue HQG. In the third stage, the operation is completed by writing the result data (HDatR) of the history register 233 to the designated numbers of the general-purpose registers 142a and 142b via the line 2301 and the line 2001.
当該命令がストァ命令の場合、 第 2ステージをスキップして第 3ステージの命 令語レジスタ 2 1 3に命令語を、 履歴レジスタ 233に履歴キュー 1 44から履 歴情報を送る。 第 3ステージでは、 レジスタ 224 bにス トアデータ (命令のフ ィ一ルド Rで指定される汎用レジスタ 142 aないし 142 bの内容であり、 図 2では図面の関係上、 汎用レジスタ 1 42 aないし 1 42 bを汎用レジスタ G r と記し、 この G rからストアデータが入力される) をセットして第 4ステージに 進む。 第 4ステージでは比較器 244 aでストァデータの現在値と履歴値 HD a t a Rの比較を行い両者共一致するときはス トア動作を抑止する。 これにより、 ストァ命令の実行時間を短縮できる効果がある。 比較結果が一致しないときは通 常にス トアを行い、 レジスタ 234内のア ドレス (HAd r) とレジスタ 224 b内のデータをライン 2406、 ライン 2402、 ライン 2002を経由して 履歴記憶 1 45の対応ェントリに書き込む。 また履歴記憶〗 45にス卜ァァドレ スと同じァドレスを持つェントリが有るときはその無効化フィールド 1 4 5 7 を "1 " にする。 これにより、 同じス トアア ドレスを持つ過去のス トア命令の履 歴を抹消する。 If the instruction is a store instruction, the second stage is skipped and the instruction is sent to the instruction register 2 13 of the third stage, and the history information is sent from the history queue 144 to the history register 233. In the third stage, store data is stored in register 224b (the contents of general registers 142a to 142b specified by the field R of the instruction. 42b is written as a general-purpose register G r, and store data is input from this G r), and then proceed to the fourth stage. In the fourth stage, the comparator 244a compares the current value of the store data with the history value HD ata R, and if both match, the store operation is suppressed. This has the effect of shortening the execution time of the store instruction. When the comparison results do not match, a normal store is performed, and the address (HAdr) in register 234 and the data in register 224b are transferred via lines 2406, 2402, and 2002. History record 1 Write to the corresponding entry of 45. If there is an entry having the same address as the storage address in the history storage No. 45, the invalidation field 14457 is set to "1". This erases the history of previous store instructions with the same store address.
当該命令がロード命令のときは第 2、 第 3ステージをスキップして第 4ステー ジの命令語レジスタ 2 1 4に命令語を、 履歴レジスタ 234に履歴キューから履 歴情報 (CC、 D a t a R、 D a d r ) を送る。 ロード命令では第 4ステージで 履歴レジスタ 234の読みだしデータ (HDa t a R) をライン 2403、 ライ ン 2001を経由して汎用レジスタ 142 a、 1 42 bに書き込むか、 図 3の浮 動小数点レジスタ 1 52 a、 1 52 bに書き込むべくライン 2405を経由して 浮動小数点演算ュニット 1 5に送出して命令を終了する (図 2では、 浮動小数点 演算ユニットを F uで示している) 。  If the instruction is a load instruction, the second and third stages are skipped, the instruction word of the fourth stage is stored in the instruction word register 214, and the history information is stored in the history register 234 from the history queue (CC, DataR). , D adr). In the load instruction, in the fourth stage, the read data (HDa R) of the history register 234 is written to the general-purpose registers 142a and 142b via the lines 2403 and 2001, or the floating-point register 1 in FIG. Send to floating point unit 15 via line 2405 to write to 52a, 152b and terminate the instruction (in FIG. 2, the floating point unit is denoted by Fu).
以上、 が "1 " の場合における、 命令の種類毎の処理について示した。 次に、 第 1ステージで Uが立たない場合 (U=0の場合) の各命令の処理につ いて述べる。  The processing for each type of instruction when is "1" has been described above. Next, the processing of each instruction when U does not stand in the first stage (when U = 0) is described.
演算命令の場合は、 第 2ステージで、 レジスタ 222 a、 222 bの演算入力 データを、 ライン 2201を経由して履歴記憶 1 45に登録するとともに、 命令 処理部 1 46で演算を行い、 演算結果をレジスタ 223に格納する。 演算結果は 第 3ステージのレジスタ 223からライン 2302、 ライン 2001を経由して 汎用レジスタ 142 a、 1 42 bに書き込むと共に、 ライン 2002を経由して 履歴記憶 145に登録する。 口一ド命令の場合は、 第 2ステージでロード元ァドレスを計算'し、 計算された ロード元ァドレスをレジスタ 2 2 3に格納し、 第 3ステージでレジスタ 2 23に 格納されたア ドレスで示される主記憶 (データキャッシュを含む) からレジスタ 2 24 bへのデータ読み出しを開始する。 これと合わせて、 Vビット 203がォ ン ( " 1 " ) で履歴情報があるときは、 比較器 24 3でレジスタ 2 2 3からの現 ァドレスと 2 3 3からの履歴ァドレス H a d rとを比較する。 比較器 24 3での ロード元ァドレス比較の結果、 両者が一致するときはデータの読み出しをキャン セル (ロード抑止) すると共に、 ライン 2 3 04、 ライン 200 3を経由して履 歴記憶の有効カウント 1 4 5 7を更新する。 また第 4ステージの Uビット 264 を " 1 " にする。 第 4ステージでは、 Uビッ ト 264力; "1 " のときは履歴レジ スタ 234上の結果データ (Hd a t a R) をライン 24 03、 ライン 200 1 経由で汎用レジスタ 1 4 2 a、 1 4 2 bに書き込むか、 または図 3の浮動小数点 レジスタ 1 5 2 a、 1 5 2 bに書き込むべくライン 240 5を経由して浮動小数 点演算ュニッ ト 1 5に送出する。 比較器 243での比較結果、 ロード元ァドレス がー致しない場合は、 レジスタ 2 23に格納されているロード元アドレスを用い てレジスタ 2 24 bに得られた読み出しデータを、 Vビット 204がオンのとき 比較器 244 aを用いて履歴レジスタ 2 34にある履歴データ HD a t a Rと 比較する。 両口一ドデータが一致するときはライン 24 04、 2 003を経由し て履歴記億の有効カウント 1 4 5 7を更新する。 レジスタ 2 24 b上の読み出し データはライン 2402、 200 1を経由して汎用レジスタ 1 4 2 a、 1 4 2 b に書き込むか、 図 3の浮動小数点レジスタ 1 5 2 a、 1 5 2 bに書き込むべく浮 動小数点演算ュニッ ト 1 5に送出する。 またレジスタ 2 2 4 a上のァドレスと レ ジスタ 2 2 4 b上のデータとをライン 2 4 0 1、 2 4 0 2、 2 0 0 2を経由して 履歴記憶 1 4 5に書き込み履歴情報を更新する。 In the case of an operation instruction, in the second stage, the operation input data of the registers 222a and 222b are registered in the history storage 145 via the line 2201, and the operation is performed in the instruction processing unit 146. Is stored in the register 223. The operation result is written from the register 223 of the third stage to the general-purpose registers 142a and 142b via the lines 2302 and 2001, and is registered in the history storage 145 via the line 2002. In the case of a spoken instruction, the load source address is calculated in the second stage, the calculated load source address is stored in the register 223, and is indicated by the address stored in the register 223 in the third stage. Start reading data from main memory (including data cache) to register 224b. At the same time, when the V bit 203 is ON ("1") and there is history information, the comparator 243 compares the current address from the register 223 with the history address Hadr from the 233. I do. As a result of the comparison of the load source address by the comparator 243, when the two match, the data reading is canceled (load suppression) and the effective count of the history storage via the lines 2304 and 20033 is performed. Update 1 4 5 7 Also, set the U bit 264 of the fourth stage to "1". In the fourth stage, the U bit 264 power; when "1", the result data (Hdata R) on the history register 234 is transferred to the general registers 1442a and 1442 via lines 2403 and 2001, respectively. b or sent to floating point unit 15 via line 2405 to write to floating point registers 152a, 152b in Figure 3. If the load source address does not match as a result of the comparison by the comparator 243, the read data obtained in the register 224b using the load source address stored in the register 223 and the V bit 204 are turned on. When Compared with the history data HD ata R in the history register 234 using the comparator 244a. When the two data match, the effective count of the history record is updated via the lines 2404 and 2003. Read data on register 2 24b is written to general registers 144a, 142b via lines 2402, 2001, or to the floating-point registers 1552a, 152b in Figure 3 Float It is sent to the floating-point operation unit 15. Also, the address on register 2 24 a and the data on register 2 24 b are written to history storage 1 45 via lines 240 1, 240 2, and 200 2 to store the history information. Update.
ス トァ命令の場合は、 第 2ステージの演算器 1 4 6でァドレス計算を行い、 第 3ステージで、 計算したア ドレスをレジスタ 2 2 4 aに、 ス トア命令で示される 汎用レジスタ内のストアデ一タをレジスタ 2 2 4 bにセットする。 第 4ステージ では、 データの現在値と履歴値の比較を比較器 2 4 4 aで、 ァ ドレスの現在値と 履歴値の比較を比較器 2 4 4 bで行い、 それらの比較結果の論理積を論理積回路 2 5 4でとる。 ア ドレス及びデータに関し、 現在値と履歴値が共に一致するとき は、 つまり、 この場合にはストア先のアドレスにはス トアデータがストアされて いるので、 ス トア動作を抑止する。 このス トア動作を抑止するのでス トア命令の 実行時間が短くて済む。 アドレス及びデータに関し、 現在値と履歴値が共に一致 しない場合は、 通常にストアを行う。 アドレスとデータをライン 2 4 0 1、 ライ ン 2 4 0 2、 2 0 0 2を経由して、 履歴記憶 1 4 5の対応ェントリに書き込む。 また履歴記憶 1 4 5のデータァ ドレスフィールド 1 4 5 6の内容がス トァァ ド レスと一致するェントリが有るときはその無効化フィールド 1 4 5 7を " 1 " に する。  In the case of the store instruction, the address calculation is performed by the second stage computing unit 146, and in the third stage, the calculated address is stored in the register 224a and stored in the general-purpose register indicated by the store instruction. Set the data in register 2 2 4b. In the fourth stage, the current value of the data and the history value are compared by the comparator 244a, the current value of the address and the history value are compared by the comparator 244b, and the logical product of the comparison results is obtained. Is taken by the AND circuit 2 5 4. When both the current value and the history value of the address and data match, that is, in this case, since the store data is stored at the store destination address, the store operation is suppressed. Since this store operation is suppressed, the execution time of the store instruction can be reduced. If both the current value and the history value do not match for the address and data, store normally. The address and data are written to the corresponding entry of the history storage 145 via the lines 2401, 242, and 202. If there is an entry in which the content of the data address field 14456 of the history storage 144 matches the address, the invalidation field 14457 is set to "1".
以上、 第 1 ステージで Uが立たない場合 (U = 0の場合) の各命令の処理につ いて述べた。  The processing of each instruction when U does not stand in the first stage (when U = 0) has been described above.
なお図 2に於いて、 履歴記憶 1 4 5への登録時に空きェントリがないときは、 有効カウント 1 4 5 7の小さいものから順に削除して空きを作り、 有効な履歴情 報が残るように制御を行う。 これにより、 履歴記憶の内、 再利用される頻度が高 いものが残り、 かつ、 再利用される頻度が少ない履歴情報から削除されることに なり、 履歴記憶の領域を有効に使用することができる。 In FIG. 2, if there is no empty entry at the time of registration to the history storage 1 45, a space is created by deleting the effective counts 1 4 5 7 in ascending order to create a valid history information. Control is performed so that the information remains. As a result, of the history storage, those that are frequently reused remain and are deleted from the history information that is not frequently reused, so that the area of the history storage can be used effectively. it can.
図 3は同様に浮動小数点演算ュニット 1 5の詳細構成を示す。 ここではロード Zストア命令を扱わないので、 その分、 図 2と比べて簡略になる。  FIG. 3 similarly shows a detailed configuration of the floating-point operation unit 15. Since load Z store instruction is not handled here, it is simpler than that in Fig. 2.
まず、 図 3の符号を整理しておくと次のようになる。 301、 302、 303 は浮動小数点演算ュニッ ト内ステージ Vビッ トであり、 3 1 1、 31 2、 3 1 3 は浮動小数点演算ュニッ ト内の各ステージの命令語を格納する命令語レジスタ であり、 322 a、 322 b、 323は浮動小数点演算ュニット内データレジス タであり、 332、 333は浮動小数点演算ュニッ ト内ステージ履歴レジスタで あり、 34 1 a、 341 bは淳動小数点演算ュニット内の比較器であり、 342 は浮動小数点演算ュニット内条件コードレジスタであり、 35 1は浮動小数点演 算ユニット内論理積回路であり、 362、 363は浮動小数点演算ユニット内ス テ―ジ履歴有効ビットであり、 3001、 3002、 3 1 05、 3 1 06、 32 01、 3301、 3302、 3303は浮動演算ユニッ ト内ラインである。  First, the symbols in Fig. 3 are summarized as follows. 301, 302, and 303 are stage V bits in the floating-point operation unit, and 311, 312, and 313 are instruction word registers for storing the instruction words of each stage in the floating-point operation unit. , 322a, 322b, and 323 are data registers in the floating-point operation unit, 332 and 333 are stage history registers in the floating-point operation unit, and 341a and 341b are data in the dynamic point operation unit. 342 is a condition code register in the floating-point operation unit, 351 is an AND circuit in the floating-point operation unit, and 362 and 363 are stage history valid bits in the floating-point operation unit. Yes, 3001, 3002, 310, 310, 3201, 3301, 3302, and 3303 are lines in the floating operation unit.
命令キュー ( I QF) 1 53は命令を一時的に保持する記憶手段であり、 命令 ごとに命令ァドレスフィーノレド(I a d r) 1 53 1、命令コ一ドフィールド(O p) 1 532、 演算の入力レジスタのァドレスレジスタ番号を保持するフィ一ル ド (S l、 S 2) 1 533、 1 534、 演算結果を格納するレジスタ番号を保持 するフィールド (R) 1 535よりなる。 フィールド 1 533、 1 534、 1 5 35のレジスタ番号値は、 浮動小数点レジスタ 1 52 a、 1 52 bのレジスタ^ 指定するのに用いられる。 The instruction queue (IQF) 153 is a storage means for temporarily storing instructions. For each instruction, an instruction address field (I adr) 1 53 1, an instruction code field (Op) 1 532, and an operation It consists of fields (S1, S2) 1533 and 1534 that hold the address register number of the input register, and a field (R) 1535 that holds the register number that stores the operation result. The register number values for fields 1 533, 1 534, and 15 35 are the floating point registers 1 52a and 1 52b registers ^ Used to specify.
履歴記憶 (HSF) 1 55は実行された履歴を記憶する記憶手段であり、 命令 単位に登録される履歴情報を格納し、 命令ごとに命令ア ドレスフィールド ( I a d r) 1 55 1、 条件コードフィールド (CC) ] 552、 演算入力データのフ ィ ーノレド (D a t a S l、 Da t a S 2) 1 553、 1 554、 演算結果のフィ 一ルド (D a t a R) 1 555、 有効カウント (C n t ) フィールド 1 557よ りなる。  The history storage (HSF) 155 is a storage means for storing the history of execution, stores history information registered for each instruction, and stores an instruction address field (I adr) 155 1 for each instruction and a condition code field. (CC)] 552, Finoredo of operation input data (DataSl, DataS2) 1 553, 1554, Field of operation result (DataR) 1555, Effective count (Cnt) Field 1 consists of 557.
履歴キュー (HQF) 1 54は、 命令キューの各命令に対応して、 履歴情報の 有無を示す Vフィールド 1 54 1、 条件コードの履歴フィ—ルド (CC) 1 54 2、 演算入力データの内容の履歴フィールド (D a t a S l、 D a t a S 2) 1 543、 1 544、 演算結果の履歴フィールド (D a t a R) 1 545よりなる。 命令が命令キューに登録されるとき、 履歴記憶 1 55内に当該登録される命令と 対応する命令の履歴が格納されていれば、 履歴キューに当該登録される命令と対 応する命令の履歴が履歴記憶 1 55からコピーされる。 The history queue (HQF) 154 has a V field 154 1, which indicates the presence or absence of history information, the condition code history field (CC) 154 2, and the contents of operation input data for each instruction in the instruction queue. History fields (D ata S l, D at a S 2) 1 543 and 1 544, and a history field (D ata R) 1 545 of the operation result. When an instruction is registered in the instruction queue, if the history of the instruction corresponding to the registered instruction is stored in the history memory 155, the history of the instruction corresponding to the registered instruction is stored in the history queue. History memory 1 Copied from 55.
パイプラインステージは、 浮動小数点レジスタ読みだし (i r) 、 浮動小数点 演算 (i a) 、 浮動小数点レジスタ書き込み (f w) の 3ステージである。 命令 語レジスタ 31 1、 31 2、 3 1 3は各ステージの命令語を保持する。 履歴レジ スタに保持する情報は、 条件コードと結果レジスタの値のみであり、 履歴レジス タ 332には第 2ステージの命令の履歴を保持し、 履歴レジスタ 333には第 3 ステージの命令の履歴データを保持する。 比較器 34 1 a、 34 1 bは、 演算入 力データと履歴キューの履歴値との比較に用いる。 図 3に於ける浮動小数点演算命令の実行時に於ける図 3に示される各構成要 素の動作であるが、 これは、 図 3の構成要素と対応する図 2に示される各構成要 素の、 図 2に於ける演算命令の実行時に於ける動作と同様である。 There are three pipeline stages: floating point register reading (ir), floating point operation (ia), and floating point register writing (fw). The instruction register 311, 312, 313 holds the instruction of each stage. The information held in the history register is only the condition code and the value of the result register.The history register 332 holds the history of the instruction of the second stage, and the history register 333 stores the history data of the instruction of the third stage. Hold. The comparators 34 1 a and 34 1 b are used for comparing the operation input data with the history value of the history queue. The operation of each component shown in FIG. 3 when the floating-point operation instruction in FIG. 3 is executed is the same as the operation of each component shown in FIG. 2 corresponding to the component shown in FIG. This is the same as the operation at the time of execution of the operation instruction in FIG.
図 4は分岐演算ュニッ ト 1 3の詳細構成を示す。  FIG. 4 shows the detailed configuration of the branch operation unit 13.
まず、 図 4の符号を整理しておくと次のようになる。 40 1、 402は分岐演 算ユニット内ステージ Vビッ トであり、 4 1 1、 4 】 2は分岐演算ユニッ ト内ス テ一ジ命令語を格納する命令語レジスタであり、 4 3 1、 432は分岐演算ュニ ット内ステージ履歴レジスタであり、 44 1、 452は分岐演算ユニット内条件 コードレジスタであり、 442は分岐成立レジスタであり、 46 1は分岐アドレ ス加算回路であり、 462は分岐アドレスレジスタであり、 4 71は分岐演算ュ ニッ ト內比較器であり、 4 72は分岐演算ュニット履歴有効ビットであり、 40 02、 4003は分岐演算ユニット内ラインである。  First, the symbols in Fig. 4 are summarized as follows. 401, 402 are stage V bits in the branch operation unit, 411, 4] and 2 are instruction word registers for storing stage instruction words in the branch operation unit, and 431, 432 Is a stage history register in the branch operation unit, 441 and 452 are condition code registers in the branch operation unit, 442 is a branch establishment register, 461 is a branch address addition circuit, and 462 is a branch address addition circuit. A branch address register, 471 is a branch operation unit / comparator, 472 is a branch operation unit history valid bit, and 4022 and 4003 are lines in the branch operation unit.
命令キュー ( I QB) 1 33の各行は、 命令ァドレスフィールド ( I a d r) 1 331、 命令コ ードフィールド (Op) 1 332、 分岐マスクフィ一ルド (M) 1 333、 分岐先相対アドレスフィールド (T a d r) 1 334よりなる。 履歴 キュー (HQB) 1 34には、 命令キューの各命令に対応して、 履歴情報の有無 を示す Vフィールド 1 34 1、 条件コ —ドフィ—ルド (CC) 1 342、 分岐先 アドレス (B a d r ) 1 343を保持する。 履歴記憶 (HS B) 1 35は、 命令 アドレスフィールド ( I a d r) 1 35 1、 条件コ ードフィールド (CC) 1 3 52、 分岐先ァドレス (B a d r) 1 353、 有効カウントフィールド (C n t ) 1 354よりなる。 パイブラインは、 分岐アドレス計算 (b a) と分岐 (b b) の 2ステージから なる。 命令レジスタ 4 1 1、 4 1 2は各ステージの命令語を、 履歴レジスタ 43 1 , 432はそれら命令に対応する履歴情報を保持する。 Each line of the instruction queue (IQB) 133 has an instruction address field (Iadr) 1331, an instruction code field (Op) 1332, a branch mask field (M) 1333, and a branch destination relative address field (Tadr). Consists of 1 334. The history queue (HQB) 134 has a V field 1341 indicating the presence / absence of history information, a condition code field (CC) 1 342, a branch destination address (Badr) corresponding to each instruction in the instruction queue. 1) Hold 343. History storage (HS B) 1 35, the instruction address field (I a dr) 1 35 1 , condition code field (CC) 1 3 52, branch destination Adoresu (B adr) 1 353, the effective count field (C nt) 1 Consists of 354. The pipeline consists of two stages: branch address calculation (ba) and branch (bb). The instruction registers 4 1 1 and 4 1 2 hold the instruction word of each stage, and the history registers 43 1 and 432 hold the history information corresponding to those instructions.
第 1ステージでは、 汎用演算ュニッ ト (Gu) 1 4ないし浮動小数点演算ュニ ッ ト (F u) 1 5から送られる条件コ一 ド (CC) 44 1を、 命令レジスタ 4 1 1内の命令語のマスクと比較器 45 1を用いて比較し、 分岐の有無を決定してレ ジスタ 44 2にセットする。 一方、 命令ア ドレスと分岐先相対ア ドレスとから分 岐ァドレスをァドレス加算器 46 1を用いて決定し、 分岐ァ ドレスレジスタ 46 2にセッ トする。 また、 履歴レジスタ 43 1の条件コ一ド (HCC) をレジスタ 441の条件コ—ドと比較器 4 71を用いて比較し、 結果を履歴有効ビッ ト (U ビット) 472にセットする。  In the first stage, the condition code (CC) 441 sent from the general-purpose arithmetic unit (Gu) 14 or the floating-point arithmetic unit (Fu) 15 is used for the instruction in the instruction register 4 11 1. The word mask is compared with the comparator 451 to determine whether or not there is a branch, and the result is set in the register 442. On the other hand, the branch address is determined from the instruction address and the branch destination relative address using the address adder 461 and set in the branch address register 462. Also, the condition code (HCC) of the history register 431 is compared with the condition code of the register 441 using the comparator 471, and the result is set in the history valid bit (U bit) 472.
第 2ステージでは、 分岐成立時にレジスタ 452の条件コードとレジスタ 46 2の分岐ァドレスをライン 4002をとおして履歴記憶 1 35に登録するとと もに、 Uビッ ト 472が " 1" のとき、 ライン 4003を通じて、 履歴記憶 1 3 5の対応する命令の有効カウント (Cn t ) フィールド 1 354を更新する。 さ らに分岐成立時には分岐先命令の読み出しを行うためレジスタ 462の分岐ァ ドレスを命令キヤッシュ 1 1に送出する。  In the second stage, when the branch is taken, the condition code of register 452 and the branch address of register 462 are registered in history storage 135 via line 4002, and when U bit 472 is "1", line 4003 Update the effective count (Cnt) field 1354 of the corresponding instruction in the history store 135. Further, when the branch is taken, the branch address of the register 462 is sent to the instruction cache 11 to read the branch destination instruction.
以上述べた構成を使った命令列の実行過程を、 図 5に示す命令列の例に即して 述べる。 まず、 命令列の例を図 5で説明する。  The process of executing an instruction sequence using the configuration described above will be described based on an example of the instruction sequence shown in FIG. First, an example of an instruction sequence will be described with reference to FIG.
図 5の命令列にて、 左端の列は命令番号 ( (1 ) (2) · · · (1 4) ) を、 その右は分岐先のラベル (L 0) を、 次の列は命令コー ド (LD I、 SUB等) を、 続く列はオペランド ( ]、 % r 9等) を示す。 オペランドの欄で、 % rは汎 用レジスタの指定を、 %i rは浮動小数点レジスタの指定を意味する。 右端にか かれたレジスタは、 演算命令やロード命令で結果の置かれるレジスタを示し、 そ れ以外は演算入力、 ストアデータないしア ドレス指定のレジスタを意味する。 数 値はデータないしァドレスの命令語での直接指定を意味する。 In the instruction sequence in Fig. 5, the leftmost column is the instruction number ((1) (2) · · · (1 4)), the right is the branch destination label (L0), and the next column is the instruction code. C (LD I, SUB, etc.) , And the following columns indicate the operands (],% r 9, etc.). In the operand field,% r indicates a general-purpose register, and% ir indicates a floating-point register. The registers at the right end indicate the registers where the results are placed by operation instructions and load instructions, and the others indicate registers for operation input, store data, or address specification. Numerical value means direct designation of data or address by command word.
命令番号 (1 ) 、 (2) 、 (4) 、 (6) 、 (7) 、 (8) および (1 1 ) は 固定小数点演算命令であり、 汎用演算ュニッ ト ] 4にて実行される。 命令番号 (3) および (1 4) は固定小数点ロー ド命令であり、 同じく汎用演算ュニッ ト にて実行される。 命令番号 (5) 、 (9) および (1 2) は浮動小数点ロード/ ス トァ命令であり、 汎用演算ュニット 1 4と浮動小数点演算ュニット 1 5で協調 して実行される。 命令番号 (1 0) は浮動小数点演算命令であり、 浮動小数点演 算ュニッ ト 1 5にて実行される。 最後に命令番号 (1 3) は比較分岐命令であり、 汎用演算ュニット 1 4と分岐演算ュニッ ト 1 3で協調して実行される。  Instruction numbers (1), (2), (4), (6), (7), (8), and (11) are fixed-point arithmetic instructions, which are executed by the general-purpose arithmetic unit. Instruction numbers (3) and (14) are fixed-point load instructions, also executed by the general-purpose arithmetic unit. Instruction numbers (5), (9) and (12) are floating-point load / store instructions, which are executed by the general-purpose operation unit 14 and the floating-point operation unit 15 in cooperation. The instruction number (10) is a floating-point operation instruction, and is executed by the floating-point operation unit 15. Finally, the instruction number (13) is a comparison / branch instruction, which is executed by the general-purpose operation unit 14 and the branch operation unit 13 in cooperation.
各命令の機能は次のとおりである。  The function of each instruction is as follows.
( 1 ) および (2) の LD I (L o a d I mm e d i a t e ) 命令は、 直接 値を結果レジスタに転送する。 命令 (1 ) の例では値 1を汎用レジスタ 9番に転 送する。 (3) および (1 4) の LDW (L o a d Wo r d) 命令は、 汎用レ ジスタ 3 0番の內容と直接値一 6 04の和で指定されるメモリ了ドレスから 1 語分の固定小数点データを読みだし、 汎用レジスタ 3 1番にセットする。 (4) 、 (7) および (1 1 ) の LDO (L o a d O f f s e t ) 命令は、 了ドレスレ ジスタの内容に直接指定のオフセッ ト値を加えて結果レジスタにセッ 卜する。 (4) の例では、 汎用レジスタ 9番の内容から;!を引き、 結果を汎用レジスタ ] 6番にセッ トする。 (5) および (9) の F LDDX (F 】 o a t i n g L o a d D o u b l e 】 n d e x e d) 命令は、 2つのァ ドレスレジスタの内容 を加えたメモリアドレスから浮動小数点データを読みだし、 結果レジスタにセッ トする。 (5) の例では、 汎用レジスタ 2 2番の内容に汎用レジスタ 1 6番の内 容を加えたァドレスからデータを読み出し、 浮動小数点レジスタ 8番にセッ卜す る。 (6) は固定小数点减算命令である。 (6) の例では汎用レジスタ 8番の内 容から汎用レジスタ 2 3番の内容を引き、 結果を汎用レジスタ 1 7番にセットす る。 (8) は加算命令である。 (1 0) は、 浮動小数点レジスタ 7番の内容と浮 動小数点レジスタ 8番の内容の間で乗算を行い結果を淳動小数点レジスタ 1 3 番にセットする命令である。 (1 2) の F S TDX (F l o a t i n g S t o r e D o u b l e I n d e x e d) 命令は、 浮動小数点レジスタ 1 3番の内 容を、 汎用レジスタ 20番と汎用レジスタ 2 6番の内容を加えたメモリアドレス に格納する。 (1 3) の COMB (C omp a r e a n d B r a n c h) 命 令は、 汎用レジスタ 9番と汎用レジスタ 1 1番の内容を比較し、 後者が前者以上 であるとき L 0のラベルを持つ (4) の命令に分岐する。 すなわちこの命令列は ループを作る。 (1 4) の命令は (1 3) の命令のディレイスロットにあり、 見 かけ上 (1 3) の命令の後に置かれるが (1 3) で分岐が成立するときはこの命 令も実行される。 The LDI (Load Immediate) instructions (1) and (2) transfer the value directly to the result register. In the example of the instruction (1), the value 1 is transferred to the general-purpose register 9. The LDW (Load Word) instructions in (3) and (14) are the fixed-point data for one word from the memory address specified by the sum of the contents of general-purpose register 30 and the direct value 604. And set it to general-purpose register 3 No. 1. The (4), (7) and (11) LDO (Load Offset) instructions add the specified offset value directly to the contents of the end address register and set it in the result register. In the example of (4), from the contents of general-purpose register 9; , And set the result in general-purpose register 6]. The (5) and (9) FLDDX (F) loading L Double (ndexed) instructions read the floating-point data from the memory address to which the contents of the two address registers have been added, and set them in the result register. . In the example of (5), data is read from the address obtained by adding the contents of general-purpose registers 16 and 16 to the contents of general-purpose registers 2 and 2 and set to the floating-point register 8. (6) is a fixed-point arithmetic instruction. In the example of (6), the contents of general registers 23 and 23 are subtracted from the contents of general registers 8 and the result is set to general registers 17 and 17. (8) is an addition instruction. (10) is an instruction that multiplies the contents of the floating-point register 7 and the contents of the floating-point register 8 and sets the result in the dynamic-point register 13. The FS TDX (Floating Store Double Indexed) instruction of (1 2) stores the contents of floating-point register 13 into a memory address that is the sum of the contents of general-purpose register 20 and general-purpose register 26. I do. The (13) COMB (Compare and Branch) instruction compares the contents of general-purpose register 9 and general-purpose register 11 and has the L0 label when the latter is greater than or equal to the former. Branch to instruction. That is, this instruction sequence creates a loop. The instruction (14) is located in the delay slot of the instruction (13), and is apparently placed after the instruction (13). However, when the branch is taken in (13), this instruction is also executed. You.
図 5のアドレス不変性の欄に a uとある命令は、 ロード Zストア命令にてルー プの繰返しごとにメモリアドレスが変わらないことを意味し、 これはループ内で ァドレスレジスタの内容が書き変わらないことから明らかである。 またデータ不 変性の欄に uないし ί uとあるのはその命令の結果レジスタの内容がループの 進行で変わらないことを意味する。 (5) の命令についてはそのように仮定し、 その他は仮定とア ドレスの不変性データの受渡しの関係から帰結される。 The instruction with “au” in the address invariance column in FIG. 5 means that the memory address does not change every time the loop is repeated by the load Z store instruction. This is clear from the fact that the contents of the address register are not rewritten. Also, u or ίu in the column of data invariance means that the contents of the result register of the instruction do not change as the loop proceeds. For the instruction in (5), such an assumption is made, and the others are consequent to the relation between the assumption and the delivery of the invariant data of the address.
図 6は、 図 5の命令列を本発明の構成で実行したときのパイプラインのステー ジフローを示す。 図 6では、 横軸は左から右に時間の流れを示し、 縦軸は命令の 番号を示す。 (4) のループ開始を基点とし、 初回のステージフローと 2回目以 降のステージフローを重ねて示す。 2回目以降は命令番号に 「' 」 を付けて区別 する。 2回目以降で途中から進行が早まるのは履歴情報の利用のためである。 ま た、 図 6中の r、 a、 w、 L、 S T、 i w、 i a、 f w、 b a、 b bは、 図 2、 図 3および図 4の説明中に示したステージを意味する。 以下にその状況を説明す る。  FIG. 6 shows a stage flow of the pipeline when the instruction sequence of FIG. 5 is executed by the configuration of the present invention. In Fig. 6, the horizontal axis indicates the flow of time from left to right, and the vertical axis indicates the instruction number. Starting from the loop start in (4), the first stage flow and the second and subsequent stage flows are shown in a superimposed manner. From the second time onward, the instruction numbers are distinguished by appending “'”. The reason why the progress is accelerated in the middle after the second time is due to the use of the history information. Also, r, a, w, L, ST, i w, ia, fw, ba, and bb in FIG. 6 mean the stages shown in the description of FIG. 2, FIG. 3, and FIG. The situation is described below.
ループの初回の進行においては、 各命令に履歴情報がないので、 履歴キューの Vフィ 一ノレド 1 34 1、 1 44 1、 1 54 1は全て "0" であり、 ステージの進 行はスキップされることがない。  In the first progression of the loop, since there is no history information for each instruction, the V fields 1341, 1441, and 1541 in the history queue are all "0", and the stage progress is skipped. Never.
ステージの進行に伴い、 演算命令については、 第 2ステージで演算入力データ がライン 2 20 1、 3 20 1を介して、 第 3ステージで演算の結果データと条件 コードがライン 23 02、 2303、 3302、 3 303を介して履歴記憶 1 4 5、 1 5 5に書き込まれる。 ロード /ストア命令については、 第 4ステージでメ モリアドレスがライン 24 0 1を介して、 ロードノストアデータが 2402を介 して、 履歴記憶 1 4 5に書き込まれる。 分岐命令では第 2ステージで条件コード と分岐先ァドレスがライン 4 002を介して履歴記億] 3 5に書き込まれる。 履歴情報は、 固定小数点演算命令と口—ドノス トァ命令に関しては汎用演算ュ 二ッ 卜の履歴記憶 1 4 5に、 浮動小数点演算命令に関しては浮動小数点演算ュニ ッ 卜の履歴記憶 1 5 5に、 比較分岐命令に関しては汎用演算ュニッ トと分岐演算 ユニッ トの履歴記憶 1 4 5、 1 5 5の両者に書き込まれている。 As the stage progresses, regarding the operation instructions, in the second stage, the operation input data is transmitted via lines 2201 and 3201, and in the third stage the operation result data and condition code are transmitted on lines 2302, 2303 and 3302. Written to the history storage 1 4 5, 1 5 5 via 3 303. For load / store instructions, in the fourth stage the memory address is written to history storage 145 via line 2401 and the load / store data via 2402. Condition code in the second stage for branch instructions And the destination address is written to history record 35 via line 4002. The history information is stored in the history storage 144 of the general-purpose operation unit for fixed-point operation instructions and mouth storage instructions, and in the history storage 155 of the floating-point operation unit for floating-point operation instructions. The comparison / branch instruction is written to both the general-purpose operation unit and the history memories 144 and 155 of the branch operation unit.
また、 上記命令の実行に於いて命令の主記憶上のァドレスも合わせて履歴記憶 の中に、 履歴を書き込むステージの l a d r (2 1 2、 2 1 3、 2 1 4のいずれ かの 1 a d r ) が書き込まれる。 書き込むタイミングは最初に履歴データを履歴 記億に書き込む際であり、 当該命令に対する履歴データを履歴記億に追加する際 はこのアドレスに対応する行に新たな履歴データを追加する。 なお、 命令アドレ スを履歴記憶へ書き込むためのラインは図から省略している。  In addition, when executing the above instruction, the address of the instruction in the main memory is also added to the history storage, and the ladr of the stage to write the history (1 adr of any of 2 1 2, 2 1 3, 2 1 4) Is written. The timing of writing is when writing the history data to the history storage first, and when adding the history data for the instruction to the history storage, new history data is added to the line corresponding to this address. The line for writing the instruction address to the history storage is omitted from the figure.
なお図 6に於いて、 命令 (1 1 ) が命令 (1 0) より先に実行されているのは、 命令 (1 1 ) が固定小数点演算命令であり、 命令 (1 0) が浮動小数点演算命令 であり、 それぞれの命令実行が別の演算ユニッ トで実行され、 また命令 (1 0) が命令 (9) の実行結果を用いることによる。 また、 最初は汎用演算ユニットで 実行され、 その後、 浮動小数点演算ュニッ 卜で実行される命令 (1 1 ) の実行が 途中でウェイ ト状態となっているのは、 命令 (1 1 ) が浮動小数点演算ュニッ ト で実行中の命令 (1 0) の実行結果を用いるからである。  In FIG. 6, the instruction (11) is executed before the instruction (10) because the instruction (11) is a fixed-point operation instruction and the instruction (10) is a floating-point operation. Instructions, where each instruction is executed by a separate arithmetic unit, and instruction (10) uses the execution result of instruction (9). Also, the execution of the instruction (11), which is first executed by the general-purpose arithmetic unit and then executed by the floating-point arithmetic unit, is in the middle of execution because the instruction (11) is floating-point This is because the execution result of the instruction (10) being executed by the operation unit is used.
ループの初回の進行を終えると命令 (4) ' に折り返し、 次ループが開始され る。 演算命令 (4) ' の第 1ステージ (r ) にて、 汎用レジスタ 9番に関し、 比 較器 24 1 aにより履歴キュー 1 4 4 3からの履歴値と汎用レジスタ 1 4 2 a から読みだされた現在値との比較がなされる。 汎用レジスタ 9番の内容は前回の ループで履歴値を格納したのち命令 (1 1 ) で値が更新されているので、 履歴値 と現在値は一致しない。 したがって、 論理積 25 1の結果は "0" となり、 有効 ビッ ト 2 1 2に "0" がセッ 卜されるとともに第 1ステージの命令語 21 1は第 2ステージ 2 1 2に送られ、 以降、 通常のステージ動作となる。 After completing the first loop, the loop returns to instruction (4) 'and the next loop starts. In the first stage (r) of the operation instruction (4) ', regarding the general-purpose register 9, the comparator 241 a uses the history value from the history queue 144 43 and the general-purpose register 144 a. Is compared with the current value read from the. Since the contents of general-purpose register 9 have been updated with the instruction (11) after storing the history value in the previous loop, the history value and the current value do not match. Therefore, the result of the logical product 25 1 is “0”, “0” is set in the valid bit 2 12, and the instruction word 211 of the first stage is sent to the second stage 2 12, and thereafter, The normal stage operation is performed.
続く浮動小数点ロード命令 (5) ' の実行では、 まず第 1ステージで、 汎用レ ジスタ 1 6番に関する履歴値と現在値の比較が比較器 24 1 aで、 汎用レジスタ 22番に関する履歴値と現在値の比較が比較器 24 1 bを用いてなされる。 汎用 レジスタ 1 6番の現在値は、 直前の命令 (4) ' において汎用レジスタ 9番の値 を使って更新されており、履歴値と異なるので、論理積回路 251の結果は '0' となり、 前命令 (4) ' と同様に全ステージが実行される。 第 4ステージにて 2 24 bに読みだされたデータと履歴レジスタ 234の履歴値の比較が比較器 2 44 aを用いてなされ、 両者が等しいことからライン 2404、 2003を経由 して履歴記憶 145の命令 ( 5 ) の行の有効力ゥントフィールド 145 7の値が 1つカウントアップされる。 またデータレジスタ 224 bのデータはライン 24 02から浮動小数点演算ユニット (Fu) に送られ、 図 3のライン 3302、 ラ イン 3001を経て浮動小数点レジスタ 1 52 a、 1 52 bに書き込まれる。 次の演算命令 (6) ' では、 第 1ステージで、 汎用レジスタ 8番に関する現在 値と履歴値との比較と、 汎用レジスタ 23番に関する現在値と履歴値との比較と がなされる。 比較の結果、 両者がともに等しく、 かつ、 演算命令 (6) ' は 2回 目の実行であることから Vビット 201は "1 " であり、 このことから論理積回 路 25 ] の結果は ' 1 ' となり、 第 1ステージの命令語レジスタ 2 1 1の命令語 と履歴キューの履歴値は、 第 2ステージをスキップして、 第 3ステージの命令語 レジスタ 2 1 3と履歴レジスタ 23 3に移される。 また、 ライン 2 1 05、 2 0 0 3を経由して、 履歴記憶 1 4 5の命令 (6) の行の有効カウント 1 4 5 7が力 ゥントアップされる。 第 3ステージでは先行するロード命令のロードステージが 進行中なので 1サイクル待ったのち、 履歴レジスタ 2 3 3の結果レジスタ値 (H d a t a R) をライン 2 3 0 1、 200 1を経由して汎用レジスタ 1 4 2 a、 1 4 2 bに書き込み終了する。 In the execution of the subsequent floating-point load instruction (5) ', first, in the first stage, the comparison between the history value of general-purpose register 16 and the current value is performed by comparator 241 a, and the history value and general value of general-purpose register 22 are compared. The comparison of the values is made using the comparator 241b. The current value of general register 16 is updated using the value of general register 9 in the immediately preceding instruction (4) ', and differs from the history value, so the result of the AND circuit 251 is'0', All stages are executed as in the previous instruction (4) '. In the fourth stage, the data read to 224b and the history value of the history register 234 are compared using the comparator 244a, and since both are equal, the history is stored via lines 2404 and 2003. The value of the effective field 1457 of the line of the instruction (5) is incremented by one. The data in the data register 224b is sent from the line 2402 to the floating point arithmetic unit (Fu), and is written to the floating point registers 152a and 152b via the line 3302 and the line 3001 in FIG. In the next operation instruction (6) ', the first stage compares the current value with the history value of general-purpose register 8 and the current value with the history value of general-purpose register 23. As a result of the comparison, both are equal, and the operation instruction (6) ′ is the second execution, so that the V bit 201 is “1”. The result of [Route 25] is '1', and the instruction word of the first stage instruction register 2 1 1 and the history value of the history queue skip the second stage, and the instruction word register 2 1 3 of the third stage Is transferred to the history register 233. In addition, the effective count 1457 of the line of the instruction (6) of the history storage 1450 is incremented via the lines 2105 and 2003. In the third stage, since the load stage of the preceding load instruction is in progress, after waiting for one cycle, the result register value (H data R) of the history register 2 33 is transferred to the general register 1 via lines 2 3 0 1 and 2 Writing to 42a and 142b is completed.
続く演算命令 (7) ' 、 (8) ' の動作は、 先行する命令の影響がないため、 演算命令 (6) ' における途中の 1サイクル待つ動作を行わない以外は、 演算命 令 (6) ' と同様である。 この 2回目のループの演算命令 (7) ' 、 (8) ' の 動作は、 1回目のループの演算命令 (7) 、 (8) の動作と比べて第 2ステージ をスキップするため、 演算処理時間が短くなる。  The operation of the following operation instructions (7) 'and (8)' is not affected by the preceding instruction, so the operation instruction (6) 'except for the operation that waits for one cycle in the middle of operation instruction (6)' Same as'. The operation of the operation instructions (7) 'and (8)' in the second loop skips the second stage compared to the operation of the operation instructions (7) and (8) in the first loop. Time is shortened.
浮動小数点ロード命令 (9) ' においては、 図 2の第 1ステージで、 汎用レジ スタ 1 8番に関する現在値と履歴値との比較と、 汎用レジスタ 3 1番に関する現 在値と履歴値との比較とがなされる。 比較の結果は両者ともに一致し、 且つ、 口 一ド命令 ( 9 ) ' は 2回目の実行であることから Vビッ ト 20 1は " 1 " であり、 このことからするので論理積回路 2 5 1の結果は ' 1 ' となり、 ライン 2 1 0 5、 20 03より履歴記億の有効カウン卜 1 4 5 7を更新したのち、 第 2、 第 3ステ —ジをスキップして第 4ステージに進む。 この際、 第 1ステージの命令語レジス タ 2 1 1の命令語と履歴キューの履歴値は、 第 2ステージおよび第 3ステージ スキップして、 第 4ステ一ジの命令語レジスタ 2 1 4と履歴レジスタ 234に移 される。 第4ステージでは口一ド命令で履歴有効ビッ ト 2 64力; "] " であるこ と力 ら、 履歴レジスタ 234の結果値 (H d a t a R) をライン 24 05より浮 動小数点演算ユニッ ト (F u) 1 5に送出する。 図 3の浮動小数点演算ユニッ ト 1 5ではライン 3 302、 3 00 1より浮動小数点レジスタ 1 5 2 a、 1 5 2 b に書き込む。 In the floating-point load instruction (9) ', the first stage in FIG. 2 compares the current value with the history value of general-purpose register 18 and compares the current value with the history value of general-purpose register 3 1 at the first stage in FIG. A comparison is made. The result of the comparison is the same for both, and the V-bit 201 is "1" because the spoken instruction (9) 'is the second execution, and from this, the logical product circuit 25 The result of 1 is '1', and after updating the effective count 1 4 5 7 of the history record from lines 2 105 and 2003, skip the second and third stages and go to the fourth stage move on. At this time, the instruction word of the first-stage instruction register 211 and the history value of the history queue are stored in the second and third stages. It skips and moves to the instruction word register 214 and the history register 234 of the fourth stage. In the 4th stage, the history valid bit 2 64 power by the spoken instruction; "]", the result value (H data R) of the history register 234 is read from the line 2405 to the floating-point operation unit (F u) Send to 15 In the floating-point operation unit 15 in FIG. 3, data is written to the floating-point registers 152a and 152b from lines 3302 and 3001.
次の演算命令 (1 0) ' は図 3の浮動小数点演算ュニッ ト 1 5で実行される。 第 1ステージ (i r ) で、 浮動小数点レジスタ 7番に関する現在値と履歴値が比 較器 34 1 aにより比較され、 浮動小数点レジスタ 8番に閱するの現在値と履歴 値が比較器 34 1 bにより比較される。 これら比較結果がともに一致し、 かつ、 演算命令 (1 0) ' は 2回目の実行であることから Vビッ ト 3 0 1は " 1 " であ ることから、 論理積回路 3 5 1の結果は " 1 " となり、 ライン 3 1 05より履歴 記憶 1 5 5の有効カウント 1 5 5 7を更新した後、 第 2ステージをスキップして 第 3ステージへ進む。 この際、 第 1ステージの命令語レジスタ 3 1 1の命令語と 履歴キューの履歴値は、 第 2ステージをスキップして、 第 3ステージの命令語レ ジスタ 3 1 3と履歴レジスタ 3 3 3に移される。 第 3ステージでは、 履歴有効ビ ッ ト 3 6 3力 ' 'であることから履歴レジスタ 33 3の結果値(Hd a t a R) をライン 3 3 0 1、 300 1を介して浮動小数点レジスタ 1 5 2 a、 ] 5 2 bに 書き込む。  The next operation instruction (1 0) 'is executed in the floating-point operation unit 15 in FIG. In the first stage (ir), the current value and the history value of the floating-point register 7 are compared by the comparator 34 1a, and the current value and the history value of the floating-point register 8 are compared by the comparator 34 1b. Are compared by Since these comparison results match each other, and because the operation instruction (1 0) ′ is the second execution, V bit 301 is “1”, so the result of the AND circuit 35 1 Becomes "1", and from the line 3 2005, the effective count 1 5 5 7 of the history memory 1 5 5 is updated, and then the second stage is skipped and proceeds to the third stage. At this time, the instruction word of the first-stage instruction register 311 and the history value of the history queue are skipped to the second stage, and are stored in the third-stage instruction register 313 and the history register 333. Moved. In the third stage, since the history valid bit 3 6 3 is “”, the result value (Hdata R) of the history register 3333 is transferred to the floating-point register 1 5 2 via lines 3 301 and 300 1. a,] 5 2 b.
演算命令 (1 1) ' は汎用演算ユニット 1 4で実行される。 入力レジスタ 9番 の内容が毎回カウントアップされ現在値と履歴値が等しくないことからすべて のステージが実行される。 The operation instruction (1 1) 'is executed by the general-purpose operation unit 14. Every time the contents of input register 9 are counted up and the current value and the history value are not equal, Stage is executed.
浮動小数点ストァ命令 (] 2 ) ' は汎用演算ュニッ ト 1 4で実施される。 第】 ステージ ( r ) で、 汎用レジスタ 2 6番に関する現在^:と履歴値を比較し、 また, 汎用レジスタ 2 0番に関する現在値と履歴値が比較し、 両者がともに一致し、 か つ、 Vビッ トが " 1 " であることから、 論理積回路 2 5 1の出力は " 1 " となる c この結果、 履歴記憶 1 4 5の有効カウントフィールド 1 4 5 7を更新し、 第 2ス テ一ジをスキップして第 3ステージ ( i r ) に進む。 この際、 第 1ステージの命 令語レジスタ 2 1 1の命令語と履歴キュー 1 4 4の履歴値は、 第 2ステージをス キップして、 第 3ステ一ジの命令語レジスタ 2 1 3と履歴レジスタ 2 3 3に移さ れる。 第 3ステージ ( ί r ) では、 浮動小数点演算ュニットのライン 3 1 0 6を 介して浮動小数点レジスタ 1 3番のストァデータ値を読みだし、 汎用演算ュ ッ 卜のデータレジスタ 2 2 4 bにセッ 卜する。 続く第 4ステージで、 データレジス タ 2 2 4 bのス トァデータと履歴レジスタ 2 3 4内の履歴ス トァデータ (H d a t a R ) を比較器 2 4 4 aにより比較し、 両者が等しいことからライン 2 4 0 4、 ライン 2 0 0 3を経由して履歴記憶 1 4 5の有効カウン トフィールド 1 4 5 7 を更新する。 また、 回路 2 5 4を通じてデータのス トア動作を抑止する。 一方、 比較器 2 4 4 aの比較結果が一致しないときは前回のス トァデータと今回のス トァデータが異なるのでストア動作を行う。 また、 このス トア命令のストア動作 実行による履歴情報の破壊に対処するため、 履歴レジスタ 2 3 4のァドレス情報 (H A d r ) をライン 2 4 0 6、 2 0 0 2を経由して履歴記憶 1 4 5におく り、 データァドレスフィールド 1 4 5 6に H A d r と同じ値をもつ行の無効化フィ 一ルド 1 4 58を " 1 " にし、 当該行を無効化する。 The floating-point store instruction (] 2) 'is implemented in the general-purpose arithmetic unit 14. At the stage (r), the current value of the general-purpose register 26 is compared with the history value, and the current value and the history value of the general-purpose register 20 are compared. since V bit is "1", the aND circuit 2 5 1 outputs "1" and becomes c result, it updates the valid count field 1 4 5 7 history storage 1 4 5, the second scan Skip the stage and proceed to the third stage (ir). At this time, the instruction word of the instruction register 2 11 of the first stage and the history value of the history queue 1 4 4 skip the second stage, and the instruction word register 2 13 of the third stage. Moved to history register 2 3 3. In the third stage (ί r), the floating-point operation unit 13 reads out the store data value of the floating-point register 13 via the line 3106 of the floating-point operation unit, and sets it in the data register 222b of the general-purpose operation unit. I do. In the following fourth stage, the store data of the data register 224b and the history store data (H data R) in the history register 234 are compared by the comparator 244a. Update the valid count field 1457 of the history storage 1405 via 404, line 2003. Also, the data store operation is suppressed through the circuit 254. On the other hand, if the comparison results of the comparators 244a do not match, the store operation is performed because the previous store data and the current store data are different. In addition, in order to cope with the destruction of the history information due to the execution of the store operation of the store instruction, the address information (HAdr) of the history register 234 is stored in the history storage 1 via lines 2406 and 2002. 4 5 and the invalidation field of the row with the same value as HA dr in the data address field 1 4 5 6 Set the field 1 4 58 to "1" to invalidate the line.
続く命令 (1 3) ' は比較分岐命令であるため、 汎用演算ュニット 1 4と分岐 演算ュニッ ト 1 3は協調して命令 (1 3) ' を実行する。 汎用演算ュニッ ト ] 4 の第 1ステージで、 汎用レジスタ 9番に関する現在値と履歴値を比較し、 また、 汎用レジスタ 1 1番の現在値を履歴値と比較する。 汎用レジスタ 9番の現在値が 履歴値と異なることから通常のステージ進行をとる。 第 2ステージで比較演算を 固定小数点演算回路 146により行い比較結果の条件コード (CC) をレジスタ 242にセッ トする。 第 3ステージでは、 レジスタ 242の条件コードを、 ライ ン 2303、 2002より履歴記憶 1 4 5の条件コ一 ドフィ一ノレド 1452に書 き込むと同時に分岐演算ユニット (Bu) 1 3にも送出する。 図 4の分岐演算ュ ニットでは、 第】ステージ (b a) で、 レジスタ 44 1に条件コードを受け、 こ れと命令語 4 1 1のマスクフィ一ルドとを比較器 4 5 1にて比較して得た分岐 の有無を示す情報 442により分岐の有無を決定すると共に、 比較器 471で履 歴レジスタ 43 1の条件コード (HCC) と比較し、 両者が同じであることから 有効ビット U472を "1" にする。 続く第 2ステージ (b b) にて、 ライン 4 403より履歴記憶 1 35の有効カウントフィールド 1 354を更新し、 ノレープ 先頭に分岐する。  Since the following instruction (1 3) 'is a comparison / branch instruction, the general-purpose operation unit 14 and the branch operation unit 13 cooperate to execute the instruction (1 3)'. In the first stage of the general-purpose arithmetic unit 4, compare the current value of general-purpose register 9 with the history value, and compare the current value of general-purpose register 11 with the history value. Since the current value of general-purpose register 9 is different from the history value, normal stage progress is performed. In the second stage, the comparison operation is performed by the fixed-point operation circuit 146, and the condition code (CC) of the comparison result is set in the register 242. In the third stage, the condition code of the register 242 is written from the lines 2303 and 2002 to the condition code field 1452 of the history storage 1450, and is also sent to the branch operation unit (Bu) 13 at the same time. In the branch operation unit shown in FIG. 4, in the second stage (ba), the condition code is received in the register 441, and this is compared with the mask field of the instruction word 4 1 1 by the comparator 4 5 1. The presence / absence of the obtained branch is determined by the information 442 indicating the presence / absence of the branch, and the condition code (HCC) of the history register 43 1 is compared by the comparator 471 with the condition code (HCC) of the history register 431. " In the following second stage (b b), the effective count field 1 354 of the history storage 1 35 is updated from the line 4 403, and the process branches to the beginning of the norape.
ディレイスロットにある命令 (14) ' は固定小数点ロード命令である。 図 2 の第 1ステージで汎用レジスタ 30番に関する現在値が履歴値と比較され、 両者 が等しいことからライン 21 05、 2003より履歴記憶 145の有効カウント フィールド 1457が更新され、 第 2、 第 3ステージをスキップして第 4ステ一 ジに進む。 第 4ステージでは履歴レジスタ 2 3 4の結果データ (H D a \ a R ) をライン 2 4 0 3、 2 0 0 1より汎用レジスタ 1 4 0 a 、 1 4 O bに書き込む。 こうして図 6の太線に示すように、 2回目以降のル―ブでは、 履歴データを利用 したステージのスキップにより 1回目より早く命令列を実行することが出来る。 履歴記憶への新規書き込み時に空き領域がないときは、 有効カウントの小さい ものから順に削除し、 空き領域を作る。 これは、 出来るだけ履歴記億が有効であ つたァドレスの命令の履歴が履歴記憶に残るように制御するものである。 実行時 に履歴の無い命令は、 履歴キューの Vフィールド 1 3 4 1 、 1 4 4 1 、 1 5 4 1 力; " 0 " であり、 この場合はステージのスキップなしに実行され、 履歴情報が履 歴記憶に登録される。 The instruction (14) 'in the delay slot is a fixed-point load instruction. In the first stage of Fig. 2, the current value of general-purpose register 30 is compared with the history value, and since they are equal, the effective count field 1457 of the history storage 145 is updated from lines 210 and 2003, and the second and third stages are updated. Skip the fourth step Proceed to In the fourth stage, the result data (HDa \ aR) of the history register 234 is written to the general-purpose registers 140a and 144b from lines 2403 and 2001. Thus, as shown by the thick line in FIG. 6, in the second and subsequent loops, the instruction sequence can be executed earlier than the first loop by skipping the stage using the history data. If there is no free space at the time of new writing to the history storage, the free space is deleted in order from the one with the smallest effective count to create a free space. This is to control so that the history of the instruction of the address whose history record was valid as much as possible remains in the history memory. Instructions with no history at the time of execution are the V fields 1341, 14441, and 1541 of the history queue; "0"; in this case, they are executed without skipping the stage and the history information is Registered in history memory.
履歴記憶は、 命令キュ一^ ·の登録部と命令実行の各ステージから同時に読み出 しと書き込みが行われるが、 命令ア ドレスにより命令ステージ数 + 1 (この例で は 5 ) 以上のバンクに分割して構成することによりアクセスの集中を防ぐことが 出来る。  In the history storage, reading and writing are performed simultaneously from the register section of the instruction queue and each stage of instruction execution.However, depending on the instruction address, the number of instruction stages + 1 (5 in this example) bank is stored. By dividing and configuring, access concentration can be prevented.
マルチプロセッサを組む場合などで、 他プロセッサから共有記憶領域へのス ト ァがあったときには、 自プロセッサからのストアと同様、 履歴記憶のデータアド レス 1 4 5 6が他のプロセッサからのストァのァドレスと一致したところの、 当 該データァ ドレス 1 4 5 6を有する行を無効化する。 ス トアァ ドレスが不明の場 合は履歴記憶のロードノストァにかかわる全行を無効化する。  If there is a store from another processor to the shared storage area, such as when a multiprocessor is formed, the data address of the history storage is stored in the same way as the store from the own processor. Invalidate the row having the data address 1456, which matches the address. If the storage address is unknown, invalidate all lines related to the load store in the history storage.
また図 1では、 履歴記億を各演算ュニッ 卜に分散して配置しているが、 図 7の 履歴キャッシュ (履歴記憶) 1 7のように演算ユニット 1 3、 】 4、 1 5の外に とり、 命令キャッシュ】 1上の命令に対応させて持っても良い。 この場合の履歴 記憶に記憶される履歴情報の入替えの管理 (履歴記億への履歴データの登録、 ま たは履歴記憶からの履歴情報の削除等) は、 命令キャッシュに併せて行う。 履歴 情報は命令制御ュニット 1 2の命令バッファ ( I BUF) 1 2 1への命令列の取 り込みに併せて履歴バッファ (HBUF) 1 22に一旦取り込まれ、 各演算ュニ ッ 卜の履歴キュ— 1 34 (HQB) 、 144 (HQG) 、 1 54 (HQF) に分 配される。 命令実行の結果更新された履歴情報は出力履歴キュー 1 37 (HQB 〇) 、 1 4 7 (HQGO) 、 1 57 (HQFO) から履歴バッファ (HBUF) 1 22を経由して履歴記憶 1 7に戻される。 Also, in FIG. 1, the history records are distributed and arranged in each operation unit. However, like the history cache (history storage) 17 in FIG. Instruction cache] You may have it corresponding to the instruction on 1. In this case, the management of the replacement of the history information stored in the history storage (registration of the history data in the history storage or deletion of the history information from the history storage, etc.) is performed together with the instruction cache. The history information is temporarily taken into the history buffer (HBUF) 122 at the same time as the instruction sequence is taken into the instruction buffer (I BUF) 122 of the instruction control unit 12, and the history queue of each operation unit is taken. — Distributed to 134 (HQB), 144 (HQG) and 154 (HQF). The history information updated as a result of the instruction execution is returned from the output history queues 137 (HQB), 147 (HQGO) and 157 (HQFO) to the history storage 17 via the history buffer (HBUF) 122. It is.
本発明は図 5の命令列の形式にとらわれるものでは無い。 入力レジスタ、 アド レスレジスタ、 結果レジスタ、 直接値などが特定できる命令であればどのような 形式であっても適用できる。 また、 VL I W (V e r y L a r g e I n s t r u c t i o n Wo r d) 計算機のように、 単一の命令で複数の操作を指定す る命令語形式であつてもよい。  The present invention is not limited to the format of the instruction sequence shown in FIG. Any form of instruction that can specify input registers, address registers, result registers, and direct values can be used. Alternatively, it may be in the form of a command word that specifies a plurality of operations with a single command, such as a VLIW (VeryLargeInstruckintiornWord) computer.
図 1は演算ュニットが 3つの例をとりあげたがこれはより多くても少なくて も良い。 また各々の命令に履歴情報を対応させてステージの一部の実行をスキッ プすることが発明の本質であることから、 マルチパイプライン計算機やスーパー スカラ計算機のように各ュニットが複数の命令キューや演算器をもつ構成であ つても適用できる。  Figure 1 shows an example of three arithmetic units, but this may be more or less. In addition, since it is the essence of the invention to skip the execution of a part of the stage by associating the history information with each instruction, each unit has a plurality of instruction queues or multiple instruction queues such as a multi-pipeline computer and a superscalar computer. It can be applied to a configuration with an arithmetic unit.
本発明により、 図 6に示すとおり各命令の実行に要するパイプラインステージ 数を短縮でき、 その結果、 命令列の実行時間を短縮することができる。 また、 メ モリに対する不要なロードノス トア操作を抑止することから、 キャッシュ制御や ァドレス変換など記憶域管理の負荷を軽減することが出来、 その結果口—ド ス トァ命令の平均実行時間を短縮することができる。 更に、 主記憶での命令ア ドレ スが一致した時に命令の履歴を利用するようにしているので、 履歴情報の再利用 率が高く履歴情報を有効に利用できる。 産業上の利用可能性 According to the present invention, as shown in FIG. 6, the number of pipeline stages required to execute each instruction can be reduced, and as a result, the execution time of an instruction sequence can be reduced. Also, Since unnecessary load restore operations on the memory are suppressed, the load of storage management such as cache control and address conversion can be reduced, and as a result, the average execution time of the cache store instruction can be reduced. Furthermore, since the instruction history is used when the instruction address in the main memory matches, the history information is reused at a high rate and the history information can be used effectively. Industrial applicability
以上のように、 本発明にかかる命令処理装置は、 ア ドレスを指定して記憶装置 から命令を読み出し実行する命令装置に有用であり、 入力レジスタ、 アドレスレ ジスタ、 結果レジスタ、 直接値などが特定できる命令を処理する命令処理装置に 適用するのに適している。  As described above, the instruction processing device according to the present invention is useful for an instruction device that reads an instruction from a storage device by specifying an address and executes the instruction, and specifies an input register, an address register, a result register, a direct value, and the like. It is suitable for application to an instruction processing device that processes available instructions.

Claims

請求の範囲 ' The scope of the claims '
1 . ア ドレスによって指定される命令を処理する命令処理装置であって、 命令を 実行する際に当該命令が処理したデータの内容と当該命令のァ ドレスとを対に して履歴情報として記憶する履歴記憶装置と、 命令を実行する際に前記履歴記億 装置に前記履歴情報を記録する記録手段と、 再度同じァドレスの命令を実行する 際に前記履歴記憶装置に記憶された前記履歴情報の内容を用いて当該命令を実 行する再実行手段と、 および、 前記同じア ドレスの命令を再実行する際に前記履 歴記憶装置に記憶された前記履歴情報の内容を更新する更新手段とを有する命 令処理装置。 1. An instruction processing device for processing an instruction specified by an address, wherein when executing the instruction, the content of the data processed by the instruction and the address of the instruction are stored as history information in pairs. A history storage device, recording means for recording the history information in the history storage device when executing the command, and contents of the history information stored in the history storage device when the command of the same address is executed again. Re-executing means for executing the instruction by using the same, and updating means for updating the contents of the history information stored in the history storage device when re-executing the instruction of the same address. Instruction processing unit.
2 . 前記命令が演算命令で有る場合は前記データは演算の入力となるデータと演 算結果のデータを含み、 前記再実行手段は、 現時点での前記演算命令の演算の入 力となるデータと前記履歴記憶装置に記録された前記演算命令の演算の入力と なるデータとを比較する比較器と、 前記比較器が一致を示す際に前記履歴記憶装 置に記録された前記演算結果のデータを現時点での前記演算命令の演算結果デ ータとする手段を有する請求の範囲第 1項記載の命令処理装置。  2. If the instruction is an operation instruction, the data includes data to be input for the operation and data of the operation result, and the re-executing means includes: data to be input to the operation of the operation instruction at the present time; A comparator for comparing the data input to the operation of the operation instruction recorded in the history storage device, and a data of the operation result recorded in the history storage device when the comparator indicates a match. 2. The instruction processing apparatus according to claim 1, further comprising: means for obtaining operation result data of the operation instruction at the present time.
3 . 前記命令がストァ命令である場合は前記データは当該ストァ命令がアクセス するストァ先ァドレスとストァデータを含み、 前記再実行手段は、 現時点での前 記ストァ命令のストァ先ァドレスと前記履歴記憶装置に記録された前記ス トァ 命令のストァ先アドレスとを比較する第 1の比較器と、 現時点での前記ストァ命 令のストァデータと前記履歴記憶装置に記録された前記ス トァ命令のストアデ 一夕とを比較する第 2の比較器と、 前記第 1の比蛟器および前記第 2の比較器が 共に一致を示す際に現時点でのストァ命令のス トァ動作を抑止する手段とを有 する請求の範囲第 1項記載の命令処理装置。 3. If the instruction is a store instruction, the data includes a store address accessed by the store instruction and store data, and the re-executing means stores the store address of the store instruction at the present time and the history storage device. A first comparator for comparing the store instruction address of the store instruction stored in the history instruction with the store data of the store instruction at the present time and the store instruction of the store instruction stored in the history storage device. A second comparator for comparing with the first comparator, and means for inhibiting the store operation of the store instruction at the present time when the first comparator and the second comparator both indicate a match. The instruction processing device according to claim 1, wherein
4 . 前記再実行手段は、 前記第 1の比較器および前記第 2の比較器が共に一致を 示さない際に現時点でのス トア命令を実行する手段を有し、 この際、 前記更新手 段は、 前記ス トァ先ア ドレスを有する前記履歴記憶装置の前記データを無効にす る請求の範囲第 1項記載の命令処理装置。  4. The re-executing means includes means for executing a store instruction at the present time when the first comparator and the second comparator do not indicate a match, and in this case, the updating means 2. The instruction processing device according to claim 1, wherein said data invalidates said data in said history storage device having said store destination address.
5 . 前記命令がストァ命令である場合は前記データはス トァ先ア ドレスを形成す るァドレスレジスタ群の内容とス トァ先ァドレスとを有し、 前記再実行手段は、 現時点での前記ァドレスレジスタ群の内容と前記履歴記憶装置に記録された前 記ァドレスレジスタ群の内容どを比較する比較手段と、 前記比較手段が一致を示 す際にス トァ先ァドレスの計算を抑止し前記履歴記億装置に記録されたストァ 先ァドレスを使用する手段を有する請求の範囲第 1項記載の命令処理装置。 5. When the instruction is a store instruction, the data includes the contents of a group of address registers forming a store address and a store address, and the re-executing means includes the current address register. Comparing means for comparing the contents of the group with the contents of the address register group recorded in the history storage device; and when the comparing means indicates a match, the calculation of the storage destination address is suppressed and the history storage address is suppressed. 2. The instruction processing device according to claim 1, further comprising means for using a storage destination address recorded in the device.
6 . 前記再実行手段は、 比較手段が一致を示さない際に現時点でのス トア命令を 実行する手段を有し、 この際、 前記更新手段は、 前記ス トア先ア ドレスを有する 前記履歴記憶装置の前記データを無効にする請求の範囲第 5項記載の命令処理 装置。 6. The re-executing means has means for executing a store instruction at the present time when the comparing means does not show a match, and in this case, the updating means has the store destination address having the store destination address. 6. The instruction processing device according to claim 5, wherein the data of the device is invalidated.
7 . 前記命令がロード命令である場合は前記データは当該ロード命令がアクセス するロード元ァドレスとロードデータとを含み、 前記再実行手段は、 現時点での 前記ロード命令のロード元ァドレスと前記履歴記憶装置に記録された前記口一 ド命令のロード元ァドレスとを比較する第 1の比較器と、 前記第 1の比較器が共 に一致を示す際に前記履歴記憶装置に記録されたロードデータを現時点での口 一ド命令実行による口一ドデータとする手段とを有する請求の範囲第 1項記載 の命令処理装置。 7. If the instruction is a load instruction, the data includes a load source address accessed by the load instruction and load data, and the re-executing means includes a load source address of the load instruction and a history storage at the present time. A first comparator for comparing a load source address of the code instruction recorded in the device with the first comparator; 2. The instruction processing apparatus according to claim 1, further comprising: means for converting the load data recorded in the history storage device into a piece of port data by executing a port command at the current time when a match is found.
8 . 前記命令が口一ド命令である場合は前記データはロードデータと口一ド元ァ ドレスを形成するア ドレスレジスタ群の内容を含み、 前記再実行手段は、 現時点 での前記ロード命令の前記ァ ドレスレジスタ群の内容と前記履歴記憶装置に記 録された前記ロード命令の前記ァドレスレジスタ群の内容とを比較する比較手 段と、 前記比較手段が一致を示す際に前記履歴記憶装置に記録された前記ロード データを現時点での口一ド命令のロード動作結果のロードデータとする手段を 有する請求の範囲第 1項記載の命令処理装置。  8. If the instruction is a load instruction, the data includes load data and the contents of an address register group forming a load source address, and the re-executing means executes the load instruction at the present time. A comparing means for comparing the content of the address register group with the content of the address register group of the load instruction recorded in the history storage device; and 2. The instruction processing device according to claim 1, further comprising means for using the recorded load data as load data of a result of a load operation of a currently loaded instruction.
9 . 前記履歴記憶装置に記録される履歴情報は当該履歴情報を利用して前記再実 行手段が命令実行した回数を集計するカウンタを有し、 前記更新手段は、 前記際 実行手段が当該履歴情報を利用して命令実行したとき前記力ゥンタを更新する 手段と、 前記履歴記憶装置の記億域が不足するときはカウンタの値の少ないもの から順に前記履歴情報を削除する手段とを有する請求の範囲第 1項記載の命令 処理装置。  9. The history information recorded in the history storage device has a counter for counting the number of times the re-executing unit has executed the instruction using the history information, and the updating unit includes: Means for updating the counter when an instruction is executed using information, and means for deleting the history information in ascending order of the counter value when the storage area of the history storage device is insufficient. 2. The instruction processing device according to claim 1.
1 0 . 前記命令処理装置は、 命令実行時に前記履歴記憶装置に当該命令に関する 履歴情報が無いときは、 履歴を利用せずに命令を実行し、 当該命令に関する履歴 情報を新たに前記履歴記憶装置に登録する請求の範囲第 9項記載の命令処理装 置。  10. The instruction processing device executes the instruction without using history when there is no history information related to the instruction in the history storage device when executing the instruction, and newly stores the history information related to the instruction in the history storage device. 10. The instruction processing device according to claim 9, wherein the instruction processing device is registered with a computer.
PCT/JP1996/002633 1996-09-13 1996-09-13 Command processor having history memory WO1998011484A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP1996/002633 WO1998011484A1 (en) 1996-09-13 1996-09-13 Command processor having history memory

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP1996/002633 WO1998011484A1 (en) 1996-09-13 1996-09-13 Command processor having history memory

Publications (1)

Publication Number Publication Date
WO1998011484A1 true WO1998011484A1 (en) 1998-03-19

Family

ID=14153818

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP1996/002633 WO1998011484A1 (en) 1996-09-13 1996-09-13 Command processor having history memory

Country Status (1)

Country Link
WO (1) WO1998011484A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003005954A (en) * 2001-06-25 2003-01-10 Pacific Design Kk Data processor and method for controlling the same
JP2006215672A (en) * 2005-02-02 2006-08-17 Nec Corp Information processor, its method and program
WO2016200501A1 (en) * 2015-05-29 2016-12-15 Intel Corporation Source operand read suppression for graphics processors
US9919060B2 (en) 2009-05-01 2018-03-20 University Court Of The University Of Dundee Treatment or prophylaxis of proliferative conditions
US10942851B2 (en) * 2018-11-29 2021-03-09 Intel Corporation System, apparatus and method for dynamic automatic sub-cacheline granularity memory access control
US11760773B2 (en) 2018-02-02 2023-09-19 Maverix Oncology, Inc. Small molecule drug conjugates of gemcitabine monophosphate

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS60129839A (en) * 1983-12-19 1985-07-11 Hitachi Ltd Information processor
JPS6284340A (en) * 1985-05-07 1987-04-17 Hitachi Ltd Data processor
JPH01255932A (en) * 1988-04-05 1989-10-12 Fujitsu Ltd Instruction processor
JPH0271328A (en) * 1988-09-07 1990-03-09 Nec Corp Control system for branching history table
JPH0667880A (en) * 1992-08-18 1994-03-11 Nec Corp Branch history table control circuit

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS60129839A (en) * 1983-12-19 1985-07-11 Hitachi Ltd Information processor
JPS6284340A (en) * 1985-05-07 1987-04-17 Hitachi Ltd Data processor
JPH01255932A (en) * 1988-04-05 1989-10-12 Fujitsu Ltd Instruction processor
JPH0271328A (en) * 1988-09-07 1990-03-09 Nec Corp Control system for branching history table
JPH0667880A (en) * 1992-08-18 1994-03-11 Nec Corp Branch history table control circuit

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003005954A (en) * 2001-06-25 2003-01-10 Pacific Design Kk Data processor and method for controlling the same
JP2006215672A (en) * 2005-02-02 2006-08-17 Nec Corp Information processor, its method and program
US9919060B2 (en) 2009-05-01 2018-03-20 University Court Of The University Of Dundee Treatment or prophylaxis of proliferative conditions
WO2016200501A1 (en) * 2015-05-29 2016-12-15 Intel Corporation Source operand read suppression for graphics processors
US10152452B2 (en) 2015-05-29 2018-12-11 Intel Corporation Source operand read suppression for graphics processors
TWI725023B (en) * 2015-05-29 2021-04-21 美商英特爾股份有限公司 Apparatus, computer-implemented method and machine-readable storage medium for source operand read suppresion for graphics processors
US11760773B2 (en) 2018-02-02 2023-09-19 Maverix Oncology, Inc. Small molecule drug conjugates of gemcitabine monophosphate
US10942851B2 (en) * 2018-11-29 2021-03-09 Intel Corporation System, apparatus and method for dynamic automatic sub-cacheline granularity memory access control

Similar Documents

Publication Publication Date Title
US4725947A (en) Data processor with a branch target instruction storage
US7257699B2 (en) Selective execution of deferred instructions in a processor that supports speculative execution
JP3004013B2 (en) Apparatus and method for performing subroutine call and return operations
JP2003005956A (en) Branch predicting device and method and processor
JP2000148472A (en) Microprocessor device and its software instruction speeding-up method, and recording medium recorded with its control program
JPH11212788A (en) Data supplying device for processor
KR100368166B1 (en) Methods for renaming stack references in a computer processing system
US6625725B1 (en) Speculative reuse of code regions
WO1998011484A1 (en) Command processor having history memory
JP3855076B2 (en) Data processing apparatus, data processing program, and recording medium on which data processing program is recorded
JP4254954B2 (en) Data processing device
JPS5991551A (en) Instruction prefetching device forecasting address to be branched
JP2534662B2 (en) Instruction cache control method
JPH0877021A (en) Device and method for interruption processing
JPS6284340A (en) Data processor
JPH0552534B2 (en)
JP3895314B2 (en) Data processing apparatus, data processing program, and recording medium on which data processing program is recorded
JP3317985B2 (en) Pseudo vector processor
JP2685713B2 (en) Data processing device
JP3855077B2 (en) Data processing apparatus, data processing program, and recording medium on which data processing program is recorded
JPH0588891A (en) Cache memory controller
JPH06301537A (en) Instruction fetching circuit
JP2000181715A (en) Instruction controller and its method
JP2559416B2 (en) Information processing device
JP4654433B2 (en) Data processing apparatus, data processing program, and recording medium on which data processing program is recorded

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): CN JP KR US

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 09254708

Country of ref document: US

122 Ep: pct application non-entry in european phase