WO2012144374A1 - データプロセッサ - Google Patents
データプロセッサ Download PDFInfo
- Publication number
- WO2012144374A1 WO2012144374A1 PCT/JP2012/059757 JP2012059757W WO2012144374A1 WO 2012144374 A1 WO2012144374 A1 WO 2012144374A1 JP 2012059757 W JP2012059757 W JP 2012059757W WO 2012144374 A1 WO2012144374 A1 WO 2012144374A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- instruction
- code
- pattern
- instruction code
- codes
- Prior art date
Links
- 238000000034 method Methods 0.000 claims description 44
- 230000008569 process Effects 0.000 claims description 15
- 238000010276 construction Methods 0.000 claims 1
- 230000007246 mechanism Effects 0.000 abstract description 7
- 239000000872 buffer Substances 0.000 description 23
- 238000010586 diagram Methods 0.000 description 16
- 230000000694 effects Effects 0.000 description 7
- 101710130550 Class E basic helix-loop-helix protein 40 Proteins 0.000 description 5
- 102100025314 Deleted in esophageal cancer 1 Human genes 0.000 description 5
- 239000008186 active pharmaceutical agent Substances 0.000 description 4
- 230000008859 change Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 230000003111 delayed effect Effects 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 230000006872 improvement Effects 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 238000012882 sequential analysis Methods 0.000 description 3
- 125000004122 cyclic group Chemical group 0.000 description 2
- 238000006073 displacement reaction Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 238000005315 distribution function Methods 0.000 description 2
- 230000007257 malfunction Effects 0.000 description 2
- 230000015654 memory Effects 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 238000003860 storage Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 1
- 230000003139 buffering effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 229910021421 monocrystalline silicon Inorganic materials 0.000 description 1
- 230000036316 preload Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/3005—Arrangements for executing specific machine instructions to perform operations for flow control
- G06F9/30058—Conditional branch instructions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30145—Instruction analysis, e.g. decoding, instruction word fields
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30145—Instruction analysis, e.g. decoding, instruction word fields
- G06F9/30149—Instruction analysis, e.g. decoding, instruction word fields of variable length instructions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/3017—Runtime instruction translation, e.g. macros
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30181—Instruction operation extension or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30181—Instruction operation extension or modification
- G06F9/30185—Instruction operation extension or modification according to one or more bits in the instruction, e.g. prefix, sub-opcode
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3818—Decoding for concurrent execution
- G06F9/382—Pipelined decoding, e.g. using predecoding
Definitions
- the present invention relates to a technique for adding a new instruction code having a multiple length to the instruction set to an instruction set, for example, a 16-bit fixed-length instruction set or a 16 / 32-bit mixed instruction set processor.
- the present invention relates to a technique that is effective when applied to a processor having a delay slot instruction and generating an illegal slot exception.
- a 16-bit fixed-length instruction set RISC Reduced Instruction Set Computer
- High code efficiency is indispensable for effective use of on-chip caches, RAMs and ROMs even in the present when memory capacity has increased.
- the program size can be reduced, but the number of instructions increases.
- the number of instructions for register-to-register transfer and immediate value transfer instructions increases due to restrictions in operand specification. An increase in the number of instructions causes a decrease in performance and an increase in power.
- a 16 / 32-bit mixed instruction set there is a 16-bit code space pattern in which 4 codes of 8-bit operands are assigned to a 32-bit instruction to maintain compatibility.
- this mixed instruction set two of the four codes are assigned to two instructions in the 24-bit operand field by making all 16 bits added by 32-bit conversion into the operand field, and the remaining two codes are added 16
- the scale of instruction expansion is reduced.
- the interpretation of the code changes depending on whether it is an instruction code or an extension. Since it depends on the preceding instruction code as to whether it is an extension part, sequential decoding is necessary, and it is difficult to determine the instruction code type at high speed and in parallel as it is.
- Patent Document 3 discloses an efficient superscalar instruction issuance method in the prefix method.
- the prefix scheme represented by Patent Document 3 or the like if the number of prefix codes that can be allocated is small, further advantage cannot be obtained.
- An object of the present invention is to provide a data processor in which an instruction code space is expanded while maintaining compatibility with an existing instruction set such as a 16-bit fixed length instruction set.
- Another object of the present invention is to provide a data processor capable of efficiently supplying instructions to a plurality of instruction pipelines even for instructions in an extended instruction code space.
- an instruction set in which the prohibited combination pattern is additionally defined as another instruction is adopted.
- the instructions of the prohibited combination pattern additionally defined as separate instructions, for example, define so that the instruction dispatch mechanism for the instruction set before the additional definition can be used as it is.
- the instruction set before the additional definition is an instruction set including a prefix code
- the instruction additionally defined by the prohibited combination pattern is limited to the same instruction type as the instruction defined only by the second half code of the instruction.
- the prohibited combination pattern is dedicated to the branch instruction, and the branch instruction is used as the latter half pattern.
- the instruction set before the additional definition is an instruction set including a prefix code
- only the pattern not used in the first half is used in the latter half of the prohibited combination pattern, and the instruction code type information is exchanged between adjacent codes, The instruction type of the prohibited combination pattern is determined.
- the prohibited combination pattern is processed as a double length instruction code.
- FIG. 1 is an explanatory diagram illustrating a “branch instruction with a delay slot” that constitutes a “slot illegal exception pattern”.
- FIG. 2 is an explanatory diagram further illustrating an instruction used as “an instruction that cannot be placed in a delay slot” in addition to the instruction of FIG.
- FIG. 3 is an explanatory diagram illustrating “slot illegal exception patterns” based on combinations of the instructions in FIGS. 1 and 2 categorized according to the number of bits in the operand field.
- FIG. 4 is an explanatory diagram in which the allocation of the upper 4 bits CODE of the instruction code to the instruction type TYPE is classified.
- FIG. 5 is a block diagram illustrating a data processor according to this embodiment.
- FIG. 6 is a block diagram illustrating the configuration of the processor core.
- FIG. 7 is an explanatory diagram illustrating the pipeline configuration of the processor core.
- FIG. 8 is a block diagram illustrating the configuration of the global instruction queue GIQ of the data processor according to this embodiment.
- FIG. 9 is a block diagram illustrating the configuration of the branch instruction search dispatch circuit BR-ISD of the global instruction queue GIQ of FIG.
- FIG. 10 is a block diagram presenting the configuration of the predecoder in the third embodiment.
- the code space is expanded by a combination of two meaningless instructions such as consecutive loads to the same register, and the execution result is stored in the register with an instruction having 12 bits as the instruction noted in the fifth embodiment.
- FIG. 12 is a flowchart for explaining the operation of global instruction queue GIQ in the first embodiment.
- FIG. 13 is a flowchart corresponding to FIG. 12 for explaining the operation of the global instruction queue GIQ in the second embodiment.
- FIG. 14A is a block diagram illustrating a decoder configuration of a branch control unit BRC having one stage of a branch instruction buffer.
- FIG. 14B is a block diagram illustrating a decoder configuration of a branch control unit BRC having a plurality of stages of branch instruction buffers.
- FIG. 15A is a block diagram exemplifying a configuration of a decoder that supplies an instruction code of up to 32 bits and enables two-instruction superscalar execution of 16-bit code and scalar execution of 32-bit code in the fourth embodiment.
- FIG. 14A is a block diagram illustrating a decoder configuration of a branch control unit BRC having one stage of a branch instruction buffer.
- FIG. 14B is a block diagram illustrating a decoder configuration of a branch control unit BRC having a plurality of stages of branch instruction buffers.
- FIG. 15B is a block diagram exemplifying a configuration of a decoder that supplies an instruction code of up to 48 bits and allows the preceding instruction to execute 2-instruction superscalar execution of 16-bit code and 32-bit code scalar execution in the fourth embodiment.
- FIG. 16 shows the case where the instruction preload of FIG. 10 in the third embodiment is implemented when the fourth embodiment in which the slot illegal exception pattern is all used for the definition of the new 32-bit instruction is implemented in a manner close to the method of the third embodiment.
- It is a block diagram which illustrates the composition at the time of making a decoder correspond to an embodiment.
- FIG. 17 is a block diagram showing an example in which the instruction predecoder configured as shown in FIG.
- FIG. 10 in the third embodiment is configured for the fifth embodiment when the fifth embodiment is mounted in a manner close to the scheme of the third embodiment.
- FIG. FIG. 18 shows an example in which the instruction predecoder configured as shown in FIG. 16 in the fourth embodiment is configured for the fifth embodiment when the fifth embodiment is mounted in a manner close to the method of FIG. 16 in the fourth embodiment.
- FIG. 18 shows an example in which the instruction predecoder configured as shown in FIG. 16 in the fourth embodiment is configured for the fifth embodiment when the fifth embodiment is mounted in a manner close to the method of FIG. 16 in the fourth embodiment.
- a data processor includes a plurality of instruction pipelines (EXPL, LSPL, BRPL) and a global instruction queue (SMP) that sequentially stores a plurality of instruction codes fetched in parallel. GIQ) and a dispatch circuit (EX-ISD, LS) that searches for a plurality of instruction codes output from the global instruction queue for each instruction code type and distributes the instruction codes for each instruction pipeline based on the search result. -ISD, BR-ISD).
- This data processor has an instruction set in which a prohibited combination pattern based on a combination of a plurality of specific instruction codes whose original processing of individual instruction codes is prohibited is additionally defined as another instruction.
- the delay slot included in the 16-bit fixed-length instruction set is a slot for inserting a subsequent instruction of the branch instruction, and the instruction (delay slot instruction) in this slot is executed before the branch destination instruction. Normally, this slot is for one instruction. Also, if an exception or interrupt occurs between a branch instruction and a delay slot instruction, processing is resumed from the delay slot instruction, and the branch instruction is not processed correctly. is there. Further, since the branch instruction changes the PC (program counter), the delay slot instruction generally prohibits an instruction that refers to or changes the PC. As a result, a pair of a branch instruction and a delay slot instruction is treated as a 32-bit instruction, and the prohibited combination pattern is treated as a slot illegal exception and is not utilized.
- a new instruction is additionally defined in the instruction set using a prohibition pattern such as “a 32-bit pattern by a pair of a branch instruction with a delay slot and an instruction that cannot be placed in the delay slot”.
- the combination pattern prohibited in this specification can be understood as having the following significance.
- combining the first instruction code for executing the first process and the second instruction code for executing the second process may cause an error or malfunction in the first process and / or the second process, for example. It means a combination pattern of the first instruction code and the second instruction code. Accordingly, whether or not the combination pattern is prohibited does not require that the combination prohibition is clearly indicated for the combination of instruction sets, and may be any combination pattern that may cause an error or a malfunction.
- each of the first and second instruction code patterns in the prohibited combination pattern of the specific plurality of instruction codes is a different instruction code.
- the branch instruction with a delay slot can be the first half or the second half of the pattern. Then, when branch instructions with delay slots are consecutive, the odd number is the first half and the even number is the second half, so that sequential decoding is necessary for the distinction, which may reduce the efficiency of instruction allocation processing to a plurality of instruction pipelines.
- a branch instruction with a delay slot is excluded from an instruction that cannot be placed in a delay slot, which can contribute to the efficiency of instruction allocation processing to a plurality of instruction pipelines.
- the dispatch circuit issues the instruction to the corresponding instruction pipeline, regards the code immediately before the instruction as a prefix, adds it to the instruction, and adds the instruction.
- Dispatched to the instruction pipeline execution pipeline
- the decoder of each execution pipeline checks whether the code attached to the instruction is a prefix, and if it is a prefix, decodes the instruction with the prefix.
- Superscalar can also be issued for instructions. For example, even when there is a branch instruction with a delay slot, the delay slot instruction may be dispatched to the instruction pipeline for that instruction, and the branch instruction with a delay slot may be dispatched to the instruction pipeline. Thereby, it is possible to efficiently supply instructions to a plurality of instruction pipelines even for instructions in the extended instruction code space.
- the instruction pipeline additionally defines a combination of the specific plurality of instruction codes when the instruction code supplied as a prefix code candidate forms a combination of the specific plurality of instruction codes. If the instruction code supplied as a prefix code candidate does not constitute a combination of the specific plural instruction codes, it is ignored.
- the instruction pipeline can execute an execution process for a new instruction having a prohibited combination pattern without performing a specially complicated process when decoding the dispatched instruction code.
- the instruction pipeline specifies the rear-end instruction code supplied as a prefix code candidate with respect to the instruction code detected and supplied at the head in the instruction code search immediately thereafter. Are used as prefix code candidates for constituting a combination of a plurality of instruction codes.
- the first instruction code pattern in the prohibited combination pattern of the specific instruction codes separately defined by the combination is a branch instruction with a delay slot, and the second instruction code pattern is placed in the delay slot.
- the pattern types for the prohibited combination pattern can be easily limited.
- “slot illegal exception pattern” is dedicated to branch instructions, branch instructions are used as the latter half pattern, and prefixes are not used for branch instructions, but in this case, branch instructions are further dispatched in item 8.
- the code immediately after the instruction is added instead of the code immediately before the instruction, and the branch pipeline decoder uses the added code if the instruction is a “slot illegal exception pattern”. Accordingly, the first decoding described in the item 7 can be avoided. At this time, since the latter half pattern of the “slot illegal exception pattern” is a branch instruction, the second point does not occur.
- the instruction pipeline can execute an execution process for a new instruction having a prohibited combination pattern without performing a specially complicated process when decoding the dispatched instruction code.
- the instruction pipeline detects an instruction code that is detected and supplied at the head in an instruction code search immediately after the last instruction code supplied from the dispatch circuit. It is used as a postfix code candidate for constituting a combination of a plurality of instruction codes.
- an instruction code determination mechanism is required. Only by determining the instruction code in the basic length unit of the instruction code such as the 16-bit unit, the prohibited combination pattern of the instruction type cannot be arbitrarily detected, but the instruction code determination result in the basic length unit of the instruction code is adjacent. Forbidden combination patterns can be detected by exchanging information between codes. The first half of the pattern is handled in the same way as a prefix, and the second half of the pattern is handled in the same way as any instruction such as a branch instruction, a load / store instruction, and an arithmetic instruction. Can be freely assigned.
- the prohibited combination pattern is set to a double length instruction code (Embodiment 4)>
- the instruction code included in the instruction set is a mixed instruction code in which a basic length and a double-length instruction code are mixed.
- the dispatch circuit supplies the instruction code of the basic length to the instruction pipeline corresponding to the basic length unit, and supplies the instruction code of the double length to the instruction pipeline corresponding to the double length unit.
- the instruction code of the prohibited combination pattern is defined as a double-length instruction code.
- a data processor includes a plurality of instruction pipelines (EXPL, LSPL, BRPL) and a global instruction queue (SMP) that sequentially stores a plurality of instruction codes fetched in parallel. GIQ) and a dispatch circuit (EX-ISD, LS) that searches for a plurality of instruction codes output from the global instruction queue for each instruction code type and distributes the instruction codes for each instruction pipeline based on the search result. -ISD, BR-ISD).
- This data processor has an instruction set in which a prohibited combination pattern based on a combination of a plurality of meaningless instruction codes that are not prohibited originally is additionally defined as another instruction.
- the concept of the present invention can be applied to combinations of a plurality of instructions that have no meaning on a program other than patterns that generate exceptions such as the prohibited combination patterns. For example, in the case of continuous load to the same register, it is not necessary to execute the first load unless the load destination register of the first load is the source operand of the second load. Such continuous loading into the same register is not prohibited, but the code space can be expanded by applying the present invention while prohibiting such a combination of two instructions.
- a data processor (MPU) includes a plurality of instruction pipelines (EXPL, LSPL, BRPL) and a global instruction queue (SMP) that sequentially stores a plurality of instruction codes fetched in parallel. GIQ) and a dispatch circuit (EX-ISD, LS) that searches for a plurality of instruction codes output from the global instruction queue for each instruction code type and distributes the instruction codes for each instruction pipeline based on the search result. -ISD, BR-ISD).
- the instruction pipeline further performs processing as a single separate instruction code for a specific combination of instruction codes that would otherwise prohibit the original processing of individual instruction codes.
- the dispatch circuit searches for a combination of the specific plurality of instruction codes and supplies the combination to the corresponding instruction pipeline.
- each of the first and second half instruction code patterns in the prohibited combination pattern of the specific plurality of instruction codes is a different instruction code.
- the first instruction code pattern in the prohibited combination pattern of the plurality of specific instruction codes separately defined by the combination is a branch instruction with a delay slot, and the second instruction code pattern is placed in the delay slot.
- ⁇ Inhibited combination pattern is set to double-length instruction code when mixed instruction code of basic length and double-length is included in instruction set (Embodiment 4)>
- the instruction code included in the instruction set is a mixed instruction code in which a basic length and a double-length instruction code are mixed.
- the dispatch circuit supplies the instruction code of the basic length to the instruction pipeline corresponding to the basic length unit, and supplies the instruction code of the double length to the instruction pipeline corresponding to the double length unit.
- the instruction code of the prohibited combination pattern is defined as a double-length instruction code.
- a data processor (MPU) includes a plurality of instruction pipelines (EXPL, LSPL, BRPL) and a global instruction queue (SMP) that sequentially stores a plurality of instruction codes fetched in parallel. GIQ) and a dispatch circuit (EX-ISD, LS) that searches for a plurality of instruction codes output from the global instruction queue for each instruction code type and distributes the instruction codes for each instruction pipeline based on the search result. -ISD, BR-ISD).
- the instruction pipeline further performs processing as a single separate instruction code for a combination of a plurality of specific instruction codes that are originally prohibited from being combined but have no meaning. A combination of a plurality of instruction codes is searched and supplied to the corresponding instruction pipeline.
- Embodiment 1 Consider an example including 4 instructions of FIG. 1 as “branch instructions with delay slots” constituting “illegal slot exception pattern” and 8 instructions of FIGS. 1 and 2 as “instructions that cannot be placed in delay slots”.
- instruction (1) is a program counter with a delay slot (hereinafter referred to as PC) relative unconditional branch instruction
- instruction (2) is a PC relative subroutine call instruction with delay slot
- instructions (3) and (4) are delay slots. Branches when the condition flag is established or not established by the attached PC relative conditional branch instruction.
- instructions (5) and (6) in FIG. 2 are PC relative load instructions, which load 16-bit and 32-bit data, respectively.
- Instructions (7) and (8) are PC relative conditional branch instructions, and branch when the condition flag is satisfied and not satisfied, respectively.
- the operand fields are 12 bits for instructions (1), (2), (5), and (6), and 8 bits for instructions (3), (4), (7), and (8).
- “disp8” means an 8-bit displacement
- “disp12” means a 12-bit displacement
- PC means a program counter value
- Rn means a general-purpose register number #n.
- Pattern (9) has code12 in both the first half and the second half of the pattern.
- the code A1 in the first half of the pattern contains 1010 or 1011 which is the code of the instructions (1) and (2).
- the code B1 in the latter half of the pattern contains 1010 or 1011 which is the code of the instructions (5) and (6), and the code C1 contains 1010 or 1011 which is the code of the instructions (1) and (2).
- Pattern (10) has code12 in the first half of the pattern and code8 in the second half of the pattern.
- the code A1 is the same as the pattern (9)
- the code B2 in the second half of the pattern contains 10001001 or 10001011 which is the code of the instructions (7) and (8)
- the code C2 is 10001101 which is the code of the instructions (3) and (4) Or 100001111 is entered.
- Pattern (11) has code 8 in the first half of the pattern and code 12 in the second half of the pattern.
- the code A2 in the first half of the pattern contains the codes 10001101 or 100001111 of the instructions (3) and (4), and the codes B1 and C1 are the same as the pattern (9).
- Pattern (12) has code8 in both the first half and the second half of the pattern.
- the code A2 is the same as the pattern (11), and the codes B2 and C2 are the same as the pattern (10).
- code C1 or C2 When code C1 or C2 is used, that is, when code A1 is used in the first half, code C1 is used in the second half, or code C2 is used extensively when code A2 is used in the first half.
- the superscalar processor of this embodiment classifies instruction codes into prefix PX, branch instruction with delay slot BRD, branch instruction BR, load / store instruction LS, load / store instruction LSX that cannot be placed in the delay slot, and operation instruction EX.
- the assignment of the upper 4 bits CODE of the instruction code to the instruction type TYPE is shown in FIG. Note that the instruction type TYPE-DS in the figure is used in Embodiment 4 to be described later, and is not described here.
- Instructions other than the eight instructions listed in FIGS. 1 and 2 are also classified by the upper 4 bits of the instruction code (CODE).
- the present invention can be applied to any instruction encoding, but is easy to explain by using a specific example.
- the CODE 1000 is also used as a branch instruction with delay slot BRD, a branch instruction BR, and a prefix PX.
- Patterns (3) and (4) in FIG. 1 are branch instructions with delay slots BRD, patterns (7) and (8) in FIG. 2 are branch instructions BR, and the rest are prefixes PX.
- FIG. 5 illustrates a data processor MPU according to the present embodiment.
- the data processor MPU includes a processor core CPU as a center, and includes one or a plurality of memories MEM, an external interface circuit EIF, a built-in peripheral module PER, and the like connected thereto by an internal bus IBUS.
- the data processor MPU is not particularly limited, but is formed on one semiconductor substrate such as single crystal silicon by a CMOS integrated circuit manufacturing technique or the like.
- FIG. 6 illustrates a block configuration of the processor core CPU.
- An instruction fetch unit IFU is arranged near the instruction cache IC, and includes a predecoder PD, a global instruction queue GIQ, and a branch control unit BRC.
- a load / store unit LSU is arranged in the vicinity of the data cache DC, and includes a load / store instruction queue LSIQ that holds a load / store instruction, a load / store instruction decoder LSID, and a load / store instruction execution unit LSE.
- the operation instruction execution unit EXU includes an execution instruction queue EXIQ that holds operation instructions, an operation instruction decoder EXID, and an operation instruction execution unit EXE.
- the bus interface unit BIU is an interface circuit between the processor core CPU and the bus IBUS outside the core.
- FIG. 7 illustrates the pipeline configuration of the processor core CPU.
- the operation instruction pipeline EXPL includes a local instruction buffer EXIB, a local register read EXRR, an operation EX, and a register write back WB.
- the load / store instruction pipeline LSPL includes stages of a local instruction buffer LSIB, a local register read LSRR, an address calculation LSA, data cache accesses DC1 and DC2, and a register write back WB.
- the branch instruction pipeline BRPL includes a branch BR stage.
- the instruction fetch unit IFU fetches the instruction code from the instruction cache IC, predecodes it by the predecoder PD, and stores it in the global instruction queue GIQ of the subsequent global instruction buffer GIB stage.
- the global instruction buffer GIB stage instructions of each category of load store, operation, and branch are extracted and dispatched one by one using the code result and the like, and dispatched to the local instruction buffers LSIB and EXIB and branch BR, respectively.
- the data is stored in the instruction queue LSIQ of the load / store unit LSU, the instruction queue EXIQ of the arithmetic instruction execution unit EXU, and the branch control unit BRC of the instruction fetch unit IFU.
- branch BR stage when a branch instruction is received, branch processing is immediately started.
- the operation instruction execution unit EXU receives operation instructions in the local instruction buffer EXIB stage at the maximum one instruction per cycle in the instruction queue EXIQ, and decodes the instruction one instruction at a time by the operation instruction decoder EXID.
- the register read is performed, the operation is performed using the operation instruction execution unit EXE in the operation EX stage, and the processing result is stored in the register if the instruction has a register write in the register write back WB stage.
- the load / store unit LSU receives the load / store instructions at the local instruction buffer LSIB stage at a maximum of one instruction per cycle into the instruction queue LSIQ, and the load / store instruction decoder LSID decodes the instructions at a maximum of one instruction. Then, register read is performed in the next local register read LSRR stage, load / store address is calculated in the address calculation LSA stage, load / store processing is performed in the data cache access DC1 and DC2 stages, and register write is performed in the register write back WB stage. If it is an instruction, the processing result is stored in a register.
- FIG. 8 illustrates the configuration of the global instruction queue GIQ of the data processor MPU according to the present embodiment.
- the global instruction queue GIQ includes instruction queue entries GIQ0 to GIQ15 for 16 instructions, a global instruction queue pointer GIQP that specifies a write position, and a global instruction queue pointer decoder GIQP-DEC that decodes the global instruction queue pointer GIQP.
- Operation, load store, and branch are executed in accordance with the progress of instructions in the respective categories, and are operated according to the operation instruction pointer EXP, the load store instruction pointer LSP, the branch instruction pointer BRP, and the respective pointers for designating the reading position.
- Operation instruction search dispatch circuit EX-ISD for searching and dispatching instructions, load store instructions, branch instructions, load store instruction search dispatch circuit LS-ISD, branch instruction search dispatch circuit BR-ISD Consisting instruction fetch request generator IREQ-GEN.
- the global instruction queue pointer decoder GIQP-DEC reads the instruction queue pointed to by the global instruction queue pointer GIQP among the global instruction queue update signals GIQUA0 to GIQP. Assert entry group update signal.
- corresponding global instruction queue update signals GIQU0 to GIQ3 are asserted among the global instruction queue entry groups GIQ0 to GIQ3, GIQ4 to GIQ7, GIQ8 to GIQ11, or GIQ12 to GIQ15.
- the branch instruction search dispatch circuit BR-ISD includes (e) four instruction codes starting with the instruction code pointed to by the branch instruction pointer BRP from the global instruction queue outputs GIQO0 to GIQ15 output from the global instruction queue entries GIQ0 to GIQ15. (F) If there is a branch instruction code, the branch instruction code valid signal BR-IV is asserted, and the first branch instruction code and the instruction code immediately before it are output as the branch instruction BR-INST. If the previous instruction code is selected, it can be output together with the instruction code that the prefix modifies when it is a prefix. (G) The new branch instruction pointer BRP-NEW is output so that the instruction code next to the output instruction code is pointed to by the branch instruction pointer BRP.
- the branch control unit BRC When a branch instruction code is found at a position other than the head, the branch prefix candidate valid signal BR-PV is asserted. (O) As a result, the branch control unit BRC together with the branch instruction code sets the immediately preceding instruction code as a branch prefix candidate. Both are judged as valid inputs. (I) If a branch instruction code is found at the head, the branch prefix candidate valid signal BR-PV is negated. (P) Thereby, the branch control unit BRC determines that only the branch instruction code is a valid input.
- branch instruction code valid signal BR-IV is negated, and this is a branch prefix candidate when the branch instruction code is found at the head during the next branch instruction search.
- the instruction immediately before the last valid instruction code is regarded as the branch instruction code so that the last valid instruction code in the current search range is selected as the instruction code immediately before the branch instruction code.
- a branch instruction BR-INST is output together with the code. Even if a code regarded as a branch instruction code is output, it is substantially meaningless in the branch control unit BRC, but the control logic is simplified by matching with the control content of output control when there is a valid branch instruction code. This is to make it easier.
- the new branch instruction pointer BRP-NEW is output so that the branch instruction pointer BRP points to the code next to the last valid instruction code.
- the branch prefix candidate valid signal BR-PV is asserted.
- N Thereby, the branch control unit BRC determines only the immediately preceding instruction as a branch prefix candidate as a valid input.
- the branch prefix candidate valid signal BR-PV is negated, and thereby the branch control unit BRC determines that both the instruction code and the instruction immediately preceding the branch prefix candidate are invalid.
- invalid instructions are included in the four instructions to be searched within the range including the group of invalid instruction codes in the global instruction queue entry groups GIQ0-3, GIQ4-7, GIQ8-11, or GIQ12-15. Is a search target. Details of the search operation will be described later with reference to FIG.
- the branch prefix candidate is output simultaneously with the branch instruction code, and (n) each negated And if it is asserted, only the branch prefix candidate is output in advance, and if it is asserted and negated, respectively, only the branch instruction code is output and used together with the branch prefix candidate output in advance, (q) If they are negated together, no valid code is output. If the code decoded as a branch prefix candidate is not a branch prefix, the instruction is executed using only the branch instruction code.
- the load / store instruction search dispatch circuit LS-ISD also receives the load / store instruction code valid signal LS-IV, the load / store instruction LS-INST, and the load / store prefix from the global instruction queue outputs GIQO 0 to 15 according to the load / store instruction pointer LSP.
- a candidate valid signal LS-PV and a new load / store instruction pointer LSP-NEW are output.
- the arithmetic instruction search dispatch circuit EX-ISD also receives the arithmetic instruction code valid signal EX-IV, the arithmetic instruction EX-INST, and the arithmetic prefix candidate valid signal EX-PV from the global instruction queue outputs GIQO0 to 15 according to the arithmetic instruction pointer EXP. , And a new operation instruction pointer EXP-NEW.
- the instruction fetch request generation unit IREQ-GEN is set to one of the global instruction queue entry groups GIQ0 to 3, GIQ4 to 7, GIQ8 to 11, or GIQ12 to 15 based on the values of the pointers GIQP, EXP, LSP, and BRP. It is determined whether or not there is a space larger than the entry group. If there is a space, the instruction fetch request signal IREQ is asserted. The state in which there is no space is a state in which none of the pointers EXP, LSP, and BRP indicates the global instruction queue entry group to which the next fetched instruction codes ICO0 to ICO3 are latched is indicated by the global instruction queue pointer GIQP. It is.
- the instruction fetch request signal IREQ is generated from the signals EX-OK, LS-OK, and BR-OK
- the instruction fetch request signal IREQ can be generated one cycle earlier.
- various timing methods such as a method of making from a pointer, a method of making from a new pointer, a method of latching and then sending to the instruction cache IC, etc. Can be considered.
- FIG. 9 illustrates the configuration of the branch instruction search dispatch circuit BR-ISD of the global instruction queue GIQ of FIG. It consists of a pointer decoder P-DEC, instruction code multiplexers M0 to M3, a priority encoder PE, an output instruction code multiplexer MOUT, and a pointer update circuit P-ADV.
- the pointer decoder P-DEC decodes the branch instruction pointer BRP, and selects the four instruction codes starting with the instruction code pointed to by the branch instruction pointer BRP, so that the control signals M0-CNTL ⁇ of the instruction code multiplexers M0-M3 are selected. M3-CNTL is generated.
- the instruction code multiplexers M0 to M3 respectively follow GIQO0, 4, 8, and 12, GIQO1, 5, 9, and 13, GIQO2, 6, 10, and 14, GIQO3, 7 according to the control signals M0-CNTL to M3-CNTL, respectively. , 11, and 15 are selected and output as search target instruction codes C0 to C3, respectively. As a result, a fixed order is not given to the search target instruction codes C0 to C3, and the instruction order is cyclically given from the head instruction code.
- the priority encoder PE searches for the search target instruction codes C0 to C3 from the first instruction code indicated by the branch instruction pointer BRP by priority encoding, and searches for the first branch instruction code. If a branch instruction is found, its instruction code is found.
- the first instruction code is found if there is an invalid instruction code. If it is not found, the first instruction code is found if no invalid instruction code is found.
- the output instruction code multiplexer control signal MOUT-CNTL is output so as to select the instruction code immediately before the selected instruction code.
- the instruction code immediately before it is not the current search target and cannot be selected and the order of the search target instruction codes C0 to C3 is cyclic.
- the last code to be searched is selected as the instruction code. Then, if it is found in the first, second, third, and fourth instruction codes, 1, 2, 3, and 4 are added to the instruction pointer BRP, respectively. If not found, the number of searched valid instruction codes is added. The obtained value is output as a new branch instruction pointer BRP-NEW.
- the branch instruction code is not found, when the branch instruction code is found at the head during the next branch instruction search, The last valid instruction code is selected and properly selected and output even if this is a branch instruction prefix. That is, when a new branch instruction is searched, if the first instruction code to be searched is a branch instruction, the branch prefix candidate valid signal BR-PV is negated and is used as a part of the branch instruction code BR-INST. A branch prefix candidate that is already held by the branch / branch control unit BRC is instructed to use the branch prefix candidate that is output at the same time.
- the code output as the branch instruction code BR-INST is the pair of the instruction code and the instruction code immediately before it, regardless of whether the branch instruction code is found or not found. It becomes easy to give partial commonality to the output control logic of the branch instruction code BR-INST, which can contribute to simplification of the logic scale.
- the load store instruction search dispatch circuit LS-ISD and the operation instruction search dispatch circuit EX-ISD have the same structure as the branch instruction search dispatch circuit BR-ISD.
- instructions with prefix and instructions without prefix can be issued every cycle for each instruction type, and efficient superscalar instructions can be issued in an instruction set including instructions with prefix.
- the above control method can be applied to an instruction set architecture that allows a plurality of prefix codes. Can contribute to the issue of a super-scalar instruction.
- the configuration so far is based on the configuration of the processor disclosed in Patent Document 3 by the present inventor.
- a case where a pair of a branch instruction with a delay slot and a delay slot instruction is to be processed will be described.
- the branch instruction with a delay slot in the first half of the pattern is selected by the branch instruction search dispatch circuit BR-ISD in FIG. 8 and is output as the branch instruction BR-INST.
- the decoder is configured not to execute even the branch instruction. That is, the dispatch by the branch instruction search dispatch circuit BR-ISD in that case is a useless dispatch, but the slot illegal exception is not generated, but it is simply canceled. In short, such a function is added to the instruction decoder.
- the delay slot instruction in the latter half of the pattern is selected by the operation instruction search dispatch circuit EX-ISD, the load / store instruction search dispatch circuit LS-ISD or the branch instruction search dispatch circuit BR-ISD according to the instruction type, and branches with delay slots.
- the instruction is added as a prefix and output as an operation instruction EX-INST, a load / store instruction LS-INST, or a branch instruction BR-INST. That is, a pair of a branch instruction with a delay slot and a delay slot instruction is output. If this pair is an “unjustified slot exception pattern”, the process proceeds to exception processing.
- the patterns (9), (10), (11), and (12) defined in FIG. 3 are executed as 32-bit instructions, and the slot illegal exception is not generated.
- the latter 16 bits of the pattern defined in FIG. 3 is one of (1) to (8) in FIG. 1 and FIG. 2, and if the instruction dispatch method in FIG.
- the store instruction LS-INST is treated as a branch instruction BR-INST. Therefore, the instruction definition includes the definition of an instruction executed in the same execution pipeline as the latter half of the pattern.
- the branch control unit BRC in FIG. 6 decodes a branch instruction BR-INST including a branch instruction branch instruction code and an instruction code immediately before the branch instruction code, controls the instruction fetch unit IFU, and manages the instruction flow.
- the branch instruction BR-INST has four states (o), depending on the values of the branch instruction code valid signal BR-IV and the branch prefix candidate valid signal BR-PV. There are (p), (n), and (q). In each case, the decoding operation of the branch control unit BRC will be described with reference to FIGS. 14A and 14B.
- FIG. 14A shows an example in which the branch instruction buffer has one stage, and includes a branch prefix latch BR-PX and a branch instruction latch BR-I.
- FIG. 14B shows an example in which the branch instruction buffer has a plurality of stages.
- the branch instruction buffer has one or more branch instruction buffers BR-BUF and controls the branch prefix multiplexer BR-MPX and the branch instruction multiplexer BR-MPI.
- An instruction code to be latched in the latch BR-PX and the branch instruction latch BR-I is selected. Not limited to the present invention, if there are a plurality of buffers, there is a margin in timing when the instruction supply side is stopped when the instruction issue stalls.
- the branch instruction input control circuit BRI-CTL when the branch prefix candidate in (o) of FIG. 12 is valid at the same time as the branch instruction code, the branch instruction input control circuit BRI-CTL generates the branch instruction code valid signal BR-IV and the branch prefix candidate valid signal BR-. PV determines that this is the case, controls branch prefix latch BR-PX and branch instruction latch BR-I, and supplies branch instruction BR-INST as branch prefix candidate and branch instruction to branch instruction decoder BR-DEC To do.
- the branch instruction decoder BR-DEC decodes it as a branch instruction with a prefix if the branch prefix candidate is actually a prefix, and decodes it as a normal branch instruction if it is not a prefix, generates a control signal BR-DECO, and performs branch processing To proceed.
- branch instruction code of (p) in FIG. 12 is validated, as in (o), it is determined that this is the case, and the branch instruction latch BR-I is controlled to branch.
- the latter half of the instruction BR-INST is supplied to the branch instruction decoder BR-DEC as a branch instruction. If the branch prefix candidates supplied in advance are valid, they are used together and processed in the same manner as in the above (o). On the other hand, if a branch prefix candidate is not supplied in advance, such as immediately after a branch, it is processed as a normal branch instruction.
- Some 16-bit fixed-length instruction sets include branch instructions with delay slots.
- the delay slot is a slot for inserting a subsequent instruction of the branch instruction, and the instruction (delay slot instruction) in this slot is executed before the branch destination instruction. Normally, this slot is for one instruction. Also, if an exception or interrupt occurs between a branch instruction and a delay slot instruction, the processing is resumed from the delay slot instruction, and the branch instruction is not processed correctly. is there. Further, since the branch instruction changes the PC, the delay slot instruction generally prohibits an instruction that refers to or changes the PC.
- the instruction code space is expanded by defining a 32-bit instruction set by utilizing this prohibition pattern, that is, the “32-pit pattern by a pair of instruction with a delay slot and an instruction that cannot be placed in the delay slot”. can do.
- this pattern is referred to as a “slot illegal exception pattern”.
- a branch instruction with a delay slot is also an instruction that cannot be placed in the delay slot. Therefore, the branch instruction with a delay slot can be the first half or the second half of the pattern. If the branch instructions with delay slots are consecutive, the odd number is the first half and the even number is the second half, so that sequential decoding is necessary for the distinction, and speeding up is difficult.
- instructions other than the branch instruction with a delay slot may be used among the representative eight instructions that cannot be placed in the delay slot.
- the instruction code is classified into a prefix, a branch instruction with a delay slot, a branch instruction, a load / store instruction, and an arithmetic instruction, and a superscalar processor having three types of branch, load / store, and arithmetic pipeline is considered as an execution pipeline. . It is possible for an engineer with ordinary skills to expand this processor to, for example, a processor having floating point arithmetic instructions as another classification and further having a floating point arithmetic as an execution pipeline.
- each instruction of the branch instruction, the load / store instruction, and the operation instruction is searched every cycle, and if each instruction is within the search scope, the instruction is issued and If the code immediately before the instruction is regarded as a prefix and added to the instruction and dispatched to each execution pipeline, the decoder of each execution pipeline checks whether the code added to the instruction is a prefix. By using it also for instruction decoding, a prefixed instruction can be issued as a superscalar. Even if there is a branch instruction with a delay slot, no special processing is required after dispatch, and the branch instruction with a delay slot may be dispatched to the branch pipeline and the delay slot instruction may be dispatched to the execution pipeline for that instruction. Then, the instruction fetch and dispatch flow switching is delayed by one instruction from the normal branch instruction. When an illegal slot exception occurs, high-speed processing is not necessary, so that it is only necessary to shift to exception processing as appropriate.
- Embodiment 2 The instruction control form utilizing the “slot illegal exception pattern” described in the first embodiment does not require changing the dispatch mechanism, but there is a point to consider.
- the instruction code is “Slot illegal exception pattern”
- the branch instruction with a delay slot in the first half of the pattern is dispatched as a branch instruction to the branch pipeline, but since the latter half of the pattern does not follow, Cannot be executed, and wasteful dispatch occurs.
- the latter half of the pattern is regarded as an instruction and dispatched, and the first half of the pattern is added as a prefix candidate, the latter half of the pattern and the “slot illegal exception pattern” must be code of instructions executed in the same execution pipeline. This is a limitation.
- the instruction definition based on the “slot illegal exception pattern” is dedicated to the branch instruction, and the branch instruction is used as the latter half pattern. That is, only the six patterns (1) to (4) and (7) and (8) in FIGS. 1 and 2 are used as the latter 16 bits of the pattern defined in FIG. The prefix is not used for branch instructions.
- the branch instruction search dispatch circuit BR-ISD of the global instruction queue GIQ in FIG. 8 is a branch instruction defined by the “slot illegal exception pattern” together with the selected instruction code
- the instruction code immediately after the selected instruction code is also output.
- the instruction code immediately before the selected instruction code is output at the same time in preparation for the case where there is a prefix.
- the prefix is not used for the branch instruction, so the immediately preceding instruction code is unnecessary. is there.
- the first half of the prohibited pattern is prefixed and grasped so that the instruction code immediately before the selected instruction code is output together.
- the second half of the prohibited pattern is posted.
- BR-PV in FIG. 8 is a postfix candidate valid signal.
- the postfix candidate valid signal BR-PV is negated.
- the new branch instruction pointer BRP-NEW is output so that the branch instruction pointer BRP points to the instruction code next to the output instruction code as in the first embodiment. If negated, the process proceeds by one less than in the first embodiment. Further, since there is no prefix, there is no need to consider the case where the last code of the effective instruction code to be searched considered in the first embodiment is a prefix.
- FIG. 12 Due to the above changes from the first embodiment, the flowchart of FIG. 12 becomes as shown in FIG. Hereinafter, a difference from FIG. 12 will be described based on FIG. In FIG. 13, the main changes from FIG. 12 are underlined.
- the instruction code is changed to output an arbitrary instruction code and the instruction code immediately after that, Deleted (l) and (o) that had been advanced from (k) to “With valid instruction code”, changed to (m) unconditionally, and changed the branch prefix of (m) to postfix change.
- the first half of the branch instruction BR-INST is the four instructions (1) to (4) in FIG. 1, which is a “branch instruction with delay slot”, and the postfix candidate valid signal BR-PV is negated
- the second half Since there is no portion, it is impossible to distinguish between a normal “branch instruction with delay slot” and a “slot illegal exception pattern”, and the branch control unit BRC of the instruction fetch unit IFU in FIG. 6 cannot process the received instruction.
- the branch control unit BRC of the instruction fetch unit IFU in FIG. 6 cannot process the received instruction.
- the next half of the “slot illegal exception pattern” is searched next when the branch instruction search succeeds. Is output as the first half of the successful branch instruction BR-INST. By using this as the latter half of the “slot illegal exception pattern”, processing becomes possible.
- the second half of the branch instruction BR-INST is not used at the same time as the first half and if it is a branch instruction, it is used as the next first half so that it is processed following the processing of the preceding branch instruction. To do. At this time, if the first half of the next branch instruction BR-INST has arrived, the postfix candidate valid signal BR-PV is negated even for the “branch instruction with delay slot”. The same processing can be performed. If it has not arrived, processing is started after arrival.
- the “slot illegal exception pattern” is dedicated to the branch instruction, the branch instruction is used as the latter half pattern, and the prefix is not used for the branch instruction.
- the code immediately before the instruction If the code immediately after is added and the decoder of the branch pipeline uses the added code if the instruction is a “slot illegal exception pattern”, the “slot illegal exception pattern” can be appropriately decoded. Therefore, useless dispatch can be avoided.
- Embodiment 3 the instruction definition based on the “slot illegal exception pattern” is limited to a branch instruction only.
- the instruction code type determination mechanism is changed so that the instruction code type determination result in units of 16 bits and the adjacent code are determined when determining the instruction code type.
- the “slot illegal exception pattern” can be detected by exchanging information. That is, the same code is not used in the first half and the second half of the “slot illegal exception pattern”. Specifically, a branch instruction with a delay slot is used in the first half, and an instruction that cannot be placed in a delay slot other than a branch instruction with a delay slot is used in the second half.
- the first half of the pattern is handled in the same manner as the prefix, and the second half of the pattern is handled in the same manner as any of the branch instruction, load / store instruction, and operation instruction.
- the instruction type TYPE-DS as the latter half of the “slot illegal exception pattern” is classified.
- instructions can be freely assigned to the “slot illegal exception pattern” and can be handled in the same way as prefixed instructions.
- the instruction code distribution function by the dispatch circuits EX-ISD, LS-ISD, and BR-ISD at this time may be the same as in FIG.
- the instruction code type determination in the predecoder PD of the instruction fetch unit IFU of FIG. 6 is changed in the processor configuration described in the first embodiment with reference to FIGS.
- the instruction code type determination is performed somewhere from the instruction fetch to the instruction search, such as for the instruction codes C0 to C3 to be searched for the branch instruction search dispatch circuit BR-ISD in FIG. Good.
- the search is performed for the search target instruction codes C0 to C3, the circuit is complicated because the order relationship is not constant.
- FIG. 10 is an example of the predecoder PD of FIG. 6 in the present embodiment.
- the instruction codes ICO0 to ICO3 of the four instructions fetched from the instruction cache IC by the instruction fetch unit IFU are predecoded by the predecoders PD0 to PD3 of the respective instruction codes, and among the normal six types of instruction types TYPE in FIG.
- branch instruction BRD branch instruction BR
- a load store instruction LSX load store instruction LSX that cannot be placed in a delay slot
- prefix PX branch signals with delay slots BRD-PD0 to BRD-PD3 and delay slot disabling signals DSNG-PD0 to DSNG-PD3 are output. If there is a 32-bit instruction in the delay slot, the instruction issuing mechanism becomes complicated, so the prefix PX is also classified as an instruction code that cannot be placed in the delay slot and used as the second half. Also, the instruction type TYPE determination result TYP-PD0 to TYP-PD3 is output.
- the instruction type adjustment circuits TYP0 to TYP3 when the branch signals with delay slots BRD-PD0 to BRD-PD3 from the adjacent preceding instruction are asserted, the instruction type when the instruction type is “slot illegal exception pattern”.
- TYPE-DS When TYPE-DS is negated, TYP-PD0 to TYP-PD3 are output as they are.
- the instruction type adjusting circuit TYP0 has either the adjacent preceding instruction is the branch signal with delay slot BRD-PD3 of the previous predecode result or does not exist because it is the first instruction code of the branch destination. Therefore, if it is the former, it is latched in the latch BRDL, and if it is the latter, the value of the latch BRDL is cleared.
- the latch BRDL output is used.
- the instruction type adjustment circuits TYP0 to TYP2 are configured such that when the instruction type is the branch instruction with delay slot BRD, the delay slot disabling signals DSNG-PD1 to DSNG-PD3 from the adjacent succeeding instructions are asserted. , Change the instruction type to prefix PX. Then, the instruction type adjustment circuits TYP0 to TYP3 outputs are added to the instruction codes ICO0 to ICO3 to form predecoder outputs ICPDO0 to ICPDO3.
- the instruction type adjustment circuit TYP3 cannot determine the instruction type if the instruction type is the branch instruction with delay slot BRD because there is no signal from the adjacent succeeding instruction. For this reason, it is assumed that it is a branch instruction with delay slot BRD, is set to the instruction type, is added to the instruction code ICO3, and is used as the predecoder output ICPDO3. If the “slot illegal exception pattern” is later found, the instruction type is changed to the prefix PX by a prefix signal PX-PD3 described later. Since the predecoder output ICPDO3 is latched in the global instruction queue GIQ in FIG. 8, the latched value can be updated by the prefix signal PX-PD3.
- the prefix signal PX-PD3 is generated by the instruction type adjustment circuit TYPEX.
- the branch signal with delay slot BRD-PD3 latched in the latch BRDL and the delay slot disable signal DSNG-PD0 from the predecoder PD0 are both asserted. Assert when
- Embodiment 4 In the third embodiment, instruction encoding is limited so that “slot illegal exception pattern” can be detected by exchanging information between 16-bit instruction code type determination results and adjacent codes, giving priority to ease of implementation. However, a processor that already implements a 16/32 bit length mixed instruction set has no merit even if the above restriction is applied. Therefore, all the “slot illegal exception pattern” approximately 144.5M patterns should be used in the new 32-bit instruction definition without any restrictions.
- the latter half of a 32-bit instruction may be the same code as the first half of a 16- or 32-bit instruction, and whether a certain code is the latter half of a 32-bit instruction.
- the instruction type of all the instructions preceding the code can be sequentially analyzed and determined for the first time. For this reason, when determining at the time of predecoding after instruction fetch, it is necessary to perform sequential analysis for the number of instruction codes fetched at a time within the instruction fetch cycle. When fetching every cycle, the minimum condition is one cycle or less, and when one cycle is not required for predecoding, it is necessary to finish the sequential analysis within that time.
- FIG. 15A and FIG. 15B show a configuration example of the instruction issuing unit of the present embodiment. There is no particular difference from the instruction issue part of the normal 16 / 32-bit mixed instruction set, and a function that decodes an instruction that uses the newly defined "illegal slot exception pattern" as a new 32-bit instruction has been added. become.
- FIG. 15A supplies an instruction code of a maximum of 32 bits, and can execute a 2-instruction superscalar execution of a 16-bit code and a scalar execution of a 32-bit code.
- FIG. 15B provides an instruction code of a maximum of 48 bits, and a two-instruction superscalar execution in which the preceding instruction is a 16-bit code and a 32-bit code scalar execution are possible.
- the instruction queue aligner IQ-ALN buffers the instruction cache output ICO, outputs the first 32-bit code, and latches it in the instruction code latches OP0 and OP1.
- the instruction code latch OP0 output is the first 16 bits of the preceding 16-bit code or 32-bit code
- the instruction code latch OP1 output is the latter 16 bits of the subsequent 16-bit code or 32-bit code.
- the instruction decoder DEC0 inputs the outputs of the instruction code latches OP0 and OP1, decodes the preceding 16-bit code or 32-bit code, and outputs the control signal DECO0 and the invalidation signal INV-DECO1 of the output of the instruction decoder DEC1. Invalidation is performed when the instruction decoded by the instruction decoder DEC0 is a 32-bit code.
- the instruction decoder DEC1 receives the output of the instruction code latch OP1, and decodes the subsequent 16 bits. If the output of the instruction code latch OP1 is the first half of the 32-bit code, the instruction queue aligner IQ-ALN is controlled so that it is not decoded and is supplied as the output of the instruction code latch OP0 to the instruction decoder DEC0 in the next cycle. To do.
- the instruction queue aligner IQ-ALN advances the 0-, 16-, or 32-bit pointer by the number of instructions that have been successfully decoded and issued by the instruction decoders DEC0 and DEC1, and outputs the next instruction code.
- the instruction queue aligner IQ-ALN buffers the instruction cache output ICO, outputs the first 48-bit code, and latches it in the instruction code latches OP0, OP1, and OPX.
- the instruction code latch OP0 output is the first 16 bits of the preceding 16-bit code or the preceding 32-bit code
- the instruction code latch OP1 output is the latter 16-bit code, the first half of the succeeding 32-bit code, or the latter 16 bits of the preceding 32-bit code
- the output of the instruction code latch OPX is the latter 16 bits of the subsequent 32-bit code.
- the instruction decoder DEC0 is the same as the example in FIG. 15A.
- the instruction decoder DEC1 receives the outputs of the instruction code latches OP1 and OPX and decodes the subsequent 16 bits or the subsequent 32 bits.
- the instruction queue aligner IQ-ALN advances the 0, 16, 32, or 48-bit pointer by the instruction that has been successfully decoded and issued by the instruction decoders DEC0 and DEC1, and outputs the next instruction code.
- FIG. 16 shows an embodiment in which the present embodiment in which the “slot illegal exception pattern” is all used for defining a new 32-bit instruction without restriction is implemented in a manner close to the method of the third embodiment.
- FIG. 3 illustrates how the instruction predecoder, which was as shown in FIG.
- the four instruction codes ICO0 to ICO3 fetched from the instruction cache IC by the instruction fetch unit IFU are predecoded by the predecoders PD0 to PD3 of the respective instruction codes, and the delay slot of the normal six types of instruction types TYPE in FIG. That is, the branch instruction BR, the branch instruction with delay slot BRD, the load store instruction LSX that cannot be placed in the delay slot, or the prefix PX, and the delay slot disabling signals DSNG-PD0 to DSNG-PD3 Is output.
- the instruction type TYPE determination result TYP-PD0 to TYP-PD3 is output.
- the instruction type TYPE determination results TYP-PD0 to TYP-PD3 are output.
- Instruction type first half TYP-FH0 to TYP-FH3 X defines 32-bit arithmetic instruction first half EXFH, 32-bit load / store instruction first half LSFH, 32-bit branch instruction first half BRHF, branch instruction with delay slot BRD, and other ETC .
- Other ETCs include the latter half of a 32-bit code, an instruction that cannot be placed in a delay slot, and a 16-bit single code.
- the instruction type adjustment circuits TYP0 to TYP2 are the instruction type first half TYP-FH0 to TYP from the instruction type TYPE determination results TYP-PD0 to TYP-PD2 and the instruction type adjustment circuits TYP0 to TYP1 and X of adjacent preceding instruction codes.
- Instruction type first half TYP-FH1 to TYP-FH3 and instruction codes ICO0 to ICO2 sent from the FH1, X and delay slot disabling signals DSNG-PD1 to DSNG-PD3 to the instruction type adjusting circuits TYP1 to TYP3 of the adjacent subsequent instruction codes
- An instruction type to be added to is generated.
- the output from the instruction type adjustment circuits TYP0 to TYP1 and X of the adjacent preceding instruction code is other ETC, and the instruction type TYPE determination results TYP-PD0 to TYP-PD2 are the first half of the 32-bit operation instruction EXFH and the first half of the 32-bit load store instruction LSFH If the first half of the 32-bit branch instruction BRFH or the branch instruction with delay slot BRD, the instruction type TYPE determination results TYP-PD0 to TYP-PD3 are used as they are. In other cases, the other ETC is used as the first half of the instruction type TYP-FH0. Generated as TYP-FH2.
- the instruction type first half TYP-FH0 to TYP-FH1 from the instruction type adjustment circuit of the adjacent preceding instruction code, if X is a 32-bit arithmetic instruction first half EXFH Outputs the instruction EX, outputs the load / store instruction LS if the first half LSFH of the 32-bit load / store instruction, outputs the branch instruction BR if the first half of the 32-bit branch instruction BRHF, and outputs the instruction if the branch instruction BRD has a delay slot
- the type TYPE determination results TYP-PD0 to TYP-PD2 are output after converting TYP to TYP-DS according to FIG.
- the instruction type TYPE determination result TYP-PD0 to TYP-PD2 is a branch instruction with delay slot BRD, and if a delay slot disabling signal DSNG-PD1 to DSNG-PD3 from an adjacent succeeding instruction is asserted, it is a “slot illegal exception pattern”. If the prefix code PX is not asserted, the branch instruction BR is output. If the load / store instruction LSX cannot be placed in the delay slot, the load / store instruction LS is output. In other cases, the instruction type TYPE determination result TYP-PD0 to Outputs TYP-PD2.
- the instruction type adjustment circuit TYP3 is different from the instruction type adjustment circuits TYP0 to TYP2 in that it does not receive signals corresponding to the delay slot disabling signals DSNG-PD1 to DSNG-PD3 from the adjacent subsequent instructions.
- the instruction type to be added to the instruction code ICO3 the instruction type first half TYP-FH2 from the instruction type adjustment circuit of the adjacent preceding instruction code is the other ETC, and the instruction type TYPE determination results TYP-PD0 to TYP-PD2 are branched with delay slots.
- the branch instruction BR is always output. In other cases, the same operation as the instruction type adjustment circuits TYP0 to TYP2 is performed.
- the instruction type first half type TYP-FH3 outputted by the instruction type adjustment circuit TYP3 is latched and outputted by the latch TYPL.
- the instruction type adjustment circuit TYPX receives this, and first, the instruction type first half type TYP-FHX outputs the latch TYPL. Is output as is. Then, the instruction type adjustment circuit TYP3 takes over the processing that could not be performed because it did not receive the signals corresponding to the delay slot disabling signals DSNG-PD1 to DSNG-PD3 from the adjacent subsequent instructions. If the latch TYPL output is the branch instruction with delay slot BRD and the delay slot disabling signal DSNG-PD0 from the adjacent succeeding instruction is asserted, the prefix signal code PX-PD3 is asserted, otherwise it is negated.
- the “slot illegal exception pattern” When handling the “slot illegal exception pattern” as a 32-bit instruction, the “slot illegal exception pattern” may be detected by the above method and dispatched in the same manner as the 16 / 32-bit mixed instruction set method. This method is suitable when the instruction set has already been extended to the 16/32 bit length mixed instruction set method and sequential decoding of instructions is required. In this case, since there is no merit to restrict the “slot illegal exception pattern” to the above-described pattern that can be issued by the efficient superscalar instruction, all the “slot illegal exception pattern” should be used effectively. Therefore, according to this embodiment, in a processor that already implements a 16 / 32-bit mixed instruction set, the “slot illegal exception pattern” of about 144.5M pattern is completely new without affecting the branch performance. Can be used for 32-bit instruction definitions.
- Embodiment 5 In addition to patterns that generate exceptions such as the “illegal slot exception pattern” used in each of the above embodiments, the present invention also includes combinations of two instructions that have no meaning on the program (specific combination patterns). The inventive concept can be applied. For example, in the case of continuous load to the same register, it is not necessary to execute the first load unless the load destination register of the first load is the source operand of the second load. Such continuous loading into the same register is not prohibited, but the code space can be expanded by applying the present invention while prohibiting such a combination of two instructions.
- the probability that the register numbers match is 1/16. Therefore, the usable 32-bit pattern is a 15M pattern.
- the “slot illegal exception pattern” it takes more time to check that the registers storing the execution results are the same, but there are many new patterns to be obtained.
- the pattern obtained in this way can be detected by sequential decoding as in the fourth embodiment.
- five instructions may be divided into two instructions and three instructions.
- the three instructions in the subsequent line are not changed, and the preceding instruction may be limited to two instructions that are not used in the subsequent line.
- the resulting pattern is a 6M pattern.
- FIG. 17 shows an example in which the instruction predecoder configured as shown in FIG. 10 in the third embodiment is configured for the present embodiment when the present embodiment is implemented in a manner close to the method of the third embodiment. is there.
- the “slot illegal exception pattern” is used.
- the first half is a branch instruction with a delay slot and the second half is an instruction that cannot be placed in a delay slot.
- the branch signals with delay slots BRD-PD0 to BRD-PD3 and the delay slot impossible signals DSNG-PD0 to DSNG-PD3 instead of the branch signals with delay slots BRD-PD0 to BRD-PD3 and the delay slot impossible signals DSNG-PD0 to DSNG-PD3, the first half candidate signals FH-PD0 to FH-PD3 and latter half candidate signals LH-PD0 to LH-PD3 are used.
- the comparison circuits CMP01, CMP12, CMP23, and CMP30 detect the coincidence of the result storage destination register numbers between adjacent codes.
- the result is added to the first half candidate signals FH-PD0 to FH-PD3 from the preceding instruction code to the succeeding instruction code, and the latter half candidate signals LH-PD0 to LH-PD3 from the succeeding instruction code to the preceding instruction code.
- the instruction type adjustment circuits TYP0 to TYP3, X detect the combination of the two instructions of this embodiment from these signals. If the instruction code type is the first half, the prefix PX is used. Is converted to a type of “insignificant combination of two instructions” and added to the instruction codes ICO0 to ICO3.
- FIG. 18 shows the configuration of the instruction predecoder for the present embodiment, which is the same as that of FIG. 16 in the fourth embodiment when the present embodiment is implemented in a manner similar to the system of FIG. 16 of the fourth embodiment. This is an example.
- the latter half candidate signals LH-PD0 to LH-PD3 are used in place of the delay slot disabling signals DSNG-PD0 to DSNG-PD3, and between adjacent codes by the comparison circuits CMP01, CMP12, CMP23, and CMP30. Then, the result storage destination register number coincidence is detected and the result is added to the latter half candidate signals LH-PD0 to LH-PD3 and sent. Then, the instruction type adjusting circuits TYP0 to TYP3, X generate the instruction type first half TYP-FH0 to TYP-FH3, X from these signals in the same manner as in the fourth embodiment, and the instruction code type is changed to the instruction codes ICO0 to ICO3. Append.
- the instruction code type is the prefix PX if the first half part, and “the combination of two instructions not meaningful on the program” if the second half part. And is added to the instruction codes ICO0 to ICO3.
- the distribution function of the bright record by the dispatch circuits EX-ISD, LS-ISD, and BR-ISD may be the same as in FIG.
- the concept of the present invention can be applied to a combination of a plurality of instructions such as two instructions that have no meaning on a program other than a pattern that generates an exception like the “slot illegal exception pattern”.
- a combination of a plurality of instructions such as two instructions that have no meaning on a program other than a pattern that generates an exception like the “slot illegal exception pattern”.
- the load destination register of the first load is the source operand of the second load.
- Such continuous loading into the same register is not prohibited, but the code space can be expanded by applying the present invention while prohibiting such a combination of two instructions.
- it can be extended to a combination of three or more meaningless instructions such as three consecutive loads to the same register.
- 15M patterns can be parallel-decoded and implemented in the same manner as in the third embodiment, which requires code sequentially.
- all 6M patterns can be used for the definition of a new 32-bit instruction, which can contribute to improvement of performance and efficiency by instruction expansion.
- the present invention can be applied to a processor having a 32-bit fixed-length instruction set or a 32- / 64-bit mixed instruction set.
- the present invention relates to a technique for adding a new instruction code having a multiple length to the instruction set to an instruction set, for example, a 16-bit fixed-length instruction set or a 16 / 32-bit mixed instruction set processor. Among them, it can be widely applied to processors having a delay slot instruction and generating an illegal slot exception.
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Advance Control (AREA)
- Executing Machine-Instructions (AREA)
Abstract
Description
先ず、本願において開示される発明の代表的な実施の形態について概要を説明する。代表的な実施の形態についての概要説明で括弧を付して参照する図面中の参照符号はそれが付された構成要素の概念に含まれるものを例示するに過ぎない。
本発明の代表的な実施の形態に係るデータプロセッサ(MPU)は、複数の命令パイプライン(EXPL,LSPL,BRPL)と、並列的にフェッチされた複数の命令コードを順次蓄積するグローバル命令キュー(GIQ)と、前記グローバル命令キューから出力された複数の命令コードに対して命令コードタイプ毎に探索を行い、探索結果に基づいて命令コードを命令パイプライン毎に振り分けるディスパッチ回路(EX-ISD,LS-ISD,BR-ISD)とを有する。このデータプロセッサは、個々の命令コードの本来の処理が禁止されることになる特定の複数の命令コードの組合せによる禁止組合せパターンを別の命令として追加定義した命令セットを有する。
項1のデータプロセッサにおいて、前記特定の複数の命令コードの禁止組み合わせパターンによって追加定義される命令は、当該組み合わせパターンの後半の命令コードパターンのみで定義される命令コードと同じ命令タイプに限定される。
項2のデータプロセッサにおいて、前記特定の複数の命令コードの禁止組み合わせパターンにおける前半及び後半のそれぞれの命令コードパターンは異なる命令コードである。
項3のデータプロセッサにおいて、前記ディスパッチ回路は、探索対象とする複数の命令コードの探索単位の中で目的とする命令コードタイプの命令コードを検出したときは当該検出した命令コードを有効として出力すると共にその直前の命令コードをプレフィックスコード候補として出力し、前記探索単位の先頭で目的とする命令コードタイプの命令コードを検出したときは当該先頭の命令コードを有効として出力し、前記探索単位の後端まで目的とする命令コードタイプの命令コードを検出できなかったときは当該後端の命令コードをプレフィックスコード候補として出力する。
項4のデータプロセッサにおいて、前記命令パイプラインは、プレフィックスコード候補として供給された命令コードが前記特定の複数の命令コードの組み合わせを構成するときは当該特定の複数の命令コードの組み合わせを、追加定義された命令として処理し、プレフィックスコード候補として供給された命令コードが前記特定の複数の命令コードの組み合わせを構成するものでないときないときはこれを無視する。
項5のデータプロセッサにおいて、前記命令パイプラインは、プレフィックスコード候補として供給された前記後端の命令コードを、その直後の命令コード探索において先頭で検出されて供給された命令コードに対して前記特定の複数の命令コードの組み合わせを構成するためのプレフィックスコード候補として用いる。
項2のデータプロセッサにおいて、前記特定の複数の命令コードの禁止組み合わせパターンによって追加定義される命令は分岐命令に専用化され、当該禁止組み合わせパターンの後半の命令コードパターンには分岐命令の命令コードが使用される。
項7のデータプロセッサにおいて、前記組み合わせによって別定義される前記特定の複数の命令コードの禁止組み合わせパターンにおける前半の命令コードパターンは遅延スロット付き分岐命令であり、後半の命令コードパターンは遅延スロットに置けない遅延スロット付き分岐命令以外の分岐命令である。
項7のデータプロセッサにおいて、前記ディスパッチ回路は、探索対象とする複数の命令コードの探索単位の中の最後以外で目的とする命令コードタイプの命令コードを検出したときは当該検出した命令コードを有効として出力すると共にその直後の命令コードをポストフィックスコード候補として出力し、前記探索単位の中の最後で目的とする命令コードタイプの命令コードを検出したときは当該最後の命令コードを有効として出力する。
項9のデータプロセッサにおいて、前記命令パイプラインは、ポストフィックスコード候補として供給された命令コードが前記特定の複数の命令コードの組み合わせを構成するときは当該特定の複数の命令コードの組み合わせを追加定義された命令として処理し、ポストフィックスコード候補として供給された命令コードが前記特定の複数の命令コードの組み合わせを構成するものでないときないときはこれを無視する。
項10のデータプロセッサにおいて、前記命令パイプラインは、前記ディスパッチ回路から供給された前記最後の命令コードに対して、その直後の命令コード探索において先頭で検出されて供給された命令コードを前記特定の複数の命令コードの組み合わせを構成するためのポストフィックスコード候補として用いる。
項1のデータプロセッサにおいて、前記特定の複数の命令コードの禁止組み合わせパターンにおける後半の命令コードパターンには前半の命令コードパターンとは異なる命令コードパターンが使用され、前記ディスパッチ回路はその前段にプリデコーダを有する。前記プリデコーダは、命令コードの命令コードタイプを判別すると共に、隣接する命令コード間で命令コードタイプの情報を交換して、命令コードが前記禁止組み合わせパターンを構成する命令タイプであるかを確定するための情報を前記ディスパッチ回路に供給する。前記ディスパッチ回路は、前記確定するための情報を用いることにより(前記禁止組み合わせパターンによる別定義された命令の命令タイプが当該禁止組み合わせパターンの後半の命令コードパターンによる命令タイプと異なる場合にも)前記禁止組み合わせパターンによる命令を供給する命令パイプラインを決定する。
項1のデータプロセッサにおいて、前記命令セットに含まれる命令コードは基本長とそれに対する2倍長の命令コードが混在する混在命令コードである。前記ディスパッチ回路は基本長の命令コードに対しては基本長単位で対応する命令パイプラインに供給し、2倍長の命令コードに対しては2倍長単位で対応する命令パイプラインに供給する。このとき、前記禁止組合せパターンの命令コードは2倍長命令コードとして定義される。
本発明のさらに別の実施の形態に係るデータプロセッサ(MPU)は、複数の命令パイプライン(EXPL,LSPL,BRPL)と、並列的にフェッチされた複数の命令コードを順次蓄積するグローバル命令キュー(GIQ)と、前記グローバル命令キューから出力された複数の命令コードに対して命令コードタイプ毎に探索を行い、探索結果に基づいて命令コードを命令パイプライン毎に振り分けるディスパッチ回路(EX-ISD,LS-ISD,BR-ISD)とを有する。このデータプロセッサは、本来組み合わせが禁止されていないが意味のない複数の命令コードの組み合わせによる禁止組合せパターンを別の命令として追加定義した命令セットを有する。
本発明のさらに別の実施の形態に係るデータプロセッサ(MPU)は、複数の命令パイプライン(EXPL,LSPL,BRPL)と、並列的にフェッチされた複数の命令コードを順次蓄積するグローバル命令キュー(GIQ)と、前記グローバル命令キューから出力された複数の命令コードに対して命令コードタイプ毎に探索を行い、探索結果に基づいて命令コードを命令パイプライン毎に振り分けるディスパッチ回路(EX-ISD,LS-ISD,BR-ISD)とを有する。前記命令パイプラインは更に、個々の命令コードの本来の処理が禁止されることになる特定の複数の命令コードの組み合わせに対して、単一の別の命令コードとして処理を行う。前記ディスパッチ回路は、前記特定の複数の命令コードの組み合わせを探索して対応する命令パイプラインに供給する。
項15のデータプロセッサにおいて、前記特定の複数の命令コードの禁止組み合わせパターンによって追加定義される命令は、当該組み合わせパターンの後半の命令コードパターンのみで定義される命令コードと同じ命令タイプに限定される。
項16のデータプロセッサにおいて、前記特定の複数の命令コードの禁止組み合わせパターンにおける前半及び後半のそれぞれの命令コードパターンは異なる命令コードである。
項16のデータプロセッサにおいて、前記特定の複数の命令コードの禁止組み合わせパターンによって追加定義される命令は分岐命令に専用化され、当該禁止組み合わせパターンの後半の命令コードパターンには分岐命令の命令コードが使用される。
項18のデータプロセッサにおいて、前記組み合わせによって別定義される前記特定の複数の命令コードの禁止組み合わせパターンにおける前半の命令コードパターンは遅延スロット付き分岐命令であり、後半の命令コードパターンは遅延スロットに置けない遅延スロット付き分岐命令以外の分岐命令である。
項15のデータプロセッサにおいて、前記組み合わせによって別定義される前記特定の複数の命令コードの禁止組み合わせパターンにおける後半の命令コードパターンには前半の命令コードパターンとは異なる命令コードパターンが使用される。前記ディスパッチ回路はその前段にプリデコーダを有する。前記プリデコーダは、命令コードの命令コードタイプを判別すると共に、隣接する命令コード間で命令コードタイプの情報を交換して、命令コードが前記禁止組み合わせパターンを構成する命令タイプであるかを確定するための情報を前記ディスパッチ回路に供給する。前記ディスパッチ回路は、確定するための情報を用いることにより(前記禁止組み合わせパターンによる別定義された命令の命令タイプが当該禁止組み合わせパターンの後半の命令コードパターンによる命令タイプと異なる場合にも)前記禁止組み合わせパターンによる命令を供給する命令パイプラインを決定する。
項15のデータプロセッサにおいて、前記命令セットに含まれる命令コードは基本長とそれに対する2倍長の命令コードが混在する混在命令コードである。前記ディスパッチ回路は基本長の命令コードに対しては基本長単位で対応する命令パイプラインに供給し、2倍長の命令コードに対しては2倍長単位で対応する命令パイプラインに供給する。前記禁止組合せパターンの命令コードは2倍長命令コードとして定義される。
本発明の更に別の実施の形態に係るデータプロセッサ(MPU)は、複数の命令パイプライン(EXPL,LSPL,BRPL)と、並列的にフェッチされた複数の命令コードを順次蓄積するグローバル命令キュー(GIQ)と、前記グローバル命令キューから出力された複数の命令コードに対して命令コードタイプ毎に探索を行い、探索結果に基づいて命令コードを命令パイプライン毎に振り分けるディスパッチ回路(EX-ISD,LS-ISD,BR-ISD)とを有する。前記命令パイプラインは更に、本来組み合わせが禁止されていないが意味のない特定の複数の命令コードの組み合わせに対して、単一の別の命令コードとして処理を行い、前記ディスパッチ回路は、前記特定の複数の命令コードの組み合わせを探索して対応する命令パイプラインに供給する。
実施の形態について更に詳述する。
「スロット不当例外パターン」を構成する「遅延スロット付分岐命令」として図1の4命令を、「遅延スロットに置けない命令」として図1および図2の8命令を含む例を考える。図1の命令(1)はディレイスロット付プログラムカウンタ(以下PCとする)相対無条件分岐命令、命令(2)はディレイスロット付PC相対サブルーチンコール命令、そして命令(3)(4)はディレイスロット付PC相対条件分岐命令で、それぞれ条件フラグが成立および不成立の場合に分岐する。また、図2の命令(5)(6)はPC相対ロード命令で、それぞれ16ビットおよび32ビットデータをロードする。そして、命令(7)(8)はPC相対条件分岐命令で、それぞれ条件フラグが成立および不成立の場合に分岐する。オペランドフィールドは命令(1)(2)(5)(6)が12ビット、命令(3)(4)(7)(8)が8ビットである。図1及び図2において「disp8」は8ビットのディスプレースメント、「disp12」は12ビットのディスプレースメント、PCはプログラムカウンタ値、Rnは番号#nの汎用レジスタ、をそれぞれ意味する。
実施の形態1で説明した「スロット不当例外パターン」を活用する命令制御形態はディスパッチ機構を変えずに済むが、考慮すべき点がある。すなわち、命令コードが「スロット不当例外パターン」の場合、パターン前半の遅延スロット付分岐命令は分岐命令として分岐パイプラインにディスパッチされるが、パターン後半が付いて来ないため、例え分岐命令であっても実行できず、無駄なディスパッチが発生することである。すなわち、パターン後半が命令と見なされてディスパッチされ、パターン前半がそのプレフィックス候補として付加されるため、パターン後半と「スロット不当例外パターン」が同じ実行パイプラインで実行される命令のコードでなければならないという制約が必要なことである。
実施の形態2では「スロット不当例外パターン」による命令定義は分岐命令専用という制限があった。本実施の形態では「スロット不当例外パターン」を分岐命令以外にも活用するために、命令コードタイプ判定機構を変更し、命令コードタイプ判定時に16ビット単位の命令コードタイプ判定結果と隣接コード間での情報交換によって「スロット不当例外パターン」を検出できるようにする。即ち、「スロット不当例外パターン」の前半部分と後半部分に同じコードを使用しないようにする。具体的には、前半部分には遅延スロット付分岐命令、後半部分には遅延スロット付分岐命令以外の遅延スロットに置けない命令を使用する。そして、パターン前半をプレフィックスと同様に扱い、パターン後半を分岐命令、ロードストア命令、および演算命令のいずれかの命令と同様に扱う。図4の分類において、「スロット不当例外パターン」の後半部分としての命令タイプTYPE-DSで分類する。この結果、「スロット不当例外パターン」に命令を自由に割り当て、プレフィックス付命令と同様に扱うことが可能となる。このときのディスパッチ回路EX-ISD,LS-ISD,BR-ISDによる命令コードの分配機能は図12と同様でよい。
実施例3では実装の容易さを優先して16ビット単位の命令コードタイプ判定結果と隣接コード間での情報交換によって「スロット不当例外パターン」を検出できるように命令のエンコーディングを制限した。しかし、既に16/32ビット長混在命令セットを実装しているプロセッサでは、上記制限をしてもメリットがない。したがって、制限を設けずに「スロット不当例外パターン」約144.5Mパターンを全て新規32ビット命令定義に使用すべきである。
以上の実施の各実施の形態で活用した「スロット不当例外パターン」のような例外を発生するパターンだけでなく、その他に、プログラム上で意味のない2命令の組み合わせ(特定組み合わせパターン)にも本発明の概念を適用することができる。例えば、同一レジスタへの連続ロードの場合、1回目のロードのロード先のレジスタが2回目のロードのソースオペランドでなければ、1回目のロードは実行する必要がない。このような同一レジスタへの連続ロードは禁止されていないが、このような2命令の組み合わせを禁止して本発明を適用すればコード空間を拡張することができる。
CPU プロセッサコア
IBUS 内部バス
EIF 外部インタフェース回路
PER 内蔵周辺モジュール
IC 命令キャッシュ
IFU 命令フェッチユニット
PD プリデコーダ
GIQ グローバル命令キュー
BRC 分岐制御部
DC データキャッシュ
LSU ロードストアユニット
LSIQ ロードストア命令キュー
LSID ロードストア命令デコーダ
LSE ロードストア命令実行部
EXIQ 実行命令キュー
EXID 演算命令デコーダ
EXE 演算命令実行部
BIU バスインタフェースユニット
IC1、IC2 命令キャッシュアクセス
GIB グローバル命令バッファGIB
EXPL 演算命令用パイプライン
EXIB ローカル命令バッファ
EXRR ローカルレジスタリード
EX 演算
WB レジスタライトバック
LSPL ロードストア命令用パイプライン
LSIB ローカル命令バッファ
LSRR ローカルレジスタリード
LSA アドレス計算
DC1、DC2 データキャッシュアクセス
BRPL 分岐命令用パイプライン
BR 分岐
IFU 命令フェッチユニット
GIQ0~15 16命令分の命令キューエントリ
GIQP グローバル命令キューポインタ
GIQP-DEC グローバル命令キューポインタデコーダ
EXP 演算命令ポインタ
LSP ロードストア命令ポインタ
BRP 分岐命令ポインタ
EX-ISD 演算命令探索ディスパッチ回路
LS-ISD ロードストア命令探索ディスパッチ回路
BR-ISD 分岐命令探索ディスパッチ回路
IREQ-GEN 命令フェッチ要求生成部
ICOV 命令キャッシュ出力有効信号
GIQU0~3 グローバル命令キュー更新信号
GIQ0~3、GIQ4~7、GIQ8~11、GIQ12~15 グローバル命令キューエントリグループ
GIQO0~15 グローバル命令キュー出力
BR-IV 分岐命令コード有効信号
BR-INST 分岐命令
BR-PV 分岐プレフィックス候補有効信号
BR-IV 分岐命令コード有効信号
Claims (22)
- 複数の命令パイプラインを有するデータプロセッサであって、
並列的にフェッチされた複数の命令コードを順次蓄積するグローバル命令キューと、
前記グローバル命令キューから出力された複数の命令コードに対して命令コードタイプ毎に探索を行い、探索結果に基づいて命令コードを命令パイプライン毎に振り分けるディスパッチ回路とを有し、
個々の命令コードの本来の処理が禁止されることになる特定の複数の命令コードの組合せによる禁止組合せパターンを別の命令として追加定義した命令セットを有する、データプロセッサ。 - 前記特定の複数の命令コードの禁止組み合わせパターンによって追加定義される命令は、当該組み合わせパターンの後半の命令コードパターンのみで定義される命令コードと同じ命令タイプに限定される、請求項1記載のデータプロセッサ。
- 前記特定の複数の命令コードの禁止組み合わせパターンにおける前半及び後半のそれぞれの命令コードパターンは異なる命令コードである、請求項2記載のデータプロセッサ。
- 前記ディスパッチ回路は、探索対象とする複数の命令コードの探索単位の中で目的とする命令コードタイプの命令コードを検出したときは当該検出した命令コードを有効として出力すると共にその直前の命令コードをプレフィックスコード候補として出力し、前記探索単位の先頭で目的とする命令コードタイプの命令コードを検出したときは当該先頭の命令コードを有効として出力し、前記探索単位の後端まで目的とする命令コードタイプの命令コードを検出できなかったときは当該後端の命令コードをプレフィックスコード候補として出力する、請求項3記載のデータプロセッサ。
- 前記命令パイプラインは、プレフィックスコード候補として供給された命令コードが前記特定の複数の命令コードの組み合わせを構成するときは当該特定の複数の命令コードの組み合わせを、追加定義された命令として処理し、プレフィックスコード候補として供給された命令コードが前記特定の複数の命令コードの組み合わせを構成するものでないときないときはこれを無視する、請求項4記載のデータプロセッサ。
- 前記命令パイプラインは、プレフィックスコード候補として供給された前記後端の命令コードを、その直後の命令コード探索において先頭で検出されて供給された命令コードに対して前記特定の複数の命令コードの組み合わせを構成するためのプレフィックスコード候補として用いる、請求項5記載のデータプロセッサ。
- 前記特定の複数の命令コードの禁止組み合わせパターンによって追加定義される命令は分岐命令に専用化され、当該禁止組み合わせパターンの後半の命令コードパターンには分岐命令の命令コードが使用される、請求項2記載のデータプロセッサ。
- 前記組み合わせによって別定義される前記特定の複数の命令コードの禁止組み合わせパターンにおける前半の命令コードパターンは遅延スロット付き分岐命令であり、後半の命令コードパターンは遅延スロットに置けない遅延スロット付き分岐命令以外の分岐命令である、請求項7記載のデータプロセッサ。
- 前記ディスパッチ回路は、探索対象とする複数の命令コードの探索単位の中の最後以外で目的とする命令コードタイプの命令コードを検出したときは当該検出した命令コードを有効として出力すると共にその直後の命令コードをポストフィックスコード候補として出力し、前記探索単位の中の最後で目的とする命令コードタイプの命令コードを検出したときは当該最後の命令コードを有効として出力する、請求項7記載のデータプロセッサ。
- 前記命令パイプラインは、ポストフィックスコード候補として供給された命令コードが前記特定の複数の命令コードの組み合わせを構成するときは当該特定の複数の命令コードの組み合わせを追加定義された命令として処理し、ポストフィックスコード候補として供給された命令コードが前記特定の複数の命令コードの組み合わせを構成するものでないときないときはこれを無視する、請求項9記載のデータプロセッサ。
- 前記命令パイプラインは、前記ディスパッチ回路から供給された前記最後の命令コードに対して、その直後の命令コード探索において先頭で検出されて供給された命令コードを前記特定の複数の命令コードの組み合わせを構成するためのポストフィックスコード候補として用いる、請求項10記載のデータプロセッサ。
- 前記特定の複数の命令コードの禁止組み合わせパターンにおける後半の命令コードパターンには前半の命令コードパターンとは異なる命令コードパターンが使用され、
前記ディスパッチ回路はその前段にプリデコーダを有し、
前記プリデコーダは、命令コードの命令コードタイプを判別すると共に、隣接する命令コード間で命令コードタイプの情報を交換して、命令コードが前記禁止組み合わせパターンを構成する命令タイプであるかを確定するための情報を前記ディスパッチ回路に供給し、
前記ディスパッチ回路は、前記確定するための情報を用いることにより(前記禁止組み合わせパターンによる別定義された命令の命令タイプが当該禁止組み合わせパターンの後半の命令コードパターンによる命令タイプと異なる場合にも)前記禁止組み合わせパターンによる命令を供給する命令パイプラインを決定する、請求項1記載のデータプロセッサ。 - 前記命令セットに含まれる命令コードは基本長とそれに対する2倍長の命令コードが混在する混在命令コードであり、
前記ディスパッチ回路は基本長の命令コードに対しては基本長単位で対応する命令パイプラインに供給し、2倍長の命令コードに対しては2倍長単位で対応する命令パイプラインに供給し、
前記禁止組合せパターンの命令コードは2倍長命令コードとして定義される、請求項1記載のデータプロセッサ。 - 複数の命令パイプラインを有するデータプロセッサであって、
並列的にフェッチされた複数の命令コードを順次蓄積するグローバル命令キューと、
前記グローバル命令キューから出力された複数の命令コードに対して命令コードタイプ毎に探索を行い、探索結果に基づいて命令コードを命令パイプライン毎に振り分けるディスパッチ回路とを有し、
本来組み合わせが禁止されていないが意味のない複数の命令コードの組み合わせによる禁止組合せパターンを別の命令として追加定義した命令セットを有する、データプロセッサ。 - 複数の命令パイプラインを有するデータプロセッサであって、
並列的にフェッチされた複数の命令コードを順次蓄積するグローバル命令キューと、
前記グローバル命令キューから出力された複数の命令コードに対して命令コードタイプ毎に探索を行い、探索結果に基づいて命令コードを命令パイプライン毎に振り分けるディスパッチ回路とを有し、
前記命令パイプラインは更に、個々の命令コードの本来の処理が禁止されることになる特定の複数の命令コードの組み合わせに対して、単一の別の命令コードとして処理を行い、
前記ディスパッチ回路は、前記特定の複数の命令コードの組み合わせを探索して対応する命令パイプラインに供給する、データプロセッサ。 - 前記特定の複数の命令コードの禁止組み合わせパターンによって追加定義される命令は、当該組み合わせパターンの後半の命令コードパターンのみで定義される命令コードと同じ命令タイプに限定される、請求項15記載のデータプロセッサ。
- 前記特定の複数の命令コードの禁止組み合わせパターンにおける前半及び後半のそれぞれの命令コードパターンは異なる命令コードである、請求項16記載のデータプロセッサ。
- 前記特定の複数の命令コードの禁止組み合わせパターンによって追加定義される命令は分岐命令に専用化され、当該禁止組み合わせパターンの後半の命令コードパターンには分岐命令の命令コードが使用される、請求項16記載のデータプロセッサ。
- 前記組み合わせによって別定義される前記特定の複数の命令コードの禁止組み合わせパターンにおける前半の命令コードパターンは遅延スロット付き分岐命令であり、後半の命令コードパターンは遅延スロットに置けない遅延スロット付き分岐命令以外の分岐命令である、請求項18記載のデータプロセッサ。
- 前記組み合わせによって別定義される前記特定の複数の命令コードの禁止組み合わせパターンにおける後半の命令コードパターンには前半の命令コードパターンとは異なる命令コードパターンが使用され、
前記ディスパッチ回路はその前段にプリデコーダを有し、
前記プリデコーダは、命令コードの命令コードタイプを判別すると共に、隣接する命令コード間で命令コードタイプの情報を交換して、命令コードが前記禁止組み合わせパターンを構成する命令タイプであるかを確定するための情報を前記ディスパッチ回路に供給し、
前記ディスパッチ回路は、確定するための情報を用いることにより(前記禁止組み合わせパターンによる別定義された命令の命令タイプが当該禁止組み合わせパターンの後半の命令コードパターンによる命令タイプと異なる場合にも)前記禁止組み合わせパターンによる命令を供給する命令パイプラインを決定する、請求項15記載のデータプロセッサ。 - 前記命令セットに含まれる命令コードは基本長とそれに対する2倍長の命令コードが混在する混在命令コードであり、
前記ディスパッチ回路は基本長の命令コードに対しては基本長単位で対応する命令パイプラインに供給し、2倍長の命令コードに対しては2倍長単位で対応する命令パイプラインに供給し、
前記禁止組合せパターンの命令コードは2倍長命令コードとして定義される、請求項15記載のデータプロセッサ。 - 複数の命令パイプラインを有するデータプロセッサであって、
並列的にフェッチされた複数の命令コードを順次蓄積するグローバル命令キューと、
前記グローバル命令キューから出力された複数の命令コードに対して命令コードタイプ毎に探索を行い、探索結果に基づいて命令コードを命令パイプライン毎に振り分けるディスパッチ回路とを有し、
前記命令パイプラインは更に、本来組み合わせが禁止されていないが意味のない特定の複数の命令コードの組み合わせに対して、単一の別の命令コードとして処理を行い、
前記ディスパッチ回路は、前記特定の複数の命令コードの組み合わせを探索して対応する命令パイプラインに供給する、データプロセッサ。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/113,058 US9910674B2 (en) | 2011-04-21 | 2012-04-10 | Data processor with extended instruction code space including a prohibition combination pattern as a separate instruction |
JP2013510955A JP5658358B2 (ja) | 2011-04-21 | 2012-04-10 | データプロセッサ |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2011-094800 | 2011-04-21 | ||
JP2011094800 | 2011-04-21 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2012144374A1 true WO2012144374A1 (ja) | 2012-10-26 |
Family
ID=47041486
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2012/059757 WO2012144374A1 (ja) | 2011-04-21 | 2012-04-10 | データプロセッサ |
Country Status (3)
Country | Link |
---|---|
US (1) | US9910674B2 (ja) |
JP (1) | JP5658358B2 (ja) |
WO (1) | WO2012144374A1 (ja) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2014208054A1 (en) * | 2013-06-28 | 2014-12-31 | International Business Machines Corporation | Optimization of instruction groups across group boundaries |
US9348596B2 (en) | 2013-06-28 | 2016-05-24 | International Business Machines Corporation | Forming instruction groups based on decode time instruction optimization |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180211046A1 (en) * | 2017-01-26 | 2018-07-26 | Intel Corporation | Analysis and control of code flow and data flow |
US11119777B1 (en) * | 2020-04-22 | 2021-09-14 | International Business Machines Corporation | Extended prefix including routing bit for extended instruction format |
CN112199158B (zh) * | 2020-10-16 | 2021-11-23 | 常熟理工学院 | 虚拟机保护的解释例程识别方法、装置、设备及存储介质 |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0895780A (ja) * | 1994-09-20 | 1996-04-12 | Nec Corp | 命令コード符号化方式 |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4439828A (en) * | 1981-07-27 | 1984-03-27 | International Business Machines Corp. | Instruction substitution mechanism in an instruction handling unit of a data processing system |
JP3570287B2 (ja) | 1999-03-31 | 2004-09-29 | セイコーエプソン株式会社 | マイクロコンピュータ |
JP3627725B2 (ja) | 2002-06-24 | 2005-03-09 | セイコーエプソン株式会社 | 情報処理装置及び電子機器 |
JP5357475B2 (ja) | 2008-09-09 | 2013-12-04 | ルネサスエレクトロニクス株式会社 | データプロセッサ |
-
2012
- 2012-04-10 JP JP2013510955A patent/JP5658358B2/ja active Active
- 2012-04-10 US US14/113,058 patent/US9910674B2/en active Active
- 2012-04-10 WO PCT/JP2012/059757 patent/WO2012144374A1/ja active Application Filing
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0895780A (ja) * | 1994-09-20 | 1996-04-12 | Nec Corp | 命令コード符号化方式 |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2014208054A1 (en) * | 2013-06-28 | 2014-12-31 | International Business Machines Corporation | Optimization of instruction groups across group boundaries |
GB2530454A (en) * | 2013-06-28 | 2016-03-23 | Global Foundries Inc | Optimization of instruction groups across group boundaries |
US9348596B2 (en) | 2013-06-28 | 2016-05-24 | International Business Machines Corporation | Forming instruction groups based on decode time instruction optimization |
US9361108B2 (en) | 2013-06-28 | 2016-06-07 | International Business Machines Corporation | Forming instruction groups based on decode time instruction optimization |
US9372695B2 (en) | 2013-06-28 | 2016-06-21 | Globalfoundries Inc. | Optimization of instruction groups across group boundaries |
US9477474B2 (en) | 2013-06-28 | 2016-10-25 | Globalfoundries Inc. | Optimization of instruction groups across group boundaries |
US9678757B2 (en) | 2013-06-28 | 2017-06-13 | International Business Machines Corporation | Forming instruction groups based on decode time instruction optimization |
US9678756B2 (en) | 2013-06-28 | 2017-06-13 | International Business Machines Corporation | Forming instruction groups based on decode time instruction optimization |
GB2530454B (en) * | 2013-06-28 | 2020-05-06 | Global Foundries Inc | Optimization of instruction groups across group boundaries |
Also Published As
Publication number | Publication date |
---|---|
JP5658358B2 (ja) | 2015-01-21 |
JPWO2012144374A1 (ja) | 2014-07-28 |
US20140040600A1 (en) | 2014-02-06 |
US9910674B2 (en) | 2018-03-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5357475B2 (ja) | データプロセッサ | |
JP6043374B2 (ja) | 動的アウトオブオーダプロセッサパイプラインを実装する方法および装置 | |
EP0996057B1 (en) | Data processor with an instruction unit having a cache and a ROM | |
US20030079114A1 (en) | Processor, compiling apparatus, and compile program recorded on a recording medium | |
JP3714999B2 (ja) | 命令キューをスキャンするための装置およびその方法 | |
JP2816248B2 (ja) | データプロセッサ | |
US20030154358A1 (en) | Apparatus and method for dispatching very long instruction word having variable length | |
US7454598B2 (en) | Controlling out of order execution pipelines issue tagging | |
JP4230504B2 (ja) | データプロセッサ | |
JP2002333978A (ja) | Vliw型プロセッサ | |
US20060259742A1 (en) | Controlling out of order execution pipelines using pipeline skew parameters | |
JP5658358B2 (ja) | データプロセッサ | |
US9021236B2 (en) | Methods and apparatus for storing expanded width instructions in a VLIW memory for deferred execution | |
JP2006313422A (ja) | 演算処理装置及びデータ転送処理の実行方法 | |
JP2009099097A (ja) | データ処理装置 | |
US9841978B2 (en) | Processor with a program counter increment based on decoding of predecode bits | |
US7725690B2 (en) | Distributed dispatch with concurrent, out-of-order dispatch | |
US9170638B2 (en) | Method and apparatus for providing early bypass detection to reduce power consumption while reading register files of a processor | |
JP2815236B2 (ja) | スーパースカーラマイクロプロセッサのための命令ディスパッチ方法及びレジスタ競合についてのチェック方法 | |
US20020116599A1 (en) | Data processing apparatus | |
US6119220A (en) | Method of and apparatus for supplying multiple instruction strings whose addresses are discontinued by branch instructions | |
JP7409208B2 (ja) | 演算処理装置 | |
JP5657760B2 (ja) | データプロセッサ | |
US20040128482A1 (en) | Eliminating register reads and writes in a scheduled instruction cache | |
JP2004152049A (ja) | データ処理装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 12774867 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2013510955 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 14113058 Country of ref document: US |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 12774867 Country of ref document: EP Kind code of ref document: A1 |