WO2016029444A1 - 指令执行的方法及处理器 - Google Patents
指令执行的方法及处理器 Download PDFInfo
- Publication number
- WO2016029444A1 WO2016029444A1 PCT/CN2014/085555 CN2014085555W WO2016029444A1 WO 2016029444 A1 WO2016029444 A1 WO 2016029444A1 CN 2014085555 W CN2014085555 W CN 2014085555W WO 2016029444 A1 WO2016029444 A1 WO 2016029444A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- instruction
- token
- execution unit
- instruction execution
- unit
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 127
- 230000008569 process Effects 0.000 claims abstract description 66
- 230000015654 memory Effects 0.000 claims description 69
- 238000004891 communication Methods 0.000 claims description 13
- 238000010586 diagram Methods 0.000 description 18
- 238000012545 processing Methods 0.000 description 16
- 230000006870 function Effects 0.000 description 13
- 230000001360 synchronised effect Effects 0.000 description 9
- 230000005540 biological transmission Effects 0.000 description 4
- 230000003139 buffering effect Effects 0.000 description 2
- 230000008878 coupling Effects 0.000 description 2
- 238000010168 coupling process Methods 0.000 description 2
- 238000005859 coupling reaction Methods 0.000 description 2
- 201000001098 delayed sleep phase syndrome Diseases 0.000 description 2
- 208000033921 delayed sleep phase type circadian rhythm sleep disease Diseases 0.000 description 2
- KLDZYURQCUYZBL-UHFFFAOYSA-N 2-[3-[(2-hydroxyphenyl)methylideneamino]propyliminomethyl]phenol Chemical compound OC1=CC=CC=C1C=NCCCN=CC1=CC=CC=C1O KLDZYURQCUYZBL-UHFFFAOYSA-N 0.000 description 1
- 230000002860 competitive effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 235000019800 disodium phosphate Nutrition 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000007429 general method Methods 0.000 description 1
- 230000009191 jumping Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000004549 pulsed laser deposition Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/38—Information transfer, e.g. on bus
- G06F13/40—Bus structure
- G06F13/4063—Device-to-bus coupling
- G06F13/4068—Electrical coupling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30098—Register arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30145—Instruction analysis, e.g. decoding, instruction word fields
- G06F9/3016—Decoding the operand specifier, e.g. specifier format
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3854—Instruction completion, e.g. retiring, committing or graduating
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3867—Concurrent instruction execution, e.g. pipeline or look ahead using instruction pipelines
- G06F9/3871—Asynchronous instruction pipeline, e.g. using handshake signals between stages
Definitions
- Embodiments of the present invention relate to the field of computers, and, more particularly, to a method of executing instructions and a processor. Background technique
- the system is clock-based, or synchronous. These systems are constructed from subsystems that use flip-flops (or registers) to store different states, and changing from one state to another depends on the global clock signal. The status update within the trigger is done at the edge of the clock signal. The performance of the synchronous processor is affected by the worst-case. In order to improve the processing power of the synchronous processor, the general method is to increase the frequency of the global clock, but higher frequencies will result in higher power consumption.
- the instructions of the asynchronous processor can be executed at the fastest possible speed, and the power consumption of the asynchronous processor is significantly reduced.
- Embodiments of the present invention provide a method and a processor for executing an instruction, which are highly efficient in processing.
- a method of instruction execution is provided, the method being performed by an execution component EU in a processor, the EU for acquiring a first instruction from an instruction queue of a bus interface component BIU, performing the first The instruction writes a result of executing the first instruction to a register set, the EU including a token management unit and at least two instruction execution units, the at least two instruction execution units including a first instruction execution unit and a second instruction Execution unit, the method includes:
- the token management unit sends a first read token to the first instruction execution unit, where the first read token is used to instruct the first instruction execution unit to read an operation of the first instruction to be executed Number
- the first instruction execution unit reads an operand of the first instruction according to an indication of the first read token
- the first instruction execution unit releases the first read token after the process of reading the operand of the first instruction starts;
- the token management unit After determining that the releasing the first read token, the token management unit sends a second read token to the second instruction execution unit, where the second read token is used to indicate the second instruction
- the execution unit reads an operand of the second instruction to be executed, where the second instruction is an instruction to be executed adjacent to the first instruction in the instruction queue;
- the first instruction execution unit executes the first instruction according to an operand of the first instruction
- the first instruction execution unit writes a result of executing the first instruction to the register set.
- the first instruction execution unit writes a result of executing the first instruction to the register set, including:
- the first instruction execution unit receives a first write token sent by the token management unit, and the first write token is used to instruct the first instruction execution unit to write a result of executing the first instruction
- the register set
- the first instruction execution unit After executing the first instruction, the first instruction execution unit writes a result of executing the first instruction to the register group according to the indication of the first write token;
- the first instruction execution unit releases the first write token after the process of writing the result of the first instruction to the register group, so that the token management unit determines After releasing the first write token, sending a second write token to the second instruction execution unit, where the second write token is used to indicate that the second instruction execution unit is to execute the second instruction The result is written to the register set.
- the first instruction execution unit writes the result of the first instruction to be executed in the After the process of the register set begins, the first write token is released, including:
- the first instruction execution unit releases the first write token and the token to the token at a first moment before a completion time of the process of writing the result of the first instruction to the register set
- the management unit sends a first write token release message, where the duration between the first time and the completion time is equal to a preset first threshold, and the first threshold is smaller than the One instruction The result of the process of writing to the register set;
- sending the second write token to the second instruction execution unit includes:
- the token management unit when receiving the first write token release message, sends a second write token to the second instruction execution unit.
- the first instruction execution unit is configured to read the first instruction After the process of the operand starts, releasing the first read token includes:
- the first instruction execution unit releases the first read token and sends the first to the token management unit at a second moment before the end time of the process of reading the operand of the first instruction Reading a token release message, wherein a duration between the second time and the end time is equal to a preset second threshold, and the second threshold is smaller than the operand of reading the first instruction Duration of the process;
- sending the second read token to the second instruction execution unit including:
- the token management unit transmits a second read token to the second instruction execution unit upon receiving the first read token release message.
- the EU further includes a pre-decoding unit
- the method further includes: the pre-decoding unit pre-decoding the first instruction, determining that an instruction execution unit for executing the first instruction is the first instruction execution unit, and the first The instruction is sent to the first instruction execution unit, and the identifier of the first instruction execution unit is sent to the token management unit.
- the method further includes: the pre-decoding unit pre-decoding the second instruction Determining that the instruction execution unit for executing the second instruction is the second instruction execution unit;
- the pre-decoding unit sends the second instruction to the second instruction execution unit, and An identifier of the second instruction execution unit is sent to the token management unit.
- the method further includes: the token management unit receiving an identifier of the first instruction execution unit from the pre-decoding unit;
- the sending, by the token management unit, the first read token to the first instruction execution unit includes: the token management unit sending the first instruction to the first instruction execution unit according to the identifier of the first instruction execution unit Read the token first;
- the method further includes: the first instruction execution unit from the pre-decode The unit receives the first instruction;
- the method further includes: the token management unit receiving an identifier of the second instruction execution unit from the pre-decoding unit; The second instruction execution unit sends the second read token, including: sending the second read token to the second instruction execution unit according to the identifier of the second instruction execution unit.
- a processor configured to acquire a first instruction from an instruction queue of a bus interface component BIU, execute the first instruction, and execute the first instruction The result is written to a register set, the EU comprising a token management unit and at least two instruction execution units, the at least two instruction execution units comprising a first instruction execution unit and a second instruction execution unit:
- the token management unit is configured to send a first read token to the first instruction execution unit, where the first read token is used to instruct the first instruction execution unit to read the first to be executed The operand of the instruction;
- the first instruction execution unit is configured to read an operand of the first instruction according to the indication of the first read token; after the process of reading the operand of the first instruction starts, Release the first read token;
- the token management unit is further configured to: after determining to release the first read token, send a second read token to the second instruction execution unit, where the second read token is used to indicate The second instruction execution unit reads an operand of the second instruction to be executed, where the second instruction is an instruction to be executed adjacent to the first instruction in the instruction queue;
- the first instruction execution unit is further configured to perform the performing according to an operand of the first instruction a first instruction to write a result of executing the first instruction to the register set.
- the token management unit is further configured to send, by the first instruction execution unit, a first write token, the first write order a card for instructing the first instruction execution unit to write a result of executing the first instruction to the register set;
- the first instruction execution unit is configured to receive the first write token sent by the token management unit, and after executing the first instruction, execute according to the indication of the first write token Writing the result of the first instruction to the register set; releasing the first write token after the process of writing the result of the first instruction to the register set begins;
- the token management unit is further configured to: after determining to release the first write token, send a second write token to the second instruction execution unit, where the second write token is used to indicate The second instruction execution unit writes the result of executing the second instruction to the register set.
- the token management unit is further configured to send a first write order to the first instruction execution unit.
- the first write token is used to instruct the first instruction execution unit to write a result of executing the first instruction into the register set;
- the first instruction execution unit is configured to receive the first write token sent by the token management unit, and after executing the first instruction, execute according to the indication of the first write token Writing the result of the first instruction to the register set; releasing the first write token after the process of writing the result of the first instruction to the register set begins;
- the token management unit is further configured to: after determining to release the first write token, send a second write token to the second instruction execution unit, where the second write token is used to indicate The second instruction execution unit writes the result of executing the second instruction to the register set.
- the first instruction execution unit is specifically configured to: Release a first read token and send a first read token release message to the token management unit at a second moment before the end of the process of the operand of the first instruction, wherein the second time to The duration between the end times is equal to a preset second threshold, and the second threshold is less than a duration of the process of reading the operand of the first instruction;
- the token management unit is specifically configured to send a second read token to the second instruction execution unit when receiving the first read token release message.
- the EU further includes a pre-decoding unit,
- the pre-decoding unit is configured to acquire the first instruction from an instruction queue of the BIU; the pre-decoding unit is further configured to pre-decode the first instruction, and determine to be used for executing An instruction execution unit of the first instruction is the first instruction execution unit, transmitting the first instruction to the first instruction execution unit, and transmitting an identifier of the first instruction execution unit to the token Management unit.
- the pre-decoding unit is further configured to:
- Decoding the second instruction determining that the instruction execution unit for executing the second instruction is the second instruction execution unit;
- the token management unit is further configured to receive the first instruction from the pre-decoding unit An identifier of the execution unit, the identifier of the second instruction execution unit is received from the pre-decoding unit; the first instruction execution unit is further configured to receive the first instruction from the pre-decoding unit; The token management unit is configured to: send, according to the identifier of the first instruction execution unit, the first read token to the first instruction execution unit; according to the identifier of the second instruction execution unit, to the The second instruction execution unit transmits the second read token.
- a communication device including a processor and a memory, and the processor and the memory are connected by a bus system,
- the processor including the processor according to the second aspect or any possible implementation manner of the second aspect;
- the memory is configured to store programs and data run by the processor.
- an instruction execution unit releases the read token in time during the execution of the instruction, which can facilitate the token management unit to allocate a read token to another instruction execution unit, so that the token management unit can read the token. Coordinated management and control to ensure the efficiency of instruction execution.
- Figure 1 is a schematic diagram of a microcomputer.
- FIG. 2 is a schematic diagram of the structure of a processor.
- FIG. 3 is a flow chart of a method of instruction execution in accordance with one embodiment of the present invention.
- FIG. 4 is a schematic diagram of a pipeline of an instruction execution unit in accordance with an embodiment of the present invention.
- FIG. 5 is a block diagram of EU in accordance with one embodiment of the present invention.
- FIG. 6 is a flow chart of a method of instruction execution in accordance with another embodiment of the present invention.
- FIG. 7 is a flow chart of a method of instruction execution in accordance with another embodiment of the present invention.
- Figure 8 is a schematic illustration of a pipeline of an instruction execution unit in accordance with another embodiment of the present invention.
- 9 is another schematic diagram of a pipeline of an instruction execution unit in accordance with another embodiment of the present invention.
- Figure 10 is a block diagram of a processor in accordance with another embodiment of the present invention.
- FIG 11 is a block diagram of a communication device in accordance with one embodiment of the present invention.
- Figure 12 is a block diagram of a communication device in accordance with another embodiment of the present invention.
- FIG. 13 is a schematic diagram of a processor for use in a smart wireless meter reading system in accordance with an embodiment of the present invention. detailed description
- Figure 1 is a schematic diagram of a microcomputer.
- the microcomputer of Fig. 1 includes a microprocessor 11, a memory 12, and an I/O interface 13.
- the microprocessor 11, the memory 12 and the I/O interface 13 are connected by a bus.
- the bus includes an address bus 14, a data bus 15, and a control bus 16. And, wherein the I/O interface 13 is connected to the external device 21.
- the memory 12 may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memory.
- the non-volatile memory can be a read-only memory (ROM), a programmable read-only memory (Programmable ROM, PROM), Erasable PROM (EPROM), Electrically Erasable Programmable Read Only Memory (EEPROM) or Flash Memory.
- the volatile memory can be a Random Access Memory (RAM) that acts as an external cache.
- RAM Random Access Memory
- many forms of RAM are available, such as static random access memory (SRAM), dynamic random access memory (DRAM), synchronous dynamic random access memory (Synchronous DRAM).
- DRAM Double Data Rate Synchronous Dynamic Random Access Memory
- ESDRAM Enhanced Synchronous Dynamic Random Access Memory
- Synchronous Connection Dynamic Random Access Memory Synchronous Connection Dynamic Random Access Memory
- SLDRAM Synchronous Connection Dynamic Random Access Memory
- DR RAM Direct Memory Bus Random Access Memory
- the microprocessor 11 is a central processing unit (CPU) of a microcomputer, and may also be referred to as a processor.
- the processor is an integrated circuit chip with signal processing capabilities.
- the processor can be a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a Field Programmable Gate Array (FPGA), or other programmable logic.
- DSP digital signal processor
- ASIC application specific integrated circuit
- FPGA Field Programmable Gate Array
- the embodiments described herein can be implemented in hardware, software, firmware, middleware, microcode, or a combination thereof.
- the processing unit can be implemented in one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processing (DSP), Digital Signal Processing Equipment (DSP Device, DSPD), programmable Programmable Logic Device (PLD), Field-Programmable Gate Array (FPGA), general purpose processor, controller, microcontroller, microprocessor, other for performing the functions described herein In an electronic unit or a combination thereof.
- ASICs Application Specific Integrated Circuits
- DSP Digital Signal Processing
- DSP Device Digital Signal Processing Equipment
- PLD programmable Programmable Logic Device
- FPGA Field-Programmable Gate Array
- a code segment can represent a procedure, a function, a subprogram, a program, a routine, a subroutine, a module, a software grouping, a class, or any combination of instructions, data structures, or program statements.
- a code segment can be combined into another code segment or hardware circuit by transmitting and/or receiving information, data, arguments, parameters, or memory contents. Any suitable means including memory sharing, messaging, token passing, network transmission, etc. can be used to deliver, forward or send information, Arguments, parameters, data, and more.
- the techniques described herein can be implemented by modules (eg, procedures, functions, and so on) that perform the functions described herein.
- the software code can be stored in a memory unit and executed by the processor.
- the memory unit can be implemented in the processor or external to the processor, in the latter case the memory unit can be communicatively coupled to the processor via various means known in the art.
- the processor includes an Execute Unit (EU) and a Bus Interface Unit (BIU).
- EU Execute Unit
- BIU Bus Interface Unit
- the input and output control circuits in the BIU are connected to an external bus and can access the memory 12 or the I/O interface 13.
- the address adder is used to complete the transformation of the logical address to the physical address.
- the BIU fetches instructions from memory 12 and writes the instructions to the sequence of instructions.
- the EU can get instructions from the BIU's instruction queue.
- the order of the first in first out (FIFO) queue is the same.
- the execution part control circuit in the EU is used to execute the instructions. Specifically, the EU fetches instructions from the BIU's instruction queue, predecodes the instructions, and executes the instructions.
- FIG. 3 is a flow diagram of a method of instruction execution in accordance with one embodiment of the present invention.
- the method shown in Figure 3 is performed by the EU in the processor.
- the EU is used to fetch a first instruction from the BIU's instruction queue, execute the first instruction, and write the result of executing the first instruction to the register set.
- the EU includes a token management unit and at least two instruction execution units, the at least two instruction execution units including a first instruction execution unit and a second instruction execution unit.
- the method shown in FIG. 3 includes:
- the token management unit sends a first read token to the first instruction execution unit, where the first read token is used to instruct the first instruction execution unit to read the first instruction to be executed.
- the operand is used to instruct the first instruction execution unit to read the first instruction to be executed.
- the first instruction execution unit reads an operand of the first instruction according to the indication of the first read token.
- the first instruction execution unit releases the first read token after the process of reading the operand of the first instruction begins.
- the token management unit After determining that the releasing the first read token, the token management unit sends a second read token to the second instruction execution unit, where the second read token is used to indicate the first
- the second instruction execution unit reads an operand of the second instruction to be executed, wherein the second instruction is in the instruction An instruction in the queue that is to be executed adjacent to the first instruction.
- the first instruction execution unit executes the first instruction according to an operand of the first instruction.
- the first instruction execution unit writes a result of executing the first instruction to the register group.
- an instruction execution unit releases the read token in time during the execution of the instruction, so that the token management unit can allocate the read token to the next instruction execution unit, so that the token management unit can read the token. Coordinated management and control to ensure the efficiency of instruction execution.
- the token management unit and the instruction execution unit may be logic circuits.
- the token management unit may also be referred to as a Register Token (Reg Token) unit, or a register token circuit.
- the Instruction Execute Unit (iEU) is the execution unit inside the processor.
- the at least two instruction execution units may include: a Multiply And Accumulate (MAC) unit, an Arithmetic Logic Unit (ALU), a Look Up Table (LUT) unit, and a load (LD). ) Unit, storage (Store, ST) unit, etc.
- the ALU consists of an AND Gate and an Or Gate.
- token management unit is connected to each instruction execution unit. Specifically, the connection can be made through an internal bus.
- the EU may also include a pre decode unit. Specifically, obtaining the first instruction from the BIU's instruction queue may mean that the pre-decoding unit acquires the first instruction from the BIU's instruction queue. Further, the pre-decoding unit pre-decodes the first instruction, and determines that the instruction execution unit for executing the first instruction is the first instruction execution unit. Subsequently, the pre-decoding unit may transmit the first instruction to the first instruction execution unit and send the identification of the first instruction execution unit to the token management unit.
- the token management unit receives an identification of the first instruction execution unit from the pre-decoding unit. Moreover, in 301, the token management unit sends the first read order card to the first instruction execution unit according to the identifier of the first instruction execution unit.
- the first instruction execution unit receives the first instruction from the pre-decoding unit.
- the instruction to be executed adjacent to the first instruction in the instruction queue is the second instruction.
- the pre-decoding unit fetches the second instruction from the BIU's instruction queue. Further, the pre-decoding unit pre-decodes the second instruction, and determines that the instruction execution unit for executing the second instruction is the second instruction execution unit. Subsequently, the pre-decoding unit may send the second instruction to the second instruction execution unit and send the identification of the second instruction execution unit to the token management unit.
- the token management unit receives an identification of the second instruction execution unit from the pre-decoding unit. And, in 304, the token management unit sends the second read order card to the second instruction execution unit according to the identifier of the second instruction execution unit.
- the identifier may be a serial number, or may be a physical address, or may be other identifiers, which is not limited by the present invention.
- the token management unit may generate an identifier sequence by using identifiers of the plurality of instruction execution units. The order of the received identifiers is recorded in the identification sequence. Thereafter, the token management unit may sequentially perform steps similar to 304 in accordance with the identification sequence.
- the first instruction execution unit determines that the operand of the first instruction is ready and reads the operand of the first instruction after receiving the first read token from the token management unit.
- the first instruction execution unit reads the operand of the first instruction from the register set. If the pre-decoding unit determines that the operand source of the first instruction is the first instruction, the first instruction execution unit reads the operand of the first instruction from the first instruction.
- the operand of the first instruction can be read by generating a read clock (elk).
- the method 306 may include: the token management unit sends a first write token to the first instruction execution unit, where the first write token is used to indicate that the first instruction is executed.
- the unit writes a result of executing the first instruction into the register set; after the first instruction execution unit executes the first instruction, according to an indication of the first write token, the first Writing a result of the instruction to the register set; the first instruction execution unit releasing the first write token after the process of writing the result of the first instruction to the register set begins; After determining that the releasing the first write token, the token management unit sends a second write token to the second instruction execution unit, where the second write token is used to indicate that the second instruction is executed. The unit writes the result of executing the second instruction to the register set. In this way, the token management unit can implement control of the instruction execution unit write operation by writing the token, that is, the token management unit can coordinately manage the process of writing the register set, thereby ensuring efficiency.
- the token management unit may generate an identifier sequence by using identifiers of the plurality of instruction execution units. The order of the received identifiers is recorded in the identification sequence. Thereafter, the token management unit can determine which instruction execution unit to send the write token to in accordance with the identification sequence.
- the first instruction execution unit releases the first write token after the process of writing the result of the first instruction to the register group, and may include: The first instruction execution unit releases the first write token and manages the token at a first moment before the completion time of the process of writing the result of the first instruction to the register set The unit sends a first write token release message, where the duration between the first time and the completion time is equal to a preset first threshold, and the first threshold is smaller than the first The length of time during which the result of the instruction is written to the register bank.
- the first instruction execution unit may release the first write token by using a delay circuit. For example, if the delay circuit sets the first delay tl, then the delay circuit is used, that is, the first write token is released after the first delay t1.
- the token management unit after determining that the releasing the first write token, sending the second write token to the second instruction execution unit, may include: the token management unit is receiving When the first write token releases the message, the second write token is sent to the second instruction execution unit.
- the second instruction execution unit can write the result of executing the second instruction to the register set according to the indication of the second write token.
- the first threshold in the embodiment of the present invention is a preset value, and the size of the first threshold is not limited by the present invention. It is only necessary to ensure that the process of writing 306 and the second instruction execution unit to write the result of executing the second instruction into the register set does not cause a conflict.
- the instruction execution unit releases the write token before the end of the write operation, enabling the token management unit to timely send the write token to the instruction execution unit of the next execution instruction, enabling the instruction execution unit to operate in sufficient parallel, thereby enabling Further improve efficiency.
- 303 may include: the first instruction execution unit releases the first read order at a second moment before an end time of the process of reading the operand of the first instruction And transmitting a first read token release message to the token management unit, wherein the second time The duration between the end time is equal to a preset second threshold, and the second threshold is less than the duration of the process of reading the operand of the first instruction.
- the first instruction execution unit may release the first read token by using a delay circuit. For example, if the delay circuit sets the second delay t2, then the delay circuit is used, that is, the first read token is released after the second delay t2.
- 304 may include: the token management unit transmitting a second read token to the second instruction execution unit upon receiving the first read token release message.
- the second instruction execution unit can read the operand of the second instruction according to the indication of the second read token.
- the second threshold in the embodiment of the present invention is a preset value, and the size of the second threshold is not limited by the present invention. It is only necessary to ensure that the process of reading the operand of the second instruction by the second instruction execution unit does not cause a conflict.
- the process of executing the first instruction may be as shown in FIG.
- the process includes:
- the first instruction execution unit reads the first instruction.
- the instruction execution unit may determine the source of the operand of the first instruction in 211, or determine the source of the operand of the first instruction by the instruction decoding in 212, which is not limited by the present invention.
- the first instruction execution unit reads the operand of the first instruction according to the indication of the first read token.
- the first instruction execution unit receives the first read token from the token management unit at time 221 . And, it can be understood that the first instruction execution unit releases the first read token at time 222. That is, the first instruction execution unit releases the first read token before the end of the process of reading the operand of the first instruction.
- the first instruction execution unit executes the first instruction according to the operand of the first instruction. 215, write operation.
- the write operation of 215 refers to writing the result of executing the instruction to the register set. Specifically, after receiving the first write token from the token management unit, after the first instruction is executed, the first instruction execution unit writes the result of executing the first instruction to the register group according to the instruction of the first write token. .
- the first instruction execution unit receives the first write token from the token management unit at time 223. And, since the process of executing the first instruction by the first instruction execution unit is not completed at time 223, the first instruction execution unit does not start the write operation at time 223.
- the first instruction execution unit releases the first write token at time 224. That is, the first instruction execution unit releases the first write token before the end of the process of the write operation.
- Figure 220 also includes the 220 phase. It will be appreciated that 220 is the wait phase and the first instruction execution unit has determined that the operand of the first instruction is ready after 212, but has not received the first read token from the token management unit. This 220 phase can be understood as the phase of waiting for the first read token.
- the EU continuously acquires the instruction code from the BIU's instruction queue and executes the instruction continuously at full load.
- the process of executing the instruction execution unit by any one of the at least two instruction execution units is similar to the process of executing the first instruction by the first instruction execution unit shown in FIG.
- FIG. 5 is a schematic diagram of EU.
- the EU 100 in Fig. 5 includes a pre-decoding unit 101, a token management unit 102, and at least two instruction execution units.
- the at least two instruction execution units shown in FIG. 5 are five, which are an instruction execution unit 1031, an instruction execution unit 1032, an instruction execution unit 1033, an instruction execution unit 1034, and an instruction execution unit 1035.
- the pre-decoding unit 101, the token management unit 102, and the at least two instruction execution units are connected by the internal bus system 104.
- FIG. 6 is a flow chart of a method of instruction execution in accordance with another embodiment of the present invention. It is based on the EU 100 shown in Figure 5, and is described by taking three instructions as an example. The method shown in Figure 6 includes:
- the pre-decoding unit 101 acquires the first instruction.
- the pre-decoding unit 101 acquires the first instruction from the instruction queue of the BIU.
- the pre-decoding unit 101 pre-decodes the first instruction.
- the pre-decoding unit 101 determines that the instruction execution unit that will execute the first instruction is the instruction execution unit 1032. And, the pre-decoding unit 101 can determine that the identification of the instruction execution unit 1032 is the first identification.
- the pre-decoding unit 101 may also determine the source of the operand of the first instruction by pre-decoding. 403.
- the pre-decoding unit 101 sends the first identifier to the token management unit 102, and sends the first instruction to the instruction execution unit 1032.
- the pre-decoding unit 101 may also inform the instruction execution unit 1032 of the source of the operand of the first instruction.
- the pre-decoding unit 101 acquires the second instruction.
- the pre-decoding unit 101 acquires a second instruction from the instruction queue of the BIU, and the second instruction instructs an instruction in the queue adjacent to the first instruction.
- the pre-decoding unit 101 pre-decodes the second instruction.
- the pre-decoding unit 101 determines that the instruction execution unit that will execute the second instruction is the instruction execution unit 1031. And, the pre-decoding unit 101 can determine that the identification of the instruction execution unit 1031 is the second identification.
- pre-decoding unit 101 may also determine the source of the operands of the second instruction by pre-decoding.
- the pre-decoding unit 101 sends the second identifier to the token management unit 102, and sends the second instruction to the instruction execution unit 1031.
- the pre-decoding unit 101 may also inform the instruction execution unit 1031 of the source of the operand of the second instruction.
- the pre-decoding unit 101 acquires a third instruction.
- the pre-decoding unit 101 acquires a third instruction from the instruction queue of the BIU, and the third instruction instructs an instruction in the queue adjacent to the second instruction.
- the pre-decoding unit 101 pre-decodes the third instruction.
- the pre-decoding unit 101 determines that the instruction execution unit that will execute the third instruction is the instruction execution unit 1033. And, the pre-decoding unit 101 can determine that the identification of the instruction execution unit 1033 is the third identification.
- pre-decoding unit 101 may also determine the source of the operand of the third instruction by pre-decoding.
- the pre-decoding unit 101 sends the third identifier to the token management unit 102, and sends the third instruction to the instruction execution unit 1033.
- the pre-decoding unit 101 may also inform the instruction execution unit 1033 of the source of the operand of the third instruction.
- the pre-decoding unit 101 continuously acquires instructions from the BIU's instruction queue, The instructions are pre-decoded, sent to the corresponding instruction execution unit that will execute the instruction, and the identification of the corresponding instruction execution unit is sent to the token management unit 102.
- the token management unit 102 sends the first read token to the instruction execution unit 1032.
- the token management unit 102 transmits the first read token to the instruction execution unit 1032 in accordance with the first identification.
- token management unit 102 prior to 501, token management unit 102 generates a first read token.
- the instruction execution unit 1032 reads the operand of the first instruction according to the indication of the first read token.
- 502 is after 403 and is executed after 501.
- the source of the operand of the first instruction may be the first instruction, or may be a register group.
- the instruction execution unit 1032 releases the first read token.
- the instruction execution unit 1032 may release the first read token at a first time before the end of the process of reading the operand of the first instruction.
- the duration between the first time and the end time may be less than the first time length threshold.
- the instruction execution unit 1032 sends a first read token release message to the token management unit 102. It will be appreciated that after 503, instruction execution unit 1032 may generate a first read token release message.
- the instruction execution unit 1032 executes the first instruction.
- 505 is executed after the end of 502.
- the token management unit 102 sends the second read token to the instruction execution unit 1031.
- 506 is executed after 406 and after 504. Specifically, after receiving the first read token release message, the token management unit 102 sends the second read token to the instruction execution unit 1031 according to the second identifier.
- token management unit 102 generates a second read token.
- the instruction execution unit 1031 reads the operand of the second instruction according to the indication of the second read token.
- 507 is after 406 and is executed after 506.
- the source of the operand of the second instruction may be the second instruction, or may be a register group.
- the instruction execution unit 1031 releases the second read token.
- the instruction execution unit 1031 may release the second read token at a second time before the end time of the process of reading the operand of the second instruction.
- the duration between the second time and the end time may be less than the second time width.
- the instruction execution unit 1031 sends a second read token release message to the token management unit 102. It will be appreciated that after 508, instruction execution unit 1031 may generate a second read token release message.
- the instruction execution unit 1031 executes the second instruction.
- 510 is executed after the end of 507.
- the token management unit 102 sends the third read token to the instruction execution unit 1033.
- 511 is executed after 409. Specifically, after receiving the second read token release message, the token management unit 102 transmits the third read token to the instruction execution unit 1033 according to the third identifier.
- token management unit 102 generates a third read token.
- the instruction execution unit 1033 reads the operand of the third instruction according to the instruction of the third read token.
- 512 is after 409 and is executed after 511.
- the source of the operand of the third instruction may be the third instruction, or may be a register group.
- the instruction execution unit 1033 releases the third read token.
- the instruction execution unit 1033 may release the third read token at a third time before the end of the process of reading the operand of the third instruction.
- the duration between the third time and the end time may be less than the third time width.
- the instruction execution unit 1033 sends a third read token release message to the token management unit 102. It will be appreciated that after 513, instruction execution unit 1033 may generate a third read token release message.
- the instruction execution unit 1033 executes the third instruction.
- 515 is executed after the end of 512.
- the decoding unit 101 acquires a fourth instruction and determines an instruction execution unit that executes the fourth instruction. For example, assume that the instruction execution unit that executes the fourth instruction is the instruction execution unit 1032. Then, the pre-decoding unit 101 sends the fourth instruction to the instruction execution unit 1032. It should be noted that the pre-decoding unit 101 transmits the fourth instruction to the instruction execution unit 1032 after the instruction execution unit 1031 performs the clearing of the jump. At the same time, the token management unit 102 generates a fourth read token and sends it to the instruction execution unit 1032, so that the instruction execution unit 1032 reads the operand of the fourth instruction and executes the fourth instruction.
- the token management unit 102 directly generates a read token for the instruction execution unit that executes the new instruction, without relying on other read token release messages.
- the method may further include:
- the token management unit 102 sends the first write token to the instruction execution unit 1032.
- 516 is executed after 504. Specifically, the token management unit 102 transmits the first write token to the instruction execution unit 1032 according to the first identification.
- token management unit 102 prior to 516, token management unit 102 generates a first write token.
- the instruction execution unit 1032 writes the result of executing the first instruction into the register set according to the indication of the first write token.
- 517 is performed after 516 and after 505.
- the instruction execution unit 1032 releases the first write token.
- the instruction execution unit 1032 may release the first write token at a fourth time before the end of the process of 517.
- the duration between the fourth time and the end time may be less than the fourth time width.
- the instruction execution unit 1032 sends a first write token release message to the token management unit.
- instruction execution unit 1032 can generate a first write token release message.
- the token management unit 102 sends the second write token to the instruction execution unit 1031.
- 520 is executed after 509 and after 519. Specifically, after receiving the first write token release message, the token management unit 102 sends the second write token to the instruction execution unit 1031 according to the second identifier.
- token management unit 102 generates a second write token.
- the instruction execution unit 1031 performs a node of the second instruction. Write to the register bank.
- 521 is performed after 520 and after 510.
- the instruction execution unit 1031 releases the second write token.
- the instruction execution unit 1031 may release the second write token at a fifth time before the end time of the process of 521.
- the duration between the fifth time and the end time may be less than the fifth time width.
- the instruction execution unit 1031 sends a second write token release message to the token management unit.
- instruction execution unit 1031 may generate a second write token release message.
- the token management unit 102 sends the third write token to the instruction execution unit 1033.
- 524 is executed after 514 and after 523. Specifically, after receiving the second write token release message, the token management unit 102 sends the third write token to the instruction execution unit 1033 according to the third identifier.
- token management unit 102 generates a third write token.
- the instruction execution unit 1032 writes the result of executing the third instruction into the register group according to the indication of the third write token.
- 525 is performed after 524 and after 515.
- the instruction execution unit 1033 releases the third write token.
- the instruction execution unit 1033 may release the third write token at a sixth time before the end time of the process of 525.
- the duration between the sixth time and the end time may be less than the third time width.
- the instruction execution unit 1033 sends a third write token release message to the token management unit.
- instruction execution unit 1033 may generate a third write token release message.
- the source of the operand of the second instruction is the execution result of the first instruction, that is, the operand of the second instruction depends on the first instruction, the two have a dependency relationship. Then, it can be understood that 507 is executed after 517.
- the size of the sequence number does not represent the order of execution, that is, the sequence number before each step cannot be defined as the execution order.
- Figure 8 is an instruction execution unit 1031, an instruction execution unit.
- a schematic diagram of the pipeline of 1032 and instruction execution unit 1033 There is no dependency between the operands of the first instruction, the second instruction, and the third instruction.
- 506, 501, and 511 are processes for acquiring the read token by the instruction execution unit 1031, the instruction execution unit 1032, and the instruction execution unit 1033, respectively.
- 508, 503, and 513 are processes for releasing the read token by the execution unit 1031, the instruction execution unit 1032, and the instruction execution unit 1033, respectively.
- 520, 516, and 524 are processes for acquiring the write token for the instruction execution unit 1031, the instruction execution unit 1032, and the instruction execution unit 1033, respectively.
- 522, 518, and 526 are processes for releasing the write read token for execution unit 1031, instruction execution unit 1032, and instruction execution unit 1033, respectively.
- Fig. 9 is another schematic diagram of the pipeline of the instruction execution unit 1031, the instruction execution unit 1032, and the instruction execution unit 1033.
- the operand of the third instruction depends on the execution result of the first instruction.
- 512 is executed after 517. It can be understood that, for the instruction execution unit 1033, when the third read token is acquired at 511, the operand of the third instruction is not yet ready. After 517, the 512 is executed after the operand of the third instruction is ready.
- the time period of the diagonal line is the waiting phase.
- the wait phase before 507, 502, and 512 means that the operand of the instruction is not ready or has not yet received the read token;
- the wait phase before 517 and 525 means that the write token has not been received.
- the waiting phase takes a long time.
- an instruction execution unit releases the read token in time during the execution of the instruction, which can facilitate the token management unit to allocate a read token to another instruction execution unit, so that the token management unit can read the token. Coordinated management and control to ensure the efficiency of instruction execution.
- FIG. 10 is a block diagram of a processor in accordance with one embodiment of the present invention.
- Processor 1000 in FIG. 10 includes EU 100.
- the EU 100 includes a token management unit 1001 and at least two instruction execution units 1002.
- At least two of the instruction execution units 1002 include a first instruction execution unit 10021 and a second instruction execution unit 10022.
- the execution unit EU 100 is configured to acquire a first instruction from an instruction queue of the bus interface unit BIU, execute the first instruction and write the result of executing the first instruction to the register set.
- the token management unit 1001 is configured to send a first read token to the first instruction execution unit 10021, where the first read token is used to instruct the first instruction execution unit 10021 to read The operand of the first instruction executed;
- the first instruction execution unit 10021 is configured to read an operand of the first instruction according to the indication of the first read token; after the process of reading the operand of the first instruction starts Release the first read token;
- the token management unit 1001 is further configured to: after determining to release the first read token, send a second read token to the second instruction execution unit 10022, where the second read token is used for Instructing the second instruction execution unit 10022 to read an operand of a second instruction to be executed, wherein the second instruction is an instruction to be executed adjacent to the first instruction in the instruction queue;
- the first instruction execution unit 10021 is further configured to execute the first instruction according to an operand of the first instruction, and write a result of executing the first instruction into the register set.
- an instruction execution unit releases the read token in time during the execution of the instruction, which can facilitate the token management unit to allocate a read token to another instruction execution unit, so that the token management unit can read the token. Coordinated management and control to ensure the efficiency of instruction execution.
- the token management unit 1001 is further configured to send a first write token to the first instruction execution unit 10021, where the first write token is used to indicate the first
- the instruction execution unit 10021 writes the result of executing the first instruction to the register set.
- the first instruction execution unit 10021 is configured to receive the first write token sent by the token management unit 1001, and after executing the first instruction, according to the indication of the first write token, Writing a result of executing the first instruction to the register set; releasing the first write token after the process of writing the result of the first instruction to the register set begins.
- the token management unit 1001 is further configured to: after determining to release the first write token, send a second write token to the second instruction execution unit 10022, where the second write token is used to The second instruction execution unit 10022 is instructed to write the result of executing the second instruction to the register set.
- the first instruction execution unit 10021 is specifically configured to: Releasing the first write token and transmitting the first write to the token management unit 1001 at a first time prior to the completion of the process of writing the result of the first instruction to the register set a token release message, wherein a duration between the first time and the completion time is equal to a preset first threshold, and the first threshold is smaller than a result of the first instruction being executed The duration of the process of the register set.
- the token management unit 1001 is specifically configured to send a second write token to the second instruction execution unit 10022 when receiving the first write token release message.
- the first instruction execution unit 10021 is specifically configured to release the first time at a second moment before the end time of the process of reading the operand of the first instruction.
- the first token read release message is sent to the token management unit 1001, and the duration between the second time and the end time is equal to a preset second threshold.
- the second threshold is less than the duration of the process of reading the operand of the first instruction.
- the token management unit 1001 is specifically configured to send a second read token to the second instruction execution unit 10022 when receiving the first read token release message.
- the EU 100 further includes a pre-decoding unit, where the pre-decoding unit is configured to acquire the first instruction from an instruction queue of the BIU; The unit is further configured to pre-decode the first instruction, determine that the instruction execution unit for executing the first instruction is the first instruction execution unit, and send the first instruction to the first An instruction execution unit, and transmitting an identifier of the first instruction execution unit to the token management unit.
- the pre-decoding unit is configured to acquire the first instruction from an instruction queue of the BIU;
- the unit is further configured to pre-decode the first instruction, determine that the instruction execution unit for executing the first instruction is the first instruction execution unit, and send the first instruction to the first An instruction execution unit, and transmitting an identifier of the first instruction execution unit to the token management unit.
- the pre-decoding unit is further configured to: obtain the second instruction from an instruction queue of the BIU; pre-decode the second instruction, determine to be used for An instruction execution unit that executes the second instruction is the second instruction execution unit; transmits the second instruction to the second instruction execution unit, and sends an identifier of the second instruction execution unit to the Token management unit.
- the token management unit 1001 is further configured to receive an identifier of the first instruction execution unit from the pre-decoding unit, and receive the first The identification of the two instruction execution units.
- the first instruction execution unit 10021 is further configured to receive the first instruction from the pre-decoding unit.
- the token management unit 1001 is specifically configured to send, according to the identifier of the first instruction execution unit, the first read token to the first instruction execution unit; according to the identifier of the second instruction execution unit, Transmitting the second read token to the second instruction execution unit.
- at least two instruction execution units 1002 may include multiple instruction execution units. A plurality of instruction execution units may be used to execute different types of instructions, or two of the plurality of instruction execution units may be used to execute the same type of instructions.
- the processing performance of the processor 1000 can be improved by increasing the number of the instruction execution units 102, that is, the effect of changing the area by area can be achieved.
- processor 1000 shown in FIG. 10 can implement the processes performed by the processor in the foregoing embodiments of FIG. 3 to FIG. 9. To avoid repetition, details are not described herein again.
- FIG. 11 is a block diagram of a communication device in accordance with one embodiment of the present invention.
- the communication device 700 shown in FIG. 11 includes a processor 701 and a memory 702, and the processor 701 and the memory 702 are connected by a bus system 703.
- the processor 701 includes the processor 1000 illustrated in FIG. 10;
- the memory 702 is configured to store programs and data that are executed by the processor 701.
- the processor in the foregoing embodiment of the present invention can be used in a communication device, and the above processor has high processing efficiency, and the processor can improve the performance of the communication device.
- the processor 701 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method may be completed by an integrated logic circuit of hardware in the processor or an instruction in the form of software.
- the processor 701 described above may be a general purpose processor, a DSP, an ASIC, an FPGA or other programmable logic device, a discrete gate or a transistor logic device, a discrete hardware group, a general purpose processor may be a microprocessor or the processor may be any conventional Processors, etc.
- the steps of the method disclosed in the embodiments of the present invention may be directly implemented by the hardware decoding processor, or may be performed by a combination of hardware and software modules in the decoding processor.
- the software modules can be located in a conventional storage medium such as random access memory, flash memory, read only memory, programmable read only memory or electrically erasable programmable memory, registers, and the like.
- the storage medium can be located in a memory, and the processor 701 reads the information in the memory 702 and, in conjunction with its hardware, performs the steps of the above method.
- the memory 702 in the embodiments of the present invention may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memory.
- the non-volatile memory can be ROM, PROM, EPROM, EEPROM or flash memory.
- the volatile memory can be RAM, which acts as an external cache.
- many forms of RAM are available, such as SRAM, DRAM, SDRAM, DDR SDRAM, ESDRAM, SLDRAM, and DR RAM.
- the memory of the systems and methods described herein is intended to include, but is not limited to, These and any other suitable type of memory.
- FIG. 12 is a block diagram of a communication device in accordance with another embodiment of the present invention.
- the processor 1000 includes an EU 100, an instruction fetch unit 104, an instruction cache unit 105, a data buffer unit 106, a multiplexer (MUX) 107, and a register bank 108.
- the EU 100 includes a pre-decoding unit 101, a token management unit 102, and five instruction execution units: an instruction execution unit 1031, an instruction execution unit 1032, an instruction execution unit 1033, an instruction execution unit 1034, and an instruction execution unit 1035.
- bus system which includes, in addition to the data bus, a power bus, a control bus and a status signal bus.
- bus system which includes, in addition to the data bus, a power bus, a control bus and a status signal bus.
- bus system includes, in addition to the data bus, a power bus, a control bus and a status signal bus.
- various buses are labeled as a bus system in FIG.
- the processor 1000 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method may be completed by an integrated logic circuit of hardware in a processor or an instruction in a form of software.
- the processor 1000 described above may be a general purpose processor, a DSP, an ASIC, an FPGA or other programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component. The methods, steps, and logic blocks disclosed in the embodiments of the present invention may be implemented or carried out.
- the general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
- the steps of the method disclosed in the embodiments of the present invention may be directly implemented as a hardware decoding processor, or may be performed by a combination of hardware and software modules in the decoding processor.
- the software module can be located in a conventional storage medium such as random access memory, flash memory, read only memory, programmable read only memory or electrically erasable programmable memory, registers, and the like.
- the storage medium can be located in a memory, and the processor 1000 reads the information in the memory 2000 and combines the hardware to perform the steps of the above method.
- the memory 2000 in the embodiments of the present invention may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memory.
- the non-volatile memory can be ROM, PROM, EPROM, EEPROM or flash memory.
- the volatile memory can be RAM, which acts as an external cache.
- many forms of RAM are available, such as SRAM, DRAM, SDRAM, DDR SDRAM, ESDRAM, SLDRAM, and DR RAM.
- the memories of the systems and methods described herein are intended to comprise, without being limited to, these and any other suitable types of memory.
- the embodiments described herein can be implemented in hardware, software, firmware, middleware, microcode, or a combination thereof.
- the processing unit can be implemented in one or more ASICs, DSPs, DSPDs, PLDs, FPGAs, general purpose processors, controllers, microcontrollers, microprocessors, other electronic units for performing the functions described herein, or combinations thereof.
- a code segment can represent any combination of procedures, functions, subroutines, programs, routines, subroutines, modules, software groupings, classes, or instructions, data structures, or program statements.
- a code segment can be combined into another code segment or hardware circuit by transmitting and/or receiving information, data, arguments, parameters or memory contents. Information, arguments, parameters, data, etc. can be communicated, forwarded, or transmitted using any suitable means including memory sharing, messaging, token passing, network transmission, and the like.
- the techniques described herein can be implemented by modules (eg, procedures, functions, and so on) that perform the functions described herein.
- the software code can be stored in a memory unit and executed by the processor.
- the memory unit can be implemented in the processor or external to the processor, in the latter case the memory unit can be communicatively coupled to the processor via various means known in the art.
- the instruction execution unit 1031 is a Multiply And Accumulate (MAC) unit
- the instruction execution unit 1032 is an Arithmetic Logic Unit (ALU)
- the instruction execution unit 1033 is a Look Up Table (LUT).
- the unit, the instruction execution unit 1034 is a load (LD) unit, and the instruction execution unit 1035 is a store (Store, ST) unit.
- Instruction fetch unit 104 is operative to fetch a sequence of instructions from memory 2000.
- the instruction cache unit 105 is for buffering the instruction sequence by the instruction fetch unit 104.
- the pre-decoding unit 101 is for acquiring a sequence of instructions from the instruction extracting unit 104.
- the data buffer unit 106 is for buffering data by the load unit and the storage unit.
- a multiplexer also known as a multiplexer, data selector, or multiplexer.
- the smart wireless meter reading system shown in FIG. 13 includes a core system 710, a user interface 720, a radio frequency 730, and an APP Micro Control Unit (APP MCU) 740.
- the core system 710 includes a digital signal processor (DSP) 711 and a FLASH 712.
- the radio frequency 730 includes a Radio Frequency Integrated Circuit (RFIC) 731 and a Power Amplifier (PA) 732.
- RFIC Radio Frequency Integrated Circuit
- PA Power Amplifier
- the FLASH 712 is a memory chip for storing a program running by the DSP.
- the RFIC 731 can be a wireless signal processing chip. It can be understood that the processor in the foregoing embodiment of the present invention can be applied to the DSP 711 in FIG. 13, and the DSP 711 can be a single-core digital signal processor for processing wireless baseband signals, thereby implementing wireless service transmission.
- the disclosed systems, devices, and methods may be implemented in other ways.
- the device embodiments described above are merely illustrative.
- the division of the unit is only a logical function division.
- there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not executed.
- the mutual coupling or direct connection or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in electrical, mechanical or other form.
- the components displayed for the unit may or may not be physical units, ie may be located in one place, or may be distributed over multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solution of the embodiment.
- each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
- the functions may be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a standalone product.
- the technical solution of the present invention which is essential or contributes to the prior art, or a part of the technical solution, may be embodied in the form of a software product, which is stored in a storage medium, including
- the instructions are used to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention.
- the storage medium includes: a USB flash drive, a removable hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, and the like, which can store program codes.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Advance Control (AREA)
- Executing Machine-Instructions (AREA)
Abstract
一种指令执行的方法和处理器。该方法包括:令牌管理单元向第一指令执行单元发送第一读令牌,用于指示第一指令执行单元读取待执行的第一指令的操作数(S301);第一指令执行单元根据第一读令牌的指示读取第一指令的操作数(S302),释放第一读令牌(S303);令牌管理单元在确定释放第一读令牌后,向第二指令执行单元发送第二读令牌(S304);第一指令执行单元根据第一指令的操作数,执行第一指令(S305),并将执行的结果写入寄存器组(S306)。该方法中,一个指令执行单元在指令执行的过程中,及时释放读令牌,能够便于令牌管理单元向下一个指令执行单元分配读令牌,这样令牌管理单元能够对读令牌进行协调管理和控制,从而能够保证指令执行的效率。
Description
指令执行的方法及处理器 技术领域
本发明实施例涉及计算机领域, 并且更具体地, 涉及一种指令执行的方 法及处理器。 背景技术
在同步处理器中, 系统都是基于时钟的, 或者说同步的。 这些系统由子 系统构造而成, 子系统用触发器(或寄存器)存储不同的状态, 从一个状态 变到另一个状态依赖于全局时钟信号。触发器内的状态更新在时钟信号的边 沿完成。 同步处理器的性能受最差环节的影响。 为了提高同步处理器的处理 能力, 一般的方法是提高全局时钟的频率, 但是更高的频率将会导致更高的 功耗。
异步处理器中没有全局时钟, 甚至没有时钟, 子系统间通过绑定通信协 议进行流水操作,或是通过控制时钟的产生和传输来驱动功能电路工作的自 定时系统。 异步处理器的指令可以按最可能快的速度执行, 并且异步处理器 的功耗明显降低。
传统的异步处理器釆用 muller-c电路实现异步流水线,通过请求-应答方 式处理数据, 但是这种异步流水线不紧凑, 并且多级流水线对竟争资源的访 问过于复杂, 这样导致处理效率低。 发明内容
本发明实施例提供了一种指令执行的方法及处理器, 该方法处理效率 高。
第一方面, 提供了一种指令执行的方法, 所述方法由处理器中的执行部 件 EU执行,所述 EU用于从总线接口部件 BIU的指令队列中获取第一指令, 执行所述第一指令并将执行所述第一指令的结果写入寄存器组, 所述 EU包 括令牌管理单元和至少两个指令执行单元, 所述至少两个指令执行单元包括 第一指令执行单元和第二指令执行单元, 所述方法包括:
所述令牌管理单元向所述第一指令执行单元发送第一读令牌, 所述第一 读令牌用于指示所述第一指令执行单元读取待执行的所述第一指令的操作
数;
所述第一指令执行单元根据所述第一读令牌的指示,读取所述第一指令 的操作数;
所述第一指令执行单元在所述读取所述第一指令的操作数的过程开始 后, 释放所述第一读令牌;
所述令牌管理单元在确定所述释放所述第一读令牌后, 向所述第二指令 执行单元发送第二读令牌, 所述第二读令牌用于指示所述第二指令执行单元 读取待执行的第二指令的操作数, 其中, 所述第二指令是在所述指令队列中 与第一指令相邻的待执行的指令;
所述第一指令执行单元根据所述第一指令的操作数, 执行所述第一指 令;
所述第一指令执行单元将执行所述第一指令的结果写入所述寄存器组。 结合第一方面, 在第一方面的第一种可能的实现方式中, 所述第一指令执行 单元将执行所述第一指令的结果写入所述寄存器组, 包括:
所述第一指令执行单元接收所述令牌管理单元发送的第一写令牌, 所述 第一写令牌用于指示所述第一指令执行单元将执行所述第一指令的结果写 入所述寄存器组;
所述第一指令执行单元在执行所述第一指令后,根据所述第一写令牌的 指示, 将执行所述第一指令的结果写入所述寄存器组;
所述第一指令执行单元在所述将执行所述第一指令的结果写入所述寄 存器组的过程开始后, 释放所述第一写令牌, 以便于所述令牌管理单元在确 定所述释放所述第一写令牌后, 向所述第二指令执行单元发送第二写令牌, 所述第二写令牌用于指示所述第二指令执行单元将执行所述第二指令的结 果写入所述寄存器组。
结合第一方面的第一种可能的实现方式,在第一方面的第二种可能的实 现方式中, 所述第一指令执行单元在所述将执行所述第一指令的结果写入所 述寄存器组的过程开始后, 释放所述第一写令牌, 包括:
所述第一指令执行单元在所述将执行所述第一指令的结果写入所述寄 存器组的过程的完成时刻前的第一时刻,释放所述第一写令牌并向所述令牌 管理单元发送第一写令牌释放消息, 其中, 所述第一时刻至所述完成时刻之 间的时长等于预设的第一阔值, 所述第一阔值小于所述将执行所述第一指令
的结果写入所述寄存器组的过程的时长;
所述令牌管理单元在确定所述释放所述第一写令牌后, 向所述第二指令 执行单元发送第二写令牌, 包括:
所述令牌管理单元在接收到所述第一写令牌释放消息时, 向所述第二指 令执行单元发送第二写令牌。
结合第一方面或者上述第一方面的任一种可能的实现方式,在第一方面 的第三种可能的实现方式中, 所述第一指令执行单元在所述读取所述第一指 令的操作数的过程开始后, 释放所述第一读令牌, 包括:
所述第一指令执行单元在所述读取所述第一指令的操作数的过程的结 束时刻前的第二时刻,释放所述第一读令牌并向所述令牌管理单元发送第一 读令牌释放消息, 其中, 所述第二时刻至所述结束时刻之间的时长等于预设 的第二阔值, 所述第二阔值小于所述读取所述第一指令的操作数的过程的时 长;
所述令牌管理单元在确定所述释放所述第一读令牌后, 向所述第二指令 执行单元发送第二读令牌, 包括:
所述令牌管理单元在接收到所述第一读令牌释放消息时, 向所述第二指 令执行单元发送第二读令牌。
结合第一方面或者上述第一方面的任一种可能的实现方式,在第一方面 的第四种可能的实现方式中, 所述 EU还包括预译码单元,
所述从 BIU的指令队列中获取第一指令, 包括: 所述预译码单元从所述
BIU的指令队列中获取所述第一指令;
所述方法还包括: 所述预译码单元对所述第一指令进行预译码, 确定用 于执行所述第一指令的指令执行单元为所述第一指令执行单元,将所述第一 指令发送至所述第一指令执行单元, 并将所述第一指令执行单元的标识发送 至所述令牌管理单元。
结合上述第一方面的第四种可能的实现方式,在第一方面的第五种可能 的实现方式中, 所述方法还包括: 所述预译码单元对所述第二指令进行预译码,确定用于执行所述第二指 令的指令执行单元为所述第二指令执行单元;
所述预译码单元将所述第二指令发送至所述第二指令执行单元, 并将所
述第二指令执行单元的标识发送至所述令牌管理单元。
结合上述第一方面的第五种可能的实现方式,在第一方面的第六种可能 的实现方式中,在所述令牌管理单元向第一指令执行单元发送第一读令牌之 前, 所述方法还包括: 所述令牌管理单元从所述预译码单元接收所述第一指 令执行单元的标识;
所述令牌管理单元向第一指令执行单元发送第一读令牌, 包括: 所述令 牌管理单元根据所述第一指令执行单元的标识, 向所述第一指令执行单元发 送所述第一读令牌;
在所述第一指令执行单元根据所述第一读令牌的指示,读取所述第一指 令的操作数之前, 所述方法还包括: 所述第一指令执行单元从所述预译码单 元接收所述第一指令;
在所述向第二指令执行单元发送第二读令牌之前, 所述方法还包括: 所 述令牌管理单元从所述预译码单元接收所述第二指令执行单元的标识; 所述向第二指令执行单元发送第二读令牌, 包括: 根据所述第二指令执 行单元的标识, 向所述第二指令执行单元发送所述第二读令牌。
第二方面, 提供了一种处理器, 所述处理器中的执行部件 EU用于从总 线接口部件 BIU的指令队列中获取第一指令,执行所述第一指令并将执行所 述第一指令的结果写入寄存器组, 所述 EU包括令牌管理单元和至少两个指 令执行单元, 所述至少两个指令执行单元包括第一指令执行单元和第二指令 执行单元:
所述令牌管理单元, 用于向所述第一指令执行单元发送第一读令牌, 所 述第一读令牌用于指示所述第一指令执行单元读取待执行的所述第一指令 的操作数;
所述第一指令执行单元, 用于根据所述第一读令牌的指示, 读取所述第 一指令的操作数; 在所述读取所述第一指令的操作数的过程开始后, 释放所 述第一读令牌;
所述令牌管理单元, 还用于在确定所述释放所述第一读令牌后, 向所述 第二指令执行单元发送第二读令牌, 所述第二读令牌用于指示所述第二指令 执行单元读取待执行的第二指令的操作数, 其中, 所述第二指令是在所述指 令队列中与第一指令相邻的待执行的指令;
所述第一指令执行单元, 还用于根据所述第一指令的操作数, 执行所述
第一指令, 并将执行所述第一指令的结果写入所述寄存器组。
结合第二方面, 在第二方面的第一种可能的实现方式中, 所述令牌管理 单元, 还用于向所述第一指令执行单元发送第一写令牌, 所述第一写令牌用 于指示所述第一指令执行单元将执行所述第一指令的结果写入所述寄存器 组;
所述第一指令执行单元, 具体用于接收所述令牌管理单元发送的所述第 一写令牌, 在执行所述第一指令后, 根据所述第一写令牌的指示, 将执行所 述第一指令的结果写入所述寄存器组; 在所述将执行所述第一指令的结果写 入所述寄存器组的过程开始后, 释放所述第一写令牌;
所述令牌管理单元, 还用于在确定所述释放所述第一写令牌后, 向所述 第二指令执行单元发送第二写令牌, 所述第二写令牌用于指示所述第二指令 执行单元将执行所述第二指令的结果写入所述寄存器组。
结合第二方面的第一种可能的实现方式,在第二方面的第二种可能的实 现方式中, 所述令牌管理单元, 还用于向所述第一指令执行单元发送第一写 令牌, 所述第一写令牌用于指示所述第一指令执行单元将执行所述第一指令 的结果写入所述寄存器组;
所述第一指令执行单元, 具体用于接收所述令牌管理单元发送的所述第 一写令牌, 在执行所述第一指令后, 根据所述第一写令牌的指示, 将执行所 述第一指令的结果写入所述寄存器组; 在所述将执行所述第一指令的结果写 入所述寄存器组的过程开始后, 释放所述第一写令牌;
所述令牌管理单元, 还用于在确定所述释放所述第一写令牌后, 向所述 第二指令执行单元发送第二写令牌, 所述第二写令牌用于指示所述第二指令 执行单元将执行所述第二指令的结果写入所述寄存器组。
结合第二方面或者上述第二方面的任一种可能的实现方式,在第二方面 的第三种可能的实现方式中, 所述第一指令执行单元, 具体用于在所述读取 所述第一指令的操作数的过程的结束时刻前的第二时刻,释放所述第一读令 牌并向所述令牌管理单元发送第一读令牌释放消息, 其中, 所述第二时刻至 所述结束时刻之间的时长等于预设的第二阔值, 所述第二阔值小于所述读取 所述第一指令的操作数的过程的时长;
所述令牌管理单元, 具体用于在接收到所述第一读令牌释放消息时, 向 所述第二指令执行单元发送第二读令牌。
结合第二方面或者上述第二方面的任一种可能的实现方式,在第二方面 的第四种可能的实现方式中, 所述 EU还包括预译码单元,
所述预译码单元, 用于从所述 BIU的指令队列中获取所述第一指令; 所述预译码单元, 还用于对所述第一指令进行预译码, 确定用于执行所 述第一指令的指令执行单元为所述第一指令执行单元,将所述第一指令发送 至所述第一指令执行单元, 并将所述第一指令执行单元的标识发送至所述令 牌管理单元。
结合第二方面的第四种可能的实现方式,在第二方面的第五种可能的实 现方式中, 所述预译码单元, 还用于:
从所述 BIU的指令队列中获取所述第二指令;
对所述第二指令进行预译码,确定用于执行所述第二指令的指令执行单 元为所述第二指令执行单元;
将所述第二指令发送至所述第二指令执行单元, 并将所述第二指令执行 单元的标识发送至所述令牌管理单元。
结合第二方面的第五种可能的实现方式,在第二方面的第六种可能的实 现方式中, 所述令牌管理单元, 还用于从所述预译码单元接收所述第一指令 执行单元的标识, 从所述预译码单元接收所述第二指令执行单元的标识; 所述第一指令执行单元, 还用于从所述预译码单元接收所述第一指令; 所述令牌管理单元, 具体用于根据所述第一指令执行单元的标识, 向所 述第一指令执行单元发送所述第一读令牌; 根据所述第二指令执行单元的标 识, 向所述第二指令执行单元发送所述第二读令牌。
第三方面, 提供了一种通信设备, 所述通信设备包括处理器和存储器, 所述处理器和所述存储器通过总线系统连接,
所述处理器, 包括上述第二方面或者第二方面的任一种可能的实现方式 所述的处理器;
所述存储器, 用于存储所述处理器运行的程序和数据。
本发明实施例中, 一个指令执行单元在执行指令的过程中, 及时释放读 令牌, 能够便于令牌管理单元向另一个指令执行单元分配读令牌, 这样令牌 管理单元能够对读令牌进行协调管理和控制, 从而能够保证指令执行的效 率。
附图说明
为了更清楚地说明本发明实施例的技术方案, 下面将对实施例或现有技 术描述中所需要使用的附图作简单地介绍, 显而易见地, 下面描述中的附图 仅仅是本发明的一些实施例, 对于本领域普通技术人员来讲, 在不付出创造 性劳动性的前提下, 还可以根据这些附图获得其他的附图。
图 1是一个微型计算机的示意图。
图 2是一个处理器的结构的示意图。
图 3是本发明一个实施例的指令执行的方法的流程图。
图 4是本发明一个实施例的指令执行单元的流水线的示意图。
图 5是本发明一个实施例的 EU的框图。
图 6是本发明另一个实施例的指令执行的方法的流程图。
图 7是本发明另一个实施例的指令执行的方法的流程图。
图 8是本发明另一个实施例的指令执行单元的流水线的一个示意图。 图 9是本发明另一个实施例的指令执行单元的流水线的另一个示意图。 图 10是本发明另一个实施例的处理器的框图。
图 11是本发明一个实施例的通信设备的框图。
图 12是本发明另一个实施例的通信设备的框图。
图 13是本发明一个实施例的处理器用于智能无线抄表系统的示意图。 具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行 清楚、 完整地描述, 显然, 所描述的实施例是本发明一部分实施例, 而不是 全部的实施例。 基于本发明中的实施例, 本领域普通技术人员在没有作出创 造性劳动的前提下所获得的所有其他实施例, 都属于本发明保护的范围。
图 1是一个微型计算机的示意图。图 1中的微型计算机包括微处理器 11、 存储器 12和 I/O接口 13。 其中微处理器 11、 存储器 12和 I/O接口 13之间 通过总线连接, 如图 1所示, 总线包括地址总线 14、 数据总线 15和控制总 线 16。 并且, 其中 I/O接口 13与外部设备 21连接。
其中, 存储器 12可以是易失性存储器或非易失性存储器, 或可包括易 失性和非易失性存储器两者。 其中, 非易失性存储器可以是只读存储器 ( Read-Only Memory, ROM ), 可编程只读存储器(Programmable ROM,
PROM )、 可擦除可编程只读存储器(Erasable PROM, EPROM )、 电可擦除 可编程只读存储器 (Electrically EPROM, EEPROM )或闪存。 易失性存储 器可以是随机存取存储器(Random Access Memory, RAM ), 其用作外部高 速緩存。 通过示例性但不是限制性说明, 许多形式的 RAM可用, 例如静态 随机存取存储器 (Static RAM, SRAM ), 动态随机存取存储器 (Dynamic RAM, DRAM )、 同步动态随机存取存储器( Synchronous DRAM, SDRAM )、 双倍数据速率同步动态随机存取存储器 (Double Data Rate SDRAM, DDR SDRAM )、增强型同步动态随机存取存储器( Enhanced SDRAM, ESDRAM )、 同步连接动态随机存取存储器(Synchlink DRAM, SLDRAM )和直接内存 总线随机存取存储器(Direct Rambus RAM, DR RAM )0 应注意, 本文的存 储器旨在包括但不限于这些和任意其它适合类型的存储器。
其中,微处理器 11是微型计算机的中央处理器( Central Processing Unit, CPU ), 也可以称为处理器。 处理器是一种集成电路芯片, 具有信号处理能 力。 处理器可以是通用处理器、 数字信号处理器 ( Digital Signal Processor, DSP )、 专用集成电路( Application Specific Integrated Circuit, ASIC )、 现成 可编程门阵列 ( Field Programmable Gate Array, FPGA )或者其他可编程逻 辑器件、 分立门或者晶体管逻辑器件、 分立硬件组件。 可以实现或者执行本 发明实施例中的公开的各方法、 步骤及逻辑框图。
可以理解的是, 本文描述的实施例可以用硬件、 软件、 固件、 中间件、 微码或其组合来实现。 对于硬件实现, 处理单元可以实现在一个或多个专用 集成电路(Application Specific Integrated Circuits, ASIC )、 数字信号处理器 ( Digital Signal Processing, DSP )、数字信号处理设备 ( DSP Device, DSPD )、 可编程逻辑设备 ( Programmable Logic Device, PLD )、 现场可编程门阵列 ( Field-Programmable Gate Array, FPGA )、通用处理器、控制器、微控制器、 微处理器、 用于执行本申请所述功能的其它电子单元或其组合中。
当在软件、 固件、 中间件或微码、 程序代码或代码段中实现实施例时, 它们可存储在例如存储部件的机器可读介质中。 代码段可表示过程、 函数、 子程序、 程序、 例程、 子例程、 模块、 软件分组、 类、 或指令、 数据结构或 程序语句的任意组合。 代码段可通过传送和 /或接收信息、 数据、 自变量、 参 数或存储器内容来稿合至另一代码段或硬件电路。 可使用包括存储器共享、 消息传递、 令牌传递、 网络传输等任意适合方式来传递、 转发或发送信息、
自变量、 参数、 数据等。
对于软件实现, 可通过执行本文所述功能的模块(例如过程、 函数等) 来实现本文所述的技术。 软件代码可存储在存储器单元中并通过处理器执 行。 存储器单元可以在处理器中或在处理器外部实现, 在后一种情况下存储 器单元可经由本领域己知的各种手段以通信方式耦合至处理器。
如图 2所示, 处理器包括执行部件(Execute Unit, EU )和总线接口部 件 ( Bus Interface Unit, BIU )。
BIU中的输入输出控制电路与外部总线连接,可以访问存储器 12或 I/O 接口 13。 地址加法器用于完成逻辑地址向物理地址的变换。
可理解, BIU从存储器 12获取指令, 并将指令写入指令序列。 EU可以 从 BIU的指令队列中获取指令。 队列是先入先出 ( first in first out, FIFO ) 的指令的顺序一致。
EU中的执行部分控制电路用于执行指令。 具体地, EU从 BIU的指令 队列中获取指令, 对指令进行预译码, 并执行指令。
本发明实施例涉及对 EU执行指令的过程的改进。 具体地, 图 3是本发 明一个实施例的指令执行的方法的流程图。 图 3 所示的方法由处理器中的 EU执行。 EU用于从 BIU的指令队列中获取第一指令, 执行该第一指令并 将执行该第一指令的结果写入寄存器组。 所述 EU包括令牌管理单元和至少 两个指令执行单元, 所述至少两个指令执行单元包括第一指令执行单元和第 二指令执行单元。 在所述获取第一指令后, 图 3所示的方法包括:
301 , 所述令牌管理单元向所述第一指令执行单元发送第一读令牌, 所 述第一读令牌用于指示所述第一指令执行单元读取待执行的所述第一指令 的操作数。
302, 所述第一指令执行单元根据所述第一读令牌的指示, 读取所述第 一指令的操作数。
303, 所述第一指令执行单元在所述读取所述第一指令的操作数的过程 开始后, 释放所述第一读令牌。
304, 所述令牌管理单元在确定所述释放所述第一读令牌后, 向所述第 二指令执行单元发送第二读令牌, 所述第二读令牌用于指示所述第二指令执 行单元读取待执行的第二指令的操作数, 其中, 所述第二指令是在所述指令
队列中与第一指令相邻的待执行的指令。
305, 所述第一指令执行单元根据所述第一指令的操作数, 执行所述第 一指令。
306, 所述第一指令执行单元将执行所述第一指令的结果写入所述寄存 器组。
本发明实施例中, 一个指令执行单元在指令执行的过程中, 及时释放读 令牌, 能够便于令牌管理单元向下一个指令执行单元分配读令牌, 这样令牌 管理单元能够对读令牌进行协调管理和控制, 从而能够保证指令执行的效 率。
应理解, 本发明实施例中, 令牌管理单元和指令执行单元可以是逻辑电 路。其中,令牌管理单元,也可以称为寄存器令牌( Register Token, Reg Token ) 单元, 或者寄存器令牌电路。 指令执行单元( instruction Execute Unit, iEU ) 是处理器内部的执行单元。
例如, 至少两个指令执行单元可以包括: 乘累加 ( Multiply And Accumulate, MAC )单元、 算术逗辑单元( Arithmetic Logic Unit, ALU ), 查表(Look Up Table, LUT )单元、 加载(Load, LD )单元、 存储(Store, ST )单元等。 其中 ALU由与门 (And Gate )和或门 (Or Gate )构成。
可理解, 令牌管理单元与每一个指令执行单元进行连接。 具体地, 可以 通过内部总线进行连接。
可选地, EU还可包括预译码(pre decode )单元。 具体地, 从 BIU的指 令队列中获取第一指令可以是指:预译码单元从 BIU的指令队列中获取第一 指令。 进一步地, 预译码单元对该第一指令进行预译码, 确定用于执行该第 一指令的指令执行单元为第一指令执行单元。 随后, 预译码单元可将第一指 令发送至第一指令执行单元, 并将第一指令执行单元的标识发送至令牌管理 单元。
相应地, 可理解, 在 301之前, 所述令牌管理单元从所述预译码单元接 收所述第一指令执行单元的标识。 并且, 在 301中, 所述令牌管理单元根据 所述第一指令执行单元的标识, 向所述第一指令执行单元发送所述第一读令 牌。
相应地, 可理解, 在 302之前, 所述第一指令执行单元从所述预译码单 元接收所述第一指令。
本发明实施例中,假设指令队列中与第一指令相邻的待执行的指令为第 二指令。 那么类似地, 预译码单元从 BIU的指令队列中获取第二指令。 进一 步地, 预译码单元对该第二指令进行预译码, 确定用于执行该第二指令的指 令执行单元为第二指令执行单元。 随后, 预译码单元可将第二指令发送至第 二指令执行单元, 并将第二指令执行单元的标识发送至令牌管理单元。
相应地, 可理解, 在 304之前, 所述令牌管理单元从所述预译码单元接 收所述第二指令执行单元的标识。 并且, 在 304中, 所述令牌管理单元根据 所述第二指令执行单元的标识, 向所述第二指令执行单元发送所述第二读令 牌。
应注意, 本发明实施例中, 标识可以序号, 或者也可以是物理地址, 或 者也可以是其他的标识, 本发明对此不作限定。
可选地, 本发明实施例中, 令牌管理单元在从预译码单元接收多个指令 执行单元的标识之后, 可以将多个指令执行单元的标识生成一个标识序列。 该标识序列中记录接收到的标识的顺序。 之后, 令牌管理单元可以按照该标 识序列依次地执行类似于 304的步骤。
可选地, 在 302中, 第一指令执行单元确定第一指令的操作数已经就绪 并且在从令牌管理单元接收第一读令牌后, 读取第一指令的操作数。
具体地, 若预译码单元确定第一指令的操作数来源为寄存器组, 那么第 一指令执行单元从寄存器组读取第一指令的操作数。若预译码单元确定第一 指令的操作数来源为该第一指令,那么第一指令执行单元从第一指令中读取 第一指令的操作数。
可选地, 可以通过产生一个读时钟(elk )读取第一指令的操作数。 可选地, 作为一个实施例, 306可以包括: 所述令牌管理单元向所述第 一指令执行单元发送第一写令牌, 所述第一写令牌用于指示所述第一指令执 行单元将执行所述第一指令的结果写入所述寄存器组; 所述第一指令执行单 元在执行所述第一指令后, 根据所述第一写令牌的指示, 将执行所述第一指 令的结果写入所述寄存器组; 所述第一指令执行单元在所述将执行所述第一 指令的结果写入所述寄存器组的过程开始后, 释放所述第一写令牌; 所述令 牌管理单元在确定所述释放所述第一写令牌后, 向所述第二指令执行单元发 送第二写令牌, 所述第二写令牌用于指示所述第二指令执行单元将执行所述 第二指令的结果写入所述寄存器组。
这样, 令牌管理单元能够通过写令牌实现对指令执行单元写操作的控 制, 即令牌管理单元能够对写入寄存器组的过程进行协调管理, 进而能够保 证效率。
可选地, 本发明实施例中, 令牌管理单元在从预译码单元接收多个指令 执行单元的标识之后, 可以将多个指令执行单元的标识生成一个标识序列。 该标识序列中记录接收到的标识的顺序。 之后, 令牌管理单元可以按照该标 识序列, 确定将写令牌发送至哪一个指令执行单元。
其中, 作为一个实施例, 所述第一指令执行单元在所述将执行所述第一 指令的结果写入所述寄存器组的过程开始后, 释放所述第一写令牌, 可以包 括: 所述第一指令执行单元在所述将执行所述第一指令的结果写入所述寄存 器组的过程的完成时刻前的第一时刻,释放所述第一写令牌并向所述令牌管 理单元发送第一写令牌释放消息, 其中, 所述第一时刻至所述完成时刻之间 的时长等于预设的第一阔值, 所述第一阔值小于所述将执行所述第一指令的 结果写入所述寄存器组的过程的时长。
可选地, 第一指令执行单元可以釆用一个时延电路释放第一写令牌。 例 如, 若时延电路设置第一时延 tl, 那么釆用该时延电路, 即在第一时延 tl 之后再释放第一写令牌。
相应地, 所述令牌管理单元在确定所述释放所述第一写令牌后, 向所述 第二指令执行单元发送第二写令牌, 可包括: 所述令牌管理单元在接收到所 述第一写令牌释放消息时, 向所述第二指令执行单元发送第二写令牌。
类似地, 可以理解, 第二指令执行单元接收到第二写令牌后, 可根据第 二写令牌的指示, 将执行第二指令的结果写入寄存器组。
应注意, 本发明实施例中的第一阔值是预设值, 本发明对该第一阔值的 大小不作限定。 只需保证 306与第二指令执行单元将执行第二指令的结果写 入寄存器组的过程不产生冲突即可。
这样, 指令执行单元在写操作结束前释放写令牌, 能够使令牌管理单元 及时地将写令牌发送至下一个执行指令的指令执行单元, 能够使指令执行单 元充分地并行运行, 进而能够进一步提高效率。
类似地, 作为一个实施例, 303可包括: 所述第一指令执行单元在所述 读取所述第一指令的操作数的过程的结束时刻前的第二时刻,释放所述第一 读令牌并向所述令牌管理单元发送第一读令牌释放消息, 其中, 所述第二时
刻至所述结束时刻之间的时长等于预设的第二阔值, 所述第二阔值小于所述 读取所述第一指令的操作数的过程的时长。
可选地, 在 303中, 第一指令执行单元可以釆用一个时延电路释放第一 读令牌。 例如, 若时延电路设置第二时延 t2, 那么釆用该时延电路, 即在第 二时延 t2之后再释放第一读令牌。
相应地, 304可包括: 所述令牌管理单元在接收到所述第一读令牌释放 消息时, 向所述第二指令执行单元发送第二读令牌。
与 302类似地, 可以理解, 第二指令执行单元接收到第二读令牌后, 可 根据第二读令牌的指示, 读取第二指令的操作数。
应注意, 本发明实施例中的第二阔值是预设值, 本发明对该第二阔值的 大小不作限定。 只需保证 302与第二指令执行单元读取第二指令的操作数的 过程不产生冲突即可。
具体地, 对一个指令执行单元来说, 例如第一指令执行单元, 执行第一 指令的过程可如图 4所示。 该过程包括:
211 , 读取(fetch )指令。
具体地, 第一指令执行单元读取第一指令。
212, 指令解码( decode )。
可选地, 指令执行单元可以在 211中确定第一指令的操作数的来源, 也 可以在 212中通过指令解码确定第一指令的操作数的来源, 本发明对此不作 限定。
213 , 读取指令的操作数。
具体地, 第一指令执行单元在从令牌管理单元接收第一读令牌后, 根据 第一读令牌的指示, 读取第一指令的操作数。
可理解,第一指令执行单元在 221时刻从令牌管理单元接收第一读令牌。 并且, 可理解, 第一指令执行单元在 222时刻释放该第一读令牌。 即, 第一指令执行单元在读取第一指令的操作数的过程结束前,释放该第一读令 牌。
214, 执行指令。
具体地, 第一指令执行单元根据第一指令的操作数, 执行第一指令。 215 , 写入操作。
215的写入操作是指将执行指令的结果写入寄存器组。
具体地, 第一指令执行单元在从令牌管理单元接收第一写令牌后, 在执 行第一指令结束后, 根据第一写令牌的指示, 将执行第一指令的结果写入寄 存器组。
可理解,第一指令执行单元在 223时刻从令牌管理单元接收第一写令牌。 并且, 由于在 223时刻, 第一指令执行单元执行第一指令的过程还未完成, 于是第一指令执行单元并没有在 223时刻开始写入操作。
并且, 可理解, 第一指令执行单元在 224时刻释放该第一写令牌。 即, 第一指令执行单元在写入操作的过程结束前, 释放该第一写令牌。
另外, 图 4中还包括 220阶段。 可以理解, 220为等待阶段, 第一指令 执行单元在 212之后已确定第一指令的操作数就绪,但是还未从令牌管理单 元接收第一读令牌。 该 220阶段可以理解为是等待第一读令牌的阶段。
可理解, EU从 BIU的指令队列中源源不断地获取指令代码, 满负荷地 连续执行指令。 其中, 至少两个指令执行单元中的任意一个指令执行单元执 行指令的过程与图 3所示的第一指令执行单元执行第一指令的过程类似。
应注意, 本发明实施例中, 对至少两个指令执行单元的数目不作限定。 例如, 图 5是 EU的一个示意图。 图 5中的 EU 100包括预译码单元 101、 令 牌管理单元 102和至少两个指令执行单元。 其中, 图 5中示出的至少两个指 令执行单元为 5个, 分别为指令执行单元 1031、 指令执行单元 1032、 指令 执行单元 1033、 指令执行单元 1034和指令执行单元 1035。 并且, 预译码单 元 101、 令牌管理单元 102和至少两个指令执行单元之间通过内部总线系统 104连接。
图 6是本发明另一个实施例的指令执行的方法的流程图。 其中, 以图 5 所示的 EU 100为基础, 且以三个指令为例进行描述。 图 6所示的方法包括:
401, 预译码单元 101获取第一指令。
具体地, 预译码单元 101从 BIU的指令队列中获取第一指令。
402, 预译码单元 101对第一指令进行预译码。
具体地,预译码单元 101确定将执行第一指令的指令执行单元为指令执 行单元 1032。 并且, 预译码单元 101可确定指令执行单元 1032的标识为第 一标识。
可选地,预译码单元 101也可以通过预译码确定第一指令的操作数的来 源。
403,预译码单元 101将第一标识发送至令牌管理单元 102,将第一指令 发送至指令执行单元 1032。
可选地,预译码单元 101也可以将第一指令的操作数的来源告知指令执 行单元 1032。
404, 预译码单元 101获取第二指令。
具体地,预译码单元 101从 BIU的指令队列中获取第二指令,第二指令 时指令队列中与第一指令相邻的指令。
405, 预译码单元 101对第二指令进行预译码。
具体地,预译码单元 101确定将执行第二指令的指令执行单元为指令执 行单元 1031。 并且, 预译码单元 101可确定指令执行单元 1031的标识为第 二标识。
可选地,预译码单元 101也可以通过预译码确定第二指令的操作数的来 源。
406,预译码单元 101将第二标识发送至令牌管理单元 102,将第二指令 发送至指令执行单元 1031。
可选地,预译码单元 101也可以将第二指令的操作数的来源告知指令执 行单元 1031。
407, 预译码单元 101获取第三指令。
具体地,预译码单元 101从 BIU的指令队列中获取第三指令,第三指令 时指令队列中与第二指令相邻的指令。
408, 预译码单元 101对第三指令进行预译码。
具体地,预译码单元 101确定将执行第三指令的指令执行单元为指令执 行单元 1033。 并且, 预译码单元 101可确定指令执行单元 1033的标识为第 三标识。
可选地,预译码单元 101也可以通过预译码确定第三指令的操作数的来 源。
409,预译码单元 101将第三标识发送至令牌管理单元 102,将第三指令 发送至指令执行单元 1033。
可选地,预译码单元 101也可以将第三指令的操作数的来源告知指令执 行单元 1033。
可理解,预译码单元 101从 BIU的指令队列中源源不断地获取指令,对
指令进行预译码, 将指令发送至将执行指令的对应的指令执行单元, 并将对 应的指令执行单元的标识发送至令牌管理单元 102。
501, 令牌管理单元 102将第一读令牌发送至指令执行单元 1032。
可理解, 501在 403之后执行。 具体地, 令牌管理单元 102根据第一标 识, 将第一读令牌发送至指令执行单元 1032。
可理解, 在 501之前, 令牌管理单元 102生成第一读令牌。
502, 指令执行单元 1032根据第一读令牌的指示, 读取第一指令的操作 数。
可理解, 502是在 403之后, 且在 501之后执行的。
具体地, 第一指令的操作数的来源可能为该第一指令, 也可能为寄存器 组。
503, 指令执行单元 1032释放第一读令牌。
可选地, 指令执行单元 1032可以在读取第一指令的操作数的过程的结 束时刻前的第一时刻, 释放第一读令牌。
其中, 第一时刻至结束时刻之间的时长可以小于第一时长阔值。
504,指令执行单元 1032发送第一读令牌释放消息至令牌管理单元 102。 可理解, 在 503之后, 指令执行单元 1032可以生成第一读令牌释放消 息。
505, 指令执行单元 1032执行第一指令。
可理解, 505是在 502结束之后执行的。
506, 令牌管理单元 102将第二读令牌发送至指令执行单元 1031。
可理解, 506在 406之后, 且在 504之后执行。 具体地, 令牌管理单元 102在接收到第一读令牌释放消息后, 根据第二标识, 将第二读令牌发送至 指令执行单元 1031。
可理解, 在 506之前, 令牌管理单元 102生成第二读令牌。
507, 指令执行单元 1031根据第二读令牌的指示, 读取第二指令的操作 数。
可理解, 507是在 406之后, 且在 506之后执行的。
具体地, 第二指令的操作数的来源可能为该第二指令, 也可能为寄存器 组。
508, 指令执行单元 1031释放第二读令牌。
可选地, 指令执行单元 1031可以在读取第二指令的操作数的过程的结 束时刻前的第二时刻, 释放第二读令牌。
其中, 第二时刻至结束时刻之间的时长可以小于第二时长阔值。
509,指令执行单元 1031发送第二读令牌释放消息至令牌管理单元 102。 可理解, 在 508之后, 指令执行单元 1031可以生成第二读令牌释放消 息。
510, 指令执行单元 1031执行第二指令。
可理解, 510是在 507结束之后执行的。
511, 令牌管理单元 102将第三读令牌发送至指令执行单元 1033。
可理解, 511在 409之后执行。 具体地, 令牌管理单元 102在接收到第 二读令牌释放消息后, 根据第三标识, 将第三读令牌发送至指令执行单元 1033。
可理解, 在 511之前, 令牌管理单元 102生成第三读令牌。
512, 指令执行单元 1033根据第三读令牌的指示, 读取第三指令的操作 数。
可理解, 512是在 409之后, 且在 511之后执行的。
具体地, 第三指令的操作数的来源可能为该第三指令, 也可能为寄存器 组。
513, 指令执行单元 1033释放第三读令牌。
可选地, 指令执行单元 1033可以在读取第三指令的操作数的过程的结 束时刻前的第三时刻, 释放第三读令牌。
其中, 第三时刻至结束时刻之间的时长可以小于第三时长阔值。
514,指令执行单元 1033发送第三读令牌释放消息至令牌管理单元 102。 可理解, 在 513之后, 指令执行单元 1033可以生成第三读令牌释放消 息。
515, 指令执行单元 1033执行第三指令。
可理解, 515是在 512结束之后执行的。
另外, 本发明实施例中, 当一个指令执行单元在执行指令过程中需跳转 到新的指令时, 例如, 若指令执行单元 1031执行第二指令的过程中需跳转 到第四指令,预译码单元 101获取第四指令并确定执行该第四指令的指令执 行单元。 例如, 假设执行该第四指令的指令执行单元为指令执行单元 1032。
那么,预译码单元 101将该第四指令发送至指令执行单元 1032。应注意, 在指令执行单元 1031执行跳转的清除工作结束之后, 预译码单元 101将该 第四指令发送至指令执行单元 1032。 同时,令牌管理单元 102生成第四读令 牌并发送至指令执行单元 1032, 以便指令执行单元 1032读取第四指令的操 作数并执行第四指令。
也就是说, 当跳转到新的指令时, 令牌管理单元 102直接为执行该新的 指令的指令执行单元生成读令牌, 而不依赖于其他的读令牌释放消息。
并且, 可理解, 在执行该新的指令之后, 该新的指令的执行结果不写入 寄存器组, 而是返回跳转前的指令的执行过程。
进一步地, 如图 7所示, 在图 6所示的实施例之后, 还可以包括:
516, 令牌管理单元 102将第一写令牌发送至指令执行单元 1032。
可理解, 516在 504之后执行。 具体地, 令牌管理单元 102根据第一标 识, 将第一写令牌发送至指令执行单元 1032。
可理解, 在 516之前, 令牌管理单元 102生成第一写令牌。
517, 指令执行单元 1032根据第一写令牌的指示, 将执行第一指令的结 果写入寄存器组。
可理解, 517在 516之后, 且在 505之后执行。
518, 指令执行单元 1032释放第一写令牌。
可选地, 指令执行单元 1032可以在 517的过程的结束时刻前的第四时 刻, 释放第一写令牌。
其中, 第四时刻至结束时刻之间的时长可以小于第四时长阔值。
519,指令执行单元 1032发送第一写令牌释放消息发送至令牌管理单元
102。
可理解, 在 518之后, 指令执行单元 1032可以生成第一写令牌释放消 息。
520, 令牌管理单元 102将第二写令牌发送至指令执行单元 1031。
可理解, 520在 509之后, 且在 519之后执行。 具体地, 令牌管理单元 102在接收到第一写令牌释放消息后, 根据第二标识, 将第二写令牌发送至 指令执行单元 1031。
可理解, 在 520之前, 令牌管理单元 102生成第二写令牌。
521 , 指令执行单元 1031根据第二写令牌的指示, 将执行第二指令的结
果写入寄存器组。
可理解, 521在 520之后, 且在 510之后执行。
522, 指令执行单元 1031释放第二写令牌。
可选地, 指令执行单元 1031可以在 521的过程的结束时刻前的第五时 刻, 释放第二写令牌。
其中, 第五时刻至结束时刻之间的时长可以小于第五时长阔值。
523 ,指令执行单元 1031发送第二写令牌释放消息发送至令牌管理单元
102。
可理解, 在 522之后, 指令执行单元 1031可以生成第二写令牌释放消 息。
524, 令牌管理单元 102将第三写令牌发送至指令执行单元 1033。
可理解, 524在 514之后, 且在 523之后执行。 具体地, 令牌管理单元 102在接收到第二写令牌释放消息后, 根据第三标识, 将第三写令牌发送至 指令执行单元 1033。
可理解, 在 524之前, 令牌管理单元 102生成第三写令牌。
525 , 指令执行单元 1032根据第三写令牌的指示, 将执行第三指令的结 果写入寄存器组。
可理解, 525在 524之后, 且在 515之后执行。
526, 指令执行单元 1033释放第三写令牌。
可选地, 指令执行单元 1033可以在 525的过程的结束时刻前的第六时 刻, 释放第三写令牌。
其中, 第六时刻至结束时刻之间的时长可以小于第三时长阔值。
527,指令执行单元 1033发送第三写令牌释放消息发送至令牌管理单元
102。
可理解, 在 526之后, 指令执行单元 1033可以生成第三写令牌释放消 息。
应注意, 本发明实施例中, 如果第二指令的操作数的来源为第一指令的 执行结果, 也就是说, 第二指令的操作数依赖于第一指令, 两者具有依赖关 系。 那么, 可理解, 507是在 517之后执行的。
同样, 如果第三指令的操作数依赖于第二指令的结果, 那么, 可理解,
512是在 521之后执行的。
同样, 如果第三指令的操作数依赖于第一指令的结果, 那么, 可理解,
512是在 517之后执行的。
应注意, 图 6和图 7所示的实施例中, 序号的大小不代表执行的顺序, 也就是说, 不能将每个步骤前的序号作为执行顺序的限定。
针对图 6和图 7所示的流程, 图 8是指令执行单元 1031、指令执行单元
1032和指令执行单元 1033的流水线的示意图。 其中, 第一指令、 第二指令 和第三指令的操作数之间没有依赖关系。
图 8中, 506、 501和 511分别为指令执行单元 1031、指令执行单元 1032 和指令执行单元 1033获取读令牌的过程。 508、 503和 513分别为执行单元 1031、 指令执行单元 1032和指令执行单元 1033释放读令牌的过程。 520、 516和 524分别为指令执行单元 1031、 指令执行单元 1032和指令执行单元 1033获取写令牌的过程。 522、 518和 526分别为执行单元 1031、 指令执行 单元 1032和指令执行单元 1033释放写读令牌的过程。
针对图 6和图 7所示的流程, 图 9是指令执行单元 1031、指令执行单元 1032和指令执行单元 1033的流水线的另一示意图。 其中, 第一指令和第三 指令的操作数之间具有依赖关系。 具体地, 第三指令的操作数依赖于第一指 令的执行结果。
图 9中, 512是在 517之后执行的。可理解,对指令执行单元 1033来说, 在 511获取第三读令牌时, 该第三指令的操作数还未就绪。 在 517之后, 该 第三指令的操作数就绪后, 才执行 512。
在图 8和图 9中, 斜划线的时间段为等待阶段。 具体地, 在 507、 502 和 512前的等待阶段是指指令的操作数未就绪或者还未收到读令牌; 在 517 和 525前的等待阶段是指还未收到写令牌。 对照图 9和图 8, 可以看出, 指 令间具有依赖关系时, 等待阶段的时间较长。
从图 8和图 9可以看出, 各条流水线并行进行, 并且流水线布局紧凑。 本发明实施例中, 一个指令执行单元在执行指令的过程中, 及时释放读 令牌, 能够便于令牌管理单元向另一个指令执行单元分配读令牌, 这样令牌 管理单元能够对读令牌进行协调管理和控制, 从而能够保证指令执行的效 率。
图 10是本发明一个实施例的处理器的框图。 图 10中的处理器 1000包 括 EU 100。 EU 100包括令牌管理单元 1001和至少两个指令执行单元 1002。
其中至少两个指令执行单元 1002包括第一指令执行单元 10021和第二指令 执行单元 10022。
执行部件 EU 100用于从总线接口部件 BIU的指令队列中获取第一指 令,执行所述第一指令并将执行所述第一指令的结果写入寄存器组。具体地, 所述令牌管理单元 1001,用于向所述第一指令执行单元 10021发送第一 读令牌, 所述第一读令牌用于指示所述第一指令执行单元 10021读取待执行 的所述第一指令的操作数;
所述第一指令执行单元 10021, 用于根据所述第一读令牌的指示, 读取 所述第一指令的操作数; 在所述读取所述第一指令的操作数的过程开始后, 释放所述第一读令牌;
所述令牌管理单元 1001,还用于在确定所述释放所述第一读令牌后, 向 所述第二指令执行单元 10022发送第二读令牌, 所述第二读令牌用于指示所 述第二指令执行单元 10022读取待执行的第二指令的操作数, 其中, 所述第 二指令是在所述指令队列中与第一指令相邻的待执行的指令;
所述第一指令执行单元 10021, 还用于根据所述第一指令的操作数, 执 行所述第一指令, 并将执行所述第一指令的结果写入所述寄存器组。
本发明实施例中, 一个指令执行单元在执行指令的过程中, 及时释放读 令牌, 能够便于令牌管理单元向另一个指令执行单元分配读令牌, 这样令牌 管理单元能够对读令牌进行协调管理和控制, 从而能够保证指令执行的效 率。
可选地, 作为一个实施例, 所述令牌管理单元 1001, 还用于向所述第一 指令执行单元 10021发送第一写令牌, 所述第一写令牌用于指示所述第一指 令执行单元 10021将执行所述第一指令的结果写入所述寄存器组。
所述第一指令执行单元 10021, 具体用于接收所述令牌管理单元 1001 发送的所述第一写令牌, 在执行所述第一指令后, 根据所述第一写令牌的指 示, 将执行所述第一指令的结果写入所述寄存器组; 在所述将执行所述第一 指令的结果写入所述寄存器组的过程开始后, 释放所述第一写令牌。
所述令牌管理单元 1001,还用于在确定所述释放所述第一写令牌后, 向 所述第二指令执行单元 10022发送第二写令牌, 所述第二写令牌用于指示所 述第二指令执行单元 10022将执行所述第二指令的结果写入所述寄存器组。
可选地, 作为另一个实施例, 所述第一指令执行单元 10021, 具体用于
在所述将执行所述第一指令的结果写入所述寄存器组的过程的完成时刻前 的第一时刻, 释放所述第一写令牌并向所述令牌管理单元 1001发送第一写 令牌释放消息, 其中, 所述第一时刻至所述完成时刻之间的时长等于预设的 第一阔值, 所述第一阔值小于所述将执行所述第一指令的结果写入所述寄存 器组的过程的时长。
所述令牌管理单元 1001, 具体用于在接收到所述第一写令牌释放消息 时, 向所述第二指令执行单元 10022发送第二写令牌。
可选地, 作为另一个实施例, 所述第一指令执行单元 10021, 具体用于 在所述读取所述第一指令的操作数的过程的结束时刻前的第二时刻,释放所 述第一读令牌并向所述令牌管理单元 1001发送第一读令牌释放消息, 其中, 所述第二时刻至所述结束时刻之间的时长等于预设的第二阔值, 所述第二阔 值小于所述读取所述第一指令的操作数的过程的时长。
所述令牌管理单元 1001, 具体用于在接收到所述第一读令牌释放消息 时, 向所述第二指令执行单元 10022发送第二读令牌。
可选地, 作为另一个实施例, 所述 EU 100还包括预译码单元, 所述预 译码单元,用于从所述 BIU的指令队列中获取所述第一指令; 所述预译码单 元, 还用于对所述第一指令进行预译码, 确定用于执行所述第一指令的指令 执行单元为所述第一指令执行单元,将所述第一指令发送至所述第一指令执 行单元, 并将所述第一指令执行单元的标识发送至所述令牌管理单元。
可选地, 作为另一个实施例, 所述预译码单元, 还用于: 从所述 BIU的 指令队列中获取所述第二指令; 对所述第二指令进行预译码, 确定用于执行 所述第二指令的指令执行单元为所述第二指令执行单元; 将所述第二指令发 送至所述第二指令执行单元, 并将所述第二指令执行单元的标识发送至所述 令牌管理单元。
可选地, 作为另一个实施例, 所述令牌管理单元 1001, 还用于从所述预 译码单元接收所述第一指令执行单元的标识,从所述预译码单元接收所述第 二指令执行单元的标识。 所述第一指令执行单元 10021, 还用于从所述预译 码单元接收所述第一指令。所述令牌管理单元 1001,具体用于根据所述第一 指令执行单元的标识, 向所述第一指令执行单元发送所述第一读令牌; 根据 所述第二指令执行单元的标识, 向所述第二指令执行单元发送所述第二读令 牌。
可理解, 本发明实施例中, 至少两个指令执行单元 1002可以包括多个 指令执行单元。 多个指令执行单元可以用于执行不同类型的指令, 或者, 多 个指令执行单元中的两个指令执行单元也可以用于执行相同类型的指令。
本发明实施例中, 能够通过增加指令执行单元 102的数目来提高处理器 1000的处理性能, 也就是说, 能够达到以面积换速度的效果。
可理解, 图 10所示的处理器 1000能够实现前述图 3至图 9的实施例中 由处理器执行的过程, 为避免重复, 这里不再赘述。
图 11是本发明一个实施例的通信设备的框图。图 11所示的通信设备 700 包括处理器 701和存储器 702, 所述处理器 701和所述存储器 702通过总线 系统 703连接。
所述处理器 701, 包括图 10所述的处理器 1000;
所述存储器 702, 用于存储所述处理器 701运行的程序和数据。
这样, 本发明前述实施例中的处理器能够用于一种通信设备, 且上述处 理器具有较高的处理效率, 进而该处理器能够提高通信设备的性能。
应注意, 处理器 701可能是一种集成电路芯片, 具有信号处理能力。 在 实现过程中, 上述方法的各步骤可以通过处理器中的硬件的集成逻辑电路或 者软件形式的指令完成。上述的处理器 701可以是通用处理器、 DSP、 ASIC, FPGA或者其他可编程逻辑器件、 分立门或者晶体管逻辑器件、 分立硬件组 通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。 结 合本发明实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行 完成, 或者用译码处理器中的硬件及软件模块组合执行完成。 软件模块可以 位于随机存储器, 闪存、 只读存储器, 可编程只读存储器或者电可擦写可编 程存储器、 寄存器等本领域成熟的存储介质中。 该存储介质可以位于存储器 中,处理器 701读取存储器 702中的信息,结合其硬件完成上述方法的步骤。
可以理解, 本发明实施例中的存储器 702可以是易失性存储器或非易失 性存储器, 或可包括易失性和非易失性存储器两者。 其中, 非易失性存储器 可以是 ROM、 PROM, EPROM、 EEPROM 或闪存。 易失性存储器可以是 RAM, 其用作外部高速緩存。 通过示例性但不是限制性说明, 许多形式的 RAM可用, 例如 SRAM、 DRAM, SDRAM、 DDR SDRAM、 ESDRAM、 SLDRAM和 DR RAM。 本文描述的系统和方法的存储器旨在包括但不限于
这些和任意其它适合类型的存储器。
图 12是本发明另一个实施例的通信设备的框图。 包括处理器 1000和存 储器 2000。处理器 1000包括 EU 100、指令提取单元 104、指令緩存单元 105、 数据緩存单元 106、 复用器(multiplexer, MUX ) 107和寄存器组 108。 其中 EU 100包括预译码单元 101、令牌管理单元 102和五个指令执行单元: 指令 执行单元 1031、指令执行单元 1032、指令执行单元 1033、指令执行单元 1034 和指令执行单元 1035。
图 12中的各个组件通过总线系统连接在一起, 其中总线系统除包括数 据总线之外, 还包括电源总线、 控制总线和状态信号总线。 但是为了清楚说 明起见, 在图 12中将各种总线都标为总线系统。
应注意, 处理器 1000可能是一种集成电路芯片, 具有信号处理能力。 在实现过程中, 上述方法的各步骤可以通过处理器中的硬件的集成逻辑电路 或者软件形式的指令完成。 上述的处理器 1000可以是通用处理器、 DSP、 ASIC, FPGA或者其他可编程逻辑器件、 分立门或者晶体管逻辑器件、 分立 硬件组件。 可以实现或者执行本发明实施例中的公开的各方法、 步骤及逻辑 框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器 等。 结合本发明实施例所公开的方法的步骤可以直接体现为硬件译码处理器 执行完成, 或者用译码处理器中的硬件及软件模块组合执行完成。 软件模块 可以位于随机存储器, 闪存、 只读存储器, 可编程只读存储器或者电可擦写 可编程存储器、 寄存器等本领域成熟的存储介质中。 该存储介质可以位于存 储器中, 处理器 1000读取存储器 2000中的信息, 结合其硬件完成上述方法 的步骤。
可以理解, 本发明实施例中的存储器 2000可以是易失性存储器或非易 失性存储器, 或可包括易失性和非易失性存储器两者。 其中, 非易失性存储 器可以是 ROM、 PROM, EPROM、 EEPROM或闪存。 易失性存储器可以是 RAM, 其用作外部高速緩存。 通过示例性但不是限制性说明, 许多形式的 RAM可用, 例如 SRAM、 DRAM, SDRAM、 DDR SDRAM、 ESDRAM、 SLDRAM和 DR RAM。 本文描述的系统和方法的存储器旨在包括但不限于 这些和任意其它适合类型的存储器。
可以理解的是, 本文描述的这些实施例可以用硬件、 软件、 固件、 中间 件、 微码或其组合来实现。 对于硬件实现, 处理单元可以实现在一个或多个
ASIC, DSP、 DSPD、 PLD、 FPGA、 通用处理器、 控制器、 微控制器、 微处 理器、 用于执行本申请所述功能的其它电子单元或其组合中。
当在软件、 固件、 中间件或微码、 程序代码或代码段中实现实施例时, 它们可存储在例如存储部件的机器可读介质中。 代码段可表示过程、 函数、 子程序、 程序、 例程、 子例程、 模块、 软件分组、 类、 或指令、 数据结构或 程序语句的任意组合。 代码段可通过传送和 /或接收信息、 数据、 自变量、 参 数或存储器内容来稿合至另一代码段或硬件电路。 可使用包括存储器共享、 消息传递、 令牌传递、 网络传输等任意适合方式来传递、 转发或发送信息、 自变量、 参数、 数据等。
对于软件实现, 可通过执行本文所述功能的模块(例如过程、 函数等) 来实现本文所述的技术。 软件代码可存储在存储器单元中并通过处理器执 行。 存储器单元可以在处理器中或在处理器外部实现, 在后一种情况下存储 器单元可经由本领域己知的各种手段以通信方式耦合至处理器。
可选地, 假设指令执行单元 1031为乘累加 (Multiply And Accumulate, MAC )单元, 指令执行单元 1032为算术逻辑单元( Arithmetic Logic Unit, ALU ), 指令执行单元 1033为查表(Look Up Table, LUT )单元, 指令执行 单元 1034为加载(Load, LD )单元, 指令执行单元 1035为存储(Store, ST )单元。
指令提取单元 104用于从存储器 2000获取指令序列。指令緩存单元 105 用于由指令提取单元 104緩存指令序列。
预译码单元 101用于从指令提取单元 104获取指令序列。
数据緩存单元 106用于由加载单元和存储单元将緩存数据。
复用器, 也可以称为多路转换器、 数据选择器或多路开关等。
图 13是本发明一个实施例的处理器用于智能无线抄表系统的示意图。 图 13所示的智能无线抄表系统包括核心系统 710、 用户接口 720、 射频 730 和抄表应用微控制单元 ( APP Micro Control Unit, APP MCU ) 740。 其中, 核心系统 710包括数字信号处理器 ( Digital Signal Processor, DSP ) 711和 FLASH 712。射频 730包括射频集成电路( Radio Frequency Integrated Circuit, RFIC ) 731和功率放大器(Power Amplifier, PA ) 732。
其中, FLASH 712是存储芯片, 用于存储 DSP运行的程序。 RFIC 731 可以是无线信号处理芯片。
可理解, 本发明前述实施例中的处理器可以应用与图 13中的 DSP 711 , 该 DSP 711可以是单核数字信号处理器, 用于处理无线基带信号, 从而实现 无线业务传输。
本领域普通技术人员可以意识到, 结合本文中所公开的实施例描述的各 示例的单元及算法步骤, 能够以电子硬件、 或者计算机软件和电子硬件的结 合来实现。 这些功能究竟以硬件还是软件方式来执行, 取决于技术方案的特 定应用和设计约束条件。 专业技术人员可以对每个特定的应用来使用不同方 法来实现所描述的功能, 但是这种实现不应认为超出本发明的范围。
所属领域的技术人员可以清楚地了解到, 为描述的方便和简洁, 上述描 述的系统、 装置和单元的具体工作过程, 可以参考前述方法实施例中的对应 过程, 在此不再赘述。
在本申请所提供的几个实施例中, 应该理解到, 所揭露的系统、 装置和 方法, 可以通过其它的方式实现。 例如, 以上所描述的装置实施例仅仅是示 意性的, 例如, 所述单元的划分, 仅仅为一种逻辑功能划分, 实际实现时可 以有另外的划分方式, 例如多个单元或组件可以结合或者可以集成到另一个 系统, 或一些特征可以忽略, 或不执行。 另一点, 所显示或讨论的相互之间 的耦合或直接辆合或通信连接可以是通过一些接口, 装置或单元的间接耦合 或通信连接, 可以是电性, 机械或其它的形式。 为单元显示的部件可以是或者也可以不是物理单元, 即可以位于一个地方, 或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或 者全部单元来实现本实施例方案的目的。
另外, 在本发明各个实施例中的各功能单元可以集成在一个处理单元 中, 也可以是各个单元单独物理存在, 也可以两个或两个以上单元集成在一 个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使 用时, 可以存储在一个计算机可读取存储介质中。 基于这样的理解, 本发明 的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部 分可以以软件产品的形式体现出来, 该计算机软件产品存储在一个存储介质 中, 包括若干指令用以使得一台计算机设备(可以是个人计算机, 服务器, 或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。 而前
述的存储介质包括: U盘、移动硬盘、只读存储器( Read-Only Memory, ROM )、 随机存取存储器(Random Access Memory, RAM ), 磁碟或者光盘等各种可 以存储程序代码的介质。
以上所述, 仅为本发明的具体实施方式, 但本发明的保护范围并不局限 于此, 任何熟悉本技术领域的技术人员在本发明揭露的技术范围内, 可轻易 想到变化或替换, 都应涵盖在本发明的保护范围之内。 因此, 本发明的保护 范围应以权利要求的保护范围为准。
Claims
1. 一种指令执行的方法, 所述方法由处理器中的执行部件 EU执行, 所述 EU用于从总线接口部件 BIU的指令队列中获取第一指令,执行所述第 一指令并将执行所述第一指令的结果写入寄存器组, 其特征在于, 所述 EU 包括令牌管理单元和至少两个指令执行单元, 所述至少两个指令执行单元包 括第一指令执行单元和第二指令执行单元, 所述方法包括:
所述令牌管理单元向所述第一指令执行单元发送第一读令牌, 所述第一 读令牌用于指示所述第一指令执行单元读取待执行的所述第一指令的操作 数;
所述第一指令执行单元根据所述第一读令牌的指示,读取所述第一指令 的操作数;
所述第一指令执行单元在所述读取所述第一指令的操作数的过程开始 后, 释放所述第一读令牌;
所述令牌管理单元在确定所述释放所述第一读令牌后, 向所述第二指令 执行单元发送第二读令牌, 所述第二读令牌用于指示所述第二指令执行单元 读取待执行的第二指令的操作数, 其中, 所述第二指令是在所述指令队列中 与第一指令相邻的待执行的指令;
所述第一指令执行单元根据所述第一指令的操作数, 执行所述第一指 令;
所述第一指令执行单元将执行所述第一指令的结果写入所述寄存器组。
2. 根据权利要求 1 所述的方法, 其特征在于, 所述第一指令执行单元 将执行所述第一指令的结果写入所述寄存器组, 包括:
所述第一指令执行单元接收所述令牌管理单元发送的第一写令牌, 所述 第一写令牌用于指示所述第一指令执行单元将执行所述第一指令的结果写 入所述寄存器组;
所述第一指令执行单元在执行所述第一指令后,根据所述第一写令牌的 指示, 将执行所述第一指令的结果写入所述寄存器组;
所述第一指令执行单元在所述将执行所述第一指令的结果写入所述寄 存器组的过程开始后, 释放所述第一写令牌, 以便于所述令牌管理单元在确 定所述释放所述第一写令牌后, 向所述第二指令执行单元发送第二写令牌, 所述第二写令牌用于指示所述第二指令执行单元将执行所述第二指令的结
果写入所述寄存器组。
3. 根据权利要求 2所述的方法, 其特征在于, 所述第一指令执行单元 在所述将执行所述第一指令的结果写入所述寄存器组的过程开始后,释放所 述第一写令牌, 包括:
所述第一指令执行单元在所述将执行所述第一指令的结果写入所述寄 存器组的过程的完成时刻前的第一时刻,释放所述第一写令牌并向所述令牌 管理单元发送第一写令牌释放消息, 其中, 所述第一时刻至所述完成时刻之 间的时长等于预设的第一阔值, 所述第一阔值小于所述将执行所述第一指令 的结果写入所述寄存器组的过程的时长;
所述令牌管理单元在确定所述释放所述第一写令牌后, 向所述第二指令 执行单元发送第二写令牌, 包括:
所述令牌管理单元在接收到所述第一写令牌释放消息时, 向所述第二指 令执行单元发送第二写令牌。
4. 根据权利要求 1至 3任一项所述的方法, 其特征在于, 所述第一指 令执行单元在所述读取所述第一指令的操作数的过程开始后,释放所述第一 读令牌, 包括:
所述第一指令执行单元在所述读取所述第一指令的操作数的过程的结 束时刻前的第二时刻,释放所述第一读令牌并向所述令牌管理单元发送第一 读令牌释放消息, 其中, 所述第二时刻至所述结束时刻之间的时长等于预设 的第二阔值, 所述第二阔值小于所述读取所述第一指令的操作数的过程的时 长;
所述令牌管理单元在确定所述释放所述第一读令牌后, 向所述第二指令 执行单元发送第二读令牌, 包括:
所述令牌管理单元在接收到所述第一读令牌释放消息时, 向所述第二指 令执行单元发送第二读令牌。
5. 根据权利要求 1至 4任一项所述的方法, 其特征在于, 所述 EU还 包括预译码单元,
所述从 BIU的指令队列中获取第一指令, 包括: 所述预译码单元从所述 BIU的指令队列中获取所述第一指令;
所述方法还包括: 所述预译码单元对所述第一指令进行预译码, 确定用 于执行所述第一指令的指令执行单元为所述第一指令执行单元,将所述第一
指令发送至所述第一指令执行单元, 并将所述第一指令执行单元的标识发送 至所述令牌管理单元。
6. 根据权利要求 5所述的方法, 其特征在于, 所述方法还包括: 所述预译码单元对所述第二指令进行预译码,确定用于执行所述第二指 令的指令执行单元为所述第二指令执行单元;
所述预译码单元将所述第二指令发送至所述第二指令执行单元, 并将所 述第二指令执行单元的标识发送至所述令牌管理单元。
7. 根据权利要求 6所述的方法, 其特征在于,
在所述令牌管理单元向第一指令执行单元发送第一读令牌之前, 所述方 法还包括: 所述令牌管理单元从所述预译码单元接收所述第一指令执行单元 的标识;
所述令牌管理单元向第一指令执行单元发送第一读令牌, 包括: 所述令 牌管理单元根据所述第一指令执行单元的标识, 向所述第一指令执行单元发 送所述第一读令牌;
在所述第一指令执行单元根据所述第一读令牌的指示,读取所述第一指 令的操作数之前, 所述方法还包括: 所述第一指令执行单元从所述预译码单 元接收所述第一指令;
在所述向第二指令执行单元发送第二读令牌之前, 所述方法还包括: 所 述令牌管理单元从所述预译码单元接收所述第二指令执行单元的标识; 所述向第二指令执行单元发送第二读令牌, 包括: 根据所述第二指令执 行单元的标识, 向所述第二指令执行单元发送所述第二读令牌。
8. 一种处理器,所述处理器中的执行部件 EU用于从总线接口部件 BIU 的指令队列中获取第一指令,执行所述第一指令并将执行所述第一指令的结 果写入寄存器组, 其特征在于, 所述 EU包括令牌管理单元和至少两个指令 执行单元, 所述至少两个指令执行单元包括第一指令执行单元和第二指令执 行单元:
所述令牌管理单元, 用于向所述第一指令执行单元发送第一读令牌, 所 述第一读令牌用于指示所述第一指令执行单元读取待执行的所述第一指令 的操作数;
所述第一指令执行单元, 用于根据所述第一读令牌的指示, 读取所述第
一指令的操作数; 在所述读取所述第一指令的操作数的过程开始后, 释放所 述第一读令牌;
所述令牌管理单元, 还用于在确定所述释放所述第一读令牌后, 向所述 第二指令执行单元发送第二读令牌, 所述第二读令牌用于指示所述第二指令 执行单元读取待执行的第二指令的操作数, 其中, 所述第二指令是在所述指 令队列中与第一指令相邻的待执行的指令;
所述第一指令执行单元, 还用于根据所述第一指令的操作数, 执行所述 第一指令, 并将执行所述第一指令的结果写入所述寄存器组。
9. 根据权利要求 8所述的处理器, 其特征在于:
所述令牌管理单元, 还用于向所述第一指令执行单元发送第一写令牌, 所述第一写令牌用于指示所述第一指令执行单元将执行所述第一指令的结 果写入所述寄存器组;
所述第一指令执行单元, 具体用于接收所述令牌管理单元发送的所述第 一写令牌, 在执行所述第一指令后, 根据所述第一写令牌的指示, 将执行所 述第一指令的结果写入所述寄存器组; 在所述将执行所述第一指令的结果写 入所述寄存器组的过程开始后, 释放所述第一写令牌;
所述令牌管理单元, 还用于在确定所述释放所述第一写令牌后, 向所述 第二指令执行单元发送第二写令牌, 所述第二写令牌用于指示所述第二指令 执行单元将执行所述第二指令的结果写入所述寄存器组。
10. 根据权利要求 9所述的处理器, 其特征在于:
所述第一指令执行单元, 具体用于在所述将执行所述第一指令的结果写 入所述寄存器组的过程的完成时刻前的第一时刻,释放所述第一写令牌并向 所述令牌管理单元发送第一写令牌释放消息, 其中, 所述第一时刻至所述完 成时刻之间的时长等于预设的第一阔值, 所述第一阔值小于所述将执行所述 第一指令的结果写入所述寄存器组的过程的时长;
所述令牌管理单元, 具体用于在接收到所述第一写令牌释放消息时, 向 所述第二指令执行单元发送第二写令牌。
11. 根据权利要求 8至 10任一项所述的处理器, 其特征在于: 所述第一指令执行单元, 具体用于在所述读取所述第一指令的操作数的 过程的结束时刻前的第二时刻,释放所述第一读令牌并向所述令牌管理单元 发送第一读令牌释放消息, 其中, 所述第二时刻至所述结束时刻之间的时长
等于预设的第二阔值, 所述第二阔值小于所述读取所述第一指令的操作数的 过程的时长;
所述令牌管理单元, 具体用于在接收到所述第一读令牌释放消息时, 向 所述第二指令执行单元发送第二读令牌。
12. 根据权利要求 8至 11任一项所述的处理器, 其特征在于, 所述 EU 还包括预译码单元,
所述预译码单元, 用于从所述 BIU的指令队列中获取所述第一指令; 所述预译码单元, 还用于对所述第一指令进行预译码, 确定用于执行所 述第一指令的指令执行单元为所述第一指令执行单元,将所述第一指令发送 至所述第一指令执行单元, 并将所述第一指令执行单元的标识发送至所述令 牌管理单元。
13. 根据权利要求 12所述的处理器, 其特征在于, 所述预译码单元, 还用于:
从所述 BIU的指令队列中获取所述第二指令;
对所述第二指令进行预译码,确定用于执行所述第二指令的指令执行单 元为所述第二指令执行单元;
将所述第二指令发送至所述第二指令执行单元, 并将所述第二指令执行 单元的标识发送至所述令牌管理单元。
14. 根据权利要求 13所述的处理器, 其特征在于,
所述令牌管理单元,还用于从所述预译码单元接收所述第一指令执行单 元的标识, 从所述预译码单元接收所述第二指令执行单元的标识;
所述第一指令执行单元, 还用于从所述预译码单元接收所述第一指令; 所述令牌管理单元, 具体用于根据所述第一指令执行单元的标识, 向所 述第一指令执行单元发送所述第一读令牌; 根据所述第二指令执行单元的标 识, 向所述第二指令执行单元发送所述第二读令牌。
15. 一种通信设备, 其特征在于, 所述通信设备包括处理器和存储器, 所述处理器和所述存储器通过总线系统连接,
所述处理器, 包括权利要求 8至 14任一项所述的处理器;
所述存储器, 用于存储所述处理器运行的程序和数据。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201480081514.2A CN106796505B (zh) | 2014-08-29 | 2014-08-29 | 指令执行的方法及处理器 |
EP14900708.0A EP3176692A4 (en) | 2014-08-29 | 2014-08-29 | Instruction execution method and processor |
PCT/CN2014/085555 WO2016029444A1 (zh) | 2014-08-29 | 2014-08-29 | 指令执行的方法及处理器 |
US15/445,677 US20170185411A1 (en) | 2014-08-29 | 2017-02-28 | Instruction execution method and processor |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2014/085555 WO2016029444A1 (zh) | 2014-08-29 | 2014-08-29 | 指令执行的方法及处理器 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/445,677 Continuation US20170185411A1 (en) | 2014-08-29 | 2017-02-28 | Instruction execution method and processor |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2016029444A1 true WO2016029444A1 (zh) | 2016-03-03 |
Family
ID=55398655
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2014/085555 WO2016029444A1 (zh) | 2014-08-29 | 2014-08-29 | 指令执行的方法及处理器 |
Country Status (4)
Country | Link |
---|---|
US (1) | US20170185411A1 (zh) |
EP (1) | EP3176692A4 (zh) |
CN (1) | CN106796505B (zh) |
WO (1) | WO2016029444A1 (zh) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110799938A (zh) * | 2018-09-30 | 2020-02-14 | 深圳市大疆创新科技有限公司 | 令牌管理方法、装置、芯片及可移动平台 |
CN109947479A (zh) * | 2019-01-29 | 2019-06-28 | 安谋科技(中国)有限公司 | 指令执行方法及其处理器、介质和系统 |
US11900156B2 (en) * | 2019-09-24 | 2024-02-13 | Speedata Ltd. | Inter-thread communication in multi-threaded reconfigurable coarse-grain arrays |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1744033A (zh) * | 2004-09-02 | 2006-03-08 | 国际商业机器公司 | 流水线处理的方法和设备 |
CN101025681A (zh) * | 2006-02-09 | 2007-08-29 | 国际商业机器公司 | 最小化未调度d缓存缺失流水线停顿的方法和装置 |
CN101136735A (zh) * | 2006-09-12 | 2008-03-05 | 中兴通讯股份有限公司 | 使用通用异步收发报机的半双工串口通信系统及通信方法 |
US20120066690A1 (en) * | 2010-09-15 | 2012-03-15 | Gagan Gupta | System and Method Providing Run-Time Parallelization of Computer Software Using Data Associated Tokens |
CN103874968A (zh) * | 2011-08-03 | 2014-06-18 | 康奈尔大学 | 用于高性能异步电路的节能流水线电路模板 |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5185873A (en) * | 1988-02-19 | 1993-02-09 | Hitachi, Ltd. | Control system with flag indicating two or less data inputs and counter indicating two or more controlling data driven execution method |
US5842033A (en) * | 1992-06-30 | 1998-11-24 | Discovision Associates | Padding apparatus for passing an arbitrary number of bits through a buffer in a pipeline system |
KR100243100B1 (ko) * | 1997-08-12 | 2000-02-01 | 정선종 | 다수의 주프로세서 및 보조 프로세서를 갖는 프로세서의구조 및 보조 프로세서 공유 방법 |
US7930578B2 (en) * | 2007-09-27 | 2011-04-19 | International Business Machines Corporation | Method and system of peak power enforcement via autonomous token-based control and management |
-
2014
- 2014-08-29 EP EP14900708.0A patent/EP3176692A4/en not_active Withdrawn
- 2014-08-29 WO PCT/CN2014/085555 patent/WO2016029444A1/zh active Application Filing
- 2014-08-29 CN CN201480081514.2A patent/CN106796505B/zh active Active
-
2017
- 2017-02-28 US US15/445,677 patent/US20170185411A1/en not_active Abandoned
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1744033A (zh) * | 2004-09-02 | 2006-03-08 | 国际商业机器公司 | 流水线处理的方法和设备 |
CN101025681A (zh) * | 2006-02-09 | 2007-08-29 | 国际商业机器公司 | 最小化未调度d缓存缺失流水线停顿的方法和装置 |
CN101136735A (zh) * | 2006-09-12 | 2008-03-05 | 中兴通讯股份有限公司 | 使用通用异步收发报机的半双工串口通信系统及通信方法 |
US20120066690A1 (en) * | 2010-09-15 | 2012-03-15 | Gagan Gupta | System and Method Providing Run-Time Parallelization of Computer Software Using Data Associated Tokens |
CN103874968A (zh) * | 2011-08-03 | 2014-06-18 | 康奈尔大学 | 用于高性能异步电路的节能流水线电路模板 |
Non-Patent Citations (1)
Title |
---|
See also references of EP3176692A4 * |
Also Published As
Publication number | Publication date |
---|---|
EP3176692A4 (en) | 2018-09-19 |
US20170185411A1 (en) | 2017-06-29 |
EP3176692A1 (en) | 2017-06-07 |
CN106796505A (zh) | 2017-05-31 |
CN106796505B (zh) | 2019-03-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10942737B2 (en) | Method, device and system for control signalling in a data path module of a data stream processing engine | |
US9405552B2 (en) | Method, device and system for controlling execution of an instruction sequence in a data stream accelerator | |
KR101847857B1 (ko) | 스레드 일시중지 프로세서들, 방법들, 시스템들 및 명령어들 | |
US9846581B2 (en) | Method and apparatus for asynchronous processor pipeline and bypass passing | |
TWI620073B (zh) | 通用串列匯流排(usb)3.1重定時器存在檢測與索引之方法及設備 | |
US8996788B2 (en) | Configurable flash interface | |
KR20140103048A (ko) | 낮은 레벨 프로그래밍 가능한 시퀀서와 조합한 범용 프로그래밍 가능한 프로세서를 사용한 비휘발성 메모리 채널 제어 | |
US9690720B2 (en) | Providing command trapping using a request filter circuit in an input/output virtualization (IOV) host controller (HC) (IOV-HC) of a flash-memory-based storage device | |
US8977835B2 (en) | Reversing processing order in half-pumped SIMD execution units to achieve K cycle issue-to-issue latency | |
US20190249678A1 (en) | Management of multiple fan modules | |
KR20170013882A (ko) | 플래시 메모리 기반 저장 디바이스의 멀티 호스트 전력 제어기(mhpc) | |
WO2016029444A1 (zh) | 指令执行的方法及处理器 | |
JP7229305B2 (ja) | 命令実行結果をライトバックするための装置及び方法、処理装置 | |
CN106444965B (zh) | 时钟管理单元、包括其的集成电路和管理时钟的方法 | |
US10031753B2 (en) | Computer systems and methods for executing contexts with autonomous functional units | |
US9830154B2 (en) | Method, apparatus and system for data stream processing with a programmable accelerator | |
US9983932B2 (en) | Pipeline processor and an equal model compensator method and apparatus to store the processing result | |
US20140201505A1 (en) | Prediction-based thread selection in a multithreading processor | |
CN110764710A (zh) | 低延迟高iops的数据访问方法与存储系统 | |
TW202121879A (zh) | 透過虛擬匯流排編碼傳播遙測資訊的系統、設備及方法 | |
CN113703841A (zh) | 一种寄存器数据读取的优化方法、装置及介质 | |
Trajkovic et al. | Communication Design for No Instruction Set Computer | |
CN101510190A (zh) | 一种基于自定义指令的多核通信系统及方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14900708 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
REEP | Request for entry into the european phase |
Ref document number: 2014900708 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2014900708 Country of ref document: EP |