WO2023142502A1 - 循环指令处理方法、装置、芯片、电子设备及存储介质 - Google Patents

循环指令处理方法、装置、芯片、电子设备及存储介质 Download PDF

Info

Publication number
WO2023142502A1
WO2023142502A1 PCT/CN2022/120852 CN2022120852W WO2023142502A1 WO 2023142502 A1 WO2023142502 A1 WO 2023142502A1 CN 2022120852 W CN2022120852 W CN 2022120852W WO 2023142502 A1 WO2023142502 A1 WO 2023142502A1
Authority
WO
WIPO (PCT)
Prior art keywords
loop
instruction
layer
instructions
execution
Prior art date
Application number
PCT/CN2022/120852
Other languages
English (en)
French (fr)
Inventor
霍冠廷
王文强
徐宁仪
Original Assignee
上海商汤智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤智能科技有限公司 filed Critical 上海商汤智能科技有限公司
Publication of WO2023142502A1 publication Critical patent/WO2023142502A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3005Arrangements for executing specific machine instructions to perform operations for flow control
    • G06F9/30065Loop control instructions; iterative instructions, e.g. LOOP, REPEAT
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3005Arrangements for executing specific machine instructions to perform operations for flow control
    • G06F9/30069Instruction skipping instructions, e.g. SKIP
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements

Definitions

  • the present disclosure relates to the technical field of computers, and in particular, to a loop instruction processing method, device, chip, electronic equipment, and computer-readable storage medium.
  • the execution process of the for loop statement is usually controlled by the condition judgment branch in the compilation result.
  • each statement in the for loop will be compiled into a corresponding instruction.
  • the for loop statement will be compiled into more complex instructions, thereby affecting the processing efficiency of the processor.
  • Embodiments of the present disclosure at least provide a loop instruction processing method, device, chip, electronic equipment, and computer-readable storage medium.
  • an embodiment of the present disclosure provides a loop instruction processing method applied to a processor, including: acquiring a multi-layer loop instruction; wherein, the multi-layer loop instruction includes a multi-layer nested loop instruction; Obtain the loop parameters of each layer of loop instructions in the multi-layer loop instructions in the parameter register of the processor; during the execution of each layer of loop instructions, the loop parameters based on the state information of this layer of loop instructions and the layer of loop instructions
  • the parameter determines the execution logic of the loop body in the loop instructions of each layer; wherein, the state information is used to indicate the real-time execution status of the loop instruction of the layer; and the execution of the loop body of the loop instruction of the layer is controlled based on the execution logic.
  • the instruction function of the multi-layer loop instruction can be realized by executing the loop body of each layer of loop instructions, so that Executing the for statement is omitted to simplify the instruction cycle, thereby improving the execution efficiency of the processor and realizing efficient execution of the processor.
  • the determining the execution logic of the loop body in the layer loop instruction based on the state information of the layer loop instruction and the loop parameters of the layer loop instruction includes: obtaining the real-time information of the layer loop instruction The number of loops; comparing the real-time loop number with the loop end parameter of the loop instruction of the layer to obtain a comparison result; determining the execution logic of the loop body in the loop instruction of the layer according to the comparison result.
  • the jump logic of the loop body in each layer of loop instructions can be controlled through the real-time loop times of each layer of loop instructions.
  • the execution steps of the for loop can be omitted, thereby The instruction cycle can be simplified, thereby improving the execution efficiency of the processor and realizing efficient execution of the processor.
  • the method further includes: detecting a target layer loop instruction satisfying an update condition among the multi-layer loop instructions; wherein the update condition is a real-time loop for updating the target layer loop instruction The condition of the number of times; updating the real-time cycle times of the loop instruction of the target layer.
  • the real-time number of cycles of each layer of loop instructions is maintained by the status register in the processor, and the number of cycles of the loop body of each layer of loop instructions can be controlled by the hardware device in the processor, so that each layer of loops can be realized.
  • the dynamic self-maintenance of the real-time execution state of the instruction can realize the efficient realization of the loop instruction.
  • the detection of the target layer loop instructions satisfying the update condition in the multi-layer loop instructions includes Instruction: obtain the loop jump signal of the loop instruction of this layer; wherein, the loop jump signal is used to indicate whether to jump to execute the instruction indicated by the initial PC pointer of the loop instruction of this layer; obtain the nested loop instruction of this layer The execution information of the next layer of loop instruction; in the case that the loop jump signal of the layer of loop instruction is determined to be a jumpback signal and the execution information of the next layer of loop instruction is execution completion, determine the loop instruction of this layer Loop instructions for the target layer meeting the update condition.
  • the determining that the loop jump signal of the layer loop instruction is a jumpback signal includes: detecting that the PC pointer points to the last instruction in the layer loop instruction and detecting that the In the case that the layer loop instruction does not execute the last cycle calculation, it is determined that the loop jump signal of the layer loop instruction is a jumpback signal.
  • the loop jump signal and the completion indication signal are maintained by the status register, and the hardware device in the processor can be used to determine the update condition of the number of real-time loop times, thereby realizing the dynamic self-maintenance of the real-time execution status of each loop instruction , and then the efficient realization of the loop instruction can be realized.
  • the determining the execution logic of the loop body in the layer loop instruction based on the status information of the layer loop instruction and the loop parameters of the layer loop instruction includes: detecting the end of the instruction of the layer loop instruction Signal; based on the detected instruction end signal and the loop parameters of the loop instruction at this layer, determine the execution logic of the loop body in the loop instruction at this layer.
  • the detecting the instruction end signal of the layer loop instruction includes: in the case of detecting that the layer loop instruction executes to the last instruction in the last loop process of the layer loop instruction, It is determined that the instruction end signal of the layer loop instruction is detected.
  • the execution logic of the loop body in each layer of loop instructions can be controlled through the instruction end signal of each layer of loop instructions.
  • the execution steps of the for loop can be omitted, so that The instruction cycle is simplified, thereby improving the execution efficiency of the processor and realizing efficient execution of the processor.
  • the detecting that the layer loop instruction is executed to the last instruction of the layer loop instruction in the last loop process includes: obtaining the next instruction cycle of the loop instruction pointed to by the PC pointer at the current moment , to obtain the first instruction cycle; when it is determined that the first instruction cycle is greater than the target value, and the execution of the embedded loop instruction pointed to by the PC pointer to the layer loop instruction is completed, and the layer loop instruction executes the last loop process Next, determine the last instruction executed up to the last loop process of the layer loop instruction; wherein, the target value is the sum of the start pointer of the layer loop instruction and the number of instructions contained in the loop body of the layer loop instruction.
  • the method includes: determining a first loop instruction in each layer of loop instructions in the multi-layer loop instructions; wherein, the first A loop instruction is a loop instruction to be allocated with an instruction identifier; an idle loop identifier in an idle state is determined among preset loop identifiers; a loop identifier of the first loop instruction is determined based on the idle loop identifier; wherein, the first The loop identifier of the loop instruction is used to indicate the number of loop layers of the first loop instruction; the loop parameter of the first loop instruction is stored in the parameter register based on the loop identifier of the first loop instruction.
  • the method further includes: when the idle loop identifier is not included in the preset loop identifier, detecting a second loop instruction in the multi-layer loop instruction; wherein , the second loop instruction is a loop end loop instruction; the loop identifier of the first loop instruction is determined based on the loop identifier of the second loop instruction.
  • the detecting the second loop instruction in the multi-layer loop instruction includes: acquiring the instruction working status of each layer loop instruction in the multi-layer loop instruction; The working state determines the loop instructions whose instruction execution is completed in the multi-layer loop instructions, and determines the second loop instruction based on the instruction execution completed loop instructions.
  • the controlling execution of the loop body in the loop instructions of each layer based on the execution logic includes: jumping to multiple loop bodies in the loop instructions in the execution logic In the case of the instruction pointed to by the initial PC pointer of the loop instruction, it is determined that the innermost loop instruction in the plurality of loop instructions is the instruction to be jumped; jump to the initial PC pointer of the instruction to be jumped Execute the instruction to be jumped.
  • an embodiment of the present disclosure provides a loop instruction processing device, including: a controller, a parameter register, and an arithmetic unit; the controller is used to obtain a multi-layer loop instruction; wherein, the multi-layer loop instruction includes a multi-layer Nested loop instructions; obtaining the loop parameters of each layer of loop instructions in the multi-layer loop instructions in the parameter register; and during the execution of each layer of loop instructions, based on the state information of the layer of loop instructions and the loop parameter of the layer loop instruction to determine the execution logic of the loop body in the layer loop instruction; wherein, the state information is used to indicate the real-time execution state of the layer loop instruction; the operator is used to control the execution of the loop based on the execution logic The loop body in the loop instructions of each layer.
  • an embodiment of the present disclosure further provides a chip, which is characterized by comprising: the instruction processing device according to any one of the second aspect.
  • an embodiment of the present disclosure further provides an electronic device, including: a processor, a memory, and a bus, the memory stores machine-readable instructions executable by the processor, and when the electronic device is running, the processing The processor communicates with the memory through a bus, and when the machine-readable instructions are executed by the processor, the steps of the loop instruction processing method described in any one of the above-mentioned first aspects are executed.
  • an embodiment of the present disclosure further provides an electronic device, including the chip as described in the third aspect.
  • the embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored, and when the computer program is run by a processor, the method described in any one of the above-mentioned first aspects is executed.
  • the steps of the loop instruction processing method are executed.
  • FIG. 1 shows a flow chart of a loop instruction processing method provided by an embodiment of the present disclosure
  • Fig. 2 shows a loop instruction processing method provided by an embodiment of the present disclosure, based on the state information of the loop instructions of each layer and the loop parameters of the loop instructions of each layer to determine the loop body in the loop instructions of each layer A flowchart of a specific method of implementing the logic;
  • FIG. 3 shows a flow chart of a specific method for detecting loop instructions satisfying update conditions among the loop instructions of each layer in a loop instruction processing method provided by an embodiment of the present disclosure
  • Fig. 4 shows a loop instruction processing method provided by an embodiment of the present disclosure, based on the status information of the loop instructions of each layer and the loop parameters of the loop instructions of each layer to determine the loop body in the loop instructions of each layer A flowchart of a specific method of implementing the logic;
  • FIG. 5 shows a schematic diagram of a loop instruction processing device provided by an embodiment of the present disclosure
  • Fig. 6 shows a schematic diagram of an electronic device provided by an embodiment of the present disclosure.
  • a and/or B may mean that A exists alone, A and B exist simultaneously, and B exists alone.
  • at least one herein means any one of a variety or any combination of at least two of the more, for example, including at least one of A, B, and C, which may mean including from A, Any one or more elements selected from the set formed by B and C.
  • Neural network is an algorithmic mathematical model that imitates the behavior characteristics of animal neural networks and performs distributed parallel information processing. It is a nonlinear and adaptive information processing system composed of a large number of processing units interconnected. Research on neural networks can promote or accelerate the development of artificial intelligence.
  • Convolution calculation is one of the most important functions that need to be realized in the neural network.
  • multiple loop statements are usually used for multi-dimensional images and convolution kernels.
  • control information such as loop instructions is very necessary.
  • the execution process of the for loop statement is usually controlled by the condition judgment branch in the compilation result.
  • each statement in the for loop will be compiled into a corresponding instruction.
  • the for loop statement will be compiled into more complex instructions, thereby affecting the processing efficiency of the processor.
  • the present disclosure provides a loop instruction processing method, device, electronic equipment, and computer-readable storage medium.
  • the loop parameter of each layer of loop instruction in the multi-layer loop instruction can be obtained in the parameter register of the processor, and during the execution of each layer of loop instruction, Determine the execution logic of the loop body in the layer loop instruction based on the state information of the layer loop instruction and the loop parameters of the layer loop instruction; wherein, the state information is used to indicate the real-time execution status of the layer loop instruction; and then control the execution based on the execution logic The loop body in the loop instruction of this layer.
  • the instruction function of the multi-layer loop instruction can be realized by executing the loop body of each layer of loop instructions, so that Executing the for statement is omitted to simplify the instruction cycle, thereby improving the execution efficiency of the processor and realizing efficient execution of the processor.
  • the execution subject of the loop instruction processing method provided by the embodiments of the present disclosure is generally an electronic device with a certain computing capability.
  • FIG. 1 is a flow chart of a method for processing loop instructions provided by an embodiment of the present disclosure, the method includes steps S101-S107.
  • S101 Acquire multi-layer loop instructions; wherein, the multi-layer loop instructions include multi-layer nested loop instructions.
  • the multi-level loop instruction may be a loop instruction including N layers of nesting, where N is a positive integer greater than 1.
  • the multi-layer loop instruction may be a 2-layer loop instruction or a 3-layer loop instruction, and the present disclosure does not specifically limit the number of instruction layers of the multi-layer loop instruction.
  • the multi-level loop instruction can be the following for loop instruction:
  • S103 Obtain a loop parameter of each layer of loop instructions in the multi-layer loop instructions from a parameter register of the processor.
  • one or more parameter registers may be pre-determined in the processor for each layer of loop instructions.
  • the parameter register is used to store the loop parameters of each loop instruction.
  • the loop parameters of each loop instruction may include the following parameters: the starting PC pointer of the loop instruction, the instruction quantity of the loop body, the loop step size, the loop end parameter (for example, the number of loops) and other information.
  • Each parameter register contains corresponding layer number information, and the layer number information is the loop layer number of the loop instruction corresponding to the parameter register in the multi-layer loop instruction.
  • S105 During the execution of the loop instructions of each layer, based on the state information of the loop instructions of the layer and the loop parameters of the loop instructions of the layer, determine the execution logic of the loop body in the loop instructions of the layer; wherein, the state information is used Indicates the real-time execution status of the loop instruction of this layer.
  • the multi-level loop instruction can be the following for loop instruction:
  • the loop body in each layer of loop instructions can be executed, thereby omitting the execution of each layer of loop instructions.
  • execution logic of the loop body in each layer of loop instructions can be understood as the jump logic of the loop body in each layer of loop instructions.
  • the loop parameter of each layer of loop instruction in the multi-layer loop instruction can be obtained in the parameter register of the processor, and during the execution of each layer of loop instruction, Determine the execution logic of the loop body in the layer loop instruction based on the state information of the layer loop instruction and the loop parameters of the layer loop instruction; wherein, the state information is used to indicate the real-time execution status of the layer loop instruction; and then control the execution based on the execution logic The loop body in the loop instruction of this layer.
  • the instruction function of the multi-layer loop instruction can be realized by executing the loop body of each layer of loop instructions, so that Executing the for statement is omitted to simplify the instruction cycle, thereby improving the execution efficiency of the processor and realizing efficient execution of the processor.
  • the loop parameters of each layer of loop instructions can be parsed, and the parsed loop parameters can be stored in corresponding parameter registers.
  • the loop parameters of the layer loop instructions can be obtained in the parameter register of the processor; and in the process of executing each layer of loop instructions in the multi-layer loop instructions, The state information in the state register in the processor is obtained, and the state information is used to indicate the real-time execution state of the loop instruction of this layer.
  • the real-time execution state may correspond to the execution state of multiple dimensions, and different state registers may be set for the execution state of each dimension of each layer of loop instructions.
  • the execution logic of the loop body in the loop instruction of this layer can be determined based on the state information of the loop instruction of each layer and the loop parameters of the loop instruction of this layer.
  • step S105 based on the state information of the loop instruction of the layer and the loop parameters of the loop instruction of the layer, determine the execution logic of the loop body in the loop instruction of the layer, specifically including:
  • S203 Determine the execution logic of the loop body in the loop instruction of this layer according to the comparison result.
  • the real-time cycle number of the layer of loop instruction can be obtained, so as to compare the real-time cycle number with the loop end parameter of the layer of loop instruction to obtain a comparison result.
  • the real-time number of loops of each layer of loop instructions may be acquired from the first status register.
  • the first state register is a register in the state register of the processor for storing the real-time cycle times of each layer of cycle instructions.
  • the loop end parameter can be understood as the maximum number of loops of the loop instruction in this layer.
  • the real-time cycle number can be compared with the maximum cycle number to obtain a comparison result.
  • the comparison result may be that the number of real-time cycles is equal to the maximum number of cycles, or the number of real-time cycles is smaller than the maximum number of cycles.
  • the execution logic of the loop body in the loop instruction of the layer can be determined according to the comparison result, for example, jump back to continue executing the loop instruction of the layer, or execute the loop instruction of the next layer.
  • the real-time number of cycles of this layer of loop instructions can be compared with the maximum value of this layer of loop instructions.
  • the number of loops is compared, so as to determine whether to jump back to continue executing the loop instruction of this layer or to execute the loop instruction of the next layer according to the comparison result.
  • the jump logic of the loop body in each layer of loop instructions can be controlled through the real-time loop times of each layer of loop instructions.
  • the execution steps of the for loop can be omitted, thereby The instruction cycle can be simplified, thereby improving the execution efficiency of the processor and realizing efficient execution of the processor.
  • the real-time number of loops of each loop instruction can be obtained from the first status register.
  • the number of real-time cycles stored in the first status register needs to be updated in real time.
  • the specific update method is described as follows:
  • the update condition is a condition for updating the real-time cycle times of the target layer loop instruction
  • the loop instruction of each layer may be detected whether the loop instruction of the layer satisfies the update condition. If a loop instruction satisfying the updating condition is detected, the real-time loop count of the loop instruction at this layer may be updated.
  • the real-time number of cycles of each layer of loop instructions is maintained by the status register in the processor, and the hardware device in the processor can be used to control the number of cycles of the loop body of each layer of loop instructions, and each layer of loops can be realized.
  • the dynamic self-maintenance of the real-time execution state of the instruction can realize the efficient realization of the loop instruction.
  • the above step: detecting target layer loop instructions satisfying update conditions in the multi-layer loop instructions specifically includes the following steps:
  • a loop jump signal loop_jump is maintained for each layer of loop instructions in the multi-layer loop instructions, wherein the loop jump signal is used to indicate whether to jump to the start of the loop instruction of the layer The instruction indicated by the PC pointer.
  • the loop jump signal of each layer of loop instructions can be obtained, and the execution information of the nested loop instructions in the layer of loop instructions can be obtained, for example, the loop execution is completed or the loop execution is not completed.
  • the loop jump signal loop_jump of each layer of loop instructions can be obtained from the second status register (referred to as register R2 ) of the status registers during the execution of each layer of loop instructions.
  • the second state register is a register in the state register of the processor for storing the loop jump signal loop_jump of each layer of loop instructions.
  • the execution information of the nested loop instructions in the layer of loop instructions can also be obtained from the third status register (recorded as register R3) of the status register.
  • the third status register (register R3) is a register in the status register of the processor for storing the completion indication signal loop_lower_done of the inner loop instruction of each loop instruction.
  • the execution information of the loop instruction nested in the loop instruction of the layer can be understood as the completion indication signal loop_lower_done of the inner loop instruction of the loop instruction of the layer.
  • the loop jump signal and the execution information After the loop jump signal and the execution information are acquired, it may be determined based on the loop jump signal and the execution information whether the loop instruction of this layer satisfies the update condition.
  • the loop jump signal loop_jump is pulled high and the completion indicator signal loop_lower_done is pulled high, it is determined that the loop instruction of this layer meets the update condition, and at this time, the first status register corresponding to the loop instruction of this layer can be updated.
  • the real-time cycle times of layer cycle instructions for example, add 1 to the real-time cycle times.
  • the loop jump signal loop_jump pulled high can be understood as the loop jump signal is a jump back signal, indicating that the jump executes the instruction indicated by the starting PC pointer of the loop instruction of this layer; the completion indicator signal loop_lower_done is pulled high to indicate the loop instruction of this layer The inner loop of is all executed.
  • step S303 determining that the loop jump signal of the layer loop instruction is a jumpback signal specifically includes: detecting that the PC pointer points to the last instruction of the layer loop instruction and detecting When the loop instruction of this layer has not executed the last loop calculation, it is determined that the loop jump signal of the loop instruction of this layer is a jumpback signal.
  • the loop jump signal loop_jump is used to determine whether to jump back to the start PC pointer of the loop instruction of the layer to continue executing the loop instruction at the end of the loop body of the loop instruction of the layer. layer loop directive.
  • condition 1 the conditions for determining that the loop jump signal of the loop instruction at this layer is the back jump signal (or the loop jump signal loop_jump is pulled high) include condition 1 and condition 2.
  • Condition 2 The loop instruction at this layer has not executed the last loop calculation.
  • the multi-level loop instruction can be the following for loop instruction:
  • the loop jump signal loop_jump1 in the low state can be written in the register R2 corresponding to the outer loop for1, and the loop jump signal loop_jump2 in the low state can be written in the register R2 corresponding to the inner loop for2.
  • the loop jump signal loop_jump1 written in the register R2 corresponding to the outer loop for1 remains unchanged; the loop jump signal loop_jump2 written in the low state of the register R2 corresponding to the inner loop for2 is modified to be in the The loop jump signal loop_jump2 of the pull-up state.
  • the loop jump signal loop_jump1 written in the low state of the register R2 corresponding to the outer loop for1 is modified to the loop jump signal loop_jump1 in the high state;
  • the register corresponding to the inner loop for2 is changed to
  • the loop jump signal loop_jump2 in the pull-up state written in R2 is modified to the loop jump signal loop_jump2 in the pull-down state.
  • instruction 4 is not the last instruction of the outer loop for1, but it is calculated for the last loop of the outer loop for1, so loop_jump1 is pulled low; and instruction 4 is not the instruction of the inner loop for2, so loop_jump2 is pulled low.
  • instruction 5 is the instruction in the last loop calculation of the outer loop for1, so loop_jump1 is pulled low; at this time, the inner loop for2 executes to the last instruction of the layer loop, and the last loop has not yet been executed calculation, therefore, loop_jump2 is pulled high.
  • the loop jump signal loop_jump1 written in the register R2 corresponding to the outer loop for1 remains unchanged; the loop jump signal loop_jump2 written in the low state of the register R2 corresponding to the inner loop for2 Modified to the loop jump signal loop_jump2 in the pulled high state.
  • instruction 6 When executing instruction 6, instruction 6 is the instruction in the last loop calculation of the outer loop for1, so loop_jump1 is pulled low; at this time, the inner loop for2 executes to the last instruction of the layer loop, and it is the last loop calculation, Therefore, loop_jump2 is pulled low.
  • the loop jump signal loop_jump1 written in the register R2 corresponding to the outer loop for1 remains unchanged; the loop jump signal loop_jump2 written in the register R2 corresponding to the inner loop for2 is in a high state Modified to the loop jump signal loop_jump2 in the pull-down state.
  • the real-time cycle number (for example, 1) stored in the first state register of the outer loop for1 can be read, and it is determined that the outer loop for1 has not yet performed the last cycle calculation according to the real-time cycle number; and It is determined that the loop jump signal of the outer loop for1 is the back jump signal (that is, the loop jump signal loop_jump is pulled high). At this time, it indicates that it is necessary to jump back to the instruction indicated by the start PC pointer of the outer loop for1 to continue executing the outer loop for1, that is, to execute the instructions described in 4, 5 and 6 above.
  • step S303 it is determined that the execution information of the next layer of loop instructions nested in this layer of loop instructions is executed, specifically including: all inner loops nested in this layer of loop instructions
  • loop_id-N:0 means that the loop_jump of the inner loop with the number of layers N to 0 in the loop instruction of this layer is all pulled down.
  • the loop jump signal of the inner and outer loop for1, the loop jump signal of the inner loop for2, and the execution information of the inner loop for2 in each instruction cycle it can be known that: in the instruction cycle corresponding to instruction 2 and instruction 5, the loop jump of the inner loop for2 The turn signal is pulled high, and the inner loop for2 does not contain an embedded loop instruction.
  • the real-time cycle number of the inner loop for2 executes the "+1" operation, that is, the real-time value in the first status register of the inner loop for2 can be The number of loops performs the operation of "+1".
  • the loop jump signal of the outer loop for1 is pulled high, and the execution of the memory loop instruction of the outer loop for1 (ie, the inner loop for2) is completed (ie, the loop jump signal of the inner loop for2 Pull down), at this time, the real-time cycle number of the outer loop for1 performs the operation of "+1", that is, the real-time cycle number in the first status register of the outer loop for1 can perform the "+1" operation.
  • the first status register includes register R11 and register R12
  • the second status register includes register R21 and register R22
  • the third status register includes register R31.
  • the register R11 is used to store the real-time loop times of the outer loop for1
  • the register R21 is used to store the loop jump signal loop_jump1 of the outer loop for1
  • the register R31 is used to store the completion indication signal loop_lower_done1 of the outer loop for1.
  • the register R12 is used to store the real-time loop times of the inner loop for2
  • the register R22 is used to store the loop jump signal loop_jump2 of the inner loop for2. Since the inner loop for2 has no inner loop, the third status register of the inner loop for2 is not set.
  • Register R11 not updated
  • Register R12 not updated
  • Register R21 write loop_jump1 to pull low
  • Register R22 write loop_jump2 to pull low
  • Register R31 Write loop_lower_done1 to pull low.
  • Register R11 not updated
  • Register R12 real-time cycle times + 1;
  • Register R21 not updated
  • Register R22 updated to loop_jump2 pulled high
  • Register R11 real-time cycle times + 1; register R12: not updated;
  • Register R21 updated to pull high for loop_jump1; register R22: updated to pull low for loop_jump2;
  • Register R31 Updated to loop_lower_done1 pulled high.
  • Register R11 not updated
  • Register R12 not updated
  • Register R21 update to loop_jump1 pull low; register R22: not update;
  • Register R31 Updated to loop_lower_done1 pulled low.
  • Register R11 not updated
  • Register R12 real-time cycle times + 1;
  • Register R21 not updated
  • Register R22 updated to loop_jump2 pulled high
  • Register R11 not updated
  • Register R12 not updated
  • Register R21 not updated
  • register R22 updated to loop_jump2 pulled low
  • Register R31 Updated to loop_lower_done1 pulled high.
  • the corresponding real-time loop number can be read from the first status register corresponding to the loop instruction of this layer, so as to compare the real-time loop number with the loop end parameter Yes, and determine whether to continue to execute the loop instruction of this layer according to the comparison result.
  • the loop jump signal and the completion indication signal are maintained by the status register, and the hardware device in the processor can be used to determine the update condition of the number of real-time loop times, thereby realizing the dynamic self-maintenance of the real-time execution status of each loop instruction , and then the efficient realization of the loop instruction can be realized.
  • step S105 based on the state information of the loop instruction of the layer and the loop parameters of the loop instruction of the layer, determine the execution logic of the loop body in the loop instruction of the layer, specifically including:
  • S402 Based on the detected instruction end signal and the loop parameters of the loop instruction at this level, determine the execution logic of the loop body in the loop instruction at this level.
  • the loop in the loop instruction of the layer can also be determined based on the instruction end signal loop_end of the loop instruction of the layer. body's execution logic.
  • the instruction end signal loop_end of the loop instruction of the layer can be read from the fourth status register (denoted as register R4 ) corresponding to the loop instruction of the layer. If it is detected that the instruction end signal loop_end is pulled high, it is determined that the execution of the layer loop instruction is completed; if it is detected that the instruction end signal loop_end is pulled low, and it is detected that the real-time cycle number of the layer loop instruction is less than the loop end parameter, then continue to execute This layer loops instructions.
  • the layer loop instruction if it is detected that the layer loop instruction is executed to the last instruction of the last loop process of the layer loop instruction, it is determined that the instruction end signal of the layer loop instruction is detected.
  • an instruction end signal loop_end is maintained for each layer of loop instructions, and the instruction end signal loop_end can be stored in the fourth status register of the status register.
  • a fourth status register may be allocated correspondingly, and the fourth status register is used to store the instruction end signal loop_end of each layer of loop instructions.
  • the multi-level loop instruction can be the following for loop instruction:
  • the inner loop for2 loop ends, and at this time, the instruction end signal loop_end2 of the inner loop for2 is pulled high.
  • the sixth instruction is executed, the inner loop for2 loop ends, at this time, the instruction end signal loop_end2 of the inner loop for2 is pulled high; and the outer loop for1 loop ends, at this time, the instruction end signal loop_end2 of the outer loop for1 is pulled high.
  • the above step: detecting the last instruction of each loop process from the execution of the layer loop instruction to the layer loop instruction specifically includes the following steps:
  • the execution process of the loop body of the outer loop for1 and the loop body of the inner loop for2 is taken as an example for illustration.
  • the multi-level loop instruction can be the following for loop instruction:
  • the loop instruction executed at the current moment is the inner loop instruction (for2) of the outer loop for1. If the inner loop instruction of the outer loop for1 (that is, the inner loop for2 ) is executed, it is determined that the execution of the layer loop instruction to the last instruction of the layer loop instruction is detected. At this time, the instruction indication signal loop_last_ins of the layer loop instruction reaching the last instruction is pulled high.
  • the above-mentioned instruction indication signal loop_last_ins may be maintained in the fifth status register, and a fifth status register may be maintained for each layer of loop instructions.
  • the above-described "the execution of the embedded loop instruction of the loop instruction of this layer is completed” can be understood as the completion indication signal loop_lower_done of the loop instruction of this layer is pulled high, wherein the completion indication signal loop_lower_done can be stored in the layer In the third status register corresponding to the loop instruction.
  • 1, 2 and 3 are the first loop of the outer loop for1
  • 4, 5 and 6 are the second loop of the outer loop for1.
  • the instruction indication signal loop_last_ins of the outer loop for1 is pulled high, and since it is not the last lap of the outer loop for1, the instruction end signal loop_end of the outer loop for1 is pulled low.
  • the second loop is the last loop process of the loop command in this layer, and then the last command from the execution of the outer loop for1 to the last loop process can be determined. In this case, the instruction indication signal loop_last_ins of the outer loop for1 is pulled high, and since it is currently the last lap of the outer loop for1, the instruction end signal loop_end of the outer loop for1 is pulled high.
  • the execution logic of the loop body in each layer of loop instructions can be controlled through the instruction end signal of each layer of loop instructions.
  • the execution steps of the for loop can be omitted, so that The instruction cycle is simplified, thereby improving the execution efficiency of the processor and realizing efficient execution of the processor.
  • the method further includes the following steps:
  • S21 Determine a first loop instruction in each layer of loop instructions in the multi-layer loop instructions; wherein, the first loop instruction is a loop instruction identified by an instruction to be allocated;
  • S23 Determine the loop identifier of the first loop instruction based on the idle loop identifier; wherein the loop identifier of the first loop instruction is used to indicate the number of loop layers of the first loop instruction;
  • S24 Store a loop parameter of the first loop instruction in the parameter register based on the loop identifier of the first loop instruction.
  • a plurality of loop identifiers loop_id may be preselected and set, and then the loop identifier loop_id is dynamically assigned to each layer of loop instructions in the multi-layer loop instructions.
  • the idle cycle identifier in the idle state can be determined in the preset cycle identifier; and the first idle cycle identifier can be determined in the idle cycle identifier.
  • Loop ID for the loop instruction.
  • the loop identifier of the first loop instruction can be determined as the level number information of the first loop instruction (that is, the above-mentioned loop level number).
  • the loop parameter of the first loop instruction may be stored in a parameter register based on the layer number information.
  • At least one parameter register can be determined for the first loop instruction in a plurality of registers of the processor, and a corresponding index can be set for the at least one parameter register, and the index is the layer number information of the first loop instruction, To indicate that the at least one parameter register is a parameter register corresponding to the cycle parameters of the cycle instructions storing the layer number.
  • the method further includes the following steps:
  • a second loop instruction is detected in the multi-layer loop instruction; wherein, the second loop instruction is a loop end loop instruction ;
  • the second loop instruction can be detected in the multi-layer loop instruction, where the second loop instruction is the loop instruction at the end of the loop, and the specific detection
  • the instruction working state loop_en of each layer of loop instructions can be obtained in the sixth state register of the state register.
  • a sixth status register may be configured for each layer of loop instructions in the status register, and the sixth status register is used to maintain the instruction working state loop_en of each layer of loop instructions.
  • the loop identifier loop_id allocated for the outer loop for1 can be released, and at this time, the outer loop for1 is the above-mentioned second loop instruction.
  • the loop identifier loop_id allocated for the inner loop for2 can be released, at this time , the inner loop for2 is the above-mentioned second loop instruction.
  • the loop instruction whose instruction execution is completed may be determined in the multi-layered loop instructions in the manner described above, and the second loop instruction may be determined based on the instruction execution completed loop instruction.
  • the loop identifier loop_id of the second loop instruction after the execution of the second loop instruction is completed, the loop identifier loop_id of the second loop instruction is released, and at this time, the loop identifier loop_id may be reset.
  • the value of loop_id can be set according to the loop layer number of the first loop instruction. For example, if the first loop instruction is the upper loop instruction of the second loop instruction, and the loop layer number of the second loop instruction is N, then The loop_id can be set to LOOP_N-1, when the first loop instruction arrives, loop_id-1.
  • the execution of the loop body in the loop instructions of each layer can be controlled based on the execution logic, which specifically includes the following steps:
  • Register R11 The number of real-time cycles is not updated; Register R12: The number of real-time cycles is not updated;
  • Register R21 the loop jump signal loop_jump1 is pulled low
  • Register R22 the loop jump signal loop_jump2 is pulled low
  • Register R41 instruction end signal loop_end1 is pulled low
  • Register R42 instruction end signal loop_end2 is pulled low
  • Register R51 the command indication signal loop_last_ins1 is pulled low
  • Register R52 the command indication signal loop_last_ins2 is pulled low
  • Register R61 Instruction working state loop_en1 is pulled high; Register R62: Instruction working state loop_en2 is pulled low.
  • Register R11 the number of real-time loops is not updated; register R12: the number of real-time loops + 1;
  • Register R21 the loop jump signal loop_jump1 is pulled low
  • Register R22 the loop jump signal loop_jump2 is pulled high
  • Register R41 instruction end signal loop_end1 is pulled low
  • Register R42 instruction end signal loop_end2 is pulled low
  • Register R51 the command indication signal loop_last_ins1 is pulled low
  • Register R52 the command indication signal loop_last_ins2 is pulled high
  • Register R61 Instruction working state loop_en1 is pulled high; Register R62: Instruction working state loop_en2 is pulled high.
  • Register R11 real-time cycle count + 1; register R12: real-time cycle count not updated;
  • Register R21 the loop jump signal loop_jump1 is pulled high; Register R22: the loop jump signal loop_jump2 is pulled low;
  • Register R41 instruction end signal loop_end1 is pulled low; Register R42: instruction end signal loop_end2 is pulled high;
  • Register R51 the command indication signal loop_last_ins1 is pulled high; Register R52: the command indication signal loop_last_ins2 is pulled high;
  • Register R61 Instruction working state loop_en1 is pulled high; Register R62: Instruction working state loop_en2 is pulled high.
  • Register R11 The number of real-time cycles is not updated; Register R12: The number of real-time cycles is not updated;
  • Register R21 the loop jump signal loop_jump1 is pulled low
  • Register R22 the loop jump signal loop_jump2 is pulled low
  • Register R41 instruction end signal loop_end1 is pulled low
  • Register R42 instruction end signal loop_end2 is pulled low
  • Register R51 the command indication signal loop_last_ins1 is pulled low
  • Register R52 the command indication signal loop_last_ins2 is pulled low
  • Register R61 Instruction working state loop_en1 is pulled high; Register R62: Instruction working state loop_en2 is pulled low.
  • Register R11 the number of real-time loops is not updated; register R12: the number of real-time loops + 1;
  • Register R21 the loop jump signal loop_jump1 is pulled low
  • Register R22 the loop jump signal loop_jump2 is pulled high
  • Register R41 instruction end signal loop_end1 is pulled low
  • Register R42 instruction end signal loop_end2 is pulled low
  • Register R51 the command indication signal loop_last_ins1 is pulled low
  • Register R52 the command indication signal loop_last_ins2 is pulled high
  • Register R61 Instruction working state loop_en1 is pulled high; Register R62: Instruction working state loop_en2 is pulled high.
  • Register R11 The number of real-time cycles is not updated; Register R12: The number of real-time cycles is not updated;
  • Register R21 the loop jump signal loop_jump1 is pulled low
  • Register R22 the loop jump signal loop_jump2 is pulled low
  • Register R41 instruction end signal loop_end1 is pulled low; Register R42: instruction end signal loop_end2 is pulled high;
  • Register R51 the command indication signal loop_last_ins1 is pulled high; Register R52: the command indication signal loop_last_ins2 is pulled high;
  • Register R61 Instruction working state loop_en1 is pulled high; Register R62: Instruction working state loop_en2 is pulled high.
  • the writing order of each step does not mean a strict execution order and constitutes any limitation on the implementation process.
  • the specific execution order of each step should be based on its function and possible
  • the inner logic is OK.
  • the embodiment of the present disclosure also provides a loop instruction processing device corresponding to the loop instruction processing method. Since the problem-solving principle of the device in the embodiment of the present disclosure is similar to the above-mentioned loop instruction processing method in the embodiment of the disclosure, therefore For the implementation of the device, reference may be made to the implementation of the method, and repeated descriptions will not be repeated.
  • FIG. 5 is a schematic diagram of a loop instruction processing device provided by an embodiment of the present disclosure
  • the device includes: a controller 10 , a parameter register 20 and an arithmetic unit 30 .
  • the controller 10 is used to obtain a multi-layer loop instruction; wherein, the multi-layer loop instruction includes a multi-layer nested loop instruction; the loop of each layer of loop instruction in the multi-layer loop instruction is obtained in the parameter register 20 parameters; and in the execution process of each layer of loop instructions, the execution logic of the loop body in the layer of loop instructions is determined based on the state information of the layer of loop instructions and the loop parameters of the layer of loop instructions; wherein, the state information is used Indicates the real-time execution status of the loop instruction of this layer.
  • the arithmetic unit 30 is used to control and execute the loop body in the loop instruction of this layer based on the execution logic.
  • the loop parameter of each layer of loop instruction in the multi-layer loop instruction can be obtained in the parameter register of the processor, and during the execution of each layer of loop instruction, Determine the execution logic of the loop body in the layer loop instruction based on the state information of the layer loop instruction and the loop parameters of the layer loop instruction; wherein, the state information is used to indicate the real-time execution status of the layer loop instruction; and then control the execution based on the execution logic The loop body in the loop instruction of this layer.
  • the instruction function of the multi-layer loop instruction can be realized by executing the loop body of each layer of loop instructions, so that Executing the for statement is omitted to simplify the instruction cycle, thereby improving the execution efficiency of the processor and realizing efficient execution of the processor.
  • the controller 10 is further configured to: obtain the real-time cycle number of the loop instruction of the layer; compare the real-time cycle number with the cycle end parameter of the loop instruction of the layer to obtain a comparison result; The execution logic of the loop body in the loop instruction of this layer is determined according to the comparison result.
  • the loop instruction processing device is further configured to: detect a target layer loop instruction satisfying an update condition among the multi-layer loop instructions; wherein, the update condition is an update condition of the target layer loop instruction The condition of the real-time cycle times; updating the real-time cycle times of the target layer cycle instruction.
  • the loop instruction processing device is further configured to: for each layer of loop instructions except the innermost loop instruction in the multi-layer loop instructions, obtain the loop jump signal of the layer loop instruction ; Wherein, the loop jump signal is used to indicate whether to jump to execute the instruction indicated by the initial PC pointer of the loop instruction of this layer; obtain the execution information of the next layer of loop instruction nested in the loop instruction of this layer; If the loop jump signal of the loop instruction at this layer is a jumpback signal and the execution information of the loop instruction at the next layer is execution completed, it is determined that the loop instruction at this layer is the target layer loop instruction satisfying the update condition.
  • the loop instruction processing device is further configured to: when it is detected that the PC pointer points to the last instruction of the loop instruction of the layer, and it is detected that the loop instruction of the layer has not performed the last loop calculation, It is determined that the loop jump signal of the loop instruction of this layer is the jump back signal.
  • the controller 10 is further configured to: detect the instruction end signal of the layer cycle instruction; determine the layer cycle based on the detected instruction end signal and the cycle parameter of the layer cycle instruction The execution logic of the loop body in the instruction.
  • the controller 10 is further configured to: when it is detected that the layer loop instruction is executed to the last instruction of the layer loop instruction in the last loop process, determine that the layer loop instruction is detected Command end signal.
  • the controller 10 is further configured to: obtain the next instruction cycle of the layer loop instruction pointed to by the PC pointer at the current moment to obtain the first instruction cycle; when it is determined that the first instruction cycle is greater than the target value, and the execution of the embedded loop instruction pointed to by the PC pointer to the loop instruction of this layer is completed, and the loop instruction of this layer executes the last loop process, determine the last instruction executed to the last loop process of the loop instruction of this layer ;
  • the target value is the sum of the start pointer of the layer loop instruction and the number of instructions contained in the loop body of the layer loop instruction.
  • the loop instruction processing device is further configured to: after acquiring the multi-layer loop instructions, determine the first loop instruction in each layer of the multi-layer loop instructions; wherein, the The first cycle instruction is a cycle instruction to be allocated with an instruction identifier; an idle cycle identifier in an idle state is determined in a preset cycle identifier; a cycle identifier of the first cycle instruction is determined based on the idle cycle identifier; wherein, the second The loop identifier of a loop instruction is used to indicate the number of loop layers of the first loop instruction; the loop parameter of the first loop instruction is stored in the parameter register based on the loop identifier of the first loop instruction.
  • the loop instruction processing device is further configured to: in the case that the idle loop identifier is not included in the preset loop identifier, detect a second loop instruction in the multi-layer loop instruction ; Wherein, the second loop instruction is a loop end loop instruction; the loop identifier of the first loop instruction is determined based on the loop identifier of the second loop instruction.
  • the loop instruction processing device is further configured to: obtain the instruction working state of each layer of loop instructions in the multi-layer loop instructions; A loop instruction whose instruction execution is completed is determined, and the second loop instruction is determined based on the instruction execution completed loop instruction.
  • the arithmetic unit 30 is further configured to determine the The innermost loop instruction among the plurality of loop instructions is the instruction to be jumped; jump to the start PC pointer of the instruction to jump to execute the instruction to jump.
  • an embodiment of the present disclosure further provides an electronic device 600 .
  • FIG. 6 it is a schematic structural diagram of an electronic device 600 provided by an embodiment of the present disclosure.
  • the electronic device 600 includes: a processor 61 , a memory 62 and a bus 63 .
  • the memory 62 is used to store execution instructions, including a memory 621 and an external memory 622; the memory 621 here is also called an internal memory, and is used to temporarily store calculation data in the processor 61 and exchange data with an external memory 622 such as a hard disk.
  • the processor 61 exchanges data with the external memory 622 through the memory 621.
  • the processor 61 communicates with the memory 62 through the bus 63, so that the processor 61 executes the following instructions: Obtaining a multi-layer loop instruction; wherein, the multi-layer loop instruction includes a multi-layer nested loop instruction; obtaining the loop parameter of each layer of loop instruction in the multi-layer loop instruction in the parameter register of the processor; During the execution of the loop instructions of each layer, the execution logic of the loop body in the loop instructions of the layer is determined based on the state information of the loop instructions of the layer and the loop parameters of the loop instructions of the layer; wherein, the state information is used to indicate the The real-time execution state of the loop instruction of the layer; the loop body in the loop instruction of the layer is controlled and executed based on the execution logic.
  • Embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored. When the computer program is run by a processor, the steps of the loop instruction processing method described in the foregoing method embodiments are executed.
  • the storage medium may be a volatile or non-volatile computer-readable storage medium.
  • Embodiments of the present disclosure also provide a computer program product, the computer program product carries a program code, and the instructions included in the program code can be used to execute the steps of the loop instruction processing method described in the above method embodiment, for details, please refer to the above The method embodiment will not be repeated here.
  • Embodiments of the present disclosure further provide a chip, including the instruction processing device described in any one of the above embodiments. For details, reference may be made to the above device embodiments, which will not be repeated here.
  • the above-mentioned computer program product may be specifically implemented by means of hardware, software or a combination thereof.
  • the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK) etc. wait.
  • a software development kit Software Development Kit, SDK
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the functions are realized in the form of software function units and sold or used as independent products, they can be stored in a non-volatile computer-readable storage medium executable by a processor.
  • the technical solution of the present disclosure is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make an electronic device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present disclosure.
  • the aforementioned storage medium includes various media that can store program codes such as U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Advance Control (AREA)
  • Executing Machine-Instructions (AREA)

Abstract

本公开提供了一种循环指令处理方法、装置、芯片、电子设备及存储介质。该循环指令处理方法包括:获取多层循环指令;其中,所述多层循环指令包含多层嵌套的循环指令;在所述处理器的参数寄存器中获取所述多层循环指令中的每层循环指令的循环参数;在所述每层循环指令的执行过程中,基于该层循环指令的状态信息和该层循环指令的循环参数,确定该层循环指令中循环体的执行逻辑;其中,所述状态信息用于指示该层循环指令的实时执行状态;基于所述执行逻辑控制执行该层循环指令中的循环体。

Description

循环指令处理方法、装置、芯片、电子设备及存储介质
交叉引用声明
本申请要求于2022年01月29日提交中国专利局的申请号为202210113076.X的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本公开涉及计算机技术领域,具体而言,涉及一种循环指令处理方法、装置、芯片、电子设备及计算机可读存储介质。
背景技术
在现有计算机语言中,循环语句为经常使用的程序语言。例如,针对一个for循环(for i=0;i<N;i++)语句,该for循环语句在经过编译器编译之后可以编译得到数据运算分支和条件判断分支。
在现有for循环语句的执行过程中,通常通过编译结果中的条件判断分支来控制for循环语句的执行过程。当一个for循环语句内包含多层嵌套内循环时,该for循环中的每一条语句将被编译成对应的指令。当for循环语句结构复杂时,该for循环语句将被编译成更加复杂的指令,从而影响了处理器的处理效率。
发明内容
本公开实施例至少提供一种循环指令处理方法、装置、芯片、电子设备及计算机可读存储介质。
第一方面,本公开实施例提供了一种循环指令处理方法,应用于处理器,包括:获取多层循环指令;其中,所述多层循环指令包含多层嵌套的循环指令;在所述处理器的参数寄存器中获取所述多层循环指令中的每层循环指令的循环参数;在所述每层循环指令的执行过程中,基于该层循环指令的状态信息和该层循环指令的循环参数,确定所述各层循环指令中循环体的执行逻辑;其中,所述状态信息用于指示该层循环指令的实时执行状态;基于所述执行逻辑控制执行该层循环指令中的循环体。
在本公开实施例中,通过维护循环参数和状态信息控制执行每层循环指令中循环体的执行逻辑,可以实现通过执行每层循环指令的循环体来实现多层循环指令的指令功能,从而可以省略执行for语句,简化指令周期,进而提高了处理器的执行效率,实现了处理器的高效执行。
在一种可选的实施方式中,所述基于该层循环指令的状态信息和该层循环指令的循环参数,确定该层循环指令中循环体的执行逻辑,包括:获取该层循环指令的实时循环次数;将所述实时循环次数与该层循环指令的循环结束参数进行比对,得到比对结果;根据所述比对结果确定该层循环指令中循环体的执行逻辑。
通过上述描述可知,在本公开实施例中,通过每层循环指令的实时循环次数,可以控制每层循环指令中循环体的跳转逻辑,通过该处理方式,可以省略for循环的执行步骤,从而可以简化指令周期,进而提高了处理器的执行效率,实现了处理器的高效执行。
在一种可选的实施方式中,所述方法还包括:检测所述多层循环指令中满足更新条件的目标层循环指令;其中,所述更新条件为更新所述目标层循环指令的实时循环次数的条件;更新所述目标层循环指令的实时循环次数。
在上述实施方式中,通过处理器中的状态寄存器维护每层循环指令的实时循环次数,可以实现通过处理器中的硬件设备控制每层循环指令的循环体的循环次数,从而可以实现每层循环指令的实时执行状态的动态自维护,进而可以实现循环指令的高效实现。
在一种可选的实施方式中,所述检测所述多层循环指令中满足更新条件的目标层循环指令,包括针对所述多层循环指令中除最内层循环指令之外的每层循环指令:获取该层循环指令的循环跳转信号;其中,所述循环跳转信号用于指示是否跳转执行该层循环指令的起始PC指针所指示的指令;获取该层循环指令内嵌套的下一层循环指令的执行信息;在确定该层循环指令的所述循环跳转信号为回跳信号以及所述下一层循环指令的执行信息为执行完成的情况下,确定该层循环指令为所述满足更新条件的目标层循环指令。
在一种可选的实施方式中,所述确定该层循环指令的所述循环跳转信号为回跳信号,包括:在检测到PC指针指向该层循环指令中的最后一个指令且检测到该层循环指令未执行最后一次循环计算的情况下,确定该层循环指令的循环跳转信号为回跳信号。
在上述实施方式中,通过状态寄存器维护循环跳转信号和完成指示信号,可以实现通过处理器中的硬件设备确定实时循环次数的更新条件,从而实现每层循环指令的实时执行状态的动态自维护,进而可以实现循环指令的高效实现。
在一种可选的实施方式中,所述基于该层循环指令的状态信息和该层循环指令的循环参数确定该层循环指令中循环体的执行逻辑,包括:检测该层循环指令的指令结束信号;基于检测到的所述指令结束信号和该层循环指令的所述循环参数,确定该层循环指令中循环体的执行逻辑。
在一种可选的实施方式中,所述检测该层循环指令的指令结束信号,包括:在检测到该层循环指令执行至该层循环指令的最后一次循环过程的最后一条指令的情况下,确定检测到该层循环指令的指令结束信号。
通过上述描述可知,在本公开实施例中,通过每层循环指令的指令结束信号,可以控制每层循环指令中循环体的执行逻辑,通过该处理方式,可以省略for循环的执行步骤,从而可以简化指令周期,进而提高了处理器的执行效率,实现了处理器的高效执行。
在一种可选的实施方式中,所述检测到该层循环指令执行至该层循环指令的最后一次循环过程的最后一条指令,包括:获取当前时刻PC指针所指向循环指令的下一个指令周期,得到第一指令周期;在确定出所述第一指令周期大于目标数值,且该PC指针所指向该层循环指令的内嵌循环指令执行完成,以及该层循环指令执行最后一次循环过程的情况下,确定执行至该层循环指令的最后一次循环过程的最后一条指令;其中,所述目标数值为该层循环指令的起始指针和该层循环指令的循环体内所包含的指令数量之和。
在一种可选的实施方式中,在所述获取多层循环指令之后,所述方法包括:在所述多层循环指令中的每层循环指令中确定第一循环指令;其中,所述第一循环指令为待分配指令标识的循环指令;在预设循环标识中确定处于空闲状态的空闲循环标识;基于所述空闲循环标识确定所述第一循环指令的循环标识;其中,所述第一循环指令的循环标识用于指示所述第一循环指令的循环层数;基于所述第一循环指令的循环标识在所述参数寄存器中存储所述第一循环指令的循环参数。
在一种可选的实施方式中,所述方法还包括:在所述预设循环标识中不包含所述空闲循环标识的情况下,在所述多层循环指令中检测第二循环指令;其中,所述第二循环指令为循环结束的循环指令;基于所述第二循环指令的循环标识确定所述第一循环指令的循环标识。
在一种可选的实施方式中,所述在所述多层循环指令中检测第二循环指令,包括:获取所述多层循环指令中的每层循环指令的指令工作状态;基于所述指令工作状态在所 述多层循环指令中确定指令执行结束的循环指令,并基于所述指令执行结束的循环指令确定所述第二循环指令。
在上述实施方式中,通过为多层循环指令中的每层循环指令动态分配循环标识的方式,可以实现每层循环指令的循环层数的自维护,从而可以提高处理器对多层循环指令的兼容性。
在一种可选的实施方式中,所述基于所述执行逻辑控制执行所述各层循环指令中的循环体,包括:在所述执行逻辑为跳转至所述多层循环指令中多个循环指令的起始PC指针所指向指令的情况下,确定所述多个循环指令中位于最内层的循环指令为待跳转指令;跳转至所述待跳转指令的起始PC指针处执行所述待跳转指令。
在上述实施方式中,通过为多层循环指令中的每层循环指令动态分配循环标识的方式,可以实现每层循环指令的循环层数的自维护,从而可以提高处理器对多层循环指令的兼容性。
第二方面,本公开实施例提供了一种循环指令处理装置,包括:控制器、参数寄存器以及运算器;所述控制器用于获取多层循环指令;其中,所述多层循环指令包含多层嵌套的循环指令;在所述参数寄存器中获取所述多层循环指令中的每层循环指令的循环参数;以及在所述每层循环指令的执行过程中,基于该层循环指令的状态信息和该层循环指令的循环参数确定该层循环指令中循环体的执行逻辑;其中,所述状态信息用于指示该层循环指令的实时执行状态;运算器用于基于所述执行逻辑控制执行所述各层循环指令中的循环体。
第三方面,本公开实施例还提供一种芯片,其特征在于,包括:如第二方面中任一项所述的指令处理装置。
第四方面,本公开实施例还提供一种电子设备,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,当电子设备运行时,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行上述第一方面中任一项所述的循环指令处理方法的步骤。
第五方面,本公开实施例还提供一种电子设备,包括如第三方面所述的芯片。
第六方面,本公开实施例还提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行上述第一方面中任一项所述的循环指令处理方法的步骤。
为使本公开的上述目的、特征和优点能更明显易懂,下文特举较佳实施例,并配合所附附图,作详细说明如下。
附图说明
为了更清楚地说明本公开实施例的技术方案,下面将对实施例中所需要使用的附图作简单地介绍。这些附图示出了符合本公开的实施例,并与说明书一起用于说明本公开的技术方案。应当理解,以下附图仅示出了本公开的某些实施例,因此不应被看作是对范围的限定,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他相关的附图。
图1示出了本公开实施例所提供的一种循环指令处理方法的流程图;
图2示出了本公开实施例所提供的一种循环指令处理方法中基于所述各层循环指令的状态信息和所述各层循环指令的循环参数确定所述各层循环指令中循环体的执行逻辑的具体方法的流程图;
图3示出了本公开实施例所提供的一种循环指令处理方法中检测所述各层循环指令中满足更新条件的循环指令的具体方法的流程图;
图4示出了本公开实施例所提供的一种循环指令处理方法中基于所述各层循环指令的状态信息和所述各层循环指令的循环参数确定所述各层循环指令中循环体的执行逻辑的具体方法的流程图;
图5示出了本公开实施例所提供的一种循环指令处理装置的示意图;
图6示出了本公开实施例所提供的一种电子设备的示意图。
具体实施方式
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本公开实施例中附图,对本公开实施例中的技术方案进行清楚、完整地描述。所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。通常在此处附图中描述和示出的本公开实施例的组件可以以各种不同的配置来布置和设计。因此,以下对在附图中提供的本公开的实施例的详细描述无意限制要求保护的本公开的范围。基于本公开的实施例,本领域技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本公开保护的范围。
应注意到,相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步定义和解释。
本文中术语“和/或”,仅仅是描述一种关联关系,表示可以存在三种关系,例如,A和/或B,可以表示单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中术语“至少一种”表示多种中的任意一种或多种中的至少两种的任意组合,例如,包括A、B、C中的至少一种,可以表示包括从A、B和C构成的集合中选择的任意一个或多个元素。
经研究发现,人工智能作为新一轮科技革命和产业变革的核心驱动力,正快速催生新产品、新服务、新业态,重塑着经济社会运行模式,改变人类生产和生活方式。神经网络是一种模仿动物神经网络行为特征,进行分布式并行信息处理的算法数学模型,是由大量处理单元互联组成的非线性、自适应信息处理系统。神经网络的研究能促进或者加快人工智能的发展。
卷积计算是神经网络中需要实现的最主要功能之一,在卷积的计算过程中将遍历图像矩阵及卷积核进行重复的点积运算,对图像和卷积核的遍历就会用到for循环(for i=0;i<N;i++)语句;对于多维图像及卷积核通常会用到多重循环语句。在边缘计算等应用场景下通常对计算的速度有着较高的要求,因此循环指令这类控制信息的高效执行是十分必要。
在现有for循环语句的执行过程中,通常通过编译结果中的条件判断分支来控制for循环语句的执行过程。当一个for循环语句内包含多层嵌套内循环时,该for循环中的每一条语句将被编译成对应的指令。当for循环语句结构复杂时,该for循环语句将被编译成更加复杂的指令,从而影响了处理器的处理效率。
基于上述研究,本公开提供了一种循环指令处理方法、装置、电子设备及计算机可读存储介质。在本公开实施例中,在获取到多层循环指令之后,可以在处理器的参数寄存器中获取多层循环指令中的每层循环指令的循环参数,并在每层循环指令的执行过程中,基于该层循环指令的状态信息和该层循环指令的循环参数确定该层循环指令中循环体的执行逻辑;其中,状态信息用于指示该层循环指令的实时执行状态;进而基于执行逻辑控制执行该层循环指令中的循环体。在本公开实施例中,通过维护循环参数和状态信息控制执行各层循环指令中循环体的执行逻辑,可以实现通过执行每层循环指令的循环体来实现多层循环指令的指令功能,从而可以省略执行for语句,简化指令周期,进而提高了处理器的执行效率,实现了处理器的高效执行。
为便于对本实施例进行理解,首先对本公开实施例所公开的一种循环指令处理方法进行详细介绍。本公开实施例所提供的循环指令处理方法的执行主体一般为具有一定计算能力的电子设备。
参见图1所示,为本公开实施例提供的一种循环指令处理方法的流程图,所述方法包括步骤S101~S107。
S101:获取多层循环指令;其中,所述多层循环指令包含多层嵌套的循环指令。
在本公开实施例中,多层循环指令可以为包含N层嵌套的循环指令,其中,N为大于1的正整数。例如,多层循环指令可以为2层循环指令、3层循环指令,本公开对多层循环指令的指令层数不作具体限定。
举例来说,多层循环指令可以为以下for循环指令:
Figure PCTCN2022120852-appb-000001
这里,for(i=0;i<3;i++)可以标记为for1,表示为外循环;for(j=0;j<3;j++)可以标记为for2,表示为内循环,即,外循环for1内嵌套的循环指令。
针对外循环for1来说,该外循环for1的循环内容包括“{a=a+1}”、“for(j=0;j<3;j++)”和“{b=b+1}”;针对内循环for2来说,该内循环for2的循环内容包括“{b=b+1}”。
S103:在所述处理器的参数寄存器中获取所述多层循环指令中的每层循环指令的循环参数。
在本公开实施例中,可以预先在处理器中为每层循环指令确定一个或者多个参数寄存器。这里,参数寄存器用于存储每层循环指令的循环参数。
在本公开实施例中,每层循环指令的循环参数可以包含以下参数:循环指令的起始PC指针、循环体的指令数量、循环步长、循环结束参数(例如,循环次数)等信息。
针对每个参数寄存器包含对应的层数信息,该层数信息为该参数寄存器所对应的循环指令在多层循环指令中的循环层数。
S105:在所述每层循环指令的执行过程中,基于该层循环指令的状态信息和该层循环指令的循环参数,确定该层循环指令中循环体的执行逻辑;其中,所述状态信息用于指示该层循环指令的实时执行状态。
举例来说,多层循环指令可以为以下for循环指令:
Figure PCTCN2022120852-appb-000002
上述for循环指令中,{a=a+1}为外循环for1中的循环体,{b=b+1}为内循环for2中的循环体。
通过上述描述可知,在本公开实施例中,在通过各层循环指令的循环参数控制多层循环指令的执行过程中,可以执行每层循环指令中的循环体,从而省略执行每层循环指令中的条件判断分支所对应的指令,例如for(i=0;i<3;i++)和for(j=0;j<3;j++)所对应的指令。通过该处理方式,可以简化指令周期,从而实现处理器的高效执行。
这里,每层循环指令中循环体的执行逻辑可以理解为每层循环指令中循环体的跳转逻辑。
例如,上述for循环指令,该执行逻辑可以理解为每层循环指令中循环体{a=a+1}和{b=b+1}之间的跳转逻辑。在循环参数的控制下,可以通过执行循环体{a=a+1}和{b=b+1} 来实现多层循环指令的指令功能,从而省略执行for(i=0;i<3;i++)和for(j=0;j<3;j++)所对应的指令。
S107:基于所述执行逻辑控制执行该层循环指令中的循环体。
在本公开实施例中,在获取到多层循环指令之后,可以在处理器的参数寄存器中获取多层循环指令中的每层循环指令的循环参数,并在每层循环指令的执行过程中,基于该层循环指令的状态信息和该层循环指令的循环参数确定该层循环指令中循环体的执行逻辑;其中,状态信息用于指示该层循环指令的实时执行状态;进而基于执行逻辑控制执行该层循环指令中的循环体。在本公开实施例中,通过维护循环参数和状态信息控制执行每层循环指令中循环体的执行逻辑,可以实现通过执行每层循环指令的循环体来实现多层循环指令的指令功能,从而可以省略执行for语句,简化指令周期,进而提高了处理器的执行效率,实现了处理器的高效执行。
下面将结合具体实施例描述上述循环指令处理方法。
在本公开实施例中,在获取到上述多层循环指令之后,可以解析得到每层循环指令的循环参数,并将解析到的循环参数存储至对应的参数寄存器中。
在执行多层循环指令中的每层循环指令的过程中,可以在处理器的参数寄存器中获取该层循环指令的循环参数;以及在执行多层循环指令中的每层循环指令的过程中,获取处理器中状态寄存器中的状态信息,该状态信息用于指示该层循环指令的实时执行状态。
这里,实时执行状态可以对应多个维度的执行状态,针对每层循环指令的每个维度下的执行状态,可以分别设置不同的状态寄存器。
之后,可以基于每层循环指令的状态信息和该层循环指令的循环参数确定该层循环指令中循环体的执行逻辑。
在一个可选的实施方式中,如图2所示,上述步骤S105:基于该层循环指令的状态信息和该层循环指令的循环参数,确定该层循环指令中循环体的执行逻辑,具体包括如下步骤:
S201:获取该层循环指令的实时循环次数;
S202:将所述实时循环次数与该层循环指令的循环结束参数进行比对,得到比对结果。
S203:根据所述比对结果确定该层循环指令中循环体的执行逻辑。
在本公开实施例中,在执行每层循环指令时,可以获取该层循环指令的实时循环次数,从而将该实时循环次数和该层循环指令的循环结束参数进行比对,得到比对结果。
在本公开实施例中,可以从第一状态寄存器中获取每层循环指令的实时循环次数。这里,第一状态寄存器为处理器的状态寄存器中用于存储每层循环指令的实时循环次数的寄存器。
这里,循环结束参数可以理解为该层循环指令的最大循环次数。此时,可以将实时循环次数与最大循环次数进行比对,从而得到比对结果。例如,该比对结果可以为实时循环次数等于最大循环次数,或者实时循环次数小于最大循环次数。
在得到比对结果之后,可以根据该比对结果确定该层循环指令中循环体的执行逻辑,例如,回跳继续执行该层循环指令,或者,执行下一层循环指令。
具体实施时,针对多层循环指令的每层循环指令,在每层循环指令执行到该层循环指令的循环体的最后时,可以将该层循环指令的实时循环次数与该层循环指令的最大循环次数进行比对,从而根据比对结果确定是回跳继续执行该层循环指令,还是执行下一层循环指令。
通过上述描述可知,在本公开实施例中,通过每层循环指令的实时循环次数,可以控制每层循环指令中循环体的跳转逻辑,通过该处理方式,可以省略for循环的执行步骤,从而可以简化指令周期,进而提高了处理器的执行效率,实现处理器的高效执行。
通过上述描述可知,可以从第一状态寄存器中获取每层循环指令的实时循环次数。针对第一状态寄存器中存储的实时循环次数,需要实时进行更新,具体更新方法描述如下:
(1)检测所述多层循环指令中满足更新条件的目标层循环指令;其中,所述更新条件为更新所述目标层循环指令的实时循环次数的条件;
(2)更新所述目标层循环指令的实时循环次数。
在本公开实施例中,可以在每层循环指令的执行过程中,检测该层循环指令是否满足更新条件。如果检测到满足该更新条件的循环指令,则可以更新该层循环指令的实时循环次数。
具体实施时,可以确定满足更新条件的循环指令的层数信息,进而根据该层数信息在状态寄存器中确定与该层循环指令相匹配的用于存储该层循环指令的实时循环次数的第一状态寄存器,从而对该第一状态寄存器(记为寄存器R1)中存储的实时循环次数进行更新。
在上述实施方式中,通过处理器中的状态寄存器维护每层循环指令的实时循环次数,可以实现通过处理器中的硬件设备控制每层循环指令的循环体的循环次数,并且可以实现每层循环指令的实时执行状态的动态自维护,从而可以实现循环指令的高效实现。
在一个可选的实施方式中,如图3所示,上述步骤:检测所述多层循环指令中满足更新条件的目标层循环指令,具体包括如下步骤:
S301:针对所述多层循环指令中除最内层循环指令之外的每层循环指令,获取该层循环指令的循环跳转信号;其中,所述循环跳转信号用于指示是否跳转执行该层循环指令的起始PC指针所指示的指令;
S302:获取所该层循环指令内嵌套的下一层循环指令的执行信息;
S303:在确定该层循环指令的所述循环跳转信号为回跳信号以及所述下一层循环指令的执行信息为执行完成的情况下,确定该层循环指令为所述满足更新条件的目标层循环指令。
在本公开实施例中,针对多层循环指令中的每层循环指令,均维护了一个循环跳转信号loop_jump,其中,该循环跳转信号用于指示是否跳转执行该层循环指令的起始PC指针所指示的指令。
具体实施时,可以获取每层循环指令的循环跳转信号,并获取该层循环指令内嵌套的循环指令的执行信息,例如,循环执行完成或者循环执行未完成。
在本公开实施例中,可以在每层循环指令的执行过程中,从状态寄存器的第二状态寄存器(记为寄存器R2)中获取每层循环指令的循环跳转信号loop_jump。这里,第二状态寄存器为处理器的状态寄存器中用于存储每层循环指令的循环跳转信号loop_jump的寄存器。
在每层循环指令的执行过程中,还可以从状态寄存器的第三状态寄存器(记为寄存器R3)中获取该层循环指令内嵌套的循环指令的执行信息。这里,第三状态寄存器(寄存器R3)为处理器的状态寄存器中用于存储每层循环指令的内层循环指令的完成指示信号loop_lower_done的寄存器。其中,该层循环指令内嵌套的循环指令的执行信息可以理解为该层循环指令的内层循环指令的完成指示信号loop_lower_done。
在获取到循环跳转信号和执行信息之后,可以基于该循环跳转信号和执行信息确定该层循环指令是否满足更新条件。
具体实施时,如果该循环跳转信号loop_jump拉高且完成指示信号loop_lower_done拉高,则确定该层循环指令满足更新条件,此时,可以在该层循环指令所对应的第一状态寄存器中更新该层循环指令的实时循环次数,例如,实时循环次数加1。
这里,循环跳转信号loop_jump拉高可以理解为循环跳转信号为回跳信号,表示跳转执行该层循环指令的起始PC指针所指示的指令;完成指示信号loop_lower_done拉高表示该层循环指令的内层循环全部执行完成。
在一个可选的实施方式中,上述步骤S303中:确定该层循环指令的所述循环跳转信号为回跳信号,具体包括:在检测到PC指针指向该层循环指令的最后一个指令且检测到该层循环指令未执行最后一次循环计算的情况下,确定该层循环指令的循环跳转信号为回跳信号。
通过上述描述可知,在本公开实施例中,循环跳转信号loop_jump用于确定在执行到该层循环指令的循环体的最后时是否回跳至该层循环指令的起始PC指针处继续执行该层循环指令。
这里,确定该层循环指令的循环跳转信号为回跳信号(或者循环跳转信号loop_jump拉高)的条件包括条件1和条件2。
条件1:PC指针到达该层循环指令的最后一个指令。
条件2:该层循环指令还未执行最后一次循环计算。
在确定出该层循环指令满足上述条件1和条件2的情况下,确定该层循环指令的循环跳转信号为回跳信号(也即,循环跳转信号loop_jump拉高)。
举例来说,多层循环指令可以为以下for循环指令:
Figure PCTCN2022120852-appb-000003
上述外循环for1的循环体和内循环for2的循环体的执行过程可以描述为:
1:a=a+1;外循环for1的第一圈循环;
2:b=b+1;内循环for2的第一圈循环;
3:b=b+1;内循环for2的第二圈循环;
4:a=a+1;外循环for1的第二圈循环;
5:b=b+1;内循环for2的第一圈循环;
6:b=b+1;内循环for2的第二圈循环。
在执行上述指令1至指令6的过程中,外循环for1和内循环for2的循环跳转信号loop_jump的具体变化过程描述如下,其中,外循环for1的循环跳转信号loop_jump记为loop_jump1,内循环for2的循环跳转信号loop_jump记为loop_jump2。
(1)指令1(a=a+1):loop_jump1拉低,且loop_jump2拉低。
在执行指令1时,指令1不是外循环for1的最后一条指令,因此,loop_jump1拉低;且指令1不是内循环for2的指令,因此,loop_jump2拉低。
此时,可以在外循环for1所对应的寄存器R2中写入处于拉低状态的循环跳转信号loop_jump1,并在内循环for2所对应的寄存器R2中写入处于拉低状态的循环跳转信号loop_jump2。
(2)指令2(b=b+1):loop_jump1拉低,且loop_jump2拉高。
在执行指令2时,内循环for2执行到该层循环的最后一条指令,且还未执行最后一次循环计算,因此,loop_jump2拉高;此时,虽然外循环for1未执行最后一次循环计算,但是指令2不是外循环for1的最后一条指令,因此,loop_jump1拉低。
此时,外循环for1所对应的寄存器R2中写入的循环跳转信号loop_jump1保持不变;将内循环for2所对应的寄存器R2中写入的处于拉低状态的循环跳转信号loop_jump2修改为处于拉高状态的循环跳转信号loop_jump2。
(3)指令2(b=b+1):loop_jump1拉高,且loop_jump2拉低。
在执行指令3时,内循环for2执行到该层循环的最后一条指令,且为最后一次循环计算,因此,loop_jump2拉低;此时,外循环for1还未执行最后一次循环计算,且指令3是外循环for1的最后一条指令,因此,loop_jump1拉高。
在该指令周期中,将外循环for1所对应的寄存器R2中写入的处于拉低状态的循环跳转信号loop_jump1修改为处于拉高状态的循环跳转信号loop_jump1;将内循环for2所对应的寄存器R2中写入的处于拉高状态的循环跳转信号loop_jump2修改为处于拉低状态的循环跳转信号loop_jump2。
(4)指令4(a=a+1):loop_jump1拉低,且loop_jump2拉低。
在执行指令4时,指令4不是外循环for1的最后一条指令,但是为外循环for1的最后一次循环计算,因此,loop_jump1拉低;且指令4不是内循环for2的指令,因此,loop_jump2拉低。
在该指令周期中,将外循环for1所对应的寄存器R2中写入的处于拉高状态的循环跳转信号loop_jump1修改为处于拉低状态的循环跳转信号loop_jump1;内循环for2所对应的寄存器R2中写入的循环跳转信号loop_jump2保持不变。
(5)指令5(b=b+1):loop_jump1拉低,且loop_jump2拉高。
在执行指令5时,指令5为外循环for1的最后一次循环计算中的指令,因此,loop_jump1拉低;此时,内循环for2执行到该层循环的最后一条指令,则还未执行最后一次循环计算,因此,loop_jump2拉高。
在该指令周期中,外循环for1所对应的寄存器R2中写入的循环跳转信号loop_jump1保持不变;将内循环for2所对应的寄存器R2中写入的处于拉低状态的循环跳转信号loop_jump2修改为处于拉高状态的循环跳转信号loop_jump2。
(6)指令6(b=b+1):loop_jump1拉低,且loop_jump2拉低。
在执行指令6时,指令6为外循环for1的最后一次循环计算中的指令,因此,loop_jump1拉低;此时,内循环for2执行到该层循环的最后一条指令,且为最后一次循环计算,因此,loop_jump2拉低。
在该指令周期中,外循环for1所对应的寄存器R2中写入的循环跳转信号loop_jump1保持不变;将内循环for2所对应的寄存器R2中写入的处于拉高状态的循环跳转信号loop_jump2修改为处于拉低状态的循环跳转信号loop_jump2。
通过上述描述可知,当指令执行至3:b=b+1时,则确定该多层循环指令所对应计算机程序的PC指针当前时刻指向外循环for1的最后一条指令,且检测到外循环for1还未执行最后一轮循环,且外循环for1的第一轮循环中内嵌循环指令执行结束。此时,可以确定满足实时循环次数的更新条件,执行实时循环次数加1的操作。在执行后续指令之前,可以读取外循环for1的第一状态寄存器中所存储的实时循环次数(例如,1),并根据该实时循环次数确定该外循环for1还未执行最后一次循环计算;且确定出外循环for1的循环跳转信号为回跳信号(即,循环跳转信号loop_jump拉高)。此时,表明需要回跳至该外循环for1的起始PC指针所指示的指令处继续执行该外循环for1,即,执行上述4、5和6所描述的指令。
在一个可选的实施方式中,上述步骤S303中:确定该层循环指令内嵌套的下一层循循环指令的执行信息为执行完成,具体包括:该层循环指令内嵌套的所有内循环指令的循环跳转信号loop_jump均为不回跳,即,该层循环指令内嵌套的所有内循环指令的循环跳转信号loop_jump均拉低,表示为:loop_jump[loop_id-N:0]=0。其中,loop_id-N:0表示为该层循环指令中层数N至0的内循环的loop_jump均拉低。
在本公开实施例中,在确定出该层循环指令内嵌套的所有内循环指令的循环跳转信号loop_jump均拉低的情况下,可以确定该层循环指令的完成指示信号loop_lower_done拉高,即,该层循环指令内嵌套的下一层循环指令的执行信息为执行完成。
以上述指令1至指令6为例来进行说明,假设该层循环指令为外循环for1,该层循环指令内嵌套的所有内循环指令为内循环for2。针对外循环for1,在执行指令3和指令6时,该层循环指令内嵌套的所有内循环指令的循环跳转信号loop_jump均拉低。
结合各个指令周期内外循环for1的循环跳转信号,内循环for2的循环跳转信号,以及内循环for2的执行信息可知:在指令2和指令5所对应的指令周期内,内循环for2的循环跳转信号拉高,且该内循环for2不包含内嵌循环指令,此时,内循环for2的实时循环次数执行“+1”的操作,即,可以对内循环for2的第一状态寄存器中的实时循环次数执行“+1”的操作。在指令3所对应的指令周期内,外循环for1的循环跳转信号拉高,且外循环for1的内存循环指令(即,内循环for2)的执行完成(即,内循环for2的循环跳转信号拉低),此时,外循环for1的实时循环次数执行“+1”的操作,即,可以对外循环for1的第一状态寄存器中的实时循环次数执行“+1”的操作。
下面将继续以上述指令1至指令6为例,对第一状态寄存器至第三状态寄存器中所存储数据的变化过程进行描述。假设第一状态寄存器包括寄存器R11和寄存器R12,第二状态寄存器包括寄存器R21和寄存器R22,第三状态寄存器包括寄存器R31。
其中,寄存器R11用于存储外循环for1的实时循环次数,寄存器R21用于存储外循环for1的循环跳转信号loop_jump1,寄存器R31用于存储外循环for1的完成指示信号loop_lower_done1。寄存器R12用于存储内循环for2的实时循环次数,寄存器R22用于存储内循环for2的循环跳转信号loop_jump2。由于内循环for2无内层循环,因此,未设置内循环for2的第三状态寄存器。
(1)指令1(a=a+1):loop_jump1拉低,且loop_jump2拉低。
寄存器R11:不更新;寄存器R12:不更新;
寄存器R21:写入loop_jump1拉低;寄存器R22:写入loop_jump2拉低;
寄存器R31:写入loop_lower_done1拉低。
(2)指令2(b=b+1):loop_jump1拉低,且loop_jump2拉高。
寄存器R11:不更新;寄存器R12:实时循环次数+1;
寄存器R21:不更新;寄存器R22:更新为loop_jump2拉高;
寄存器R31:不更新。
(3)指令3(b=b+1):loop_jump1拉高,且loop_jump2拉低。
寄存器R11:实时循环次数+1;寄存器R12:不更新;
寄存器R21:更新为loop_jump1拉高;寄存器R22:更新为loop_jump2拉低;
寄存器R31:更新为loop_lower_done1拉高。
(4)指令4(a=a+1):loop_jump1拉低,且loop_jump2拉低。
寄存器R11:不更新;寄存器R12:不更新;
寄存器R21:更新为loop_jump1拉低;寄存器R22:不更新;
寄存器R31:更新为loop_lower_done1拉低。
(5)指令5(b=b+1):loop_jump1拉低,且loop_jump2拉高。
寄存器R11:不更新;寄存器R12:实时循环次数+1;
寄存器R21:不更新;寄存器R22:更新为loop_jump2拉高;
寄存器R31:不更新。
(6)指令6(b=b+1):loop_jump1拉低,且loop_jump2拉低。
寄存器R11:不更新;寄存器R12:不更新;
寄存器R21:不更新;寄存器R22:更新为loop_jump2拉低;
寄存器R31:更新为loop_lower_done1拉高。
在本公开实施例中,在执行每层循环指令之前,可以从与该层循环指令所对应的第一状态寄存器中读取对应的实时循环次数,从而将该实时循环次数与循环结束参数进行比对,并根据比对结果确定是否继续执行该层循环指令。
在上述实施方式中,通过状态寄存器维护循环跳转信号和完成指示信号,可以实现通过处理器中的硬件设备确定实时循环次数的更新条件,从而实现每层循环指令的实时执行状态的动态自维护,进而可以实现循环指令的高效实现。
在一个可选的实施方式中,如图4所示,上述步骤S105:基于该层循环指令的状态信息和该层循环指令的循环参数,确定该层循环指令中循环体的执行逻辑,具体包括如下步骤:
S401:检测该层循环指令的指令结束信号;
S402:基于检测到的所述指令结束信号和该层循环指令的所述循环参数,确定该层循环指令中循环体的执行逻辑。
在本公开实施例中,除了按照上述所描述的实时循环次数确定该层循环指令中循环体的执行逻辑之外,还可以基于该层循环指令的指示结束信号loop_end,确定该层循环指令中循环体的执行逻辑。
在图2所描述实施例的基础上,上述过程还可以描述为以下过程。
在本公开实施例中,在执行每层循环指令之前,可以从该层循环指令所对应的第四状态寄存器(记为寄存器R4)中读取该层循环指令的指令结束信号loop_end。如果检测到该指令结束信号loop_end拉高,则确定该层循环指令执行结束;如果检测到该指令结束信号loop_end拉低,且检测到该层循环指令的实时循环次数小于循环结束参数,则继续执行该层循环指令。
在本公开实施例中,在检测到该层循环指令执行至该层循环指令的最后一次循环过程的最后一条指令的情况下,确定检测到该层循环指令的指令结束信号。
这里,针对每层循环指令均维护了一个指令结束信号loop_end,该指令结束信号loop_end可以存储在状态寄存器的第四状态寄存器中。针对每层循环指令,均可以对应分配一个第四状态寄存器,该第四状态寄存器用于存储每层循环指令的指令结束信号loop_end。
举例来说,多层循环指令可以为以下for循环指令:
Figure PCTCN2022120852-appb-000004
上述外循环for1的循环体和内循环for2的循环体的执行过程可以描述为:
1:a=a+1;外循环for1的第一圈循环;
2:b=b+1;内循环for2的第一圈循环;
3:b=b+1;内循环for2的第二圈循环;
4:a=a+1;外循环for1的第二圈循环;
5:b=b+1;内循环for2的第一圈循环;
6:b=b+1;内循环for2的第二圈循环。
当执行至第3条指令时,内循环for2循环结束,此时,内循环for2的指令结束信号loop_end2拉高。当执行至第6条指令时,内循环for2循环结束,此时,内循环for2的指令结束信号loop_end2拉高;且外循环for1循环结束,此时,外循环for1的指令结束信号loop_end2拉高。
在一个可选的实施方式中,上述步骤:检测到该层循环指令执行至该层循环指令的每次循环过程的最后一条指令,具体包括如下步骤:
S11:获取当前时刻PC指针所指向该层循环指令的下一个指令周期,得到第一指令周期;
S12:在确定出所述第一指令周期大于目标数值,且该PC指针所指向该层循环指令的内嵌循环指令执行完成,以及该层循环指令执行最后一次循环过程的情况下,确定 执行至该层循环指令的最后一次循环过程的最后一条指令;其中,所述目标数值为该层循环指令的起始指针和该层循环指令的循环体内所包含的指令数量之和。
在本公开实施例中,以上述外循环for1的循环体和内循环for2的循环体的执行过程为例来进行说明。
举例来说,多层循环指令可以为以下for循环指令:
(1)for(i=1;i<3;i++)
(2)       {a=a+1};
(3)       for(j=1;j<3;j++)
(4)       {b=b+1};
(5)c=c+1。
假设当前时刻PC指针所指向的循环指令为(4)“{b=b+1}”,此时,可以确定出当前时刻PC指针所指向循环指令的下一个指令周期为(5)“c=c+1”。针对循环指令(4)“{b=b+1}”来说,该循环指令的起始指针(也即,起始PC指针)所指向的数值为(3)“for(j=0;j<3;j++)”,且循环指令为(4)“b=b+1”的循环体内所包含指令数量为1,此时,起始指针和指令数量之和为3+1=4,其中,第一指令周期5大于3+1=4。
通过上述计算过程可以确定当前时刻所执行的循环指令为外循环for1的内嵌循环指令(for2)。如果外循环for1的内嵌循环指令(即,内循环for2)执行完成,则确定检测到该层循环指令执行至该层循环指令的最后一条指令。此时,该层循环指令到达最后一条指令的指令指示信号loop_last_ins拉高。
在本公开实施例中,可以在第五状态寄存器中维护上述指令指示信号loop_last_ins,针对每层循环指令,均可以维护一个第五状态寄存器。
在本公开实施例中,上述描述的“该层循环指令的内嵌循环指令执行完成”可以理解为该层循环指令的完成指示信号loop_lower_done拉高,其中,该完成指示信号loop_lower_done可以存储在于该层循环指令对应的第三状态寄存器中。
这里,当该层循环指令内嵌套的所有内循环指令的循环跳转信号loop_jump均拉低,则确定该层循环指令的完成指示信号loop_lower_done拉高。
下面以上述指令1至指令6为例来进行举例说明:
1:a=a+1;外循环for1的第一圈循环;
2:b=b+1;内循环for2的第一圈循环;
3:b=b+1;内循环for2的第二圈循环;
4:a=a+1;外循环for1的第二圈循环;
5:b=b+1;内循环for2的第一圈循环;
6:b=b+1;内循环for2的第二圈循环。
其中,1、2和3为外循环for1的第一圈循环,4、5和6为外循环for1的第二圈循环。
针对第一圈循环,如果当前时刻PC指针所指向的循环指令为(4)“{b=b+1}”,则表明当前时刻所执行的指令为第一圈循环中的指令2和3。如果第一圈循环执行至指令3,则表明外循环for1内嵌循环指令完成,此时,可以确定出外循环for1执行至第一圈循环的最后一条指令。在此情况下,外循环for1的指令指示信号loop_last_ins拉高,由于当前不是外循环for1的最后一圈,因此,外循环for1的指令结束信号loop_end拉低。
针对第二圈循环,如果当前时刻PC指针所指向的循环指令为(4)“{b=b+1}”,则表明当前时刻所执行的指令为第一圈循环中的指令5和6。如果第一圈循环执行至指令6,则表明外循环for1内嵌循环指令完成,此时,可以确定出外循环for1执行至第二圈循环的最后一条指令;且可以确定出外循环for1所执行的第二圈循环为该层循环指令最后一次循环过程,则可以确定出外循环for1执行至最后一次循环过程的最后一条指令。 在此情况下,外循环for1的指令指示信号loop_last_ins拉高,由于当前为外循环for1的最后一圈,因此,外循环for1的指令结束信号loop_end拉高。
通过上述描述可知,在本公开实施例中,通过每层循环指令的指令结束信号,可以控制每层循环指令中循环体的执行逻辑,通过该处理方式,可以省略for循环的执行步骤,从而可以简化指令周期,进而提高了处理器的执行效率,实现了处理器的高效执行。
在一个可选的实施方式中,在获取多层循环指令之后,该方法还包括如下步骤:
S21:在所述多层循环指令中的每层循环指令中确定第一循环指令;其中,所述第一循环指令为待分配指令标识的循环指令;
S22:在预设循环标识中确定处于空闲状态的空闲循环标识;
S23:基于所述空闲循环标识确定所述第一循环指令的循环标识;其中,所述第一循环指令的循环标识用于指示所述第一循环指令的循环层数;
S24:基于所述第一循环指令的循环标识在所述参数寄存器中存储所述第一循环指令的循环参数。
在本公开实施例中,可以预选设置多个循环标识loop_id(预设循环标识),进而为多层循环指令中的每层循环指令动态分配循环标识loop_id。
具体实施时,在确定出待分配指令标识的循环指令(即,第一循环指令)之后,可以优先在预设循环标识中确定处于空闲状态的空闲循环标识;并在空闲循环标识中确定第一循环指令的循环标识。
在确定出第一循环指令的循环标识之后,可以将该循环标识确定为该第一循环指令的层数信息(也即,上述循环层数)。之后,可以基于该层数信息将该第一循环指令的循环参数存储至参数寄存器中。
具体实施时,可以在处理器的多个寄存器中为该第一循环指令确定至少一个参数寄存器,并为这至少一个参数寄存器设置对应的索引,该索引即为第一循环指令的层数信息,以表示该至少一个参数寄存器为对应存储该层数的循环指令的循环参数的参数寄存器。
在本公开实施例中,该方法还包括如下步骤:
(1)在所述预设循环标识中不包含所述空闲循环标识的情况下,在所述多层循环指令中检测第二循环指令;其中,所述第二循环指令为循环结束的循环指令;
(2)基于所述第二循环指令的循环标识确定所述第一循环指令的循环标识。
在本公开实施例中,如果预设循环标识中不包含空闲循环标识,此时,可以在多层循环指令中检测第二循环指令,这里,第二循环指令为循环结束的循环指令,具体检测过程描述如下:
首先,获取所述多层循环指令中的每层循环指令的指令工作状态;然后,基于所述指令工作状态在所述多层循环指令中确定指令执行结束的循环指令,并基于所述指令执行结束的循环指令确定所述第二循环指令。
具体实施时,可以在状态寄存器的第六状态寄存器中获取每层循环指令的指令工作状态loop_en。可以在状态寄存器为每层循环指令均配置一个第六状态寄存器,该第六状态寄存器用于维护每层循环指令的指令工作状态loop_en。
当来临一条循环指令时,该层循环指令的指令工作状态loop_en拉高;当该层循环结束工作时,该层循环指令的指令工作状态loop_en拉低。
以上述外循环for1的循环体和内循环for2的循环体的执行过程可以描述为例来进行说明:
1:a=a+1;外循环for1的第一圈循环;
2:b=b+1;内循环for2的第一圈循环;
3:b=b+1;内循环for2的第二圈循环;
4:a=a+1;外循环for1的第二圈循环;
5:b=b+1;内循环for2的第一圈循环;
6:b=b+1;内循环for2的第二圈循环。
当执行第一条指令“a=a+1”时,外循环for1的指令工作状态loop_en一直拉高,直至上述第一条指令至第六条指令全部执行完成,该外循环for1的指令工作状态loop_en一直拉低。
此时,可以理解为在上述多层循环指令循环至第六条指令时,为外循环for1分配的循环标识loop_id可以被释放,此时,外循环for1即为上述第二循环指令。
当执行第2条指令、第3条指令、第5条指令和第6条指令时,内循环for2的指令工作状态loop_en拉高。
此时,可以理解为在上述多层循环指令循环至第2条指令、第3条指令、第5条指令和第6条指令时,为内循环for2分配的循环标识loop_id可以被释放,此时,内循环for2即为上述第二循环指令。
在本公开实施例中,可以通过上述所描述的方式在多层循环指令中确定指令执行结束的循环指令,并基于指令执行结束的循环指令确定所述第二循环指令。
具体地,针对第二循环指令的循环标识loop_id,当第二循环指令执行结束之后,第二循环指令的循环标识loop_id被释放,此时,可以将循环标识loop_id进行复位。复位时将可以根据第一循环指令的循环层数设置loop_id的数值,例如,如果第一循环指令为第二循环指令的上一层循环指令,且第二循环指令的循环层数为N,那么可以将loop_id置为LOOP_N-1,在第一循环指令到来时,loop_id-1。
在上述实施方式中,通过为多层循环指令中的每层循环指令动态分配循环标识的方式,可以实现每层循环指令的循环层数的自维护,从而可以提高处理器对多层循环指令的兼容性。
在本公开实施例中,在按照上述所描述的方式确定出执行逻辑之后,可以基于所述执行逻辑控制执行所述各层循环指令中的循环体,具体包括如下步骤:
(1)在所述执行逻辑为跳转至所述多层循环指令中多个循环指令的起始PC指针所指向指令的情况下,确定所述多个循环指令中位于最内层的循环指令为待跳转指令;
(2)跳转至所述待跳转指令的起始PC指针处执行所述待跳转指令。
在本公开实施例中,如果基于该执行逻辑确定出需要回跳至多个循环指令的起始PC指针所指向指令(即,多个循环指令的循环跳转信号loop_jump拉高)的情况下,可以确定多个循环指令中位于最内层的循环指令为待跳转指令,并跳转至所述待跳转指令的起始PC指针处执行所述待跳转指。
在上述实施方式中,在多个循环指令的循环跳转信号拉高情况下,通过将多个循环指令中位于最内层的循环指令为待跳转指令的方式,可以保证多层循环指令的正常稳定运行。
下面以下述多层循环指令为例,对本公开实施例中循环指令处理方法进行举例说明:
Figure PCTCN2022120852-appb-000005
通过上述描述可知,针对上述for循环指令,该执行逻辑可以理解为每层循环指令中循环体{a=a+1}和{b=b+1}之间的跳转逻辑。在循环参数的控制下,可以通过执行循环体{a=a+1}和{b=b+1}来实现多层循环指令的指令功能,从而省略执行for(i=0;i<3;i++)和for(j=0;j<3;j++)所对应的指令。
(1)进入到外循环for1的第一轮循环过程,并执行指令1:a=a+1。
这里,在进入到上述多层循环指令之后,可以执行for1中的一个循环体a=a+1。
在执行该指令过程中,针对外循环for1和内循环for2:
寄存器R11:实时循环次数不更新;寄存器R12:实时循环次数不更新;
寄存器R21:循环跳转信号loop_jump1拉低;寄存器R22:循环跳转信号loop_jump2拉低;
寄存器R31:完成指示信号loop_lower_done1拉低;
寄存器R41:指令结束信号loop_end1拉低;寄存器R42:指令结束信号loop_end2拉低;
寄存器R51:指令指示信号loop_last_ins1拉低;寄存器R52:指令指示信号loop_last_ins2拉低;
寄存器R61:指令工作状态loop_en1拉高;寄存器R62:指令工作状态loop_en2拉低。
(2)继续执行外循环for1的第一轮循环过程,并执行指令2:b=b+1。
这里,在进入到上述多层循环指令之后,可以执行for1中的另一个循环体b=b+1。
在执行该指令过程中,针对外循环for1和内循环for2:
寄存器R11:实时循环次数不更新;寄存器R12:实时循环次数+1;
寄存器R21:循环跳转信号loop_jump1拉低;寄存器R22:循环跳转信号loop_jump2拉高;
寄存器R31:完成指示信号loop_lower_done1继续拉低;
寄存器R41:指令结束信号loop_end1拉低;寄存器R42:指令结束信号loop_end2拉低;
寄存器R51:指令指示信号loop_last_ins1拉低;寄存器R52:指令指示信号loop_last_ins2拉高;
寄存器R61:指令工作状态loop_en1拉高;寄存器R62:指令工作状态loop_en2拉高。
在执行指令2之后,通过寄存器R52判断可知,虽然执行至内循环的每轮循环的最后一条指令,但是通过寄存器R12可知,还未执行到内循环的最后一轮。且通过寄存器R12可知,内循环的实时循环次数小于1,且通过寄存器R22可知,循环跳转信号loop_jump2拉高,因此需要继续执行循环体b=b+1,即继续执行下述(3)。
(3)继续执行外循环for1的第一轮循环过程,并执行指令3:b=b+1。
这里,在进入到上述多层循环指令之后,由于循环体b=b+1还未执行到最后一轮的最后一条指令,因此,需要回跳继续执行该循环体b=b+1。
在执行该指令过程中,针对外循环for1和内循环for2:
寄存器R11:实时循环次数+1;寄存器R12:实时循环次数不更新;
寄存器R21:循环跳转信号loop_jump1拉高;寄存器R22:循环跳转信号loop_jump2拉低;
寄存器R31:完成指示信号loop_lower_done1更新为拉高;
寄存器R41:指令结束信号loop_end1拉低;寄存器R42:指令结束信号loop_end2拉高;
寄存器R51:指令指示信号loop_last_ins1拉高;寄存器R52:指令指示信号loop_last_ins2拉高;
寄存器R61:指令工作状态loop_en1拉高;寄存器R62:指令工作状态loop_en2拉高。
在执行指令3之后,通过寄存器R52判断可知,当前时刻已经执行至内循环的每轮循环的最后一条指令,且通过寄存器R21可知,已经执行到内循环的最后一轮。因此,需要为内循环for2生成拉高的指令结束信号,进一步地,通过寄存器R22可知,循环跳转信号loop_jump2拉低,表明不再继续执行循环体b=b+1。
在执行指令3之后,通过寄存器R51判断可知,虽然执行至外循环的每轮循环的最后一条指令,但是通过寄存器R11可知,还未执行到外循环的最后一轮,因此,外循环的指令结束信号loop_end1拉低,表明不结束外循环。且通过寄存器R21可知,循环跳转信号loop_jump1拉高,且通过寄存器R11可知,实时循环次数小于2,因此,需要继续执行循环体a=a+1,即,继续执行下述(4)。
(4)进入到外循环for1的第二轮循环过程,并执行指令1:a=a+1。
这里,在进入到上述多层循环指令的第二轮循环过程之后,可以执行for1中的一个循环体a=a+1。
在执行该指令过程中,针对外循环for1和内循环for2:
寄存器R11:实时循环次数不更新;寄存器R12:实时循环次数不更新;
寄存器R21:循环跳转信号loop_jump1拉低;寄存器R22:循环跳转信号loop_jump2拉低;
寄存器R31:完成指示信号loop_lower_done1拉低;
寄存器R41:指令结束信号loop_end1拉低;寄存器R42:指令结束信号loop_end2拉低;
寄存器R51:指令指示信号loop_last_ins1拉低;寄存器R52:指令指示信号loop_last_ins2拉低;
寄存器R61:指令工作状态loop_en1拉高;寄存器R62:指令工作状态loop_en2拉低。
(5)继续执行外循环for1的第二轮循环过程,并执行指令5:b=b+1。
这里,在进入到上述多层循环指令的第二轮循环过程之后,可以执行for1中的另一个循环体b=b+1。
在执行该指令过程中,针对外循环for1和内循环for2:
寄存器R11:实时循环次数不更新;寄存器R12:实时循环次数+1;
寄存器R21:循环跳转信号loop_jump1拉低;寄存器R22:循环跳转信号loop_jump2拉高;
寄存器R31:完成指示信号loop_lower_done1继续拉低;
寄存器R41:指令结束信号loop_end1拉低;寄存器R42:指令结束信号loop_end2拉低;
寄存器R51:指令指示信号loop_last_ins1拉低;寄存器R52:指令指示信号loop_last_ins2拉高;
寄存器R61:指令工作状态loop_en1拉高;寄存器R62:指令工作状态loop_en2拉高。
在执行指令5之后,通过寄存器R52判断可知,虽然执行至内循环的每轮循环的最后一条指令,但是通过寄存器R12可知,还未执行到内循环的最后一轮。且通过寄存器R22可知,循环跳转信号loop_jump2拉高,因此,需要继续执行循环体b=b+1,即,继续执行下述(6)。
(6)继续执行外循环for1的第二轮循环过程,并执行指令6:b=b+1。
这里,在进入到上述多层循环指令之后,由于循环体b=b+1还未执行到最后一轮的最后一条指令,因此,需要回跳继续执行该循环体b=b+1。
在执行该指令过程中,针对外循环for1和内循环for2:
寄存器R11:实时循环次数不更新;寄存器R12:实时循环次数不更新;
寄存器R21:循环跳转信号loop_jump1拉低;寄存器R22:循环跳转信号loop_jump2拉低;
寄存器R31:完成指示信号loop_lower_done1更新为拉高;
寄存器R41:指令结束信号loop_end1拉低;寄存器R42:指令结束信号loop_end2拉高;
寄存器R51:指令指示信号loop_last_ins1拉高;寄存器R52:指令指示信号loop_last_ins2拉高;
寄存器R61:指令工作状态loop_en1拉高;寄存器R62:指令工作状态loop_en2拉高。
在执行指令6之后,通过寄存器R52判断可知,当前时刻已经执行至内循环的每轮循环的最后一条指令,且通过寄存器R21可知,已经执行到内循环的最后一轮。因此,需要为内循环for2生成拉高的指令结束信号,进一步地,通过寄存器R22可知,循环跳转信号loop_jump2拉低,表明不再继续执行循环体b=b+1。
在执行指令6之后,通过寄存器R51判断可知,当前时刻已经执行至外循环的每轮循环的最后一条指令,但是通过寄存器R11可知,已经执行到外循环的最后一轮,因此,外循环的指令结束信号loop_end1拉高,表明不结束外循环。且通过寄存器R21可知,循环跳转信号loop_jump1拉低,表明不再继续执行循环体a=a+1,即完成全部循环过程。
本领域技术人员可以理解,在具体实施方式的上述方法中,各步骤的撰写顺序并不意味着严格的执行顺序而对实施过程构成任何限定,各步骤的具体执行顺序应当以其功能和可能的内在逻辑确定。
基于同一发明构思,本公开实施例中还提供了与循环指令处理方法对应的循环指令处理装置,由于本公开实施例中的装置解决问题的原理与本公开实施例上述循环指令处理方法相似,因此装置的实施可以参见方法的实施,重复之处不再赘述。
参照图5所示,为本公开实施例提供的一种循环指令处理装置的示意图,所述装置包括:控制器10、参数寄存器20以及运算器30。
控制器10用于获取多层循环指令;其中,所述多层循环指令包含多层嵌套的循环指令;在所述参数寄存器20中获取所述多层循环指令中的每层循环指令的循环参数;以及在所述每层循环指令的执行过程中,基于该层循环指令的状态信息和该层循环指令的循环参数确定该层循环指令中循环体的执行逻辑;其中,所述状态信息用于指示该层循环指令的实时执行状态。
运算器30用于基于所述执行逻辑控制执行该层循环指令中的循环体。
在本公开实施例中,在获取到多层循环指令之后,可以在处理器的参数寄存器中获取多层循环指令中的每层循环指令的循环参数,并在每层循环指令的执行过程中,基于该层循环指令的状态信息和该层循环指令的循环参数确定该层循环指令中循环体的执行逻辑;其中,状态信息用于指示该层循环指令的实时执行状态;进而基于执行逻辑控制执行该层循环指令中的循环体。在本公开实施例中,通过维护循环参数和状态信息控制执行每层循环指令中循环体的执行逻辑,可以实现通过执行每层循环指令的循环体来实现多层循环指令的指令功能,从而可以省略执行for语句,简化指令周期,进而提高了处理器的执行效率,实现了处理器的高效执行。
在一种可能的实施方式中,控制器10还用于:获取该层循环指令的实时循环次数;将所述实时循环次数与该层循环指令的循环结束参数进行比对,得到比对结果;根据所述比对结果确定该层循环指令中循环体的执行逻辑。
在一种可能的实施方式中,该循环指令处理装置还用于:检测所述多层循环指令中满足更新条件的目标层循环指令;其中,所述更新条件为更新所述目标层循环指令的实时循环次数的条件;更新所述目标层循环指令的实时循环次数。
在一种可能的实施方式中,该循环指令处理装置还用于:针对所述多层循环指令中除最内层循环指令之外的每层循环指令,获取该层循环指令的循环跳转信号;其中,所述循环跳转信号用于指示是否跳转执行该层循环指令的起始PC指针所指示的指令;获 取该层循环指令内嵌套的下一层循环指令的执行信息;在确定该层循环指令的所述循环跳转信号为回跳信号以及所述下一层循环指令的执行信息为执行完成的情况下,确定该层循环指令为所述满足更新条件的目标层循环指令。
在一种可能的实施方式中,该循环指令处理装置还用于:在检测到PC指针指向该层循环指令的最后一个指令,且检测到该层循环指令未执行最后一次循环计算的情况下,确定该层循环指令的循环跳转信号为回跳信号。
在一种可能的实施方式中,控制器10还用于:检测该层循环指令的指令结束信号;基于检测到的所述指令结束信号和该层循环指令的所述循环参数,确定该层循环指令中循环体的执行逻辑。
在一种可能的实施方式中,控制器10还用于:在检测到该层循环指令执行至该层循环指令的最后一次循环过程的最后一条指令的情况下,确定检测到该层循环指令的指令结束信号。
在一种可能的实施方式中,控制器10还用于:获取当前时刻PC指针所指向该层循环指令的下一个指令周期,得到第一指令周期;在确定出所述第一指令周期大于目标数值,且该PC指针所指向该层循环指令的内嵌循环指令执行完成,以及该层循环指令执行最后一次循环过程的情况下,确定执行至该层循环指令的最后一次循环过程的最后一条指令;其中,所述目标数值为该层循环指令的起始指针和该层循环指令的循环体内所包含的指令数量之和。
在一种可能的实施方式中,该循环指令处理装置还用于:在获取多层循环指令之后,在所述多层循环指令中的每层循环指令中确定第一循环指令;其中,所述第一循环指令为待分配指令标识的循环指令;在预设循环标识中确定处于空闲状态的空闲循环标识;基于所述空闲循环标识确定所述第一循环指令的循环标识;其中,所述第一循环指令的循环标识用于指示所述第一循环指令的循环层数;基于所述第一循环指令的循环标识在所述参数寄存器中存储所述第一循环指令的循环参数。
在一种可能的实施方式中,该循环指令处理装置还用于:在所述预设循环标识中不包含所述空闲循环标识的情况下,在所述多层循环指令中检测第二循环指令;其中,所述第二循环指令为循环结束的循环指令;基于所述第二循环指令的循环标识确定所述第一循环指令的循环标识。
在一种可能的实施方式中,该循环指令处理装置还用于:获取所述多层循环指令中的每层循环指令的指令工作状态;基于所述指令工作状态在所述多层循环指令中确定指令执行结束的循环指令,并基于所述指令执行结束的循环指令确定所述第二循环指令。
在一种可能的实施方式中,运算器30还用于:在所述执行逻辑为跳转至所述多层循环指令中多个循环指令的起始PC指针所指向指令的情况下,确定所述多个循环指令中位于最内层的循环指令为待跳转指令;跳转至所述待跳转指令的起始PC指针处执行所述待跳转指令。
关于装置中的各模块的处理流程、以及各模块之间的交互流程的描述可以参照上述方法实施例中的相关说明,这里不再详述。
对应于图1中的循环指令处理方法,本公开实施例还提供了一种电子设备600。如图6所示,为本公开实施例提供的电子设备600结构示意图,该电子设备600包括:处理器61、存储器62和总线63。
存储器62用于存储执行指令,包括内存621和外部存储器622;这里的内存621也称内存储器,用于暂时存放处理器61中的运算数据,以及与硬盘等外部存储器622交换的数据。处理器61通过内存621与外部存储器622进行数据交换,当所述电子设备600运行时,所述处理器61与所述存储器62之间通过总线63通信,使得所述处理器61执行以下指令:获取多层循环指令;其中,所述多层循环指令包含多层嵌套的循环指令;在所述处理器的参数寄存器中获取所述多层循环指令中的每层循环指令的循环 参数;在所述每层循环指令的执行过程中,基于该层循环指令的状态信息和所该层循环指令的循环参数确定该层循环指令中循环体的执行逻辑;其中,所述状态信息用于指示该层循环指令的实时执行状态;基于所述执行逻辑控制执行该层循环指令中的循环体。
本公开实施例还提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行上述方法实施例中所述的循环指令处理方法的步骤。其中,该存储介质可以是易失性或非易失的计算机可读取存储介质。
本公开实施例还提供一种计算机程序产品,该计算机程序产品承载有程序代码,所述程序代码包括的指令可用于执行上述方法实施例中所述的循环指令处理方法的步骤,具体可参见上述方法实施例,在此不再赘述。
本公开实施例还提供一种芯片,包括上述实施例中任一项所述的指令处理装置,具体可参见上述装置实施例,在此不再赘述。
其中,上述计算机程序产品可以具体通过硬件、软件或其结合的方式实现。在一个可选实施例中,所述计算机程序产品具体体现为计算机存储介质,在另一个可选实施例中,计算机程序产品具体体现为软件产品,例如软件开发包(Software Development Kit,SDK)等等。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统和装置的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。在本公开所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,又例如,多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些通信接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本公开各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个处理器可执行的非易失的计算机可读取存储介质中。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台电子设备(可以是个人计算机,服务器,或者网络设备等)执行本公开各个实施例所述方法的全部或部分步骤。而前述的存储介质包括U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
最后应说明的是,以上所述实施例,仅为本公开的具体实施方式,用以说明本公开的技术方案,而非对其限制,本公开的保护范围并不局限于此,尽管参照前述实施例对本公开进行了详细的说明,本领域的普通技术人员应当理解,任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,其依然可以对前述实施例所记载的技术方案进行修改或可轻易想到变化,或者对其中部分技术特征进行等同替换;而这些修改、变化或者替换,并不使相应技术方案的本质脱离本公开实施例技术方案的精神和范围,都应涵盖在本公开的保护范围之内。因此,本公开的保护范围应所述以权利要求的保护范围为准。

Claims (17)

  1. 一种指令处理方法,其特征在于,应用于处理器,包括:
    获取多层循环指令;其中,所述多层循环指令包含多层嵌套的循环指令;
    在所述处理器的参数寄存器中获取所述多层循环指令中的每层循环指令的循环参数;
    在所述每层循环指令的执行过程中,
    基于该层循环指令的状态信息和该层循环指令的循环参数,确定该层循环指令中循环体的执行逻辑;其中,所述状态信息用于指示该层循环指令的实时执行状态;
    基于所述执行逻辑控制执行该层循环指令中的循环体。
  2. 根据权利要求1所述的方法,其特征在于,所述基于该层循环指令的状态信息和该层循环指令的循环参数,确定该层循环指令中循环体的执行逻辑,包括:
    获取该层循环指令的实时循环次数;
    将所述实时循环次数与该层循环指令的循环结束参数进行比对,得到比对结果;
    根据所述比对结果确定该层循环指令中循环体的执行逻辑。
  3. 根据权利要求2所述的方法,其特征在于,所述方法还包括:
    检测所述多层循环指令中满足更新条件的目标层循环指令;其中,所述更新条件为更新所述目标层循环指令的实时循环次数的条件;
    更新所述目标层循环指令的实时循环次数。
  4. 根据权利要求3所述的方法,其特征在于,所述检测所述多层循环指令中满足更新条件的目标层循环指令,包括:
    针对所述多层循环指令中除最内层循环指令之外的每层循环指令,
    获取该层循环指令的循环跳转信号;其中,所述循环跳转信号用于指示是否跳转执行该层循环指令的起始PC指针所指示的指令;
    获取该层循环指令内嵌套的下一层循环指令的执行信息;
    在确定该层循环指令的所述循环跳转信号为回跳信号以及所述下一层循环指令的执行信息为执行完成的情况下,确定该层循环指令为所述满足更新条件的目标层循环指令。
  5. 根据权利要求4所述的方法,其特征在于,所述确定该层循环指令的所述循环跳转信号为回跳信号,包括:
    在检测到PC指针指向该层循环指令中的最后一个指令,且检测到该层循环指令未执行最后一次循环计算的情况下,确定该层循环指令的循环跳转信号为回跳信号。
  6. 根据权利要求1所述的方法,其特征在于,所述基于该层循环指令的状态信息和该层循环指令的循环参数确定该层循环指令中循环体的执行逻辑,包括:
    检测该层循环指令的指令结束信号;
    基于检测到的所述指令结束信号和该层循环指令的所述循环参数,确定该层循环指令中循环体的执行逻辑。
  7. 根据权利要求6所述的方法,其特征在于,所述检测该层循环指令的指令结束信号,包括:
    在检测到该层循环指令执行至该层循环指令的最后一次循环过程的最后一条指令的情况下,确定检测到该层循环指令的指令结束信号。
  8. 根据权利要求7所述的方法,其特征在于,所述检测到该层循环指令执行至该层循环指令的最后一次循环过程的最后一条指令,包括:
    获取当前时刻PC指针所指向该层循环指令的下一个指令周期,得到第一指令周期;
    在确定出所述第一指令周期大于目标数值,且该PC指针所指向该层循环指令的内嵌循环指令执行完成,以及该层循环指令执行最后一次循环过程的情况下,确定执行至 该层循环指令的最后一次循环过程的最后一条指令;其中,所述目标数值为该层循环指令的起始指针和该层循环指令的循环体内所包含的指令数量之和。
  9. 根据权利要求1所述的方法,其特征在于,在所述获取多层循环指令之后,所述方法包括:
    在所述多层循环指令中的每层循环指令中确定第一循环指令;其中,所述第一循环指令为待分配指令标识的循环指令;
    在预设循环标识中确定处于空闲状态的空闲循环标识;
    基于所述空闲循环标识确定所述第一循环指令的循环标识;其中,所述第一循环指令的循环标识用于指示所述第一循环指令的循环层数;
    基于所述第一循环指令的循环标识在所述参数寄存器中存储所述第一循环指令的循环参数。
  10. 根据权利要求9所述的方法,其特征在于,所述方法还包括:
    在所述预设循环标识中不包含所述空闲循环标识的情况下,在所述多层循环指令中检测第二循环指令;其中,所述第二循环指令为循环结束的循环指令;
    基于所述第二循环指令的循环标识确定所述第一循环指令的循环标识。
  11. 根据权利要求10所述的方法,其特征在于,所述在所述多层循环指令中检测第二循环指令,包括:
    获取所述多层循环指令中的每层循环指令的指令工作状态;
    基于所述指令工作状态在所述多层循环指令中确定指令执行结束的循环指令;
    基于所述指令执行结束的循环指令确定所述第二循环指令。
  12. 根据权利要求1所述的方法,其特征在于,所述基于所述执行逻辑控制执行所述各层循环指令中的循环体,包括:
    在所述执行逻辑为跳转至所述多层循环指令中多个循环指令的起始PC指针所指向指令的情况下,确定所述多个循环指令中位于最内层的循环指令为待跳转指令;
    跳转至所述待跳转指令的起始PC指针处执行所述待跳转指令。
  13. 一种指令处理装置,其特征在于,包括:
    参数寄存器;
    控制器,用于:
    获取多层循环指令;其中,所述多层循环指令包含多层嵌套的循环指令;
    在所述参数寄存器中获取所述多层循环指令中的每层循环指令的循环参数;
    在所述每层循环指令的执行过程中,基于该层循环指令的状态信息和该层循环指令的循环参数,确定该层循环指令中循环体的执行逻辑;其中,所述状态信息用于指示该层循环指令的实时执行状态;
    运算器,用于基于所述执行逻辑控制执行该层循环指令中的循环体。
  14. 一种芯片,其特征在于,包括如权利要求13所述的指令处理装置。
  15. 一种电子设备,其特征在于,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,当所述电子设备运行时,所述处理器与所述存储器之间通过所述总线通信,所述机器可读指令被所述处理器执行时实现如权利要求1至12任一项所述的循环指令处理方法的步骤。
  16. 一种电子设备,其特征在于,包括如权利要求14所述的芯片。
  17. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器运行时执行如权利要求1至12任一项所述的循环指令处理方法的步骤。
PCT/CN2022/120852 2022-01-29 2022-09-23 循环指令处理方法、装置、芯片、电子设备及存储介质 WO2023142502A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210113076.X 2022-01-29
CN202210113076.XA CN114443142A (zh) 2022-01-29 2022-01-29 循环指令处理方法、装置、芯片、电子设备及存储介质

Publications (1)

Publication Number Publication Date
WO2023142502A1 true WO2023142502A1 (zh) 2023-08-03

Family

ID=81372379

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/120852 WO2023142502A1 (zh) 2022-01-29 2022-09-23 循环指令处理方法、装置、芯片、电子设备及存储介质

Country Status (2)

Country Link
CN (1) CN114443142A (zh)
WO (1) WO2023142502A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117707619A (zh) * 2023-11-02 2024-03-15 上海攸化技术管理咨询有限公司 指令的编码方式、运算单元、运算模块及运算方法
CN117908967A (zh) * 2024-03-15 2024-04-19 北京壁仞科技开发有限公司 支持动态形状的计算的方法、装置、介质和程序产品

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114443142A (zh) * 2022-01-29 2022-05-06 上海阵量智能科技有限公司 循环指令处理方法、装置、芯片、电子设备及存储介质
CN115113934B (zh) * 2022-08-31 2022-11-11 腾讯科技(深圳)有限公司 指令处理方法、装置、程序产品、计算机设备和介质
CN116893850B (zh) * 2023-07-10 2024-05-24 北京辉羲智能科技有限公司 一种硬件循环指令转换方法及编译器

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107729054A (zh) * 2017-10-18 2018-02-23 珠海市杰理科技股份有限公司 实现处理器对循环体执行的方法及装置
CN112148367A (zh) * 2019-06-26 2020-12-29 北京百度网讯科技有限公司 用于处理循环指令集合的方法、装置、设备和介质
CN114443142A (zh) * 2022-01-29 2022-05-06 上海阵量智能科技有限公司 循环指令处理方法、装置、芯片、电子设备及存储介质

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107729054A (zh) * 2017-10-18 2018-02-23 珠海市杰理科技股份有限公司 实现处理器对循环体执行的方法及装置
CN112148367A (zh) * 2019-06-26 2020-12-29 北京百度网讯科技有限公司 用于处理循环指令集合的方法、装置、设备和介质
CN114443142A (zh) * 2022-01-29 2022-05-06 上海阵量智能科技有限公司 循环指令处理方法、装置、芯片、电子设备及存储介质

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117707619A (zh) * 2023-11-02 2024-03-15 上海攸化技术管理咨询有限公司 指令的编码方式、运算单元、运算模块及运算方法
CN117908967A (zh) * 2024-03-15 2024-04-19 北京壁仞科技开发有限公司 支持动态形状的计算的方法、装置、介质和程序产品
CN117908967B (zh) * 2024-03-15 2024-05-31 北京壁仞科技开发有限公司 支持动态形状的计算的方法、装置、介质和程序产品

Also Published As

Publication number Publication date
CN114443142A (zh) 2022-05-06

Similar Documents

Publication Publication Date Title
WO2023142502A1 (zh) 循环指令处理方法、装置、芯片、电子设备及存储介质
TWI486810B (zh) 在狀態機晶格中之計數器操作
TWI713629B (zh) 切換影堆疊指標的硬體設備以及方法
WO2020062086A1 (zh) 选择处理器的方法和装置
TWI497278B (zh) 微處理器、死結或活結狀態解除方法以及其電腦程式產品
WO2020083050A1 (zh) 一种数据流处理方法及相关设备
US10748060B2 (en) Pre-synaptic learning using delayed causal updates
CN104750459A (zh) 带有事务功能以及报告事务操作的日志记录电路的处理器
KR20190113555A (ko) 뉴로모픽 가속기 멀티태스킹
JP2020518068A (ja) 最適化されたディープネットワーク処理のためのグラフマッチング
US20240118992A1 (en) Systems, apparatus, and methods to debug accelerator hardware
US11694075B2 (en) Partitioning control dependency edge in computation graph
WO2020191549A1 (zh) 一种soc芯片、确定热点函数的方法及终端设备
TW202109513A (zh) 在區塊鏈中同時執行交易的方法和裝置及電腦可讀儲存媒體與計算設備
TW201921997A (zh) 行動設備的功率狀態控制
WO2023173642A1 (zh) 指令调度的方法、处理电路和电子设备
TW202109286A (zh) 純函數語言神經網路加速器系統及結構
US10684834B2 (en) Method and apparatus for detecting inter-instruction data dependency
CN103793263A (zh) 一种基于PowerPC处理器的DMA事务级建模方法
US11216431B2 (en) Providing a compact representation of tree structures
US11366690B2 (en) Scheduling commands in a virtual computing environment
US20220171622A1 (en) Multi-dimension dma controller and computer system including the same
US10180839B2 (en) Apparatus for information processing with loop cache and associated methods
CN102063308B (zh) 一种用于地震勘探资料处理流程控制的方法
Li et al. An application-oblivious memory scheduling system for DNN accelerators

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22923306

Country of ref document: EP

Kind code of ref document: A1