WO2023124370A1 - Instruction synchronization apparatus, chip, computer device, and data processing method - Google Patents

Instruction synchronization apparatus, chip, computer device, and data processing method Download PDF

Info

Publication number
WO2023124370A1
WO2023124370A1 PCT/CN2022/124511 CN2022124511W WO2023124370A1 WO 2023124370 A1 WO2023124370 A1 WO 2023124370A1 CN 2022124511 W CN2022124511 W CN 2022124511W WO 2023124370 A1 WO2023124370 A1 WO 2023124370A1
Authority
WO
WIPO (PCT)
Prior art keywords
instruction
queue
waiting
count value
adjustment
Prior art date
Application number
PCT/CN2022/124511
Other languages
French (fr)
Chinese (zh)
Inventor
王文强
孙海涛
何博
徐宁仪
Original Assignee
上海商汤智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤智能科技有限公司 filed Critical 上海商汤智能科技有限公司
Publication of WO2023124370A1 publication Critical patent/WO2023124370A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication

Definitions

  • the present disclosure relates to the technical field of artificial intelligence, and in particular to an instruction synchronization device, a chip and computer equipment, and a data processing method.
  • an embodiment of the present disclosure provides an instruction synchronization device, which includes: a counter, and a plurality of instruction queues; the counter is coupled to the plurality of instruction queues; each of the plurality of instruction queues An instruction queue is used to store instructions, and the instructions include execution instructions, and at least one of a trigger instruction and a waiting instruction; the counter is used for performing a first adjustment to the count value in response to receiving the trigger instruction, and in response to A second adjustment is performed on the count value after receiving the waiting instruction; the adjustment method of the first adjustment is different from the adjustment method of the second adjustment; wherein, the instruction after the waiting instruction in an instruction queue is at the top of the count value It is sent when a preset numerical condition is met, and the preset numerical condition is determined based on the initial count value of the counter, the first adjusted adjustment mode, and the second adjusted adjustment mode.
  • the apparatus further includes an execution unit, the execution unit is coupled to the plurality of instruction queues, and is used to execute the received execution instructions; each instruction queue is used to execute the stored Parsing the instructions, sending the parsed trigger instructions and waiting instructions to the counter, and sending the parsed execution instructions to the execution unit.
  • one of the first adjustment and the second adjustment is adjusted by increasing the count value according to a first preset step size, and the other is adjusted by a second preset
  • the step size is reduced by the count value; the first preset step size is determined based on the number of waiting queues related to the target synchronization event in the plurality of instruction queues, and the second preset step size is determined based on the plurality of instruction queues The number of waiting queues related to the target synchronization event in the queue is determined.
  • the trigger instruction carries the first preset step size
  • the wait instruction carries the second preset step size
  • the first preset step size is equal to the product of the number of waiting queues and a preset multiple
  • the second preset step size is equal to the number of the waiting queues and the preset multiple product of .
  • the preset numerical condition is jointly determined based on the initial count value, the first preset step size, the second preset step size, and a preset multiple.
  • the preset numerical condition is: the count value and the target count value satisfy a preset numerical relationship, and the target count value is: n*m*a-1+k0; wherein, n is The first preset step size, m is the second preset step size, a is the preset multiple, and k0 is the initial value of the count; the adjustment method in the first adjustment is to increase the count value, when the adjustment method of the second adjustment is to reduce the count value, the numerical relationship is that the count value is greater than the target count value; when the adjustment method of the first adjustment is to reduce the count value value, and when the adjustment method of the second adjustment is to increase the count value, the numerical relationship is that the count value is smaller than the target count value.
  • the plurality of instruction queues include a wait queue and a wait queue; the wait queue includes a plurality of trigger instructions, and each trigger instruction in the same wait queue corresponds to a synchronization event, A synchronization event includes a corresponding waiting instruction and a corresponding triggering instruction, and the multiple triggering instructions are used to adjust the count value of the same counter; after the first triggering instruction in the waiting queue, a target waiting instruction is also included ; The first waiting instruction in the waiting queue further includes a target trigger instruction.
  • the number of the counter is greater than 1; the trigger instruction and the waiting instruction included in each instruction queue include: identification information of the counter, and the instruction queue is used to store the trigger instruction and the waiting instruction included in the instruction queue The waiting instruction is sent to the corresponding counter according to the identification information of the counter.
  • the instruction synchronization device further includes: an arbitration unit, coupled to the plurality of instruction queues and the execution unit, configured to assign each of the plurality of instruction queues according to a preset priority The execution instructions sent by the instruction queues are sent to the execution unit.
  • the instruction synchronization device further includes: a multiplexer, coupled to the plurality of instruction queues, for sending the instruction to a corresponding instruction queue in the plurality of instruction queues.
  • both the triggering instruction and the waiting instruction include: identification information of the instruction queue, and the multiplexer is also used to combine the triggering instruction and the waiting instruction according to the instruction The identification information of the queue is sent to the corresponding command queue.
  • an embodiment of the present disclosure provides a chip, and the chip includes: the instruction synchronization device described in any embodiment of the present disclosure.
  • an embodiment of the present disclosure provides a computer device, where the computer device includes: the chip described in any embodiment of the present disclosure.
  • an embodiment of the present disclosure provides a data processing method, which is applied to the instruction synchronization device described in any embodiment of the present disclosure, and the method includes: each instruction queue in a plurality of instruction queues stores instructions, and the The instructions include an execution instruction, and at least one of a trigger instruction and a wait instruction; a counter coupled to the plurality of instruction queues performs a first adjustment to the count value in response to receiving the trigger instruction, and responds to receiving the wait instruction Carrying out a second adjustment to the count value; the adjustment method of the first adjustment is different from the adjustment method of the second adjustment; when the count value satisfies the preset numerical condition, each instruction queue The instruction after the waiting instruction is sent, and the preset value condition is determined based on the initial count value of the counter, the first adjusted adjustment method and the second adjusted adjustment method.
  • the method further includes: each instruction queue parses the stored instructions, sends the parsed trigger instructions and waiting instructions to the counter, and sends the parsed execution instructions to execution unit.
  • the counter performs a first adjustment to the count value in response to receiving the trigger instruction, and performs a second adjustment to the count value in response to receiving the wait instruction, including: responding to receiving the trigger instruction according to the first preset Set the step size to increase the count value, and decrease the count value according to the second preset step size in response to receiving the waiting instruction; or, reduce the count value according to the first preset step size in response to receiving the trigger instruction, And in response to receiving the waiting instruction, increase the count value according to the second preset step size.
  • the first preset step size is determined based on the number of waiting queues related to the target synchronization event in the plurality of instruction queues
  • the second preset step size is determined based on the number of waiting queues in the plurality of instruction queues The number of waiting queues associated with the target synchronization event is determined.
  • the method further includes: acquiring, by the counter, the first preset step size carried in the trigger instruction, and acquiring the second preset step size carried in the waiting instruction.
  • the first preset step size is equal to the product of the number of waiting queues and a preset multiple
  • the second preset step size is equal to the number of the waiting queues and the preset multiple product of .
  • the preset numerical condition is jointly determined based on the initial count value, the first preset step size, the second preset step size, and a preset multiple.
  • the preset numerical condition is: the count value and the target count value satisfy a preset numerical relationship, and the target count value is: n*m*a-1+k0; wherein, n is The first preset step size, m is the second preset step size, a is the preset multiple, and k0 is the counting initial value.
  • the method further includes: when the adjustment mode of the first adjustment is to increase the count value of each instruction queue, and the adjustment mode of the second adjustment is to decrease the count value Next, the numerical relationship is that the count value is greater than the target count value; the adjustment mode of each instruction queue in the first adjustment is to reduce the count value, and the adjustment mode of the second adjustment is to increase In the case of the count value, the numerical relationship is that the count value is smaller than the target count value.
  • the plurality of instruction queues include a wait queue and a wait queue; the wait queue includes a plurality of trigger instructions, and each trigger instruction in the same wait queue corresponds to a synchronization event, A synchronization event includes a corresponding waiting instruction and a corresponding triggering instruction, and the multiple triggering instructions are used to adjust the count value of the same counter; after the first triggering instruction in the waiting queue, a target waiting instruction is also included ; The first waiting instruction in the waiting queue further includes a target trigger instruction; the first trigger instruction and the first waiting instruction are corresponding instructions of other synchronization events except the last synchronization event.
  • the number of counters is greater than 1; the method further includes: each instruction queue acquires the identification information of the counters included in the triggering instructions and waiting instructions in the queue, and based on the triggering instructions in the queue The identification information of the counter included in the instruction and the waiting instruction sends the trigger instruction and the waiting instruction in the queue to the corresponding counter.
  • the sending the parsed execution instructions to the execution unit includes: sending the parsed execution instructions to the arbitration unit, so that the arbitration unit sends the multiple The execution instructions sent by each instruction queue in the instruction queue are sent to the execution unit.
  • the method further includes: each instruction queue acquires the instruction sent by the multiplexer, and stores the instruction sent by the multiplexer.
  • each of the instruction queues acquiring the instructions sent by the multiplexer includes: each of the instruction queues acquiring the instructions sent by the multiplexer based on the identification information of the instruction queues included in the instructions to Instructions for this queue.
  • a trigger command and a wait command are inserted into the command queue, and the count value of the counter is adjusted through the trigger command and the wait command. Therefore, the sending order of the instructions in the multiple instruction queues can be controlled through the above method, so as to realize the instruction synchronization among the multiple instruction queues.
  • the foregoing embodiments implement synchronization between instruction queues in a hardware manner, which improves the efficiency of the hardware system.
  • FIG. 1 is a schematic diagram of an instruction synchronization process.
  • FIG. 2 is a schematic structural diagram of an instruction synchronization device according to an embodiment of the present disclosure.
  • FIG. 3 is a schematic diagram of an instruction synchronization method according to an embodiment of the disclosure.
  • FIG. 4 is a schematic diagram of an instruction synchronization method according to another embodiment of the disclosure.
  • FIG. 5A is a schematic diagram of an instruction synchronization method when multiple synchronization events are mapped to the same synchronization counter according to an embodiment of the present disclosure.
  • FIG. 5B is a schematic diagram of the instruction sending sequence in the instruction queue in FIG. 5A .
  • FIG. 6 is a schematic structural diagram of an instruction synchronization device according to another embodiment of the present disclosure.
  • FIG. 7 is a flowchart of a data processing method according to an embodiment of the present disclosure.
  • first, second, third, etc. may be used in the present disclosure to describe various information, the information should not be limited to these terms. These terms are only used to distinguish information of the same type from one another. For example, without departing from the scope of the present disclosure, first information may also be called second information, and similarly, second information may also be called first information. Depending on the context, the word “if” as used herein may be interpreted as “at” or “when” or “in response to a determination.”
  • the instructions may be issued to each instruction queue according to the preset distribution rules, and there may or may not be a dependency relationship between instructions in different instruction queues.
  • the instructions in the two instruction queues can be sent in parallel, thereby increasing the degree of parallelism between the instructions; there is a dependency between the instructions in the two instruction queues
  • FIG. 1 it is a schematic diagram of an instruction synchronization process in some embodiments. Assuming that the instructions q1 and q2 in the instruction queue 1 depend on the instructions Q1, Q2 and Q3 in the instruction queue 2, that is, the instructions q1 and q2 can only be sent after the instructions Q1, Q2 and Q3 are all sent, then each The order in which commands are sent is shown on the time axis in the figure. Those skilled in the art can understand that the sending sequence of instructions shown in the figure is only an exemplary description. In practical applications, the instructions in the same instruction queue are sent sequentially according to the order in which the instructions are stored in the instruction queue. Instructions that do not have a dependency relationship among the instruction queues may be sent in any sending order.
  • the sending time of the instruction Q4 may be earlier than the sending time of the instruction q3, or the instruction Q1 may be sent between the instruction q3 and the instruction q4.
  • the instructions q1 and q2 depend on the instructions Q1, Q2 and Q3, the sending times of the instructions Q1, Q2 and Q3 are all earlier than the instruction q1, and are all earlier than the instruction q2.
  • an instruction counter is deployed for each instruction queue to count the instructions sent by the current instruction queue, and other instruction queues realize instruction synchronization by obtaining the count value of the instruction counter of the corresponding instruction queue.
  • this command synchronization method there are only producers and no consumers (for example, the count of the counter has been increasing but not decreasing), the counter has the risk of overflow, and the commands in each command queue can only be the same type of command.
  • another related technology adopts a producer-consumer model, and deploys a state synchronization counter between every two instruction queues to realize instruction synchronization.
  • This technology adopts the method of statically deploying state synchronization counters. Assuming that the number of instruction queues is N, the number of state synchronization counters to be deployed is N*(N-1). When N is large, the number of counters used The number is huge and takes up too many system resources.
  • an embodiment of the present disclosure provides an instruction synchronization device, see Figure 2 and Figure 6, the device includes:
  • a counter 201 and a plurality of instruction queues 202 coupled to the counter 201; each instruction queue in the plurality of instruction queues 202 (for example, instruction queue 1, instruction queue 2, instruction queue 3, instruction queue 4, etc.)
  • the instructions include an execution instruction, and at least one of a trigger instruction Trigger and a wait instruction Wait;
  • the counter 201 is used to perform a first adjustment to the count value in response to receiving the trigger instruction Trigger, and to perform a second adjustment to the count value in response to receiving the wait instruction Wait; the adjustment method of the first adjustment and the second adjustment Adjustments are made in different ways;
  • the instructions located after the waiting instruction Wait in an instruction queue 202 are sent when the count value satisfies a preset numerical condition, and the preset numerical condition is based on the initial count value of the counter 201, the The adjustment mode of the first adjustment and the adjustment mode of the second adjustment are determined.
  • the number of counters (also referred to as synchronous counters) 201 can be greater than or equal to 1, and the number of instruction queues 202 can be greater than or equal to 2.
  • An instruction queue 202 may be used to store execution instructions, and the execution instructions are used to instruct the execution unit to perform operations.
  • the execution instruction is sent from the instruction queue 202 to the execution unit 203 for execution.
  • the execution instruction may include but not limited to at least one of various operation instructions such as an addition instruction, a multiplication instruction, and a convolution multiplication instruction.
  • the execution unit may perform operations on the acquired target data in response to the execution instruction.
  • the target data may be data in various forms such as image data, voice data, and text data.
  • Instruction queue 202 may be implemented as a memory.
  • Instructions in one or more instruction queues may have a dependency relationship with instructions in another one or more instruction queues.
  • a trigger command Trigger and a wait command Wait may be inserted into the command queue.
  • the instructions in each instruction queue may have different dependencies at different times. For example, at time t1, the instructions in instruction queue 1 need to wait for the instructions in instruction queue 2 to be sent before they can be sent (this situation is called instruction queue 1 waits for the instruction queue 2, the instruction queue 1 is called the waiting queue, and the instruction queue 2 is called the waited queue); at t2 time, there is no dependency relationship between the instructions in the instruction queue 1 and the instructions in the instruction queue 2; At time t3, command queue 2 needs to wait for command queue 1.
  • instruction queue 1 waiting for instruction queue 2 is called a synchronization event
  • the waiting queue involved in a synchronization event is called the waiting queue corresponding to the synchronization event
  • the waiting queue involved in a synchronization event is called the synchronization event The corresponding waiting queue.
  • the same instruction queue may correspond to one or more synchronization events. For example, at a certain moment, an instruction in instruction queue 1 needs to wait for an instruction in instruction queue 2 to be sent before it can be sent; An instruction in 1 needs to wait for two instructions in instruction queue 3 to be sent before it can be sent; at another moment, three instructions in instruction queue 4 need to wait for two instructions in instruction queue 1 to be sent before they can be sent. Then, in the above example, instruction queue 1 corresponds to 3 synchronization events respectively.
  • the instruction queue can store the instructions in the order in which they were received, and can also analyze the stored instructions. Each instruction may include the type of the instruction, and the instruction queue may determine whether the instruction is an execution instruction, a trigger instruction or a waiting instruction according to the type of the instruction. Referring to FIG. 3 , where, instruction 1, instruction 2, instruction 3, etc. represent execution instructions, T and W represent trigger instructions and waiting instructions respectively, and instructions after waiting instructions in instruction queue 1 need to wait before trigger instructions in instruction queue 2 All commands of the command must be sent before they can be sent. After the instruction queue parses out the execution instruction, it can send the execution instruction to the execution unit; after parsing out the trigger instruction or the waiting instruction, it can send the trigger instruction or the waiting instruction to the counter 201 .
  • the triggering instruction and the waiting instruction can respectively trigger the counting value of the counter to be adjusted in a first adjustment manner and a second adjustment manner. For example, by sending a trigger instruction to the counter, the counter can be triggered to increase the count value, and by sending a wait instruction to the counter, the counter can be triggered to decrease the count value. Alternatively, by sending a trigger instruction to the counter, the counter can be triggered to decrease the count value, and by sending a wait instruction to the counter, the counter can be triggered to increase the count value.
  • one of the first adjustment and the second adjustment is adjusted by increasing the count value according to a first preset step size, and the other is adjusted by a second preset
  • the step size decreases the count value.
  • each trigger instruction may trigger the counter to add 1 to the count value
  • each waiting instruction may trigger the counter to decrease the count value by 1.
  • other positive integers may also be used as the first preset step size and the second preset step size.
  • both the first preset step size and the second preset step size may be 2.
  • each trigger instruction can trigger the counter to add 2 to the count value
  • each waiting instruction can trigger the counter to decrease the count value by 2.
  • the first preset step size may not be equal. Specifically, the first preset step size is determined based on the number of waiting queues related to the target synchronization event in the multiple instruction queues, and the second preset step size is determined based on the number of waiting queues related to the target synchronization event in the multiple instruction queues. The number of waiting queues related to the target synchronization event is determined.
  • the first preset step size can be set to n*a
  • the second preset step size can be set to m*a
  • a is a positive integer
  • the trigger instruction carries the first preset step size
  • the wait instruction carries the second preset step size. Since the instruction carries the step size information, the counter can directly read the corresponding step size information from the above instruction after receiving the trigger instruction or the waiting instruction, so as to determine the step size for increasing or decreasing the count value.
  • the trigger command may also carry identification information used to characterize the adjustment method of the first adjustment
  • the waiting instruction may also carry identification information used to characterize the adjustment method of the second adjustment, so that the counter determines which adjustment mode to use. There are two adjustment methods to adjust the step size of the counter.
  • the instructions after the waiting instruction in an instruction queue are sent only when the count value of the counter satisfies the preset value condition, so that the sending order of the instructions in the instruction queue can be controlled based on the count value of the counter, thereby realizing Instructions among the plurality of instruction queues are synchronized.
  • the preset numerical condition is jointly determined based on the initial count value, the first preset step size, the second preset step size, and a preset multiple.
  • the preset numerical condition may be that the count value of the counter and the target count value satisfy a preset numerical relationship.
  • the product of the first preset step size, the second preset step size and a preset multiple sum the product and the initial count value, and determine the target based on the sum result count value.
  • the first preset step size is marked as n
  • the second preset step size is marked as m
  • the preset multiple is marked as a
  • the initial counting value is marked as k0
  • the The target count value can be written as n*m*a-1+k0.
  • the preset multiple is a positive integer.
  • the adjustment mode of the first adjustment is to increase the count value
  • the adjustment mode of the second adjustment is to decrease the count value
  • the numerical relationship is that the count value is greater than the Target count value
  • the adjustment mode of the first adjustment is to reduce the count value
  • the adjustment mode of the second adjustment is to increase the count value
  • the numerical relationship is that the count value is less than the target count value.
  • both m and n are equal to 1, and it is assumed that the preset multiple a is also equal to 1, and the initial value of the counter is 0.
  • Instruction 4 and instruction 5 in instruction queue 1 need to wait for instruction 1 and instruction 2 in instruction queue 2 to be sent before being sent. Therefore, the trigger instruction T can be inserted after the instruction 2 in the instruction queue 2, and the waiting instruction W can be inserted before the instruction 4 in the instruction queue 1.
  • the command queue 2 can analyze the commands in this queue, and send the parsed commands sequentially.
  • instruction 1 and instruction 2 are execution instructions, which can be sent to the execution unit in sequence. At the same time, the instruction queue 1 can send the execution instructions (ie, instruction 1, instruction 2 and instruction 3) in this queue in parallel.
  • Instructions with the same label as instruction 1 and instruction 2 in the above two instruction queues may be the same instruction or different instructions.
  • the label of the instruction is only used to indicate the relative position of the instruction in the instruction queue to which it belongs. Not intended to indicate the content or type of an instruction. Since the sending and completion time of the commands without dependencies in different command queues is random, it is possible that command 3 in command queue 1 is sent before command 2 in command queue 2, or that command Instruction 2 in queue 2 is sent before instruction 3 in instruction queue 1.
  • the command queue 2 parses the trigger command T, it will send the trigger command T to the counter.
  • the counter receives the trigger command T , increment the count value by 1.
  • the command queue 1 resolves the waiting command W, it can read the count value of the counter. Only when the count value of the counter is greater than 0, the command queue 1 will send the waiting command W and the commands after the waiting command W. , otherwise the waiting instruction W and the instructions following the waiting instruction W are not sent. However, when the counter receives the waiting instruction W, it can decrement the count value by 1.
  • a synchronization event is completed. If there are other synchronization events between the two instruction queues, continue to perform instruction synchronization according to the above process.
  • both m and n are positive integers greater than 1, and m and n may not be equal.
  • This situation is called n instruction queues waiting for m instruction queues.
  • instruction 4 and instruction 5 in each waiting queue that is, instruction queue A1 above the counter to instruction queue An
  • instruction 3 in all waiting queues that is, instruction queue B1 to instruction queue Bm below the counter
  • the counting value of the counter can be increased by n, so that all waiting queues send to the counter After the trigger command is sent, the value of the counter is n*m.
  • Each waiting queue can read the count value of the counter in the case of parsing the waiting instruction, if the count value is greater than n*m-1, the waiting queue can send the waiting instruction so that the counter will decrement the count value by m, in After all the waiting queues have sent the waiting instructions, the count value of the counter returns to 0. In this way, a synchronization event is completed.
  • the synchronization process of the synchronization event is similar to the embodiment shown in FIG. 3 and FIG. 4 , and will not be repeated here.
  • the only difference is that after the counter receives the trigger command, it subtracts the corresponding value from the counting initial value (for example, 5), and after the counter receives the waiting command, it increases the counting value by the corresponding value, only when the counting value is less than the counting value In the case of the initial value, each instruction after the waiting instruction can continue to be sent. In this way, a synchronization event is also completed.
  • a trigger instruction in a later synchronization event may falsely trigger a wait instruction in a previous synchronization event.
  • FIG. 5A and FIG. 5B suppose there are two synchronization events, namely: (1) instruction queues 1 and 2 wait for instruction queues 3 , 4 , and 5 ; and (2) instruction queues 1 and 2 wait for instruction queue 3 .
  • T1 in command queue 3, command queue 4 and command queue 5, and W1 in command queue 1 and command queue 2 correspond to the first synchronization event
  • T2 in command queue 3 and command queue 1 and command queue 2 W2 corresponds to the second synchronization event.
  • the command queues obtained by inserting trigger commands and waiting commands in the foregoing embodiments are shown in Case 1 in FIG.
  • the trigger command T2 may be earlier than the time when the command queue 4 sends the trigger command Trigger1, and the sending time sequence of each command is shown in the sending order of the commands in case 1 in FIG. 5B .
  • the sending order of the trigger commands in the command queues 3 , 4 , and 5 is shown in the figure.
  • n 2 and m is 3, that is, when the count value of the counter is greater than 5, the waiting instruction in the waiting queue and the execution instruction after the waiting instruction can be sent.
  • T2 of the command queue 3 is sent before T1 of the command queue 4, after the command queue 4 sends T1, the count value of the counter will reach 6, which will trigger the waiting queue to send related commands.
  • T1 of the command queue 5 has not been sent, and the relevant commands in the waiting queue do not meet the sending conditions. It can be seen that, in the above case, T2 in the command queue 3 will mistakenly trigger W1 in the synchronization event 1 .
  • the plurality of instruction queues in the embodiments of the present disclosure include a wait queue and a wait queue; the wait queue includes a plurality of trigger instructions, and each trigger instruction in the same wait queue corresponds to A synchronous event, a synchronous event includes a corresponding waiting instruction and a corresponding trigger instruction, and the plurality of trigger instructions are used to adjust the count value of the same counter; after the first trigger instruction in the waiting queue, it also includes A target waiting instruction; after the first waiting instruction in the waiting queue, a target triggering instruction is also included.
  • the number of waiting queues can be greater than or equal to 1, and the number of waiting queues can also be greater than or equal to 1.
  • the first trigger instructions corresponding to different synchronization events can come from one or more waiting queues, and the corresponding The first wait instruction may come from one or more wait queues.
  • the first trigger instruction corresponding to the first synchronization event, the first trigger instruction corresponding to the second synchronization event, ..., the N-1th synchronization event corresponds to After the first trigger command, each includes a target wait command; and the first wait command corresponding to the first synchronization event, the first wait command corresponding to the second synchronization event, ..., the N-1th synchronization event corresponds to After the first wait instruction of each includes a target trigger instruction.
  • the first trigger instruction corresponding to the Nth synchronous event may or may not include a target waiting instruction
  • the first waiting instruction corresponding to the Nth synchronous event may or may not include a target trigger instruction.
  • each of the multiple waiting queues may include a target waiting instruction after the first trigger instruction in each of the waiting queues.
  • a synchronization event is that instruction 1 in instruction queue 2 waits for instruction 2 in instruction queue 3 and instruction 3 in instruction queue 4
  • both instruction queue 3 and instruction queue 4 include the first trigger instruction corresponding to the synchronization event , so that the first trigger instruction in the instruction queue 3 includes a target waiting instruction, and the first trigger instruction in the instruction queue 4 also includes a target waiting instruction.
  • each of the multiple waiting queues may include a target trigger instruction after the first waiting instruction in each waiting queue.
  • a synchronization event is that instruction 1 in instruction queue 1 and instruction 1 in instruction queue 2 wait for instruction 2 in instruction queue 3
  • both instruction queue 1 and instruction queue 2 include the first waiting instruction corresponding to the synchronization event , so that the first waiting instruction in the instruction queue 1 includes a target trigger instruction, and the first waiting instruction in the instruction queue 2 also includes a target trigger instruction.
  • the waiting queue Mi include a target waiting instruction after the first trigger instruction corresponding to the synchronization event S; if the synchronization event S is the last trigger event in a queue Mj to be waited for, then the corresponding event S in the queue Mj to be waited for The target waiting instruction is no longer included after the first trigger instruction.
  • this waiting queue Ni corresponds to the synchronization event S Include a target trigger instruction after the first waiting instruction; if the synchronization event S is the last waiting event in a certain waiting queue Nj, then no longer include after the first waiting instruction corresponding to the synchronization event S in the waiting queue Nj Target trigger command. As shown in Case 2 in Figure 5A.
  • the first trigger instruction is T1 in the instruction queue 3, and the target waiting instruction is Wait after T1 in the instruction queue 3.
  • a waiting command is W1 in command queue 1 and W1 in command queue 2
  • the above-mentioned target trigger commands are T after W1 in command queue 1 and T after W1 in command queue 2 .
  • Trigger1 in command queue 3 may be sent first, but because Wait is inserted after Trigger1 in command queue 3, the sending of T1 in command queue 3 is completed, However, when T1 in command queue 4 and command queue 5 has not been sent, W in command queue 3 does not meet the sending condition, so T1 in command queue 3 will enter a waiting state after sending. Similarly, W1 in command queue 1 and W1 in command queue 2 will cause command queue 1 and command queue 2 to also enter the waiting state. Therefore, T1 in command queue 4 and T1 in command queue 5 will be sent first. In this way, it is guaranteed that the triggering instruction of the second synchronization event can take effect only after the waiting instructions of the first synchronization event normally occur.
  • the number of synchronization events mapped to the same counter is 2, that is, there is at least one waiting queue (instruction queue 3 in the figure) corresponding to two synchronization events respectively.
  • the number of synchronization events mapped to the same counter may also be greater than 2, and there may also be more than one waiting queue corresponding to multiple synchronization events respectively.
  • the waiting queues in the two synchronization events shown in the figure are the same (both instruction queue 1 and instruction queue 2), in practical applications, the waiting queues in different events may also be partly the same, or Totally different.
  • the position of the target waiting instruction after the first trigger instruction does not have to be adjacent to the first trigger instruction, as long as it is between the first trigger instruction and the next first trigger instruction after the first trigger instruction.
  • the position of the target trigger instruction after the first waiting instruction does not have to be adjacent to the first waiting instruction, as long as it is between the first waiting instruction and the next first waiting instruction after the first waiting instruction.
  • W after T1 can be in any position between T1 and T2.
  • T after W1 can be in any position between W1 and W2 .
  • false triggering of the synchronization event can also be avoided by inserting a target waiting instruction and a target triggering instruction, which will not be repeated here.
  • the number of counters 201 is greater than one.
  • the trigger instruction and the waiting instruction included in each instruction queue may include identification information of the counter, which is used for the instruction queue to send the trigger instruction and the waiting instruction included in the instruction queue to the corresponding counter.
  • the identification information of the trigger instruction and the identification information of the waiting instruction may be respectively bound with the identification information of the counter, so that the instruction queue sends the trigger instruction and the waiting instruction to the corresponding counter.
  • the instruction synchronization device further includes an execution unit 203, configured to execute the received execution instruction.
  • the number of execution units may be greater than or equal to 1, and one execution unit may receive and process execution instructions of one or more instruction queues.
  • an execution unit may be a subunit in a processing unit capable of executing instructions, and the processing unit may be divided into multiple groups of execution units according to different granularities, and each group of execution units may be used to execute a processing task (for example, addition operation). In different situations, different partition granularities can be adopted according to different actual needs.
  • This division method has high flexibility. When a task requires a small number of execution units, the execution units can be divided into more groups, thereby improving the parallelism of task processing.
  • the execution unit finishes processing the instruction it can also return an acknowledgment signal (ACK) to the arbitration unit, so that the arbitration unit can continue to send new instructions.
  • ACK acknowledgment signal
  • the instruction synchronization device further includes an arbitration unit 204, configured to send the execution instructions sent by each instruction queue in the plurality of instruction queues to the execution unit according to a preset priority.
  • the number of arbitration units 204 may be greater than or equal to 1.
  • FIG. 6 shows that the number of arbitration units is equal to the number of instruction queues, but in practical applications, the numbers of the two may not be equal.
  • the instruction synchronization device further includes a multiplexer 205, configured to send the instruction to a corresponding instruction queue in the plurality of instruction queues.
  • both the triggering instruction and the waiting instruction include identification information of an instruction queue, for the multiplexer 205 to send the triggering instruction and the waiting instruction to the corresponding instruction queue.
  • the present disclosure provides a hardware-implemented dynamic deployment mechanism of the instruction queue state synchronization counter, which saves counter overhead and eliminates the risk of counter overflow; at the same time, the dynamic deployment of the counter is implemented by hardware, and efficient and flexible multi-process scheduling is also realized.
  • the embodiments of the present disclosure have strong scalability, and the number of counters, the number of instruction queues, and the number of execution units can be adjusted according to requirements.
  • the instruction synchronization device of the embodiments of the present disclosure can be applied to processing chips such as artificial intelligence chips, graphics processing chips, etc., to realize efficient and flexible instruction queue deployment and scheduling, thereby improving the parallel efficiency of execution units.
  • Various instructions in the above embodiments can be generated in advance by offline compilation. Each instruction can be sequentially generated and sent to the instruction queue according to the required order during offline compilation.
  • the present disclosure further provides a chip, the chip including the instruction synchronization device described in any embodiment of the present disclosure.
  • the aforementioned chips may be artificial intelligence chips or graphics processing chips, or other types of processing chips.
  • the instruction synchronization device in this chip embodiment reference may be made to the aforementioned embodiments of the instruction synchronization device, and details are not repeated here.
  • An embodiment of the present disclosure further provides a computer device, where the computer device includes the chip described in any embodiment of the present disclosure.
  • the computer device includes the chip described in any embodiment of the present disclosure.
  • the specific functions of the chip reference may be made to the description of the chip embodiments above, and for the sake of brevity, details are not repeated here.
  • an embodiment of the present disclosure also provides a data processing method, which is applied to the instruction synchronization device described in any embodiment of the present disclosure, and the method includes:
  • Step 701 Each instruction queue in the plurality of instruction queues stores an instruction, the instruction includes an execution instruction, and at least one of a trigger instruction and a waiting instruction;
  • Step 702 The counter coupled to the plurality of command queues performs a first adjustment to the count value in response to receiving a trigger command, and performs a second adjustment to the count value in response to receiving a wait command; the adjustment of the first adjustment The adjustment method is different from the adjustment method of the second adjustment;
  • Step 703 In the case that the count value satisfies a preset numerical condition, each instruction queue sends the instruction after the waiting instruction in the queue, and the preset numerical condition is based on the initial count of the counter. The value, the adjustment mode of the first adjustment and the adjustment mode of the second adjustment are determined.
  • the method further includes: each instruction queue parses the stored instructions, sends the parsed trigger instructions and waiting instructions to the counter, and sends the parsed execution instructions to execution unit.
  • the counter performs a first adjustment to the count value in response to receiving the trigger instruction, and performs a second adjustment to the count value in response to receiving the wait instruction, including: responding to receiving the trigger instruction according to the first preset Set the step size to increase the count value, and decrease the count value according to the second preset step size in response to receiving the waiting instruction; or, reduce the count value according to the first preset step size in response to receiving the trigger instruction, And in response to receiving the waiting instruction, increase the count value according to the second preset step size.
  • the first preset step size is determined based on the number of waiting queues related to the target synchronization event in the plurality of instruction queues
  • the second preset step size is determined based on the number of waiting queues in the plurality of instruction queues The number of waiting queues associated with the target synchronization event is determined.
  • the method further includes: acquiring, by the counter, the first preset step size carried in the trigger instruction, and acquiring the second preset step size carried in the waiting instruction.
  • the first preset step size is equal to the product of the number of waiting queues and a preset multiple
  • the second preset step size is equal to the number of the waiting queues and the preset multiple product of .
  • the preset numerical condition is jointly determined based on the initial count value, the first preset step size, the second preset step size, and a preset multiple.
  • the preset numerical condition is: the count value and the target count value satisfy a preset numerical relationship, and the target count value is: n*m*a-1+k0; wherein, n is The first preset step size, m is the second preset step size, a is the preset multiple, and k0 is the counting initial value.
  • the method further includes: when the adjustment mode of the first adjustment is to increase the count value of each instruction queue, and the adjustment mode of the second adjustment is to decrease the count value Next, the numerical relationship is that the count value is greater than the target count value; the adjustment mode of each instruction queue in the first adjustment is to reduce the count value, and the adjustment mode of the second adjustment is to increase In the case of the count value, the numerical relationship is that the count value is smaller than the target count value.
  • the plurality of instruction queues include a wait queue and a wait queue; the wait queue includes a plurality of trigger instructions, and each trigger instruction in the same wait queue corresponds to a synchronization event, A synchronization event includes a corresponding waiting instruction and a corresponding triggering instruction, and the multiple triggering instructions are used to adjust the count value of the same counter; after the first triggering instruction in the waiting queue, a target waiting instruction is also included ; The first waiting instruction in the waiting queue further includes a target trigger instruction; the first trigger instruction and the first waiting instruction are corresponding instructions of other synchronization events except the last synchronization event.
  • the number of counters is greater than 1; the method further includes: each instruction queue acquires the identification information of the counters included in the triggering instructions and waiting instructions in the queue, and based on the triggering instructions in the queue The identification information of the counter included in the instruction and the waiting instruction sends the trigger instruction and the waiting instruction in the queue to the corresponding counter.
  • the sending the parsed execution instructions to the execution unit includes: sending the parsed execution instructions to the arbitration unit, so that the arbitration unit sends the multiple The execution instructions sent by each instruction queue in the instruction queue are sent to the execution unit.
  • the method further includes: each instruction queue acquires the instruction sent by the multiplexer, and stores the instruction sent by the multiplexer.
  • each of the instruction queues acquiring the instructions sent by the multiplexer includes: each of the instruction queues acquiring the instructions sent by the multiplexer based on the identification information of the instruction queues included in the instructions to Instructions for this queue.
  • a typical implementing device is a computer, which may take the form of a personal computer, laptop computer, cellular phone, camera phone, smart phone, personal digital assistant, media player, navigation device, e-mail device, game control device, etc. desktops, tablets, wearables, or any combination of these.
  • each embodiment in this specification is described in a progressive manner, the same and similar parts of each embodiment can be referred to each other, and each embodiment focuses on the differences from other embodiments.
  • the description is relatively simple, and for relevant parts, please refer to part of the description of the method embodiment.
  • the device embodiments described above are only illustrative, and the modules described as separate components may or may not be physically separated, and the functions of each module may be integrated in the same or multiple software and/or hardware implementations. Part or all of the modules can also be selected according to actual needs to achieve the purpose of the solution of this embodiment. It can be understood and implemented by those skilled in the art without creative effort.

Abstract

Embodiments of the present disclosure provide an instruction synchronization apparatus, a chip, a computer device, and a data processing method. Trigger instructions and waiting instructions are inserted in instruction queues, a count value of a counter is adjusted according to the trigger instructions and the waiting instructions, and the instructions after the waiting instructions can be sent only when the count value satisfies a preset numerical value condition. Therefore, the sending sequences of the instructions in the plurality of instruction queues can be controlled by means of the manner, so that instruction synchronization among the plurality of instruction queues is realized. According to the embodiments, the synchronization among the instruction queues is realized by using hardware, and the efficiency of a hardware system is improved.

Description

指令同步装置、芯片和计算机设备,数据处理方法Instruction synchronization device, chip and computer equipment, data processing method
相关公开的交叉引用Related Publication Cross-References
本公开要求于2021年12月30日提交的、申请号为202111652996.0的中国专利公开的优先权,该中国专利公开的全部内容以引用的方式并入本文中。This disclosure claims the priority of the Chinese patent publication with application number 202111652996.0 filed on December 30, 2021, the entire content of which is incorporated herein by reference.
技术领域technical field
本公开涉及人工智能技术领域,尤其涉及一种指令同步装置、芯片和计算机设备,数据处理方法。The present disclosure relates to the technical field of artificial intelligence, and in particular to an instruction synchronization device, a chip and computer equipment, and a data processing method.
背景技术Background technique
在图形处理器、人工智能加速芯片中通常通过多进程提高处理器的算力以及实现数据的并行处理。大量并行工作的进程存在同步依赖关系,例如,一个进程中包括的指令需要在另一个进程中包括的指令发送完成的情况下才能进行发送。因此,有必要设计指令间的同步机制。然而,相关技术中的指令同步机制导致硬件系统的效率较低。In graphics processors and artificial intelligence acceleration chips, multi-processes are usually used to increase the computing power of the processor and realize parallel processing of data. A large number of processes working in parallel have synchronous dependencies, for example, an instruction included in one process needs to be dispatched after an instruction included in another process has been sent to completion. Therefore, it is necessary to design a synchronization mechanism between instructions. However, the instruction synchronization mechanism in the related art results in low efficiency of the hardware system.
发明内容Contents of the invention
第一方面,本公开实施例提供一种指令同步装置,所述装置包括:计数器,以及多个指令队列;所述计数器耦接至所述多个指令队列;所述多个指令队列中的每个指令队列用于存储指令,所述指令包括执行指令,还包括触发指令和等待指令中的至少一者;所述计数器用于响应于接收到触发指令对计数值进行第一调整,以及响应于接收到等待指令对计数值进行第二调整;所述第一调整的调整方式和所述第二调整的调整方式不同;其中,一个指令队列中位于所述等待指令之后的指令在所述计数值满足预设数值条件的情况下被发送,所述预设数值条件基于所述计数器的计数初值、所述第一调整的调整方式和所述第二调整的调整方式确定。In a first aspect, an embodiment of the present disclosure provides an instruction synchronization device, which includes: a counter, and a plurality of instruction queues; the counter is coupled to the plurality of instruction queues; each of the plurality of instruction queues An instruction queue is used to store instructions, and the instructions include execution instructions, and at least one of a trigger instruction and a waiting instruction; the counter is used for performing a first adjustment to the count value in response to receiving the trigger instruction, and in response to A second adjustment is performed on the count value after receiving the waiting instruction; the adjustment method of the first adjustment is different from the adjustment method of the second adjustment; wherein, the instruction after the waiting instruction in an instruction queue is at the top of the count value It is sent when a preset numerical condition is met, and the preset numerical condition is determined based on the initial count value of the counter, the first adjusted adjustment mode, and the second adjusted adjustment mode.
在一些实施例中,所述装置还包括执行单元,所述执行单元耦接至所述多个指令队列,用于对接收到的执行指令进行执行;所述每个指令队列用于对已存储的指令进行解析,将解析出的触发指令和等待指令发送至所述计数器,以及将解析出的执行指令发送至执行单元。In some embodiments, the apparatus further includes an execution unit, the execution unit is coupled to the plurality of instruction queues, and is used to execute the received execution instructions; each instruction queue is used to execute the stored Parsing the instructions, sending the parsed trigger instructions and waiting instructions to the counter, and sending the parsed execution instructions to the execution unit.
在一些实施例中,所述第一调整和所述第二调整中的一者的调整方式为按照第一预设步长增加所述计数值,另一者的调整方式为按照第二预设步长减少所述计数值;所述第一预设步长基于所述多个指令队列中与目标同步事件相关的等待队列的数量确定,所述第二预设步长基于所述多个指令队列中与所述目标同步事件相关的被等待队列的数量确定。In some embodiments, one of the first adjustment and the second adjustment is adjusted by increasing the count value according to a first preset step size, and the other is adjusted by a second preset The step size is reduced by the count value; the first preset step size is determined based on the number of waiting queues related to the target synchronization event in the plurality of instruction queues, and the second preset step size is determined based on the plurality of instruction queues The number of waiting queues related to the target synchronization event in the queue is determined.
在一些实施例中,所述触发指令中携带所述第一预设步长,所述等待指令中携带所述第二预设步长。In some embodiments, the trigger instruction carries the first preset step size, and the wait instruction carries the second preset step size.
在一些实施例中,所述第一预设步长等于所述等待队列的数量与预设倍数的乘积,所述第二预设步长等于所述被等待队列的数量与所述预设倍数的乘积。In some embodiments, the first preset step size is equal to the product of the number of waiting queues and a preset multiple, and the second preset step size is equal to the number of the waiting queues and the preset multiple product of .
在一些实施例中,所述预设数值条件基于所述计数初值、所述第一预设步长、所述第二预设步长以及预设倍数共同确定。In some embodiments, the preset numerical condition is jointly determined based on the initial count value, the first preset step size, the second preset step size, and a preset multiple.
在一些实施例中,所述预设数值条件为:所述计数值与目标计数值满足预设的数值 关系,所述目标计数值为:n*m*a-1+k0;其中,n为所述第一预设步长,m为所述第二预设步长,a为所述预设倍数,k0为所述计数初值;在所述第一调整的调整方式为增加所述计数值,所述第二调整的调整方式为减少所述计数值的情况下,所述数值关系为所述计数值大于所述目标计数值;在所述第一调整的调整方式为减少所述计数值,所述第二调整的调整方式为增加所述计数值的情况下,所述数值关系为所述计数值小于所述目标计数值。In some embodiments, the preset numerical condition is: the count value and the target count value satisfy a preset numerical relationship, and the target count value is: n*m*a-1+k0; wherein, n is The first preset step size, m is the second preset step size, a is the preset multiple, and k0 is the initial value of the count; the adjustment method in the first adjustment is to increase the count value, when the adjustment method of the second adjustment is to reduce the count value, the numerical relationship is that the count value is greater than the target count value; when the adjustment method of the first adjustment is to reduce the count value value, and when the adjustment method of the second adjustment is to increase the count value, the numerical relationship is that the count value is smaller than the target count value.
在一些实施例中,所述多个指令队列包括被等待队列和等待队列;所述被等待队列中包括多个触发指令,在同一个被等待队列中的每个触发指令对应于一个同步事件,一个同步事件包括对应的等待指令和对应的触发指令,所述多个触发指令用于对同一个计数器的计数值进行调整;所述被等待队列中的第一触发指令之后还包括一个目标等待指令;所述等待队列中的第一等待指令之后还包括一个目标触发指令。In some embodiments, the plurality of instruction queues include a wait queue and a wait queue; the wait queue includes a plurality of trigger instructions, and each trigger instruction in the same wait queue corresponds to a synchronization event, A synchronization event includes a corresponding waiting instruction and a corresponding triggering instruction, and the multiple triggering instructions are used to adjust the count value of the same counter; after the first triggering instruction in the waiting queue, a target waiting instruction is also included ; The first waiting instruction in the waiting queue further includes a target trigger instruction.
在一些实施例中,所述计数器的数量大于1;每个指令队列包括的触发指令和等待指令中均包括:计数器的标识信息,所述指令队列用于将所述指令队列包括的触发指令和等待指令按照所述计数器的标识信息发送至对应的计数器。In some embodiments, the number of the counter is greater than 1; the trigger instruction and the waiting instruction included in each instruction queue include: identification information of the counter, and the instruction queue is used to store the trigger instruction and the waiting instruction included in the instruction queue The waiting instruction is sent to the corresponding counter according to the identification information of the counter.
在一些实施例中,所述指令同步装置还包括:仲裁单元,与所述多个指令队列和所述执行单元耦接,用于按照预先设置的优先级,将所述多个指令队列中每个指令队列发送的执行指令发送至所述执行单元。In some embodiments, the instruction synchronization device further includes: an arbitration unit, coupled to the plurality of instruction queues and the execution unit, configured to assign each of the plurality of instruction queues according to a preset priority The execution instructions sent by the instruction queues are sent to the execution unit.
在一些实施例中,所述指令同步装置还包括:多路选择器,与所述多个指令队列耦接,用于将所述指令发送至所述多个指令队列中对应的指令队列。In some embodiments, the instruction synchronization device further includes: a multiplexer, coupled to the plurality of instruction queues, for sending the instruction to a corresponding instruction queue in the plurality of instruction queues.
在一些实施例中,所述触发指令和所述等待指令中均包括:指令队列的标识信息,所述多路选择器,还用于将所述触发指令和所述等待指令,按照所述指令队列的标识信息发送至对应的指令队列。In some embodiments, both the triggering instruction and the waiting instruction include: identification information of the instruction queue, and the multiplexer is also used to combine the triggering instruction and the waiting instruction according to the instruction The identification information of the queue is sent to the corresponding command queue.
第二方面,本公开实施例提供一种芯片,所述芯片包括:本公开任一实施例所述的指令同步装置。In a second aspect, an embodiment of the present disclosure provides a chip, and the chip includes: the instruction synchronization device described in any embodiment of the present disclosure.
第三方面,本公开实施例提供一种计算机设备,所述计算机设备包括:本公开任一实施例所述的芯片。In a third aspect, an embodiment of the present disclosure provides a computer device, where the computer device includes: the chip described in any embodiment of the present disclosure.
第四方面,本公开实施例提供一种数据处理方法,应用于本公开任一实施例所述的指令同步装置,所述方法包括:多个指令队列中的每个指令队列存储指令,所述指令包括执行指令,还包括触发指令和等待指令中的至少一者;与所述多个指令队列耦接的计数器响应于接收到触发指令对计数值进行第一调整,并响应于接收到等待指令对计数值进行第二调整;所述第一调整的调整方式和所述第二调整的调整方式不同;所述每个指令队列在所述计数值满足预设数值条件的情况下,对本队列中位于所述等待指令之后的指令进行发送,所述预设数值条件基于所述计数器的计数初值、所述第一调整的调整方式和所述第二调整的调整方式确定。In a fourth aspect, an embodiment of the present disclosure provides a data processing method, which is applied to the instruction synchronization device described in any embodiment of the present disclosure, and the method includes: each instruction queue in a plurality of instruction queues stores instructions, and the The instructions include an execution instruction, and at least one of a trigger instruction and a wait instruction; a counter coupled to the plurality of instruction queues performs a first adjustment to the count value in response to receiving the trigger instruction, and responds to receiving the wait instruction Carrying out a second adjustment to the count value; the adjustment method of the first adjustment is different from the adjustment method of the second adjustment; when the count value satisfies the preset numerical condition, each instruction queue The instruction after the waiting instruction is sent, and the preset value condition is determined based on the initial count value of the counter, the first adjusted adjustment method and the second adjusted adjustment method.
在一些实施例中,所述方法还包括:所述每个指令队列对已存储的指令进行解析,将解析出的触发指令和等待指令发送至所述计数器,以及将解析出的执行指令发送至执行单元。In some embodiments, the method further includes: each instruction queue parses the stored instructions, sends the parsed trigger instructions and waiting instructions to the counter, and sends the parsed execution instructions to execution unit.
在一些实施例中,所述计数器响应于接收到触发指令对计数值进行第一调整,并响应于接收到等待指令对计数值进行第二调整,包括:响应于接收到触发指令按照第一预设步长增加所述计数值,并响应于接收到等待指令按照第二预设步长减少所述计数值;或者,响应于接收到触发指令按照第一预设步长减少所述计数值,并响应于接收到等待指令按照第二预设步长增加所述计数值。In some embodiments, the counter performs a first adjustment to the count value in response to receiving the trigger instruction, and performs a second adjustment to the count value in response to receiving the wait instruction, including: responding to receiving the trigger instruction according to the first preset Set the step size to increase the count value, and decrease the count value according to the second preset step size in response to receiving the waiting instruction; or, reduce the count value according to the first preset step size in response to receiving the trigger instruction, And in response to receiving the waiting instruction, increase the count value according to the second preset step size.
在一些实施例中,所述第一预设步长基于所述多个指令队列中与目标同步事件相关的等待队列的数量确定,所述第二预设步长基于所述多个指令队列中与所述目标同步事件相关的被等待队列的数量确定。In some embodiments, the first preset step size is determined based on the number of waiting queues related to the target synchronization event in the plurality of instruction queues, and the second preset step size is determined based on the number of waiting queues in the plurality of instruction queues The number of waiting queues associated with the target synchronization event is determined.
在一些实施例中,所述方法还包括:所述计数器获取所述触发指令中携带的所述第一预设步长,并获取所述等待指令中携带的所述第二预设步长。In some embodiments, the method further includes: acquiring, by the counter, the first preset step size carried in the trigger instruction, and acquiring the second preset step size carried in the waiting instruction.
在一些实施例中,所述第一预设步长等于所述等待队列的数量与预设倍数的乘积,所述第二预设步长等于所述被等待队列的数量与所述预设倍数的乘积。In some embodiments, the first preset step size is equal to the product of the number of waiting queues and a preset multiple, and the second preset step size is equal to the number of the waiting queues and the preset multiple product of .
在一些实施例中,所述预设数值条件基于所述计数初值、所述第一预设步长、所述第二预设步长以及预设倍数共同确定。In some embodiments, the preset numerical condition is jointly determined based on the initial count value, the first preset step size, the second preset step size, and a preset multiple.
在一些实施例中,所述预设数值条件为:所述计数值与目标计数值满足预设的数值关系,所述目标计数值为:n*m*a-1+k0;其中,n为所述第一预设步长,m为所述第二预设步长,a为所述预设倍数,k0为所述计数初值。In some embodiments, the preset numerical condition is: the count value and the target count value satisfy a preset numerical relationship, and the target count value is: n*m*a-1+k0; wherein, n is The first preset step size, m is the second preset step size, a is the preset multiple, and k0 is the counting initial value.
在一些实施例中,所述方法还包括:所述每个指令队列在所述第一调整的调整方式为增加所述计数值,所述第二调整的调整方式为减少所述计数值的情况下,所述数值关系为所述计数值大于所述目标计数值;所述每个指令队列在所述第一调整的调整方式为减少所述计数值,所述第二调整的调整方式为增加所述计数值的情况下,所述数值关系为所述计数值小于所述目标计数值。In some embodiments, the method further includes: when the adjustment mode of the first adjustment is to increase the count value of each instruction queue, and the adjustment mode of the second adjustment is to decrease the count value Next, the numerical relationship is that the count value is greater than the target count value; the adjustment mode of each instruction queue in the first adjustment is to reduce the count value, and the adjustment mode of the second adjustment is to increase In the case of the count value, the numerical relationship is that the count value is smaller than the target count value.
在一些实施例中,所述多个指令队列包括被等待队列和等待队列;所述被等待队列中包括多个触发指令,在同一个被等待队列中的每个触发指令对应于一个同步事件,一个同步事件包括对应的等待指令和对应的触发指令,所述多个触发指令用于对同一个计数器的计数值进行调整;所述被等待队列中的第一触发指令之后还包括一个目标等待指令;所述等待队列中的第一等待指令之后还包括一个目标触发指令;所述第一触发指令和所述第一等待指令均为除最后一个同步事件以外的其余同步事件的对应的指令。In some embodiments, the plurality of instruction queues include a wait queue and a wait queue; the wait queue includes a plurality of trigger instructions, and each trigger instruction in the same wait queue corresponds to a synchronization event, A synchronization event includes a corresponding waiting instruction and a corresponding triggering instruction, and the multiple triggering instructions are used to adjust the count value of the same counter; after the first triggering instruction in the waiting queue, a target waiting instruction is also included ; The first waiting instruction in the waiting queue further includes a target trigger instruction; the first trigger instruction and the first waiting instruction are corresponding instructions of other synchronization events except the last synchronization event.
在一些实施例中,所述计数器的数量大于1;所述方法还包括:所述每个指令队列获取本队列中的触发指令和等待指令中包括的计数器的标识信息,基于本队列中的触发指令和等待指令中包括的计数器的标识信息,将本队列中的触发指令和等待指令发送至对应的计数器。In some embodiments, the number of counters is greater than 1; the method further includes: each instruction queue acquires the identification information of the counters included in the triggering instructions and waiting instructions in the queue, and based on the triggering instructions in the queue The identification information of the counter included in the instruction and the waiting instruction sends the trigger instruction and the waiting instruction in the queue to the corresponding counter.
在一些实施例中,所述将解析出的执行指令发送至执行单元,包括:将解析出的执行指令发送至仲裁单元,以使所述仲裁单元按照预先设置的优先级,将所述多个指令队列中每个指令队列发送的执行指令发送至所述执行单元。In some embodiments, the sending the parsed execution instructions to the execution unit includes: sending the parsed execution instructions to the arbitration unit, so that the arbitration unit sends the multiple The execution instructions sent by each instruction queue in the instruction queue are sent to the execution unit.
在一些实施例中,所述方法还包括:所述每个指令队列获取多路选择器发送的指令,对所述多路选择器发送的指令进行存储。In some embodiments, the method further includes: each instruction queue acquires the instruction sent by the multiplexer, and stores the instruction sent by the multiplexer.
在一些实施例中,所述每个指令队列获取多路选择器发送的指令,包括:所述每个指令队列获取所述多路选择器基于所述指令中包括的指令队列的标识信息发送至本队列的指令。In some embodiments, each of the instruction queues acquiring the instructions sent by the multiplexer includes: each of the instruction queues acquiring the instructions sent by the multiplexer based on the identification information of the instruction queues included in the instructions to Instructions for this queue.
本公开实施例通过在指令队列中插入触发指令和等待指令,并通过触发指令和等待指令对计数器的计数值进行调整,由于位于等待指令之后的指令在计数值满足预设数值条件的情况下才能被发送,因此,通过上述方式能够控制所述多个指令队列中指令的发送顺序,从而实现所述多个指令队列之间的指令同步。上述实施例采用硬件方式实现指令队列之间的同步,提高了硬件系统的效率。In the embodiment of the present disclosure, a trigger command and a wait command are inserted into the command queue, and the count value of the counter is adjusted through the trigger command and the wait command. Therefore, the sending order of the instructions in the multiple instruction queues can be controlled through the above method, so as to realize the instruction synchronization among the multiple instruction queues. The foregoing embodiments implement synchronization between instruction queues in a hardware manner, which improves the efficiency of the hardware system.
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,而非限制本公开。It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
附图说明Description of drawings
此处的附图被并入说明书中并构成本说明书的一部分,这些附图示出了符合本公开的实施例,并与说明书一起用于说明本公开的技术方案。The accompanying drawings here are incorporated into the description and constitute a part of the present description. These drawings show embodiments consistent with the present disclosure, and are used together with the description to explain the technical solution of the present disclosure.
图1是指令同步过程的示意图。FIG. 1 is a schematic diagram of an instruction synchronization process.
图2是本公开实施例的指令同步装置的结构示意图。FIG. 2 is a schematic structural diagram of an instruction synchronization device according to an embodiment of the present disclosure.
图3是本公开实施例的指令同步方式的示意图。FIG. 3 is a schematic diagram of an instruction synchronization method according to an embodiment of the disclosure.
图4是本公开另一实施例的指令同步方式的示意图。FIG. 4 is a schematic diagram of an instruction synchronization method according to another embodiment of the disclosure.
图5A是本公开实施例的多个同步事件映射到同一个同步计数器时的指令同步方式的示意图。FIG. 5A is a schematic diagram of an instruction synchronization method when multiple synchronization events are mapped to the same synchronization counter according to an embodiment of the present disclosure.
图5B是图5A中的指令队列中的指令发送顺序的示意图。FIG. 5B is a schematic diagram of the instruction sending sequence in the instruction queue in FIG. 5A .
图6是本公开另一实施例的指令同步装置的结构示意图。FIG. 6 is a schematic structural diagram of an instruction synchronization device according to another embodiment of the present disclosure.
图7是本公开实施例的数据处理方法的流程图。FIG. 7 is a flowchart of a data processing method according to an embodiment of the present disclosure.
具体实施方式Detailed ways
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本公开相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本公开的一些方面相一致的装置和方法的例子。Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatuses and methods consistent with aspects of the present disclosure as recited in the appended claims.
在本公开使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本公开。在本公开和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。另外,本文中术语“至少一种”表示多种中的任意一种或多种中的至少两种的任意组合。The terminology used in the present disclosure is for the purpose of describing particular embodiments only, and is not intended to limit the present disclosure. As used in this disclosure and the appended claims, the singular forms "a", "the", and "the" are intended to include the plural forms as well, unless the context clearly dictates otherwise. It should also be understood that the term "and/or" as used herein refers to and includes any and all possible combinations of one or more of the associated listed items. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality.
应当理解,尽管在本公开可能采用术语第一、第二、第三等来描述各种信息,但这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开。例如,在不脱离本公开范围的情况下,第一信息也可以被称为第二信息,类似地,第二信息也可以被称为第一信息。取决于语境,如在此所使用的词语“如果”可以被解释成为“在……时”或“当……时”或“响应于确定”。It should be understood that although the terms first, second, third, etc. may be used in the present disclosure to describe various information, the information should not be limited to these terms. These terms are only used to distinguish information of the same type from one another. For example, without departing from the scope of the present disclosure, first information may also be called second information, and similarly, second information may also be called first information. Depending on the context, the word "if" as used herein may be interpreted as "at" or "when" or "in response to a determination."
为了使本技术领域的人员更好的理解本公开实施例中的技术方案,并使本公开实施例的上述目的、特征和优点能够更加明显易懂,下面结合附图对本公开实施例中的技术方案作进一步详细的说明。In order to enable those skilled in the art to better understand the technical solutions in the embodiments of the present disclosure, and to make the above-mentioned purposes, features and advantages of the embodiments of the present disclosure more obvious and understandable, the technical solutions in the embodiments of the present disclosure are described below in conjunction with the accompanying drawings The program is described in further detail.
在多进程并行处理的场景中,大量并行工作的进程存在同步依赖关系。例如,一个进程中包括的指令可能需要在另一个进程中包括的指令发送完成的情况下才能进行发送。In a multi-process parallel processing scenario, a large number of processes working in parallel have synchronization dependencies. For example, an instruction included in one process may need to be dispatched after an instruction included in another process has been sent.
其中,指令可以先按照预先设置的分发规则下发到各个指令队列,不同指令队列中的指令之间可以存在依赖关系,也可以不存在依赖关系。在两个指令队列中的指令不存在依赖关系的情况下,这两个指令队列中的指令可以并行地进行发送,从而提高指令之间的并行程度;在两个指令队列中的指令存在依赖关系的情况下,可以通过设置这两个队列中的指令同步机制,保证存在依赖关系的指令按照一定的顺序进行发送。Wherein, the instructions may be issued to each instruction queue according to the preset distribution rules, and there may or may not be a dependency relationship between instructions in different instruction queues. In the case that there is no dependency between the instructions in the two instruction queues, the instructions in the two instruction queues can be sent in parallel, thereby increasing the degree of parallelism between the instructions; there is a dependency between the instructions in the two instruction queues In this case, you can set the instruction synchronization mechanism in the two queues to ensure that the instructions with dependencies are sent in a certain order.
参见图1,是一些实施例的指令同步过程的示意图。假设指令队列1中的指令q1和q2依赖于指令队列2中的指令Q1、Q2和Q3,即,指令q1和q2需要在指令Q1、Q2和Q3均发送完成的情况下才能进行发送,则各指令的发送顺序如图中的时间轴所示。本领域技术人员可以理解,图中示出的指令发送顺序仅为一种示例性说明,在实际应用中,同一指令队列中的各个指令按照各指令存储到该指令队列的顺序依次发送,而不同指令队列之间不存在依赖关系的各指令可以按照任意的发送顺序进行发送。例如,指令Q4的发送时间可以早于指令q3的发送时间,或者,指令Q1也可以在指令q3与指令q4之间发送。但由于指令q1和q2依赖于指令Q1、Q2和Q3,因此,指令Q1、Q2和Q3的发送时间均早于指令q1,且均早于指令q2。Referring to FIG. 1 , it is a schematic diagram of an instruction synchronization process in some embodiments. Assuming that the instructions q1 and q2 in the instruction queue 1 depend on the instructions Q1, Q2 and Q3 in the instruction queue 2, that is, the instructions q1 and q2 can only be sent after the instructions Q1, Q2 and Q3 are all sent, then each The order in which commands are sent is shown on the time axis in the figure. Those skilled in the art can understand that the sending sequence of instructions shown in the figure is only an exemplary description. In practical applications, the instructions in the same instruction queue are sent sequentially according to the order in which the instructions are stored in the instruction queue. Instructions that do not have a dependency relationship among the instruction queues may be sent in any sending order. For example, the sending time of the instruction Q4 may be earlier than the sending time of the instruction q3, or the instruction Q1 may be sent between the instruction q3 and the instruction q4. However, since the instructions q1 and q2 depend on the instructions Q1, Q2 and Q3, the sending times of the instructions Q1, Q2 and Q3 are all earlier than the instruction q1, and are all earlier than the instruction q2.
相关技术中,通过为每个指令队列部署指令计数器来对当前指令队列发送的指令进行计数,其他指令队列间通过获取对应指令队列的指令计数器的计数值实现指令同步。然而,这种指令同步方式中只有生产者没有消费者(如计数器的计数一直在增加而没有减少),计数器会有溢出风险,并且每个指令队列中的指令只能是相同类型的指令。为了解决上述问题,另一种相关技术中采用生产者-消费者模型,在每两个指令队列间部署状态同步计数器来实现指令同步。这种技术采用了静态部署状态同步计数器的方式,假设指令队列的数量是N,则需要部署的状态同步计数器的数量是N*(N-1),在N较大时,所使用的计数器的数量巨大,占用了过多的系统资源。In related technologies, an instruction counter is deployed for each instruction queue to count the instructions sent by the current instruction queue, and other instruction queues realize instruction synchronization by obtaining the count value of the instruction counter of the corresponding instruction queue. However, in this command synchronization method, there are only producers and no consumers (for example, the count of the counter has been increasing but not decreasing), the counter has the risk of overflow, and the commands in each command queue can only be the same type of command. In order to solve the above problems, another related technology adopts a producer-consumer model, and deploys a state synchronization counter between every two instruction queues to realize instruction synchronization. This technology adopts the method of statically deploying state synchronization counters. Assuming that the number of instruction queues is N, the number of state synchronization counters to be deployed is N*(N-1). When N is large, the number of counters used The number is huge and takes up too many system resources.
基于此,本公开实施例提供一种指令同步装置,参见图2和图6,所述装置包括:Based on this, an embodiment of the present disclosure provides an instruction synchronization device, see Figure 2 and Figure 6, the device includes:
计数器201,以及耦接至计数器201的多个指令队列202;所述多个指令队列202中的每个指令队列(例如,指令队列1,指令队列2,指令队列3,指令队列4等等)用于存储指令,所述指令包括执行指令,还包括触发指令Trigger和等待指令Wait中的至少一者;A counter 201, and a plurality of instruction queues 202 coupled to the counter 201; each instruction queue in the plurality of instruction queues 202 (for example, instruction queue 1, instruction queue 2, instruction queue 3, instruction queue 4, etc.) For storing instructions, the instructions include an execution instruction, and at least one of a trigger instruction Trigger and a wait instruction Wait;
所述计数器201用于响应于接收到触发指令Trigger对计数值进行第一调整,以及响应于接收到等待指令Wait对计数值进行第二调整;所述第一调整的调整方式和所述第二调整的调整方式不同;The counter 201 is used to perform a first adjustment to the count value in response to receiving the trigger instruction Trigger, and to perform a second adjustment to the count value in response to receiving the wait instruction Wait; the adjustment method of the first adjustment and the second adjustment Adjustments are made in different ways;
其中,一个指令队列202中位于所述等待指令Wait之后的指令在所述计数值满足预设数值条件的情况下被发送,所述预设数值条件基于所述计数器201的计数初值、所述第一调整的调整方式和所述第二调整的调整方式确定。Wherein, the instructions located after the waiting instruction Wait in an instruction queue 202 are sent when the count value satisfies a preset numerical condition, and the preset numerical condition is based on the initial count value of the counter 201, the The adjustment mode of the first adjustment and the adjustment mode of the second adjustment are determined.
在本实施例中,计数器(也称为同步计数器)201的数量可以大于或等于1,指令队列202的数量可以大于或等于2,图中示出了计数器201的数量以及指令队列202的数量均为4的情况,但本领域技术人员可以理解,图中所示的情况并非用于限制本公开,计数器201的数量和指令队列202的数量均可以根据实际需要设置为其他大于4或者小于4的数量,且计数器201的数量与指令队列202的数量可以相等,也可以不相等。In this embodiment, the number of counters (also referred to as synchronous counters) 201 can be greater than or equal to 1, and the number of instruction queues 202 can be greater than or equal to 2. The figure shows that the number of counters 201 and the number of instruction queues 202 are both is 4, but those skilled in the art can understand that the situation shown in the figure is not used to limit the disclosure, and the number of counters 201 and the number of instruction queues 202 can be set to other values greater than 4 or less than 4 according to actual needs. number, and the number of counters 201 and the number of instruction queues 202 may or may not be equal.
一个指令队列202可以用于存储执行指令,所述执行指令用于指示执行单元执行的操作。所述执行指令从指令队列202发送至执行单元203以执行。例如,所述执行指令可以包括但不限于加法指令、乘法指令、卷积乘指令等各种运算指令中的至少一种。执行单元可以响应于所述执行指令,对获取到的目标数据进行运算。所述目标数据可以是图像数据、语音数据、文本数据等各种形态的数据。指令队列202可以实现为存储器。An instruction queue 202 may be used to store execution instructions, and the execution instructions are used to instruct the execution unit to perform operations. The execution instruction is sent from the instruction queue 202 to the execution unit 203 for execution. For example, the execution instruction may include but not limited to at least one of various operation instructions such as an addition instruction, a multiplication instruction, and a convolution multiplication instruction. The execution unit may perform operations on the acquired target data in response to the execution instruction. The target data may be data in various forms such as image data, voice data, and text data. Instruction queue 202 may be implemented as a memory.
一个或多个指令队列中的指令可能与另外的一个或多个指令队列中的指令存在依赖关系。为了对存在依赖关系的指令进行指令同步,可以在指令队列中插入触发指令Trigger和等待指令Wait。各个指令队列中的指令在不同的时间可能存在不同的依赖关系,例如,在t1时刻,指令队列1中的指令需要等待指令队列2中的指令发送完成才能进行发送(这种情况称为指令队列1等待指令队列2,指令队列1称为等待队列,指令 队列2称为被等待队列);而在t2时刻,指令队列1中的指令与指令队列2之间中的指令不存在依赖关系;在t3时刻,指令队列2需要等待指令队列1。Instructions in one or more instruction queues may have a dependency relationship with instructions in another one or more instruction queues. In order to perform command synchronization on commands with dependencies, a trigger command Trigger and a wait command Wait may be inserted into the command queue. The instructions in each instruction queue may have different dependencies at different times. For example, at time t1, the instructions in instruction queue 1 need to wait for the instructions in instruction queue 2 to be sent before they can be sent (this situation is called instruction queue 1 waits for the instruction queue 2, the instruction queue 1 is called the waiting queue, and the instruction queue 2 is called the waited queue); at t2 time, there is no dependency relationship between the instructions in the instruction queue 1 and the instructions in the instruction queue 2; At time t3, command queue 2 needs to wait for command queue 1.
在上述例子中,指令队列1等待指令队列2称为一个同步事件,一个同步事件中涉及的等待队列称为该同步事件对应的等待队列,一个同步事件中涉及的被等待队列称为该同步事件对应的被等待队列。同一个指令队列可能对应于一个或多个同步事件,例如,在某一时刻,指令队列1中的一条指令需要等待指令队列2中的一条指令发送完成才能进行发送;在另一时刻,指令队列1中的一条指令需要等待指令队列3中的两条指令发送完成才能进行发送;在再一时刻,指令队列4中的三条指令需要等待指令队列1中的两条指令发送完成才能进行发送。那么,在上述例子中,指令队列1分别对应于3个同步事件。In the above example, instruction queue 1 waiting for instruction queue 2 is called a synchronization event, the waiting queue involved in a synchronization event is called the waiting queue corresponding to the synchronization event, and the waiting queue involved in a synchronization event is called the synchronization event The corresponding waiting queue. The same instruction queue may correspond to one or more synchronization events. For example, at a certain moment, an instruction in instruction queue 1 needs to wait for an instruction in instruction queue 2 to be sent before it can be sent; An instruction in 1 needs to wait for two instructions in instruction queue 3 to be sent before it can be sent; at another moment, three instructions in instruction queue 4 need to wait for two instructions in instruction queue 1 to be sent before they can be sent. Then, in the above example, instruction queue 1 corresponds to 3 synchronization events respectively.
为了便于理解,下面先以两个存在依赖关系的指令队列为例进行说明。首先,指令队列在接收到指令之后,可以按照接收各指令的顺序,对各指令进行存储,还可以对存储的指令进行解析。各指令中均可包括该指令所属的类型,指令队列可以根据指令的类型确定指令是执行指令、触发指令还是等待指令。参见图3,其中,指令1、指令2、指令3等表示执行指令,T和W分别表示触发指令和等待指令,指令队列1中处于等待指令之后的指令需要等待指令队列2中处于触发指令之前的各指令均发送完成,才能进行发送。指令队列在解析出执行指令之后,可以将执行指令发送至执行单元;在解析出触发指令或等待指令之后,可以将触发指令或等待指令发送至计数器201。For ease of understanding, two instruction queues with dependencies are taken as an example below. First, after receiving the instructions, the instruction queue can store the instructions in the order in which they were received, and can also analyze the stored instructions. Each instruction may include the type of the instruction, and the instruction queue may determine whether the instruction is an execution instruction, a trigger instruction or a waiting instruction according to the type of the instruction. Referring to FIG. 3 , where, instruction 1, instruction 2, instruction 3, etc. represent execution instructions, T and W represent trigger instructions and waiting instructions respectively, and instructions after waiting instructions in instruction queue 1 need to wait before trigger instructions in instruction queue 2 All commands of the command must be sent before they can be sent. After the instruction queue parses out the execution instruction, it can send the execution instruction to the execution unit; after parsing out the trigger instruction or the waiting instruction, it can send the trigger instruction or the waiting instruction to the counter 201 .
触发指令和等待指令可以分别触发计数器的计数值以第一调整方式和第二调整方式进行调整。例如,通过向计数器发送触发指令,可以触发计数器增加计数值,通过向计数器发送等待指令,可以触发计数器减少计数值。或者,通过向计数器发送触发指令,可以触发计数器减少计数值,通过向计数器发送等待指令,可以触发计数器增加计数值。The triggering instruction and the waiting instruction can respectively trigger the counting value of the counter to be adjusted in a first adjustment manner and a second adjustment manner. For example, by sending a trigger instruction to the counter, the counter can be triggered to increase the count value, and by sending a wait instruction to the counter, the counter can be triggered to decrease the count value. Alternatively, by sending a trigger instruction to the counter, the counter can be triggered to decrease the count value, and by sending a wait instruction to the counter, the counter can be triggered to increase the count value.
在一些实施例中,所述第一调整和所述第二调整中的一者的调整方式为按照第一预设步长增加所述计数值,另一者的调整方式为按照第二预设步长减少所述计数值。例如,每条触发指令可以触发计数器将计数值加1,每条等待指令可以触发计数器将计数值减1。当然,除了将1作为第一预设步长和第二预设步长之外,还可以将其他正整数作为第一预设步长和第二预设步长。例如,在图3所示的实施例中,第一预设步长和第二预设步长可以均为2。在这种情况下,每条触发指令可以触发计数器将计数值加2,每条等待指令可以触发计数器将计数值减2。在一个目标同步事件中,在等待队列的数量与被等待队列的数量不同的情况下,第一预设步长可以不相等。具体来说,所述第一预设步长基于所述多个指令队列中与目标同步事件相关的等待队列的数量确定,所述第二预设步长基于所述多个指令队列中与所述目标同步事件相关的被等待队列的数量确定。例如,假设等待队列的数量为n,且被等待队列的数量为m的情况下,可以将第一预设步长设置为n*a,并将第二预设步长设置为m*a,其中,a为正整数。In some embodiments, one of the first adjustment and the second adjustment is adjusted by increasing the count value according to a first preset step size, and the other is adjusted by a second preset The step size decreases the count value. For example, each trigger instruction may trigger the counter to add 1 to the count value, and each waiting instruction may trigger the counter to decrease the count value by 1. Certainly, in addition to using 1 as the first preset step size and the second preset step size, other positive integers may also be used as the first preset step size and the second preset step size. For example, in the embodiment shown in FIG. 3 , both the first preset step size and the second preset step size may be 2. In this case, each trigger instruction can trigger the counter to add 2 to the count value, and each waiting instruction can trigger the counter to decrease the count value by 2. In a target synchronization event, if the number of waiting queues is different from the number of waited queues, the first preset step size may not be equal. Specifically, the first preset step size is determined based on the number of waiting queues related to the target synchronization event in the multiple instruction queues, and the second preset step size is determined based on the number of waiting queues related to the target synchronization event in the multiple instruction queues. The number of waiting queues related to the target synchronization event is determined. For example, assuming that the number of waiting queues is n and the number of waiting queues is m, the first preset step size can be set to n*a, and the second preset step size can be set to m*a, Among them, a is a positive integer.
在一些实施例中,所述触发指令中携带所述第一预设步长,所述等待指令中携带所述第二预设步长。由于指令中携带步长信息,因此,计数器可以在接收到触发指令或等待指令之后,直接从上述指令中读取到对应的步长信息,从而确定增加或减少计数值的步长。进一步地,触发指令还可以携带用于表征所述第一调整的调整方式的标识信息,等待指令中还可以携带用于表征所述第二调整的调整方式的标识信息,从而使计数器确定采用何种调整方式对计数器的步长进行调整。In some embodiments, the trigger instruction carries the first preset step size, and the wait instruction carries the second preset step size. Since the instruction carries the step size information, the counter can directly read the corresponding step size information from the above instruction after receiving the trigger instruction or the waiting instruction, so as to determine the step size for increasing or decreasing the count value. Further, the trigger command may also carry identification information used to characterize the adjustment method of the first adjustment, and the waiting instruction may also carry identification information used to characterize the adjustment method of the second adjustment, so that the counter determines which adjustment mode to use. There are two adjustment methods to adjust the step size of the counter.
一个指令队列中位于所述等待指令之后的指令,只有在计数器的计数值满足预设数值条件的情况下被发送,这样,可以基于计数器的计数值控制指令队列中的指令的发送顺序,从而实现所述多个指令队列之间的指令同步。在一些实施例中,所述预设数值条件基于所述计数初值、所述第一预设步长、所述第二预设步长以及预设倍数共同确定。 具体来说,所述预设数值条件可以是计数器的计数值与目标计数值满足预设的数值关系。其中,可以确定所述第一预设步长、所述第二预设步长以及预设倍数的乘积,对所述乘积与所述计数初值进行求和,基于求和结果确定所述目标计数值。假设将所述第一预设步长记为n,将所述第二预设步长记为m,将所述预设倍数记为a,将所述计数初值记为k0,则所述目标计数值可以记为n*m*a-1+k0。在一些例子中,预设倍数为正整数。The instructions after the waiting instruction in an instruction queue are sent only when the count value of the counter satisfies the preset value condition, so that the sending order of the instructions in the instruction queue can be controlled based on the count value of the counter, thereby realizing Instructions among the plurality of instruction queues are synchronized. In some embodiments, the preset numerical condition is jointly determined based on the initial count value, the first preset step size, the second preset step size, and a preset multiple. Specifically, the preset numerical condition may be that the count value of the counter and the target count value satisfy a preset numerical relationship. Wherein, it is possible to determine the product of the first preset step size, the second preset step size and a preset multiple, sum the product and the initial count value, and determine the target based on the sum result count value. Assuming that the first preset step size is marked as n, the second preset step size is marked as m, the preset multiple is marked as a, and the initial counting value is marked as k0, then the The target count value can be written as n*m*a-1+k0. In some examples, the preset multiple is a positive integer.
取决于第一调整、第二调整对应的调整方式,可以确定不同的数值关系。例如,在方式一中,所述第一调整的调整方式为增加所述计数值,所述第二调整的调整方式为减少所述计数值,则所述数值关系为所述计数值大于所述目标计数值。在方式二中,所述第一调整的调整方式为减少所述计数值,所述第二调整的调整方式为增加所述计数值,则所述数值关系为所述计数值小于所述目标计数值。下面以上述方式一为例,对本公开实施例的方案进行说明。Depending on the adjustment manners corresponding to the first adjustment and the second adjustment, different numerical relationships can be determined. For example, in mode one, the adjustment mode of the first adjustment is to increase the count value, and the adjustment mode of the second adjustment is to decrease the count value, then the numerical relationship is that the count value is greater than the Target count value. In mode 2, the adjustment mode of the first adjustment is to reduce the count value, and the adjustment mode of the second adjustment is to increase the count value, then the numerical relationship is that the count value is less than the target count value. The solution of the embodiment of the present disclosure will be described below by taking the foregoing manner 1 as an example.
在图3所示的实施例中,m和n均等于1,并假设预设倍数a也等于1,且计数器的计数初值为0。指令队列1中的指令4和指令5需要等指令队列2中的指令1和指令2发送完成再进行发送。因此,可以在指令队列2中的指令2之后插入触发指令T,并在指令队列1中的指令4之前插入等待指令W。指令队列2可以对本队列中的各指令进行解析,并对解析后的指令进行顺序发送。其中,指令1和指令2均为执行指令,可以依次发送至执行单元。与此同时,指令队列1可以并行地对本队列中的执行指令(即指令1、指令2和指令3)进行发送。上述两个指令队列中的指令1、指令2等标号相同的指令,可以是相同的指令,也可以是不同的指令,这里指令的标号只是用于表示该指令在所属指令队列中的相对位置,并非用于表示指令的内容或类型。由于不同指令队列中不存在依赖关系的各指令的发送完成时间是随机的,因此,既可能出现指令队列1中的指令3比指令队列2中的指令2先完成发送的情况,也可能出现指令队列2中的指令2比指令队列1中的指令3先完成发送的情况。In the embodiment shown in FIG. 3 , both m and n are equal to 1, and it is assumed that the preset multiple a is also equal to 1, and the initial value of the counter is 0. Instruction 4 and instruction 5 in instruction queue 1 need to wait for instruction 1 and instruction 2 in instruction queue 2 to be sent before being sent. Therefore, the trigger instruction T can be inserted after the instruction 2 in the instruction queue 2, and the waiting instruction W can be inserted before the instruction 4 in the instruction queue 1. The command queue 2 can analyze the commands in this queue, and send the parsed commands sequentially. Wherein, instruction 1 and instruction 2 are execution instructions, which can be sent to the execution unit in sequence. At the same time, the instruction queue 1 can send the execution instructions (ie, instruction 1, instruction 2 and instruction 3) in this queue in parallel. Instructions with the same label as instruction 1 and instruction 2 in the above two instruction queues may be the same instruction or different instructions. Here, the label of the instruction is only used to indicate the relative position of the instruction in the instruction queue to which it belongs. Not intended to indicate the content or type of an instruction. Since the sending and completion time of the commands without dependencies in different command queues is random, it is possible that command 3 in command queue 1 is sent before command 2 in command queue 2, or that command Instruction 2 in queue 2 is sent before instruction 3 in instruction queue 1.
无论是上述哪种情况,由于在两个指令队列中分别插入了触发指令和等待指令,指令队列2在解析到触发指令T后,会将触发指令T发送给计数器,计数器接收到触发指令T后,将计数值加1。指令队列1在解析到等待指令W后,可以读取计数器的计数值,只有当计数器的计数值大于0的情况下,指令队列1才会对等待指令W以及等待指令W之后的各个指令进行发送,否则不发送等待指令W以及等待指令W之后的各个指令。而计数器在接收到等待指令W的情况下,可以将计数值减1。通过上述方式,就完成了一次同步事件。如果两个指令队列之间还有其他的同步事件,则继续根据上述过程进行指令同步即可。Regardless of the above situation, since the trigger command and the waiting command are respectively inserted in the two command queues, after the command queue 2 parses the trigger command T, it will send the trigger command T to the counter. After the counter receives the trigger command T , increment the count value by 1. After the command queue 1 resolves the waiting command W, it can read the count value of the counter. Only when the count value of the counter is greater than 0, the command queue 1 will send the waiting command W and the commands after the waiting command W. , otherwise the waiting instruction W and the instructions following the waiting instruction W are not sent. However, when the counter receives the waiting instruction W, it can decrement the count value by 1. Through the above method, a synchronization event is completed. If there are other synchronization events between the two instruction queues, continue to perform instruction synchronization according to the above process.
在图4所示的更为通用的情况下,m和n均为大于1的正整数,且m和n可以不相等,这种情况称为n个指令队列等待m个指令队列。其中,各个等待队列(即计数器上方的指令队列A1到指令队列An)中的指令4和指令5均需要等待所有的被等待队列(即计数器下方的指令队列B1到指令队列Bm)中的指令3发送完成才能进行发送。假设计数初值为0,预设倍数a为1,则每个被等待队列(例如指令队列Bi)向计数器发送完触发指令之后,计数器的计数值可以加n,从而所有被等待队列均向计数器发送完触发指令之后,计数器的数值为n*m。每个等待队列在解析到等待指令的情况下,可以读取计数器的计数值,如果计数值大于n*m-1,则该等待队列可以发送等待指令,以使计数器将计数值减m,在所有等待队列均发送完等待指令之后,计数器的计数值回到0。这样,就完成了一次同步事件。In the more general case shown in FIG. 4, both m and n are positive integers greater than 1, and m and n may not be equal. This situation is called n instruction queues waiting for m instruction queues. Among them, instruction 4 and instruction 5 in each waiting queue (that is, instruction queue A1 above the counter to instruction queue An) need to wait for instruction 3 in all waiting queues (that is, instruction queue B1 to instruction queue Bm below the counter) Sending is not possible until the sending is complete. Assuming that the counting initial value is 0 and the preset multiple a is 1, after each waiting queue (such as instruction queue Bi) sends a trigger command to the counter, the counting value of the counter can be increased by n, so that all waiting queues send to the counter After the trigger command is sent, the value of the counter is n*m. Each waiting queue can read the count value of the counter in the case of parsing the waiting instruction, if the count value is greater than n*m-1, the waiting queue can send the waiting instruction so that the counter will decrement the count value by m, in After all the waiting queues have sent the waiting instructions, the count value of the counter returns to 0. In this way, a synchronization event is completed.
当然,图中所示的情况只是一种示例性说明,在实际应用中,各个等待队列中处于等待状态的指令在所属的指令队列中的位置可以不同,各个被等待队列中处于被等待状态的指令在所属的指令队列中的位置也可以不同。Of course, the situation shown in the figure is only an exemplary illustration. In practical applications, the positions of the instructions in the waiting state in each waiting queue in the instruction queue to which they belong may be different, and the instructions in the waiting state in each waiting queue The positions of the commands in the associated command queues can also vary.
在采用上述方式二确定所述数值关系的实施例中,同步事件的同步过程与图3和图4所示的实施例类似,此处不再赘述。区别仅在于,计数器在接收到触发指令后,将计数值从计数初值(例如,5)减去相应数值,计数器在接收到等待指令之后,将计数值增加相应数值,只有在计数值小于计数初值的情况下,才能继续发送等待指令之后的各个指令。如此,同样完成了一次同步事件。In the embodiment in which the numerical relationship is determined in the second manner above, the synchronization process of the synchronization event is similar to the embodiment shown in FIG. 3 and FIG. 4 , and will not be repeated here. The only difference is that after the counter receives the trigger command, it subtracts the corresponding value from the counting initial value (for example, 5), and after the counter receives the waiting command, it increases the counting value by the corresponding value, only when the counting value is less than the counting value In the case of the initial value, each instruction after the waiting instruction can continue to be sent. In this way, a synchronization event is also completed.
在多个同步事件映射到一个计数器的情况下,在后的同步事件中的触发指令可能会对在前的同步事件中的等待指令产生误触发。参见图5A和图5B,假设存在两个同步事件,分别为:(1)指令队列1、2等待指令队列3、4、5;以及(2)指令队列1、2等待指令队列3。其中,指令队列3、指令队列4和指令队列5中的T1以及指令队列1和指令队列2中的W1对应于第一个同步事件,指令队列3中的T2以及指令队列1和指令队列2中的W2对应于第二个同步事件。按照前述实施例中插入触发指令和等待指令的方式得到的各个指令队列如图5A中的情况一所示,由于每个指令队列发送触发指令的时间随机,在一些情况下,指令队列3中的触发指令T2可能比指令队列4发送触发指令Trigger1的时间要早,各指令的发送时间顺序如图5B中情况一的指令发送顺序所示。为了简洁,图中仅示出了指令队列3、4、5中的触发指令的发送顺序。In the case that multiple synchronization events are mapped to a counter, a trigger instruction in a later synchronization event may falsely trigger a wait instruction in a previous synchronization event. Referring to FIG. 5A and FIG. 5B , suppose there are two synchronization events, namely: (1) instruction queues 1 and 2 wait for instruction queues 3 , 4 , and 5 ; and (2) instruction queues 1 and 2 wait for instruction queue 3 . Among them, T1 in command queue 3, command queue 4 and command queue 5, and W1 in command queue 1 and command queue 2 correspond to the first synchronization event, and T2 in command queue 3 and command queue 1 and command queue 2 W2 corresponds to the second synchronization event. The command queues obtained by inserting trigger commands and waiting commands in the foregoing embodiments are shown in Case 1 in FIG. The trigger command T2 may be earlier than the time when the command queue 4 sends the trigger command Trigger1, and the sending time sequence of each command is shown in the sending order of the commands in case 1 in FIG. 5B . For the sake of brevity, only the sending order of the trigger commands in the command queues 3 , 4 , and 5 is shown in the figure.
假设计数初值为0,预设倍数a为1,计数器每收到一个触发指令,将计数值加n,每收到一个等待指令,将计数值减m,n和m分别为一个同步事件中等待队列的数量和被等待队列的数量。在上述第一个同步事件中,n为2,m为3,即,在计数器的计数值大于5时,可以发送等待队列中的等待指令以及位于等待指令之后的执行指令。在本实施例中,由于指令队列3的T2比指令队列4的T1先发送,因此,在指令队列4发送T1之后,计数器的计数值将达到6,会触发等待队列发送相关指令。而实际的情况却是,指令队列5的T1尚未发送,等待队列中的相关指令并不满足发送条件。可见,在上述情况下,指令队列3中的T2会误触发同步事件1的W1。Assuming that the initial counting value is 0 and the preset multiple a is 1, every time the counter receives a trigger command, it adds n to the count value, and every time it receives a waiting command, it decreases the count value by m, n and m are respectively in a synchronous event The number of waiting queues and the number of waiting queues. In the above first synchronization event, n is 2 and m is 3, that is, when the count value of the counter is greater than 5, the waiting instruction in the waiting queue and the execution instruction after the waiting instruction can be sent. In this embodiment, since T2 of the command queue 3 is sent before T1 of the command queue 4, after the command queue 4 sends T1, the count value of the counter will reach 6, which will trigger the waiting queue to send related commands. However, the actual situation is that T1 of the command queue 5 has not been sent, and the relevant commands in the waiting queue do not meet the sending conditions. It can be seen that, in the above case, T2 in the command queue 3 will mistakenly trigger W1 in the synchronization event 1 .
为了解决上述问题,本公开实施例中所述多个指令队列包括被等待队列和等待队列;所述被等待队列中包括多个触发指令,在同一个被等待队列中的每个触发指令对应于一个同步事件,一个同步事件包括对应的等待指令和对应的触发指令,所述多个触发指令用于对同一个计数器的计数值进行调整;所述被等待队列中的第一触发指令之后还包括一个目标等待指令;所述等待队列中的第一等待指令之后还包括一个目标触发指令。In order to solve the above problems, the plurality of instruction queues in the embodiments of the present disclosure include a wait queue and a wait queue; the wait queue includes a plurality of trigger instructions, and each trigger instruction in the same wait queue corresponds to A synchronous event, a synchronous event includes a corresponding waiting instruction and a corresponding trigger instruction, and the plurality of trigger instructions are used to adjust the count value of the same counter; after the first trigger instruction in the waiting queue, it also includes A target waiting instruction; after the first waiting instruction in the waiting queue, a target triggering instruction is also included.
其中,等待队列的数量可以大于或等于1,被等待队列的数量也可以大于或等于1,不同的同步事件对应的第一触发指令可以来自一个或多个被等待队列,不同的同步事件对应的第一等待指令可以来自一个或多个等待队列。对于同一个指令队列,假设同步事件的总数为N,则第1个同步事件对应的第一触发指令,第2个同步事件对应的第一触发指令,……,第N-1个同步事件对应的第一触发指令之后,各包括一个目标等待指令;且第1个同步事件对应的第一等待指令,第2个同步事件对应的第一等待指令,……,第N-1个同步事件对应的第一等待指令之后,各包括一个目标触发指令。第N个同步事件对应的第一触发指令之后可以包括,也可以不包括目标等待指令,第N个同步事件对应的第一等待指令之后可以包括,也可以不包括目标触发指令。Wherein, the number of waiting queues can be greater than or equal to 1, and the number of waiting queues can also be greater than or equal to 1. The first trigger instructions corresponding to different synchronization events can come from one or more waiting queues, and the corresponding The first wait instruction may come from one or more wait queues. For the same instruction queue, assuming that the total number of synchronization events is N, the first trigger instruction corresponding to the first synchronization event, the first trigger instruction corresponding to the second synchronization event, ..., the N-1th synchronization event corresponds to After the first trigger command, each includes a target wait command; and the first wait command corresponding to the first synchronization event, the first wait command corresponding to the second synchronization event, ..., the N-1th synchronization event corresponds to After the first wait instruction of each includes a target trigger instruction. The first trigger instruction corresponding to the Nth synchronous event may or may not include a target waiting instruction, and the first waiting instruction corresponding to the Nth synchronous event may or may not include a target trigger instruction.
如果某个同步事件对应的第一触发指令来自多个被等待队列,则这多个被等待队列中的每个被等待队列中的第一触发指令之后可以均包括一个目标等待指令。例如,一个同步事件为指令队列2中的指令1等待指令队列3中的指令2和指令队列4中的指令3,则指令队列3和指令队列4中均包括该同步事件对应的第一触发指令,从而指令队列3中的第一触发指令之后包括一个目标等待指令,且指令队列4中的第一触发指令之后也包括一个目标等待指令。同理,如果某个同步事件对应的第一等待指令来自多个等待队列,则这多个等待队列中的每个等待队列中的第一等待指令之后可以均包括一个目标触 发指令。例如,一个同步事件为指令队列1中的指令1和指令队列2中的指令1等待指令队列3中的指令2,则指令队列1和指令队列2中均包括该同步事件对应的第一等待指令,从而指令队列1中的第一等待指令之后包括一个目标触发指令,且指令队列2中的第一等待指令之后也包括一个目标触发指令。If the first trigger instruction corresponding to a certain synchronization event comes from multiple waiting queues, each of the multiple waiting queues may include a target waiting instruction after the first trigger instruction in each of the waiting queues. For example, if a synchronization event is that instruction 1 in instruction queue 2 waits for instruction 2 in instruction queue 3 and instruction 3 in instruction queue 4, then both instruction queue 3 and instruction queue 4 include the first trigger instruction corresponding to the synchronization event , so that the first trigger instruction in the instruction queue 3 includes a target waiting instruction, and the first trigger instruction in the instruction queue 4 also includes a target waiting instruction. Similarly, if the first waiting instruction corresponding to a certain synchronization event comes from multiple waiting queues, each of the multiple waiting queues may include a target trigger instruction after the first waiting instruction in each waiting queue. For example, if a synchronization event is that instruction 1 in instruction queue 1 and instruction 1 in instruction queue 2 wait for instruction 2 in instruction queue 3, then both instruction queue 1 and instruction queue 2 include the first waiting instruction corresponding to the synchronization event , so that the first waiting instruction in the instruction queue 1 includes a target trigger instruction, and the first waiting instruction in the instruction queue 2 also includes a target trigger instruction.
在另一个例子中,如果某个同步事件S对应的第一触发指令来自多个被等待队列,且该同步事件S在某一被等待队列Mi中不是最后一个触发事件,则这个被等待队列Mi中与同步事件S对应的第一触发指令之后包括一个目标等待指令;如果该同步事件S在某一被等待队列Mj中是最后一个触发事件,则这个被等待队列Mj中与同步事件S对应的第一触发指令之后不再包括目标等待指令。同样的,如果某个同步事件S对应的第一等待指令来自多个等待队列,且该同步事件S在某一等待队列Ni中不是最后一个等待事件,则这个等待队列Ni中与同步事件S对应的第一等待指令之后包括一个目标触发指令;如果该同步事件S在某一等待队列Nj中是最后一个等待事件,则这个等待队列Nj中与同步事件S对应的第一等待指令之后不再包括目标触发指令。如图5A中的情况二所示。In another example, if the first trigger instruction corresponding to a certain synchronization event S comes from multiple waiting queues, and the synchronization event S is not the last triggering event in a certain waiting queue Mi, then the waiting queue Mi Include a target waiting instruction after the first trigger instruction corresponding to the synchronization event S; if the synchronization event S is the last trigger event in a queue Mj to be waited for, then the corresponding event S in the queue Mj to be waited for The target waiting instruction is no longer included after the first trigger instruction. Similarly, if the first waiting instruction corresponding to a certain synchronization event S comes from multiple waiting queues, and the synchronization event S is not the last waiting event in a certain waiting queue Ni, then this waiting queue Ni corresponds to the synchronization event S Include a target trigger instruction after the first waiting instruction; if the synchronization event S is the last waiting event in a certain waiting queue Nj, then no longer include after the first waiting instruction corresponding to the synchronization event S in the waiting queue Nj Target trigger command. As shown in Case 2 in Figure 5A.
参见图5A中的情况二以及图5B中情况二的指令发送顺序,上述第一触发指令即为指令队列3中的T1,目标等待指令即为位于指令队列3中的T1之后的Wait,上述第一等待指令即为指令队列1中的W1和指令队列2中的W1,上述目标触发指令即为位于指令队列1中的W1之后的T以及位于指令队列2中的W1之后的T。在指令队列3中的指令发送较快的情况下,指令队列3中的Trigger1可能先发送,但由于在指令队列3中的Trigger1之后插入了Wait,因此,在指令队列3中的T1发送完成,而指令队列4和指令队列5中的T1未发送完成的情况下,指令队列3中的W并不满足发送条件,从而指令队列3中的T1发送完成之后会进入等待状态。同样地,指令队列1中的W1和指令队列2中的W1会使指令队列1和指令队列2也进入等待状态。因此,指令队列4中的T1和指令队列5中的T1会先发送。这样,就保证了第一个同步事件的等待指令都正常发生后,第二个同步事件的触发指令才能生效。Referring to the instruction sending sequence of case 2 in FIG. 5A and case 2 in FIG. 5B, the first trigger instruction is T1 in the instruction queue 3, and the target waiting instruction is Wait after T1 in the instruction queue 3. A waiting command is W1 in command queue 1 and W1 in command queue 2 , and the above-mentioned target trigger commands are T after W1 in command queue 1 and T after W1 in command queue 2 . In the case that the commands in command queue 3 are sent quickly, Trigger1 in command queue 3 may be sent first, but because Wait is inserted after Trigger1 in command queue 3, the sending of T1 in command queue 3 is completed, However, when T1 in command queue 4 and command queue 5 has not been sent, W in command queue 3 does not meet the sending condition, so T1 in command queue 3 will enter a waiting state after sending. Similarly, W1 in command queue 1 and W1 in command queue 2 will cause command queue 1 and command queue 2 to also enter the waiting state. Therefore, T1 in command queue 4 and T1 in command queue 5 will be sent first. In this way, it is guaranteed that the triggering instruction of the second synchronization event can take effect only after the waiting instructions of the first synchronization event normally occur.
在图5A和图5B所示的实施例中,映射到同一个计数器的同步事件的数量为2,即,存在至少一个被等待队列(图中是指令队列3)分别对应于两个同步事件。在实际应用中,映射到同一个计数器的同步事件的数量也可以大于2,并且,也可以存在一个以上被等待队列分别对应于多个同步事件。此外,虽然图中示出的两个同步事件中的等待队列是相同的(都是指令队列1和指令队列2),但在实际应用中,不同事件中的等待队列也可以是部分相同,或者完全不同的。此外,第一触发指令之后的目标等待指令的位置并不一定要紧邻第一触发指令,只要处于第一触发指令与该第一触发指令之后的下一个第一触发指令之间即可。同理,第一等待指令之后的目标触发指令的位置并不一定要紧邻第一等待指令,只要处于第一等待指令与该第一等待指令之后的下一个第一等待指令之间即可。例如,在图5A的情况二中所示的指令队列3中,T1后的W可以处于T1与T2之间的任意一个位置上。在图5A的情况二中所示的指令队列2中,W1后的T可以处于W1与W2之间的任意一个位置上。在上述情况下,同样可以通过插入目标等待指令和目标触发指令的方式来避免同步事件的误触发,此处不再赘述。In the embodiment shown in FIG. 5A and FIG. 5B , the number of synchronization events mapped to the same counter is 2, that is, there is at least one waiting queue (instruction queue 3 in the figure) corresponding to two synchronization events respectively. In practical applications, the number of synchronization events mapped to the same counter may also be greater than 2, and there may also be more than one waiting queue corresponding to multiple synchronization events respectively. In addition, although the waiting queues in the two synchronization events shown in the figure are the same (both instruction queue 1 and instruction queue 2), in practical applications, the waiting queues in different events may also be partly the same, or Totally different. In addition, the position of the target waiting instruction after the first trigger instruction does not have to be adjacent to the first trigger instruction, as long as it is between the first trigger instruction and the next first trigger instruction after the first trigger instruction. Similarly, the position of the target trigger instruction after the first waiting instruction does not have to be adjacent to the first waiting instruction, as long as it is between the first waiting instruction and the next first waiting instruction after the first waiting instruction. For example, in the instruction queue 3 shown in case 2 of FIG. 5A , W after T1 can be in any position between T1 and T2. In the instruction queue 2 shown in the second case of FIG. 5A , T after W1 can be in any position between W1 and W2 . In the above case, false triggering of the synchronization event can also be avoided by inserting a target waiting instruction and a target triggering instruction, which will not be repeated here.
在一些实施例中,计数器201的数量大于1。在这种情况下,每个指令队列包括的触发指令和等待指令中均可以包括计数器的标识信息,用于所述指令队列将所述指令队列包括的触发指令和等待指令发送至对应的计数器。或者,也可以将触发指令的标识信息和等待指令的标识信息分别与计数器的标识信息进行绑定,从而使指令队列将触发指令和等待指令发送至对应的计数器。In some embodiments, the number of counters 201 is greater than one. In this case, the trigger instruction and the waiting instruction included in each instruction queue may include identification information of the counter, which is used for the instruction queue to send the trigger instruction and the waiting instruction included in the instruction queue to the corresponding counter. Alternatively, the identification information of the trigger instruction and the identification information of the waiting instruction may be respectively bound with the identification information of the counter, so that the instruction queue sends the trigger instruction and the waiting instruction to the corresponding counter.
在一些实施例中,所述指令同步装置还包括执行单元203,用于对接收到的执行指 令进行执行。执行单元的数量可以大于或等于1,一个执行单元可以接收并处理一个或多个指令队列的执行指令。其中,一个执行单元可以是具有指令执行功能的处理单元中的一个子单元,该处理单元可以按照不同的粒度划分为多组执行单元,每组执行单元可以用于执行一项处理任务(例如,加法运算)。在不同的情况下,可以根据不同的实际需求采用不同的划分粒度。例如,假设处理单元中包括R个执行单元,这R个执行单元可以划分为s1组,每组包括r1个执行单元,也可以划分为s2组,每组包括r2个执行单元,其中,R=s1*r1=s2*r2。这种划分方式灵活性较高,在一项处理任务所需的执行单元数量较少的情况下,可以将执行单元划分为较多的组,从而提高任务处理的并行度。执行单元在对指令处理完成的情况下,还可以返回一个响应信号(ACK)给仲裁单元,以便仲裁单元继续发送新的指令。In some embodiments, the instruction synchronization device further includes an execution unit 203, configured to execute the received execution instruction. The number of execution units may be greater than or equal to 1, and one execution unit may receive and process execution instructions of one or more instruction queues. Wherein, an execution unit may be a subunit in a processing unit capable of executing instructions, and the processing unit may be divided into multiple groups of execution units according to different granularities, and each group of execution units may be used to execute a processing task (for example, addition operation). In different situations, different partition granularities can be adopted according to different actual needs. For example, assuming that the processing unit includes R execution units, the R execution units can be divided into s1 groups, each group includes r1 execution units, or can be divided into s2 groups, each group includes r2 execution units, where R= s1*r1=s2*r2. This division method has high flexibility. When a task requires a small number of execution units, the execution units can be divided into more groups, thereby improving the parallelism of task processing. When the execution unit finishes processing the instruction, it can also return an acknowledgment signal (ACK) to the arbitration unit, so that the arbitration unit can continue to send new instructions.
在一些实施例中,所述指令同步装置还包括仲裁单元204,用于按照预先设置的优先级,将所述多个指令队列中每个指令队列发送的执行指令发送至所述执行单元。仲裁单元204的数量可以大于或等于1,图6中示出了仲裁单元的数量与指令队列的数量相等的情况,但在实际应用中,二者的数量也可以不相等。In some embodiments, the instruction synchronization device further includes an arbitration unit 204, configured to send the execution instructions sent by each instruction queue in the plurality of instruction queues to the execution unit according to a preset priority. The number of arbitration units 204 may be greater than or equal to 1. FIG. 6 shows that the number of arbitration units is equal to the number of instruction queues, but in practical applications, the numbers of the two may not be equal.
在一些实施例中,所述指令同步装置还包括多路选择器205,用于将所述指令发送至所述多个指令队列中对应的指令队列。在一些实施例中,所述触发指令和所述等待指令中均包括指令队列的标识信息,用于所述多路选择器205将所述触发指令和所述等待指令发送至对应的指令队列。In some embodiments, the instruction synchronization device further includes a multiplexer 205, configured to send the instruction to a corresponding instruction queue in the plurality of instruction queues. In some embodiments, both the triggering instruction and the waiting instruction include identification information of an instruction queue, for the multiplexer 205 to send the triggering instruction and the waiting instruction to the corresponding instruction queue.
本公开提供了一种硬件实现的指令队列状态同步计数器的动态部署机制,节省了计数器开销,消除了计数器溢出风险;同时,采用硬件实现计数器的动态部署,也实现了高效灵活的多进程调度。本公开实施例具有很强的扩展性,可根据需求调整计数器的数量、指令队列的数量,以及执行单元的数量。本公开实施例的指令同步装置可应用于人工智能芯片、图形处理芯片等处理芯片中,实现高效灵活的指令队列部署和调度,进而提高执行单元的并行效率。The present disclosure provides a hardware-implemented dynamic deployment mechanism of the instruction queue state synchronization counter, which saves counter overhead and eliminates the risk of counter overflow; at the same time, the dynamic deployment of the counter is implemented by hardware, and efficient and flexible multi-process scheduling is also realized. The embodiments of the present disclosure have strong scalability, and the number of counters, the number of instruction queues, and the number of execution units can be adjusted according to requirements. The instruction synchronization device of the embodiments of the present disclosure can be applied to processing chips such as artificial intelligence chips, graphics processing chips, etc., to realize efficient and flexible instruction queue deployment and scheduling, thereby improving the parallel efficiency of execution units.
上述实施例中的各种指令,包括触发指令、等待指令、执行指令,可以预先通过离线编译的方式生成。各指令可以在离线编译时按照所需的顺序依次生成并发送至指令队列。Various instructions in the above embodiments, including trigger instructions, waiting instructions, and execution instructions, can be generated in advance by offline compilation. Each instruction can be sequentially generated and sent to the instruction queue according to the required order during offline compilation.
在一些实施例中,本公开还提供一种芯片,所述芯片包括本公开任一实施例所述的指令同步装置。上述芯片可以是人工智能芯片或图形处理芯片,也可以是其他类型的处理芯片。本芯片实施例中的指令同步装置的细节可以参照前述指令同步装置的实施例,此处不再赘述。In some embodiments, the present disclosure further provides a chip, the chip including the instruction synchronization device described in any embodiment of the present disclosure. The aforementioned chips may be artificial intelligence chips or graphics processing chips, or other types of processing chips. For details of the instruction synchronization device in this chip embodiment, reference may be made to the aforementioned embodiments of the instruction synchronization device, and details are not repeated here.
本公开实施例还提供一种计算机设备,所述计算机设备包括本公开任一实施例所述的芯片。其中,该芯片的具体功能可以参照上文芯片实施例的描述,为了简洁,这里不再赘述。An embodiment of the present disclosure further provides a computer device, where the computer device includes the chip described in any embodiment of the present disclosure. For the specific functions of the chip, reference may be made to the description of the chip embodiments above, and for the sake of brevity, details are not repeated here.
如图7所示,本公开实施例还提供一种数据处理方法,应用于本公开任一实施例所述的指令同步装置,所述方法包括:As shown in FIG. 7, an embodiment of the present disclosure also provides a data processing method, which is applied to the instruction synchronization device described in any embodiment of the present disclosure, and the method includes:
步骤701:多个指令队列中的每个指令队列存储指令,所述指令包括执行指令,还包括触发指令和等待指令中的至少一者;Step 701: Each instruction queue in the plurality of instruction queues stores an instruction, the instruction includes an execution instruction, and at least one of a trigger instruction and a waiting instruction;
步骤702:与所述多个指令队列耦接的计数器响应于接收到触发指令对计数值进行第一调整,并响应于接收到等待指令对计数值进行第二调整;所述第一调整的调整方式和所述第二调整的调整方式不同;Step 702: The counter coupled to the plurality of command queues performs a first adjustment to the count value in response to receiving a trigger command, and performs a second adjustment to the count value in response to receiving a wait command; the adjustment of the first adjustment The adjustment method is different from the adjustment method of the second adjustment;
步骤703:所述每个指令队列在所述计数值满足预设数值条件的情况下,对本队列 中位于所述等待指令之后的指令进行发送,所述预设数值条件基于所述计数器的计数初值、所述第一调整的调整方式和所述第二调整的调整方式确定。Step 703: In the case that the count value satisfies a preset numerical condition, each instruction queue sends the instruction after the waiting instruction in the queue, and the preset numerical condition is based on the initial count of the counter. The value, the adjustment mode of the first adjustment and the adjustment mode of the second adjustment are determined.
在一些实施例中,所述方法还包括:所述每个指令队列对已存储的指令进行解析,将解析出的触发指令和等待指令发送至所述计数器,以及将解析出的执行指令发送至执行单元。In some embodiments, the method further includes: each instruction queue parses the stored instructions, sends the parsed trigger instructions and waiting instructions to the counter, and sends the parsed execution instructions to execution unit.
在一些实施例中,所述计数器响应于接收到触发指令对计数值进行第一调整,并响应于接收到等待指令对计数值进行第二调整,包括:响应于接收到触发指令按照第一预设步长增加所述计数值,并响应于接收到等待指令按照第二预设步长减少所述计数值;或者,响应于接收到触发指令按照第一预设步长减少所述计数值,并响应于接收到等待指令按照第二预设步长增加所述计数值。In some embodiments, the counter performs a first adjustment to the count value in response to receiving the trigger instruction, and performs a second adjustment to the count value in response to receiving the wait instruction, including: responding to receiving the trigger instruction according to the first preset Set the step size to increase the count value, and decrease the count value according to the second preset step size in response to receiving the waiting instruction; or, reduce the count value according to the first preset step size in response to receiving the trigger instruction, And in response to receiving the waiting instruction, increase the count value according to the second preset step size.
在一些实施例中,所述第一预设步长基于所述多个指令队列中与目标同步事件相关的等待队列的数量确定,所述第二预设步长基于所述多个指令队列中与所述目标同步事件相关的被等待队列的数量确定。In some embodiments, the first preset step size is determined based on the number of waiting queues related to the target synchronization event in the plurality of instruction queues, and the second preset step size is determined based on the number of waiting queues in the plurality of instruction queues The number of waiting queues associated with the target synchronization event is determined.
在一些实施例中,所述方法还包括:所述计数器获取所述触发指令中携带的所述第一预设步长,并获取所述等待指令中携带的所述第二预设步长。In some embodiments, the method further includes: acquiring, by the counter, the first preset step size carried in the trigger instruction, and acquiring the second preset step size carried in the waiting instruction.
在一些实施例中,所述第一预设步长等于所述等待队列的数量与预设倍数的乘积,所述第二预设步长等于所述被等待队列的数量与所述预设倍数的乘积。In some embodiments, the first preset step size is equal to the product of the number of waiting queues and a preset multiple, and the second preset step size is equal to the number of the waiting queues and the preset multiple product of .
在一些实施例中,所述预设数值条件基于所述计数初值、所述第一预设步长、所述第二预设步长以及预设倍数共同确定。In some embodiments, the preset numerical condition is jointly determined based on the initial count value, the first preset step size, the second preset step size, and a preset multiple.
在一些实施例中,所述预设数值条件为:所述计数值与目标计数值满足预设的数值关系,所述目标计数值为:n*m*a-1+k0;其中,n为所述第一预设步长,m为所述第二预设步长,a为所述预设倍数,k0为所述计数初值。In some embodiments, the preset numerical condition is: the count value and the target count value satisfy a preset numerical relationship, and the target count value is: n*m*a-1+k0; wherein, n is The first preset step size, m is the second preset step size, a is the preset multiple, and k0 is the counting initial value.
在一些实施例中,所述方法还包括:所述每个指令队列在所述第一调整的调整方式为增加所述计数值,所述第二调整的调整方式为减少所述计数值的情况下,所述数值关系为所述计数值大于所述目标计数值;所述每个指令队列在所述第一调整的调整方式为减少所述计数值,所述第二调整的调整方式为增加所述计数值的情况下,所述数值关系为所述计数值小于所述目标计数值。In some embodiments, the method further includes: when the adjustment mode of the first adjustment is to increase the count value of each instruction queue, and the adjustment mode of the second adjustment is to decrease the count value Next, the numerical relationship is that the count value is greater than the target count value; the adjustment mode of each instruction queue in the first adjustment is to reduce the count value, and the adjustment mode of the second adjustment is to increase In the case of the count value, the numerical relationship is that the count value is smaller than the target count value.
在一些实施例中,所述多个指令队列包括被等待队列和等待队列;所述被等待队列中包括多个触发指令,在同一个被等待队列中的每个触发指令对应于一个同步事件,一个同步事件包括对应的等待指令和对应的触发指令,所述多个触发指令用于对同一个计数器的计数值进行调整;所述被等待队列中的第一触发指令之后还包括一个目标等待指令;所述等待队列中的第一等待指令之后还包括一个目标触发指令;所述第一触发指令和所述第一等待指令均为除最后一个同步事件以外的其余同步事件的对应的指令。In some embodiments, the plurality of instruction queues include a wait queue and a wait queue; the wait queue includes a plurality of trigger instructions, and each trigger instruction in the same wait queue corresponds to a synchronization event, A synchronization event includes a corresponding waiting instruction and a corresponding triggering instruction, and the multiple triggering instructions are used to adjust the count value of the same counter; after the first triggering instruction in the waiting queue, a target waiting instruction is also included ; The first waiting instruction in the waiting queue further includes a target trigger instruction; the first trigger instruction and the first waiting instruction are corresponding instructions of other synchronization events except the last synchronization event.
在一些实施例中,所述计数器的数量大于1;所述方法还包括:所述每个指令队列获取本队列中的触发指令和等待指令中包括的计数器的标识信息,基于本队列中的触发指令和等待指令中包括的计数器的标识信息,将本队列中的触发指令和等待指令发送至对应的计数器。In some embodiments, the number of counters is greater than 1; the method further includes: each instruction queue acquires the identification information of the counters included in the triggering instructions and waiting instructions in the queue, and based on the triggering instructions in the queue The identification information of the counter included in the instruction and the waiting instruction sends the trigger instruction and the waiting instruction in the queue to the corresponding counter.
在一些实施例中,所述将解析出的执行指令发送至执行单元,包括:将解析出的执行指令发送至仲裁单元,以使所述仲裁单元按照预先设置的优先级,将所述多个指令队列中每个指令队列发送的执行指令发送至所述执行单元。In some embodiments, the sending the parsed execution instructions to the execution unit includes: sending the parsed execution instructions to the arbitration unit, so that the arbitration unit sends the multiple The execution instructions sent by each instruction queue in the instruction queue are sent to the execution unit.
在一些实施例中,所述方法还包括:所述每个指令队列获取多路选择器发送的指令,对所述多路选择器发送的指令进行存储。In some embodiments, the method further includes: each instruction queue acquires the instruction sent by the multiplexer, and stores the instruction sent by the multiplexer.
在一些实施例中,所述每个指令队列获取多路选择器发送的指令,包括:所述每个指令队列获取所述多路选择器基于所述指令中包括的指令队列的标识信息发送至本队列的指令。In some embodiments, each of the instruction queues acquiring the instructions sent by the multiplexer includes: each of the instruction queues acquiring the instructions sent by the multiplexer based on the identification information of the instruction queues included in the instructions to Instructions for this queue.
上述实施例阐明的系统、装置、模块或单元,具体可以由计算机芯片或实体实现,或者由具有某种功能的产品来实现。一种典型的实现设备为计算机,计算机的具体形式可以是个人计算机、膝上型计算机、蜂窝电话、相机电话、智能电话、个人数字助理、媒体播放器、导航设备、电子邮件收发设备、游戏控制台、平板计算机、可穿戴设备或者这些设备中的任意几种设备的组合。The systems, devices, modules, or units described in the above embodiments can be specifically implemented by computer chips or entities, or by products with certain functions. A typical implementing device is a computer, which may take the form of a personal computer, laptop computer, cellular phone, camera phone, smart phone, personal digital assistant, media player, navigation device, e-mail device, game control device, etc. desktops, tablets, wearables, or any combination of these.
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于装置实施例而言,由于其基本相似于方法实施例,所以描述得比较简单,相关之处参见方法实施例的部分说明即可。以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,在实施本说明书实施例方案时可以把各模块的功能在同一个或多个软件和/或硬件中实现。也可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性劳动的情况下,即可以理解并实施。Each embodiment in this specification is described in a progressive manner, the same and similar parts of each embodiment can be referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, as for the device embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for relevant parts, please refer to part of the description of the method embodiment. The device embodiments described above are only illustrative, and the modules described as separate components may or may not be physically separated, and the functions of each module may be integrated in the same or multiple software and/or hardware implementations. Part or all of the modules can also be selected according to actual needs to achieve the purpose of the solution of this embodiment. It can be understood and implemented by those skilled in the art without creative effort.
以上所述仅是本说明书实施例的具体实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本说明书实施例原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本说明书实施例的保护范围。The above is only the specific implementation of the embodiment of this specification. It should be pointed out that for those of ordinary skill in the art, without departing from the principle of the embodiment of this specification, some improvements and modifications can also be made. These Improvements and modifications should also be regarded as the scope of protection of the embodiments of this specification.

Claims (15)

  1. 一种指令同步装置,其特征在于,所述装置包括:An instruction synchronization device, characterized in that the device comprises:
    计数器,以及多个指令队列;Counters, and multiple instruction queues;
    所述计数器耦接至所述多个指令队列;the counter is coupled to the plurality of instruction queues;
    所述多个指令队列中的每个指令队列用于存储指令,所述指令包括执行指令,还包括触发指令和等待指令中的至少一者;Each instruction queue in the plurality of instruction queues is used to store instructions, the instructions include execution instructions, and at least one of a trigger instruction and a wait instruction;
    所述计数器用于响应于接收到所述触发指令对计数值进行第一调整,以及响应于接收到所述等待指令对所述计数值进行第二调整;所述第一调整的调整方式和所述第二调整的调整方式不同;The counter is used to perform a first adjustment to the count value in response to receiving the trigger instruction, and to perform a second adjustment to the count value in response to receiving the wait instruction; the adjustment method of the first adjustment and the The adjustment method of the above-mentioned second adjustment is different;
    其中,一个指令队列中位于所述等待指令之后的指令在所述计数值满足预设数值条件的情况下被发送,所述预设数值条件基于所述计数器的计数初值、所述第一调整的调整方式和所述第二调整的调整方式确定。Wherein, the instruction after the waiting instruction in an instruction queue is sent when the count value satisfies a preset numerical condition, and the preset numerical condition is based on the initial count value of the counter, the first adjustment The adjustment mode of and the adjustment mode of the second adjustment are determined.
  2. 根据权利要求1所述的指令同步装置,其特征在于,The instruction synchronization device according to claim 1, characterized in that,
    所述装置还包括执行单元,所述执行单元耦接至所述多个指令队列,用于对接收到的执行指令进行执行;The apparatus further includes an execution unit, the execution unit is coupled to the plurality of instruction queues, and is used to execute the received execution instructions;
    所述每个指令队列用于对已存储的指令进行解析,将解析出的触发指令和等待指令发送至所述计数器,以及将解析出的执行指令发送至所述执行单元。Each of the instruction queues is used to analyze the stored instructions, send the analyzed trigger instructions and waiting instructions to the counter, and send the analyzed execution instructions to the execution unit.
  3. 根据权利要求1或2所述的指令同步装置,其特征在于,所述第一调整和所述第二调整中的一者的调整方式为按照第一预设步长增加所述计数值,另一者的调整方式为按照第二预设步长减少所述计数值;The instruction synchronization device according to claim 1 or 2, characterized in that, the adjustment method of one of the first adjustment and the second adjustment is to increase the count value according to the first preset step length, and in addition One adjustment method is to reduce the count value according to the second preset step size;
    所述第一预设步长基于所述多个指令队列中与目标同步事件相关的等待队列的数量确定,所述第二预设步长基于所述多个指令队列中与所述目标同步事件相关的被等待队列的数量确定。The first preset step size is determined based on the number of waiting queues related to the target synchronization event in the plurality of instruction queues, and the second preset step size is determined based on the target synchronization event in the plurality of instruction queues The number of associated wait queues is determined.
  4. 根据权利要求3所述的指令同步装置,其特征在于,所述触发指令中携带所述第一预设步长,所述等待指令中携带所述第二预设步长。The instruction synchronization device according to claim 3, wherein the trigger instruction carries the first preset step size, and the waiting instruction carries the second preset step size.
  5. 根据权利要求3或4所述的指令同步装置,其特征在于,所述第一预设步长等于所述等待队列的数量与预设倍数的乘积,所述第二预设步长等于所述被等待队列的数量与所述预设倍数的乘积。The instruction synchronization device according to claim 3 or 4, wherein the first preset step size is equal to the product of the number of the waiting queue and a preset multiple, and the second preset step size is equal to the The product of the number of waiting queues and the preset multiple.
  6. 根据权利要求3至5任意一项所述的指令同步装置,其特征在于,所述预设数值条件基于所述计数初值、所述第一预设步长、所述第二预设步长以及预设倍数共同确定。The instruction synchronization device according to any one of claims 3 to 5, wherein the preset numerical condition is based on the initial count value, the first preset step size, and the second preset step size And the preset multiples are jointly determined.
  7. 根据权利要求6所述的指令同步装置,其特征在于,所述预设数值条件为:所述计数值与目标计数值满足预设的数值关系,所述目标计数值为:The instruction synchronization device according to claim 6, wherein the preset numerical condition is: the count value and the target count value satisfy a preset numerical relationship, and the target count value is:
    n*m*a-1+k0;n*m*a-1+k0;
    其中,n为所述第一预设步长,m为所述第二预设步长,a为所述预设倍数,k0为所述计数初值;Wherein, n is the first preset step size, m is the second preset step size, a is the preset multiple, and k0 is the initial counting value;
    在所述第一调整的调整方式为增加所述计数值,所述第二调整的调整方式为减少所述计数值的情况下,所述数值关系为所述计数值大于所述目标计数值;When the adjustment method of the first adjustment is to increase the count value, and the adjustment method of the second adjustment is to decrease the count value, the numerical relationship is that the count value is greater than the target count value;
    在所述第一调整的调整方式为减少所述计数值,所述第二调整的调整方式为增加所述计数值的情况下,所述数值关系为所述计数值小于所述目标计数值。In the case where the adjustment manner of the first adjustment is to decrease the count value, and the adjustment manner of the second adjustment is to increase the count value, the numerical relationship is that the count value is smaller than the target count value.
  8. 根据权利要求1至7任意一项所述的指令同步装置,其特征在于,所述多个指令队列包括被等待队列和等待队列;所述被等待队列中包括多个触发指令,在同一个被等待队列中的每个触发指令对应于一个同步事件,一个同步事件包括对应的等待指令和对应的触发指令,所述多个触发指令用于对同一个计数器的计数值进行调整;The instruction synchronization device according to any one of claims 1 to 7, wherein the plurality of instruction queues include a waiting queue and a waiting queue; the waiting queue includes a plurality of trigger instructions, and the same Each trigger command in the waiting queue corresponds to a synchronization event, and a synchronization event includes a corresponding wait command and a corresponding trigger command, and the multiple trigger commands are used to adjust the count value of the same counter;
    所述被等待队列中的第一触发指令之后还包括一个目标等待指令;After the first trigger instruction in the waited queue, a target waiting instruction is also included;
    所述等待队列中的第一等待指令之后还包括一个目标触发指令。The first waiting instruction in the waiting queue further includes a target trigger instruction.
  9. 根据权利要求1至8任意一项所述的指令同步装置,其特征在于,所述计数器的数量大于1;每个指令队列包括的触发指令和等待指令中均包括:计数器的标识信息;The instruction synchronization device according to any one of claims 1 to 8, wherein the number of the counters is greater than 1; the trigger instruction and the waiting instruction included in each instruction queue include: identification information of the counter;
    所述指令队列用于将所述指令队列包括的触发指令和等待指令按照所述计数器的标识信息发送至对应的计数器。The instruction queue is used to send the trigger instruction and the waiting instruction included in the instruction queue to the corresponding counter according to the identification information of the counter.
  10. 根据权利要求2至9任意一项所述的指令同步装置,其特征在于,所述指令同步装置还包括:The instruction synchronization device according to any one of claims 2 to 9, wherein the instruction synchronization device further comprises:
    仲裁单元,与所述多个指令队列和所述执行单元耦接,用于按照预先设置的优先级,将所述多个指令队列中每个指令队列发送的执行指令发送至所述执行单元。The arbitration unit is coupled to the plurality of instruction queues and the execution unit, and is configured to send the execution instructions sent by each instruction queue in the plurality of instruction queues to the execution unit according to a preset priority.
  11. 根据权利要求1至10任意一项所述的指令同步装置,其特征在于,所述指令同步装置还包括:The instruction synchronization device according to any one of claims 1 to 10, wherein the instruction synchronization device further comprises:
    多路选择器,与所述多个指令队列耦接,用于将所述指令发送至所述多个指令队列中对应的指令队列。A multiplexer, coupled to the multiple command queues, is used to send the command to a corresponding command queue in the multiple command queues.
  12. 根据权利要求11所述的指令同步装置,其特征在于,所述触发指令和所述等待指令中均包括:指令队列的标识信息;The instruction synchronization device according to claim 11, wherein the trigger instruction and the waiting instruction both include: identification information of the instruction queue;
    所述多路选择器,还用于将所述触发指令和所述等待指令,按照所述指令队列的标识信息发送至对应的指令队列。The multiplexer is further configured to send the trigger instruction and the waiting instruction to the corresponding instruction queue according to the identification information of the instruction queue.
  13. 一种芯片,其特征在于,所述芯片包括:A chip, characterized in that the chip comprises:
    权利要求1至12任意一项所述的指令同步装置。The instruction synchronization device according to any one of claims 1 to 12.
  14. 一种计算机设备,其特征在于,包括权利要求13所述的芯片。A computer device, characterized by comprising the chip according to claim 13.
  15. 一种数据处理方法,其特征在于,应用于权利要求1至12任意一项所述的指令同步装置,所述方法包括:A data processing method, characterized in that it is applied to the instruction synchronization device described in any one of claims 1 to 12, the method comprising:
    多个指令队列中的每个指令队列存储指令,所述指令包括执行指令,还包括触发指令和等待指令中的至少一者;Each instruction queue in the plurality of instruction queues stores instructions, the instructions include an execution instruction, and at least one of a trigger instruction and a wait instruction;
    与所述多个指令队列耦接的计数器响应于接收到所述触发指令对计数值进行第一调整,并响应于接收到所述等待指令对所述计数值进行第二调整;所述第一调整的调整方式和所述第二调整的调整方式不同;a counter coupled to the plurality of command queues performs a first adjustment to a count value in response to receiving the trigger command, and performs a second adjustment to the count value in response to receiving the wait command; the first The adjustment method of the adjustment is different from the adjustment method of the second adjustment;
    所述每个指令队列在所述计数值满足预设数值条件的情况下,对本队列中位于所述等待指令之后的指令进行发送,所述预设数值条件基于所述计数器的计数初值、所述第一调整的调整方式和所述第二调整的调整方式确定。Each of the instruction queues sends the instructions after the waiting instruction in the queue when the count value satisfies a preset numerical condition, and the preset numerical condition is based on the initial count value of the counter, the The adjustment mode of the first adjustment and the adjustment mode of the second adjustment are determined.
PCT/CN2022/124511 2021-12-30 2022-10-11 Instruction synchronization apparatus, chip, computer device, and data processing method WO2023124370A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111652996.0 2021-12-30
CN202111652996.0A CN114265717A (en) 2021-12-30 2021-12-30 Instruction synchronization device, chip, computer equipment and data processing method

Publications (1)

Publication Number Publication Date
WO2023124370A1 true WO2023124370A1 (en) 2023-07-06

Family

ID=80831795

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/124511 WO2023124370A1 (en) 2021-12-30 2022-10-11 Instruction synchronization apparatus, chip, computer device, and data processing method

Country Status (2)

Country Link
CN (1) CN114265717A (en)
WO (1) WO2023124370A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114265717A (en) * 2021-12-30 2022-04-01 上海阵量智能科技有限公司 Instruction synchronization device, chip, computer equipment and data processing method
CN115390569A (en) * 2022-08-04 2022-11-25 上海布鲁可积木科技有限公司 Instruction synchronous processing method and system in hunting toy
CN117112025B (en) * 2023-10-18 2023-12-22 北京开源芯片研究院 Method, device, equipment and storage medium for executing instructions of processing component

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6581089B1 (en) * 1998-04-16 2003-06-17 Sony Corporation Parallel processing apparatus and method of the same
CN102207848A (en) * 2010-03-29 2011-10-05 索尼公司 Instruction fetch apparatus, processor and program counter addition control method
CN113138801A (en) * 2021-04-29 2021-07-20 上海阵量智能科技有限公司 Command distribution device, method, chip, computer equipment and storage medium
CN113138802A (en) * 2021-04-29 2021-07-20 上海阵量智能科技有限公司 Command distribution device, method, chip, computer equipment and storage medium
CN113778914A (en) * 2020-06-09 2021-12-10 华为技术有限公司 Apparatus, method, and computing device for performing data processing
CN114265717A (en) * 2021-12-30 2022-04-01 上海阵量智能科技有限公司 Instruction synchronization device, chip, computer equipment and data processing method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6581089B1 (en) * 1998-04-16 2003-06-17 Sony Corporation Parallel processing apparatus and method of the same
CN102207848A (en) * 2010-03-29 2011-10-05 索尼公司 Instruction fetch apparatus, processor and program counter addition control method
CN113778914A (en) * 2020-06-09 2021-12-10 华为技术有限公司 Apparatus, method, and computing device for performing data processing
CN113138801A (en) * 2021-04-29 2021-07-20 上海阵量智能科技有限公司 Command distribution device, method, chip, computer equipment and storage medium
CN113138802A (en) * 2021-04-29 2021-07-20 上海阵量智能科技有限公司 Command distribution device, method, chip, computer equipment and storage medium
CN114265717A (en) * 2021-12-30 2022-04-01 上海阵量智能科技有限公司 Instruction synchronization device, chip, computer equipment and data processing method

Also Published As

Publication number Publication date
CN114265717A (en) 2022-04-01

Similar Documents

Publication Publication Date Title
WO2023124370A1 (en) Instruction synchronization apparatus, chip, computer device, and data processing method
US9430411B2 (en) Method and system for communicating with non-volatile memory
US8453161B2 (en) Method and apparatus for efficient helper thread state initialization using inter-thread register copy
US9390033B2 (en) Method and system for communicating with non-volatile memory via multiple data paths
KR20140117578A (en) Multithreaded computing
CN112491426B (en) Service assembly communication architecture and task scheduling and data interaction method facing multi-core DSP
US20180275891A1 (en) Memory System with Latency Distribution Optimization and an Operating Method thereof
US20060146864A1 (en) Flexible use of compute allocation in a multi-threaded compute engines
US20120180056A1 (en) Heterogeneous Enqueuinig and Dequeuing Mechanism for Task Scheduling
US9377968B2 (en) Method and system for using templates to communicate with non-volatile memory
US10664282B1 (en) Runtime augmentation of engine instructions
US9304772B2 (en) Ordering thread wavefronts instruction operations based on wavefront priority, operation counter, and ordering scheme
US11256543B2 (en) Processor and instruction scheduling method
JP2002287957A (en) Method and device for increasing speed of operand access stage in cpu design using structure such as casche
WO2023125359A1 (en) Task processing method and apparatus
CN111158875A (en) Multi-module-based multi-task processing method, device and system
US10284501B2 (en) Technologies for multi-core wireless network data transmission
CN114371920A (en) Network function virtualization system based on graphic processor accelerated optimization
US11301255B2 (en) Method, apparatus, device, and storage medium for performing processing task
CN113296957A (en) Method and device for dynamically allocating network-on-chip bandwidth
EP3131004A1 (en) Processor and method
CN105723317B (en) Method and system for being communicated with nonvolatile memory
CN113439260A (en) I/O completion polling for low latency storage devices
US10901784B2 (en) Apparatus and method for deferral scheduling of tasks for operating system on multi-core processor
US9921891B1 (en) Low latency interconnect integrated event handling

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22913634

Country of ref document: EP

Kind code of ref document: A1