WO2023124370A1 - Appareil de synchronisation d'instructions, puce, dispositif informatique et procédé de traitement de données - Google Patents

Appareil de synchronisation d'instructions, puce, dispositif informatique et procédé de traitement de données Download PDF

Info

Publication number
WO2023124370A1
WO2023124370A1 PCT/CN2022/124511 CN2022124511W WO2023124370A1 WO 2023124370 A1 WO2023124370 A1 WO 2023124370A1 CN 2022124511 W CN2022124511 W CN 2022124511W WO 2023124370 A1 WO2023124370 A1 WO 2023124370A1
Authority
WO
WIPO (PCT)
Prior art keywords
instruction
queue
waiting
count value
adjustment
Prior art date
Application number
PCT/CN2022/124511
Other languages
English (en)
Chinese (zh)
Inventor
王文强
孙海涛
何博
徐宁仪
Original Assignee
上海商汤智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤智能科技有限公司 filed Critical 上海商汤智能科技有限公司
Publication of WO2023124370A1 publication Critical patent/WO2023124370A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication

Definitions

  • the present disclosure relates to the technical field of artificial intelligence, and in particular to an instruction synchronization device, a chip and computer equipment, and a data processing method.
  • an embodiment of the present disclosure provides an instruction synchronization device, which includes: a counter, and a plurality of instruction queues; the counter is coupled to the plurality of instruction queues; each of the plurality of instruction queues An instruction queue is used to store instructions, and the instructions include execution instructions, and at least one of a trigger instruction and a waiting instruction; the counter is used for performing a first adjustment to the count value in response to receiving the trigger instruction, and in response to A second adjustment is performed on the count value after receiving the waiting instruction; the adjustment method of the first adjustment is different from the adjustment method of the second adjustment; wherein, the instruction after the waiting instruction in an instruction queue is at the top of the count value It is sent when a preset numerical condition is met, and the preset numerical condition is determined based on the initial count value of the counter, the first adjusted adjustment mode, and the second adjusted adjustment mode.
  • the apparatus further includes an execution unit, the execution unit is coupled to the plurality of instruction queues, and is used to execute the received execution instructions; each instruction queue is used to execute the stored Parsing the instructions, sending the parsed trigger instructions and waiting instructions to the counter, and sending the parsed execution instructions to the execution unit.
  • one of the first adjustment and the second adjustment is adjusted by increasing the count value according to a first preset step size, and the other is adjusted by a second preset
  • the step size is reduced by the count value; the first preset step size is determined based on the number of waiting queues related to the target synchronization event in the plurality of instruction queues, and the second preset step size is determined based on the plurality of instruction queues The number of waiting queues related to the target synchronization event in the queue is determined.
  • the trigger instruction carries the first preset step size
  • the wait instruction carries the second preset step size
  • the first preset step size is equal to the product of the number of waiting queues and a preset multiple
  • the second preset step size is equal to the number of the waiting queues and the preset multiple product of .
  • the preset numerical condition is jointly determined based on the initial count value, the first preset step size, the second preset step size, and a preset multiple.
  • the preset numerical condition is: the count value and the target count value satisfy a preset numerical relationship, and the target count value is: n*m*a-1+k0; wherein, n is The first preset step size, m is the second preset step size, a is the preset multiple, and k0 is the initial value of the count; the adjustment method in the first adjustment is to increase the count value, when the adjustment method of the second adjustment is to reduce the count value, the numerical relationship is that the count value is greater than the target count value; when the adjustment method of the first adjustment is to reduce the count value value, and when the adjustment method of the second adjustment is to increase the count value, the numerical relationship is that the count value is smaller than the target count value.
  • the plurality of instruction queues include a wait queue and a wait queue; the wait queue includes a plurality of trigger instructions, and each trigger instruction in the same wait queue corresponds to a synchronization event, A synchronization event includes a corresponding waiting instruction and a corresponding triggering instruction, and the multiple triggering instructions are used to adjust the count value of the same counter; after the first triggering instruction in the waiting queue, a target waiting instruction is also included ; The first waiting instruction in the waiting queue further includes a target trigger instruction.
  • the number of the counter is greater than 1; the trigger instruction and the waiting instruction included in each instruction queue include: identification information of the counter, and the instruction queue is used to store the trigger instruction and the waiting instruction included in the instruction queue The waiting instruction is sent to the corresponding counter according to the identification information of the counter.
  • the instruction synchronization device further includes: an arbitration unit, coupled to the plurality of instruction queues and the execution unit, configured to assign each of the plurality of instruction queues according to a preset priority The execution instructions sent by the instruction queues are sent to the execution unit.
  • the instruction synchronization device further includes: a multiplexer, coupled to the plurality of instruction queues, for sending the instruction to a corresponding instruction queue in the plurality of instruction queues.
  • both the triggering instruction and the waiting instruction include: identification information of the instruction queue, and the multiplexer is also used to combine the triggering instruction and the waiting instruction according to the instruction The identification information of the queue is sent to the corresponding command queue.
  • an embodiment of the present disclosure provides a chip, and the chip includes: the instruction synchronization device described in any embodiment of the present disclosure.
  • an embodiment of the present disclosure provides a computer device, where the computer device includes: the chip described in any embodiment of the present disclosure.
  • an embodiment of the present disclosure provides a data processing method, which is applied to the instruction synchronization device described in any embodiment of the present disclosure, and the method includes: each instruction queue in a plurality of instruction queues stores instructions, and the The instructions include an execution instruction, and at least one of a trigger instruction and a wait instruction; a counter coupled to the plurality of instruction queues performs a first adjustment to the count value in response to receiving the trigger instruction, and responds to receiving the wait instruction Carrying out a second adjustment to the count value; the adjustment method of the first adjustment is different from the adjustment method of the second adjustment; when the count value satisfies the preset numerical condition, each instruction queue The instruction after the waiting instruction is sent, and the preset value condition is determined based on the initial count value of the counter, the first adjusted adjustment method and the second adjusted adjustment method.
  • the method further includes: each instruction queue parses the stored instructions, sends the parsed trigger instructions and waiting instructions to the counter, and sends the parsed execution instructions to execution unit.
  • the counter performs a first adjustment to the count value in response to receiving the trigger instruction, and performs a second adjustment to the count value in response to receiving the wait instruction, including: responding to receiving the trigger instruction according to the first preset Set the step size to increase the count value, and decrease the count value according to the second preset step size in response to receiving the waiting instruction; or, reduce the count value according to the first preset step size in response to receiving the trigger instruction, And in response to receiving the waiting instruction, increase the count value according to the second preset step size.
  • the first preset step size is determined based on the number of waiting queues related to the target synchronization event in the plurality of instruction queues
  • the second preset step size is determined based on the number of waiting queues in the plurality of instruction queues The number of waiting queues associated with the target synchronization event is determined.
  • the method further includes: acquiring, by the counter, the first preset step size carried in the trigger instruction, and acquiring the second preset step size carried in the waiting instruction.
  • the first preset step size is equal to the product of the number of waiting queues and a preset multiple
  • the second preset step size is equal to the number of the waiting queues and the preset multiple product of .
  • the preset numerical condition is jointly determined based on the initial count value, the first preset step size, the second preset step size, and a preset multiple.
  • the preset numerical condition is: the count value and the target count value satisfy a preset numerical relationship, and the target count value is: n*m*a-1+k0; wherein, n is The first preset step size, m is the second preset step size, a is the preset multiple, and k0 is the counting initial value.
  • the method further includes: when the adjustment mode of the first adjustment is to increase the count value of each instruction queue, and the adjustment mode of the second adjustment is to decrease the count value Next, the numerical relationship is that the count value is greater than the target count value; the adjustment mode of each instruction queue in the first adjustment is to reduce the count value, and the adjustment mode of the second adjustment is to increase In the case of the count value, the numerical relationship is that the count value is smaller than the target count value.
  • the plurality of instruction queues include a wait queue and a wait queue; the wait queue includes a plurality of trigger instructions, and each trigger instruction in the same wait queue corresponds to a synchronization event, A synchronization event includes a corresponding waiting instruction and a corresponding triggering instruction, and the multiple triggering instructions are used to adjust the count value of the same counter; after the first triggering instruction in the waiting queue, a target waiting instruction is also included ; The first waiting instruction in the waiting queue further includes a target trigger instruction; the first trigger instruction and the first waiting instruction are corresponding instructions of other synchronization events except the last synchronization event.
  • the number of counters is greater than 1; the method further includes: each instruction queue acquires the identification information of the counters included in the triggering instructions and waiting instructions in the queue, and based on the triggering instructions in the queue The identification information of the counter included in the instruction and the waiting instruction sends the trigger instruction and the waiting instruction in the queue to the corresponding counter.
  • the sending the parsed execution instructions to the execution unit includes: sending the parsed execution instructions to the arbitration unit, so that the arbitration unit sends the multiple The execution instructions sent by each instruction queue in the instruction queue are sent to the execution unit.
  • the method further includes: each instruction queue acquires the instruction sent by the multiplexer, and stores the instruction sent by the multiplexer.
  • each of the instruction queues acquiring the instructions sent by the multiplexer includes: each of the instruction queues acquiring the instructions sent by the multiplexer based on the identification information of the instruction queues included in the instructions to Instructions for this queue.
  • a trigger command and a wait command are inserted into the command queue, and the count value of the counter is adjusted through the trigger command and the wait command. Therefore, the sending order of the instructions in the multiple instruction queues can be controlled through the above method, so as to realize the instruction synchronization among the multiple instruction queues.
  • the foregoing embodiments implement synchronization between instruction queues in a hardware manner, which improves the efficiency of the hardware system.
  • FIG. 1 is a schematic diagram of an instruction synchronization process.
  • FIG. 2 is a schematic structural diagram of an instruction synchronization device according to an embodiment of the present disclosure.
  • FIG. 3 is a schematic diagram of an instruction synchronization method according to an embodiment of the disclosure.
  • FIG. 4 is a schematic diagram of an instruction synchronization method according to another embodiment of the disclosure.
  • FIG. 5A is a schematic diagram of an instruction synchronization method when multiple synchronization events are mapped to the same synchronization counter according to an embodiment of the present disclosure.
  • FIG. 5B is a schematic diagram of the instruction sending sequence in the instruction queue in FIG. 5A .
  • FIG. 6 is a schematic structural diagram of an instruction synchronization device according to another embodiment of the present disclosure.
  • FIG. 7 is a flowchart of a data processing method according to an embodiment of the present disclosure.
  • first, second, third, etc. may be used in the present disclosure to describe various information, the information should not be limited to these terms. These terms are only used to distinguish information of the same type from one another. For example, without departing from the scope of the present disclosure, first information may also be called second information, and similarly, second information may also be called first information. Depending on the context, the word “if” as used herein may be interpreted as “at” or “when” or “in response to a determination.”
  • the instructions may be issued to each instruction queue according to the preset distribution rules, and there may or may not be a dependency relationship between instructions in different instruction queues.
  • the instructions in the two instruction queues can be sent in parallel, thereby increasing the degree of parallelism between the instructions; there is a dependency between the instructions in the two instruction queues
  • FIG. 1 it is a schematic diagram of an instruction synchronization process in some embodiments. Assuming that the instructions q1 and q2 in the instruction queue 1 depend on the instructions Q1, Q2 and Q3 in the instruction queue 2, that is, the instructions q1 and q2 can only be sent after the instructions Q1, Q2 and Q3 are all sent, then each The order in which commands are sent is shown on the time axis in the figure. Those skilled in the art can understand that the sending sequence of instructions shown in the figure is only an exemplary description. In practical applications, the instructions in the same instruction queue are sent sequentially according to the order in which the instructions are stored in the instruction queue. Instructions that do not have a dependency relationship among the instruction queues may be sent in any sending order.
  • the sending time of the instruction Q4 may be earlier than the sending time of the instruction q3, or the instruction Q1 may be sent between the instruction q3 and the instruction q4.
  • the instructions q1 and q2 depend on the instructions Q1, Q2 and Q3, the sending times of the instructions Q1, Q2 and Q3 are all earlier than the instruction q1, and are all earlier than the instruction q2.
  • an instruction counter is deployed for each instruction queue to count the instructions sent by the current instruction queue, and other instruction queues realize instruction synchronization by obtaining the count value of the instruction counter of the corresponding instruction queue.
  • this command synchronization method there are only producers and no consumers (for example, the count of the counter has been increasing but not decreasing), the counter has the risk of overflow, and the commands in each command queue can only be the same type of command.
  • another related technology adopts a producer-consumer model, and deploys a state synchronization counter between every two instruction queues to realize instruction synchronization.
  • This technology adopts the method of statically deploying state synchronization counters. Assuming that the number of instruction queues is N, the number of state synchronization counters to be deployed is N*(N-1). When N is large, the number of counters used The number is huge and takes up too many system resources.
  • an embodiment of the present disclosure provides an instruction synchronization device, see Figure 2 and Figure 6, the device includes:
  • a counter 201 and a plurality of instruction queues 202 coupled to the counter 201; each instruction queue in the plurality of instruction queues 202 (for example, instruction queue 1, instruction queue 2, instruction queue 3, instruction queue 4, etc.)
  • the instructions include an execution instruction, and at least one of a trigger instruction Trigger and a wait instruction Wait;
  • the counter 201 is used to perform a first adjustment to the count value in response to receiving the trigger instruction Trigger, and to perform a second adjustment to the count value in response to receiving the wait instruction Wait; the adjustment method of the first adjustment and the second adjustment Adjustments are made in different ways;
  • the instructions located after the waiting instruction Wait in an instruction queue 202 are sent when the count value satisfies a preset numerical condition, and the preset numerical condition is based on the initial count value of the counter 201, the The adjustment mode of the first adjustment and the adjustment mode of the second adjustment are determined.
  • the number of counters (also referred to as synchronous counters) 201 can be greater than or equal to 1, and the number of instruction queues 202 can be greater than or equal to 2.
  • An instruction queue 202 may be used to store execution instructions, and the execution instructions are used to instruct the execution unit to perform operations.
  • the execution instruction is sent from the instruction queue 202 to the execution unit 203 for execution.
  • the execution instruction may include but not limited to at least one of various operation instructions such as an addition instruction, a multiplication instruction, and a convolution multiplication instruction.
  • the execution unit may perform operations on the acquired target data in response to the execution instruction.
  • the target data may be data in various forms such as image data, voice data, and text data.
  • Instruction queue 202 may be implemented as a memory.
  • Instructions in one or more instruction queues may have a dependency relationship with instructions in another one or more instruction queues.
  • a trigger command Trigger and a wait command Wait may be inserted into the command queue.
  • the instructions in each instruction queue may have different dependencies at different times. For example, at time t1, the instructions in instruction queue 1 need to wait for the instructions in instruction queue 2 to be sent before they can be sent (this situation is called instruction queue 1 waits for the instruction queue 2, the instruction queue 1 is called the waiting queue, and the instruction queue 2 is called the waited queue); at t2 time, there is no dependency relationship between the instructions in the instruction queue 1 and the instructions in the instruction queue 2; At time t3, command queue 2 needs to wait for command queue 1.
  • instruction queue 1 waiting for instruction queue 2 is called a synchronization event
  • the waiting queue involved in a synchronization event is called the waiting queue corresponding to the synchronization event
  • the waiting queue involved in a synchronization event is called the synchronization event The corresponding waiting queue.
  • the same instruction queue may correspond to one or more synchronization events. For example, at a certain moment, an instruction in instruction queue 1 needs to wait for an instruction in instruction queue 2 to be sent before it can be sent; An instruction in 1 needs to wait for two instructions in instruction queue 3 to be sent before it can be sent; at another moment, three instructions in instruction queue 4 need to wait for two instructions in instruction queue 1 to be sent before they can be sent. Then, in the above example, instruction queue 1 corresponds to 3 synchronization events respectively.
  • the instruction queue can store the instructions in the order in which they were received, and can also analyze the stored instructions. Each instruction may include the type of the instruction, and the instruction queue may determine whether the instruction is an execution instruction, a trigger instruction or a waiting instruction according to the type of the instruction. Referring to FIG. 3 , where, instruction 1, instruction 2, instruction 3, etc. represent execution instructions, T and W represent trigger instructions and waiting instructions respectively, and instructions after waiting instructions in instruction queue 1 need to wait before trigger instructions in instruction queue 2 All commands of the command must be sent before they can be sent. After the instruction queue parses out the execution instruction, it can send the execution instruction to the execution unit; after parsing out the trigger instruction or the waiting instruction, it can send the trigger instruction or the waiting instruction to the counter 201 .
  • the triggering instruction and the waiting instruction can respectively trigger the counting value of the counter to be adjusted in a first adjustment manner and a second adjustment manner. For example, by sending a trigger instruction to the counter, the counter can be triggered to increase the count value, and by sending a wait instruction to the counter, the counter can be triggered to decrease the count value. Alternatively, by sending a trigger instruction to the counter, the counter can be triggered to decrease the count value, and by sending a wait instruction to the counter, the counter can be triggered to increase the count value.
  • one of the first adjustment and the second adjustment is adjusted by increasing the count value according to a first preset step size, and the other is adjusted by a second preset
  • the step size decreases the count value.
  • each trigger instruction may trigger the counter to add 1 to the count value
  • each waiting instruction may trigger the counter to decrease the count value by 1.
  • other positive integers may also be used as the first preset step size and the second preset step size.
  • both the first preset step size and the second preset step size may be 2.
  • each trigger instruction can trigger the counter to add 2 to the count value
  • each waiting instruction can trigger the counter to decrease the count value by 2.
  • the first preset step size may not be equal. Specifically, the first preset step size is determined based on the number of waiting queues related to the target synchronization event in the multiple instruction queues, and the second preset step size is determined based on the number of waiting queues related to the target synchronization event in the multiple instruction queues. The number of waiting queues related to the target synchronization event is determined.
  • the first preset step size can be set to n*a
  • the second preset step size can be set to m*a
  • a is a positive integer
  • the trigger instruction carries the first preset step size
  • the wait instruction carries the second preset step size. Since the instruction carries the step size information, the counter can directly read the corresponding step size information from the above instruction after receiving the trigger instruction or the waiting instruction, so as to determine the step size for increasing or decreasing the count value.
  • the trigger command may also carry identification information used to characterize the adjustment method of the first adjustment
  • the waiting instruction may also carry identification information used to characterize the adjustment method of the second adjustment, so that the counter determines which adjustment mode to use. There are two adjustment methods to adjust the step size of the counter.
  • the instructions after the waiting instruction in an instruction queue are sent only when the count value of the counter satisfies the preset value condition, so that the sending order of the instructions in the instruction queue can be controlled based on the count value of the counter, thereby realizing Instructions among the plurality of instruction queues are synchronized.
  • the preset numerical condition is jointly determined based on the initial count value, the first preset step size, the second preset step size, and a preset multiple.
  • the preset numerical condition may be that the count value of the counter and the target count value satisfy a preset numerical relationship.
  • the product of the first preset step size, the second preset step size and a preset multiple sum the product and the initial count value, and determine the target based on the sum result count value.
  • the first preset step size is marked as n
  • the second preset step size is marked as m
  • the preset multiple is marked as a
  • the initial counting value is marked as k0
  • the The target count value can be written as n*m*a-1+k0.
  • the preset multiple is a positive integer.
  • the adjustment mode of the first adjustment is to increase the count value
  • the adjustment mode of the second adjustment is to decrease the count value
  • the numerical relationship is that the count value is greater than the Target count value
  • the adjustment mode of the first adjustment is to reduce the count value
  • the adjustment mode of the second adjustment is to increase the count value
  • the numerical relationship is that the count value is less than the target count value.
  • both m and n are equal to 1, and it is assumed that the preset multiple a is also equal to 1, and the initial value of the counter is 0.
  • Instruction 4 and instruction 5 in instruction queue 1 need to wait for instruction 1 and instruction 2 in instruction queue 2 to be sent before being sent. Therefore, the trigger instruction T can be inserted after the instruction 2 in the instruction queue 2, and the waiting instruction W can be inserted before the instruction 4 in the instruction queue 1.
  • the command queue 2 can analyze the commands in this queue, and send the parsed commands sequentially.
  • instruction 1 and instruction 2 are execution instructions, which can be sent to the execution unit in sequence. At the same time, the instruction queue 1 can send the execution instructions (ie, instruction 1, instruction 2 and instruction 3) in this queue in parallel.
  • Instructions with the same label as instruction 1 and instruction 2 in the above two instruction queues may be the same instruction or different instructions.
  • the label of the instruction is only used to indicate the relative position of the instruction in the instruction queue to which it belongs. Not intended to indicate the content or type of an instruction. Since the sending and completion time of the commands without dependencies in different command queues is random, it is possible that command 3 in command queue 1 is sent before command 2 in command queue 2, or that command Instruction 2 in queue 2 is sent before instruction 3 in instruction queue 1.
  • the command queue 2 parses the trigger command T, it will send the trigger command T to the counter.
  • the counter receives the trigger command T , increment the count value by 1.
  • the command queue 1 resolves the waiting command W, it can read the count value of the counter. Only when the count value of the counter is greater than 0, the command queue 1 will send the waiting command W and the commands after the waiting command W. , otherwise the waiting instruction W and the instructions following the waiting instruction W are not sent. However, when the counter receives the waiting instruction W, it can decrement the count value by 1.
  • a synchronization event is completed. If there are other synchronization events between the two instruction queues, continue to perform instruction synchronization according to the above process.
  • both m and n are positive integers greater than 1, and m and n may not be equal.
  • This situation is called n instruction queues waiting for m instruction queues.
  • instruction 4 and instruction 5 in each waiting queue that is, instruction queue A1 above the counter to instruction queue An
  • instruction 3 in all waiting queues that is, instruction queue B1 to instruction queue Bm below the counter
  • the counting value of the counter can be increased by n, so that all waiting queues send to the counter After the trigger command is sent, the value of the counter is n*m.
  • Each waiting queue can read the count value of the counter in the case of parsing the waiting instruction, if the count value is greater than n*m-1, the waiting queue can send the waiting instruction so that the counter will decrement the count value by m, in After all the waiting queues have sent the waiting instructions, the count value of the counter returns to 0. In this way, a synchronization event is completed.
  • the synchronization process of the synchronization event is similar to the embodiment shown in FIG. 3 and FIG. 4 , and will not be repeated here.
  • the only difference is that after the counter receives the trigger command, it subtracts the corresponding value from the counting initial value (for example, 5), and after the counter receives the waiting command, it increases the counting value by the corresponding value, only when the counting value is less than the counting value In the case of the initial value, each instruction after the waiting instruction can continue to be sent. In this way, a synchronization event is also completed.
  • a trigger instruction in a later synchronization event may falsely trigger a wait instruction in a previous synchronization event.
  • FIG. 5A and FIG. 5B suppose there are two synchronization events, namely: (1) instruction queues 1 and 2 wait for instruction queues 3 , 4 , and 5 ; and (2) instruction queues 1 and 2 wait for instruction queue 3 .
  • T1 in command queue 3, command queue 4 and command queue 5, and W1 in command queue 1 and command queue 2 correspond to the first synchronization event
  • T2 in command queue 3 and command queue 1 and command queue 2 W2 corresponds to the second synchronization event.
  • the command queues obtained by inserting trigger commands and waiting commands in the foregoing embodiments are shown in Case 1 in FIG.
  • the trigger command T2 may be earlier than the time when the command queue 4 sends the trigger command Trigger1, and the sending time sequence of each command is shown in the sending order of the commands in case 1 in FIG. 5B .
  • the sending order of the trigger commands in the command queues 3 , 4 , and 5 is shown in the figure.
  • n 2 and m is 3, that is, when the count value of the counter is greater than 5, the waiting instruction in the waiting queue and the execution instruction after the waiting instruction can be sent.
  • T2 of the command queue 3 is sent before T1 of the command queue 4, after the command queue 4 sends T1, the count value of the counter will reach 6, which will trigger the waiting queue to send related commands.
  • T1 of the command queue 5 has not been sent, and the relevant commands in the waiting queue do not meet the sending conditions. It can be seen that, in the above case, T2 in the command queue 3 will mistakenly trigger W1 in the synchronization event 1 .
  • the plurality of instruction queues in the embodiments of the present disclosure include a wait queue and a wait queue; the wait queue includes a plurality of trigger instructions, and each trigger instruction in the same wait queue corresponds to A synchronous event, a synchronous event includes a corresponding waiting instruction and a corresponding trigger instruction, and the plurality of trigger instructions are used to adjust the count value of the same counter; after the first trigger instruction in the waiting queue, it also includes A target waiting instruction; after the first waiting instruction in the waiting queue, a target triggering instruction is also included.
  • the number of waiting queues can be greater than or equal to 1, and the number of waiting queues can also be greater than or equal to 1.
  • the first trigger instructions corresponding to different synchronization events can come from one or more waiting queues, and the corresponding The first wait instruction may come from one or more wait queues.
  • the first trigger instruction corresponding to the first synchronization event, the first trigger instruction corresponding to the second synchronization event, ..., the N-1th synchronization event corresponds to After the first trigger command, each includes a target wait command; and the first wait command corresponding to the first synchronization event, the first wait command corresponding to the second synchronization event, ..., the N-1th synchronization event corresponds to After the first wait instruction of each includes a target trigger instruction.
  • the first trigger instruction corresponding to the Nth synchronous event may or may not include a target waiting instruction
  • the first waiting instruction corresponding to the Nth synchronous event may or may not include a target trigger instruction.
  • each of the multiple waiting queues may include a target waiting instruction after the first trigger instruction in each of the waiting queues.
  • a synchronization event is that instruction 1 in instruction queue 2 waits for instruction 2 in instruction queue 3 and instruction 3 in instruction queue 4
  • both instruction queue 3 and instruction queue 4 include the first trigger instruction corresponding to the synchronization event , so that the first trigger instruction in the instruction queue 3 includes a target waiting instruction, and the first trigger instruction in the instruction queue 4 also includes a target waiting instruction.
  • each of the multiple waiting queues may include a target trigger instruction after the first waiting instruction in each waiting queue.
  • a synchronization event is that instruction 1 in instruction queue 1 and instruction 1 in instruction queue 2 wait for instruction 2 in instruction queue 3
  • both instruction queue 1 and instruction queue 2 include the first waiting instruction corresponding to the synchronization event , so that the first waiting instruction in the instruction queue 1 includes a target trigger instruction, and the first waiting instruction in the instruction queue 2 also includes a target trigger instruction.
  • the waiting queue Mi include a target waiting instruction after the first trigger instruction corresponding to the synchronization event S; if the synchronization event S is the last trigger event in a queue Mj to be waited for, then the corresponding event S in the queue Mj to be waited for The target waiting instruction is no longer included after the first trigger instruction.
  • this waiting queue Ni corresponds to the synchronization event S Include a target trigger instruction after the first waiting instruction; if the synchronization event S is the last waiting event in a certain waiting queue Nj, then no longer include after the first waiting instruction corresponding to the synchronization event S in the waiting queue Nj Target trigger command. As shown in Case 2 in Figure 5A.
  • the first trigger instruction is T1 in the instruction queue 3, and the target waiting instruction is Wait after T1 in the instruction queue 3.
  • a waiting command is W1 in command queue 1 and W1 in command queue 2
  • the above-mentioned target trigger commands are T after W1 in command queue 1 and T after W1 in command queue 2 .
  • Trigger1 in command queue 3 may be sent first, but because Wait is inserted after Trigger1 in command queue 3, the sending of T1 in command queue 3 is completed, However, when T1 in command queue 4 and command queue 5 has not been sent, W in command queue 3 does not meet the sending condition, so T1 in command queue 3 will enter a waiting state after sending. Similarly, W1 in command queue 1 and W1 in command queue 2 will cause command queue 1 and command queue 2 to also enter the waiting state. Therefore, T1 in command queue 4 and T1 in command queue 5 will be sent first. In this way, it is guaranteed that the triggering instruction of the second synchronization event can take effect only after the waiting instructions of the first synchronization event normally occur.
  • the number of synchronization events mapped to the same counter is 2, that is, there is at least one waiting queue (instruction queue 3 in the figure) corresponding to two synchronization events respectively.
  • the number of synchronization events mapped to the same counter may also be greater than 2, and there may also be more than one waiting queue corresponding to multiple synchronization events respectively.
  • the waiting queues in the two synchronization events shown in the figure are the same (both instruction queue 1 and instruction queue 2), in practical applications, the waiting queues in different events may also be partly the same, or Totally different.
  • the position of the target waiting instruction after the first trigger instruction does not have to be adjacent to the first trigger instruction, as long as it is between the first trigger instruction and the next first trigger instruction after the first trigger instruction.
  • the position of the target trigger instruction after the first waiting instruction does not have to be adjacent to the first waiting instruction, as long as it is between the first waiting instruction and the next first waiting instruction after the first waiting instruction.
  • W after T1 can be in any position between T1 and T2.
  • T after W1 can be in any position between W1 and W2 .
  • false triggering of the synchronization event can also be avoided by inserting a target waiting instruction and a target triggering instruction, which will not be repeated here.
  • the number of counters 201 is greater than one.
  • the trigger instruction and the waiting instruction included in each instruction queue may include identification information of the counter, which is used for the instruction queue to send the trigger instruction and the waiting instruction included in the instruction queue to the corresponding counter.
  • the identification information of the trigger instruction and the identification information of the waiting instruction may be respectively bound with the identification information of the counter, so that the instruction queue sends the trigger instruction and the waiting instruction to the corresponding counter.
  • the instruction synchronization device further includes an execution unit 203, configured to execute the received execution instruction.
  • the number of execution units may be greater than or equal to 1, and one execution unit may receive and process execution instructions of one or more instruction queues.
  • an execution unit may be a subunit in a processing unit capable of executing instructions, and the processing unit may be divided into multiple groups of execution units according to different granularities, and each group of execution units may be used to execute a processing task (for example, addition operation). In different situations, different partition granularities can be adopted according to different actual needs.
  • This division method has high flexibility. When a task requires a small number of execution units, the execution units can be divided into more groups, thereby improving the parallelism of task processing.
  • the execution unit finishes processing the instruction it can also return an acknowledgment signal (ACK) to the arbitration unit, so that the arbitration unit can continue to send new instructions.
  • ACK acknowledgment signal
  • the instruction synchronization device further includes an arbitration unit 204, configured to send the execution instructions sent by each instruction queue in the plurality of instruction queues to the execution unit according to a preset priority.
  • the number of arbitration units 204 may be greater than or equal to 1.
  • FIG. 6 shows that the number of arbitration units is equal to the number of instruction queues, but in practical applications, the numbers of the two may not be equal.
  • the instruction synchronization device further includes a multiplexer 205, configured to send the instruction to a corresponding instruction queue in the plurality of instruction queues.
  • both the triggering instruction and the waiting instruction include identification information of an instruction queue, for the multiplexer 205 to send the triggering instruction and the waiting instruction to the corresponding instruction queue.
  • the present disclosure provides a hardware-implemented dynamic deployment mechanism of the instruction queue state synchronization counter, which saves counter overhead and eliminates the risk of counter overflow; at the same time, the dynamic deployment of the counter is implemented by hardware, and efficient and flexible multi-process scheduling is also realized.
  • the embodiments of the present disclosure have strong scalability, and the number of counters, the number of instruction queues, and the number of execution units can be adjusted according to requirements.
  • the instruction synchronization device of the embodiments of the present disclosure can be applied to processing chips such as artificial intelligence chips, graphics processing chips, etc., to realize efficient and flexible instruction queue deployment and scheduling, thereby improving the parallel efficiency of execution units.
  • Various instructions in the above embodiments can be generated in advance by offline compilation. Each instruction can be sequentially generated and sent to the instruction queue according to the required order during offline compilation.
  • the present disclosure further provides a chip, the chip including the instruction synchronization device described in any embodiment of the present disclosure.
  • the aforementioned chips may be artificial intelligence chips or graphics processing chips, or other types of processing chips.
  • the instruction synchronization device in this chip embodiment reference may be made to the aforementioned embodiments of the instruction synchronization device, and details are not repeated here.
  • An embodiment of the present disclosure further provides a computer device, where the computer device includes the chip described in any embodiment of the present disclosure.
  • the computer device includes the chip described in any embodiment of the present disclosure.
  • the specific functions of the chip reference may be made to the description of the chip embodiments above, and for the sake of brevity, details are not repeated here.
  • an embodiment of the present disclosure also provides a data processing method, which is applied to the instruction synchronization device described in any embodiment of the present disclosure, and the method includes:
  • Step 701 Each instruction queue in the plurality of instruction queues stores an instruction, the instruction includes an execution instruction, and at least one of a trigger instruction and a waiting instruction;
  • Step 702 The counter coupled to the plurality of command queues performs a first adjustment to the count value in response to receiving a trigger command, and performs a second adjustment to the count value in response to receiving a wait command; the adjustment of the first adjustment The adjustment method is different from the adjustment method of the second adjustment;
  • Step 703 In the case that the count value satisfies a preset numerical condition, each instruction queue sends the instruction after the waiting instruction in the queue, and the preset numerical condition is based on the initial count of the counter. The value, the adjustment mode of the first adjustment and the adjustment mode of the second adjustment are determined.
  • the method further includes: each instruction queue parses the stored instructions, sends the parsed trigger instructions and waiting instructions to the counter, and sends the parsed execution instructions to execution unit.
  • the counter performs a first adjustment to the count value in response to receiving the trigger instruction, and performs a second adjustment to the count value in response to receiving the wait instruction, including: responding to receiving the trigger instruction according to the first preset Set the step size to increase the count value, and decrease the count value according to the second preset step size in response to receiving the waiting instruction; or, reduce the count value according to the first preset step size in response to receiving the trigger instruction, And in response to receiving the waiting instruction, increase the count value according to the second preset step size.
  • the first preset step size is determined based on the number of waiting queues related to the target synchronization event in the plurality of instruction queues
  • the second preset step size is determined based on the number of waiting queues in the plurality of instruction queues The number of waiting queues associated with the target synchronization event is determined.
  • the method further includes: acquiring, by the counter, the first preset step size carried in the trigger instruction, and acquiring the second preset step size carried in the waiting instruction.
  • the first preset step size is equal to the product of the number of waiting queues and a preset multiple
  • the second preset step size is equal to the number of the waiting queues and the preset multiple product of .
  • the preset numerical condition is jointly determined based on the initial count value, the first preset step size, the second preset step size, and a preset multiple.
  • the preset numerical condition is: the count value and the target count value satisfy a preset numerical relationship, and the target count value is: n*m*a-1+k0; wherein, n is The first preset step size, m is the second preset step size, a is the preset multiple, and k0 is the counting initial value.
  • the method further includes: when the adjustment mode of the first adjustment is to increase the count value of each instruction queue, and the adjustment mode of the second adjustment is to decrease the count value Next, the numerical relationship is that the count value is greater than the target count value; the adjustment mode of each instruction queue in the first adjustment is to reduce the count value, and the adjustment mode of the second adjustment is to increase In the case of the count value, the numerical relationship is that the count value is smaller than the target count value.
  • the plurality of instruction queues include a wait queue and a wait queue; the wait queue includes a plurality of trigger instructions, and each trigger instruction in the same wait queue corresponds to a synchronization event, A synchronization event includes a corresponding waiting instruction and a corresponding triggering instruction, and the multiple triggering instructions are used to adjust the count value of the same counter; after the first triggering instruction in the waiting queue, a target waiting instruction is also included ; The first waiting instruction in the waiting queue further includes a target trigger instruction; the first trigger instruction and the first waiting instruction are corresponding instructions of other synchronization events except the last synchronization event.
  • the number of counters is greater than 1; the method further includes: each instruction queue acquires the identification information of the counters included in the triggering instructions and waiting instructions in the queue, and based on the triggering instructions in the queue The identification information of the counter included in the instruction and the waiting instruction sends the trigger instruction and the waiting instruction in the queue to the corresponding counter.
  • the sending the parsed execution instructions to the execution unit includes: sending the parsed execution instructions to the arbitration unit, so that the arbitration unit sends the multiple The execution instructions sent by each instruction queue in the instruction queue are sent to the execution unit.
  • the method further includes: each instruction queue acquires the instruction sent by the multiplexer, and stores the instruction sent by the multiplexer.
  • each of the instruction queues acquiring the instructions sent by the multiplexer includes: each of the instruction queues acquiring the instructions sent by the multiplexer based on the identification information of the instruction queues included in the instructions to Instructions for this queue.
  • a typical implementing device is a computer, which may take the form of a personal computer, laptop computer, cellular phone, camera phone, smart phone, personal digital assistant, media player, navigation device, e-mail device, game control device, etc. desktops, tablets, wearables, or any combination of these.
  • each embodiment in this specification is described in a progressive manner, the same and similar parts of each embodiment can be referred to each other, and each embodiment focuses on the differences from other embodiments.
  • the description is relatively simple, and for relevant parts, please refer to part of the description of the method embodiment.
  • the device embodiments described above are only illustrative, and the modules described as separate components may or may not be physically separated, and the functions of each module may be integrated in the same or multiple software and/or hardware implementations. Part or all of the modules can also be selected according to actual needs to achieve the purpose of the solution of this embodiment. It can be understood and implemented by those skilled in the art without creative effort.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Advance Control (AREA)

Abstract

Des modes de réalisation de la présente divulgation concernent un appareil de synchronisation d'instructions, une puce, un dispositif informatique et un procédé de traitement de données. Des instructions de déclenchement et des instructions d'attente sont insérées dans des files d'attente d'instructions, une valeur de comptage d'un compteur est ajustée en fonction des instructions de déclenchement et des instructions d'attente, et les instructions après les instructions d'attente peuvent être envoyées uniquement lorsque la valeur de comptage satisfait une condition de valeur numérique prédéfinie. Par conséquent, les séquences d'envoi des instructions de la pluralité de files d'attente d'instructions peuvent être commandées de sorte que la synchronisation d'instructions parmi la pluralité de files d'attente d'instructions est réalisée. Selon les modes de réalisation, la synchronisation parmi les files d'attente d'instructions est réalisée en utilisant du matériel, et l'efficacité d'un système matériel est améliorée.
PCT/CN2022/124511 2021-12-30 2022-10-11 Appareil de synchronisation d'instructions, puce, dispositif informatique et procédé de traitement de données WO2023124370A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111652996.0 2021-12-30
CN202111652996.0A CN114265717A (zh) 2021-12-30 2021-12-30 指令同步装置、芯片和计算机设备,数据处理方法

Publications (1)

Publication Number Publication Date
WO2023124370A1 true WO2023124370A1 (fr) 2023-07-06

Family

ID=80831795

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/124511 WO2023124370A1 (fr) 2021-12-30 2022-10-11 Appareil de synchronisation d'instructions, puce, dispositif informatique et procédé de traitement de données

Country Status (2)

Country Link
CN (1) CN114265717A (fr)
WO (1) WO2023124370A1 (fr)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114265717A (zh) * 2021-12-30 2022-04-01 上海阵量智能科技有限公司 指令同步装置、芯片和计算机设备,数据处理方法
CN115390569A (zh) * 2022-08-04 2022-11-25 上海布鲁可积木科技有限公司 寻线玩具中的指令同步处理方法和系统
CN117112025B (zh) * 2023-10-18 2023-12-22 北京开源芯片研究院 处理部件的指令执行方法、装置、设备及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6581089B1 (en) * 1998-04-16 2003-06-17 Sony Corporation Parallel processing apparatus and method of the same
CN102207848A (zh) * 2010-03-29 2011-10-05 索尼公司 指令取回装置、处理器和程序计数器加法控制方法
CN113138801A (zh) * 2021-04-29 2021-07-20 上海阵量智能科技有限公司 命令分发装置、方法、芯片、计算机设备及存储介质
CN113138802A (zh) * 2021-04-29 2021-07-20 上海阵量智能科技有限公司 命令分发装置、方法、芯片、计算机设备及存储介质
CN113778914A (zh) * 2020-06-09 2021-12-10 华为技术有限公司 用于执行数据处理的装置、方法、和计算设备
CN114265717A (zh) * 2021-12-30 2022-04-01 上海阵量智能科技有限公司 指令同步装置、芯片和计算机设备,数据处理方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6581089B1 (en) * 1998-04-16 2003-06-17 Sony Corporation Parallel processing apparatus and method of the same
CN102207848A (zh) * 2010-03-29 2011-10-05 索尼公司 指令取回装置、处理器和程序计数器加法控制方法
CN113778914A (zh) * 2020-06-09 2021-12-10 华为技术有限公司 用于执行数据处理的装置、方法、和计算设备
CN113138801A (zh) * 2021-04-29 2021-07-20 上海阵量智能科技有限公司 命令分发装置、方法、芯片、计算机设备及存储介质
CN113138802A (zh) * 2021-04-29 2021-07-20 上海阵量智能科技有限公司 命令分发装置、方法、芯片、计算机设备及存储介质
CN114265717A (zh) * 2021-12-30 2022-04-01 上海阵量智能科技有限公司 指令同步装置、芯片和计算机设备,数据处理方法

Also Published As

Publication number Publication date
CN114265717A (zh) 2022-04-01

Similar Documents

Publication Publication Date Title
WO2023124370A1 (fr) Appareil de synchronisation d'instructions, puce, dispositif informatique et procédé de traitement de données
US8108571B1 (en) Multithreaded DMA controller
US9430411B2 (en) Method and system for communicating with non-volatile memory
US8453161B2 (en) Method and apparatus for efficient helper thread state initialization using inter-thread register copy
US11169712B2 (en) Memory system with latency distribution optimization and an operating method thereof
US9390033B2 (en) Method and system for communicating with non-volatile memory via multiple data paths
KR20140117578A (ko) 다중스레드 컴퓨팅
CN112491426B (zh) 面向多核dsp的服务组件通信架构及任务调度、数据交互方法
US20060146864A1 (en) Flexible use of compute allocation in a multi-threaded compute engines
US9377968B2 (en) Method and system for using templates to communicate with non-volatile memory
US10664282B1 (en) Runtime augmentation of engine instructions
US9304772B2 (en) Ordering thread wavefronts instruction operations based on wavefront priority, operation counter, and ordering scheme
US11256543B2 (en) Processor and instruction scheduling method
JP2002287957A (ja) キャッシュのような構造を使用してcpu設計におけるオペランド・アクセス・ステージを高速化するための方法及び装置
CN111158875A (zh) 基于多模块的多任务处理方法、装置及系统
US10284501B2 (en) Technologies for multi-core wireless network data transmission
CN114371920A (zh) 一种基于图形处理器加速优化的网络功能虚拟化系统
CN116360930A (zh) 一种任务处理的方法及装置
CN113296957A (zh) 一种用于动态分配片上网络带宽的方法及装置
EP3131004A1 (fr) Processeur et procédé
WO2015073608A1 (fr) Procede et systeme pour communiquer avec une memoire non volatile
CN113439260A (zh) 针对低时延存储设备的i/o完成轮询
US10901784B2 (en) Apparatus and method for deferral scheduling of tasks for operating system on multi-core processor
US9921891B1 (en) Low latency interconnect integrated event handling
CN112559054B (zh) 用于同步指令的方法和计算系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22913634

Country of ref document: EP

Kind code of ref document: A1