CN117407182A - Process synchronization method, system, equipment and medium based on Poll instruction - Google Patents

Process synchronization method, system, equipment and medium based on Poll instruction Download PDF

Info

Publication number
CN117407182A
CN117407182A CN202311713764.0A CN202311713764A CN117407182A CN 117407182 A CN117407182 A CN 117407182A CN 202311713764 A CN202311713764 A CN 202311713764A CN 117407182 A CN117407182 A CN 117407182A
Authority
CN
China
Prior art keywords
poll
instruction
variable value
read
poll instruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311713764.0A
Other languages
Chinese (zh)
Other versions
CN117407182B (en
Inventor
钱龙
张博文
孔超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Muxi Integrated Circuit Nanjing Co ltd
Original Assignee
Muxi Integrated Circuit Nanjing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Muxi Integrated Circuit Nanjing Co ltd filed Critical Muxi Integrated Circuit Nanjing Co ltd
Priority to CN202311713764.0A priority Critical patent/CN117407182B/en
Publication of CN117407182A publication Critical patent/CN117407182A/en
Application granted granted Critical
Publication of CN117407182B publication Critical patent/CN117407182B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/52Program synchronisation; Mutual exclusion, e.g. by means of semaphores
    • G06F9/526Mutual exclusion algorithms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/60Memory management

Abstract

The invention provides a process synchronization method, a system, equipment and a medium based on Poll instruction, belonging to the field of data processing, wherein the method comprises the following steps: writing a Poll instruction into an instruction buffer area of the first processor, sending the Poll instruction to a Poll state control module for analysis, reading variable data from a storage unit according to Poll instruction information obtained through analysis, comparing the read first variable value with a reference variable value carried by the Poll instruction, judging whether the Poll instruction and the reference variable value meet Poll instruction synchronous conditions, judging whether the Poll instruction contains flush marks or not if the Poll instruction contains flush marks, writing the reference variable value carried by the Poll instruction into the storage unit if the Poll instruction contains flush marks, otherwise, returning a normal completion state to the second processor, and selecting to continuously write the Poll instruction or stop writing the Poll instruction by the second processor based on the received completion state information. The parallel synchronous operation of the same variable, which is not blocked by each other among a plurality of processes, can be realized under the condition of not depending on different processes by executing the parallel operation of the Poll instruction through the multithread/multiprocess.

Description

Process synchronization method, system, equipment and medium based on Poll instruction
Technical Field
The embodiment of the disclosure relates to the field of data processing, in particular to a method, a system, equipment and a medium for process synchronization based on Poll instructions.
Background
Heterogeneous computing (Heterogeneous Computing), which mainly refers to joint computing performed by computing units comprising instruction sets of different architectures and types, is applicable to different computing scenarios, and in order to obtain more efficient computing performance, different types of computing tasks need to be allocated to reasonable computing units. In the AI field, there are different computing platforms such as cpu+gpu, cpu+fpga, cpu+npu, etc., for example, CPU is suitable for serial computing and logic scheduling, etc., while specific optimized accelerators such as GPU, FPGA or NPU are suitable for parallel computing tasks such as matrix computing.
GPUs are commonly used to process massively parallel computing tasks, including multiple threads or multiple processes executing simultaneously. And the threads or processes need to be effectively synchronized so as to ensure the consistency and the correctness of the data and avoid the problems of race conditions and the like. The prior art generally uses primitives such as semaphores and mutexes to achieve process synchronization. These traditional approaches are typically blocking synchronization operations, where one process or process occupies a mutex, other processes need to wait, however, in a scenario where high concurrency and real-time requirements are high. Waiting for completion of the synchronization event using a blocking operation can result in the process being in an idle state for a long time, lacking flexibility, and reducing the efficiency of parallel computing.
In the big data age, computing power has become the main motive force for pushing digital economy to develop. With the increasing demand for large-scale data, GPU has become an important component of computing infrastructure as a heterogeneous acceleration chip. GPU chips have huge numbers of computing cores and powerful instruction sets, and are widely applied to numerous fields such as data centers, artificial intelligence and the like. GPUs are commonly used to process massively parallel computing tasks, including multiple threads or multiple processes executing simultaneously. And the threads or processes need to be effectively synchronized so as to ensure the consistency and the correctness of the data and avoid the problems of race conditions and the like.
The prior art generally uses primitives such as semaphores, mutexes, and the like to achieve process synchronization. These traditional approaches are typically blocking synchronization operations, where one process or process occupies a mutex, and other processes need to wait, by locking and unlocking the mutex within a critical section, exclusive access to shared resources can be ensured, and developers and designers can implement blocking synchronization by using basic synchronization primitives such as mutex, condition variables, etc., without introducing additional complex concepts and algorithms, especially for simple synchronization requirements and concurrent tasks of smaller scale, which provides a compact and reliable solution. However, in a scenario where high concurrency and real-time requirements are high. Waiting for completion of the synchronization event using a blocking operation can result in the process being in an idle state for a long time, lacking flexibility, and reducing the efficiency of parallel computing. Particularly in a high concurrency environment such as GPU, a large number of processes are blocked, which may result in a waste of computing resources, so that the conventional synchronization mechanism may not meet the current demand for efficient synchronization.
Disclosure of Invention
The invention aims to provide a method, a system, equipment and a medium for process synchronization based on Poll instructions, so as to at least partially solve the problems.
According to one aspect of the present disclosure, a method for process synchronization based on Poll instructions is provided, including:
s101, writing a Poll instruction into an instruction cache area of a first processor, wherein the Poll instruction is pre-constructed by a second processor,
s102, in response to the non-empty instruction cache area, sending the Poll instruction in the instruction cache area to the Poll state control module,
s103, the Poll state control module analyzes the Poll instruction, reads variable data from the storage unit according to the Poll instruction information obtained by analysis,
s104, comparing the read first variable value with a reference variable value carried by the Poll instruction, judging whether the first variable value and the reference variable value accord with the Poll instruction synchronous condition, if so, proceeding to step S105, otherwise proceeding to step S106,
s105, judging whether the Poll instruction contains a flush identifier, if yes, proceeding to step S1051, if no, proceeding to step S1052,
s1051, writing the reference variable value carried by the Poll instruction into the storage unit, proceeding to step S1052,
s1052, returning the normal completion status to the second processor, proceeding to step S107,
S106, after waiting for the retry interval time, re-reading the variable data to the storage unit until the retry number reaches a preset value or the read data satisfies the synchronization condition, if the read data still does not satisfy the synchronization condition until the retry number reaches the preset value, entering S1061, if the re-read data satisfies the synchronization condition, entering S105,
s1061, returning the abnormal completion status to the second processor, proceeding to step S107,
and S107, the second processor selects to continue writing the Poll instruction or stop writing the Poll instruction based on the received completion status information.
In some embodiments, the method further comprises the first processor being a heterogeneous acceleration chip and the second processor being a CPU.
In some embodiments, the method further comprises the instruction cache region is in a FIFO structure.
In some embodiments, the method further comprises the storage unit is a register or a memory.
In some embodiments, the method further comprises, the Poll instruction information including at least a reference variable value, a flush identifier, a Poll address, a retry number preset value, a Poll storage type, a Poll synchronization condition, a retry interval,
where Poll addresses represent addresses where variable data is read from a memory location or flush is written to a memory location, the address range is the space of a memory or register,
Poll storage types are classified as memory or registers,
the Poll synchronization condition includes the variable value being equal, the read variable value being less than or equal to the reference variable value, the read variable value being equal to the reference variable value, the read variable value not being equal to the reference variable value, the read variable value being greater than or equal to the reference variable value, the read variable value being greater than the reference variable value.
According to another aspect of the present disclosure, a process synchronization system based on Poll instructions is provided, including:
an instruction read-write module for writing a Poll instruction into an instruction buffer area of the first processor, the Poll instruction being pre-constructed by the second processor and sending the Poll instruction to the Poll state control module in response to the instruction buffer area being non-empty,
the Poll state control module is used for analyzing the Poll instruction, reading variable data from the storage module according to the Poll instruction information obtained by analysis, comparing the read first variable value with a reference variable value carried by the Poll instruction, judging whether the first variable value and the reference variable value meet the Poll instruction synchronization condition, further judging whether the Poll instruction contains a flush identifier when the Poll instruction meets the Poll instruction synchronization condition, writing the reference variable value carried by the Poll instruction into the storage module if the flush identifier is contained, returning to a normal completion state, directly returning to the normal completion state if the flush identifier is not contained,
When the two do not accord with the Poll instruction synchronization condition, after waiting for the retry interval time, reading variable data from the storage module again until the retry times reach a preset value or the read data meet the synchronization condition; if the data read by the retry times reaching the preset value still does not meet the synchronous condition, returning to an abnormal completion state; if the re-read data meets the synchronous condition, further judging whether the Poll instruction contains a flush identifier, if so, writing the reference variable value carried by the Poll instruction into the storage module, returning to a normal completion state, if not, directly returning to the normal completion state,
and the storage module is used for responding to the read data request of the Poll state control module and returning the variable value to the Poll state control module or writing the reference variable value to the storage module in response to the request of the Poll state control module for writing the reference variable value.
In some embodiments, the first processor is a heterogeneous accelerator chip and the second processor is a CPU.
In some embodiments, the instruction cache is a FIFO structure.
In some embodiments, the memory module is embodied as a register or a memory.
In some embodiments, the Poll instruction information includes at least a reference variable value, a flush identifier, a Poll address, a retry number preset value, a Poll storage type, a Poll synchronization condition, a retry interval,
where Poll addresses represent addresses where variable data is read from a memory location or flush is written to a memory location, the address range is the space of a memory or register,
poll storage types are classified as memory or registers,
the Poll synchronization condition includes the variable value being equal, the read variable value being less than or equal to the reference variable value, the read variable value being equal to the reference variable value, the read variable value not being equal to the reference variable value, the read variable value being greater than or equal to the reference variable value, the read variable value being greater than the reference variable value.
The embodiment of the application also provides electronic equipment, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor executes the steps in the method in any embodiment by calling the computer program stored in the memory.
The embodiment of the application also provides a computer readable storage medium storing a computer program, which is characterized in that: the computer program, when executed by a processor, performs the steps of the method of any of the embodiments above.
The method can realize parallel synchronous operation of the same variable which is not blocked by each other among a plurality of processes under the condition of not depending on different processes, can simplify the complexity of software operation, and can very effectively improve the synchronous efficiency among different processes, thereby improving the execution efficiency of instructions and being widely applied to multi-process synchronous services of heterogeneous acceleration chips. Compared with the prior art, the invention has the following advantages:
1. high efficiency. Multiple processes may operate on the same variable in parallel. By parallel processing, the problem of blocking between threads/processes is avoided, the waiting time of a plurality of threads/processes and the executing time of instructions can be greatly shortened, and the executing efficiency of the system multithreading/process instructions is remarkably improved.
2. The user operation is simple. And an unlocked synchronization mechanism is adopted, and multiple processes/processes independently operate in parallel. Each thread/process only needs to do instruction read and write operations, and does not need to do state inquiry, locking and unlocking operations. This greatly simplifies the operational complexity of the Host.
3. The operation range is more flexible, the flexible setting of a memory or a register unit by a user is supported, the flush write synchronization or the non-flush write synchronization is realized, and the application of complex multiple scenes is satisfied.
Additional features and advantages of the disclosure will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
Fig. 1 is a block diagram of a method for process synchronization based on Poll instructions according to an embodiment of the present application.
Fig. 2 is a block diagram of a Poll instruction based process synchronization system according to an embodiment of the present application.
Fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The following describes the embodiments of the present invention further with reference to the drawings. The description of these embodiments is provided to assist understanding of the present invention, but is not intended to limit the present invention. In addition, the technical features of the embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.
It should be noted that, in the description of the present invention, the positional or positional relation indicated by the terms such as "upper", "lower", "left", "right", "front", "rear", etc. are merely for convenience of describing the present invention based on the description of the present invention shown in the drawings, and are not intended to indicate or imply that the system or element to be referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention.
The terms "first" and "second" in this technical solution are merely references to the same or similar structures, or corresponding structures that perform similar functions, and are not an arrangement of the importance of these structures, nor are they ordered, or are they of a comparative size, or other meaning.
In addition, unless explicitly stated and limited otherwise, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., the connection may be a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communicated with the inside of two structures. It will be apparent to those skilled in the art that the specific meaning of the terms described above in this application may be understood in the light of the general inventive concept in connection with the present application.
Example 1
Aiming at the problem of lower efficiency of the existing multithreading blocking synchronization mode, the method adopts a process synchronization method based on Poll instructions. The Poll instruction may be used for efficient synchronization operations that are non-blocking by multiple processes.
Specifically, referring to fig. 1, a method for process synchronization based on Poll instructions is provided. The method comprises the following specific steps:
s101, writing a Poll instruction into an instruction cache area of the first processor, wherein the Poll instruction is pre-constructed by the second processor.
In some embodiments, the Poll instruction may contain the following information:
1. reference variable (reference) data. A value for comparison with a variable in memory or a register. If the comparison of the value with the variable in the memory or register meets the user-defined synchronization condition, it is indicated that the expected value of the variable coincides with the value that the user wants to synchronize.
2. The flush identifier is divided into two cases, one is refresh synchronization, the other is no refresh, the presence of the flush identifier indicates that the variable value needs to be refreshed, and the absence of the flush identifier indicates that the variable value is not refreshed.
3. Poll address. Addresses for reading variable stores or flush writes. An address range is a space of memory or registers.
4. Number of retries for Poll. And controlling the maximum number of times of reinitiating the read operation after the Poll fails to compare, wherein the failure refers to that the comparison result does not meet the synchronization condition. If the value is 0, after the comparison fails, the read operation is not initiated any more, and the abnormal completion identification is directly returned. If the value is not 0, then the read operation is initiated continually until the number of times the value is reached, or the variable value read during is consistent with the Poll sync function.
5. The type of storage of Poll is divided into two types, one is a memory, the type of memory may be SRAM, DDR, HBM, etc., and the other is a register unit without limitation.
6. The Poll synchronous conditions are divided into 6 types in total, and the comparison results are equal in 1 and variable value; 2. the variable value is less than or equal to the reference data; 3. the variable value is equal to reference data; 4. the variable value is not equal to reference data; 5. the variable value is greater than or equal to the reference data; 6. the variable value is greater than the reference data.
7. The time interval for each Poll. For controlling the time interval of two Poll retries. If the value is 0, indicating no interval, the retry is done continuously.
S102, in response to the fact that the instruction cache area is not empty, a Poll instruction in the instruction cache area is sent to a Poll state control module.
When the instruction buffer is not empty, it indicates that there is a new instruction write. At this time, the instruction read-write module will actively send a section of instruction to the Poll state control module. The instruction length of each transmission is L.
In some embodiments, the instruction buffer is a buffer for storing user instructions, and has a first-in-first-out characteristic, each time a user-defined instruction is written to the next address of the instruction buffer, the depth is M Byte, and the address range is 0~M-1 Byte. When the instruction writes to the last address M-1 of buffer, it starts again from 0. The first written instruction is read first.
In some embodiments, when the instruction buffer is not empty, the read instruction is started and sent to the Poll state control module in sequence. Each transmission has a length Y. Y is 1Byte at a minimum and is the length of the Poll instruction at a maximum.
And S103, the Poll state control module analyzes the Poll instruction, and reads variable data from the storage unit according to the Poll instruction information obtained by analysis.
The Poll state control module may parse the Poll command. In some embodiments, information such as a Poll storage type, a reference variable (reference) value, a Poll address, a Poll flush identifier, a Poll synchronization condition, a Poll retry number preset value, and a Poll retry interval of the Poll instruction is parsed. And initiating a data reading operation to a storage register or a memory module according to the Poll address and the Poll storage type. After receiving the read operation, the memory or the register module reads the variable value from the corresponding address according to the Poll address, and returns the variable value to the Poll state control module.
S104, comparing the read first variable value with a reference variable value carried by the Poll instruction, judging whether the first variable value and the reference variable value meet the Poll instruction synchronization condition, if so, entering a step S105, and if not, entering a step S106.
S105, judging whether the Poll instruction contains a flush identifier, if yes, proceeding to step S1051, if no, proceeding to step S1052,
s1051, writing the reference variable value carried by the Poll instruction into the storage unit, proceeding to step S1052,
s1052, returning the normal completion status to the second processor, and proceeding to step S107.
And comparing the returned variable value with reference data carried by the Poll instruction, if the comparison result accords with the Poll instruction synchronization condition, judging whether flush information exists in the Poll instruction, if the flush mark exists, sending the reference data to the variable read-write module, and writing the reference value into a corresponding variable storage unit by the variable read-write module according to the write address, and simultaneously returning to a normal completion state by the Poll state control module.
If the flush identifier is not contained in the Poll instruction, the reference data does not need to be written into the memory unit, and the Poll state control module directly returns to the normal completion state.
S106, after waiting for the retry interval time, reading the variable data from the storage unit again until the retry number reaches a preset value or the read data meets the synchronization condition, entering S1061 if the retry number reaches the preset value and the read data still does not meet the synchronization condition, and entering S105 if the read data meets the synchronization condition.
S1061, returning the abnormal completion status to the second processor, and proceeding to step S107.
If the comparison result does not meet the Poll instruction synchronization condition, the read operation is restarted after waiting for the retry interval time. Meanwhile, comparing the retry times carried by the Poll instruction with the times which are read currently. If the retry number is not reached, the read operation is initiated again. If the retry number has been reached, an abort identification is returned. In the re-reading process, if the re-read variable value meets the Poll instruction synchronization condition after comparison, further judging the flush identifier as described above, and finally indicating the Poll state control module to return to the normal completion state and returning to the normal completion identifier.
And S107, the second processor selects to continue writing the Poll instruction or stop writing the Poll instruction based on the received completion status information.
The Poll state control module issues an instruction completion state to the user while stopping all operations. The instruction completion status is either normal completion or abnormal completion. After receiving the completion state, the user selects to continue writing a new Poll instruction or stop writing according to the state information.
Further, in order to describe the inventive concept of the present disclosure in detail, taking a storage type as a memory as an example, a specific flow of the present embodiment is briefly described as follows:
1. User 1 wants to synchronize the desired variable value 64 to the address storage location 0x32 currently operated by user 2. The Poll instruction information prepared by the user therefore includes: the Poll type is a memory (memory) type, the Poll address is 0x32, the reference data is 64, the number of times of Poll retries is 9, the time interval of Poll retries is 1us, the Poll flag is to flush, and the Poll synchronization condition is equal to or greater. The Poll instruction is 32 bytes in length.
2. In the initial state, no Poll instruction exists in the instruction buffer, and the buffer address is 0. The Buffer depth is 1Mbyte, so the latest Poll instruction is written to the instruction Buffer 0-31 byte address.
3. When the instruction buffer is not empty, it indicates that a new instruction is written into the host buffer, and the instruction read-write module actively sends a section of instruction to the state control module. The instruction length of each sending is 16B, and the sending is completed in 2 times.
And 4, the Poll state control module analyzes the Poll instruction. And analyzing information such as calculation type data, reference data, poll address, poll retry number, flush and the like of the Poll instruction.
5. And initiating data reading operation to the storage module according to the address of the Poll instruction and the Poll type.
6. After the memory receives the read operation, the variable is read from the variable memory location at address 0x 32. And returns variable value 67 to the Poll state control module.
7. And comparing the returned variable value with reference data carried by the Poll instruction.
8. Since the variable value is 67, the reference data is 64, and the synchronization type carried by the poll instruction is equal to or greater than, the '64 is less than 67' is not satisfied.
9. Because the Poll synchronization condition is not satisfied and the Poll instruction parses the retry function, the Poll state control module waits for 1us to initiate a read data operation again.
10. After the memory module receives the read operation, the variable is read from the variable store at address 0x32. And returns the variable value 63 to the Poll state control module.
11. And comparing the returned variable value with reference data carried by the Poll instruction.
12. Because 64 is greater than 63, the Poll synchronization condition reference data is greater than or equal to the variable data. The current Poll has flush functionality, and therefore reference data 64 is sent to the memory module. The corresponding address is 0x32.
13. The memory module writes 64 to the variable memory location with an address of 0x32. At the same time, the Poll state control module returns to the complete state.
The poll state control module gives the user the instruction completion state while stopping all operations. The current instruction completion status is normal completion.
15. After receiving the completion status, the user sees normal completion, indicating that the variable update was successful. A new Poll instruction may be selected to continue writing or to stop writing.
Further, in order to describe the inventive concept of the present disclosure in detail, taking a storage type as a register as an example, a specific flow of the present embodiment is briefly described as follows:
1. user 2 wants to synchronize the expected value 48 to pass as long as the variable value at register unit address 0x400 operated by user 3 is not greater than expected value 48. The Poll instruction information prepared by the user therefore includes: the Poll storage type is a register type, the Poll address is 0x400, the reference data is 48, the number of times of Poll retries is 9, the time interval of Poll retries is 1us, the Poll flag is to flush, and the Poll synchronization function is less than or equal to. The Poll instruction is 32 bytes in length.
2. In the initial state, no Poll instruction exists in the instruction buffer, and the buffer address is 0. The Buffer depth is 1Mbyte, so the latest Poll instruction is written to the instruction Buffer 0-31 byte address.
3. When the instruction buffer is not empty, it indicates that a new instruction is written into the host buffer, and the instruction read-write module actively sends a section of instruction to the state control module. The instruction length of each sending is 16B, and the sending is completed in 2 times.
And 4, the Poll state control module analyzes the Poll instruction. And analyzing information such as calculation type data, reference data, poll address, poll retry number, flush and the like of the Poll instruction.
5. And initiating data reading operation to the register module according to the address of the Poll instruction and the Poll type.
6. After the register module receives the read operation, the variable is read out from the variable register unit with address 0x400. And returns variable value 49 to the Poll state control module.
7. And comparing the returned variable value with reference data carried by the Poll instruction.
8. Since the variable value is 49, the reference data is 48, and the synchronization type carried by the Poll instruction is equal to or less, the '48 less than 49' satisfies the Poll synchronization function reference data being equal to or less than the variable data. The current Poll has flush functionality, so reference data 48 is issued to the register module. The corresponding address is 0x400.
9. The register module writes the variable 48 into the variable register unit at address 0x400. At the same time, the Poll state control module returns to the complete state.
The poll state control module gives the user the instruction completion state while stopping all operations. The current instruction completion status is normal completion.
11. After receiving the completion status, the user sees normal completion, indicating that the variable update was successful. A new Poll instruction may be selected to continue writing or to stop writing.
The present disclosure employs advanced synchronization mechanisms, particularly in scenarios where synchronization is required for high concurrency multi-process operation, using non-locking and non-blocking mechanisms. For example, multiple processes may be securely enqueued without first locking and blocking.
Meanwhile, the access to the memory and the register unit can be supported, the synchronous operation range is wider, the complex and diversified use scenes are met, and the use is more convenient and flexible for users.
Through a locking-free synchronization mechanism, each process can operate the same variable in the storage module in parallel, and simultaneously can operate the same variable in the register module in parallel, so that the operation complexity of a user is greatly simplified.
It will be understood by those skilled in the art that the process of implementing synchronization in this embodiment is not limited to GPU process, but may be other types of heterogeneous accelerator chips such as FPGA or NPU.
Example two
To achieve the above objective, the present embodiment proposes a process synchronization system based on Poll instructions, and particularly please refer to fig. 2. The system comprises:
and the instruction reading and writing module is used for receiving the Poll instruction and sending the Poll instruction to the Poll state control module in response to the non-empty instruction buffer area.
After the user constructs the Poll instruction, the Poll instruction is written into the buffer of the instruction reading and writing module. Then, the instruction read-write module reads the instruction from the buffer and sends the instruction to the Poll state control module.
The Poll state control module is used for analyzing the Poll instruction, reading variable data from the storage module according to the Poll instruction information obtained by analysis, comparing the read first variable value with a reference variable value carried by the Poll instruction, judging whether the first variable value and the reference variable value meet the Poll instruction synchronization condition, further judging whether the Poll instruction contains a flush identifier when the Poll instruction meets the Poll instruction synchronization condition, writing the reference variable value carried by the Poll instruction into the storage module if the flush identifier is contained, returning to a normal completion state, directly returning to the normal completion state if the flush identifier is not contained,
When the two do not accord with the Poll instruction synchronization condition, after waiting for the retry interval time, reading variable data from the storage module again until the retry times reach a preset value or the read data meet the synchronization condition; if the data read by the retry times reaching the preset value still does not meet the synchronous condition, returning to an abnormal completion state; if the re-read data meets the synchronous condition, further judging whether the Poll instruction contains a flush identifier, if so, writing a reference variable value carried by the Poll instruction into the storage module, returning to a normal completion state, and if not, directly returning to the normal completion state.
And the storage module is used for responding to the read data request of the Poll state control module and returning the variable value to the Poll state control module or writing the reference variable value to the storage module in response to the request of the Poll state control module for writing the reference variable value.
The Poll state control module performs instruction analysis to obtain information such as an address carried by the Poll instruction. Then, a read operation is selectively initiated to the memory cell or the register cell based on the information. The storage unit or the register unit can read variables and return the variables to the Poll state control module, then the variables are compared according to the synchronous function analyzed by the Poll instruction, if the comparison is successful, user success information is returned, then the Poll instruction control module analyzes whether refresh operation exists according to the instruction, and if the refresh operation is to be performed, reference data in the Poll instruction are written into the storage unit or the register unit. If the comparison fails, the read operation is continuously initiated according to the Poll retry times carried by the Poll instruction, and then the comparison is performed. Finally, the Poll state control module returns the normally completed or abnormally completed state to the user to wait for a new instruction of the user.
The Poll state control module mainly performs analysis of Poll instructions, control of various states such as read variable waiting, write variable waiting, variable comparison and the like, and initiates read operation and refresh write operation.
First, in the initial state, the Poll state control module is in an idle state. When the Poll state control module receives the Poll instruction from the instruction read-write module, the Poll state control module enters an instruction analysis state. And obtaining information such as Reference data, poll address, poll type, poll retry number, poll retry waiting time, poll synchronization function, flush and the like carried by the Poll instruction.
And after the analysis is completed, initiating a read operation to the storage register module according to the poll type and the read address information, and then entering a state of waiting for data return. Then, when the storage unit or the register unit returns the variable data, the variable data enters a comparison state, and the variable data is compared. If the data and reference of the current variable accord with the size relation of the synchronous function, the data and reference accords with the expected flush state, and if the flush state instruction analysis needs to be refreshed, the reference variable value is sent to the storage register module. After the write operation is completed, the write operation enters a completion state and returns to the normal completion flag.
If the currently returned variable data and the current reference data do not meet the synchronization function, the data and the expected value are not synchronized, a retry state is entered, and after the retry interval time is reached, a read operation is initiated to the storage register module again, and the data return state is entered. In the poll retry number, if the reference data and the variable data meet the synchronization function, the flush state is entered, if flush is needed, the write data state is entered, the reference data is refreshed to the memory register module and jumps into the execution completion state, and the normal completion flag is returned. If the reference data and the read data are still unequal after the poll retry number is reached, the jump to completion status returns an exception completion flag.
Finally, after entering the completion state, the Poll state control module notifies the user that the Poll instruction has been executed to complete, and returns to the normal or abnormal completion flag. And returns to the idle state to await the next Poll instruction.
The memory module is mainly used for storing, reading and refreshing variables.
First, the variables in the memory or register unit are initially determined to be 0 or other values by default, and the present embodiment is not limited. The memory may be stored in SRAM, DDR, HBM, which is not limited in this embodiment. The register space is configuration information for some initialization and key parameters configured by the user. The memory space and register space range is the maximum value that can be represented by the address bit width in a Poll instruction. The data bit width is the reference data bit width in Poll instructions.
And secondly, after receiving the read operation sent by the state control module, judging the read operation, and reading the variable value from the memory or the register unit according to the read address. The delay of reading the data depends on the memory itself, and a minimum of 0 may immediately return the read data. The memory or register module then returns the variable data to the Poll state control module to complete the read operation.
Finally, after receiving the flush write operation sent by the Poll state control module, the storage module writes the reference data into the memory or the register unit of the corresponding address according to the write address and the write data.
It will be understood by those skilled in the art that the process of implementing synchronization in this embodiment is not limited to GPU process, but may be other types of heterogeneous accelerator chips such as FPGA or NPU.
The description of technical terms, concepts, etc. related to the foregoing embodiments in this embodiment may refer to the foregoing embodiments, and are not repeated here.
Example III
Correspondingly, the embodiment of the application also provides electronic equipment which can be a terminal or a server. As shown in fig. 3, fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
The electronic device 300 includes a processor 301 having one or more processing cores, a memory 302 having one or more computer-readable storage media, and a computer program stored on the memory 302 and executable on the processor. The processor 301 is electrically connected to the memory 302. It will be appreciated by those skilled in the art that the electronic device structure shown in the figures is not limiting of the electronic device and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.
The processor 301 is a control center of the electronic device 300, connects various parts of the entire electronic device 300 using various interfaces and lines, and performs various functions of the electronic device 300 and processes data by running or loading software programs (computer programs) and/or units stored in the memory 302, and calling data stored in the memory 302, thereby performing overall monitoring of the electronic device 300.
In the embodiment of the present application, the processor 301 in the electronic device 300 loads the instructions corresponding to the processes of one or more application programs into the memory 302 according to the following steps, and the processor 301 executes the application programs stored in the memory 302, so as to implement various functions:
S101, writing a Poll instruction into an instruction cache area of a first processor, wherein the Poll instruction is pre-constructed by a second processor,
s102, in response to the non-empty instruction cache area, sending the Poll instruction in the instruction cache area to the Poll state control module,
s103, the Poll state control module analyzes the Poll instruction, reads variable data from the storage unit according to the Poll instruction information obtained by analysis,
s104, comparing the read first variable value with a reference variable value carried by the Poll instruction, judging whether the first variable value and the reference variable value accord with the Poll instruction synchronous condition, if so, proceeding to step S105, otherwise proceeding to step S106,
s105, judging whether the Poll instruction contains a flush identifier, if yes, proceeding to step S1051, if no, proceeding to step S1052,
s1051, writing the reference variable value carried by the Poll instruction into the storage unit, proceeding to step S1052,
s1052, returning the normal completion status to the second processor, proceeding to step S107,
s106, after waiting for the retry interval time, re-reading the variable data to the storage unit until the retry number reaches a preset value or the read data satisfies the synchronization condition, if the read data still does not satisfy the synchronization condition until the retry number reaches the preset value, entering S1061, if the re-read data satisfies the synchronization condition, entering S105,
S1061, returning the abnormal completion status to the second processor, proceeding to step S107,
and S107, the second processor selects to continue writing the Poll instruction or stop writing the Poll instruction based on the received completion status information.
Optionally, as shown in fig. 3, the electronic device 300 further includes: a process synchronization system 303, a communication module 304, an input unit 305, and a power supply 306. The processor 301 is electrically connected to the process synchronization system 303, the communication module 304, the input unit 305, and the power supply 306, respectively. Those skilled in the art will appreciate that the electronic device structure shown in fig. 3 is not limiting of the electronic device and may include more or fewer components than shown, or may combine certain components, or may be arranged in different components.
The process synchronization system 303 may be used to implement Poll instruction based process synchronization.
The communication module 304 may be used to communicate with other devices.
The input unit 305 may be used to receive input numbers, character information, or user characteristic information (e.g., fingerprint, iris, facial information, etc.), and to generate keyboard, mouse, joystick, optical, or trackball signal inputs related to user settings and function control.
The power supply 306 is used to power the various components of the electronic device 300. Alternatively, the power supply 306 may be logically connected to the processor 301 through a power management system, so as to perform functions of managing charging, discharging, and power consumption management through the power management system. The power supply 306 may also include one or more of any components, such as a direct current or alternating current power supply, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to related descriptions of other embodiments.
Example IV
Those of ordinary skill in the art will appreciate that all or a portion of the steps of the various methods of the above embodiments may be performed by instructions, or by instructions controlling associated hardware, which may be stored in a computer-readable storage medium and loaded and executed by a processor.
To this end, embodiments of the present application provide a computer readable storage medium having stored therein a plurality of computer programs that can be loaded by a processor to perform the steps of a Poll instruction based process synchronization method provided by embodiments of the present application. For example, the computer program may perform the steps of:
S101, writing a Poll instruction into an instruction cache area of a first processor, wherein the Poll instruction is pre-constructed by a second processor,
s102, in response to the non-empty instruction cache area, sending the Poll instruction in the instruction cache area to the Poll state control module,
s103, the Poll state control module analyzes the Poll instruction, reads variable data from the storage unit according to the Poll instruction information obtained by analysis,
s104, comparing the read first variable value with a reference variable value carried by the Poll instruction, judging whether the first variable value and the reference variable value accord with the Poll instruction synchronous condition, if so, proceeding to step S105, otherwise proceeding to step S106,
s105, judging whether the Poll instruction contains a flush identifier, if yes, proceeding to step S1051, if no, proceeding to step S1052,
s1051, writing the reference variable value carried by the Poll instruction into the storage unit, proceeding to step S1052,
s1052, returning the normal completion status to the second processor, proceeding to step S107,
s106, after waiting for the retry interval time, re-reading the variable data to the storage unit until the retry number reaches a preset value or the read data satisfies the synchronization condition, if the read data still does not satisfy the synchronization condition until the retry number reaches the preset value, entering S1061, if the re-read data satisfies the synchronization condition, entering S105,
S1061, returning the abnormal completion status to the second processor, proceeding to step S107,
and S107, the second processor selects to continue writing the Poll instruction or stop writing the Poll instruction based on the received completion status information.
The specific implementation of each operation may be referred to the foregoing embodiments, and will not be described herein.
Wherein the computer-readable storage medium may comprise: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like.
The computer program stored in the storage medium may perform any step in the Poll instruction based process synchronization method provided in the embodiment of the present application, so that any beneficial effect that any Poll instruction based process synchronization method provided in the embodiment of the present application may be achieved, which is detailed in the previous embodiment and will not be described herein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create a system for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The embodiments of the present invention have been described in detail above with reference to the accompanying drawings, but the present invention is not limited to the described embodiments. It will be apparent to those skilled in the art that various changes, modifications, substitutions and alterations can be made to these embodiments without departing from the principles and spirit of the invention, and yet fall within the scope of the invention.

Claims (12)

1. A method for synchronizing processes based on Poll instructions, said method comprising the steps of:
s101, writing a Poll instruction into an instruction cache area of a first processor, wherein the Poll instruction is pre-constructed by a second processor,
s102, in response to the non-empty instruction cache area, sending the Poll instruction in the instruction cache area to the Poll state control module,
s103, the Poll state control module analyzes the Poll instruction, reads variable data from the storage unit according to the Poll instruction information obtained by analysis,
s104, comparing the read first variable value with a reference variable value carried by the Poll instruction, judging whether the first variable value and the reference variable value accord with the Poll instruction synchronous condition, if so, proceeding to step S105, otherwise proceeding to step S106,
s105, judging whether the Poll instruction contains a flush identifier, if yes, proceeding to step S1051, if no, proceeding to step S1052,
s1051, writing the reference variable value carried by the Poll instruction into the storage unit, proceeding to step S1052,
s1052, returning the normal completion status to the second processor, proceeding to step S107,
s106, after waiting for the retry interval time, re-reading the variable data to the storage unit until the retry number reaches a preset value or the read data satisfies the synchronization condition, if the read data still does not satisfy the synchronization condition until the retry number reaches the preset value, entering S1061, if the re-read data satisfies the synchronization condition, entering S105,
S1061, returning the abnormal completion status to the second processor, proceeding to step S107,
and S107, the second processor selects to continue writing the Poll instruction or stop writing the Poll instruction based on the received completion status information.
2. The method according to claim 1, characterized in that:
the first processor is a heterogeneous acceleration chip, and the second processor is a CPU.
3. The method according to claim 1, characterized in that:
the instruction cache area is of a FIFO structure.
4. The method according to claim 1, characterized in that:
the storage unit is a register or a memory.
5. The method according to any one of claims 1-4, wherein:
the Poll instruction information at least comprises a reference variable value, a flush identifier, a Poll address, a retry number preset value, a Poll storage type, a Poll synchronization condition and a retry interval time,
where Poll addresses represent addresses where variable data is read from a memory location or flush is written to a memory location, the address range is the space of a memory or register,
poll storage types are classified as memory or registers,
the Poll synchronization condition includes the variable value being equal, the read variable value being less than or equal to the reference variable value, the read variable value being equal to the reference variable value, the read variable value not being equal to the reference variable value, the read variable value being greater than or equal to the reference variable value, the read variable value being greater than the reference variable value.
6. A Poll instruction based process synchronization system, the system comprising:
an instruction read-write module for writing a Poll instruction into an instruction buffer area of the first processor, the Poll instruction being pre-constructed by the second processor and sending the Poll instruction to the Poll state control module in response to the instruction buffer area being non-empty,
the Poll state control module is used for analyzing the Poll instruction, reading variable data from the storage module according to the Poll instruction information obtained by analysis, comparing the read first variable value with a reference variable value carried by the Poll instruction, judging whether the first variable value and the reference variable value meet the Poll instruction synchronization condition, further judging whether the Poll instruction contains a flush identifier when the Poll instruction meets the Poll instruction synchronization condition, writing the reference variable value carried by the Poll instruction into the storage module if the flush identifier is contained, returning to a normal completion state, directly returning to the normal completion state if the flush identifier is not contained,
when the two do not accord with the Poll instruction synchronization condition, after waiting for the retry interval time, reading variable data from the storage module again until the retry times reach a preset value or the read data meet the synchronization condition; if the data read by the retry times reaching the preset value still does not meet the synchronous condition, returning to an abnormal completion state; if the re-read data meets the synchronous condition, further judging whether the Poll instruction contains a flush identifier, if so, writing the reference variable value carried by the Poll instruction into the storage module, returning to a normal completion state, if not, directly returning to the normal completion state,
And the storage module is used for responding to the read data request of the Poll state control module and returning the variable value to the Poll state control module or writing the reference variable value to the storage module in response to the request of the Poll state control module for writing the reference variable value.
7. The system of claim 6, wherein the system further comprises a controller configured to control the controller,
the first processor is a heterogeneous acceleration chip, and the second processor is a CPU.
8. The system of claim 6, wherein the system further comprises a controller configured to control the controller,
the instruction cache area is of a FIFO structure.
9. The system according to claim 6, wherein:
the storage module is specifically a register or a memory.
10. The system according to any one of claims 6-9, wherein:
the Poll instruction information at least comprises a reference variable value, a flush identifier, a Poll address, a retry number preset value, a Poll storage type, a Poll synchronization condition and a retry interval time,
where Poll addresses represent addresses where variable data is read from a memory location or flush is written to a memory location, the address range is the space of a memory or register,
poll storage types are classified as memory or registers,
the Poll synchronization condition includes the variable value being equal, the read variable value being less than or equal to the reference variable value, the read variable value being equal to the reference variable value, the read variable value not being equal to the reference variable value, the read variable value being greater than or equal to the reference variable value, the read variable value being greater than the reference variable value.
11. An electronic device, characterized in that: comprising a memory storing executable program code and a processor coupled to the memory; wherein the processor invokes executable program code stored in the memory to perform the method of any of claims 1-5.
12. A computer-readable storage medium storing a computer program, characterized in that: the computer program, when executed by a processor, performs the method of any of claims 1-5.
CN202311713764.0A 2023-12-14 2023-12-14 Process synchronization method, system, equipment and medium based on Poll instruction Active CN117407182B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311713764.0A CN117407182B (en) 2023-12-14 2023-12-14 Process synchronization method, system, equipment and medium based on Poll instruction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311713764.0A CN117407182B (en) 2023-12-14 2023-12-14 Process synchronization method, system, equipment and medium based on Poll instruction

Publications (2)

Publication Number Publication Date
CN117407182A true CN117407182A (en) 2024-01-16
CN117407182B CN117407182B (en) 2024-03-12

Family

ID=89492870

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311713764.0A Active CN117407182B (en) 2023-12-14 2023-12-14 Process synchronization method, system, equipment and medium based on Poll instruction

Country Status (1)

Country Link
CN (1) CN117407182B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5214770A (en) * 1988-04-01 1993-05-25 Digital Equipment Corporation System for flushing instruction-cache only when instruction-cache address and data-cache address are matched and the execution of a return-from-exception-or-interrupt command
US5901311A (en) * 1996-12-18 1999-05-04 Intel Corporation Access key protection for computer system data
US20030191793A1 (en) * 1991-03-18 2003-10-09 Dolin Robert A. Task scheduling in an event driven environment
CN105138398A (en) * 2015-09-30 2015-12-09 山东乾云启创信息科技有限公司 SOCKET communication and process management common platform and method under synchronous communication mode
US10929779B1 (en) * 2018-05-22 2021-02-23 Marvell Asia Pte, Ltd. Architecture to support synchronization between core and inference engine for machine learning
KR20210090207A (en) * 2018-11-07 2021-07-19 에이알엠 리미티드 Method and apparatus for implementing lockless data structure
CN113157467A (en) * 2021-05-07 2021-07-23 瑞斯康达科技发展股份有限公司 Multi-process data output method
WO2021164165A1 (en) * 2020-02-20 2021-08-26 苏州浪潮智能科技有限公司 File lock processing method and apparatus, electronic device, and storage medium
CN115269015A (en) * 2022-09-26 2022-11-01 沐曦集成电路(南京)有限公司 Shared variable processing system based on Atomic instruction
WO2023035413A1 (en) * 2021-09-08 2023-03-16 长鑫存储技术有限公司 Read and write test method and apparatus, computer storage medium, and electronic device

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5214770A (en) * 1988-04-01 1993-05-25 Digital Equipment Corporation System for flushing instruction-cache only when instruction-cache address and data-cache address are matched and the execution of a return-from-exception-or-interrupt command
US20030191793A1 (en) * 1991-03-18 2003-10-09 Dolin Robert A. Task scheduling in an event driven environment
US5901311A (en) * 1996-12-18 1999-05-04 Intel Corporation Access key protection for computer system data
CN105138398A (en) * 2015-09-30 2015-12-09 山东乾云启创信息科技有限公司 SOCKET communication and process management common platform and method under synchronous communication mode
US10929779B1 (en) * 2018-05-22 2021-02-23 Marvell Asia Pte, Ltd. Architecture to support synchronization between core and inference engine for machine learning
KR20210090207A (en) * 2018-11-07 2021-07-19 에이알엠 리미티드 Method and apparatus for implementing lockless data structure
WO2021164165A1 (en) * 2020-02-20 2021-08-26 苏州浪潮智能科技有限公司 File lock processing method and apparatus, electronic device, and storage medium
CN113157467A (en) * 2021-05-07 2021-07-23 瑞斯康达科技发展股份有限公司 Multi-process data output method
WO2023035413A1 (en) * 2021-09-08 2023-03-16 长鑫存储技术有限公司 Read and write test method and apparatus, computer storage medium, and electronic device
CN115269015A (en) * 2022-09-26 2022-11-01 沐曦集成电路(南京)有限公司 Shared variable processing system based on Atomic instruction

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
MATTEO MONCHIERO 等: "Efficient Synchronization for Embedded On-Chip Multiprocessors", 《IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS》, vol. 14, no. 10, 23 October 2006 (2006-10-23), pages 1049 - 1062, XP011142360, DOI: 10.1109/TVLSI.2006.884147 *
吴晓慧 等: "微架构瞬态执行攻击与防御方法", 《软件学报》, vol. 31, no. 2, 5 December 2019 (2019-12-05), pages 544 - 563 *
雷洪 等: "《多核并行高性能计算 OpenMP》", 31 May 2016, 冶金工业出版社, pages: 113 - 115 *

Also Published As

Publication number Publication date
CN117407182B (en) 2024-03-12

Similar Documents

Publication Publication Date Title
CN108647104B (en) Request processing method, server and computer readable storage medium
US7583268B2 (en) Graphics pipeline precise interrupt method and apparatus
KR100936601B1 (en) Multi-processor system
US20130160028A1 (en) Method and apparatus for low latency communication and synchronization for multi-thread applications
CN113504985B (en) Task processing method and network equipment
US20110265093A1 (en) Computer System and Program Product
CN111767159A (en) Asynchronous system calling system based on coroutine
CN109101662B (en) Block generation method, device, equipment and storage medium
US5371857A (en) Input/output interruption control system for a virtual machine
CN113885945A (en) Calculation acceleration method, equipment and medium
US20110173287A1 (en) Preventing messaging queue deadlocks in a dma environment
CN115269015A (en) Shared variable processing system based on Atomic instruction
CN100583047C (en) Method for synchronizing real-time interruption with multiple progress states
CN117407182B (en) Process synchronization method, system, equipment and medium based on Poll instruction
CN101189579A (en) Behavioral model based multi-threaded architecture
CN109992539B (en) Double-host cooperative working device
CN116561091A (en) Log storage method, device, equipment and readable storage medium
US8219762B1 (en) Computer system and method for leasing memory location to allow predictable access to memory location
CN115080670A (en) Deterministic transaction concurrency control method based on GPU acceleration
CN115269226A (en) Multithreading multistage parallel communication method based on atomic message and shared cache
CN112131238B (en) Transaction state machine design method, processing device and processing method
CN117407181B (en) Heterogeneous computing process synchronization method and system based on barrier instruction
CN117389625B (en) Process synchronization method, system, equipment and medium based on active interrupt instruction
US20230091817A1 (en) Protocol buffer-based cache mirroring method
JPS6393055A (en) Real time type garbage collection back-up device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant