CN112395093A - Multithreading data processing method and device, electronic equipment and readable storage medium - Google Patents

Multithreading data processing method and device, electronic equipment and readable storage medium Download PDF

Info

Publication number
CN112395093A
CN112395093A CN202011402977.8A CN202011402977A CN112395093A CN 112395093 A CN112395093 A CN 112395093A CN 202011402977 A CN202011402977 A CN 202011402977A CN 112395093 A CN112395093 A CN 112395093A
Authority
CN
China
Prior art keywords
data
processed
instruction
instruction sequence
register
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011402977.8A
Other languages
Chinese (zh)
Inventor
余银
赵家众
穆涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Longxin Zhongke Hefei Technology Co ltd
Original Assignee
Longxin Zhongke Hefei Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Longxin Zhongke Hefei Technology Co ltd filed Critical Longxin Zhongke Hefei Technology Co ltd
Priority to CN202011402977.8A priority Critical patent/CN112395093A/en
Publication of CN112395093A publication Critical patent/CN112395093A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Executing Machine-Instructions (AREA)

Abstract

The application provides a multithreading data processing method, a multithreading data processing device, an electronic device and a readable storage medium, wherein whether the byte number of data to be processed is smaller than or equal to the width of a register is determined, when the byte number of the data to be processed is smaller than or equal to the width of the register, an instruction sequence corresponding to a current thread is generated, the instruction sequence comprises the data to be processed and an atomic operation instruction, and the atomic operation instruction is used for realizing the processing of the data to be processed; and executing an instruction sequence corresponding to the current thread, wherein the data to be processed is data shared by a plurality of threads. In other words, in the embodiment of the present application, when the number of bytes of the to-be-processed data is less than or equal to the width of the register, the to-be-processed data may be written into the instruction sequence, and the read or write operation of the to-be-processed data may be completed through the atomic operation instruction, so that the synchronization of the multi-thread data may be ensured without using a thread lock, and the performance overhead of the processor during the multi-thread data synchronization process is reduced.

Description

Multithreading data processing method and device, electronic equipment and readable storage medium
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to a multithreading data processing method and device, electronic equipment and a readable storage medium.
Background
The multithreading technology is a technology that can implement concurrent execution of multiple threads from software or hardware, and can effectively improve the resource utilization rate of a Central Processing Unit (CPU) and accelerate the program response speed, and is widely used at present.
But the problem of data synchronization is also brought when the concurrent execution of multiple threads is realized. For example, when thread B accesses and modifies a constant in certain data during the process of accessing the data by thread a, thread a may eventually obtain an erroneous access result.
The current way to solve the above problem is to use a thread lock, that is, in the process of accessing data by thread a, other threads cannot access the data, and only after thread a releases the thread lock, other threads can access the data, thereby ensuring the synchronization of data, but the way of using the thread lock will bring significant performance overhead to the processor.
Disclosure of Invention
Embodiments of the present invention provide a method and an apparatus for processing multithreading data, an electronic device, and a readable storage medium, which can reduce performance overhead of a processor in a multithreading data synchronization process.
In a first aspect, an embodiment of the present invention provides a multithreading data processing method, including:
determining whether the byte number of data to be processed is smaller than or equal to the width of a register, wherein the data to be processed is shared by a plurality of threads;
when the byte number of the data to be processed is smaller than or equal to the width of a register, generating an instruction sequence corresponding to a current thread, wherein the instruction sequence comprises the data to be processed and an atomic operation instruction, and the atomic operation instruction is used for realizing the processing of the data to be processed;
and executing the instruction sequence corresponding to the current thread.
In a possible design, the generating a sequence of instructions corresponding to a current thread includes:
if the current thread belongs to the reading operation, generating a first instruction sequence corresponding to the current thread, wherein the first instruction sequence comprises an atomic loading instruction and the data to be processed;
the executing the instruction sequence corresponding to the current thread includes:
executing the atomic load instruction to write the data to be processed to the register.
In a possible design manner, the data to be processed is address information, and the atomic load instruction includes a load operation code, a register identifier, an address of a currently executing instruction, and an offset;
the executing the atomic load instruction to write the data to be processed to the register comprises:
and acquiring the data to be processed stored at the corresponding position in the instruction sequence by using the loading operation code, the address and the offset, and loading the data to be processed into a register corresponding to the register identifier.
In a possible design, the generating a sequence of instructions corresponding to a current thread includes:
if the current thread belongs to the write-in operation, generating a second instruction sequence corresponding to the current thread, wherein the second instruction sequence comprises an atomicity write-in instruction and the data to be processed;
the executing the instruction sequence corresponding to the current thread includes:
executing the atomic write instruction to update the data to be processed in the instruction sequence.
In a second aspect, an embodiment of the present invention provides a multithreading data processing apparatus applied to a RISC architecture processor, including:
the determining module is used for determining whether the byte number of the data to be processed is smaller than or equal to the width of the register, wherein the data to be processed is data shared by a plurality of threads;
the processing module is used for generating an instruction sequence corresponding to the current thread when the byte number of the data to be processed is smaller than or equal to the width of a register, wherein the instruction sequence comprises the data to be processed and an atomic operation instruction, and the atomic operation instruction is used for realizing the processing of the data to be processed;
and the execution module is used for executing the instruction sequence corresponding to the current thread.
In one possible design, the processing module is configured to:
if the current thread belongs to the reading operation, generating a first instruction sequence corresponding to the current thread, wherein the first instruction sequence comprises an atomic loading instruction and the data to be processed;
the execution module is configured to:
executing the atomic load instruction to write the data to be processed to the register.
In a possible design manner, the data to be processed is address information, and the atomic load instruction includes a load operation code, a register identifier, an address of a currently executing instruction, and an offset; the execution module is configured to:
and acquiring the data to be processed stored at the corresponding position in the instruction sequence by using the loading operation code, the address and the offset, and loading the data to be processed into a register corresponding to the register identifier.
In one possible embodiment, the processing module is configured to:
if the current thread belongs to the write-in operation, generating a second instruction sequence corresponding to the current thread, wherein the second instruction sequence comprises an atomicity write-in instruction and the data to be processed;
the execution module is configured to:
executing the atomic write instruction to update the data to be processed in the instruction sequence.
In a third aspect, an embodiment of the present invention provides an electronic device, including: at least one processor and memory;
the memory stores computer-executable instructions;
the at least one processor executing the computer-executable instructions stored by the memory causes the at least one processor to perform the method of multi-threaded data processing as provided in the first aspect.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, in which computer-executable instructions are stored, and when a processor executes the computer-executable instructions, the method for processing multithread data provided in the first aspect is implemented.
The multithreading data processing method, the multithreading data processing device, the electronic equipment and the readable storage medium provided by the embodiment of the application determine whether the byte number of the data to be processed is smaller than or equal to the width of the register, and when the byte number of the data to be processed is smaller than or equal to the width of the register, an instruction sequence corresponding to a current thread is generated, wherein the instruction sequence comprises an atomic operation instruction to realize the processing of the data to be processed; and executing an instruction sequence corresponding to the current thread, wherein the data to be processed is data shared by a plurality of threads. In other words, in the embodiment of the present application, when the number of bytes of the to-be-processed data is less than or equal to the width of the register, the to-be-processed data may be written into the instruction sequence, and the read or write operation of the to-be-processed data is completed through the atomic operation instruction, so that the synchronization of the multi-thread data may be ensured without using a thread lock, and the performance overhead of the processor during the multi-thread data synchronization process is reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a schematic hardware structure diagram of an electronic device provided in an embodiment of the present application;
FIG. 2 is a flowchart illustrating a multithreading data processing method according to an embodiment of the present invention;
FIG. 3 is a thread diagram of a multithreading data processing method according to an embodiment of the present invention;
fig. 4 is a schematic diagram of program modules of a multithreading data processing apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The multithreading data processing method provided by the embodiment of the application can be applied to electronic equipment in various forms, such as a mobile terminal, a computer, a vehicle-mounted terminal, wearable equipment and the like.
Referring to fig. 1, fig. 1 is a schematic diagram of a hardware structure of an electronic device provided in an embodiment of the present application. In the embodiment of the present application, the electronic device 10 includes: a processor 101 and a memory 102; wherein:
memory 102 for storing computer-executable instructions and data.
A processor 101 for executing computer executable instructions stored in the memory, processing data in the memory, and the like.
Alternatively, the memory 102 may be separate or integrated with the processor 101.
When the memory 102 is provided separately, the electronic device further includes a bus 103 for connecting the memory 102 and the processor 101.
Optionally, the Processor 101 may be a Central Processing Unit (CPU), or may be another general-purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), or the like. A general purpose processor may be a microprocessor, any conventional processor, or the like. The steps of the multithreading data processing method disclosed in the present application can be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules in the processor.
The Memory may include a Random Access Memory (RAM), a Non-Volatile Memory (NVM), for example, at least one disk Memory, and may also be a usb disk, a removable hard disk, a read-only Memory, a magnetic disk or an optical disk.
The bus may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, the buses in the figures of the present application are not limited to only one bus or one type of bus.
Among other things, a processing system that includes multiple processors and a single processor may include a number of "threads," each of which runs program instructions independently of the other threads. The use of multiple processors allows for multiple tasks or functions, and even multiple applications, to make the processing more efficient and faster. Using multiple threads or processors means that there are two or more processors or threads that can share the same data stored in the system.
The multithreading technology is a technology which can realize the concurrent execution of a plurality of threads from software or hardware, can effectively improve the resource utilization rate of a CPU, and can accelerate the response speed of a program. But the problem of data synchronization is also brought when the concurrent execution of multiple threads is realized. For example, multiple instructions are required for loading a constant into a register by a processor of a Reduced Instruction Set Computer (RISC), and if a thread a accesses a certain data, a thread B also accesses and modifies the constant in the data (such as modifying a function address), which may result in the thread a finally obtaining an erroneous access result. In the prior art, a common way to solve such a problem is to obtain a thread lock before thread B writes and before thread a reads, that is, in the process of thread a accessing data, thread B cannot access the data, and only after thread a releases the thread lock, thread B can access the data, thereby ensuring the synchronization of the data, but the way of using the thread lock brings significant performance overhead to the processor.
In order to solve the above technical problems, the present application provides a multithreading data processing method for the above situation, which can implement multithreading data synchronization without a thread lock, thereby reducing performance overhead of a processor in the multithreading data synchronization process.
Referring to fig. 2, fig. 2 is a first flowchart illustrating a multithread data processing method according to an embodiment of the present invention, in a possible embodiment of the present application, the multithread data processing method includes:
s201, determining whether the byte number of the data to be processed is smaller than or equal to the width of the register, wherein the data to be processed is shared by a plurality of threads.
The data to be processed is data shared by multiple threads, that is, the data to be processed on the same memory address can be processed by multiple threads.
The registers are small storage areas used for storing data in the CPU, and are used for temporarily storing data participating in operation and operation results. Which may be any one of general purpose registers, special purpose registers, and control registers.
In the embodiment of the application, whether the byte number of the data to be processed in the memory is smaller than or equal to the width of the register or not can be predetermined, and if the byte number of the data to be processed in the memory is larger than the width of the register, the thread lock mode can be still used in the thread to ensure that the thread is not interrupted by other threads in the process of reading or changing the data to be processed in the memory; if the byte number of the data to be processed in the memory is less than or equal to the width of the first register, the data to be processed can be read or changed at one time by using the instruction sequence.
In a possible implementation manner, the data to be processed is address information, for example, the data to be processed is a storage address of the target data in the memory.
It will be appreciated that if the word length of the register is 32 bits, then for a 64-bit data variable M, the CPU needs to read the variable M twice (for example, first read the upper 32 bits and then read the lower 32 bits), and the same applies to the writing. Then if a thread lock is not applied, the following problems exist:
for example, when the thread a reads the high 32 bits of the variable M, it only allows the high 32 bits of data not to be modified by other threads, and the thread B can still modify the low 32 bits of the variable M, which may cause the high 32 bits of the variable M read by the thread a to be the original value and the low 32 bits to be the modified value of the thread B, thereby causing dirty data and program error. Therefore, the application needs to ensure that the byte number of the data to be processed in the memory is less than or equal to the width of the register, so that the thread A can read all the data to be processed at one time, and the problem that the data to be processed is modified by the thread B is solved.
S202, when the byte number of the data to be processed is smaller than or equal to the width of the register, generating an instruction sequence corresponding to the current thread, wherein the instruction sequence comprises the data to be processed and an atomic operation instruction, and the atomic operation instruction is used for realizing the processing of the data to be processed.
Once the atomic operation instruction starts to be executed, the atomic operation instruction runs to the end, and is not switched to any thread in the middle, namely the atomic operation instruction is not interrupted by other threads in the running process.
In a possible implementation manner, the data to be processed may be directly written into the instruction sequence when the instruction sequence corresponding to the current thread is generated.
In another possible implementation, the data to be processed may be generated and stored in the instruction sequence corresponding to the address before the current thread executes. When the current thread is executed and a corresponding instruction sequence is generated, the atomic line operation instruction is generated based on the position of the data to be processed in the instruction sequence. For example, in the instruction sequence, 12 other instructions are spaced between the data to be processed and the atomic operation instruction, and the data to be processed is located below the atomic operation instruction, information that the pointer pc +12 points to the data to be processed and is generated based on the information related to the data to be processed inevitably exists in the atomic operation instruction.
And S203, executing the instruction sequence corresponding to the current thread.
When the instruction sequence corresponding to the current thread is executed, the atomic operation instruction in the instruction sequence can realize the function of atomically reading or modifying the data to be processed in the instruction sequence, namely, the atomic operation instruction is not interrupted by other threads in the process of reading or modifying the data to be processed in the instruction sequence, so that the synchronization of multi-thread data can be ensured without adopting a thread lock, and the performance overhead of a processor in the multi-thread data synchronization process is reduced.
For better understanding of the present application, the present embodiment is referred to a MIPS instruction set, and without a thread lock, under a MIPS architecture processor, a thread may dynamically generate the following instruction to load a target address into a register:
from 32to 47 bits of the address byte, lui reg, address _ bit32Tobit47\, are left shifted by 16 bits and stored in register reg
ori reg, reg, address _ bit16Tobit31\ \ register reg and 16to 31 bits in the address byte are OR-ed bitwise and the result is written into the reg register
The contents of the drotr32 reg, reg,16\ \ register reg are shifted to the left by 16 bits and stored in the register reg
ori reg, reg, address _ bit0Tobit15\ \ per bit OR register reg and 0to 15 bits in the address byte, and write the result into register reg
jr reg \\ \ jump to address stored in register to execute corresponding instruction
It will be appreciated that since the address (data to be processed) is dynamically changed, each change will have a thread (thread B) to modify the 1 st, 2 nd and 4 th instructions to load the latest address. And the other thread (thread A) circularly runs the instruction sequence and jumps to the latest function for execution, if the thread A is interrupted before the execution reaches the 2 nd, 3 rd and 4 th and is switched to the thread B for execution, the thread B modifies the three instructions, and thus when the thread A is switched back, the value in the register reg is not the old address or the new address, and the program execution is wrong.
In this embodiment, when the byte number of the address (to-be-processed data) is smaller than or equal to the width of the register, the address may be directly written into the instruction sequence, and the reading or writing operation on the to-be-processed data may be completed by using an atomic operation instruction. Specifically, the thread a may load the address (to-be-processed data) into the register reg at one time through an atomic load instruction, so that it may be ensured that no data asynchronism occurs when the thread a is switched to the thread B whenever the thread a is interrupted in the process of executing the instruction sequence. The thread B can update the data to be processed through an atomicity writing instruction, so that the thread A can not read wrong data in the process of updating the data to be processed by the thread B.
The multithreading data processing method provided by the embodiment of the application can ensure that the to-be-processed data cannot be changed by the thread B in the process of reading the to-be-processed data in the memory by the thread A, so that the thread A can read the to-be-processed data before the thread B is changed or read the to-be-processed data after the thread B is changed, and can not read the to-be-processed data which is changed by the thread B by half and is not changed by the other half.
Similarly, it can be ensured that the data to be processed is not read by the thread a when the thread B changes the data to be processed in the memory. That is, after the thread B starts to change the to-be-processed data, the thread a cannot read the to-be-processed data any more, and can only read the to-be-processed data after the thread B has changed the to-be-processed data, so that the situation that the thread a reads the to-be-processed data in the process of changing the to-be-processed data by the thread B does not exist.
It should be noted that all processors with RISC architecture have such problems, such as RISC-V, Loongarch, ARM, etc., and no specific description is given here, and the method for improving such problems can be implemented by the embodiments described in the embodiments of the present application.
According to the multithreading data processing method provided by the embodiment of the application, when the byte number of the data to be processed is smaller than or equal to the width of the register, the instruction sequence corresponding to the current thread can be generated, and after the data to be processed is written into the instruction sequence, the reading or writing operation of the data to be processed is completed through the atomic operation instruction. Because the atomic operation instruction cannot be interrupted by other threads in the execution process, the synchronization of the multi-thread data can be ensured without adopting a thread lock, and the performance expense of the processor in the multi-thread data synchronization process is reduced.
Based on the content described in the foregoing embodiment, in a possible implementation manner, the generating an instruction sequence corresponding to the current thread in step S202 specifically includes:
and if the current thread belongs to the reading operation, generating a first instruction sequence corresponding to the current thread, wherein the first instruction sequence comprises an atomic loading instruction and the data to be processed.
The executing of the instruction sequence corresponding to the current thread described in step S203 specifically includes:
and executing the atomic load instruction to write the data to be processed into the register.
That is, in the embodiment of the present application, the data to be processed may be directly written into the instruction sequence, and the data to be processed may be loaded into the register at one time by using one atomic load instruction. That is, in the embodiment of the present application, by writing the to-be-processed data into the register by using the atomic load instruction, it can be ensured that the to-be-processed data is not modified by other threads in the process of writing the to-be-processed data into the register.
In a possible implementation manner, the data to be processed is address information, and the atomic load instruction includes a load opcode, a register identifier, an address of an instruction currently being executed, and an offset.
The executing the atomic load instruction to write the data to be processed into the register includes:
and acquiring the data to be processed stored at the corresponding position in the instruction sequence by using the loading operation code, the address and the offset, and loading the data to be processed into a register corresponding to the register identifier.
In one possible implementation, the load opcode is a load.
That is, in the embodiment of the present application, the data to be processed may be directly written into the instruction sequence, and the data to be processed may be loaded into the register at one time by using one load instruction. By writing the data to be processed into the register by using the load instruction, the data to be processed can be ensured not to be modified by other threads in the process of writing the data to be processed into the register.
And if the current thread belongs to the write-in operation, generating a second instruction sequence corresponding to the current thread, wherein the second instruction sequence comprises an atomicity write-in instruction and the data to be processed.
The executing of the instruction sequence corresponding to the current thread described in step S203 specifically includes:
and executing the atomic write instruction to update the data to be processed in the instruction sequence.
That is, in the embodiment of the present application, the data to be processed may be directly written into the instruction sequence, and the update of the data to be processed may be completed by using one atomic write instruction. By updating the data to be processed by using the atomic write instruction, the data to be processed can be ensured not to be read by other threads in the updating process.
In one possible embodiment, the atomic write instruction includes a store instruction.
That is, in the embodiment of the present application, the data to be processed may be directly written into the instruction sequence, and the update of the data to be processed may be completed by using one store instruction. By updating the data to be processed by using the store instruction, the data to be processed can be ensured not to be read by other threads in the updating process.
For better understanding of the embodiment of the present application, referring to fig. 3, fig. 3 is a schematic thread diagram of a multithread data processing method according to an embodiment of the present invention.
In fig. 3, a first thread is used to execute a first instruction sequence to read data to be processed in a memory and write to a first register; the second instruction is used for executing a second instruction sequence so as to write the data in the second register into the data to be processed in the memory and modify the data to be processed in the memory.
It is understood that a thread is sometimes referred to as a Lightweight Process (LWP), which is the smallest unit of program execution flow, and a standard thread consists of a thread ID, a current instruction Pointer (PC), a register set, and a stack. Namely, the first register and the second register are two different groups of registers distributed in the processor.
The memory is used to temporarily store operation data in the CPU and data exchanged with an external memory such as a hard disk.
In the embodiment of the application, when the number of bytes of the data to be processed is less than or equal to the width of the first register, the first thread can read the data to be processed from the memory at one time by executing the first instruction sequence and write the data to be processed into the first register for processing, and because the load instruction in the first instruction sequence is not interrupted by other threads in the execution process, the first thread can ensure that the data to be processed is not modified by other threads even if a thread lock is not adopted in the process of reading the data to be processed. In addition, when the byte number of the data to be processed in the memory is smaller than or equal to the width of the second register, the second thread can modify the data to be processed in the memory according to the data in the second register by executing the second instruction sequence, and because the store instruction in the second instruction sequence is not interrupted by other threads in the execution process, when the second thread modifies the data to be processed, the data to be processed can be ensured not to be read by other threads even if thread locks are not adopted, so that the performance overhead of the processor in the multi-thread data synchronization process can be reduced.
According to the multithreading data processing method provided by the embodiment of the application, when the byte number of the data to be processed is smaller than or equal to the width of the register, the data to be processed is directly written into the instruction sequence, and the first thread can read the data to be processed into the register for processing at one time by executing the load instruction; or the second thread can write the data in the register into the memory at one time by executing the store instruction, and modify the data to be processed in the memory. Because the load instruction and the store instruction are not interrupted by other threads in the execution process, the data to be processed can not be modified by other threads even if a thread lock is not adopted when the first thread reads the data to be processed, and the data to be processed can not be read by other threads even if the thread lock is not adopted when the second thread modifies the data to be processed, so that the problem of multithread synchronization can be effectively solved, and the performance expense of the processor can be reduced.
Based on the multithreading data processing method described in the foregoing embodiment, an embodiment of the present application further provides a multithreading data processing apparatus, and referring to fig. 4, fig. 4 is a schematic diagram of program modules of the multithreading data processing apparatus provided in an embodiment of the present invention, where, in a possible embodiment of the present application, the multithreading data processing apparatus 40 includes:
a determining module 401, configured to determine whether a byte number of data to be processed is smaller than or equal to a width of the register, where the data to be processed is data shared by multiple threads.
A processing module 402, configured to generate an instruction sequence corresponding to a current thread when a byte number of data to be processed is smaller than or equal to a width of a register, where the instruction sequence includes the data to be processed and an atomic operation instruction, and the atomic operation instruction is used to implement processing on the data to be processed.
The execution module 403 is configured to execute an instruction sequence corresponding to the current thread.
In the multithreading data processing apparatus 40 provided in the embodiment of the present application, when the number of bytes of the to-be-processed data is less than or equal to the width of the register, the reading or writing operation of the to-be-processed data can be completed through the atomic operation instruction, and the atomic operation instruction is not interrupted by other threads in the execution process, so that the synchronization of the multithreading data can be ensured without using a thread lock, and the performance overhead of the processor in the multithreading data synchronization process is reduced.
In a possible implementation, the processing module 402 is specifically configured to:
and if the current thread belongs to the reading operation, generating a first instruction sequence corresponding to the current thread, wherein the first instruction sequence comprises an atomic loading instruction and the data to be processed.
The execution module 403 is specifically configured to:
executing the atomic load instruction to write the data to be processed to the register.
In a possible implementation manner, the data to be processed is address information, and the atomic load instruction includes a load opcode, a register identifier, an address of an instruction currently being executed, and an offset; the execution module 403 is configured to:
and acquiring the data to be processed stored at the corresponding position in the instruction sequence by using the loading operation code, the address and the offset, and loading the data to be processed into a register corresponding to the register identifier.
In a possible implementation, the processing module 402 is specifically configured to:
if the current thread belongs to the write-in operation, generating a second instruction sequence corresponding to the current thread, wherein the second instruction sequence comprises an atomicity write-in instruction and the data to be processed; the execution module 403 is configured to:
executing the atomic write instruction to update the data to be processed in the instruction sequence.
It should be understood that the functions and principles implemented by the functional modules in the multithread data processing apparatus 40 are consistent with the functions and principles implemented by the steps in the multithread data processing method described in the foregoing embodiment, and the detailed implementation process may refer to the description in each embodiment corresponding to the multithread data processing method, which is not described herein again.
Based on the content described in the foregoing embodiments, the embodiments of the present invention further provide a computer-readable storage medium, in which computer-executable instructions are stored, and when a processor executes the computer-executable instructions, the content described in the foregoing embodiments of the multithread data processing method is implemented.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, and for example, the division of the modules is only one logical division, and other divisions may be realized in practice, for example, a plurality of modules may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or modules, and may be in an electrical, mechanical or other form.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each module may exist alone physically, or two or more modules are integrated into one unit. The unit formed by the modules can be realized in a hardware form, and can also be realized in a form of hardware and a software functional unit.
The integrated module implemented in the form of a software functional module may be stored in a computer-readable storage medium. The software functional module is stored in a storage medium and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute some steps of the methods according to the embodiments of the present application.
The storage medium may be implemented by any type or combination of volatile or non-volatile memory devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.
An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. Of course, the storage medium may also be integral to the processor. The processor and the storage medium may reside in an Application Specific Integrated Circuits (ASIC). Of course, the processor and the storage medium may reside as discrete components in an electronic device or host device.
Those of ordinary skill in the art will understand that: all or a portion of the steps of implementing the above-described method embodiments may be performed by hardware associated with program instructions. The program may be stored in a computer-readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. A method of multithreaded data processing for use in a RISC architecture processor, the method comprising:
determining whether the byte number of data to be processed is smaller than or equal to the width of a register, wherein the data to be processed is shared by a plurality of threads;
when the byte number of the data to be processed is smaller than or equal to the width of a register, generating an instruction sequence corresponding to a current thread, wherein the instruction sequence comprises the data to be processed and an atomic operation instruction, and the atomic operation instruction is used for realizing the processing of the data to be processed;
and executing the instruction sequence corresponding to the current thread.
2. The method of claim 1, wherein generating the instruction sequence corresponding to the current thread comprises:
if the current thread belongs to the reading operation, generating a first instruction sequence corresponding to the current thread, wherein the first instruction sequence comprises an atomic loading instruction and the data to be processed;
the executing the instruction sequence corresponding to the current thread includes:
executing the atomic load instruction to write the data to be processed to the register.
3. The method of claim 2, wherein the data to be processed is address information, and the atomic load instruction comprises a load opcode, a register identification, an address of a currently executing instruction, and an offset;
the executing the atomic load instruction to write the data to be processed to the register comprises:
and acquiring the data to be processed stored at the corresponding position in the instruction sequence by using the loading operation code, the address and the offset, and loading the data to be processed into a register corresponding to the register identifier.
4. The method of claim 1, wherein generating the instruction sequence corresponding to the current thread comprises:
if the current thread belongs to the write-in operation, generating a second instruction sequence corresponding to the current thread, wherein the second instruction sequence comprises an atomicity write-in instruction and the data to be processed;
the executing the instruction sequence corresponding to the current thread includes:
executing the atomic write instruction to update the data to be processed in the instruction sequence.
5. A multithreaded data processing apparatus, for use in a RISC architecture processor, the apparatus comprising:
the determining module is used for determining whether the byte number of the data to be processed is smaller than or equal to the width of the register, wherein the data to be processed is data shared by a plurality of threads;
the processing module is used for generating an instruction sequence corresponding to the current thread when the byte number of the data to be processed is smaller than or equal to the width of a register, wherein the instruction sequence comprises the data to be processed and an atomic operation instruction, and the atomic operation instruction is used for realizing the processing of the data to be processed;
and the execution module is used for executing the instruction sequence corresponding to the current thread.
6. The apparatus of claim 5, wherein the processing module is configured to:
if the current thread belongs to the reading operation, generating a first instruction sequence corresponding to the current thread, wherein the first instruction sequence comprises an atomic loading instruction and the data to be processed;
the execution module is configured to:
executing the atomic load instruction to write the data to be processed to the register.
7. The apparatus of claim 6, wherein the data to be processed is address information, and the atomic load instruction comprises a load opcode, a register identification, an address of a currently executing instruction, and an offset; the execution module is configured to:
and acquiring the data to be processed stored at the corresponding position in the instruction sequence by using the loading operation code, the address and the offset, and loading the data to be processed into a register corresponding to the register identifier.
8. The apparatus of claim 5, wherein the processing module is configured to:
if the current thread belongs to the write-in operation, generating a second instruction sequence corresponding to the current thread, wherein the second instruction sequence comprises an atomicity write-in instruction and the data to be processed;
the execution module is configured to:
executing the atomic write instruction to update the data to be processed in the instruction sequence.
9. An electronic device, comprising: at least one processor and memory;
the memory stores computer-executable instructions;
execution of computer-executable instructions stored by the memory by the at least one processor causes the at least one processor to perform a method of multi-threaded data processing according to any of claims 1 to 4.
10. A computer-readable storage medium having stored thereon computer-executable instructions which, when executed by a processor, implement a method of multithreaded data processing as recited in any of claims 1-4.
CN202011402977.8A 2020-12-04 2020-12-04 Multithreading data processing method and device, electronic equipment and readable storage medium Pending CN112395093A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011402977.8A CN112395093A (en) 2020-12-04 2020-12-04 Multithreading data processing method and device, electronic equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011402977.8A CN112395093A (en) 2020-12-04 2020-12-04 Multithreading data processing method and device, electronic equipment and readable storage medium

Publications (1)

Publication Number Publication Date
CN112395093A true CN112395093A (en) 2021-02-23

Family

ID=74604199

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011402977.8A Pending CN112395093A (en) 2020-12-04 2020-12-04 Multithreading data processing method and device, electronic equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN112395093A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113254073A (en) * 2021-05-31 2021-08-13 厦门紫光展锐科技有限公司 Data processing method and device
CN114327815A (en) * 2021-12-10 2022-04-12 龙芯中科技术股份有限公司 Atomicity keeping method, processor and electronic equipment
CN115408153A (en) * 2022-08-26 2022-11-29 海光信息技术股份有限公司 Instruction distribution method, apparatus and storage medium for multithreaded processor
CN115718622A (en) * 2022-11-25 2023-02-28 苏州睿芯通量科技有限公司 Data processing method and device under ARM architecture and electronic equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020078307A1 (en) * 2000-12-15 2002-06-20 Zahir Achmed Rumi Memory-to-memory copy and compare/exchange instructions to support non-blocking synchronization schemes
CN101231585A (en) * 2007-01-26 2008-07-30 辉达公司 Virtual architecture and instruction set for parallel thread computing
CN103299272A (en) * 2010-12-07 2013-09-11 超威半导体公司 Programmable atomic memory using stored atomic procedures
CN107111483A (en) * 2014-10-28 2017-08-29 国际商业机器公司 Control the instruction of the access to the shared register of multiline procedure processor
US20170293486A1 (en) * 2016-04-07 2017-10-12 Imagination Technologies Limited Processors supporting atomic writes to multiword memory locations & methods
CN108701027A (en) * 2016-04-02 2018-10-23 英特尔公司 Processor, method, system and instruction for the broader data atom of data width than primary support to be stored to memory

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020078307A1 (en) * 2000-12-15 2002-06-20 Zahir Achmed Rumi Memory-to-memory copy and compare/exchange instructions to support non-blocking synchronization schemes
CN101231585A (en) * 2007-01-26 2008-07-30 辉达公司 Virtual architecture and instruction set for parallel thread computing
CN103299272A (en) * 2010-12-07 2013-09-11 超威半导体公司 Programmable atomic memory using stored atomic procedures
CN107111483A (en) * 2014-10-28 2017-08-29 国际商业机器公司 Control the instruction of the access to the shared register of multiline procedure processor
CN108701027A (en) * 2016-04-02 2018-10-23 英特尔公司 Processor, method, system and instruction for the broader data atom of data width than primary support to be stored to memory
US20170293486A1 (en) * 2016-04-07 2017-10-12 Imagination Technologies Limited Processors supporting atomic writes to multiword memory locations & methods

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113254073A (en) * 2021-05-31 2021-08-13 厦门紫光展锐科技有限公司 Data processing method and device
CN113254073B (en) * 2021-05-31 2022-08-26 厦门紫光展锐科技有限公司 Data processing method and device
CN114327815A (en) * 2021-12-10 2022-04-12 龙芯中科技术股份有限公司 Atomicity keeping method, processor and electronic equipment
WO2023104146A1 (en) * 2021-12-10 2023-06-15 龙芯中科技术股份有限公司 Atomicity maintaining method, processor and electronic device
CN115408153A (en) * 2022-08-26 2022-11-29 海光信息技术股份有限公司 Instruction distribution method, apparatus and storage medium for multithreaded processor
CN115408153B (en) * 2022-08-26 2023-06-30 海光信息技术股份有限公司 Instruction distribution method, device and storage medium of multithreaded processor
CN115718622A (en) * 2022-11-25 2023-02-28 苏州睿芯通量科技有限公司 Data processing method and device under ARM architecture and electronic equipment
CN115718622B (en) * 2022-11-25 2023-10-13 苏州睿芯通量科技有限公司 Data processing method and device under ARM architecture and electronic equipment

Similar Documents

Publication Publication Date Title
CN112395093A (en) Multithreading data processing method and device, electronic equipment and readable storage medium
CA2706737C (en) A multi-reader, multi-writer lock-free ring buffer
US8996845B2 (en) Vector compare-and-exchange operation
KR101581177B1 (en) Provision of extended addressing modes in a single instruction multiple data data processor
US20040076069A1 (en) System and method for initializing a memory device from block oriented NAND flash
EP2842041B1 (en) Data processing system and method for operating a data processing system
TWI808869B (en) Hardware processor and processor
CN108628638B (en) Data processing method and device
CN111208933B (en) Method, device, equipment and storage medium for data access
US8788766B2 (en) Software-accessible hardware support for determining set membership
CN104978284A (en) Processor subroutine cache
EP4152146A1 (en) Data processing method and device, and storage medium
CN115640047B (en) Instruction operation method and device, electronic device and storage medium
EP3825848A1 (en) Data processing method and apparatus, and related product
CN109416632B (en) Apparatus and method for processing data
US20030084232A1 (en) Device and method capable of changing codes of micro-controller
CN115905040B (en) Counter processing method, graphics processor, device and storage medium
US11853412B2 (en) Systems and methods for defeating stack-based cyber attacks by randomizing stack frame size
US20170192838A1 (en) Cpu system including debug logic for gathering debug information, computing system including the cpu system, and debugging method of the computing system
CN111984317A (en) System and method for addressing data in a memory
CN115774724A (en) Concurrent request processing method and device, electronic equipment and storage medium
CN115269199A (en) Data processing method and device, electronic equipment and computer readable storage medium
CN110308933B (en) Access instruction determining method, device and storage medium
US20040205701A1 (en) Computer system, virtual machine, runtime representation of object, storage media and program transmission apparatus
US20130166887A1 (en) Data processing apparatus and data processing method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination