CN111666103A - Instruction processing method and device - Google Patents

Instruction processing method and device Download PDF

Info

Publication number
CN111666103A
CN111666103A CN202010381861.4A CN202010381861A CN111666103A CN 111666103 A CN111666103 A CN 111666103A CN 202010381861 A CN202010381861 A CN 202010381861A CN 111666103 A CN111666103 A CN 111666103A
Authority
CN
China
Prior art keywords
task
instruction
cpu
execution result
unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010381861.4A
Other languages
Chinese (zh)
Inventor
刘浩楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New H3C Semiconductor Technology Co Ltd
Original Assignee
New H3C Semiconductor Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by New H3C Semiconductor Technology Co Ltd filed Critical New H3C Semiconductor Technology Co Ltd
Priority to CN202010381861.4A priority Critical patent/CN111666103A/en
Publication of CN111666103A publication Critical patent/CN111666103A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3004Arrangements for executing specific machine instructions to perform operations on memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
    • G06F9/3889Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by multiple instructions, e.g. MIMD, decoupled access or execute
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5017Task decomposition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5018Thread allocation

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Advance Control (AREA)

Abstract

The application provides an instruction processing method and device, wherein the method is applied to a hardware acceleration module and comprises the following steps: receiving an acceleration instruction sent by a CPU, wherein the acceleration instruction comprises attribute information of a task to be executed; constructing a first task instruction for completing the task to be executed according to the attribute information of the task to be executed, wherein the first task instruction comprises a task target and a data storage position; and sending the constructed first task instruction to a corresponding task completion module so that the task completion module completes a corresponding task to be executed according to the task target, and sending an execution result and the data storage position to a storage control unit, wherein the storage control unit stores the execution result to a corresponding storage unit according to the data storage position so that the CPU obtains the execution result from the storage unit.

Description

Instruction processing method and device
Technical Field
The present application relates to the field of communications technologies, and in particular, to a method and an apparatus for processing an instruction.
Background
In a computer system, a program usually has a CPU issuing a certain instruction as a start-stop boundary. Most programs can be viewed as an ordered set of instructions. For some programs with lengthy flow or complicated operations, the relevant instructions occupy a significant portion of the entire instruction set space in order to complete the corresponding operations.
If a program has a relatively fixed instruction issuing sequence, that is, a complex workflow is completely fixed on a frame (for example, a special register is called every time a certain program is executed, and only data read from the register is different every time), a CPU generally performs a large amount of repetitive work on such programs or instructions (for example, the CPU needs to issue a plurality of instructions to complete a table lookup, and the construction of each instruction needs a plurality of instructions to complete steps of logical operation, logical judgment, address calculation, and the like), so that great waste is caused on the performance of instruction set space, CPU working efficiency, and the like.
Disclosure of Invention
In view of this, the present application provides an instruction processing method and apparatus, so as to solve the problem in the prior art that, for some fixed flows, a CPU performs a large amount of repetitive work on the fixed flows, which causes great waste on performance such as instruction set space and CPU operating efficiency.
In a first aspect, the present application provides an instruction processing method, where the method is applied to a hardware acceleration module, and the method includes:
receiving an acceleration instruction sent by a CPU, wherein the acceleration instruction comprises an acceleration mark and attribute information of a task to be executed;
constructing a first task instruction for completing the task to be executed according to the acceleration identification and the attribute information of the task to be executed, wherein the first task instruction comprises a task target and a data storage position;
and sending the constructed first task instruction to a corresponding task completion module so that the task completion module completes a corresponding task to be executed according to the task target, and sending an execution result and the data storage position to a storage control unit, wherein the storage control unit stores the execution result to a corresponding storage unit according to the data storage position so that the CPU obtains the execution result from the storage unit.
In a second aspect, the present application provides an instruction processing apparatus, the apparatus comprising:
the system comprises a receiving unit, a processing unit and a processing unit, wherein the receiving unit is used for receiving an acceleration instruction sent by a CPU (central processing unit), and the acceleration instruction comprises attribute information of a task to be executed;
the construction unit is used for constructing a first task instruction for completing the task to be executed according to the attribute information of the task to be executed, and the first task instruction comprises a task target and a data storage position;
and the sending unit is used for sending the constructed first task instruction to a corresponding task completion module so that the task completion module completes a corresponding task to be executed according to the task target, and sending an execution result and the data storage position to the storage control unit, and the storage control unit stores the execution result to the corresponding storage unit according to the data storage position so that the CPU obtains the execution result from the storage unit.
Therefore, by applying the instruction processing method and the apparatus provided by the present application, after the hardware acceleration module receives the acceleration instruction sent by the CPU, a first task instruction for completing the task to be executed is constructed according to the acceleration flag included in the acceleration instruction and the attribute information of the task to be executed. The task instruction comprises a task target and a data storage position. The hardware acceleration module sends the task instruction to the corresponding task completion module, so that the task completion module completes the corresponding task to be executed according to the task target, the execution result and the data storage position are sent to the storage control unit, the storage control unit stores the execution result to the corresponding storage unit according to the data storage position, and the CPU obtains the execution result from the storage unit.
The mode solves the problem that in the prior art, for certain fixed flows, a large amount of repetitive work exists in the CPU, and great waste is caused to the performance such as instruction set space, CPU working efficiency and the like. The aim of simplifying the instruction set is achieved, the number of specific flow instructions is reduced, and the working efficiency of the CPU is improved.
Drawings
Fig. 1 is a flowchart of an instruction processing method according to an embodiment of the present application;
fig. 2 is a structural diagram of an instruction processing apparatus according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the corresponding listed items.
It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information, such information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present application. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.
The following describes the instruction processing method provided in the embodiment of the present application in detail. Referring to fig. 1, fig. 1 is a flowchart of an instruction processing method according to an embodiment of the present disclosure. The method is applied to a hardware acceleration module and specifically comprises the following steps.
And step 110, receiving an acceleration instruction sent by the CPU, wherein the acceleration instruction comprises an acceleration mark and attribute information of a task to be executed.
Specifically, in the embodiment of the present application, the CPU supports a multi-threaded operating mode. Supposing that a thread in an active state in a CPU is about to enter a task taking or task completing process, the CPU generates an acceleration instruction according to a current task, wherein the acceleration instruction comprises an acceleration mark and attribute information of a task to be executed.
The CPU sends an acceleration instruction to the hardware acceleration module. The hardware acceleration module obtains an acceleration flag from the acceleration instruction. And finishing self-starting according to the acceleration identifier.
The attribute information of the task to be executed specifically refers to information indicating characteristics of the task to be executed, for example, information indicating characteristics of an execution operation such as reading, writing, and comparison.
In the embodiment of the present application, the acceleration instruction further includes a task target, a task address, control information, and the like. The control information specifically refers to the number of times that the hardware acceleration module constructs the task instruction subsequently. For example, if the control information is 3, the hardware acceleration module needs to construct 3 task instructions subsequently, and complete 3 acceleration processes.
And 120, constructing a first task instruction for completing the task to be executed according to the acceleration identifier and the attribute information of the task to be executed, wherein the first task instruction comprises a task target and a data storage position.
Specifically, after receiving the acceleration instruction, the hardware acceleration module obtains the acceleration flag and the attribute information of the task to be executed. And constructing a first task instruction for finishing the task to be executed according to the attribute information of the task to be executed. Wherein the piece of task instruction comprises a task target primary data storage location.
Further, if the acceleration instruction does not include the acceleration flag, the hardware acceleration module executes the acceleration instruction according to the existing flow for completing the instruction issued by the CPU, which is similar to the existing instruction completing process and will not be repeated here.
Still further, the acceleration instruction in step 110 further includes a thread number, and the hardware acceleration module takes over the thread described in step 110 from the CPU according to the thread number before constructing a task instruction to complete the task to be executed, and then completes a series of fixed works (e.g., Load/Store, DMA, etc.) instead of the CPU. After the thread is released, the CPU switches to other threads and maintains the thread in a working state. Therefore, the purpose of simplifying the software processing flow is achieved, and the computing performance of the CPU is optimized.
Step 130, sending the constructed first task instruction to a corresponding task completion module, so that the task completion module completes a corresponding task to be executed according to the task target, and sending an execution result and the data storage position to a storage control unit, where the storage control unit stores the execution result to a corresponding storage unit according to the data storage position, so that the CPU obtains the execution result from the storage unit.
Specifically, after the hardware acceleration module replaces the CPU to construct a first task instruction, the hardware acceleration module sends the task instruction to the corresponding task completion module. Therefore, after the task completion module receives the task instruction, the task target is obtained from the task instruction. And executing the task to be executed according to the task target, and after the execution is finished, sending an execution result (which may specifically include a final execution result, intermediate process data and the like) and a data storage position to the storage control unit.
And after receiving the execution result and the data storage position, the storage control unit stores the execution result into the corresponding storage unit according to the data storage position, so that the CPU acquires the execution result from the storage unit.
Further, after the hardware acceleration module sends the first task instruction to the corresponding task completion module, the hardware acceleration module periodically scans the storage control unit to check whether an execution result exists at the storage control unit. The hardware acceleration module can judge whether the execution result stored by the storage control unit has the execution result corresponding to the thread number according to the thread number. And if so, the hardware acceleration module acquires an execution result corresponding to the thread number. The execution result comprises attribute information of a task to be executed required by constructing a second task instruction, and the first task instruction and the second task instruction belong to the same task.
It can be understood that, after obtaining the execution result corresponding to the first task instruction, the hardware acceleration module obtains the attribute information of the task to be executed again from the execution result. According to the acquired attribute information, the hardware acceleration module constructs a second task instruction, and repeatedly executes the steps 120-130 until the constructed task instruction has the same numerical value as the control information.
Further, the task completion module may specifically refer to a generic name of a module for completing a task to be executed, and may include a task target analysis module, a task target forwarding module, and a task execution module. The task target analysis module and the task target forwarding module can be arranged in the thread management module.
Further, before the foregoing step 110, a process that the hardware acceleration module negotiates with the CPU a storage address for storing an execution result in the storage unit is further included, and by this process, the hardware acceleration module only needs to track a completion condition of the acceleration instruction to prompt the CPU to finish the acceleration process, and does not need to provide an address for storing the execution result in the register, and the CPU will autonomously obtain all the execution results in the agreed address.
The CPU and the hardware acceleration module agree on a storage address of an execution result, wherein the address can be a designated address of the CPU in a storage unit, and the address is an address included in the storage unit directly accessible to the CPU. The CPU notifies the designated address to the hardware acceleration module so that the hardware acceleration module determines the address as an address storing the execution result.
In the embodiment of the application, the hardware acceleration module is in the thread management module, and the most important function of the hardware acceleration module is to simulate the instruction of the CPU. In the application, besides the hardware acceleration module, the thread management module only modifies the original logic circuit as little as possible (for example, a plurality of modules involved in the hardware acceleration process need to check the progress of the current acceleration process so as to confirm whether the modules need to participate in the next step, and the newly added module will complete the tracking work of the acceleration process). Related or similar behaviors are often put in the same module to be realized in the hardware acceleration module, and the capability of selecting a branch in the behaviors is sacrificed by fake instructions, so that a hardware tracking mechanism is needed to ensure that the hardware acceleration module can accurately grasp what behavior needs to be realized at any time in a process that a CPU cannot participate. The tracking function is realized by setting a special state (Status) register, data processing is needed in all the processes, and the analyzing module can see the state of the current thread to judge the progress of the accelerated process. The status register is updated when the forged instruction is issued and the instruction is replied and written back.
The hardware instruction acceleration is adopted instead of the specific function module acceleration, so that specific functions such as memory access and memory comparison are guaranteed to be completed on the premise that the corresponding function module is not required to be added. The CPU peripheral task completion module for executing each acceleration instruction still identifies the original instruction format, and the hardware acceleration module only needs to construct the acceleration instruction with the format consistent with the instruction format issued by the CPU to replace the information such as the specific task target, the data storage position and the like originally provided for the task completion modules by the CPU.
It should be noted that each instruction involved in the acceleration flow has a certain dependency, the result of completion of execution of each instruction will participate in constructing the next instruction, otherwise instruction acceleration will lose meaning. All data related to the execution result is written back by the CPU and the hardware acceleration module in a conventional mode. The hardware acceleration module only needs to track the completion condition of the acceleration instruction and prompts the CPU to finish the acceleration process, but does not need to provide an address for storing an execution result in a register, and the CPU can spontaneously acquire all the execution results in the appointed address. By means of the prior arrangement mode, the CPU can more freely access any storage address without being limited by the instruction format in the process of searching the execution result, and can change the instruction reply format by using hardware logic and delete \ increase the information which is not required by software.
Therefore, by applying the instruction processing method and the instruction processing device provided by the application, after the hardware acceleration module receives the acceleration instruction sent by the thread management module, at least one task instruction for completing the task to be executed is constructed according to the attribute information of the task to be executed, which is included in the acceleration instruction. Each task instruction comprises a task target and a data storage position. And the hardware acceleration module sends the task instruction to the corresponding task completion module, so that the task completion module completes the corresponding task to be executed according to the task target and stores the execution result to the address corresponding to the data storage position. And after receiving a first notification message sent by the task completion module, the hardware acceleration module stores the execution result stored in the data storage position into the RAM, and the CPU can directly access the RAM and obtain the execution result from the RAM.
The method solves the problem that in the prior art, for certain fixed flows, a large amount of repetitive work exists in the CPU, and great waste is caused to the performance such as an instruction set space, a CPU instruction execution cycle and the like. The aim of simplifying the instruction set is achieved, the number of specific flow instructions is reduced, and the execution time of a CPU is reduced.
Optionally, after the step 150, a process of sending a notification message to the CPU by the hardware acceleration module is further included, and by this process, the CPU is prompted to finish the acceleration flow.
Specifically, the hardware acceleration module generates a notification message after completing all task instructions of the acceleration flow. The hardware acceleration module sends a notification message to the CPU.
And after receiving the notification message, the CPU determines that the hardware acceleration module has constructed all task instructions of the acceleration flow. The CPU accesses the appointed address in the storage unit appointed in advance by the hardware acceleration module, and obtains an execution result from the appointed address.
Optionally, in this embodiment of the present application, a process of avoiding other instructions issued by the CPU by the hardware acceleration module is further included.
Specifically, in the embodiment of the present application, a high priority is set for an instruction issued by a CPU (software, microcode layer), and when the CPU issues the instruction with the high priority and the hardware acceleration module issues a forged task instruction, the instruction issued by the CPU is avoided. Therefore, the storage unit is also required to store the task instruction forged by the current hardware acceleration module, and the hardware acceleration module waits for the interval of the instruction issued by the CPU and issues the forged task instruction to the corresponding task completion module.
Further, the hardware acceleration module receives a pause indication; according to the pause indication, the hardware acceleration module stores a currently constructed task instruction into a storage space reserved for a thread (the thread is indicated by a thread number included in the acceleration instruction) in a storage unit; and after receiving a starting instruction, the hardware acceleration module sends a currently constructed task instruction to the corresponding task completion module.
It can be understood that the storage unit provides an independent storage space for each thread, and under this condition, the CPU can apply the hardware acceleration module for a service that multiple threads simultaneously perform task fetching/task completing instruction acceleration.
It should be noted that the module for sending the pause instruction and the start instruction may specifically be any one of a task target analysis module, a task target forwarding module, and a task execution module, that is, the module for executing any link task except for the hardware acceleration module in the acceleration process is sent.
When determining that the instruction issued by the CPU exists currently or the instruction issued by the CPU is not issued currently, other task modules can trigger to generate a pause instruction or a start instruction and send the pause instruction or the start instruction to the hardware acceleration module.
Optionally, in this embodiment of the present application, after the hardware acceleration module completes all task instructions of the acceleration flow, the hardware acceleration module returns the thread to the CPU. When the CPU is switched to the thread subsequently, the CPU can directly process data, so that the process of accessing the remote memory for many times is omitted, and the time delay caused by reading and writing of the memory and routing is saved.
It should be noted that, in the embodiment of the present application, if there is no correlation among a plurality of instructions in a program, the hardware acceleration module cannot determine data or an execution result required by the CPU, and thus cannot replace the CPU to complete issuing a fake task instruction. Therefore, in the embodiment of the application, the information required by the hardware acceleration module to forge the next task instruction is uniformly stored in the on-chip memory control unit. When the hardware acceleration module forges the task instruction, information is uniformly acquired from the on-chip memory control unit, and the task instruction is forged by using the acquired information.
As can be seen from the above, the hardware acceleration module needs to have the functions of address calculation, instruction encoding and decoding, data processing, and the like, like the CPU, in order to achieve the predetermined target.
The information carried by the acceleration command is specifically shown in table 1 below, by way of example and not by way of limitation.
Figure BDA0002482446290000091
In one example, taking a data comparison task as an example, the existing process needs two instructions to read back and store data in two address spaces into an adjacent on-chip memory that can be quickly accessed by a CPU, then the CPU initiates a data comparison instruction, and finally, the CPU writes a comparison result into a certain register or a certain storage address by using a write instruction. After the instruction processing method provided by the embodiment of the application is adopted, only the CPU needs to carry three addresses in one acceleration instruction containing the acceleration mark at one time. After receiving the acceleration instruction, the hardware acceleration module actively forges and issues two task instructions (in this example, the task instruction is a read instruction) in sequence according to the first two addresses. When the hardware acceleration module scans that the storage control unit has a corresponding execution result, it forges again and issues a task instruction (in this example, the task instruction is a comparison instruction). And finally, after the hardware acceleration module scans the corresponding execution result existing in the storage control unit again, the execution result is written back according to the third address, and the CPU is informed that the execution task is completed.
Based on the same inventive concept, the embodiment of the present application further provides an instruction processing apparatus corresponding to the instruction processing method described in fig. 1. Referring to fig. 2, fig. 2 is a structural diagram of an instruction processing apparatus according to an embodiment of the present application, where the apparatus includes:
a receiving unit 210, configured to receive an acceleration instruction sent by a CPU, where the acceleration instruction includes attribute information of a task to be executed;
a constructing unit 220, configured to construct a first task instruction for completing the task to be executed according to the attribute information of the task to be executed, where the first task instruction includes a task target and a data storage location;
the sending unit 230 is configured to send the constructed first task instruction to a corresponding task completion module, so that the task completion module completes a corresponding task to be executed according to the task target, and sends an execution result and the data storage location to a storage control unit, and the storage control unit stores the execution result to a corresponding storage unit according to the data storage location, so that the CPU obtains the execution result from the storage unit.
Optionally, the apparatus further comprises: a negotiation unit (not shown in the figure) for negotiating with the CPU a memory address for storing the execution result in the memory unit.
Optionally, the acceleration instruction further comprises a thread number; the device further comprises: and a monitoring unit (not shown in the figure) for monitoring the thread indicated by the thread number according to the thread number.
Optionally, the apparatus further comprises: a scanning unit (not shown in the figure) for periodically scanning the storage control unit;
a judging unit (not shown in the figure) for judging whether an execution result corresponding to the thread number exists in the execution results stored by the storage control unit according to the thread number;
and an obtaining unit (not shown in the figure), configured to obtain, if the first task instruction and the second task instruction belong to the same task, an execution result corresponding to the thread number, where the execution result includes attribute information of a task to be executed, where the attribute information is required to construct a second task instruction, and the first task instruction and the second task instruction belong to the same task.
Optionally, the sending unit 230 is further configured to send a notification message to the CPU, where the notification message is used to enable the CPU to determine that the task to be executed has been completed.
Optionally, the receiving unit 210 is further configured to receive a pause indication;
the device further comprises: a storage unit (not shown in the figure) for storing a currently constructed task instruction into a storage space reserved for the thread indicated by the thread number in the storage unit according to the suspension indication;
the sending unit 230 is further configured to send the currently constructed task instruction to the corresponding task completion module after receiving the start instruction.
Therefore, by applying the instruction processing device provided by the application, after receiving the acceleration instruction sent by the CPU, the device constructs a first task instruction for completing the task to be executed according to the acceleration flag included in the acceleration instruction and the attribute information of the task to be executed. The task instruction comprises a task target and a data storage position. The device sends the task instruction to the corresponding task completion module, so that the task completion module completes the corresponding task to be executed according to the task target, the execution result and the data storage position are sent to the storage control unit, the storage control unit stores the execution result to the corresponding storage unit according to the data storage position, and the CPU obtains the execution result from the storage unit.
The mode solves the problem that in the prior art, for certain fixed flows, a large amount of repetitive work exists in the CPU, and great waste is caused to the performance such as instruction set space, CPU working efficiency and the like. The aim of simplifying the instruction set is achieved, the number of specific flow instructions is reduced, and the working efficiency of the CPU is improved.
The implementation process of the functions and actions of each unit in the above device is specifically described in the implementation process of the corresponding step in the above method, and is not described herein again.
For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the application. One of ordinary skill in the art can understand and implement it without inventive effort.
For the embodiment of the instruction processing apparatus, since the content of the related method is substantially similar to that of the foregoing method embodiment, the description is relatively simple, and for the relevant points, reference may be made to part of the description of the method embodiment.
The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the scope of protection of the present application.

Claims (12)

1. An instruction processing method, applied to a hardware acceleration module, the method comprising:
receiving an acceleration instruction sent by a CPU, wherein the acceleration instruction comprises an acceleration mark and attribute information of a task to be executed;
constructing a first task instruction for completing the task to be executed according to the acceleration identification and the attribute information of the task to be executed, wherein the first task instruction comprises a task target and a data storage position;
and sending the constructed first task instruction to a corresponding task completion module so that the task completion module completes a corresponding task to be executed according to the task target, and sending an execution result and the data storage position to a storage control unit, wherein the storage control unit stores the execution result to a corresponding storage unit according to the data storage position, so that the CPU obtains the execution result from the storage unit.
2. The method of claim 1, wherein before the hardware acceleration module receives an acceleration instruction sent by a CPU, the method further comprises:
and negotiating with the CPU about a storage address used for storing the execution result in the storage unit.
3. The method of claim 1, wherein the acceleration instruction further comprises a thread number;
before the constructing completes the first task instruction of the task to be executed, the method further comprises:
and monitoring the thread indicated by the thread number according to the thread number.
4. The method of claim 3, further comprising:
periodically scanning the storage control unit;
judging whether an execution result corresponding to the thread number exists in the execution results stored by the storage control unit according to the thread number;
and if the first task instruction and the second task instruction belong to the same task, acquiring an execution result corresponding to the thread number, wherein the execution result comprises attribute information of a task to be executed required by constructing the second task instruction, and the first task instruction and the second task instruction belong to the same task.
5. The method of claim 1, further comprising:
and sending a notification message to the CPU, wherein the notification message is used for enabling the CPU to determine that the task to be executed is executed and completed.
6. The method of claim 3, further comprising:
receiving a pause indication;
according to the pause indication, storing a currently constructed task instruction into a storage space reserved for the thread indicated by the thread number in the storage unit;
and after receiving a starting instruction, sending the currently constructed task instruction to a corresponding task completion module.
7. An instruction processing apparatus, characterized in that the apparatus comprises:
the system comprises a receiving unit, a processing unit and a processing unit, wherein the receiving unit is used for receiving an acceleration instruction sent by a CPU (central processing unit), and the acceleration instruction comprises attribute information of a task to be executed;
the construction unit is used for constructing a first task instruction for completing the task to be executed according to the attribute information of the task to be executed, and the first task instruction comprises a task target and a data storage position;
and the sending unit is used for sending the constructed first task instruction to a corresponding task completion module so that the task completion module completes a corresponding task to be executed according to the task target, and sending an execution result and the data storage position to the storage control unit, and the storage control unit stores the execution result to the corresponding storage unit according to the data storage position so that the CPU obtains the execution result from the storage unit.
8. The apparatus of claim 7, further comprising:
and the negotiation unit is used for negotiating with the CPU about a storage address used for storing the execution result in the storage unit.
9. The apparatus of claim 7, wherein the acceleration instruction further comprises a thread number; the device further comprises:
and the monitoring unit is used for monitoring the thread indicated by the thread number according to the thread number.
10. The apparatus of claim 9, further comprising:
a scanning unit for periodically scanning the storage control unit;
a judging unit, configured to judge whether an execution result corresponding to the thread number exists in the execution results stored in the storage control unit according to the thread number;
and the obtaining unit is used for obtaining an execution result corresponding to the thread number if the thread number exists, wherein the execution result comprises attribute information of a task to be executed required by constructing a second task instruction, and the first task instruction and the second task instruction belong to the same task.
11. The apparatus according to claim 7, wherein the sending unit is further configured to send a notification message to the CPU, and the notification message is configured to enable the CPU to determine that the task to be executed has been completed.
12. The apparatus of claim 9, wherein the receiving unit is further configured to receive a pause indication;
the device further comprises: the storage unit is used for storing a currently constructed task instruction into a storage space reserved for the thread indicated by the thread number in the storage unit according to the suspension indication;
and the sending unit is also used for sending the currently constructed task instruction to the corresponding task completion module after receiving the starting instruction.
CN202010381861.4A 2020-05-08 2020-05-08 Instruction processing method and device Pending CN111666103A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010381861.4A CN111666103A (en) 2020-05-08 2020-05-08 Instruction processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010381861.4A CN111666103A (en) 2020-05-08 2020-05-08 Instruction processing method and device

Publications (1)

Publication Number Publication Date
CN111666103A true CN111666103A (en) 2020-09-15

Family

ID=72383113

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010381861.4A Pending CN111666103A (en) 2020-05-08 2020-05-08 Instruction processing method and device

Country Status (1)

Country Link
CN (1) CN111666103A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114003182A (en) * 2022-01-04 2022-02-01 苏州浪潮智能科技有限公司 Instruction interaction method and device, storage equipment and medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114003182A (en) * 2022-01-04 2022-02-01 苏州浪潮智能科技有限公司 Instruction interaction method and device, storage equipment and medium

Similar Documents

Publication Publication Date Title
US6996821B1 (en) Data processing systems and method for batching tasks of the same type in an instruction cache
US7716396B1 (en) Multi-reader multi-writer circular buffer memory
RU2004126679A (en) AGENT, METHOD AND COMPUTER SYSTEM FOR MATCHING IN A VIRTUAL ENVIRONMENT
CN109831520A (en) A kind of timed task dispatching method and relevant apparatus
CN109308213B (en) Multi-task breakpoint debugging method based on improved task scheduling mechanism
US20230274129A1 (en) Method for execution of computational graph in neural network model and apparatus thereof
CN102934102A (en) Multiprocessor system, execution control method and execution control program
CN112083882B (en) SRAM (static random Access memory) dead point processing method, system and device and computer equipment
CN106815080A (en) Distributed diagram data treating method and apparatus
CN111666103A (en) Instruction processing method and device
CN1713134B (en) Virtual machine control structure decoder
CN110706108B (en) Method and apparatus for concurrently executing transactions in a blockchain
CN102214094B (en) Operation is performed via asynchronous programming model
CN111310638A (en) Data processing method and device and computer readable storage medium
CN103988462A (en) A register renaming data processing apparatus and method for performing register renaming
CN108062224B (en) Data reading and writing method and device based on file handle and computing equipment
CN109408532A (en) Data capture method, device, computer equipment and storage medium
CN111258653B (en) Atomic access and storage method, storage medium, computer equipment, device and system
CN112306420A (en) Data read-write method, device and equipment based on storage pool and storage medium
CN102346681B (en) General-purpose simulator
CN111045959A (en) Complex algorithm variable mapping method based on storage optimization
CN112181893A (en) Communication method and system between multi-core processor cores in vehicle controller
EP3291096B1 (en) Storage system and device scanning method
CN105279103A (en) Data management method and apparatus
US7849164B2 (en) Configuring a device in a network via steps

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination