CN116149603A - Operation instruction processing method and system, main processor and coprocessor - Google Patents

Operation instruction processing method and system, main processor and coprocessor Download PDF

Info

Publication number
CN116149603A
CN116149603A CN202111396203.3A CN202111396203A CN116149603A CN 116149603 A CN116149603 A CN 116149603A CN 202111396203 A CN202111396203 A CN 202111396203A CN 116149603 A CN116149603 A CN 116149603A
Authority
CN
China
Prior art keywords
operation instruction
instruction
coprocessor
exponentiation
field
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111396203.3A
Other languages
Chinese (zh)
Inventor
陈东坡
袁博浒
董卫民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Yuefang Technology Co ltd
Original Assignee
Guangdong Yuefang Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Yuefang Technology Co ltd filed Critical Guangdong Yuefang Technology Co ltd
Priority to CN202111396203.3A priority Critical patent/CN116149603A/en
Priority to PCT/CN2022/111463 priority patent/WO2023093128A1/en
Publication of CN116149603A publication Critical patent/CN116149603A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/544Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
    • G06F7/552Powers or roots, e.g. Pythagorean sums
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

An operation processing method and system, a main processor and a coprocessor, wherein the method comprises the following steps: the main processor firstly acquires an operation instruction to be processed and judges whether the operation instruction is a coprocessor operation instruction or not; when the operation instruction is determined to be a coprocessor operation instruction, the main processor sends the operation instruction to a coprocessor so that the coprocessor executes the operation instruction and obtains an execution result of the operation instruction sent by the coprocessor. According to the scheme, the instruction of the coprocessor operation is sent to the coprocessor for execution, so that the instruction fetching times of the processor are reduced, and the execution efficiency of the main processor is improved.

Description

Operation instruction processing method and system, main processor and coprocessor
Technical Field
The present invention relates to the field of integrated circuits, and in particular, to a method and a system for processing an operation instruction, a main processor, and a coprocessor.
Background
The processor is used as the operation and control core of the computer system and is the final execution unit for information processing and program running. Processors are emerging in the age of large-scale integrated circuits, and iterative updating of processor architecture designs and continued advances in integrated circuit technology have prompted their continued development. Processors have evolved rapidly since birth, from the advent of first specializing in mathematical computations to widespread use in general purpose computations, from 4-bit to 8-bit, 16-bit, 32-bit processors, and finally to 64-bit processors, from the advent of mutual incompatibility of vendors to different instruction set architecture specifications.
Modern processors typically employ pipelining to process instructions in parallel to speed up instruction processing efficiency. The instructions processed by the processor comprise a branch instruction, a logic operation instruction, a memory access instruction and the like.
However, the conventional processor has a problem of low efficiency when executing a logical operation instruction.
Disclosure of Invention
The invention provides a method and a system for processing an operation instruction, a main processor and a coprocessor so as to improve the speed of operation processing.
In order to solve the above problems, the present invention provides an arithmetic processing method including:
acquiring an operation instruction to be processed;
judging whether the operation instruction is a coprocessor operation instruction or not;
when the operation instruction is determined to be a coprocessor operation instruction, sending the operation instruction to a coprocessor so that the coprocessor executes the operation instruction;
and acquiring an execution result of the operation instruction sent by the coprocessor.
Optionally, the coprocessor operation instruction is an exponentiation operation instruction.
Optionally, the exponentiation instruction includes a logical operator field; wherein the logical operator field is used for indicating information of exponentiation logical operation;
the determining whether the operation instruction is an exponentiation operation instruction includes:
analyzing the operation instruction to acquire information of a corresponding logical operator field;
and when the logic operator field obtained by analysis is determined to be a preset numerical value, determining the operation instruction to be an exponentiation operation instruction.
Optionally, when it is determined that the operation instruction is not a coprocessor operation instruction, further comprising:
executing the operation instruction and obtaining an execution result of the operation instruction.
Correspondingly, the embodiment of the invention also provides an operation instruction processing method, which comprises the following steps:
acquiring a coprocessor operation instruction sent by a main processor;
executing the coprocessor operation instruction to obtain a corresponding execution result;
and sending the execution result of the coprocessor operation instruction to the main processor.
Optionally, the coprocessor operation instruction is an exponentiation operation instruction.
Optionally, the exponentiation instruction includes an instruction bit field, a logical operator field, a base register field, an exponent register field, and a destination register field; the instruction bit field is used for indicating information of an instruction type of the operation instruction, the logic operator field is used for indicating information of an exponentiation logic operation, the base register field is used for indicating information of a base register address storing the exponentiation operation, the exponent register field is used for indicating information of an exponent register address storing the exponentiation operation, and the destination register field is used for indicating information of an address storing a result of the exponentiation operation;
the step of executing the arithmetic instruction includes:
analyzing the operation instruction to obtain information of a corresponding instruction bit field, a logic operator field, a base number register field, an index register field and a destination register field;
respectively acquiring address information of a base register and an index register from the base register field and the index register field, and respectively reading corresponding information of the base and the index from the base register address and the index register address;
and executing corresponding exponentiation operation according to the acquired information of the instruction bit field and the logic operator field and the read information of the base number and the real number, and acquiring a corresponding execution result.
Correspondingly, the embodiment of the invention also provides a main processor, which comprises:
the first acquisition unit is suitable for acquiring an operation instruction to be processed;
the judging unit is suitable for judging whether the operation instruction is a coprocessor operation instruction or not;
a sending unit adapted to send the operation instruction to a coprocessor when the operation instruction is determined to be a coprocessor operation instruction, so that the coprocessor executes the operation instruction;
and the second acquisition unit is suitable for acquiring an execution result of the operation instruction sent by the coprocessor.
Correspondingly, the embodiment of the invention also provides a coprocessor, which comprises the following components:
the third acquisition unit is suitable for acquiring a coprocessor operation instruction sent by the main processor;
the execution unit is suitable for executing the coprocessor operation instruction and acquiring a corresponding execution result;
and the second sending unit is suitable for sending the execution result of the coprocessor operation instruction to the main processor.
Correspondingly, the embodiment of the invention also provides an operation instruction processing system, which comprises:
a main processor as described above;
such as the coprocessor described above.
Compared with the prior art, the technical scheme of the invention has the following advantages:
according to the operation processing method in the embodiment of the invention, a main processor firstly acquires an operation instruction to be processed and judges whether the operation instruction is a coprocessor operation instruction or not; when the operation instruction is determined to be a coprocessor operation instruction, the main processor sends the operation instruction to a coprocessor so that the coprocessor executes the operation instruction and obtains an execution result of the operation instruction sent by the coprocessor. The instruction of the coprocessor operation is sent to the coprocessor for execution, so that the instruction fetching times of a main processor can be reduced, the required memory read-write time is shortened, and the execution efficiency of the processor is improved.
Drawings
FIG. 1 is a schematic diagram of an instruction processing system according to an embodiment of the present invention;
FIG. 2 is a flow chart of an operation instruction processing method according to an embodiment of the invention;
FIG. 3 is a schematic diagram of a form of an exponentiation instruction according to an embodiment of the present invention;
FIG. 4 is a table of the numerical settings of operand fields in a RISC-V computer instruction in an embodiment of the present invention;
FIG. 5 is a schematic diagram of an execution unit in a coprocessor according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of an exponentiation logic operation unit in an embodiment of the invention.
Detailed Description
As known from the background art, the conventional operation instruction processing method has a problem of low efficiency.
Modern processors typically execute computer instructions in parallel using pipelining techniques to speed up the processing efficiency of the computer instructions. The instructions processed by the processor comprise a branch instruction, a logic operation instruction, a memory access instruction and the like.
In the process of processing logic operation instructions, the read-write time of a memory is one of reasons for influencing the performance of a processor; if the processor needs to rely on the execution result of the logic operation instruction to execute the subsequent instruction, the long time of reading and writing the memory can cause the pipeline to have a long time stop, thereby causing the performance loss of the processor.
For exponentiation instructions, existing processors typically do so by way of raising. For example, for exponentiation y=x n The instruction codes are as follows:
Figure BDA0003370001470000041
the specific implementation steps of the instruction codes specifically include:
step (1): storing the base number x into a register t0;
step (2): storing the exponent n into a register t1;
step (3): deposit 0x1 into register t2;
step (4): writing a loop (loop) function;
step (5): multiplying the value of the register t2 by the value of the register t0, and storing the result into the register t2;
step (6): subtracting 1 from the value of register t1;
step (7): judging whether the value of the register t1 is 0; when the value of the register t1 is determined to be 0, jumping out of the circulation function; otherwise, repeating steps (5) to (6) until the value of the register t1 is 0.
As can be seen from the above description, the processor needs to perform 7+ (n-1) x 3 steps (n > 1) when performing the exponentiation operation. Wherein, as the value of n is larger, the steps required to be executed by the processor in executing the exponentiation operation are also increased. Accordingly, the longer the memory read-write time required. Therefore, the conventional processor has a problem of inefficiency in performing the exponentiation operation.
In order to solve the above problems, in the technical solution of the embodiment of the present invention, a main processor first obtains an operation instruction to be processed, and determines whether the operation instruction is a coprocessor operation instruction; when the operation instruction is determined to be a coprocessor operation instruction, the main processor sends the operation instruction to a coprocessor so that the coprocessor executes the operation instruction and obtains an execution result of the operation instruction sent by the coprocessor. The instruction of the coprocessor operation is sent to the coprocessor for execution, so that the instruction fetching times of a main processor can be reduced, the required memory read-write time is shortened, and the execution efficiency of the processor is improved.
In order that the above objects, features and advantages of embodiments of the invention may be readily understood, a more particular description of the invention will be rendered by reference to specific embodiments thereof which are illustrated in the appended drawings.
For ease of understanding, the following description first describes an arithmetic instruction processing system in an embodiment of the present invention.
Fig. 1 shows a structure of an arithmetic instruction processing system in an embodiment of the present invention. Referring to fig. 1, an arithmetic instruction processing system includes a main processor 10 and a coprocessor 20. Wherein the main processor 10 and the co-processor 20 are coupled to each other.
The main processor 10 is adapted to obtain an operation instruction to be processed; judging whether the operation instruction is a coprocessor operation instruction or not; when the operation instruction is determined to be a coprocessor operation instruction, sending the operation instruction to a coprocessor so that the coprocessor executes the operation instruction; and acquiring an execution result of the operation instruction sent by the coprocessor.
Specifically, the main processor 10 may include a first acquisition unit (not shown), a determination unit (not shown), a transmission unit (not shown), and a second acquisition unit (not shown), wherein:
the first acquisition unit is suitable for acquiring an operation instruction to be processed;
the judging unit is suitable for judging whether the operation instruction is a coprocessor operation instruction or not;
the sending unit is suitable for sending the operation instruction to the coprocessor when the operation instruction is determined to be the coprocessor operation instruction, so that the coprocessor executes the operation instruction;
the second obtaining unit is suitable for obtaining the execution result of the operation instruction sent by the coprocessor.
The coprocessor 20 is adapted to acquire an operation instruction sent by the main processor; executing the operation instruction to obtain a corresponding execution result; and sending the execution result of the operation instruction to the main processor.
Specifically, the coprocessor 20 may include a third fetch unit (not shown), an execution unit (not shown), and a second transmit unit (not shown), wherein:
the third acquisition unit is suitable for acquiring a coprocessor operation instruction sent by the main processor;
the execution unit is suitable for executing the coprocessor operation instruction and obtaining a corresponding execution result;
the second sending unit is suitable for sending the execution result of the coprocessor operation instruction to the main processor.
In this embodiment, the main processor 10 is a fifth generation reduced instruction set computer (Reduced Instruction Set Computing-fiVe, RISC-V) microprocessor. In other embodiments, the processor can also be a complex instruction set computer (Complex Instruction Set Computer, CISC) microprocessor, very long instruction word (Very Long Instruction Word, VLIW) microprocessor, a processor implementing a combination of instruction sets, or any other processor device, such as a digital signal processor.
In this embodiment, the main processor 10 includes a processor core to execute computer instructions.
In this embodiment, the main processor 10 is a reduced instruction set computer processor, and correspondingly, the processor core of the main processor 10 is a reduced instruction set computer processor core. In other embodiments, the processor core of the host processor 10 may be a processor core of any other type of architecture, such as a CISC processor core, a VLIM processor core, or a hybrid processor core.
The number of processor cores included in the main processor 10 may be set to one or more according to actual needs.
Where the main processor 10 includes a plurality of processor cores, the plurality of processor cores may be homogeneous or heterogeneous in architecture and/or instruction set. For example, some of the plurality of processor cores may execute computer instructions in an in-order manner, while other processor cores may execute computer instructions in an out-of-order manner; alternatively, two or more processor cores may execute the same set of computer instructions, while other processor cores may execute different sets of computer instructions or subsets of the set of computer instructions.
In this embodiment, the main processor 10 is coupled to a memory through a memory interface to obtain and execute computer instructions from the memory.
In other embodiments, a corresponding cache is further provided between the main processor and the memory, so as to accelerate the access speed between the processor core of the main processor and the memory.
The cache typically has a multi-level structure. Among them, three-level cache structures are more common, and are divided into a first-level (L1) cache, a second-level (L2) cache and a third-level (L3) cache. Of course, embodiments of the present invention may also support structures with more or less than three levels of cache.
As an example, all or part of the cache is provided within the main processor. As one example, all of the cache is provided within the host processor; in some cases, embodiments of the present invention may also support that all of the cache resides outside of the host processor.
As another example, all or part of the cache is disposed in the memory. As an example, all of the caches are disposed in the memory; in some cases, embodiments of the present invention may also support all of the cache being located outside of the memory.
In this embodiment, the main processor 10 is a central processing unit (Central Processing Unit, CPU). In other embodiments, the host processor may also be an accelerator (e.g., a graphics accelerator or digital signal processing unit), a graphics processor (Graphics Processing Unit, GPU), a field programmable gate array, or any other processor with computer instruction execution functionality.
In this embodiment, the coprocessor 20 is a RISC-V microprocessor. In other embodiments, the processor can also be a complex instruction set computer microprocessor, a very long instruction word microprocessor, a processor implementing a combination of instruction sets, or any other processor device such as a digital signal processor.
In this embodiment, the coprocessor 20 includes a coprocessor core to execute computer instructions.
In this embodiment, the coprocessor 20 is a RISC-V microprocessor, and correspondingly, the coprocessor core is a RISC-V processor core. In other embodiments, the coprocessor core may also be any other type of architecture of processor core, such as a CISC processor core, a VLIM processor core, a hybrid processor core, or the like.
The number of coprocessor cores included in the coprocessor 20 may be set to one or more according to actual needs.
Where the coprocessor 20 includes multiple coprocessor cores, the multiple coprocessor cores may be homogeneous or heterogeneous in architecture and/or instruction set. For example, some of the plurality of coprocessor cores may execute computer instructions in an orderly fashion, while other coprocessor cores may execute computer instructions in an out-of-order fashion; alternatively, two or more coprocessor cores may execute the same set of computer instructions, while other processor cores may execute different sets of computer instructions or subsets of the set of computer instructions.
In this embodiment, the coprocessor 20 is coupled to the main processor 10 through an interface, so as to receive and execute computer instructions sent by the main processor 10.
It should be noted that, the main processor and the coprocessor may also include other circuits that are not necessary for understanding the disclosure of the embodiments of the present invention, and are not described herein again in detail, since the other circuits are not necessary for understanding the disclosure of the embodiments of the present invention.
The operation instruction processing method of the above operation instruction processing system will be described in further detail below.
Fig. 2 is a flow chart of an operation instruction processing method in an embodiment of the invention. Referring to fig. 2, an operation instruction processing method in an embodiment of the present invention specifically includes the following steps:
step S201: the main processor acquires an operation instruction to be processed;
step S202: the main processor judges whether the operation instruction is a coprocessor operation instruction or not; when the determination result is yes, step S203 to step S205 may be performed; otherwise, step S206 may be performed.
Step S203: when the main processor determines that the operation instruction is a coprocessor operation instruction, the operation instruction is sent to a coprocessor;
step S204: and the coprocessor receives and executes the operation instruction, acquires a corresponding execution result and sends the execution result to the main processor.
Step S205: and the main processor acquires an execution result of the operation instruction sent by the coprocessor.
Step S206: and the main processor executes the operation instruction to acquire a corresponding execution result.
Referring to fig. 2, step S201 is performed, where the main processor acquires an operation instruction to be processed.
In this embodiment, the main processor adopts a five-stage pipeline processing procedure when processing an operation instruction, and may be specifically classified into instruction fetch (Instruction Fetch), decode (Instruction Decode), execute (Execute), memory Access (Memory Access), and Write Back (Write Back)). The main processor acquires an operation instruction to be processed, namely, instruction fetching operation in the processing process of the five-stage pipeline.
It will be appreciated that the main processor may also be more or less pipelined, such as a four-stage pipeline, without limitation.
In this embodiment, the operation instruction is stored in a memory, and the corresponding main processor obtains the operation instruction to be processed from the memory. Specifically, the main processor is coupled with the memory through a memory interface, and the main processor correspondingly acquires an operation instruction to be processed from the memory through the memory interface.
A cache, such as a cache (cache), may also be disposed between the main processor and the memory to speed up access between the main processor and the memory. Correspondingly, the main processor can also acquire the operation instruction to be processed from the cache. Specifically, the main processor is coupled with the cache through a cache interface, and the main processor correspondingly acquires an operation instruction to be processed from the cache through the cache interface.
In other embodiments, the main processor may further obtain the operation instruction to be processed from a device other than the memory and the cache.
As one example, the host processor may obtain the pending arithmetic instructions from a peripheral device. Specifically, the main processor is coupled with the peripheral equipment through a bus interconnection matrix and a peripheral equipment interface in sequence, and the main processor acquires an operation instruction to be processed from the peripheral equipment through the bus interconnection matrix and the peripheral equipment interface.
The peripheral device may be an information sensor, such as a global positioning system, an infrared sensor, a laser scanner, etc., a computer information storage device, such as a magnetic disk, an optical disk, a magnetic tape, etc., or a human interaction device, such as a printer, a display, a plotter, a speech synthesizer, etc.
The peripheral interfaces may be a universal asynchronous receiver Transmitter (Universal Asynchronous Receiver/Transmitter, UART) interface, an inter-integrated circuit (Inter Integrated Circuit, I2C) interface, a serial peripheral interface (Serial Peripheral Interface, SPI), a General-Purpose Input/Output (GPIO) interface, an integrated circuit internal audio bus (Inter Integrated Circuit Sound, I2S) interface, a serial audio interface (Serial Audio Interface, SAI), a controller area network (Controller Area Network, CAN), a USB interface, etc., which may be set by those skilled in the art according to actual needs, and are not limited herein.
In this embodiment, the operation instruction includes an arithmetic logic operation instruction.
Referring to fig. 2, step S202 is executed, in which the main processor determines whether the operation instruction is a coprocessor operation instruction; when the determination result is yes, step S203 to step S205 may be performed; otherwise, step S206 may be performed.
In a specific implementation, the host processor may parse the acquired operation instruction to determine whether the operation instruction is a coprocessor operation instruction.
Specifically, the operation instruction carries corresponding instruction identification information. And the main processor analyzes the operation instruction to acquire the information of the instruction identification, and judges whether the operation instruction is a coprocessor operation instruction or not through the instruction identification information obtained by analysis.
For example, when the instruction obtained by the main processor through parsing is identified as a first numerical value, it may be determined that the operation instruction is a coprocessor operation instruction; otherwise, when the instruction obtained by the main processor through analysis is identified as the second value, it may be determined that the operation instruction is not the coprocessor operation instruction.
In this embodiment, the coprocessor operation instruction is an exponentiation operation instruction. In other embodiments, the coprocessor instructions may also be other logic instructions, for example, may be other complex logic instructions including the exponentiation, such as exponentiation add instructions, exponentiation divide instructions, and the like.
In this embodiment, the operation instruction includes an instruction bit field, a logical operator field, a base register field, an exponent register field, and a destination register field. The instruction bit field is used for indicating instruction type information, namely indicating whether the operation instruction is a main processor operation instruction or a coprocessor instruction, the logic operator field is used for indicating information of the exponentiation logic operation, the base register field is used for indicating information of a base register address for storing the exponentiation operation, the exponent register field is used for indicating information of an exponent register address for storing the exponentiation operation, and the destination register field is used for indicating information of an address for storing a result of the exponentiation operation.
In this embodiment, the main processor analyzes the operation instruction to obtain a value of a logical operator field in the operation instruction; when the values of the logical operator fields are determined to be corresponding preset values respectively, determining that the operation instruction is an exponentiation operation instruction; otherwise, when the values of the logical operator fields are determined to be respectively not corresponding to the preset values, the operation instruction is determined to be not an exponentiation operation instruction.
FIG. 3 illustrates a format of an exponentiation instruction in an embodiment of the invention; FIG. 4 illustrates selected combinations of instruction bit fields of a 32-bit reduced instruction set computer instruction.
Referring to fig. 3 and 4 in combination, taking the length of the exponentiation instruction as 32 bits as an example:
for a 32-bit reduced instruction set computer instruction, bits 0 through 31 are sequentially from left to right. The 0 th bit to the 6 th bit are operand fields, the 7 th bit to the 11 th bit are destination register fields, the 12 th bit to the 14 th bit are logical operation extension fields, the 15 th bit to the 19 th bit are base register fields, the 20 th bit to the 24 th bit are index register fields, and the 25 th bit to the 31 th bit are instruction bit fields.
The operand field is used to indicate information of a logical operation. In this embodiment, the operand field is used to indicate information of the exponentiation. In addition, the operand field may be used to indicate information about the length of the operation instruction.
As an example, the operand field has a value of "0101011". Specifically, for a 32-bit RISC-V instruction, the operand field is arranged in the order "ccbbbaa". Wherein the value of "aa" of bit 0 and bit 1 is a fixed value of "11"; the values of "bbb" of the 2 nd bit to the 4 th bit cannot be equal to 111, and the values of "bbb" of the 2 nd bit to the 4 th bit are selected to be set to "010" and the custom-1 encoding is selected to be adopted in this embodiment, and the values of "cc" of the 5 th bit and the 6 th bit can be found to be "01" by referring to the table in fig. 5. In this embodiment, the logical operation extension field is used to indicate information of an extension operation of the exponentiation operation. Specifically, when the value of the logic operation extension field is a first preset value, the operation instruction is an exponentiation operation instruction; when the value of the logic operation extension field is other preset values, the operation instruction is an extension operation instruction of the exponentiation operation, such as exponentiation addition, exponentiation subtraction, exponentiation division and the like.
As an example, the value of the logical operation extension field is "0b000". Specifically, for the exponentiation instruction, which is used only to indicate an exponentiation, the value of the logical operation extension field of the 12 th bit to the 14 th bit is selected to be set to "0b000" in the present embodiment. When the operation instruction is an exponentiation extension operation instruction, the values of the logical operation extension fields of the 12 th bit to the 14 th bit may be set to other values than "0b000", such as values selected from "0b001" to "0b111", or the like. Thus, the operand field and the operation extension field together serve as the logical operator field for information indicating an exponentiation operation.
As an example, the value of the instruction bit is "0b0001000". Specifically, in the existing 32-bit RISC-V instruction, the existing values of the func7 fields corresponding to the 25 th bit to the 31 st bit are only two of "0b0000000" and "0b0100000", so that the available values are more, and in this embodiment, the value of the 25 th bit to the 31 st bit instruction is set to "0b0001000" so as to be distinguished from the existing other 32-bit RISC instruction.
For the exponentiation instruction, the 7 th bit to 11 th bit are the destination register field, the 15 th bit to 19 th bit are the base register field and the 20 th bit to 24 th bit are the exponent register field, and the exponentiation instruction is filled in according to the exponentiation exponent, the base storage register address and the operation result written register address.
The above illustrates only one kind of exponentiation instruction in an embodiment of the present invention. Those skilled in the art will appreciate that the exponentiation instruction may take other instruction formats as long as it is recognized by the host processor and recognized and executed by the coprocessor, without limitation.
The length of the operation instruction may also be set to be longer or shorter, such as 16 bits, 48 bits and 64 bits, etc., and those skilled in the art may choose the operation instruction according to need without limitation.
With continued reference to fig. 2, step S203 is performed, where the main processor sends the operation instruction to the coprocessor when determining that the operation instruction is a coprocessor operation instruction.
In a specific implementation, when the operation instruction is determined to be a coprocessor operation instruction, the main processor sends the operation instruction to the coprocessor so that the coprocessor acquires and executes the operation instruction.
In this embodiment, the main processor is sequentially coupled to the coprocessor through a coprocessor interface, and the corresponding main processor sends the operation instruction to the coprocessor through the coprocessor interface.
With continued reference to fig. 2, in step S204, the coprocessor acquires the operation instruction and executes the operation instruction.
In a specific implementation, when the coprocessor acquires the coprocessor operation instruction sent by the main processor, the coprocessor can analyze and operate on the acquired coprocessor operation instruction.
FIG. 5 is a schematic diagram showing the structure of an execution unit in a coprocessor according to an embodiment of the present invention. Referring to fig. 5, the execution unit includes a decoding module 501, an exponentiation logic operation module 502, and a register read/write module 503. The decoding module 501 is coupled to the exponentiation logic operation module 502 and the register read/write module 503, respectively, and the exponentiation logic operation module 502 is also coupled to the register read/write module 503.
When the coprocessor acquires the exponentiation instruction, the decoding module 501 analyzes the exponentiation instruction, acquires information of a logic operator field, an instruction bit field, an exponent register field, a base register field and a destination register field in the exponentiation instruction, sends the information of the logic operator field and the instruction bit field acquired by analysis to the exponentiation logic operation module 502, and sends the information of the exponent register field, the base register field and the destination register field acquired by analysis to the register read-write module 503.
Subsequently, the register read/write module 503 reads information of the exponent of the exponentiation, the base, and the destination register written by the execution result of the exponentiation from the corresponding registers according to the information of the exponent register field, the base register field, and the destination register field, and sends the information to the exponentiation logic operation module 502.
Then, the exponentiation logic operation module 502 completes the corresponding exponentiation logic operation according to the obtained numerical value of the exponent register, the numerical value of the base register, and the information of the logic operator and the instruction bit.
The exponentiation y=x is as follows n For example, the exponent n=10, and the execution of the exponentiation logic 502 is explained.
Fig. 6 shows a schematic diagram of a power logic operation unit in an embodiment of the present invention. Referring to fig. 6, the exponentiation logic operation unit includes a preprocessing center 601, a first selector 602, a second selector 603, a third selector 604, a fourth selector 605, a first multiplier 606, a second multiplier 607, and a third multiplier 608.
Wherein the preprocessing center 601 is coupled to the first selector 602, the second selector 603, the third selector 604 and the fourth selector 605, respectively, the first selector 602 and the second selector 603 are further coupled to the first multiplier 606, respectively, the third selector 604 and the fourth selector 605 are further coupled to the second multiplier 607, respectively, and the first multiplier 606 and the second multiplier 607 are further coupled to the third multiplier 608, respectively.
In executing the exponentiation instruction, the information of the received instruction bits is sent to the first selector 602, the second selector 603, the third selector 604, and the fourth selector 605. In this embodiment, if the exponentiation instruction is 1, the information of the instruction bits received by the first selector 602, the second selector 603, the third selector 604, and the fourth selector 605 is 1.
When receiving the base x of the exponentiation, the preprocessing center 601 first calculates x 1 、x 2 、x 4 And x 8 And sent to the first selector 602, the second selector 603, the third selector 604 and the fourth selector 605, respectively.
When the exponent n=10, the binary representation of the exponent is 1010, i.e., n [0] is 0, n [1] is 1, n [2] is 0, n [3] is 1, and sent to the first selector 602, the second selector 603, the third selector 604, and the fourth selector 605, respectively.
Next, the first selector 602 is based on the received n [0]]To determine whether its output value is 1 of the instruction bit or x sent by the preprocessing center 601 1 . Specifically, when n [0] is received]When the value of (1) is 1, the first selector 602 selects information of the output instruction bit; when n [0] is received]When the value of (1) is 1, the first selector 602 selects and outputs x sent by the preprocessing center 601 1 . Thus, when the exponent n=10, the first selector 602 receives n [0]]The first selector 602 selects the information of the output instruction bit, i.e., 1, accordingly and sends it to the first multiplier 606.
Similarly, the second selector 603 is based on the received n [1]]To determine whether its output value is 1 of the instruction bit or x sent by the preprocessing center 601 2 . Specifically, when n [1] is received]When the value of (2) is 1, the second selector 603 selects information of the output instruction bit; when n [1] is received]When the value of (1) is 1, the second selector 603 selects and outputs x sent by the preprocessing center 601 2 . When the exponent n=10, the second selector 603 receives n [1]]The second selector 603 correspondingly selects and outputs x transmitted by the preprocessing center 601, the value of which is 1 2 And sent to a first multiplier 606.
The third selector 604 is based on the received n 2]To determine whether its output value is 1 of the instruction bit or x sent by the preprocessing center 601 4 . Specifically, when n 2 is received]When the value of (2) is 1, the third selector 604 selects the information of the output instruction bit; when n [2] is received]When the value of (1) is 1, the third selector 604 selects and outputs x sent by the preprocessing center 601 4 . When the exponent n=10, the third selector 604 receives n [2]]The third selector 604 selects the information of the output instruction bit, i.e., 1, accordingly and sends it to the second multiplier 607.
The fourth selector 605 is based on the received n 3]To determine the value of (a)Whether the output value is 1 of the instruction bit or x sent by the preprocessing center 601 8 . Specifically, when n [3] is received]When the value of (1) is 1, the fourth selector 605 selects information of the output instruction bit; when n [3] is received]When the value of (1) is 1, the fourth selector 605 selects and outputs x transmitted from the preprocessing center 601 8 . When the exponent n=10, the fourth selector 605 receives n [3]]The fourth selector 605 correspondingly selects and outputs x transmitted by the preprocessing center 601 8 And sent to a second multiplier 607.
Then, the first multiplier 606 receives the instruction bit 1 sent by the first selector 602 and x sent by the second selector 603 2 Multiplication and sending the result to a third multiplier 608, the second multiplier 607 will receive instruction bit 1 sent by the third selector 604 and x sent by the fourth selector 605 8 Multiply and send the result to a third multiplier 608.
Finally, the third multiplier receives x transmitted by the first multiplier 606 2 X sent with fourth selector 605 8 Multiplying and sending the result to a register read-write unit.
The above description describes the execution of the multiplication unit in the embodiment of the present invention taking the exponent n=10 of the exponentiation operation as an example. It will be appreciated by those skilled in the art that the value of n may be set to be greater or lesser depending on the actual needs and is not limited thereto.
When the value of the exponent n of the exponentiation varies, the structure of the exponentiation logic and the operations performed will also vary accordingly, which is not limited herein.
And when the execution of the exponentiation instruction is finished and the corresponding execution result is obtained, the coprocessor sends the execution result of the exponentiation logic operation to the main processor, so that the main processor obtains the operation result of the exponentiation instruction.
With continued reference to fig. 2, step S205 is performed, where the main processor obtains the execution result of the operation instruction sent by the coprocessor.
And when the main processor acquires the execution result of the exponentiation instruction sent by the coprocessor, writing the execution result of the exponentiation instruction into a corresponding destination register.
Specifically, when the main processor receives the execution result of the exponentiation instruction, the exponentiation instruction is exited from the pipeline, and the result is written back to the corresponding destination register.
In the method for processing an operation instruction in this embodiment, no matter how the value of the exponent n changes, the main processor only needs to execute fewer instructions to complete the exponentiation calculation y=x n
Specifically, according to the exponentiation instruction format shown in FIG. 4, the corresponding instruction code may be represented as:
mpower rd,rs1,rs2;
wherein, mpower represents exponentiation, rd represents a destination register, and rs1 and rs2 represent a base number register and an exponent register respectively.
The result achieved by the instruction code described above is that the number in the base register rs1 is raised to the power of the number in the exponent register rs2, and the result is stored in the destination register rd, that is, x [ rd ] =x [ rs1] x [ rs2].
Accordingly, exponentiation y=x n The assembly language of (c) is as follows:
Figure BDA0003370001470000161
the specific implementation steps of the instruction codes are as follows:
(1) Storing the immediate x into a register t0;
(2) Storing the immediate n into a register t1;
(3) The numerical value in the register t0 is raised to the power of the numerical value in the register t1, and the result is written into the register t2;
(4) And (5) returning.
It can be seen that no matter how large the value of the exponent n is, the main processor only needs to execute 4 instructions to complete the exponentiation calculation y=x n . Thus, it is compatible with existing exponentiation instructionsCompared with the processing method, the processing method of the operation instruction in the embodiment of the invention can obviously reduce the instruction fetching times of the main processor and improve the execution efficiency of the main processor by executing the exponentiation operation instruction through the coprocessor.
With continued reference to fig. 2, step S206 is performed, where the main processor executes the operation instruction to obtain an execution result of the operation instruction.
When the operation instruction is determined not to be the coprocessor operation instruction, the main processor processes the operation instruction, and writes the execution result of the operation instruction into a corresponding destination register when the execution of the operation instruction is finished.
While specific embodiments of the invention have been described above, it will be appreciated by those skilled in the art that the specific examples described are illustrative only and are not intended to limit the scope of the invention. Although the present invention is disclosed above, the present invention is not limited thereto. Various changes and modifications may be made by one skilled in the art without departing from the spirit and scope of the invention, and the scope of the invention should be assessed accordingly to that of the appended claims.

Claims (10)

1. A method of processing an arithmetic instruction, comprising:
acquiring an operation instruction to be processed;
judging whether the operation instruction is a coprocessor operation instruction or not;
when the operation instruction is determined to be a coprocessor operation instruction, sending the operation instruction to a coprocessor so that the coprocessor executes the operation instruction;
and acquiring an execution result of the operation instruction sent by the coprocessor.
2. The arithmetic instruction processing method according to claim 1, wherein the coprocessor arithmetic instruction is an exponentiation arithmetic instruction.
3. The arithmetic instruction processing method according to claim 2, wherein the exponentiation instruction includes a logical operator field; wherein the logical operator field is used for indicating information of exponentiation logical operation;
the determining whether the operation instruction is an exponentiation operation instruction includes:
analyzing the operation instruction to acquire information of a corresponding logical operator field;
and when the logic operator field obtained by analysis is determined to be a preset numerical value, determining the operation instruction to be an exponentiation operation instruction.
4. The arithmetic instruction processing method according to claim 1, wherein when it is determined that the arithmetic instruction is not a coprocessor arithmetic instruction, further comprising:
executing the operation instruction and obtaining an execution result of the operation instruction.
5. A method of processing an arithmetic instruction, comprising:
acquiring a coprocessor operation instruction sent by a main processor;
executing the coprocessor operation instruction to obtain a corresponding execution result;
and sending the execution result of the coprocessor operation instruction to the main processor.
6. The method according to claim 5, wherein the coprocessor operation instruction is an exponentiation operation instruction.
7. The method of claim 6, wherein the exponentiation instruction includes an instruction bit field, a logical operator field, a base register field, an exponent register field, and a destination register field; the instruction bit field is used for indicating information of an instruction type of the operation instruction, the logic operator field is used for indicating information of an exponentiation logic operation, the base register field is used for indicating information of a base register address storing the exponentiation operation, the exponent register field is used for indicating information of an exponent register address storing the exponentiation operation, and the destination register field is used for indicating information of an address storing a result of the exponentiation operation;
the step of executing the arithmetic instruction includes:
analyzing the operation instruction to obtain information of a corresponding instruction bit field, a logic operator field, a base number register field, an index register field and a destination register field;
respectively acquiring address information of a base register and an index register from the base register field and the index register field, and respectively reading corresponding information of the base and the index from the base register address and the index register address;
and executing corresponding exponentiation operation according to the acquired information of the instruction bit field and the logic operator field and the read information of the base number and the real number, and acquiring a corresponding execution result.
8. A main processor, comprising:
the first acquisition unit is suitable for acquiring an operation instruction to be processed;
the judging unit is suitable for judging whether the operation instruction is a coprocessor operation instruction or not;
a sending unit adapted to send the operation instruction to a coprocessor when the operation instruction is determined to be a coprocessor operation instruction, so that the coprocessor executes the operation instruction;
and the second acquisition unit is suitable for acquiring an execution result of the operation instruction sent by the coprocessor.
9. A coprocessor, comprising:
the third acquisition unit is suitable for acquiring a coprocessor operation instruction sent by the main processor;
the execution unit is suitable for executing the coprocessor operation instruction and acquiring a corresponding execution result;
and the second sending unit is suitable for sending the execution result of the coprocessor operation instruction to the main processor.
10. An arithmetic instruction processing system, comprising:
the main processor of claim 8;
the coprocessor of claim 9.
CN202111396203.3A 2021-11-23 2021-11-23 Operation instruction processing method and system, main processor and coprocessor Pending CN116149603A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202111396203.3A CN116149603A (en) 2021-11-23 2021-11-23 Operation instruction processing method and system, main processor and coprocessor
PCT/CN2022/111463 WO2023093128A1 (en) 2021-11-23 2022-08-10 Operation instruction processing method and system, main processor, and coprocessor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111396203.3A CN116149603A (en) 2021-11-23 2021-11-23 Operation instruction processing method and system, main processor and coprocessor

Publications (1)

Publication Number Publication Date
CN116149603A true CN116149603A (en) 2023-05-23

Family

ID=86353142

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111396203.3A Pending CN116149603A (en) 2021-11-23 2021-11-23 Operation instruction processing method and system, main processor and coprocessor

Country Status (2)

Country Link
CN (1) CN116149603A (en)
WO (1) WO2023093128A1 (en)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9317287B2 (en) * 2011-07-19 2016-04-19 Panasonic Intellectual Property Management Co., Ltd. Multiprocessor system
CN202331425U (en) * 2011-08-29 2012-07-11 江苏中科芯核电子科技有限公司 Vector floating point arithmetic device based on vector arithmetic
CN111078287B (en) * 2019-11-08 2022-07-19 苏州浪潮智能科技有限公司 Vector operation co-processing method and device
CN113253664B (en) * 2021-07-02 2021-10-15 峰岹科技(深圳)股份有限公司 Coprocessor, coprocessor control method, terminal and storage medium

Also Published As

Publication number Publication date
WO2023093128A1 (en) 2023-06-01

Similar Documents

Publication Publication Date Title
CN107844322B (en) Apparatus and method for performing artificial neural network forward operations
CN101488083B (en) Methods, apparatus, and instructions for converting vector data
CN107957976B (en) Calculation method and related product
US20120284489A1 (en) Methods and Apparatus for Constant Extension in a Processor
CN101495959A (en) Method and system to combine multiple register units within a microprocessor
CN110825437B (en) Method and apparatus for processing data
WO2012106716A1 (en) Processor with a hybrid instruction queue with instruction elaboration between sections
US20210089309A1 (en) Byte comparison method for string processing and instruction processing apparatus
US20210089315A1 (en) System, device, and method for processing instructions based on multiple levels of branch target buffers
CN113743599A (en) Operation device and server of convolutional neural network
US7523152B2 (en) Methods for supporting extended precision integer divide macroinstructions in a processor
CN108733412B (en) Arithmetic device and method
CN111124495A (en) Data processing method, decoding circuit and processor
JPH1165839A (en) Instruction control mechanism of processor
JP2844591B2 (en) Digital signal processor
KR100267089B1 (en) Single instruction multiple data processing with combined scalar/vector operations
CN116149603A (en) Operation instruction processing method and system, main processor and coprocessor
CN114924792A (en) Instruction decoding unit, instruction execution unit, and related devices and methods
CN115048334A (en) Programmable array processor control apparatus
CN112559037B (en) Instruction execution method, unit, device and system
TW201810020A (en) Systems, apparatuses, and methods for cumulative product
WO2021061260A1 (en) System, device, and method for obtaining instructions from a variable-length instruction set
CN113946368B (en) Three-stage pipeline architecture, processor and data processing method based on RISC-V instruction set
CN112257843B (en) System for expanding instruction set based on MobileNet V1 network inference task
CN111813447A (en) Processing method and processing device for data splicing instruction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination