WO2023142524A1

WO2023142524A1 - Instruction processing method and apparatus, chip, electronic device, and storage medium

Info

Publication number: WO2023142524A1
Application number: PCT/CN2022/124520
Authority: WO
Inventors: 王文强; 霍冠廷; 孙海涛; 夏晓旭; 徐宁仪
Original assignee: 上海商汤智能科技有限公司
Priority date: 2022-01-30
Filing date: 2022-10-11
Publication date: 2023-08-03
Also published as: CN114443143A

Abstract

Provided in the present disclosure are an instruction processing method and apparatus, a chip, an electronic device, and a storage medium. The instruction processing method comprises: acquiring a first computation instruction to be processed; determining an addressing mode of first address information of first access data corresponding to the first computation instruction, wherein the first access data comprises an operand which is used for executing the first computation instruction and/or a computation result of the operand; when the addressing mode is an address accumulation mode, acquiring accumulated step size information on the basis of the first computation instruction, and acquiring second address information of second access data corresponding to a second computation instruction, wherein the second computation instruction is the instruction preceding the first computation instruction; and on the basis of the second address information and the accumulated step size information, determining the first address information, and on the basis of the first address information, executing the first computation instruction.

Description

Instruction processing method, device, chip, electronic device and storage medium

cross-reference statement

This application claims the priority of the Chinese patent application with application number 202210114536.0 filed on January 30, 2022, the entire contents of which are incorporated in this application by reference.

technical field

The present disclosure relates to the field of electronic technology, and in particular, to an instruction processing method, device, chip, electronic equipment, and computer-readable storage medium.

Background technique

The working process of a computer involves the execution of computer instructions arranged in a certain order. Computer instructions are instructions and commands to direct and coordinate the work of various components of the computer. The execution of instructions mainly includes access to storage space and calculation of data, wherein the form of storage space access includes immediate direct addressing and indirect addressing using operation results. The indirect addressing mode needs to calculate the value of the address in real time. Therefore, the indirect addressing mode brings extra instruction cycles, thereby affecting the execution efficiency of the instruction.

Contents of the invention

Embodiments of the present disclosure at least provide an instruction processing method, device, chip, electronic device, and computer-readable storage medium.

In a first aspect, an embodiment of the present disclosure provides an instruction processing device, including: an instruction processing unit configured to obtain a first calculation instruction to be processed; an instruction execution unit configured to obtain an accumulated calculation instruction based on the first calculation instruction step size information, and obtain the second address information of the second access data corresponding to the second calculation instruction; wherein, the second calculation instruction is the previous instruction of the first calculation instruction; based on the second address information and the accumulative step information, determine the first address information of the first access data corresponding to the first calculation instruction, and execute the first calculation instruction based on the first address information, wherein the first An operand included in the fetched data and/or a calculation result of the operand.

In the embodiment of the present disclosure, after the instruction processing unit acquires the first calculation instruction, the instruction execution unit may acquire the accumulative step size information based on the first calculation instruction, and acquire the first value of the second access data corresponding to the second calculation instruction. second address information, and then determine the first address information based on the second address information and the accumulation step information, and execute the first calculation instruction based on the first address information. In the above implementation manner, the address calculation can be hidden in the instruction processing process of the calculation instruction, so that the extra instruction cycle can be reduced, and the execution efficiency of the instruction can be accelerated.

In an optional implementation manner, the instruction processing unit is configured to, after acquiring the first computing instruction to be processed, determine the first Addressing mode of address information; the instruction execution unit is configured to, when the addressing mode is an address accumulation mode, obtain the accumulation step information based on the first calculation instruction, and obtain the second calculation second address information of the second access data corresponding to the instruction; and based on the second address information and the accumulation step information, determine the first address information, and execute the first address information based on the first address information A first calculation instruction.

In the above embodiment, after the first calculation instruction is acquired, the first address information of the first access data of the first calculation instruction can be determined through the address accumulation mode, and the determination process of the first address information can be hidden in the first An instruction processing process for computing instructions, thereby reducing additional instruction cycles, thereby speeding up the execution efficiency of instructions.

In an optional implementation manner, the instruction execution unit includes: an accumulation register configured to store the second address information; a first calculation unit configured to obtain the After the accumulative step information, perform accumulative calculation on the second address information and the accumulative step information to obtain the first address information.

In an optional implementation manner, the first address information includes a first access address and a first storage address; the instruction execution unit further includes a second computing unit configured to: obtain the first computing unit Send the first access address and the first storage address, and obtain the operand stored in the storage location corresponding to the first access address; and obtain the operand stored in the storage location corresponding to the first access address Executing the first computing instruction with the obtained operand as the first memory access data, obtaining a first computing result of the first computing instruction, and storing the first computing result in the first storage address corresponding storage location.

In an optional implementation manner, the instruction processing unit is further configured to: if the addressing mode is a direct addressing mode, obtain the field content in the address field in the first calculation instruction ; Determine the first address information of the first memory access data based on the obtained field content, and send the first address information to the instruction execution unit, so that the instruction execution unit based on the first The address information executes the first calculation instruction.

In an optional implementation manner, the instruction processing unit includes an instruction decoding unit configured to determine a first enable flag in the first calculation instruction, where the first enable flag is used for Indicating whether the address accumulation mode for the first memory access data is enabled and valid; and determining the addressing of the first address information of the first memory access data corresponding to the first calculation instruction based on the first enable flag model.

In an optional implementation manner, the first enabling flag includes a plurality of first sub-enabling flags, and each of the first sub-enabling flags corresponds to one data in the first access data; The instruction decoding unit is configured to: determine a first sub-enablement flag that matches each data in the first memory access data among the plurality of first sub-enablement flags; The first sub-enabling flag that matches each data in the first memory access data, and determine the addressing mode of the first address information of each data in the first memory access data corresponding to the first calculation instruction .

In an optional implementation manner, the instruction decoding unit is configured to: after determining that the first enable flag indicates that address accumulation mode enable is valid, detect the second enable in the second calculation instruction An enable flag; wherein, the second enable flag is used to indicate whether the address accumulation mode indicated by the first enable flag in the first calculation instruction is enabled and valid; and determining the The first address information of the first access data corresponding to the first calculation instruction.

In an optional implementation manner, the instruction decoding unit is configured to: when determining that the second enabling flag is address accumulation mode enable and disable, determine the The first address information of the first access data is the second address information of the second access data corresponding to the second calculation instruction.

In an optional implementation manner, the second enabling flag includes a plurality of second sub-enabling flags, and each of the second sub-enabling flags corresponds to one data in the first access data; The instruction decoding unit is configured to: determine a second sub-enablement flag that matches each data in the first memory access data among the plurality of second sub-enablement flags; The second sub-enabling identifier that matches each data in the first access data, and determine the first address information of each data in the first access data.

In an optional implementation manner, the instruction content of the first calculation instruction includes at least one continuous first enable flag and/or at least one continuous second enable flag, wherein each of the first The enabling identifier includes a first identifying field and/or first field content, each of the second enabling identifiers includes a second identifying field and/or second field content, and the first identifying field is used to indicate that the first An addressing mode of a calculation instruction, the content of the first field is used to indicate the accumulation step information or the first address information, and the second identification field is used to indicate the next calculation of the first calculation instruction The addressing mode of the instruction, the content of the second field is used to indicate the accumulation step size information or the first address information corresponding to the execution of the next calculation instruction.

In an optional implementation manner, the instruction processing device further includes: a register file, configured to store at least one of the accumulation step information, the first memory access data, and the first address information.

In an optional implementation manner, the instruction execution unit includes: an instruction decoding unit configured to decode the instruction content of the first computing instruction to obtain a decoding result; an instruction issuing unit configured to An instruction for acquiring the accumulation step information is sent to the register file based on the decoding result; wherein, the instruction execution unit is configured to determine the accumulated step information based on the second address information and the accumulation step information The first address information of the first access data.

In an optional implementation manner, the register file includes a vector register file and a scalar register file, wherein the vector register file is used to store the first memory access data of the first computing instruction, and the scalar register The stack is used to store the accumulation step information and/or the first address information.

In a second aspect, an embodiment of the present disclosure provides an instruction processing method, including: obtaining a first computing instruction to be processed; determining the addressing of the first address information of the first access data corresponding to the first computing instruction Mode; wherein, the first memory access data includes the operand and/or the calculation result of the operand used to execute the first calculation instruction; in the case where the addressing mode is an address accumulation mode, based on the The first calculation instruction obtains the cumulative step size information, and obtains the second address information of the second access data corresponding to the second calculation instruction; wherein, the second calculation instruction is a previous instruction of the first calculation instruction; The first address information is determined based on the second address information and the accumulation step information, and the first calculation instruction is executed based on the first address information.

In a third aspect, an embodiment of the present disclosure provides an instruction processing device, including: a first acquiring unit, configured to acquire a first computing instruction to be processed; a determining unit, configured to determine a first computing instruction corresponding to the first computing instruction An addressing mode of the first address information of the access data; wherein, the first access data includes the operand and/or the calculation result of the operand used to execute the first calculation instruction; the second acquisition unit, It is used to obtain the accumulation step size information based on the first calculation instruction when the addressing mode is the address accumulation mode, and obtain the second address information of the second memory access data corresponding to the second calculation instruction; wherein , the second calculation instruction is the previous instruction of the first calculation instruction; the instruction execution unit is configured to determine the first address information based on the second address information and the accumulation step information, and based on The first address information executes the first calculation instruction.

In a fourth aspect, an embodiment of the present disclosure further provides a chip, including the instruction processing device described in any one of the above first aspects.

In a fifth aspect, an embodiment of the present disclosure further provides an electronic device, including a processor, a memory, and a bus, the memory stores machine-readable instructions executable by the processor, and when the electronic device is running, the The processor communicates with the memory through the bus, and when the machine-readable instructions are executed by the processor, the steps of the instruction processing method described in any one of the above-mentioned second aspects are implemented.

In a sixth aspect, an embodiment of the present disclosure further provides an electronic device, including the chip described in the fourth aspect.

In the seventh aspect, the embodiments of the present disclosure further provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, it executes any one of the above-mentioned second aspects. Steps of the instruction processing method described above.

In order to make the above-mentioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments will be described in detail below together with the accompanying drawings.

Description of drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the following will briefly introduce the drawings required in the embodiments. These drawings show embodiments consistent with the present disclosure, and are used together with the description to explain the technical solution of the present disclosure. It should be understood that the following drawings only show some embodiments of the present disclosure, and therefore should not be regarded as limiting the scope. For those skilled in the art, they can also make From these drawings other related drawings are obtained.

FIG. 1 shows a flowchart of an instruction processing method provided by an embodiment of the present disclosure;

Fig. 2 shows a flow chart of a specific method for determining the addressing mode of the first address information of the first access data corresponding to the first calculation instruction in the instruction processing method provided by the embodiment of the present disclosure;

FIG. 3 shows a flowchart of another instruction processing method provided by an embodiment of the present disclosure;

FIG. 4 shows a schematic structural diagram of a first instruction processing device provided by an embodiment of the present disclosure;

FIG. 5 shows a schematic structural diagram of a second instruction processing device provided by an embodiment of the present disclosure;

FIG. 6 shows a schematic structural diagram of a third instruction processing device provided by an embodiment of the present disclosure;

Fig. 7 shows a schematic diagram of the instruction processing flow of the first instruction processing device provided by the embodiment of the present disclosure;

FIG. 8 shows a schematic diagram of an instruction processing flow of a second instruction processing apparatus provided by an embodiment of the present disclosure;

FIG. 9 shows a schematic diagram of an instruction processing device provided by an embodiment of the present disclosure;

Fig. 10 shows a schematic diagram of an electronic device provided by an embodiment of the present disclosure.

Detailed ways

In order to make the purpose, technical solutions and advantages of the embodiments of the present disclosure clearer, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below in conjunction with the drawings in the embodiments of the present disclosure. The described embodiments are only some of the embodiments of the present disclosure, not all of them. The components of the disclosed embodiments generally described and illustrated in the figures herein may be arranged and designed in a variety of different configurations. Accordingly, the following detailed description of embodiments of the present disclosure provided in the accompanying drawings is not intended to limit the scope of the claimed disclosure. Based on the embodiments of the present disclosure, all other embodiments obtained by those skilled in the art without creative effort shall fall within the protection scope of the present disclosure.

It should be noted that like numerals and letters denote similar items in the following figures, therefore, once an item is defined in one figure, it does not require further definition and explanation in subsequent figures.

The term "and/or" in this article only describes an association relationship, which means that there may be three kinds of relationships. For example, A and/or B may mean that A exists alone, A and B exist simultaneously, and B exists alone. Condition. In addition, the term "at least one" herein means any one of a variety or any combination of at least two of the more, for example, including at least one of A, B, and C, which may mean including from A, Any one or more elements selected from the set formed by B and C.

The execution of computer instructions mainly includes access to storage space and calculation of data, wherein the form of storage space access includes direct addressing of immediate data and indirect addressing using operation results. The indirect addressing mode needs to calculate the value of the address in real time. Therefore, the indirect addressing mode brings extra instruction cycles, thereby affecting the execution efficiency of the instruction.

Based on the above research, the present disclosure provides an instruction processing method. In the embodiment of the present disclosure, after the first calculation instruction is acquired, the addressing mode of the first address information of the first access data corresponding to the first calculation instruction may be determined. When the addressing mode is the address accumulation mode, the accumulation step information can be obtained based on the first calculation instruction, and the second address information of the second access data corresponding to the second calculation instruction can be obtained, and then based on the second address information and accumulation step information determine first address information, and execute the first calculation instruction based on the first address information. In the embodiment of the present disclosure, after the first calculation instruction is obtained, the first address information of the first access data of the first calculation instruction is determined through the address accumulation mode, and the determination process of the first address information can be hidden in the first The instruction processing process of the calculation instruction reduces the extra instruction cycle, thereby speeding up the execution efficiency of the instruction.

In order to facilitate the understanding of this embodiment, a method for processing instructions disclosed in this embodiment of the present disclosure is first introduced in detail. The execution subject of the instruction processing method provided by the embodiments of the present disclosure is generally an electronic device with a certain computing capability.

Referring to FIG. 1 , which is a flowchart of an instruction processing method provided by an embodiment of the present disclosure, the method includes steps S101 to S107.

S101: Acquire a first computing instruction to be processed.

In the embodiment of the present disclosure, the first calculation instruction to be processed may be obtained from the instruction register. Here, the first calculation instruction may be any type of calculation instruction, for example, a multiply-accumulate instruction macc, which is not specifically limited in the present disclosure.

S103: Determine the addressing mode of the first address information of the first memory access data corresponding to the first calculation instruction; wherein the first memory access data includes an operand and an operand used to execute the first calculation instruction /or the result of the computation of the operand.

Here, the addressing mode includes an address accumulation mode and a direct addressing mode. Wherein, the address accumulation mode can be understood as determining the first address information (access address information and/or storage address information) of the first memory access data in the storage device according to a preset accumulation algorithm during instruction execution. The direct addressing mode can be understood as obtaining the first address information (access address information and/or storage address information) of the first access data in the storage device based on the address field in the first calculation instruction during the execution of the instruction. Wherein, the storage device may be a memory and a register file in an internal storage device of the instruction processing device, or may be an external storage device of the instruction processing device.

In the embodiment of the present disclosure, the addressing mode of the first address information of the first access data may be determined based on the instruction content of the first calculation instruction. During specific implementation, it may be determined based on the specified data bit in the first computing instruction whether the addressing mode of the first address information of the first access data is an address accumulation mode or a direct addressing mode.

S105: If the addressing mode is an address accumulation mode, obtain accumulation step information based on the first calculation instruction, and obtain second address information of the second memory access data corresponding to the second calculation instruction; wherein , the second calculation instruction is a previous instruction of the first calculation instruction.

In the embodiment of the present disclosure, the second calculation instruction is a previous instruction executed immediately before the first calculation instruction. Here, the second address information is address information (access address information and/or storage address information) of the second memory access data corresponding to the second calculation instruction in the storage device of the instruction processing device. Here, the second memory access data includes an operand (operand) for executing the second calculation instruction and/or a calculation result of the operand.

The "operand" in this article refers to the entity that the operator acts on, which specifies the processing object of the instruction to be processed. For example, the instruction to be processed is a comparison instruction, in which the operator specifies the computer to perform a comparison operation, and the operand specifies two values to be compared.

The inventors found that for high-performance computing scenarios, the data access addresses of multiple instructions often have a certain correlation, for example, the addresses are often incremented by a fixed step size. Based on this, in the embodiment of the present disclosure, information of an accumulation step size may be preset. Here, the determination of the accumulation step size information is associated with the storage location of the first memory access data in the storage device of the instruction processing device.

S107: Determine the first address information based on the second address information and the accumulation step information, and execute the first calculation instruction based on the first address information.

After the second address information and the accumulation step information are determined, an accumulation calculation may be performed on the second address information and the accumulation step information to obtain the first address information.

Next, taking the dot product operation in the convolution operation as an example, the problem of extra instruction cycles brought by the indirect addressing mode in the prior art will be described.

Assuming that the input data are respectively x=[a1, a2, a3], y=[b1, b2, b3], let the vector x be stored in the storage space 0 to 2, and y be stored in the storage space 3 to 5, then the corresponding dot product calculation The procedure is as follows:

In the above program code, macc is a multiply-accumulate instruction. For the above-mentioned for loop instruction, the following instructions "x_addr=x_addr+1" and "y_addr=y_addr+1" must be executed during each execution of the multiply-accumulate instruction macc. Therefore, for each loop process of the for loop instruction, the above three instructions need to be issued and executed to realize the function of the multiply-accumulate instruction macc, and at this time, three instruction cycles will be consumed. For an instruction processing device with high processing performance requirements, implementing the function of one instruction through three instruction cycles will reduce the processing performance of the instruction processing device.

In the embodiment of the present disclosure, after the first calculation instruction is acquired, the addressing mode of the first address information of the first access data corresponding to the first calculation instruction may be determined. When the addressing mode is the address accumulation mode, the accumulation step information can be obtained based on the first calculation instruction, and the second address information of the second access data corresponding to the second calculation instruction can be obtained, and then based on the second address information and accumulation step information determine first address information, and execute the first calculation instruction based on the first address information. In the embodiment of the present disclosure, after the first calculation instruction is obtained, the first address information of the first access data of the first calculation instruction is determined through the address accumulation mode, and the determination process of the first address information can be hidden in the first The instruction processing process of the calculation instruction reduces the extra instruction cycle, thereby speeding up the execution efficiency of the instruction.

The above instruction processing method will be introduced below in combination with specific implementation manners.

In the embodiment of the present disclosure, firstly, the first calculation instruction to be processed is obtained from the instruction register, and then the addressing mode of the first address information of the first memory access data corresponding to the first calculation instruction is determined.

In an optional implementation manner, as shown in FIG. 2, the above step S103: determining the addressing mode of the first address information of the first memory access data corresponding to the first computing instruction, specifically includes the following steps:

S1031: Determine a first enable flag in the first calculation instruction, where the first enable flag is used to indicate whether the address accumulation mode for the first memory access data is enabled and valid;

S1032: Determine an addressing mode of the first address information of the first memory access data corresponding to the first computing instruction based on the first enabling flag.

In an embodiment of the present disclosure, a specified data bit including the first enable flag may be determined in each data bit of the first calculation instruction, and the value of the specified data bit is determined as the first enable flag.

During specific implementation, when the first enabling flag is the first numerical value, it can be determined that the address accumulation mode of the first memory access data is enabled; when the first enabling flag is the second numerical value, it can be determined that the first 1. The address accumulation mode of accessing data is enabled and disabled.

Here, the first value and the second value may be set according to actual needs, which is not specifically limited in the present disclosure. For example, the first value can be set to "1", indicating that the address accumulation mode of the first memory access data is enabled; the second value can be set to "0", indicating that the address accumulation mode of the first memory access data is enabled and disabled.

In this example, the first enabling flag corresponds to a specified data bit as an example for illustration, however, in the case where the first enabling flag includes multiple first sub-enabling flags (wherein, each sub-enabling flag corresponds to One data of the first access data), the first enable flag may correspond to multiple specified data bits, which will be described in subsequent embodiments.

In the embodiment of the present disclosure, after the first calculation instruction is obtained, the value of the designated data bit in the first calculation instruction may be obtained, and then the first enabling flag is determined according to the value of the designated data bit. After the first enabling flag is acquired, it may be determined based on the first enabling flag whether the address accumulation mode for the first memory access data is enabled and valid.

When it is determined that the address accumulation mode is enabled and valid, it can be determined that the addressing mode for the first address information of the first memory access data is the address accumulation mode; when it is determined that the address accumulation mode is enabled and disabled, it can be determined that The addressing mode for the first address information of the first access data is a non-address accumulation mode (for example, a direct addressing mode).

In the above embodiment, by determining the first enable flag in the first calculation instruction, and then determining the addressing mode of the first address information of the first access data according to the first enable flag, it can be realized that the first address The determination process of the information is hidden in the instruction processing process of the first calculation instruction, thereby reducing the extra instruction cycle, thereby speeding up the execution efficiency of the instruction.

After determining the addressing mode of the first address information of the first access data, the first address information of the first access data (that is, the above-mentioned first address information) can be determined based on the addressing mode, the following cases Make an introduction.

Case 1: The addressing mode is the address accumulation mode.

When the addressing mode is the address accumulation mode, the accumulation step size information may be acquired based on the instruction content of the first calculation instruction.

During specific implementation, the accumulative step size information can be obtained in the register file or in the instruction code based on the instruction content of the first calculation instruction, and the two methods will be introduced respectively below.

Method 1: The method based on the register file.

When the addressing mode is the address accumulation mode, the instruction content of the first calculation instruction includes an index identifier of a register in the register file storing the accumulation step information. In this case, the data in the corresponding register may be read based on the index identifier, and the read data may be determined as the accumulative step size information.

Method 2: A method based on instruction encoding.

When the addressing mode is the address accumulation mode, the instruction content of the first calculation instruction includes accumulation step size information. In this case, the accumulative step size information can be directly determined according to the instruction content of the first calculation instruction.

After determining the accumulative step size information in the manners described above in the first and second ways, the second address information of the operand used in the execution of the second calculation instruction can be obtained, and based on the second address information and the The accumulation step information is used to determine first address information, and then the first calculation instruction is executed based on the first address information.

Case 2: The addressing mode is direct addressing mode.

In the case where the addressing mode is the direct addressing mode, obtaining the field content in the address field in the first computing instruction; and determining the first access data of the first memory based on the obtained field content 1. Address information.

In the embodiment of the present disclosure, when the addressing mode is the direct addressing mode, the first address information of the first fetched data (that is, the above-mentioned first address information).

During specific implementation, the first address information of the first memory access data can be obtained in the register file or in the instruction code based on the field content of the address field in the first computing instruction, and the two methods will be introduced respectively below.

Method 1: The method based on the register file.

In the case that the addressing mode is the direct addressing mode, the field content of the address field in the first computing instruction includes the index identifier of the register storing the first address information in the register file. In this case, the first address information in the corresponding register can be read based on the index identifier.

Method 2: A method based on instruction encoding.

When the addressing mode is the direct addressing mode, the field content of the address field in the first calculation instruction includes the first address information of the first memory access data. In this case, the first address information of the first fetched data can be determined directly according to the field content of the address field in the first calculation instruction.

After the first address information of the first access data is determined according to the above-mentioned manner 1 or manner 2, the first calculation instruction may be executed based on the first address information.

For example, the instruction form of the first calculation instruction may be opcal(addr1/dlt1=a, addr2/dlt2=b, addr3/dlt3=c), wherein, opcal is the above-mentioned first calculation instruction, (addr1/dlt1, addr2 /dlt2, addr3/dlt3) are part of the instruction content of the first calculation instruction. Among them, addr1/dlt1 and addr2/dlt2 are the instruction contents used to determine the access information (for example, the access address of the operand) of the operand in the first memory access data, and addr3/dlt3 are used to determine the first memory access data. The instruction content of the storage information of the calculation result (for example, the storage address of the calculation result).

In the case that the addressing mode is the address accumulation mode, the following information (dlt1=a1, dlt2=b1, dlt3=c1) is contained in the part of the instruction content of the first calculation instruction, wherein, dlt1, dlt2, dlt3 can be expressed as the first The enable flag is enabled and valid, and a1, b1, and c1 may be accumulation step information, and may also be expressed as index identifiers of registers storing the accumulation step information in the register file.

In the case that the addressing mode is the direct addressing mode, the following information (addr1=a2, addr2=b2, addr3=c2) is included in the part of the instruction content of the first calculation instruction, wherein, addr1, addr2, addr3 can be expressed as the first An enable flag enables and disables, a2, b2, c2 can be the first address information of the first memory access data, and can also be expressed as index marks of registers storing the first address information of the first memory access data in the register file.

In an optional implementation manner, the first enabling flag includes a plurality of first sub-enabling flags; each of the first sub-enabling flags corresponds to one data in the first access data.

Take the first calculation instruction opcal (addr1/dlt1=a, addr2/dlt2=b, addr3/dlt3=c) as an example for illustration, where addr/dlt is the above-mentioned first enabling flag, and addr represents the first The enable flag is enabled and disabled, and dlt indicates that the first enable flag is enabled and valid.

addr1/dlt1, addr2/dlt2 and addr3/dlt3 are the above-mentioned multiple first sub-enabling identifiers, wherein the first sub-enabling identifier addr1/dlt1 is used to indicate the addressing mode of the access address of operand 1, and the first sub-enabling identifier The enable flag addr2/dlt2 is used to indicate the addressing mode of the access address of operand 2, and the first sub-enable flag addr3/dlt3 is used to indicate the addressing mode of the storage address of the calculation result of the operand of the first calculation instruction.

In this case, the above step S1032: determining the addressing mode of the first address information of the first access data corresponding to the first calculation instruction based on the first enabling flag includes the following steps:

determining, among the plurality of first sub-enabling identifiers, a first sub-enabling identifier that matches each data in the first memory access data;

Based on the first sub-enabling identifier that matches each data in the first memory access data, determine the first address information of each data in the first memory memory data corresponding to the first calculation instruction addressing mode.

It should be understood that the above-mentioned first enablement flag may include a plurality of first sub-enablement flags, and each first sub-enablement flag is used to indicate the first address information of each data in the first access data. address mode. Here, each data in the first fetch data can be understood as each operand used to execute the first calculation instruction and/or the calculation result of the operand.

That is to say, the addressing modes of the first address information of each operand and/or the calculation result of the operand may be the same or different, which is not specifically limited in the present disclosure.

Here, for the operand, the first address information can be understood as the read address of the operand (that is, the first access address described below); for the calculation result of the operand, the first address information can be It is understood as the storage location of the calculation result (that is, the first storage address described below).

In a first calculation instruction, the addressing modes of the first address information of each operand and the calculation result of the operand may not be completely the same. At this time, for each operation data and calculation result of the operand, the corresponding The addressing mode determines first address information.

During specific implementation, the first sub-enabling identifier that matches each data in the first memory access data may be determined among the plurality of first sub-enabling identifiers, and then according to the matched first sub-enabling identifier The identification content (addr1/dlt1) of determines whether the addressing mode of the first address information corresponding to the data is an address accumulation mode.

In the above embodiment, by setting a corresponding first sub-enabling flag for each data in the first access data (that is, the operand and the calculation result of the operand), and controlling the operation according to the first sub-enabling flag The manner of the first address information of each data can expand the application scenarios of the technical solution, thereby satisfying the programming requirements of programmers.

On the basis of the implementation described in FIG. 2 above, as shown in FIG. 3 , the method provided by the embodiment of the present disclosure further includes the following steps:

S301: After determining that the first enabling flag is valid, detect a second enabling flag in the second computing instruction; wherein the second enabling flag is used to indicate the first computing instruction Whether the address accumulation mode indicated by the first enabling flag in is enabled and valid;

S302: Determine first address information of first memory access data corresponding to the first computing instruction based on the second enabling flag.

For high-performance computing scenarios, multiple loops are often required to implement the computing process. The program for each layer of loop process includes an initial instruction and a loop instruction; wherein, the initial instruction is used to give an initial address. Assuming that the loop instruction is an instruction that loops and executes N times, if the address accumulation mode is enabled for the first loop instruction, then each time the loop is executed, the data address information corresponding to the first loop instruction is based on The data address information of the initial address is determined after address accumulation calculation. In this way, the data address information of the first loop instruction will be wrong, resulting in wrong operands being obtained, which seriously affects the calculation result of the instruction.

Based on this, in the embodiment of the present disclosure, the above-mentioned address accumulation mode is extended, so that the extended address accumulation mode can support the loop program more concisely and efficiently. The specific extension method is:

A second enable flag is set in the previous instruction (that is, the second computation instruction) of the first computation instruction, and the second enable flag is used to indicate the next computation instruction (that is, the first computation instruction) of the second computation instruction. instruction) whether the address accumulation mode indicated by the first enable flag is enabled or not.

During specific implementation, when the first enabling flag in the first calculation instruction indicates that the address accumulation mode for the first memory access data is enabled, if the second enabling flag in the second computing instruction is address accumulation If the mode is enabled and disabled, it is determined that the address accumulation mode indicated by the first enable flag in the first calculation instruction is enabled and disabled. That is to say, even if the addressing mode is determined to be the address accumulation mode according to the first enabling flag in the first calculation instruction, if the second enabling flag is enabled and disabled, the first memory access data in the first computing instruction The addressing mode may be a direct addressing mode, or the first address information is determined according to the second address information.

For example, the second calculation instruction is an address initialization instruction in the cyclic program. In the case that the first calculation instruction is the first loop instruction in the cyclic program, at this time, the first use can be set in the first calculation instruction. The enable flag is valid, and the second enable flag is set in the second calculation instruction as enable and disable. At this time, for the first loop instruction, the first address information of the first memory access data can be determined through the direct addressing mode, or the second address information of the second memory access data corresponding to the second calculation instruction can be determined as The above-mentioned first address information. For other loop instructions in the loop program, for example, the Nth loop instruction, where N is greater than 1, it can be determined whether the address accumulation mode is enabled and valid according to the second enable flag in the N-1th loop instruction.

In the embodiment of the present disclosure, when the second enable flag is the third value, it can be determined that the address accumulation mode indicated by the first enable flag in the next computation instruction of the second computation instruction is enabled; When the second enabling flag is the fourth value, it may be determined that the address accumulation mode indicated by the first enabling flag in the next computing instruction of the second computing instruction is enabled and disabled.

Here, the third value and the fourth value can be set according to actual needs, which is not specifically limited in the present disclosure. For example, the third numerical value may be set to "1", indicating that enabling is valid; the fourth numerical value may be set as "0", indicating that enabling is disabled.

It should be understood that the expansion scheme described in the above step S301 and step S302 can not only be applied to the multi-cycle calculation process, but also can be applied to other calculation processes, for example, a calculation process similar to the multi-cycle calculation process, or other It is necessary to specify the scenarios where the address accumulation mode indicated by the first enable flag in the first calculation instruction is enabled and disabled, and this disclosure does not specifically limit this, and the implementation shall prevail.

In an optional implementation manner, the above step S302: determining the first address information of the first access data corresponding to the first computing instruction based on the second enabling flag, specifically includes the following steps:

S3021: In the case where it is determined that the second enable flag is address accumulation mode enable and disable, determine that the first address information of the first memory access data corresponding to the first calculation instruction is set by the second calculation instruction Corresponding second address information of the second fetched data.

In the embodiment of the present disclosure, in a case where it is determined that the second enabling flag is valid for enabling the address accumulation mode, the first address information of the first memory access data may be determined according to the address accumulation mode. During specific implementation, the accumulative step size information can be obtained based on the first computing instruction, and the second address information of the second memory access data corresponding to the second computing instruction can be obtained; further, based on the second address information and the accumulating step size information, determining the first address information, and executing the first calculation instruction based on the first address information.

In a case where it is determined that the second enabling flag is enabling and disabling the address accumulation mode, the second address information of the second memory access data corresponding to the second calculation instruction may be determined as the above-mentioned first address information.

The following takes the cyclic program in the above-mentioned embodiment as an example for illustration. It can be seen from the above description that each layer of the loop program in the multiple loop program includes an initial instruction and a loop instruction. Among them, the initial instruction is used to give the initial address.

Assuming that the second calculation instruction is the address initialization instruction in the cyclic program, in the case that the first calculation instruction is the first loop instruction in the cyclic program, at this time, the second enabling flag can be set in the second calculation instruction as Enable and disable. At this time, for the first loop instruction, the second address information of the second memory access data corresponding to the second calculation instruction may be determined as the first address information. For example, the second address information in the address initialization instruction is determined as the first address information of the first calculation instruction.

In the embodiment of the present disclosure, the programmer can set whether the second enabling flag is enabled or disabled according to actual needs. Through this processing method, errors in the data address information of the first loop instruction can be avoided, so that accurate operands can be obtained. At the same time, through this processing method, the flexibility of programs written by users can be improved, so as to meet various programming needs of programmers.

In the above embodiment, by setting the second enabling flag in the instruction preceding the first computing instruction, the address accumulation mode indicated by the first enabling flag in the first computing instruction is determined according to the second enabling flag Whether to enable an effective method can expand the applicable scenarios of the technical solution of the present disclosure. For the calculation process of multiple cycles, the technical solution of the present disclosure can still process calculation instructions, thereby improving the processing efficiency of cycle instructions.

In an optional implementation manner, in the case where the second enablement identifier includes a plurality of second sub-enablement identifiers, and each second sub-enablement identifier corresponds to one data in the first access data, the above-mentioned Step S302: Determine the first address information of the first memory access data corresponding to the first computing instruction based on the second enabling flag, specifically including the following steps:

(1) determining a second sub-enablement identifier that matches each data in the first access data among the plurality of second sub-enablement identifiers;

(2) Determine the first address information of each data in the first access data based on the second sub-enabling identifier that matches each data in the first access data.

In the embodiment of the present disclosure, the above-mentioned second enablement flag may include multiple second sub-enablement flags, and each second sub-enablement flag corresponds to one data in the first access data. Here, each data in the first fetch data can be understood as each operand used to execute the first calculation instruction and/or the calculation result of the operand.

In this embodiment of the present disclosure, the second sub-enabling identifier matching each data in the first access data may be determined among multiple second sub-enabling identifiers, and then according to the matched second sub-enabling identifier The identification value of the enabling identification determines the first address information of the corresponding data.

For example, the first calculation instruction and the second calculation instruction are:

opinit(addr1, addr2, addr3, nxt_dlt0_on, nxt_dlt1_on, nxt_dlt2_off);

opcal(addr1/dlt1, addr2/dlt2, addr3/dlt3, nxt_dlt0_on, nxt_dlt1_on, nxt_dlt2_off).

Wherein, opcal is the above-mentioned first calculation instruction, opinit is the second calculation instruction, (addr1/dlt1, addr2/dlt2, addr3/dlt3, nxt_dlt_on[2:0]) is part of the instruction content of the first calculation instruction. Among them, addr1/dlt1 and addr2/dlt2 are the instruction content used to determine the access information of the operand in the first memory access data, and addr3/dlt3 are used to determine the storage of the calculation result of the operand in the first memory access data The command content of the message.

It is assumed that the first enabling flag in the first computing instruction is valid for enabling the address accumulation mode. According to the second enable flag (nxt_dlt0_on, nxt_dlt1_on, nxt_dlt2_off) in the second calculation instruction, the address accumulation mode of the access address of the operand corresponding to "dlt0" and "dlt1" in the first calculation instruction is enabled. The address accumulation mode of the storage address of the calculation result corresponding to "dlt2" in a calculation instruction is enabled and disabled.

In this case, the access addresses of the operands corresponding to "dlt0" and "dlt1" can be determined based on the address accumulation mode, and the storage address of the calculation result of the second calculation instruction can be determined as the storage address of the calculation result of the first calculation instruction address.

In the above embodiment, by setting a corresponding second sub-enabling flag for each data in the first memory access data, and controlling the address accumulation mode of the first address information of each data according to the second sub-enabling flag Whether an effective method is enabled can further expand the applicable scenarios of the technical solution, so as to meet the programming needs of programmers.

The above-mentioned content will be introduced below by taking the above-mentioned dot product calculation program as an example. For the program of dot product calculation described above, after adopting the technical solution provided by the embodiment of the present disclosure, the program can be described as:

Wherein, minit is an address initialization instruction, and the address initialization instruction is the previous instruction of the first instruction in the for loop, that is, minit is the above-mentioned second calculation instruction, and the first macc instruction in the for loop can be the above-mentioned first instruction. a calculation instruction.

In the address initialization instruction, vector x, vector y, and initial addresses of calculation results of vector x and vector y are included. "nxt_dlt0_on, nxt_dlt1_on, nxt_dlt2_on" in the address initialization instruction respectively indicate that the address accumulation mode corresponding to vector x in the first macc instruction of the for loop is enabled and valid, and the address accumulation mode corresponding to vector y in the first macc instruction of the for loop Enable is valid, and the address accumulation mode corresponding to the calculation results of vector x and vector y in the first macc instruction of the for loop is enabled and valid.

At this time, for the first macc instruction of the for loop, the access addresses of vector x and vector y can be determined based on the address accumulation mode, and the storage addresses of the calculation results of vector x and vector y can be determined.

In an optional implementation manner, the above-mentioned instruction processing device includes a register file, and the register file is used to store the accumulation step size information and the first memory access data.

In this case, the above step S105: acquiring the accumulation step information based on the first calculation instruction includes: acquiring the accumulation step information in the register file based on the instruction content of the first calculation instruction.

In this case, the above step S107: determining the first address information based on the second address information and the accumulation step information includes: based on the second address information and the The accumulative step size information is used to determine the first address information of the first memory access data.

In the embodiment of the present disclosure, when the addressing mode is the address accumulation mode, the instruction content of the first calculation instruction includes an index identifier of a register in the register file storing the accumulation step information. In this case, the data in the corresponding register may be read based on the index identifier, and the read data may be determined as the accumulative step size information.

After the accumulation step information is determined, the second address information and the accumulation step information may be accumulated and calculated to obtain the first address information of the first access data.

In an optional implementation, the register file includes a vector register file and a scalar register file, the vector register file is used to store the first memory access data of the first computing instruction, and the scalar register file is used to store the Describe the accumulative step size information.

At present, the development of artificial intelligence has put forward higher requirements for computing power. In this context, the computing power density of instruction processing devices continues to increase, and the performance requirements for memory access also increase accordingly. In order to meet memory access requirements, the current trend is to increase high-speed internal storage spaces (for example, register files) inside the instruction processing device, and the first calculation unit of the instruction processing device can directly access these storage spaces (for example, register files), and Efficient data multiplexing is performed therein, thereby improving the calculation efficiency of the first calculation unit in the instruction processing device.

In an optional implementation manner, when the first address information includes the first access address and the first storage address, the above step S107: Execute the first calculation instruction based on the first address information, which specifically includes the following step:

S1071: Obtain an operand stored in a storage location corresponding to the first access address;

S1072: Execute the first calculation instruction by using the operand obtained from the storage location corresponding to the first access address as the first memory access data, to obtain a first calculation result of the first calculation instruction, and storing the first calculation result in a storage location corresponding to the first storage address.

In the embodiment of the present disclosure, the operand stored in the storage location corresponding to the first access address may be acquired. For example, obtain the operand stored in the register corresponding to the first access address in the register file, and execute the first calculation instruction with the obtained operand, so as to obtain the first calculation result of the first calculation instruction, and store the first calculation result Store in the register corresponding to the first storage address in the register file.

In the above embodiment, through the above processing method, the determination process of the first address information can be hidden in the instruction processing process of the first calculation instruction, thereby reducing the extra instruction cycle, and further speeding up the execution efficiency of the instruction.

Referring to FIG. 4 , which is a schematic diagram of an instruction processing device provided by an embodiment of the present disclosure, the instruction processing device includes:

an instruction processing unit 41 configured to obtain a first computing instruction to be processed;

The instruction execution unit 42 is configured to obtain the accumulation step size information based on the first calculation instruction, and obtain the second address information of the second access data corresponding to the second calculation instruction; wherein, the second calculation instruction is The previous instruction of the first calculation instruction; and based on the second address information and the accumulation step information, determine the first address information of the first memory access data corresponding to the first calculation instruction, and based on The first address information executes the first calculation instruction, wherein the first memory access data includes an operand and/or a calculation result of the operand of the first calculation instruction.

In an optional implementation manner, the instruction processing unit 41 is further configured to, after acquiring the first computing instruction to be processed, determine the address of the first address information of the first access data corresponding to the first computing instruction. address mode.

The instruction execution unit 42 is further configured to obtain the accumulation step size information based on the first calculation instruction when the addressing mode is the address accumulation mode, and obtain the information of the second memory access data corresponding to the second calculation instruction second address information; and determining the first address information based on the second address information and the accumulation step information, and executing the first calculation instruction based on the first address information.

In the embodiment of the present disclosure, after the instruction processing unit acquires the first calculation instruction, it may determine the addressing mode of the first address information of the first access data corresponding to the first calculation instruction. In the case where the addressing mode is the address accumulation mode, the instruction execution unit can obtain the accumulation step size information based on the first calculation instruction, and obtain the second address information of the second memory access data corresponding to the second calculation instruction, and then based on the The second address information and the accumulation step information determine the first address information, and execute the first calculation instruction based on the first address information. In the embodiment of the present disclosure, after the first calculation instruction is obtained, the first address information of the first access data of the first calculation instruction is determined through the address accumulation mode, and the determination process of the first address information can be hidden in the first The instruction processing process of the calculation instruction reduces the extra instruction cycle, thereby speeding up the execution efficiency of the instruction.

In an optional implementation manner, the instruction processing unit 41 is further configured to: if the addressing mode is a direct addressing mode, obtain the field content in the address field in the first calculation instruction; and Determine the first address information of the first memory access data based on the obtained field content, and send the first address information to the instruction execution unit, so that the instruction execution unit based on the first address The information executes the first computing instruction.

In the embodiment of the present disclosure, when the addressing mode is the direct addressing mode, the first address information of the first memory access data may be acquired based on the instruction content of the first computing instruction.

During specific implementation, the first address information of the first memory access data can be obtained in the register file or instruction encoding based on the instruction content of the first computing instruction. Specifically, the first address information of the first fetched data can be obtained through the two methods described in the second case of the above embodiment, which will not be described in detail here.

In the embodiment of the present disclosure, as shown in FIG. 5 , the instruction execution unit 42 includes an accumulation register 421 , a first calculation unit 442 and a second calculation unit 423 .

The accumulation register 421 is configured to store the second address information.

The first calculation unit 422 is configured to perform accumulation calculation on the second address information and the accumulation step information to obtain the first address information.

The second computing unit 423 is configured to obtain the first access address and the first storage address sent by the first computing unit, and obtain the operand stored in the storage location corresponding to the first access address; The operand obtained in the storage location corresponding to the first access address is used as the first memory access data to execute the first calculation instruction, obtain a first calculation result of the first calculation instruction, and store the The first calculation result is stored in a storage location corresponding to the first storage address.

In the embodiment of the present disclosure, in the case that the instruction execution unit includes a plurality of second calculation units, an address accumulation unit can be set for each second calculation unit, and the address accumulation unit includes an accumulation register 421 and a first calculation unit Unit 422. Further, the address accumulation unit may include multiple sub-accumulation units, and each sub-accumulation unit includes an accumulation register 421 and a first calculation unit 422 . For example, each second calculation unit is used to calculate multiple operands to obtain calculation results. At this time, for each operand and calculation result, a sub-accumulation unit can be correspondingly set.

After the first calculation unit accumulates and calculates the second address information and the accumulation step information to obtain the first address information, it may send the first address information to the second calculation unit. Wherein, the first address information includes a first access address and a first storage address.

After obtaining the first address information, the second calculation unit may obtain the operand stored in the corresponding storage location in the first access address. For example, obtaining the operand stored in the register corresponding to the first access address in the register file of the instruction processing device, and executing the first calculation instruction with the obtained operand, so as to obtain the first calculation result of the first calculation instruction, and The first calculation result is stored in the register corresponding to the first storage address in the register file.

In the embodiment of the present disclosure, as shown in FIG. 5 , the instruction processing unit 41 includes an instruction reading unit 411 , an instruction decoding unit 412 and an instruction issuing unit 413 .

The instruction fetching unit is configured to fetch the first computation instruction to be processed.

The instruction decoding unit is configured to determine a first enable flag in the first calculation instruction, wherein the first enable flag is used to indicate whether the address accumulation mode for the first memory access data is enabled or not. ; and determining an addressing mode of the first address information of the first memory access data corresponding to the first computing instruction based on the first enabling flag.

The instruction issuing unit is configured to implement data transmission between the instruction processing unit, the instruction execution unit and the register file.

In the embodiment of the present disclosure, after the instruction reading unit reads the first calculation instruction, the instruction decoding unit can obtain the value of the specified data bit in the first calculation instruction, and then determine the first calculation instruction according to the value of the specified data bit. enable flag. After the first enabling flag is acquired, it may be determined based on the first enabling flag whether the address accumulation mode for the first memory access data is enabled and valid.

When it is determined that the address accumulation mode is enabled and valid, it can be determined that the addressing mode for the first address information of the first memory access data is the address accumulation mode and the accumulation step size information; after that, the instruction issuing unit can send the instruction execution unit The accumulation step information is transmitted, so that the instruction execution unit determines the first address information based on the second address information and the accumulation step information.

In this case, the instruction decoding unit is configured to determine, among the plurality of first sub-enabling flags, a first sub-enabling flag that matches each data in the first memory access data; and based on The first sub-enable flag matching each data in the first memory access data determines the addressing mode of the first address information of each data in the first calculation instruction.

In the above embodiment, by setting a corresponding first sub-enabling flag for each data in the first access data (that is, the operand and the calculation result of the operand), and controlling the operation according to the first sub-enabling flag The first address information of each data can expand the application scenarios of the technical solution, so as to meet the programming needs of programmers.

In an optional implementation manner, the above-mentioned instruction decoding unit 412 is configured to detect the second enable flag in the second calculation instruction after determining that the first enable flag indicates that the address accumulation mode is enabled ; Wherein, the second enabling flag is used to indicate whether the address accumulation mode indicated by the first enabling flag in the first calculation instruction is enabled and valid; and determining the first The first address information of the first memory access data corresponding to a computing instruction.

From the description of the above embodiments, it can be seen that in the embodiments of the present disclosure, the above address accumulation mode is extended, so that the extended address accumulation mode can support the loop program more concisely and efficiently. The specific extension method is as follows:

During specific implementation, when the first enable flag in the first computing instruction indicates that the address accumulation mode for the first memory access data is enabled, if the second enabling flag in the second computing instruction is enabled If disabled, it is determined that the address accumulation mode indicated by the first enable flag in the first computing instruction is enabled and disabled. That is to say, even if the addressing mode is determined to be the address accumulation mode according to the first enabling flag in the first calculation instruction, if the second enabling flag is enabled and disabled, the first memory access data in the first computing instruction The addressing mode may be a direct addressing mode, or the first address information is determined according to the second address information.

In an embodiment of the present disclosure, the instruction decoding unit is configured to determine the first address of the first memory access data corresponding to the first calculation instruction when it is determined that the second enable flag is enabled and disabled The information is the second address information of the second access data corresponding to the second computing instruction.

In a case where it is determined that the second enable flag is enabled and disabled, the second address information of the second memory access data corresponding to the second calculation instruction may be determined as the above-mentioned first address information.

The following takes the cyclic program in the above-mentioned embodiment as an example for description. It can be seen from the above description that each layer of the loop program in the multiple loop program includes an initial instruction and a loop instruction. Among them, the initial instruction is used to give the initial address.

In the embodiment of the present disclosure, the programmer can set whether the second enabling flag is enabled or disabled according to actual needs. Through this processing method, an error in the data address information of the first instruction in the loop instruction can be avoided, so that an accurate operand can be obtained. At the same time, through this processing method, the flexibility of programs written by users can be improved, so as to meet various programming needs of programmers.

In the embodiment of the present disclosure, the second enabling flag includes multiple second sub-enabling flags, and each second sub-enabling flag corresponds to one data in the first access data.

In this case, the instruction decoding unit is configured to determine a second sub-enablement flag that matches each data in the first memory access data among the plurality of second sub-enablement flags; and based on The second sub-enabling identifier matched with each data in the first access data determines the first address information of each data in the first access data.

In the embodiment of the present disclosure, the above-mentioned second enablement flag may include multiple second sub-enablement flags, and each second sub-enablement flag corresponds to one data in the first access data. Here, each data in the first memory access data can be understood as each operand used to execute the first calculation instruction, and the calculation result of the operand.

In the above embodiment, by setting a corresponding second sub-enabling flag for each data in the first memory access data, and controlling whether the first address information address accumulation mode of each data is based on the second sub-enabling flag Enabling an effective way can further expand the application scenarios of the technical solution, so as to meet the programming needs of programmers.

In an embodiment of the present disclosure, the instruction content of the first calculation instruction includes at least one first data bit and/or at least one second data bit, wherein each of the first data bits includes a first enable flag and/or First identification content, each of the second data bits includes a second enable identification and/or second identification content, the first enable identification is used to indicate the addressing mode of the first calculation instruction, the The content of the first identification is used to indicate the accumulation step information or the first address information, the second enabling identification is used to indicate the addressing mode of the next calculation instruction of the first calculation instruction, and the first The content of the second identification is used to indicate the accumulative step size information or the first address information corresponding to the execution of the next calculation instruction.

For example, the first calculation instruction is opinit(addr1/dlt1=a, addr2/dlt2=b, addr3/dlt3=c, nxt_dlt0_on, nxt_dlt1_on, nxt_dlt2_off).

Wherein, addr1/dlt1=a, addr2/dlt2=b, addr3/dlt3=c is the above-mentioned at least one first data bit, addr/dlt is the above-mentioned first enabling flag, and the data bit is the value of addr/dlt, corresponding The enabling flag of is *dlt*on/off, a, b and c are the first flag contents of the first enabling flag. nxt_dlt0_on, nxt_dlt1_on, nxt_dlt2_off are at least one second data bit, nxt_dlt0, nxt_dlt1 and nxt_dlt2 are the second enable flags, and on or off is the second flag content of the second enable flag.

In the embodiment of the present disclosure, as shown in FIG. 6 , the instruction processing device further includes a register file 61; wherein, the register file is used to store at least one of the following: the accumulation step information, the first memory access data and the first address information.

In this case, the instruction decoding unit is configured to decode the instruction content of the first computing instruction to obtain a decoding result; the instruction issuing unit is configured to send a fetch to the register file based on the decoding result The instruction for accumulating step size information; the instruction executing unit is configured to determine the first address information of the first memory access data based on the second address information and the accumulating step size information.

The above instruction processing method will be introduced below in conjunction with FIG. 7. As shown in FIG. 7, the instruction processing device includes an instruction processing unit, an instruction execution unit, and a register file; wherein, the instruction execution unit includes an address accumulation unit A and a second calculation unit A , the address accumulation unit B and the second calculation unit B, the address accumulation unit C and the second calculation unit C. Wherein, each of the address accumulation unit A, the address accumulation unit B and the address accumulation unit C includes an accumulation register and a first calculation unit.

On this basis, the instruction processing method can be described as the following process.

The above idea of address accumulation can be applied to the design of the instruction processing device. The instruction processing device is simply divided into three parts: an instruction processing unit, an instruction execution unit and a register file. Among them, the instruction processing unit is responsible for reading, decoding and launching instructions, and the instruction execution unit is responsible for arithmetic and memory access.

The computing unit in the instruction execution unit supports, for example, the following instructions:

opinit(addr1/dlt1, addr1/dlt1, addr2/dlt2, nxt_dlt_on[2:0]);

opcal(addr1/dlt1, addr1/dlt1, addr1/dlt1, nxt_dlt_on[2:0]).

The process of parsing addresses when the instruction processing device executes the above instructions is shown in FIG. 7 .

1) The instruction reading unit reads the first computing instruction, and after the instruction decoding unit decodes the first computing instruction, it parses out the addressing mode (direct addressing addr mode/address accumulation dlt mode), and for the next Whether the dlt mode of a calculation instruction is enabled (nxt_dlt_on/nxt_dlt_off).

2) The instruction emission unit establishes a communication link with the register file, so as to take out the value of addr/dlt from the register file or instruction encoding through the communication link;

3) For the addr mode, the instruction execution unit directly uses the value of addr to read the register file.

For the dlt mode, the last access address (that is, the second address information corresponding to the second calculation instruction) is maintained in the accumulation register, after that, the first calculation unit calculates the last access address and the accumulation result of the accumulation step information, The first address information is sent to the second calculation unit of the instruction execution unit, and the second calculation unit of the instruction execution unit uses the first address information to access the register file.

In an optional implementation, the register file includes a vector register file and a scalar register file, the vector register file is used to store the first memory access data of the first calculation instruction, and the scalar register file is used to store the accumulated step size information and/or the first address information.

In this case, the above instruction processing method will be introduced below with reference to FIG. 8 . As shown in Figure 8, the instruction processing device includes an instruction processing unit, an instruction execution unit, a scalar register file and a vector register file; wherein, the instruction execution unit includes an address accumulation unit A and a second computing unit A, an address accumulation unit B and a second A calculation unit B, an address accumulation unit C and a second calculation unit C. Wherein, each of the address accumulation unit A, the address accumulation unit B and the address accumulation unit C includes an accumulation register and a first calculation unit.

The technical solution provided by the embodiments of the present disclosure can be extended to an instruction processing device including a parallel computing unit, such as a CPU (Central Processing Unit) or a GPU (Graphics Processing Unit) that includes a SIMD (Single Instruction Multiple Data) unit. Unit). The main idea is: the instruction processing device contains two independent register files, among which the scalar register file is mainly used to store simple scalar data or control information, and the vector register file is used to store parallel computing SIMD/SIMT (Single Instruction Multiple Threads, Single Instruction Multiple Threads) data.

Here, the vector register file is used to store the first memory access data of the first calculation instruction, and the scalar register file is used to store the accumulation step information and/or the first address information.

The process of parsing the address when the instruction processing device executes the above instructions is shown in Figure 8:

1) The instruction reading unit reads the first calculation instruction, and after the instruction decoding unit decodes the first calculation instruction, it parses out the addressing mode (direct addressing addr mode/address accumulation dlt mode), and for the next first calculation instruction Whether the dlt mode of a calculation instruction is enabled (nxt_dlt_on/nxt_dlt_off);

3) For the addr mode, the instruction execution unit directly uses the value of addr to read the vector register file.

For the dlt mode, the last access address (that is, the second address information corresponding to the second calculation instruction) is maintained in the accumulation register, after that, the first calculation unit calculates the last access address and the accumulation result of the accumulation step information, The first address information is sent to the second calculation unit of the instruction execution unit, and the second calculation unit of the instruction execution unit uses the address to access the vector register file.

Those skilled in the art can understand that in the above method of specific implementation, the writing order of each step does not mean a strict execution order and constitutes any limitation on the implementation process. The specific execution order of each step should be based on its function and possible The inner logic is OK.

Based on the same inventive concept, the embodiment of the present disclosure also provides an instruction processing device corresponding to the instruction processing method. Since the problem-solving principle of the device in the embodiment of the present disclosure is similar to the above-mentioned instruction processing method in the embodiment of the present disclosure, the implementation of the device Reference can be made to the implementation of the method, and repeated descriptions will not be repeated.

Referring to FIG. 9 , which is a schematic diagram of an instruction processing device provided by an embodiment of the present disclosure, the instruction processing device includes:

The first obtaining unit 91 is configured to obtain the first computing instruction to be processed;

A determining unit 92, configured to determine an addressing mode of the first address information of the first memory access data corresponding to the first calculation instruction; the first memory access data includes an operation for executing the first calculation instruction results of calculations on numbers and/or operands;

The second obtaining unit 93 is configured to obtain the accumulation step size information based on the first calculation instruction when the addressing mode is the address accumulation mode, and obtain the second memory access data corresponding to the second calculation instruction Second address information; the second computing instruction is a previous instruction of the first computing instruction;

The instruction execution unit 94 is configured to determine the first address information based on the second address information and the accumulation step information, and execute the first calculation instruction based on the first address information.

In a possible implementation manner, the instruction execution unit is further configured to obtain the operation stored in the storage location corresponding to the first access address when the first address information includes the first access address and the first storage address. number; using the operand obtained from the memory corresponding to the first access address as the first memory access data to execute the first calculation instruction to obtain a first calculation result of the first calculation instruction, and storing the first calculation result to a storage location corresponding to the first storage address.

In a possible implementation manner, the instruction processing device is further configured to obtain the field content in the address field in the first calculation instruction when the addressing mode is the direct addressing mode; based on the obtained The content of the field of determines the first address information of the first fetched data.

In a possible implementation manner, the determining unit is further configured to determine a first enable flag in the first computing instruction, where the first enable flag is used to indicate the Whether the address accumulation mode is enabled and valid; determining the addressing mode of the first address information of the first memory access data corresponding to the first calculation instruction based on the first enable flag.

In a possible implementation manner, the determining unit is further configured to include a plurality of first sub-enabling identifiers in the first enabling identifier; each of the first sub-enabling identifiers corresponds to an In the case of one data, among the plurality of first sub-enabling identifiers, determine a first sub-enabling identifier that matches each data in the first memory access data; The first sub-enabling identifier matched by each data in the data determines the addressing mode of the first address information of each data in the first calculation instruction.

In a possible implementation manner, the instruction processing device is further configured to detect a second enablement flag in the second calculation instruction after determining that the first enablement flag is enabled; wherein, the The second enabling flag is used to indicate whether the address accumulation mode indicated by the first enabling flag in the first computing instruction is enabled and valid; determine the address corresponding to the first computing instruction based on the second enabling flag First address information of the first fetched data.

In a possible implementation manner, the instruction processing device is further configured to, in the case of determining that the second enable flag is enabled and disabled, determine the first memory access data corresponding to the first calculation instruction The address information is the second address information of the second fetch data corresponding to the second computing instruction.

In a possible implementation manner, the instruction processing device is further configured to include a plurality of second sub-enabling identifiers in the second enabling identifier, and each second sub-enabling identifier corresponds to an In the case of one data, among the multiple second sub-enablement identifiers, determine a second sub-enablement identifier that matches each data in the first memory access data; Each data in the data matches the second sub-enabling identifier to determine the first address information of each data in the first fetched data.

In a possible implementation manner, when the instruction processing device includes a register file, and the register file is used to store the accumulation step size information and the first memory access data, the second acquisition unit is further configured to The instruction content of the first calculation instruction acquires the accumulative step information in the register file; the instruction execution unit is further configured to determine the first memory access based on the second address information and the accumulative step information The first address information of the data.

In a possible implementation manner, the register file includes a vector register file and a scalar register file, the vector register file is used to store the first memory access data of the first computing instruction, and the scalar register file is used to store the Describe the accumulative step size information.

In a possible implementation manner, the instruction processing apparatus provided in the present disclosure may also be implemented as a processor, which will not be repeated here.

For the description of the processing flow of each module in the device and the interaction flow between the modules, reference may be made to the relevant description in the above method embodiment, and details will not be described here.

The various units included in the instruction processing device mentioned in the embodiment of the present application (as shown in Fig. 4 to Fig. 9 ) can be realized as a part of the processor, or in the case that the instruction processing device is a multi-core processor, it can be implemented as The processor itself is implemented as a part of a chip, or as various electronic circuits, etc. The present application does not limit its specific hardware implementation form, as long as it realizes the functions described in the embodiments of the present application.

Corresponding to the instruction processing method in FIG. 1 , an embodiment of the present disclosure further provides an electronic device 1000 . As shown in FIG. 10 , it is a schematic structural diagram of an electronic device 1000 provided by an embodiment of the present disclosure, and the electronic device 1000 includes a processor 101 , a memory 102 and a bus 103 .

The memory 102 is used to store execution instructions, including a memory 1021 and an external memory 1022; the memory 1021 here is also called an internal memory, and is used to temporarily store computing data in the processor 101 and data exchanged with an external memory 1022 such as a hard disk. The processor 101 exchanges data with the external memory 1022 through the memory 1021. When the electronic device 1000 is running, the processor 101 communicates with the memory 102 through the bus 103, so that the processor 101 executes the following instructions.

Acquiring the first calculation instruction to be processed; determining the addressing mode of the first address information of the first memory access data corresponding to the first calculation instruction; wherein the first memory access data includes An operand of a calculation instruction and/or a calculation result of the operand; in the case where the addressing mode is an address accumulation mode, acquiring accumulation step size information based on the first calculation instruction, and obtaining the information corresponding to the second calculation instruction The second address information of the second access data; wherein, the second calculation instruction is the previous instruction of the first calculation instruction; based on the second address information and the accumulation step information, determine the first address information, and execute the first calculation instruction based on the first address information.

Embodiments of the present disclosure further provide a computer-readable storage medium, on which a computer program is stored. When the computer program is run by a processor, the steps of the instruction processing method described in the foregoing method embodiments are executed. Wherein, the storage medium may be a volatile or non-volatile computer-readable storage medium.

The embodiment of the present disclosure also provides a computer program product, the computer program product carries a program code, and the instructions included in the program code can be used to execute the steps of the instruction processing method described in the above method embodiment, for details, please refer to the above method The embodiment will not be repeated here.

Embodiments of the present disclosure further provide a chip, which includes the instruction processing device described in any one of the above embodiments. For details, please refer to the above device embodiments, which will not be repeated here.

Wherein, the above-mentioned computer program product may be specifically implemented by means of hardware, software or a combination thereof. In an optional embodiment, the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK) etc. wait.

Those skilled in the art can clearly understand that for the convenience and brevity of description, the specific working process of the above-described system and device can refer to the corresponding process in the foregoing method embodiments, which will not be repeated here. In the several embodiments provided in the present disclosure, it should be understood that the disclosed systems, devices and methods may be implemented in other ways. The device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some communication interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.

The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.

If the functions are realized in the form of software function units and sold or used as independent products, they can be stored in a non-volatile computer-readable storage medium executable by a processor. Based on this understanding, the technical solution of the present disclosure is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make an electronic device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present disclosure. The aforementioned storage medium includes various media that can store program codes such as U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk.

Finally, it should be noted that the above-described embodiments are only specific implementations of the present disclosure, and are used to illustrate the technical solutions of the present disclosure, rather than to limit them. The protection scope of the present disclosure is not limited thereto, although referring to the aforementioned The embodiments have described the present disclosure in detail, and those skilled in the art should understand that any person familiar with the technical field can still modify the technical solutions described in the foregoing embodiments within the technical scope disclosed in the present disclosure Changes can be easily imagined, or equivalent replacements can be made to some of the technical features; and these modifications, changes or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present disclosure, and should be included in this disclosure. within the scope of protection. Therefore, the protection scope of the present disclosure should be defined by the protection scope of the claims.

Claims

An instruction processing device, characterized in that it includes:

an instruction processing unit configured to obtain a first computing instruction to be processed;

The instruction execution unit is configured to obtain the accumulation step size information based on the first calculation instruction, and obtain the second address information of the second access data corresponding to the second calculation instruction; wherein the second calculation instruction is the The previous instruction of the first calculation instruction; based on the second address information and the accumulation step information, determine the first address information of the first memory access data corresponding to the first calculation instruction, and based on the The first address information executes the first calculation instruction, wherein the first memory access data includes an operand of the first calculation instruction and/or a calculation result of the operand.
The instruction processing device according to claim 1, wherein:

The instruction processing unit is configured to, after acquiring the first computing instruction to be processed, determine an addressing mode of the first address information of the first memory access data corresponding to the first computing instruction;

The instruction execution unit is configured to, when the addressing mode is an address accumulation mode, obtain the accumulation step size information based on the first calculation instruction, and obtain the second calculation instruction corresponding to the second calculation instruction. second address information of the access data; and based on the second address information and the accumulation step information, determine the first address information, and execute the first calculation instruction based on the first address information.
The instruction processing device according to claim 1 or 2, wherein the instruction execution unit comprises:

an accumulation register configured to store the second address information;

The first calculating unit is configured to, after acquiring the accumulative step information sent by the instruction processing unit, perform accumulative calculation on the second address information and the accumulative step information to obtain the first address information.
The instruction processing device according to claim 3, wherein the first address information includes a first access address and a first storage address; the instruction execution unit further comprises: a second calculation unit configured to:

Obtaining the first access address and the first storage address sent by the first computing unit, and obtaining an operand stored in a storage location corresponding to the first access address;

Executing the first calculation instruction by using the operand obtained from the storage location corresponding to the first access address as the first memory access data, to obtain a first calculation result of the first calculation instruction;

storing the first calculation result to a storage location corresponding to the first storage address.
The instruction processing device according to any one of claims 2 to 4, wherein the instruction processing unit is further configured to:

In the case where the addressing mode is a direct addressing mode, acquiring field content in an address field in the first computing instruction;

Determine the first address information of the first memory access data based on the obtained field content, and send the first address information to the instruction execution unit, so that the instruction execution unit based on the first address The information executes the first computing instruction.
The instruction processing device according to any one of claims 1 to 5, wherein the instruction processing unit comprises: an instruction decoding unit configured to:

determining a first enable flag in the first calculation instruction, wherein the first enable flag is used to indicate whether the address accumulation mode for the first memory access data is enabled and valid; and

The addressing mode of the first address information of the first memory access data corresponding to the first computing instruction is determined based on the first enabling flag.
The instruction processing device according to claim 6, wherein the first enable flag includes a plurality of first sub-enable flags; each of the first sub-enable flags corresponds to the first memory access data A data in; the instruction decoding unit is configured to:

determining, among the plurality of first sub-enabling identifiers, a first sub-enabling identifier that matches each data in the first memory access data; and

Based on the first sub-enabling identifier that matches each data in the first memory access data, determine the first address information of each data in the first memory memory data corresponding to the first calculation instruction addressing mode.
The instruction processing device according to claim 6 or 7, wherein the instruction decoding unit is configured to:

After determining that the first enabling flag is valid for enabling the address accumulation mode, a second enabling flag is detected in the second computing instruction; wherein the second enabling flag is used to indicate that the first computing Whether the address accumulation mode indicated by the first enabling flag in the instruction is enabled and valid; and

The first address information of the first memory access data corresponding to the first computing instruction is determined based on the second enabling flag.
The instruction processing device according to claim 8, wherein the instruction decoding unit is configured to:

In the case where it is determined that the second enable flag is address accumulation mode enable and disable, determine that the first address information of the first memory access data corresponding to the first calculation instruction is corresponding to the second calculation instruction Second address information of the second fetch data.
The instruction processing device according to claim 8 or 9, wherein the second enablement flag includes a plurality of second sub-enablement flags, and each of the second sub-enablement flags corresponds to the first access One of the stored data; the instruction decoding unit is configured to:

determining, among the plurality of second sub-enabling identifiers, a second sub-enabling identifier that matches each data in the first memory access data; and

The first address information of each data in the first memory access data is determined based on the second sub-enabling identifier matched with each data in the first memory memory data.
The instruction processing device according to any one of claims 8 to 10, wherein the instruction content of the first calculation instruction includes at least one first data bit and/or at least one second data bit, wherein each Each of the first data bits includes a first enable flag and/or first flag content, each of the second data bits includes a second enable flag and/or second flag content, and the first enable flag It is used to indicate the addressing mode of the first calculation instruction, the first identification content is used to indicate the accumulation step size information or the first address information, and the second enable flag is used to indicate the first The addressing mode of the next calculation instruction of a calculation instruction, the second identification content is used to indicate the accumulation step size information or the first address information corresponding to the execution of the next calculation instruction.
The instruction processing device according to any one of claims 1 to 11, wherein the instruction processing device further comprises:

A register file, configured to store at least one of the accumulation step information, the first memory access data and the first address information.
The instruction processing device according to claim 12, wherein the instruction execution unit comprises:

an instruction decoding unit configured to decode the instruction content of the first calculation instruction to obtain a decoding result;

an instruction issuing unit configured to send an instruction for obtaining the accumulation step size information to the register file based on the decoding result;

Wherein, the instruction execution unit is configured to determine the first address information of the first memory access data based on the second address information and the accumulation step information.
The instruction processing device according to claim 12 or 13, wherein the register file includes a vector register file and a scalar register file, wherein the vector register file is used to store the first access of the first calculation instruction storing data, the scalar register file is used to store the accumulative step size information and/or the first address information.
A command processing method, characterized in that, comprising:

Acquiring a pending first calculation instruction;

Determine the addressing mode of the first address information of the first memory access data corresponding to the first calculation instruction; wherein the first memory access data includes an operand and/or used to execute the first calculation instruction the result of the computation of the operand;

In the case where the addressing mode is an address accumulation mode, acquiring accumulation step information based on the first calculation instruction, and acquiring second address information of the second access data corresponding to the second calculation instruction; wherein, the The second calculation instruction is the previous instruction of the first calculation instruction;

The first address information is determined based on the second address information and the accumulation step information, and the first calculation instruction is executed based on the first address information.
A chip, characterized by comprising the instruction processing device according to any one of claims 1 to 14.
An electronic device, characterized in that it includes a processor, a memory and a bus, the memory stores machine-readable instructions executable by the processor, and when the electronic device is running, the processor and the memory communicate with each other through the bus, and implement the steps of the instruction processing method as claimed in claim 15 when the machine-readable instructions are executed by the processor.
An electronic device, characterized by comprising the chip according to claim 16.
A computer-readable storage medium, wherein a computer program is stored on the computer-readable storage medium, and the computer program executes the steps of the instruction processing method according to claim 15 when the computer program is run by a processor.