CN111158756A - Method and apparatus for processing information - Google Patents
Method and apparatus for processing information Download PDFInfo
- Publication number
- CN111158756A CN111158756A CN201911402401.9A CN201911402401A CN111158756A CN 111158756 A CN111158756 A CN 111158756A CN 201911402401 A CN201911402401 A CN 201911402401A CN 111158756 A CN111158756 A CN 111158756A
- Authority
- CN
- China
- Prior art keywords
- instruction
- register
- configuration
- information
- preset
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 55
- 238000012545 processing Methods 0.000 title claims abstract description 43
- 230000004044 response Effects 0.000 claims abstract description 15
- 238000012546 transfer Methods 0.000 claims description 44
- 238000004590 computer program Methods 0.000 claims description 10
- 230000008569 process Effects 0.000 claims description 9
- 238000010586 diagram Methods 0.000 description 14
- 230000006870 function Effects 0.000 description 10
- 238000005516 engineering process Methods 0.000 description 7
- 238000004891 communication Methods 0.000 description 6
- 238000013473 artificial intelligence Methods 0.000 description 5
- 238000013528 artificial neural network Methods 0.000 description 4
- 239000003795 chemical substances by application Substances 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 238000013461 design Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 230000008859 change Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3867—Concurrent instruction execution, e.g. pipeline or look ahead using instruction pipelines
- G06F9/3869—Implementation aspects, e.g. pipeline latches; pipeline synchronisation and clocking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30098—Register arrangements
- G06F9/30105—Register structure
- G06F9/30109—Register structure having multiple operands in a single register
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/14—Handling requests for interconnection or transfer
- G06F13/20—Handling requests for interconnection or transfer for access to input/output bus
- G06F13/28—Handling requests for interconnection or transfer for access to input/output bus using burst mode transfer, e.g. direct memory access DMA, cycle steal
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30076—Arrangements for executing specific machine instructions to perform miscellaneous control operations, e.g. NOP
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30098—Register arrangements
- G06F9/30101—Special purpose registers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30145—Instruction analysis, e.g. decoding, instruction word fields
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3877—Concurrent instruction execution, e.g. pipeline or look ahead using a slave processor, e.g. coprocessor
- G06F9/3879—Concurrent instruction execution, e.g. pipeline or look ahead using a slave processor, e.g. coprocessor for non-native instruction execution, e.g. executing a command; for Java instruction set
- G06F9/3881—Arrangements for communication of instructions and data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/445—Program loading or initiating
- G06F9/44505—Configuring for program initiating, e.g. using registry, configuration files
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Executing Machine-Instructions (AREA)
- Advance Control (AREA)
Abstract
The embodiment of the disclosure discloses a method and a device for processing information. One embodiment of the method comprises: determining the instruction type of the instruction based on the instruction operation code of the acquired instruction, wherein the instruction type comprises a special instruction, and the special instruction comprises register selection information, base address information and the length of data to be read; in response to determining that the instruction is a special instruction, sending the special instruction to a preset arithmetic unit for the arithmetic unit to execute the following operation steps: selecting a configuration register group from preset configuration register groups as a target configuration register group according to the register selection information; reading configuration information from the target configuration register set according to the base address information and the length of the data to be read; and executing preset operation according to the configuration information. This embodiment solves the problem of extending specialized instructions in a reduced instruction set processor architecture.
Description
Technical Field
The disclosed embodiments relate to the field of computer technologies, and in particular, to a method and an apparatus for processing information.
Background
Reduced Instruction Set Computer (RISC) is the term used with respect to a Complex Instruction Set Computer (CISC). CISC relies on increasing the hardware architecture of the machine to meet the increasing performance demands on the computer. The development of computer architecture is always monopolized by more and more complex processors, and in order to reduce the difference between computer operation and high-level language and to improve the operating characteristics of machines, more and more machine instructions are provided, and the instruction system is more and more complex. In particular, the contradiction between the early high-speed CPU (central processing unit) and the low-speed memory has greatly developed a complex instruction set in order to reduce the number of times of accessing data as much as possible and to increase the speed of the machine. However, with the development of semiconductor process technology, the speed of memory, especially the use of cache, has been increased, resulting in fundamental changes in the architecture of the computer. While the hardware technology is improved, the software is also advanced with equal importance, and the optimized compiling program is generated, so that the execution time of the program is reduced as much as possible, and the memory occupied by the machine language is reduced to the minimum. With advanced memory technology and advanced compilers, the CISC architecture is no longer applicable, and the RISC architecture is born. The basic starting point of RISC technology is to reduce the complexity of hardware design and increase the instruction execution speed by reducing the machine instruction system
In the design of an end-side AI (Artificial Intelligence) inference chip, the RISC instruction set is adopted to significantly reduce chip area, cost, and power consumption. Bringing strong product competitiveness. However, the application scenario and the typical computation load of the end-side AI inference chip designed for a specific application field have the characteristics of a special chip, and it is difficult to improve the operation efficiency by simply adopting a general RISC instruction set.
Disclosure of Invention
The embodiment of the disclosure provides a method and a device for processing information.
In a first aspect, an embodiment of the present disclosure provides a method for processing information, where the method includes: determining the instruction type of the instruction based on the instruction operation code of the acquired instruction, wherein the instruction type comprises a special instruction, and the special instruction comprises register selection information, base address information and the length of data to be read; in response to determining that the instruction is a special instruction, sending the special instruction to a preset arithmetic unit for the arithmetic unit to execute the following operation steps: selecting a configuration register group from preset configuration register groups as a target configuration register group according to the register selection information; reading configuration information from the target configuration register set according to the base address information and the length of the data to be read; and executing preset operation according to the configuration information.
In some embodiments, the instruction class further includes a general-purpose instruction including first register determination information, second register determination information, and third register determination information for determining a register from a preset register group; and the above method further comprises: in response to determining that the instruction is a general instruction, performing the following steps: acquiring data in a register determined by the first register determination information as a first operand; acquiring data in the register determined by the second register determination information as a second operand; executing a preset operation on the first operand and the second operand to obtain an operation result; and storing the operation result into the register determined by the third register.
In some embodiments, prior to fetching the instruction, the method further comprises: and storing the preset configuration information into a preset memory.
In some embodiments, the above method further comprises: configuring data transfer configuration information of a DMA controller, wherein the data transfer configuration information is used for indicating data to be transferred; sending a starting instruction to the DMA controller so that the DMA controller executes the following data carrying steps after receiving the starting instruction: and transferring the configuration information stored in the memory to the configuration register group according to the data transfer configuration information.
In some embodiments, the transferring the configuration information stored in the memory to the configuration register set according to the data transfer configuration information includes: determining the state of the configuration register set; transferring configuration information from the memory to the set of configuration registers according to the determined state.
In a second aspect, an embodiment of the present disclosure provides an apparatus for processing information, the apparatus including: the above-mentioned device includes: the determining unit is configured to determine an instruction type of the instruction based on an instruction operation code of the obtained instruction, wherein the instruction type comprises a special instruction, and the special instruction comprises register selection information, base address information and a data length to be read; an arithmetic unit configured to send the dedicated instruction to a preset arithmetic unit in response to determining that the instruction is a dedicated instruction, so that the arithmetic unit executes the following arithmetic steps: selecting a configuration register group from preset configuration register groups as a target configuration register group according to the register selection information; reading configuration information from the target configuration register set according to the base address information and the length of the data to be read; and executing preset operation according to the configuration information.
In some embodiments, the instruction class further includes a general-purpose instruction including first register determination information, second register determination information, and third register determination information for determining a register from a preset register group; and the above apparatus further comprises: an operation unit configured to, in response to determining that the instruction is a general instruction, perform the following operation steps: acquiring data in a register determined by the first register determination information as a first operand; acquiring data in the register determined by the second register determination information as a second operand; executing a preset operation on the first operand and the second operand to obtain an operation result; and storing the operation result into the register determined by the third register.
In some embodiments, the above apparatus further comprises: a storage unit configured to store preset configuration information to a preset memory.
In some embodiments, the above apparatus further comprises: the DMA controller comprises a configuration unit, a data transfer unit and a data transfer unit, wherein the configuration unit is configured to configure data transfer configuration information of the DMA controller, and the data transfer configuration information is used for indicating data to be transferred; and a starting unit configured to send a starting instruction to the DMA controller so that the DMA controller executes a preset data transfer step after receiving the starting instruction, wherein the DMA controller includes a transfer unit configured to transfer the configuration information stored in the memory to the configuration register set according to the data transfer configuration information.
In some embodiments, the transfer unit is further configured to: determining the state of the configuration register set; transferring configuration information from the memory to the set of configuration registers according to the determined state.
In a third aspect, an embodiment of the present disclosure provides a terminal, where the terminal includes: one or more processors; a storage device having one or more programs stored thereon; an operator configured to process the dedicated instruction; the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method as described in any implementation of the first aspect.
In a fourth aspect, the disclosed embodiments provide a computer-readable medium on which a computer program is stored, wherein the computer program, when executed by a processor, implements the method as described in any implementation manner of the first aspect.
The method and the device for processing information provided by the embodiment of the disclosure determine the instruction category of the instruction based on the instruction operation code of the acquired instruction. If the instruction is determined to be a special instruction, sending the special instruction to the arithmetic unit so that the arithmetic unit can execute the following operation steps: firstly, selecting a configuration register group from a preset configuration register group as a target configuration register group according to register selection information; and then, reading the configuration information from the target configuration register group according to the base address information and the length of the data to be read. And finally, executing preset operation aiming at the configuration information. Thus, the problem of expanding special instructions in the architecture of the reduced instruction set processor is solved.
Drawings
Other features, objects and advantages of the disclosure will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 is an exemplary system architecture diagram in which one embodiment of the present disclosure may be applied;
FIG. 2 is a flow diagram for one embodiment of a method for processing information, according to the present disclosure;
FIG. 3 is a schematic diagram of one hardware portion of a method for processing information suitable for use in implementing embodiments of the present disclosure;
FIG. 4 is a schematic diagram of one application scenario of a method for processing information according to the present disclosure;
FIG. 5 is a flow diagram of yet another embodiment of a method for processing information according to the present disclosure;
FIG. 6 is a schematic block diagram illustrating one embodiment of an apparatus for processing information according to the present disclosure;
FIG. 7 is a block diagram of a computer system suitable for use in implementing a terminal device of an embodiment of the disclosure.
Detailed Description
The present disclosure is described in further detail below with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that, in the present disclosure, the embodiments and features of the embodiments may be combined with each other without conflict. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
Fig. 1 illustrates an exemplary system architecture 100 of a method for processing information or an apparatus for processing information to which embodiments of the present disclosure may be applied.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. Various communication client applications, such as a play application, a search application, an image processing application, an instant messaging tool, social platform software, and the like, may be installed on the terminal devices 101, 102, 103.
The terminal apparatuses 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices having RISC processors and operators, including but not limited to smart phones, tablet computers, smart speakers, in-vehicle voice devices, and so on. When the terminal apparatuses 101, 102, 103 are software, they can be installed in the electronic apparatuses listed above. It may be implemented as multiple pieces of software or software modules (e.g., to provide distributed services) or as a single piece of software or software module. And is not particularly limited herein.
The server 105 may be a server providing various services, such as a background server providing support for information displayed on the terminal devices 101, 102, 103. The background server may analyze and perform other processing on the received data such as the request, and feed back the processing result to the terminal devices 101, 102, and 103.
The server 105 may be hardware or software. When the server 105 is hardware, it may be implemented as a distributed server cluster composed of a plurality of servers, or may be implemented as a single server. When the server 105 is software, it may be implemented as multiple pieces of software or software modules (e.g., to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.
It should be noted that the method for processing information provided by the embodiment of the present disclosure is generally executed by the terminal devices 101, 102, 103, and accordingly, the apparatus for processing information is generally disposed in the terminal devices 101, 102, 103. It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
With continued reference to FIG. 2, a flow 200 of one embodiment of a method for processing information in accordance with the present disclosure is shown. The method for processing information comprises the following steps:
In the present embodiment, the execution subject of the method for processing information (e.g., terminal apparatuses 101, 102, 103 shown in fig. 1) may include a CPU and an operator. The CPU may be a RISC processor, among others. RISC is a microprocessor that executes fewer types of computer instructions. In RISC, a computer can execute instructions in each machine cycle, either simple or complex operations, can be performed by blocks of simple instructions. Typically, each instruction has an opcode that indicates what nature of the operation the instruction performs, and different instructions are indicated by different encodings of this field of the opcode. Therefore, the execution agent may determine the instruction class of the fetched instruction based on the instruction opcode of the fetched instruction. Here, the instruction category may include general instructions and special instructions.
In a computer, a command instructing computer hardware to execute a certain operation or processing function is called an instruction. The instruction is the smallest unit of function that the computer runs on, and the role of the hardware is to accomplish the function specified by each instruction. The set of all instructions on a computer is the instruction system of the computer. An instruction system, also known as an instruction set, is a representation of the overall functionality of this computer. RISC and CISC are two categories into which the CPU is divided from the characteristics of the instruction set. In practice, each CPU is designed to specify a series of instructions that cooperate with its hardware circuitry. Here, the general-purpose instruction may refer to an instruction that cooperates with a hardware circuit of a CPU that executes the subject. A dedicated instruction may refer to an instruction that cooperates with hardware circuitry of an operator that executes the subject. Here, the operator may be an AI accelerator added according to an actual application scenario.
Referring to FIG. 3, a logic block diagram of a standard five-stage pipelined RISC processor is shown in its upper half block. Wherein, five grades of assembly lines are respectively: the device comprises an instruction fetching stage, a decoding stage, an execution stage, an access stage and a write-back stage. The instruction fetching stage sends the fetched instruction to the decoding stage, the decoding stage reads data in the register group according to information in an instruction format and sends the data to the execution stage, the execution stage sends an operation result to the write-back stage after performing operation processing on the data, and the write-back stage selects one of the operation results of the execution stages according to priority and writes the selected operation result back to the register group. It should be understood that, since the five-stage pipeline RISC processor is a well-known technology widely studied and applied at present, the functions of the decoding unit, the data correlation scoreboard, the arithmetic logic unit, and the like in the upper half block of fig. 3 are not described in detail.
The lower half of the block in FIG. 3 shows the added hardware units and register sets of the operator in two stages of decoding and execution. The hardware unit may include an arithmetic unit, a configuration loader, and the like, and the register set may include an arithmetic register set and a configuration register set. Here, the configuration register group may have a special structure in which a plurality of configuration register groups coexist (the coexistence of the plurality of configuration register groups may be realized by an address-accessible manner or a first-in first-out access manner). The contents of the configuration register set may be transferred to the arithmetic unit of the execution stage through the selector. The contents of which set of configuration registers are selected by the configuration loader based on instruction information transmitted during the decode stage.
The parts outside the lower block of fig. 3 are additional circuits outside the RISC processor. The DMA (Direct memory access) controller is responsible for loading accelerator configuration information to the configuration register set in batch. Due to the limited number of configuration register sets, the on-chip memory can accommodate much more configuration information than configuration register sets. Therefore, when the number of configuration information sets is large, only a part of the configuration information sets can be loaded into the configuration register sets, and the subsequent configuration information sets need to be loaded step by the DMA controller according to the speed of consuming the configuration information sets by the arithmetic unit. It should be noted that the operation information of the DMA controller, i.e. the address and the number of sets of configuration information sets stored on the on-chip memory, may be pre-configured by the RISC processor. After the RISC processor starts the DMA controller, the DMA controller can operate independently of the RISC processor, does not need RISC processor to intervene.
In practice, RISC instructions use a fixed length instruction format, e.g., 32 bits, 64 bits, etc., to facilitate hardware decoding. Here, the general-purpose instruction and the special-purpose instruction may be equal in length and both may be fixed in length. As an example, the instruction in the present embodiment may be a 32-bit instruction. Bits 0-6 in the instruction may represent an instruction opcode, which may be used to distinguish between general-purpose instructions and special-purpose instructions. The special purpose instruction may include register selection information, base address information, and the length of data to be read. By way of example, when the instruction is a dedicated instruction, bits 7-11 of the 32 bits may be register selection information for selecting one of a plurality of sets of configuration registers, or for an address portion in an implementation that is accessible by address; bits 15-19 of the 32 bits may be base address information that may be used to identify a base address in a particular configuration register set; bits 20-24 of the 32 bits may be the length of data to be read and may be used to identify the length of bytes in a particular configuration register set that need to be read.
In some optional implementation manners of this embodiment, before the obtaining the instruction, the method for processing information may further include: and storing the preset configuration information into a preset memory.
In this implementation, the execution body may store preset configuration information to a preset memory. As an example, the preset memory may be an on-chip memory other than the RISC processor. As an example, the RISC processor may configure multiple sets of configuration information into the on-chip memory during an initialization phase, where a set of configuration information may refer to information required for the operator to perform one operation.
In some optional implementations, the method for processing information may further include:
first, data transfer configuration information of the DMA controller is configured.
In this implementation, the execution body may configure the data transfer configuration information of the DMA controller. Here, the data transfer configuration information may be used to indicate data to be transferred. For example, the data transfer configuration information may include a base address and a transfer byte length. As an example, during the startup phase, the RISC processor may configure the base address and transfer byte length of the DMA controller.
Then, sending a starting instruction to the DMA controller, so that the DMA controller executes the following data carrying steps after receiving the starting instruction: and transferring the configuration information stored in the memory to the configuration register group according to the data transfer configuration information.
In this implementation, the RISC processor may send a start instruction to the DMA controller, so that the DMA controller may execute the following data transfer steps after receiving the start instruction: and transferring the configuration information stored in the memory to the configuration register group according to the data transfer configuration information.
In practice, in the process of executing the end-side AI inference, parameter information and the like required by the neural network inference operation can be obtained in advance, and is fixed before the operation is started, and is represented by the weight value and the configuration information of each layer of the neural network. Based on the principle, the configuration information required by each layer of the computational neural network can be loaded in batch before the arithmetic unit is started, so that the time for loading the configuration information in the running process of the arithmetic unit is saved. In addition, because the DMA controller can be operated independently of the RISC processor, the RISC processor can carry out operation and the DMA controller can carry out configuration information group in parallel, thereby further saving the operation time and improving the operation efficiency.
Optionally, the configuration information stored in the memory is transferred to the configuration register group according to the data transfer configuration information, which may be specifically performed as follows:
first, the state of the configuration register set is determined.
In this implementation, after the DMA controller is started, the states of the multiple sets of configuration register sets may be determined first, and whether the states of the configuration register sets are not full may be determined. As an example, the DMA controller may determine whether the state of the configuration register set is not full based on the relationship between the read pointer of the configuration loader and its own write pointer. Here, the state is not full, which indicates that there is a remaining space in the configuration register group and data can be transferred to the configuration register group. Otherwise, the configuration register group has no residual space, and data can not be carried into the configuration register group.
Configuration information is then transferred from the memory to the set of configuration registers according to the determined state.
In this implementation, the DMA controller may transfer configuration information from memory to the set of configuration registers according to the determined state of the set of configuration registers. For example, the DMA controller may transfer configuration information from memory to the set of configuration registers when the state of the set of configuration registers is not full. Otherwise, the DMA controller waits and inquires the change of the read pointer of the configuration loader in real time until the state of the configuration register group is judged to be not full.
In this embodiment, if the obtained instruction is determined to be a special instruction, the execution main body may send the special instruction to the arithmetic unit, so that the arithmetic unit may perform the following operation steps 2021 to 2023.
In this embodiment, after the arithmetic unit receives the dedicated instruction, a set of registers may be selected from the preset sets of configuration registers as a target configuration register set according to the register selection information in the dedicated instruction. For example, in the decoding stage, the decoding unit parses the instruction operation code, and when the instruction is determined to be a special instruction, the register selection information in the special instruction is sent to the configuration loader in the execution stage, and the configuration loader selects one configuration register group from the plurality of configuration register groups as the target configuration register group according to the register selection information.
In this embodiment, after determining the target configuration register set, the arithmetic unit may read the configuration information from the target configuration register set according to the base address information in the dedicated instruction and the length of the data to be read. For example, after determining the target configuration register set, the configuration loader in the arithmetic unit may start reading the configuration information of the length specified by the length of the data to be read to the arithmetic unit at the base address specified by the base address information of the target configuration register set. By way of example, the configuration information may include, but is not limited to, a series of information such as register addresses in an arithmetic register set, register addresses in a register set of a RISC processor, parameter information required for neural network inference operations by an arithmetic unit, and the like. The length of the configuration information is far beyond what the 32-bit RISC instruction can accommodate.
In this embodiment, the operator may perform a preset operation using the configuration information acquired in step 2022. Here, the preset operation may refer to an operation instructed to be performed by the fetched dedicated instruction. For example, the arithmetic unit of the arithmetic unit can complete the operation indicated by the dedicated instruction using the configuration information obtained in step 2022, and obtain the operation result. The arithmetic unit may then write the operation result back to the set of operation registers or to a set of registers of the RISC processor.
With continued reference to fig. 4, fig. 4 is a schematic diagram of an application scenario of the method for processing information according to the present embodiment. In the application scenario of fig. 4, the RISC processor 401 of the terminal device determines the instruction class of the instruction based on the instruction operation code of the fetched instruction. If the instruction is determined to be a special instruction, the RISC processor sends the special instruction to the predetermined operator 402, so that the operator 402 can perform the following operation steps: and selecting the configuration register group from the preset configuration register group as a target configuration register group according to the register selection information. And then, reading the configuration information from the target configuration register group according to the base address information and the length of the data to be read. And finally, executing preset operation aiming at the configuration information.
The method provided by the above embodiment of the present disclosure uses an arithmetic unit to process the special instruction, thereby solving the problem of extending the special instruction in the simplified instruction set processor architecture.
With further reference to FIG. 5, a flow 500 of yet another embodiment of a method for processing information is shown. The flow 500 of the method for processing information includes the steps of:
In this embodiment, step 501 is similar to step 201 of the embodiment shown in fig. 2, and is not described here again.
In this embodiment, if the obtained instruction is determined to be a dedicated instruction, the execution body may send the dedicated instruction to the operator, so that the operator can perform the following operation steps 5021-5023.
In this embodiment, step 5021 is similar to step 2021 of the embodiment shown in fig. 2, and is not repeated here.
In this embodiment, step 5022 is similar to step 2022 in the embodiment shown in fig. 2, and is not repeated here.
In this embodiment, step 5023 is similar to step 2023 of the embodiment shown in fig. 2, and is not repeated here.
In response to determining that the command is a general command, operation steps 5031-5034 are performed, step 503.
In this embodiment, the instruction category may also include general instructions. The general-purpose instruction may include first register determination information, second register determination information, and third register determination information for determining a register from a preset register group. As an example, when the instruction is a general-purpose instruction, bits 7-11 of the 32 bits may be used to select a register of a register set that may be used to store the result of the instruction operation. Bits 15-19 of the 32 bits may be used to select a register of a register bank that is used to store one of the instruction operand. Bits 20-24 of the 32 bits may be used to select a register of a register set that is used to store a second operand of the instruction operation.
Here, when the execution agent determines that the acquired instruction is a general instruction, the execution agent may perform the following operation steps 5031 to 5034.
In this embodiment, the execution body may acquire data in the register determined by the first register determination information as the first operand.
In this embodiment, the execution body may acquire data in the register determined by the second register determination information as the second operand.
In this embodiment, the execution body may execute a predetermined operation on the first operand and the second operand. Here, the preset operation may refer to an operation instructed to be performed by the acquired general-purpose instruction.
In this embodiment, the execution agent may store the result of the operation arrived at in step 5033 in a register determined by the third register.
As can be seen from fig. 5, compared with the embodiment corresponding to fig. 2, the flow 500 of the method for processing information in the present embodiment highlights the processing of the general-purpose instruction when the instruction is a general-purpose instruction. Therefore, the scheme described in this embodiment can execute different steps for different instruction categories, thereby realizing the processing of general instructions and special instructions.
With further reference to fig. 6, as an implementation of the methods shown in the above figures, the present disclosure provides an embodiment of an apparatus for processing information, which corresponds to the method embodiment shown in fig. 2, and which is particularly applicable in various electronic devices.
As shown in fig. 6, the apparatus 600 for processing information of the present embodiment includes: a determination unit 601 and an arithmetic unit 602. The determining unit 601 is configured to determine an instruction type of the instruction based on an instruction operation code of the obtained instruction, where the instruction type includes a special instruction, and the special instruction includes register selection information, base address information, and a data length to be read; the arithmetic unit 602 is configured to, in response to determining that the instruction is a dedicated instruction, send the dedicated instruction to a preset arithmetic unit, so that the arithmetic unit executes the following arithmetic steps: selecting a configuration register group from preset configuration register groups as a target configuration register group according to the register selection information; reading configuration information from the target configuration register set according to the base address information and the length of the data to be read; and executing preset operation according to the configuration information.
In this embodiment, specific processes of the determining unit 601 and the calculating unit 602 of the apparatus 600 for processing information and technical effects thereof can refer to the related descriptions of step 201 and step 202 in the corresponding embodiment of fig. 2, which are not described herein again.
In some optional implementations of this embodiment, the instruction class further includes a general instruction, and the general instruction includes first register determination information, second register determination information, and third register determination information for determining a register from a preset register group; and the apparatus 600 further comprises: an operation unit (not shown in the figure) configured to, in response to determining that the instruction is a general instruction, perform the following operation steps: acquiring data in a register determined by the first register determination information as a first operand; acquiring data in the register determined by the second register determination information as a second operand; executing a preset operation on the first operand and the second operand to obtain an operation result; and storing the operation result into the register determined by the third register.
In some optional implementations of this embodiment, the apparatus 600 further includes: and a storage unit (not shown in the drawings) configured to store preset configuration information into a preset memory.
In some optional implementations of this embodiment, the apparatus 600 further includes: a configuration unit (not shown in the figure) configured to configure data transfer configuration information of the DMA controller, wherein the data transfer configuration information is used for indicating data to be transferred; and a starting unit (not shown in the figure) configured to send a starting instruction to the DMA controller for the DMA controller to execute a preset data transfer step after receiving the starting instruction, wherein the DMA controller includes a transferring unit configured to transfer the configuration information stored in the memory to the configuration register set according to the data transfer configuration information.
In some optional implementations of this embodiment, the transfer unit is further configured to: determining the state of the configuration register set; transferring configuration information from the memory to the set of configuration registers according to the determined state.
Referring now to fig. 7, shown is a schematic diagram of an electronic device (e.g., terminal device in fig. 1) 700 suitable for use in implementing embodiments of the present disclosure. The terminal device shown in fig. 7 is only an example, and should not bring any limitation to the functions and the use range of the embodiments of the present disclosure.
As shown in fig. 7, the electronic device 700 may include a processing means (e.g., a central processing unit, a graphics processor, an operator, etc.) 701 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)702 or a program loaded from a storage means 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data necessary for the operation of the electronic apparatus 700 are also stored. The processing device 701, the ROM 702, and the RAM 703 are connected to each other by a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
Generally, the following devices may be connected to the I/O interface 705: input devices 706 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 707 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 708 including, for example, magnetic tape, hard disk, etc.; and a communication device 709. The communication means 709 may allow the electronic device 700 to communicate wirelessly or by wire with other devices to exchange data. While fig. 7 illustrates an electronic device 700 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided. Each block shown in fig. 7 may represent one device or may represent multiple devices as desired.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such embodiments, the computer program may be downloaded and installed from a network via the communication means 709, or may be installed from the storage means 708, or may be installed from the ROM 702. The computer program, when executed by the processing device 701, performs the above-described functions defined in the methods of embodiments of the present disclosure.
It should be noted that the computer readable medium described in the embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In embodiments of the disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In embodiments of the present disclosure, however, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: determining the instruction type of the instruction based on the instruction operation code of the acquired instruction, wherein the instruction type comprises a special instruction, and the special instruction comprises register selection information, base address information and the length of data to be read; in response to determining that the instruction is a special instruction, sending the special instruction to a preset arithmetic unit for the arithmetic unit to execute the following operation steps: selecting a configuration register group from preset configuration register groups as a target configuration register group according to the register selection information; reading configuration information from the target configuration register set according to the base address information and the length of the data to be read; and executing preset operation according to the configuration information.
Computer program code for carrying out operations for embodiments of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present disclosure may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes a determination unit and an arithmetic unit. The names of these units do not in some cases constitute a limitation on the unit itself, and for example, the determination unit may also be described as "a unit that determines the instruction category of the instruction based on the instruction opcode of the fetched instruction".
The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the embodiments of the present disclosure is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is made without departing from the inventive concept as defined above. For example, the above features and (but not limited to) technical features with similar functions disclosed in the embodiments of the present disclosure are mutually replaced to form the technical solution.
Claims (12)
1. A method for processing information, the method comprising:
determining the instruction type of the instruction based on the instruction operation code of the acquired instruction, wherein the instruction type comprises a special instruction, and the special instruction comprises register selection information, base address information and the length of data to be read;
in response to determining that the instruction is a special instruction, sending the special instruction to a preset arithmetic unit for the arithmetic unit to execute the following operation steps: selecting a configuration register group from preset configuration register groups as a target configuration register group according to the register selection information; reading configuration information from the target configuration register group according to the base address information and the length of the data to be read; and executing preset operation aiming at the configuration information.
2. The method of claim 1, wherein the instruction class further includes a general-purpose instruction including first register determination information, second register determination information, and third register determination information for determining a register from a preset register group; and
the method further comprises the following steps:
in response to determining that the instruction is a general instruction, performing the following operational steps: acquiring data in a register determined by the first register determination information as a first operand; acquiring data in the register determined by the second register determination information as a second operand; executing a preset operation on the first operand and the second operand to obtain an operation result; storing the operation result into a register determined by the third register.
3. The method of claim 1, wherein prior to fetching an instruction, the method further comprises:
and storing the preset configuration information into a preset memory.
4. The method of claim 3, wherein the method further comprises:
configuring data transfer configuration information of a DMA controller, wherein the data transfer configuration information is used for indicating data to be transferred;
sending a starting instruction to the DMA controller so that the DMA controller can execute the following data carrying steps after receiving the starting instruction: and transferring the configuration information stored in the memory to the configuration register group according to the data transfer configuration information.
5. The method of claim 4, wherein the transferring the configuration information stored in the memory to the set of configuration registers according to the data transfer configuration information comprises:
determining a state of the set of configuration registers;
transferring configuration information from the memory to the set of configuration registers according to the determined state.
6. An apparatus for processing information, the apparatus comprising:
the determining unit is configured to determine an instruction type of the instruction based on an instruction operation code of the obtained instruction, wherein the instruction type comprises a special instruction, and the special instruction comprises register selection information, base address information and a data length to be read;
an arithmetic unit configured to send the special instruction to a preset arithmetic unit in response to determining that the instruction is the special instruction, so that the arithmetic unit executes the following arithmetic steps: selecting a configuration register group from preset configuration register groups as a target configuration register group according to the register selection information; reading configuration information from the target configuration register group according to the base address information and the length of the data to be read; and executing preset operation aiming at the configuration information.
7. The apparatus of claim 6, wherein the instruction class further includes a general-purpose instruction including first register determination information, second register determination information, and third register determination information for determining a register from a preset register group; and
the device further comprises:
an operation unit configured to, in response to determining that the instruction is a general-purpose instruction, perform the following operation steps: acquiring data in a register determined by the first register determination information as a first operand; acquiring data in the register determined by the second register determination information as a second operand; executing a preset operation on the first operand and the second operand to obtain an operation result; storing the operation result into a register determined by the third register.
8. The apparatus of claim 6, wherein the apparatus further comprises:
a storage unit configured to store preset configuration information to a preset memory.
9. The apparatus of claim 8, wherein the apparatus further comprises:
the data transfer control device comprises a configuration unit, a data transfer unit and a data transfer unit, wherein the configuration unit is configured to configure data transfer configuration information of a DMA controller, and the data transfer configuration information is used for indicating data to be transferred;
the DMA controller comprises a starting unit configured to send a starting instruction to the DMA controller so that the DMA controller executes a preset data carrying step after receiving the starting instruction, wherein the DMA controller comprises a transfer unit configured to transfer the configuration information stored in the memory to the configuration register set according to the data transfer configuration information.
10. The apparatus of claim 9, wherein the transfer unit is further configured to:
determining a state of the set of configuration registers;
transferring configuration information from the memory to the set of configuration registers according to the determined state.
11. A terminal, comprising:
one or more processors;
a storage device having one or more programs stored thereon;
an operator configured to process the dedicated instruction;
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-5.
12. A computer-readable medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the method of any one of claims 1-5.
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911402401.9A CN111158756B (en) | 2019-12-31 | 2019-12-31 | Method and apparatus for processing information |
US16/890,665 US11016769B1 (en) | 2019-12-31 | 2020-06-02 | Method and apparatus for processing information |
JP2020096666A JP6998991B2 (en) | 2019-12-31 | 2020-06-03 | Information processing methods and equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911402401.9A CN111158756B (en) | 2019-12-31 | 2019-12-31 | Method and apparatus for processing information |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111158756A true CN111158756A (en) | 2020-05-15 |
CN111158756B CN111158756B (en) | 2021-06-29 |
Family
ID=70559410
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911402401.9A Active CN111158756B (en) | 2019-12-31 | 2019-12-31 | Method and apparatus for processing information |
Country Status (3)
Country | Link |
---|---|
US (1) | US11016769B1 (en) |
JP (1) | JP6998991B2 (en) |
CN (1) | CN111158756B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113138800A (en) * | 2021-03-25 | 2021-07-20 | 沐曦集成电路(上海)有限公司 | Encoding and decoding method and computing system for fixed-length instruction set |
WO2022134729A1 (en) * | 2020-12-24 | 2022-06-30 | 苏州浪潮智能科技有限公司 | Risc-v-based artificial intelligence inference method and system |
CN116126252A (en) * | 2023-04-11 | 2023-05-16 | 南京砺算科技有限公司 | Data loading method, graphic processor and computer readable storage medium |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117193861B (en) * | 2023-11-07 | 2024-03-15 | 芯来智融半导体科技(上海)有限公司 | Instruction processing method, apparatus, computer device and storage medium |
CN118626152A (en) * | 2024-08-14 | 2024-09-10 | 北京开源芯片研究院 | Method and device for generating instruction stream, electronic equipment and storage medium |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1220017A (en) * | 1996-03-15 | 1999-06-16 | 微米技术有限公司 | Method and apparatus for performing operation multiple times in response to single instruction |
US6128728A (en) * | 1997-08-01 | 2000-10-03 | Micron Technology, Inc. | Virtual shadow registers and virtual register windows |
CN1716229A (en) * | 2004-06-28 | 2006-01-04 | 富士通株式会社 | Reconfigurable processor and semiconductor devices |
EP1634164A2 (en) * | 2003-06-13 | 2006-03-15 | ARM Limited | Data access program instruction encoding |
CN105426274A (en) * | 2015-11-13 | 2016-03-23 | 上海交通大学 | Soft error-tolerant coarse-grained reconfigurable array |
CN105512092A (en) * | 2014-10-08 | 2016-04-20 | 富士通株式会社 | Arithmetic circuit and control method for arithmetic circuit |
KR20160070631A (en) * | 2014-12-10 | 2016-06-20 | 삼성전자주식회사 | Processor and method for processing command of processor |
CN106557442A (en) * | 2015-09-28 | 2017-04-05 | 北京兆易创新科技股份有限公司 | A kind of chip system |
CN106775579A (en) * | 2016-11-29 | 2017-05-31 | 北京时代民芯科技有限公司 | Floating-point operation accelerator module based on configurable technology |
CN107315715A (en) * | 2016-04-26 | 2017-11-03 | 北京中科寒武纪科技有限公司 | A kind of apparatus and method for performing matrix plus/minus computing |
CN107562549A (en) * | 2017-08-21 | 2018-01-09 | 西安电子科技大学 | Isomery many-core ASIP frameworks based on on-chip bus and shared drive |
CN110390385A (en) * | 2019-06-28 | 2019-10-29 | 东南大学 | A kind of general convolutional neural networks accelerator of configurable parallel based on BNRP |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000250895A (en) | 1999-03-01 | 2000-09-14 | Sanyo Electric Co Ltd | Data processor |
US8484441B2 (en) * | 2004-03-31 | 2013-07-09 | Icera Inc. | Apparatus and method for separate asymmetric control processing and data path processing in a configurable dual path processor that supports instructions having different bit widths |
US20060179273A1 (en) | 2005-02-09 | 2006-08-10 | Advanced Micro Devices, Inc. | Data processor adapted for efficient digital signal processing and method therefor |
US20180046903A1 (en) | 2016-08-12 | 2018-02-15 | DeePhi Technology Co., Ltd. | Deep processing unit (dpu) for implementing an artificial neural network (ann) |
US10409615B2 (en) * | 2017-06-19 | 2019-09-10 | The Regents Of The University Of Michigan | Configurable arithmetic unit |
US10747286B2 (en) * | 2018-06-11 | 2020-08-18 | Intel Corporation | Dynamic power budget allocation in multi-processor system |
CN109857460B (en) | 2019-02-20 | 2021-09-21 | 南京华捷艾米软件科技有限公司 | Matrix convolution calculation method, interface, coprocessor and system based on RISC-V architecture |
CN110502278B (en) | 2019-07-24 | 2021-07-16 | 瑞芯微电子股份有限公司 | Neural network coprocessor based on RiccV extended instruction and coprocessing method thereof |
US11593156B2 (en) * | 2019-08-16 | 2023-02-28 | Red Hat, Inc. | Instruction offload to processor cores in attached memory |
-
2019
- 2019-12-31 CN CN201911402401.9A patent/CN111158756B/en active Active
-
2020
- 2020-06-02 US US16/890,665 patent/US11016769B1/en active Active
- 2020-06-03 JP JP2020096666A patent/JP6998991B2/en active Active
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1220017A (en) * | 1996-03-15 | 1999-06-16 | 微米技术有限公司 | Method and apparatus for performing operation multiple times in response to single instruction |
US6128728A (en) * | 1997-08-01 | 2000-10-03 | Micron Technology, Inc. | Virtual shadow registers and virtual register windows |
EP1634164A2 (en) * | 2003-06-13 | 2006-03-15 | ARM Limited | Data access program instruction encoding |
CN1716229A (en) * | 2004-06-28 | 2006-01-04 | 富士通株式会社 | Reconfigurable processor and semiconductor devices |
CN105512092A (en) * | 2014-10-08 | 2016-04-20 | 富士通株式会社 | Arithmetic circuit and control method for arithmetic circuit |
KR20160070631A (en) * | 2014-12-10 | 2016-06-20 | 삼성전자주식회사 | Processor and method for processing command of processor |
CN106557442A (en) * | 2015-09-28 | 2017-04-05 | 北京兆易创新科技股份有限公司 | A kind of chip system |
CN105426274A (en) * | 2015-11-13 | 2016-03-23 | 上海交通大学 | Soft error-tolerant coarse-grained reconfigurable array |
CN107315715A (en) * | 2016-04-26 | 2017-11-03 | 北京中科寒武纪科技有限公司 | A kind of apparatus and method for performing matrix plus/minus computing |
CN106775579A (en) * | 2016-11-29 | 2017-05-31 | 北京时代民芯科技有限公司 | Floating-point operation accelerator module based on configurable technology |
CN107562549A (en) * | 2017-08-21 | 2018-01-09 | 西安电子科技大学 | Isomery many-core ASIP frameworks based on on-chip bus and shared drive |
CN110390385A (en) * | 2019-06-28 | 2019-10-29 | 东南大学 | A kind of general convolutional neural networks accelerator of configurable parallel based on BNRP |
Non-Patent Citations (2)
Title |
---|
Y. OKADA等: "A new delta compression algorithm suitable for program updating in embedded systems", 《DATA COMPRESSION CONFERENCE, 2003. PROCEEDINGS. DCC 2003》 * |
黄凯杰: "一种精简高效的FCU控制子系统平台", 《中国优秀硕士学位论文全文数据库》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022134729A1 (en) * | 2020-12-24 | 2022-06-30 | 苏州浪潮智能科技有限公司 | Risc-v-based artificial intelligence inference method and system |
US11880684B2 (en) | 2020-12-24 | 2024-01-23 | Inspur Suzhou Intelligent Technology Co., Ltd. | RISC-V-based artificial intelligence inference method and system |
CN113138800A (en) * | 2021-03-25 | 2021-07-20 | 沐曦集成电路(上海)有限公司 | Encoding and decoding method and computing system for fixed-length instruction set |
CN116126252A (en) * | 2023-04-11 | 2023-05-16 | 南京砺算科技有限公司 | Data loading method, graphic processor and computer readable storage medium |
CN116126252B (en) * | 2023-04-11 | 2023-08-08 | 南京砺算科技有限公司 | Data loading method, graphic processor and computer readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
US11016769B1 (en) | 2021-05-25 |
CN111158756B (en) | 2021-06-29 |
JP2021111313A (en) | 2021-08-02 |
JP6998991B2 (en) | 2022-01-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111158756B (en) | Method and apparatus for processing information | |
US7146613B2 (en) | JAVA DSP acceleration by byte-code optimization | |
CN111857819B (en) | Apparatus and method for performing matrix add/subtract operation | |
JP7012689B2 (en) | Command execution method and device | |
US9043806B2 (en) | Information processing device and task switching method | |
CN112925587B (en) | Method and device for initializing applications | |
CN111651203B (en) | Device and method for executing vector four-rule operation | |
US20230084523A1 (en) | Data Processing Method and Device, and Storage Medium | |
CN113204385B (en) | Plug-in loading method and device, computing equipment and readable storage medium | |
US8707013B2 (en) | On-demand predicate registers | |
CN111651202A (en) | Device for executing vector logic operation | |
CN115604331A (en) | Data processing system, method and device | |
CN114924792A (en) | Instruction decoding unit, instruction execution unit, and related devices and methods | |
KR20230124598A (en) | Compressed Command Packets for High Throughput and Low Overhead Kernel Initiation | |
KR100751063B1 (en) | Method and apparatus for providing emulation PC-based for developing program of embedded system | |
EP2336883A1 (en) | Programming system in multi-core, and method and program of the same | |
CN112882753A (en) | Program running method and device | |
CN111562913B (en) | Method, device and equipment for pre-creating view component and computer readable medium | |
US20090276777A1 (en) | Multiple Programs for Efficient State Transitions on Multi-Threaded Processors | |
CN115951936B (en) | Chip adaptation method, device, equipment and medium of vectorization compiler | |
CN112214244A (en) | Arithmetic device and operation method thereof | |
US20230168898A1 (en) | Methods and apparatus to schedule parallel instructions using hybrid cores | |
CN107145372A (en) | information generating method and device | |
CN115129394A (en) | Microservice starting method and device, computer readable medium and electronic equipment | |
US9081582B2 (en) | Microcode for transport triggered architecture central processing units |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |