CN111782274A - Data processing device and related product - Google Patents
Data processing device and related product Download PDFInfo
- Publication number
- CN111782274A CN111782274A CN201910272513.0A CN201910272513A CN111782274A CN 111782274 A CN111782274 A CN 111782274A CN 201910272513 A CN201910272513 A CN 201910272513A CN 111782274 A CN111782274 A CN 111782274A
- Authority
- CN
- China
- Prior art keywords
- descriptor
- data
- tensor
- instruction
- processing instruction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/34—Addressing or accessing the instruction operand or the result ; Formation of operand address; Addressing modes
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Executing Machine-Instructions (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present disclosure relates to a data processing apparatus and related products, the products including a control module, the control module including: the device comprises an instruction cache unit, an instruction processing unit and a storage queue unit; the instruction cache unit is used for storing the calculation instruction associated with the artificial neural network operation; the instruction processing unit is used for analyzing the calculation instruction to obtain a plurality of operation instructions; the storage queue unit is configured to store an instruction queue, where the instruction queue includes: and a plurality of operation instructions or calculation instructions to be executed according to the front and back sequence of the queue. Through the method, the operation efficiency of the related product in the operation of the neural network model can be improved.
Description
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a data processing apparatus and a related product.
Background
With the continuous development of artificial intelligence technology, the amount of data and the data dimension which need to be processed are increasing. In the related art, a processor generally determines a data address by obtaining a parameter of an instruction, and then completes reading and using data according to the data address. This requires a technician to set parameters related to data access (e.g., interrelationships between data or data dimensions, etc.) when designing the parameters, so as to generate instructions to be transmitted to the processor to complete data access, which reduces the processing efficiency of the processor.
Disclosure of Invention
In view of this, the present disclosure provides a data processing technical solution.
According to an aspect of the present disclosure, there is provided a data processing apparatus, the apparatus including a control unit including a sheet amount control module, and an execution unit, wherein the control unit is configured to: when the operand of the decoded first processing instruction comprises the identifier of the descriptor, determining a descriptor storage space corresponding to the descriptor through the tensor control module according to the identifier of the descriptor, wherein the descriptor is used for indicating the shape of the tensor; obtaining the content of the descriptor from the descriptor storage space; determining, by the tension control module, a data address of data corresponding to an operand of the first processing instruction in a data storage space according to contents of the descriptor; and executing data processing corresponding to the first processing instruction through the tensor control module according to the data address.
According to another aspect of the present disclosure, there is provided an artificial intelligence chip comprising a data processing apparatus as described above.
According to another aspect of the present disclosure, there is provided an electronic device including the artificial intelligence chip as described above.
According to another aspect of the present disclosure, a board card is provided, which includes: a memory device, an interface device and a control device and an artificial intelligence chip as described above; wherein, the artificial intelligence chip is respectively connected with the storage device, the control device and the interface device; the storage device is used for storing data; the interface device is used for realizing data transmission between the artificial intelligence chip and external equipment; and the control device is used for monitoring the state of the artificial intelligence chip.
According to the data processing device disclosed by the embodiment of the disclosure, by introducing the descriptor indicating the tensor shape and arranging the tensor control module in the control unit, when the operand of the decoded processing instruction comprises the descriptor identifier, the content of the descriptor can be acquired through the tensor control module, the data address is determined through the tensor control module, and then the processing instruction is executed, so that the complexity of data access is reduced, and the efficiency of data access is improved.
Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments, features, and aspects of the disclosure and, together with the description, serve to explain the principles of the disclosure.
Fig. 1 shows a block diagram of a data processing apparatus according to an embodiment of the present disclosure.
Fig. 2 shows a schematic diagram of a data storage space of a data processing apparatus according to an embodiment of the present disclosure.
Fig. 3 shows a block diagram of a board card according to an embodiment of the present disclosure.
Detailed Description
Various exemplary embodiments, features and aspects of the present disclosure will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers can indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
Furthermore, in the following detailed description, numerous specific details are set forth in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements and circuits that are well known to those skilled in the art have not been described in detail so as not to obscure the present disclosure.
Fig. 1 shows a block diagram of a data processing apparatus according to an embodiment of the present disclosure. As shown in fig. 1, the data processing apparatus includes a control unit 11 and an execution unit 12, where the control unit 11 includes a tension control module 111, where the control unit 11 is configured to:
when the operand of the decoded first processing instruction comprises the identifier of the descriptor, determining a descriptor storage space corresponding to the descriptor through the tensor control module according to the identifier of the descriptor, wherein the descriptor is used for indicating the shape of the tensor;
obtaining the content of the descriptor from the descriptor storage space;
determining, by the tension control module, a data address of data corresponding to an operand of the first processing instruction in a data storage space according to contents of the descriptor;
and executing data processing corresponding to the first processing instruction through the tensor control module according to the data address.
According to the data processing device of the embodiment of the disclosure, the descriptor indicating the tensor shape is introduced, the tensor control module is arranged in the control unit, the descriptor content can be obtained through the tensor control module when the operand of the decoded processing instruction comprises the descriptor identification, the data address is determined through the tensor control module, and then the processing instruction is executed, so that the complexity of data access is reduced, and the efficiency of data access is improved.
For example, the data processing device may be a processor, for example, wherein the processor may include a general purpose processor (e.g., a central processing unit CPU, a graphics processing unit GPU) and a special purpose processor (e.g., an artificial intelligence processor, a scientific computing processor, or a digital signal processor, etc.). The present disclosure is not limited as to the type of processor.
In a possible implementation, the data processing apparatus may include a control unit 11 and an execution unit 12, where the control unit 11 is configured to control the apparatus, such as reading a memory or an externally-incoming instruction, decoding (decoding) the instruction, sending a micro-operation control signal to a corresponding component, and the like. The execution unit 12 is used for executing specific operation instructions, and the execution unit 12 may be, for example, an Arithmetic and Logic Unit (ALU), a Memory Access Unit (MAU), a Natural Functional Unit (NFU), or the like. The present disclosure is not limited as to the specific type of hardware of execution unit 12.
In one possible implementation, the data processed by the data processing apparatus may include N-dimensional tensor data (N is an integer greater than or equal to zero, for example, N is 1, 2 or 3), where the tensor may include multiple forms of data composition, the tensor may be of different dimensions, for example, a scalar may be regarded as a 0-dimensional tensor, a vector may be regarded as a 1-dimensional tensor, and a matrix may be a 2-dimensional or more than 2-dimensional tensor. The shape of the tensor includes information such as the dimensions of the tensor, the sizes of the dimensions of the tensor, and the like. For example, for a tensor:
the shape of the tensor can be described by a descriptor as (2, 4), i.e. the tensor is represented by two parameters as a two-dimensional tensor, with the size of the first dimension (column) of the tensor being 2 and the size of the second dimension (row) being 4. It should be noted that the manner in which the descriptors indicate the tensor shape is not limited in the present application. When storing tensor data in a memory of a data processing apparatus, the shape of the tensor data cannot be specified from the data address (or storage area) thereof, and further, related information such as the correlation between a plurality of tensor data cannot be specified, which results in low efficiency of access of the processor to the tensor data.
In this case, a descriptor (tensor descriptor) may be introduced to indicate the shape of the tensor (tensor data of the N-dimension). The value of N may be determined according to the dimension (order) of the tensor data, or may be set according to the usage requirement of the tensor data. For example, when the value of N is 3, the tensor data is three-dimensional tensor data, and the descriptor may be used to indicate the shape (e.g., offset, size, etc.) of the tensor data in three-dimensional directions. It should be understood that the value of N can be set by those skilled in the art according to actual needs, and the disclosure does not limit this.
In one possible implementation, the descriptor may include an identifier and content, etc., and the identifier of the descriptor may be used to distinguish the descriptor, such as a number; the content of the descriptor may include at least one shape parameter (e.g., a size in each dimension direction of the tensor, etc.) representing the shape of the tensor data, and may further include at least one address parameter (e.g., a reference address of the data reference point) representing the address of the tensor data. The present disclosure does not limit the specific parameters included in the content of the descriptor.
By using the descriptor to indicate the tensor data, the shape of the tensor data can be expressed, and further, the relevant information such as the interrelation among a plurality of tensor data can be determined, thereby improving the access efficiency of the tensor data.
In a possible implementation, a tension control module 111 may be provided in the control unit 11 to implement operations associated with the descriptors and execution of instructions, such as registration, modification and deregistration of the descriptors; reading and writing the descriptor content; the calculation of data addresses and the execution of data access instructions. The tension control module 111 may be, for example, a Tensor Interface Unit (TIU), and the present disclosure does not limit the specific hardware type of the tension control module. In this way, the operation associated with the descriptor can be realized by dedicated hardware, and the access efficiency of tensor data is further improved.
In a possible implementation, the data processing apparatus may decode (decode) the processing instruction through the control unit 11 when receiving the processing instruction. Wherein the control unit 11 is further configured to: the method comprises the steps of decoding a received first processing instruction to obtain a decoded first processing instruction, wherein the decoded first processing instruction comprises an operation code and one or more operands, and the operation code is used for indicating a processing type corresponding to the first processing instruction.
In this case, the first processing instruction is decoded by the control unit 11, and the decoded first processing instruction (microinstruction) is obtained. The first processing instruction may include a data access instruction, an operation instruction, a descriptor management instruction, a synchronization instruction, and the like. The present disclosure is not limited to a particular type of first processing instruction and a particular manner of decoding.
The decoded first processing instruction may include an opcode to indicate a type of processing corresponding to the first processing instruction and one or more operands to indicate data to be processed. For example, the instruction may be expressed as: add; a; b, Add is an opcode and A and B are operands, and the instruction is used to Add A and B. Instructions the present disclosure does not limit the number of operands of the decoded instructions and the representation of the instructions.
In a possible implementation, if the operand of the first processing instruction decoded by the control unit 11 includes an identifier of a descriptor, a descriptor storage space corresponding to the descriptor may be determined by the tensor control module; after the descriptor storage space is determined, the contents of the descriptor (including information such as a shape and an address characterizing tensor data) can be obtained from the descriptor storage space. Then, according to the content of the descriptor, a data address in the data storage space corresponding to the operand is determined by the tensor control module, and data processing corresponding to the first processing instruction is executed by the tensor control module according to the data address.
That is to say, when the operand of the first processing instruction includes the identifier of the descriptor, the tension control module may obtain the content of the descriptor from the descriptor storage space according to the identifier of the descriptor, and calculate the data address of the data in the data storage space corresponding to the operand including the identifier of the descriptor in the first processing instruction according to the content of the descriptor, and further may execute corresponding processing according to the data address.
In this way, the descriptor content can be obtained from the descriptor storage space, and then the data address can be obtained, and the address does not need to be transmitted through an instruction in each access, so that the data access efficiency of the processor is improved.
In one possible implementation, the identity and content of the descriptor may be stored in a descriptor storage space, which may be a storage space in an internal memory of the control unit (e.g., a register, an on-chip SRAM, or other media cache, etc.). The data storage space of the tensor data indicated by the descriptors may be a storage space in an internal memory (e.g., an on-chip cache) of the control unit or an external memory (e.g., an off-chip memory) connected to the control unit. The data addresses in the data storage space may be actual physical addresses or virtual addresses. The present disclosure does not limit the location of the descriptor storage space and the data storage space and the type of data address.
In one possible implementation, the descriptor's identification, content, and tensor data indicated by the descriptor may be located in the same block, for example, a contiguous block of on-chip cache may be used to store the descriptor's associated content at addresses ADDR0-ADDR1023, where addresses ADDR0-ADDR31 may be used to store the descriptor's identification, addresses ADDR32-ADDR63 may be used to store the descriptor's content, and addresses ADDR64-ADDR1023 may be used to store the tensor data indicated by the descriptor. Here, the address ADDR is not limited to 1 bit or one byte, and is used herein to indicate one address, which is one address unit. The storage area and its address can be determined by those skilled in the art in practical situations, and the present disclosure is not limited thereto.
In one possible implementation, the identifier and content of the descriptor and the tensor data indicated by the descriptor may be stored separately in different areas of the internal memory, for example, a register may be used as the descriptor storage space, the identifier and content of the descriptor may be stored in the register, an on-chip cache may be used as the data storage space, and the tensor data indicated by the descriptor may be stored.
In a possible implementation, a Special Register (SR) dedicated to the descriptor may be provided, and the data in the descriptor may be an immediate number or may be obtained from the special register. When the register is used to store the identifier and the content of the descriptor, the identifier of the descriptor may be represented by using the number of the register, for example, when the number of the register is 0, the identifier of the descriptor stored therein is 0. When the descriptor in the register is valid, an area may be allocated in the buffer space according to the size of the tensor data indicated by the descriptor (for example, a tensor buffer unit is created in the buffer for each tensor data) for storing the tensor data. It should be understood that the tensor data may also be stored in a preset buffer space, which is not limited by the present disclosure.
In one possible implementation, the identity and content of the descriptors may be stored in an internal memory and the tensor data indicated by the descriptors may be stored in an external memory. For example, the identification and content of the descriptors may be stored on-chip, and the tensor data indicated by the descriptors may be stored under-chip.
In one possible implementation, the data address of the data storage space corresponding to the descriptor may be a fixed address. For example, separate data storage spaces may be divided for tensor data, each of which has a one-to-one correspondence with the descriptor at the start address of the data storage space. In this case, the control unit may determine a data address of data corresponding to the operand through the tensor control module according to the contents of the descriptor, and then execute the first processing instruction.
In one possible implementation, when the data address of the data storage space corresponding to the identifier of the descriptor is a variable address, the descriptor may be further used to indicate an address of tensor data of the N-dimension, wherein the content of the descriptor may further include at least one address parameter indicating an address of the tensor data. For example, the tensor data is 3-dimensional data, when the descriptor points to an address of the tensor data, the content of the descriptor may include one address parameter indicating the address of the tensor data, such as a start address of the tensor data, or may include a plurality of address parameters of the address of the tensor data, such as a start address of the tensor data + an address offset, or the address parameters of the tensor data based on each dimension. The address parameters can be set by those skilled in the art according to actual needs, and the disclosure does not limit this.
In one possible implementation, the address parameter of the tensor data includes a reference address of a data reference point of the descriptor in a data storage space of the tensor data. Wherein the reference address may be different according to a variation of the data reference point. The present disclosure does not limit the selection of data reference points.
In one possible implementation, the base address may include a start address of the data storage space. When the data reference point of the descriptor is the first data block of the data storage space, the reference address of the descriptor is the start address of the data storage space. When the data reference point of the descriptor is data other than the first data block in the data storage space, the reference address of the descriptor is the physical address of the data block in the data storage space.
In one possible implementation, the shape parameters of the tensor data include at least one of: the size of the data storage space in at least one of the N dimensional directions, the size of the storage region of the tensor data in at least one of the N dimensional directions, the offset of the storage region in at least one of the N dimensional directions, the positions of at least two vertices located at diagonal positions of the N dimensional directions relative to the data reference point, and the mapping relationship between the data description position of the tensor data indicated by the descriptor and the data address. Where the data description position is a mapping position of a point or a region in the tensor data indicated by the descriptor, for example, when the tensor data is 3-dimensional data, the descriptor may represent a shape of the tensor data using three-dimensional space coordinates (x, y, z), and the data description position of the tensor data may be a position of a point or a region in the three-dimensional space to which the tensor data is mapped, which is represented using three-dimensional space coordinates (x, y, z).
It should be understood that the shape parameters representing tensor data can be selected by one skilled in the art based on practical circumstances, and the present disclosure is not limited thereto.
Fig. 2 shows a schematic diagram of a data storage space of a data processing apparatus according to an embodiment of the present disclosure. As shown in fig. 2, the data storage space 21 stores a two-dimensional data in a line-first manner, which can be represented by (X, Y) (where the X axis is horizontally right and the Y axis is vertically downward), the size in the X axis direction (the size of each line) is ori _ X (not shown in the figure), the size in the Y axis direction (the total number of lines) is ori _ Y (not shown in the figure), and the starting address PA _ start (the base address) of the data storage space 21 is the physical address of the first data block 22. The data block 23 is partial data in the data storage space 21, and its offset amount 25 in the X-axis direction is denoted as offset _ X, the offset amount 24 in the Y-axis direction is denoted as offset _ Y, the size in the X-axis direction is denoted as size _ X, and the size in the Y-axis direction is denoted as size _ Y.
In one possible implementation, when the data block 23 is defined by using a descriptor, a data reference point of the descriptor may use a first data block of the data storage space 21, and the reference address of the descriptor is a starting address PA _ start of the data storage space 21, and then the content of the descriptor of the data block 23 may be determined by combining a size ori _ X of the data storage space 21 in the X axis direction, a size ori _ Y of the data storage space 21 in the Y axis direction, an offset _ Y of the data block 23 in the Y axis direction, an offset _ X in the X axis direction, a size _ X in the X axis direction, and a size _ Y in the Y axis direction.
In one possible implementation, the content of the descriptor can be represented using the following formula (1):
it should be understood that, although the descriptor describes a two-dimensional space in the above example, the dimension of the content representation of the descriptor can be set by those skilled in the art according to the actual situation, and the disclosure does not limit this.
In one possible implementation, the content of the descriptor of the tensor data may be determined according to a reference address of a data reference point of the descriptor in the data storage space, and positions of at least two vertices located at diagonal positions in N dimensional directions relative to the data reference point.
For example, the content of the descriptor of the data block 23 in fig. 2 may be determined using the reference address PA _ base of the data reference point of the descriptor in the data storage space, and the positions of the two vertices of the angular position relative to the data reference point. First, a data reference point of the descriptor and its reference address PA _ base in the data storage space are determined, for example, one data (e.g., data with position (2, 2)) can be selected as the data reference point in the data storage space 21, and the physical address of the data in the data storage space is taken as the reference address PA _ base; then, the positions of at least two vertices of the diagonal positions of the data block 23 with respect to the data reference point are determined, for example, using the positions of the diagonal position vertices with respect to the data reference point in the top-left to bottom-right direction, where the relative position of the top-left vertex is (x _ min, y _ min) and the relative position of the bottom-right vertex is (x _ max, y _ max), and then the content of the descriptor of the data block 23 can be determined according to the reference address PA _ base, the relative position of the top-left vertex (x _ min, y _ min), and the relative position of the bottom-right vertex (x _ max, y _ max).
In one possible implementation, the content of the descriptor can be represented using the following equation (2):
it should be understood that although the above examples use two vertices of the upper left corner and the lower right corner to determine the content of the descriptor, those skilled in the art can set the specific vertex of the at least two vertices according to actual needs, and the disclosure is not limited thereto.
In one possible implementation manner, the content of the descriptor of the tensor data can be determined according to a reference address of the data reference point of the descriptor in the data storage space and a mapping relation between the data description position and the data address of the tensor data indicated by the descriptor. The mapping relationship between the data description position and the data address may be set according to actual needs, for example, when tensor data indicated by the descriptor is three-dimensional space data, the mapping relationship between the data description position and the data address may be defined by using a function f (x, y, z).
In one possible implementation, the content of the descriptor can be represented using the following equation (3):
it should be understood that, a person skilled in the art may set the mapping relationship between the data description location and the data address according to practical situations, and the disclosure does not limit this.
In the case where the content of the descriptor is expressed by equation (1), the data description position is set to (x) for any data point in the tensor dataq,yq) Then the data address PA2 of the data point in the data storage space(x,y)The following equation (4) may be used to determine:
PA2(x,y)=PA_start+(offset_y+yq-1)*ori_x+(offset_x+xq) (4)
in this way, the control unit can calculate the data address of the tensor data indicated by the descriptor in the data storage space through the tensor control module according to the content of the descriptor, and further execute the processing corresponding to the processing instruction according to the address.
In a possible implementation manner, management of registration, modification, cancellation and the like of the descriptors can be realized through management instructions of the descriptors, and corresponding operation codes are set for the management instructions. The descriptor may be registered (created), for example, by a descriptor registration instruction (trcredit); modifying respective parameters (shape, address, etc.) of the descriptor by a descriptor modification instruction; de-register (delete) descriptors by a descriptor de-register instruction (TRRelease), etc. The present disclosure does not limit the kind of the management instruction of the descriptor and the specific setting of the operation code.
In one possible implementation, the control unit is further configured to:
when the first processing instruction is a descriptor registration instruction, acquiring registration parameters of a descriptor in the first processing instruction, wherein the registration parameters comprise at least one of an identifier of the descriptor, a tensor shape and content of tensor data indicated by the descriptor;
determining, by the tensor control module, a first storage area of the content of the descriptor in a descriptor storage space and a second storage area of the content of tensor data indicated by the descriptor in a data storage space according to the registration parameters of the descriptor;
determining the content of the descriptor according to the registration parameters of the descriptor and the second storage area so as to establish the corresponding relationship between the descriptor and the second storage area;
storing the contents of the descriptor in the first storage area.
For example, a descriptor registration instruction may be used to register a descriptor, and the instruction may include a registration parameter for the descriptor. The registration parameters may include at least one of an Identification (ID) of the descriptor, a tensor shape, and contents of tensor data indicated by the descriptor. For example, the registration parameters may include the shape identified as TR0, tensor (number of dimensions, size of dimensions, offset, starting data address, etc.). The present disclosure is not limited to the specific content of the registration parameters.
In a possible implementation manner, when the decoded operation code of the first processing instruction is determined to be the descriptor registration instruction, the control unit may create a corresponding descriptor through the tension control module according to the registration parameter in the first processing instruction.
In one possible implementation, a first storage area of the content of the descriptor in the descriptor storage space and a second storage area of the content of the tensor data indicated by the descriptor in the data storage space may be determined.
For example, if at least one of the storage areas has been previously set, the first storage area and/or the second storage area may be directly determined. For example, it is preset that the descriptor content and the tensor data content are stored in the same storage space, and the storage addresses of the content of the descriptor corresponding to the identifier TR0 of the descriptor are ADDR32-ADDR63, and the storage addresses of the content of the tensor data are ADDR64-ADDR1023, then the two addresses can be directly determined as the first storage area and the second storage area.
In one possible implementation, if there is no preset storage area, a first storage area may be allocated in the descriptor storage space for the descriptor content and a second storage area may be allocated in the data storage space for the tensor data content by the tensor control module. The present disclosure is not so limited.
In a possible implementation manner, according to the tensor shape in the registration parameter and the data address of the second storage area, a corresponding relationship between the tensor shape and the address can be established, and then the descriptor content is determined, so that the corresponding data address can be determined according to the descriptor content during data processing. After the content of the descriptor is determined, it can be stored in the first storage area, completing the registration process of the descriptor.
For example, for the tensor data 23 as shown in fig. 2, the registration parameters may include a start address PA _ start (reference address) of the data storage space 21, an offset amount 25 (denoted as offset _ X) in the X-axis direction, an offset amount 24 (denoted as offset _ Y) in the Y-axis direction, a size (denoted as size _ X) in the X-axis direction, and a size (denoted as size _ Y) in the Y-axis direction. With these parameters, the contents of the descriptor can be expressed as formula (1) and stored in the first storage area, thereby completing the registration process of the descriptor.
In this way, the descriptor can be automatically created according to the descriptor registration instruction, and the correspondence between the tensor indicated by the descriptor and the data address is realized, so that the data address is obtained through the content of the descriptor during data processing, and the data access efficiency of the processor is improved.
In one possible implementation, the control unit is further configured to:
when the first processing instruction is a descriptor cancelling instruction, acquiring the identifier of the descriptor in the first processing instruction;
and according to the identifier of the descriptor, respectively releasing the storage area of the descriptor in the descriptor storage space and the storage area of the content of the tensor data indicated by the descriptor in the data storage space through the tensor control module.
For example, a descriptor deregistration instruction may be used to deregister (delete) a descriptor in order to free up space occupied by the descriptor. At least an identification of the descriptor may be included in the instruction.
In a possible implementation manner, when the instruction is determined to be a descriptor unregistering instruction according to the operation code of the decoded first processing instruction, the control unit may unregister the corresponding descriptor through the tensor control module according to the descriptor identifier in the first processing instruction.
In a possible implementation manner, according to the identifier of the descriptor, the tensor control module may release the storage area of the descriptor in the descriptor storage space and the storage area of the content of the tensor data indicated by the descriptor in the data storage space, so as to release the occupation of the storage areas by the descriptor.
By the method, the space occupied by the descriptor can be released after the use of the descriptor is finished, the limited storage resource can be repeatedly utilized, and the utilization efficiency of the resource is improved.
In one possible implementation, the control unit is further configured to:
when the first processing instruction is a descriptor modification instruction, obtaining modification parameters of a descriptor in the first processing instruction, wherein the modification parameters comprise at least one of identification of the descriptor, tensor shape to be modified and content of tensor data indicated by the descriptor;
determining the content to be updated of the descriptor through the tension control module according to the modification parameter of the descriptor;
and updating the content of the descriptor in the descriptor storage space and/or the content of tensor data in the data storage space through the tensor control module according to the content to be updated.
For example, descriptor modification instructions may be used to modify various parameters of the descriptor, such as identification, tensor shape, and the like. Modification parameters including at least one of an identification of the descriptor, a tensor shape to be modified, and content of tensor data indicated by the descriptor may be included in the instruction. The present disclosure is not limited to the details of modifying the parameters.
In a possible implementation manner, when the decoded opcode of the first processing instruction determines that the instruction is a descriptor modification instruction, according to the modification parameter in the first processing instruction, the control unit may determine, by the tensor control module, to-be-updated content of the descriptor, for example, change a dimension of a tensor from 3 dimensions to 2 dimensions, change a size of the tensor in one or more dimension directions, and the like.
In one possible implementation, upon determining the content to be updated, the tensor control module may update the descriptor content in the descriptor storage space and/or the content of the tensor data in the data storage space to modify the tensor data and enable the updated descriptor content to indicate a shape of the modified tensor data. The scope and the specific updating mode of the content to be updated are not limited in the present disclosure.
In this way, when the tensor data indicated by the descriptor is changed, the descriptor can be directly modified to maintain the correspondence between the descriptor and the tensor data, and the utilization efficiency of resources is improved.
In one possible implementation manner, the control unit further includes a dependency relationship determination module, wherein the control unit is further configured to:
determining whether a second processing instruction with a dependency relationship exists through a dependency relationship judging module according to the identifier of the descriptor, wherein the second processing instruction comprises a processing instruction which is in an instruction queue and is before the first processing instruction and has the identifier of the descriptor in an operand;
blocking or caching the first processing instruction when there is a second processing instruction that has a dependency and does not complete processing.
For example, after the descriptors are set, a dependency relationship determination module may be provided in the control unit to perform determination of dependency relationship between instructions according to the descriptors. If the operand of the decoded first processing instruction comprises the identifier of the descriptor, the control unit may determine whether an instruction with a dependency relationship exists in the preamble instruction of the first processing instruction through the dependency relationship determination module.
In this case, for an instruction (a preamble instruction) preceding the first processing instruction in the instruction queue, the dependency relationship determination module may search for a second processing instruction having an identifier of the descriptor in the operand, and use the searched second processing instruction as a processing instruction having a dependency relationship with the first processing instruction. In case of an operand of a first processing instruction having an identification of a plurality of descriptors, the dependency corresponding to each descriptor may be determined separately, i.e. a preceding instruction in the operand having an identification of at least one of the plurality of descriptors may be treated as a second processing instruction having a dependency.
For example, if the first processing instruction is an operation instruction for descriptor TR0 and the second processing instruction is a write instruction for descriptor TR0, then the second processing instruction has a dependency relationship with the first processing instruction, and the first processing instruction cannot be executed during execution of the second processing instruction. If the second processing instruction includes a synchronization instruction (sync) for the first processing instruction, the second processing instruction has a dependency relationship with the first processing instruction, and the first processing instruction needs to be executed after the second processing instruction completes execution.
In a possible implementation manner, if there is a second processing instruction which has a dependency relationship and does not complete processing, the first processing instruction may be blocked, that is, the execution of the first processing instruction and other instructions after the first processing instruction is suspended, until the execution of the second processing instruction is completed, and then the first processing instruction and other instructions after the first processing instruction are executed.
In a possible implementation manner, if there is a second processing instruction which has a dependency relationship and does not complete processing, the first processing instruction may be cached, that is, the first processing instruction is stored in a preset cache space, without affecting the execution of other instructions. And after the second processing instruction is executed, executing the first processing instruction in the cache space. The present disclosure does not limit the manner in which the first processing instruction is processed in this case.
In this way, the dependency relationship judging module can be introduced to determine the dependency relationship generated by the types between the instructions and the dependency relationship generated by the synchronous instructions, so that the execution sequence of the instructions is ensured, and the correctness of data processing is ensured.
In one possible implementation, the control unit is further configured to:
determining, by the tension control module, a current state of the descriptor according to the identifier of the descriptor, the state of the descriptor including an operable state or an inoperable state;
blocking or caching the first processing instruction while the descriptor is currently in an inoperable state.
For example, a corresponding table of the states of the descriptors, including an operable state or an inoperable state, can be stored in the tensor control module to display the current state of the descriptors.
In one possible implementation, the tension control module may set the current state of the descriptor to an inoperable state in the event that the predecessor instruction of the first processing instruction is currently operating on (e.g., writing to or reading from) the descriptor. In this state, the first processing instruction cannot be executed, and may be blocked or cached. Conversely, in the event that there is no prologue instruction currently operating on the descriptor, the tension control module may set the current state of the descriptor to an operable state. In this state, the first processing instruction can be executed.
In a possible implementation manner, when the descriptor content is stored in a register TR (tensor register), the state correspondence table of the descriptor of the sheet control module may also store the usage condition of the TR, so as to determine whether the TR is occupied or released, thereby implementing management of limited register resources.
In this way, the dependency relationship between the instructions can be judged according to the states of the descriptors, so that the execution sequence of the instructions is ensured, and the correctness of data processing is ensured.
In one possible implementation, the first processing instruction comprises a data access instruction, the operand comprises source data and destination data,
wherein the control unit is configured to:
determining, by the tension control module, a descriptor storage space for a descriptor when at least one of the source data and the destination data includes an identification of the descriptor;
obtaining the content of the descriptor from the descriptor storage space;
determining, by the tension control module, a first data address of the source data and/or a second data address of the destination data according to contents of the descriptor;
data is read from the first data address and written to the second data address.
For example, an operand of a data access instruction includes source data and destination data for reading data from a data address of the source data and writing to a data address of the destination data. When the first processing instruction is a data access instruction, access to tensor data may be achieved by descriptors. Where at least one of the source data and the destination data of the data access instruction includes an identification of a descriptor, a descriptor storage space for the descriptor may be determined by the tensor control module.
In one possible implementation, if the source data includes an identifier of a first descriptor and the destination data includes an identifier of a second descriptor, the control unit may determine, by the tensor control module, a first descriptor storage space of the first descriptor and a second descriptor storage space of the second descriptor respectively; then reading the content of the first descriptor and the content of the second descriptor from the first descriptor storage space and the second descriptor storage space respectively; respectively calculating a first data address of the source data and a second data address of the target data through a tensor control module according to the contents of the first descriptor and the second descriptor; and reading the data from the first data address and writing the data into the second data address, thereby completing the whole access process.
For example, the source data may be data to be read under a chip, the first descriptor of which is identified as TR1, and the destination data is a block of storage space on the chip, the second descriptor of which is identified as TR 2. The control unit 11 may retrieve the content D1 of the first descriptor and the content D2 of the second descriptor from the descriptor storage space, respectively, based on the identification TR1 of the first descriptor in the source data and the identification TR2 of the second descriptor in the destination data. Wherein, the content D1 of the first descriptor and the content D2 of the second descriptor may be represented as follows:
the control unit 11 can obtain the starting physical address PA3 of the source data and the starting physical address PA4 of the destination data respectively through the tensor control module according to the obtained content D1 of the first descriptor and the obtained content D2 of the second descriptor, which are respectively expressed as follows:
PA3=PA_start1+(offset_y1-1)*ori_x1+offset_x1
PA4=PA_start2+(offset_y2-1)*ori_x2+offset_x2
the control unit 11 may determine the first data address and the second data address respectively through the tensor control module according to the starting physical address PA3 of the source data and the starting physical address PA4 of the destination data, and the content D1 of the first descriptor and the content D2 of the second descriptor, and read data from the first data address and write the second data address (through the IO path), thereby completing loading the tensor data indicated by D1 into the storage space indicated by D2.
In one possible implementation, if only the active data includes an identification of the first descriptor, the control unit may determine, by the tensor control module, a first descriptor storage space of the first descriptor; then reading the content of the first descriptor from the first descriptor storage space; calculating a first data address of the source data through a tensor control module according to the content of the first descriptor; and reading data from the first data address and writing the data into a second data address according to the second data address of the destination data in the operand of the instruction, thereby completing the whole access process.
In one possible implementation, if only the destination data includes an identification of the second descriptor, the control unit may determine, by the tensor control module, a second descriptor storage space for the second descriptor; then reading the content of the second descriptor from the second descriptor storage space; calculating a second data address of the target data through a tensor control unit according to the content of the second descriptor; and reading data from the first data address and writing the data into the second data address according to the first data address of the source data in the operand of the instruction, thereby completing the whole access process.
In this way, the descriptor can be used to complete the data access without the need to pass in the data address through the instruction at each access, thereby improving data access efficiency.
In one possible implementation, the first processing instruction comprises an arithmetic instruction, wherein the control unit 11 is configured to:
sending the data address and the first processing instruction to the execution unit when the first processing instruction is an arithmetic instruction,
wherein the execution unit is configured to:
and executing the operation corresponding to the first processing instruction according to the received data address.
For example, when the first processing instruction is an operation instruction, the operation of the tensor data may be realized by the descriptor. When the operand of the operation instruction comprises the identifier of the descriptor, the control unit can determine the descriptor storage space of the descriptor through the tensor control module, then read the content of the descriptor from the descriptor storage space, calculate the address of the data corresponding to the operand through the tensor control module according to the content of the descriptor, and then send the data address and the first processing instruction to the execution unit; and the execution unit reads data from the data address to perform operation according to the received data address, so that the whole operation process is completed.
For example, for an operation instruction Add; a; b, if the operands a and B respectively include identifiers TR3 and TR4 of descriptors, the control unit may determine a descriptor storage space corresponding to TR3 and TR4 through the tensor control module, read contents (such as shape parameters and address parameters) in the descriptor storage space and calculate data addresses of data a and B through the tensor control module according to the contents of the descriptors, for example, data address 1 of data a in the memory is ADDR64-ADDR127, data address 2 of data B in the memory is ADDR1023-ADDR1087, and then send the data address 1 and data address 2 and Add instructions to the execution unit; the execution unit can read data from the data address 1 and the data address 2 respectively, and perform an addition (Add) operation to obtain an operation result (a + B).
In this way, the descriptor can be used for finishing the reading of data during operation, and a data address does not need to be transmitted through an instruction, so that the data operation efficiency is improved.
According to the data processing device disclosed by the embodiment of the disclosure, the descriptor capable of indicating the tensor shape is introduced, so that the address of data can be determined through the descriptor in the data processing instruction operation process, and the instruction generation mode is simplified from the aspect of hardware, so that the complexity of data access is reduced, and the efficiency of data access of a processor is improved.
In a possible implementation manner, an artificial intelligence chip is also disclosed, which comprises the data processing device.
In a possible implementation manner, a board card is further disclosed, which comprises a storage device, an interface device, a control device and the artificial intelligence chip; wherein, the artificial intelligence chip is respectively connected with the storage device, the control device and the interface device; the storage device is used for storing data; the interface device is used for realizing data transmission between the artificial intelligence chip and external equipment; and the control device is used for monitoring the state of the artificial intelligence chip.
Fig. 3 shows a block diagram of a board according to an embodiment of the present disclosure, and referring to fig. 3, the board may include other kit components besides the artificial intelligence chip 389, where the kit components include, but are not limited to: memory device 390, interface device 391 and control device 392;
the memory device 390 is connected to the artificial intelligence chip through a bus for storing data. The memory device may include a plurality of groups of memory cells 393. Each group of the storage units is connected with the artificial intelligence chip through a bus. It is understood that each group of the memory cells may be a DDR SDRAM (Double Data Rate SDRAM).
DDR can double the speed of SDRAM without increasing the clock frequency. DDR allows data to be read out on the rising and falling edges of the clock pulse. DDR is twice as fast as standard SDRAM. In one embodiment, the storage device may include 4 sets of the storage unit. Each group of the memory cells may include a plurality of DDR4 particles (chips). In one embodiment, the artificial intelligence chip may include 4 72-bit DDR4 controllers, and 64 bits of the 72-bit DDR4 controller are used for data transmission, and 8 bits are used for ECC check. It can be understood that when DDR4-3200 particles are adopted in each group of memory cells, the theoretical bandwidth of data transmission can reach 25600 MB/s.
In one embodiment, each group of the memory cells includes a plurality of double rate synchronous dynamic random access memories arranged in parallel. DDR can transfer data twice in one clock cycle. And a controller for controlling DDR is arranged in the artificial intelligence chip and is used for controlling data transmission and data storage of each storage unit.
The interface device is electrically connected with the artificial intelligence chip. The interface device is used for realizing data transmission between the artificial intelligence chip and external equipment (such as a server or a computer). For example, in one embodiment, the interface device may be a standard PCIE interface. For example, the data to be processed is transmitted to the artificial intelligence chip by the server through a standard PCIE interface, so that data transfer is realized. Preferably, when PCIE 3.0X 16 interface transmission is adopted, the theoretical bandwidth can reach 16000 MB/s. In another embodiment, the interface device may also be another interface, and the present application does not limit the concrete expression of the other interface, and the interface unit may implement the switching function. In addition, the calculation result of the chip is still transmitted back to an external device (e.g., a server) by the interface device.
The control device is electrically connected with the artificial intelligence chip. The control device is used for monitoring the state of the artificial intelligence chip. Specifically, the artificial intelligence chip and the control device can be electrically connected through an SPI interface. The control device may include a single chip Microcomputer (MCU). As the artificial intelligence chip can comprise a plurality of processing chips, a plurality of processing cores or a plurality of processing circuits, a plurality of loads can be driven. Therefore, the artificial intelligence chip can be in different working states such as multi-load and light load. The control device can realize the regulation and control of the working states of a plurality of processing chips, a plurality of processing circuits and/or a plurality of processing circuits in the artificial intelligence chip.
In one possible implementation, an electronic device is disclosed that includes the artificial intelligence chip described above. The electronic device comprises a data processing device, a robot, a computer, a printer, a scanner, a tablet computer, an intelligent terminal, a mobile phone, a vehicle data recorder, a navigator, a sensor, a camera, a server, a cloud server, a camera, a video camera, a projector, a watch, an earphone, a mobile storage, a wearable device, a vehicle, a household appliance, and/or a medical device.
The vehicle comprises an airplane, a ship and/or a vehicle; the household appliances comprise a television, an air conditioner, a microwave oven, a refrigerator, an electric cooker, a humidifier, a washing machine, an electric lamp, a gas stove and a range hood; the medical equipment comprises a nuclear magnetic resonance apparatus, a B-ultrasonic apparatus and/or an electrocardiograph.
A1, a data processing device, the device comprising a control unit and an execution unit, the control unit comprising a tension control module, wherein the control unit is configured to:
when the operand of the decoded first processing instruction comprises the identifier of the descriptor, determining a descriptor storage space corresponding to the descriptor through the tensor control module according to the identifier of the descriptor, wherein the descriptor is used for indicating the shape of the tensor;
obtaining the content of the descriptor from the descriptor storage space;
determining, by the tension control module, a data address of data corresponding to an operand of the first processing instruction in a data storage space according to contents of the descriptor;
and executing data processing corresponding to the first processing instruction through the tensor control module according to the data address.
A2, the apparatus of claim A1, the control unit further configured to:
when the first processing instruction is a descriptor registration instruction, acquiring registration parameters of a descriptor in the first processing instruction, wherein the registration parameters comprise at least one of an identifier of the descriptor, a tensor shape and content of tensor data indicated by the descriptor;
determining, by the tensor control module, a first storage area of the content of the descriptor in a descriptor storage space and a second storage area of the content of tensor data indicated by the descriptor in a data storage space according to the registration parameters of the descriptor;
determining the content of the descriptor according to the registration parameters of the descriptor and the second storage area so as to establish the corresponding relationship between the descriptor and the second storage area;
storing the contents of the descriptor in the first storage area.
A3, the apparatus of claim a1 or a2, the control unit further configured to:
when the first processing instruction is a descriptor cancelling instruction, acquiring the identifier of the descriptor in the first processing instruction;
and according to the identifier of the descriptor, respectively releasing the storage area of the descriptor in the descriptor storage space and the storage area of the content of the tensor data indicated by the descriptor in the data storage space through the tensor control module.
A4, the apparatus of any one of claims a1-A3, the control unit further configured to:
when the first processing instruction is a descriptor modification instruction, obtaining modification parameters of a descriptor in the first processing instruction, wherein the modification parameters comprise at least one of identification of the descriptor, tensor shape to be modified and content of tensor data indicated by the descriptor;
determining the content to be updated of the descriptor through the tension control module according to the modification parameter of the descriptor;
and updating the content of the descriptor in the descriptor storage space and/or the content of tensor data in the data storage space through the tensor control module according to the content to be updated.
A5, the apparatus of any one of claims a1-a4, the control unit further comprising a dependency determination module, wherein the control unit is further configured to:
determining whether a second processing instruction with a dependency relationship exists through a dependency relationship judging module according to the identifier of the descriptor, wherein the second processing instruction comprises a processing instruction which is in an instruction queue and is before the first processing instruction and has the identifier of the descriptor in an operand;
blocking or caching the first processing instruction when there is a second processing instruction that has a dependency and does not complete processing.
A6, the apparatus of any one of claims a1-a5, the control unit further configured to:
determining, by the tension control module, a current state of the descriptor according to the identifier of the descriptor, the state of the descriptor including an operable state or an inoperable state;
blocking or caching the first processing instruction while the descriptor is currently in an inoperable state.
A7, the apparatus of any one of claims A1-A6, the first processing instruction comprising a data access instruction, the operand comprising source data and destination data,
wherein the control unit is configured to:
determining, by the tension control module, a descriptor storage space for a descriptor when at least one of the source data and the destination data includes an identification of the descriptor;
obtaining the content of the descriptor from the descriptor storage space;
determining, by the tension control module, a first data address of the source data and/or a second data address of the destination data according to contents of the descriptor;
data is read from the first data address and written to the second data address.
A8, the apparatus of any one of claims A1-A7, the first processing instruction comprising an arithmetic instruction,
wherein the control unit is further configured to:
sending the data address and the first processing instruction to the execution unit when the first processing instruction is an arithmetic instruction,
wherein the execution unit is configured to:
and executing the operation corresponding to the first processing instruction according to the received data address.
A9, the apparatus of any one of claims A1-A8, the descriptor indicating a shape of tensor data of dimension N, N being an integer greater than or equal to zero,
wherein the content of the descriptor comprises at least one shape parameter representing a shape of tensor data.
A10, the apparatus of claim a9, the descriptor further for indicating an address of tensor data of the N-dimension, wherein the content of the descriptor further comprises at least one address parameter representing the address of the tensor data.
A11, the apparatus of claim a10, the address parameters of the tensor data include a reference address of a data reference point of the descriptor in a data storage space of the tensor data;
wherein the shape parameters of the tensor data comprise at least one of:
the size of the data storage space in at least one of N dimensional directions, the size of the storage area in at least one of N dimensional directions, the offset of the storage area in at least one of N dimensional directions, the positions of at least two vertices located at diagonal positions in the N dimensional directions relative to the data reference point, and the mapping relationship between the data description positions of tensor data indicated by the descriptors and the data addresses.
A12, the apparatus of any one of claims a1-a11, the control unit further configured to:
the method comprises the steps of decoding a received first processing instruction to obtain a decoded first processing instruction, wherein the decoded first processing instruction comprises an operation code and one or more operands, and the operation code is used for indicating a processing type corresponding to the first processing instruction.
A13, the apparatus of any one of claims a1-a12, the descriptor storage space being a storage space in an internal memory of the control unit, the data storage space being a storage space in the internal memory of the control unit or an external memory connected to the control unit.
A14, an artificial intelligence chip, the chip comprising a data processing device according to any of claims A1-A13.
A15, an electronic device comprising the artificial intelligence chip of claim A14.
A16, a board card, comprising: a memory device, an interface device and a control device and an artificial intelligence chip according to claim a 14;
wherein, the artificial intelligence chip is respectively connected with the storage device, the control device and the interface device;
the storage device is used for storing data;
the interface device is used for realizing data transmission between the artificial intelligence chip and external equipment;
and the control device is used for monitoring the state of the artificial intelligence chip.
A17, the card of claim a16, the memory device comprising: the artificial intelligence chip comprises a plurality of groups of storage units, wherein each group of storage unit is connected with the artificial intelligence chip through a bus, and the storage units are as follows: DDR SDRAM;
the chip includes: the DDR controller is used for controlling data transmission and data storage of each memory unit;
the interface device is as follows: a standard PCIE interface.
Having described embodiments of the present disclosure, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.
Claims (10)
1. A data processing apparatus, characterized in that the apparatus comprises a control unit and an execution unit, the control unit comprising a tension control module, wherein the control unit is configured to:
when the operand of the decoded first processing instruction comprises the identifier of the descriptor, determining a descriptor storage space corresponding to the descriptor through the tensor control module according to the identifier of the descriptor, wherein the descriptor is used for indicating the shape of the tensor;
obtaining the content of the descriptor from the descriptor storage space;
determining, by the tension control module, a data address of data corresponding to an operand of the first processing instruction in a data storage space according to contents of the descriptor;
and executing data processing corresponding to the first processing instruction through the tensor control module according to the data address.
2. The apparatus of claim 1, wherein the control unit is further configured to:
when the first processing instruction is a descriptor registration instruction, acquiring registration parameters of a descriptor in the first processing instruction, wherein the registration parameters comprise at least one of an identifier of the descriptor, a tensor shape and content of tensor data indicated by the descriptor;
determining, by the tensor control module, a first storage area of the content of the descriptor in a descriptor storage space and a second storage area of the content of tensor data indicated by the descriptor in a data storage space according to the registration parameters of the descriptor;
determining the content of the descriptor according to the registration parameters of the descriptor and the second storage area so as to establish the corresponding relationship between the descriptor and the second storage area;
storing the contents of the descriptor in the first storage area.
3. The apparatus of claim 1, wherein the control unit is further configured to:
when the first processing instruction is a descriptor cancelling instruction, acquiring the identifier of the descriptor in the first processing instruction;
and according to the identifier of the descriptor, respectively releasing the storage area of the descriptor in the descriptor storage space and the storage area of the content of the tensor data indicated by the descriptor in the data storage space through the tensor control module.
4. The apparatus of claim 1, wherein the control unit is further configured to:
when the first processing instruction is a descriptor modification instruction, obtaining modification parameters of a descriptor in the first processing instruction, wherein the modification parameters comprise at least one of identification of the descriptor, tensor shape to be modified and content of tensor data indicated by the descriptor;
determining the content to be updated of the descriptor through the tension control module according to the modification parameter of the descriptor;
and updating the content of the descriptor in the descriptor storage space and/or the content of tensor data in the data storage space through the tensor control module according to the content to be updated.
5. The apparatus of claim 1, wherein the control unit further comprises a dependency determination module, wherein the control unit is further configured to:
determining whether a second processing instruction with a dependency relationship exists through a dependency relationship judging module according to the identifier of the descriptor, wherein the second processing instruction comprises a processing instruction which is in an instruction queue and is before the first processing instruction and has the identifier of the descriptor in an operand;
blocking or caching the first processing instruction when there is a second processing instruction that has a dependency and does not complete processing.
6. The apparatus of claim 1, wherein the control unit is further configured to:
determining, by the tension control module, a current state of the descriptor according to the identifier of the descriptor, the state of the descriptor including an operable state or an inoperable state;
blocking or caching the first processing instruction while the descriptor is currently in an inoperable state.
7. The apparatus of claim 1, wherein the first processing instruction comprises a data access instruction, the operand comprising source data and destination data,
wherein the control unit is configured to:
determining, by the tension control module, a descriptor storage space for a descriptor when at least one of the source data and the destination data includes an identification of the descriptor;
obtaining the content of the descriptor from the descriptor storage space;
determining, by the tension control module, a first data address of the source data and/or a second data address of the destination data according to contents of the descriptor;
data is read from the first data address and written to the second data address.
8. An artificial intelligence chip, wherein the chip comprises a data processing apparatus according to any one of claims 1 to 7.
9. An electronic device, characterized in that the electronic device comprises an artificial intelligence chip according to claim 8.
10. The utility model provides a board card, its characterized in that, the board card includes: a memory device, an interface device and a control device and an artificial intelligence chip according to claim 8;
wherein, the artificial intelligence chip is respectively connected with the storage device, the control device and the interface device;
the storage device is used for storing data;
the interface device is used for realizing data transmission between the artificial intelligence chip and external equipment;
and the control device is used for monitoring the state of the artificial intelligence chip.
Priority Applications (12)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910272513.0A CN111782274B (en) | 2019-04-04 | 2019-04-04 | Data processing device and related product |
KR1020207036316A KR102611169B1 (en) | 2019-04-04 | 2020-04-01 | Data processing apparatus and related product |
JP2021510523A JP7073581B2 (en) | 2019-04-04 | 2020-04-01 | Data processing equipment and related products |
KR1020207032017A KR20200142536A (en) | 2019-04-04 | 2020-04-01 | Data processing devices and related products |
EP20785318.5A EP3951666A4 (en) | 2019-04-04 | 2020-04-01 | Data processing apparatus and related product |
PCT/CN2020/082803 WO2020200246A1 (en) | 2019-04-04 | 2020-04-01 | Data processing apparatus and related product |
KR1020207036312A KR102611162B1 (en) | 2019-04-04 | 2020-04-01 | Data processing apparatus and related product |
JP2020198245A JP7150803B2 (en) | 2019-04-04 | 2020-11-30 | Data processing equipment and related products |
JP2020198200A JP7121103B2 (en) | 2019-04-04 | 2020-11-30 | Data processing equipment and related products |
US17/489,671 US11385895B2 (en) | 2019-04-04 | 2021-09-29 | Data processing apparatus and related products |
US17/849,182 US11886880B2 (en) | 2019-04-04 | 2022-06-24 | Data processing apparatus and related products with descriptor management |
US18/531,734 US20240111536A1 (en) | 2019-04-04 | 2023-12-07 | Data processing apparatus and related products |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910272513.0A CN111782274B (en) | 2019-04-04 | 2019-04-04 | Data processing device and related product |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111782274A true CN111782274A (en) | 2020-10-16 |
CN111782274B CN111782274B (en) | 2023-03-31 |
Family
ID=72755023
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910272513.0A Active CN111782274B (en) | 2019-04-04 | 2019-04-04 | Data processing device and related product |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111782274B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114064585A (en) * | 2021-11-10 | 2022-02-18 | 南京信易达计算技术有限公司 | Storage compression system based on domestic AI chip architecture and control method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103562897A (en) * | 2011-06-10 | 2014-02-05 | 国际商业机器公司 | Store storage class memory information command |
CN107077327A (en) * | 2014-06-30 | 2017-08-18 | 微体系统工程有限公司 | System and method for expansible wide operand instruction |
CN107347253A (en) * | 2015-02-25 | 2017-11-14 | 米雷普里卡技术有限责任公司 | Hardware instruction generation unit for application specific processor |
CN109543832A (en) * | 2018-11-27 | 2019-03-29 | 北京中科寒武纪科技有限公司 | A kind of computing device and board |
-
2019
- 2019-04-04 CN CN201910272513.0A patent/CN111782274B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103562897A (en) * | 2011-06-10 | 2014-02-05 | 国际商业机器公司 | Store storage class memory information command |
CN107077327A (en) * | 2014-06-30 | 2017-08-18 | 微体系统工程有限公司 | System and method for expansible wide operand instruction |
CN107347253A (en) * | 2015-02-25 | 2017-11-14 | 米雷普里卡技术有限责任公司 | Hardware instruction generation unit for application specific processor |
CN109543832A (en) * | 2018-11-27 | 2019-03-29 | 北京中科寒武纪科技有限公司 | A kind of computing device and board |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114064585A (en) * | 2021-11-10 | 2022-02-18 | 南京信易达计算技术有限公司 | Storage compression system based on domestic AI chip architecture and control method |
CN114064585B (en) * | 2021-11-10 | 2023-10-13 | 南京信易达计算技术有限公司 | Storage compression system based on domestic AI chip architecture and control method |
Also Published As
Publication number | Publication date |
---|---|
CN111782274B (en) | 2023-03-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111782133A (en) | Data processing method and device and related product | |
JP7239547B2 (en) | Data processing method, apparatus, and related products | |
US12112166B2 (en) | Data processing method and apparatus, and related product for increased efficiency of tensor processing | |
CN111857828B (en) | Processor operation method and device and related product | |
US20240111536A1 (en) | Data processing apparatus and related products | |
CN111782274B (en) | Data processing device and related product | |
CN111831337B (en) | Data synchronization method and device and related product | |
CN113807507B (en) | Data processing method and device and related products | |
CN111783992A (en) | Data processing device and related product | |
CN111831329B (en) | Data processing method and device and related product | |
CN111782267B (en) | Data processing method and device and related product | |
CN113806246A (en) | Data processing device and method and related product | |
CN111831722A (en) | Data synchronization method and device and related product | |
CN111857829B (en) | Processor operation method and device and related products | |
CN114489790A (en) | Data processing device, data processing method and related product | |
CN114489789A (en) | Processing device, processing method and related product | |
CN114489788A (en) | Instruction processing device, instruction processing method and related product | |
CN111857829A (en) | Processor operation method and device and related product | |
CN114489804A (en) | Processing method, processing device and related product |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |