CN114282159A - Data processing device, integrated circuit chip, equipment and method for realizing the same - Google Patents

Data processing device, integrated circuit chip, equipment and method for realizing the same Download PDF

Info

Publication number
CN114282159A
CN114282159A CN202011036302.6A CN202011036302A CN114282159A CN 114282159 A CN114282159 A CN 114282159A CN 202011036302 A CN202011036302 A CN 202011036302A CN 114282159 A CN114282159 A CN 114282159A
Authority
CN
China
Prior art keywords
data
dimensional
descriptor
address
dimensional data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011036302.6A
Other languages
Chinese (zh)
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cambricon Technologies Corp Ltd
Original Assignee
Cambricon Technologies Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cambricon Technologies Corp Ltd filed Critical Cambricon Technologies Corp Ltd
Priority to CN202011036302.6A priority Critical patent/CN114282159A/en
Priority to PCT/CN2021/110357 priority patent/WO2022062682A1/en
Priority to US18/013,976 priority patent/US20230297270A1/en
Priority to EP21871059.8A priority patent/EP4220448A1/en
Publication of CN114282159A publication Critical patent/CN114282159A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The present disclosure relates to a data processing apparatus, a method, an integrated circuit chip, an electronic device, and a board, wherein the data processing apparatus is included in a computing apparatus, the computing apparatus may be included in a combined processing apparatus, the combined processing apparatus may further include a general interconnect interface and other processing apparatuses. The computing device interacts with other processing devices to jointly complete computing operations specified by a user. The combined processing device may further comprise a storage device connected to the computing device and the other processing device, respectively, for storing data of the computing device and the other processing device. The scheme disclosed by the invention can be widely applied to various conversions of multi-dimensional data, and the efficiency of data conversion is improved.

Description

Data processing device, integrated circuit chip, equipment and method for realizing the same
Technical Field
The present disclosure relates generally to the field of data processing. More particularly, the present disclosure relates to a data processing apparatus, an integrated circuit chip, an electronic device, a board card, and a method implemented by the data processing apparatus.
Background
Operations in the field of artificial intelligence typically involve the processing of multidimensional data (e.g., two-dimensional matrices or three-dimensional arrays). Taking the processing of a two-dimensional matrix as an example, the conversion operation may include transposition, rotation, or mirroring. For such conversion operations, it is currently common to use dedicated matrix operation custom circuits to implement them. However, these matrix operation circuits are relatively complex in design, and the interfaces and functions are relatively fixed, and one type of matrix operation circuit can only process a corresponding type of matrix conversion, and cannot perform multiple conversion operations of the matrix according to actual needs. Therefore, how to obtain a data processing device capable of performing a conversion operation on multidimensional data becomes a problem to be solved in the prior art.
Further, in a computing system, an instruction set is a set of instructions for performing computations and controlling the computing system, and plays a critical role in improving the performance of a computing chip (e.g., a processor) in the computing system. Various types of computing chips (particularly those in the field of artificial intelligence) currently utilize associated instruction sets to perform various general or specific control operations and data processing operations. However, current instruction sets suffer from a number of drawbacks. For example, existing instruction sets are limited to hardware architectures and perform poorly in terms of flexibility. Further, the current instructions also present improvements in the conversion of various data types, particularly in the description and processing of multidimensional data.
Disclosure of Invention
To address at least the technical problems noted in the background section above, and to provide a computing architecture and instruction system for efficiently processing multidimensional data, aspects of the disclosed solution will be described below.
In a first aspect, the present disclosure provides a data processing apparatus comprising data caching circuitry and data conversion circuitry, wherein the data caching circuitry is configured to perform caching of multidimensional data; and
the data conversion circuit is configured to execute a store and read operation on multi-dimensional data in the data cache circuit according to a data conversion instruction to realize data conversion on the multi-dimensional data, wherein the data conversion instruction comprises a descriptor used for indicating the shape of the multi-dimensional data, and the descriptor is used for determining the storage address of the corresponding multi-dimensional data, and the data conversion circuit is configured to execute the store and read operation on the multi-dimensional data according to the storage address.
In a second aspect, the present disclosure provides an integrated circuit chip comprising a data processing apparatus as described in the first aspect above.
In a third aspect, the present disclosure provides an electronic device comprising an integrated circuit chip as described in the second aspect above.
In a fourth aspect, the present disclosure provides a board card comprising the integrated circuit chip as described in the third aspect above.
In a fifth aspect, the present disclosure provides a method implemented by a data processing apparatus, wherein the data processing apparatus comprises a data caching circuit and a data conversion circuit, the method comprising: caching multidimensional data using the data caching circuitry; and performing, using the data conversion circuit, a store and read operation on the multi-dimensional data in the data cache circuit according to a data conversion instruction to implement data conversion on the multi-dimensional data, wherein the data conversion instruction includes a descriptor indicating a shape of the multi-dimensional data, and the descriptor is used to determine a storage address of the corresponding multi-dimensional data, wherein the store and read operation is performed on the multi-dimensional data according to the storage address using the data conversion circuit.
With the data processing apparatus, the integrated circuit chip, the electronic device, the board card, and the method provided in the foregoing aspects, the present disclosure may implement data conversion on data, for example, multidimensional data, using a data conversion instruction. Specifically, by performing the storing and reading operations on the data to be converted in the data cache circuit by using the data conversion instruction, the scheme of the disclosure can implement various operations such as addressing, carrying, and deforming on the multidimensional data. Further, since the aforementioned data conversion operation is realized by means of an instruction, the scheme of the present disclosure reduces the modification to the hardware architecture and improves the efficiency of data conversion. In addition, by using the descriptor, the scheme of the disclosure facilitates addressing and storing the multidimensional data, thereby increasing the execution speed of the storing and reading operations of the multidimensional data, and further accelerating the efficiency of multidimensional data conversion.
Drawings
The above and other objects, features and advantages of exemplary embodiments of the present disclosure will become readily apparent from the following detailed description read in conjunction with the accompanying drawings. Several embodiments of the present disclosure are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar or corresponding parts and in which:
FIG. 1 is a schematic diagram illustrating a data processing apparatus according to an embodiment of the present disclosure;
FIG. 2 is a schematic diagram illustrating a data storage space for multidimensional data in accordance with an embodiment of the present disclosure;
3-5 are flow diagrams respectively illustrating various types of operation of a data conversion circuit according to an embodiment of the present disclosure;
FIG. 6 is a schematic diagram illustrating a computing device in accordance with an embodiment of the present disclosure;
FIG. 7 is a flow diagram illustrating a method implemented by a data processing apparatus according to an embodiment of the present disclosure;
FIG. 8 is a block diagram illustrating a combined treatment device according to an embodiment of the present disclosure; and
fig. 9 is a schematic diagram illustrating a structure of a board according to an embodiment of the disclosure.
Detailed Description
The technical solutions in the embodiments of the present disclosure will be described clearly and completely with reference to the accompanying drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are some, not all embodiments of the present disclosure. All other embodiments, which can be derived by one skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the scope of protection of the present disclosure.
It should be understood that the terms "first," "second," "third," and "fourth," etc. in the claims, description, and drawings of the present disclosure are used to distinguish between different objects and are not used to describe a particular order. The terms "comprises" and "comprising," when used in the specification and claims of this disclosure, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the disclosure herein is for the purpose of describing particular embodiments only, and is not intended to be limiting of the disclosure. As used in the specification and claims of this disclosure, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should be further understood that the term "and/or" as used in the specification and claims of this disclosure refers to any and all possible combinations of one or more of the associated listed items and includes such combinations.
As used in this specification and claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to a determination" or "in response to a detection". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".
Specific embodiments of the present disclosure are described in detail below with reference to the accompanying drawings.
Fig. 1 is a schematic diagram illustrating a data processing apparatus 100 according to an embodiment of the present disclosure. As shown in fig. 1, the data processing apparatus 100 includes a data buffer circuit 102 and a data conversion circuit 104. In one embodiment, the data caching circuitry may be configured to perform caching of multidimensional data. In one exemplary application scenario, the multi-dimensional data of the present disclosure may include tensor data of two or more dimensions. In one embodiment, the data conversion circuit may be configured to perform a store operation and a read operation on the multidimensional data in the aforementioned data cache circuit according to the data conversion instruction, so as to implement data conversion on the multidimensional data. For example, by performing the logging and reading of data to be transformed in different ways, aspects of the present disclosure may perform transformation operations on multidimensional data at various spatial locations to obtain transformed data. In one embodiment, the aforementioned data conversion instructions may include descriptors for indicating the shape of the multidimensional data, and the descriptors may be used to determine the memory address of the corresponding multidimensional data. Further, the data conversion circuit may be configured to perform a store and read operation on the multi-dimensional data according to the memory address.
In one embodiment, the data conversion instruction of the present disclosure may include an identifier of the descriptor and/or contents of the descriptor, and the contents of the descriptor may include at least one shape parameter representing a shape of the multidimensional data and/or at least one address parameter representing an address of the multidimensional data. In another embodiment, the address parameter of the aforementioned multi-dimensional data may include a reference address of a data reference point of the descriptor in a data storage space of the multi-dimensional data. In another embodiment, the data conversion instruction may include data volume information and/or inter-dimension offset information for performing a store and read operation with respect to each dimension of the multi-dimensional data, and wherein the data volume information and/or the inter-dimension offset information may be determined according to address parameters and/or shape parameters in the descriptor.
Based on the discussion of the above embodiments, in one exemplary implementation scenario, the shape parameters of the multi-dimensional data of the present disclosure include at least one of: the size of the data storage space in at least one direction of N dimensional directions, the size of a storage area of the multi-dimensional data in at least one direction of the N dimensional directions, the offset of the storage area in at least one direction of the N dimensional directions, the positions of at least two vertexes at diagonal positions of the N dimensional directions relative to the data reference point, and the mapping relation between the data description position of the multi-dimensional data indicated by the descriptor and the data address, wherein N is an integer greater than or equal to zero.
By utilizing the above-mentioned descriptors included in the data conversion instructions, the disclosed aspects can perform various types of conversion operations on multi-dimensional data, which can include, but are not limited to, data mirroring operations, data rotation operations from multiple angles, or data transpose operations. Taking three-dimensional data as an example, a transposition, mirroring or multi-angle (e.g., 90 ° or 180 °) rotation operation of the three-dimensional data may be achieved using the scheme of the present disclosure. In one application scenario, when the data to be converted is a matrix to be converted (i.e. a kind of two-dimensional data), the data buffer circuit may include a buffer memory array for buffering the matrix data written or transformed by the data conversion circuit in a storing operation, or for transferring the matrix data to the data conversion circuit in a reading operation, so that the data conversion circuit transfers the matrix data to an external memory or a computing unit for proper conversion.
In one embodiment, when the data to be converted is multi-dimensional data, the data conversion instruction may include a descriptor including data amount information and inter-dimension shift information for performing a deposit and read operation with respect to each dimension in the multi-dimensional data. In an example scenario, the data amount information may include the number of data to be stored and read in each dimension, and the inter-dimension offset information includes an address interval to be spanned from a current dimension to a next dimension. In another example scenario, the address interval is determined according to the number of data in the current dimension and the footprint of each data.
As an example, when the multidimensional data is data having three dimensions of length, width, and height, then for data in the length or width direction (i.e., one dimension), the data amount information may be information in terms of the number, size, and/or occupied space of each data in the length or width direction. Further, the inter-dimension shift information may be inter-dimension shift information from one-dimensional data composed of a length or width direction to two-dimensional data composed of a length and width, or inter-dimension shift information from two-dimensional data composed of a length and width to three-dimensional data composed of a length, width, and height. For example, the inter-dimension offset information may be the number of data and/or address space offsets spanned from the previous low dimension to the next high dimension.
To facilitate reading and writing multidimensional data, the present disclosure also proposes that one M-dimensional counter may be defined based on the descriptor, i.e., there are M N _1, N _2, N _3, … … N _ M one-dimensional counters, respectively. In counting, when the nth counter is counted for one period N _ N (e.g., from 0 to N _ N), the nth counter may be reset to zero, and the (N + 1) th counter + 1. Based on the definition of the M-dimension counter, the present disclosure proposes maintaining an M-dimension read counter and read pointer, and maintaining an M-dimension write counter and write pointer. For an M-dimensional read counter, it can be expressed as: r _ cnt (i _1, i _2, i _3 … i _ M), and accordingly the read pointer can be expressed as: r _ p is R _ addr + i _1 _ s _0+ i _2 _ s _1+ … + s _ M-1 _ n _ M, where R _ addr is the read base address. In the reading process, the M-dimensional counter R _ cnt is incremented by one after each reading of 0 elements of R _ n. Similarly, for an M-dimensional write counter, it can be expressed as: w _ cnt (i _1, i _2, i _3 … i _ M), and accordingly the write pointer can be expressed as: w _ p — addr + i _1 _ s _0+ i _2 _ s _1+ … + s _ M-1 _ n _ M, where W _ addr is the write base address.
Based on the above-described M-dimensional read and write counters, the scheme of the present disclosure may implement data conversion of data to be converted by storing R _ n0 elements into the data cache circuit via the data conversion circuit, and then reading W _ n0 elements from the data cache circuit via the data conversion circuit. For example, the data conversion circuit may perform selective output of partial data on multidimensional data, rotate the data at an angle, mirror or transpose the data, or the like, using the aforementioned data storing and reading operations, according to a data conversion instruction.
For example, in one embodiment, the data conversion instruction of the present disclosure may further include the aforementioned store base address information ("W _ addr") and read base address information ("R _ addr") according to the descriptor, wherein in performing the write and read operations, the data conversion circuit may address to the next dimension according to the store base address information and the inter-dimension offset information to perform the store operation, and address to the next dimension according to the read base address information and the inter-dimension offset information to perform the read operation. It can be seen that by utilizing base address information, the data conversion circuit can more accurately and efficiently locate multidimensional data that requires both store and read operations to be performed. Furthermore, by introducing the base address information, the method for positioning to multi-dimensional data is expanded and the addressing space is expanded. In addition, by introducing the base address information and the inter-dimension offset information, the data processing apparatus of the present disclosure may implement various types of operations on the multi-dimensional data, such as one or more of a bypass operation, a multi-angle rotation operation, a mirroring operation, or a sequential transformation operation of the multi-dimensional data, based on the data conversion instruction.
In one exemplary implementation scenario, when the multi-dimensional data is implemented as a two-dimensional matrix, then the data caching circuitry of the present disclosure may include a cache storage array. In one embodiment, the size of the cache memory array may be determined according to the number of rows X and the number of columns Y of the matrix to be converted and the memory space K occupied by the basic elements in the matrix.
Specifically, according to the number of rows X, the number of columns Y, and the storage space K occupied by the basic elements of the matrix to be converted, the data processing apparatus of the present disclosure may further include a cache storage array having a size matching the size of the matrix to be converted, where the storage space occupied by the basic elements of the cache storage array is greater than or equal to K, the number of rows of the cache storage array is greater than or equal to the larger of X and Y, and the number of columns of the cache storage array is greater than or equal to the larger of X and Y. The size of the cache memory array is set to meet the requirement that the matrix to be converted can be stored in the cache memory array according to a preset access mode. For example, when X is not equal to Y, the number of rows and columns is interchanged under the transpose operation, and the number of rows and columns of the cache memory array formed in the above arrangement can support such a change in the number of rows and columns during the matrix conversion process. Of course, when X is equal to Y, the number of rows in the cache memory array is greater than or equal to any one of X and Y, and the number of columns in the cache memory array is greater than or equal to any one of X and Y.
Based on the above-mentioned cache memory array, the shape of the matrix to be transformed (i.e. the two-dimensional data) can be described as (X, Y) by the descriptor of the present disclosure, that is, the multidimensional data is represented as the two-dimensional data by two parameters, and the size of the first dimension (column) of the multidimensional data is Y and the size of the second dimension (row) of the multidimensional data is X. By using the descriptor, the data conversion circuit disclosed by the disclosure can perform corresponding storing and reading operations on the two-dimensional matrix under different scenes, so that various operations on the two-dimensional matrix, such as transposition, rotation of 270 °, rotation of 90 °, rotation of 180 °, mirroring and the like, can be realized.
Taking the transpose operation as an example, assuming that the matrix to be transformed is an X Y matrix, X may be equal to Y or not equal to Y. The data conversion circuit may store the 1 st row of the X Y matrix to the 1 st to the Y th basic element positions in the 1 st row of the cache memory array, respectively, in order from the 1 st to the Y th basic elements, and thus operate in a loop in order from the 1 st to the X th rows of the X Y matrix until the X th row of the X Y matrix is stored to the 1 st to the Y th basic element positions in the X th row of the cache memory array, respectively, in order from the 1 st to the Y th basic elements, forming an intermediate matrix of (X Y), which may be understood as a matrix to be converted copied to the cache memory array. Then, the data conversion circuit reads the 1 st basic element from the 1 st row to the X th row of the (X Y) intermediate matrix in sequence, and splices the read X basic elements into 1 row and uses the spliced row as the 1 st row of the transpose matrix in sequence, and thus the operation is circulated in sequence from the 1 st basic element to the Y th basic element until the 1 st row to the Y th basic element from the X Y intermediate matrix are read in sequence, and splices the read X basic elements into 1 row and uses the spliced row as the Y th row of the transpose matrix in sequence, thereby forming the transposed matrix.
Taking the 270 ° rotation operation as an example, it is still assumed that the matrix to be transformed is an X × Y matrix, where X may be equal to Y or not. The data conversion circuit may store the 1 st row of the X Y matrix to the 1 st to Y basic element positions in the 1 st row of the cache memory array in the order from the Y basic element to the 1 st basic element, and thus operate cyclically in the order from the 1 st to X rows of the X Y matrix until the X th row of the X Y matrix is stored to the 1 st to Y basic element positions in the X th row of the cache memory array in the order from the Y basic element to the 1 st basic element, respectively, to form an X Y intermediate matrix, which may be understood as being formed by intra-row mirroring each row of a matrix to be converted. Then, the data conversion circuit may sequentially read the 1 st basic element from the 1 st row to the 1 st row of the X × Y intermediate matrix, and concatenate the read X basic elements into 1 row in this order and use the row 1 as the rotated matrix, and thus circulate the operations in the order from the 1 st basic element to the Y basic element until the 1 st row to the Y basic element from the X × Y intermediate matrix are sequentially read, and concatenate the read X basic elements into 1 row in this order and use the row Y as the rotated matrix, thereby forming a matrix rotated by 270 °.
Taking the mirror operation as an example, it is still assumed that the matrix to be transformed is an X Y matrix, where X may be equal to Y or not equal to Y. The conversion processing circuit may store the 1 st row of the X × Y matrix to the 1 st basic element position to the Y th basic element position in the 1 st row of the cache memory array in the order from the Y th basic element to the 1 st basic element, and thus, the operation is cycled in the order from the 1 st row to the X th row of the X × Y matrix until the X th row of the X × Y matrix is stored to the 1 st basic element position to the Y th basic element position in the X th row of the cache memory array in the order from the Y th basic element to the 1 st basic element, so as to form an X × Y intermediate matrix, which may be understood as being formed by internally mirroring each row of the matrix to be converted. Then, the data conversion circuit may read the X-th row of the X × Y intermediate matrix in order from the Y-th basic element to the 1-th basic element as the 1 st row of the mirrored matrix, and thus cyclically operate in order from the X-th row to the 1 st row of the intermediate matrix until the 1 st row of the X × Y intermediate matrix is read in order from the Y-th basic element to the 1 st basic element as the X-th row of the mirrored matrix, thereby forming the mirrored matrix.
The composition and operation of the data processing apparatus of the present disclosure are described above in connection with fig. 1. Based on the above description, those skilled in the art can understand that the data processing apparatus of the present disclosure transforms multidimensional data using data transformation instructions, which improves the execution efficiency of multidimensional data transformation. In addition, by performing various kinds of store and read operations on the multi-dimensional data using the data conversion circuit to convert the data, the disclosed scheme simplifies the complexity of the multi-dimensional data conversion operation and speeds up the progress of the multi-dimensional data conversion. Thus, the scheme of the disclosure also reduces the overhead of data processing, and in a computing scenario requiring data conversion, improves the computing efficiency and reduces the computing overhead.
FIG. 2 is a schematic diagram illustrating a data storage space for multidimensional data in accordance with an embodiment of the present disclosure. As previously described, the data conversion operations of the present disclosure further include using the descriptors to indicate (or obtain) shape-related information about the multidimensional data to determine the storage addresses of the multidimensional data, so as to obtain and save the multidimensional data through the aforementioned storage addresses. In addition, based on the foregoing description, those skilled in the art will also appreciate that the multidimensional data of the present disclosure may refer to or represent tensor data having dimensions greater than or equal to two dimensions. Therefore, the following description for multidimensional data also applies to tensor data having dimensions greater than or equal to two dimensions.
In one exemplary implementation, the shape of the N-dimensional data may be indicated by a descriptor, N being a positive integer greater than or equal to 2, e.g., N ═ 2 or 3. The multidimensional data may include various forms of data composition, for example, the matrix may be regarded as multidimensional data in 2 dimensions or more than 2 dimensions, and the aforementioned data in the "HWC" dimension may be regarded as multidimensional data in 3 dimensions. The shape of the multidimensional data includes information such as dimensions of the multidimensional data, sizes of the dimensions, and the like. For example, for multidimensional data:
Figure BDA0002705186440000101
the shape of the multidimensional data can be described as (2, 4) by a descriptor, that is, the multidimensional data is represented as two-dimensional data by two parameters, and the size of the first dimension (column) of the multidimensional data is 2, and the size of the second dimension (row) of the multidimensional data is 4. It should be noted that, the manner in which the descriptor indicates the shape of the multidimensional data is not limited in the present application.
In an exemplary implementation manner, the value of N may be determined according to the dimension (order) of the multidimensional data, or may be set according to the usage requirement of the multidimensional data. For example, when the value of N is 3, the multi-dimensional data is three-dimensional data and the descriptor can thus be used to indicate the shape (e.g., offset, size, etc.) of the three-dimensional data in the three-dimensional directions. It should be understood that the value of N can be set by those skilled in the art according to practical needs, and the disclosure does not limit this.
In an exemplary implementation, the descriptor may include an identification of the descriptor and/or the content of the descriptor. In this case, the identification of the descriptor may be used to distinguish the descriptor. For example, the identifier of the descriptor may be its number; the content of the descriptor may include at least one shape parameter representing a shape of the multi-dimensional data. For example, when the multidimensional data is 3-dimensional data, of the three dimensions of the multidimensional data, the shape parameters of two dimensions may remain fixed, and the content of the descriptor thereof may include a shape parameter representing another dimension of the multidimensional data.
In an exemplary implementation, the identity and/or content of the descriptor may be stored in a descriptor storage space (internal memory), such as a register, an on-chip static random access memory ("SRAM"), or other media cache, or the like. Accordingly, the multidimensional data indicated by the descriptor may be stored in a data storage space (internal memory or external memory), such as an on-chip cache or an off-chip memory, etc. In view of this, the present disclosure does not limit the specific locations of the descriptor storage space and the data storage space.
In an exemplary implementation, the identity of the descriptor, the content, and the multidimensional data indicated by the descriptor may be stored in the same region of the internal memory. For example, a contiguous region of the on-chip cache may be used to store the relevant contents of the descriptors, with addresses such as ADDR0-ADDR 1023. Wherein, the addresses ADDR0-ADDR63 can be used as descriptor storage space for storing the identification and content of the descriptor, and the addresses ADDR64-ADDR1023 can be used as data storage space for storing the multidimensional data indicated by the descriptor. In the descriptor memory space, the available addresses ADDR0-ADDR31 may be used to store the identity of the descriptor, and the addresses ADDR32-ADDR63 may be used to store the content of the descriptor. It should be understood that the address ADDR is not limited to 1 bit or one byte, and is used herein to mean one address, which is a unit of one address. The descriptor storage space, the data storage space, and their specific addresses may be determined by those skilled in the art in practice, and the present disclosure is not limited thereto.
In an exemplary implementation, the identity of the descriptor, the content, and the multidimensional data indicated by the descriptor may be stored in different areas of internal memory. For example, a register may be used as a descriptor storage space, the identifier and content of the descriptor may be stored in the register, an on-chip cache may be used as a data storage space, and the multidimensional data indicated by the descriptor may be stored.
In one exemplary implementation, where a register is used to store the identity and content of a descriptor, the number of the register may be used to represent the identity of the descriptor. For example, when the number of the register is 0, the identifier of the descriptor stored therein is set to 0. When the descriptor in the register is valid, an area may be allocated in the cache space for storing the multidimensional data according to the size of the multidimensional data indicated by the descriptor.
In an exemplary implementation, the identity and content of the descriptor may be stored in an internal memory and the multidimensional data indicated by the descriptor may be stored in an external memory. For example, the identification and content of the descriptor may be stored on-chip, and the multidimensional data indicated by the descriptor may be stored under-chip.
In one exemplary implementation, the data address of the data storage space corresponding to each descriptor may be a fixed address. For example, separate data storage spaces may be partitioned for multidimensional data, each multidimensional data having a one-to-one correspondence with descriptors at a starting address of the data storage space. In this case, the data conversion circuit of the present disclosure may determine the data address of the data corresponding to the operand in the data storage space according to the descriptor.
In an exemplary implementation, when the data address of the data storage space corresponding to the descriptor is a variable address, the descriptor may also be used to indicate an address of the multidimensional data. In this case, the content of the descriptor may further include at least one address parameter representing an address of the multi-dimensional data. For example, when the multidimensional data is 3-dimensional data, when the descriptor points to the address of the multidimensional data, the content of the descriptor may include an address parameter indicating the address of the multidimensional data, such as the starting physical address of the multidimensional data, and may also include a plurality of address parameters of the address of the multidimensional data. For example, the starting address of the multidimensional data + the address offset, or the multidimensional data is based on the address parameters of the dimensions. The address parameters can be set by those skilled in the art according to practical needs, and the disclosure does not limit this.
In an exemplary implementation, the address parameter of the multi-dimensional data may include a reference address of a data reference point of the descriptor in a data storage space of the multi-dimensional data. Wherein the reference address may be different according to a variation of the data reference point. The present disclosure does not limit the selection of data reference points.
In one exemplary implementation, the base address may include a start address of the data storage space. When the data reference point of the descriptor is the first data block of the data storage space, the reference address of the descriptor is the start address of the data storage space. When the data reference point of the descriptor is data other than the first data block in the data storage space, the reference address of the descriptor is the address of the data block in the data storage space.
In one exemplary implementation, the shape parameters of the multi-dimensional data include at least one of: the size of the data storage space in at least one direction of the N dimensional directions, the size of the storage area in at least one direction of the N dimensional directions, the offset of the storage area in at least one direction of the N dimensional directions, the positions of at least two vertexes at diagonal positions of the N dimensional directions relative to the data reference point, and the mapping relationship between the data description position and the data address of the multi-dimensional data indicated by the descriptor. Wherein the data description position is a mapping position of a point or a region in the multi-dimensional data indicated by the descriptor. For example, when the multidimensional data is 3-dimensional data, the descriptor may represent a shape of the multidimensional data using three-dimensional space coordinates (x, y, z), and the data description position of the multidimensional data may be a position of a point or a region of the multidimensional data mapped in the three-dimensional space, which is represented using the three-dimensional space coordinates (x, y, z).
It should be understood that those skilled in the art can select the shape parameters representing the multi-dimensional data according to practical situations, and the present disclosure does not limit this. By using the descriptor in the data access process, the association between the data can be established, thereby reducing the complexity of data access and improving the instruction processing efficiency.
In an exemplary implementation, the content of the descriptor of the multidimensional data may be determined according to a reference address of a data reference point of the descriptor in a data storage space of the multidimensional data, a size of the data storage space in at least one of N dimensional directions, a size of the storage area in at least one of N dimensional directions, and/or an offset of the storage area in at least one of N dimensional directions.
As shown in fig. 2, the data storage space 21 stores two-dimensional data in a line-first manner, which can be represented by (X, Y) (where the X axis is horizontally right and the Y axis is vertically downward), the size in the X axis direction (the size of each line) is ori _ X (not shown in the figure), the size in the Y axis direction (the total number of lines) is ori _ Y (not shown in the figure), and the starting address PA _ start (reference address) of the data storage space 21 is the physical address of the first data block 22. The data block 23 is partial data in the data storage space 21, and its offset amount 25 in the X-axis direction is denoted as offset _ X, the offset amount 24 in the Y-axis direction is denoted as offset _ Y, the size in the X-axis direction is denoted as size _ X, and the size in the Y-axis direction is denoted as size _ Y.
In an exemplary implementation, when the descriptor is used to define the data block 23, the data reference point of the descriptor may use the first data block of the data storage space 21, and the reference address of the descriptor may be agreed as the starting address PA _ start of the data storage space 21. Then, the content of the descriptor of the data block 23 may be determined in combination with the size ori _ X of the data storage space 21 in the X axis, the size ori _ Y in the Y axis, and the offset amount offset _ Y of the data block 23 in the Y axis direction, the offset amount offset _ X in the X axis direction, the size _ X in the X axis direction, and the size _ Y in the Y axis direction.
In one exemplary implementation, the content of the descriptor may be represented using the following equation (1):
Figure BDA0002705186440000131
it should be understood that although the content of the descriptor is represented by a two-dimensional space in the above examples, a person skilled in the art can set the specific dimension of the content representation of the descriptor according to practical situations, and the disclosure does not limit this.
In an exemplary implementation, a base address of a data reference point of the descriptor in the data storage space may be defined. On the basis of the reference address, the content of the descriptor of the multi-dimensional data can be determined according to the positions of at least two vertexes at diagonal positions in the N dimensional directions relative to the data reference point.
For example, a reference address PA _ base of a data reference point of the descriptor in the data storage space may be agreed. For example, one data (for example, data with position (2, 2)) may be selected as a data reference point in the data storage space 21, and the physical address of the data in the data storage space may be used as the reference address PA _ base. The content of the descriptor of the data block 23 in fig. 2 can then be determined from the positions of the two vertices of the diagonal position relative to the data reference point. First, the positions of at least two vertices of the diagonal positions of the data block 23 with respect to the data reference point are determined, for example, the positions of the diagonal position vertices with respect to the data reference point in the top-left to bottom-right direction are used. Then, the content of the descriptor of the data block 23 can be determined according to the reference address PA _ base, the relative position of the top-left vertex (x _ min, y _ min), and the relative position of the bottom-right vertex (x _ max, y _ max).
In one exemplary implementation, the content of the descriptor (with reference to PA _ base) can be represented using the following equation (2):
Figure BDA0002705186440000141
it should be understood that although the above examples use the vertex of two diagonal positions of the upper left corner and the lower right corner to determine the content of the descriptor, the skilled person can set the specific vertex of at least two vertices of the diagonal positions according to the actual needs, and the disclosure does not limit this.
In an exemplary implementation manner, the content of the descriptor of the multidimensional data may be determined according to a reference address of the data reference point of the descriptor in the data storage space and a mapping relationship between a data description position and a data address of the multidimensional data indicated by the descriptor. For example, when the multidimensional data indicated by the descriptor is three-dimensional space data, the mapping relationship between the data description position and the data address may be defined by using a function f (x, y, z).
In one exemplary implementation, the content of the descriptor may be represented using the following equation (3):
Figure BDA0002705186440000142
in an exemplary implementation, the descriptor is further configured to indicate an address of the multidimensional data, wherein the content of the descriptor further includes at least one address parameter indicating the address of the multidimensional data, for example, the content of the descriptor may be:
D:
Figure BDA0002705186440000151
where PA is the address parameter. The address parameter may be a logical address or a physical address. The descriptor parsing circuit may obtain a corresponding data address by using PA as any one of a vertex, a middle point, or a preset point of a vector shape in combination with shape parameters in the X direction and the Y direction.
In one exemplary implementation, the address parameter of the multi-dimensional data includes a reference address of a data reference point of the descriptor in a data storage space of the multi-dimensional data, and the reference address includes a start address of the data storage space.
In one possible implementation, the descriptor may further include at least one address parameter indicating an address of the multidimensional data, for example, the content of the descriptor may be:
D:
Figure BDA0002705186440000152
wherein PA _ start is a reference address parameter, which is not described again.
It should be understood that, the mapping relationship between the data description location and the data address can be set by those skilled in the art according to practical situations, and the disclosure does not limit this.
In an exemplary implementation manner, a default base address may be set in a task, the base address is used by descriptors in instructions in the task, and shape parameters based on the base address may be included in the descriptor contents. This base address may be determined by setting an environmental parameter for the task. The relevant description and usage of the base address can be found in the above embodiments. In this implementation, the content of the descriptor can be mapped to the data address more quickly.
In an exemplary implementation, the base address may be included in the content of each descriptor, and the base address of each descriptor may be different. Compared with a mode of setting a common reference address by using environment parameters, each descriptor in the mode can describe data more flexibly and use a larger data address space.
In an exemplary implementation, the data address in the data storage space of the data corresponding to the operand of the processing instruction may be determined according to the content of the descriptor. The calculation of the data address is automatically completed by hardware, and the calculation methods of the data address are different when the content of the descriptor is represented in different ways. The present disclosure does not limit the specific calculation method of the data address.
For example, the content of the descriptor in the operand is expressed by formula (1), the amount of shift of the multidimensional data indicated by the descriptor in the data storage space is offset _ x and offset _ y, respectively, and the size is size _ x × size _ y, then the starting data address PA1 of the multidimensional data indicated by the descriptor in the data storage space(x,y)The following equation (4) may be used to determine:
PA1(x,y)=PA_start+(offset_y-1)*ori_x+offset_x (4)
the data start address PA1 determined according to the above formula (4)(x,y)Combining the offset amounts offset _ x and offset _ y, anThe size _ x and size _ y of the storage area may determine the storage area of the multi-dimensional data indicated by the descriptor in the data storage space.
In an exemplary implementation, when the operand further includes a data description location for the descriptor, a data address of data corresponding to the operand in the data storage space may be determined according to the content of the descriptor and the data description location. In this way, a portion of the data (e.g., one or more data) in the multi-dimensional data indicated by the descriptor may be processed.
For example, the content of the descriptor in the operand is expressed by formula (1), the multidimensional data indicated by the descriptor is respectively offset in the data storage space by offset _ x and offset _ y, the size is size _ x × size _ y, and the data description position for the descriptor included in the operand is (x) xq,yq) Then, the data address PA2 of the multidimensional data indicated by the descriptor in the data storage space(x,y)The following equation (5) may be used to determine:
PA2(x,y)=PA_start+(offset_y+yq-1)*ori_x+(offset_x+xq) (5)
the data start address PA2 determined according to the above equation (5)(x,y)In combination with the offsets offset _ x and offset _ y and the size _ x and size _ y of the storage area, the storage area of the multidimensional data indicated by the descriptor in the data storage space can be determined.
In an exemplary implementation, when the operand further includes a data description location for the descriptor, a data address of data corresponding to the operand in the data storage space may be determined according to the content of the descriptor and the data description location. In this way, a portion of the data (e.g., one or more data) in the multi-dimensional data indicated by the descriptor may be processed.
For example, the content of the descriptor in the operand is expressed by using formula (2), the multidimensional data indicated by the descriptor is respectively offset in the data storage space by offset _ x and offset _ y, and the size is size _ x × size _ y, and the description in the operand is included in the operandThe data description position of the symbol is (x)q,yq) Then, the data address PA2 of the multidimensional data indicated by the descriptor in the data storage space(x,y)The following equation (6) may be used to determine:
PA2(x,y)=PA_start+(offset_y+yq-1)*ori_x+(offset_x+xq) (6)
the data processing apparatus of the present disclosure is described above with reference to fig. 1 and fig. 2, and through the data conversion instruction and in combination with the descriptor, the data processing apparatus of the present disclosure can significantly improve the access and conversion efficiency of the multidimensional data, and reduce the overhead for multidimensional data processing.
Fig. 3-5 are flow diagrams respectively illustrating various types of operations of a data conversion circuit according to an embodiment of the present disclosure.
To perform the operations 300 as illustrated in fig. 3, the data processing apparatus of the present disclosure may further include an external memory (e.g., the memory circuit 602 exemplarily illustrated in fig. 6) that stores multi-dimensional data. Based on the external memory, the data conversion instructions of the present disclosure may include a first descriptor and a second descriptor. In this scenario, at step S302, the data conversion circuit of the present disclosure may be configured to perform reading the multidimensional data from the external memory for storing to the data cache circuit according to the first descriptor. Next, at step S304, the data conversion circuit of the present disclosure may be configured to perform reading the multidimensional data in the data cache circuit into the external memory according to the second descriptor. By targeting different store and read operations of the data cache circuit, the disclosed scheme may utilize descriptors to implement the conversion of multidimensional data. For example, the data conversion circuit of the present disclosure may perform a warping operation on multidimensional data in whole or in blocks through a first descriptor and then output the multidimensional data represented by a second descriptor. According to different conversion scenarios, the foregoing warping operation may include, but is not limited to, operations of mirroring, 180 degrees, 270 degrees, 90 degrees rotating or transposing multi-dimensional data.
In one or more embodiments, when the data conversion instructions include operating parameters, the data conversion circuitry of the present disclosure may be configured to perform data conversion on the multidimensional data according to the operating parameters. Data conversion operations 400 and 500 performed by the data conversion circuit of the present disclosure in accordance with the foregoing operating parameters will be described below in conjunction with fig. 4 and 5.
As shown in fig. 4, when the data conversion circuit is configured to perform an operation according to the operation parameter, it may store multidimensional data into the data cache circuit in a first dimension order of the multidimensional data at step S402. Next, at step S404, the data conversion circuit of the present disclosure may be configured to read the multi-dimensional data from the data buffer circuit in a second dimensional order for output. For example, for three-dimensional data in the neural network in the "HWC" dimensional order (H represents height, W represents width, and C represents channel), the conversion of the three-dimensional data may be achieved by performing operation 400 according to the operating parameters to convert the three-dimensional data into three-dimensional data in the "WCH" or "CWH" dimensional order.
As shown in fig. 5, when at step S502, the data conversion circuit is configured to perform data conversion on the multidimensional data according to the operating parameters. As an example, the data conversion circuit may perform various operations as shown in steps S502-1 to S502-3 according to different operating parameters.
Specifically, as shown in step S502-1, according to the operation parameters, the data conversion circuit of the present disclosure may be configured to perform a store and read operation on one or more portions of the multidimensional data in the data cache circuit to implement data conversion on the one or more portions of the multidimensional data. In other words, the data conversion circuitry of the present disclosure may selectively perform store and read operations on portions of the multi-dimensional data, rather than processing the entire multi-dimensional data, as dictated by the operating parameters.
As shown in step S502-2, according to the operation parameters, the data conversion circuit of the present disclosure may be configured to splice multiple portions of the converted multidimensional data read from the data buffer circuit for output. That is to say, the data conversion circuit of the present disclosure not only performs conversion by accessing multidimensional data for the data cache circuit, but also performs related post-processing operations on the converted multidimensional data, so as to facilitate subsequent processing. In particular, depending on the operating parameters, the data conversion circuitry may select out portions (e.g., corresponding to one or more dimensions) of the specified multi-dimensional data and place them (which may include ordered stitching) as required by subsequent operations for subsequent processing. In one exemplary implementation scenario, this subsequent processing may be performed, for example, by the computation circuitry 604 in fig. 6.
Slightly different from the operation of step S502-2 as above, in the operation shown in step S502-3, the data conversion circuit of the present disclosure may be configured to combine the plurality of portions of the unconverted multidimensional data read from the data buffer circuit to output, according to the operation parameter. It can be seen that in this case, the data conversion circuit of the present disclosure does not convert multiple dimensions by a store and read operation for the data cache circuit. Instead, it performs a combining operation on the non-converted multidimensional data before outputting the multidimensional data to a subsequent processing unit (e.g., the calculation circuit 604 of fig. 6). In other words, the data conversion circuit of the present disclosure may support enable and disable functions with respect to the data conversion operation section, and may be selected according to the operation parameters. When the data conversion function is selectively disabled based on operating parameters, the data conversion circuitry of the present disclosure also supports selectively not converting the multidimensional data as a whole, but instead selecting and combining portions thereof (which may also be specified by operating parameters) for subsequent processing.
FIG. 6 is a schematic diagram illustrating a computing device 600 according to an embodiment of the present disclosure. As shown in fig. 6, the computing device 600 may include the data processing device 100 described above in connection with fig. 1, i.e., the data cache circuit 102 and the data conversion circuit 104 shown in the figure. Since the data processing apparatus of the present disclosure has been described in detail above with reference to fig. 1 to 5, and the detailed description about the data buffer circuit 102 and the data conversion circuit 104 also applies to the computing apparatus 200, the same contents will not be repeated.
As shown in the figure, the computing device of the present disclosure also includes a storage circuit 602 and a computing circuit 604. The computing circuitry and memory circuitry herein may be implemented in various ways depending on the application scenario. In one embodiment, the memory circuit may take the form of a memory, such as a dynamic random access memory ("DRAM") or a double data rate synchronous dynamic random access memory ("DDR SDRAM"), which may be used to store operational data required for the computing circuit to perform operations, or data for exchange with external memory, such as multi-dimensional data according to the present disclosure. When the computing device of the present disclosure is applied to the field of artificial intelligence, the aforementioned operation data or data to be exchanged may be data of various related fields, such as various training data, network model data and parameters in machine learning, and various types of data to be detected (e.g., three-dimensional or four-dimensional image data, etc.).
In another embodiment, the computing circuitry may take the form of a general-purpose or special-purpose processor, and a general-purpose or special-purpose processor core, which may include various types of operators and buses (e.g., a data bus, a control bus, or a broadcast bus). When the disclosed solution is applied to the field of artificial intelligence, the computing circuit can be implemented or included in a single-core or multi-core deep learning processor to implement various computing operations for multi-dimensional data. In one application scenario, when the computing circuitry is implemented as a processor core, it may be packaged together with data caching circuitry and data conversion circuitry to form a processor. In this case, the data caching circuitry may be implemented as a cache of the computing device to store data (e.g., including multidimensional data) and instructions in memory (e.g., storage circuitry 202) that are most frequently accessed by the computing circuitry, such that the computing circuitry need not read the needed data and instructions from memory that is relatively slow to operate.
FIG. 7 is a flow diagram illustrating a method 700 implemented by a data processing apparatus according to an embodiment of the present disclosure. It will be appreciated that the data processing apparatus herein is the data processing apparatus discussed above in connection with fig. 1-6. Therefore, the foregoing description of the data processing apparatus is also applicable to the scheme shown in fig. 7, and the same contents will not be described again.
As shown in fig. 7, at step S702, the method 700 performs caching of multi-dimensional data using a data caching circuit. The data herein may be multidimensional data, such as a two-dimensional matrix or a three-or four-dimensional array, in accordance with various embodiments of the present disclosure. At step S704, the method 700 uses the data conversion circuit to perform a store and read operation on the multidimensional data in the data cache circuit according to a data conversion instruction to implement data conversion on the multidimensional data. As previously described, in one embodiment, the data conversion instructions may include descriptors for indicating the shape of the multidimensional data, and the descriptors are used to determine the memory address of the corresponding multidimensional data. In another embodiment, method 700 includes using the data conversion circuitry to perform store and read operations on the multi-dimensional data according to the memory address. Although not shown in fig. 7, those skilled in the art will appreciate that the method 700 may perform various operations of the data processing apparatus described in conjunction with fig. 1-6.
Fig. 8 is a block diagram illustrating a combined processing device 800 according to an embodiment of the present disclosure. As shown in fig. 8, the combined processing device 800 includes a computing processing device 802, an interface device 804, other processing devices 806, and a storage device 808. Depending on the application scenario, one or more computing devices 810 may be included in the computing processing device, which may include the data processing device of the present disclosure, and may be configured to perform the operations described herein in conjunction with fig. 1-7.
In various embodiments, the computing processing device of the present disclosure may be configured to perform user-specified operations. In an exemplary application, the computing processing device may be implemented as a single-core artificial intelligence processor or a multi-core artificial intelligence processor. Similarly, one or more computing devices included within a computing processing device may be implemented as an artificial intelligence processor core or as part of a hardware structure of an artificial intelligence processor core. When multiple computing devices are implemented as artificial intelligence processor cores or as part of a hardware structure of an artificial intelligence processor core, computing processing devices of the present disclosure may be considered to have a single core structure or a homogeneous multi-core structure.
In an exemplary operation, the computing processing device of the present disclosure may interact with other processing devices through an interface device to collectively perform user-specified operations. Other Processing devices of the present disclosure may include one or more types of general and/or special purpose processors, such as Central Processing Units (CPUs), Graphics Processing Units (GPUs), and artificial intelligence processors, depending on the implementation. These processors may include, but are not limited to, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic, discrete hardware components, etc., and the number may be determined based on actual needs. As previously mentioned, the computing processing device of the present disclosure may be considered to have a single core structure or an isomorphic multi-core structure only. However, when considered together, a computing processing device and other processing devices may be considered to form a heterogeneous multi-core structure.
In one or more embodiments, the other processing device can interface with external data and controls as the computing processing device of the present disclosure, performing basic controls including, but not limited to, data handling, starting and/or stopping of the computing device, and the like. In further embodiments, other processing devices may also cooperate with the computing processing device to collectively perform computational tasks.
In one or more embodiments, the interface device may be used to transfer data and control instructions between the computing processing device and other processing devices. For example, the computing processing device may obtain input data from other processing devices via the interface device, and write the input data into a storage device (or memory) on the computing processing device. Further, the computing processing device may obtain the control instruction from the other processing device via the interface device, and write the control instruction into the control cache on the computing processing device slice. Alternatively or optionally, the interface device may also read data from the memory device of the computing processing device and transmit the data to the other processing device.
Additionally or alternatively, the combined processing device of the present disclosure may further include a storage device. As shown in the figure, the storage means is connected to the computing processing means and the further processing means, respectively. In one or more embodiments, the storage device may be used to hold data for the computing processing device and/or the other processing devices. For example, the data may be data that is not fully retained within internal or on-chip storage of a computing processing device or other processing device.
In some embodiments, the present disclosure also discloses a chip (e.g., chip 902 shown in fig. 9). In one implementation, the Chip is a System on Chip (SoC) and is integrated with one or more combinatorial processing devices as shown in fig. 8. The chip may be connected to other associated components through an external interface device, such as external interface device 906 shown in fig. 9. The relevant component may be, for example, a camera, a display, a mouse, a keyboard, a network card, or a wifi interface. In some application scenarios, other processing units (e.g., video codecs) and/or interface modules (e.g., DRAM interfaces) and/or the like may be integrated on the chip. In some embodiments, the disclosure also discloses a chip packaging structure, which includes the chip. In some embodiments, the present disclosure also discloses a board card including the above chip packaging structure. The board will be described in detail below with reference to fig. 9.
Fig. 9 is a schematic diagram illustrating a structure of a board card 900 according to an embodiment of the disclosure. As shown in fig. 9, the board includes a memory device 904 for storing data, which includes one or more memory units 910. The memory device may be connected and data transferred to and from the control device 908 and the chip 902 described above by means of, for example, a bus. Further, the board card further includes an external interface device 906 configured for a data relay or transfer function between the chip (or the chip in the chip package structure) and an external device 912 (such as a server or a computer). For example, the data to be processed may be transferred to the chip by an external device through an external interface means. For another example, the calculation result of the chip may be transmitted back to an external device via the external interface device. According to different application scenarios, the external interface device may have different interface forms, for example, it may adopt a standard PCIE interface or the like.
In one or more embodiments, the control device in the disclosed card may be configured to regulate the state of the chip. Therefore, in an application scenario, the control device may include a single chip Microcomputer (MCU) for controlling the operating state of the chip.
From the above description in conjunction with fig. 8 and 9, it will be understood by those skilled in the art that the present disclosure also discloses an electronic device or apparatus, which may include one or more of the above boards, one or more of the above chips and/or one or more of the above combination processing devices.
According to different application scenarios, the electronic device or apparatus of the present disclosure may include a server, a cloud server, a server cluster, a data processing apparatus, a robot, a computer, a printer, a scanner, a tablet computer, a smart terminal, a PC device, a terminal of the internet of things, a mobile terminal, a mobile phone, a vehicle recorder, a navigator, a sensor, a camera, a video camera, a projector, a watch, an earphone, a mobile storage, a wearable device, a visual terminal, an autopilot terminal, a vehicle, a household appliance, and/or a medical device. The vehicle comprises an airplane, a ship and/or a vehicle; the household appliances comprise a television, an air conditioner, a microwave oven, a refrigerator, an electric cooker, a humidifier, a washing machine, an electric lamp, a gas stove and a range hood; the medical equipment comprises a nuclear magnetic resonance apparatus, a B-ultrasonic apparatus and/or an electrocardiograph. The electronic device or apparatus of the present disclosure may also be applied to the fields of the internet, the internet of things, data centers, energy, transportation, public management, manufacturing, education, power grid, telecommunications, finance, retail, construction site, medical, and the like. Further, the electronic device or apparatus disclosed herein may also be used in application scenarios related to artificial intelligence, big data, and/or cloud computing, such as a cloud end, an edge end, and a terminal. In one or more embodiments, a computationally powerful electronic device or apparatus according to the present disclosure may be applied to a cloud device (e.g., a cloud server), while a less power-consuming electronic device or apparatus may be applied to a terminal device and/or an edge-end device (e.g., a smartphone or a camera). In one or more embodiments, the hardware information of the cloud device and the hardware information of the terminal device and/or the edge device are compatible with each other, so that appropriate hardware resources can be matched from the hardware resources of the cloud device to simulate the hardware resources of the terminal device and/or the edge device according to the hardware information of the terminal device and/or the edge device, and uniform management, scheduling and cooperative work of end-cloud integration or cloud-edge-end integration can be completed.
It is noted that for the sake of brevity, the present disclosure describes some methods and embodiments thereof as a series of acts and combinations thereof, but those skilled in the art will appreciate that the aspects of the present disclosure are not limited by the order of the acts described. Accordingly, one of ordinary skill in the art will appreciate that certain steps may be performed in other sequences or simultaneously, in accordance with the disclosure or teachings of the present disclosure. Further, those skilled in the art will appreciate that the embodiments described in this disclosure are capable of alternative embodiments, in which acts or modules are involved, which are not necessarily required to practice one or more aspects of the disclosure. In addition, the present disclosure may focus on the description of some embodiments, depending on the solution. In view of the above, those skilled in the art will understand that portions of the disclosure that are not described in detail in one embodiment may also be referred to in the description of other embodiments.
In particular implementation, based on the disclosure and teachings of the present disclosure, one skilled in the art will appreciate that the several embodiments disclosed in the present disclosure may be implemented in other ways not disclosed herein. For example, as for the units in the foregoing embodiments of the electronic device or apparatus, the units are divided based on the logic functions, and there may be other dividing manners in actual implementation. Also for example, multiple units or components may be combined or integrated with another system or some features or functions in a unit or component may be selectively disabled. The connections discussed above in connection with the figures may be direct or indirect couplings between the units or components in terms of connectivity between the different units or components. In some scenarios, the aforementioned direct or indirect coupling involves a communication connection utilizing an interface, where the communication interface may support electrical, optical, acoustic, magnetic, or other forms of signal transmission.
In the present disclosure, units described as separate parts may or may not be physically separate, and parts shown as units may or may not be physical units. The aforementioned components or units may be co-located or distributed across multiple network elements. In addition, according to actual needs, part or all of the units can be selected to achieve the purpose of the solution of the embodiment of the present disclosure. In addition, in some scenarios, multiple units in embodiments of the present disclosure may be integrated into one unit or each unit may exist physically separately.
In some implementation scenarios, the integrated units may be implemented in the form of software program modules. If implemented in the form of software program modules and sold or used as a stand-alone product, the integrated units may be stored in a computer readable memory. In this regard, when aspects of the present disclosure are embodied in the form of a software product (e.g., a computer-readable storage medium), the software product may be stored in a memory, which may include instructions for causing a computer device (e.g., a personal computer, a server, or a network device, etc.) to perform some or all of the steps of the methods described in embodiments of the present disclosure. The Memory may include, but is not limited to, a usb disk, a flash disk, a Read Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.
In other implementation scenarios, the integrated unit may also be implemented in hardware, that is, a specific hardware circuit, which may include a digital circuit and/or an analog circuit, etc. The physical implementation of the hardware structure of the circuit may include, but is not limited to, physical devices, which may include, but are not limited to, transistors or memristors, among other devices. In view of this, the various devices described herein (e.g., computing devices or other processing devices) may be implemented by suitable hardware processors, such as CPUs, GPUs, FPGAs, DSPs, ASICs, and the like. Further, the aforementioned storage unit or storage device may be any suitable storage medium (including magnetic storage medium or magneto-optical storage medium, etc.), and may be, for example, a variable Resistive Memory (RRAM), a Dynamic Random Access Memory (DRAM), a Static Random Access Memory (SRAM), an Enhanced Dynamic Random Access Memory (EDRAM), a High Bandwidth Memory (HBM), a Hybrid Memory Cube (HMC), a ROM, a RAM, or the like.
While various embodiments of the present disclosure have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Numerous modifications, changes, and substitutions will occur to those skilled in the art without departing from the spirit and scope of the present disclosure. It should be understood that various alternatives to the embodiments of the disclosure described herein may be employed in practicing the disclosure. It is intended that the following claims define the scope of the disclosure and that equivalents or alternatives within the scope of these claims be covered thereby.
The foregoing may be better understood in light of the following clauses:
clause a1, a data processing apparatus comprising a data cache circuit and a data conversion circuit, wherein:
the data caching circuitry is configured to perform caching of multidimensional data; and
the data conversion circuit is configured to execute a store and read operation on multi-dimensional data in the data cache circuit according to a data conversion instruction to realize data conversion on the multi-dimensional data,
wherein the data conversion instruction includes a descriptor indicating a shape of the multi-dimensional data, and the descriptor is used to determine a storage address of the corresponding multi-dimensional data,
wherein the data conversion circuit is configured to perform a store and read operation on the multi-dimensional data according to the memory address.
Clause a2, the data processing apparatus of clause a1, wherein the data conversion instruction comprises an identification of a descriptor and/or content of a descriptor comprising at least one shape parameter representing a shape of the multi-dimensional data and at least one address parameter representing an address of the multi-dimensional data.
Clause A3, the data processing apparatus of clause a2, wherein the address parameter of the multidimensional data comprises a base address of a data reference point of the descriptor in a data storage space of the multidimensional data.
Clause a4, the data processing apparatus of clause a2, wherein the shape parameters of the multi-dimensional data comprise at least one of:
the size of the data storage space in at least one direction of N dimensional directions, the size of a storage area of the multi-dimensional data in at least one direction of the N dimensional directions, the offset of the storage area in at least one direction of the N dimensional directions, the positions of at least two vertexes at diagonal positions of the N dimensional directions relative to the data reference point, and the mapping relation between the data description position of the multi-dimensional data indicated by the descriptor and the data address, wherein N is an integer greater than or equal to zero.
Clause a5, the data processing apparatus of clause a2, wherein the data conversion instruction includes data volume information and/or inter-dimension offset information for performing a deposit and read operation with respect to each dimension in the multi-dimensional data, and wherein the data volume information and/or inter-dimension offset information is determined from address parameters and/or shape parameters in the descriptor.
Clause a6, the data processing apparatus of any one of clauses a1-a5, wherein the data processing apparatus further comprises an external memory storing multidimensional data, the data conversion instruction comprising a first descriptor and a second descriptor, wherein the data conversion circuitry is configured to:
performing a read of the multi-dimensional data from an external memory for storing into the data cache circuit according to the first descriptor; and
reading the multi-dimensional data in the data cache circuit into the external memory is performed according to the second descriptor.
Clause a7, the data processing apparatus of any one of clauses a1-a5, wherein the data conversion circuit is configured to perform a deposit and read operation on the multi-dimensional data to perform a conversion operation on the multi-dimensional data of one of:
a data mirroring operation, a data rotation operation of multiple angles, or a data transpose operation.
Clause A8, the data processing apparatus of any one of clauses a1-a5, wherein the data conversion instruction comprises an operating parameter, and the data conversion circuitry is configured to perform data conversion on the multi-dimensional data in accordance with the operating parameter.
Clause a9, the data processing apparatus of clause A8, wherein the data conversion circuitry is configured to:
according to the operation parameters, storing and reading operations are carried out on one or more parts of the multi-dimensional data in the data cache circuit, so that data conversion of the one or more parts of the multi-dimensional data is achieved.
Clause a10, the data processing apparatus of clause A8, wherein the data conversion circuitry is configured to:
and splicing a plurality of parts of the converted multi-dimensional data read out from the data cache circuit according to the operating parameters to output.
Clause a11, the data processing apparatus of clause A8, wherein the data conversion circuitry is configured to:
and combining a plurality of parts of the unconverted multi-dimensional data read out from the data buffer circuit according to the operating parameters to output.
Clause a12, the data processing apparatus of clause A8, wherein the data conversion circuitry is configured to, in accordance with the operating parameters, perform the following:
storing the multi-dimensional data into the data cache circuit according to a first dimension sequence of the multi-dimensional data; and
and reading the multi-dimensional data from the data buffer circuit in a second dimension order to output.
Clause a13, an integrated circuit chip comprising the data processing apparatus of any one of clauses a1-a 12.
Clause a14, an electronic device, comprising the integrated circuit chip of clause a 13.
Clause a15, a board comprising the integrated circuit chip of clause a 13.
Clause a16, a method implemented by a data processing apparatus, wherein the data processing apparatus comprises data caching circuitry and data conversion circuitry, the method comprising:
caching multidimensional data using the data caching circuitry; and
using the data conversion circuit to perform a store and read operation on multi-dimensional data in the data cache circuit according to a data conversion instruction to implement data conversion on the multi-dimensional data,
wherein the data conversion instruction includes a descriptor indicating a shape of the multi-dimensional data, and the descriptor is used to determine a storage address of the corresponding multi-dimensional data,
wherein the data conversion circuitry is used to perform store and read operations on the multi-dimensional data according to the memory address.
Clause a17, the method of clause a16, wherein the data conversion instruction comprises an identification of a descriptor and/or content of a descriptor comprising at least one shape parameter representing a shape of the multi-dimensional data and at least one address parameter representing an address of the multi-dimensional data.
Clause a18, the method of clause a17, wherein the address parameters of the multi-dimensional data include a base address of a data reference point of the descriptor in a data storage space of the multi-dimensional data.
Clause a19, the method of clause a17, wherein the shape parameters of the multi-dimensional data include at least one of:
the size of the data storage space in at least one direction of N dimensional directions, the size of a storage area of the multi-dimensional data in at least one direction of the N dimensional directions, the offset of the storage area in at least one direction of the N dimensional directions, the positions of at least two vertexes at diagonal positions of the N dimensional directions relative to the data reference point, and the mapping relation between the data description position of the multi-dimensional data indicated by the descriptor and the data address, wherein N is an integer greater than or equal to zero.
Clause a20, the method of clause a17, wherein the data conversion instruction includes data volume information and/or inter-dimension offset information for performing a deposit and read operation with respect to each dimension in the multi-dimensional data, and wherein the data volume information and/or inter-dimension offset information is determined from address parameters and/or shape parameters in the descriptor.
Clause a21, the method of any one of clauses a16-a20, wherein the data processing apparatus further comprises an external memory storing multidimensional data, the data conversion instruction comprising a first descriptor and a second descriptor, wherein the method uses the data conversion circuitry to perform the steps of:
performing a read of the multi-dimensional data from an external memory for storing into the data cache circuit according to the first descriptor; and
reading the multi-dimensional data in the data cache circuit into the external memory is performed according to the second descriptor.
Clause a22, the method of any one of clauses a16-a20, wherein the data conversion circuit is used to perform a store and read operation on the multidimensional data to perform a conversion operation on the multidimensional data that is one of:
a data mirroring operation, a data rotation operation of multiple angles, or a data transpose operation.
Clause a23, the method of any one of clauses a16-a20, wherein the data conversion instruction includes an operating parameter, and the method includes using the data conversion circuitry to perform data conversion on the multi-dimensional data according to the operating parameter.
Clause a24, the method of clause a23, wherein the following steps are performed using the data conversion circuitry:
according to the operation parameters, storing and reading operations are carried out on one or more parts of the multi-dimensional data in the data cache circuit, so that data conversion of the one or more parts of the multi-dimensional data is achieved.
Clause a25, the method of clause a23, wherein the following steps are performed using the data conversion circuitry:
and splicing a plurality of parts of the converted multi-dimensional data read out from the data cache circuit according to the operating parameters to output.
Clause a26, the method of clause a23, wherein the following steps are performed using the data conversion circuitry:
and combining a plurality of parts of the unconverted multi-dimensional data read out from the data buffer circuit according to the operating parameters to output.
Clause a27, the method of clause a23, wherein the data conversion circuitry is used to perform the following operations in accordance with the operating parameters:
storing the multi-dimensional data into the data cache circuit according to a first dimension sequence of the multi-dimensional data; and
and reading the multi-dimensional data from the data buffer circuit in a second dimension order to output.
In the above embodiments of the present disclosure, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to the related descriptions of other embodiments. The technical features of the embodiments may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

Claims (27)

1. A data processing apparatus comprising a data buffer circuit and a data conversion circuit, wherein:
the data caching circuitry is configured to perform caching of multidimensional data; and
the data conversion circuit is configured to execute a store and read operation on multi-dimensional data in the data cache circuit according to a data conversion instruction to realize data conversion on the multi-dimensional data,
wherein the data conversion instruction includes a descriptor indicating a shape of the multi-dimensional data, and the descriptor is used to determine a storage address of the corresponding multi-dimensional data,
wherein the data conversion circuit is configured to perform a store and read operation on the multi-dimensional data according to the memory address.
2. The data processing apparatus according to claim 1, wherein the data conversion instruction comprises an identification of a descriptor and/or a content of a descriptor comprising at least one shape parameter representing a shape of the multi-dimensional data and at least one address parameter representing an address of the multi-dimensional data.
3. The data processing apparatus according to claim 2, wherein the address parameter of the multi-dimensional data comprises a reference address of a data reference point of the descriptor in a data storage space of the multi-dimensional data.
4. The data processing apparatus of claim 2, wherein the shape parameters of the multi-dimensional data comprise at least one of:
the size of the data storage space in at least one direction of N dimensional directions, the size of a storage area of the multi-dimensional data in at least one direction of the N dimensional directions, the offset of the storage area in at least one direction of the N dimensional directions, the positions of at least two vertexes at diagonal positions of the N dimensional directions relative to the data reference point, and the mapping relation between the data description position of the multi-dimensional data indicated by the descriptor and the data address, wherein N is an integer greater than or equal to zero.
5. The data processing apparatus according to claim 2, wherein the data conversion instruction comprises data volume information and/or inter-dimension offset information for performing a store and read operation with respect to each dimension in the multi-dimensional data, and wherein the data volume information and/or inter-dimension offset information is determined from address parameters and/or shape parameters in the descriptor.
6. The data processing apparatus according to any of claims 1-5, wherein the data processing apparatus further comprises an external memory storing multidimensional data, the data conversion instruction comprising a first descriptor and a second descriptor, wherein the data conversion circuitry is configured to:
performing a read of the multi-dimensional data from an external memory for storing into the data cache circuit according to the first descriptor; and
reading the multi-dimensional data in the data cache circuit into the external memory is performed according to the second descriptor.
7. The data processing apparatus according to any of claims 1-5, wherein the data conversion circuitry is configured to perform a store and read operation on the multi-dimensional data to perform a conversion operation on the multi-dimensional data of one of:
a data mirroring operation, a data rotation operation of multiple angles, or a data transpose operation.
8. The data processing apparatus according to any of claims 1-5, wherein the data conversion instructions comprise operating parameters, and the data conversion circuitry is configured to perform data conversion on the multi-dimensional data according to the operating parameters.
9. The data processing apparatus of claim 8, wherein the data conversion circuitry is configured to:
according to the operation parameters, storing and reading operations are carried out on one or more parts of the multi-dimensional data in the data cache circuit, so that data conversion of the one or more parts of the multi-dimensional data is achieved.
10. The data processing apparatus of claim 8, wherein the data conversion circuitry is configured to:
and splicing a plurality of parts of the converted multi-dimensional data read out from the data cache circuit according to the operating parameters to output.
11. The data processing apparatus of claim 8, wherein the data conversion circuitry is configured to:
and combining a plurality of parts of the unconverted multi-dimensional data read out from the data buffer circuit according to the operating parameters to output.
12. The data processing apparatus according to claim 8, wherein the data conversion circuitry is configured to perform the following operations in accordance with the operating parameter:
storing the multi-dimensional data into the data cache circuit according to a first dimension sequence of the multi-dimensional data; and
and reading the multi-dimensional data from the data buffer circuit in a second dimension order to output.
13. An integrated circuit chip comprising a data processing device according to any one of claims 1 to 12.
14. An electronic device comprising the integrated circuit chip of claim 13.
15. A board card comprising the integrated circuit chip of claim 13.
16. A method implemented by a data processing apparatus, wherein the data processing apparatus comprises a data caching circuit and a data conversion circuit, the method comprising:
caching multidimensional data using the data caching circuitry; and
using the data conversion circuit to perform a store and read operation on multi-dimensional data in the data cache circuit according to a data conversion instruction to implement data conversion on the multi-dimensional data,
wherein the data conversion instruction includes a descriptor indicating a shape of the multi-dimensional data, and the descriptor is used to determine a storage address of the corresponding multi-dimensional data,
wherein the data conversion circuitry is used to perform store and read operations on the multi-dimensional data according to the memory address.
17. The method of claim 16, wherein the data conversion instruction comprises an identification of a descriptor and/or a content of a descriptor comprising at least one shape parameter representing a shape of the multi-dimensional data and at least one address parameter representing an address of the multi-dimensional data.
18. The method of claim 17, wherein the address parameters of the multi-dimensional data comprise a base address of a data base point of the descriptor in a data storage space of the multi-dimensional data.
19. The method of claim 17, wherein shape parameters of the multi-dimensional data comprise at least one of:
the size of the data storage space in at least one direction of N dimensional directions, the size of a storage area of the multi-dimensional data in at least one direction of the N dimensional directions, the offset of the storage area in at least one direction of the N dimensional directions, the positions of at least two vertexes at diagonal positions of the N dimensional directions relative to the data reference point, and the mapping relation between the data description position of the multi-dimensional data indicated by the descriptor and the data address, wherein N is an integer greater than or equal to zero.
20. The method of claim 17, wherein the data conversion instruction includes data volume information and/or inter-dimension offset information to perform a store and read operation with respect to each dimension in the multi-dimensional data, and wherein the data volume information and/or inter-dimension offset information is determined according to address parameters and/or shape parameters in the descriptor.
21. The method of any of claims 16-20, wherein the data processing apparatus further comprises an external memory storing multidimensional data, the data conversion instruction comprising a first descriptor and a second descriptor, wherein the method uses the data conversion circuitry to perform the steps of:
performing a read of the multi-dimensional data from an external memory for storing into the data cache circuit according to the first descriptor; and
reading the multi-dimensional data in the data cache circuit into the external memory is performed according to the second descriptor.
22. The method of any of claims 16-20, wherein a store and read operation is performed on the multi-dimensional data using the data conversion circuitry to perform a conversion operation on the multi-dimensional data that is one of:
a data mirroring operation, a data rotation operation of multiple angles, or a data transpose operation.
23. The method of any of claims 16-20, wherein the data conversion instructions include operational parameters, and the method comprises using the data conversion circuitry to perform data conversion on the multi-dimensional data according to the operational parameters.
24. The method of claim 23, wherein the following steps are performed using the data conversion circuit:
according to the operation parameters, storing and reading operations are carried out on one or more parts of the multi-dimensional data in the data cache circuit, so that data conversion of the one or more parts of the multi-dimensional data is achieved.
25. The method of claim 23, wherein the following steps are performed using the data conversion circuit:
and splicing a plurality of parts of the converted multi-dimensional data read out from the data cache circuit according to the operating parameters to output.
26. The method of claim 23, wherein the following steps are performed using the data conversion circuit:
and combining a plurality of parts of the unconverted multi-dimensional data read out from the data buffer circuit according to the operating parameters to output.
27. The method of claim 23, wherein the data conversion circuitry is used to perform the following operations in accordance with the operating parameter:
storing the multi-dimensional data into the data cache circuit according to a first dimension sequence of the multi-dimensional data; and
and reading the multi-dimensional data from the data buffer circuit in a second dimension order to output.
CN202011036302.6A 2020-09-27 2020-09-27 Data processing device, integrated circuit chip, equipment and method for realizing the same Pending CN114282159A (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN202011036302.6A CN114282159A (en) 2020-09-27 2020-09-27 Data processing device, integrated circuit chip, equipment and method for realizing the same
PCT/CN2021/110357 WO2022062682A1 (en) 2020-09-27 2021-08-03 Data processing device, integrated circuit chip, device, and implementation method therefor
US18/013,976 US20230297270A1 (en) 2020-09-27 2021-08-03 Data processing device, integrated circuit chip, device, and implementation method therefor
EP21871059.8A EP4220448A1 (en) 2020-09-27 2021-08-03 Data processing device, integrated circuit chip, device, and implementation method therefor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011036302.6A CN114282159A (en) 2020-09-27 2020-09-27 Data processing device, integrated circuit chip, equipment and method for realizing the same

Publications (1)

Publication Number Publication Date
CN114282159A true CN114282159A (en) 2022-04-05

Family

ID=80867749

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011036302.6A Pending CN114282159A (en) 2020-09-27 2020-09-27 Data processing device, integrated circuit chip, equipment and method for realizing the same

Country Status (1)

Country Link
CN (1) CN114282159A (en)

Similar Documents

Publication Publication Date Title
CN110096310B (en) Operation method, operation device, computer equipment and storage medium
US20210150325A1 (en) Data processing method and apparatus, and related product
EP3825842B1 (en) Data processing method and apparatus, and related product
US20240111536A1 (en) Data processing apparatus and related products
US20240004650A1 (en) Data processing method and apparatus, and related product
CN112347186B (en) Data synchronization method and device and related product
CN114282159A (en) Data processing device, integrated circuit chip, equipment and method for realizing the same
WO2021027973A1 (en) Data synchronization method and device, and related products
CN112306945B (en) Data synchronization method and device and related products
CN114691353A (en) Tensor reading method and device and related product
WO2022062682A1 (en) Data processing device, integrated circuit chip, device, and implementation method therefor
CN114489799A (en) Processing method, processing device and related product
CN112395008A (en) Operation method, operation device, computer equipment and storage medium
CN114282160A (en) Data processing device, integrated circuit chip, equipment and implementation method thereof
CN113867800A (en) Computing device, integrated circuit chip, board card, electronic equipment and computing method
CN114692844A (en) Data processing device, data processing method and related product
CN112395009A (en) Operation method, operation device, computer equipment and storage medium
CN111061507A (en) Operation method, operation device, computer equipment and storage medium
CN112395002B (en) Operation method, device, computer equipment and storage medium
CN112232498B (en) Data processing device, integrated circuit chip, electronic equipment, board card and method
CN114489802A (en) Data processing device, data processing method and related product
WO2022001499A1 (en) Computing apparatus, chip, board card, electronic device and computing method
CN112396170B (en) Operation method, device, computer equipment and storage medium
CN114282161A (en) Matrix conversion circuit, matrix conversion method, integrated circuit chip, computing device and board card
CN114489803A (en) Processing device, processing method and related product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination