CN109992541B - Data carrying method, computing device and computer storage medium - Google Patents

Data carrying method, computing device and computer storage medium Download PDF

Info

Publication number
CN109992541B
CN109992541B CN201711473331.7A CN201711473331A CN109992541B CN 109992541 B CN109992541 B CN 109992541B CN 201711473331 A CN201711473331 A CN 201711473331A CN 109992541 B CN109992541 B CN 109992541B
Authority
CN
China
Prior art keywords
data
target data
data block
storage
unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711473331.7A
Other languages
Chinese (zh)
Other versions
CN109992541A (en
Inventor
王和国
黎立煌
李炜
曹庆新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Intellifusion Technologies Co Ltd
Original Assignee
Shenzhen Intellifusion Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Intellifusion Technologies Co Ltd filed Critical Shenzhen Intellifusion Technologies Co Ltd
Priority to CN201711473331.7A priority Critical patent/CN109992541B/en
Publication of CN109992541A publication Critical patent/CN109992541A/en
Application granted granted Critical
Publication of CN109992541B publication Critical patent/CN109992541B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/20Handling requests for interconnection or transfer for access to input/output bus
    • G06F13/28Handling requests for interconnection or transfer for access to input/output bus using burst mode transfer, e.g. direct memory access DMA, cycle steal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Abstract

The embodiment of the invention provides a data movement methodThe method, the related product and the computer storage medium are applied to a computing device, the computing device comprises the storage medium, a register unit and a channel control unit, and the method comprises the following steps: the storage control unit acquires a data carrying instruction, wherein the data carrying instruction comprises indication information, and the indication information is used for indicating target data to be carried and a storage format of a target data block; the first data block of the target data comprises C H data blocks1*W1C is the number of channels required for transmitting the data block, H2Is less than or equal to H1,W2Is less than or equal to W1(ii) a And then, according to the storage format of the target data block, the target data is transported between the storage medium and the register unit. By adopting the embodiment of the invention, the data can be efficiently carried, and the data carrying efficiency is improved.

Description

Data carrying method, computing device and computer storage medium
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a data handling method, a related product, and a computer storage medium.
Background
With the development and intensive research of Artificial Intelligence (AI), the chip has higher and higher requirements for data supply. In a method for carrying data in a multi-dimensional Direct Memory Access (DMA) in the prior art, a continuous storage address is used to store carried data, which only considers how to carry the data and does not consider subsequent processing of the data, thereby bringing unnecessary troubles to the read/write operation of the subsequent data. For example, in a deep learning model, the data storage method cannot be well matched with an arithmetic unit, and the read-write speed of data is reduced, so that the data processing efficiency is reduced.
Disclosure of Invention
The technical problem to be solved by the embodiments of the present invention is to provide a data handling method, a related product, and a computer storage medium, which can improve the efficiency of data transmission.
In a first aspect, an embodiment of the present invention discloses a data handling method, which is applied to a computing device, where the computing device includes a storage medium, a register unit, and a storage control unit, and the method includes:
the computing device controls the storage control unit to obtain a data carrying instruction, wherein the data carrying instruction comprises indication information, and the indication information is used for indicating target data to be carried and a storage format of a target data block; the first data block where the target data is located comprises C data blocks H1W 1, C is the number of channels required for transmitting the first data block, H1 and W1 are the height and width of the first data block respectively, the target data block comprises C data blocks H2W 2, H2 and W2 are the height and width of the target data block respectively, H2 is smaller than or equal to H1, and W2 is smaller than or equal to W1; the storage format of the target data block is used for indicating the storage format of the target data;
and the computing device controls the storage control unit to finish the transportation of the target data between the storage medium and the register unit according to the storage format of the target data block.
In a second aspect, an embodiment of the present invention further provides a computing apparatus, including a storage medium, a register unit, and a storage control unit,
the storage medium and the register unit are used for storing data;
the storage control unit is used for acquiring a data carrying instruction, wherein the data carrying instruction comprises indication information, and the indication information is used for indicating target data to be carried and a storage format of a target data block; the first data block where the target data is located comprises C data blocks H1W 1, C is the number of channels required for transmitting the first data block, H1 and W1 are the height and width of the first data block respectively, the target data block comprises C data blocks H2W 2, H2 and W2 are the height and width of the target data block respectively, H2 is smaller than or equal to H1, and W2 is smaller than or equal to W1; the storage format of the target data block is used for indicating the storage format of the target data;
the storage control unit is further used for completing the transportation of the target data between the storage medium and the register unit according to the storage format of the target data block.
In a third aspect, an embodiment of the present invention further provides a computing apparatus, including: a processor, a memory, a communication interface, and a bus; the processor, the memory and the communication interface are connected through the bus and complete mutual communication; the memory stores executable program code; the processor executes a program corresponding to the executable program code by reading the executable program code stored in the memory to perform the method as described above in the first aspect.
In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium storing program codes for data handling. The program code comprises instructions for performing the method described in the first aspect above.
A storage control unit in a computing device of an embodiment of the present invention obtains a data carrying instruction, where the data carrying instruction includes indication information, and the indication information is used to indicate target data to be carried and a storage format of a target data block; the first data block where the target data is located comprises C data blocks H1W 1, C is the number of channels required for transmitting the first data block, H1 and W1 are the height and width of the first data block respectively, the target data block comprises C data blocks H2W 2, H2 and W2 are the height and width of the target data block respectively, H2 is smaller than or equal to H1, and W2 is smaller than or equal to W1; the storage format of the target data block is used for indicating the storage format of the target data; and then, according to the storage format of the target data block, the target data is transported between the storage medium and the register unit. By adopting the embodiment of the invention, the data can be efficiently carried, and the data carrying efficiency is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic structural diagram of a computing device according to an embodiment of the present invention;
FIGS. 2A and 2B are schematic diagrams of two data stores provided by embodiments of the present invention;
FIG. 3 is a flow chart illustrating a data handling method according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of yet another data store provided by an embodiment of the present invention;
FIG. 5A is a diagram of image data provided by an embodiment of the invention;
FIGS. 5B and 5C are schematic diagrams of two other data stores provided by embodiments of the present invention;
FIG. 6A is a schematic structural diagram of a first memory control unit according to an embodiment of the present invention;
FIG. 6B is a flowchart illustrating another data handling method according to an embodiment of the present invention;
FIG. 7A is a diagram illustrating a second memory control unit according to an embodiment of the present invention;
FIG. 7B is a flowchart illustrating another data handling method according to an embodiment of the present invention;
FIG. 8A is a schematic structural diagram of another computing device according to an embodiment of the present invention;
fig. 8B is a schematic structural diagram of another computing device according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "first," "second," and "third" (if any) in the description and claims of the invention and the above-described drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "comprises" and any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
In order to solve the problems of low data carrying efficiency and the like in the prior art, the application particularly provides an efficient data carrying method which is applied to a computing device. Fig. 1 shows a schematic diagram of a possible computing device. As shown in fig. 1, the computing apparatus 100 includes a storage medium 102, a register unit 104, a storage control unit 106, and an arithmetic unit 108; wherein the content of the first and second substances,
the storage medium 102 is used for storing external data; the external data refers to data stored outside the neural network processor. Accordingly, the storage medium may be an external memory, a cache, a Double Data Rate (DDR) and the like, which are independent from the neural network processor.
The register unit 104 is used for storing internal data, where the internal data refers to data stored inside the neural network processor, and the register unit may be a memory installed inside the neural network processor, such as a cache, a flash memory, and the like. In the present application, the external data and the internal data are collectively referred to as data, which includes, but is not limited to, image data, text data, voice data, and the like. The neural network processor may also be referred to as a deep learning processor, or other processor/chip for processing big data, which is not limited in this application.
The storage control unit 106(Enhanced Direct Memory Access, EDMA, also called Enhanced Direct Memory Access) is configured to implement data transfer between the storage medium and the register unit. Specifically, the storage control unit includes a first storage control unit 1062(Enhanced Output Direct Memory Access, EoDMA, which may also be referred to as Enhanced Output Direct Memory Access) and a second storage control unit 1064(Enhanced Iutput Direct Memory Access, EiDMA, which may also be referred to as Enhanced input Direct Memory Access). The first memory control unit EoDMA is configured to transport the data in the register unit to the storage medium for storage. The second storage control unit EiDMA is configured to transfer the data in the storage medium inward to the register unit for storage. The handling of data between the register unit and the storage medium will be explained in detail below.
The operation unit 108 is used for performing related operation operations on the data in the register unit, such as convolution operation, inner product operation, and the like. It should be understood that, in order to adapt to the data processing requirement of the neural network model (which may also be a deep learning model), a plurality of sets of operation units, which are also referred to as Medium Access Control (MAC) operation units, are configured for the data stored in the register unit. Each set of arithmetic units may be used to correspondingly process the data stored in the register unit, and each set of arithmetic units includes, but is not limited to, any one or a combination of more than one of the following arithmetic units: adders, multipliers, non-linear operators, and the like.
Taking the data as an example of image data, assuming that the computing device obtains image data to be processed from the storage medium, the computing device needs to read the image data into the register unit inside the neural network processor, and then input the image data to be processed into the operation unit to perform operation associated with a neural network model (e.g., perform convolutional neural network operation, etc.), thereby obtaining an output result. Accordingly, the computing device may store the output result calculated by the arithmetic unit into the register unit, and optionally may also send the output result stored in the register unit to the storage medium outside the neural network processor for viewing by a user, or the like.
It should be understood that the granularity supporting data processing within a neural network processor is a channel, and in particular refers to a block/data transmitted through a channel. Wherein the channels include an input channel CI and an output channel CO. Wherein, CI is used for realizing data transmission from the storage medium to the register unit; the CO is used to enable data transfer of the register unit to the storage medium. That is, the storage medium needs to transfer data to the register unit through the input channel CI for storage, and the register unit needs to transfer data to the storage medium through the output channel CO for storage.
Furthermore, the minimum granularity to support data storage in the register units inside the neural network processor is pixel. Alternatively, a plurality of pixel merges may be defined as a data slice, and a plurality of data slice merges may also be defined as a data block bank. That is, the storage of data in the storage format of the data block bank is supported in the register unit.
The number of pixels included in the data slice and the number of data slices included in the data block may be set by a user side or a computing device side in a user-defined manner, which is not limited in the present application. For example, a 7 pixel merge is defined as one slice, a 2 slice merge is defined as one bank, and the like, which is not limited in the present application.
Fig. 2A shows a schematic diagram of data storage in a register unit. As shown in fig. 2A, two channels CI/CO are shown for respective corresponding data blocks bank0 and bank1, wherein each data block is composed of two data slices slice, each data slice is composed of 7 pixel pixels, and each data slice is illustrated as being composed of 7 pixels in the width direction (i.e., row direction). Wherein, the part corresponding to the black line represents the storage of the effective data (such as image data) in the register unit; the remaining portion represents invalid data stored in the register unit or no data stored. Taking the data as image data as an example, fig. 2A shows a schematic diagram of storing 8 × 4 image data in the register unit.
The storage format of the data in the storage medium external to the neural network processor is supported as continuous storage. Specifically, in the storage medium, data is stored sequentially in the transmission order of the channels CI/CO (e.g., CI0, CI1, CI2, etc.). Fig. 2B shows a schematic diagram of data storage in a storage medium. Referring to fig. 2B, data blocks in two CI/CO channels are stored in a storage medium in a continuous manner, for example, all data in a data block corresponding to a CI/CO0 channel is stored in a first CI/CO0 channel in a direction of an arrow shown in the figure, and after a data block of a CI/CO0 channel is stored, all data contained in a data block of a next CI/CO1 channel is stored. For the data block of each channel, the data in the data block may be stored sequentially from low to high in the order of rows. As shown in fig. 2B, which is a schematic diagram illustrating storage of data blocks in the CI/CO0 channel, specifically, the first row of data may be stored first, and then the second row of data may be stored.
It should be noted that, in the present application, the processing object of the neural network processor in the computing apparatus is a neural network model (may also be a deep learning model), the neural network model may be represented as N input channels CI and M input channels CO, and the data block transmitted through the channel CI/CO may be represented as H × W, that is, it may be understood as a matrix data, the length of the matrix is H, the width of the matrix is W, and the data of each point in the matrix data may be Pixel data, the bit width of the data is Xbit, X is set by a user side or a computing apparatus side in a customized manner, for example, the bit width of the data in an existing computer is set to 8 bits, and so on.
In the present application, for a larger neural network model, limited by the storage capacity of the register unit inside the neural network processor, the neural network processor cannot completely buffer data (such as intermediate data and result data) operated by the neural network model in the register unit, and therefore, the intermediate data of the neural network model needs to be outputted from the register unit to the external storage medium, and then the relevant data processing of the subsequent neural network model is completed. That is, the data transfer between the register unit and the storage medium is involved in the operation of the neural network processor.
Referring to fig. 3, a flow chart of a data transfer method according to an embodiment of the invention is shown based on the computing apparatus shown in fig. 1. The method as shown in fig. 3 comprises the following implementation steps:
in step S202, the storage control unit 106 obtains a data transfer instruction, where the data transfer instruction includes instruction information, and the instruction information is used to instruct the target data to be transferred and the storage format of the target data block. The storage format of the target data block is used for indicating the storage format adopted when the target data is stored.
In this application, the target data may be data in a first data block, and the first data block may be denoted as C × H1*W1I.e. from C H1*W1Is used to store the data block components. Where C represents the number of channels required to transmit the data block, H1And W1Respectively, the height and width of the data block, as can be understood from fig. 2A, the height H of the data block in the height direction (i.e., column direction)1And the width W of the data block in the width direction (i.e., in the row direction)1
For example, an image is transmitted by a computing device, and the image data corresponding to the image can be represented as C H1*W1Where C denotes the number of channels through which the image is transmitted, H1*W1Representing the size of the image in a two-dimensional plane. When C is 1, the image may be a grayscale image; when C is 3, the image may be a color image, for example, an RGB image may transmit Red data of the image in 1 channel, Green data of the image in 1 channel, and Blue data of the image in 1 channel. Accordingly, the RGB image may be composed ofThe color data transmitted in the three channels are combined, and the detailed description is not given in the present application.
The target data block belongs to the first data block, and the target data block comprises C H data blocks2*W2The data block including the target data, H2Is less than or equal to H1,W2Is less than or equal to W1. That is, the target data block is a data block constituting the first data block, or is the first data block, which is not limited in this application.
Step S204, the storage control unit 106 completes the transportation of the target data between the storage medium and the register unit according to the storage format of the target data block.
Some specific examples to which the present application relates are set forth below.
The indication information comprises first indication information and second indication information, wherein the first indication information is used for indicating the storage format of a first target data block in which the target data corresponds to in the storage medium. If in the scene of transferring data from a storage medium to the register unit, the first indication information is used for indicating how to read the target data required to be transferred from the storage medium so as to transfer the target data to the register unit. The first indication information is specifically used for indicating how to store the target data of the required transportation read from the register unit in the storage medium if in a scene of transporting data from the register unit to the storage medium.
Specifically, the first indication information includes any one or a combination of more of the following stored information: the storage address of the ith data block in the storage medium, and the width W of the ith data block in the width direction2(also may be referred to as a stored length) and store offset values, and so on. The memory address includes, but is not limited to, a memory head address (i.e., a starting memory address), a memory tail address (i.e., an ending memory address), and the like. The stored offset values may include but are not limited to a first stored offset value and a second stored offset value,the first storage offset value refers to a storage interval between every two rows of data in the height direction (i.e. column direction) of the ith data block, and may also be understood as a storage interval between rows of data in the ith data block. The second storage offset value refers to a storage interval between two adjacent data blocks in the storage medium, that is, a storage interval between data blocks of two adjacent channels. Specifically, the difference between the respective storage head addresses or storage tail addresses of two consecutive data blocks (for example, the ith data block and the (i + 1) th data block or the (i-1) th data block) in the storage medium may be used. See, in particular, the description below in connection with fig. 5B.
Wherein the ith data block is any one of C data blocks in the first target data block, and the size of the data blocks is H2*W2. Wherein the first target data block is a data block in the first data block.
Preferably, the first indication information may be used to indicate a storage address of the first data block and a storage offset value. The data block here refers to the data block constituting the first target data block, the first data block refers to the data block transmitted in the first channel CI/CO0, and the storage address may be a storage head address and/or a storage tail address. The storage offset value may include a difference between a storage head address of a data block of the first channel CI/CO0 and a storage head address of a data block of the second channel CI/CO1 in the storage medium, and may further include a storage interval of every two lines of data in the first data block in the storage medium.
It should be understood that, since the data blocks constituting each channel of the first target data block have the same size, they are all H2*W2. Since the storage space (i.e., the storage offset value in the present application) between the data blocks of each channel is the same, only the storage head/tail address and the storage offset value of the data block of a certain channel may be designed in designing the data transfer command to save the command overhead.
Correspondingly, the second indication information is used for indicating the storage format of the second target data block in which the target data is located in the register unit. If in the scenario of data transfer from storage medium to the register unit, the second indication information is used to indicate how to deposit in the register unit the target data of the required transfer read from the storage inoculum. If in the scene of transferring data from the register unit to the storage medium, the second indication information is specifically used for indicating how to read the target data required to be transferred from the register unit so as to store the target data in the storage medium.
Specifically, the second indication information includes at least one of multidimensional information indicating the target data to be transported or indicating a storage location where the target data needs to be transported/stored. The multidimensional information specifically includes the following:
dimension 1: a first carry indication of the data block in a width direction (i.e. a row direction), the first carry indication indicating every K consecutive1The individual data determine a desired delivery of said object data. Wherein K1Is a positive integer when K1When the number of data blocks is 1, it means that each piece of data stored in the row direction in the data block is the target data and needs to be transported.
In particular, if in a scenario of transferring data from a storage medium to the register unit, the first transfer indication is specifically for indicating every K consecutive in the width direction1And storing the target data which is read from the storage medium and needs to be carried in the data storage. For example, when K1Specifically, 1 indicates that each piece of data stored in the data block in the row direction is the continuous target data to be transported acquired by the computing device from the storage medium.
If in a scenario of transferring data from the register unit to the storage medium, the first transfer indication is specifically for indicating every K consecutive in the width direction1And extracting data as the target data to be transported in the storage medium. For example, when K1Specifically, when the value is 1, each piece of data stored in the data block in the row direction is the target data to be transported, that is, each piece of data in the row direction needs to be transported to the storage medium by the computing device for storage.
It should be understood that the target data is valid data, and the data stored in the data block may also be valid data or invalid data in addition to the target data, and preferably may be invalid data for distinguishing, which is not limited in this application.
Optionally, the invalid data may be stored and represented in the data block in the form of preset characters, preset numerical values, preset letters, and the like. For example, the invalid data may be replaced by 0 or invalid bubble data, and the like, and the present application is not limited thereto.
Dimension 2: width W of the data block in the width direction (i.e., row direction)2I.e. the width of all data contained in the width direction of the data block, wherein the data may be valid data and/or invalid data. Alternatively, the width may also be a width of valid data in the width direction of the data block, and the like.
Dimension 3: a second carrying indication of the data block in the height direction (i.e. in the column direction), the second carrying indication indicating every K consecutive2The individual data determine a desired delivery of said object data. That is, each successive K2The data of a row determines the target data of a row that needs to be handled. Wherein, K2Is a positive integer. When K is2When the number of data blocks is 1, each row of data in the column direction in the data block is the target data and needs to be transported.
In particular, if in a scenario of transferring data from the storage medium to the register unit, the second transfer indication is used in particular to indicate every K consecutive in the elevation direction2The line data determines a line of data for storing the target data of the desired transport read from the storage medium. For example, when K2When the number of data blocks is 1, the data blocks are used for storing the data blocks read by the computing device from the storage medium to be conveyed in each row of data in the height directionTarget data.
The second handling indication is particularly for indicating every K consecutive in the height direction if in a scenario of handling data from the register unit to the storage medium2And extracting a line of data by the line data to be used as the target data to be transported to the storage medium. For example, when K2Specifically, when the data block is 1, each line of data stored in the height direction in the data block is the target data to be transported, that is, each line of data in the height direction needs to be transported to the storage medium by the computing device for storage.
For the related description of the target data and the invalid data, reference may be made to the foregoing embodiments, which are not described herein again.
Dimension 4: the height H of the data block in the height direction (i.e., in the column direction) is the height of all data contained in the data block in the height direction. Optionally, the data may be valid data and/or invalid data. Alternatively, the height may refer to a height of valid data in a height direction of the data block, and the like.
It should be understood that the above dimension 1 to dimension 4 define the storage format of the data block of one channel CI/CO in the register unit. Optionally, the multidimensional information may further include:
dimension 5: the storage interval between every two data blocks in the width direction (namely, in the row direction), and/or the storage quantity M of the data blocks in the width direction (namely, in the row direction)1. Wherein M is1Is a positive integer.
Dimension 6: the storage interval between every two data blocks in the height direction (i.e. in the column direction), and/or the storage quantity M of the data blocks in the height direction (i.e. in the column direction)2. Wherein M is2Is a positive integer.
It will be appreciated that the memory space, M, of the register unit in the neural network processor is limited1And M2The user interface can be set for the computing device side or the user side in a self-defined mode, and the application is not limited. Alternatively, processing limited to neural network processorsThe capacity (i.e., the data handling capacity of the computing device, which may also be referred to as processing capacity) limits the amount of data that can be handled per pass, and thus may involve the handling of data for multiple rounds. Optionally, the multidimensional information may further include:
dimension 7: the storage interval of the data blocks in the height direction (i.e., column direction) is the storage interval between every two rounds. The total number of data chunks contained in each round (i.e., the batch data chunk referred to herein) is M1*M2
Optionally, the data block M contained in each round (i.e. each batch of data blocks)1*M2The user-defined setting is set for the user side or the computing device side in a self-defined mode, and the application is not limited.
Optionally, the indication information may further include other parameter information, for example, information such as a priority of the data block, which is not limited in this application.
Accordingly, a schematic diagram of data storage is given as fig. 4. As shown in fig. 4, dimension 1 to dimension 7 respectively define the storage format of the channel CI/CO data block in the register unit, and also define the target data to be carried in the data block or the storage location of the target data.
Based on this, there are two specific embodiments of step S204 as follows.
In the first embodiment, if the data transfer command is used to instruct to transfer data from the storage medium to the register unit, the storage control unit reads the target data to be transferred in the first target data block from the storage medium according to the instruction of the first instruction information. For example, the storage control unit may obtain all target data that needs to be transported in the first target data block in which the target data is located according to the storage address (specifically, the storage head address and the storage tail address) and the storage offset value of the ith data block in the first indication information. And then, according to the indication of the second indication information, storing the target data into the register unit according to the storage format of the second target data block in which the target data is located.
A specific implementation of step S204 is set forth below.
Taking the target data as the image data as an example, fig. 5A shows a schematic plan view of the image data to be carried. This image data may also be referred to herein as a first data block, and the image data (first data block) as shown in fig. 5A may be denoted as C × H1*W1I.e. C H1*W1The data block of (1). As shown, assume W1=(2c+d),H1=(a+b)。
Accordingly, a schematic diagram in which the image data (first data block) is stored in the storage medium is shown in fig. 5B. Wherein, fig. 5B shows that the data blocks in the two channels CI/CO store data in the storage medium in a continuous manner. For example, all data in the data blocks corresponding to the CI/CO0 channel are stored in the first CI/CO0 channel continuously from the low order to the high order in the direction of the arrow shown in the figure, and after the data block of the CI/CO0 channel is stored, all data included in the data block of the next CI/CO1 channel is stored.
If the second indication information in the data transfer instruction in S202 is used to indicate that the storage format of the register unit that supports the target data is stored according to the storage format of the second target data block, where the second target data block is C × H2*W2Wherein H is2Is less than or equal to H1,W2Is less than or equal to W1. It is assumed here that W2=c,H2A. The size of the first data block where the image data is located is larger than the size H of the second target data block2*W2(ii) a Correspondingly, when the data is specifically transported, the first data block needs to be divided, so that the size of each divided second data block is H2*W2. The second data block is a data block used for forming the first data block, namely a data block in the first data block.
It should be understood that when the size of the second data block after division is not H2*W2Can be filled with invalid data to arrive atSize H of second target data block supporting storage in register unit2*W2. Optionally, only the width of the second data block in the width direction may be limited, that is, when the width of the divided second data block in the width direction is not n × W2(i.e., W)2Integer multiple of) of the first data block, upper bits of the second data block in the width direction may be filled with invalid data so that the width of the second data block in the width direction reaches n × W2Wherein n is a positive integer.
Since the data in the divided second data block are all the target data, it can also be understood that: when the total bit width of the target data (i.e. the total bit width of the second data block in the width direction) cannot reach the bit width n x W required by the storage format of the target data block2X, invalid data padding may be employed to arrive at bit width n X W required for storage format of the target data block2X; wherein, X represents the bit width of each data in the target data block, and n is a positive integer. In particular, invalid data may be filled into the upper bits of the target data block, and the like, which is not described in detail in this application.
As shown in fig. 5A, 6 second data blocks, which are respectively Tile0 to Tile5, are obtained by dividing the first data block (i.e., image data) by the size c × a of the second target data block. Among them, the dimensions of Tile2 and Tile5 in the width direction as in fig. 5A are obviously not c; accordingly, the upper bits of the second data block may be filled with invalid data so that the size of the second data block in the width direction reaches c. That is, the storage bit width occupied by the divided second data block in the width direction is n × c × X (i.e., a positive integer multiple of c × X). Where X represents the Bit width of each data in the second target data block, e.g., 8 bits, etc.
Accordingly, when the image data in the second data blocks Tile2 and Tile5 is transferred/written into the register unit, since the original width (2c + d) of the image data is known, the valid data in the first target data block (here, the second data block Tile0 or Tile5) where the image data currently required to be transferred is located can be accordingly known, and the invalid data filled before is automatically filtered and removed to transfer the valid data (i.e., the valid image data).
For ease of understanding, the respective meanings of the three-digit numerical parameters such as 000, 010, 020 as shown in fig. 5B are as follows. The first bit parameter represents the current channel CI/CO, namely the channel used when the current data block is transmitted; a second bit parameter Tile represents an identifier of a second data block after being segmented, wherein the second data block is a data block forming the first data block; the third bit parameter Line indicates the number of lines where the current data is located in the first data block (here, image data). As shown in fig. 5B, only the image data (i.e., the first data block) is composed of 3 lines of data, which is not described in detail in this application.
Wherein DH1 indicates the storage interval between every two rows of data in the width direction of the second data block transmitted in the first channel CI/CO0 (i.e. the first storage offset value mentioned above) as in fig. 5B; DH2 indicates the storage interval (i.e., the second storage offset value described above) for the second block of data transferred between channels in the storage medium.
The target data to be carried is originally positioned in a first data block C H with larger size1*W1Therefore, when data is transported from the storage medium to the register unit, the data block transport instruction needs to inform which first target data block in the first data blocks is transported, that is, the storage format of the first target data block corresponding to the target data in the storage medium is indicated, so as to obtain the target data to be transported.
Specifically, the data transfer command is used to instruct the storage medium to transfer target data in a first target data block to the register unit, for example, data of the data block Tile0 shown in fig. 5A. For example, the first indication information in the data transfer command is used to indicate a storage head address and corresponding storage offset values, DH1 and DH2, of a data block (which may also be referred to as a first data block) of the first channel CI/CO0 stored in the storage medium by the first target data block Tile 0.
Accordingly, the storage control unit may read the target data to be transported, i.e., the image data in the C blocks of C × a for constituting the first target block Tile0, from the storage medium according to the first instruction information, as shown in fig. 5A.
It should be understood that, as shown in the example of fig. 5A, the size (2c + d) × (a + b) of the original image data (i.e., the first data block in the present application) where the target data is located is significantly larger than the size c × a of the first target data block indicated by the data transfer command. Therefore, the first data block is also divided to obtain the target data in the first target data block while the storage control unit reads the first target data.
Further, after reading the target data, the storage control unit may store the target data into the register unit according to the storage format of the second target data block according to the indication of the second indication information.
Specifically, after the storage control unit reads the target data from the storage medium, as shown in fig. 4, storage of the target data in a row direction of one channel CI/CO data block is completed according to dimension 1 and dimension 2, that is, as shown by an arrow in fig. 4, the target data is sequentially stored from a low level to a high level in the row direction of one channel CI/CO data block. And then, finishing the storage of the target data in the column direction in one channel CI/CO data block according to the dimension 3 and the dimension 4. At this time, the storage of the target data in one channel CI/CO data block can be completed. Then, according to the indication of the indication information, the storage of the target data in the CI/CO data blocks of the other channels in the row/column direction can be completed according to the dimension 5 or the dimension 6, that is, the target data read subsequently is stored in the CI/CO data blocks of the other channels in the row/column direction according to the dimension 5 or the dimension 6 until the storage of one round for the target data is completed. Then, it is also possible to determine whether or not to complete storage of the target data of a plurality of rounds according to dimension 7, based on the actual situation, that is, the data amount of the target data to be transported. The specific storage manner for the target data in each round is the same, and is not described herein again.
For example, fig. 4, the computing means stores the target data in the data block CI/CO0 of the register unit along the row direction in accordance with the indication of the indication information in the dimensions 1 and 2, and then stores the target data in the data block CI/CO0 of the register unit along the column direction in accordance with the dimensions 3 and 4. That is, the target data is stored into the data block CI/CO0 of the register unit according to dimension 1 to dimension 4, that is, the storage of the target data in the data block CI/CO0 is completed first. And then storing the target data read subsequently to the data block CI/CO1 or CI/CO2 corresponding to the row/column direction of the register unit according to dimension 5 or dimension 6. By analogy, it is assumed that the number of data blocks of each channel CI/CO that the round supports processing is defined to be 4 in the present application according to the processing capability of the neural network processor. Correspondingly, if the data size of the target quantity required to be transported in the storage medium is large and exceeds the data size stored for the target data in 4 channel CI/CO data blocks, the storage control unit needs to complete storage of the target data in the next round according to dimension 7, that is, store the target data required to be transported subsequently into the channel CI/CO data block of the next round.
Referring to the example shown in fig. 5A, assuming that C is 6, a specific schematic diagram of the first target data block Tile0 stored in the register unit is given in fig. 4, and details of this application are not described in detail.
In an alternative embodiment, after the storage control unit transports the first target data block Tile0 to the register unit, the operation unit may obtain the first target data block Tile0 from the register unit, and perform related operation, such as convolution operation, on the first target data block Tile0 to obtain a result data block. Accordingly, the arithmetic unit may store the result data block in the register unit to wait for an instruction to transfer the result data block to the storage medium, so as to perform the next calculation in the neural network model, and the like, which is not described in detail herein.
In a second embodiment, if the data transfer command is used to instruct data transfer from the register unit to the storage medium, the storage control unit reads a second target data block in which the target data is located from the register unit according to the instruction of the second instruction information, and then extracts the second target data to be transferred from the target data block. And then, according to the indication of the first indication information, storing the target data in the storage medium according to the storage format of the first target data block in the storage medium.
Specifically, as shown in fig. 4, the target data to be transported in the row direction of the channel CI/CO data blocks is read and extracted according to dimension 1 and dimension 2, that is, as shown in the arrow direction in fig. 4, the target data to be transported is read sequentially from low to high in the row direction of one channel CI/CO data block. And then reading and extracting the target data required to be carried in the column direction of the channel CI/CO data block according to the dimension 3 and the dimension 4. And at the moment, the acquisition of all the target data to be carried in one channel CI/CO data block can be completed. Then, the storage control unit may complete reading of the target data block to be transported in the other channel CI/CO data blocks in the row/column direction according to the indication of the indication information and according to the dimension 5 or the dimension 6 until completing reading of the target data in one round. Then, according to the actual situation, the number of the first data blocks C and the number of the batch data blocks M of each round supporting processing1*M2And determining whether to continue reading and carrying the target data in other rounds according to the dimension 7.
It should be understood that, after the target data to be transported is read from the register unit, the storage control unit may store the target data according to a storage format of the first target data block supported to be stored in the storage medium.
Accordingly, for example, as shown in fig. 4, the computing device reads the target data to be transported in the row direction in the data block CI/CO0 of the register unit according to the indication of the indication information and according to the dimension 1 and the dimension 2, and continuously sends the target data to the storage medium for storage, that is, the transport of the target data in the row direction of the channel data block CI/CO0 is completed according to the dimension 1 and the dimension 2. Next, target data to be transported in the data block CI/CO0 of the register unit read in the column direction is read according to dimension 3 and dimension 4, and the target data is continuously sent to the storage medium for storage, that is, the transport of the target data in the row direction of the channel data block CI/CO0 is completed according to dimension 3 and dimension 4. Accordingly, the movement of the target data in the data block CI/CO0 can be completed according to dimension 1 to dimension 4. And then, reading and transporting the target data in the CI/CO data blocks of other channels in the row/column direction according to dimension 5 or dimension 6, and so on until the target data in one round is read and transported. It is assumed in the present application that the number of data blocks of each channel CI/CO that the round supports processing is defined to be 4, depending on the processing capability of the neural network processor. Accordingly, if the target data to be carried in the register unit is stored in 6-channel CI/CO data blocks, specifically, data blocks CI/CO0 through CI/CO5 of fig. 4. Therefore, the storage control unit needs to complete reading and transporting of the target data in the next round according to dimension 7.
The following sets forth specific embodiments involved in storing the target data in the storage medium. Specifically, after the target data to be transported is read from the register unit, the storage control unit may store the target data in the storage medium according to the storage format of the first target data block according to the instruction of the first instruction information.
Fig. 5C shows still another schematic diagram for storing data in a storage medium. In connection with the foregoing example shown in fig. 5A or fig. 5B, it can be understood that the storage control unit may continuously send the target data to the storage medium for storage according to the storage first address a and two storage offset values (DH1 and DH2) of the data block in the first channel CI/CO0 according to the indication of the first indication information. Here, the black area indicates the filled wireless data as shown in fig. 5C.
The following describes a structural framework of the storage control unit referred to in the present application and a specific embodiment of data handling correspondingly given based on the framework.
Fig. 6A shows a schematic structure diagram of a first storage control unit EoDMA. The first storage control unit shown in fig. 6A includes a first control unit 202, a first read data unit 204, and a first write data unit 206. Optionally, a First In First Out (FIFO) unit 208 and a First grant unit 210 may also be included. Optionally, the first control unit 202 includes an instruction storage unit and an instruction control unit, which are not shown in the figure. Wherein the content of the first and second substances,
the instruction storage unit (EoDMA _ iq) is configured to receive the data handling instruction sent by the neural network processor, where parameters such as indication information included in the data handling instruction are included, of course.
The instruction control unit (EoDMA _ ctrl) is used for distributing the data handling instruction (specifically, a parameter in the instruction, such as indication information), starting EoDMA, controlling to complete switching between multidimensional information other than dimension 1 in the indication information, and controlling to complete switching between row data in the indication information.
The first authorization unit (Rd _ dm _ grant) is used for taking charge of authorization for reading the register unit (i.e. whether reading of the target data to be carried from the register unit is supported). It should be understood that the authorization of the register unit may relate to information such as the priority of the target data to be transported stored in the register unit or the target data block in which the target data is located, for example, when a plurality of data transport instructions simultaneously need to transport data from the register unit, limited by data transport capability, the register unit may consider the priority of the target data to be transported, such as the higher the priority, the earlier the authorization; and if the priority is lower, the target data with lower priority is transported after the target data with higher priority is transported, and the like.
Optionally, the first authorization unit is further configured to predict whether the first FIFO will overflow or not according to a read enable of the first FIFO (that is, a memory space occupied in the first FIFO unit is known through a read enable instruction) and a read enable of the first read data unit (that is, a data read instruction sent to the register unit to know a data amount of the target data that needs to be carried in the register unit), where the memory space of the first FIFO unit exceeds a preset threshold, and the like. Optionally, the register unit is further configured to feed back a corresponding data read authorization indication to the first read data unit, where the data read authorization indication is used to indicate whether to authorize reading data from the register unit, that is, whether to authorize reading data. Specifically, when the register unit is unauthorized, or the storage space of the first FIFO unit exceeds a preset threshold or overflows, the fed back data read authorization indication is used to indicate that the reading of the data in the register unit is not authorized; otherwise, the read command is used for indicating that the reading of the data in the register unit is authorized.
And the first data reading unit is used for reading target data to be conveyed from the register unit according to the storage format of a second target data block where the target data is located and according to the indication of indication information in the instruction when the data reading authorization indication is used for indicating authorized data reading, and continuously sending the target data to the first FIFO unit for buffering.
Specifically, the first data reading unit, which may also be referred to as an extract valid data, is configured to read a second target data block where the target data is located from the register unit through an interface (e.g., an Xbar interface), and is responsible for extracting the target data based on dimension 1 in the indication information; and simultaneously buffering the extracted target data into the first FIFO unit.
The first FIFO unit is used for buffering the target data to be carried.
The first write data unit, which may also be referred to as a data sending unit (send data unit), is configured to send the buffered target data from the first FIFO unit to the storage medium for storage in the storage format of the second target data block according to the indication of the indication information in the instruction. Optionally, the first write data unit may be further configured to take charge of interfacing with a bus in the computing device, and under the control of the bus, allowing/prohibiting the target data to be written into the storage medium continuously for storage.
Correspondingly, based on the EoDMA structure diagram shown in fig. 6A, reference is made to fig. 6B, which is a flowchart diagram of another data handling method provided in the embodiment of the present invention. The method as shown in fig. 6B may comprise the following implementation steps:
in step 301, the first control unit 202 receives the data transfer instruction sent by the neural network processor, and sends the data transfer instruction to the first read data unit 204. The data carrying instruction comprises the indication information, which is used for indicating the carrying of data from the register unit to the storage medium, and optionally also comprises target data to be carried and a storage format of a target data block. The storage format of the first target data block where the target data is located in the storage medium and the storage format of the second target data block where the target data is located in the register unit may be specifically included, which may specifically refer to relevant explanations in the foregoing embodiments, and details are not described here.
Step S302, the first authorization unit 210 determines a data reading authorization indication according to an authorization factor. The data reading authorization indication is used for indicating whether the first read data unit 204 is authorized to read the target data to be carried from the register unit, that is, indicating whether data reading is authorized. The authorization factor comprises any one or a combination of more of the following: whether the occupied storage space in the first FIFO unit exceeds a preset threshold, the priority of the target data block, or other factors for affecting data transmission, etc., which is not limited in the present application.
Step S303, the first data reading unit 204 receives the data transfer instruction, and reads the target data to be transferred from the register unit according to the storage format of the second target data block in the data transfer instruction when the data read authorization instruction is used to instruct to authorize data reading. Correspondingly, the first read data unit 204 may further buffer the read target data into the first FIFO unit 208 according to the storage format of the first target data block according to the indication of the indication information in the instruction. For the reading of the target data, reference may be made to the related description in the foregoing embodiments, and details are not repeated here.
In step S304, the first FIFO unit 208 buffers the target data to be transferred.
In step S305, the first write data unit 206 reads the target data in the storage format of the first target data block from the first FIFO unit 208, and writes/stores the target data in the storage format of the first data block into the storage medium. As to how to store the target data according to the storage format of the first target data block in the indication information, reference may be made to relevant explanations in the foregoing embodiments, and details are not repeated here.
Fig. 7A is a schematic diagram illustrating a structure of a second storage control unit EiDMA. The second storage control unit shown in fig. 7A includes a second control unit 302, a second read data unit 304, and a second write data unit 306. Optionally, a second First In First Out (FIFO) unit 308 and a second grant unit 310 may also be included. Optionally, the second control unit 302 includes an instruction storage unit and an instruction control unit, which are not shown in the figure. Wherein the content of the first and second substances,
the instruction storage unit (EiDMA _ iq) is configured to receive the data handling instruction sent by the neural network processor, wherein parameters such as indication information included in the data handling instruction are included.
The instruction control unit (EiDMA ctrl) is responsible for distributing the data handling instructions (specifically, parameters in the instructions, such as indication information) and starting EiDMA.
The second authorization unit (Rd _ dm _ grant) is used for taking charge of authorization for reading the storage medium (i.e. whether reading of the target data to be transported from the storage medium is supported). It should be understood that the authorization of the storage medium may relate to information such as the priority of the target data to be transported stored in the storage medium or the target data block in which the target data is located, for example, when a plurality of data transport instructions simultaneously transport data from the storage medium, limited by data transport capability, the storage medium may consider the priority of the target data to be transported, and the higher the priority, the earlier the authorization; and if the priority is lower, the target data with lower priority is transported after the target data with higher priority is transported, and the like.
Optionally, the second authorization unit is further configured to predict whether the second FIFO will overflow or not according to a read enable of the second FIFO (that is, a read enable instruction is used to learn a storage space occupied in the second FIFO unit) and a read enable of the second data reading unit (that is, a data reading instruction is sent to the storage medium to learn a data amount of the target data that needs to be carried in the storage medium), where the storage space of the second FIFO unit exceeds a preset threshold, and the like. Optionally, the apparatus is further configured to feed back a corresponding data reading authorization indication to the second data reading unit, where the data reading authorization indication is used to indicate whether to authorize to read data from the storage medium, that is, whether to authorize to read data. Specifically, when the storage medium is unauthorized, or the storage space of the second FIFO unit exceeds a preset threshold or overflows, the fed back data read authorization indication is used to indicate that the reading of the data in the storage medium is not authorized; otherwise, the reading of the data in the storage medium is indicated to be authorized.
And the second data reading unit (Rd _ ext _ data) is used for reading target data needing to be conveyed from the storage medium according to the instruction of the instruction information in the instruction and sending the target data to the second FIFO unit for buffering when the data reading authorization instruction is used for indicating authorization to read data.
Optionally, the device is also used for interfacing with a bus in the computing device, being controlled by the bus, and enabling/disabling reading of the target data.
The second FIFO unit is used for buffering the target data to be carried.
The second write data unit (Wr _ dm _ data) is used for sending the buffered target data from the second FIFO unit to the register unit for storage in the storage format of the second data block according to the indication of the indication information in the instruction. Specifically, the second write data unit may be configured to control switching between multidimensional information in the indication information, so as to complete storage of the target data in the register unit.
Based on the schematic structural diagram shown in fig. 7A, fig. 7B is a schematic flow chart of another data handling method according to an embodiment of the present invention. The method as shown in fig. 7B comprises the following implementation steps:
in step S401, the second control unit 302 receives the data transfer instruction sent from the neural network processor, and sends the data transfer instruction to the second data reading unit 304. The data carrying instruction includes the indication information, where the indication information is used to indicate that data is carried from the storage medium to the register unit, and optionally, may also include target data to be carried and a storage format of the target data in the register unit, that is, a storage format of the first data block where the target data is located in the present application, which may specifically refer to relevant explanations in the foregoing embodiments, and details are not repeated here.
Step S402, the second authorization unit 310 determines the data reading authorization indication according to the authorization factor. Wherein the data reading authorization indication is used for indicating whether the second reading data unit 304 is authorized to read the target data needing to be carried from the storage medium, i.e. indicating whether data reading is authorized. The authorization factor comprises any one or a combination of more of the following: whether the occupied storage space in the second FIFO unit exceeds a preset threshold, the priority of the target data block, or other factors for affecting data transmission, etc., which is not limited in this application.
Step S403, the second data reading unit 304 receives the data transportation instruction, and reads the target data to be transported from the storage medium according to the storage format of the first target data block in the data transportation instruction when the data reading authorization instruction is used to instruct to authorize data reading. Correspondingly, the second data reading unit 304 may further buffer the target data into the second FIFO unit 308 according to the storage format of the second target data block indicated by the indication information in the instruction.
In step S404, the second FIFO unit 308 buffers the target data to be transferred.
Step S405, the second data writing unit 306 reads the target data required to be handled, which conforms to the storage format of the second target data block, from the second FIFO unit, and sends the target data conforming to the storage format of the second target data block to the register unit for storage. As to how to store the target data according to the storage format of the second target data block in the indication information, reference may be made to relevant explanations in the foregoing embodiments, and details are not repeated here.
Therefore, the storage format of the multidimensional DMA data can be designed according to the data format supported by the neural network processor, namely, the data storage format matched with the neural network processor is utilized to correspondingly store the data, so that the read-write speed of the data can be improved, the efficient data transportation is realized, and the data transportation/data processing efficiency is improved. In addition, when the size of the data block to be carried is larger than the size of the data block supported and processed by the neural network processor, the data block to be carried needs to be cut through an instruction so as to meet the requirement of data processing of the neural network processor.
Fig. 8A is a schematic structural diagram of another computing device according to an embodiment of the present invention. The computing device of fig. 8A includes a storage medium 70, a register unit 72, and a storage control unit 74, wherein,
the storage medium 70 and the register unit 72 are both used for storing data;
the storage control unit 74 is configured to obtain a data transport instruction, where the data transport instruction includes instruction information, and the instruction information is used to instruct target data to be transported and a storage format of a target data block; the first data block of the target data comprises C H data blocks1*W1C is the number of channels required for transmitting said first data block, H1And W1The height and the width of the first data block are respectively, and the target data block comprises C H data blocks2*W2Data block of (H)2And W2Respectively, the height and width of the target data block, and H2Is less than or equal to H1,W2Is less than or equal to W1(ii) a The storage format of the target data block is used for indicating the storage format of the target data;
the storage control unit 74 is further configured to complete the transportation of the target data between the storage medium and the register unit according to the storage format of the target data block.
In an optional embodiment, if the target data block is a first target data block, and the indication information is used to indicate a storage format of the first target data block in which the target data is located in the storage medium, then the storage medium is configured to store the first target data block in the storage medium
The indication information includes at least one of: the storage device comprises a storage address of an ith data block and a storage offset value, wherein the storage offset value comprises a storage interval of adjacent rows of data in the ith data block in the storage medium and a storage interval between the ith data block and the (i + 1) th data block in the storage medium; the ith data block is C H data blocks forming the first target data block2*W2The data block of (1), wherein the ith data block is composed of the target data.
In an optional embodiment, if the target data block is a second target data block, and the indication information is used to indicate a storage format of the second target data block where the target data is located in the register unit, then the target data block is a second target data block
The indication information includes at least one of: a first transfer instruction of the data block in the width direction, and a width W of the data block in the width direction2A second conveyance instruction of the data block in the height direction, and a height H of the data block in the height direction2A storage interval between the data blocks in the width direction, and a storage number M of the data blocks in the width direction1A storage interval between the data blocks in the height direction, and a storage number M of the data blocks in the height direction2And storage intervals of the batch data blocks in the height direction; wherein the total number of data blocks contained in the batch of data blocks is M1*M2Indicating a data handling capability of the computing device;
the first carrying indication is used for indicating every continuous K in the width direction of the data block1Determining target data to be transported according to the data;
the second carrying indication is used for indicating each continuous K of the data blocks in the height direction2Determining target data to be transported according to the data;
wherein the data blocks are C H constituting the second target data block2*W2Any of the data blocks of (1), M1、 M2、K1And K2Are all positive integers.
In an optional embodiment, the storage control unit is a first storage control unit 720, configured to implement the transfer of the target data from the register unit to the storage medium, where the first storage control unit 720 includes a first control unit 721, a first read data unit 722, a first write data unit 723, a first FIFO 724, and a first grant unit 725;
the first control unit 721 is configured to obtain the data transfer instruction, and send the data transfer instruction to the first read data unit 722;
the first authorization unit 725 is configured to determine a data reading authorization indication according to an authorization factor, and send the data reading authorization indication to the first read data unit 722; wherein the data reading authorization indication is used for indicating whether to authorize to read the target data needing to be carried, and the authorization factor comprises at least one of the following: whether the occupied storage space in the first FIFO unit exceeds a preset threshold, the priority of the target data and the priority of the target data block;
the first read data unit 722 is configured to, when the data read authorization indication indicates that data read is authorized, read the target data to be transferred from the register unit according to the storage format of the second target data block, and buffer the target data into the first FIFO unit 724 according to the storage format of the first target data block;
the first write data unit 723 is configured to extract the target data, which meets the required handling of the storage format of the first target data block, from the first FIFO unit 724, and store the target data in the storage medium.
In an optional embodiment, the storage control unit is a second storage control unit 730, configured to implement transportation of the target data from the storage medium to the register unit; the second storage control unit 730 comprises a second control unit 731, a second read data unit 732, a second write data unit 733, a second first-in first-out FIFO unit 734, and a second grant unit 735;
the second control unit 731 is configured to obtain the data carrying instruction and send the data carrying instruction to the second data reading unit 732;
the second authorization unit 735 is configured to determine a data reading authorization indication according to an authorization factor, and send the data reading authorization indication to the second reading data unit 732; wherein the data reading authorization indication is used for indicating whether to authorize to read the target data needing to be carried, and the authorization factor comprises at least one of the following: whether the occupied storage space in the second FIFO unit exceeds a preset threshold, the priority of the target data and the priority of the target data block;
the second read data unit 732 is configured to, if the data read authorization indication indicates that data read is authorized, read the target data to be transported from the storage medium according to the storage format of the first target data block, and buffer the target data into the second FIFO unit 734 according to the storage format of the second target data block;
the second write data unit 733 is configured to extract the target data corresponding to the storage format of the second target data block from the second FIFO unit 734, and store the target data in the register unit.
Fig. 8B is a schematic structural diagram of a computing device according to an embodiment of the invention. The computing apparatus may be a device with a neural network processor and a communication network function, such as a smart phone, a tablet computer, and a smart wearable device, as shown in fig. 8B, the computing apparatus according to an embodiment of the present invention may include modules such as a display screen, a button, a speaker, and a sound pickup, and further includes: at least one bus 501, at least one processor 502 connected to the bus 501, and at least one memory 503 connected to the bus 501, a communication device 505 that implements a communication function, and a power supply device 504 that supplies power to each power consuming module of the terminal.
The processor 502 may invoke code stored in memory 503 to perform the associated functions via the bus 501, wherein the memory 503 includes an operating system, a data handling application. Wherein the processor 502 is configured to perform all or part of the implementation steps provided by the method embodiments described above.
The embodiment of the present invention further provides a computer storage medium, where the computer storage medium may store a program, and the program includes some or all of the steps described in the above method embodiments when executed.
Embodiments of the present application also provide a computer program product comprising a non-transitory computer readable storage medium storing a computer program operable to cause a computer to perform some or all of the steps as recited in the above method embodiments.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implementing, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of some interfaces, devices or units, and may be an electric or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (9)

1. A data transfer method applied to a computing device, the computing device comprising: storage medium, register unit and storage control unit, the method comprising:
the storage control unit acquires a data carrying instruction, wherein the data carrying instruction comprises indication information, and the indication information is used for indicating target data to be carried and a storage format of a target data block; the first data block of the target data comprises C H data blocks1*W1C is the number of channels required for transmitting said first data block, H1And W1The height and the width of the first data block are respectively, and the target data block comprises C H data blocks2*W2Data block of (H)2And W2Respectively, the height and width of the target data block, and H2Is less than or equal to H1,W2Is less than or equal to W1(ii) a The register unit supports data storage in a data block storage format; any data block only contains data of one channel, and the target data are arranged in the data block in the row/column direction; the storage format of the target data block is used for indicating the storage format of the target data;
the storage control unit completes the transportation of the target data between the storage medium and the register unit according to the storage format of the target data block, and when the total bit width of the target data cannot reach the bit width n x W required by the storage format of the target data block2Filling with invalid data to reach the bit width required by the storage format of the target data block; wherein, X represents the bit width of each data in the target data block, and n is a positive integer.
2. The method according to claim 1, wherein if the target data block is a first target data block, the indication information is used to indicate a storage format of the first target data block in which the target data is located in the storage medium, then
The indication information includes at least one of: the storage offset value comprises a storage interval of adjacent rows of data in the ith data block in the storage medium and a storage interval between the ith data block and the (i + 1) th data block or the (i-1) th data block in the storage medium; the ith data block is C H data blocks forming the first target data block2*W2The ith data block includes the target data.
3. The method according to claim 2, wherein if the target data block is a second target data block, the indication information is used to indicate a storage format of the second target data block in which the target data is located in the register unit
The indication information includes at least one of: a first transfer instruction of the data block in the width direction, and a width W of the data block in the width direction2A second conveyance instruction of the data block in the height direction, and a height H of the data block in the height direction2A storage interval between the data blocks in the width direction, and a storage number M of the data blocks in the width direction1A storage interval between the data blocks in the height direction, and a storage number M of the data blocks in the height direction2And storage intervals of the batch data blocks in the height direction; wherein the total number of data blocks contained in the batch of data blocks is M1*M2Indicating a data handling capability of the computing device;
the first carrying indication is used for indicating that the data block determines target data of one required carrying every K1 continuous data in the width direction;
the second carrying indication is used for indicating that the data block determines target data required to be carried every K2 continuous data in the height direction;
wherein the data block is any one of C H2 × W2 data blocks constituting the second target data block, and M1, M2, K1, and K2 are positive integers.
4. The method of claim 3, wherein the storage control unit comprises a first storage control unit and a second storage control unit,
the storage control unit completes the transportation of the target data between the storage medium and the register unit according to the storage format of the target data block, and comprises the following steps:
the first storage control unit reads the target data to be transported from the register unit according to the storage format of the second target data block, and stores the target data into the storage medium according to the storage format of the first target data block; alternatively, the first and second electrodes may be,
the second storage control unit reads the target data to be transported from the storage medium according to the storage format of the first target data block, and stores the target data into the register unit according to the storage format of the second target data block.
5. The method of claim 4, wherein the first storage control unit comprises a first control unit, a first read data unit, a first write data unit, a first-in-first-out (FIFO) unit, and a first grant unit;
the first storage control unit reads the target data to be transported from the register unit according to the storage format of the second target data block, and stores the target data into the storage medium according to the storage format of the first target data block, wherein the first storage control unit comprises:
the first control unit sends the acquired data carrying instruction to the first data reading unit;
the first authorization unit determines a data reading authorization indication according to an authorization factor and sends the data reading authorization indication to the first data reading unit; wherein the data reading authorization indication is used for indicating whether to authorize to read the target data needing to be carried, and the authorization factor comprises at least one of the following: whether the occupied storage space in the first-in first-out FIFO unit exceeds a preset threshold, the priority of the target data and the priority of the target data block;
the first data reading unit reads the target data to be conveyed from the register unit according to the storage format of the second target data block and caches the target data into the first-in first-out (FIFO) unit according to the storage format of the first target data block under the condition that the data reading authorization indication is used for indicating authorization to read the data;
the first data writing unit extracts the target data which is in accordance with the storage format of the first target data block and needs to be transported from the first-in first-out FIFO unit, and stores the target data into the storage medium.
6. The method of claim 4, wherein the second storage control unit comprises a second control unit, a second read data unit, a second write data unit, a second first-in-first-out FIFO unit, and a second grant unit;
the second storage control unit reads the target data to be transported from the storage medium according to the storage format of the first target data block, and stores the target data into the register unit according to the storage format of the second target data block, wherein the second storage control unit comprises:
the second control unit sends the acquired data carrying instruction to the second data reading unit;
the second authorization unit determines a data reading authorization indication according to an authorization factor and sends the data reading authorization indication to the second data reading unit; wherein the data reading authorization indication is used for indicating whether to authorize to read the target data needing to be carried, and the authorization factor comprises at least one of the following: whether the occupied storage space in the second first-in first-out FIFO unit exceeds a preset threshold, the priority of the target data and the priority of the target data block;
when the data reading authorization indication is used for indicating authorization data reading, the second data reading unit reads the target data needing to be conveyed from the storage medium according to the storage format of the first target data block, and caches the target data into the second first-in first-out FIFO unit according to the storage format of the second target data block;
the second data writing unit extracts the target data corresponding to the storage format of the second target data block from the second first-in first-out FIFO unit and stores the target data in the register unit.
7. A computing device comprising a storage medium, a register unit, and a storage control unit,
the storage medium and the register unit are used for storing data;
the storage control unit is used for acquiring a data carrying instruction, wherein the data carrying instruction comprises indication information, and the indication information is used for indicating target data to be carried and a storage format of a target data block; the first data block of the target data comprises C H data blocks1*W1C is the number of channels required for transmitting said first data block, H1And W1The height and the width of the first data block are respectively, and the target data block comprises C H data blocks2*W2Data block of (H)2And W2Respectively, the height and width of the target data block, and H2Is less than or equal to H1,W2Is less than or equal to W1(ii) a The register unit supports data storage in a data block storage format; any data block only contains data of one channel, and the target data are arranged in the data block in the row/column direction; the storage format of the target data block is used for indicating the storage format of the target data;
the storage control unit is further configured to complete the transportation of the target data between the storage medium and the register unit according to the storage format of the target data block, and when the total bit width of the target data cannot reach the bit width n × W required by the storage format of the target data block2Filling with invalid data to reach the bit width required by the storage format of the target data block; wherein, X represents the bit width of each data in the target data block, and n is a positive integer.
8. A computing device, comprising: a processor, a memory, a communication interface, and a bus; the processor, the memory and the communication interface are connected through the bus and complete mutual communication; the memory stores executable program code; the processor executes a program corresponding to the executable program code by reading the executable program code stored in the memory to perform the method of any one of claims 1-6.
9. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program comprising program instructions that, when executed by a processor, cause the processor to carry out the method according to any one of claims 1-6.
CN201711473331.7A 2017-12-29 2017-12-29 Data carrying method, computing device and computer storage medium Active CN109992541B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711473331.7A CN109992541B (en) 2017-12-29 2017-12-29 Data carrying method, computing device and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711473331.7A CN109992541B (en) 2017-12-29 2017-12-29 Data carrying method, computing device and computer storage medium

Publications (2)

Publication Number Publication Date
CN109992541A CN109992541A (en) 2019-07-09
CN109992541B true CN109992541B (en) 2021-09-14

Family

ID=67109039

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711473331.7A Active CN109992541B (en) 2017-12-29 2017-12-29 Data carrying method, computing device and computer storage medium

Country Status (1)

Country Link
CN (1) CN109992541B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021035598A1 (en) * 2019-08-29 2021-03-04 深圳市大疆创新科技有限公司 Data processing method and device
CN112446497B (en) * 2019-09-02 2024-02-27 中科寒武纪科技股份有限公司 Data block splicing method, related equipment and computer readable medium
CN111008040B (en) * 2019-11-27 2022-06-14 星宸科技股份有限公司 Cache device and cache method, computing device and computing method
CN111583095B (en) * 2020-05-22 2022-03-22 浪潮电子信息产业股份有限公司 Image data storage method, image data processing system and related device
CN111897579B (en) * 2020-08-18 2024-01-30 腾讯科技(深圳)有限公司 Image data processing method, device, computer equipment and storage medium
CN114399034B (en) * 2021-12-30 2023-05-02 北京奕斯伟计算技术股份有限公司 Data handling method for direct memory access device
CN114661644B (en) * 2022-02-17 2024-04-09 之江实验室 Pre-storage DMA device for auxiliary 3D architecture near-memory computing accelerator system
CN116166583B (en) * 2023-04-26 2023-07-11 太初(无锡)电子科技有限公司 Data precision conversion method and device, DMA controller and medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1971537A (en) * 2005-11-25 2007-05-30 杭州中天微系统有限公司 Access method of matrix data and storage device of the matrix data
CN101552916A (en) * 2009-05-05 2009-10-07 北京红旗胜利科技发展有限责任公司 DMA transfer method, device and DMA controller for YUV video data
CN102508800A (en) * 2011-09-30 2012-06-20 北京君正集成电路股份有限公司 Transmission method and transmission system for two-dimension data block

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7181548B2 (en) * 1998-10-30 2007-02-20 Lsi Logic Corporation Command queueing engine
CN106250103A (en) * 2016-08-04 2016-12-21 东南大学 A kind of convolutional neural networks cyclic convolution calculates the system of data reusing
CN106529517B (en) * 2016-12-30 2019-11-01 北京旷视科技有限公司 Image processing method and image processing equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1971537A (en) * 2005-11-25 2007-05-30 杭州中天微系统有限公司 Access method of matrix data and storage device of the matrix data
CN101552916A (en) * 2009-05-05 2009-10-07 北京红旗胜利科技发展有限责任公司 DMA transfer method, device and DMA controller for YUV video data
CN102508800A (en) * 2011-09-30 2012-06-20 北京君正集成电路股份有限公司 Transmission method and transmission system for two-dimension data block

Also Published As

Publication number Publication date
CN109992541A (en) 2019-07-09

Similar Documents

Publication Publication Date Title
CN109992541B (en) Data carrying method, computing device and computer storage medium
CN109992542B (en) Data handling method, related product and computer storage medium
CN106294234B (en) A kind of data transmission method and device
US11314457B2 (en) Data processing method for data format conversion, apparatus, device, and system, storage medium, and program product
US11468145B1 (en) Storage of input values within core of neural network inference circuit
CN102566958B (en) Image segmentation processing device based on SGDMA (scatter gather direct memory access)
CN102625110B (en) Caching system and caching method for video data
CN105095903A (en) Electronic equipment and image processing method
CN106293578A (en) Video card, image display device, method for displaying image and system
GB2538797B (en) Managing display data
US20200128264A1 (en) Image processing
CN101546527A (en) Liquid crystal display controller and image scaling method
CN103037137A (en) Image regeneration device and image regeneration method
EP2898414B1 (en) Fast, dynamic cache packing
CN109933560A (en) A kind of intermodule flow control communication means based on FIFO in conjunction with random access memory
CN101894082B (en) Storage device and smartphone system
CN107222793A (en) A kind of method and device of data transfer
CN204231519U (en) Camera chain
US9036874B2 (en) Image processing and recording system preidentifying and prestoring images with predetermined features and method thereof
CN114625891A (en) Multimedia data processing method, device and system
CN112166454A (en) Feature map loading method and device for neural network
CN113542805B (en) Video transmission method and device
CN106293441A (en) A kind of striding equipment performs the method and device of instruction
EP2355441B1 (en) Remote CPU-less decompression
CN105991714B (en) System and method for displaying head portrait of mobile phone client

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant