WO2022160706A1

WO2022160706A1 - Data processing method and apparatus, computer device, and storage medium

Info

Publication number: WO2022160706A1
Application number: PCT/CN2021/115799
Authority: WO
Inventors: 周军; 周亮; 常亮; 吴飞
Original assignee: 成都商汤科技有限公司; 电子科技大学
Priority date: 2021-01-31
Filing date: 2021-08-31
Publication date: 2022-08-04
Also published as: CN112927125B; CN112927125A

Abstract

The present disclosure provides a data processing method and apparatus, a computer device, and a storage medium. The method comprises: grouping a plurality of multipliers in a multiplier array on the basis of an operation step size to obtain a plurality of multiplier groups; and performing in parallel, by using all of the plurality of multiplier groups, data processing tasks corresponding to all the multiplier groups. The present disclosure enables a multiplier array to process all of a plurality of data processing tasks, thereby improving the processing efficiency of the multiplier array on the data processing tasks. In addition, the multiplier array is grouped on the basis of an operation step size to enable a multiplier, which outputs an invalid result of processing a certain data processing task, to output a valid result of processing another data processing task, thereby increasing the utilization rate of the multiplier array, and reducing the waste of computing resources.

Description

A data processing method, device, computer equipment and storage medium

CROSS-REFERENCE TO RELATED APPLICATIONS

The present disclosure claims priority to Chinese Patent Publication No. 202110132573.X filed on January 31, 2021, the entire contents of which are incorporated herein by reference.

technical field

The present disclosure relates to the field of computer technology, and in particular, to a data processing method, apparatus, computer device, and storage medium.

Background technique

At present, the convolutional neural network mainly relies on the multiplier-adder array for convolution processing. The multiplier-adder array stores the image data to be processed in the data processing task in the corresponding register array. However, the current data processing method has the problems of low utilization rate of the multiplier-adder array and waste of computing resources.

SUMMARY OF THE INVENTION

Embodiments of the present disclosure provide at least a data processing method, apparatus, computer device, and storage medium.

In a first aspect, an embodiment of the present disclosure provides a data processing method, including: grouping a plurality of multiplier-adders in a multiplier-adder array based on an operation step to obtain a plurality of multiplier-adder groups; using the plurality of multiplier-adder groups Each multiplier-adder group in the multiplier-adder group executes the data processing task corresponding to each multiplier-adder group in parallel.

In this way, based on the grouping of the multiplier-adder array, the multiplier-adder array can process multiple data processing tasks at the same time, which improves the processing efficiency of the multiplier-adder array for the data processing tasks. In addition, the multiplier-adder arrays are grouped based on the operation step size, so that the multiplier-adder that was originally invalid for the processing result of a certain data processing task is effective for the processing result of another data processing task, which improves the utilization rate of the multiplier-adder array. The waste of computing resources is reduced.

In a possible implementation manner, in the same row of the multiplier-adder array, the interval between two adjacent multiplier-adders in the same group is the same and non-zero in the number of multiplier-adders in the same group, and the multiplier-adder array In the same column of , the interval between two adjacent multiplier-adders of the same group is the same and not zero.

In this way, the grouping situation of the multiplier-adder array can ensure that each multiplier-adder group handles different data processing tasks, so that the multiplier-adder array can process multiple data processing tasks at the same time, and the processing of data processing tasks by the multiplier-adder array is improved. efficiency.

In a possible implementation manner, the grouping a plurality of multiplier-adders in the multiplier-adder array based on the operation step size includes: determining the number of the multiplier-adder groups based on the operation step size; The number of multiplier-adder groups groups a plurality of multiplier-adders in the multiplier-adder array.

In this way, it is ensured that each multiplier-adder group of the multiplier-adder array is effective in processing the data processing task of the multiplier-adder group, so that the multiplier-adder array can process multiple data processing tasks at the same time, improving the performance of the multiplier-adder group. The processing efficiency of the array for data processing tasks.

In a possible implementation manner, the grouping a plurality of multiplier-adders in the multiplier-adder array based on the number of multiplier-adder groups includes: determining the multiplier-adder array from the multiplier-adder array. the first target multiplier-adder in each multiplier-adder group; based on the position of the first target multiplier-adder in the multiplier-adder array, the operation step size, and the size of the multiplier-adder array, Other target multiplier-adders in each multiplier-adder group except the first target multiplier-adder are determined from the multiplier-adder array.

In this way, after the position of the first target multiplier-adder of each multiplier-adder group in the multiplier-adder array is determined, based on the position of the first target multiplier-adder of each multiplier-adder group in the multiplier-adder array, the The positions of other target multiplier-adders except the first target multiplier-adder in each multiplier-adder group can be determined from the multiplier-adder array, which improves the grouping efficiency of the multiplier-adder array grouping.

In a possible implementation manner, based on the position of the first multiplier-adder in the multiplier-adder array, the operation step size and the size of the multiplier-adder array, the multiplier-adder is obtained from the multiplier-adder array Determining other target multiplier-adders except the first target multiplier-adder in each multiplier-adder group in the multiplier-adder array includes: for each multiplier-adder group except for the first target multiplier-adder For each other multiplier-adder, based on the operation step size, determine the first positional relationship between the multiplier-adder and the previous multiplier-adder that is in the same row and adjacent to the multiplier-adder in the multiplier-adder array; And based on the operation step size and the number of columns of the multiplier-adder array, determine the second multiplier-adder in the multiplier-adder array of the multiplier-adder and the previous multiplier-adder in the same column and adjacent to the multiplier-adder array. positional relationship; based on the position of the first target multiplier-adder of the multiplier-adder group in the multiplier-adder array, the first positional relationship and the second positional relationship, it is determined that the multiplier-adder is in the multiplier-adder target location in the array.

In a possible implementation manner, the determining the first target multiplier-adder in each multiplier-adder group from the multiplier-adder array includes: based on the operation step size, the multiply-adder determining the target matrix; determining the position of the first target multiplier-adder in each multiplier-adder group in the multiplier-adder array according to the matrix element values of the target matrix.

In a possible implementation manner, using each multiplier-adder group in the multiple multiplier-adder groups to perform data processing tasks corresponding to each multiplier-adder group in parallel includes: according to the multiplier-adder group. the position of each target multiplier-adder in each multiplier-adder group in the multiplier-adder array, and store the image data to be processed corresponding to each multiplier-adder group In the register array corresponding to the multiplier-adder group; for each data processing cycle in the multiple data processing cycles, read the pending processing corresponding to each multiplier-adder group in the data processing cycle from the register array corresponding to each multiplier-adder group. image data; perform parallel processing on the read image data to be processed to obtain the data processing results of each multiplier-adder group in the data processing cycle; according to the data processing results corresponding to each multiplier-adder group in each data processing cycle , to complete the data processing tasks corresponding to each multiplier-adder group.

In this way, the multiplier-adder array ensures that each multiplier-adder group can process the corresponding data processing task by reading corresponding operands in different data processing cycles, and ensures the validity of the processing result of the multiplier-adder array for the data processing task.

In a possible implementation manner, according to the position of each target multiplier-adder in each multiplier-adder group in the multiplier-adder array, the to-be-processed image data corresponding to the multiplier-adder group storing in the register array corresponding to the multiplier-adder group, comprising: for each multiplier-adder group, determining the position of the register that the target multiplier-adder of the multiplier-adder group reads in the respective corresponding register arrays; For each multiplier-adder group, according to the position of each target multiplier-adder in the multiplier-adder group in the multiplier-adder array, and the position of the register read by the target multiplier-adder in the multiplier-adder group , and the processing sequence of the operands contained in the image data to be processed in the data processing process, the image data to be processed corresponding to the multiplier-adder group is stored in the register array corresponding to the multiplier-adder group, In each data processing cycle, the operand stored in the fixed read register of each target multiplier-adder corresponds to the matrix element in the matrix operand of the corresponding data processing cycle.

In a possible implementation manner, for each data processing cycle in the plurality of data processing cycles, from the register array corresponding to each multiplier-adder group, respectively, read the corresponding data processing cycle corresponding to each multiplier-adder group. and perform parallel processing on the read image data to be processed to obtain the data processing results of each multiplier-adder group within the data processing cycle, including: a first step for processing the image data to be processed. each data processing cycle, control each target multiplier-adder in each multiplier-adder group, and read the operand corresponding to each target multiplier-adder in the first data processing cycle from the fixed read register as the first operation and determine the matrix elements in the matrix operands corresponding to the first data processing cycle of each multiplier-adder group as the second operand; respectively determine the first data processing cycle of each target multiplier-adder The product of the operand and the second operand; for the non-first data processing cycle in which the image data to be processed is processed, the image data to be processed is controlled according to the preset data movement mode corresponding to the data processing cycle. Move the preset step size in the register array; and control each target multiplier-adder in each multiplier-adder group, respectively read each target multiplier-adder from the register fixedly read with each target multiplier-adder in the non-multiplier-adder The operand of the first data processing cycle is taken as the first operand; and the matrix elements in the matrix operand corresponding to the data processing cycle of each multiplier-adder group are determined as the second operand; The product of the first operand and the second operand of a data processing cycle.

In this way, based on the preset step size and the preset data movement mode, the operands are shifted in an orderly manner in the register array with the transformation of the data processing cycle, so as to ensure that the corresponding multiplier-adder in the multiplier-adder array can obtain valid data, Ensure the validity of the processing results of the data processing tasks.

In a possible implementation manner, completing the data processing tasks corresponding to each multiplier-adder group according to the data processing results corresponding to each multiplier-adder group in each data processing cycle includes: for each multiplier-adder group For each target multiplier-adder in the group, add the products obtained by the target multiplier-adder in each data processing cycle to obtain a sum value; based on the sum value corresponding to each target multiplier-adder contained in each multiplier-adder group , to complete the data processing tasks corresponding to each multiplier-adder group.

In a possible implementation manner, the data processing tasks include: convolution processing tasks; images to be processed corresponding to the convolution processing tasks of different multiplier-adder groups are different.

In this way, the multiplier-adder array can process multiple images to be processed at the same time, and the processing efficiency of the multiplier-adder array to be processed is improved.

In a second aspect, an embodiment of the present disclosure further provides a data processing apparatus, including: a controller; the controller is configured to: group a plurality of multiplier-adders in a multiplier-adder array based on an operation step to obtain a plurality of multiplier-adders an adder group; using each multiplier-adder group in the plurality of multiplier-adder groups, execute the data processing task corresponding to each multiplier-adder group in parallel.

In a possible implementation manner, when the multiplier-adders in the multiplier-adder array are grouped based on the operation step size, the controller is specifically configured to determine the multiplier-adder based on the operation step size the number of groups; the multiplier-adders in the multiplier-adder array are grouped based on the number of multiplier-adder groups.

In a possible implementation manner, when a plurality of multiplier-adders in the multiplier-adder array are grouped based on the number of the multiplier-adder groups, the controller is specifically configured to extract the data from the multiplier-adder group. determining the first target multiplier-adder in each multiplier-adder group in the multiplier-adder array; based on the position of the first target multiplier-adder in the multiplier-adder array, the operation step size, and the multiplier-adder The size of the adder array is determined from the multiplier-adder array, and other target multiplier-adders in each multiplier-adder group except the first target multiplier-adder are determined from the multiplier-adder array.

In a possible implementation, based on the position of the first multiplier-adder in the multiplier-adder array, the operation step size and the size of the multiplier-adder array, the multiplier-adder is obtained from the multiplier-adder array. When other target multiplier-accumulators except the first target multiplier-adder in each multiplier-adder group are determined in the array, the controller is specifically configured to target all multiplier-adder groups except the first target multiplier-adder. For each multiplier-adder other than the target multiplier-adders, based on the operation step size, determine that the multiplier-adder and the previous multiplier-adder that is in the same row and adjacent to the multiplier-adder are in the multiplier-adder array The first positional relationship of the second positional relationship in the multiplier array; based on the position of the first target multiplier-adder of the multiplier-adder group in the multiplier-adder array, the first positional relationship and the second positional relationship, determine the multiplier-adder The target position of the multiplier in the multiplier-adder array.

In a possible implementation manner, when determining the first target multiplier-adder in each multiplier-adder group from the multiplier-adder array, the controller is specifically configured to, based on the operation step Length, the multiplier-adder array, determine the target matrix; according to the matrix element value of the target matrix, determine the position of the first target multiplier-adder in each multiplier-adder group in the multiplier-adder array.

In a possible implementation manner, when using each multiplier-adder group in the plurality of multiplier-adder groups to perform data processing tasks corresponding to each multiplier-adder group in parallel, the controller , which is specifically used to store the to-be-processed image data corresponding to each multiplier-adder group to the In the register array corresponding to each multiplier-adder group; for each data processing cycle in the plurality of data processing cycles, respectively, from the register array corresponding to each multiplier-adder group, read the multiplication and addition of the data processing cycle. The image data to be processed corresponding to the group of multipliers is processed; the read image data to be processed is processed in parallel to obtain the data processing results of each multiplier-adder group in the data processing cycle; The corresponding data processing results are completed, respectively, and the data processing tasks corresponding to each multiplier-adder group are completed.

In a possible implementation manner, according to the position of each target multiplier-adder in each multiplier-adder group in the multiplier-adder array, the to-be-processed image corresponding to the multiplier-adder group When the data is stored in the register array corresponding to the multiplier-adder group, the controller is specifically configured to, for each multiplier-adder group, determine that the target multiplier-adder of the multiplier-adder group is in the corresponding register array The position of the read register is fixed; for each multiplier-adder group, according to the position of each target multiplier-adder in the multiplier-adder group in the multiplier-adder array, the target multiplier-adder in the multiplier-adder group The position of the register read by the processor and the processing sequence of the operands contained in the image data to be processed in the data processing process are stored, and the image data to be processed corresponding to the multiplier-adder group is stored to the multiplier-adder group. In the register array corresponding to the processor group, in each data processing cycle, the operand stored in the register fixedly read by each target multiplier-adder corresponds to the matrix element in the matrix operand of the corresponding data processing cycle.

In a possible implementation manner, for each data processing cycle in the multiple data processing cycles, from the register array corresponding to each multiplier-adder group, respectively, the data processing cycle corresponding to each multiplier-adder group is read. image data to be processed; and perform parallel processing on the read image data to be processed to obtain the data processing results of each multiplier-adder group within the data processing cycle, the controller is specifically used for processing the to-be-processed image data. In the first data processing cycle of image data processing, control each target multiplier-adder in each multiplier-adder group, and read the corresponding target multiplier-adder in the first data processing cycle from the fixed read register respectively. The operand is used as the first operand; and the matrix elements in the matrix operands corresponding to the first data processing cycle of each multiplier-adder group are determined as the second operand; respectively, determine that each target multiplier-adder is in the first data processing cycle. The product of the first operand and the second operand of the data processing cycle; for the non-first data processing cycle of processing the image data to be processed, control the image data to be processed according to the preset corresponding to the data processing cycle The data movement mode moves a preset step size in the register array; and controls each target multiplier-adder in each multiplier-adder group, and reads each target multiplier from a register fixedly read with each target multiplier-adder. The operand of the adder in the non-first data processing cycle is taken as the first operand; and the matrix elements in the matrix operands corresponding to the data processing cycle of each multiplier-adder group are determined as the second operand; The product of the first operand and the second operand of the target multiplier-adder in this data processing cycle.

In a possible implementation manner, when completing the data processing tasks corresponding to each multiplier-adder group according to the data processing results corresponding to each multiplier-adder group in each data processing cycle, the controller is specifically configured to: For each target multiplier-adder in each multiplier-adder group, add the products obtained by the target multiplier-adder in each data processing cycle to obtain a sum value; based on each target multiplier-adder included in each multiplier-adder group The sum value corresponding to the adder is used to complete the data processing task corresponding to each multiplier-adder group.

In a third aspect, an optional implementation manner of the present disclosure further provides a computer device, a controller, and a memory, where the memory stores machine-readable instructions executable by the controller, and the controller is configured to execute the instructions stored in the memory. machine-readable instructions, when the machine-readable instructions are executed by the controller, the machine-readable instructions are executed by the controller to execute the first aspect above, or any possible implementation of the first aspect steps in the method.

In a fourth aspect, an optional implementation manner of the present disclosure further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and the computer program executes the first aspect, or any of the first aspect, when the computer program is run. steps in one possible implementation.

For the description of the effects of the above-mentioned data processing apparatus, computer equipment, and computer-readable storage medium, please refer to the description of the above-mentioned data processing method, which will not be repeated here.

In order to make the above-mentioned objects, features and advantages of the present disclosure more obvious and easy to understand, the preferred embodiments are exemplified below, and are described in detail as follows in conjunction with the accompanying drawings.

Description of drawings

In order to illustrate the technical solutions of the embodiments of the present disclosure more clearly, the accompanying drawings required in the embodiments will be briefly introduced below. These drawings illustrate embodiments consistent with the present disclosure, and together with the description, serve to explain the technical solutions of the present disclosure. It should be understood that the following drawings only show some embodiments of the present disclosure, and therefore should not be regarded as limiting the scope. Other related figures are obtained from these figures.

FIG. 1 shows a flowchart of a data processing method provided by an embodiment of the present disclosure;

FIG. 2 shows an example diagram of a multiplier-adder array provided by an embodiment of the present disclosure;

3A, FIG. 3B, and FIG. 3C show an example diagram of movement based on an operation step provided by an embodiment of the present disclosure;

FIG. 4 shows an example diagram of dividing a multiplier-adder array into four multiplier-adder groups provided by the present disclosure;

FIG. 5 shows an example diagram of a matrix for determining the position of the first target multiplier-adder in each multiplier-adder group in the multiplier-adder array provided by an embodiment of the present disclosure;

6A and FIG. 6B show exemplary diagrams of a multiplier-adder array and a corresponding register array provided by an embodiment of the present disclosure;

7 shows an example diagram of the register array a after the image data to be processed is shifted to the left by one step in the register array as a whole in an embodiment of the present disclosure;

FIG. 8 shows a schematic diagram of a data processing apparatus provided by an embodiment of the present disclosure;

FIG. 9 shows a schematic diagram of a computer device provided by an embodiment of the present disclosure.

Detailed ways

In order to make the purposes, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present disclosure. Obviously, the described embodiments are only These are some, but not all, embodiments of the present disclosure. The components of the disclosed embodiments generally described and illustrated herein may be arranged and designed in a variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure is not intended to limit the scope of the disclosure as claimed, but is merely representative of selected embodiments of the disclosure. Based on the embodiments of the present disclosure, all other embodiments obtained by those skilled in the art without creative work fall within the protection scope of the present disclosure.

After research, it is found that convolutional neural networks mainly rely on multiplier-adder arrays for convolution processing. During convolution processing, the image data to be processed will be stored in the register array connected to the multiplier-adder array; the image data to be processed stored in the register array will be moved in the register array in different data processing cycles; multiply-add In each data processing cycle of the multiplier-adder array, the operands of the data processing cycle are read from the registers (belonging to the register array) connected to the multiplier-adder, and multiplication and/or addition operations are performed. After a plurality of data processing cycles, the multiplier-adder array outputs the result of convolution processing on the image data to be processed. In the case where the operation step size is greater than 1, the processing results of some of the multiplier-adders in the multiplier-adder array are not needed in the results of processing the image data to be processed, so there is a data processing method in this case. The problem of low utilization of multiplier-adder array and waste of computing resources.

Based on the above research, the present disclosure provides a data processing method, apparatus, computer equipment and storage medium. By grouping the multiplier-adder array based on the operation step size, multiple multiplier-adder groups are obtained, so that different multiplier-adder groups in the multiplier-adder array are obtained. The adder groups process data processing tasks corresponding to different image data to be processed in parallel, that is, the same multiplier-adder array can process multiple image data to be processed at the same time, and each multiplier-adder group processes one image data to be processed. The multiplier-adder that is not used in the process of processing one image data to be processed is used to process other image data to be processed, which improves the utilization rate of the multiplier-adder array, reduces the waste of computing resources, and improves the The processing efficiency of the multiplier-adder array for the image data to be processed.

The defects existing in the above solutions are all the results obtained by the inventor after practice and careful research. Therefore, the discovery process of the above problems and the solutions to the above problems proposed by the present disclosure hereinafter should be the inventors Contributions made to this disclosure during the course of this disclosure.

It should be noted that like numerals and letters refer to like items in the following figures, so once an item is defined in one figure, it does not require further definition and explanation in subsequent figures.

In order to facilitate the understanding of this embodiment, a data processing method disclosed in the embodiment of the present disclosure is first introduced in detail. The device includes, for example, a terminal device or a server or other processing device, and the terminal device can be a user equipment (User Equipment, UE), a mobile device, a user terminal, a terminal, a cellular phone, a cordless phone, a personal digital assistant (Personal Digital Assistant, PDA) , handheld devices, computing devices, in-vehicle devices, wearable devices, etc. In some possible implementations, the data processing method may be implemented by a processor invoking computer-readable instructions stored in a memory.

The data processing method provided by the embodiments of the present disclosure will be described below.

Referring to FIG. 1, which is a flowchart of a data processing method provided by an embodiment of the present disclosure, the method includes steps S101-S102, wherein:

S101: Grouping multiple multiplier-adders in the multiplier-adder array based on the operation step size to obtain multiple multiplier-adder groups;

S102: Using each multiplier-adder group in the multiple multiplier-adder groups, execute the data processing task corresponding to each multiplier-adder group in parallel.

In the present disclosure, the multiplier-adder arrays are grouped based on the operation step size to obtain multiple multiplier-adder groups, and the multiple multiplier-adder groups are allowed to execute their corresponding data processing tasks in parallel; the data processing tasks of each multiplier-adder group The difference enables the multiplier-adder array to process multiple data processing tasks at the same time, thereby improving the processing efficiency of the multiplier-adder array. In addition, the multiplier-adder that is not used in the data processing process of one image to be processed is used to process the data of other images to be processed, which improves the utilization rate of the multiplier-adder array and reduces the waste of computing resources. .

The above S101 to S102 will be described in detail below.

For the above S101, the multiplier-adder array is a matrix array composed of a plurality of multiplier-adders. As an example, FIG. 2 shows an example diagram of a multiplier-adder array provided by the present disclosure. The multiplier-adder array includes There are 16 multiplier-adders in 4 rows and 4 columns. The matrix operand includes, for example, the convolution kernel when the image data to be processed is processed; the operation step size is, for example, the moving step size of the convolution kernel. Exemplarily, as shown in Figure 3A, Figure 3B, and Figure 3C, moving the convolution kernel according to the step size 2 means: S _x =2, S _y =2, and the moving process is, for example, moving from the target position shown in Figure 3A To the target position two shown in Figure 3B, then move from the target position two shown in Figure 3B to the target position three shown in Figure 3C, i.e. move two pixels at a time when moving horizontally, and move two pixels at a time when moving vertically; Among them, S _x represents the number of pixels moved in the horizontal direction, and S _y represents the number of pixels moved in the vertical direction.

When grouping the multiplier-adders in the multiplier-adder array, for example, the number of multiplier-adder groups can be determined based on the operation step size, and a plurality of multiplier-adders in the multiplier-adder array can be performed based on the number of multiplier-adder groups. grouping.

In a specific implementation, the relationship between the operation step size and the number GN of multiplier-adder groups is:

GN=S _x *S _y ;

For example, when the operation step size is 2, S _x =2, S _y =2, then the number GN of multiplier-adder groups is 4.

In a specific implementation, an embodiment of the present disclosure provides a specific method for grouping multiple multiplier-adders in a multiplier-adder array based on the number of multiplier-adder groups to obtain multiple multiplier-adder groups, including: Determine the first target multiplier-adder in each multiplier-adder group in the adder array; based on the position of the first target multiplier-adder in the multiplier-adder array, the operation step size and the size of the multiplier-adder array, the The target multiplier-adders other than the first target multiplier-adder in each multiplier-adder group are determined in the multiplier array.

In a specific implementation, since the size of the multiplier-adder array is fixed, the size of the image data to be processed will be different according to the actual situation. The image data to be processed is processed in parallel, and the utilization rate of the multiplier-adder array may not reach 100% in many cases. Therefore, in the embodiment of the present disclosure, the size information of the multiplier-adder array actually used is firstly determined based on the operation step size and the size of the multiplier-adder array; wherein the size of the multiplier-adder array includes the number of rows of the multiplier-adder array, and Number of columns, the size information of the multiplier-adder array actually used includes the number of rows and columns of the multiplier-adder array actually used; the size information and operation step size of the multiplier-adder array actually used, and the size of the multiplier-adder array actually used. The relationship between the dimensions is:

A' _x =A _x -A _x %S _x ;

A' _y =A _y -A _y %S _y ;

Among them, A _x is the number of columns of the multiplier-adder array, A _y is the number of rows of the multiplier-adder array; A' _x is the number of columns of the multiplier-adder array actually used, and A' _y is the actually used multiplier-adder array The number of lines, % is the operation to find the remainder. Exemplarily, when the operation step is 2, S _x =2, S _y =2, the size of the multiplier-adder array is: A _x =5, A _y =5; therefore, the number of columns of the multiplier-adder array actually used _A'x = _Ax - _Ax %Sx ₌ 5-5%2=4, the number of rows of the multiplier-adder array actually used is _A'y ₌ Ay- _Ay % _Sy= 5-5%2 =4.

The first target multiplier-adder of each multiplier-adder group is then determined in the multiplier-adder array actually used.

During specific implementation, for example, the first target multiplier-adder of each multiplier-adder group can be determined based on the following methods: determining the target matrix based on the operation step size and the multiplier-adder array; determining each The position of the first target multiplier-adder in the multiplier-adder group in the multiplier-adder array; where the matrix element value represents the first target multiplier-adder in each multiplier-adder group.

Exemplarily, the size information of the multiplier-adder array actually used is 4 rows and 4 columns, and when the operation step is 2, the number of multiplier-adder groups is 4, which means that the target matrix contains two rows and two columns, a total of 4 multiplication and additions. The first multiplier-adder in the multiplier-adder array is used as the first multiplier-adder of the target matrix, that is, the first target multiplier-adder of the first multiplier-adder group, based on the first multiplier-adder of the target matrix Each multiplier-adder determines other multiplier-adders in the target matrix, that is, the first target multiplier-adder of other multiplier-adder groups. For example, the positional arrangement number of the actually used multiplier-adder array is:

The first target multiplier-adder of the first multiplier-adder group, that is, the first multiplier-adder of the target matrix, is at position 0, then the target matrix with two rows and two columns can be determined based on the multiplier-adder at position 0. The corresponding position of the matrix in the actual multiplier-adder array is numbered as

Then, the positions of the first target multiplier-adders of the other three multiplier-adder groups in the actually used multiplier-adder array are respectively 1, 4, and 5, as shown in the target matrix in FIG. 4 .

Exemplarily, the target position of the first target multiplier-adder of each multiplier-adder group in the actually used multiplier-adder array can also be determined with reference to the formula corresponding to each matrix element in the matrix shown in FIG. That is to say, each matrix element in the matrix shown in FIG. 5 respectively represents the position of the first target multiplier-adder of a multiplier-adder group in the actually used multiplier-adder array. Among them, A' _x is the number of columns of the multiplier-adder array actually used, A' _y is the number of rows of the multiplier-adder array actually used,

A' _x =A _x -A _x %S _x ;

A' _y =A _y -A _y %S _y ,

A _x is the column number of the multiplier-adder array, A _y is the row number of the multiplier-adder array; S _x is the horizontal movement step of the operation step, and S _y is the vertical movement step of the operation step.

After the first target multiplier-adder of each multiplier-adder group is determined in the actually used multiplier-adder array, for example, the first divider-adder in each multiplier-adder group can be determined based on the methods described in the following steps 1 to 3 Target multiplier-adders other than target multiplier-adders:

Step 1: For each multiplier-adder in each multiplier-adder group except the first target multiplier-adder, based on the operation step size, determine the previous multiplier-adder that is in the same row and adjacent to the multiplier-adder. The first positional relationship of the multiplier-adder in the multiplier-adder array; wherein, each multiplier-adder PO(i) except the first target multiplier-adder in the multiplier-adder group is in the same row of the multiplier-adder and The first positional relationship of the adjacent previous multiplier-adder PO(i-1) in the multiplier-adder array is, for example:

PO(i-1)+Sx ₌ PO(i).

Exemplarily, the actually used multiplier-adder array is 4 rows and 4 columns, that is, A' _y =4, A' _x =4, and the positional arrangement number of the actually used multiplier-adder array is:

When the operation step is 2, S _x =2, S _y =2, as shown in Figure 4, the four different colors represent four multiplier-adder groups: the first multiplier-adder group in black, the second in white Multiplier-adder group, third multiplier-adder group in light gray, and fourth multiplier-adder group in dark gray. Taking the first multiplier-adder group as an example, the first target multiplier-adder of the first multiplier-adder group is at position 0, then the position PO(A) of another multiplier-adder A in the same group in this row is :

PO(A)=0+Sx ₌ 0+2=2,

The position PO(B) of the next multiplier-adder B after position 2 in the same group is:

PO(B)=PO(A)+S _x =2+S _x =2+2=4,

However, because the size of the multiplier-accumulator array actually used is 4 columns, and the maximum position arrangement number of this row is 3, the multiplier-accumulator in this row and the multiplier-accumulator at position 0 belong to the same group only has the multiplier-accumulator at position 2. adder.

Step 2: For each multiplier-adder except the first target multiplier-adder in each multiplier-adder group, determine the multiplier-adder and the multiplier-adder based on the operation step size and the number of columns of the multiplier-adder array The second positional relationship of the previous multiplier-adder in the same column and adjacent in the multiplier-adder array; wherein, each multiplier-adder PO(j) except the first target multiplier-adder in the multiplier-adder group The second positional relationship of the previous multiplier-adder PO(j-1) in the same column and adjacent to the multiplier-adder array in the multiplier-adder array is, for example:

PO(j-1)+Sy* _A'x ₌ PO(j).

When the operation step size is 2, S _x =2, S _y =2, as shown in Figure 4, taking the first multiplier-adder group as an example, the first target multiplier-adder of the first multiplier-adder group is at the position 0, then the position PO(C) of another multiplier-adder C of the same group in this column is:

PO(C)=0+S _y *A′ _x =0+2*4=8,

The position PO(D) of the next multiplier-adder D after position 8 in the same group is:

PO(D)=PO(C)+S _y *A′ _x =8+S _y *A′ _x =8+2*4=16,

However, because the size of the multiplier-accumulator array actually used is 4 rows and 4 columns, and the maximum position arrangement number of this column is 12, the multiplier-accumulator that belongs to the same group as the multiplier-accumulator at position 0 is only at position 8. the multiplier-adder.

Step 3: Based on the position of the first target multiplier-adder in the multiplier-adder group in the multiplier-adder array, the first positional relationship and the second positional relationship, determine that the multiplier-adder group divides the first target multiplier-adder The target position of other target multiplier-adders in the multiplier-adder array.

When the operation step is 2, S _x =2, S _y =2, after calculating the position of the first target multiplier-adder of each multiplier-adder group in the actually used multiplier-adder array, you can refer to the above The first position relationship and the second position relationship calculate the target positions of the other target multiplier-adders in the multiplier-adder group except the first target multiplier-adder in the actually used multiplier-adder array. Specifically, after calculating the target positions of the first target multiplier-adder in each row and the first target multiplier-adder in each column in the actually used multiplier-adder array, the target position can be calculated based on the first positional relationship or the second positional relationship. The target position of the other target multipliers in the multiplier-adder group. As shown in Figure 4, taking the first multiplier-adder group as an example, the first target multiplier-adder of the first multiplier-adder group is at position 0, and the position PO( A) is 2, then the position PO(E) of the next multiplier-adder E in the same column as multiplier-adder A is:

PO(E)=2+S _y *A′ _x =2+2*4=2+8=10;

Or for example, the position PO(C) of the next multiplier-adder C in the same column and group as the multiplier-adder at position 0 is 8, then the position PO(E of the next multiplier-adder E in the same group as the multiplier-adder C is 8. )for:

PO(E)=8+Sx ₌ 8+2=10.

Exemplarily, as shown in FIG. 4 , the present disclosure provides an example diagram of dividing the multiplier-adder array into four multiplier-adder groups, four different colors represent the four multiplier-adder groups, and the black first Multiplier-adder group, the second multiplier-adder group in white, the third multiplier-adder group in light gray, and the fourth multiplier-adder group in dark gray; in the same row of the multiplier-adder array, adjacent two In the same group of multiplier-accumulator intervals, the number of non-same-group multiplier-adders is the same and not zero, and in the same column of the multiplier-adder array, two adjacent multiplier-adders in the same group have the same number of non-same-group multiplier-adders. zero.

For the above S102, the images to be processed corresponding to the convolution processing tasks of different multiplier-adder groups are different, for example, each multiplier-adder group convolves different data matrices respectively.

When using each multiplier-adder group in the multiple multiplier-adder groups to execute the data processing task corresponding to each multiplier-adder group in parallel, according to each target multiplier-adder group in each multiplier-adder group position in the multiplier-adder array, and store the image data to be processed corresponding to each multiplier-adder group in the register array corresponding to each multiplier-adder group.

Here, the image data to be processed includes, for example, at least one of the following: the original image to be processed; a sub-image corresponding to any color channel in the original image to be processed; a feature map obtained after feature extraction is performed on the original image; The feature sub-map corresponding to at least one channel in the feature map obtained after feature extraction of the image; the image data obtained after data filling processing is performed on the sub-map corresponding to at least one color channel in the original image; the feature map corresponding to at least one channel The image data obtained after the feature submap performs data filling processing.

Taking the feature map as the image data to be processed as an example, when the image data to be processed is stored in the register array, each register in at least some of the registers stores the feature value of a feature point in the image data to be processed, also called The operands required by the multiplier-adder.

For each multiplier-adder group, determine the position of the register that the target multiplier-adder of the multiplier-adder group reads fixedly in the respective register arrays; as shown in FIG. 6A , the multiplier-adder array includes four multiplier-adders group, respectively corresponding to the four register arrays shown in FIG. 6B, the black multiplier-adder group corresponds to the black register array a, the white multiplier-adder group corresponds to the white register array b, and the light gray multiplier-adder group corresponds to The light gray register array c, the dark gray multiplier-adder group corresponds to the dark gray register array d. The target multiplier-adder PE0 reads the eigenvalue stored in register A0, the target multiplier-adder PE1 reads the eigenvalue stored in register B0, the target multiplier-adder PE2 reads the eigenvalue stored in register A2, and the target multiplier-adder reads the eigenvalue stored in register A2. The adder PE3 reads the eigenvalue stored in the register B2, the target multiplier-adder PE4 reads the eigenvalue stored in the register C0, the target multiplier-adder PE5 reads the eigenvalue stored in the register D0, and the target multiplier-adder PE6 reads the eigenvalue stored in register C2, target multiplier-adder PE7 reads the eigenvalue stored in register D2, target multiplier-adder PE8 reads the eigenvalue stored in register A8, and PE9 reads the eigenvalue stored in register The eigenvalue in B8, the target multiplier-adder PE10 reads the eigenvalue stored in register A10, the target multiplier-adder PE11 reads the eigenvalue stored in register B10, and the target multiplier-adder PE12 reads the eigenvalue stored in register C8 The target multiplier-adder PE13 reads the characteristic value stored in the register D8, the target multiplier-adder PE14 reads the characteristic value stored in the register C10, and the target multiplier-adder PE15 reads the characteristic value stored in the register D10 value.

For each multiplier-adder group, according to the position of each target multiplier-adder in the multiplier-adder group in the multiplier-adder array, the position of the register read by the target multiplier-adder in the multiplier-adder group, and In the data processing process, the processing sequence of the operands contained in the image data to be processed is to store the image data to be processed corresponding to the multiplier-adder group in the register array corresponding to the multiplier-adder group, so that each data processing cycle, each The operand stored in the position of the register fixedly read by the target multiplier-accumulator corresponds to the matrix element in the corresponding processing cycle matrix operand.

Wherein, the matrix operand includes, for example, the convolution kernel in the convolution calculation, that is, a data matrix, exemplarily,

W ₀ W ₁

W ₃ W ₂

A matrix operand with two rows and two columns provided by the present disclosure includes matrix elements: W ₀ , W ₁ , W ₂ , and W ₃ . The number of operands contained in the image data to be processed corresponding to each multiplier-adder group should be consistent. The image data to be processed corresponding to the first multiplier-adder group shown in FIG. 6A is:

Then the storage rule of the image data to be processed in the black register array a corresponding to the first multiplier-adder group is shown in the upper left part of FIG. 6B , and the image data to be processed corresponding to the second multiplier-adder group is:

Then the storage rule of the to-be-processed image data in the white register array b corresponding to the second multiplier-adder group is shown in the upper right part of FIG. 6B , and the to-be-processed image data corresponding to the third multiplier-adder group is:

Then the storage rule of the to-be-processed image data in the light gray register array c corresponding to the third multiplier-adder group is shown in the lower left part of FIG. 6B , and the to-be-processed image data corresponding to the fourth multiplier-adder group is:

Then, the storage rule of the image data to be processed in the dark gray register array d corresponding to the fourth multiplier-adder group is shown in the lower right part of FIG. 6B .

After the to-be-processed image data corresponding to each multiplier-adder group is stored in the register array corresponding to each multiplier-adder group, for each data processing cycle in the multiple data processing cycles, data from each multiplier-adder In the register array corresponding to the group, read the image data to be processed corresponding to each multiplier-adder group in the data processing cycle; and process the read image data to be processed to obtain the data of each multiplier-adder group in the data processing cycle. data processing results.

Among them, for the first data processing cycle in which the image data to be processed is processed, each target multiplier-adder in each multiplier-adder group is controlled, and each target multiplier-adder is read from the fixed read register in the first data processing. The operand corresponding to the cycle is taken as the first operand; and the matrix elements in the matrix operand corresponding to the first data processing cycle of each multiplier-adder group are determined as the second operand; The product of the first operand and the second operand of the processing cycle.

Exemplarily, the target multiplier-adder PE0 reads the operand a0 from the register A0 that is fixedly read in the corresponding register array, the target multiplier-adder PE1 reads the operand b0 in the register B0, and the other target multiplier-adders read the operand b0. The operands are analogous and will not be repeated here; assuming that the matrix operands are:

Taking the multiplier-adder PE0 as an example, after reading the operand a0, take a0 as the first operand, the matrix element corresponding to the data processing cycle is W ₀ , and take W ₀ as the second operand, and then calculate W ₀ * a0; and store the result in a register.

For the non-first data processing cycle in which the image data to be processed is processed, control the image data to be processed to move a preset step size in the register array according to the preset data movement mode corresponding to the data processing cycle; and control each multiplier-adder group For each target multiplier-adder in , read the operand of each target multiplier-adder in the non-first data processing cycle from the fixed read register as the first operand; and determine that each multiplier-adder group is in the data processing cycle. The matrix element in the matrix operand corresponding to the cycle is used as the second operand; the product of the first operand and the second operand of each target multiplier-adder in the data processing cycle is determined respectively.

Exemplarily, taking the second data processing cycle corresponding to the multiplier-adder group shown in FIG. 6A as an example, the preset data movement mode is to move to the left, and the preset step size is 1, as shown in FIG. An example diagram of the register array a after the image data to be processed is shifted to the left by one step in the register array in the disclosed embodiment, the multiplier-adder PE0 reads the operand a1 from the register A0, and the multiplier-adder PE2 reads the operand a2 from the register A2 Take the operand a3..., and so on for the operands read by other multipliers and adders, and will not repeat them; the matrix operands are:

Taking the multiplier-adder PE0 as an example, after reading the operand a1, take a1 as the first operand, the matrix element corresponding to the data processing cycle is W ₁ , and take W ₁ as the second operand, and then calculate W ₁ * a1; and store the result in a register.

Similarly, in the third data processing cycle, the image data to be processed can be moved up by one step on the basis of the position shown in Figure 7. At this time, the operand a5 is stored in the register A0, and the corresponding data processing cycle The matrix element is W ₂ , and PE0 can perform the calculation of W ₂ *a5; in the fourth data processing cycle, the image data to be processed can be moved to the right as a whole based on the movement of the third data processing cycle. , the operand a4 is stored in the register A0 at this time, the matrix element corresponding to this data processing cycle is W ₃ , PE0 can perform the calculation of W ₃ *a4, and the same is true for other PEs, which will not be repeated here.

It can be seen that in each data processing cycle, the PEs that process different image data to be processed have completed the calculation of the corresponding data processing cycle, that is to say, different multiplier-adder groups have completed the corresponding data processing cycles in parallel in each data processing cycle. Calculation, after all data processing cycles, different multiplier-adder groups simultaneously complete the final calculation, saving system resources.

Here, for different image data to be processed, the corresponding convolution kernels may be different or the same. For example, if the two image data to be processed are different feature submaps of the same feature map, the convolution kernels corresponding to the two image data to be processed are different. If the two to-be-processed image data are image data of different positions of the same feature submap, the convolution kernels corresponding to the two to-be-processed image data are the same.

After the data processing results of each multiplier-adder group in multiple data processing cycles are obtained, the data processing corresponding to each multiplier-adder group can be completed according to the data processing results corresponding to each multiplier-adder group in each data processing cycle. Task.

Wherein, for each target multiplier-adder in each multiplier-adder group, the products obtained by the target multiplier-adder in each data processing cycle are added to obtain a sum value; The sum value corresponding to the target multiplier-adder completes the data processing task corresponding to each multiplier-adder group.

For example, taking the target multiplier-adder PE0 shown in FIG. 6A as an example, the products calculated by PE0 in four data processing cycles are: W ₀ *a0, W ₁ *a1, W ₂ *a5, W ₃ *a4 ; add the four results:

W ₀ *a0+W ₁ *a1+W ₂ *a5+W ₃ *a4,

The obtained sum is the result value of PE0, which belongs to the processing result matrix of the data processing task corresponding to the first multiplier-adder group. In the processing result matrix of the data processing task corresponding to the first multiplier-adder group, The resulting numerical arrangement is:

Here, if the image data to be processed by convolution is a feature map, the feature map includes 16 channels, and the feature sub-maps corresponding to 4 channels are processed each time, that is, the feature sub-maps corresponding to the 16 channels need to be divided For 4 groups, one group of feature sub-maps are processed each time. If the four groups of feature sub-maps are: group a, group b, group c, and group d, after processing the four feature sub-maps included in group a, Accumulate the results; after processing the 4 feature sub-maps included in group b, then accumulate the 4 results corresponding to group b, and calculate the accumulated result corresponding to group a and the accumulated result corresponding to group b. Accumulation; after processing the 4 feature sub-maps included in group c, the 4 results corresponding to group c are accumulated, and the accumulated results of group a, group b, and the accumulated results corresponding to group c are accumulated. Accumulation; after processing the 4 feature sub-maps included in group d, the 4 results corresponding to group d are accumulated, and the accumulated results of group a, group b, group c, and the corresponding group d are accumulated. The accumulated results are accumulated, and finally, the accumulated sum of the convolution results corresponding to the 16 channels is obtained.

After processing the 4 feature sub-maps included in the group a, the obtained 4 output results corresponding to the group a are: a1, a2, a3 and a4 respectively. After processing the 4 feature sub-maps included in group b, the obtained 4 output results corresponding to group b are: b1, b2, b3 and b4 respectively. At this time, a1+b1=O1, a2+b2=O2, a3+b3=O3, a4+b4=O4 are executed. After processing the 4 feature submaps included in group c, the obtained 4 output results corresponding to group c are: c1, c2, c3 and c4, and then execute: O1+c1, O2+c2, O3 +c3, O4+c4; and so on, and finally get a1+b1+c1+d1, a2+b2+c2+d2, a3+b3+c3+d3, a4+b4+c4+d4, and then four The results are accumulated together to obtain the accumulated sum of the convolution results corresponding to the 16 channels respectively.

Those skilled in the art can understand that in the above method of the specific implementation, the writing order of each step does not mean a strict execution order but constitutes any limitation on the implementation process, and the specific execution order of each step should be based on its function and possible Internal logic is determined.

Based on the same inventive concept, the embodiment of the present disclosure also provides a data processing apparatus corresponding to the data processing method. Reference may be made to the implementation of the method, and repeated descriptions will not be repeated.

Referring to FIG. 8 , which is a schematic diagram of a data processing apparatus according to an embodiment of the present disclosure, the apparatus includes a controller 801 . The controller 801 is configured to: group a plurality of multiplier-adders in the multiplier-adder array based on an operation step to obtain a plurality of multiplier-adder groups; use each of the plurality of multiplier-adder groups A multiplier-adder group for executing data processing tasks corresponding to each multiplier-adder group in parallel.

In a possible implementation manner, when grouping a plurality of multiplier-adders in the multiplier-adder array based on an operation step, the controller 801 is specifically configured to determine the The number of multiplier-adder groups; the multiplier-adders in the multiplier-adder array are grouped based on the number of multiplier-adder groups.

In a possible implementation manner, when a plurality of multiplier-adders in the multiplier-adder array are grouped based on the number of the multiplier-adder groups, the controller 801 is specifically configured to extract the multiplier-adder from the multiplier-adder group. determining the first target multiplier-adder in each multiplier-adder group in the adder array; based on the position of the first target multiplier-adder in the multiplier-adder array, the operation step size, and the The size of the multiplier-adder array, from which other target multiplier-adders except the first target multiplier-adder in each multiplier-adder group are determined from the multiplier-adder array.

In a possible implementation, based on the position of the first multiplier-adder in the multiplier-adder array, the operation step size and the size of the multiplier-adder array, the multiplier-adder is obtained from the multiplier-adder array. When other target multiplier-adders except the first target multiplier-adder in each multiplier-adder group are determined in the array, the controller 801 is specifically configured to divide the target multiplier-adder in each multiplier-adder group For each multiplier-adder other than the first target multiplier-adder, based on the operation step size, determine that the multiplier-adder and the previous multiplier-adder that is in the same row and adjacent to the multiplier-adder are in the multiplier-adder array The first positional relationship in the second positional relationship in the adder array; based on the position of the first target multiplier-adder of the multiplier-adder group in the multiplier-adder array, the first positional relationship and the second positional relationship, determine the multiplier-adder The target position of the adder in the multiplier-adder array.

In a possible implementation manner, when determining the first target multiplier-adder in each multiplier-adder group from the multiplier-adder array, the controller 801 is specifically configured to, based on the operation The step size, the multiplier-adder array, determine the target matrix; according to the matrix element value of the target matrix, determine the position of the first target multiplier-adder in each multiplier-adder group in the multiplier-adder array.

In a possible implementation manner, when using each multiplier-adder group in the plurality of multiplier-adder groups to perform data processing tasks corresponding to each multiplier-adder group in parallel, the controller 801, which is specifically configured to store the to-be-processed image data corresponding to each multiplier-adder group in the multiplier-adder array according to the position of each target multiplier-adder in each multiplier-adder group. In the register array corresponding to each of the multiplier-adder groups; for each data processing cycle in the multiple data processing cycles, respectively, from the register array corresponding to each multiplier-adder group, read the multipliers of the data processing cycle. The image data to be processed corresponding to the adder group is processed in parallel; the read image data to be processed is processed in parallel to obtain the data processing result of each multiplier-adder group in the data processing cycle; The data processing results corresponding to the groups are completed, and the data processing tasks corresponding to the multiplier-adder groups are completed.

In a possible implementation manner, according to the position of each target multiplier-adder in each multiplier-adder group in the multiplier-adder array, the to-be-processed image corresponding to the multiplier-adder group When the data is stored in the register array corresponding to the multiplier-adder group, the controller 801 is specifically configured to, for each multiplier-adder group, determine that the target multiplier-adder of the multiplier-adder group is in the corresponding register array. Fixed the position of the read register in the multiplier-adder group; for each multiplier-adder group, according to the position of each target multiplier-adder in the multiplier-adder group in the multiplier-adder array, the target multiplier in the multiplier-adder group The position of the register read by the adder, and the processing order of the operands contained in the image data to be processed in the data processing process, and the image data to be processed corresponding to the multiplier-adder group is stored to the multiplier-adder group. In the register array corresponding to the adder group, in each data processing cycle, the operand stored in the register fixedly read by each target multiplier-adder corresponds to the matrix element in the matrix operand corresponding to the processing cycle.

In a possible implementation manner, for each data processing cycle in the multiple data processing cycles, from the register array corresponding to each multiplier-adder group, respectively, the data processing cycle corresponding to each multiplier-adder group is read. Image data to be processed; and parallel processing of the read image data to be processed to obtain the data processing results of each multiplier-adder group within the data processing cycle, the controller 801 is specifically used for processing the to-be-processed image data. In the first data processing cycle of processing image data, control each target multiplier-adder in each multiplier-adder group, and read each target multiplier-adder from the fixed read register corresponding to the first data processing cycle. The operand of each multiplier-adder group is determined as the first operand; and the matrix elements in the matrix operand corresponding to the first data processing cycle of each multiplier-adder group are determined as the second operand; The product of the first operand and the second operand of each data processing cycle; for the non-first data processing cycle of processing the to-be-processed image data, the to-be-processed image data is controlled according to the preset corresponding to the data processing cycle. Set the data movement mode to move the preset step size in the register array; and control each target multiplier-adder in each multiplier-adder group, respectively read each target from the register fixedly read with each target multiplier-adder The operand of the multiplier-adder in the non-first data processing cycle is taken as the first operand; and the matrix elements in the matrix operands corresponding to the data processing cycle of each multiplier-adder group are determined as the second operand; The product of the first operand and the second operand of each target multiplier-adder in the data processing cycle.

In a possible implementation manner, when completing the data processing task corresponding to each multiplier-adder group according to the data processing results corresponding to each multiplier-adder group in each data processing cycle, the controller 801 specifically uses For each target multiplier-adder in each multiplier-adder group, the products obtained by the target multiplier-adder in each data processing cycle are added to obtain a sum value; based on each target contained in each multiplier-adder group The sum value corresponding to the multiplier-adder completes the data processing task corresponding to each multiplier-adder group.

For the description of the processing flow of each module in the apparatus and the interaction flow between the modules, reference may be made to the relevant descriptions in the foregoing method embodiments, which will not be described in detail here.

The image processing apparatus provided by the embodiments of the present disclosure may include a chip, an AI chip, and the like.

An embodiment of the present disclosure further provides a computer device. As shown in FIG. 9 , a schematic structural diagram of the computer device provided by the embodiment of the present disclosure includes a controller 910 and a memory 920 . The memory 920 stores machine-readable instructions executable by the controller 910 , and the controller 910 is configured to execute the machine-readable instructions stored in the memory 920 . When the machine-readable instruction is executed by the controller 910, the controller 910 performs the following steps: grouping the multiplier-adders in the multiplier-adder array based on the operation step to obtain a plurality of multiplier-adder groups; Using each of the plurality of multiplier-adder groups, data processing tasks corresponding to each of the multiplier-adder groups are performed in parallel.

The above-mentioned memory 920 includes a memory 921 and an external memory 922; the memory 921 here is also called an internal memory, which is used to temporarily store the operation data in the controller 910 and the data exchanged with the external memory 922 such as the hard disk. The external memory 922 performs data exchange.

The computer device provided by the embodiment of the present disclosure may include a smart terminal such as a mobile phone, or may also be other devices, servers, etc. that have a camera and can perform image processing, which is not limited here.

For the specific execution process of the above instruction, reference may be made to the steps of the data processing method described in the embodiments of the present disclosure, and details are not repeated here.

Embodiments of the present disclosure further provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the steps of the data processing method described in the foregoing method embodiments are executed. Wherein, the storage medium may be a volatile or non-volatile computer-readable storage medium.

Embodiments of the present disclosure further provide a computer program product, where the computer program product carries program codes, and the instructions included in the program codes can be used to execute the steps of the data processing methods described in the foregoing method embodiments. For details, please refer to the foregoing methods. The embodiments are not repeated here.

Wherein, the above-mentioned computer program product can be specifically implemented by means of hardware, software or a combination thereof. In an optional embodiment, the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), etc. Wait.

Those skilled in the art can clearly understand that, for the convenience and brevity of description, for the specific working process of the system and device described above, reference may be made to the corresponding process in the foregoing method embodiments, which will not be repeated here. In the several embodiments provided by the present disclosure, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. The apparatus embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some communication interfaces, indirect coupling or communication connection of devices or units, which may be in electrical, mechanical or other forms.

The units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.

In addition, each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a processor-executable non-volatile computer-readable storage medium. Based on such understanding, the technical solutions of the present disclosure can be embodied in the form of software products in essence, or the parts that contribute to the prior art or the parts of the technical solutions. The computer software products are stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of the present disclosure. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes .

Finally, it should be noted that the above-mentioned embodiments are only specific implementations of the present disclosure, and are used to illustrate the technical solutions of the present disclosure rather than limit them. The protection scope of the present disclosure is not limited thereto, although referring to the foregoing The embodiments describe the present disclosure in detail, and those skilled in the art should understand that: any person skilled in the art can still modify the technical solutions described in the foregoing embodiments within the technical scope disclosed by the present disclosure. Or can easily think of changes, or equivalently replace some of the technical features; and these modifications, changes or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present disclosure, and should be covered in the present disclosure. within the scope of protection. Therefore, the protection scope of the present disclosure should be based on the protection scope of the claims.

Claims

A data processing method comprising:

Group the multiplier-adders in the multiplier-adder array based on the operation step to obtain a plurality of multiplier-adder groups;

Using each of the plurality of multiplier-adder groups, data processing tasks corresponding to each of the multiplier-adder groups are performed in parallel.
The data processing method according to claim 1, wherein,

In the same row of the multiplier-adder array, the interval between two adjacent multiplier-adders in the same group is the same and non-zero in number, and

In the same column of the multiplier-adder array, the interval between two adjacent multiplier-adders of the same group is the same and not zero.
The data processing method according to claim 1 or 2, wherein the grouping of the multiplier-adders in the multiplier-adder array based on the operation step size comprises:

determining the number of multiplier-adder groups based on the operation step size;

The plurality of multiplier-adders in the multiplier-adder array are grouped based on the number of multiplier-adder groups.
The data processing method according to claim 3, wherein the grouping of the multiplier-adders in the multiplier-adder array based on the number of the multiplier-adder groups comprises:

determining a first target multiplier-adder in each multiplier-adder group from the multiplier-adder array;

Based on the position of the first target multiplier-adder in the multiplier-adder array, the operation step size, and the size of the multiplier-adder array, it is determined to divide the first multiplier-adder group in each multiplier-adder group The position of the target multiplier-adder other than the target multiplier-adder in the multiplier-adder array.
The data processing method according to claim 4, wherein the multiplier-adder is based on the position of the first target multiplier-adder in the multiplier-adder array, the operation step size, and the value of the multiplier-adder array. size, to determine the positions of other target multiplier-adders in each multiplier-adder group except the first target multiplier-adder in the multiplier-adder array, including:

For each multiplier-adder in each multiplier-adder group except the first target multiplier-adder,

Based on the operation step size, determining the first positional relationship between the multiplier-adder and the previous multiplier-adder that is in the same row and adjacent to the multiplier-adder in the multiplier-adder array; and

Based on the operation step size and the number of columns of the multiplier-adder array, determine the second position of the multiplier-adder and the previous multiplier-adder in the same column and adjacent to the multiplier-adder array in the multiplier-adder array relation;

Based on the position of the first target multiplier-adder of the multiplier-adder group in the multiplier-adder array, the first positional relationship and the second positional relationship, it is determined that the multiplier-adder is in the multiplier-adder array target location.
The data processing method according to claim 4 or 5, wherein the determining the first target multiplier-adder in each multiplier-adder group from the multiplier-adder array comprises:

Determine a target matrix based on the operation step size and the multiplier-adder array;

According to the matrix element value of the target matrix, the position of the first target multiplier-adder in each multiplier-adder group in the multiplier-adder array is determined.
The data processing method according to any one of claims 1 to 6, characterized in that, by using each multiplier-adder group in the plurality of multiplier-adder groups, executing in parallel with each multiplier-adder group Data processing tasks corresponding to the group, including:

According to the position of each target multiplier-adder in each multiplier-adder group in the multiplier-adder array, the to-be-processed image data corresponding to each multiplier-adder group is stored in the In the register array corresponding to the multiplier-adder group;

For each data processing cycle of the plurality of data processing cycles,

Read the image data to be processed corresponding to each multiplier-adder group in the data processing cycle from the register array corresponding to each multiplier-adder group; and

Process the read image data to be processed to obtain the data processing results of each multiplier-adder group within the data processing cycle;

According to the data processing results corresponding to the multiplier-adder groups in each data processing cycle, the data processing tasks corresponding to the multiplier-adder groups are completed.
The data processing method according to claim 7, wherein, according to the position of each target multiplier-adder in each multiplier-adder group in the multiplier-adder array The to-be-processed image data corresponding to the multiplier group is stored in the register array corresponding to the multiplier-adder group, including:

For each multiplier-adder group, determine the position of the register that is fixedly read by the target multiplier-adder of the multiplier-adder group in the respective corresponding register arrays;

For each multiplier-adder group, according to the position of each target multiplier-adder in the multiplier-adder group in the multiplier-adder array, and the position of the register read by the target multiplier-adder in the multiplier-adder group , and the processing sequence of the operands contained in the image data to be processed in the data processing process, the image data to be processed corresponding to the multiplier-adder group is stored in the register array corresponding to the multiplier-adder group, In each data processing cycle, the operand stored in the fixed read register of each target multiplier-adder corresponds to the matrix element in the matrix operand of the corresponding data processing cycle.
The data processing method according to claim 7 or 8, wherein the data is read from the register array corresponding to each multiplier-adder group for each data processing cycle in the plurality of data processing cycles. Processing the image data to be processed corresponding to each multiplier-adder group in the processing period; and performing parallel processing on the read image data to be processed to obtain the data processing results of each multiplier-adder group in the data processing period, including:

For the first data processing cycle of processing the to-be-processed image data,

Controlling each target multiplier-adder in each multiplier-adder group, respectively reading the operand corresponding to each target multiplier-adder in the first data processing cycle from the fixed read register as the first operand;

Determine the matrix elements in the matrix operands corresponding to the first data processing cycle of each multiplier-adder group as the second operand;

determining the product of the first operand and the second operand of each target multiplier-adder in the first data processing cycle;

For a non-first data processing cycle for processing the image data to be processed,

controlling the to-be-processed image data to move a preset step size in the register array according to a preset data movement mode corresponding to the data processing period;

Controlling each target multiplier-adder in each of the multiplier-adder groups, respectively reading the operand of each target multiplier-adder in the non-first data processing cycle from the fixed read register as the first operand;

Determine the matrix element in the matrix operand corresponding to the data processing cycle of each multiplier-adder group as the second operand;

Determine the product of the first operand and the second operand of each target multiplier-adder in the data processing cycle.
The data processing method according to any one of claims 7 to 9, wherein, according to the data processing results corresponding to each multiplier-adder group in each data processing cycle, the data corresponding to each multiplier-adder group is completed. Processing tasks, including:

For each target multiplier-adder in each multiplier-adder group, add the products obtained by the target multiplier-adder in each data processing cycle to obtain a sum value;

Based on the sum value corresponding to each target multiplier-adder included in each multiplier-adder group, the data processing task corresponding to each multiplier-adder group is completed.
The data processing method according to any one of claims 1 to 10, characterized in that:

The data processing tasks include convolution processing tasks;

The images to be processed corresponding to the convolution processing tasks of different multiplier-adder groups are different.
A data processing device, comprising: a controller; the controller is used for:

Group the multiplier-adders in the multiplier-adder array based on the operation step to obtain a plurality of multiplier-adder groups;

Using each of the plurality of multiplier-adder groups, data processing tasks corresponding to each of the multiplier-adder groups are performed in parallel.
The data processing apparatus according to claim 12, wherein, when the multiplier-adders in the multiplier-adder array are grouped based on the operation step size, the controller is specifically configured to, based on the operation step size determining the number of multiplier-adder groups; grouping the plurality of multiplier-adders in the multiplier-adder array based on the number of multiplier-adder groups.
A computer device, comprising: a controller and a memory, wherein the memory stores machine-readable instructions executable by the controller, and the controller is configured to execute the machine-readable instructions stored in the memory, When the machine-readable instructions are executed by the controller, the controller performs the steps of the data processing method of any one of claims 1 to 11.
A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, and when the computer program is run by a computer device, the computer device executes any one of claims 1 to 11. The steps of the data processing method described above.