CN112712173A

CN112712173A - Method and system for acquiring sparse operation data based on MAC (media Access control) multiply-add array

Info

Publication number: CN112712173A
Application number: CN202011640074.3A
Authority: CN
Inventors: 吴小鹏; 唐士斌; 欧阳鹏
Original assignee: Beijing Qingwei Intelligent Technology Co ltd
Current assignee: Beijing Qingwei Intelligent Technology Co ltd
Priority date: 2020-12-31
Filing date: 2020-12-31
Publication date: 2021-04-27
Anticipated expiration: 2040-12-31
Also published as: CN112712173B

Abstract

The invention provides a method for acquiring sparse operation data based on an MAC multiply-add array, which comprises the following steps: and taking O rows and I columns as a dividing unit in the row-column direction of the sparse weight matrix to be calculated. One or more cell blocks are read in a column direction of a thinning weight matrix to be calculated. A plurality of operating modes are generated. And integrating the calculation arrays into a calculation array with M rows and N columns. When matrix multiplication is realized through the MAC multiplication and addition array, the calculation array is used as a multiplier item after being converted, and an effective weight value unit in the row direction of the calculation matrix can be used as a characteristic weight value for calculation. According to the invention, through the division of the sparse weight matrix to be calculated, the divided units can meet the operation structure of the MAC multiply-add array, so that a plurality of non-zero weights are processed in parallel by using a small amount of resources. And (5) carrying out a traditional thinning process. Meanwhile, the invention also provides a system for acquiring sparse operation data based on the MAC multiply-add array.

Description

Method and system for acquiring sparse operation data based on MAC (media Access control) multiply-add array

Technical Field

The invention relates to the field of reconfigurable processors, in particular to neural network hardware acceleration, convolution, full connection, rule sparseness and matrix multiplication operation. The invention particularly relates to a method and a system for acquiring sparse operation data based on an MAC multiply-add array.

Background

The neural network accelerator is used for accelerating the operation of a neural network algorithm through hardware such as a chip and plays an important role in the neural network operation. The traditional neural network operation is generally performed by a GPU or a CPU in a software mode, the operation speed is low, the energy efficiency is low, meanwhile, the application scene of a customized ASIC or FPGA is single, and only convolution operation or full connection can be performed. The neural network operation is generally operation intensive operation, the data bandwidth and weight on-chip storage can become the performance of the existing neural network accelerator, the weight data can be compressed through thinning, the problem of the data bandwidth and weight on-chip storage is solved, and the operation efficiency can be improved (or the actual operation amount is reduced).

At present, general sparsification is very unfriendly to hardware, so that according to rule sparsification, a hardware structure is optimized to better support sparsification. The matrix operation is a relatively common operation in various algorithms, but is different from the NN operation rule of the neural network, so that only the ASIC can be customized, and the MAC array can simultaneously support the neural network calculation and the matrix multiplication calculation in a special mode.

Disclosure of Invention

The invention aims to provide a method for acquiring sparse operation data based on an MAC multiply-add array, which divides a sparse weight matrix to be calculated to enable the divided units to meet the operation structure of the MAC multiply-add array, thereby utilizing a small amount of resources to process a plurality of nonzero weights in parallel; and (5) carrying out a traditional thinning process.

The invention also aims to provide a system for acquiring sparse operation data based on the MAC multiply-add array, which can enable the divided units to meet the operation structure of the MAC multiply-add array by dividing the sparse weight matrix to be calculated, thereby accelerating the operation speed of the system, reducing the implementation cost and reducing the complexity of hardware implementation.

In a first aspect of the present invention, a method for acquiring sparsely operated data based on a MAC multiply-add array is provided, where the MAC multiply-add array is an O row-I column matrix. The MAC multiply-add array comprises I multiplied by O computing units, wherein I bit input channels are arranged along the column direction of the MAC multiply-add array, and each input channel corresponds to one computing unit. And the row direction of the MAC multiplication and addition array is provided with O-bit output channels, and each input channel of the O-bit output channels corresponds to one computing unit.

The method for acquiring the sparse operation data based on the MAC multiply-add array comprises the following steps:

step S101, taking the row and column directions of the sparse weight matrix to be calculated as a dividing unit by taking O rows and I columns, and dividing the sparse weight matrix into a plurality of cell blocks along the column direction according to the division of the sparse weight matrix. Each cell block includes a plurality of cells having a significant weight value.

Step S102, one or more unit blocks are read in the column direction of the thinning weight matrix to be calculated. If the number of cells of the one or more cell blocks having a significant weight value is equal to I O/2 and the number of cells of the one or more cell blocks having a significant weight value is equal to or less than I in each column of the one or more cell blocks, a plurality of operation modes corresponding to the one or more cell blocks are generated.

Step S103, reading one or more cell blocks along the column direction of the thinning weight matrix according to a plurality of working modes. The cells of significant weight values in one or more cell blocks are integrated into a computational array of M rows and N columns. Effective weight value units in the calculation array are sequentially arranged along the column direction. Row M corresponds to row O. The N columns correspond to the I columns.

And step S104, when matrix multiplication calculation is realized through the MAC multiplication and addition array, the calculation array is used as a multiplier item after being converted, and an effective weight value unit in the row direction of the calculation matrix can be used as a characteristic weight value for calculation.

In another embodiment of the method for acquiring sparse operation data based on the MAC multiply-add array according to the present invention, step S102 includes:

1 unit block is read as a first division unit in the column direction of the thinning-out weight matrix to be calculated. If the number of cells of the first partition unit with significant weight values is equal to I × O/2 and the number of cells of the first partition unit with significant weight values is below I in each column of the first partition unit, a first operation mode is generated. Or

And reading 2 unit blocks in the column direction of the sparse weight matrix to be calculated as a second division unit. And if the number of the units of the effective weight values in the second dividing unit is equal to I multiplied by O/2 and the number of the units of the effective weight values in each column of the second dividing unit is below I, generating a second working mode. Or

And reading 4 unit blocks in the column direction of the sparse weight matrix to be calculated as a third division unit. And if the number of the effective weight values in the third dividing unit is equal to I multiplied by O/2 and the number of the effective weight values in each column of the third dividing unit is below I, generating a third working mode. Or

And reading 8 unit blocks in the column direction of the sparse weight matrix to be calculated as a fourth division unit. And if the number of the cells of the effective weight value in the fourth dividing cell is equal to I multiplied by O/2 and the number of the cells of the effective weight value in each column of the fourth dividing cell is below I, generating a fourth working mode.

In another embodiment of the method for acquiring sparse operation data based on the MAC multiply-add array according to the present invention, the step of integrating the units of effective weight values in one or more unit blocks into a calculation array with M rows and N columns in step S103 includes:

one or more cell blocks are read. And sequentially reading the units according to the row sorting in each column, and if the current reading unit is an effective weight value unit, sequentially arranging the current effective weight value unit and the last effective weight value unit along the row sorting and the column direction.

In another embodiment of the method for acquiring sparse operation data based on the MAC multiply-add array according to the present invention, step S103 includes:

and reading the first division unit along the column direction of the sparse weight matrix according to a first working mode. And integrating the effective weight value units of the first division unit into a calculation array with M rows and N columns. Effective weight value units in the calculation array are sequentially arranged along the column direction. Or

And reading the second division unit along the column direction of the thinning weight matrix according to a second working mode. And integrating the effective weight value units of the second dividing unit into a calculation array with M rows and N columns. Effective weight value units in the calculation array are sequentially arranged along the column direction. Or

And reading the third division unit along the column direction of the sparse weight matrix according to a third working mode. And integrating the effective weight value units of the third dividing unit into a calculation array with M rows and N columns. Effective weight value units in the calculation array are sequentially arranged along the column direction. Or

And reading the fourth division unit along the column direction of the sparse weight matrix according to the fourth working mode. And integrating the effective weight value units of the fourth dividing unit into a calculation array with M rows and N columns. Effective weight value units in the calculation array are sequentially arranged along the column direction.

In another embodiment of the method for acquiring sparse operation data based on the MAC multiply-add array according to the present invention, after step S104, the method further includes:

and step S105, taking the calculation array as a characteristic input value to realize convolution or full-connection layer calculation in the neural network model of deep learning.

The MAC multiply-add array is an 8 row 8 column matrix, a 16 row 16 column matrix, a 32 row 32 column matrix, a 64 row 64 column matrix, a 16 row 32 column matrix, or a 32 row 64 column matrix. The MAC multiply-add array includes 8 × 8, 16 × 16, 32 × 32, 64 × 64, or 16 × 32, 32 × 64 computing units. The M rows and N columns of computational arrays are 8 rows and 8 columns, 16 rows and 16 columns, 32 rows and 32 columns, 64 rows and 64 columns, 16 rows and 32 columns, or 32 rows and 64 columns of computational arrays.

In a second aspect, the present invention provides a system for acquiring sparsely operated data based on a MAC multiply-add array, which is an O row-I column matrix. The MAC multiply-add array comprises I multiplied by O computing units, wherein I bit input channels are arranged along the column direction of the MAC multiply-add array, and each input channel corresponds to one computing unit. And the row direction of the MAC multiplication and addition array is provided with O-bit output channels, and each input channel of the O-bit output channels corresponds to one computing unit.

The system for acquiring the sparse operation data based on the MAC multiply-add array comprises: a dividing unit, a generating work mode unit, an integrating unit and a calculating unit, wherein:

and the dividing unit is configured to divide the row and column directions of the sparse weight matrix to be calculated into a plurality of cell blocks along the column direction of the sparse weight matrix by taking O rows and I columns as one dividing unit. Each cell block includes a plurality of cells having a significant weight value.

Generating an operation mode unit configured to read one or more unit blocks in a column direction of a thinning weight matrix to be calculated. If the number of cells of the one or more cell blocks having a significant weight value is equal to I O/2 and the number of cells of the one or more cell blocks having a significant weight value is equal to or less than I in each column of the one or more cell blocks, a plurality of operation modes corresponding to the one or more cell blocks are generated.

An integration unit configured to read one or more unit blocks in a column direction of the thinning weight matrix according to a plurality of operation modes. The cells of significant weight values in one or more cell blocks are integrated into a computational array of M rows and N columns. Effective weight value units in the calculation array are sequentially arranged along the column direction. Row M corresponds to row O. The N columns correspond to the I columns.

And the calculating unit is configured to convert the calculating array to be used as a multiplier item when matrix multiplication calculation is realized through the MAC multiplication and addition array, and the effective weight value unit for calculating the row direction of the matrix can be used as a characteristic weight value for calculation.

In another embodiment of the method for acquiring sparse operation data based on the MAC multiply-add array according to the present invention, the generating operation mode unit is further configured to:

In another embodiment of the method for acquiring sparse operation data based on the MAC multiply-add array according to the present invention, the step of integrating the units of effective weight values in one or more unit blocks into a calculation array with M rows and N columns in the integration unit is configured to:

In another embodiment of the method for acquiring sparse operation data based on the MAC multiply-add array according to the present invention, the integrating unit is configured to:

In another embodiment of the method for acquiring sparse operation data based on the MAC multiply-add array according to the present invention, the calculating unit further includes:

and a convolution calculation unit configured to implement convolution or full-connected layer calculation in the deep-learning neural network model with the calculation array as the characteristic input value.

The following will further describe characteristics, technical features, advantages and implementation manners of the method and system for acquiring sparse operation data based on the MAC multiply-add array in a clearly understandable manner with reference to the accompanying drawings.

Drawings

Fig. 1 is a flowchart for explaining a method of acquiring sparse operation data based on a MAC multiply-add array according to an embodiment of the present invention.

Fig. 2 is a schematic diagram for explaining the composition of a MAC-based multiply-add array according to an embodiment of the present invention.

Fig. 3 is a diagram for explaining the effective weight values in the cells when the first division unit corresponds to the first operation mode according to an embodiment of the present invention.

Fig. 4 is a diagram for explaining the effective weight values in the unit when the second division unit corresponds to the second operation mode in an embodiment of the present invention.

Fig. 5 is a diagram for explaining the effective weight values in the unit when the third division unit corresponds to the third operation mode in an embodiment of the present invention.

Fig. 6 is a schematic diagram for explaining integration in the method for acquiring the sparsification operation data based on the MAC multiply-add array according to an embodiment of the present invention.

Fig. 7 is a schematic diagram for explaining a combination of a system for acquiring thinning-out operation data based on a MAC multiply-add array according to an embodiment of the present invention.

FIG. 8 is a diagram for illustrating the convolution module supporting the matrix multiplication process in one embodiment of the present invention.

Detailed Description

In order to more clearly understand the technical features, objects and effects of the present invention, embodiments of the present invention will now be described with reference to the accompanying drawings, in which the same reference numerals indicate the same or structurally similar but functionally identical elements.

"exemplary" means "serving as an example, instance, or illustration" herein, and any illustration, embodiment, or steps described as "exemplary" herein should not be construed as a preferred or advantageous alternative. For the sake of simplicity, the drawings only schematically show the parts relevant to the present exemplary embodiment, and they do not represent the actual structure and the true scale of the product.

A first aspect of the invention. A method for acquiring sparse operation data based on a MAC multiplication and addition array is provided, wherein the MAC multiplication and addition array is an O row and I column matrix, and as shown in figure 2, the MAC multiplication and addition array is an 8 row and 8 column matrix. The MAC multiply-add array comprises I multiplied by O computing units, wherein I bit input channels are arranged along the column direction of the MAC multiply-add array, and each input channel corresponds to one computing unit. And the row direction of the MAC multiplication and addition array is provided with O-bit output channels, and each input channel of the O-bit output channels corresponds to one computing unit. The MAC multiply-add array is an operation array in hardware.

As shown in fig. 1, the method for acquiring the sparse operation data based on the MAC multiply-add array includes:

step S101, a plurality of cell blocks are obtained according to the sparse weight matrix to be calculated.

In this step, the row and column direction of the sparse weight matrix to be calculated is taken as a dividing unit by row I and column O, and the sparse weight matrix is divided into a plurality of cell blocks along the column direction according to the dividing unit. Each cell block includes a plurality of cells having a significant weight value.

For example: the dividing unit is 8 rows and 8 columns, and the thinning weight matrix is divided into a plurality of unit blocks along the column direction according to the 8 rows and 8 columns dividing unit.

Step S102, generating a plurality of working modes.

In this step, one or more unit blocks are read in the column direction of the sparse weight matrix to be calculated. If the number of cells of the one or more cell blocks having a significant weight value is equal to I O/2 and the number of cells of the one or more cell blocks having a significant weight value is equal to or less than I in each column of the one or more cell blocks, a plurality of operation modes corresponding to the one or more cell blocks are generated.

Step S103, integrating the calculation array.

In this step, one or more cell blocks are read along the column direction of the thinning weight matrix according to the plurality of operation modes. The cells of significant weight values in one or more cell blocks are integrated into a computational array of M rows and N columns. Effective weight value units in the calculation array are sequentially arranged along the column direction. Row M corresponds to row O. The N columns correspond to the I columns. For example: the M rows and N columns are 8 rows and 8 columns of computational arrays.

And step S104, realizing matrix multiplication calculation through the MAC multiplication and addition array.

In this step, when the matrix multiplication calculation is realized by the MAC multiplication and addition array, the calculation array is converted and used as a multiplier item, and the effective weight value unit in the matrix row direction can be calculated as a characteristic weight value.

1 unit block is read as a first division unit in the column direction of the thinning-out weight matrix to be calculated. If the number of cells of the first partition unit with significant weight values is equal to I × O/2 and the number of cells of the first partition unit with significant weight values is below I in each column of the first partition unit, a first operation mode is generated.

For example: as shown in fig. 3, the number of cells of the effective weight value in the first divided cell thereof is equal to 32, i.e., 8 × 8/2. In column 0 the effective weights are the elements indicated by the

boxes

0, 1, 2, 3, 4. In column 1 the effective weights are the elements indicated by the 0, 1, 2, 3, 4, 5, 6 boxes. In column 2 the effective weights are the elements indicated by the 0, 1, 2 boxes.

In column 3 the significance weights are 0, 1, 2, 3, the unit indicated by the box. In column 4 the effective weights are the elements indicated by the 0, 1, 2 boxes. In column 5 the effective weights are the elements indicated by the 0, 1, 2 boxes. In column 6 the effective weights are the elements indicated by the 0, 1, 2, 3, 4, 5 boxes. The effective weight is the unit indicated by the 0 box in column 7.

The effective weight is a unit with a weight value of non-zero. Wherein the number of cells of the 1 st column with a significant weight value is 7, i.e. less than 8 in each column of the first division cells, a first operation mode is generated.

In the other case, 2 unit blocks are read in the column direction of the thinning-out weight matrix to be calculated as the second division unit. And if the number of the units of the effective weight values in the second dividing unit is equal to I multiplied by O/2 and the number of the units of the effective weight values in each column of the second dividing unit is below I, generating a second working mode.

For example: as shown in fig. 4, 2 unit blocks, that is, a unit block 11 and a unit block 12 are read in the column direction of the thinning-out weight matrix to be calculated as second division units. The number of cells of the effective weight value in the cell block 11 and the cell block 12 is equal to 32, i.e., 8 × 8/2. And if the number of the effective weight values in the 1 st column is 7, namely, the number of the effective weight values in each column of the second dividing unit is less than 8, generating a second working mode.

And reading 4 unit blocks in the column direction of the sparse weight matrix to be calculated as a third division unit. And if the number of the effective weight values in the third dividing unit is equal to I multiplied by O/2 and the number of the effective weight values in each column of the third dividing unit is below I, generating a third working mode.

For example: as shown in fig. 5, 4 unit blocks, that is, a unit block 21, a unit block 22, a unit block 23, a unit block 24 are read in the column direction of the thinning-out weight matrix to be calculated as the third division unit. The number of cells of the effective weight values in the cell block 21, the cell block 22, the cell block 23, the cell block 24 is equal to 32, i.e., 8 × 8/2. And if the number of the effective weight values in the 1 st column is 7, that is, each column of the third dividing unit is less than 8, generating a third working mode.

And reading 8 unit blocks in the column direction of the sparse weight matrix to be calculated as a fourth division unit. And if the number of the cells of the effective weight value in the fourth dividing cell is equal to I multiplied by O/2 and the number of the cells of the effective weight value in each column of the fourth dividing cell is below I, generating a fourth working mode. The implementation of the fourth working mode refers to the first, second and third working modes, and is not described in detail.

And reading the second division unit along the column direction of the thinning weight matrix according to a second working mode. And integrating the effective weight value units of the second dividing unit into a calculation array with M rows and N columns. Effective weight value units in the calculation array are sequentially arranged along the column direction. As shown in fig. 6, in the second operating mode, the second partitioning unit is integrated into the 8 rows and 8 columns of the computational array. Or

In a second aspect, as shown in fig. 7, the present invention provides a system for acquiring sparsely operated data based on a MAC multiply-add array, which is an O row-I column matrix. The MAC multiply-add array comprises I multiplied by O computing units, wherein I bit input channels are arranged along the column direction of the MAC multiply-add array, and each input channel corresponds to one computing unit. And the row direction of the MAC multiplication and addition array is provided with O-bit output channels, and each input channel of the O-bit output channels corresponds to one computing unit.

The system for acquiring the sparse operation data based on the MAC multiply-add array comprises: a dividing unit 101, a generating operation mode unit 201, an integrating unit 301 and a calculating unit 401, wherein:

the dividing unit 101 is configured to divide the sparse weight matrix into a plurality of cell blocks along the column direction of the sparse weight matrix, wherein the row and column direction of the sparse weight matrix to be calculated is O row I column. Each cell block includes a plurality of cells having a significant weight value.

An operation mode unit 201 is generated, which is configured to read one or more unit blocks in a column direction of a thinning weight matrix to be calculated. If the number of cells of the one or more cell blocks having a significant weight value is equal to I O/2 and the number of cells of the one or more cell blocks having a significant weight value is equal to or less than I in each column of the one or more cell blocks, a plurality of operation modes corresponding to the one or more cell blocks are generated.

An integration unit 301 configured to read one or more unit blocks in a column direction of the thinning weight matrix according to a plurality of operation modes. The cells of significant weight values in one or more cell blocks are integrated into a computational array of M rows and N columns. Effective weight value units in the calculation array are sequentially arranged along the column direction. Row M corresponds to row O. The N columns correspond to the I columns.

And a calculating unit 401 configured to, when the MAC multiply-add array is used to perform matrix multiply calculation, convert the calculating array to be a multiplier item, and calculate an effective weight value in a matrix row direction as a feature weight value.

In another embodiment of the method for acquiring sparse operation data based on the MAC multiply-add array according to the present invention, the generating operation mode unit 201 is further configured to:

In another embodiment of the method for acquiring sparse operation data based on the MAC multiply-add array according to the present invention, the step of integrating the units of effective weight values in one or more unit blocks into a calculation array with M rows and N columns in the integration unit 301 is configured to:

In another embodiment of the method for acquiring sparse operation data based on MAC multiply-add array according to the present invention, the integrating unit 301 is configured to:

In another embodiment of the method for acquiring sparse operation data based on the MAC multiply-add array according to the present invention, the calculating unit 401 further includes:

In a preferred embodiment of the method for acquiring sparse operation data based on the MAC multiply-add array according to the present invention.

The invention relates to a hardware implementation mode of hardware friendly rule sparsization. The selection of the data enables the neural network accelerator to support matrix multiplication.

Preferably, the method for acquiring the sparse operation data based on the MAC multiply-add array includes:

the first step, array arrangement mode: the array arrangement is 8 by 8 arrays, 8 channels are input, and 8 channels are output. As shown in fig. 2.

Secondly, rule sparsification: the working modes are divided into 4 levels: the 50% sparsity (G1) as shown in fig. 3, the 25% sparsity (G2) as shown in fig. 4, the 12.5% sparsity (G4) as shown in fig. 5, etc., may be divided according to demand and resource allocation.

As shown in fig. 3, fig. 4, and fig. 5, each MAC array has a weight of one group, G1 mode comprises 1 group per block, G2 mode comprises 2 groups per black, G4 mode comprises 4 groups per block, and so on. Each block occupies a line space in the memory.

The limiting conditions are as follows: first, each column direction does not exceed 8 non-zero weights, which are limited by the output channel resources of the MAC array; second, the number of non-zero weights per line weight equals 32.

The process flow is shown in fig. 6 with G2, for example, where the edge is the arrangement of the original non-zero weights and the right graph is the arrangement of the sorted non-zero weights.

Thirdly, the MAC array supports a matrix multiplication mode: as shown in fig. 8, the convolution module supports the matrix multiplication process in two steps, first, S10 performs right matrix transposition, where C is depth direction data and W is width direction data. CO and Ci are data after the shift. Then, in step S20, a convolution with k 1 × 1S 1 is performed. And performing full join operation to obtain the result of matrix multiplication.

The invention has the beneficial effects that:

on one hand, the sparse processing flow is friendly to hardware, and a small amount of resources can be utilized to process a plurality of non-zero weights in parallel; in the traditional sparsification process, a non-zero weight needs to be found, the processing speed is low, or a large amount of hardware is needed to be processed in parallel, the cost is high, and the hardware implementation is complex.

On the other hand, the convolution operation MAC can support the matrix multiplication operation through simple data change.

It should be understood that although the present description is described in terms of various embodiments, not every embodiment includes only a single embodiment, and such description is for clarity purposes only, and those skilled in the art will recognize that the embodiments described herein as a whole may be suitably combined to form other embodiments as will be appreciated by those skilled in the art.

The above-listed detailed description is only a specific description of a possible embodiment of the present invention, and they are not intended to limit the scope of the present invention, and equivalent embodiments or modifications made without departing from the technical spirit of the present invention should be included in the scope of the present invention.

Claims

1. The method for acquiring the sparse operation data based on the MAC multiply-add array is characterized in that the MAC multiply-add array is an O row and I column matrix; the MAC multiply-add array comprises I multiplied by O computing units, wherein I bit input channels are arranged along the column direction of the MAC multiply-add array, and each input channel corresponds to one computing unit; the row direction of the MAC multiply-add array is provided with O bit output channels, and each input channel corresponds to one computing unit;

step S101, taking the row and column directions of a sparsification weight matrix to be calculated and the O row and the I column as a dividing unit, dividing the sparsification weight matrix into a plurality of cell blocks along the column direction according to the dividing unit; each cell block comprises a plurality of cells with effective weight values;

step S102, reading one or more unit blocks along the column direction of the sparse weight matrix to be calculated; generating a plurality of operation modes corresponding to the one or more unit blocks if the number of cells of the effective weight value in the one or more unit blocks is equal to I × O/2 and the number of cells of the effective weight value in each column of the one or more unit blocks is below I;

step S103, reading one or more cell blocks along the column direction of the sparse weight matrix according to the plurality of working modes; integrating the cells of significant weight values in one or more cell blocks into a computational array of M rows and N columns; effective weight value units in the calculation array are sequentially arranged along the column direction; the M rows correspond to the O rows; the N columns correspond to the I columns;

and step S104, when the MAC multiplication and addition array is used for realizing matrix multiplication calculation, the calculation array is used as a multiplier item after being converted, and the effective weight value unit in the row direction of the calculation matrix can be used as a characteristic weight value for calculation.

2. The acquisition method according to claim 1, the step S102 comprising:

reading 1 unit block along the column direction of the sparse weight matrix to be calculated as a first dividing unit; if the number of the effective weight values in the first division unit is equal to I multiplied by O/2 and the number of the effective weight values in each row of the first division unit is below I, generating a first working mode; or

Reading 2 unit blocks along the column direction of the sparse weight matrix to be calculated as a second division unit; if the number of the effective weight values in the second dividing unit is equal to I multiplied by O/2 and the number of the effective weight values in each column of the second dividing unit is below I, generating a second working mode; or

Reading 4 unit blocks along the column direction of the sparse weight matrix to be calculated as a third division unit; if the number of the effective weight values in the third dividing unit is equal to I multiplied by O/2 and the number of the effective weight values in each column of the third dividing unit is below I, generating a third working mode; or

Reading 8 unit blocks along the column direction of the sparse weight matrix to be calculated as a fourth dividing unit; and if the number of the effective weight values in the fourth dividing unit is equal to I multiplied by O/2 and the number of the effective weight values in each column of the fourth dividing unit is below I, generating a fourth working mode.

3. The method of claim 2, wherein the step of integrating the cells of significant weight values in one or more cell blocks into a computational array of M rows and N columns in step S103 comprises:

reading one or more cell blocks; and sequentially reading the units according to the row sorting in each column, and if the current reading unit is an effective weight value unit, sequentially arranging the current effective weight value unit and the last effective weight value unit along the row sorting and the column direction.

4. The acquisition method according to claim 2 or 3, the step S103 comprising:

reading the first division unit along the column direction of the sparse weight matrix according to the first working mode; integrating the effective weight value units of the first division unit into a calculation array with M rows and N columns; effective weight value units in the calculation array are sequentially arranged along the column direction; or

Reading the second division unit along the column direction of the sparse weight matrix according to the second working mode; integrating the effective weight value units of the second dividing unit into a calculation array with M rows and N columns; effective weight value units in the calculation array are sequentially arranged along the column direction; or

Reading the third division unit along the column direction of the sparse weight matrix according to the third working mode; integrating the effective weight value units of the third dividing unit into a calculation array with M rows and N columns; effective weight value units in the calculation array are sequentially arranged along the column direction; or

Reading the fourth division unit along the column direction of the sparse weight matrix according to the fourth working mode; integrating the effective weight value units of the fourth dividing unit into a calculation array with M rows and N columns; the effective weight value units in the calculation array are sequentially arranged along the column direction.

5. The acquisition method according to claim 1, further comprising after the step S104:

step S105, the calculation array is used as a characteristic input value to realize convolution or full-connection layer calculation in a neural network model of deep learning;

the MAC multiplication and addition array is an 8-row 8-column matrix, a 16-row 16-column matrix, a 32-row 32-column matrix, a 64-row 64-column matrix, a 16-row 32-column matrix or a 32-row 64-column matrix; the MAC multiply-add array comprises 8 × 8, 16 × 16, 32 × 32, 64 × 64 or 16 × 32, 32 × 64 computing units; the M rows and N columns of computational arrays are 8 rows and 8 columns, 16 rows and 16 columns, 32 rows and 32 columns, 64 rows and 64 columns, 16 rows and 32 columns or 32 rows and 64 columns of computational arrays.

6. The sparse operation data acquisition system is based on an MAC multiplication and addition array, wherein the MAC multiplication and addition array is an O row and I column matrix; the MAC multiply-add array comprises I multiplied by O computing units, wherein I bit input channels are arranged along the column direction of the MAC multiply-add array, and each input channel corresponds to one computing unit; the row direction of the MAC multiply-add array is provided with O bit output channels, and each input channel corresponds to one computing unit;

the dividing unit is configured to take the row and column directions of the sparse weight matrix to be calculated and the O row and the I column as one dividing unit, and divide the sparse weight matrix into a plurality of cell blocks along the column direction according to the dividing unit; each cell block comprises a plurality of cells with effective weight values;

the generation working mode unit is configured to read one or more unit blocks along the column direction of the sparse weight matrix to be calculated; generating a plurality of operation modes corresponding to the one or more unit blocks if the number of cells of the effective weight value in the one or more unit blocks is equal to I × O/2 and the number of cells of the effective weight value in each column of the one or more unit blocks is below I;

the integration unit is configured to read one or more unit blocks along a column direction of the thinning weight matrix according to the plurality of operation modes; integrating the cells of significant weight values in one or more cell blocks into a computational array of M rows and N columns; effective weight value units in the calculation array are sequentially arranged along the column direction; the M rows correspond to the O rows; the N columns correspond to the I columns;

the calculation unit is configured to, when the MAC multiply-add array is used to perform matrix multiply calculation, convert the calculation array to be a multiplier item, and the effective weight value unit in the calculation matrix row direction can be calculated as a feature weight value.

7. The acquisition system of claim 6, the generate operating mode unit further configured to:

8. The acquisition system of claim 7, said integrating of the cells of significant weight values in one or more cell blocks into an M row by N column computational array of cells configured to:

9. The acquisition system according to claim 7 or 8, the integration unit being configured to:

10. The acquisition system of claim 6, the computing unit further comprising:

a convolution calculation unit configured to perform convolution or full-connected layer calculation in a deep-learning neural network model with the calculation array as a feature input value;