WO2021147196A1

WO2021147196A1 - Convolution operation method, apparatus and device, and storage medium

Info

Publication number: WO2021147196A1
Application number: PCT/CN2020/087105
Authority: WO
Inventors: 董刚; 赵雅倩; 李仁刚; 杨宏斌; 刘海威
Original assignee: 苏州浪潮智能科技有限公司
Priority date: 2020-01-20
Filing date: 2020-04-27
Publication date: 2021-07-29
Also published as: CN111310891A

Abstract

A convolution operation method, apparatus and device, and a storage medium. The method comprises the steps of: reading, from a memory, a sample data matrix and a convolution kernel matrix corresponding to the sample data matrix; executing an expansion operation on the sample data matrix to generate a first intermediate matrix, and executing an expansion operation on the convolution kernel matrix to generate a second intermediate matrix, wherein the number of rows and the number of columns between the first intermediate matrix and the second intermediate matrix are consistent; and executing a convolution operation on the first intermediate matrix by means of the second intermediate matrix, and generating a convolution result. In the method, executing the convolution operation on the first intermediate matrix by means of the second intermediate matrix is equivalent to executing a convolution operation on the sample data matrix by means of the convolution kernel matrix; and the data amount of a convolution between the two matrices in a unit of time can be increased, thereby relatively ensuring the overall efficiency of a convolution operation process. In addition, a convolution operation apparatus and device, and a storage medium are further provided in the present invention, and the beneficial effects thereof are the same as those described above.

Description

Convolution operation method, device, equipment and storage medium

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on January 20, 2020, the application number is 202010065274.4, and the invention title is "a convolution operation method, device, equipment, and storage medium". The entire content of the application is approved The reference is incorporated in this application.

Technical field

The present invention relates to the field of deep learning, in particular to a convolution operation method, device, equipment and storage medium.

Background technique

Deep learning refers to the internal laws and representation levels of learning sample data. Its ultimate goal is to enable the machine to have the ability to analyze and learn like humans, recognize text, images, and sound data, and perform convolution operations on the sample data. Feature extraction is currently an important means to realize deep learning.

Taking image deep learning as an example, the operation of inner product of sample data and convolution kernel in different data windows in an image is called convolution, and its calculation process is also called filtering. The essence is to extract the characteristics of different frequency bands of the image. The convolution kernel is also called a filter. It is a set of neurons with fixed weights, usually a square two-dimensional matrix. The matrix stores the coefficients for processing the data in the receptive field. The filtering of a convolution kernel can be used Extract specific features, for example, you can extract the contours of objects in the image, the color depth, and so on. Because the matrix elements of the sample data currently acquired in the data window are often more than the matrix elements of the convolution kernel, and the number of matrix elements varies greatly, it is difficult to ensure the overall efficiency of the convolution kernel for convolution operations on the sample data.

It can be seen that providing a convolution operation method to relatively ensure the overall efficiency of the convolution operation process is a problem that needs to be solved by those skilled in the art.

Summary of the invention

The purpose of the present invention is to provide a convolution operation method, device, equipment and storage medium to relatively ensure the overall efficiency of the convolution operation process.

In order to solve the above technical problems, the present invention provides a convolution operation method, including:

Read the sample data matrix and the convolution kernel matrix corresponding to the sample data matrix in the memory;

Perform an expansion operation on the sample data matrix to generate a first intermediate matrix, and perform an expansion operation on the convolution kernel matrix to generate a second intermediate matrix, the number of rows and columns between the first intermediate matrix and the second intermediate matrix are the same;

Perform a convolution operation on the first intermediate matrix through the second intermediate matrix, and generate a convolution result.

Preferably, when the number of sample data matrices is greater than 1, performing a convolution operation on the first intermediate matrix through the second intermediate matrix includes:

Perform a matrix multiplication operation on each first intermediate matrix respectively through the second intermediate matrix and generate a corresponding result matrix;

Perform an accumulation operation on each result matrix.

Preferably, reading the sample data matrix and the convolution kernel matrix corresponding to the sample data matrix in the memory includes:

Read the sample data matrix in the DDR memory, and read the convolution kernel matrix corresponding to the sample data matrix in the HBM2 memory.

Preferably, performing a convolution operation on the first intermediate matrix through the second intermediate matrix includes:

In the DSP operation array, a convolution operation is performed on the first intermediate matrix through the second intermediate matrix.

Preferably, after generating the convolution result, the method further includes:

The convolution result is stored in the storage location corresponding to the sample data matrix in the memory.

Preferably, performing an expansion operation on the sample data matrix to generate the first intermediate matrix includes:

Extract sequentially from the sample matrix the process matrix with the same size as the convolution kernel matrix;

Perform a transposition operation on each row of data of the process matrix and splice them into the first transposed data column according to the order between the rows;

According to the adjacent relationship between the process matrices, the corresponding first transposed data columns are combined into a first intermediate matrix.

Preferably, performing an expansion operation on the convolution kernel matrix to generate the second intermediate matrix includes:

Perform a transposition operation on each row of data of the convolution kernel matrix, and splice them into a second transposed data column according to the order between the rows;

A second intermediate matrix is combined based on a plurality of second transposed data columns.

Preferably, when the number of dimensions of the sample data matrix is greater than 2, performing an expansion operation on the sample data matrix to generate the first intermediate matrix includes:

Based on each element in the target dimension in the sample data, an expansion operation is sequentially performed to generate a first intermediate matrix.

In addition, the present invention also provides a convolution operation device, including:

The matrix reading module is used to read the sample data matrix and the convolution kernel matrix corresponding to the sample data matrix in the memory;

The preprocessing module is used to perform an expansion operation on the sample data matrix to generate a first intermediate matrix, and perform an expansion operation on the convolution kernel matrix to generate a second intermediate matrix, the number of rows and columns between the first intermediate matrix and the second intermediate matrix The numbers are the same;

The convolution execution module is configured to perform a convolution operation on the first intermediate matrix through the second intermediate matrix and generate a convolution result.

Preferably, the convolution execution module includes:

The matrix product module is used to perform matrix multiplication operations on each first intermediate matrix through the second intermediate matrix and generate a corresponding result matrix;

The accumulation module is used to perform accumulation operations on each result matrix.

Preferably, the matrix reading module includes:

The memory reading module is used to read the sample data matrix in the DDR memory, and read the convolution kernel matrix corresponding to the sample data matrix in the HBM2 memory.

Memory, used to store computer programs;

The processor is used to implement the steps of the above-mentioned convolution operation method when the computer program is executed.

In addition, the present invention also provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps of the convolution operation method as described above are realized.

The convolution operation method provided by the present invention first reads the sample data matrix and the convolution kernel matrix corresponding to the sample data matrix in the memory. The number of rows or columns of the sample data matrix is equal to the number of rows of the convolution kernel matrix. Therefore, the expansion operation is performed on the sample data matrix and the convolution kernel matrix respectively to generate the first intermediate matrix and the second intermediate matrix, and the number of rows and columns of the first intermediate matrix and the second intermediate matrix are the same, and finally through the convolution The second intermediate matrix obtained by the expansion of the kernel matrix performs a convolution operation on the first intermediate matrix obtained by the expansion of the sample data matrix to generate a corresponding convolution result. Since the method performs the expansion operation on the sample data matrix and the convolution kernel matrix, the first intermediate matrix generated is equivalent to the sample data matrix, and the second intermediate matrix generated is equivalent to the convolution kernel matrix. Therefore, the second intermediate matrix is equivalent to the convolution kernel matrix. Performing a convolution operation on the first intermediate matrix by the matrix is equivalent to performing a convolution operation on the sample data matrix by the convolution kernel matrix, and can increase the amount of convolution data between the two matrices per unit time, thereby relatively ensuring the convolution operation process Overall efficiency. In addition, the present invention also provides a convolution operation device, equipment and storage medium, and the beneficial effects are the same as those described above.

Description of the drawings

FIG. 1 is a flowchart of a convolution operation method disclosed in an embodiment of the present invention;

Figure 2.a is a schematic diagram of the expansion operation of a sample data matrix in a specific application scenario disclosed in an embodiment of the present invention;

Figure 2.b is a schematic diagram of the expansion operation of a convolution kernel matrix in a specific application scenario disclosed in an embodiment of the present invention;

FIG. 3 is a flowchart of a specific convolution operation method disclosed in an embodiment of the present invention;

FIG. 4 is a schematic diagram of the composition structure of a convolution operation device disclosed in an embodiment of the present invention.

Detailed ways

The following describes the technical solutions in the embodiments of the present invention clearly and completely with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.

To this end, the core of the present invention is to provide a convolution operation method to relatively ensure the overall efficiency of the convolution operation process.

In order to enable those skilled in the art to better understand the solution of the present invention, the present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments.

As shown in FIG. 1, an embodiment of the present invention discloses a convolution operation method, including:

Step S10: Read the sample data matrix and the convolution kernel matrix corresponding to the sample data matrix in the memory.

It should be noted that the sample data matrix read in this step can be a data matrix extracted from samples including but not limited to pictures, audio, text, etc. The convolution kernel matrix corresponding to the sample data matrix is the pair of samples. The data matrix is a matrix for feature extraction. The elements in the convolution kernel matrix are set according to the specific types of features extracted in the sample data matrix, and the convolution kernel matrix generates feature images by performing convolution operations on the sample data matrix, that is, volume As a result, the feature image can reflect the distribution state of the corresponding type of feature in the sample data matrix. In addition, in this step, the sample data matrix and the convolution kernel matrix corresponding to the sample data matrix are read in the memory. Specifically, the sample data matrix and the convolution kernel matrix corresponding to the sample data matrix can be obtained in the same memory, or It is to obtain the sample data matrix and the convolution kernel matrix corresponding to the sample data matrix in two independent memories.

Step S11: Perform an expansion operation on the sample data matrix to generate a first intermediate matrix, and perform an expansion operation on the convolution kernel matrix to generate a second intermediate matrix. The number of rows and columns between the first intermediate matrix and the second intermediate matrix are the same .

The focus of this embodiment is to obtain the sample data matrix and the convolution kernel matrix corresponding to the sample data matrix, and before performing the convolution operation on the sample data matrix with the convolution kernel matrix, first perform the convolution operation on the sample data matrix and the convolution kernel. The matrix is preprocessed, that is, the sample data matrix and the convolution kernel matrix are expanded respectively. The purpose of the expansion operation is to obtain the first intermediate matrix and the second intermediate matrix with the same number of rows and columns. Among them, the first intermediate matrix is equivalent to the sample data matrix, and the second intermediate matrix is equivalent to the convolution kernel matrix. Since the number of rows and columns of the first intermediate matrix and the second intermediate matrix are the same respectively, it can be ensured that in the subsequent execution During the convolution operation, the first intermediate matrix and the second intermediate matrix have a larger number of data convolutions in a unit time. In addition, the expansion operation in this step may specifically be expanded by row.

Step S12: Perform a convolution operation on the first intermediate matrix through the second intermediate matrix, and generate a convolution result.

In this step, after the first intermediate matrix and the second intermediate matrix are obtained, a convolution operation is further performed on the first intermediate matrix through the second intermediate matrix to generate a corresponding convolution result.

The convolution operation method provided by the present invention first reads the sample data matrix and the convolution kernel matrix corresponding to the sample data matrix in the memory. The number of rows or columns of the sample data matrix is equal to the number of rows of the convolution kernel matrix. Therefore, the expansion operation is performed on the sample data matrix and the convolution kernel matrix respectively to generate the first intermediate matrix and the second intermediate matrix, and the number of rows and columns of the first intermediate matrix and the second intermediate matrix are the same, and finally through the convolution The second intermediate matrix obtained by the expansion of the kernel matrix performs a convolution operation on the first intermediate matrix obtained by the expansion of the sample data matrix to generate a corresponding convolution result. Since the method performs the expansion operation on the sample data matrix and the convolution kernel matrix, the first intermediate matrix generated is equivalent to the sample data matrix, and the second intermediate matrix generated is equivalent to the convolution kernel matrix. Therefore, the second intermediate matrix is equivalent to the convolution kernel matrix. Performing a convolution operation on the first intermediate matrix by the matrix is equivalent to performing a convolution operation on the sample data matrix by the convolution kernel matrix, and can increase the amount of convolution data between the two matrices per unit time, thereby relatively ensuring the convolution operation process Overall efficiency.

On the basis of the foregoing embodiment, as a preferred implementation manner, performing an expansion operation on the sample data matrix to generate a first intermediate matrix includes:

It should be noted that, since the matrix transposition and splicing method is adopted in this embodiment, the first intermediate matrix obtained by the transformation can be characterized by row-first calculation, and the amount of intermediate result data generated in the calculation process is small. Advantages, so it can achieve the effect of reducing hardware resource overhead.

On the basis of the foregoing implementation manner, as a preferred implementation manner, performing an expansion operation on the convolution kernel matrix to generate a second intermediate matrix includes:

It should be noted that this embodiment can generate the second intermediate matrix relatively efficiently based on the row and column size of the first intermediate matrix, which improves the overall efficiency of the convolution operation.

In order to deepen the understanding of the unfolding operation in the foregoing embodiment, this embodiment is described by way of example. In a specific application scenario, the schematic diagrams of the expansion operation of the sample data matrix and the convolution kernel matrix are shown in Figure 2.a and Figure 2.b, respectively.

As shown in Figure 2.a, the sample data matrix is a 3x11 matrix, and the first intermediate matrix after the expansion operation is a 9x9 arrangement. Among them, the upper three rows of the 9x9 arrangement are obtained by dividing the first row of the 3x11 format three times, each time taking 9 data. The initial positions of the three selected data are the first, second, and third data respectively. By analogy, the lower six rows of the 9x9 arrangement can be obtained.

As shown in Figure 2.b, the convolution kernel matrix is a 3x3 matrix, and the expansion operation is to arrange the 3x3 data into a column in the order of rows, and further expand it into 9 columns.

On the basis of the above-mentioned embodiments, the present invention also provides the following series of preferred embodiments.

When the number of sample data matrices is greater than 1, please refer to FIG. 3. An embodiment of the present invention discloses a convolution operation method, including:

Step S20: Read the sample data matrix and the convolution kernel matrix corresponding to the sample data matrix in the memory.

Step S21: Perform an expansion operation on the sample data matrix to generate a first intermediate matrix, and perform an expansion operation on the convolution kernel matrix to generate a second intermediate matrix. The number of rows and columns between the first intermediate matrix and the second intermediate matrix are the same .

Step S22: Perform a matrix multiplication operation on each first intermediate matrix respectively through the second intermediate matrix and generate a corresponding result matrix.

Step S23: Perform an accumulation operation on each result matrix and generate a convolution result.

It is understandable that when the number of sample data matrices is greater than 1, the number of first intermediate matrices generated by performing the expansion operation based on the sample data matrix is also greater than 1, so the second intermediate matrix corresponding to the convolution kernel matrix needs to be the same as all The first intermediate matrix performs matrix multiplication operations and generates corresponding result matrices, and then accumulates each result matrix to generate a convolution result of all sample data matrices. This implementation can relatively ensure the number of sample data matrices When it is greater than 1, the overall accuracy of the convolution operation performed on the sample data matrix.

On the basis of the foregoing embodiment, as a preferred implementation manner, reading the sample data matrix and the convolution kernel matrix corresponding to the sample data matrix in the memory includes:

It should be noted that, in this embodiment, the sample data matrix and the convolution kernel matrix are obtained from two different memories, that is, the sample data matrix is read in the DDR memory, and the sample data matrix corresponding to the sample data matrix is read in the HBM2 memory. Convolution kernel matrix, where DDR memory and HBM2 memory can belong to the same arithmetic chip. For example, DDR memory and HBM2 memory belong to the FPGA chip. In this case, the FPGA chip obtains the sample data matrix from the local DDR chip. The local HBM2 memory obtains the convolution kernel matrix, and performs the convolution operation of the convolution kernel matrix on the sample data matrix in the FPGA chip.

Both the DDR memory and the HBM2 memory in this embodiment can achieve a higher data transmission rate at the same bus frequency as the SDRAM memory, so this embodiment can further improve the overall efficiency of the convolution operation.

On the basis of the foregoing embodiment, as a preferred implementation manner, performing a convolution operation on the first intermediate matrix through the second intermediate matrix includes:

It should be noted that the DSP arithmetic array, also called a digital signal processor, is a microprocessor with a special structure. The internal structure of the DSP chip is separated from the program and the data. It has a hardware multiplier and widely adopts pipeline operation. The provided DSP instructions can be used to quickly implement various digital signal processing algorithms. Therefore, this implementation mode passes through the DSP arithmetic array. The second intermediate matrix performs the convolution operation on the first intermediate matrix, which can relatively improve the overall efficiency of the second intermediate matrix performing the convolution operation on the first intermediate matrix.

In addition, as a preferred implementation manner, after generating the convolution result, the method further includes:

It should be noted that, considering that the sample data matrix has completed its role in the convolution operation after the convolution kernel matrix is used to perform the convolution operation on the sample data matrix, but the sample data matrix will continue to occupy memory space, As a result, the memory space availability rate is reduced. Therefore, after the convolution result is generated in this embodiment, the convolution result is further stored in the storage location corresponding to the sample data matrix in the memory. The purpose is to cover the original sample data in the memory by the convolution result. The matrix is used to ensure the space availability of the memory, thereby avoiding the waste of memory space, thereby reducing the storage pressure of the memory, and ensuring the overall stability of the convolution operation.

In addition, on the basis of the foregoing series of implementation manners, as a preferred implementation manner, when the number of dimensions of the sample data matrix is greater than 2, performing an expansion operation on the sample data matrix to generate the first intermediate matrix includes:

It should be noted that when the number of dimensions of the sample data matrix is greater than 2, this embodiment performs the expansion operation in sequence based on each element in the target dimension in the sample data to generate the first intermediate matrix, and then the first intermediate matrix is processed through the second intermediate matrix. The intermediate matrix performs a convolution operation, which can sequentially perform the convolution operation between the second intermediate matrix and the first intermediate matrix with each element in the target dimension as a unit, relatively reducing the second intermediate matrix and the second intermediate matrix corresponding to the same element in the target dimension. The amount of intermediate data generated when the first intermediate matrix performs the convolution operation, thereby achieving the effect of reducing hardware resource overhead.

On the other hand, the present invention also provides a convolution operation device. Refer to FIG. 4, which shows a schematic diagram of the composition structure of an embodiment of a convolution operation device, and the device includes:

The matrix reading module 10 is used for reading the sample data matrix and the convolution kernel matrix corresponding to the sample data matrix in the memory.

The preprocessing module 11 is used to perform an expansion operation on the sample data matrix to generate a first intermediate matrix, and perform an expansion operation on the convolution kernel matrix to generate a second intermediate matrix, the number of rows between the first intermediate matrix and the second intermediate matrix and The number of columns is the same.

The convolution execution module 12 is configured to perform a convolution operation on the first intermediate matrix through the second intermediate matrix, and generate a convolution result.

The convolution operation device provided by the present invention first reads the sample data matrix and the convolution kernel matrix corresponding to the sample data matrix in the memory. The number of rows or columns of the sample data matrix is equal to the number of rows of the convolution kernel matrix. Therefore, the expansion operation is performed on the sample data matrix and the convolution kernel matrix respectively to generate the first intermediate matrix and the second intermediate matrix, and the number of rows and columns of the first intermediate matrix and the second intermediate matrix are the same, and finally through the convolution The second intermediate matrix obtained by the expansion of the kernel matrix performs a convolution operation on the first intermediate matrix obtained by the expansion of the sample data matrix to generate a corresponding convolution result. Since the first intermediate matrix generated by the device after the expansion operation of the sample data matrix and the convolution kernel matrix is equivalent to the sample data matrix, and the second intermediate matrix generated is equivalent to the convolution kernel matrix, the second intermediate matrix is equivalent to the convolution kernel matrix. Performing a convolution operation on the first intermediate matrix by the matrix is equivalent to performing a convolution operation on the sample data matrix by the convolution kernel matrix, and can increase the amount of convolution data between the two matrices per unit time, thereby relatively ensuring the convolution operation process Overall efficiency.

In addition, as a preferred implementation manner, the convolution execution module includes:

In addition, as a preferred embodiment, the matrix reading module includes:

On the other hand, the present invention also provides a convolution operation device, including:

Memory, used to store computer programs;

The convolution operation device provided by the present invention first reads the sample data matrix and the convolution kernel matrix corresponding to the sample data matrix in the memory. The number of rows or columns of the sample data matrix is equal to the number of rows of the convolution kernel matrix. Therefore, the expansion operation is performed on the sample data matrix and the convolution kernel matrix respectively to generate the first intermediate matrix and the second intermediate matrix, and the number of rows and columns of the first intermediate matrix and the second intermediate matrix are the same, and finally through the convolution The second intermediate matrix obtained by the expansion of the kernel matrix performs a convolution operation on the first intermediate matrix obtained by the expansion of the sample data matrix to generate a corresponding convolution result. Since the first intermediate matrix generated by this device after the expansion operation of the sample data matrix and the convolution kernel matrix is equivalent to the sample data matrix, and the second intermediate matrix generated is equivalent to the convolution kernel matrix, the second intermediate matrix is equivalent to the convolution kernel matrix. Performing a convolution operation on the first intermediate matrix by the matrix is equivalent to performing a convolution operation on the sample data matrix by the convolution kernel matrix, and can increase the amount of convolution data between the two matrices per unit time, thereby relatively ensuring the convolution operation process Overall efficiency.

On the other hand, the present invention also provides a computer-readable storage medium with a computer program stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the convolution operation method as described above are realized.

The computer-readable storage medium provided by the present invention first reads the sample data matrix and the convolution kernel matrix corresponding to the sample data matrix in the memory, the number of rows or columns of the sample data matrix and the number of rows of the convolution kernel matrix Consistent, and then perform the expansion operation on the sample data matrix and the convolution kernel matrix to generate the first intermediate matrix and the second intermediate matrix, and the number of rows and columns of the first intermediate matrix and the second intermediate matrix are the same, and finally pass the convolution The second intermediate matrix obtained by the expansion of the product kernel matrix performs a convolution operation on the first intermediate matrix obtained by the expansion of the sample data matrix to generate a corresponding convolution result. Since the computer-readable storage medium performs the expansion operation on the sample data matrix and the convolution kernel matrix, the first intermediate matrix generated is equivalent to the sample data matrix, and the second intermediate matrix generated is equivalent to the convolution kernel matrix. Performing a convolution operation on the first intermediate matrix through the second intermediate matrix is equivalent to performing a convolution operation on the sample data matrix by the convolution kernel matrix, and can increase the amount of convolution data between the two matrices per unit time, thereby relatively ensuring The overall efficiency of the convolution operation process.

The above describes in detail a convolution operation method, device, equipment, and storage medium provided by the present invention. The various embodiments in the specification are described in a progressive manner, and each embodiment focuses on the differences from other embodiments, and the same or similar parts between the various embodiments can be referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant parts can be referred to the description of the method part. It should be pointed out that for those of ordinary skill in the art, without departing from the principle of the present invention, several improvements and modifications can be made to the present invention, and these improvements and modifications also fall within the protection scope of the claims of the present invention.

It should also be noted that in this specification, relational terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply these entities or operations. There is any such actual relationship or sequence between operations. Moreover, the terms "include", "include" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article or device including a series of elements not only includes those elements, but also includes those that are not explicitly listed Other elements of, or also include elements inherent to this process, method, article or equipment. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other same elements in the process, method, article, or equipment that includes the element.

Claims

A convolution operation method, characterized in that it comprises:

Reading the sample data matrix and the convolution kernel matrix corresponding to the sample data matrix in the memory;

Performing an expansion operation on the sample data matrix to generate a first intermediate matrix, and performing an expansion operation on the convolution kernel matrix to generate a second intermediate matrix, the number of rows between the first intermediate matrix and the second intermediate matrix And the number of columns are the same;

Perform a convolution operation on the first intermediate matrix through the second intermediate matrix, and generate a convolution result.
The convolution operation method according to claim 1, wherein when the number of the sample data matrix is greater than 1, the performing a convolution operation on the first intermediate matrix through the second intermediate matrix comprises :

Perform a matrix multiplication operation on each of the first intermediate matrices through the second intermediate matrix and generate a corresponding result matrix;

An accumulation operation is performed on each of the result matrices.
The convolution operation method according to claim 1, wherein said reading the sample data matrix and the convolution kernel matrix corresponding to the sample data matrix in the memory comprises:

The sample data matrix is read in the DDR memory, and the convolution kernel matrix corresponding to the sample data matrix is read in the HBM2 memory.
The convolution operation method according to claim 1, wherein the performing a convolution operation on the first intermediate matrix through the second intermediate matrix comprises:

The convolution operation is performed on the first intermediate matrix through the second intermediate matrix in the DSP operation array.
The convolution operation method according to claim 1, wherein after said generating a convolution result, the method further comprises:

Storing the convolution result in a storage location corresponding to the sample data matrix in the memory.
The convolution operation method according to claim 1, wherein the performing an expansion operation on the sample data matrix to generate a first intermediate matrix comprises:

Sequentially extracting process matrices with the same size as the convolution kernel matrix from the sample matrix;

Perform a transposition operation on each row of data of the process matrix, and splice them into a first transposed data column according to the order between rows;

Combine the corresponding first transposed data columns into the first intermediate matrix according to the adjacent relationship between the process matrices.
The convolution operation method according to claim 6, wherein the performing an expansion operation on the convolution kernel matrix to generate a second intermediate matrix comprises:

Performing a transposition operation on each row of data of the convolution kernel matrix and splicing them into a second transposed data column according to the order between the rows;

Combining a plurality of the second transposed data columns to form the second intermediate matrix.
The convolution operation method according to any one of claims 1 to 7, wherein when the number of dimensions of the sample data matrix is greater than 2, the expansion operation is performed on the sample data matrix to generate a first intermediate matrix ,include:

The first intermediate matrix is generated by sequentially performing the expansion operation based on each element in the target dimension in the sample data.
A convolution operation device, characterized in that it comprises:

A matrix reading module for reading a sample data matrix and a convolution kernel matrix corresponding to the sample data matrix in the memory;

The preprocessing module is configured to perform an expansion operation on the sample data matrix to generate a first intermediate matrix, and perform an expansion operation on the convolution kernel matrix to generate a second intermediate matrix. The number of rows and columns between the matrices are consistent;

The convolution execution module is configured to perform a convolution operation on the first intermediate matrix through the second intermediate matrix, and generate a convolution result.
The convolution operation device according to claim 9, wherein the convolution execution module comprises:

A matrix product module, configured to perform a matrix multiplication operation on each of the first intermediate matrices through the second intermediate matrix and generate a corresponding result matrix;

The accumulation module is used to perform accumulation operations on each of the result matrices.
The convolution operation device according to claim 9, wherein the matrix reading module comprises:

The memory reading module is used to read the sample data matrix in the DDR memory, and read the convolution kernel matrix corresponding to the sample data matrix in the HBM2 memory.
A convolution operation device, which is characterized in that it comprises:

Memory, used to store computer programs;

The processor is configured to implement the steps of the convolution operation method according to any one of claims 1 to 8 when the computer program is executed.
A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the convolution operation according to any one of claims 1 to 8 is realized Method steps.