WO2019136750A1

WO2019136750A1 - Artificial intelligence-based computer-aided processing device and method, storage medium, and terminal

Info

Publication number: WO2019136750A1
Application number: PCT/CN2018/072662
Authority: WO
Inventors: 肖梦秋
Original assignee: 深圳鲲云信息科技有限公司
Priority date: 2018-01-15
Filing date: 2018-01-15
Publication date: 2019-07-18
Also published as: CN109313663A; CN109313663B

Abstract

An artificial intelligence-based computer-aided processing device, comprising: a plurality of storage modules (11) storing data matrixes to be processed; a memory (12) provided with a zero matrix; and a control module (13) for taking the data matrixes to be processed out from the storage modules (11) and placing same into the zero matrix of the memory, such that the data matrixes to be processed can constitute, with any first matrix element thereof as a center, a matrix to be convoluted having a size of w*w for a convolution kernel matrix to perform convolution computation according to a preset step length, wherein w is the size of the convolution kernel matrix. By constructing a zero padding operation system by means of a hardware structure, and presetting a zero matrix in the memory (12) for data matrixes to be processed to be placed, a zero padding operation can be implemented without computing parameters, such as a zero padding number or a zero padding position, so that the computational complexity of the system is greatly reduced, the efficiency of a zero padding operation is improved, and the response speed of operations, such as image processing, is increased.

Description

Artificial intelligence calculation auxiliary processing device, method, storage medium, and terminal

Technical field

The present invention relates to the field of artificial intelligence, and in particular to an artificial intelligence computing auxiliary processing method, apparatus, readable computer storage medium, and terminal.

Background technique

Nowadays, with the development of the artificial intelligence industry, various technologies in the field of artificial intelligence have arisen. Among them, convolutional neural networks have become a research hotspot in many fields of artificial intelligence.

As early as the 1960s, when scientists studied the local sensitive and directional selection of neurons in the cat's cerebral cortex, they found that their unique network structure can effectively reduce the complexity of the feedback neural network, and then propose convolutional nerves. The internet. Later, more researchers were involved in the study of convolutional neural networks.

Generally, in order to make the matrix size after the convolution extraction feature value coincides with the original data matrix size before convolution, the original data matrix needs to be zero-padded.

However, the zero-padding operation in the prior art can usually only be zero-padded by software technology, and the calculation amount of the CPU is very large, which leads to very low zero-filling efficiency.

Summary of the invention

In view of the above-mentioned shortcomings of the prior art, an object of the present invention is to provide an industrial intelligence calculation auxiliary processing method and apparatus, a readable computer storage medium, and a terminal, which are used for solving the low efficiency and calculation of the zero-padding operation in the prior art. A large amount of technical problems.

To achieve the above and other related objects, the present invention provides an artificial intelligence calculation auxiliary processing apparatus, comprising: a plurality of storage modules storing a data matrix to be processed; a memory, having a zero matrix; and a control module for using the The processing data matrix is taken out from the storage module and placed in the memory zero matrix, so that the to-be-processed data matrix can be constructed with a size of W*W to be convolved around any one of the first matrix elements. a matrix for convolution calculation of the convolution kernel matrix according to a preset step size; wherein W is the size of the convolution kernel matrix.

In an embodiment of the invention, the data matrix to be processed includes an n*m matrix, and the zero matrix includes an N*M zero matrix;

In an embodiment of the present invention, the data matrix to be processed is taken out from the storage module and placed in the inner zero matrix, and the method further includes: the control module: the n*m a matrix is taken out from the storage module and placed in an N*M zero matrix of the memory to form a fill matrix; wherein the fill matrix includes: first to first

Line, number

To the Nth act zero; 1st to the first

Column, number

The column M is zero; the other regions are the n*m matrix.

In an embodiment of the present invention, the data matrix to be processed is taken out from the storage module and placed in the inner zero matrix, and the specific manner includes: dividing the n*m matrix into multiple a block matrix of the same size; the control module sequentially takes out each of the block matrices, and fills each of the block matrices into the N*M zero matrix according to a preset placement condition.

In an embodiment of the present invention, the preset placement condition includes: the control module sequentially sets each of the block matrix according to a preset start address in the memory and a size of the block matrix Into the N*M zero matrix.

In an embodiment of the invention, the storage module includes a dual storage module.

In an embodiment of the present invention, the artificial intelligence calculation auxiliary processing device includes: a multiplier for multiplying each of the to-be-convolved matrices with a convolution kernel to obtain a corresponding multiplication result matrix; Each of the multiplication result matrices is aligned with each of the first matrix elements; an adder is configured to add the second matrix elements in each of the multiplication result matrices to obtain corresponding convolution result values; The convolution result value is aligned with each of the first matrix elements to form a convolution result matrix of size n*m.

To achieve the above and other related objects, the present invention provides an artificial intelligence calculation auxiliary processing method, which is applied to a control module, the method comprising: taking an n*m matrix from a storage module; placing the n*m matrix In the N*M zero matrix of the memory, the n*m matrix is capable of forming a to-be-convoluted matrix of size W*W centered on any one of the first matrix elements for convolution calculation with the convolution kernel ;among them,

W is the order of the convolution kernel.

In an embodiment of the present invention, the control module extracts the n*m matrix from the storage module, and specifically includes: the n*m matrix is divided into a plurality of block matrices of the same size; The control module sequentially takes out each of the block matrices.

In an embodiment of the present invention, the placing the n*m matrix in the N*M zero matrix of the memory specifically includes: the control module sequentially according to the start address and the size of the block matrix Each of the block matrices is filled into the N*M zero matrix.

To achieve the above and other related objects, the present invention provides a computer readable storage medium having stored thereon a computer program that, when executed by a processor, implements the artificial intelligence calculation auxiliary processing method.

To achieve the above and other related objects, the present invention provides an artificial intelligence computing auxiliary processing terminal, comprising: a processor and a memory; the memory is for storing a computer program, and the processor is configured to execute the computer for storing the memory a program for causing the terminal to execute the artificial intelligence calculation auxiliary processing method.

As described above, the artificial intelligence calculation auxiliary processing method and apparatus, the readable computer storage medium, and the terminal of the present invention have the following beneficial effects: the artificial intelligence calculation auxiliary processing method and apparatus provided by the present invention, and the zero-fill operation is established through the hardware structure. The system, and pre-set zero matrix in the memory, can be used to realize the zero-padding operation when the data matrix to be processed is placed. It is not necessary to calculate the number of zero padding or the position of zero padding, which greatly reduces the calculation amount of the system and improves the calculation. The efficiency of the zero-padding operation speeds up the response speed of operations such as image processing. Therefore, the present invention effectively overcomes various shortcomings in the prior art and has high industrial utilization value.

DRAWINGS

FIG. 1 is a schematic diagram of an artificial intelligence calculation auxiliary processing apparatus according to an embodiment of the present invention.

FIG. 2 is a schematic diagram showing an artificial intelligence intelligent calculation auxiliary processing process according to an embodiment of the present invention.

FIG. 3 is a schematic diagram showing an artificial intelligence intelligent calculation auxiliary processing process according to an embodiment of the present invention.

FIG. 4 is a schematic diagram showing an artificial intelligence intelligent calculation auxiliary processing process according to an embodiment of the present invention.

FIG. 5 is a schematic diagram showing an artificial intelligence intelligent calculation auxiliary processing process according to an embodiment of the present invention.

FIG. 6 is a schematic diagram showing an artificial intelligence intelligent calculation auxiliary processing process according to an embodiment of the present invention.

FIG. 7 is a schematic diagram of an artificial intelligence intelligent calculation auxiliary processing device according to an embodiment of the present invention.

FIG. 8 is a schematic diagram showing an artificial intelligence intelligent calculation auxiliary processing method according to an embodiment of the present invention.

Component label description

11 storage module

12 memory

13 control module

M1 pending data matrix

M2 convolution kernel matrix

M3 zero matrix

M4 fill matrix

M401～M425 to be convolution matrix

M501～M505 multiplication result matrix

M6 convolution result matrix

M7 pending data matrix

M8 zero matrix

71 storage module

72 memory

R1 rectangular dotted frame

R2 rectangular dotted frame

R3 rectangular dotted frame

S801～S802 steps

Detailed ways

The embodiments of the present invention are described below by way of specific examples, and those skilled in the art can readily understand other advantages and effects of the present invention from the disclosure of the present disclosure. The present invention may be embodied or applied in various other specific embodiments, and various modifications and changes can be made without departing from the spirit and scope of the invention. It should be noted that the features in the following embodiments and embodiments may be combined with each other without conflict.

It should be noted that the illustrations provided in the following embodiments merely illustrate the basic concept of the present invention in a schematic manner, and only the components related to the present invention are shown in the drawings, rather than the number and shape of components in actual implementation. Dimensional drawing, the actual type of implementation of each component's type, number and proportion can be a random change, and its component layout can be more complicated.

The invention provides an artificial intelligence calculation auxiliary processing device for performing a zero-padding operation on a data matrix to be processed.

As shown in FIG. 1, an artificial intelligence calculation auxiliary processing apparatus in an embodiment of the present invention is shown. The artificial intelligence calculation auxiliary processing device includes a plurality of storage modules 11, a memory 12, and a control module 13. The plurality of storage modules 11 store an n*m matrix, and both n and m are natural numbers greater than or equal to 1. An N*M zero matrix is stored in the memory 12; the control module 13 is configured to take out the n*m matrix and place it in the N*M zero matrix.

Preferably, the storage module 11 is a dual storage module; the dual storage module specifically includes two storage modules. When one storage module scans the output, the other storage module processes the data; when the next cycle is processed, the processing is completed. The storage module of the data starts to scan the output, and the storage module that originally scanned the output starts processing the data. That is, the two storage modules are always in a state in which one is scanning and the other is processing data, and the effect is that each frame of the final output data appears to have undergone two steps of processing and scanning output. In fact, it is achieved through the cooperation of two memories, thus achieving the technical effect of multiplying the data transmission and processing efficiency.

Preferably, the control module implements data transmission by using a DMA method. Specifically, the DMA is called Direct Memory Access, which means direct memory access, and is a controller that can directly access data from the memory without going through the CPU. In DMA mode, the CPU only needs to issue instructions to the control module, let the control module process the data transmission, and then feed back the information to the CPU after the data is sent, thus greatly reducing the CPU resource occupancy rate and greatly saving system resources. .

The N*M zero matrix is a matrix composed of (N*M) zero values. specific,

The W is the order of the W*W order convolution kernel matrix. The convolution kernel matrix is a weight matrix for performing weighted average calculation on the matrix data, and the effect is equivalent to the filter in the convolution calculation. In general, the order of the weight matrix is odd, which facilitates determining the position of the matrix by the center element of the odd-order matrix.

Specifically, the control module takes the n*m-order matrix out of the storage module and places it in an N*M zero matrix of the memory to form a padding matrix. The first to the first of the filling matrix

Line, number

To the Nth act zero; 1st to the first

Column, number

The column M is zero; the other regions are the n*m matrix.

And extracting the n*m matrix from the storage module 11 into an N*M zero matrix in the memory 12 to form a fill matrix, so that the n*m matrix can be any one of its matrix elements The center can form a matrix to be convolved with a size of W*W. The process of convolution calculation of the n*m matrix and the W*W order convolution kernel is described below in a specific embodiment.

As shown in FIG. 2-6, a schematic diagram of an artificial intelligence intelligent calculation auxiliary processing process in an embodiment of the present invention is shown. among them:

Figure 2 shows the data matrix M1 to be processed and the convolution kernel matrix M2 in this embodiment. The data matrix to be processed is a 5*5 order matrix, and the convolution kernel matrix is a 3*3 order matrix, and the values in each matrix are matrix elements of the matrix.

Figure 3 shows the zero matrix M3 in this embodiment. The zero matrix is set in the memory and is based on

n=5, W=3, the zero matrix is a 7*7 matrix.

Figure 4 shows the fill matrix M4 in this embodiment. The padding matrix is formed after the to-be-processed data matrix M1 is placed in the zero matrix M3. According to the first to the first of the filling matrix

Line, number

To the Nth act zero; 1st to the first

Column, number

The M column is zero; the other regions are the n*m order matrix: the first, seventh, first, and seventh columns of the padding matrix M4 are all 0, and the second to sixth rows are The areas of the second to sixth columns are used to place the to-be-processed data matrix M1, and the padding matrix M4 is actually a matrix after the zero-padding operation of the to-be-processed data matrix M1.

The rectangular dotted frame R1 in Fig. 4 represents a 3*3 order to be convolved matrix M401 centered on the matrix element 18, which is located on the right side of Fig. 4. The rectangular dotted frame R1 is moved to the right by the step size 1, and the to-be-convolved matrices M401 to M405 centered on the matrix elements in the first row of the filling matrix M4 are sequentially obtained. The same operation is sequentially performed on the second to seventh rows in the same manner as in the first row, thereby finally obtaining a number of to-be-convolution matrices M401 to M425 of 25.

It should be noted that the matrixes to be convolved M401-M425 and the matrix elements in the data matrix M1 to be processed are aligned with each other. In addition, although the step of moving the rectangular dotted frame R1 in the embodiment is 1, that is, the distance of moving only one matrix element at a time, the step size of the moving of the dotted frame is not limited by the present invention.

FIG. 5 is a schematic diagram showing the multiplication of each of the to-be-convolved matrices and the convolution kernel matrix in this embodiment. The artificial intelligence calculation auxiliary processing device includes a multiplier not shown for multiplying the convolution kernel matrix with each of the to-be-convolution matrices. Specifically, the matrix elements in the matrix to be convolved are multiplied with the matrix elements of the convolution kernel matrix to obtain corresponding multiplication result matrices M501 to M525. It should be noted that the multiplication result matrices M501 M M525 are aligned with the matrix elements in the to-be-processed data matrix M1.

Fig. 6 is a diagram showing the addition of the multiplication result matrices in this embodiment. The artificial intelligence calculation auxiliary processing device includes an unillustrated adder for performing each of the multiplication result matrices M501 to M525 as follows: adding matrix elements in the multiplication result matrix , get the corresponding convolution result value. For example, the matrix elements in the multiplication result matrix M501 are added to obtain a convolution result value 32, and so on, all 25 result matrices are added, and finally a convolution result matrix M6 is obtained.

It can be seen from the above specific embodiment that the artificial intelligence calculation auxiliary processing device provided by the present invention convolutes the n*m matrix in the storage module with the convolution kernel, and outputs a convolution result matrix of order n*m.

It should be noted that the artificial intelligence calculation auxiliary processing device provided by the present invention constructs a zero-padding operation system through a hardware structure; in addition, the present invention pre-sets a zero matrix in the memory for the data matrix to be processed to be filled with zeros. Operation, no need to calculate the number of zero padding or the position of zero padding. Compared with the prior art, the invention realizes the calculation of the zero-filling operation by the software running through the CPU, greatly reduces the calculation amount of the system, improves the efficiency of the zero-padding operation, and speeds up the response speed of the image processing and the like.

Optionally, in an embodiment, the n*m matrix is divided into a plurality of block matrices of the same size, and the control module takes the n*m matrix out of the storage module and places the The specific manner of the N*M zero matrix of the memory includes: the control module sequentially fetching each of the block matrices, and filling each of the block matrices into the N*M zero matrix according to a preset placement condition. The following is described in a specific embodiment.

As shown in FIG. 7, a schematic diagram of an artificial intelligence calculation auxiliary processing apparatus in an embodiment of the present invention is shown. The processing device includes a storage module 71 with a data matrix M7 to be processed. The data matrix M7 to be processed is a 4*4 matrix, and the 2*2 matrix is a block matrix, which can be divided into four block matrices. In FIG. 7, a rectangular dotted frame R2 represents one of the block matrices.

The processing device includes a memory 72 having a zero matrix M8 disposed therein, the zero matrix M8 being a 6*6 matrix. The area of the zero matrix M8 for storing the data matrix M7 to be processed starts from the matrix element of the second row and the second column, and its storage address is 0x00220000. The control module (not shown) uses the storage address as a starting address, and places the first block matrix of the data matrix M7 to be processed into a rectangular dotted frame R3.

The control module sequentially places each of the block matrices in a corresponding position in the zero matrix M8 according to the initial address and the size of the block matrix. For example, the control module puts the first block matrix into the zero matrix M8, and further stores the second block matrix into the zero matrix M8 with the storage address being 0x00220004 as the starting address. And so on until all the data matrix to be processed is placed in the zero matrix M8.

The artificial intelligence calculation auxiliary processing method provided by the invention extracts data from the storage module through the control module, and divides the data matrix to be processed into a plurality of block matrices of the same size, thereby greatly improving the efficiency of extracting data and speeding up the system. responding speed.

The invention also provides an artificial intelligence calculation auxiliary processing method, which is applied to the control module, and specifically includes:

S801: Extract a data matrix to be processed from the storage module.

S802: The to-be-processed data matrix is placed in a zero matrix in the memory, so that the to-be-processed data matrix can form a to-be-convolution matrix of size W*W centering on any one of the first matrix elements. The convolution kernel matrix performs a convolution calculation according to a preset step size; where W is the size of the convolution kernel matrix.

The implementation manner of the artificial intelligence calculation auxiliary processing method is similar to the implementation manner of the artificial intelligence calculation auxiliary processing device, and therefore will not be described again.

Those skilled in the art can understand that all or part of the steps of implementing the above-mentioned artificial intelligence calculation auxiliary processing method embodiment can be completed by computer program related hardware. The aforementioned computer program can be stored in a computer readable storage medium. The program, when executed, performs the steps including the foregoing method embodiments; and the foregoing storage medium includes various media that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.

The invention also provides an artificial intelligence calculation auxiliary processing terminal, comprising: a processor and a memory. The memory is for storing a computer program for executing the computer program stored by the memory to cause the terminal to execute the artificial intelligence calculation auxiliary processing method.

The memory mentioned above may include random access memory (RAM), and may also include non-volatile memory, such as at least one disk storage.

The above processor may be a general-purpose processor, including a central processing unit (CPU), a network processor (Network Processor, NP for short), and the like; or a digital signal processor (DSP), an application specific integrated circuit (DSP). ApplicationSpecificIntegratedCircuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.

In summary, the artificial intelligence calculation auxiliary processing method and device provided by the present invention build a zero-padding operation system through a hardware structure, and a zero matrix is pre-set in the memory for the data matrix to be processed to realize the zero-padding operation. There is no need to calculate parameters such as the number of zero padding or the position of zero padding, which greatly reduces the calculation amount of the system, improves the efficiency of the zero padding operation, and speeds up the response speed of operations such as image processing. Therefore, the present invention effectively overcomes various shortcomings in the prior art and has high industrial utilization value.

The above-described embodiments are merely illustrative of the principles of the invention and its effects, and are not intended to limit the invention. Modifications or variations of the above-described embodiments may be made by those skilled in the art without departing from the spirit and scope of the invention. Therefore, all equivalent modifications or changes made by those skilled in the art without departing from the spirit and scope of the invention will be covered by the appended claims.

Claims

An artificial intelligence calculation auxiliary processing device, comprising:

a plurality of storage modules storing a data matrix to be processed;

Memory, with a zero matrix;

a control module, configured to take the to-be-processed data matrix from the storage module, and place the data matrix in the memory, so that the to-be-processed data matrix can be centered on any one of the first matrix elements A convolution matrix of size W*W is used for convolution calculation of the convolution kernel matrix according to a preset step size; wherein W is the size of the convolution kernel matrix.
The artificial intelligence calculation auxiliary processing device according to claim 1, comprising:

The data matrix to be processed includes an n*m matrix, and the zero matrix includes an N*M zero matrix;

among them,
The artificial intelligence calculation auxiliary processing device according to claim 2, wherein the data matrix to be processed is taken out from the storage module and placed in the zero matrix in the inner layer, and specifically includes:

The control module takes the n*m matrix out of the storage module and places it in an N*M zero matrix of the memory to form a padding matrix; wherein the padding matrix includes:

1st to the first
Line, number
To the Nth act zero; 1st to the first
Column, number
The column M is zero; the other regions are the n*m matrix.
The artificial intelligence calculation auxiliary processing device according to claim 2, wherein the data matrix to be processed is taken out from the storage module and placed in the zero matrix in the medium, and the specific manner includes:

The n*m matrix is divided into a plurality of block matrices of the same size; the control module sequentially extracts each of the block matrices, and fills each of the block matrices into the N*M according to a preset placement condition. In the zero matrix.
The artificial intelligence calculation auxiliary processing device according to claim 4, wherein the preset placement condition comprises:

The control module sequentially places each of the block matrices into the N*M zero matrix according to a preset start address in the memory and a size of the block matrix.
The artificial intelligence calculation auxiliary processing device according to claim 1, wherein the storage module comprises a dual storage module.
The artificial intelligence calculation auxiliary processing device according to claim 1, comprising:

a multiplier for multiplying each of the to-be-convolution matrices with a convolution kernel to obtain a corresponding multiplication result matrix; wherein each of the multiplication result matrices is aligned with each of the first matrix elements;

An adder, configured to add a second matrix element in each of the multiplication result matrices to obtain a corresponding convolution result value; wherein each of the convolution result values is aligned with each of the first matrix elements, To form a convolution result matrix of size n*m.
An artificial intelligence calculation auxiliary processing method is characterized in that it is applied to a control module, and the method includes:

Extracting the data matrix to be processed from the storage module;

And placing the to-be-processed data matrix in a zero matrix in the memory, so that the to-be-processed data matrix can form a to-be-convolved matrix of size W*W centered on any one of the first matrix elements, for convolution The kernel matrix performs a convolution calculation according to a preset step size; where W is the size of the convolution kernel matrix.
The artificial intelligence calculation auxiliary processing method according to claim 8, comprising:

The data matrix to be processed includes an n*m matrix, and the zero matrix includes an N*M zero matrix;

among them,
The artificial intelligence calculation auxiliary processing method according to claim 8, wherein the data matrix to be processed is taken out from the storage module and placed in the inner zero matrix, and specifically includes:

The n*m matrix is divided into a plurality of block matrices of the same size, which are sequentially taken out by the control module.
The artificial intelligence calculation auxiliary processing method according to claim 8, wherein the data matrix to be processed is taken out from the storage module and placed in the inner zero matrix, and specifically includes:

The control module sequentially fills each of the block matrices into the N*M zero matrix according to a starting address and a size of the block matrix.
A computer readable storage medium having stored thereon a computer program, wherein the program is executed by a processor to implement the artificial intelligence calculation auxiliary processing method according to any one of claims 8 to 11.
An artificial intelligence computing auxiliary processing terminal, comprising: a processor and a memory;

The memory is for storing a computer program for executing the computer program stored by the memory to cause the terminal to execute the artificial intelligence calculation auxiliary processing method according to any one of claims 8 to 11.