CN111247527B

CN111247527B - Method and device for determining characteristic images in convolutional neural network model

Info

Publication number: CN111247527B
Application number: CN201780096076.0A
Authority: CN
Inventors: 胡慧
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2017-12-20
Filing date: 2017-12-20
Publication date: 2023-08-22
Anticipated expiration: 2037-12-20
Also published as: WO2019119301A1; CN111247527A

Abstract

A method and a device for determining characteristic images in a convolutional neural network model belong to the technical field of model training. The method comprises the following steps: acquiring a plurality of input images of a target processing layer in a convolutional neural network model (S210); acquiring at least one group of convolution kernels of the target processing layer (S220), wherein different convolution kernels in the same group contain the same elements and have different arrangement sequences of the elements; based on each convolution kernel in at least one set of convolution kernels, convolution computation is performed on different input images to obtain a plurality of intermediate matrices, and the characteristic images are obtained by summing the plurality of intermediate matrices (S230). According to the method, through the characteristics that elements contained in different convolution kernels are the same but the arrangement sequences of the elements are different, resources occupied by the convolution kernels are reduced, the number of times of reading the convolution kernels is reduced, the calculated amount generated when a feature image is determined by a convolution layer is reduced, and the system operation resources consumed in the calculation process are reduced.

Description

Method and device for determining characteristic images in convolutional neural network model

Technical Field

The present disclosure relates to the field of model training technology, and more particularly, to a method and apparatus for determining feature images in a convolutional neural network model.

Background

The convolutional neural network is composed of a convolutional layer, a full-connection layer, an activation function and the like, and the output of a single convolutional layer comprises a plurality of characteristic images. In training a convolutional neural network model, a large number of samples need to be calculated. Wherein the calculated amount generated in the convolution layer accounts for 90% of the total calculated amount in the whole training process.

For any one convolution layer, the number of convolution kernels may be determined according to the number of input images and the number of output feature images, and a corresponding number of convolution kernels may be generated, each convolution kernel may be a small matrix, such as a 3×3 matrix, and each input image may be considered a large matrix. The processing of the convolution layer may be as follows: and carrying out convolution calculation on an input image and a convolution kernel, specifically extracting all matrixes with the same size as the convolution kernel from the input image, multiplying the extracted matrixes with the alignment elements of the convolution kernel, adding the multiplied matrixes to obtain a numerical value, forming an intermediate matrix by all the obtained numerical values, carrying out convolution calculation on each input image and the convolution kernel to obtain an intermediate matrix, and adding the intermediate matrices to obtain a characteristic image.

In carrying out the present disclosure, the inventors found that there are at least the following problems:

because the convolutional neural network comprises a plurality of convolutional layers, each convolutional layer needs to output more characteristic images, and the number of the convolutional kernels corresponding to each characteristic image is also more. The corresponding calculation amount of each convolution kernel is larger, and the total calculation amount in the whole training process increases exponentially. Therefore, the computation amount generated in the convolution layer is huge, and a large amount of processing resources are required.

Disclosure of Invention

In order to overcome the problems in the related art, the present disclosure provides the following technical solutions:

in a first aspect, there is provided a method of determining a feature image in a convolutional neural network model, the method comprising:

acquiring a plurality of input images of a target processing layer in a convolutional neural network model;

acquiring at least one group of convolution kernels of the target processing layer, wherein the elements contained in different convolution kernels in the same group are the same and the arrangement order of the elements is different;

based on each convolution kernel in the at least one group of convolution kernels, respectively carrying out convolution calculation on different input images to obtain a plurality of intermediate matrixes, and summing the plurality of intermediate matrixes to obtain a characteristic image, wherein each element of the intermediate matrixes is a polynomial obtained by multiplying a corresponding convolution kernel with the input image in the convolution calculation process and then adding the para elements.

The method provided by the embodiment obtains a plurality of input images; generating at least one group of convolution kernels, wherein different convolution kernels in the same group contain the same elements and have different arrangement sequences of the elements; based on each convolution kernel in at least one group of convolution kernels, convolution calculation is carried out on different input images respectively to obtain a plurality of intermediate matrixes, and the characteristic images are obtained by summing the plurality of intermediate matrixes. The method can reduce the resources occupied by the stored convolution kernels, reduce the number of times of reading the convolution kernels, reduce the calculated amount generated when the convolution layer determines the characteristic image and reduce the system operation resources consumed in the calculation process by the characteristics that the elements contained in different convolution kernels of the convolution kernels are identical and the arrangement sequence of the elements is different.

In one possible implementation manner, the summing the plurality of intermediate matrices to obtain a feature image includes:

adding polynomials of elements at the same position in the plurality of intermediate matrixes to obtain polynomials corresponding to each element of the characteristic image;

respectively combining the polynomials corresponding to each element of the characteristic image;

and respectively evaluating the polynomials processed by each merging similar item to obtain the characteristic image.

The number of times of multiplication addition which is required to be performed together is far more than that of the polynomials which are required to be performed with the same kind of term combination. It can be seen that as the number of convolution kernels included in a set of convolution kernels increases, and the overall calculation process for determining the feature image is completed, the number of places involved in which the amount of computation can be reduced increases greatly, ultimately accelerating the speed of determining the feature image.

In one possible implementation, before acquiring the at least one set of convolution kernels of the target processing layer, the method further comprises:

randomly generating N convolution kernels, wherein N is the preset group number;

and performing element displacement on each convolution kernel in the N convolution kernels in a row unit and/or performing element displacement in a column unit to obtain M-1 different convolution kernels, wherein M-1 convolution kernels and the convolution kernels before element displacement form a group of convolution kernels of the target processing layer, and M is the number of the convolution kernels in a preset group.

The method can reduce the resources occupied by the stored convolution kernels, reduce the number of times of reading the convolution kernels, reduce the calculated amount generated when the convolution layer determines the characteristic image and reduce the system operation resources consumed in the calculation process by the characteristics that the elements contained in different convolution kernels of the convolution kernels are identical and the arrangement sequence of the elements is different.

In one possible implementation, the number of convolution kernels in each group is greater than 2 and less than the product of the number of rows and columns of convolution kernels.

In one possible implementation manner, after summing the plurality of intermediate matrices to obtain the feature image, the method further includes:

when an output result of the convolutional neural network model is obtained, determining an adjustment value of each element in each convolutional kernel in the at least one group of convolutional kernels according to the output result of the convolutional neural network model and a preset output result;

determining the sum of the adjustment values of the same element contained in different convolution kernels in the same group as a corrected adjustment value corresponding to the adjustment value of the same element;

the respective convolution kernels are adjusted based on the corrected adjustment value for each element.

In the implementation, a plurality of convolution layers exist in the convolution neural network model, the first convolution layer outputs a characteristic image to the Z-1 th convolution layer, and the last convolution layer, namely the Z-1 th convolution layer, outputs a final output result. When the output result of the convolutional neural network model is obtained, the output result is generally in error with the preset output result because the convolutional neural network model is still in the training process. Based on the errors produced by the overall convolutional neural network model, an adjustment value for each element in each of the plurality of sets of convolutional kernels may be determined. Then, the sum of the adjustment values of the same element included in different convolution kernels in the same group is determined as the corrected adjustment value corresponding to the adjustment value of the same element.

In a second aspect, there is provided an apparatus for determining a feature image in a convolutional neural network model, the apparatus comprising at least one module for implementing the method for determining a feature image in a convolutional neural network model provided in the first aspect above.

In a third aspect, a terminal is provided, the terminal comprising a processor, a memory, the processor configured to execute instructions stored in the memory; the processor implements the method for determining a feature image in a convolutional neural network model provided in the first aspect described above by executing instructions.

In a fourth aspect, a computer readable storage medium is provided, comprising instructions which, when run on an origin server, cause the origin server to perform the method of determining a feature image in a convolutional neural network model provided in the first aspect above.

In a fifth aspect, a computer program product comprising instructions, which when run on an origin server, causes the origin server to perform the method of determining a feature image in a convolutional neural network model provided in the first aspect above.

The technical scheme provided by the embodiment of the disclosure can comprise the following beneficial effects:

the method provided by the embodiment obtains a plurality of input images; generating a plurality of groups of convolution kernels, wherein different convolution kernels in the same group contain the same elements and have different arrangement sequences of the elements; at least one feature image corresponding to the plurality of input images is determined. The method can reduce the resources occupied by the stored convolution kernels, reduce the number of times of reading the convolution kernels, reduce the calculated amount generated when the convolution layer determines the characteristic image and reduce the system operation resources consumed in the calculation process by the characteristics that the elements contained in different convolution kernels of the convolution kernels are identical and the arrangement sequence of the elements is different.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic diagram illustrating a structure of a terminal according to an exemplary embodiment;

FIG. 2 is a flow diagram illustrating a method of determining a feature image in a convolutional neural network model, in accordance with an exemplary embodiment;

FIG. 3 is a flow diagram illustrating a method of determining a feature image in a convolutional neural network model, in accordance with an exemplary embodiment;

FIG. 4 is a flow diagram illustrating a method of determining a feature image in a convolutional neural network model, in accordance with an exemplary embodiment;

FIG. 5 is a flow diagram illustrating a method of determining a feature image in a convolutional neural network model, in accordance with an exemplary embodiment;

FIG. 6 is a flow diagram illustrating a method of determining a feature image in a convolutional neural network model, in accordance with an exemplary embodiment;

FIG. 7 is a flow diagram illustrating a method of determining a feature image in a convolutional neural network model, in accordance with an exemplary embodiment;

FIG. 8 is a flow diagram illustrating a method of determining a feature image in a convolutional neural network model, in accordance with an exemplary embodiment;

FIG. 9 is a flow diagram illustrating a method of determining a feature image in a convolutional neural network model, in accordance with an exemplary embodiment;

fig. 10 is a schematic structural view showing an apparatus for determining a feature image in a convolutional neural network model according to an exemplary embodiment.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.

The embodiment of the application provides a method for determining a characteristic image in a convolutional neural network model, wherein an execution subject of the method is a terminal.

The terminal may include a processor 110, a memory 120, and the processor 110 may be connected to the memory 120 as shown in fig. 1. Processor 110 may include one or more processing units; the processor 110 may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU for short), a network processor (Network Processor, NP for short), etc.; but may also be a Digital Signal Processor (DSP), application Specific Integrated Circuit (ASIC), field Programmable Gate Array (FPGA), or other programmable logic device, etc.

In particular, the program may include program code including computer-operating instructions. The terminal may also include a memory 120, the memory 120 being operable to store software programs and modules, and the processor 110 performing tasks by reading the software code and modules stored in the memory 120.

In addition, the terminal may further include a receiver 130 and a transmitter 140, wherein the receiver 130 and the transmitter 140 may be respectively connected to the processor 110, and the transmitter 130 and the receiver 140 may be collectively referred to as a transceiver. The transmitter 140 may be used to transmit messages or data and the transmitter 140 may include, but is not limited to, at least one amplifier, a tuner, one or more oscillators, a coupler, an LNA (Low Noise Amplifier ), a duplexer, and the like.

An exemplary embodiment of the present disclosure provides a method for determining a feature image in a convolutional neural network model, as shown in fig. 2, the process flow of which may include the steps of:

step S210, a plurality of input images of a target processing layer in a convolutional neural network model are acquired.

In the implementation, in the process of training the convolutional neural network model, firstly, the structure of the convolutional neural network model, such as the number of layers of convolutional layers contained in the convolutional neural network model, the number of input images, the number of convolutional kernels, the number of output characteristic images and the like in each layer, is designed. In the first training process of the convolutional neural network model, the values of elements used for performing convolutional calculation on the input image to obtain a convolutional kernel of the output image are random. The convolution kernel may be a matrix, and the elements of the convolution kernel are values at any position (positions determined by rows and columns) in the matrix. For a convolution kernel of size 3X3, there are 9 values in the convolution kernel for 3 rows and 3 columns, and these 9 values are the elements of the convolution kernel. Similarly, for a convolution kernel of size 5X5, there are 25 values for 5 rows and 5 columns in the convolution kernel, and these 25 values are the elements of the convolution kernel. Other sizes of convolution kernels are similar and are not illustrated herein. And continuously returning to optimize the values of the elements of the convolution kernel through the difference value between the result output by the convolution neural network model and the correct result in the sample in the process from the second training to the Nth training, so that the difference value between the result output by the convolution neural network model and the correct result in the sample is the lowest value as far as possible after the Nth training. For a target processing layer (a certain layer of convolution layer in the convolution neural network model), a plurality of characteristic images output by a convolution layer on the target processing layer are processed by other layers such as a Pooling layer and a RELU layer (activating function layer), namely a plurality of input images of the convolution layer on the target processing layer are obtained.

Each feature image in each convolutional layer in the convolutional neural network model may be determined using the methods provided by the present embodiments.

Step S220, at least one set of convolution kernels of the target processing layer is acquired.

The multi-dimensional tensor (including the tensor of three or more dimensions of three or three orders) formed by all the convolution kernels in each group of convolution kernels can be tensors with special structures arranged according to a certain rule, and the purpose of the tensor is to enable the elements in each group of convolution kernels to be repeatable, so that when the elements of the convolution kernels are used for calculation, the calculation amount can be reduced by combining the similar terms. The size of the convolution kernel is typically 3X3 or 5X5, and the height and width of the convolution kernel are typically the same value. The elements contained in different convolution kernels in the same group are identical and the order of the elements is different.

In implementations, multiple sets of convolution kernels may be generated in units of groups during initialization of the convolutional neural network model. For example, as shown in fig. 3, in a certain convolution layer, there are 6 input images in total, and there are 6 convolution kernels corresponding to each input image. The 6 convolution kernels may be grouped, for example, 1-3 into 1 groups and 4-6 into 1 groups.

Optionally, before acquiring at least one set of convolution kernels of the target processing layer, the method provided by the embodiment may further include: randomly generating N convolution kernels, wherein N is the preset group number; and performing element displacement on each of the N convolution kernels in a row unit and/or performing element displacement in a column unit to obtain M-1 different convolution kernels, wherein the M-1 convolution kernels and the convolution kernels before the element displacement form a group of convolution kernels of the target processing layer, and M is the number of the convolution kernels in a preset group.

In practice, in initializing the convolutional neural network model, for example, convolutional kernels 1-3 are grouped into 1 groups and convolutional kernels 4-6 are grouped into 1 groups. First, 2 convolution kernels, convolution kernel 1 and convolution kernel 4, are randomly generated. Next, the convolution kernel 1 is subjected to element displacement in units of rows, and if the size of the convolution kernel 1 is 3X3, as shown in fig. 4, the convolution kernel 1 is subjected to element displacement in units of columns to obtain a convolution kernel 2 and a convolution kernel 3. Where W1-W8 are elements in the convolution kernel. The convolution kernel 4-6 is generated in the same way.

Optionally, the number M of convolution kernels in each group is greater than 2 and less than the product of the number of rows and columns of convolution kernels. Such as up to 9 for a convolution kernel of size 3X 3. Because once M exceeds 9, the convolution kernels of size 3X3 have been shifted by elements in units of rows and by columns in all possible ways, the 10 th convolution kernel must be one of the first 9 convolution kernels repeated. That is, to ensure that the elements contained in different convolution kernels in the same group are the same and the arrangement order of the elements is different, the Mmax is controlled to be not more than the product of the number of rows and the number of columns of the convolution kernels.

In practice, for example, there are 5 convolution kernels 1-5 in the 1 set of convolution kernels. As shown in fig. 5, first, the convolution kernels 2 and 3 may be obtained by performing element displacement in units of columns from the convolution kernel 1, and then the convolution kernels 4 and 5 may be obtained by performing element displacement in units of rows from the convolution kernel 1. Or the convolution kernel 2 may perform element displacement in units of rows to obtain the convolution kernels 4 and 5.

Step S230, based on each convolution kernel in at least one group of convolution kernels, convolution calculation is performed on different input images to obtain a plurality of intermediate matrixes, and the characteristic images are obtained by summing the plurality of intermediate matrixes.

Each element of the intermediate matrix is a polynomial obtained by multiplying a corresponding convolution kernel with an input image in a convolution calculation process and then adding the para elements.

In practice, as shown in fig. 6, there are 4 input images in the same convolution layer, and the convolution layer is required to output 2 feature images, namely feature image 1 and feature image 2. The 4 input images are respectively subjected to convolution calculation through 4 convolution kernels 1-4 to obtain an intermediate matrix 1-4, and a characteristic image 1 can be obtained based on the intermediate matrix 1-4. And carrying out convolution calculation on the 4 input images again through the 4 convolution kernels 5-8 to obtain an intermediate matrix 5-8, and obtaining the characteristic image 2 based on the intermediate matrix 5-8.

For fig. 6, convolution kernels 1-4 may be divided into multiple sets of convolution kernels, and convolution kernels 5-8 may be divided into multiple sets of convolution kernels. Here, only an example is shown, in practice, the number of convolution kernels corresponding to 1 feature image is large, and the convolution kernels corresponding to 1 feature image may be divided into multiple sets of convolution kernels. For each set of convolution kernels, the different convolution kernels contain the same elements and the order of the elements is different.

As shown in fig. 7, in the determination of the intermediate matrix 1, each time the input image 1 is extracted, for example, 3X3 adjacent elements, the 3X3 adjacent elements in the input image 1 are multiplied by the elements of the convolution kernel at the corresponding position and added to obtain a polynomial. Shifting the convolution kernel in the input image 1 according to a preset number of rows or a preset number of columns each time, repeating the operation of multiplying and adding 3X3 adjacent elements in the obtained input image 1 with elements of the convolution kernel at the corresponding position to obtain a polynomial until the convolution kernel traverses all 3X3 adjacent elements in the input image, and obtaining the intermediate matrix 1.

Optionally, the step of summing the plurality of intermediate matrices to obtain the feature image may include: adding polynomials of elements at the same position in a plurality of intermediate matrixes to obtain polynomials corresponding to each element of the characteristic image; respectively combining the polynomials corresponding to each element of the characteristic image; and respectively evaluating the polynomials processed by each merging similar item to obtain a characteristic image.

In implementation, for the method provided in this embodiment, the polynomials of a plurality of intermediate matrices may be determined through a plurality of channels at the same time, and then the polynomials of the same position are added.

As shown in fig. 8, convolution kernels 1, 2 and 3 are used to perform convolution computation on 3X3 adjacent elements at the upper left corner positions in the input image 1, 2 and 3, respectively, to obtain one element of the first row and first column of the intermediate matrix 1, 2 and 3, respectively, as an example. The polynomial corresponding to one element of the first row and the first column of the intermediate matrix 1 is: w (W) ₀ ×a ₀ +W ₁ ×a ₁ +W ₂ ×a ₂ +W ₃ ×a ₃ +W ₄ ×a ₄ +W ₅ ×a ₅ +W ₆ ×a ₆ +W ₇ ×a ₇ +W ₈ ×a ₈ . The polynomial corresponding to one element of the first row and the first column of the intermediate matrix 2 is: w (W) ₂ ×b ₀ +W ₀ ×b ₁ +W ₁ ×b ₂ +W ₅ ×b ₃ +W ₃ ×b ₄ +W ₄ ×b ₅ +W ₈ ×b ₆ +W ₆ ×b ₇ +W ₇ ×b ₈ . The polynomial corresponding to one element of the first row and the first column of the intermediate matrix 3 is: w (W) ₁ ×c ₀ +W ₂ ×c ₁ +W ₀ ×c ₂ +W ₄ ×c ₃ +W ₅ ×c ₄ +W ₃ ×c ₅ +W ₇ ×c ₆ +W ₈ ×c ₇ +W ₆ ×c ₈ 。

In determining the feature image, polynomials of elements of the same position of all intermediate matrices corresponding to the feature image are added, which includes adding polynomials of elements of the same position of intermediate matrix 1, intermediate matrix 2, and intermediate matrix 3. Of course, adding polynomials corresponding to the elements of the first row and the first column of the intermediate matrix 1, the intermediate matrix 2 and the intermediate matrix 3 is also included, to obtain: w (W) ₀ ×a ₀ +W ₁ ×a ₁ +W ₂ ×a ₂ +W ₃ ×a ₃ +W ₄ ×a ₄ +W ₅ ×a ₅ +W ₆ ×a ₆ +W ₇ ×a ₇ +W ₈ ×a ₈ +W ₂ ×b ₀ +W ₀ ×b ₁ +W ₁ ×b ₂ +W ₅ ×b ₃ +W ₃ ×b ₄ +W ₄ ×b ₅ +W ₈ ×b ₆ +W ₆ ×b ₇ +W ₇ ×b ₈ +W ₁ ×c ₀ +W ₂ ×c ₁ +W ₀ ×c ₂ +W ₄ ×c ₃ +W ₅ ×c ₄ +W ₃ ×c ₅ +W ₇ ×c ₆ +W ₈ ×c ₇ +W ₆ ×c ₈ 。

It can be seen that the above equation can be processed to combine the same types of terms to obtain: w (W) ₀ ×(a ₀ +b ₁ +c ₂ )+W ₁ ×(a ₁ +b ₂ +c ₀ )+W ₂ ×(a ₂ +b ₀ +c ₁ )+W ₃ ×(a ₃ +b ₄ +c ₅ )+W ₄ ×(a ₄ +b ₅ +c ₃ )+W ₅ ×(a ₅ +b ₃ +c ₄ )+W ₆ ×(a ₆ +b ₇ +c ₈ )+W ₇ ×(a ₇ +b ₈ +c ₆ )+W ₈ ×(a ₈ +b ₆ +c ₇ ). It can be seen that the polynomial without homography combining takes 27 multiplications and 26 additions in total to obtain the desired result, while the polynomial with homography combining takes 9 multiplications and 26 additions in total to obtain the desired result.

It can be seen that the number of operations for 18 multiplications can be reduced for a small portion of the operations in the determined feature image. With the increase of the number of convolution kernels included in a group of convolution kernels and the completion of the whole calculation process for determining the characteristic image, the place where the operand can be reduced is greatly increased, and finally the speed for determining the characteristic image is accelerated.

Optionally, after summing the plurality of intermediate matrices to obtain the feature image, the method provided in this embodiment further includes: when an output result of the convolutional neural network model is obtained, determining an adjustment value of each element in each convolutional kernel in at least one group of convolutional kernels according to the output result of the convolutional neural network model and a preset output result; determining the sum of the adjustment values of the same element contained in different convolution kernels in the same group as a corrected adjustment value corresponding to the adjustment value of the same element; the respective convolution kernels are adjusted based on the corrected adjustment value for each element.

In the implementation, a plurality of convolution layers exist in the convolution neural network model, the first convolution layer outputs a characteristic image to the Z-1 th convolution layer, and the last convolution layer, namely the Z-1 th convolution layer, outputs a final output result. When the output result of the convolutional neural network model is obtained, the output result is generally in error with the preset output result because the convolutional neural network model is still in the training process. Based on the errors produced by the overall convolutional neural network model, an adjustment value for each element in each of the plurality of sets of convolutional kernels may be determined. Then, the sum of the adjustment values of the same element included in different convolution kernels in the same group is determined as the corrected adjustment value corresponding to the adjustment value of the same element. For example, with respect to fig. 8, if fig. 9 is present in case that convolution computation is performed on 3X3 adjacent elements in the input image 1, 3X3 adjacent elements in the input image 2, and 3X3 adjacent elements in the input image 3 through 3 channels by the convolution kernel 1, the convolution kernel 2, and the convolution kernel 3, respectively, then the following formula is present in calculating the corrected adjustment value corresponding to the adjustment value of the same element:

wherein Δw is the corrected adjustment value corresponding to the adjustment value of the same element. WH is the product of the width and height of the feature image. Delta _Rk For sensitivity, delta _Rk R in (2) represents the R-th feature image of the target processing layer, w_size ² Is the product of the width of the convolution kernel and the height of the convolution kernel.

Finally, by the method provided in this example, a test was performed. Specifically, a Cifar10 data set is used for image recognition training, a convolutional neural network model is designed to be a 3-layer model, and the convolution kernel size of each layer is 5X5. The test results obtained are shown in the following table:

TABLE 1

By the method provided in this example, a test was performed. The convolutional neural network model is specifically trained for the field of image super-resolution, and a new image which is used for amplifying an original image into a size of 3 times is set. The convolutional neural network model is designed to be a 3-layer model, and the size of a convolutional kernel is 5X5. The test results obtained are shown in the following table:

TABLE 2

The PSNR is a common measurement index in the application of the super-resolution of the image, and the effect after the super-resolution of the image is better as the PSNR is larger. BaseHisrcnn is a convolutional neural network structure applied to image super-resolution.

Yet another exemplary embodiment of the present disclosure provides an apparatus for determining a feature image in a convolutional neural network model, as shown in fig. 10, the apparatus comprising:

an acquisition module 1010, configured to acquire a plurality of input images of a target processing layer in a convolutional neural network model; at least one set of convolution kernels of the target processing layer is acquired. Wherein, the elements contained in different convolution kernels in the same group are the same and the arrangement order of the elements is different. The acquisition functions in step S210 and step S220 described above, as well as other implicit steps, may be implemented in particular.

The determining module 1020 is configured to perform convolution calculation on different input images based on each convolution kernel in the at least one set of convolution kernels to obtain a plurality of intermediate matrices, and sum the plurality of intermediate matrices to obtain a feature image, where each element of the intermediate matrix is a polynomial obtained by multiplying a corresponding convolution kernel with an input image in a convolution calculation process and adding the corresponding element. The acquisition function in step S230 described above, as well as other implicit steps, may be implemented in particular.

Optionally, the determining module 1020 is configured to add polynomials of elements at the same position in the plurality of intermediate matrices to obtain a polynomial corresponding to each element of the feature image; respectively combining the polynomials corresponding to each element of the characteristic image; and respectively evaluating the polynomials processed by each merging similar item to obtain the characteristic image.

Optionally, the apparatus further comprises:

the generating module is used for randomly generating N convolution kernels, wherein N is the preset group number;

and the displacement module is used for carrying out element displacement on each convolution kernel in the N convolution kernels in a row unit and/or carrying out element displacement on the N convolution kernels in a column unit to obtain M-1 different convolution kernels, wherein the M-1 convolution kernels and the convolution kernels before the element displacement form a group of convolution kernels of the target processing layer, and M is the number of the convolution kernels in a preset group.

Optionally, the number of convolution kernels in each group is greater than 2 and less than the product of the number of rows and columns of convolution kernels.

Optionally, the determining module 1020 is further configured to determine, when an output result of the convolutional neural network model is obtained, an adjustment value of each element in each convolution kernel in the at least one set of convolution kernels according to the output result of the convolutional neural network model and a preset output result; determining the sum of the adjustment values of the same element contained in different convolution kernels in the same group as a corrected adjustment value corresponding to the adjustment value of the same element;

the apparatus further comprises an adjustment module:

the adjustment module is used for adjusting each convolution kernel based on the corrected adjustment value of each element.

It should be noted that, the acquiring module 1010 and the determining module 1020 may be implemented by a processor, or the processor may be implemented in combination with a memory, or the processor may be implemented by executing program instructions in the memory.

The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.

It should be noted that: the device for determining the feature image in the convolutional neural network model provided in the above embodiment only uses the division of the above functional modules to illustrate when determining the feature image, in practical application, the above functional allocation may be performed by different functional modules according to needs, i.e. the internal structure of the terminal is divided into different functional modules to complete all or part of the functions described above. In addition, the device for determining the feature image in the convolutional neural network model provided in the above embodiment belongs to the same concept as the method embodiment for determining the feature image in the convolutional neural network model, and the specific implementation process is detailed in the method embodiment, which is not repeated here.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A method of determining a feature image in a convolutional neural network model, the method comprising:

2. The method of claim 1, wherein summing the plurality of intermediate matrices results in a feature image, comprising:

3. The method of claim 1, wherein prior to obtaining at least one set of convolution kernels for the target processing layer, the method further comprises:

4. The method of claim 1, wherein the number of convolution kernels in each group is greater than 2 and less than the product of the number of rows and columns of convolution kernels.

5. The method of claim 1, wherein after summing the plurality of intermediate matrices to obtain a feature image, the method further comprises:

6. An apparatus for determining a feature image in a convolutional neural network model, the apparatus comprising:

the acquisition module is used for acquiring a plurality of input images of a target processing layer in the convolutional neural network model; acquiring at least one group of convolution kernels of the target processing layer, wherein the elements contained in different convolution kernels in the same group are the same and the arrangement order of the elements is different;

the determining module is used for respectively carrying out convolution calculation on different input images based on each convolution kernel in the at least one group of convolution kernels to obtain a plurality of intermediate matrixes, and summing the plurality of intermediate matrixes to obtain a characteristic image, wherein each element of the intermediate matrixes is a polynomial obtained by multiplying a corresponding convolution kernel with the input image in the convolution calculation process and then adding the corresponding element.

7. The apparatus according to claim 6, wherein the determining module is configured to add polynomials of elements at the same position in the plurality of intermediate matrices to obtain a polynomial corresponding to each element of the feature image; respectively combining the polynomials corresponding to each element of the characteristic image; and respectively evaluating the polynomials processed by each merging similar item to obtain the characteristic image.

8. The apparatus of claim 6, wherein the apparatus further comprises:

9. The apparatus of claim 6, wherein the number of convolution kernels in each group is greater than 2 and less than a product of the number of rows and columns of convolution kernels.

10. The apparatus of claim 6, wherein the determining module is further configured to determine, when the output result of the convolutional neural network model is obtained, an adjustment value for each element in each of the at least one set of convolutional kernels according to the output result of the convolutional neural network model and a preset output result; determining the sum of the adjustment values of the same element contained in different convolution kernels in the same group as a corrected adjustment value corresponding to the adjustment value of the same element;

the apparatus further comprises an adjustment module:

11. A terminal comprising a processor and a memory, wherein:

the processor is used for acquiring a plurality of input images of a target processing layer in the convolutional neural network model stored in the memory; acquiring at least one group of convolution kernels of the target processing layer stored in the memory, wherein elements contained in different convolution kernels in the same group are the same and the arrangement order of the elements is different; based on each convolution kernel in the at least one group of convolution kernels, respectively carrying out convolution calculation on different input images to obtain a plurality of intermediate matrixes, and summing the plurality of intermediate matrixes to obtain a characteristic image, wherein each element of the intermediate matrixes is a polynomial obtained by multiplying a corresponding convolution kernel with the input image in the convolution calculation process and then adding the para elements.

12. The terminal of claim 11, wherein the processor is configured to add polynomials of elements at the same position in the plurality of intermediate matrices to obtain a polynomial corresponding to each element of the feature image; respectively combining the polynomials corresponding to each element of the characteristic image; and respectively evaluating the polynomials processed by each merging similar item to obtain the characteristic image.

13. The terminal of claim 11, wherein the processor is further configured to randomly generate N convolution kernels, wherein N is a preset number of groups; and performing element displacement on each convolution kernel in the N convolution kernels in a row unit and/or performing element displacement in a column unit to obtain M-1 different convolution kernels, wherein M-1 convolution kernels and the convolution kernels before element displacement form a group of convolution kernels of the target processing layer, and M is the number of the convolution kernels in a preset group.

14. The terminal of claim 11, wherein the number of convolution kernels in each group is greater than 2 and less than the product of the number of rows and columns of convolution kernels.

15. The terminal of claim 11, wherein the processor is further configured to determine, when an output result of the convolutional neural network model is obtained, an adjustment value for each element in each of the at least one set of convolutional kernels according to the output result of the convolutional neural network model and a preset output result; determining the sum of the adjustment values of the same element contained in different convolution kernels in the same group as a corrected adjustment value corresponding to the adjustment value of the same element; the respective convolution kernels are adjusted based on the corrected adjustment value for each element.

16. A computer readable storage medium comprising instructions which, when run on a terminal, cause the terminal to perform the method of any of claims 1-5.