CN111247527A

CN111247527A - Method and device for determining characteristic image in convolutional neural network model

Info

Publication number: CN111247527A
Application number: CN201780096076.0A
Authority: CN
Inventors: 胡慧
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2017-12-20
Filing date: 2017-12-20
Publication date: 2020-06-05
Anticipated expiration: 2037-12-20
Also published as: WO2019119301A1; CN111247527B

Abstract

A method and a device for determining a characteristic image in a convolutional neural network model belong to the technical field of model training. The method comprises the following steps: acquiring a plurality of input images of a target processing layer in a convolutional neural network model (S210); acquiring at least one group of convolution kernels of a target processing layer (S220), wherein different convolution kernels in the same group contain the same elements and different arrangement sequences of the elements; convolution calculation is respectively carried out on different input images based on each convolution kernel in at least one group of convolution kernels to obtain a plurality of intermediate matrixes, and the intermediate matrixes are summed to obtain a characteristic image (S230). According to the method, through the characteristic that different convolution kernels contain the same elements but different arrangement sequences of the elements, resources occupied by the stored convolution kernels are reduced, the number of times of reading the convolution kernels is reduced, the calculation amount generated when the convolution layers determine the characteristic images is reduced, and system operation resources consumed in the calculation process are reduced.

Description

Method and device for determining characteristic image in convolutional neural network model

Technical Field

The present disclosure relates to the field of model training technologies, and in particular, to a method and an apparatus for determining a feature image in a convolutional neural network model.

Background

The convolutional neural network is composed of convolutional layers, fully-connected layers, activation functions and the like, and the output of a single convolutional layer comprises a plurality of characteristic images. In the process of training the convolutional neural network model, a large number of samples need to be calculated. Wherein the computation amount generated in the convolutional layer accounts for 90% of the total computation amount in the whole training process.

For any convolution layer, the number of convolution kernels may be determined according to the number of input images and the number of output feature images, and a corresponding number of convolution kernels may be generated, each convolution kernel may be a small matrix, such as a 3 × 3 matrix, and each input image may be considered as a large matrix. The handling of the convolutional layer may be as follows: performing convolution calculation on an input image and a convolution kernel, specifically, extracting all matrixes with the same size as the convolution kernel from the input image, multiplying the extracted matrixes by the convolution kernel for alignment elements, adding the multiplication to obtain a numerical value, forming an intermediate matrix by all the obtained numerical values, performing convolution calculation on each input image and one convolution kernel to obtain an intermediate matrix, and adding the intermediate matrices to obtain a characteristic image.

In carrying out the present disclosure, the inventors found that at least the following problems exist:

because the convolutional neural network contains a large number of convolutional layers, each convolutional layer needs to output a large number of characteristic images, and the number of convolutional cores corresponding to each characteristic image is also large. The computation amount corresponding to each convolution kernel is large, and the total computation amount in the whole training process increases exponentially. Therefore, the amount of computation generated in the convolutional layer is large, and a large amount of processing resources are required.

Disclosure of Invention

In order to overcome the problems in the related art, the present disclosure provides the following technical solutions:

in a first aspect, there is provided a method for determining a feature image in a convolutional neural network model, the method comprising:

acquiring a plurality of input images of a target processing layer in a convolutional neural network model;

acquiring at least one group of convolution kernels of the target processing layer, wherein different convolution kernels in the same group contain the same elements and different arrangement sequences of the elements;

and performing convolution calculation on different input images respectively based on each convolution kernel in the at least one group of convolution kernels to obtain a plurality of intermediate matrixes, and summing the intermediate matrixes to obtain a characteristic image, wherein each element of the intermediate matrixes is a polynomial obtained by multiplying and adding the corresponding convolution kernel and the input image in the convolution calculation process.

The method provided by the embodiment comprises the steps of acquiring a plurality of input images; generating at least one group of convolution kernels, wherein different convolution kernels in the same group comprise the same elements and different arrangement sequences of the elements; and performing convolution calculation on different input images respectively based on each convolution kernel in at least one group of convolution kernels to obtain a plurality of intermediate matrixes, and summing the intermediate matrixes to obtain the characteristic image. By the characteristics that different convolution kernels of the convolution kernels contain the same elements and the arrangement sequences of the elements are different, resources occupied by the stored convolution kernels are reduced, the number of times of reading the convolution kernels is reduced, the calculation amount generated when the convolution layers determine the characteristic images is reduced, and system operation resources consumed in the calculation process are reduced.

In one possible implementation, the summing the plurality of intermediate matrices to obtain a feature image includes:

adding polynomials of elements at the same position in the plurality of intermediate matrixes to obtain a polynomial corresponding to each element of the characteristic image;

respectively carrying out merging similar item processing on the polynomial corresponding to each element of the characteristic image;

and respectively evaluating each combined polynomial processed by the same type of items to obtain the characteristic image.

The times of the multiplication and addition which are required in total are far more than the times of the multiplication and addition which are required by the polynomials which are not combined with the same terms. It can be seen that, as the number of convolution kernels included in a set of convolution kernels increases and the whole calculation process for determining the feature image is completed, the number of places involved in the calculation process, which can reduce the operation amount, greatly increases, and finally the speed for determining the feature image is accelerated.

In one possible implementation, before obtaining at least one set of convolution kernels of the target processing layer, the method further includes:

randomly generating N convolution kernels, wherein N is a preset group number;

and performing element displacement on each convolution kernel in the N convolution kernels by using a row unit and/or performing element displacement on each convolution kernel by using a column unit to obtain M-1 different convolution kernels, wherein the M-1 convolution kernels and the convolution kernels before element displacement form a group of convolution kernels of the target processing layer, and M is the number of the convolution kernels in a preset group.

By the characteristics that different convolution kernels of the convolution kernels contain the same elements and the arrangement sequences of the elements are different, resources occupied by the stored convolution kernels are reduced, the number of times of reading the convolution kernels is reduced, the calculation amount generated when the convolution layers determine the characteristic images is reduced, and system operation resources consumed in the calculation process are reduced.

In one possible implementation, the number of convolution kernels in each group is greater than 2 and less than the product of the number of rows and columns of convolution kernels.

In a possible implementation manner, after summing the plurality of intermediate matrices to obtain a feature image, the method further includes:

when the output result of the convolutional neural network model is obtained, determining the adjustment value of each element in each convolutional kernel in the at least one group of convolutional kernels according to the output result of the convolutional neural network model and a preset output result;

determining the sum of the adjustment values of the same elements contained in different convolution kernels in the same group as a corrected adjustment value corresponding to the adjustment value of the same element;

each convolution kernel is adjusted based on the modified adjustment value for each element.

In implementation, a plurality of convolutional layers exist in the convolutional neural network model, the convolutional layers from the first layer to the Z-1 layer output characteristic images, and the convolutional layer from the last layer, namely the Z-1 layer outputs a final output result. When the output result of the convolutional neural network model is obtained, because the convolutional neural network model is still in the training process, an error generally exists between the output result and a preset output result. Based on the error generated by the entire convolutional neural network model, the adjustment value for each element in each convolutional kernel in the plurality of sets of convolutional kernels can be determined. Then, the sum of the adjustment values of the same element included in the different convolution kernels in the same group is determined as a corrected adjustment value corresponding to the adjustment value of the same element.

In a second aspect, an apparatus for determining a feature image in a convolutional neural network model is provided, the apparatus comprising at least one module for implementing the method for determining a feature image in a convolutional neural network model provided in the first aspect.

In a third aspect, a terminal is provided that includes a processor, a memory, the processor configured to execute instructions stored in the memory; the processor implements the method for determining the characteristic image in the convolutional neural network model provided in the first aspect by executing the instructions.

In a fourth aspect, a computer-readable storage medium is provided, comprising instructions that, when run on a source server, cause the source server to perform the method for determining a feature image in a convolutional neural network model as provided in the first aspect above.

In a fifth aspect, a computer program product containing instructions which, when run on a source server, causes the source server to perform the method for determining a feature image in a convolutional neural network model as provided in the first aspect above.

The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:

the method provided by the embodiment comprises the steps of acquiring a plurality of input images; generating a plurality of groups of convolution kernels, wherein different convolution kernels in the same group comprise the same elements and different arrangement sequences of the elements; at least one characteristic image corresponding to the plurality of input images is determined. By the characteristics that different convolution kernels of the convolution kernels contain the same elements and the arrangement sequences of the elements are different, resources occupied by the stored convolution kernels are reduced, the number of times of reading the convolution kernels is reduced, the calculation amount generated when the convolution layers determine the characteristic images is reduced, and system operation resources consumed in the calculation process are reduced.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic diagram illustrating the structure of a terminal according to an exemplary embodiment;

FIG. 2 is a schematic flow diagram illustrating a method of determining a feature image in a convolutional neural network model, according to an exemplary embodiment;

FIG. 3 is a schematic flow diagram illustrating a method of determining a feature image in a convolutional neural network model in accordance with an exemplary embodiment;

FIG. 4 is a schematic flow diagram illustrating a method of determining a feature image in a convolutional neural network model in accordance with an exemplary embodiment;

FIG. 5 is a schematic flow diagram illustrating a method of determining a feature image in a convolutional neural network model, according to an exemplary embodiment;

FIG. 6 is a schematic flow diagram illustrating a method of determining a feature image in a convolutional neural network model, according to an exemplary embodiment;

FIG. 7 is a schematic flow diagram illustrating a method of determining a feature image in a convolutional neural network model in accordance with an exemplary embodiment;

FIG. 8 is a schematic flow diagram illustrating a method of determining a feature image in a convolutional neural network model in accordance with an exemplary embodiment;

FIG. 9 is a schematic flow diagram illustrating a method of determining a feature image in a convolutional neural network model in accordance with an exemplary embodiment;

fig. 10 is a schematic structural diagram illustrating an apparatus for determining a feature image in a convolutional neural network model according to an exemplary embodiment.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

The embodiment of the invention provides a method for determining a characteristic image in a convolutional neural network model, wherein an execution main body of the method is a terminal.

The terminal may comprise a processor 110, a memory 120, and the processor 110 may be connected to the memory 120, as shown in fig. 1. Processor 110 may include one or more processing units; the Processor 110 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, etc.

In particular, the program may include program code including computer operating instructions. The terminal may further include a memory 120, the memory 120 may be used to store software programs and modules, and the processor 110 performs tasks by reading the software codes and modules stored in the memory 120.

In addition, the terminal may further include a receiver 130 and a transmitter 140, wherein the receiver 130 and the transmitter 140 may be respectively connected with the processor 110, and the transmitter 130 and the receiver 140 may be collectively referred to as a transceiver. The transmitter 140 may be used to transmit messages or data, and the transmitter 140 may include, but is not limited to, at least one Amplifier, a tuner, one or more oscillators, a coupler, an LNA (Low Noise Amplifier), a duplexer, and the like.

An exemplary embodiment of the present disclosure provides a method for determining a feature image in a convolutional neural network model, as shown in fig. 2, a processing flow of the method may include the following steps:

step S210, a plurality of input images of the target processing layer in the convolutional neural network model are acquired.

In the implementation, in the process of training the convolutional neural network model, first, the structure of the convolutional neural network model is designed, such as the number of convolutional layers included in the convolutional neural network model, the number of input images and convolutional kernels in each layer, and the number of output feature images. In the first round of training of the convolutional neural network model, the values of the elements of the convolution kernel used for performing convolution calculation on the input image to obtain the output image are random. The convolution kernel may be a matrix, and the elements of the convolution kernel are values at any position (position determined by rows and columns) in the matrix. For a convolution kernel of size 3X3, there are 9 values in the convolution kernel, 3 rows and 3 columns, and these 9 values are the elements of the convolution kernel. Similarly, for a convolution kernel of size 5X5, there are 25 values in the convolution kernel, i.e., 25 values in 5 rows and 5 columns, and these 25 values are the elements of the convolution kernel. Convolution kernels of other sizes and the like are not exemplified here. And in the process from the second round of training to the Nth theory of training, continuously returning to optimize the value of the element of the convolution kernel through the difference value between the result output by the convolution neural network model and the correct result in the sample, so that the difference value between the result output by the convolution neural network model and the correct result in the sample obtains the lowest value as far as possible after the Nth theory of training. For a target processing layer (a certain convolutional layer in the convolutional neural network model), a plurality of output images obtained by processing a plurality of characteristic images output by the convolutional layer on the target processing layer through other layers such as a Pooling layer and a RELU layer are a plurality of input images of the convolutional layer.

Each feature image in each convolutional layer in the convolutional neural network model can be determined using the method provided by the present embodiment.

Step S220, at least one set of convolution kernels of the target processing layer is obtained.

The multidimensional tensor (including three-dimensional or three-order matrix more than three-dimensional is the tensor) formed by all the convolution kernels in each group of convolution kernels can be a tensor with a special structure arranged according to a certain rule, and the purpose is to enable elements in each group of convolution kernels to be repeated, so that when elements of the convolution kernels are used for calculation, the calculation amount can be reduced by combining the same kind of items. The size of the convolution kernel is typically 3X3 or 5X5, and the height and width of the convolution kernel are typically the same value. Different convolution kernels in the same group contain the same elements and the elements are arranged in different orders.

In implementation, in initializing the convolutional neural network model, a plurality of sets of convolution kernels may be generated in units of sets. For example, as shown in fig. 3, in a convolutional layer, there are 6 input images in total, and there are 6 convolutional kernels corresponding to each input image. The 6 convolution kernels may be grouped, for example, the convolution kernels 1-3 into 1 group and the convolution kernels 4-6 into 1 group.

Optionally, before obtaining at least one set of convolution kernels of the target processing layer, the method provided in this embodiment may further include: randomly generating N convolution kernels, wherein N is a preset group number; and performing element displacement on each convolution kernel in the N convolution kernels by using a row unit and/or performing element displacement on each convolution kernel by using a column unit to obtain M-1 different convolution kernels, wherein the M-1 convolution kernels and the convolution kernels before element displacement form a group of convolution kernels of the target processing layer, and M is the number of the convolution kernels in a preset group.

In implementation, the convolutional neural network model is modeledIn the initialization process, for example, convolution kernels 1-3 are divided into 1 group and convolution kernels 4-6 are divided into 1 group as described above. First, 2 convolution kernels, namely, convolution kernel 1 and convolution kernel 4, are randomly generated. Next, element shift is performed on convolution kernel 1 in units of rows, and if the size of convolution kernel 1 is 3 × 3, then, as shown in fig. 4, element shift is performed on convolution kernel 1 in units of columns to obtain convolution kernel 2 and convolution kernel 3. Wherein, W₁-W₈Are elements in the convolution kernel. The convolution kernels 4-6 are generated in the same manner.

Optionally, the number M of convolution kernels in each group is greater than 2 and less than the product of the number of rows and columns of convolution kernels. E.g., no more than 9 max for convolution kernels of size 3X 3. Since, once M exceeds 9, the convolution kernel of size 3X3 has been element-shifted in row units and in column units, the shift pattern has been all possible, the 10 th convolution kernel must be one of the first 9 repeated convolution kernels. That is, to ensure that different convolution kernels in the same group contain the same elements and the elements are arranged in different orders, M is controlled not to exceed the product of the number of rows and the number of columns of the convolution kernels at maximum.

In an implementation, for example, there are 5 convolution kernels 1-5 in 1 set of convolution kernels. As shown in fig. 5, first, element shift may be performed by the convolution kernel 1 in units of columns to obtain convolution kernels 2 and 3, and then element shift may be performed by the convolution kernel 1 in units of rows to obtain

convolution kernels

4 and 5. Or the convolution kernel 4 and the convolution kernel 5 obtained by performing element shift on the basis of a row unit by the convolution kernel 2, and the like.

Step S230, performing convolution calculation on different input images respectively based on each convolution kernel in at least one group of convolution kernels to obtain a plurality of intermediate matrices, and summing the intermediate matrices to obtain a feature image.

Each element of the intermediate matrix is a polynomial obtained by multiplying and adding the corresponding convolution kernel and the input image in the convolution calculation process by the alignment element.

In practice, as shown in fig. 6, there are 4 input images in the same convolutional layer, and the convolutional layer needs to output 2 feature images, i.e., feature image 1 and feature image 2. The 4 input images are respectively subjected to convolution calculation through 4 convolution kernels 1-4 to obtain intermediate matrixes 1-4, and the characteristic image 1 can be obtained based on the intermediate matrixes 1-4. And performing convolution calculation on the 4 input images again through 4 convolution kernels 5-8 to obtain intermediate matrixes 5-8, and obtaining the characteristic image 2 based on the intermediate matrixes 5-8.

With respect to FIG. 6, the convolution kernels 1-4 may be divided into sets of convolution kernels, and the convolution kernels 5-8 may also be divided into sets of convolution kernels. Here, this is merely an example, and in practice, the number of convolution kernels corresponding to 1 feature image is large, and the convolution kernels corresponding to 1 feature image may be divided into multiple sets of convolution kernels. For each set of convolution kernels, the different convolution kernels contain the same elements and the elements are arranged in different orders.

As shown in fig. 7, in terms of the determination process of the intermediate matrix 1, each time the input image 1 is taken out of 3X3 adjacent elements, the 3X3 adjacent elements in the input image 1 are multiplied by the elements of the convolution kernel at the corresponding position and then added to obtain a polynomial. Shifting the convolution kernel in the input image 1 according to the preset row number or the preset column number each time, repeating the operation of multiplying the 3X3 adjacent elements in the obtained input image 1 by the elements of the convolution kernel at the corresponding position and then adding to obtain a polynomial until the convolution kernel traverses all 3X3 adjacent elements on the input image, thus obtaining the intermediate matrix 1.

Optionally, the step of summing the plurality of intermediate matrices to obtain the feature image may include: adding the polynomials of the elements at the same position in the plurality of intermediate matrixes to obtain a polynomial corresponding to each element of the characteristic image; respectively carrying out merging similar item processing on the polynomial corresponding to each element of the characteristic image; and respectively evaluating each combined polynomial processed by the same type of items to obtain a characteristic image.

In implementation, for the method provided in this embodiment, the polynomials of the intermediate matrices may be determined simultaneously through multiple channels, and then the polynomials at the same position are added.

As shown in fig. 8, the

convolution kernels

1, 2 and 3 are used to respectively apply to the upper left corners of the

input image

1, 2 and 3For example, convolution calculation is performed on 3 × 3 adjacent elements of the position to obtain one element in the first row and the first column of the intermediate matrix 1, the intermediate matrix 2, and the intermediate matrix 3, respectively. An element in the first row and the first column of the intermediate matrix 1 corresponds to a polynomial of: w₀×a₀+W₁×a₁+W₂×a₂+W₃×a₃+W₄×a₄+W₅×a₅+W₆×a₆+W₇×a₇+W₈×a₈. An element of the first row and the first column of the intermediate matrix 2 corresponds to a polynomial of: w₂×b₀+W₀×b₁+W₁×b₂+W₅×b₃+W₃×b₄+W₄×b₅+W₈×b₆+W₆×b₇+W₇×b₈. An element of the first row and the first column of the intermediate matrix 3 corresponds to a polynomial of: w₁×c₀+W₂×c₁+W₀×c₂+W₄×c₃+W₅×c₄+W₃×c₅+W₇×c₆+W₈×c₇+W₆×c₈。

When determining the feature image, polynomials of like-positioned elements of all intermediate matrices corresponding to the feature image are added, which includes adding polynomials of like-positioned elements of intermediate matrix 1, intermediate matrix 2, and intermediate matrix 3. Of course, the method also includes adding the polynomials corresponding to the elements in the first row and the first column of the

intermediate matrices

1, 2 and 3 to obtain: w₀×a₀+W₁×a₁+W₂×a₂+W₃×a₃+W₄×a₄+W₅×a₅+W₆×a₆+W₇×a₇+W₈×a₈+W₂×b₀+W₀×b₁+W₁×b₂+W₅×b₃+W₃×b₄+W₄×b₅+W₈×b₆+W₆×b₇+W₇×b₈+W₁×c₀+W₂×c₁+W₀×c₂+W₄×c₃+W₅×c₄+W₃×c₅+W₇×c₆+W₈×c₇+W₆×c₈。

It can be seen that the above formula can be processed by combining the same kind of terms to obtain: w₀×(a₀+b₁+c₂)+W₁×(a₁+b₂+c₀)+W₂×(a₂+b₀+c₁)+W₃×(a₃+b₄+c₅)+W₄×(a₄+b₅+c₃)+W₅×(a₅+b₃+c₄)+W₆×(a₆+b₇+c₈)+W₇×(a₇+b₈+c₆)+W₈×(a₈+b₆+c₇). It can be seen that the polynomial without the same term combination requires 27 multiplications and 26 additions to obtain the desired result, while the polynomial with the same term combination requires 9 multiplications and 26 additions to obtain the desired result.

It can be seen that the operation amount of 18 multiplications can be reduced for a small part of operations in determining the feature image. And as the number of the convolution kernels included in one group of the convolution kernels is increased and the whole calculation process for determining the characteristic image is completed, the positions involved in the calculation process, which can reduce the operation amount, are greatly increased, and finally the speed for determining the characteristic image is accelerated.

Optionally, after summing the plurality of intermediate matrices to obtain a feature image, the method provided in this embodiment further includes: when the output result of the convolutional neural network model is obtained, determining the adjustment value of each element in each convolutional kernel in at least one group of convolutional kernels according to the output result of the convolutional neural network model and a preset output result; determining the sum of the adjustment values of the same elements contained in different convolution kernels in the same group as a corrected adjustment value corresponding to the adjustment value of the same element; each convolution kernel is adjusted based on the modified adjustment value for each element.

In implementation, a plurality of convolutional layers exist in the convolutional neural network model, the convolutional layers from the first layer to the Z-1 layer output characteristic images, and the convolutional layer from the last layer, namely the Z-1 layer outputs a final output result. When the output result of the convolutional neural network model is obtained, because the convolutional neural network model is still in the training process, an error generally exists between the output result and a preset output result. Based on the error generated by the entire convolutional neural network model, the adjustment value for each element in each convolutional kernel in the plurality of sets of convolutional kernels can be determined. Then, the sum of the adjustment values of the same element included in the different convolution kernels in the same group is determined as a corrected adjustment value corresponding to the adjustment value of the same element. For example, with respect to fig. 8, if it is assumed that convolution calculations are performed on 3X3 adjacent elements in the input image 1, 3X3 adjacent elements in the input image 2, and 3X3 adjacent elements in the input image 3 by 3 channels by the convolution kernel 1, the convolution kernel 2, and the convolution kernel 3, respectively, then there is fig. 9, the following formula when calculating the corrected adjustment values corresponding to the adjustment values of the same elements:

wherein Δ w is a modified adjustment value corresponding to the adjustment value of the same element. WH is the product of the width and height of the feature image. Delta_RkFor sensitivity, δ_RkR in (1) represents the R-th feature image of the target processing layer, w _ size²Is the product of the width of the convolution kernel and the height of the convolution kernel.

Finally, the test was performed by the method provided in this example. Specifically, a Cifar10 data set is used for image recognition training, a convolutional neural network model is designed to be a 3-layer model, and the size of each layer of convolution kernel is 5X 5. The test results obtained are shown in the following table:

TABLE 1

The test was performed by the method provided in this example. Specifically, the convolutional neural network model training is used in the image super-resolution field, and the original image is amplified to be a new image with the size of 3 times. The convolutional neural network model was designed as a 3-layer model with the size of the convolution kernel being 5X 5. The test results obtained are shown in the following table:

TABLE 2

The PSNR is a commonly used measurement index in image super-resolution application, and the higher the PSNR is, the better the image super-resolution effect is. BaseHisrcnn is a convolutional neural network structure applied to image super-resolution.

Yet another exemplary embodiment of the present disclosure provides an apparatus for determining a feature image in a convolutional neural network model, as shown in fig. 10, the apparatus including:

an obtaining module 1010, configured to obtain a plurality of input images of a target processing layer in a convolutional neural network model; at least one set of convolution kernels of the target processing layer is obtained. Wherein, different convolution kernels in the same group comprise the same elements and different arrangement sequences of the elements. The obtaining function in step S210 and step S220, and other implicit steps may be implemented specifically.

A determining module 1020, configured to perform convolution calculation on different input images respectively based on each convolution kernel in the at least one set of convolution kernels to obtain a plurality of intermediate matrices, and sum the plurality of intermediate matrices to obtain a feature image, where each element of the intermediate matrices is a polynomial obtained by multiplying and adding corresponding convolution kernels and input images in the convolution calculation process. The obtaining function in step S230, and other implicit steps may be implemented specifically.

Optionally, the determining module 1020 is configured to add polynomials of elements at the same position in the plurality of intermediate matrices to obtain a polynomial corresponding to each element of the feature image; respectively carrying out merging similar item processing on the polynomial corresponding to each element of the characteristic image; and respectively evaluating each combined polynomial processed by the same type of items to obtain the characteristic image.

Optionally, the apparatus further comprises:

the generating module is used for randomly generating N convolution kernels, wherein N is a preset group number;

and the displacement module is used for performing element displacement on each convolution kernel in the N convolution kernels by using a row unit and/or performing element displacement on each convolution kernel by using a column unit to obtain M-1 different convolution kernels, wherein the M-1 convolution kernels and the convolution kernels before element displacement form a group of convolution kernels of the target processing layer, and M is the number of the convolution kernels in a preset group.

Optionally, the number of convolution kernels in each group is greater than 2 and less than the product of the number of rows and columns of convolution kernels.

Optionally, the determining module 1020 is further configured to, when the output result of the convolutional neural network model is obtained, determine an adjustment value of each element in each convolutional kernel in the at least one group of convolutional kernels according to the output result of the convolutional neural network model and a preset output result; determining the sum of the adjustment values of the same elements contained in different convolution kernels in the same group as a corrected adjustment value corresponding to the adjustment value of the same element;

the apparatus further comprises an adjustment module:

and the adjusting module is used for adjusting each convolution kernel based on the corrected adjusting value of each element.

It should be noted that the obtaining module 1010 and the determining module 1020 may be implemented by a processor, or a processor and a memory, or a processor executes program instructions in a memory.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

It should be noted that: the apparatus for determining a feature image in a convolutional neural network model provided in the above embodiment is only illustrated by dividing the above functional modules when determining the feature image, and in practical applications, the function distribution may be completed by different functional modules according to needs, that is, the internal structure of the terminal is divided into different functional modules to complete all or part of the functions described above. In addition, the apparatus for determining a feature image in a convolutional neural network model provided in the above embodiment and the method embodiment for determining a feature image in a convolutional neural network model belong to the same concept, and specific implementation processes thereof are detailed in the method embodiment and are not described herein again.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

A method of determining a feature image in a convolutional neural network model, the method comprising:

acquiring a plurality of input images of a target processing layer in a convolutional neural network model;

acquiring at least one group of convolution kernels of the target processing layer, wherein different convolution kernels in the same group contain the same elements and different arrangement sequences of the elements;

and performing convolution calculation on different input images respectively based on each convolution kernel in the at least one group of convolution kernels to obtain a plurality of intermediate matrixes, and summing the intermediate matrixes to obtain a characteristic image, wherein each element of the intermediate matrixes is a polynomial obtained by multiplying and adding the corresponding convolution kernel and the input image in the convolution calculation process.
The method of claim 1, wherein summing the plurality of intermediate matrices results in a feature image, comprising:

adding polynomials of elements at the same position in the plurality of intermediate matrixes to obtain a polynomial corresponding to each element of the characteristic image;

respectively carrying out merging similar item processing on the polynomial corresponding to each element of the characteristic image;

and respectively evaluating each combined polynomial processed by the same type of items to obtain the characteristic image.
The method of claim 1, wherein prior to obtaining at least one set of convolution kernels for the target processing layer, the method further comprises:

randomly generating N convolution kernels, wherein N is a preset group number;

and performing element displacement on each convolution kernel in the N convolution kernels by using a row unit and/or performing element displacement on each convolution kernel by using a column unit to obtain M-1 different convolution kernels, wherein the M-1 convolution kernels and the convolution kernels before element displacement form a group of convolution kernels of the target processing layer, and M is the number of the convolution kernels in a preset group.
The method of claim 1, wherein the number of convolution kernels in each group is greater than 2 and less than the product of the number of rows and columns of convolution kernels.
The method of claim 1, wherein after summing the plurality of intermediate matrices to obtain a feature image, the method further comprises:

when the output result of the convolutional neural network model is obtained, determining the adjustment value of each element in each convolutional kernel in the at least one group of convolutional kernels according to the output result of the convolutional neural network model and a preset output result;

determining the sum of the adjustment values of the same elements contained in different convolution kernels in the same group as a corrected adjustment value corresponding to the adjustment value of the same element;

each convolution kernel is adjusted based on the modified adjustment value for each element.
An apparatus for determining a feature image in a convolutional neural network model, the apparatus comprising:

the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring a plurality of input images of a target processing layer in a convolutional neural network model; acquiring at least one group of convolution kernels of the target processing layer, wherein different convolution kernels in the same group contain the same elements and different arrangement sequences of the elements;

and the determining module is used for respectively carrying out convolution calculation on different input images based on each convolution kernel in the at least one group of convolution kernels to obtain a plurality of intermediate matrixes, and summing the intermediate matrixes to obtain the characteristic image, wherein each element of the intermediate matrixes is a polynomial obtained by multiplying and adding the corresponding convolution kernel and the input image in the convolution calculation process.
The apparatus of claim 6, wherein the determining module is configured to add polynomials for elements at the same position in the plurality of intermediate matrices to obtain a polynomial corresponding to each element of the feature image; respectively carrying out merging similar item processing on the polynomial corresponding to each element of the characteristic image; and respectively evaluating each combined polynomial processed by the same type of items to obtain the characteristic image.
The apparatus of claim 6, further comprising:

the generating module is used for randomly generating N convolution kernels, wherein N is a preset group number;

and the displacement module is used for performing element displacement on each convolution kernel in the N convolution kernels by using a row unit and/or performing element displacement on each convolution kernel by using a column unit to obtain M-1 different convolution kernels, wherein the M-1 convolution kernels and the convolution kernels before element displacement form a group of convolution kernels of the target processing layer, and M is the number of the convolution kernels in a preset group.
The apparatus of claim 6, wherein the number of convolution kernels in each group is greater than 2 and less than the product of the number of rows and columns of convolution kernels.
The apparatus according to claim 6, wherein the determining module is further configured to determine, when the output result of the convolutional neural network model is obtained, an adjustment value of each element in each convolution kernel in the at least one set of convolution kernels according to the output result of the convolutional neural network model and a preset output result; determining the sum of the adjustment values of the same elements contained in different convolution kernels in the same group as a corrected adjustment value corresponding to the adjustment value of the same element;

the apparatus further comprises an adjustment module:

and the adjusting module is used for adjusting each convolution kernel based on the corrected adjusting value of each element.
A terminal, characterized in that the terminal comprises a processor and a memory, wherein:

the processor is used for acquiring a plurality of input images of a target processing layer in the convolutional neural network model stored in the memory; acquiring at least one group of convolution kernels of the target processing layer stored in the memory, wherein different convolution kernels in the same group contain the same elements and different arrangement sequences of the elements; and performing convolution calculation on different input images respectively based on each convolution kernel in the at least one group of convolution kernels to obtain a plurality of intermediate matrixes, and summing the intermediate matrixes to obtain a characteristic image, wherein each element of the intermediate matrixes is a polynomial obtained by multiplying and adding the corresponding convolution kernel and the input image in the convolution calculation process.
The terminal of claim 11, wherein the processor is configured to add polynomials for elements at the same position in the plurality of intermediate matrices to obtain a polynomial corresponding to each element of the feature image; respectively carrying out merging similar item processing on the polynomial corresponding to each element of the characteristic image; and respectively evaluating each combined polynomial processed by the same type of items to obtain the characteristic image.
The terminal of claim 11, wherein the processor is further configured to randomly generate N convolution kernels, wherein N is a preset number of groups; and performing element displacement on each convolution kernel in the N convolution kernels by using a row unit and/or performing element displacement on each convolution kernel by using a column unit to obtain M-1 different convolution kernels, wherein the M-1 convolution kernels and the convolution kernels before element displacement form a group of convolution kernels of the target processing layer, and M is the number of the convolution kernels in a preset group.
The terminal of claim 11, wherein the number of convolution kernels in each group is greater than 2 and less than the product of the number of rows and columns of convolution kernels.
The terminal of claim 11, wherein the processor is further configured to determine an adjustment value of each element in each convolution kernel in the at least one set of convolution kernels according to an output result of the convolutional neural network model and a preset output result when obtaining the output result of the convolutional neural network model; determining the sum of the adjustment values of the same elements contained in different convolution kernels in the same group as a corrected adjustment value corresponding to the adjustment value of the same element; each convolution kernel is adjusted based on the adjusted value of each element after correction.
A computer-readable storage medium comprising instructions that, when executed on a terminal, cause the terminal to perform the method of any of claims 1-5.
A computer program product comprising instructions for causing a terminal to perform the method of any of claims 1-5 when the computer program product is run on the terminal.