CN114239814A - Training method of convolution neural network model for image processing - Google Patents

Training method of convolution neural network model for image processing Download PDF

Info

Publication number
CN114239814A
CN114239814A CN202210174146.2A CN202210174146A CN114239814A CN 114239814 A CN114239814 A CN 114239814A CN 202210174146 A CN202210174146 A CN 202210174146A CN 114239814 A CN114239814 A CN 114239814A
Authority
CN
China
Prior art keywords
convolution kernel
convolution
original
kernels
parameters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210174146.2A
Other languages
Chinese (zh)
Other versions
CN114239814B (en
Inventor
艾国
杨作兴
房汝明
向志宏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Yanji Microelectronics Co ltd
Original Assignee
Hangzhou Yanji Microelectronics Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Yanji Microelectronics Co ltd filed Critical Hangzhou Yanji Microelectronics Co ltd
Priority to CN202210174146.2A priority Critical patent/CN114239814B/en
Publication of CN114239814A publication Critical patent/CN114239814A/en
Application granted granted Critical
Publication of CN114239814B publication Critical patent/CN114239814B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The present disclosure relates to a training method of a convolutional neural network model for image processing. A method of training a convolutional neural network model for image processing, the method comprising: constructing a convolutional neural network model to be trained, wherein the parameters of the convolutional neural network model comprise one or more original convolutional kernels and one or more groups of convolutional kernel generation parameters; training the one or more original convolution kernels and the one or more sets of convolution kernel generation parameters using a training set image, wherein in the training, one derivative convolution kernel is generated based on at least a portion of the original convolution kernels using each set of convolution kernel generation parameters, and image features of the training set image are convolved using the one or more original convolution kernels and the generated one or more derivative convolution kernels.

Description

Training method of convolution neural network model for image processing
Technical Field
The present disclosure relates to image processing on the smart device side.
And more particularly, to a training method of a convolutional neural network model for image processing on an intelligent device side, an image processing method using the convolutional neural network model thus trained, a computer storage medium having the above method stored thereon, an image processing apparatus implementing the above method, and an intelligent device including the image processing apparatus.
Background
Currently, there is a wide need for image processing techniques in smart devices (e.g., cell phones, tablet computers, smart cameras, smart gates, etc.). For example, in an intelligent camera, image processing is required to realize functions such as face recognition and beauty.
The convolution neural network model adopted by the existing image processing method is large, and a large number of parameters need to be stored in a memory. However, the memory space of the smart device is usually tight, and therefore, it is desirable to occupy less memory space when performing image processing on the smart device side.
Therefore, there is a need to improve the training method of the convolutional neural network model and the corresponding image processing method, so as to miniaturize the trained convolutional neural network model and reduce the memory space and the computing resources occupied by the trained convolutional neural network model at the intelligent device side.
Disclosure of Invention
It is an object of the present disclosure to provide a method of training a convolutional neural network model for image processing.
According to one aspect of the present disclosure, there is provided a training method of a convolutional neural network model for image processing, the method comprising: constructing a convolutional neural network model to be trained, wherein the parameters of the convolutional neural network model comprise one or more original convolutional kernels and one or more groups of convolutional kernel generation parameters; training the one or more original convolution kernels and the one or more sets of convolution kernel generation parameters using a training set image, wherein in the training, one derivative convolution kernel is generated based on at least a portion of the original convolution kernels using each set of convolution kernel generation parameters, and image features of the training set image are convolved using the one or more original convolution kernels and the generated one or more derivative convolution kernels.
According to another aspect of the present disclosure, there is provided an image processing method, characterized in that the method includes: obtaining a convolutional neural network model trained according to the method; generating a corresponding derivative convolution kernel based on each set of convolution kernel generation parameters in the trained convolutional neural network model and at least a portion of the one or more original convolution kernels; and performing convolution processing on the image characteristics of the image to be processed by using the one or more original convolution kernels and the generated one or more derivative convolution kernels.
According to another aspect of the present disclosure, there is provided a computer storage medium having stored thereon executable instructions that, when executed, are capable of implementing the above-described method.
According to another aspect of the present disclosure, there is provided an image processing apparatus characterized in that the apparatus is capable of implementing the above method.
According to another aspect of the present disclosure, a smart device is provided, wherein the smart device includes the above apparatus.
Other features of the present disclosure and advantages thereof will become more apparent from the following detailed description of exemplary embodiments thereof, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description, serve to explain the principles of the disclosure.
The present disclosure may be more clearly understood from the following detailed description, taken with reference to the accompanying drawings, in which:
fig. 1 shows a schematic diagram of a convolutional neural network model for image processing in the prior art.
Fig. 2 illustrates a flow diagram of a method of training a convolutional neural network model for image processing in accordance with at least one embodiment of the present disclosure.
Fig. 3 illustrates a schematic diagram of a convolutional neural network model for training in accordance with at least one embodiment of the present disclosure.
Fig. 4 illustrates a flow diagram of an image processing method according to at least one embodiment of the present disclosure.
Note that in the embodiments described below, the same reference numerals are used in common between different drawings to denote the same portions or portions having the same functions, and a repetitive description thereof will be omitted. In some cases, similar reference numbers and letters are used to denote similar items, and thus, once an item is defined in one figure, it need not be discussed further in subsequent figures.
For convenience of understanding, the positions, sizes, ranges, and the like of the respective structures shown in the drawings and the like do not sometimes indicate actual positions, sizes, ranges, and the like. Therefore, the present disclosure is not limited to the positions, dimensions, ranges, and the like disclosed in the drawings and the like.
Detailed Description
Various exemplary embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, the numerical expressions, and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless specifically stated otherwise.
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses. That is, the structures and methods herein are shown by way of example to illustrate different embodiments of the structures and methods of the present disclosure. Those skilled in the art will understand, however, that they are merely illustrative of exemplary ways in which the disclosure may be practiced and not exhaustive. Furthermore, the figures are not necessarily to scale, some features may be exaggerated to show details of particular components.
Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.
In all examples shown and discussed herein, any particular value should be construed as merely illustrative, and not limiting. Thus, other examples of the exemplary embodiments may have different values.
Fig. 1 shows a schematic diagram of a prior art convolutional neural network model 100 for image processing.
The convolutional neural network model 100 generally comprises a plurality of convolutional layers, the structure of which is schematically shown in fig. 1. Each level of convolutional layer includes convolution kernels 121, 122, 123 for performing convolution processing on the image features 110 input to that level of convolutional layer to generate output image features 130. The input image features 110 are input to the convolutional layer of the previous stage from the convolutional layer of the previous stage (or input to the convolutional layer of the previous stage as input features of the image to be processed), and the output image features 130 are output to the convolutional layer of the next stage (or output as output features of the image to be processed).
In the example of fig. 1, the input image features 110 include 4 channels with a resolution of 6 × 6; the output image features 130 include 3 channels with a resolution of 4 x 4; the convolution kernels 121, 122, 123 each include 4 channels with a resolution of 3 x 3. The number of channels of the convolution kernels 121, 122 and 123 is the same as the number of channels of the input image feature 110, and the number of the convolution kernels 121, 122 and 123 is the same as the number of channels of the output image feature 130.
When the convolution kernels 121, 122, 123 are used to perform convolution processing on the input image features 110, each convolution kernel is used to perform convolution processing on the input image features 110, and a feature value of a corresponding channel of the output image features 130 is obtained. For example, the feature value (aa) of the 1 st channel of the output image feature 130 is obtained by performing convolution processing on the input image feature 110 by using the convolution kernel 1211, ab1,…, dd1)。
More specifically, a convolution kernel is used to perform a multiply-add operation with a portion of the feature values of the image feature 110, so as to obtain a feature value of a corresponding channel of the image feature 130. Then, similar multiply-add operation is performed on the convolution kernel and another part of feature values of the image feature 110, so as to obtain another feature value of a corresponding one of the channels of the image feature 130. By analogy, the process of convolving the image feature 110 with the convolution kernel is completed. A multiply-add operation as referred to herein refers to the multiplication followed by the addition of the results of the multiplication together.
For example, when the convolution kernel 121 is used to perform convolution processing on the image feature 110, the convolution kernel 121 and the feature value aa of the image feature 110 may be used1, ab1, ac1, ba1, bb1, bc1, ca1, cb1, cc1; aa2, ab2, ac2, …, ca2, cb2, cc2; aa3, ab3, ac3, … ca4, cb4, cc4Performing a multiplication and addition operation to obtain a characteristic value aa of the 1 st channel of the image characteristic 1301(ii) a And then using the convolution kernel 121 with the feature values ab of the image features 1101, ac1, ad1, bb1, bc1, bd1, cb1, cc1, cd1; ab2, ac2, ad2, … cb4, cc4, cd4Performing multiplication and addition operation to obtain the characteristic value ab of the 1 st channel of the image characteristics 1301(ii) a And so on, using the convolution kernel 121 and the feature value dd of the image feature 1101, de1, df1, ed1, ee1, ef1, fd1, fe1, ff1; dd2, de2, df2, … fd4, fe4, ff4Performing multiply-add operation to obtain the feature value dd of the 1 st channel of the image feature 1301Thus, all feature values of channel 1 of the image feature 130 are obtained.
For descriptive convenience, a vector composed of values at the same coordinate in different ones of the image features 110, 130 or convolution kernels 121, 122, 123 including a plurality of channels will be referred to as a channel vector hereinafter. For example, the feature value aa at coordinate (a, a) in different channels of the image feature 110 (i.e., at row 1, column 1) may be determined1, aa2, aa3, aa4The constructed vector is called channel vector uaaAnd may be represented by a parameter AA at coordinates (a, a) in different channels of the convolution kernel 121 (i.e., at row 1, column 1)1, AA2, AA3, AA4The constructed vector is called a channel vector vAAAnd so on.
When a convolution kernel and a part of feature values of the image features 110 are subjected to a multiplication and addition operation, a point multiplication result of each channel vector of the convolution kernel and a corresponding channel vector of the image features 110 is calculated for each channel vector of the convolution kernel, and the sum of the point multiplication results of each channel vector of the convolution kernel and the corresponding channel vector of the image features 110 is used as a feature value at a corresponding one coordinate of a corresponding channel of the output image features 130.
For example, the feature value aa of the image feature 110 is calculated using the convolution kernel 1211, ab1, ac1, …, ca1, cb1, cc1; aa2, ab2, ac2, … ca4, cb4, cc4When performing the multiply-add operation, each channel vector v for the convolution kernel 121AA=[AA1, AA2, AA3, AA4], vAB=[AB1, AB2, AB3, AB4], vAC=[AC1, AC2, AC3, AC4], …, vCC=[CC1, CC2, CC3, CC4]Respectively calculate its corresponding channel vector u with the image feature 110aa=[aa1, aa2, aa3, aa4], uab=[ab1, ab2, ab3, ab4], uac=[ac1, ac2, ac3, ac4], …, ucc=[cc1, cc2, cc3, cc4]As a result of the dot multiplication, i.e. calculating u separatelyaa·vAA, uab·vAB, uac·vAC,…, ucc·vCC. Thereafter, the sum of the above dot product results is taken as a feature value aa at the coordinates (a, a) of the 1 st channel of the image feature 1301. Similarly, the feature value ab of the image feature 110 is calculated using the convolution kernel 1211, ac1, ad1, …, cb1, cc1, cd1; ab2, ac2, ad2, … cb4, cc4, cd4When performing the multiply-add operation, each channel vector v for the convolution kernel 121AA=[AA1, AA2, AA3, AA4], vAB=[AB1, AB2, AB3, AB4], vAC=[AC1, AC2, AC3, AC4], …, vCC=[CC1, CC2, CC3, CC4]Respectively calculate its corresponding channel vector u with the image feature 110ab=[ab1, ab2, ab3, ab4], uac=[ac1, ac2, ac3, ac4], uad=[ad1, ad2, ad3, ad4], …, ucd=[cd1, cd2, cd3, cd4]As a result of the dot multiplication, i.e. calculating u separatelyab·vAA, uac·vAB, uad·vAC, …, ucd·vCC. Then, the sum of the dot product results is taken as a feature value ab at coordinates (a, b) of the 1 st channel of the image feature 1301. And so on, calculate udd·vAA, ude·vAB, udf·vAC, …, uff·vCCAs a result of the dot multiplication of (c), the sum of them is taken as the feature value dd at the coordinates (d, d) of the 1 st channel of the image feature 1301Thus, all feature values of channel 1 of the image feature 130 are obtained.
It can be seen that the parameters of each level of convolutional layer of the convolutional neural network model 100 of the prior art include the parameters of the convolutional kernels 121, 122, 123, and the number of parameters of each level of convolutional layer is equal to the product of the number of convolutional kernels 121, 122, 123 and the number of channels and resolution of each convolutional kernel. In the example of fig. 1, this level of convolutional layers includes 108 parameters, which is equal to the product of the number of convolutional kernels (3) and the number of channels per convolutional kernel (4) and the resolution (3 × 3= 9). During training, the parameters need to be trained; in the subsequent image processing, these parameters need to be stored in a memory space.
It should be noted that fig. 1 only schematically illustrates the structure of the convolutional neural network model 100. In practical applications, the number of convolution kernels and the number of channels in each level of convolution layer of the convolutional neural network model 100, as well as the number of channels and resolution of the image features processed by it, are generally much larger than in the example of fig. 1. Thus, each convolutional layer of the prior art convolutional neural network model 100 includes a large number of parameters. In image processing, these parameters require a large amount of memory space.
However, on the smart device side, memory space is often tight. Therefore, there is a need for an improved image processing method that achieves similar effects using less memory space when performing image processing on the smart device side. Under the same precision, the number of the parameters of the model is reduced, and the purpose of model miniaturization is achieved.
Fig. 2 illustrates a flow diagram of a method 200 of training a convolutional neural network model for image processing in accordance with at least one embodiment of the present disclosure. The method 200 may be used to train an improved convolutional neural network model that is particularly suited for image processing on the smart device side.
At step 201, the method 200 begins.
At step 202, a convolutional neural network model to be trained is constructed. Wherein the parameters of the convolutional neural network model include one or more original convolution kernels and one or more sets of convolution kernel generation parameters. For example, the number of parameters for each set of convolution kernel generation parameters may be the same as the number of original convolution kernels.
At step 204, one or more original convolution kernels and one or more sets of convolution kernel generation parameters are trained using a training set image. Wherein at step 206, a derivative convolution kernel is generated based on at least a portion of the original convolution kernels using each set of convolution kernel generation parameters. For example, each convolution kernel generation parameter in each set of convolution kernel generation parameters may correspond to an original convolution kernel representing a weight of the original convolution kernel at the time the corresponding derivative convolution kernel was generated. At step 208, the image features of the training set images are convolved with the one or more original convolution kernels and the generated one or more derivative convolution kernels.
At step 210, the method 200 ends.
The various steps in method 200 are described in more detail below in conjunction with fig. 3.
Fig. 3 illustrates a schematic diagram of a convolutional neural network model 300 for training in accordance with at least one embodiment of the present disclosure. Convolutional neural network model 300 may include multiple convolutional layers, one of which is schematically illustrated in fig. 3. In the illustrated first-level convolutional layer, the input image features 310 are convolved with original convolution kernels 321, 322 and the generated derivative convolution kernel 331, thereby generating output image features 340. Where the derivative convolution kernel 331 is generated based on the original convolution kernels 321, 322 using a set of convolution kernel generation parameters 323.
It can be seen that similar to the prior art convolutional neural network model 100 in fig. 1, the input image features 310 are also convolved with 3 convolution kernels in the first-level convolutional layer shown in fig. 3, respectively. However, the parameters of the first-order convolution layer of the convolutional neural network model 300 trained according to the aforementioned training method 200 include only the original convolution kernels 321 and 322 and the convolution kernel generation parameter 323, and do not include the derivative convolution kernel 331, because the derivative convolution kernel 331 can be generated based on the original convolution kernels 321 and 322 and the convolution kernel generation parameter 323. Accordingly, as shown in FIG. 2, at step 204 of method 200, training is performed only for the original convolution kernels 321, 322 and the convolution kernel generation parameters 323.
Thus, the number of parameters included in the first convolutional layer of the convolutional neural network model 300 trained according to the method 200 is equal to the number of parameters of the convolutional kernel generation parameter 323 plus the number of parameters of the original convolutional kernels 321 and 322 (i.e., the product of the number of original convolutional kernels and the number of channels and resolution of each original convolutional kernel). In the example of fig. 3, this level of convolutional layers includes a number of parameters 74 equal to the number of parameters (2) of the convolutional kernel generation parameters 323 plus the number of parameters (72) of the original convolutional kernels 321, 322, where the number of parameters of the original convolutional kernels 321, 322 is equal to the product of the number of original convolutional kernels (2) and the number of channels (4) and resolution (3 × 3= 9) of each convolutional kernel.
In contrast, in the prior art example shown in fig. 1, the number of parameters included in the first convolutional layer of the convolutional neural network model 100 is 108. In the case of performing convolution processing on the input image features by using 3 convolution kernels as well, the number of parameters included in the first convolution layer of the convolutional neural network model 300 trained by the training method 200 according to the present invention is about 2/3 in the prior art. The training method and the image processing method provided by the invention greatly reduce the memory space occupied by the convolutional neural network model adopted in the image processing, and are particularly suitable for the image processing of an intelligent device side.
In the example of fig. 3, a derivative convolution kernel 331 is generated based on the original convolution kernels 321, 322 using a set of convolution kernel generation parameters 323. In other embodiments, a plurality of derivative convolution kernels may be generated based on the original convolution kernels 321, 322 using a plurality of sets of convolution kernel generation parameters, respectively.
In a preferred embodiment, the number of parameters for each set of convolution kernel generation parameters may be the same as the number of original convolution kernels used to generate the corresponding derivative convolution kernels. Each convolution kernel generation parameter may correspond to an original convolution kernel representing a weight of the original convolution kernel at the time of generating the corresponding derivative convolution kernel. For example, as shown in fig. 3, a set of convolution kernel generation parameters 323 includes 2 parameters α 1 and α 2, representing the weights of the original convolution kernels 321 and 322, respectively, in generating the derivative convolution kernel 331.
In a further preferred embodiment, each derivative convolution kernel may be generated from at least a portion of the original convolution kernel linear transformation. Wherein each of the corresponding set of convolution kernel generation parameters may represent a linear transform coefficient of the corresponding original convolution kernel to the derivative convolution kernel. For example, the derivative convolution kernel 331 may be generated by linear transformation of the original convolution kernels 321 and 322, where the parameters α 1 and α 2 represent linear transformation coefficients of the original convolution kernels 321 and 322 to the derivative convolution kernel 331, respectively.
In particular, the respective coordinates of the respective channels of the derivative convolution kernel 331The parameters at (b) can be obtained by linear transformation of the parameters at the corresponding coordinates of the corresponding channels of the original convolution kernels 321 and 322, where the linear transformation coefficients of the parameters in the original convolution kernels 321 and 322 are α 1 and α 2, respectively. For example, parameter AA at coordinates (A, A) of channel 1 of derivative convolution kernel 3311May be equal to the parameter AA at the coordinates (a, a) of channel 1 of the original convolution kernel 3211Multiplying by α 1 plus the parameter AA at the coordinate (A, A) of channel 1 of the original convolution kernel 3221Multiplied by alpha 2.
In a preferred embodiment, the number of parameters for each set of convolution kernel generation parameters may be the same as the product of the number of original convolution kernels and their number of channels used to generate the corresponding derived convolution kernels. Each convolution kernel generation parameter may correspond to a channel of an original convolution kernel, and represents a weight of the channel of the original convolution kernel when a corresponding derivative convolution kernel is generated. For example, the set of convolution kernel generation parameters 323 may include 8 parameters α 1, α 2, …, α 8 (not shown in fig. 3), representing the weights of the 4 channels of the original convolution kernel 321 and the 4 channels of the original convolution kernel 322, respectively, in generating the derivative convolution kernel 331.
In a further preferred embodiment, each channel of each derived convolution kernel may be generated from a corresponding channel linear transformation of at least a portion of the original convolution kernel. Wherein each of the corresponding set of convolution kernel generation parameters may represent a linear transform coefficient of a corresponding channel of the corresponding original convolution kernel to the channel of the derived convolution kernel. For example, the 4 channels of the derivative convolution kernel 331 may be generated by linear transforms of the 4 channels of the original convolution kernel 321 and the 4 channels of the original convolution kernel 322, where the parameters α 1 and α 2 represent linear transform coefficients of the 1 st channel of the original convolution kernel 321 and the 1 st channel of the original convolution kernel 322 to the 1 st channel of the derivative convolution kernel 331, respectively, the parameters α 3 and α 4 represent linear transform coefficients of the 2 nd channel of the original convolution kernel 321 and the 2 nd channel of the original convolution kernel 322 to the 2 nd channel of the derivative convolution kernel 331, respectively, and so on.
Specifically, the parameter AA at the coordinates (a, a) of the 1 st channel of the derivative convolution kernel 3311Which may be equal to channel 1 of the original convolution kernel 321Parameter AA at coordinate (A, A)1Multiplying by α 1 plus the parameter AA at the coordinate (A, A) of channel 1 of the original convolution kernel 3221Multiplying by α 2, deriving the parameter AA at the coordinates (A, A) of channel 2 of the convolution kernel 3312May be equal to the parameter AA at the coordinates (a, a) of the 2 nd channel of the original convolution kernel 3212Multiplied by α 3 plus the parameter AA at the coordinate (A, A) of channel 2 of the original convolution kernel 3222Multiplied by α 4, and so on.
In some embodiments, the number of parameters for each set of convolution kernel generation parameters may be the same as the product of the number of original convolution kernels and the number of parameters included therein used to generate the corresponding derivative convolution kernel. Each convolution kernel generation parameter may correspond to a parameter of an original convolution kernel, and represents a weight of the parameter of the original convolution kernel when the corresponding derivative convolution kernel is generated. In some embodiments, each parameter of each derived convolution kernel may be generated from a corresponding parametric linear transform of at least a portion of the original convolution kernel, wherein each of the corresponding set of convolution kernel generation parameters may represent a linear transform coefficient of the corresponding parameter of the original convolution kernel to the parameter of the derived convolution kernel.
In a preferred embodiment, a normalization constraint may be applied to each set of convolution kernel generation parameters during training such that each parameter in each set of convolution kernel generation parameters is approximately of the same magnitude. In particular, in the foregoing preferred embodiment, this is such that the weight of one original convolution kernel or one channel of an original convolution kernel to which each parameter in each set of convolution kernel generation parameters corresponds in generating the corresponding derivative convolution kernel is approximately of the same order. In a preferred embodiment, the normalization constraint may include defining the sum of each set of convolution kernel generation parameters (which may be defined as 1 or 1.2, for example). Further, in some embodiments, the normalization constraint may include defining a minimum value of the parameters in each set of convolution kernel generation parameters (e.g., which may be defined as 0.1).
In some embodiments, a derivative convolution kernel may be generated in a non-linear manner based on at least a portion of the original convolution kernels such that the generated derivative convolution kernel is linearly independent of the original convolution kernels. Non-linear approaches have a higher computational complexity in image processing than linear approaches, but can achieve better image processing results because the corresponding channels of the output features generated based on the derivative convolution kernel are linearly independent of the corresponding channels of the output features generated based on the original convolution kernel.
In a preferred embodiment, a derivative convolution kernel may be generated in the training based on every two original convolution kernels. Thus, the number of sets of convolution kernel generation parameters may be half the number of original convolution kernels. For example, in the embodiment shown in FIG. 3, a derivative convolution kernel 331 may be generated based on the original convolution kernels 321, 322. Accordingly, in a preferred embodiment, each set of convolution kernel generation parameters may include two parameters, each representing a weight of a respective original convolution kernel at the time of generation of a respective derived convolution kernel.
In some embodiments, one derivative convolution kernel may be generated in the training based at least in part on every three or more original convolution kernels, i.e., the number of sets of convolution kernel generation parameters may be less than half (e.g., one-third or less) of the number of original convolution kernels. In the case where the number of convolution kernels (including the original convolution kernels and the derivative convolution kernels) is the same, generating one derivative convolution kernel based on more original convolution kernels may result in a higher storage space required for training the resulting convolutional neural network model, but the image processing effect may be better. Furthermore, in some embodiments, a derived convolution kernel may be generated in the training based at least in part on an original convolution kernel, i.e., the number of sets of convolution kernel generation parameters may also be more than half the number of original convolution kernels.
Further, in some embodiments, in addition to generating the derivative convolution kernel based on the original convolution kernel, a new derivative convolution kernel may be generated based on at least a portion of the generated derivative convolution kernel.
In a preferred embodiment, the parameters for each level of convolutional layers in convolutional neural network model 300 include one or more original convolutional kernels and one or more sets of convolutional kernel generation parameters. In other embodiments, the parameters of the partial convolution layer in the convolutional neural network model 300 may not include convolution kernel generation parameters. That is, in training, convolution processing may be performed using only the original convolution kernel in the partial convolution layer, without generating a derivative convolution kernel to perform convolution processing.
In a preferred embodiment, the manner in which the convolution kernel generation parameters are used in each level of convolution layers in the convolutional neural network model 300 to generate a derivative convolution kernel based on the original convolution kernel may be the same. In a particularly preferred embodiment, the number of sets of convolution kernel generation parameters in each level of convolution layers in the convolutional neural network model 300 may be half the number of original convolution kernels, i.e., one derivative convolution kernel may be generated based on every two original convolution kernels in the training. Thus, in the case where each convolutional layer performs convolution processing on the input image features using the same number of convolution kernels (including the original convolution kernels and the derivative convolution kernels), the number of parameters included in the convolutional neural network model trained according to the preferred embodiment of the present invention is about 2/3 in the prior art. The method greatly reduces the memory space occupied by the convolutional neural network model adopted in the image processing, and is particularly suitable for the image processing of the intelligent device side.
Fig. 4 illustrates a flow diagram of an image processing method 400 in accordance with at least one embodiment of the present disclosure.
At step 401, the method 400 begins.
At step 402, a trained convolutional neural network model 300, schematically illustrated in FIG. 3, trained in accordance with the method 200 illustrated in FIG. 2 is obtained. The parameters of the convolutional neural network model 300 include, among other things, one or more original convolution kernels 321, 322 and one or more sets of convolution kernel generation parameters 323.
At step 404, a corresponding derivative convolution kernel 331 is generated based on each set of convolution kernel generation parameters 323 and at least a portion of the one or more original convolution kernels 321, 322 in the trained convolutional neural network model 300. The manner in which the corresponding derivative convolution kernel 331 is generated based on the convolution kernel generation parameter 323 and the original convolution kernels 321, 322 is the same as the manner in which the corresponding derivative convolution kernel 331 is generated based on the convolution kernel generation parameter 323 and the original convolution kernels 321, 322 in training at step 206 of the method 200.
At step 406, the image features 310 of the image to be processed are convolved with one or more original convolution kernels 321, 322 and the generated one or more derivative convolution kernels 331. The image features 310 of the images to be processed are convolved here using the original convolution kernels 321, 322 and the derivative convolution kernel 331 in the same way as the image features 310 of the training set images are convolved in the training using the original convolution kernels 321, 322 and the derivative convolution kernel 331 at step 208 of the method 200.
At step 408, the method 400 ends.
The methods according to the present disclosure may be implemented in various suitable manners, such as in software, hardware, a combination of software and hardware, and the like.
In another aspect, a computer storage medium may be implemented having executable instructions stored thereon that, when executed, are capable of implementing the above-described method. In another aspect, the present invention also includes an image processing apparatus capable of implementing the above-described image processing method. The invention also comprises an intelligent device which comprises the image processing device. For example, the smart device may be a cell phone, a tablet, a camera, a smart camera, and the like.
The terms "front," "back," "top," "bottom," "over," "under," and the like in the description and in the claims, if any, are used for descriptive purposes and not necessarily for describing permanent relative positions. It is to be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are, for example, capable of operation in other orientations than those illustrated or otherwise described herein.
As used herein, the word "exemplary" means "serving as an example, instance, or illustration," and not as a "model" that is to be replicated accurately. Any implementation exemplarily described herein is not necessarily to be construed as preferred or advantageous over other implementations. Furthermore, the disclosure is not limited by any expressed or implied theory presented in the preceding technical field, background, brief summary or the detailed description.
As used herein, the term "substantially" is intended to encompass any minor variation resulting from design or manufacturing imperfections, device or component tolerances, environmental influences, and/or other factors. The word "substantially" also allows for differences from a perfect or ideal situation due to parasitics, noise, and other practical considerations that may exist in a practical implementation.
In addition, the foregoing description may refer to elements or nodes or features being "connected" or "coupled" together. As used herein, unless expressly stated otherwise, "connected" means that one element/node/feature is directly connected to (or directly communicates with) another element/node/feature, either electrically, mechanically, logically, or otherwise. Similarly, unless expressly stated otherwise, "coupled" means that one element/node/feature may be mechanically, electrically, logically, or otherwise joined to another element/node/feature in a direct or indirect manner to allow for interaction, even though the two features may not be directly connected. That is, to "couple" is intended to include both direct and indirect joining of elements or other features, including connection with one or more intermediate elements.
In addition, "first," "second," and like terms may also be used herein for reference purposes only, and thus are not intended to be limiting. For example, the terms "first," "second," and other such numerical terms referring to structures or elements do not imply a sequence or order unless clearly indicated by the context.
It will be further understood that the terms "comprises/comprising," "includes" and/or "including," when used herein, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
In the present disclosure, the term "providing" is used broadly to encompass all ways of obtaining an object, and thus "providing an object" includes, but is not limited to, "purchasing," "preparing/manufacturing," "arranging/setting," "installing/assembling," and/or "ordering" the object, and the like.
Those skilled in the art will appreciate that the boundaries between the above described operations merely illustrative. Multiple operations may be combined into a single operation, single operations may be distributed in additional operations, and operations may be performed at least partially overlapping in time. Moreover, alternative embodiments may include multiple instances of a particular operation, and the order of operations may be altered in various other embodiments. However, other modifications, variations, and alternatives are also possible. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.
Although some specific embodiments of the present disclosure have been described in detail by way of example, it should be understood by those skilled in the art that the foregoing examples are for purposes of illustration only and are not intended to limit the scope of the present disclosure. The various embodiments disclosed herein may be combined in any combination without departing from the spirit and scope of the present disclosure. It will also be appreciated by those skilled in the art that various modifications may be made to the embodiments without departing from the scope and spirit of the disclosure. The scope of the present disclosure is defined by the appended claims.

Claims (15)

1. A method of training a convolutional neural network model for image processing, the method comprising:
constructing a convolutional neural network model to be trained, wherein the parameters of the convolutional neural network model comprise one or more original convolutional kernels and one or more groups of convolutional kernel generation parameters;
training the one or more original convolution kernels and the one or more sets of convolution kernel generation parameters using a training set image, wherein,
in training, one derivative convolution kernel is generated based on at least a portion of the original convolution kernels using each set of convolution kernel generation parameters, and the image features of the training set images are convolved using the one or more original convolution kernels and the generated one or more derivative convolution kernels.
2. The method of claim 1, wherein
The number of parameters for each set of convolution kernel generation parameters is the same as the number of original convolution kernels used to generate the corresponding derivative convolution kernels, and
each convolution kernel generation parameter corresponds to an original convolution kernel representing a weight of the original convolution kernel at the time the corresponding derivative convolution kernel was generated.
3. The method of claim 2, wherein
Each derived convolution kernel is generated from at least a portion of the original convolution kernel linear transformation, and
each of the corresponding set of convolution kernel generation parameters represents a linear transform coefficient of the corresponding original convolution kernel to the one derivative convolution kernel.
4. The method of claim 1, wherein
The number of parameters of each set of convolution kernel generation parameters is the same as the product of the number of original convolution kernels and the number of channels thereof used to generate the corresponding derivative convolution kernels, and
each convolution kernel generation parameter corresponds to a channel of an original convolution kernel and represents a weight of the channel of the original convolution kernel at the time of generating the corresponding derivative convolution kernel.
5. The method of claim 4, wherein
Each channel of each derivative convolution kernel is generated from a corresponding channel linear transformation of at least a portion of the original convolution kernel, and
each of the corresponding set of convolution kernel generation parameters represents a linear transform coefficient of a corresponding channel of the corresponding original convolution kernel to the one channel of the one derivative convolution kernel.
6. The method of claim 1, wherein
The number of parameters of each group of convolution kernel generation parameters is the same as the product of the number of original convolution kernels used for generating the corresponding derivative convolution kernels and the number of parameters contained in the original convolution kernels.
7. The method of claim 1, wherein
The number of sets of convolution kernel generation parameters is half the number of original convolution kernels, and a derivative convolution kernel is generated in the training based on every two original convolution kernels.
8. The method of claim 1, further comprising
In training, a new derivative convolution kernel is generated based on at least a portion of the generated derivative convolution kernels.
9. The method of claim 1, wherein the method further comprises employing a normalized constraint on each set of convolution kernel generation parameters.
10. The method of claim 9, wherein the normalization constraint includes defining a sum of each set of convolution kernel generation parameters.
11. The method of claim 1, wherein
The convolutional neural network model includes a plurality of convolutional layers, and
the parameters for each level of convolutional layer include one or more original convolutional kernels and one or more sets of convolutional kernel generation parameters.
12. An image processing method, characterized in that the method comprises:
obtaining a convolutional neural network model trained according to the method of any one of claims 1-11;
generating a corresponding derivative convolution kernel based on each set of convolution kernel generation parameters in the trained convolutional neural network model and at least a portion of the one or more original convolution kernels; and
and performing convolution processing on the image characteristics of the image to be processed by utilizing the one or more original convolution kernels and the generated one or more derivative convolution kernels.
13. A computer storage medium having stored thereon executable instructions, which when executed are capable of implementing the method of any one of claims 1-12.
14. An image processing apparatus, characterized in that the apparatus is capable of implementing the method according to claim 12.
15. A smart device, characterized in that it comprises the apparatus according to claim 14.
CN202210174146.2A 2022-02-25 2022-02-25 Training method of convolution neural network model for image processing Active CN114239814B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210174146.2A CN114239814B (en) 2022-02-25 2022-02-25 Training method of convolution neural network model for image processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210174146.2A CN114239814B (en) 2022-02-25 2022-02-25 Training method of convolution neural network model for image processing

Publications (2)

Publication Number Publication Date
CN114239814A true CN114239814A (en) 2022-03-25
CN114239814B CN114239814B (en) 2022-07-08

Family

ID=80748138

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210174146.2A Active CN114239814B (en) 2022-02-25 2022-02-25 Training method of convolution neural network model for image processing

Country Status (1)

Country Link
CN (1) CN114239814B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116071359A (en) * 2023-03-08 2023-05-05 中汽研新能源汽车检验中心(天津)有限公司 Battery aging degree detection method, electronic equipment and storage medium

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107133960A (en) * 2017-04-21 2017-09-05 武汉大学 Image crack dividing method based on depth convolutional neural networks
CN107886162A (en) * 2017-11-14 2018-04-06 华南理工大学 A kind of deformable convolution kernel method based on WGAN models
CN107886164A (en) * 2017-12-20 2018-04-06 东软集团股份有限公司 A kind of convolutional neural networks training, method of testing and training, test device
CN107943750A (en) * 2017-11-14 2018-04-20 华南理工大学 A kind of decomposition convolution method based on WGAN models
CN110188795A (en) * 2019-04-24 2019-08-30 华为技术有限公司 Image classification method, data processing method and device
CN110807480A (en) * 2019-10-25 2020-02-18 广州思德医疗科技有限公司 Convolution kernel storage method and device in convolution neural network
CN110971901A (en) * 2018-09-29 2020-04-07 杭州海康威视数字技术股份有限公司 Convolutional neural network processing method and device
CN111079905A (en) * 2019-12-27 2020-04-28 北京迈格威科技有限公司 Convolutional neural network processing method, device and electronic system
CN111401524A (en) * 2020-03-17 2020-07-10 深圳市物语智联科技有限公司 Convolutional neural network processing method, device, equipment, storage medium and model
CN111461135A (en) * 2020-03-31 2020-07-28 上海大学 Digital image local filtering evidence obtaining method integrated by convolutional neural network
CN111562612A (en) * 2020-05-20 2020-08-21 大连理工大学 Deep learning microseismic event identification method and system based on attention mechanism
CN111582454A (en) * 2020-05-09 2020-08-25 北京百度网讯科技有限公司 Method and device for generating neural network model
CN111814347A (en) * 2020-07-20 2020-10-23 中国石油大学(华东) Method and system for predicting gas channeling channel in oil reservoir
CN112102281A (en) * 2020-09-11 2020-12-18 哈尔滨市科佳通用机电股份有限公司 Truck brake cylinder fault detection method based on improved Faster Rcnn
CN113269765A (en) * 2021-06-04 2021-08-17 重庆大学 Expandable convolutional neural network training method and CT image segmentation model construction method

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107133960A (en) * 2017-04-21 2017-09-05 武汉大学 Image crack dividing method based on depth convolutional neural networks
CN107886162A (en) * 2017-11-14 2018-04-06 华南理工大学 A kind of deformable convolution kernel method based on WGAN models
CN107943750A (en) * 2017-11-14 2018-04-20 华南理工大学 A kind of decomposition convolution method based on WGAN models
CN107886164A (en) * 2017-12-20 2018-04-06 东软集团股份有限公司 A kind of convolutional neural networks training, method of testing and training, test device
CN110971901A (en) * 2018-09-29 2020-04-07 杭州海康威视数字技术股份有限公司 Convolutional neural network processing method and device
CN110188795A (en) * 2019-04-24 2019-08-30 华为技术有限公司 Image classification method, data processing method and device
CN110807480A (en) * 2019-10-25 2020-02-18 广州思德医疗科技有限公司 Convolution kernel storage method and device in convolution neural network
CN111079905A (en) * 2019-12-27 2020-04-28 北京迈格威科技有限公司 Convolutional neural network processing method, device and electronic system
CN111401524A (en) * 2020-03-17 2020-07-10 深圳市物语智联科技有限公司 Convolutional neural network processing method, device, equipment, storage medium and model
CN111461135A (en) * 2020-03-31 2020-07-28 上海大学 Digital image local filtering evidence obtaining method integrated by convolutional neural network
CN111582454A (en) * 2020-05-09 2020-08-25 北京百度网讯科技有限公司 Method and device for generating neural network model
CN111562612A (en) * 2020-05-20 2020-08-21 大连理工大学 Deep learning microseismic event identification method and system based on attention mechanism
CN111814347A (en) * 2020-07-20 2020-10-23 中国石油大学(华东) Method and system for predicting gas channeling channel in oil reservoir
CN112102281A (en) * 2020-09-11 2020-12-18 哈尔滨市科佳通用机电股份有限公司 Truck brake cylinder fault detection method based on improved Faster Rcnn
CN113269765A (en) * 2021-06-04 2021-08-17 重庆大学 Expandable convolutional neural network training method and CT image segmentation model construction method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
DA HE等: "Deep Convolutional Neural Network Framework for Subpixel Mapping", 《IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING》 *
张远忠: "基于参数衍生网络的目标检测研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116071359A (en) * 2023-03-08 2023-05-05 中汽研新能源汽车检验中心(天津)有限公司 Battery aging degree detection method, electronic equipment and storage medium
CN116071359B (en) * 2023-03-08 2023-06-23 中汽研新能源汽车检验中心(天津)有限公司 Battery aging degree detection method, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN114239814B (en) 2022-07-08

Similar Documents

Publication Publication Date Title
Zeng et al. Learning image-adaptive 3d lookup tables for high performance photo enhancement in real-time
US10977001B2 (en) Asymmetric quantization of multiple-and-accumulate operations in deep learning processing
US11403838B2 (en) Image processing method, apparatus, equipment, and storage medium to obtain target image features
CN108376387B (en) Image deblurring method based on aggregation expansion convolution network
JP6365258B2 (en) Arithmetic processing unit
CN114239814B (en) Training method of convolution neural network model for image processing
US20110274368A1 (en) Image processing device, image processing method, and program
CN107967516A (en) A kind of acceleration of neutral net based on trace norm constraint and compression method
CN112889084B (en) Method, system and computer readable medium for improving color quality of image
US20180005113A1 (en) Information processing apparatus, non-transitory computer-readable storage medium, and learning-network learning value computing method
CN112640037A (en) Learning device, inference device, learning model generation method, and inference method
CN110930306A (en) Depth map super-resolution reconstruction network construction method based on non-local perception
CN110222455A (en) A kind of modeling method of asymmetric Hysteresis Model
CN111967582B (en) CNN convolutional layer operation method and CNN convolutional layer operation accelerator
CN112580675B (en) Image processing method and device and computer readable storage medium
CN112784951A (en) Winograd convolution operation method and related product
CN114611700A (en) Model reasoning speed improving method and device based on structural parameterization
CN115004220A (en) Neural network for raw low-light image enhancement
KR20220155737A (en) Apparatus and method for generating super-resolution image using light-weight convolutional neural network
CN111639652B (en) Image processing method, device and computer storage medium
CN114399828B (en) Training method of convolution neural network model for image processing
US8380773B2 (en) System and method for adaptive nonlinear filtering
CN116468902A (en) Image processing method, device and non-volatile computer readable storage medium
CN115564655A (en) Video super-resolution reconstruction method, system and medium based on deep learning
Pugh et al. Equivalence and reduction of 2-D systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant