CN112766277A

CN112766277A - Channel adjustment method, device and equipment of convolutional neural network model

Info

Publication number: CN112766277A
Application number: CN202110167609.8A
Authority: CN
Inventors: 张洪光
Original assignee: TP Link Technologies Co Ltd
Current assignee: TP Link Technologies Co Ltd
Priority date: 2021-02-07
Filing date: 2021-02-07
Publication date: 2021-05-07

Abstract

The invention discloses a channel conversion method of a convolutional neural network model, which comprises the following steps: when the current format of the input picture is different from the input format of the convolutional neural network model, extracting the weight of a first layer of convolutional layer in the convolutional neural network model; adjusting the size of the input picture to obtain a target format of the input picture; adjusting the weight of the convolutional layer according to the target format of the input picture to obtain the target weight of the convolutional layer; assigning the target weight to the convolutional layer. The invention also discloses a channel adjusting device and equipment of the convolutional neural network model and a computer readable storage medium. By adopting the embodiment of the invention, the adjustment of channels in different frame models can be supported, and the processing efficiency of the models is improved.

Description

Channel adjustment method, device and equipment of convolutional neural network model

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a channel adjusting method, a channel adjusting device and a channel adjusting device of a convolutional neural network model.

Background

A Convolutional Neural Network (CNN) is a feed-forward Neural Network whose artificial neurons can respond to a portion of the coverage of surrounding cells, and performs well for large image processing. In the field of artificial intelligence technology, deep learning is a class of machine learning algorithms that use multiple layers to gradually extract higher-level features from raw input. For example, in image processing, lower layers may identify edges, while higher layers may identify parts that are significant to humans, which may achieve higher accuracy in the classification and detection tasks. However, due to the limitation of the data set and the preprocessing tool used in the training process, the trained neural network can only be used in the application environment with the same type/consistent data format as the training set, if different types of input data are required, such as pictures in YUV format are input into the RGB input model, usually, the format of the current picture is converted once, the pictures are converted from YUV format to RGB format, and then input into the RGB input model, but the conversion of the format requires calculation, and if there are many YUV format pictures, the time consumed for converting them one by one is long, which results in long network preprocessing time and low data processing efficiency.

Disclosure of Invention

The embodiment of the invention aims to provide a channel adjusting method, a channel adjusting device and a storage medium of a convolutional neural network model, which can support the adjustment of convolutional channels in models with different frames and improve the processing efficiency of the convolutional neural network model.

In order to achieve the above object, an embodiment of the present invention provides a channel adjustment method for a convolutional neural network model, including:

when the current format of the input picture is different from the input format of the convolutional neural network model, extracting the weight of a first layer of convolutional layer in the convolutional neural network model;

adjusting the size of the input picture to obtain a target format of the input picture;

adjusting the weight of the convolutional layer according to the target format of the input picture to obtain the target weight of the convolutional layer;

assigning the target weight to the convolutional layer.

As an improvement of the above scheme, the resizing the input picture includes:

taking one channel of the input picture as a target channel;

and adjusting the sizes of the other channels of the input picture according to the size of the target channel.

As an improvement of the above scheme, the adjusting the sizes of the remaining channels of the input picture according to the size of the target channel includes:

and copying sub-channels in the rest channels of the input picture according to the size of the target channel so as to enable the size of the rest channels to be the same as that of the target channel.

As an improvement of the above solution, the adjusting the weight of the convolutional layer according to the target format of the input picture includes:

and acquiring a conversion relation between the target format of the input picture and the input format of the convolutional neural network model, and adjusting the weight of the convolutional layer according to the conversion relation.

performing a format conversion on the weights of the convolutional layers to convert the weights of the convolutional layers from an initial format to a common format; wherein, the general format is an array form in an extended program library;

adjusting the weight of the convolutional layer in the general format according to the target format of the input picture;

and performing secondary format conversion on the weight of the convolutional layer in the general format.

As an improvement of the scheme, the extension program library is a numpy library.

In order to achieve the above object, an embodiment of the present invention further provides a channel adjustment apparatus for a convolutional neural network model, including:

the weight extraction module is used for extracting the weight of the first layer of convolution layer in the convolutional neural network model when the current format of the input picture is different from the input format of the convolutional neural network model;

the picture size adjusting module is used for adjusting the size of the input picture to obtain a target format of the input picture;

the weight adjusting module is used for adjusting the weight of the convolutional layer according to the target format of the input picture to obtain the target weight of the convolutional layer;

and the weight assignment module is used for assigning the target weight to the convolutional layer.

As an improvement of the above solution, the picture resizing module is configured to:

taking one channel of the input picture as a target channel;

In order to achieve the above object, an embodiment of the present invention further provides a channel adjustment apparatus for a convolutional neural network model, including a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, where the processor implements a channel adjustment method for a convolutional neural network model according to any of the above embodiments when executing the computer program.

In order to achieve the above object, an embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium includes a stored computer program, where when the computer program runs, the apparatus where the computer-readable storage medium is located is controlled to execute the channel adjustment method of the convolutional neural network model according to any of the above embodiments.

Compared with the prior art, the channel adjusting method, the channel adjusting device, the channel adjusting equipment and the storage medium of the convolutional neural network model disclosed by the embodiment of the invention have the advantages that firstly, when the current format of an input picture is different from the input format of the convolutional neural network model, the weight of a first layer of convolutional layer in the convolutional neural network model is extracted; then, carrying out size adjustment on the input picture to obtain a target format of the input picture, so that the size of the input picture is suitable for a channel of a convolution layer in a model; and finally, adjusting the weight of the convolutional layer according to the target format of the input picture to obtain the target weight of the convolutional layer. In the process of adjusting the channels of the convolutional layer, the format of the weights of the convolutional layer is adjusted, so that the convolutional neural network model can adapt to input pictures with different formats, and meanwhile, the adjustment of the convolutional channels in models with different frames can be supported, and the processing efficiency of the convolutional neural network model is improved.

Drawings

FIG. 1 is a flow chart of a channel tuning method of a convolutional neural network model according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a convolutional neural network model provided by an embodiment of the present invention;

fig. 3 is a diagram of a YUV color channel provided by an embodiment of the present invention;

fig. 4 is a schematic diagram of YUV color channels after picture resizing according to an embodiment of the present invention;

fig. 5 is a block diagram of a channel adjustment apparatus of a convolutional neural network model according to an embodiment of the present invention;

fig. 6 is a block diagram of a channel adjustment device of a convolutional neural network model according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1, fig. 1 is a flowchart of a channel adjustment method of a convolutional neural network model according to an embodiment of the present invention, where the channel adjustment method of the convolutional neural network model includes:

s1, when the current format of the input picture is different from the input format of the convolutional neural network model, extracting the weight of the first layer of convolutional layer in the convolutional neural network model;

s2, adjusting the size of the input picture to obtain the target format of the input picture;

s3, adjusting the weight of the convolutional layer according to the target format of the input picture to obtain the target weight of the convolutional layer;

and S4, assigning the target weight to the convolutional layer.

It is worth to be noted that the channel adjustment method of the convolutional neural network model according to the embodiment of the present invention can be implemented by a controller in an image processing apparatus, and the convolutional neural network model can be applied to various deep learning frameworks, such as: frames such as TensorFlow, Caffe, Theano, Keras, PyTorch, Mxnet, etc., the channel adjusting method provided by the embodiment of the invention can perform interconversion among a plurality of picture formats, such as: picture formats such as RGB, BGR, YUV and the like. YUV is a color coding method, which is divided into three components, "Y" represents brightness (Luma) and is a gray value, and "U" and "V" represent Chrominance (Chroma) for describing the color and saturation of an image. YUV is now commonly used in the computer field to represent files encoded using YCbCr, so YUV can be viewed coarsely as YCbCr.

Referring to fig. 2, fig. 2 is a schematic diagram of a convolutional neural network model provided in an embodiment of the present invention, where the input is a 8 × 3 picture, and W0 is the weight of the first convolutional layer of the model, and it is composed of 4 convolutional kernels, each convolutional kernel has 3 channels.

Referring to fig. 3, fig. 3 is a schematic diagram of a YUV color channel according to an embodiment of the present invention, and it can be seen from the diagram that in the YU12 data format, U1 and V1 are shared by Y1, Y2, Y7 and Y8. The linear array at the bottom of the figure is the storage order of YUV420 in memory, and it can be seen that Y, U and V are stored sequentially. When reading out, the Y components are written out in sequence and then read out in sequence according to U, V.

Specifically, in step S1, it is first determined whether the format of the input picture is the same as the input format of the convolutional neural network model, and if so, the weight of the convolutional neural network model does not need to be adjusted; if not, extracting the weight of the first layer convolution layer in the convolution neural network model. The weights are weights of convolution kernels in the convolution neural network and are used for performing convolution calculation, and different weights correspond to different output effects.

Specifically, in step S2, the input picture is resized to obtain the target format of the input picture. Firstly, taking one channel of the input picture as a target channel; and then, adjusting the sizes of the rest channels of the input picture according to the size of the target channel.

Exemplarily, a Y channel of the input picture is taken as a target channel, and sub-channels in the remaining channels of the input picture are copied according to the size of the Y channel, so that the size of the remaining channels is the same as the size of the target channel. As shown in fig. 4, 4U 1 were obtained by three copies of the first U1.

In the embodiment of the invention, the purpose of converting YUV pictures into YUV three-channel format with the same size through interpolation is to ensure that data input into a CNN network must be a block array of c x h w, but the size of each channel of three channels in the YUV format is different and large, and the data cannot be directly input, so that through padding (copying), YUV values at each position can be in one-to-one correspondence, RGB and the YUV and YUV are in a linear relation, and the convolution calculation is also in a linear relation, so that the calculation of converting YUV into RGB can be directly merged into a convolution kernel, thereby saving the calculation amount. After the UV channels are adjusted, the number of YUV three channels is kept consistent, the adjustment only needs simple copying and does not need calculation, so the efficiency is extremely high.

Specifically, in step S3, in one embodiment, the adjusting the weight of the convolutional layer according to the target format of the input picture includes:

For example, in the embodiment of the present invention, when the input picture is input into the convolutional neural network models with different input formats, format conversion may be performed on the input picture in advance according to a conversion relationship between the two formats, so that the format of the input picture corresponds to the input format of the convolutional neural network model, at this time, the format of the weight of the convolutional layer is also directly converted according to the conversion relationship, and at this time, the target weight of the convolutional layer can be obtained only by performing format conversion once.

In another embodiment, the adjusting the weight of the convolutional layer according to the target format of the input picture includes steps S31 to S33:

s31, carrying out primary format conversion on the weight of the convolutional layer so as to convert the weight of the convolutional layer from an initial format to a general format; wherein, the general format is an array form in an extended program library;

s32, adjusting the weight of the convolutional layer in the general format according to the target format of the input picture;

and S33, performing secondary format conversion on the weight of the convolutional layer in the general format.

Specifically, in step S31, the model definition method and the weight storage method of the convolutional neural network model are different under different frameworks. Such as: in a Caffe framework, a model is defined under a prototxt file, and weights are saved under a ca ffemodel file; in a Pythrch frame, model definitions and weights are saved in a pth file; in the tensoflow framework, the model definitions and weights are saved in pb files. For convolutional neural network models under different frameworks, the processing method is different because the weight storage format is different.

Optionally, the common format is in the form of an array (array) in an extended library. The extended program library can be a numpy library, and numpy (numerical Python) is an extended program library of Python language, supports a large number of dimensional arrays and matrix operations, and provides a large number of mathematical function libraries for array operations.

The method for converting the convolutional neural network model first layer convolutional layer into numpy under different frameworks is different. For the convolution layer under the caffe framework, a caffe library and a numpy library need to be called, the model is loaded firstly, then the first layer of convolution kernel is extracted, and the first layer of extracted convolution kernel is the numpy format. For the pyrrch framework, the extracted first layer convolution kernel is in a Tensor format, and a W.numpy () function is called to convert the format, wherein W represents the first layer convolution kernel. For the tensoflow framework, reference may be made to the format conversion manner in the prior art, which is not described herein again.

In the embodiment of the present invention, the purpose of format conversion for the weights is to find an intermediate format for processing the differences of CNN model formats in different frames.

Specifically, in step S32, the adjusting the weight of the convolutional layer in the generic format according to the target format of the input picture.

For example, the value of the weight may be adjusted to adapt to the target format, and a specific adjustment manner of the value of the weight may be set according to an actual situation, which is not specifically limited in the present invention.

Specifically, in step S33, after the weights of the convolutional layers are adjusted, the weights in the general format are subjected to secondary format conversion to obtain the target weights of the convolutional layers.

Further, in the process of adjusting the weight of the convolutional layer in the general format, the channel of each convolutional core in the convolutional layer may be converted into a channel corresponding to the target format of the input picture according to a preset channel conversion policy.

Illustratively, when a picture in a YUV format needs to be input into a model in an RGB format, a YUV to RGB conversion calculation is first merged into a first layer convolution layer, a conversion formula of YUV and RGB is a linear relationship, and a convolution calculation is also a linear relationship only with a multiplication and addition calculation, and for the first layer convolution calculation, the following formula is satisfied:

formula (1) where Y is WX + b;

wherein, W is a convolution kernel, X is an input RGB picture, and b is a bias;

the matrix expression of the linear relationship between RGB and YUV is as follows:

equation (2) can be simplified as:

x ═ QZ + a formula (3);

substituting formula (3) into formula (1) to obtain:

y ═ W (QZ + a) + b ═ WQZ + Wa + b ═ PZ + C formula (4);

where P ═ WQ, c ═ Wa + b, and Z are YUV maps obtained by simple copying in step S3.

The above formula shows that the calculation process of converting YUV into RGB can be integrated into a convolution kernel, and conversion of receiving YUV format pictures as input by an RGB model is completed under the condition of keeping the forward reasoning calculation amount of the model unchanged.

In the embodiment of the invention, the channels in the convolutional neural network model are adjusted according to the format of the input picture, so that the convolutional neural network model can adapt to the input pictures with different formats, and meanwhile, the adjustment of the convolutional channels in the models under different frames can be supported, and the processing efficiency of the convolutional neural network model is improved.

Specifically, in step S4, the target weight is assigned to the convolutional layer.

In the embodiment of the invention, through carrying out format conversion twice on the weight, the aim is to be compatible with various frames (Caffe, Pyrorch, tensorflow and the like), and the models have different storage formats under different frames.

Compared with the prior art, the channel adjusting method of the convolutional neural network model disclosed by the embodiment of the invention comprises the steps of firstly, when the current format of an input picture is different from the input format of the convolutional neural network model, extracting the weight of a first layer of convolutional layer in the convolutional neural network model; then, carrying out size adjustment on the input picture to obtain a target format of the input picture, so that the size of the input picture is suitable for a channel of a convolution layer in a model; and finally, adjusting the weight of the convolutional layer according to the target format of the input picture to obtain the target weight of the convolutional layer. In the process of adjusting the channels of the convolutional layer, the format of the weights of the convolutional layer is adjusted, so that the convolutional neural network model can adapt to input pictures with different formats, and meanwhile, the adjustment of the convolutional channels in models with different frames can be supported, and the processing efficiency of the convolutional neural network model is improved.

Referring to fig. 5, fig. 5 is a block diagram of a channel adjustment apparatus 10 of a convolutional neural network model according to an embodiment of the present invention, where the channel adjustment apparatus 10 of the convolutional neural network model includes:

the weight extraction module 11 is configured to extract a weight of a first layer of convolutional layer in the convolutional neural network model when a current format of an input picture is different from an input format of the convolutional neural network model;

a picture size adjusting module 12, configured to perform size adjustment on the input picture to obtain a target format of the input picture;

a weight adjusting module 13, configured to adjust the weight of the convolutional layer according to the target format of the input picture to obtain a target weight of the convolutional layer;

a weight assignment module 14, configured to perform a weight assignment module, configured to assign the target weight to the convolutional layer.

It should be noted that the channel adjustment device 10 of the convolutional neural network model according to the embodiment of the present invention may be a controller in an image processing device, and the convolutional neural network model can be applied to various deep learning frameworks, such as: frames such as TensorFlow, Caffe, Theano, Keras, PyTorch, Mxnet, etc., the channel adjusting method provided by the embodiment of the invention can perform interconversion among a plurality of picture formats, such as: picture formats such as RGB, BGR, YUV and the like.

Optionally, the picture resizing module 12 is configured to:

taking one channel of the input picture as a target channel;

Optionally, the adjusting the sizes of the remaining channels of the input picture according to the size of the target channel includes:

and copying sub-channels of the rest channels of the input picture according to the size of the target channel so as to enable the size of the rest channels to be the same as that of the target channel.

Optionally, the weight adjusting module 13 is configured to:

and performing secondary format conversion on the weight of the convolutional layer in the general format to obtain the target weight of the convolutional layer.

Optionally, the extended library is a numpy library.

It should be noted that, for the working process of each module in the channel adjustment apparatus 10 of the convolutional neural network model according to the embodiment of the present invention, please refer to the working process of the channel adjustment method of the convolutional neural network model according to the above embodiment, which is not described herein again.

Compared with the prior art, the channel adjusting device 10 of the convolutional neural network model disclosed in the embodiment of the invention comprises the following steps of firstly, when the current format of an input picture is different from the input format of the convolutional neural network model, extracting the weight of a first layer of convolutional layer in the convolutional neural network model; then, carrying out size adjustment on the input picture to obtain a target format of the input picture, so that the size of the input picture is suitable for a channel of a convolution layer in a model; and finally, adjusting the weight of the convolutional layer according to the target format of the input picture to obtain the target weight of the convolutional layer. In the process of adjusting the channels of the convolutional layer, the format of the weights of the convolutional layer is adjusted, so that the convolutional neural network model can adapt to input pictures with different formats, and meanwhile, the adjustment of the convolutional channels in models with different frames can be supported, and the processing efficiency of the convolutional neural network model is improved.

Referring to fig. 6, fig. 6 is a block diagram of a channel adjustment apparatus 20 of a convolutional neural network model according to an embodiment of the present invention, where the channel adjustment apparatus 20 of the convolutional neural network model of the embodiment includes: a processor 21, a memory 22 and a computer program stored in said memory 22 and executable on said processor 21, e.g. step S1. The processor 21 implements the steps in the embodiments of the channel adjustment method of the convolutional neural network model described above when executing the computer program. Alternatively, the processor 21 implements the functions of the modules/units in the above-described device embodiments when executing the computer program.

Illustratively, the computer program may be divided into one or more modules/units, which are stored in the memory 22 and executed by the processor 21 to accomplish the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution process of the computer program in the channel adjustment device 20 of the convolutional neural network model.

The channel adjusting device 20 of the convolutional neural network model may be a desktop computer, a notebook, a palm computer, a cloud server, or other computing devices. The channel adjusting device 20 of the convolutional neural network model may include, but is not limited to, a processor 21 and a memory 22. It will be appreciated by those skilled in the art that the schematic diagram is merely an example of the channel adjustment device 20 of the convolutional neural network model and does not constitute a limitation of the channel adjustment device 20 of the convolutional neural network model, and may include more or fewer components than those shown, or combine certain components, or different components, for example, the channel adjustment device 20 of the convolutional neural network model may also include input and output devices, network access devices, buses, and the like.

The Processor 21 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. The general purpose processor may be a microprocessor or the processor may be any conventional processor, etc., and the processor 21 is a control center of the channel adjusting device 20 of the convolutional neural network model, and various interfaces and lines are used to connect various parts of the channel adjusting device 20 of the whole convolutional neural network model.

The memory 22 may be used to store the computer programs and/or modules, and the processor 21 implements various functions of the channel adjusting device 20 of the convolutional neural network model by executing or executing the computer programs and/or modules stored in the memory 22 and calling data stored in the memory 22. The memory 22 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. In addition, the memory 22 may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.

Wherein, the integrated modules/units of the channel adjusting device 20 of the convolutional neural network model can be stored in a computer readable storage medium if they are implemented in the form of software functional units and sold or used as independent products. Based on such understanding, all or part of the flow of the method according to the above embodiments may be implemented by a computer program, which may be stored in a computer readable storage medium and used by the processor 21 to implement the steps of the above embodiments of the method. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.

It should be noted that the above-described device embodiments are merely illustrative, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. In addition, in the drawings of the embodiment of the apparatus provided by the present invention, the connection relationship between the modules indicates that there is a communication connection between them, and may be specifically implemented as one or more communication buses or signal lines. One of ordinary skill in the art can understand and implement it without inventive effort.

While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.

Claims

1. A channel adjustment method of a convolutional neural network model is characterized by comprising the following steps:

assigning the target weight to the convolutional layer.

2. The method for tuning the path of a convolutional neural network model of claim 1, wherein said resizing the input picture comprises:

taking one channel of the input picture as a target channel;

3. The method for adjusting the channels of the convolutional neural network model as claimed in claim 2, wherein the adjusting the sizes of the remaining channels of the input picture according to the size of the target channel comprises:

4. The method of claim 1, wherein the adjusting the weights of the convolutional layers according to the target format of the input picture comprises:

5. The method of claim 1, wherein the adjusting the weights of the convolutional layers according to the target format of the input picture comprises:

6. The method for tuning the path of a convolutional neural network model of claim 5, wherein the extended program library is a numpy library.

7. A channel adjustment apparatus for a convolutional neural network model, comprising:

8. The apparatus for channel adaptation of a convolutional neural network model of claim 7, wherein the picture resizing module is configured to:

taking one channel of the input picture as a target channel;

9. A channel adjustment apparatus of a convolutional neural network model, comprising a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, the processor implementing the channel adjustment method of the convolutional neural network model according to any one of claims 1 to 6 when executing the computer program.

10. A computer-readable storage medium, comprising a stored computer program, wherein the computer program, when executed, controls an apparatus in which the computer-readable storage medium is located to perform the channel adjustment method of the convolutional neural network model as defined in any one of claims 1 to 6.