CN112598110B

CN112598110B - Neural network construction method, device, equipment and medium

Info

Publication number: CN112598110B
Application number: CN202011403214.5A
Authority: CN
Inventors: 张选杨
Original assignee: Beijing Megvii Technology Co Ltd
Current assignee: Beijing Megvii Technology Co Ltd
Priority date: 2020-12-04
Filing date: 2020-12-04
Publication date: 2024-05-07
Anticipated expiration: 2040-12-04
Also published as: CN112598110A

Abstract

The embodiment of the invention provides a neural network construction method, a device, equipment and a medium, wherein the method comprises the following steps: determining a convolution layer to be replaced from an original neural network; replacing the convolution layer to be replaced with a target convolution layer to obtain a target convolution neural network; the target convolution kernel parameter of the target convolution layer is obtained by adjusting the convolution kernel parameter of the original convolution layer to be replaced according to the clipping rate of the original neural network; the target convolution layer is used for carrying out convolution operation on the feature map to be convolved according to the target convolution kernel parameters to obtain a convolved feature map.

Description

Neural network construction method, device, equipment and medium

Technical Field

The present invention relates to the field of computer processing technologies, and in particular, to a method, an apparatus, a device, and a medium for constructing a neural network.

Background

In recent years, with the increase of the complexity of the artificial neural network, the expression capability of the model is also improved. However, although the increasing of the complexity of the neural network model can improve the expression capability and performance of the model, the model also increases a large amount of parameters and calculation complexity, and especially in the application scenarios of resource limitation such as embedded equipment, automatic driving field and the like, the model prediction performance cannot be improved by increasing the complexity of the model without limitation. Therefore, in order to accelerate the landing of artificial intelligence, the complexity of the model needs to be reduced as much as possible on the premise of ensuring the model prediction accuracy.

In the related art, a model is generally compressed through model pruning to reduce model parameters or computational complexity as much as possible. However, in this way, each convolution layer needs to manually screen out the corresponding channel parameters according to the obtained channel number subscripts and gather them. This approach is very time consuming, resulting in a model prediction that is still less effective.

Disclosure of Invention

In view of the foregoing, a neural network construction method, apparatus, device, and medium according to embodiments of the present invention are provided to overcome or at least partially solve the foregoing problems.

In a first aspect of an embodiment of the present invention, a neural network construction method is disclosed, the method including:

Determining a convolution layer to be replaced from an original neural network;

Replacing the convolution layer to be replaced with a target convolution layer to obtain a target convolution neural network;

the target convolution kernel parameter of the target convolution layer is obtained by adjusting the convolution kernel parameter of the original convolution layer to be replaced according to the clipping rate of the original neural network;

the target convolution layer is used for carrying out convolution operation on the feature map to be convolved according to the target convolution kernel parameters to obtain a convolved feature map.

Optionally, the target convolution layer comprises a first convolution layer and a second convolution layer, and the convolution kernels in the first convolution layer and the second convolution layer are 1*1 convolution kernels;

the first convolution layer is used for adjusting the number of input channels in the original convolution kernel parameters according to the clipping rate to obtain adjusted first convolution kernel parameters;

The second convolution layer is used for adjusting the number of output channels in the convolution kernel parameters of the original convolution layer according to the clipping rate and the adjusted first convolution kernel parameters.

Optionally, the target convolution layer further comprises: the first dimension exchange module is connected between the first convolution layer and the second convolution layer, and the second dimension exchange module is connected between the second convolution layer and the output end of the target convolution layer;

the first dimension exchange module is used for exchanging the 0 th dimension and the 1 st dimension of the adjusted first convolution kernel parameter to obtain a first convolution kernel parameter after dimension exchange;

The second convolution layer is used for adjusting the number of output channels in the first convolution kernel parameters after the dimension exchange according to the cutting rate to obtain adjusted second convolution kernel parameters;

The second dimension exchange module is used for exchanging the 0 th dimension and the 1 st dimension of the adjusted second convolution kernel parameter to obtain the target convolution kernel parameter.

Optionally, the target convolution layer further comprises: the device comprises an attention module, a dimension adjustment module and a convolution module, wherein the attention module is connected between an input end of the target convolution layer and the second convolution layer, the dimension adjustment module is connected between the attention module and the second convolution layer, and the convolution module is connected between the input end of the target convolution layer and an output end of the target convolution layer;

The attention module is used for outputting tensors of preset dimensions according to the number of output channels of the feature map to be convolved input into the input end and the clipping rate;

the dimension adjustment module is used for adjusting the preset dimension to be the same as the dimension of the first convolution kernel parameter after dimension exchange according to the tensor of the preset dimension to obtain the convolution kernel parameter of the second convolution layer;

The convolution module is used for carrying out convolution operation on the feature map to be convolved according to the target convolution kernel parameter to obtain a feature map after convolution.

Optionally, the attention module comprises a global pooling layer, a full connection layer and an activation function layer which are sequentially connected;

the output dimension of the full connection layer is the preset dimension, and the activation function of the activation function layer is a Sigmoid activation function.

Optionally, after obtaining the target neural network, the method further comprises:

And training the target neural network by taking the sample image set as a training sample to obtain an image processing model, wherein the image processing model performs object identification or classification.

Optionally, after obtaining the image processing model, the method further comprises:

Obtaining an image to be processed;

Inputting the image to be processed into the image processing model to obtain a processing result output by the image processing model, wherein the processing result is a classification result of the image to be processed or a recognition result of an object contained in the image to be processed.

In a second aspect of the embodiment of the present invention, there is provided a neural network construction apparatus, including:

the convolution layer determining module is used for determining a convolution layer to be replaced from the original neural network;

The replacing module is used for replacing the convolution layer to be replaced with a target convolution layer to obtain a target convolution neural network;

The target convolution kernel parameter of the target convolution layer is obtained by adjusting the convolution kernel parameter of the original convolution layer to be replaced according to the clipping rate of the original neural network; the target convolution layer is used for carrying out convolution operation on the feature map to be convolved according to the target convolution kernel parameters to obtain a convolved feature map.

The embodiment of the invention also discloses an electronic device, which comprises: comprising a memory, a processor and a computer program stored on the memory and executable on the processor, which processor when executed implements the neural network construction method as described in the embodiments of the first aspect.

The embodiment of the invention also discloses a computer readable storage medium, and a stored computer program causes a processor to execute the neural network construction method according to the embodiment of the first aspect of the invention.

The embodiment of the invention has the following advantages:

In the embodiment of the invention, the convolution layer to be replaced in the original neural network can be replaced by the target convolution layer, so that the target convolution neural network is obtained. The target convolution kernel parameter of the target convolution layer is obtained after the convolution kernel parameter of the original convolution layer to be replaced is adjusted according to the clipping rate of the original neural network; in this way, the target convolution layer can be used for carrying out convolution operation on the feature map to be convolved according to the target convolution kernel parameter to obtain the feature map after convolution.

By adopting the neural network construction method of the embodiment of the application, the original convolution kernel parameters in the original neural network are subjected to corresponding convolution operation twice through the convolution kernel parameters of the two convolution layers, so that the cutting of the number of input channels and the number of output channels in the original convolution kernel parameters is realized, the cutting of the number of input channels and the number of output channels is realized by utilizing the obtained target convolution kernel parameters in the process of carrying out convolution processing on the feature map to be convolved, the cutting according to the subscript of the output feature map is not needed, the problems of low efficiency and high labor cost are avoided, and the efficiency of training the model and the efficiency of carrying out actual image processing by utilizing the model can be improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of steps of a neural network construction method according to an embodiment of the present invention;

FIG. 2 is a network architecture diagram of a target neural network in the practice of the present invention;

FIG. 3 is a schematic diagram of an attention sub-module in the practice of the present invention;

Fig. 4 is a block diagram of a neural network building apparatus in the practice of the present invention.

Detailed Description

In order that the above objects, features and advantages of the present invention will be readily apparent, a more particular description of embodiments of the invention will be rendered by reference to the appended drawings, which are illustrated in the appended drawings. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The inventor proposes the following technical concept for solving the problems in the related art: the convolution kernel inherent in the neural network is subjected to convolution operation by using 1*1 convolution kernels, so that the transformation of the dimension of the convolution kernels is realized, and the requirement on the parameters of the convolution kernels under the condition of not aiming at the clipping rate is met.

Referring to fig. 1, a flowchart illustrating steps of a neural network construction method according to an embodiment of the present application is shown, and as shown in fig. 1, the method may specifically include the following steps:

step S101: the convolutional layer to be replaced is determined from the original neural network.

In this embodiment, the original neural network may be any one of the following deep convolutional networks: VGG networks, resNet, denseNet, etc. Wherein, the original neural network can comprise a plurality of convolution layers, and the convolution kernel in each convolution layer can be used for carrying out convolution processing on the input characteristics. Each convolution layer comprises an input end and an output end, wherein the input end is used for receiving the characteristics sent to the convolution layer, and the output end is used for outputting the characteristics after the input characteristics are processed. The input to the convolution layer may be a feature map, and the output of the convolution layer may be a feature map.

Wherein each convolution layer in the original neural network can be determined to be a convolution layer to be replaced, the convolution layer to be replaced has a convolution kernel, and the convolution kernel parameters of the convolution kernel can be [ O, C, K, K ], wherein O represents the number of output channels of the convolution layer, C represents the number of input channels of the convolution layer, and K represents the size of the convolution kernel.

Step S102: and replacing the convolution layer to be replaced with a target convolution layer to obtain the target convolution neural network.

In this embodiment, the target convolution layer has its own target convolution kernel, and the target convolution kernel parameter of the target convolution kernel is obtained after the adjustment of the original convolution kernel parameter of the original convolution layer to be replaced according to the clipping rate of the original neural network. The clipping rate may be a clipping rate for clipping the number of output channels and the number of input channels of the convolution layer, and the clipping rate may be preset according to requirements, and it should be noted that the clipping rate may be a value between 0 and 1.

Specifically, according to the clipping rate of the original neural network, the original convolution layer convolution kernel parameter to be replaced is adjusted, which may mean that according to the clipping rate, the original convolution layer convolution kernel parameter to be replaced is subjected to convolution processing, so as to obtain the target convolution kernel parameter. After the target convolution kernel parameter is obtained, the target convolution layer can carry out convolution processing on the feature map input into the layer according to the target convolution kernel parameter to obtain a feature map after convolution.

Taking the clipping rate as d as an example, the target convolution kernel parameters obtained after the convolution processing is performed on the original convolution layer convolution kernel parameters to be replaced may be [ o×d, c×d, K ]. Therefore, the number of output channels is changed to O x d, the number of input channels is changed to C x d, and the number of output channels and the number of input channels are cut because d is a value between 0 and 1, so that the calculation force in the convolution processing process is saved, the compression model is achieved, model parameters are saved, and the prediction efficiency is improved.

The target convolution kernel parameters are obtained by carrying out convolution processing on the original convolution layer convolution kernel parameters to be replaced according to the clipping rate, so that the transformation of convolution kernel dimensions is realized, and the requirements on the convolution kernel parameters under the condition of not target clipping rate are met. The method and the device have the advantages that in the process of carrying out convolution processing on the feature map by utilizing the convolution kernel parameters, the clipping of the number of input channels and the number of output channels is realized, the clipping is not needed manually according to the subscript of the output feature map, and the problems of low efficiency and high labor cost are avoided, so that the efficiency of training the model and the efficiency of carrying out actual image processing by utilizing the model can be improved. Solves the time delay problem caused by parameter selection in the dynamic pruning technical process,

Referring to fig. 2 and 3, fig. 2 is a schematic diagram illustrating an overall principle of a neural network construction method according to an embodiment of the present application, fig. 3 is a network structure diagram illustrating a target convolutional layer in a target convolutional neural network according to an embodiment of the present application, and the target neural network constructed according to the embodiment of the present application is described with reference to fig. 2 and 3.

As shown in fig. 3, the target convolution layer includes a first convolution layer and a second convolution layer, where the convolution kernels in the first convolution layer and the convolution kernels in the second convolution layer are 1*1 convolution kernels;

The first convolution layer is used for adjusting the number of input channels in the original convolution kernel parameters according to the cutting rate to obtain adjusted first convolution kernel parameters; the second convolution layer is used for adjusting the number of output channels in the original convolution kernel according to the clipping rate and the adjusted first convolution kernel parameter.

A first convolution layer for adjusting the number of input channels and a second convolution layer for adjusting the number of output channels may be included in this embodiment. The output of the first convolution layer is the input of the second convolution layer, so that the purpose of adjusting the original convolution kernel parameters is achieved through the two convolution layers.

In the embodiment of the application, the number of input channels and the number of output channels are cut, so that the number of input channels in the original convolution kernel parameter can be adjusted according to the cutting rate to obtain the adjusted first convolution kernel parameter, and then the number of output channels in the original convolution kernel parameter is adjusted according to the cutting rate and the first convolution kernel parameter, and finally the target convolution kernel parameter is obtained through two-stage adjustment.

As shown in fig. 2, the leftmost side 201 is the original convolution layer convolution kernel parameters [ O, C, K ], the clipping rate is d, the convolution kernel of the first convolution layer is [ c×d, C, 1] after the first convolution processing, the output first convolution kernel parameters are [ O, c×d, K ], and as shown in 202, clipping of the number of input channels in the original convolution layer convolution kernel parameters is realized. Then, the number of output channels of the convolution kernel parameters of the original convolution layer is adjusted through the convolution kernels [ o×d, O, 1] in the second convolution layer, that is, the convolution is performed on the first convolution kernel parameters [ O, c×d, K ] by the convolution parameters [ o×d, O, 1], and finally the obtained target convolution kernel parameters are [ o×d, c×d, K ], as shown in 205.

In one example, as shown in fig. 3, the target convolutional layer further comprises: the device comprises a first dimension exchange module and a second dimension exchange module, wherein the first dimension exchange module is connected between the first convolution layer and the second convolution layer, and the second dimension exchange module is connected between the second convolution layer and the output end of the target convolution layer. As shown in fig. 3, the output of the target convolution layer may be the output at which the convolution module is located.

The first dimension exchange module is used for exchanging the 0 th dimension and the 1 st dimension of the adjusted first convolution kernel parameter to obtain a first convolution kernel parameter after dimension exchange; the second convolution layer is used for adjusting the number of output channels in the first convolution kernel parameters after the dimension exchange according to the cutting rate to obtain adjusted second convolution kernel parameters; the second dimension exchange module is used for exchanging the 0 th dimension and the 1 st dimension of the adjusted second convolution kernel parameter to obtain the target convolution kernel parameter.

In this example, as can be seen from fig. 2, when the convolution kernel parameters of the original convolution layer are processed by the convolution kernel [ c×d, C, 1] of the first convolution layer, the obtained convolution kernel is shown as 202 in the figure, and in practice, in order to facilitate the processing of the first convolution kernel parameters by the second convolution layer in the later stage, the 0 th dimension and the 1 st dimension of the first convolution kernel parameters may be exchanged, so that the obtained convolution kernel is shown as 203 in the figure. That is, the convolution kernel parameters of the first convolution layer are [ C, c×d, 1], which processes the convolution kernel parameters of the original convolution layer, the output first convolution kernel parameters are [ O, c×d, K ], the dimensions of the 0 th and 1 st dimensions of the first convolution kernel parameters are flipped once, and finally the output first convolution kernel parameters after the dimension swap are [ c×d, O, K ].

Then, the convolution kernel [ o×d, O, 1] of the second convolution layer convolves the first convolution kernel parameter [ c×d, O, K ] after the dimension exchange, and the output second convolution kernel parameter is [ c×d, o×d, K ] as shown in 204, and then, the 0 th and 1 st dimensions of the second convolution parameter are exchanged to obtain the target convolution kernel parameter [ o×d, c×d, K ].

As shown in fig. 2, the convolution kernel [ o×d, O, 1] of the second convolution layer may be obtained according to the feature map to be convolved input to the target convolution layer, so that the convolution kernel parameters of the second convolution layer are different according to the input. Of course, if different target convolutional layers in the target neural network have different inputs, the second convolutional layer of the different target convolutional layers has different convolutional kernel parameters.

Then the target convolutional layer may further comprise, as shown in fig. 3: an attention module, a dimension adjustment module, and a convolution module.

The attention module is connected between the input end of the target convolution layer and the second convolution layer, the dimension adjustment module is connected between the attention module and the second convolution layer, and the convolution module is connected between the input end of the target convolution layer and the output end of the target convolution layer;

Specifically, the attention module is used for outputting tensors of preset dimensions according to the number of output channels of the feature map to be convolved input into the input end and the clipping rate.

In one example, the attention module includes a global pooling layer, a fully-connected layer, an activation function layer connected in sequence; the output dimension of the full connection layer is the preset dimension, and the activation function of the activation function layer is a Sigmoid activation function.

As can be seen from fig. 2, the attention module is shown as a GP-FC-Sigmoid module in fig. 2, and the feature map to be convolved inputted to the input terminal is inputX, and the feature map to be convolved is inputted to the attention module and also inputted to the convolution module conv. After the feature map to be convolved sequentially passes through the global pooling, the full-connection layer and the Sigmoid activation function layer in the attention module, a tensor [ O x d x O ] with preset dimensions is output, wherein O represents the number of output channels of the feature map to be convolved, d is a clipping rate, and d can be preset. The dimension of the tensor output by the attention module may be preset, for example, the tensor of the preset dimension [ o×d×o ] is two dimensions.

The dimension adjustment module is used for adjusting the preset dimension to be the same as the dimension of the first convolution kernel parameter after dimension exchange according to the tensor of the preset dimension, and obtaining the convolution kernel parameter of the second convolution layer.

After obtaining the tensor of the preset dimension, the dimension adjustment module can convert the tensor of the preset dimension according to the first convolution kernel parameter after dimension exchange to obtain the convolution kernel parameter of the second convolution layer, wherein the dimension of the convolution kernel parameter of the second convolution layer is the same as the dimension of the first convolution kernel parameter after dimension exchange.

For example, the tensor of the preset dimension [ o×d×o ], and the first convolution kernel parameter after the dimension exchange is [ c×d, O, K ], and after the tensor of the preset dimension [ o×d×o ] is converted, the convolution kernel parameter of the second convolution layer obtained is [ o×d, O, 1]. Then, the first convolution kernel parameters [ C d, O, K, K ] after dimension exchange are convolved by using the convolution kernel parameters [ O d, O, 1] of the second convolution layer, the output convolution parameters are [ C d, O d, K, K ], and then the 0 th and 1 st dimensions of the second convolution parameters are exchanged to obtain target convolution kernel parameters [ O d, C d, K, K ].

The convolution module is used for carrying out convolution operation on the feature map to be convolved according to the target convolution kernel parameter to obtain a convolved feature map.

After the target convolution kernel parameter is obtained, the target convolution kernel parameter can be used as a convolution kernel of the convolution module, and the feature map to be convolved, which is input to the target convolution layer, is convolved, so that the convolved feature map is input to the next target convolution layer, and the target convolution kernel parameter of the next target convolution layer can also be obtained in the above manner, so that the next target convolution layer can convolve the convolved feature map by utilizing the target convolution kernel parameter of the next target convolution layer.

The original convolution kernel parameters in the original neural network are subjected to corresponding convolution operation twice through the convolution kernel parameters of the two convolution layers, so that the original convolution kernel parameters of the original convolution layers are cut on the number of input channels and the number of output channels, the obtained target convolution kernel parameters are utilized to cut the number of the input channels and the number of the output channels in the process of carrying out convolution processing on the feature images to be convolved, the cutting of the number of the input channels and the number of the output channels is not needed manually according to the subscript of the output feature images, the problems of low efficiency and high labor cost are avoided, and the efficiency of training the model and the efficiency of carrying out actual image processing by utilizing the model can be improved.

By the above method, the target neural network can be obtained, and it can be understood that the target neural network is obtained by any original neural network, and therefore any image processing task, such as an image recognition task, an image classification task, and the like, can be realized.

Therefore, in order to enable the target neural network to perform the image processing task, the target neural network may be trained to obtain an image processing model. Specifically, the target neural network may be trained by using the sample image set as a training sample, to obtain an image processing model, and the image processing model performs object recognition or classification.

The sample image set may include a plurality of sample images for the same image processing task, and each sample image may or may not carry a label according to an actual training requirement.

The object recognition may be a face image recognition task, an attribute recognition task, a fingerprint image recognition task, an iris image recognition task, or the like. Object classification may refer to classifying a class of objects. Correspondingly, aiming at the face image recognition task, the sample image set can comprise a plurality of face images from different faces or the same face; for the attribute identification task, the sample image set can comprise a plurality of sample images with different attributes; for a fingerprint image recognition task, the sample image set can include a plurality of fingerprint images with different fingers or the same finger; for iris image recognition tasks, the sample image set may include multiple iris images with images from different eyes or the same eye.

In this embodiment, for different image processing tasks, the target neural network may be trained according to a corresponding related technology to obtain an image processing model, where the structure of the obtained image processing model is consistent with that of the target neural network.

In a specific implementation, when the target neural network is trained by taking the sample image set as a training sample, the target neural network at the end of training can be determined as the target neural network for the image processing model. In practice, when the accuracy of object recognition or object classification reaches a preset accuracy, training is deemed to be finished, and then the target neural network at that time is determined as an image processing model.

In one example, after the image processing model is obtained, image processing may also be performed using the image processing model. Specifically, an image to be processed can be obtained; inputting the image to be processed into the image processing model to obtain a processing result output by the image processing model, wherein the processing result is a classification result of the image to be processed or a recognition result of an object contained in the image to be processed.

The image to be processed may be a face image, a fingerprint image, or an image taken for a specific object. In particular, the image to be processed may be associated with an image processing task to be performed, for example, face recognition is performed, and then the image to be processed is a face image.

In this embodiment, an image to be processed may be input to an input end of the image processing model, so as to obtain a processing result output by the image processing model, where the processing result is associated with an image processing task. For example, when the image processing task is to identify an object, the processing result may be an identification result of the object included in the image to be processed. Of course, the actual image processing task is not limited to the above-described recognition and classification task, and may be another image processing task.

When the image processing model is used for processing the image to be processed, the convolution kernel parameters of each convolution layer in the image processing model can cut the number of input channels and the number of output channels in the convolution kernel parameters according to the set cutting rate, so that the number of output channels and the number of input channels can be automatically cut in the process of carrying out convolution processing on the image to be processed by using the convolution kernel parameters, the automatic cutting of the output feature map in the channel dimension is realized, the calculated amount of each convolution layer is reduced, the image processing efficiency of the image processing model is improved, and the image processing result can be obtained more quickly.

It should be noted that, for simplicity of description, the method embodiments are shown as a series of acts, but it should be understood by those skilled in the art that the embodiments are not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred embodiments, and that the acts are not necessarily required by the embodiments of the invention.

Referring to fig. 4, a block diagram of a neural network building apparatus according to an embodiment of the present invention is shown, and as shown in fig. 4, the apparatus may specifically include the following modules:

a convolutional layer determining module 401, configured to determine a convolutional layer to be replaced from an original neural network;

A replacing module 402, configured to replace the convolutional layer to be replaced with a target convolutional layer, to obtain a target convolutional neural network;

Optionally, the apparatus may further include the following modules:

The training module is used for training the target neural network by taking the sample image set as a training sample to obtain an image processing model, and the image processing model is used for carrying out object identification or classification.

Optionally, the apparatus may further include the following modules:

the image acquisition module is used for acquiring an image to be processed;

the image processing module is used for inputting the image to be processed into the image processing model to obtain a processing result output by the image processing model, wherein the processing result is a classification result of the image to be processed or a recognition result of an object contained in the image to be processed.

It should be noted that, the device embodiment is similar to the method embodiment, so the description is simpler, and the relevant places refer to the method embodiment.

The embodiment of the invention also provides an electronic device, which can comprise a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor is configured to execute the classification network construction method or the object detection model construction method.

The embodiment of the invention also provides a computer readable storage medium, and a stored computer program causes a processor to execute the classification network construction method or the object detection model construction method according to the embodiment of the invention.

In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other.

It will be apparent to those skilled in the art that embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the invention may take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the scope of the embodiments of the invention.

Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or terminal device that comprises the element.

The above description of the neural network construction method, device, apparatus and storage medium provided by the present invention applies specific examples to illustrate the principles and embodiments of the present invention, and the above examples are only used to help understand the method and core ideas of the present invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims

1. A method of neural network construction, the method comprising:

Determining a convolution layer to be replaced from an original neural network;

training the target convolutional neural network by taking a sample image set as a training sample to obtain an image processing model, wherein the image processing model performs object identification or classification;

The target convolution layer is used for carrying out convolution operation on the feature map to be convolved according to the target convolution kernel parameters to obtain a convolved feature map;

The step of adjusting the convolution kernel parameters of the original convolution layer to be replaced according to the clipping rate of the original neural network comprises the following steps:

Performing convolution processing on the original convolution layer convolution kernel parameters to be replaced according to the clipping rate to obtain target convolution kernel parameters;

the target convolution layer comprises a first convolution layer and a second convolution layer, and the convolution kernels in the first convolution layer and the second convolution layer are 1*1 convolution kernels;

The first convolution layer is used for adjusting the number of input channels in the convolution kernel parameters of the original convolution layer according to the clipping rate to obtain adjusted first convolution kernel parameters;

2. The method of claim 1, wherein the target convolutional layer further comprises: the first dimension exchange module is connected between the first convolution layer and the second convolution layer, and the second dimension exchange module is connected between the second convolution layer and the output end of the target convolution layer;

3. The method of claim 2, wherein the target convolutional layer further comprises: the device comprises an attention module, a dimension adjustment module and a convolution module, wherein the attention module is connected between an input end of the target convolution layer and the second convolution layer, the dimension adjustment module is connected between the attention module and the second convolution layer, and the convolution module is connected between the input end of the target convolution layer and an output end of the target convolution layer;

4. A method according to claim 3, wherein the attention module comprises a global pooling layer, a fully connected layer, an activation function layer connected in sequence;

5. The method of claim 1, wherein after obtaining the image processing model, the method further comprises:

Obtaining an image to be processed;

6. A neural network building apparatus, the apparatus comprising:

The training module is used for training the target convolutional neural network by taking the sample image set as a training sample to obtain an image processing model, and the image processing model is used for object identification or classification;

The target convolution kernel parameter of the target convolution layer is obtained by adjusting the convolution kernel parameter of the original convolution layer to be replaced according to the clipping rate of the original neural network; the target convolution layer is used for carrying out convolution operation on the feature map to be convolved according to the target convolution kernel parameters to obtain a convolved feature map;

performing convolution processing on the original convolution layer convolution kernel parameters to be replaced according to the clipping rate, so as to obtain target convolution kernel parameters;

7. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor when executed implementing the neural network building method of any one of claims 1-5.

8. A computer-readable storage medium, characterized in that a computer program stored therein causes a processor to execute the neural network construction method according to any one of claims 1 to 5.