WO2021110147A1

WO2021110147A1 - Methods and apparatuses for image processing, image training and channel shuffling

Info

Publication number: WO2021110147A1
Application number: PCT/CN2020/133972
Authority: WO
Inventors: 张昱航; 陈长国; 杨凤海
Original assignee: 阿里巴巴集团控股有限公司
Priority date: 2019-12-06
Filing date: 2020-12-04
Publication date: 2021-06-10
Also published as: CN112927174A; CN112927174B

Abstract

Disclosed in the present application are an image processing method and apparatus, a residual network-based image training method and apparatus, a channel shuffling method and apparatus for a neural network, a neural network architecture, a computer storage medium and an electronic device. The image processing method comprises: performing a convolution operation on acquired feature images to obtain a convolution image set; splitting the convolution image set to obtain a first convolution image group and a second convolution image group; performing a convolution operation on the first convolution image group and the second convolution image group separately to obtain a first subconvolution image group and a second subconvolution image group; and performing channel shuffling on the first subconvolution image group and the second subconvolution image group to obtain shuffled feature images. Therefore, the amount of calculations during the image processing process is reduced, the loss of image feature data is reduced, and the accuracy of the finally outputted feature images is improved.

Description

A method and device for image processing, image training and channel shuffling

This application claims the priority of the Chinese patent application filed on December 6, 2019 with the application number 201911242229.5 and the title of the invention "a method and device for image processing, image training and channel shuffling", the entire content of which is incorporated by reference In this application.

Technical field

This application relates to the field of computer application technology, in particular to an image processing method and device, an image training method and device based on a residual network, a neural network channel shuffling method and device, a neural network architecture, and a computer storage medium and electronic equipment .

Background technique

With the development of artificial intelligence, neural networks are widely used in transportation, medicine and other fields. For example: image recognition, audio recognition, human posture recognition, etc.

In the existing neural networks, in the optimization of image human posture detection, it is common to improve the accuracy of the network by deepening the network or using transfer learning to improve the accuracy of recognition. Among them, deepening the network refers to increasing the number of layers of the network. The more layers of the network, the richer the feature levels that can be extracted, and the deeper the network, the more abstract the features and the more semantic information. Therefore, it is usually more inclined to use a deeper network structure when selecting a neural network in order to obtain higher-level features. However, when using a deep network structure, there are problems such as gradient disappearance, gradient explosion and network degradation.

Based on the existing problems of the existing deep network, the residual network structure can be used to solve the above-mentioned problems of gradient disappearance, gradient explosion and network degradation. For example: ShuffleNet in the prior art is a lightweight neural network that adopts a residual network structure, but the lightweight neural network performs channel shuffling work after the end of the network is output, resulting in calculations The amount is large, and because the lightweight neural network involves more operations on the intermediate channel, the accuracy of the output data is not good.

Summary of the invention

The present application provides an image processing method to solve the problem of large amount of calculation and poor accuracy in the process of processing an image in the prior art.

This application provides an image processing method, including:

Perform a convolution operation on the acquired feature images to obtain a set of convolution images;

Split the set of convolution images to obtain a first convolution image group and a second convolution image group;

Performing a convolution operation on the first convolution image group and the second convolution image group respectively to obtain the first subconvolution image group and the second subconvolution image group;

The channels of the first subconvolution image group and the second subconvolution image group are shuffled sequentially to obtain a shuffled feature image.

In some embodiments, the splitting the set of convolutional images to obtain a first convolutional image group and a second convolutional image group includes:

According to the set splitting requirements, the convolutional image set is split based on the number of channels in the convolutional image set to obtain the first convolutional image group and the second convolutional image group.

In some embodiments, according to a set splitting requirement, the convolutional image set is split based on the number of channels in the convolutional image set to obtain the first convolutional image set and the first convolutional image set. Two convolution image group, including:

If the number of channels is an even number, split the set of convolutional images into a first convolutional image group and a second convolutional image group with the same number of channels according to the order of channel arrangement;

If the number of channels is an odd number, select half of the number of channels and take an integer;

According to the order of the channel arrangement, the convolution image corresponding to the adjacent channel greater than the integer is taken as the last convolution image of the first convolution image group, and the convolution image adjacent and greater than the last convolution image channel is taken as the convolution image The first convolution image of the second convolution image group.

In some embodiments, the respectively convolution operation on the first convolution image group and the second convolution image group to obtain the first subconvolution image group and the second subconvolution image group includes:

The first convolution image group and the second convolution image group are respectively used to perform a convolution operation with a preset convolution kernel to obtain the first subconvolution image group and the second subconvolution image group.

In some embodiments, the first convolution image group and the second convolution image group are respectively used to perform a convolution operation with a preset convolution kernel to obtain the first subconvolution image group and The second subconvolution image group includes:

Using a preset first convolution kernel to perform a convolution operation on the first convolution image group to obtain the first subconvolution image group;

Performing a convolution operation on the second convolution image group using a preset second convolution kernel to obtain the second subconvolution image group.

In some embodiments, the performing channel shuffling sequentially on the channels of the first subconvolution image group and the second subconvolution image group to obtain the shuffled feature image includes:

Splicing the first sub-convolutional image group and the second sub-convolutional image group to obtain a spliced feature image set;

Channel shuffling is performed on the spliced feature image set to obtain the shuffled feature image.

In some embodiments, the splicing the first subconvolutional image group and the second subconvolutional image group to obtain a spliced feature image set includes:

According to the channel arrangement order, the first convolution map in the second subconvolution image group is placed after the last convolution map in the first subconvolution image group to obtain the mosaic feature image set.

In some embodiments, the performing channel shuffling on the mosaic feature image set to obtain the shuffled feature image includes:

According to the set channel arrangement requirements, rearrange the spliced feature image set to obtain shuffled feature images.

In some embodiments, it further includes:

According to the acquired image feature data of the feature image and the image feature data of the shuffled feature image, the output feature image is determined.

In some embodiments, the determining the output characteristic image according to the acquired image characteristic data of the characteristic image and the image characteristic data of the shuffled characteristic image includes:

The image feature data of the feature image and the image feature data of the shuffled feature image are added image by image feature data to determine the output feature image.

This application also provides an image processing device, including:

The first convolution unit is used to perform a convolution operation on the acquired feature images to obtain a convolution image set;

A splitting unit, configured to split the set of convolutional images to obtain a first convolutional image group and a second convolutional image group;

A second convolution unit, configured to perform a convolution operation on the first convolution image group and the second convolution image group to obtain the first subconvolution image group and the second subconvolution image group;

The channel shuffling unit is configured to sequentially perform channel shuffling on the channels of the first subconvolution image group and the second subconvolution image group to obtain shuffled feature images.

This application also provides an image training method based on residual network, including:

Based on the residual network model architecture, perform convolution operations on the input feature images to obtain a set of convolution images;

Performing channel shuffling on the channels of the first subconvolution image group and the second subconvolution image group in sequence to obtain shuffled feature images;

According to the acquired image feature data of the feature image and the image feature data of the shuffled feature image, the feature image after the feature image training is determined.

This application also provides an image training device based on residual network, including:

The first convolution unit is used to perform a convolution operation on the input feature image based on the residual network model architecture to obtain a convolution image set;

A channel shuffling unit, configured to sequentially perform channel shuffling on the channels of the first subconvolution image group and the second subconvolution image group to obtain a shuffled feature image;

The determining unit is configured to determine the characteristic image of the characteristic image after the training of the characteristic image according to the acquired image characteristic data of the characteristic image and the image characteristic data of the shuffled characteristic image.

This application also provides a channel shuffling method of neural network, including:

According to the convolution operation on the characteristic images input to the neural network, a convolution image set is obtained;

Channel shuffling is sequentially performed on the channels of the first subconvolution image group and the second subconvolution image group.

In some embodiments, the performing channel shuffling sequentially on the channels of the first subconvolution image group and the second subconvolution image group includes:

Splicing the first subconvolutional image group and the second subconvolutional image group to obtain a spliced feature image set;

Channel shuffling is performed on the spliced feature image set.

In some embodiments, the splicing the first subconvolutional image group and the second subconvolutional image group to obtain the spliced feature image set includes:

According to the channel arrangement order, the first convolution map in the second subconvolution image group is placed after the last convolution map in the first subconvolution image group to obtain the spliced feature map set.

In some embodiments, the performing channel shuffling on the mosaic feature image set includes:

According to the set channel arrangement requirements, the splicing feature image set is rearranged.

This application also provides a neural network channel shuffling device, including:

The first convolution unit is configured to perform a convolution operation on the characteristic image input to the neural network to obtain a convolution image set;

The shuffling unit is configured to sequentially perform channel shuffling on the channels of the first subconvolution image group and the second subconvolution image group.

This application also provides a neural network architecture, including:

The first convolution layer is used to perform a convolution operation on the input feature image to obtain a convolution image set;

A split layer for splitting the set of convolution images convolved by the first convolution layer to obtain a first convolution image group and a second convolution image group;

The second convolution layer is used to perform convolution operations on the first convolution image group and the second convolution image group after the split layer is split to obtain the first subconvolution image group and The second sub-convolution image group;

The shuffling layer is used to perform channel shuffling on the channels of the first subconvolution image group and the second subconvolution image group convolved by the second convolution layer to obtain a shuffled feature image ；

The calculation layer is used to calculate the feature image to be output according to the image feature data of the feature image input to the first convolutional layer and the image feature data of the feature image shuffled by the shuffle layer.

This application also provides a computer storage medium for storing data generated by a network platform and a program corresponding to the data generated by the network platform for processing;

When the program is read and executed, it executes the steps of the image processing method as described above; or executes the steps of the image training method based on the residual network as described above; or executes the channel shuffling of the neural network as described above method.

This application also provides an electronic device, including:

processor;

The memory is used to store a program for processing the data generated by the network platform. When the program is read and executed by the processor, it executes the steps of the image processing method as described above; or executes the residual network-based network as described above. The steps of the image training method; or the channel shuffling method of the neural network as described above.

Compared with the prior art, this application has the following advantages:

An image processing method provided by the present application obtains a convolutional image set by performing a convolution operation on the acquired feature images; then splits the convolutional image set to obtain a first convolutional image group and a second convolution Convolution image group; afterwards, perform convolution operations on the first convolution image group and the second convolution image group to obtain the first subconvolution image group and the second subconvolution image group; finally, to The first sub-convolution image group and the second sub-convolution image group are channel shuffled to obtain shuffled feature images. Since the convolution image set is split, at least two branches are formed, namely the first convolution image group and the second convolution image group, so as to disperse the calculation amount of the feature image processing, and the final output of the feature image The first subconvolution image group and the second subconvolution image group perform channel shuffling, shorten the processing process and reduce the loss of image feature data, so the accuracy of the final output feature image can be improved.

Description of the drawings

Figure 1 is a schematic diagram of the structure of a lightweight network in the prior art;

Fig. 2 is a flowchart of an embodiment of an image processing method provided by the present application;

FIG. 3 is a schematic structural diagram of an embodiment of an image processing device provided by the present application;

FIG. 4 is a flowchart of an embodiment of an image training method based on a residual network provided by the present application;

FIG. 5 is a schematic structural diagram of a residual network in an embodiment of an image training method based on a residual network provided by the present application;

FIG. 6 is a schematic diagram of a training process in an embodiment of an image training method based on a residual network provided by the present application;

FIG. 7 is a schematic structural diagram of an embodiment of an image training device based on a residual network provided by the present application;

FIG. 8 is a flowchart of an embodiment of a channel shuffling method of a neural network provided by the present application;

Fig. 9 is a schematic structural diagram of an embodiment of a neural network channel shuffling device provided by the present application.

Detailed ways

In the following description, many specific details are set forth in order to fully understand this application. However, this application can be implemented in many other ways different from those described herein, and those skilled in the art can make similar promotion without violating the connotation of this application. Therefore, this application is not limited by the specific implementation disclosed below.

The terms used in this application are only for the purpose of describing specific embodiments, and are not intended to limit the application. The description methods used in this application and in the appended claims, such as "one", "first", and "second", etc., are not limited in quantity or sequence, but Used to distinguish the same type of information from each other.

In order to better understand the image processing method provided by this application, the background of the technical solution of this application will be explained based on the background technology part and application scenarios.

According to the description in the above background technology, the existing deep neural network structure has the problems of gradient disappearance, gradient explosion and network degradation. Please understand the problems of gradient disappearance and gradient explosion in conjunction with Figure 1. Figure 1 is a schematic diagram of a relatively simple deep network in the prior art. The network is a four-layer fully connected network. It is assumed that the output of each layer of the network after activation is f _i (x), where i is the i-th layer, x represents the input of the i-th layer, that is, the output of the i-1th layer, and f is the activation function. Then, f _i+1 = f(f _i *w _i+1 + b _i+1 ), denoted as : F _i+1 =f(f _i *wi ₊₁ ). BP algorithm (BP algorithm is an optimization algorithm based on gradient descent method in neural network) is based on gradient descent strategy, the parameters are adjusted in the negative gradient direction of the target, and the parameters are updated to w=w+Δw, given the learning rate α, obtain

If you want to update the weight information of the second hidden layer, according to the chain derivation rule, the update gradient information is:

According to the above formula, it can be seen that

That is, the input of the second hidden layer. therefore,

It is the derivation of the activation function. If the derivation result is greater than 1, then when the number of network layers increases, the final gradient update will increase exponentially, that is, gradient explosion occurs. If the derivation result is less than 1, then When the number of network layers increases, the obtained gradient update information will decay exponentially, that is, the gradient disappears. In a nutshell, the so-called gradient disappearance refers to the phenomenon that the gradient norm of the parameter decreases exponentially as the network depth increases. The gradient is small, which means that the change of the parameters is very slow, which makes the learning process stagnate until the gradient becomes large enough, which usually takes an order of magnitude of time, which leads to time-consuming in the image recognition process.

The problem of network degradation mainly means that as the network depth increases, a large number of redundant layers will appear. These redundant layers will cause the difference between input and output, which will lead to network degradation, which leads to image recognition. The processing precision or accuracy is reduced.

Based on the above-mentioned disadvantages caused by the increase in network depth, the prior art provides a lightweight network that adopts a residual network structure, thereby avoiding the problem of vanishing gradient (explosion) and network degradation. However, the lightweight network performs channel shuffling work on the data after the end of the network is output, resulting in a large amount of calculation, and the lightweight network still has the problem of poor image processing accuracy.

Based on the foregoing content, the present application provides an image processing method, which can reduce the amount of calculation and improve the accuracy of image processing.

Please refer to FIG. 2. FIG. 2 is a flowchart of an embodiment of an image processing method provided by the present application. The method includes:

Step S201: Perform a convolution operation on the acquired feature images to obtain a set of convolution images;

The purpose of the step S201 is to perform a dimension upgrade operation on the feature image. The specific implementation process may be that the acquired characteristic image may be the characteristic image input through the input terminal of the neural network as the acquired characteristic image. A convolution operation is performed on the feature image to obtain a plurality of convolution images after convolution, and the convolution images constitute a convolution image set. In this embodiment, the convolution operation in step S201 may use a 1×1 convolution kernel to perform a convolution operation on the feature image, so as to obtain multiple convolution maps.

Wherein, the acquired feature images may also be multiple, and multiple convolution maps are obtained after convolution operations on the multiple feature images, and these convolution maps may include new image feature data. It should be noted that the new image feature data is derived from the acquired feature image. This is mainly for the difference between the image feature data after the convolution operation and the image feature data before the convolution operation.

Step S202: Split the convolution image set to obtain a first convolution image group and a second convolution image group;

The purpose of the step S202 is to divide the convolutional image set into at least two branches, so that the amount of calculation can be reduced in the subsequent processing.

In this embodiment, the splitting process of step S202 may adopt a splitting method of one-to-two to split the convolutional image set into at least two parts, so that the first convolutional image combined with the second convolution can be obtained. Image group. In this embodiment, the splitting requirement can be set to be divided into two. Specifically, it can be split for half of the total number of channels of the convolutional image collection, that is, split into two parts from the middle of the convolutional image collection. Of course, the number of splits can also be three parts, etc. In other words, the split convolutional image set includes: n groups of convolutional image groups.

Since there are many situations for the number of channels, for example, even and odd, the splitting of the convolutional image set in this embodiment can be divided into two situations, and the details are as follows:

If the number of channels is an even number, according to the channel arrangement order, the convolution image set is split into a first convolution image combination and a second convolution image group with the same number of channels; that is, the first convolution image group Is 1,2,3,...,(1/2)k; the second convolution image group is

Among them, k is the total number of convolutional images in the convolutional image collection.

According to the order of the channel arrangement, the convolution image corresponding to the adjacent channel greater than the integer is taken as the last feature image of the first convolution image group, and the first convolution image of the second convolution image group is the same as the first convolution image The convolution image adjacent to the last convolution image of the convolution image group, and the convolution image that is adjacent and larger than the last convolution image channel is taken as the first convolution image of the second convolution image group. That is: the last convolution image of the first convolution image group is

The first convolution image of the second convolution image group is

In this embodiment, the convolutional image combination is mainly split into two parts by splitting into two, that is, the convolution image is divided into two parts evenly. Of course, it is also possible to divide the number of channels according to the odd and even number, that is, according to the order of the number of channels, the odd number of channels is divided into one group, and the number of even channels is divided into one group. Of course, it can also be divided into multiple groups, and the splitting method is not limited to the above.

After splitting the convolution image set based on the foregoing, it is also necessary to perform the convolution operation again on the split first convolution image and the second convolution image group, and then proceed to step S203.

Step S203: Perform a convolution operation on the first convolution image group and the second convolution image group respectively to obtain the first subconvolution image group and the second subconvolution image group;

The purpose of the step S203 is to reduce the amount of calculation by performing a convolution operation on the first convolution image group and the second convolution image group respectively.

In this embodiment, the same convolution kernel can be used for the convolution operation for the first convolution image group and the second convolution image group, for example: both 5×5 convolution kernels are used for the convolution operation, and then obtain A first subconvolution image group corresponding to the first convolution image group, and a second subconvolution image group corresponding to the second convolution image group.

It is understandable that different convolution kernels can be used for the first convolution image group and the second convolution image group to perform convolution operations, for example, a 3×3 convolution kernel is used for the first convolution image group. Convolution operation, using a 5×5 convolution kernel to perform a convolution operation on the second convolution image group.

The size of the convolution kernel used for the first convolution image group and the second convolution image group is not limited to the above 3×3 and 5×5, and convolution kernels of other sizes may also be used, for example: 7×7 In this embodiment, a 5×5 convolution kernel is used.

The size of the convolution kernel can be preset or adjusted according to the image feature data amount of the first convolution image group and the second convolution image group. For example, if the preset convolution kernel size is 5× 5. Then the size of the convolution kernel can be adjusted to 3×3 or 7×7 according to the image feature data of the first convolution image group. Similarly, when convolving the second convolution image group, The size of the convolution kernel can also be adjusted. In other words, when performing a convolution operation on any one of the first convolution image group and the second convolution image group, the size of the convolution kernel can be adjusted at any time, and more convolution operations have been achieved.

Based on the above content, after performing a convolution operation on the first convolution image group and the second convolution image group, a plurality of convolutional images will be generated corresponding to the first convolution image group and the second convolution image group. Subconvolution image. The specific convolution process belongs to the prior art. It is roughly that the 5×5 convolution kernel is used as a filter to slide on each convolution image, and the corresponding position is multiplied and summed to obtain the convolution image group. The sub-convolution image of each convolution image. Since in this embodiment, the acquired feature image is split into at least two branches, and the convolution operation is performed on the two branches respectively, the calculation of the feature image can be dispersed into two, that is, the calculation of the feature image can be divided into two. % Feature images are convolved, which reduces the amount of calculation by 50% compared to the amount of calculation before splitting.

After the convolution operation is performed on the first convolution image group and the second convolution image group, the first subconvolution image group and the second subconvolution image group that need to be obtained are restored to the structure before the split, so enter Step S204.

Step S204: performing channel shuffling on the channels of the first subconvolution image group and the second subconvolution image group in sequence to obtain shuffled feature images.

The purpose of the step S204 is to restore the sub-convolutional image group obtained after convolution on the convolutional image on the split branch road to the structure before splitting. Wherein, the channel shuffling can be understood as: the process of disrupting the original channel stacking sequence in the feature map generated by the neural network. The shuffled feature image is the feature image with the channel sequence rearranged. The specific implementation process of step S204 may include:

Step S204-1: splicing the first subconvolution image group and the second subconvolution image group to obtain a spliced feature image set;

Step S204-2: Perform channel shuffling on the spliced feature image set to obtain the shuffled feature image.

Wherein, the splicing in step S204-1 can be understood as merging, that is, merging the first subconvolution image group and the second subconvolution image group, so that the two form a subconvolution image set. Specific splicing can include:

According to the order of the channel arrangement, the first convolution map in the second subconvolution image group is placed after the last convolution map in the first subconvolution image group to obtain the mosaic feature image set. In other words, the splicing of the first subconvolutional image group and the second subconvolutional image group can refer to the above-mentioned splitting process. The bit and the first bit of the second subconvolution image group are spliced to form a subconvolution image group set. For example: the first subconvolution image group is K1, K2, K3, K4; the second subconvolution image group is P1, P2, P3, P4; then, after stitching, it is K1, K2, K3, K4, P1, P2 , P3, P4.

In order to ensure the flow of image feature information between channel groups and improve the ability to express image feature information, it is necessary to perform channel shuffling on the spliced feature image set after the step S204-1 is spliced. Therefore, the specific implementation of the step S204-2 The process can include:

According to the set channel arrangement requirements, rearrange the spliced feature image set to obtain shuffled feature images. Following the above example in step S204-1, after splicing, the spliced feature image set of K1, K2, K3, K4, P1, P2, P3, P4 is shuffled. If the channel arrangement requirement is 1,

After rearrangement, they are K1, P1, K2, P2, K3, P3, K4, P4.

Based on the above content, the processing result for the acquired feature image can be obtained, that is, the shuffled feature image. After the shuffled feature image is obtained, it may also include:

According to the acquired image feature data of the feature image and the image feature data of the shuffled feature image, the output feature image is determined. Specifically, the image feature data of the feature image and the image feature data of the shuffled feature image may be added image by image feature data to determine the output feature image.

Since the image processing is split in this embodiment to form at least two branches, the calculation amount of the feature image processing is dispersed, and the first sub-convolutional image group and the second sub-convolution are performed before the final output of the feature image. The product image group performs channel shuffling to shorten the processing process and reduce the loss of image feature data, so the accuracy of the final output feature image can be improved.

The image processing method provided by the embodiments of this application can be applied to pedestrian detection, pedestrian attitude estimation, vehicle detection, aircraft detection, ship detection, logistics vehicle detection, drone detection, safe driving estimation, flight attitude estimation, factories, aprons , Engineering sites, large-scale events, concerts, and other application scenarios such as safety work monitoring, so that in these application scenarios, the output image detection results can reduce the amount of detection calculations while ensuring the accuracy of the accuracy of the image detection results.

The above is a detailed description of an embodiment of an image processing method provided by this application. Corresponding to the embodiment of an image processing method provided above, this application also discloses an embodiment of an image processing device. Please refer to FIG. 3, because the device The embodiment is basically similar to the method embodiment, so the description is relatively simple, and the relevant part can refer to the part of the description of the method embodiment. The device embodiments described below are merely illustrative.

Please refer to FIG. 3, which is a schematic structural diagram of an embodiment of an image processing apparatus provided by the present application. The device includes:

The first convolution unit 301 is configured to perform a convolution operation on the acquired feature images to obtain a convolution image set;

The purpose of the first convolution unit 301 is to perform a dimension-upgrading operation on the feature image. The specific implementation process may be that the acquired characteristic image may be the characteristic image input through the input terminal of the neural network as the acquired characteristic image. A convolution operation is performed on the feature image to obtain a plurality of convolution images after convolution, and the convolution images constitute a convolution image set. In this embodiment, the convolution operation may use a 1×1 convolution kernel to perform a convolution operation on the feature image, so as to obtain multiple convolution maps.

Wherein, the acquired feature images may also be multiple, and multiple convolution maps are obtained after convolution operations on the multiple feature images, and these convolution maps may include new image feature data.

The splitting unit 302 is configured to split the set of convolutional images to obtain a first convolutional image group and a second convolutional image group;

The purpose of the splitting unit is to divide the convolution image set in the first convolution unit 301 into at least two branches, so that the amount of calculation can be reduced in the subsequent processing, that is, the convolution of the two branches respectively The graph group performs convolution calculation, and then disperses the calculation amount.

In this embodiment, the splitting process may adopt a splitting method of one-to-two splits to split the convolutional image set into at least two parts, so that the first convolutional image combined with the second convolutional image group can be obtained. In this embodiment, the splitting requirement can be set to be divided into two. Specifically, it can be split for half of the total number of channels of the convolutional image collection, that is, split into two parts from the middle of the convolutional image collection. Of course, the number of splits can also be three parts, etc. In other words, the split convolutional image set includes: n groups of convolutional image groups.

According to the order of the channel arrangement, the convolution image corresponding to the channel adjacent and larger than the integer is taken as the last feature image of the first convolution image group, and the first convolution image of the second convolution image group is the same as the first convolution image. The convolution image adjacent to the last convolution image of the convolution image group, and the convolution image that is adjacent and larger than the last convolution image channel is used as the first convolution image of the second convolution image group. That is: the last convolution image of the first convolution image group is

The first convolution image of the second convolution image group is

In this embodiment, the convolutional image combination is mainly split into two parts by splitting into two, that is, the convolution image is divided into two parts evenly. Of course, according to the odd and even division of the number of channels, that is, according to the order of the number of channels, the odd number of channels is divided into one group, and the number of even channels is divided into one group. Of course, it can also be divided into multiple groups, and the splitting method is not limited to the above.

The second convolution unit 303 is configured to perform a convolution operation on the first convolution image group and the second convolution image group respectively to obtain the first subconvolution image group and the second subconvolution image group;

The purpose of the second convolution unit 303 is to reduce the amount of calculation by performing convolution operations on the first convolution image group and the second convolution image group respectively.

The channel shuffling unit 304 is configured to perform channel shuffling on the first subconvolution image group and the second subconvolution image group to obtain shuffled feature images.

The purpose of the channel shuffling unit 304 is to restore the sub-convolutional image group obtained after convolution on the convolutional image on the split branch road to the structure before splitting. The specific implementation process can include:

A splicing subunit for splicing the first subconvolution image group and the second subconvolution image group to obtain a spliced feature image set after splicing;

The shuffling subunit is used to perform channel shuffling on the spliced feature image set obtained in the splicing subunit to obtain the shuffled feature image.

Wherein, the splicing in the splicing subunit can be understood as merging, that is, the first subconvolution image group and the second subconvolution image group are merged, so that the two form a subconvolution image set. Specific splicing can include:

According to the channel arrangement order, the first convolution map in the second subconvolution image group is placed after the last convolution map in the first subconvolution image group to obtain the mosaic feature image set. In other words, the splicing of the first subconvolutional image group and the second subconvolutional image group can refer to the above-mentioned splitting process. The bit and the first bit of the second subconvolution image group are spliced to form a subconvolution image group set. For example: the first subconvolution image group is K1, K2, K3, K4; the second subconvolution image group is P1, P2, P3, P4; then, after stitching, it is K1, K2, K3, K4, P1, P2 , P3, P4.

In order to ensure the circulation of image feature information between channel groups and improve the ability to express image feature information, it is necessary to perform channel shuffling on the spliced feature image set after the splicing subunits are spliced. Therefore, the specific implementation process of the shuffling subunit Can include:

According to the set channel arrangement requirements, rearrange the spliced feature image set to obtain shuffled feature images. Following the above example in the splicing subunit, after splicing, the spliced feature image set of K1, K2, K3, K4, P1, P2, P3, P4 is shuffled. If the channel arrangement requirement is 1,

After rearrangement, they are K1, P1, K2, P2, K3, P3, K4, P4.

Since the image processing is split in this embodiment to form at least two branches, the calculation amount of the feature image processing is dispersed, and the first sub-convolutional image group and the second sub-convolution are performed before the final output of the feature image. The product image group performs channel shuffling to shorten the processing process and reduce the loss of image feature data, thereby improving the accuracy of the final output feature image.

Based on the above content, please refer to FIG. 4. FIG. 4 is a flowchart of an embodiment of an image training method based on a residual network provided by the present application. The training method includes:

Step S401: Based on the residual network model architecture, perform a convolution operation on the input feature image to obtain a convolution image set;

For the specific implementation process of step S401, reference may be made to step S201 in the above-mentioned image processing method embodiment provided in this application.

What needs to be explained for step S401 is the residual network model architecture, that is, the residual network. Please refer to FIG. 5, which is a schematic structural diagram of a residual network in an embodiment of an image training method based on a residual network provided by the present application. The residual network is a network composed of a series of basic residual modules with information jump connections, that is, every two layers add a shortcut (jump connection) to form a residual block, and multiple residual blocks are connected together It constitutes a residual network.

Step S402: Split the convolution image set to obtain a first convolution image group and a second convolution image group;

For the specific implementation process of step S402, refer to FIG. 6. FIG. 6 is a schematic diagram of the training process in the embodiment of the image training method based on the residual network provided by the present application; the rightmost part of FIG. 6 is the first after splitting. For the specific splitting process of the first convolution image group and the second convolution image group, reference may be made to step S202 in the above-mentioned embodiment of the image processing method provided in this application, which will not be repeated here.

It should be particularly noted that the splitting in step S402 in this embodiment is for splitting inside the residual network, that is, splitting is performed on the internal branch of the residual network.

Step S403: performing a convolution operation on the first convolution image group and the second convolution image group respectively to obtain the first subconvolution image group and the second subconvolution image group;

For the specific implementation process of step S403, refer to FIG. 6. As shown in FIG. 6, the m×m convolution kernel is used to convolve the first convolution image group and the second convolution image group. For the specific convolution process, please refer to the above-mentioned book. Step S203 in the embodiment of an image processing method provided by the application is not repeated here.

Step S404: Perform channel shuffling on the first subconvolution image group and the second subconvolution image group to obtain shuffled feature images;

For the specific implementation process of step S404, refer to FIG. 6, where the first subconvolution image group and the second subconvolution image group are interpolated with each other, and the specific channel shuffling process can refer to the above-mentioned image processing method provided in this application. Step S204 in the embodiment will not be repeated here.

Step S405: According to the acquired image feature data of the feature image and the image feature data of the shuffled feature image, determine the feature image after the feature image training.

In the specific implementation process of step S405, the image feature data of the input feature image and the image feature data of the shuffled feature image can be added image by image feature data to determine the final output feature image.

The above is a detailed description of the embodiment of the image training method based on the residual network provided by this application. Corresponding to the embodiment of the image training method based on the residual network provided above, this application also discloses a residual network-based image training method. For the embodiment of the image training device of the network, please refer to FIG. 7. Since the device embodiment is basically similar to the method embodiment, the description is relatively simple. For related parts, please refer to the part of the description of the method embodiment. The device embodiments described below are merely illustrative.

As shown in FIG. 7, FIG. 7 is a schematic structural diagram of an embodiment of an image training device based on a residual network provided by the present application. The device includes:

The first convolution unit 701 is configured to perform a convolution operation on the input feature image based on the residual network model architecture to obtain a convolution image set;

For the specific implementation process of the first convolution unit 701, refer to step S401 in the above-mentioned embodiment of the image training method based on residual network.

The splitting unit 702 is configured to split the set of convolutional images to obtain a first convolutional image group and a second convolutional image group;

For the specific implementation process of the splitting unit 702, reference may be made to step S402 in the foregoing embodiment of the image training method based on residual network.

The second convolution unit 703 is configured to perform a convolution operation on the first convolution image group and the second convolution image group respectively to obtain the first subconvolution image group and the second subconvolution image group;

For the specific implementation process of the second convolution unit 703, reference may be made to step S403 in the above-mentioned embodiment of the image training method based on residual network.

A channel shuffling unit 704, configured to perform channel shuffling on the first subconvolution image group and the second subconvolution image group to obtain shuffled feature images;

For the specific implementation process of the channel shuffling unit 704, reference may be made to step S404 in the above-mentioned embodiment of the image training method based on residual network.

The determining unit 705 is configured to determine the feature image after the feature image training according to the acquired image feature data of the feature image and the image feature data of the shuffled feature image.

For the specific implementation process of the determining unit 705, reference may be made to step S405 in the foregoing embodiment of an image training method based on a residual network.

Based on the above, the present application also provides a neural network channel shuffling method. For details, please refer to FIG. 8. FIG. 8 is a flowchart of an embodiment of a neural network channel shuffling method provided by the present application. The channel shuffling method includes:

Step S801: Obtain a set of convolutional images according to the convolution operation on the characteristic images input to the neural network;

For the specific implementation process of step S801, reference may be made to step S201 in the above-mentioned embodiment of an image processing method provided by the present application, which will not be repeated here.

Step S802: Split the convolution image set to obtain a first convolution image group and a second convolution image group;

For the specific implementation process of step S802, reference may be made to step S202 in the above-mentioned image processing method embodiment provided in this application, which will not be repeated here.

Step S803: Perform a convolution operation on the first convolution image group and the second convolution image group respectively to obtain the first subconvolution image group and the second subconvolution image group;

For the specific implementation process of step S803, reference may be made to step S203 in the above-mentioned embodiment of an image processing method provided in this application, which will not be repeated here.

Step S804: Perform channel shuffling on the first subconvolution image group and the second subconvolution image group.

For the specific implementation process of step S804, reference may be made to step S204 in the above-mentioned image processing method embodiment provided in this application, which will not be repeated here.

Similarly, this application also provides a neural network channel shuffling device. As shown in FIG. 9, the device includes:

The first convolution unit 901 is configured to perform a convolution operation on the characteristic image input to the neural network to obtain a convolution image set;

The splitting unit 902 is configured to split the set of convolutional images to obtain a first convolutional image group and a second convolutional image group;

The second convolution unit 903 is configured to perform a convolution operation on the first convolution image group and the second convolution image group to obtain the first subconvolution image group and the second subconvolution image group;

The shuffling unit 904 is configured to perform channel shuffling on the first subconvolution image group and the second subconvolution image group.

For the channel shuffling device of the neural network provided in this application, reference may be made to the description in the above-mentioned embodiment of the image processing device provided in this application, which will not be repeated here.

Based on the above content, this application also provides a neural network architecture, including:

A shuffling layer, configured to perform channel shuffling on the first subconvolution image group and the second subconvolution image group convolved by the second convolution layer to obtain a shuffled feature image;

Based on the above content, this application also provides a computer storage medium for storing data generated by a network platform, and a program corresponding to the data generated by the network platform for processing;

Based on the above content, this application also provides an electronic device, including:

processor;

In a typical configuration, the computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include non-permanent memory in computer readable media, random access memory (RAM) and/or non-volatile memory, such as read-only memory (ROM) or flash memory (flash RAM). Memory is an example of computer readable media.

1. Computer-readable media include permanent and non-permanent, removable and non-removable media, and information storage can be realized by any method or technology. The information can be computer-readable instructions, data structures, program modules, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disc (DVD) or other optical storage, Magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission media can be used to store information that can be accessed by computing devices. According to the definition in this article, computer-readable media does not include non-transitory computer-readable media (transitory media), such as modulated data signals and carrier waves.

2. Those skilled in the art should understand that the embodiments of the present application can be provided as methods, systems or computer program products. Therefore, this application may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, this application may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.

Although this application is disclosed as above in preferred embodiments, it is not intended to limit this application. Any person skilled in the art can make possible changes and modifications without departing from the spirit and scope of this application. Therefore, this application The scope of protection shall be subject to the scope defined by the claims of this application.

Claims

An image processing method, characterized in that it comprises:

Perform a convolution operation on the acquired feature images to obtain a set of convolution images;

Split the set of convolution images to obtain a first convolution image group and a second convolution image group;

Performing a convolution operation on the first convolution image group and the second convolution image group respectively to obtain a first subconvolution image group and a second subconvolution image group;

The channels of the first subconvolution image group and the second subconvolution image group are sequentially shuffled to obtain a shuffled feature image.
The image processing method according to claim 1, wherein the splitting the convolution image set to obtain a first convolution image group and a second convolution image group comprises:

According to the set splitting requirements, the convolutional image set is split based on the number of channels in the convolutional image set to obtain the first convolutional image group and the second convolutional image group.
The image processing method according to claim 2, characterized in that, according to the set splitting requirements, the convolutional image set is split based on the number of channels in the convolutional image set to obtain all The first convolution image group and the second convolution image group include:

If the number of channels is an even number, split the set of convolutional images into a first convolutional image group and a second convolutional image group with the same number of channels according to the order of channel arrangement;

If the number of channels is an odd number, select half of the number of channels and take an integer;

According to the order of the channel arrangement, the convolution image corresponding to the adjacent channel greater than the integer is taken as the last convolution image of the first convolution image group, and the convolution image adjacent and greater than the last convolution image channel is taken as the convolution image The first convolution image of the second convolution image group.
The image processing method according to claim 1, wherein the convolution operation is performed on the first convolution image group and the second convolution image group to obtain the first subconvolution image group and the second convolution image group. Two-subconvolution image group, including:

The first convolution image group and the second convolution image group are respectively used to perform a convolution operation with a preset convolution kernel to obtain the first subconvolution image group and the second subconvolution image group.
The image processing method according to claim 4, wherein the first convolution image group and the second convolution image group are respectively used to perform a convolution operation with a preset convolution kernel to obtain the The first sub-convolution image group and the second sub-convolution image group include:

Using a preset first convolution kernel to perform a convolution operation on the first convolution image group to obtain the first subconvolution image group;

Performing a convolution operation on the second convolution image group using a preset second convolution kernel to obtain the second subconvolution image group.
The image processing method according to claim 1, wherein the channel shuffling is performed sequentially on the channels of the first subconvolution image group and the second subconvolution image group to obtain a shuffled feature image ,include:

Splicing the first subconvolutional image group and the second subconvolutional image group to obtain a spliced feature image set;

Channel shuffling is performed on the spliced feature image set to obtain the shuffled feature image.
7. The image processing method according to claim 6, wherein the splicing the first sub-convolutional image group and the second sub-convolutional image group to obtain a spliced spliced feature image set comprises:

According to the channel arrangement order, the first convolution map in the second subconvolution image group is placed after the last convolution map in the first subconvolution image group to obtain the mosaic feature image set.
The image processing method according to any one of claims 6 or 7, wherein the performing channel shuffling on the spliced feature image set to obtain a shuffled feature image comprises:

According to the set channel arrangement requirements, rearrange the spliced feature image set to obtain shuffled feature images.
The image processing method according to claim 1, further comprising:

According to the acquired image feature data of the feature image and the image feature data of the shuffled feature image, the output feature image is determined.
The image processing method according to claim 9, wherein the determining the output characteristic image according to the acquired image characteristic data of the characteristic image and the image characteristic data of the shuffled characteristic image comprises:

The image feature data of the feature image and the image feature data of the shuffled feature image are added image by image feature data to determine the output feature image.
An image processing device, characterized in that it comprises:

The first convolution unit is used to perform a convolution operation on the acquired feature images to obtain a convolution image set;

A splitting unit, configured to split the set of convolutional images to obtain a first convolutional image group and a second convolutional image group;

The second convolution unit is configured to perform a convolution operation on the first convolution image group and the second convolution image group to obtain the first subconvolution image group and the second subconvolution image group;

The channel shuffling unit is configured to sequentially perform channel shuffling on the channels of the first subconvolution image group and the second subconvolution image group to obtain shuffled feature images.
An image training method based on residual network, which is characterized in that it includes:

Based on the residual network model architecture, perform convolution operations on the input feature images to obtain a set of convolution images;

Split the set of convolution images to obtain a first convolution image group and a second convolution image group;

Performing a convolution operation on the first convolution image group and the second convolution image group respectively to obtain a first subconvolution image group and a second subconvolution image group;

Performing channel shuffling sequentially on the channels of the first subconvolution image group and the second subconvolution image group to obtain a shuffled feature image;

According to the acquired image feature data of the feature image and the image feature data of the shuffled feature image, the feature image after the feature image training is determined.
An image training device based on residual network, which is characterized in that it comprises:

The first convolution unit is used to perform a convolution operation on the input feature image based on the residual network model architecture to obtain a convolution image set;

A splitting unit, configured to split the set of convolutional images to obtain a first convolutional image group and a second convolutional image group;

The second convolution unit is configured to perform a convolution operation on the first convolution image group and the second convolution image group to obtain the first subconvolution image group and the second subconvolution image group;

A channel shuffling unit, configured to sequentially perform channel shuffling on the channels of the first subconvolution image group and the second subconvolution image group to obtain a shuffled feature image;

The determining unit is configured to determine the characteristic image of the characteristic image after the training of the characteristic image according to the acquired image characteristic data of the characteristic image and the image characteristic data of the shuffled characteristic image.
A channel shuffling method of neural network, which is characterized in that it comprises:

According to the convolution operation on the characteristic images input to the neural network, a convolution image set is obtained;

Split the set of convolution images to obtain a first convolution image group and a second convolution image group;

Performing a convolution operation on the first convolution image group and the second convolution image group respectively to obtain a first subconvolution image group and a second subconvolution image group;

Channel shuffling is sequentially performed on the channels of the first subconvolution image group and the second subconvolution image group.
The method for channel shuffling of neural networks according to claim 14, wherein said sequentially performing channel shuffling on the channels of the first subconvolution image group and the second subconvolution image group comprises:

Splicing the first subconvolutional image group and the second subconvolutional image group to obtain a spliced feature image set;

Channel shuffling is performed on the spliced feature image set.
The neural network channel shuffling method according to claim 15, wherein the first subconvolution image group and the second subconvolution image group are spliced to obtain a spliced feature image set after splicing ,include:

According to the channel arrangement order, the first convolution map in the second subconvolution image group is placed after the last convolution map in the first subconvolution image group to obtain the spliced feature map set.
The channel shuffling method for neural networks according to any one of claims 15 or 16, wherein said performing channel shuffling on the set of spliced feature images comprises:

According to the set channel arrangement requirements, the splicing feature image set is rearranged.
A neural network channel shuffling device, which is characterized in that it comprises:

The first convolution unit is configured to perform a convolution operation on the characteristic image input to the neural network to obtain a convolution image set;

A splitting unit, configured to split the set of convolutional images to obtain a first convolutional image group and a second convolutional image group;

The second convolution unit is configured to perform a convolution operation on the first convolution image group and the second convolution image group to obtain the first subconvolution image group and the second subconvolution image group;

The shuffling unit is configured to sequentially perform channel shuffling on the channels of the first subconvolution image group and the second subconvolution image group.
A neural network architecture is characterized in that it includes:

The first convolution layer is used to perform a convolution operation on the input feature image to obtain a convolution image set;

A split layer for splitting the set of convolution images convolved by the first convolution layer to obtain a first convolution image group and a second convolution image group;

The second convolution layer is used to perform convolution operations on the first convolution image group and the second convolution image group after the split layer is split to obtain the first subconvolution image group and the second convolution image group. Sub-convolution image group;

The shuffling layer is used to perform channel shuffling on the channels of the first subconvolution image group and the second subconvolution image group convolved by the second convolution layer to obtain a shuffled feature image ；

The calculation layer is used to calculate the feature image to be output according to the image feature data of the feature image input to the first convolutional layer and the image feature data of the feature image shuffled by the shuffle layer.
A computer storage medium for storing data generated by a network platform and a program corresponding to the data generated by the network platform for processing;

When the program is read and executed, it executes the steps of the image processing method according to any one of claims 1 to 10; or executes the steps of the image training method based on residual network according to claim 12; or The channel shuffling method according to claim 14 is implemented.
An electronic device including:

processor;

The memory is used to store a program for processing data generated by the network platform, and when the program is read and executed by the processor, it executes the steps of the image processing method according to any one of claims 1 to 10; or The steps of the image training method based on the residual network according to claim 12; or the channel shuffling method according to claim 14 is executed.