CN117036853A

CN117036853A - Image enhancement and target detection accuracy rate improvement method based on joint training

Info

Publication number: CN117036853A
Application number: CN202310960966.9A
Authority: CN
Inventors: 王海文; 方亮; 朱言庆; 侯强
Original assignee: Zhiyang Innovation Technology Co Ltd
Current assignee: Zhiyang Innovation Technology Co Ltd
Priority date: 2023-08-01
Filing date: 2023-08-01
Publication date: 2023-11-10

Abstract

The invention discloses an image enhancement and target detection accuracy improvement method based on joint training, which belongs to the technical field of artificial intelligence and is used for a target detection system, wherein the target detection system comprises an image enhancement network and a target detection network, and the method comprises the following steps: constructing an image degradation network, and inserting the image degradation network into the front part of the image enhancement network; an image size conversion module is added at the rear end of the image enhancement network and is used for realizing the self-adaptive scaling of the image size in the training process; the image enhancement network and the target detection network are connected in series through a loss function; and constructing a training data set, and performing joint training on the image enhancement network and the target detection network. When the method is used, the enhanced image can be obtained after the image in the severe environment passes through the image enhancement network, the enhanced image is input into the target detection network, the improvement of the detection accuracy can be realized, and the problem that the target detection method is difficult to detect the target in the severe environment is solved.

Description

Image enhancement and target detection accuracy rate improvement method based on joint training

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to an image enhancement and target detection accuracy improvement method based on joint training.

Background

The object detection technology is widely applied in various industries, but in a severe environment, an object detection model cannot well detect an object in a scene, and image quality is enhanced by relying on image preprocessing.

The conventional image enhancement method is basically divided into a conventional method and a method based on deep learning, wherein the conventional method has poor generalization capability and cannot adapt to changeable environments, and the method based on the deep learning has stronger generalization, but the quality of the enhanced image depends on a data set used for training and cannot be well matched with a target detection algorithm.

The invention patent CN114998605B discloses a target detection method for image enhancement guidance under severe imaging conditions, the invention patent application CN111611907A discloses an infrared target detection method for image enhancement, and the invention patent application CN114943869A discloses an airport target detection method for style migration enhancement.

Image enhancement provides a certain help in the application of a target detection algorithm in a severe environment as an image preprocessing method, but the method has certain defects. According to the method for guiding the target detection network training based on the image enhancement network, the image enhancement network is trained firstly, then the image enhancement network is used as a monitor to guide the target detection network, the image enhancement network is trained firstly, the image enhancement network is completely dependent on a feature extraction layer of target detection, a good enhancement effect cannot be achieved, and the enhancement degree is relatively low. The image enhancement method based on image processing has poor suitability with the target detection network. The data set is enhanced through style migration, so that a new data set is obtained for training of the target detection network, and the generalization performance of the network can be affected to a certain extent.

Disclosure of Invention

The invention aims to solve the technical problem of providing an image enhancement and target detection accuracy improvement method based on joint training so as to solve the problem that a target detection method is difficult to detect a target in a severe environment.

In order to solve the technical problems, the invention provides the following technical scheme:

an image enhancement and target detection accuracy improvement method based on joint training is used for a target detection system, wherein the target detection system comprises an image enhancement network and a target detection network, and the method comprises the following steps:

constructing an image degradation network, and inserting the image degradation network into the front part of the image enhancement network;

an image size conversion module is added at the rear end of the image enhancement network and is used for realizing the self-adaptive scaling of the image size in the training process;

the image enhancement network and the target detection network are connected in series through a loss function;

and constructing a training data set, and carrying out joint training on the image enhancement network and the target detection network.

Further, the image degradation network is realized by means of random mixing by manually adding degradation, wherein the degradation comprises one or more of blurring, noise and color cast.

Further, the image enhancement network adopts a generation type countermeasure network GAN-based image enhancement network, the GAN-based image enhancement network comprises a generator and a discriminator, the generator adopts a U-Net structure, and the discriminator adopts a multi-scale feature extraction structure.

Furthermore, the image size conversion module is formed by interpolation plus multi-layer convolution stacking.

Further, the target detection network adopts a Yolov5 network.

Further, the loss function includes a loss function of the image enhancement network and a loss function of the target detection network, where the loss function of the image enhancement network is obtained by weighting and adding the counterloss, the perceived loss and the MSE loss respectively.

Further, the training data set comprises an image under a non-severe environment and a tag file containing target annotation frame information in the image, and the tag file accords with a format required by the target detection network training.

Further, the constructing a training data set, performing joint training on the image enhancement network and the target detection network, includes:

the method comprises the steps of inputting an image in a non-severe environment into an image degradation network to obtain the image in the severe environment, inputting the image into an image enhancement network for enhancement, and inputting the image into a target detection network through an image size conversion module, wherein all loss functions are updated in the same iteration.

Further, the weights of the image enhancement network and the target detection network after the joint training are respectively stored.

when the training resources are insufficient, an alternate training method is adopted, and the target detection network is trained independently to obtain the weight of the target detection network;

training the image enhancement network by taking the weight of the target detection network as constraint, wherein the weight of the target detection network does not participate in parameter updating at the moment, so as to obtain the weight of the image enhancement network;

enhancing the training data set by using the trained image enhancement network;

training the target detection network again by using the enhanced image to obtain updated weights of the target detection network parameters;

repeating the steps until the image enhancement network and the target detection network reach convergence.

The invention has the following beneficial effects:

the invention relates to a combined training-based image enhancement and target detection accuracy improvement method, which is used for a target detection system, wherein the target detection system comprises an image enhancement network and a target detection network, an image degradation network is firstly constructed, the image degradation network is inserted into the front part of the image enhancement network, then an image size conversion module is added at the rear end of the image enhancement network and used for realizing the self-adaptive scaling of the image size in the training process, the image enhancement network and the target detection network are connected in series through a loss function, a training data set is finally constructed, and the combined training is carried out on the image enhancement network and the target detection network. Therefore, the method and the device perform joint training on the image enhancement network and the target detection network, the image enhancement network has good enhancement effect and high enhancement degree, and the suitability of the image enhancement network and the target detection network is good, the generalization performance of the network is not affected, and the problem that the target detection method is difficult to detect the target in a severe environment is solved. When the invention is used, the image to be detected is input into the target detection system, the enhanced image can be obtained after the image in the severe environment passes through the image enhancement network, and the enhanced image is input into the target detection network, so that the improvement of the detection accuracy can be realized.

Drawings

FIG. 1 is a schematic diagram of a target detection system employing a joint training-based image enhancement and target detection accuracy improvement method of the present invention;

FIG. 2 is a flow chart of the image enhancement and target detection accuracy improvement method based on joint training of the present invention;

fig. 3 is a schematic diagram of target detection results before and after enhancement in the present invention, where (a) is the target detection result before enhancement and (b) is the target detection result after enhancement.

Detailed Description

In order to make the technical problems, technical solutions and advantages to be solved more apparent, the following detailed description will be given with reference to the accompanying drawings and specific embodiments.

The invention provides a combined training-based image enhancement and target detection accuracy improvement method, which is used for a target detection system, as shown in fig. 1-2, wherein the target detection system comprises an image enhancement network 2 and a target detection network 4, and the method comprises the following steps:

step 10: constructing an image degradation network 1, and inserting the image degradation network 1 into the front part of the image enhancement network 2;

in this step, the image degradation network 1 is constructed, and the image degradation network 1 can be realized by a random mixing manner by manually adding degradation such as blurring, noise, color shift, and the like. In specific implementation, the preferred methods are Gaussian blur, upsampling after downsampling, random noise, and manually applying color shift, and random combinations are performed according to different degrees by randomly selecting among the methods.

The image enhancement network 2 can adopt any image enhancement network based on deep learning, the invention preferably adopts the image enhancement network 2 based on a generation type countermeasure network (Generative Adversarial Networks, GAN), the image enhancement network 2 based on GAN comprises a generator and a discriminator, the generator adopts a U-Net structure, the characteristics are extracted by multi-layer convolution series connection, the image reconstruction is carried out by deconvolution, and the series connection is carried out between the up sampling and the down sampling by adopting jump connection so as to supplement the lost information in the down sampling process; the discriminator adopts a multi-scale feature extraction structure, uses convolution kernels with different sizes to extract features, and connects the extracted features in series so as to discriminate images from different receptive fields.

Step 20: an image size conversion module 3 is added at the rear end of the image enhancement network 2 and is used for realizing the self-adaptive scaling of the image size in the training process;

in this step, the image size conversion module 3 may be configured by interpolation plus multi-layer convolution stacking. In specific implementation, the image size conversion module 3 may first read the size of the original image, convert the output image of the image enhancement network 2 into the size of the original image by adopting an interpolation method, and then parameterize the upsampling process by adopting multi-layer convolution on the image with the converted size, so as to reduce the image information loss caused by the upsampling process.

Step 30: connecting the image enhancement network 2 and the object detection network 4 in series by a loss function;

in this step, the target detection network 4 may be a target detection network based on deep learning, and the present invention is preferably a Yolov5 network with a single stage and high accuracy in actual target detection.

Preferably, the loss function comprises two parts: the loss function of the image enhancement network 2 and the loss function of the object detection network 4 are weighted and added to each other, and the loss function of the image enhancement network 2 is obtained by weighting the counterloss, the perceived loss, and the MSE loss.

In particular, the loss function may be L, and the loss function of the image enhancement network 2 may be L _e The loss function of the target detection network 4 (Yolov 5 network) is L _o ：

L＝L _e +L _o

Loss function L of image enhancement network 2 _e ：

L _e ＝L _A +λ ₁ L _p +λ ₂ L ₂

Wherein lambda is ₁ ，λ ₂ Is the perceived loss L _p And MSE loss L ₂ The preferred values of the weights of the invention may all be 0.5.

Countering loss L _A ：

Where x is an input image, H (x) is a degraded image obtained by passing the input image x through a degradation module (image degradation network 1), G represents a generator, and D represents a discriminator.

Perception loss L _p ：

Wherein,represents layer j, C of the Vgg-16 network _j H _j W _j The feature map size of the j-th layer is indicated.

MSE loss L ₂ ：

L ₂ ＝(x-H(x)) ²

Loss function L of object detection network 4 _o Specifically, a loss function of the Yolov5 network may be employed.

Step 40: a training data set is constructed, and the image enhancement network 2 and the target detection network 4 are jointly trained.

In this step, the training data set should include an image in a non-harsh environment and a tag file containing information about the target mark frame in the image, and the tag file should conform to a format required for training of the target detection network 4. When the object detection network 4 is a Yolov5 network, the tag file format may be a txt format file, where the file includes category information and label frame information of the object.

As an alternative embodiment, the constructing a training data set, performing joint training on the image enhancement network 2 and the object detection network 4 (step 40) may include:

step A1: the image under the non-severe environment is input into the image degradation network 1 to obtain the image under the severe environment, then is input into the image enhancement network 2 for enhancement, and is input into the target detection network 4 through the image size conversion module 3, and all the loss functions are updated in the same iteration.

Thus, the combined training effect can be better ensured. In addition, the weights of the image enhancement network 2 and the weights of the target detection network 4 after the joint training may be respectively stored, that is, after the loss function L converges, the weights of the image enhancement network 2 and the weights of the target detection network 4 are respectively stored, so that the subsequent use is convenient.

As another alternative embodiment, the constructing a training data set, performing joint training on the image enhancement network 2 and the object detection network 4 (step 40) may include:

step B1; when the training resources are insufficient, an alternate training method is adopted, and the target detection network 4 is trained independently to obtain the weight of the target detection network 4;

step B2; training the image enhancement network 2 by taking the weight of the target detection network 4 as constraint, wherein the weight of the target detection network 4 does not participate in parameter updating at the moment, so as to obtain the weight of the image enhancement network 2;

step B3; enhancing the training data set by using the trained image enhancement network 2;

step B4; training the target detection network 4 again by using the enhanced image to obtain the updated weight of the target detection network 4 parameters;

step B5; the above steps (i.e. steps B1-B4) are repeated until convergence of both the image enhancement network 2 and the object detection network 4 is reached.

Thus, through the steps B1-B5, the combined training effect can be better ensured when the training resources are insufficient. It will be appreciated that when the training resources are sufficient, the entire network may be connected in series, as before, with the image intensifier network 2 and the object detection network 4 training simultaneously.

And after the steps 10-40, obtaining the target detection system after the combined training is finished. It should be noted that, when training, the target detection system in the present invention is the architecture shown in fig. 1, after training is completed, the image degradation network 1 and the image size conversion module 3 may be removed, as in the prior art, only the image enhancement network 2 and the target detection network 4 which are connected in series may be reserved, when in use, the image to be detected is input into the system, the image under severe environment may be obtained after passing through the image enhancement network 2, the enhanced image is input into the target detection network 4, so as to achieve improvement of detection accuracy, and the schematic diagrams of the target detection results before and after enhancement are shown in fig. 3, where (a) is the target detection result before enhancement, it may be seen that no detection result is detected at this time, (b) is the target detection result after enhancement, and it may be accurately detected; the target detection accuracy is compared with the results shown in the following table 1.

TABLE 1 comparison of target detection results

Note that: precision in the table is Precision, recall is Recall, and mAP is mean Average Precision (average Precision mean).

In summary, the image enhancement and target detection accuracy improvement method based on the combined training is used for a target detection system, the target detection system comprises an image enhancement network and a target detection network, an image degradation network is firstly constructed, the image degradation network is inserted into the front part of the image enhancement network, then an image size conversion module is added at the rear end of the image enhancement network and used for realizing the self-adaptive scaling of the image size in the training process, the image enhancement network and the target detection network are connected in series through a loss function, finally a training data set is constructed, and the combined training is carried out on the image enhancement network and the target detection network. Therefore, the method and the device perform joint training on the image enhancement network and the target detection network, the image enhancement network has good enhancement effect and high enhancement degree, and the suitability of the image enhancement network and the target detection network is good, the generalization performance of the network is not affected, and the problem that the target detection method is difficult to detect the target in a severe environment is solved. When the invention is used, the image to be detected is input into the target detection system, the enhanced image can be obtained after the image in the severe environment passes through the image enhancement network, and the enhanced image is input into the target detection network, so that the improvement of the detection accuracy can be realized.

While the foregoing is directed to the preferred embodiments of the present invention, it will be appreciated by those skilled in the art that various modifications and adaptations can be made without departing from the principles of the present invention, and such modifications and adaptations are intended to be comprehended within the scope of the present invention.

Claims

1. An image enhancement and target detection accuracy rate improvement method based on joint training is used for a target detection system, wherein the target detection system comprises an image enhancement network and a target detection network, and is characterized by comprising the following steps:

2. The method of claim 1, wherein the image degradation network is implemented by means of random mixing by manually adding degradation, including one or more of blur, noise, color shift.

3. The method of claim 1, wherein the image enhancement network employs a generated-based antagonism network GAN, the GAN-based image enhancement network comprising a generator employing a U-Net structure and a discriminator employing a multi-scale feature extraction structure.

4. The method of claim 1, wherein the image size transformation module is configured by interpolation plus multi-layer convolution stacking.

5. The method of claim 1, wherein the target detection network employs a Yolov5 network.

6. The method of claim 1, wherein the loss function comprises a loss function of the image enhancement network and a loss function of the object detection network, the loss function of the image enhancement network being weighted by the counterloss, the perceptual loss, and the MSE loss, respectively, and added.

7. The method of claim 1, wherein the training dataset comprises images in non-harsh environments and a tag file containing information about target marking frames in the images, the tag file conforming to a format required for training of the target detection network.

8. The method of claim 1, wherein constructing a training dataset, jointly training the image enhancement network and the object detection network, comprises:

9. The method of claim 1, wherein the weights of the jointly trained image enhancement network and the weights of the target detection network are stored separately.

10. The method according to any of claims 1-9, wherein said constructing a training dataset, jointly training said image enhancement network and object detection network, comprises:

enhancing the training data set by using the trained image enhancement network;