CN115482501A

CN115482501A - Sprinkler identification method integrating data enhancement and target detection network

Info

Publication number: CN115482501A
Application number: CN202211006018.3A
Authority: CN
Inventors: 宋建斌; 杜渐; 段洪琳; 吴武勋; 张凯; 江子强; 姜德宏; 徐华; 耿亚玮; 武英杰; 符锌砂; 胡弘毅; 曾彦杰
Original assignee: China Merchants Bureau Highway Network Technology Holding Co ltd; Zhaoshang Xinzhi Technology Co ltd
Current assignee: China Merchants Bureau Highway Network Technology Holding Co ltd; Zhaoshang Xinzhi Technology Co ltd
Priority date: 2022-08-22
Filing date: 2022-08-22
Publication date: 2022-12-16

Abstract

The invention discloses a method for identifying a projectile by fusing data enhancement and a target detection network, which comprises the following steps: constructing a projectile image data model, extracting images with projectiles according to historical multi-road scene monitoring videos in a traffic environment, acquiring an artificially marked image data set by adopting an artificial marking method, and dividing the image data set into a training set, a verification set and a test set according to a certain proportion; preprocessing the images of the training set and the verification set by a data enhancement method, inputting the processed images into a YOLO recognition network for training, and acquiring a recognition network model; and inputting the images of the test set into the identification network model, performing precision evaluation on the identification network model, extracting image data from the actual multi-road scene monitoring video, preprocessing the image data, inputting the preprocessed image into the identification network model meeting the precision requirement, outputting an identification result, and identifying the road sprinkled object.

Description

Sprinkler identification method integrating data enhancement and target detection network

Technical Field

The invention belongs to the technical field of image recognition, and particularly relates to a method for recognizing a sprinkled object by fusing data enhancement and a target detection network.

Background

The existing technology for identifying the projectile based on the monitoring video mainly trains a classifier according to a large number of projectile image data samples, and classifies by combining the temporal and spatial information characteristics of the projectile to identify the projectile in an image. Acquiring a video frame sequence, and extracting a moving object from the video frame sequence; processing the moving target to obtain a suspected throwing object; and acquiring the data of the tossing sample, and clustering pixels of the data of the tossing sample to obtain a clustering result. And acquiring a pixel value of the suspected projectile, and identifying the suspected projectile of the pixel value in the clustering result to obtain the projectile. The prior art has the following defects: a large number of projectile image data samples can only be collected and marked manually, so that the time cost is high, and the work content is repeated and fussy; the clustering algorithm of the traditional classifier is simple, and good classification precision can be obtained only by further screening and classifying by combining the temporal-spatial information characteristics of sprinkles; in complex environments such as rainy, snowy and foggy days and at night, the data images of the sprinkled objects are difficult to acquire and establish a training database, so that a good target recognition network is not trained by using a conventional data set. Therefore, it is highly desirable to provide a method for identifying a projectile that combines data enhancement and a target detection network, and identifies an image in real time, thereby improving the accuracy of image identification.

Disclosure of Invention

The invention aims to provide a method for identifying a sprinkled object by fusing a data enhancement network and a target detection network, aiming at the problem of lack of image data of the sprinkled object in a rainy and foggy scene, the method for enhancing the fused data trains an applicable sprinkled object detection network from a small amount of training data and detects the sprinkled object from a real-time road monitoring video, so that the cost is reduced and the image identification precision is improved.

In order to achieve the above object, the present invention provides a method for identifying a projectile by fusing a data enhancement network and a target detection network, comprising the following steps:

constructing a projectile image data model, extracting images with projectiles according to historical multi-road scene monitoring videos in a traffic environment, acquiring an artificially marked image data set by adopting an artificial marking method, and dividing the image data set into a training set, a verification set and a test set;

preprocessing the images of the training set and the verification set by a data enhancement method, inputting the preprocessed images into a YOLO recognition network for training, and acquiring a recognition network model;

and inputting the images of the test set into the identification network model, performing precision evaluation on the identification network model, extracting image data from the actual multi-road scene monitoring video, preprocessing the image data, inputting the preprocessed image into the identification network model meeting the precision requirement, outputting an identification result, and identifying the road sprinkled object.

Optionally, the manually labeled data set is divided into a training set, a validation set and a test set by 7.

Optionally, the preprocessing is performed on the images in the training set and the verification set by using a data enhancement method, and the processed images are input to a YOLO recognition network for training, so as to obtain a recognition network model, which specifically includes:

preprocessing the training set images and the verification set images by a standardization processing and Mosaic data enhancement method, importing a YOLO recognition network for training, and updating network recognition parameters by loss propagation to obtain a recognition network model.

Optionally, the data enhancement method specifically includes:

and randomly selecting a plurality of images from the training set or the verification set for rotation and scaling, finally splicing to obtain a new image, and reserving the information of the labeling frame.

Optionally, the acquiring and identifying a network model specifically includes:

extracting a plurality of images from a monitoring video of a road spill event, manually labeling the spill, dividing a training set and a verification set, preprocessing, enhancing data, processing into 640 × 640 RGB images, importing the RGB images into a YOLO recognition network, updating network parameters through error propagation, and repeating an iterative training process to finally obtain a recognition network model.

Optionally, the accuracy evaluation of the identified network model specifically includes:

performing precision evaluation on the identification network model through the images of the test set, performing precision evaluation on the identification network model by taking mAP as an index, and if the mAP meets the requirement, taking the identification network model meeting the precision requirement as an identification network model for performing actual video test; and if the mAP does not meet the requirement, updating the sample set and the network initialization parameters, and retraining the identification network model.

Optionally, the projectile identification accuracy index mAP is greater than or equal to 90%.

The invention has the technical effects that: the invention discloses a method for identifying a projectile by fusing data enhancement and a target detection network, which comprises the steps of processing a projectile image data sample by using a Mosaic data enhancement algorithm, and creating a large number of training set samples which can be used for network training from projectile image data which is manually marked in a small amount of complex environments; the method comprises the steps that real-time identification of the sprinklers in the monitoring video images is achieved through an open-source target identification algorithm YOLO, and the algorithm can guarantee sufficient accuracy only by needing image characteristics without combining space-time characteristic information of objects such as vehicles and sprinklers; by using a Mosaic data enhancement method, the diversity of a sample data set of a projectile image is optimized, and the training efficiency of the identification network is increased under the condition of fewer data samples; the problem that an effective database is difficult to establish due to the lack of images of the sprinkled objects in complex scenes such as rain, fog and snow days and at night is solved; and the YOLO recognition network algorithm is applied to a projectile recognition task in a monitoring video, so that better generalization performance and recognition accuracy are ensured.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the application and, together with the description, serve to explain the application and are not intended to limit the application. In the drawings:

FIG. 1 is a schematic flow chart of a method for identifying a projectile that incorporates a data enhancement and target detection network in accordance with an embodiment of the present invention;

fig. 2 is a schematic diagram of a data enhancement method according to an embodiment of the present invention.

Detailed Description

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.

It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowcharts, in some cases, the steps illustrated or described may be performed in an order different than presented herein.

As shown in fig. 1-2, the present embodiment provides a method for identifying a projectile by fusing data enhancement and target detection networks, which includes the following steps:

constructing a projectile image data model, extracting images with projectiles according to historical multi-road scene monitoring videos in a traffic environment, acquiring an artificially marked image data set by adopting an artificial marking method, and dividing the image data set into a training set, a verification set and a test set according to a certain proportion;

preprocessing the images of the training set and the verification set by a data enhancement method, inputting the processed images into a YOLO recognition network for training, and acquiring a recognition network model;

and inputting the images of the test set into the identification network model, carrying out precision evaluation on the identification network model, extracting image data from the actual multi-road scene monitoring video, preprocessing the image data, inputting the preprocessed image into the identification network model meeting the precision requirement, outputting the identification result, and finding out the road throws.

A database building stage:

acquiring a multi-road scene monitoring video under an actual traffic environment, and converting the multi-road scene monitoring video into image data;

for the image with the projectile, the projectile is injected by using a manual labeling method, and a manually labeled data set is divided into a training set, a verification set and a test set according to the following steps of 1.

A training stage:

firstly, preprocessing images of a training set and a verification set by a standardization processing and Mosaic data enhancement method;

and importing images of the training set and the verification set into a YOLO recognition network for network training, and updating network recognition parameters through loss propagation to obtain a converged recognition network.

And (3) a testing stage:

and importing the test set image into the trained recognition network, and evaluating the recognition accuracy of the test set image.

The use stage is as follows:

extracting image data from the actual multi-road scene monitoring video and preprocessing the image data;

and inputting the processed image into an identification network with the accuracy meeting the use requirement, outputting an identification result, and finding the road sprinkled object in time.

And in the training stage, a data enhancement method is adopted to increase the image diversity, so that the recognition network can learn more characteristic information from the image data set, and the recognition precision and the generalization of the network are improved. Referring to fig. 2, the mosaic data enhancement mechanism is: and randomly selecting four images from the training set or the verification set for rotation, zooming and other processing, finally splicing to obtain a new image, and reserving the information of the labeling frame to ensure that the network can obtain the characteristic information from the new image.

The image rotation, scaling and splicing processes are as follows:

recording the width and height of an image as W, H respectively, and recording the coordinate of a certain pixel point of the image as (x, y); the new coordinate position after the transformation matrix A is (x) ₁ ,x ₂ )。

The formula for performing spatial transformation on a certain pixel point of an image is as follows:

for rotational transformation, the transformation matrix A is as follows, where α is the angle of counterclockwise rotation of the image about the midpoint of the image, and in the present invention α is taken to be 90 °, 180 °, or 270 °

For scaling transformation, a transformation matrix A is as follows, wherein S is a multiple of image magnification or image reduction, and for a pixel point needing to be filled with a pixel value after magnification, a bilinear interpolation method is used for obtaining the pixel value at the position.

After amplification, for a known point (x) ₁ ，y ₁ )、(x ₁ ，y ₂ )、(x ₂ ，y ₁ )、(x ₂ ，y ₂ ) The pixel values of the four points are respectively Q ₁₁ 、Q ₁₂ 、Q ₂₁ 、Q ₂₂ And calculating the pixel value of the filling point P:

the Mosaic data enhancement is to respectively intercept partial areas and splice the partial areas into a new image after 4 images are subjected to conversion processing, wherein the widths and heights of the intercepted partial images are 1/4W and 1/4H respectively; therefore, the stitched image still satisfies the size W × H.

In order to realize identification of the sprinklers in the multi-road scene, 2000 images are extracted from a monitoring video of a road sprinkle event, the sprinklers are manually marked, a training set and a verification set are divided, then the RGB images are preprocessed and subjected to data enhancement to be processed into 640 × 640 images, the RGB images are led into a YOLO identification network, network parameters are updated through error propagation, and the iterative training process is repeated to finally obtain a loss convergence network model.

In the testing stage, the testing set data is used for testing the effect of the network model, the mAP is used as an index for evaluating the accuracy of the network model, if the mAP meets the requirement, the result of the super-parameter of the network training can be used as the network model of the actual video test, and if the mAP does not meet the requirement, the sample set and the network initialization parameter need to be updated, and the network is trained again. The above requirements are in particular that the projectile identification accuracy index mAP is greater than or equal to 90%, in this example, the projectile identification accuracy index mAP reaches 93%, it is considered that the accuracy requirement is met, and the network hyper-parameter can be directly used in the use stage.

In the using stage, a monitoring video in an actual multi-road scene needs to be extracted, frames are extracted according to a time interval of 5-10s to serve as original image data, the original image data are compressed into 640-by-640-pixel images, and then the images are input into an identification network to identify a projectile event in real time. The network identifiable projectile object comprises: helmet, throwing cloth, rockfall, plastic bag, etc., and the identification result is displayed as 'throwing object' output.

The invention discloses a method for identifying a projectile by fusing data enhancement and a target detection network, which comprises the steps of processing a projectile image data sample by using a Mosaic data enhancement algorithm, and creating a large number of training set samples which can be used for network training from projectile image data which is manually marked in a small amount of complex environments; the method comprises the steps of utilizing an open source target identification algorithm YOLO to realize real-time identification of the sprinkled objects in a monitoring video image, wherein the algorithm can ensure enough precision only by needing image characteristics without combining space-time characteristic information of objects such as vehicles, sprinkled objects and the like; by using a Mosaic data enhancement method, the diversity of a sample data set of a projectile image is optimized, and the training efficiency of the identification network is increased under the condition of fewer data samples; the problem that an effective database is difficult to establish due to the lack of images of the sprinkled objects in complex scenes such as rain, fog and snow days and at night is solved; and the YOLO recognition network algorithm is applied to a projectile recognition task in a monitoring video, so that better generalization performance and recognition accuracy are ensured.

The above description is only for the preferred embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. The method for identifying the sprinkled object by fusing the data enhancement and the target detection network is characterized by comprising the following steps of:

2. The method for projectile recognition incorporating a data enhancement and target detection network according to claim 1 wherein the manually labeled data set is divided into a training set, a validation set and a test set by 7.

3. The method for identifying a projectile fusing data enhancement and target detection networks as claimed in claim 1, wherein the data enhancement method is used to pre-process the images of the training set and the verification set, and the processed images are input into a YOLO identification network for training to obtain an identification network model, specifically comprising:

preprocessing the training set images and the verification set images by a standardized processing and Mosaic data enhancement method, importing a YOLO recognition network for training, and updating network recognition parameters by loss propagation to obtain a recognition network model.

4. The method for projectile identification incorporating a data enhancement and target detection network as claimed in claim 3 wherein said data enhancement method specifically comprises:

5. The method of claim 3, wherein obtaining a recognition network model specifically comprises:

6. The method for projectile identification incorporating a data enhancement and target detection network according to claim 1, wherein said accuracy assessment of the identification network model specifically comprises:

7. The method of integrating data enhancement and object detection networks of claim 6, wherein the projectile identification accuracy index mAP is greater than or equal to 90%.