CN115482501A - Sprinkler identification method integrating data enhancement and target detection network - Google Patents
Sprinkler identification method integrating data enhancement and target detection network Download PDFInfo
- Publication number
- CN115482501A CN115482501A CN202211006018.3A CN202211006018A CN115482501A CN 115482501 A CN115482501 A CN 115482501A CN 202211006018 A CN202211006018 A CN 202211006018A CN 115482501 A CN115482501 A CN 115482501A
- Authority
- CN
- China
- Prior art keywords
- images
- network model
- identification
- training
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/10—Image acquisition
- G06V10/16—Image acquisition using multiple overlapping images; Image stitching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/776—Validation; Performance evaluation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/70—Labelling scene content, e.g. deriving syntactic or semantic representations
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a method for identifying a projectile by fusing data enhancement and a target detection network, which comprises the following steps: constructing a projectile image data model, extracting images with projectiles according to historical multi-road scene monitoring videos in a traffic environment, acquiring an artificially marked image data set by adopting an artificial marking method, and dividing the image data set into a training set, a verification set and a test set according to a certain proportion; preprocessing the images of the training set and the verification set by a data enhancement method, inputting the processed images into a YOLO recognition network for training, and acquiring a recognition network model; and inputting the images of the test set into the identification network model, performing precision evaluation on the identification network model, extracting image data from the actual multi-road scene monitoring video, preprocessing the image data, inputting the preprocessed image into the identification network model meeting the precision requirement, outputting an identification result, and identifying the road sprinkled object.
Description
Technical Field
The invention belongs to the technical field of image recognition, and particularly relates to a method for recognizing a sprinkled object by fusing data enhancement and a target detection network.
Background
The existing technology for identifying the projectile based on the monitoring video mainly trains a classifier according to a large number of projectile image data samples, and classifies by combining the temporal and spatial information characteristics of the projectile to identify the projectile in an image. Acquiring a video frame sequence, and extracting a moving object from the video frame sequence; processing the moving target to obtain a suspected throwing object; and acquiring the data of the tossing sample, and clustering pixels of the data of the tossing sample to obtain a clustering result. And acquiring a pixel value of the suspected projectile, and identifying the suspected projectile of the pixel value in the clustering result to obtain the projectile. The prior art has the following defects: a large number of projectile image data samples can only be collected and marked manually, so that the time cost is high, and the work content is repeated and fussy; the clustering algorithm of the traditional classifier is simple, and good classification precision can be obtained only by further screening and classifying by combining the temporal-spatial information characteristics of sprinkles; in complex environments such as rainy, snowy and foggy days and at night, the data images of the sprinkled objects are difficult to acquire and establish a training database, so that a good target recognition network is not trained by using a conventional data set. Therefore, it is highly desirable to provide a method for identifying a projectile that combines data enhancement and a target detection network, and identifies an image in real time, thereby improving the accuracy of image identification.
Disclosure of Invention
The invention aims to provide a method for identifying a sprinkled object by fusing a data enhancement network and a target detection network, aiming at the problem of lack of image data of the sprinkled object in a rainy and foggy scene, the method for enhancing the fused data trains an applicable sprinkled object detection network from a small amount of training data and detects the sprinkled object from a real-time road monitoring video, so that the cost is reduced and the image identification precision is improved.
In order to achieve the above object, the present invention provides a method for identifying a projectile by fusing a data enhancement network and a target detection network, comprising the following steps:
constructing a projectile image data model, extracting images with projectiles according to historical multi-road scene monitoring videos in a traffic environment, acquiring an artificially marked image data set by adopting an artificial marking method, and dividing the image data set into a training set, a verification set and a test set;
preprocessing the images of the training set and the verification set by a data enhancement method, inputting the preprocessed images into a YOLO recognition network for training, and acquiring a recognition network model;
and inputting the images of the test set into the identification network model, performing precision evaluation on the identification network model, extracting image data from the actual multi-road scene monitoring video, preprocessing the image data, inputting the preprocessed image into the identification network model meeting the precision requirement, outputting an identification result, and identifying the road sprinkled object.
Optionally, the manually labeled data set is divided into a training set, a validation set and a test set by 7.
Optionally, the preprocessing is performed on the images in the training set and the verification set by using a data enhancement method, and the processed images are input to a YOLO recognition network for training, so as to obtain a recognition network model, which specifically includes:
preprocessing the training set images and the verification set images by a standardization processing and Mosaic data enhancement method, importing a YOLO recognition network for training, and updating network recognition parameters by loss propagation to obtain a recognition network model.
Optionally, the data enhancement method specifically includes:
and randomly selecting a plurality of images from the training set or the verification set for rotation and scaling, finally splicing to obtain a new image, and reserving the information of the labeling frame.
Optionally, the acquiring and identifying a network model specifically includes:
extracting a plurality of images from a monitoring video of a road spill event, manually labeling the spill, dividing a training set and a verification set, preprocessing, enhancing data, processing into 640 × 640 RGB images, importing the RGB images into a YOLO recognition network, updating network parameters through error propagation, and repeating an iterative training process to finally obtain a recognition network model.
Optionally, the accuracy evaluation of the identified network model specifically includes:
performing precision evaluation on the identification network model through the images of the test set, performing precision evaluation on the identification network model by taking mAP as an index, and if the mAP meets the requirement, taking the identification network model meeting the precision requirement as an identification network model for performing actual video test; and if the mAP does not meet the requirement, updating the sample set and the network initialization parameters, and retraining the identification network model.
Optionally, the projectile identification accuracy index mAP is greater than or equal to 90%.
The invention has the technical effects that: the invention discloses a method for identifying a projectile by fusing data enhancement and a target detection network, which comprises the steps of processing a projectile image data sample by using a Mosaic data enhancement algorithm, and creating a large number of training set samples which can be used for network training from projectile image data which is manually marked in a small amount of complex environments; the method comprises the steps that real-time identification of the sprinklers in the monitoring video images is achieved through an open-source target identification algorithm YOLO, and the algorithm can guarantee sufficient accuracy only by needing image characteristics without combining space-time characteristic information of objects such as vehicles and sprinklers; by using a Mosaic data enhancement method, the diversity of a sample data set of a projectile image is optimized, and the training efficiency of the identification network is increased under the condition of fewer data samples; the problem that an effective database is difficult to establish due to the lack of images of the sprinkled objects in complex scenes such as rain, fog and snow days and at night is solved; and the YOLO recognition network algorithm is applied to a projectile recognition task in a monitoring video, so that better generalization performance and recognition accuracy are ensured.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the application and, together with the description, serve to explain the application and are not intended to limit the application. In the drawings:
FIG. 1 is a schematic flow chart of a method for identifying a projectile that incorporates a data enhancement and target detection network in accordance with an embodiment of the present invention;
fig. 2 is a schematic diagram of a data enhancement method according to an embodiment of the present invention.
Detailed Description
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowcharts, in some cases, the steps illustrated or described may be performed in an order different than presented herein.
As shown in fig. 1-2, the present embodiment provides a method for identifying a projectile by fusing data enhancement and target detection networks, which includes the following steps:
constructing a projectile image data model, extracting images with projectiles according to historical multi-road scene monitoring videos in a traffic environment, acquiring an artificially marked image data set by adopting an artificial marking method, and dividing the image data set into a training set, a verification set and a test set according to a certain proportion;
preprocessing the images of the training set and the verification set by a data enhancement method, inputting the processed images into a YOLO recognition network for training, and acquiring a recognition network model;
and inputting the images of the test set into the identification network model, carrying out precision evaluation on the identification network model, extracting image data from the actual multi-road scene monitoring video, preprocessing the image data, inputting the preprocessed image into the identification network model meeting the precision requirement, outputting the identification result, and finding out the road throws.
A database building stage:
acquiring a multi-road scene monitoring video under an actual traffic environment, and converting the multi-road scene monitoring video into image data;
for the image with the projectile, the projectile is injected by using a manual labeling method, and a manually labeled data set is divided into a training set, a verification set and a test set according to the following steps of 1.
A training stage:
firstly, preprocessing images of a training set and a verification set by a standardization processing and Mosaic data enhancement method;
and importing images of the training set and the verification set into a YOLO recognition network for network training, and updating network recognition parameters through loss propagation to obtain a converged recognition network.
And (3) a testing stage:
and importing the test set image into the trained recognition network, and evaluating the recognition accuracy of the test set image.
The use stage is as follows:
extracting image data from the actual multi-road scene monitoring video and preprocessing the image data;
and inputting the processed image into an identification network with the accuracy meeting the use requirement, outputting an identification result, and finding the road sprinkled object in time.
And in the training stage, a data enhancement method is adopted to increase the image diversity, so that the recognition network can learn more characteristic information from the image data set, and the recognition precision and the generalization of the network are improved. Referring to fig. 2, the mosaic data enhancement mechanism is: and randomly selecting four images from the training set or the verification set for rotation, zooming and other processing, finally splicing to obtain a new image, and reserving the information of the labeling frame to ensure that the network can obtain the characteristic information from the new image.
The image rotation, scaling and splicing processes are as follows:
recording the width and height of an image as W, H respectively, and recording the coordinate of a certain pixel point of the image as (x, y); the new coordinate position after the transformation matrix A is (x) 1 ,x 2 )。
The formula for performing spatial transformation on a certain pixel point of an image is as follows:
for rotational transformation, the transformation matrix A is as follows, where α is the angle of counterclockwise rotation of the image about the midpoint of the image, and in the present invention α is taken to be 90 °, 180 °, or 270 °
For scaling transformation, a transformation matrix A is as follows, wherein S is a multiple of image magnification or image reduction, and for a pixel point needing to be filled with a pixel value after magnification, a bilinear interpolation method is used for obtaining the pixel value at the position.
After amplification, for a known point (x) 1 ,y 1 )、(x 1 ,y 2 )、(x 2 ,y 1 )、(x 2 ,y 2 ) The pixel values of the four points are respectively Q 11 、Q 12 、Q 21 、Q 22 And calculating the pixel value of the filling point P:
the Mosaic data enhancement is to respectively intercept partial areas and splice the partial areas into a new image after 4 images are subjected to conversion processing, wherein the widths and heights of the intercepted partial images are 1/4W and 1/4H respectively; therefore, the stitched image still satisfies the size W × H.
In order to realize identification of the sprinklers in the multi-road scene, 2000 images are extracted from a monitoring video of a road sprinkle event, the sprinklers are manually marked, a training set and a verification set are divided, then the RGB images are preprocessed and subjected to data enhancement to be processed into 640 × 640 images, the RGB images are led into a YOLO identification network, network parameters are updated through error propagation, and the iterative training process is repeated to finally obtain a loss convergence network model.
In the testing stage, the testing set data is used for testing the effect of the network model, the mAP is used as an index for evaluating the accuracy of the network model, if the mAP meets the requirement, the result of the super-parameter of the network training can be used as the network model of the actual video test, and if the mAP does not meet the requirement, the sample set and the network initialization parameter need to be updated, and the network is trained again. The above requirements are in particular that the projectile identification accuracy index mAP is greater than or equal to 90%, in this example, the projectile identification accuracy index mAP reaches 93%, it is considered that the accuracy requirement is met, and the network hyper-parameter can be directly used in the use stage.
In the using stage, a monitoring video in an actual multi-road scene needs to be extracted, frames are extracted according to a time interval of 5-10s to serve as original image data, the original image data are compressed into 640-by-640-pixel images, and then the images are input into an identification network to identify a projectile event in real time. The network identifiable projectile object comprises: helmet, throwing cloth, rockfall, plastic bag, etc., and the identification result is displayed as 'throwing object' output.
The invention discloses a method for identifying a projectile by fusing data enhancement and a target detection network, which comprises the steps of processing a projectile image data sample by using a Mosaic data enhancement algorithm, and creating a large number of training set samples which can be used for network training from projectile image data which is manually marked in a small amount of complex environments; the method comprises the steps of utilizing an open source target identification algorithm YOLO to realize real-time identification of the sprinkled objects in a monitoring video image, wherein the algorithm can ensure enough precision only by needing image characteristics without combining space-time characteristic information of objects such as vehicles, sprinkled objects and the like; by using a Mosaic data enhancement method, the diversity of a sample data set of a projectile image is optimized, and the training efficiency of the identification network is increased under the condition of fewer data samples; the problem that an effective database is difficult to establish due to the lack of images of the sprinkled objects in complex scenes such as rain, fog and snow days and at night is solved; and the YOLO recognition network algorithm is applied to a projectile recognition task in a monitoring video, so that better generalization performance and recognition accuracy are ensured.
The above description is only for the preferred embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
Claims (7)
1. The method for identifying the sprinkled object by fusing the data enhancement and the target detection network is characterized by comprising the following steps of:
constructing a projectile image data model, extracting images with projectiles according to historical multi-road scene monitoring videos in a traffic environment, acquiring an artificially marked image data set by adopting an artificial marking method, and dividing the image data set into a training set, a verification set and a test set;
preprocessing the images of the training set and the verification set by a data enhancement method, inputting the preprocessed images into a YOLO recognition network for training, and acquiring a recognition network model;
and inputting the images of the test set into the identification network model, performing precision evaluation on the identification network model, extracting image data from the actual multi-road scene monitoring video, preprocessing the image data, inputting the preprocessed image into the identification network model meeting the precision requirement, outputting an identification result, and identifying the road sprinkled object.
2. The method for projectile recognition incorporating a data enhancement and target detection network according to claim 1 wherein the manually labeled data set is divided into a training set, a validation set and a test set by 7.
3. The method for identifying a projectile fusing data enhancement and target detection networks as claimed in claim 1, wherein the data enhancement method is used to pre-process the images of the training set and the verification set, and the processed images are input into a YOLO identification network for training to obtain an identification network model, specifically comprising:
preprocessing the training set images and the verification set images by a standardized processing and Mosaic data enhancement method, importing a YOLO recognition network for training, and updating network recognition parameters by loss propagation to obtain a recognition network model.
4. The method for projectile identification incorporating a data enhancement and target detection network as claimed in claim 3 wherein said data enhancement method specifically comprises:
and randomly selecting a plurality of images from the training set or the verification set for rotation and scaling, finally splicing to obtain a new image, and reserving the information of the labeling frame.
5. The method of claim 3, wherein obtaining a recognition network model specifically comprises:
extracting a plurality of images from a monitoring video of a road spill event, manually labeling the spill, dividing a training set and a verification set, preprocessing, enhancing data, processing into 640 × 640 RGB images, importing the RGB images into a YOLO recognition network, updating network parameters through error propagation, and repeating an iterative training process to finally obtain a recognition network model.
6. The method for projectile identification incorporating a data enhancement and target detection network according to claim 1, wherein said accuracy assessment of the identification network model specifically comprises:
performing precision evaluation on the identification network model through the images of the test set, performing precision evaluation on the identification network model by taking mAP as an index, and if the mAP meets the requirement, taking the identification network model meeting the precision requirement as an identification network model for performing actual video test; and if the mAP does not meet the requirement, updating the sample set and the network initialization parameters, and retraining the identification network model.
7. The method of integrating data enhancement and object detection networks of claim 6, wherein the projectile identification accuracy index mAP is greater than or equal to 90%.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211006018.3A CN115482501A (en) | 2022-08-22 | 2022-08-22 | Sprinkler identification method integrating data enhancement and target detection network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211006018.3A CN115482501A (en) | 2022-08-22 | 2022-08-22 | Sprinkler identification method integrating data enhancement and target detection network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115482501A true CN115482501A (en) | 2022-12-16 |
Family
ID=84421680
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211006018.3A Pending CN115482501A (en) | 2022-08-22 | 2022-08-22 | Sprinkler identification method integrating data enhancement and target detection network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115482501A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116503779A (en) * | 2023-04-26 | 2023-07-28 | 中国公路工程咨询集团有限公司 | Pavement casting object identification system and method |
-
2022
- 2022-08-22 CN CN202211006018.3A patent/CN115482501A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116503779A (en) * | 2023-04-26 | 2023-07-28 | 中国公路工程咨询集团有限公司 | Pavement casting object identification system and method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109977812B (en) | Vehicle-mounted video target detection method based on deep learning | |
WO2022016563A1 (en) | Ground monitoring system for plant-protection unmanned aerial vehicle, and monitoring method for same | |
CN110660222B (en) | Intelligent environment-friendly electronic snapshot system for black-smoke road vehicle | |
CN110399856B (en) | Feature extraction network training method, image processing method, device and equipment | |
CN112183788B (en) | Domain adaptive equipment operation detection system and method | |
CN106599792B (en) | Method for detecting hand driving violation behavior | |
CN111967393A (en) | Helmet wearing detection method based on improved YOLOv4 | |
CN111598030A (en) | Method and system for detecting and segmenting vehicle in aerial image | |
CN104134068B (en) | Monitoring vehicle feature representation and classification method based on sparse coding | |
CN110866879B (en) | Image rain removing method based on multi-density rain print perception | |
CN104134222A (en) | Traffic flow monitoring image detecting and tracking system and method based on multi-feature fusion | |
Yaghoobi Ershadi et al. | Robust vehicle detection in different weather conditions: Using MIPM | |
Yousri et al. | A deep learning-based benchmarking framework for lane segmentation in the complex and dynamic road scenes | |
CN114821326A (en) | Method for detecting and identifying dense weak and small targets in wide remote sensing image | |
CN115482501A (en) | Sprinkler identification method integrating data enhancement and target detection network | |
CN113628164A (en) | Pavement crack detection method based on deep learning and web end positioning | |
CN115546742A (en) | Rail foreign matter identification method and system based on monocular thermal infrared camera | |
CN116052090A (en) | Image quality evaluation method, model training method, device, equipment and medium | |
CN115620090A (en) | Model training method, low-illumination target re-recognition method and device and terminal equipment | |
Zhao et al. | Image dehazing based on haze degree classification | |
Brophy et al. | A review of the impact of rain on camera-based perception in automated driving systems | |
Liu et al. | Multi-lane detection by combining line anchor and feature shift for urban traffic management | |
CN117294818A (en) | Building site panoramic monitoring method for airport construction | |
CN111797799A (en) | Subway passenger waiting area planning method based on artificial intelligence | |
CN114565597B (en) | Night road pedestrian detection method based on YOLO v3-tiny-DB and transfer learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |