CN116343136A - Road surface casting detection method based on expressway monitoring video - Google Patents

Road surface casting detection method based on expressway monitoring video Download PDF

Info

Publication number
CN116343136A
CN116343136A CN202310177342.XA CN202310177342A CN116343136A CN 116343136 A CN116343136 A CN 116343136A CN 202310177342 A CN202310177342 A CN 202310177342A CN 116343136 A CN116343136 A CN 116343136A
Authority
CN
China
Prior art keywords
model
casting
data set
image
background
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310177342.XA
Other languages
Chinese (zh)
Inventor
孙健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Ninghang Expressway Co ltd
Original Assignee
Jiangsu Ninghang Expressway Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Ninghang Expressway Co ltd filed Critical Jiangsu Ninghang Expressway Co ltd
Priority to CN202310177342.XA priority Critical patent/CN116343136A/en
Publication of CN116343136A publication Critical patent/CN116343136A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/54Surveillance or monitoring of activities, e.g. for recognising suspicious objects of traffic, e.g. cars on the road, trains or boats
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a road surface casting detection method based on expressway monitoring video, which belongs to the technical field of intelligent traffic and comprises the following steps: acquiring a background image data set of a highway section and acquiring a throwing object image data set; combining the casting data set with the road background image to construct a road casting data set; modeling the VIBE background to obtain a background and a foreground; improving and optimizing a YOLOv5 network model; and classifying and detecting the casting matters by using the improved YOLOv5 model. Compared with the method for acquiring the highway scene casting data set by field survey, the method for constructing the highway scene casting data set provided by the invention has the advantages that the time and the cost are greatly saved, and the safety is improved. In addition, the YOLOv5 network is improved, so that target tracking can be performed in real time and efficiently under the side view angle of the road, and meanwhile, the detection precision is improved.

Description

Road surface casting detection method based on expressway monitoring video
Technical Field
The invention belongs to the technical field of intelligent traffic, and particularly relates to a road surface casting object detection method based on a highway monitoring video.
Background
Along with the rapid development of the economy of China, the expressway mileage of China is continuously increased. The traffic flow and the cargo traffic are continuously improved, and meanwhile, the accident rate of the expressway is continuously improved due to the fact that the driving speed of the expressway is high.
Congestion due to traffic accidents causes billions of dollars worldwide in lost productivity, lost property, and personal injury each year. Highway sprinkles are one of the important reasons for disrupting the normal transportation process. Truck cargo leakage and random disposal of refuse by driving personnel on the highway constitute a major source of highway casting. The objects are small in size and not easy to be found by a driver in time, so that the vehicles cannot avoid the vehicles, and traffic accidents are caused.
With the development of informatization, automatic detection of sprinkles in roads has become a necessary condition for intelligent expressway to reduce the probability of traffic accidents and congestion caused by the sprinkles.
At present, the detection methods for throwing five roads on the expressway mainly comprise two types, namely traditional manual inspection and automatic detection. The manual inspection efficiency is low and the coverage rate is low, and the video monitoring popularity of the expressway is higher and higher, so that the automatic detection of the sprinkle through the monitoring video becomes a novel and effective mode.
Disclosure of Invention
The invention provides a road surface casting detection method based on a highway monitoring video, and aims to solve the problem of highway casting detection.
The embodiment of the invention provides a road surface casting detection method based on a highway monitoring video, which comprises the following steps:
s1: a base data set is constructed. And acquiring a highway monitoring video, acquiring a road section background image data set, and downloading a corresponding casting image data set from the ImageNet.
S2: and (5) image fusion. And overlapping the center of the throwing object with the randomly selected pavement area in the background image, and then pasting the overlapping pavement area into the scene image to generate a composite image, so as to construct the highway scene throwing object data set.
S3: VIBE background modeling. The background and foreground are acquired.
S4: and constructing a neural network. And (5) building a neural network model based on the YOLOv5 network improvement.
S5: and (5) training an optimization model. Inputting the casting data set into a neural network model for training, and optimizing the model according to the training result to obtain the training weight and the classification result of the casting detection model.
S6: and detecting the casting matters. And detecting the casting object by using the trained deep learning network.
Preferably, if the throwing object is put in the actual scene, obtaining a large number of images is dangerous and expensive, and the highway throwing object image formed by fusing the actual background image and the throwing object image is safe and efficient.
Preferably, the highway monitoring video is at a fixed viewing angle, and the picture change is relatively small, so that the background modeling method is adopted, and VIBE is an algorithm for simulating the background and detecting the foreground.
Preferably, the neural network is built using the YOLOv5 model, and two improvements are made on the YOLOv5 network. One is to reduce the size of the convolution kernel to 1 x 1. Thus, the forward convolution layer can obtain smaller scale kernels without increasing the number of parameters, and the backward convolution layer can build higher level features on this basis, such as structural features on the level of edges, shapes, and object types. Another improvement is to add convolutional connections between different convolutional layers, such as a jump connection on the res net.
The beneficial effects of the invention are as follows:
1. according to the invention, road casting objects are detected and identified based on the expressway monitoring video, so that the labor cost is reduced, and the detection efficiency is improved.
2. According to the invention, the background picture obtained by the expressway video monitoring is fused with the open-source throwing five-picture data set, so that throwing objects do not need to be manually applied to an actual road surface, the labor cost is greatly saved, and the method is safer and more efficient.
3. The present invention makes two improvements over the YOLOv5 network, including reducing the size of the convolution kernel to 1 x 1 and using a jump connection in the ResNet to enhance the ability to observe object details, increasing computational power. Conventional datasets are taken from the front and highway video surveillance is at a relatively low angle with more complex brightness and shadows, which makes it difficult to analyze details in images for convolutional layers in improved YOLOv5 networks, which have improved ability to view image details, enabling good analysis of image details.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention may be realized and attained by the structure particularly pointed out in the written description and drawings.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:
FIG. 1 is a flow chart of a method for detecting road casting based on expressway monitoring video;
FIG. 2 shows three conditions for the VIBE model update of the present invention;
FIG. 3 is a flow chart of the improved YOLOv5 network detection of the present invention;
fig. 4 is a flowchart of an implementation of the method for detecting road casting based on the expressway monitoring video.
Detailed Description
In order to make the objects, technical solutions and advantages of the technical solutions of the present invention more clear, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings of specific embodiments of the present invention. It should be noted that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be made by a person skilled in the art without creative efforts, based on the described embodiments of the present invention fall within the protection scope of the present invention.
The invention designs a road surface casting detection method based on a highway monitoring video, and a flow chart of the method is shown in figure 1 and comprises the following steps:
step 1: a base data set is constructed. And obtaining a road section background image based on the expressway monitoring video, and constructing an expressway scene data set. Downloading corresponding throwing object images from the ImageNet data set, supplementing samples aiming at the throwing object characteristics of the expressway, and constructing a throwing object data set covering ten categories, wherein the specific categories are as follows: boxes, cartons, papers, bottles, bags, roadblocks, stones, sand, plastic bags and wraps.
Step 2: and (5) image fusion. And overlapping the center of the throwing object with the randomly selected pavement area in the background image, and then pasting the overlapping pavement area into the scene image to generate a composite image, so as to construct the highway scene throwing object data set. The generated image is manually inspected to resize the image of the projectile so that the entire projectile is contained within the pavement area pixels. And finally, manually checking the synthesized images, and eliminating synthesized images which do not accord with the natural scene. And the type, size and position of the object thrown by each image are marked.
Step 3: VIBE background modeling. Firstly, initializing a background model of an input video frame without vehicle running, establishing a background model M (x, y) containing N samples for each pixel (x, y), forming a sample sampling space NB (x, y) by the pixel points and 8 neighborhood pixels, randomly selecting the pixel points in NB (x, y) to initialize the background model, wherein the initialization calculation of M (x, y) is shown as a formula:
M(x,y)={v 1 (x,y),v 2 (x,y)...v n (x,y)}
the MB (x, y) initialization calculation is shown in the following figure:
M B (x,y)={v i (x,y)|(x,y)∈N B (x,y)}
where the i-th sample value in the sample space sample set is set to 20, and N is the number of samples. And comparing the pixel value in the current image with the established background model to distinguish whether the pixel point is a foreground target pixel point. Meanwhile, the VIBE model has three different model update strategies, as shown in FIG. 2. The VIBE model has low complexity, short model initializing time and capability of automatically updating and generating a new background in time when the background changes.
Step 4: and constructing a deep learning network. Two improvements were made on the basis of the YOLOv5 network. One is to reduce the size of the convolution kernel to 1 x 1 as shown in fig. 3. Thus, the forward convolution layer can obtain smaller scale kernels without increasing the number of parameters, and the backward convolution layer can build higher level features on this basis, such as structural features on the level of edges, shapes, and object types. Another improvement is to add convolutional connections between different convolutional layers, such as jump connections on the res net, so that some layers do not emphasize the output results of the previous layer too much, but rather emphasize all previous outputs. The information transferred by the model improved with the traditional YOLOv5 model is simpler due to the removal of the burden on the convolutional layer holding data from the upper layer. The network performs global computation on the complete image and all objects in the image. The modified YOLOv5 model divides the input image into an S x S grid, which will identify the object if any part of the object is within the grid cell. Each network cell predicts B bounding boxes and the confidence scores of those bounding boxes. These consistency scores reflect the consistency with which the prediction box contains an object in the model and the accuracy with which it considers the prediction box. Each prediction block uses 5 parameters: x, y, w, h and confidence score. The (x, y) coordinates represent the position of the center of the prediction box relative to the boundary of the image. w and h represent the predicted weight and height relative to the size of the entire image. The confidence score is represented as the intersection of the IOU and Pr (SPILLED LOADS). IOU (Intersection over Union), is the ratio of the intersection of the prediction box and any ground truth box to the union.
The calculation formula of the IOU is as follows:
Figure BDA0004101382770000061
pr (SPILLED LOADS) indicates the likelihood of inclusion of a casting in the prediction box. If the prediction box contains a casting, pr (SPILLED LOADS) =1, otherwise Pr (SPILLED LOADS) =0. The trust score is set to IOU x Pr (SPILLED LOADS).
The conditional probability of a class is multiplied by the consistency prediction of a single prediction block,
Pr(Classi|SPILLED LOADS)×Pr(SPILLED LOADS)×IOU=Pr(Classi×IOU)。
it provides a class-specific confidence score for each prediction box that represents the probability that a particular class of casting will appear in the prediction box and the impact of the prediction box on the casting.
Each grid cell outputs a (bx 4+1+c) prediction. These predictions are encoded as sxs× (bx5+1+c). Improved YOLOv5 model.
A Non-maximum suppression (Non-Maximum Suppression) algorithm is used to identify each category separately. First, the detected image of each bounding box sets a merge threshold, and training is performed on the composite dataset, resulting in a convolution, and values less than the threshold are returned to the layer parameters. Then, the box in the category with the highest confidence is extracted. The IOU of the prediction block and the remaining blocks are continuously calculated using NMS method. If its value is greater than the threshold, the trusted threshold is set to 0. Finally, each determination box category takes the category with the output threshold value not being 0 as the identification result.
The convolutional layer of the modified YOLOv5 model is trained on the synthetic dataset to obtain convolutional layer parameters. The classification model uses 20 convolutional layers of the first YOLOv5 model. The developed model is then written to a cfg file, where the convolutional layer parameters are put in. The parameters of the model are then optimized using a non-maximal suppression (NMS) method. After multiple optimizations, the calculation accuracy was tested on the validation dataset. The optimization is continued until the accuracy stops when it no longer increases.
Step 5: and detecting the casting matters. And inputting the pictures to be detected by using the YOLOv5 configuration file, the python calling interface and the detection weight file generated by training, and then performing target detection to obtain information such as the type, size, position and confidence of the throwing object of each picture, wherein the specific detection steps are shown in fig. 4.
It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, and that the above embodiments and descriptions are merely illustrative of the principles of the present invention, and various changes and modifications may be made without departing from the spirit and scope of the invention, which is defined in the appended claims. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (7)

1. The road surface casting detection method based on the expressway monitoring video is characterized by comprising the following steps of:
s1: constructing a basic data set; acquiring a highway monitoring video, acquiring a road section background image data set, and downloading a corresponding throwing object image data set from an ImageNet;
s2: fusing images; overlapping the center of the throwing object with a randomly selected pavement area in the background image, and then pasting the overlapping center of the throwing object into the scene image to generate a composite image, so as to construct a highway scene throwing object data set;
s3: modeling a VIBE background; acquiring a background and a foreground;
s4: constructing a network; building a neural network model based on the YOLOv5 network improvement;
s5: training an optimization model; inputting the casting data set into a neural network model for training, and optimizing the model according to the training result to obtain the training weight and the classification result of the casting detection model;
s6: detecting a casting object; and detecting the casting object by using the trained deep learning network.
2. The method for detecting road surface casting based on expressway monitoring video according to claim 1, wherein the method comprises the steps of: the throwing object image of the road surface is formed by manually combining the expressway monitoring image and the throwing object image.
3. The method for detecting road surface casting based on expressway monitoring video according to claim 1, wherein the method comprises the steps of: the content of the data label comprises the category, the size and the position of the throwing object.
4. The method for detecting road surface casting based on expressway monitoring video according to claim 1, wherein the method comprises the steps of: the VIBE background modeling firstly initializes a background model of an input video frame without vehicle running, and then compares a pixel value in a current image with the established background model to distinguish whether the pixel point is a foreground target pixel point. The VIBE model has low complexity, short model initializing time and capability of automatically updating and generating a new background in time when the background changes.
5. The method for detecting road surface casting based on expressway monitoring video according to claim 1, wherein the method comprises the steps of: the deep learning network is constructed, and two improvements are made on the basis of the YOLOv5 network; one is to reduce the size of the convolution kernel to 1 x 1 and another improvement is to increase the convolution connection between different convolution layers.
6. The method for detecting road surface casting based on expressway monitoring video according to claim 1, wherein the method comprises the steps of: the network carries out global calculation on the complete image and all objects in the image; confidence scores are expressed as IOU and Pr; is a complex of the intersection of (a) and (b); IOU is the ratio of the intersection of the prediction box and any ground truth box to the union; the calculation formula of the IOU is as follows:
Figure FDA0004101382760000021
pr represents the possibility of containing a throwing object in the prediction frame; pr=1 if the prediction box contains a casting, else pr=0.
7. The method for detecting road surface casting based on expressway monitoring video according to claim 1, wherein the method comprises the steps of: the convolutional layer of the modified YOLOv5 model is trained on the synthetic dataset to obtain convolutional layer parameters; the classification model adopts 20 convolution layers of a first YOLOv5 model; then, writing the developed model into a cfg file, wherein the convolutional layer parameters are put in; the parameters of the model are then optimized using a non-maximal suppression (NMS) method. The calculation accuracy is tested on the verification data set through multiple times of optimization; the optimization is continued until the accuracy stops when it no longer increases.
CN202310177342.XA 2023-02-24 2023-02-24 Road surface casting detection method based on expressway monitoring video Pending CN116343136A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310177342.XA CN116343136A (en) 2023-02-24 2023-02-24 Road surface casting detection method based on expressway monitoring video

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310177342.XA CN116343136A (en) 2023-02-24 2023-02-24 Road surface casting detection method based on expressway monitoring video

Publications (1)

Publication Number Publication Date
CN116343136A true CN116343136A (en) 2023-06-27

Family

ID=86888589

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310177342.XA Pending CN116343136A (en) 2023-02-24 2023-02-24 Road surface casting detection method based on expressway monitoring video

Country Status (1)

Country Link
CN (1) CN116343136A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116994216A (en) * 2023-09-27 2023-11-03 深圳市九洲卓能电气有限公司 Highway casting object detection method and system based on deep learning

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116994216A (en) * 2023-09-27 2023-11-03 深圳市九洲卓能电气有限公司 Highway casting object detection method and system based on deep learning

Similar Documents

Publication Publication Date Title
CN109977812B (en) Vehicle-mounted video target detection method based on deep learning
Wang et al. A vision‐based video crash detection framework for mixed traffic flow environment considering low‐visibility condition
Ren et al. YOLOv5s-M: A deep learning network model for road pavement damage detection from urban street-view imagery
CN115761736A (en) Underground cavity intelligent detection method and system based on multi-dimensional ground penetrating radar image
Rastogi et al. A comparative evaluation of the deep learning algorithms for pothole detection
CN116343136A (en) Road surface casting detection method based on expressway monitoring video
Malini et al. An automatic assessment of road condition from aerial imagery using modified VGG architecture in faster-RCNN framework
Ma et al. Virtual analysis of urban road visibility using mobile laser scanning data and deep learning
CN115273032A (en) Traffic sign recognition method, apparatus, device and medium
CN114926791A (en) Method and device for detecting abnormal lane change of vehicles at intersection, storage medium and electronic equipment
CN115019039A (en) Example segmentation method and system combining self-supervision and global information enhancement
Cao et al. Data generation using simulation technology to improve perception mechanism of autonomous vehicles
CN114399734A (en) Forest fire early warning method based on visual information
Aldoski et al. Impact of Traffic Sign Diversity on Autonomous Vehicles: A Literature Review
CN111160282A (en) Traffic light detection method based on binary Yolov3 network
Kada et al. ALS point cloud classification using Pointnet++ and KPConv with prior knowledge
CN114494893B (en) Remote sensing image feature extraction method based on semantic reuse context feature pyramid
CN114882205A (en) Target detection method based on attention mechanism
Lehner et al. 3d adversarial augmentations for robust out-of-domain predictions
Rao et al. A Deep Learning Approach Towards Building Intelligent Transport System
Li et al. Prediction model of urban street public space art design indicators based on deep convolutional neural network
Kinattukara et al. Clustering based neural network approach for classification of road images
Chen et al. Road segmentation via iterative deep analysis
CN114999183B (en) Traffic intersection vehicle flow detection method
Hassandokht Mashhadi et al. A GAN-Augmented CNN Approach for Automated Roadside Safety Assessment of Rural Roadways

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination