CN111091061B

CN111091061B - Vehicle scratch detection method based on video analysis

Info

Publication number: CN111091061B
Application number: CN201911144384.3A
Authority: CN
Inventors: 高飞; 葛逸凡; 卢书芳; 陆佳炜; 程振波; 肖刚
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2019-11-20
Filing date: 2019-11-20
Publication date: 2022-02-15
Anticipated expiration: 2039-11-20
Also published as: CN111091061A

Abstract

The invention provides a vehicle scratch detection method based on video analysis, in particular to a vehicle scratch detection method based on a video semantic segmentation technology. The invention detects the vehicle in the video through video analysis by taking a semantic segmentation network and a tracking algorithm as a basis, tracks and judges the vehicle, and is used for solving the problems of low detection efficiency and high false detection rate in the prior art.

Description

Vehicle scratch detection method based on video analysis

Technical Field

The invention relates to the technical field of video semantic segmentation, in particular to a vehicle scratch detection method based on video analysis through a video semantic segmentation technology.

Background

With the development of the times and the improvement of technologies, automobiles have become a necessary travel tool for each family. However, with the increase of the number of vehicles, the number of parking spaces on the roadside is increased, the width of the road is narrowed, and hidden troubles are brought to the fact that the parked vehicles are scratched by the passing vehicles. The information of scratching the vehicle is found out by manually checking the monitoring video, so that time and labor are wasted, and the cost is too high. Therefore, a method for vehicle tracking, vehicle scratch detection and vehicle scratch information preservation based on video analysis is indispensable.

Currently, in the field of detecting vehicle scratches, the following methods are commonly used for detection: if a convolutional neural network for target detection is used for detecting the vehicle under video and judging vehicle scratch, the method is high in detection efficiency and improves the flexibility of detection operation, but the position of the vehicle is only framed by a rectangular selection frame, so that the contour information of the vehicle cannot be reflected, and the false detection rate of judgment of vehicle scratch is further high. Meanwhile, the convolutional neural network used in the prior art is lagged behind, the efficiency is low, and the convolutional neural network is not suitable for training and using a large number of pictures.

Disclosure of Invention

In order to overcome the defects in the prior art, the invention aims to provide a vehicle scratch detection method based on video analysis. The method takes a semantic segmentation network and a tracking algorithm as a basis, detects the vehicles in the video through video analysis, tracks and judges the vehicles, and is used for solving the problems of low detection efficiency and high false detection rate in the prior art.

The vehicle scratch detection method based on video analysis is characterized by comprising the following steps of:

step 1: a rectangular frame is used for marking a parking area and a vehicle passing area in a monitoring video;

step 2: defining a set of parked vehicles S ═ S_iI | ═ 1,2, …, I }, and the passing vehicle set G ═ G { (G)_j1,2, …, J, where s_iThe minimum bounding rectangle of the ith vehicle in S, I represents the number of vehicles in the parking vehicle set, g_jRepresenting the minimum bounding rectangle of the jth vehicle in G, wherein J represents the number of vehicles in the vehicle set; initially, both S and G are empty sets;

and step 3: obtaining a frame of image from a monitoring video buffer queue;

and 4, step 4: and (3) adopting a semantic segmentation method based on a deep learning technology to detect vehicles in the passing area from the image in the step (3), and recording the result as C ═ C_k1, 2., K }, where c is_kA minimum bounding rectangle representing the K-th detected vehicle, K representing the number of detected vehicles; for each c_kThe logic is as follows: if c is_kIs completely contained in the parking area and is connected with all s_iIf formula (1) is satisfied, the reaction mixture willc_kAdding into S; otherwise, if c_kNot completely contained in the parking area and corresponding to all g_jIf formula (2) is satisfied, c is_kAdding the mixture into G;

wherein A represents an area calculation function;

and 5: tracking and updating each vehicle in the parking vehicle set S in the step 4, and if the vehicle exceeds the range of the parking area, moving the vehicle out of the S;

step 6: tracking and updating G for each vehicle in the vehicle passing set G in the step 4, and if the vehicle exceeds the range of the vehicle passing area, moving the vehicle out of G;

and 7: for any vehicle S in S_iAnd any vehicle G in G_jMake the judgement of scraping, concrete step is:

step 7.1: if s_iAnd g_jIf the two do not intersect, the result is judged as s_iAnd g_jNo scratch; if s_iAnd g_jIf the two are crossed, the next step of judgment is carried out in the step 7.2;

step 7.2: definition s_iIs SP_i＝{sp_il|l＝1,2,…,L_iIn which, sp_ilDenotes s_iL is the first boundary point, L_iDenotes s_iThe number of boundary points of (a); definition of g_jIs GP_j＝{gp_jt|t＝1,2,…,M_jIn which gp is_jtDenotes g_jT-th boundary point of (1), M_jDenotes g_jThe number of boundary points of (a); if SP_iAnd GP_jIf there is no identical element in the sequence, s is determined_iAnd g_jNo scratch; if SP_iAnd GP_jIf the same element exists in the above-mentioned formula, the formula is judged as s_iAnd g_jScratching and recording s_iAnd g_jThe distance between the respective center points, denoted d_ijSaving the frame image;

step 7.3: repeating steps 7.1-7.2 to define s in the last frame_iAnd g_jThe distance between the respective center points is

In a new frame, d is compared_ijAnd

size of (1), if

Then d will be_ijIs updated to

And saving the frame image; if it is

D is not updated_ijThe value of (d) and the saved image.

Compared with the prior art, the method has the following beneficial effects:

the vehicle detection method takes the semantic segmentation technology as the basis of vehicle detection, can describe the vehicle contour information instead of a simple rectangular selection frame, and can improve the accuracy of judging whether to scratch or not; when the information of the scratched vehicle is stored, the optimal image is stored in an iteration mode, and a better visualization effect is guaranteed.

Drawings

Fig. 1 is a calibration image of positioning information of a parking detection area and a passing detection area in a surveillance video.

In the figure: 1-a parking detection area and 2-a vehicle passing detection area.

Detailed Description

The following describes in detail a specific implementation method of the vehicle scratch detection method by video semantic segmentation technology according to the present invention with reference to an embodiment.

The invention discloses a vehicle scratch detection method based on video analysis, namely a vehicle scratch detection method based on a video semantic segmentation technology, which specifically comprises the following steps:

step 1: a rectangular frame is used for calibrating a parking detection area 1 and a vehicle passing detection area 2 in a monitoring video, and the calibration condition is shown in figure 1;

step 2: defining a set of parked vehicles S ═ S_iI | ═ 1,2, …, I }, and the passing vehicle set G ═ G { (G)_j1,2, …, J, where s_iThe minimum bounding rectangle of the ith vehicle in the S is represented, I represents the number of elements in the parking vehicle set, and g_jRepresenting the minimum circumscribed rectangle of the jth vehicle in G, wherein J represents the number of elements in the vehicle set; initially, both S and G are empty sets;

and step 3: obtaining a frame of image from a monitoring video buffer queue;

and 4, step 4: and (3) adopting a semantic segmentation method based on a deep learning technology to detect vehicles in the passing area from the image in the step (3), and recording the result as C ═ C_k1, 2., K }, where c is_kA minimum bounding rectangle representing the K-th detected vehicle, K representing the number of detected vehicles; for each c_kThe logic is as follows: if c is_kIs completely contained in the parking area and is connected with all s_iIf formula (1) is satisfied, c is_kAdding into S; otherwise, if c_kNot completely contained in the parking area and corresponding to all g_jIf formula (2) is satisfied, c is_kAdding the mixture into G;

wherein A represents an area calculation function;

the vehicle detection method based on semantic segmentation specifically comprises the following steps:

step 4.1: acquiring a large number of pictures under a monitoring view angle, labeling the outline information of the pictures in the acquired pictures by vehicles to obtain a vehicle semantic segmentation data set, and calculating the following steps according to the following steps of 3: 1 dividing the data set into a training set and a test set;

step 4.2: inputting the divided training set into a first convolutional neural network for feature extraction, and inputting the extracted features into a second convolutional neural network to realize parameter training of semantic segmentation; inputting the divided test set into a first convolutional neural network and a second convolutional neural network, and comparing the detection effect of the parameters obtained through training with the test set picture calibration information to realize parameter optimization;

step 4.3: as a preferred scheme of the present invention, 10 layers of convolutional neural networks are selected as a first convolutional neural network for feature extraction, and sequentially include a first convolutional layer, a second bottleneck layer, a third bottleneck layer, a fourth bottleneck layer, a fifth bottleneck layer, a sixth bottleneck layer, a seventh bottleneck layer, an eighth bottleneck layer, a ninth convolutional layer, and a tenth pooling layer; the picture input dimension of the first convolutional neural network is 224 × 224 × 3, the picture output dimension of the first convolutional neural network is 1 × 1 × 1280, the picture output dimension of the first convolutional layer is 112 × 112 × 32, the picture output dimension of the second bottleneck layer is 112 × 112 × 16, the picture output dimension of the third bottleneck layer is 56 × 56 × 24, the picture output dimension of the fourth bottleneck layer is 28 × 28 × 32, the picture output dimension of the fifth bottleneck layer is 14 × 14 × 64, the picture output dimension of the sixth bottleneck layer is 14 × 14 × 96, the picture output dimension of the seventh bottleneck layer is 7 × 7 × 160, the picture output dimension of the eighth bottleneck layer is 7 × 7 × 320, the picture output dimension of the ninth convolutional layer is 7 × 1280, and the picture output dimension of the tenth pooling layer is 1 × 1 × 1280; the bottleneck layer is different from other nonlinear layers and contains all necessary information, and the condition of information loss can not occur when the ReLU and other activation functions are used;

step 4.4: as a preferred scheme of the invention, the second convolutional neural network adopts a symmetrical structure of a coder and a decoder to realize end-to-end pixel-level image segmentation; the second convolutional neural network takes the first convolutional neural network as a pre-trained network, takes three convolutional layers with steps as spatial paths, and then fuses the output characteristics of the two components through a characteristic fusion module to make final prediction; at the level of the representation of the features, the two networks are not characterized identically: the first convolutional neural network is used as the feature output by the pre-trained network to mainly encode the contextual information, the spatial information captured by the spatial path encodes most detailed information, and the two kinds of feature information cannot be simply weighted; in order to solve the problem, under the condition that different levels of characteristics are given, the second convolution neural network characteristic fusion module firstly connects the first convolution neural network as the characteristics output by the pre-training network and the characteristics output by the space path, then, through the scale of batch normalization balance characteristics, the connected characteristics are pooled into a characteristic vector in the next step, and a weight vector is calculated, wherein the weight vector can re-weight the characteristics to play the roles of characteristic selection and combination; the second convolutional neural network supervises the output of the whole second convolutional neural network through the main loss function by training of the auxiliary loss function supervision model, and supervises the output of the first convolutional neural network through adding two special auxiliary loss functions, just like multi-layer supervision, by using formula (3) to balance the weight of the main loss function and the auxiliary loss function, wherein the main loss function and the auxiliary loss function both use formula (4):

wherein X, W are the weights of the main loss function and the auxiliary loss function, respectively, K represents the number of total samples, l_pAs an output function of the entire second convolutional neural network,/_iSupervision functions, X, being auxiliary loss functions_iWeight of the principal loss function for the ith sample, W_iAnd alpha is a weight value for balancing the main loss function and the auxiliary loss function.

Wherein N represents the number of the input sample, y represents the expected output of the second convolutional neural network, a represents the actual output of the second convolutional neural network, ln represents the logarithm operation, and when loss is less than 0.001, the training of the input image is completed.

Step 4.5: inputting the training set and the test set into a convolutional neural network to complete parameter training of vehicle detection;

In a new frame, d is compared_ijAnd

size of (1), if

Then d will be_ijIs updated to

And saving the frame image; if it is

D is not updated_ijThe value of (d) and the saved image.

The matters set forth in the examples are illustrative only of the principles of the invention and its efficacy and are not intended to be limiting thereof. Any person skilled in the relevant art can modify or change the above-described embodiments without departing from the spirit and scope of the invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.

Claims

1. A vehicle scratch detection method based on video analysis is characterized by comprising the following steps:

step 2: defining a set of parked vehicles S ═ S_iI | ═ 1,2, …, I }, and the passing vehicle set G ═ G { (G)_j1,2, …, J, where s_iThe minimum bounding rectangle of the ith vehicle in S, I represents the number of vehicles in the parking vehicle set, g_jThe minimum bounding rectangle for the jth vehicle in G,j represents the number of vehicles in the passing vehicle set; initially, both S and G are empty sets;

and step 3: obtaining a frame of image from a monitoring video buffer queue;

wherein A represents an area calculation function;

In a new frame, d is compared_ijAnd

size of (1), if

Then d will be_ijIs updated to

And saving the frame image; if it is

D is not updated_ijThe value of (d) and the saved image.