CN109902676B

CN109902676B - Dynamic background-based violation detection algorithm

Info

Publication number: CN109902676B
Application number: CN201910029090.XA
Authority: CN
Inventors: 郑雅羽; 王济浩; 寇喜超; 冯宇
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2019-01-12
Filing date: 2019-01-12
Publication date: 2021-04-09
Anticipated expiration: 2039-01-12
Also published as: CN109902676A

Abstract

The invention relates to a dynamic background-based illegal parking detection algorithm, which comprises the steps of screening acquired images, retaining effective images and marking, constructing an example segmentation network and a related area proposal network in a neural network based on dynamic background convolution, obtaining model parameters of an optimal effect network after training, inputting the effective images into the optimal effect network to obtain segmented vehicles, parking areas and illegal parking areas, and obtaining a vehicle-parking area overlapping degree smaller than a preset threshold value when the overlapping degree of the vehicles and the parking areas is smaller than the preset threshold valueaAnd the overlapping degree of the detection area and the illegal parking area is more than a threshold valuebAnd if so, determining that the vehicle is illegal. The method is also applicable to pictures acquired by non-fixed camera equipment, and vehicles and parkable and illegal parking areas can be accurately divided for sidewalk backgrounds with complex scenes; the method has the advantages that the possible areas of the object are predicted according to the relation between pictures shot before and after in the time sequence, and the model can be continuously updated in an iterative mode to enhance the robustness of the model; the detection success rate under the shielding condition is improved, and the model accuracy is greatly improved.

Description

Dynamic background-based violation detection algorithm

Technical Field

The present invention relates to data recognition; a data representation; a record carrier; the field of record carrier handling, and in particular to a violation detection algorithm based on dynamic background.

Background

Along with the development of cities and the increase of disposable income of residents, the number of vehicles is increased year by year, the pressure of road traffic is increased, meanwhile, the problem of 'difficult parking' is increasingly acute due to the hysteresis of the number matching of parking spaces, and vehicle owners often utilize blind areas of road monitoring areas for self convenience to park vehicles into pedestrian walks. The illegal parking vehicles obstruct normal walking areas and blind roads of pedestrians, and potential risks that the pedestrians bypass into non-motor vehicle lanes and even motor vehicle lanes exist. In order to prevent the phenomenon, a traffic department can regularly patrol the illegal vehicles on the sidewalks on two sides of the road and adopt mobile equipment to collect images, but the problems of low efficiency and long time consumption for obtaining evidence and photographing exist in manual patrol and supervision illegal parking.

The intelligent image analysis is adopted for carrying out illegal parking detection, which is the development direction of urban traffic management in the future, and one illegal parking detection algorithm is the illegal parking detection algorithm. The illegal parking detection algorithm is a model algorithm obtained by training and modeling according to the acquired real scene picture, can automatically detect whether illegal parking vehicles exist in the newly acquired image according to model parameters obtained by training after the training is finished, and can perform accurate positioning and segmentation in the image.

The paper "Rich features technologies for object detection and segmentation" introduces a Convolutional Neural Network (CNN) method into the object detection field, greatly improves the object detection effect, and changes the main research idea of the object detection field, so that the following series of articles include RCNN (Region-based probabilistic Neural Networks), Fast-RCNN and Fast-RCNN, and meanwhile, the author of the Fast-RCNN in 2017 proposed master-RCNN, which represents the highest level in the field.

Most of the existing illegal parking detection algorithms are performed on the premise that a vehicle is detected, and specific position information of the vehicle is obtained by performing a frame difference method, a background difference method and the like according to a video sequence and vehicle detection information aiming at a monitoring camera device with a fixed road. However, these methods are greatly affected by illumination and the stability of the monitoring device, and rely on background modeling of the detection scene, and when the scene is complicated and the vehicle is deformed due to the shooting angle, the effect of vehicle detection is affected, thereby causing the failure of the illegal parking detection algorithm.

In recent years, with the development of deep learning, the CNN has also introduced the field of road traffic detection, for example, the invention patent with publication number CN107609491A proposes an algorithm for detecting illegal parking vehicles based on a convolutional neural network, but the algorithm only applies deep learning to the identification of vehicles and cannot effectively position and distinguish illegal parking vehicles when there are multiple vehicles in an image; therefore, the prior art still has the following defects:

1) the common CNN image recognition can not complete the positioning detection capability;

2) for the illegal parking vehicles, specific illegal parking vehicles are expected to be directly positioned from the acquired images, and the vehicle type and license plate information of the vehicles are expected to be extracted, so that the function of identifying and detecting high-resolution pictures is required, the common CNN network is generally processed by size scaling, and the accurate positioning is difficult to restore the high-resolution size;

3) the method has the advantages that the sidewalks are parked densely and have objects which are mutually shielded, the detection of the types is not accurate enough, and the requirement is provided for accurate segmentation of different vehicles of the same type;

4) when the pictures are collected, the pictures have a temporal front-back sequence, the pictures are linked in content, and the common target detection convolutional network only aims at one picture and does not utilize the associated information between the front image and the back image.

Disclosure of Invention

In order to solve the problems, the invention provides an optimized violation detection algorithm based on a dynamic background, which can continuously iterate to increase the robustness. And a non-maximum inhibition method C-Soft-NMS improved based on Soft-NMS is fused on the example segmentation network, so that the detection success rate under the shielding condition is improved.

The invention adopts the technical scheme that an illegal parking detection algorithm based on a dynamic background comprises the following steps:

step 1: screening the collected images, reserving effective images and labeling;

step 2: constructing an example segmentation network and a related area proposed network in a neural network based on dynamic background convolution, training, and setting a loss function to obtain model parameters of an optimal effect network;

and step 3: inputting the effective image into an effect optimal network to obtain a vehicle, a stoppable area and an illegal parking area after segmentation;

and 4, step 4: detecting the overlapping degree of the divided vehicle with a stoppable area and an illegal parking area, and judging as the illegal parking vehicle when the overlapping degree of the vehicle with the stoppable area is smaller than a preset threshold a and the overlapping degree of the vehicle with the illegal parking area is larger than a threshold b; a is more than 0 and less than 1, and b is more than 0 and less than 1.

Preferably, in step 1, after the acquired image is screened, preprocessing is performed, and an effective image is retained.

Preferably, in step 3, the effective image includes the following steps in the effect optimization network:

step 3.1: inputting the processed effective image into a residual error network of an example segmentation network, and outputting a residual error characteristic diagram I_RES；

Step 3.2: residual feature map I_RESInputting a regional proposal network, removing proposal boxes with low confidence coefficient less than q, and taking the rest proposal boxes W_jAs the foreground proposal box output, W_j＝I_RES·H_RPNWherein H is_RPNRepresenting a generate proposal box operation; q is more than 0 and less than 0.5;

step 3.3: if the number of processed image frames is greater than or equal to 2, inputting the detected images of the first two frames into the associated area proposing network, and marking the foreground frame proposal output by the associated area proposing network as W_rAnd carrying out the next step, otherwise, directly carrying out the step 3.5;

step 3.4: w is to be_jAnd W_rInputting the overlapped anchor frame into the improved Soft-NMS algorithm, removing the overlap and outputting a foreground proposal frame W_f；

Step 3.5: pooling foreground proposal frames into the same size through an interest area, and inputting the foreground proposal frames into a full connection layer and a full convolution network;

step 3.6: a mask classified based on pixel level is output.

Preferably, said step 3.3 comprises the steps of:

step 3.3.1: obtaining the existing vehicle detection frames of the previous frame and the previous two frames of the current frame, and respectively recording the existing vehicle detection frames as B_1，iAnd B_2，jWherein i and j represent the ith and jth detection boxes respectively;

step 3.3.2: b is to be_1，iAnd B_2，jInputting a convolution network shared by weight values and extracting characteristics;

step 3.3.3: pooling the extracted features into the same size through the interest area;

step 3.3.4: performing mutual convolution operation on the detection frames, and selecting B_1，iAnd B₂All the detection frames are convoluted to obtain a representative detection frame B_1，iAnd B_2，jIs used for the confidence queue S of the similarity_jRemoving the detection frames with confidence coefficient smaller than p, wherein p is more than 0.6 and less than or equal to 0.8, recording the detection frame with the highest confidence coefficient, and enabling the index of the detection frame with the highest confidence coefficient to be m;

step 3.3.5: determining the area A moved by the matching frame of the previous frame and the previous two frames_m；

Step 3.3.6: according to the moving area A_mCorresponding detection box B matched with the previous frame_1，mDetermining a search area A_s；

Step 3.3.7: will search for area A_sCutting in the original image to obtain a matching frame B corresponding to the search area in the previous frame_1，mSending the data to a convolutional neural network with shared weight for feature extraction to obtain respective feature graphs, and marking the feature graphs as A_fAnd B_1，f；

Step 3.3.8: feature map B of detection frame_1，fIn search area feature map A_fAnd performing mutual convolution operation to obtain a confidence map D, and selecting the area with the highest confidence as a proposed area to be added into a foreground frame proposal output of the area proposal network.

Preferably, said step 3.3.5 comprises the steps of:

step 3.3.5.1: converting the corresponding upper left corner coordinates and lower right corner coordinates of the two matching frames into corresponding centroid coordinates (x, y) and rectangular frame length and width h and w;

step 3.3.5.2: obtaining a moving area according to the relative positions of the centroids of the two rectangular frames

Wherein, color represents the coordinate (x, y) of a certain point, i represents the rectangular frame of the first two matched frames, and j represents the rectangular frame of the previous frame;

step 3.3.5.3: will color_movThe coordinates are converted to corresponding centroid coordinates and rectangular frame length and width, in (x)_m，y_m) And w_m、h_mIs represented as a moving area A_m。

Preferably, said step 3.3.6 comprises the steps of:

step 3.3.6.1: calculating the movement quantity delta x and delta y of the centroid of the corresponding matching frame in the first two frames;

step 3.3.6.2: will move the area A_mRespectively adding the moving amounts Deltax and Deltay to the centroid coordinates of (1) to obtain a search area A_s。

Preferably, said step 3.4 comprises the steps of:

step 3.4.1: w is to be_jAnd W_rThe overlapping anchor frames of (a) are defined as a plurality of detection frames;

step 3.4.2: arranging the detection frames in a descending order according to the confidence degree, and recording the queue after the arrangement as L₁(ii) a Initialize empty queue, denoted L₂；

Step 3.4.3: inspection queue L₁If the value is null, the next step is carried out if the value is not null, otherwise, the step 3.4.7 is carried out;

step 3.4.4: queue L₁The detection frame with the highest confidence level in the middle is marked as W_maxCalculating the remaining detection frame and W_maxThe degree of overlap of (2), updating the confidence of the detection box

Wherein, b_iRepresenting a detection frame currently operated, and iou is the overlapping degree;

step 3.4.5: in queue L₁Deleting the detection box with the updated confidence coefficient lower than 0.1;

step 3.4.6: w is to be_maxPut candidate queue L₂And from queue L₁And (5) deleting, returning to the step 3.4.3:

step 3.4.7: return queue L₂As a final result.

Preferably, said step 3.5 comprises the steps of:

step 3.5.1: inputting the foreground proposal frames with the same size into a full-connection layer, and outputting the category of the foreground proposal frames and the coordinates (X, y) of the upper left corner and the lower right corner of the foreground proposal frames, wherein y is X.x, y is a vector of n X1, n is the number of categories, X is a matrix of n X m, X is a vector of m X1, and m is the dimension when entering the full-connection layer;

step 3.5.2: the coordinates are input into the full convolution network and the output z is g (y), where g is the convolution layer with the forward and reverse derivative functions exchanged.

Preferably, the step 4 comprises the steps of:

step 4.1: traversing all the segmented vehicle examples;

step 4.2: IoU of detection frame, illegal parking area detection frame and parking area detection frame of detection vehicle_{Violation of parking}And IoU_{Can stop}；

Step 4.3: when IoU_{Violation of parking}>b and IoU_{Can stop}<When a, judging that the vehicle instance is an illegal parking vehicle, and adding the index into the queue L_wOtherwise, the vehicle is not illegally parked; carrying out the next step;

step 4.4: if the traversal is not finished, returning to the step 1, otherwise, finishing the detection and return queue L_w。

The invention provides an optimized violation detection algorithm based on a dynamic background, which comprises the steps of screening acquired images, reserving and marking effective images, constructing an example segmentation network and a related region proposed network in a neural network based on convolution of the dynamic background, obtaining model parameters of an optimal effect network after training, inputting the effective images into the optimal effect network to obtain segmented vehicles, stoppable regions and violation regions, detecting the overlapping degree of the segmented vehicles and the stoppable regions and the violation regions, and judging the vehicle as the violation vehicle when the overlapping degree of the vehicles and the stoppable regions is smaller than a preset threshold a and the overlapping degree of the vehicles and the violation regions is larger than a threshold b.

The invention has the following beneficial effects that the invention plays an effective auxiliary role in intelligent urban traffic management:

(1) an effective algorithm is provided for the phenomenon of vehicle parking violation, pictures acquired by non-fixed camera equipment can also be suitable, and vehicles and parkable and parking-violation areas can be accurately divided for sidewalk backgrounds with complex scenes;

(2) a related area proposal network is designed for the condition of vehicle missing detection, areas where objects may exist are predicted for the relation between pictures shot before and after in time sequence, and the model can be continuously updated iteratively to enhance the robustness of the model;

(3) a non-maximum suppression method C-Soft-NMS algorithm improved based on Soft-NMS is fused in an example segmentation network, the detection success rate under the shielding condition is improved, and the model accuracy is greatly improved.

Drawings

FIG. 1 is a flow chart of the present invention;

FIG. 2 is a flow chart of the present invention relating to a regional proposal network;

FIG. 3 is a flow chart of the modified Soft-NMS algorithm of the present invention.

Detailed Description

The invention will be described in further detail below with reference to examples and the accompanying drawings, without limiting the scope of the invention thereto.

The invention relates to an illegal parking detection algorithm based on a dynamic background, the use of dynamic background modeling generally eliminates the interference of a complex background on a specific moving object in a video continuous frame so as to improve the detection precision, but for a static specific target, the application of the traditional dynamic background method is less; the development of the convolutional neural network is from a shallow layer network (VGG) with repeated elements to an improved residual deep network (ResNet), which can extract picture information rich enough for different tasks, and meanwhile, a Region Proposal Network (RPN) is also provided for the requirement of target positioning.

The method comprises the following steps.

Step 1: and screening the acquired images, reserving effective images and labeling.

In the step 1, the collected images are screened, then are preprocessed, and effective images are reserved.

In the invention, the effective image refers to an image with a sidewalk in the image, and the effective image is not a valid image when non-sidewalk elements such as intersections appear in the image. Of course, in practical situations, the determination criteria of the effective image may be adjusted according to practical requirements based on the determination range of the violation.

In the invention, marking refers to realizing the effect of marking the key position, namely informing the algorithm of which standard the algorithm needs to reach in advance.

In the present invention, the preprocessing generally refers to resizing an image to be detected, for example, scaling the short side to 800, and the long side not to exceed 1300, and under certain conditions, denoising is also required. The technical personnel in the field can set the setting according to the requirements.

Step 2: and constructing an example segmentation network and an associated area proposed network in the neural network based on the dynamic background convolution, training, and setting a loss function to obtain model parameters of the network with the optimal effect.

In the invention, the example segmentation network comprises a residual error extraction network (RESNet), a Region Proposal Network (RPN), an improved non-maximum suppression method (C-Soft-NMS), a Full Convolution Network (FCN) and full connection layer output, and when necessary, corresponding loss functions can be set according to different tasks; and the related area proposed network (CPN) comprises search area restriction (SRR) and mutual convolution network (CCN), a foreground box is proposed to assist the training of the example segmentation network, and a loss function of similarity comparison can be set when needed. Specifically, the method provided by the invention divides the network into targets by examples, and proposes the network to perform precision adjustment by the associated area, so that the output result is more accurate.

In the present invention, the convolution layer kernel size of the residual extraction network (RESNet) may be set to a × a, where a is 1 or 3; the convolutional layer kernel size of the area proposal network (RPN) may be set to b × b, b being 3, 5 or 7; the full convolutional network uses transposed convolution, and the convolutional layer kernel can be set to 2 × 2. The size of the convolutional layer kernel can be set by those skilled in the art according to actual requirements, and an optimal network can be obtained through training.

In the invention, the model parameter of the optimal effect network refers to the model parameter with the optimal effect, namely the parameter in the convolution kernel which minimizes loss in the iterative training process, and the gradient of the model parameter can be calculated and updated by using the cross entropy loss function as the loss, which is the routine technique of the technicians in the field and can be selected and set by the technicians in the field according to the requirements.

And step 3: and inputting the effective image into the optimal effect network to obtain the segmented vehicle, the stoppable area and the illegal parking area.

In step 3, the effective image includes the following steps in the effect optimization network.

Step 3.1: inputting the processed effective image into a residual error network of an example segmentation network, and outputting a residual error characteristic diagram I_RES。

Step 3.2: residual feature map I_RESInputting a regional proposal network, removing proposal boxes with low confidence coefficient less than q, and taking the rest proposal boxes W_jAs the foreground proposal box output, W_j＝I_RES·H_RPNWherein H is_RPNRepresenting a generate proposal box operation; q is more than 0 and less than 0.5.

In the present invention, q may be 0.3 in general.

Step 3.3: if the number of processed image frames is greater than or equal to 2, inputting the detected images of the first two frames into the associated area proposing network, and marking the foreground frame proposal output by the associated area proposing network as W_rAnd (5) carrying out the next step, otherwise, directly carrying out the step 3.5.

Said step 3.3 comprises the following steps.

Step 3.3.1: obtaining the existing vehicle detection frames of the previous frame and the previous two frames of the current frame, and respectively recording the existing vehicle detection frames as B_1，iAnd B_2，jWherein i and j represent the ith and jth detection boxes, respectively.

In the present invention, B denotes a frame of a detected vehicle, 1 denotes a previous frame, and 2 denotes a previous 2 frames.

Step 3.3.2: b is to be_1，iAnd B_2，jAnd inputting a convolution network shared by the weights, and extracting the characteristics.

In the invention, the VGG network is selected for extracting the characteristics in the step 3.3.2.

Step 3.3.3: the extracted features are pooled and unified into the same size by the region of interest. In this example 17 × 17.

Step 3.3.4: performing mutual convolution operation on the detection frames, and selecting B_1，iAnd B₂All the detection frames are convoluted to obtain a representative detection frame B_1，iAnd B_2，jIs used for the confidence queue S of the similarity_jAnd eliminating the detection frames with the confidence degrees smaller than p, wherein p is more than 0.6 and less than or equal to 0.8, recording the detection frame with the highest confidence degree, and enabling the index of the detection frame with the highest confidence degree to be m.

In the present invention, B is_1，iAnd B₂All detection blocks of (A) are convolved with (B)₁And B₂All the vehicle detection frames are operated. In general, p may take 0.8 and the index "m" refers to match.

Step 3.3.5: determining the area A moved by the matching frame of the previous frame and the previous two frames_m。

Said step 3.3.5 comprises the following steps.

Step 3.3.5.1: and converting the corresponding upper left corner coordinates and lower right corner coordinates of the two matching frames into corresponding centroid coordinates (x, y) and rectangular frame length and width h and w.

Where coor represents the coordinate (x, y) of a certain point, i represents the rectangular box of the previous two frames that match, and j represents the rectangular box of the previous frame.

In the invention, the coordinates of the four corresponding corners of the matching frame can be determined by the corresponding centroid coordinates and the length and the width, and the moving area is the same.

Step 3.3.6: according to the moving area A_mCorresponding detection box B matched with the previous frame_1，mDetermining a search area A_s。

Said step 3.3.6 comprises the following steps.

Step 3.3.6.1: the shift Δ x, Δ y of the centroid of the corresponding matching frame in the first two frames is calculated.

In the present invention, a search area A is searched_sIs the search area in the previous frame, i.e. the moving area A_mAnd estimating the area range of the current frame according to the offset.

Step 3.3.7: will search for area A_sCutting in the original image to obtain a matching frame B corresponding to the search area in the previous frame_1，mSending the data to a convolutional neural network with shared weight for feature extraction to obtain respective feature graphs, and marking the feature graphs as A_fAnd B_1，f。

In the invention, corresponding to the step 3.3.2, a VGG network is used for extracting features.

Step 3.4: w is to be_jAnd W_rInputting the overlapped anchor frame into the improved Soft-NMS algorithm, removing the overlap and outputting a foreground proposal frame W_f。

In the present invention, step 3.4 is to modify the confidence of the detection box with the improved non-maximum suppression method (C-Soft-NMS). Due to the common non-maximum suppression method (NMS) for the presence of a large amount of overlapThe optimal detection frame can be reserved under the condition of the detection frame, but when two vehicles are overlapped closely, the detection missing phenomenon can be generated by using common NMS, though the Soft-NMS has a certain effect on the detection of the overlapped vehicles, the provided linear attenuation method is not accurate enough for practical application, and W is_jAnd W_rThere are a large number of overlapping anchor frames, and there is a need to deal with the situation of missed detection that may occur when vehicles overlap, using only the improved non-maximum suppression method.

Said step 3.4 comprises the following steps.

Step 3.4.1: w is to be_jAnd W_rThe overlapping anchor boxes of (a) are defined as a number of detection boxes.

Step 3.4.2: arranging the detection frames in a descending order according to the confidence degree, and recording the queue after the arrangement as L₁(ii) a Initialize empty queue, denoted L₂。

In the invention, the confidence coefficient is the credibility which is calculated by the detection network and belongs to the class.

Step 3.4.3: inspection queue L₁And (4) whether the current time is null or not, if not, performing the next step, otherwise, performing the step 3.4.7.

Wherein, b_iRepresenting the detection frame currently operated, and iou is the overlapping degree.

In the present invention, the confidence of the detection box with IoU greater than 0.5 is attenuated in a circular curvature manner.

Step 3.4.5: in queue L₁Deleting the detection boxes with updated confidence lower than 0.1.

In the present invention, when the confidence is lower than 0.1, it can be regarded as a non-target frame, and therefore, deletion is performed.

Step 3.4.6: w is to be_maxPut candidate queue L₂And from queue L₁And (4) deleting, and returning to the step 3.4.3.

Step 3.4.7: return queue L₂As a final result.

Step 3.5: and pooling the foreground proposal frames into the same size through the interest area, and inputting the foreground proposal frames into a full connection layer and a full convolution network.

In the present invention, because W_fObjects to be further classified are present in the frame and need to be unified into the same size, such as 14 × 14, by region of interest Pooling (ROI Pooling).

In the invention, a Full Convolution Network (FCN) and a full connection layer are respectively used for outputting an object mask, a rectangular surrounding frame and an object type of a final segmentation example, wherein the full convolution network uses transposition convolution, a convolution layer kernel is 2 multiplied by 2, and the full connection layer outputs a specific type of an object and coordinates of the upper left corner and the lower right corner of the surrounding rectangular frame.

Said step 3.5 comprises the following steps.

Step 3.5.1: inputting the foreground proposal frames with the same size into the full-link layer, and outputting the category of the foreground proposal frames and the coordinates (X, y) of the upper left corner and the lower right corner of the foreground proposal frames, wherein y is X.x, y is a vector of n X1, n is the number of categories, X is a matrix of n X m, X is a vector of m X1, and m is the dimension when entering the full-link layer.

In the invention, the categories of the output foreground proposal frame comprise foreground and background, namely an illegal parking area, a vehicle and a non-illegal parking area, so as to distinguish whether the vehicle has illegal parking.

In the invention, n is 4 when the coordinates of the frame are output, namely four pieces of information are needed when one detection frame is determined, namely the coordinates of the upper left corner and the lower right corner or the coordinates of the center of the frame with the length increased, and both expressions need n is 4.

In the present invention, m is the dimension into the fully connected layer, e.g., 2048.

In the present invention, the input of the full convolution network is an image area in each detection frame, and the full convolution network needs to be used in order to output a mask (mask) classified based on the pixel level while maintaining the spatial shape of the picture.

In the invention, the core of the full convolution network is a transposed convolution layer, assuming that f is a convolution layer, given input x, the forward output y can be calculated to be f (x); in reverse derivation

It is known that z will yield an output that is the same shape as x; since the derivative of the convolution operation is itself, the transposed convolution layer can be legally defined, denoted as g, as a convolution layer that exchanges forward and reverse derivative functions. That is, z is g (y), after the full convolution network, the mask for predicting the edge of the object can be output in the rectangular bounding box of the object.

Step 3.6: a mask classified based on pixel level is output.

In the invention, the mask is the edge segmentation of the target in the detection frame.

The step 4 comprises the following steps:

step 4.1: traversing all the segmented vehicle examples;

Step 4.3: when IoU_{Violation of parking}B and IoU_{Can stop}When < a, the vehicle instance is judged to be an illegal parking vehicle, and the index is added into the queue L_wOtherwise, the vehicle is not illegally parked; carrying out the next step;

In the present invention, a may be 0.5 and b may be 0.3.

The method comprises the steps of screening collected images, reserving and labeling effective images, constructing an example segmentation network and an associated region proposal network in a neural network based on dynamic background convolution, obtaining model parameters of an optimal effect network after training, inputting the effective images into the optimal effect network to obtain segmented vehicles, stoppable regions and illegal parking regions, detecting the overlapping degree of the segmented vehicles and the stoppable regions and the illegal parking regions, and judging the illegal parking vehicles when the overlapping degree of the vehicles and the stoppable regions is smaller than a preset threshold a and the overlapping degree of the vehicles and the illegal parking regions is larger than a threshold b.

The invention provides an effective algorithm aiming at the phenomenon of vehicle parking violation, the image acquired by the non-fixed camera equipment can be also suitable, and the vehicle, the parkable area and the parking violation area can be accurately divided for the sidewalk background with a complex scene; a related area proposal network is designed for the condition of vehicle missing detection, areas where objects may exist are predicted for the relation between pictures shot before and after in time sequence, and the model can be continuously updated iteratively to enhance the robustness of the model; a non-maximum suppression method C-Soft-NMS algorithm improved based on Soft-NMS is fused in an example segmentation network, the detection success rate under the shielding condition is improved, and the model accuracy is greatly improved. The method plays an effective auxiliary role in intelligent urban traffic management.

Claims

1. An illegal parking detection algorithm based on dynamic background is characterized in that: the method comprises the following steps:

in step 3, the effective image in the effect optimization network includes the following steps:

step 3.1: inputting the processed effective image into the residual error net of the example segmentation networkIn the network, a residual error characteristic diagram I is output_RES；

said step 3.3 comprises the steps of:

Step 3.3.7: will search for area A_sCutting in the original image to obtain the corresponding search in the previous frameMatching box B of region_1，mSending the data to a convolutional neural network with shared weight for feature extraction to obtain respective feature graphs, and marking the feature graphs as A_fAnd B_1，f；

Step 3.3.8: feature map B of detection frame_1，fIn search area feature map A_fCarrying out mutual convolution operation to obtain a confidence map D, and selecting an area with the highest confidence as a proposed area to be added into a foreground frame proposal output of the area proposal network;

step 3.4: w is to be_jAnd W_rInputting the overlapped anchor frame into a C-Soft-NMS algorithm, removing the overlap and outputting a foreground proposal frame W_f；

step 3.6: outputting a mask classified based on pixel level;

2. A dynamic context based violation detection algorithm as recited in claim 1, wherein: in the step 1, the collected images are screened, then are preprocessed, and effective images are reserved.

3. A dynamic context based violation detection algorithm as recited in claim 1, wherein: said step 3.3.5 comprises the steps of:

4. A dynamic context based violation detection algorithm as recited in claim 1, wherein: said step 3.3.6 comprises the steps of:

5. A dynamic context based violation detection algorithm as recited in claim 1, wherein: said step 3.4 comprises the steps of:

step 3.4.6: w is to be_maxPut candidate queue L₂And from queue L₁Deleting, and returning to the step 3.4.3;

step 3.4.7: return queue L₂As a final result.

6. A dynamic context based violation detection algorithm as recited in claim 1, wherein: said step 3.5 comprises the steps of:

7. A dynamic context based violation detection algorithm as recited in claim 1, wherein: the step 4 comprises the following steps:

step 4.1: traversing all the segmented vehicle examples;