CN109902676B - Dynamic background-based violation detection algorithm - Google Patents

Dynamic background-based violation detection algorithm Download PDF

Info

Publication number
CN109902676B
CN109902676B CN201910029090.XA CN201910029090A CN109902676B CN 109902676 B CN109902676 B CN 109902676B CN 201910029090 A CN201910029090 A CN 201910029090A CN 109902676 B CN109902676 B CN 109902676B
Authority
CN
China
Prior art keywords
detection
area
frames
network
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910029090.XA
Other languages
Chinese (zh)
Other versions
CN109902676A (en
Inventor
郑雅羽
王济浩
寇喜超
冯宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN201910029090.XA priority Critical patent/CN109902676B/en
Publication of CN109902676A publication Critical patent/CN109902676A/en
Application granted granted Critical
Publication of CN109902676B publication Critical patent/CN109902676B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention relates to a dynamic background-based illegal parking detection algorithm, which comprises the steps of screening acquired images, retaining effective images and marking, constructing an example segmentation network and a related area proposal network in a neural network based on dynamic background convolution, obtaining model parameters of an optimal effect network after training, inputting the effective images into the optimal effect network to obtain segmented vehicles, parking areas and illegal parking areas, and obtaining a vehicle-parking area overlapping degree smaller than a preset threshold value when the overlapping degree of the vehicles and the parking areas is smaller than the preset threshold valueaAnd the overlapping degree of the detection area and the illegal parking area is more than a threshold valuebAnd if so, determining that the vehicle is illegal. The method is also applicable to pictures acquired by non-fixed camera equipment, and vehicles and parkable and illegal parking areas can be accurately divided for sidewalk backgrounds with complex scenes; the method has the advantages that the possible areas of the object are predicted according to the relation between pictures shot before and after in the time sequence, and the model can be continuously updated in an iterative mode to enhance the robustness of the model; the detection success rate under the shielding condition is improved, and the model accuracy is greatly improved.

Description

Dynamic background-based violation detection algorithm
Technical Field
The present invention relates to data recognition; a data representation; a record carrier; the field of record carrier handling, and in particular to a violation detection algorithm based on dynamic background.
Background
Along with the development of cities and the increase of disposable income of residents, the number of vehicles is increased year by year, the pressure of road traffic is increased, meanwhile, the problem of 'difficult parking' is increasingly acute due to the hysteresis of the number matching of parking spaces, and vehicle owners often utilize blind areas of road monitoring areas for self convenience to park vehicles into pedestrian walks. The illegal parking vehicles obstruct normal walking areas and blind roads of pedestrians, and potential risks that the pedestrians bypass into non-motor vehicle lanes and even motor vehicle lanes exist. In order to prevent the phenomenon, a traffic department can regularly patrol the illegal vehicles on the sidewalks on two sides of the road and adopt mobile equipment to collect images, but the problems of low efficiency and long time consumption for obtaining evidence and photographing exist in manual patrol and supervision illegal parking.
The intelligent image analysis is adopted for carrying out illegal parking detection, which is the development direction of urban traffic management in the future, and one illegal parking detection algorithm is the illegal parking detection algorithm. The illegal parking detection algorithm is a model algorithm obtained by training and modeling according to the acquired real scene picture, can automatically detect whether illegal parking vehicles exist in the newly acquired image according to model parameters obtained by training after the training is finished, and can perform accurate positioning and segmentation in the image.
The paper "Rich features technologies for object detection and segmentation" introduces a Convolutional Neural Network (CNN) method into the object detection field, greatly improves the object detection effect, and changes the main research idea of the object detection field, so that the following series of articles include RCNN (Region-based probabilistic Neural Networks), Fast-RCNN and Fast-RCNN, and meanwhile, the author of the Fast-RCNN in 2017 proposed master-RCNN, which represents the highest level in the field.
Most of the existing illegal parking detection algorithms are performed on the premise that a vehicle is detected, and specific position information of the vehicle is obtained by performing a frame difference method, a background difference method and the like according to a video sequence and vehicle detection information aiming at a monitoring camera device with a fixed road. However, these methods are greatly affected by illumination and the stability of the monitoring device, and rely on background modeling of the detection scene, and when the scene is complicated and the vehicle is deformed due to the shooting angle, the effect of vehicle detection is affected, thereby causing the failure of the illegal parking detection algorithm.
In recent years, with the development of deep learning, the CNN has also introduced the field of road traffic detection, for example, the invention patent with publication number CN107609491A proposes an algorithm for detecting illegal parking vehicles based on a convolutional neural network, but the algorithm only applies deep learning to the identification of vehicles and cannot effectively position and distinguish illegal parking vehicles when there are multiple vehicles in an image; therefore, the prior art still has the following defects:
1) the common CNN image recognition can not complete the positioning detection capability;
2) for the illegal parking vehicles, specific illegal parking vehicles are expected to be directly positioned from the acquired images, and the vehicle type and license plate information of the vehicles are expected to be extracted, so that the function of identifying and detecting high-resolution pictures is required, the common CNN network is generally processed by size scaling, and the accurate positioning is difficult to restore the high-resolution size;
3) the method has the advantages that the sidewalks are parked densely and have objects which are mutually shielded, the detection of the types is not accurate enough, and the requirement is provided for accurate segmentation of different vehicles of the same type;
4) when the pictures are collected, the pictures have a temporal front-back sequence, the pictures are linked in content, and the common target detection convolutional network only aims at one picture and does not utilize the associated information between the front image and the back image.
Disclosure of Invention
In order to solve the problems, the invention provides an optimized violation detection algorithm based on a dynamic background, which can continuously iterate to increase the robustness. And a non-maximum inhibition method C-Soft-NMS improved based on Soft-NMS is fused on the example segmentation network, so that the detection success rate under the shielding condition is improved.
The invention adopts the technical scheme that an illegal parking detection algorithm based on a dynamic background comprises the following steps:
step 1: screening the collected images, reserving effective images and labeling;
step 2: constructing an example segmentation network and a related area proposed network in a neural network based on dynamic background convolution, training, and setting a loss function to obtain model parameters of an optimal effect network;
and step 3: inputting the effective image into an effect optimal network to obtain a vehicle, a stoppable area and an illegal parking area after segmentation;
and 4, step 4: detecting the overlapping degree of the divided vehicle with a stoppable area and an illegal parking area, and judging as the illegal parking vehicle when the overlapping degree of the vehicle with the stoppable area is smaller than a preset threshold a and the overlapping degree of the vehicle with the illegal parking area is larger than a threshold b; a is more than 0 and less than 1, and b is more than 0 and less than 1.
Preferably, in step 1, after the acquired image is screened, preprocessing is performed, and an effective image is retained.
Preferably, in step 3, the effective image includes the following steps in the effect optimization network:
step 3.1: inputting the processed effective image into a residual error network of an example segmentation network, and outputting a residual error characteristic diagram IRES
Step 3.2: residual feature map IRESInputting a regional proposal network, removing proposal boxes with low confidence coefficient less than q, and taking the rest proposal boxes WjAs the foreground proposal box output, Wj=IRES·HRPNWherein H isRPNRepresenting a generate proposal box operation; q is more than 0 and less than 0.5;
step 3.3: if the number of processed image frames is greater than or equal to 2, inputting the detected images of the first two frames into the associated area proposing network, and marking the foreground frame proposal output by the associated area proposing network as WrAnd carrying out the next step, otherwise, directly carrying out the step 3.5;
step 3.4: w is to bejAnd WrInputting the overlapped anchor frame into the improved Soft-NMS algorithm, removing the overlap and outputting a foreground proposal frame Wf
Step 3.5: pooling foreground proposal frames into the same size through an interest area, and inputting the foreground proposal frames into a full connection layer and a full convolution network;
step 3.6: a mask classified based on pixel level is output.
Preferably, said step 3.3 comprises the steps of:
step 3.3.1: obtaining the existing vehicle detection frames of the previous frame and the previous two frames of the current frame, and respectively recording the existing vehicle detection frames as B1,iAnd B2,jWherein i and j represent the ith and jth detection boxes respectively;
step 3.3.2: b is to be1,iAnd B2,jInputting a convolution network shared by weight values and extracting characteristics;
step 3.3.3: pooling the extracted features into the same size through the interest area;
step 3.3.4: performing mutual convolution operation on the detection frames, and selecting B1,iAnd B2All the detection frames are convoluted to obtain a representative detection frame B1,iAnd B2,jIs used for the confidence queue S of the similarityjRemoving the detection frames with confidence coefficient smaller than p, wherein p is more than 0.6 and less than or equal to 0.8, recording the detection frame with the highest confidence coefficient, and enabling the index of the detection frame with the highest confidence coefficient to be m;
step 3.3.5: determining the area A moved by the matching frame of the previous frame and the previous two framesm
Step 3.3.6: according to the moving area AmCorresponding detection box B matched with the previous frame1,mDetermining a search area As
Step 3.3.7: will search for area AsCutting in the original image to obtain a matching frame B corresponding to the search area in the previous frame1,mSending the data to a convolutional neural network with shared weight for feature extraction to obtain respective feature graphs, and marking the feature graphs as AfAnd B1,f
Step 3.3.8: feature map B of detection frame1,fIn search area feature map AfAnd performing mutual convolution operation to obtain a confidence map D, and selecting the area with the highest confidence as a proposed area to be added into a foreground frame proposal output of the area proposal network.
Preferably, said step 3.3.5 comprises the steps of:
step 3.3.5.1: converting the corresponding upper left corner coordinates and lower right corner coordinates of the two matching frames into corresponding centroid coordinates (x, y) and rectangular frame length and width h and w;
step 3.3.5.2: obtaining a moving area according to the relative positions of the centroids of the two rectangular frames
Figure GDA0002809266740000041
Wherein, color represents the coordinate (x, y) of a certain point, i represents the rectangular frame of the first two matched frames, and j represents the rectangular frame of the previous frame;
step 3.3.5.3: will colormovThe coordinates are converted to corresponding centroid coordinates and rectangular frame length and width, in (x)m,ym) And wm、hmIs represented as a moving area Am
Preferably, said step 3.3.6 comprises the steps of:
step 3.3.6.1: calculating the movement quantity delta x and delta y of the centroid of the corresponding matching frame in the first two frames;
step 3.3.6.2: will move the area AmRespectively adding the moving amounts Deltax and Deltay to the centroid coordinates of (1) to obtain a search area As
Preferably, said step 3.4 comprises the steps of:
step 3.4.1: w is to bejAnd WrThe overlapping anchor frames of (a) are defined as a plurality of detection frames;
step 3.4.2: arranging the detection frames in a descending order according to the confidence degree, and recording the queue after the arrangement as L1(ii) a Initialize empty queue, denoted L2
Step 3.4.3: inspection queue L1If the value is null, the next step is carried out if the value is not null, otherwise, the step 3.4.7 is carried out;
step 3.4.4: queue L1The detection frame with the highest confidence level in the middle is marked as WmaxCalculating the remaining detection frame and WmaxThe degree of overlap of (2), updating the confidence of the detection box
Figure GDA0002809266740000051
Wherein, biRepresenting a detection frame currently operated, and iou is the overlapping degree;
step 3.4.5: in queue L1Deleting the detection box with the updated confidence coefficient lower than 0.1;
step 3.4.6: w is to bemaxPut candidate queue L2And from queue L1And (5) deleting, returning to the step 3.4.3:
step 3.4.7: return queue L2As a final result.
Preferably, said step 3.5 comprises the steps of:
step 3.5.1: inputting the foreground proposal frames with the same size into a full-connection layer, and outputting the category of the foreground proposal frames and the coordinates (X, y) of the upper left corner and the lower right corner of the foreground proposal frames, wherein y is X.x, y is a vector of n X1, n is the number of categories, X is a matrix of n X m, X is a vector of m X1, and m is the dimension when entering the full-connection layer;
step 3.5.2: the coordinates are input into the full convolution network and the output z is g (y), where g is the convolution layer with the forward and reverse derivative functions exchanged.
Preferably, the step 4 comprises the steps of:
step 4.1: traversing all the segmented vehicle examples;
step 4.2: IoU of detection frame, illegal parking area detection frame and parking area detection frame of detection vehicleViolation of parkingAnd IoUCan stop
Step 4.3: when IoUViolation of parking>b and IoUCan stop<When a, judging that the vehicle instance is an illegal parking vehicle, and adding the index into the queue LwOtherwise, the vehicle is not illegally parked; carrying out the next step;
step 4.4: if the traversal is not finished, returning to the step 1, otherwise, finishing the detection and return queue Lw
The invention provides an optimized violation detection algorithm based on a dynamic background, which comprises the steps of screening acquired images, reserving and marking effective images, constructing an example segmentation network and a related region proposed network in a neural network based on convolution of the dynamic background, obtaining model parameters of an optimal effect network after training, inputting the effective images into the optimal effect network to obtain segmented vehicles, stoppable regions and violation regions, detecting the overlapping degree of the segmented vehicles and the stoppable regions and the violation regions, and judging the vehicle as the violation vehicle when the overlapping degree of the vehicles and the stoppable regions is smaller than a preset threshold a and the overlapping degree of the vehicles and the violation regions is larger than a threshold b.
The invention has the following beneficial effects that the invention plays an effective auxiliary role in intelligent urban traffic management:
(1) an effective algorithm is provided for the phenomenon of vehicle parking violation, pictures acquired by non-fixed camera equipment can also be suitable, and vehicles and parkable and parking-violation areas can be accurately divided for sidewalk backgrounds with complex scenes;
(2) a related area proposal network is designed for the condition of vehicle missing detection, areas where objects may exist are predicted for the relation between pictures shot before and after in time sequence, and the model can be continuously updated iteratively to enhance the robustness of the model;
(3) a non-maximum suppression method C-Soft-NMS algorithm improved based on Soft-NMS is fused in an example segmentation network, the detection success rate under the shielding condition is improved, and the model accuracy is greatly improved.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a flow chart of the present invention relating to a regional proposal network;
FIG. 3 is a flow chart of the modified Soft-NMS algorithm of the present invention.
Detailed Description
The invention will be described in further detail below with reference to examples and the accompanying drawings, without limiting the scope of the invention thereto.
The invention relates to an illegal parking detection algorithm based on a dynamic background, the use of dynamic background modeling generally eliminates the interference of a complex background on a specific moving object in a video continuous frame so as to improve the detection precision, but for a static specific target, the application of the traditional dynamic background method is less; the development of the convolutional neural network is from a shallow layer network (VGG) with repeated elements to an improved residual deep network (ResNet), which can extract picture information rich enough for different tasks, and meanwhile, a Region Proposal Network (RPN) is also provided for the requirement of target positioning.
The method comprises the following steps.
Step 1: and screening the acquired images, reserving effective images and labeling.
In the step 1, the collected images are screened, then are preprocessed, and effective images are reserved.
In the invention, the effective image refers to an image with a sidewalk in the image, and the effective image is not a valid image when non-sidewalk elements such as intersections appear in the image. Of course, in practical situations, the determination criteria of the effective image may be adjusted according to practical requirements based on the determination range of the violation.
In the invention, marking refers to realizing the effect of marking the key position, namely informing the algorithm of which standard the algorithm needs to reach in advance.
In the present invention, the preprocessing generally refers to resizing an image to be detected, for example, scaling the short side to 800, and the long side not to exceed 1300, and under certain conditions, denoising is also required. The technical personnel in the field can set the setting according to the requirements.
Step 2: and constructing an example segmentation network and an associated area proposed network in the neural network based on the dynamic background convolution, training, and setting a loss function to obtain model parameters of the network with the optimal effect.
In the invention, the example segmentation network comprises a residual error extraction network (RESNet), a Region Proposal Network (RPN), an improved non-maximum suppression method (C-Soft-NMS), a Full Convolution Network (FCN) and full connection layer output, and when necessary, corresponding loss functions can be set according to different tasks; and the related area proposed network (CPN) comprises search area restriction (SRR) and mutual convolution network (CCN), a foreground box is proposed to assist the training of the example segmentation network, and a loss function of similarity comparison can be set when needed. Specifically, the method provided by the invention divides the network into targets by examples, and proposes the network to perform precision adjustment by the associated area, so that the output result is more accurate.
In the present invention, the convolution layer kernel size of the residual extraction network (RESNet) may be set to a × a, where a is 1 or 3; the convolutional layer kernel size of the area proposal network (RPN) may be set to b × b, b being 3, 5 or 7; the full convolutional network uses transposed convolution, and the convolutional layer kernel can be set to 2 × 2. The size of the convolutional layer kernel can be set by those skilled in the art according to actual requirements, and an optimal network can be obtained through training.
In the invention, the model parameter of the optimal effect network refers to the model parameter with the optimal effect, namely the parameter in the convolution kernel which minimizes loss in the iterative training process, and the gradient of the model parameter can be calculated and updated by using the cross entropy loss function as the loss, which is the routine technique of the technicians in the field and can be selected and set by the technicians in the field according to the requirements.
And step 3: and inputting the effective image into the optimal effect network to obtain the segmented vehicle, the stoppable area and the illegal parking area.
In step 3, the effective image includes the following steps in the effect optimization network.
Step 3.1: inputting the processed effective image into a residual error network of an example segmentation network, and outputting a residual error characteristic diagram IRES
Step 3.2: residual feature map IRESInputting a regional proposal network, removing proposal boxes with low confidence coefficient less than q, and taking the rest proposal boxes WjAs the foreground proposal box output, Wj=IRES·HRPNWherein H isRPNRepresenting a generate proposal box operation; q is more than 0 and less than 0.5.
In the present invention, q may be 0.3 in general.
Step 3.3: if the number of processed image frames is greater than or equal to 2, inputting the detected images of the first two frames into the associated area proposing network, and marking the foreground frame proposal output by the associated area proposing network as WrAnd (5) carrying out the next step, otherwise, directly carrying out the step 3.5.
Said step 3.3 comprises the following steps.
Step 3.3.1: obtaining the existing vehicle detection frames of the previous frame and the previous two frames of the current frame, and respectively recording the existing vehicle detection frames as B1,iAnd B2,jWherein i and j represent the ith and jth detection boxes, respectively.
In the present invention, B denotes a frame of a detected vehicle, 1 denotes a previous frame, and 2 denotes a previous 2 frames.
Step 3.3.2: b is to be1,iAnd B2,jAnd inputting a convolution network shared by the weights, and extracting the characteristics.
In the invention, the VGG network is selected for extracting the characteristics in the step 3.3.2.
Step 3.3.3: the extracted features are pooled and unified into the same size by the region of interest. In this example 17 × 17.
Step 3.3.4: performing mutual convolution operation on the detection frames, and selecting B1,iAnd B2All the detection frames are convoluted to obtain a representative detection frame B1,iAnd B2,jIs used for the confidence queue S of the similarityjAnd eliminating the detection frames with the confidence degrees smaller than p, wherein p is more than 0.6 and less than or equal to 0.8, recording the detection frame with the highest confidence degree, and enabling the index of the detection frame with the highest confidence degree to be m.
In the present invention, B is1,iAnd B2All detection blocks of (A) are convolved with (B)1And B2All the vehicle detection frames are operated. In general, p may take 0.8 and the index "m" refers to match.
Step 3.3.5: determining the area A moved by the matching frame of the previous frame and the previous two framesm
Said step 3.3.5 comprises the following steps.
Step 3.3.5.1: and converting the corresponding upper left corner coordinates and lower right corner coordinates of the two matching frames into corresponding centroid coordinates (x, y) and rectangular frame length and width h and w.
Step 3.3.5.2: obtaining a moving area according to the relative positions of the centroids of the two rectangular frames
Figure GDA0002809266740000091
Where coor represents the coordinate (x, y) of a certain point, i represents the rectangular box of the previous two frames that match, and j represents the rectangular box of the previous frame.
Step 3.3.5.3: will colormovThe coordinates are converted to corresponding centroid coordinates and rectangular frame length and width, in (x)m,ym) And wm、hmIs represented as a moving area Am
In the invention, the coordinates of the four corresponding corners of the matching frame can be determined by the corresponding centroid coordinates and the length and the width, and the moving area is the same.
Step 3.3.6: according to the moving area AmCorresponding detection box B matched with the previous frame1,mDetermining a search area As
Said step 3.3.6 comprises the following steps.
Step 3.3.6.1: the shift Δ x, Δ y of the centroid of the corresponding matching frame in the first two frames is calculated.
Step 3.3.6.2: will move the area AmRespectively adding the moving amounts Deltax and Deltay to the centroid coordinates of (1) to obtain a search area As
In the present invention, a search area A is searchedsIs the search area in the previous frame, i.e. the moving area AmAnd estimating the area range of the current frame according to the offset.
Step 3.3.7: will search for area AsCutting in the original image to obtain a matching frame B corresponding to the search area in the previous frame1,mSending the data to a convolutional neural network with shared weight for feature extraction to obtain respective feature graphs, and marking the feature graphs as AfAnd B1,f
In the invention, corresponding to the step 3.3.2, a VGG network is used for extracting features.
Step 3.3.8: feature map B of detection frame1,fIn search area feature map AfAnd performing mutual convolution operation to obtain a confidence map D, and selecting the area with the highest confidence as a proposed area to be added into a foreground frame proposal output of the area proposal network.
Step 3.4: w is to bejAnd WrInputting the overlapped anchor frame into the improved Soft-NMS algorithm, removing the overlap and outputting a foreground proposal frame Wf
In the present invention, step 3.4 is to modify the confidence of the detection box with the improved non-maximum suppression method (C-Soft-NMS). Due to the common non-maximum suppression method (NMS) for the presence of a large amount of overlapThe optimal detection frame can be reserved under the condition of the detection frame, but when two vehicles are overlapped closely, the detection missing phenomenon can be generated by using common NMS, though the Soft-NMS has a certain effect on the detection of the overlapped vehicles, the provided linear attenuation method is not accurate enough for practical application, and W isjAnd WrThere are a large number of overlapping anchor frames, and there is a need to deal with the situation of missed detection that may occur when vehicles overlap, using only the improved non-maximum suppression method.
Said step 3.4 comprises the following steps.
Step 3.4.1: w is to bejAnd WrThe overlapping anchor boxes of (a) are defined as a number of detection boxes.
Step 3.4.2: arranging the detection frames in a descending order according to the confidence degree, and recording the queue after the arrangement as L1(ii) a Initialize empty queue, denoted L2
In the invention, the confidence coefficient is the credibility which is calculated by the detection network and belongs to the class.
Step 3.4.3: inspection queue L1And (4) whether the current time is null or not, if not, performing the next step, otherwise, performing the step 3.4.7.
Step 3.4.4: queue L1The detection frame with the highest confidence level in the middle is marked as WmaxCalculating the remaining detection frame and WmaxThe degree of overlap of (2), updating the confidence of the detection box
Figure GDA0002809266740000111
Wherein, biRepresenting the detection frame currently operated, and iou is the overlapping degree.
In the present invention, the confidence of the detection box with IoU greater than 0.5 is attenuated in a circular curvature manner.
Step 3.4.5: in queue L1Deleting the detection boxes with updated confidence lower than 0.1.
In the present invention, when the confidence is lower than 0.1, it can be regarded as a non-target frame, and therefore, deletion is performed.
Step 3.4.6: w is to bemaxPut candidate queue L2And from queue L1And (4) deleting, and returning to the step 3.4.3.
Step 3.4.7: return queue L2As a final result.
Step 3.5: and pooling the foreground proposal frames into the same size through the interest area, and inputting the foreground proposal frames into a full connection layer and a full convolution network.
In the present invention, because WfObjects to be further classified are present in the frame and need to be unified into the same size, such as 14 × 14, by region of interest Pooling (ROI Pooling).
In the invention, a Full Convolution Network (FCN) and a full connection layer are respectively used for outputting an object mask, a rectangular surrounding frame and an object type of a final segmentation example, wherein the full convolution network uses transposition convolution, a convolution layer kernel is 2 multiplied by 2, and the full connection layer outputs a specific type of an object and coordinates of the upper left corner and the lower right corner of the surrounding rectangular frame.
Said step 3.5 comprises the following steps.
Step 3.5.1: inputting the foreground proposal frames with the same size into the full-link layer, and outputting the category of the foreground proposal frames and the coordinates (X, y) of the upper left corner and the lower right corner of the foreground proposal frames, wherein y is X.x, y is a vector of n X1, n is the number of categories, X is a matrix of n X m, X is a vector of m X1, and m is the dimension when entering the full-link layer.
In the invention, the categories of the output foreground proposal frame comprise foreground and background, namely an illegal parking area, a vehicle and a non-illegal parking area, so as to distinguish whether the vehicle has illegal parking.
In the invention, n is 4 when the coordinates of the frame are output, namely four pieces of information are needed when one detection frame is determined, namely the coordinates of the upper left corner and the lower right corner or the coordinates of the center of the frame with the length increased, and both expressions need n is 4.
In the present invention, m is the dimension into the fully connected layer, e.g., 2048.
Step 3.5.2: the coordinates are input into the full convolution network and the output z is g (y), where g is the convolution layer with the forward and reverse derivative functions exchanged.
In the present invention, the input of the full convolution network is an image area in each detection frame, and the full convolution network needs to be used in order to output a mask (mask) classified based on the pixel level while maintaining the spatial shape of the picture.
In the invention, the core of the full convolution network is a transposed convolution layer, assuming that f is a convolution layer, given input x, the forward output y can be calculated to be f (x); in reverse derivation
Figure GDA0002809266740000121
It is known that z will yield an output that is the same shape as x; since the derivative of the convolution operation is itself, the transposed convolution layer can be legally defined, denoted as g, as a convolution layer that exchanges forward and reverse derivative functions. That is, z is g (y), after the full convolution network, the mask for predicting the edge of the object can be output in the rectangular bounding box of the object.
Step 3.6: a mask classified based on pixel level is output.
In the invention, the mask is the edge segmentation of the target in the detection frame.
And 4, step 4: detecting the overlapping degree of the divided vehicle with a stoppable area and an illegal parking area, and judging as the illegal parking vehicle when the overlapping degree of the vehicle with the stoppable area is smaller than a preset threshold a and the overlapping degree of the vehicle with the illegal parking area is larger than a threshold b; a is more than 0 and less than 1, and b is more than 0 and less than 1.
The step 4 comprises the following steps:
step 4.1: traversing all the segmented vehicle examples;
step 4.2: IoU of detection frame, illegal parking area detection frame and parking area detection frame of detection vehicleViolation of parkingAnd IoUCan stop
Step 4.3: when IoUViolation of parkingB and IoUCan stopWhen < a, the vehicle instance is judged to be an illegal parking vehicle, and the index is added into the queue LwOtherwise, the vehicle is not illegally parked; carrying out the next step;
step 4.4: if the traversal is not finished, returning to the step 1, otherwise, finishing the detection and return queue Lw
In the present invention, a may be 0.5 and b may be 0.3.
The method comprises the steps of screening collected images, reserving and labeling effective images, constructing an example segmentation network and an associated region proposal network in a neural network based on dynamic background convolution, obtaining model parameters of an optimal effect network after training, inputting the effective images into the optimal effect network to obtain segmented vehicles, stoppable regions and illegal parking regions, detecting the overlapping degree of the segmented vehicles and the stoppable regions and the illegal parking regions, and judging the illegal parking vehicles when the overlapping degree of the vehicles and the stoppable regions is smaller than a preset threshold a and the overlapping degree of the vehicles and the illegal parking regions is larger than a threshold b.
The invention provides an effective algorithm aiming at the phenomenon of vehicle parking violation, the image acquired by the non-fixed camera equipment can be also suitable, and the vehicle, the parkable area and the parking violation area can be accurately divided for the sidewalk background with a complex scene; a related area proposal network is designed for the condition of vehicle missing detection, areas where objects may exist are predicted for the relation between pictures shot before and after in time sequence, and the model can be continuously updated iteratively to enhance the robustness of the model; a non-maximum suppression method C-Soft-NMS algorithm improved based on Soft-NMS is fused in an example segmentation network, the detection success rate under the shielding condition is improved, and the model accuracy is greatly improved. The method plays an effective auxiliary role in intelligent urban traffic management.

Claims (7)

1. An illegal parking detection algorithm based on dynamic background is characterized in that: the method comprises the following steps:
step 1: screening the collected images, reserving effective images and labeling;
step 2: constructing an example segmentation network and a related area proposed network in a neural network based on dynamic background convolution, training, and setting a loss function to obtain model parameters of an optimal effect network;
and step 3: inputting the effective image into an effect optimal network to obtain a vehicle, a stoppable area and an illegal parking area after segmentation;
in step 3, the effective image in the effect optimization network includes the following steps:
step 3.1: inputting the processed effective image into the residual error net of the example segmentation networkIn the network, a residual error characteristic diagram I is outputRES
Step 3.2: residual feature map IRESInputting a regional proposal network, removing proposal boxes with low confidence coefficient less than q, and taking the rest proposal boxes WjAs the foreground proposal box output, Wj=IRES·HRPNWherein H isRPNRepresenting a generate proposal box operation; q is more than 0 and less than 0.5;
step 3.3: if the number of processed image frames is greater than or equal to 2, inputting the detected images of the first two frames into the associated area proposing network, and marking the foreground frame proposal output by the associated area proposing network as WrAnd carrying out the next step, otherwise, directly carrying out the step 3.5;
said step 3.3 comprises the steps of:
step 3.3.1: obtaining the existing vehicle detection frames of the previous frame and the previous two frames of the current frame, and respectively recording the existing vehicle detection frames as B1,iAnd B2,jWherein i and j represent the ith and jth detection boxes respectively;
step 3.3.2: b is to be1,iAnd B2,jInputting a convolution network shared by weight values and extracting characteristics;
step 3.3.3: pooling the extracted features into the same size through the interest area;
step 3.3.4: performing mutual convolution operation on the detection frames, and selecting B1,iAnd B2All the detection frames are convoluted to obtain a representative detection frame B1,iAnd B2,jIs used for the confidence queue S of the similarityjRemoving the detection frames with confidence coefficient smaller than p, wherein p is more than 0.6 and less than or equal to 0.8, recording the detection frame with the highest confidence coefficient, and enabling the index of the detection frame with the highest confidence coefficient to be m;
step 3.3.5: determining the area A moved by the matching frame of the previous frame and the previous two framesm
Step 3.3.6: according to the moving area AmCorresponding detection box B matched with the previous frame1,mDetermining a search area As
Step 3.3.7: will search for area AsCutting in the original image to obtain the corresponding search in the previous frameMatching box B of region1,mSending the data to a convolutional neural network with shared weight for feature extraction to obtain respective feature graphs, and marking the feature graphs as AfAnd B1,f
Step 3.3.8: feature map B of detection frame1,fIn search area feature map AfCarrying out mutual convolution operation to obtain a confidence map D, and selecting an area with the highest confidence as a proposed area to be added into a foreground frame proposal output of the area proposal network;
step 3.4: w is to bejAnd WrInputting the overlapped anchor frame into a C-Soft-NMS algorithm, removing the overlap and outputting a foreground proposal frame Wf
Step 3.5: pooling foreground proposal frames into the same size through an interest area, and inputting the foreground proposal frames into a full connection layer and a full convolution network;
step 3.6: outputting a mask classified based on pixel level;
and 4, step 4: detecting the overlapping degree of the divided vehicle with a stoppable area and an illegal parking area, and judging as the illegal parking vehicle when the overlapping degree of the vehicle with the stoppable area is smaller than a preset threshold a and the overlapping degree of the vehicle with the illegal parking area is larger than a threshold b; a is more than 0 and less than 1, and b is more than 0 and less than 1.
2. A dynamic context based violation detection algorithm as recited in claim 1, wherein: in the step 1, the collected images are screened, then are preprocessed, and effective images are reserved.
3. A dynamic context based violation detection algorithm as recited in claim 1, wherein: said step 3.3.5 comprises the steps of:
step 3.3.5.1: converting the corresponding upper left corner coordinates and lower right corner coordinates of the two matching frames into corresponding centroid coordinates (x, y) and rectangular frame length and width h and w;
step 3.3.5.2: obtaining a moving area according to the relative positions of the centroids of the two rectangular frames
Figure FDA0002809266730000031
Wherein, color represents the coordinate (x, y) of a certain point, i represents the rectangular frame of the first two matched frames, and j represents the rectangular frame of the previous frame;
step 3.3.5.3: will colormovThe coordinates are converted to corresponding centroid coordinates and rectangular frame length and width, in (x)m,ym) And wm、hmIs represented as a moving area Am
4. A dynamic context based violation detection algorithm as recited in claim 1, wherein: said step 3.3.6 comprises the steps of:
step 3.3.6.1: calculating the movement quantity delta x and delta y of the centroid of the corresponding matching frame in the first two frames;
step 3.3.6.2: will move the area AmRespectively adding the moving amounts Deltax and Deltay to the centroid coordinates of (1) to obtain a search area As
5. A dynamic context based violation detection algorithm as recited in claim 1, wherein: said step 3.4 comprises the steps of:
step 3.4.1: w is to bejAnd WrThe overlapping anchor frames of (a) are defined as a plurality of detection frames;
step 3.4.2: arranging the detection frames in a descending order according to the confidence degree, and recording the queue after the arrangement as L1(ii) a Initialize empty queue, denoted L2
Step 3.4.3: inspection queue L1If the value is null, the next step is carried out if the value is not null, otherwise, the step 3.4.7 is carried out;
step 3.4.4: queue L1The detection frame with the highest confidence level in the middle is marked as WmaxCalculating the remaining detection frame and WmaxThe degree of overlap of (2), updating the confidence of the detection box
Figure FDA0002809266730000041
Wherein, biRepresenting a detection frame currently operated, and iou is the overlapping degree;
step 3.4.5: in queue L1Deleting the detection box with the updated confidence coefficient lower than 0.1;
step 3.4.6: w is to bemaxPut candidate queue L2And from queue L1Deleting, and returning to the step 3.4.3;
step 3.4.7: return queue L2As a final result.
6. A dynamic context based violation detection algorithm as recited in claim 1, wherein: said step 3.5 comprises the steps of:
step 3.5.1: inputting the foreground proposal frames with the same size into a full-connection layer, and outputting the category of the foreground proposal frames and the coordinates (X, y) of the upper left corner and the lower right corner of the foreground proposal frames, wherein y is X.x, y is a vector of n X1, n is the number of categories, X is a matrix of n X m, X is a vector of m X1, and m is the dimension when entering the full-connection layer;
step 3.5.2: the coordinates are input into the full convolution network and the output z is g (y), where g is the convolution layer with the forward and reverse derivative functions exchanged.
7. A dynamic context based violation detection algorithm as recited in claim 1, wherein: the step 4 comprises the following steps:
step 4.1: traversing all the segmented vehicle examples;
step 4.2: IoU of detection frame, illegal parking area detection frame and parking area detection frame of detection vehicleViolation of parkingAnd IoUCan stop
Step 4.3: when IoUViolation of parkingB and IoUCan stopWhen < a, the vehicle instance is judged to be an illegal parking vehicle, and the index is added into the queue LwOtherwise, the vehicle is not illegally parked; carrying out the next step;
step 4.4: if the traversal is not finished, returning to the step 1, otherwise, finishing the detection and return queue Lw
CN201910029090.XA 2019-01-12 2019-01-12 Dynamic background-based violation detection algorithm Active CN109902676B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910029090.XA CN109902676B (en) 2019-01-12 2019-01-12 Dynamic background-based violation detection algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910029090.XA CN109902676B (en) 2019-01-12 2019-01-12 Dynamic background-based violation detection algorithm

Publications (2)

Publication Number Publication Date
CN109902676A CN109902676A (en) 2019-06-18
CN109902676B true CN109902676B (en) 2021-04-09

Family

ID=66943606

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910029090.XA Active CN109902676B (en) 2019-01-12 2019-01-12 Dynamic background-based violation detection algorithm

Country Status (1)

Country Link
CN (1) CN109902676B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110610202B (en) * 2019-08-30 2022-07-26 联想(北京)有限公司 Image processing method and electronic equipment
CN111144475A (en) * 2019-12-22 2020-05-12 上海眼控科技股份有限公司 Method and device for determining car seat, electronic equipment and readable storage medium
CN111368687B (en) * 2020-02-28 2022-07-19 成都市微泊科技有限公司 Sidewalk vehicle illegal parking detection method based on target detection and semantic segmentation
CN113496158A (en) * 2020-03-20 2021-10-12 中移(上海)信息通信科技有限公司 Object detection model optimization method, device, equipment and storage medium
CN112132037B (en) * 2020-09-23 2024-04-16 平安国际智慧城市科技股份有限公司 Pavement detection method, device, equipment and medium based on artificial intelligence
CN114297107B (en) * 2021-12-29 2024-05-24 成都智明达电子股份有限公司 Label Tag management method, device and medium
CN114566063A (en) * 2022-01-24 2022-05-31 深圳市捷顺科技实业股份有限公司 Intelligent parking space guiding management method and device and storage medium
CN115082903B (en) * 2022-08-24 2022-11-11 深圳市万物云科技有限公司 Non-motor vehicle illegal parking identification method and device, computer equipment and storage medium
CN116433988B (en) * 2023-06-01 2023-08-08 中国标准化研究院 Multi-source heterogeneous image data classification treatment method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105631899A (en) * 2015-12-28 2016-06-01 哈尔滨工业大学 Ultrasonic image motion object tracking method based on gray-scale texture feature
CN107609491A (en) * 2017-08-23 2018-01-19 中国科学院声学研究所 A kind of vehicle peccancy parking detection method based on convolutional neural networks
CN107730903A (en) * 2017-06-13 2018-02-23 银江股份有限公司 Parking offense and the car vision detection system that casts anchor based on depth convolutional neural networks
CN109033950A (en) * 2018-06-12 2018-12-18 浙江工业大学 Vehicle based on multiple features fusion cascade deep model, which is disobeyed, stops detection method
CN109102678A (en) * 2018-08-30 2018-12-28 青岛联合创智科技有限公司 A kind of drowned behavioral value method of fusion UWB indoor positioning and video object detection and tracking technique

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105631899A (en) * 2015-12-28 2016-06-01 哈尔滨工业大学 Ultrasonic image motion object tracking method based on gray-scale texture feature
CN107730903A (en) * 2017-06-13 2018-02-23 银江股份有限公司 Parking offense and the car vision detection system that casts anchor based on depth convolutional neural networks
CN107609491A (en) * 2017-08-23 2018-01-19 中国科学院声学研究所 A kind of vehicle peccancy parking detection method based on convolutional neural networks
CN109033950A (en) * 2018-06-12 2018-12-18 浙江工业大学 Vehicle based on multiple features fusion cascade deep model, which is disobeyed, stops detection method
CN109102678A (en) * 2018-08-30 2018-12-28 青岛联合创智科技有限公司 A kind of drowned behavioral value method of fusion UWB indoor positioning and video object detection and tracking technique

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Improving Object Detection With One Line of Code;Navaneeth Bodla等;《arXiv:1704.04503v1》;20170414;第2961~2969页 *
Mask R-CNN;Kaiming He;《ICCV2017》;20171029;第1~10页 *

Also Published As

Publication number Publication date
CN109902676A (en) 2019-06-18

Similar Documents

Publication Publication Date Title
CN109902676B (en) Dynamic background-based violation detection algorithm
CN111598030B (en) Method and system for detecting and segmenting vehicle in aerial image
CN111368687B (en) Sidewalk vehicle illegal parking detection method based on target detection and semantic segmentation
CN109147331B (en) Road congestion state detection method based on computer vision
Dai et al. Residential building facade segmentation in the urban environment
CN110599537A (en) Mask R-CNN-based unmanned aerial vehicle image building area calculation method and system
Huang et al. Spatial-temproal based lane detection using deep learning
CN107705577B (en) Real-time detection method and system for calibrating illegal lane change of vehicle based on lane line
Balaska et al. Enhancing satellite semantic maps with ground-level imagery
CN113313031B (en) Deep learning-based lane line detection and vehicle transverse positioning method
CN114332644B (en) Large-view-field traffic density acquisition method based on video satellite data
Bu et al. A UAV photography–based detection method for defective road marking
Zhang et al. Image-based approach for parking-spot detection with occlusion handling
CN116597270A (en) Road damage target detection method based on attention mechanism integrated learning network
CN113158954B (en) Automatic detection method for zebra crossing region based on AI technology in traffic offsite
CN116824399A (en) Pavement crack identification method based on improved YOLOv5 neural network
CN114898243A (en) Traffic scene analysis method and device based on video stream
CN114694078A (en) Traffic behavior judgment method based on multi-target tracking
CN114782919A (en) Road grid map construction method and system with real and simulation data enhanced
CN116901089B (en) Multi-angle vision distance robot control method and system
Ng et al. Scalable Feature Extraction with Aerial and Satellite Imagery.
Baduge et al. Assessment of crack severity of asphalt pavements using deep learning algorithms and geospatial system
CN114820931B (en) Virtual reality-based CIM (common information model) visual real-time imaging method for smart city
Nejadasl et al. Optical flow based vehicle tracking strengthened by statistical decisions
CN113409588B (en) Multi-vehicle speed measurement method based on video compression domain

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant