CN111597897B

CN111597897B - High-speed service area parking space recognition method

Info

Publication number: CN111597897B
Application number: CN202010297837.2A
Authority: CN
Inventors: 邵奇可; 卢熠; 颜世航; 陈一苇
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2020-04-16
Filing date: 2020-04-16
Publication date: 2023-10-24
Anticipated expiration: 2040-04-16
Also published as: CN111597897A

Abstract

The high-speed service area parking space identification method comprises the following steps: 1) And acquiring a large number of images of high-altitude cameras and other vehicle data sets in the parking lot, calibrating the data sets according to the field management requirements, and determining a used one-stage target detection algorithm model. 2) Constructing a parameter-adaptive loss functionAnd3) And constructing a LOSS function LOSS of the one-stage target detection algorithm model. 4) Updating the weight of the one-stage target detection algorithm model by adopting a gradient descent method until the model converges; and finishing the detection of vehicles by the trained model in an actual system, and calculating the number of the residual parking spaces according to the total number of the preset on-site parking spaces and the current number of the vehicles to realize the management of the parking spaces. The focus loss function provided by the invention has the advantages that the parameter self-adaptability of the target detection model can be improved, and the accuracy of target detection is greatly improved.

Description

High-speed service area parking space recognition method

Technical Field

The invention belongs to the technical field of image recognition and computer vision, and relates to a parking space recognition method for a high-speed service area.

Background

At present, aiming at the problem of detecting a parking space in a high-speed service area, the traditional detection method mainly comprises the following steps: micro-radar detection, infrared detection, geomagnetic induction coil detection and radio frequency identification technology. The method needs to install special sensing equipment for each parking space of the parking lot in the high-speed service area, engineering cost is high, later maintenance is difficult, and the cost of manpower and material resources needed to be input is high. The parking space state is identified in real time by utilizing a security camera in the existing high-speed service area parking lot, and then parking space information of the area is counted. The existing parking lot monitoring equipment is utilized, the ground of the parking space of the parking lot is not required to be changed, and equipment maintenance and repair are easy, so that the video-based parking space detection system has good popularization value.

The video stream of the security camera is utilized to identify the parking space state, and the accuracy of an identification algorithm and the real-time requirement on the empty parking space information in an application scene are high. Therefore, the target detection algorithm based on deep learning is reasonable. The target detection algorithm based on deep learning is divided into a two-stage model and a one-stage model. Although the two-stage convolutional neural network model has better detection precision, the forward reasoning speed is lower, and the real-time requirement of a service scene cannot be met. In the traditional one-stage target detection algorithm model, the real-time performance of the algorithm is good, but the detection accuracy of the two-stage convolutional neural network model cannot be achieved. The high-speed service area parking space identification method based on the focus loss function parameter self-adaption is beneficial to improving the detection precision of the system and ensuring the real-time performance of the system to meet the requirements of application scenes.

Disclosure of Invention

The invention aims to overcome the defects in the prior art and provides a high-speed service area parking space identification method so as to improve detection accuracy and instantaneity.

The invention improves the loss function in the one-stage target detection algorithm model. The loss function is used as an objective function of the gradient descent process in the convolutional neural network, and directly influences the training result of the convolutional neural network. The quality of the result of the convolutional neural network training directly relates to the recognition precision of target detection, so that the convolutional neural network training is particularly important for the design appearance of the loss function.

In the one-stage target detection algorithm model training process, a network contains a large number of service area background objects when the target is detected by an image, and the loss value of the service area background objects is very small, but the number of the service area background objects is far more than that of the vehicle target, so that the service area background loss value with a small probability value overwhelms the target loss value of the vehicle when the loss value is calculated, the model precision is greatly reduced, and therefore a focus loss function is embedded in the one-stage target detection algorithm model to improve the training precision. The super parameters in the focus loss function need to be set according to the experience value, and the self super parameters cannot be automatically adjusted according to the predicted class probability value.

Therefore, aiming at the problem that the focus loss function needs to be manually adjusted to the super-parameters in the training process, the parameter in the training process does not have self-adaptability, the invention provides the deep learning loss function based on semi-supervised learning, and the loss function uses a weighting method to improve the super-parameters, so that the network can be adaptively adjusted to the super-parameters in the gradient descent process of the network, and the network learning efficiency is further improved.

In order to solve the technical problems, the focus loss function with self-adaptive parameters is adopted to strengthen the network training capacity, and the recognition accuracy of the system is provided.

The high-speed service area parking space identification method comprises the following steps:

step 1: constructing a high-speed service area parking lot data set M, a training data set T, a verification data set V, a labeling vehicle category number C, a training data batch size batch, a training batch number batches, a learning rate l_rate and a proportionality coefficient zeta between the training data set T and the verification data set V.

Wherein: v.u.T=M, C.epsilon.N ⁺ ，ζ∈(0,1)，batches∈N ⁺ ,l_rate∈N ⁺ ，batch∈N ⁺ ，Representing the height and width of the image, and r represents the number of channels of the image.

Step 2: determining a target detection model to be trained at one stage, setting the depth of a convolutional neural network as L, a convolutional kernel set G of a network convolutional layer, adopting a fully-connected mode for a network output layer, a convolutional kernel set A, a network feature map set U,representing the kth feature map in the layer 1 network +.>The corresponding grid number and anchor point set M are specifically defined as follows:

wherein:respectively represent the convolution kernel, the feature map and the anchor point corresponding to the layer I networkWide, dimension. />Representing the fill size of the layer 1 network convolution kernel,/->Represents the layer I network convolution step length, f represents the excitation function of the convolution neuron, Θ represents the selected input feature, Λ epsilon N ⁺ Representing the total number of anchor points of the layer I network, and the element E N ⁺ Represents the total number of output layer nodes, phi epsilon N ⁺ Representing the total number of network feature graphs of the layer I, delta epsilon N ⁺ Representing the total number of layer i convolution kernels.

Step 3: the focus loss function of the design parameter adaptation is as follows:

wherein:

representing a jth anchor point in an ith grid on a layer I network at image t _k A loss function of confidence of the vehicle sample and the parking lot background sample; similarly, let go of>Representing loss of a vehicle prediction frameLoss function (I)>Loss function representing class of vehicle, lambda being the loss function +.>Parameters. />And->The loss functions respectively representing the vehicle target and the parking lot background target are specifically as follows:

a foreground vehicle probability value representing a prediction of a jth anchor point in an ith mesh on the layer i network, and similarly,representing the corresponding parking lot background probability value. />Respectively representing the abscissa and the ordinate of the central point of the prediction frame of the jth anchor point in the ith grid on the first layer network, and the same thing is->Respectively representing the abscissa and the ordinate of the central point of the vehicle sample calibration frame; />Respectively representing the shortest Euclidean distance from the predicted frame center point of the jth anchor point in the ith grid to the frame boundary on the first layer network, and the same is ≡>Respectively representing the shortest Euclidean distance from the center point of a vehicle sample calibration frame to the boundary of the frame; />Representing predicted values of the vehicle categories predicted by the j-th anchor point in the i-th grid on the layer-I network. Similarly, let go of>Indicating the calibration status of the vehicle class>Representing a vehicle sample for prediction +.>The method is used for indicating whether the parking lot background sample is predicted or not, and specifically calculated as follows:

wherein the parameter α ε (0, 1); iou (iou) _j Representing anchor point m _j The overlap ratio of the anchor point frame and the vehicle calibration frame in the ith grid, miou represents the maximum overlap ratio.

Step 4: based on a stage in Step 3And (3) carrying out gradient descent training on the model by using a training set until the model converges. In the model test stage, the total number of parking spaces is set as sum epsilon N ⁺ Outputting a test sample of the current video monitoring area to perform target detection, and recording num epsilon N ⁺ Indicating the number of vehicles in the parking lot, the empty space s_num=sum-num.

The invention has the advantages that: the provided focus loss function can improve the parameter adaptability of the target detection model, improve the detection precision of the system and ensure that the real-time performance of the system meets the requirements of application scenes.

Drawings

Fig. 1 is a network configuration diagram of a convolutional neural network of the present invention.

Fig. 2 is a block diagram of a loss function in a convolutional neural network of the present invention.

Fig. 3 is a flow chart of deployment of a parking space detection algorithm based on a convolutional neural network.

Detailed Description

In order to better explain the technical scheme of the invention, the invention is further described by an embodiment example with reference to the attached drawings.

step 1: collecting a large number of image data shot by high-altitude cameras, constructing a high-speed service area parking lot data set M, wherein the number of the training data sets T is 10000, the number of the verification data sets V is 2000, the number of marked vehicle categories C is 5, the marked vehicle categories C are respectively a car, an off-road car, a large truck, a police car and an engineering maintenance car, the number of training data batches is 4, the number of the training batches is 1000, the learning rate l_rate is 0.001, the proportional coefficient zeta between the training data sets T and the verification data sets V is 0.25, and the high h of the image _k ＝416，w _k =416, r=3 and satisfies the consistency of the height, width, channel number settings for all images.

Step 2: determining a one-stage target detection model as Yolov3, setting a convolutional neural network depth L as 139, wherein the height, width and dimension of a convolutional kernel are setAs shown in fig. 1, the fill size of the convolution kernelDefault to 1, convolution step +.>Default to 1, the excitation function f of the convolutional neuron defaults to the leakage_relu excitation function; anchors are shared in each layer of network, and the anchor set M takes values of { (10, 13), (30,61), (156,198) }, Λ=3; the network output layer adopts a full connection mode, and the convolution kernel set A of the network output layer takes values of { (1,1,30), (1,1,30), (1,1,30) } and xi=3.

Step 3: as shown in fig. 2, a parameter-adaptive focus LOSS function LOSS is constructed, the value of the parameter α is 0.25, and the value of the parameter λ is 0.5.

Step 4: and training the model by using a gradient descent method based on a loss function of the one-stage target detection algorithm model in Step 3 until the model converges. As shown in fig. 3, the video stream of the camera placed in the parking lot is used for real-time detection, the sum of parking spaces is set to be 10, a test sample of the current video monitoring area is output for target detection, and the remaining parking spaces are calculated according to the detected number of vehicles and the sum of the parking spaces, so that the management of the parking spaces is realized.

The embodiments described in the present specification are merely examples of implementation forms of the inventive concept, and the scope of protection of the present invention should not be construed as being limited to the specific forms set forth in the embodiments, and the scope of protection of the present invention and equivalent technical means that can be conceived by those skilled in the art based on the inventive concept.

Claims

1. The high-speed service area parking space identification method comprises the following steps:

step 1: constructing a high-speed service area parking lot data set M', a training data set T, a verification data set V, a labeling vehicle class number C, a training data batch size batch, a training batch number batches, a learning rate l_rate, and a proportionality coefficient zeta between the training data set T and the verification data set V;

wherein: v ∈T=M', C∈N ⁺ ，ζ∈(0,1)，batches∈N ⁺ ,l_rate∈N ⁺ ，batch∈N ⁺ ，Representing the height and width of an image, r representing the number of channels of the image;

step 2: determining a target detection model to be trained at one stage, setting the depth of a convolutional neural network as L, a convolutional kernel set G of a network convolutional layer, adopting a fully-connected mode for a network output layer, a convolutional kernel set A, a network feature map set U,representing the kth feature map in the layer 1 network +.>The corresponding grid quantity and anchor point set M specifically comprise:

wherein:representing the height, width, dimension of the convolution kernel corresponding to the layer 1 network, < >>Representing the height, width, dimension, +.>Representing the height and width of the anchor point; />Representing the fill size of the layer 1 network convolution kernel,/->Represents the convolution step length of the layer I network, f represents the excitation function of the convolution neuron, Θ represents the selected input characteristic, L epsilon N ⁺ Representing the total number of anchor points of the layer I network, and the element E N ⁺ Represents the total number of output layer nodes, phi epsilon N ⁺ Representing the total number of network feature graphs of layer I, delta epsilon N ⁺ Representing the total number of layer i convolution kernels;

step 3: the focus loss function with adaptive design parameters specifically comprises:

wherein:

representing a jth anchor point in an ith grid on a layer I network at image t _k A loss function of confidence of the vehicle sample and the parking lot background sample; similarly, let go of>Loss function representing a prediction box of a vehicle, +.>A loss function representing the class of the vehicle, lambda.epsilon.Q is the loss function +.>Parameters; />And->The loss functions respectively representing the vehicle target and the parking lot background target are specifically as follows:

a foreground vehicle probability value representing a prediction of a jth anchor point in an ith mesh on the layer i network, and similarly,representing a corresponding parking lot background probability value; />Respectively representing the abscissa and the ordinate of the central point of the prediction frame of the jth anchor point in the ith grid on the first layer network, and the same thing is->Respectively representing the abscissa and the ordinate of the central point of the vehicle sample calibration frame; />Respectively representing the shortest Euclidean distance from the center point of a calibration frame of the jth anchor point in the ith grid to the boundary of the frame on the first layer network, and the same thing is ≡>Respectively representing the shortest Euclidean distance from the central point of a vehicle sample prediction frame to the boundary of the frame; />A predicted value of the vehicle category, which represents the prediction of the jth anchor point in the ith grid on the first network; similarly, let go of>Indicating the calibration status of the vehicle class>Indicating whether a prediction is made for a vehicle sample, +.>The method is used for indicating whether the parking lot background sample is predicted or not, and specifically calculated as follows:

wherein the parameter α ε (0, 1); iou (iou) _j Representing anchor point m _j In the ith grid, the overlapping rate of the anchor point frame and the vehicle calibration frame, wherein miou represents the maximum overlapping rate;

step 4: training the model by using a loss function of a one-stage target detection algorithm model in Step 3 by using a gradient descent method until the model converges; in the system operation stage, extracting network characteristic values by using a first-order target detection model, determining anchor points based on a K-means clustering method, and setting the total number of parking spaces as sum epsilon N ⁺ Outputting the number num epsilon N of target detection vehicles in the current video monitoring area ⁺ Then the empty space s_num=sum-num.