CN111597897A

CN111597897A - Parking space identification method for high-speed service area

Info

Publication number: CN111597897A
Application number: CN202010297837.2A
Authority: CN
Inventors: 邵奇可; 卢熠; 颜世航; 陈一苇
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2020-04-16
Filing date: 2020-04-16
Publication date: 2020-08-28
Anticipated expiration: 2040-04-16
Also published as: CN111597897B

Abstract

The method for identifying the parking spaces in the high-speed service area comprises the following steps: 1) the method comprises the steps of collecting a large number of images of high-altitude cameras in a parking lot and other vehicle data sets, calibrating the data sets according to field management requirements, and determining a used one-stage target detection algorithm model. 2) Constructing a parameter adaptive loss function

And

Description

Parking space identification method for high-speed service area

Technical Field

The invention belongs to the technical field of image recognition and computer vision, and relates to a parking space recognition method in a high-speed service area.

Background

At present, aiming at the problem of detecting parking spaces in a high-speed service area, the traditional detection method mainly comprises the following steps: micro radar detection, infrared detection, geomagnetic induction coil detection and radio frequency identification. The method needs to install special sensing equipment for each parking space of the parking lot in the high-speed service area, and has the disadvantages of high engineering cost overhead, difficult later maintenance and high cost of manpower and material resources. The parking space state is identified in real time by utilizing a security camera in the existing parking lot of the high-speed service area, and then the parking space information of the area is counted. Because it utilizes current parking area supervisory equipment, need not change parking area parking stall ground, and equipment maintenance is easy moreover, therefore this kind of parking stall detecting system based on video has fine spreading value.

The parking space state is identified by using the video stream of the security camera, and the requirements on the accuracy of an identification algorithm and the real-time performance of the information of the vacant parking spaces in an application scene are high. Therefore, the target detection algorithm based on deep learning is reasonable. The target detection algorithm based on deep learning is divided into a two-stage model and a one-stage model. Although the two-stage convolutional neural network model has better detection accuracy, the forward reasoning speed is slow, and the real-time requirement of a service scene cannot be met. In the traditional one-stage target detection algorithm model, the algorithm has good real-time performance, but the detection precision of the two-stage convolutional neural network model cannot be achieved. The high-speed service area parking space identification method based on the focus loss function parameter self-adaption is beneficial to improving the detection precision of the system and ensuring that the real-time performance of the system meets the requirements of application scenes.

Disclosure of Invention

The invention provides a method for identifying a parking space in a high-speed service area to overcome the defects in the prior art, so as to improve the detection precision and the real-time property.

The invention improves the loss function in a one-stage target detection algorithm model. The loss function is used as an objective function of a gradient descent process in the convolutional neural network, and directly influences the training result of the convolutional neural network. The quality of the training result of the convolutional neural network is directly related to the identification precision of target detection, so that the method is particularly important for the design and display of a loss function.

In the training process of the one-stage target detection algorithm model, a network contains a large number of service area background objects when a target is detected by an image, although the loss value of the service area background objects is small, the number of the service area background objects is far more than that of vehicle targets, so that when the loss value is calculated, the service area background loss value with small probability value overwhelms the target loss value of the vehicle, the model precision is greatly reduced, and a focus loss function is embedded into the one-stage target detection algorithm model to improve the training precision. And if the hyper-parameters exist in the focus loss function, the hyper-parameters need to be set according to empirical values, and the magnitude of the hyper-parameters can not be automatically adjusted according to the predicted class probability value.

Therefore, the invention provides a deep learning loss function based on semi-supervised learning aiming at the problems that the focus loss function needs to manually adjust the hyper-parameters in the training process and the parameters in the training process do not have self-adaptability, wherein the loss function improves the hyper-parameters by using a weighting method, so that the network hyper-parameters can be adaptively adjusted in the gradient descending process of the network, and the network learning efficiency is further improved.

In order to solve the technical problem, a parameter self-adaptive focus loss function is adopted to enhance the network training capability and provide the identification precision of the system.

The method for identifying the parking spaces in the high-speed service area comprises the following steps:

step 1: the method comprises the steps of constructing a parking lot data set M of a high-speed service area, a training data set T, a verification data set V, a labeled vehicle category number C, a training data batch size batch, a training batch number batch, a learning rate l _ rate and a proportionality coefficient zeta between the training data set T and the verification data set V.

Wherein V ∪ T is M, C ∈ N⁺，ζ∈(0,1)，batches∈N⁺,l_rate∈N⁺，batch∈N⁺，

Representing the height and width of the image and r representing the number of channels of the image.

Step 2: determining a stage target detection model to be trained, setting the depth of a convolutional neural network as L, setting a network convolutional layer convolutional kernel set G, setting a network output layer in a full-connection mode, setting a convolutional kernel set A and a network characteristic diagram set U,

representing the kth characteristic diagram in the l-th network

The corresponding grid number and anchor point set M are specifically defined as follows:

wherein:

and respectively representing the height, width and dimension of a convolution kernel, a characteristic diagram and an anchor point corresponding to the l-th network.

Indicating the fill size of the layer l network convolution kernel,

representing the convolution step size of the layer I network, f representing the excitation function of the convolution neuron, theta representing the selected input feature, Λ∈ N⁺Denotes the total number of anchor points xi ∈ N in the layer I network⁺Representing the total number of output layer nodes, Φ ∈ N⁺Indicates the total number of layer I network feature maps, Δ ∈ N⁺Representing the total number of the l-th layer convolution kernels.

Step 3: the design parameter adaptive focus loss function is as follows:

wherein:

indicating that the jth anchor point in the ith grid on the ith network is in the image t_kThe loss function of the confidence coefficient of the vehicle sample and the parking lot background sample; in the same way, the method for preparing the composite material,

a loss function representing a prediction box of the vehicle,

a loss function representing the class of the vehicle, λ being the loss function

And (4) parameters.

And

the loss functions of the vehicle object and the parking lot background object are respectively expressed as follows:

the probability value of the foreground vehicle predicted by the jth anchor point in the ith grid on the ith network is shown, and similarly,

representing a corresponding parking lot background probability value.

Respectively representing the abscissa and the ordinate of the central point of the prediction frame of the jth anchor point in the ith grid on the ith network, and the like

Respectively representing the abscissa and the ordinate of the central point of the vehicle sample calibration frame;

respectively representing the shortest Euclidean distance from the central point of the prediction frame of the jth anchor point in the ith grid on the ith network to the boundary of the frame, and the same way

Respectively representing the shortest Euclidean distance from the central point of the vehicle sample calibration frame to the frame boundary;

and the vehicle category predicted value of the jth anchor point prediction in the ith grid on the ith network is represented. In the same way, the method for preparing the composite material,

a calibration status indicating the category of the vehicle,

a sample of the vehicle is represented for prediction,

whether the parking lot background sample is predicted or not is represented, and the specific calculation is as follows:

wherein the parameters α∈ (0, 1); iou_jRepresenting anchor points m_jThe overlap ratio of the anchor point box and the vehicle calibration box in the ith grid, miou represents the maximum overlap ratio.

Step 4: loss function based on a stage target detection algorithm model in Step 3, and utilization of loss functionThe training set carries out gradient descent method training on the model until the model converges, and in the model testing stage, the total number of parking spaces is set to sum ∈ N⁺Outputting a test sample of the current video monitoring area for target detection, and recording num ∈ N⁺When the number of vehicles in the parking lot is indicated, the vacant parking space s _ num is sum-num.

The invention has the advantages that: the provided focus loss function can improve the parameter adaptability of the target detection model, improve the detection precision of the system and ensure that the real-time performance of the system meets the requirements of application scenes.

Drawings

Fig. 1 is a network configuration diagram of the convolutional neural network of the present invention.

Fig. 2 is a diagram of a loss function structure in the convolutional neural network of the present invention.

Fig. 3 is a flowchart of the deployment of the parking space detection algorithm based on the convolutional neural network provided by the present invention.

Detailed Description

In order to better explain the technical scheme of the invention, the invention is further explained by an embodiment example with the accompanying drawings.

step 1: acquiring a large amount of image data shot by a high-altitude camera, constructing 10000 high-speed service area parking lot data sets M, 8000 training data sets T, 2000 verification data sets V, 5 marked vehicle category numbers C, respectively a car, a cross-country vehicle, a large truck, a police car and an engineering maintenance vehicle, wherein the training data batch size batch value is 4, the training batch times batches value is 1000, the learning rate l _ rate value is 0.001, the proportionality coefficient zeta value between the training data set T and the verification data set V is 0.25, and the height h of the image is_k＝416，w_kAnd 416, r is 3, and the height, width and channel number of all the images are consistent.

Step 2: determining a one-stage target detection model as Yolov3, setting the depth L of the convolutional neural network as 139, wherein the height, width and dimension settings of the convolutional kernel are specifically shown in FIG. 1, and filling the convolutional kernelSize of the charger

Default to 1, convolution step size

The default is 1, the excitation function f of the convolution neuron is a LEAKLy _ relu excitation function, anchor points are shared in each layer network, the set of anchor points M is { (10,13), (30,61), (156,198) }, Λ is 3, the network output layer adopts a full connection mode, and the set of convolution kernels A is { (1,1,30), (1,1,30), (1,1,30) }, and xi is 3.

Step 3: as shown in fig. 2, a parameter adaptive focus LOSS function LOSS is constructed, where the value of the parameter α is 0.25 and the value of the parameter λ is 0.5.

Step 4: and (3) based on a loss function of a stage target detection algorithm model in Step 3, carrying out gradient descent method training on the model by using a training set until the model converges. As shown in fig. 3, a video stream of a camera installed in a parking lot is used for real-time detection, the total number sum of parking spaces is set to 10, a test sample of a current video monitoring area is output for target detection, and the remaining parking spaces are calculated according to the detected number of vehicles and the total number of parking spaces, so that management of the parking spaces is realized.

The embodiments described in this specification are merely illustrative of implementations of the inventive concept and the scope of the present invention should not be considered limited to the specific forms set forth in the embodiments but rather by the equivalents thereof as may occur to those skilled in the art upon consideration of the present inventive concept.

Claims

1. The method for identifying the parking spaces in the high-speed service area comprises the following steps:

step 1: constructing a parking lot data set M of a high-speed service area, a training data set T, a verification data set V, a labeled vehicle category number C, a training data batch size batch, a training batch number batch, a learning rate l _ rate and a proportionality coefficient zeta between the training data set T and the verification data set V;

Representing the height and width of the image, and r represents the number of channels of the image;

representing the kth characteristic diagram in the l-th network

The corresponding grid number and anchor point set M specifically include:

wherein:

respectively representing the height, width and dimension of a convolution kernel, a characteristic diagram and an anchor point corresponding to the l-th network;

indicating the fill size of the layer l network convolution kernel,

representing the convolution step size of the layer I network, f representing the excitation function of the convolution neuron, theta representing the selected input feature, Λ∈ N⁺Denotes the total number of anchor points xi ∈ N in the layer I network⁺Representing the total number of output layer nodes, Φ ∈ N⁺Indicates the total number of layer I network feature maps, Δ ∈ N⁺Represents the total number of the l layer convolution kernels;

step 3: designing a parameter adaptive focus loss function, which specifically comprises the following steps:

wherein:

a loss function representing a prediction box of the vehicle,

a loss function representing the class of the vehicle, λ ∈ Q being the loss function

A parameter;

and

representing a corresponding parking lot background probability value;

representing the predicted vehicle category value of the jth anchor point prediction in the ith grid on the ith network; in the same way, the method for preparing the composite material,

a calibration status indicating the category of the vehicle,

a sample of the vehicle is represented for prediction,

wherein the parameters α∈ (0, 1); iou_jRepresenting anchor points m_jOverlapping rate of the anchor point frame and the vehicle calibration frame in the ith grid, wherein miou represents the maximum overlapping rate;

step 4, performing gradient descent method training on the model by using a loss function of a first-stage target detection algorithm model in Step 3 until the model converges, extracting a network characteristic value by using the first-stage target detection model in the system operation stage, determining an anchor point based on a K-means clustering method, and setting the total number of parking spaces to sum ∈ N⁺And outputting the number num ∈ N of target detection vehicles in the current video monitoring area⁺And if the empty parking space s _ num is equal to sum-num.