CN110210350B - Rapid parking space detection method based on deep learning - Google Patents

Rapid parking space detection method based on deep learning Download PDF

Info

Publication number
CN110210350B
CN110210350B CN201910429977.8A CN201910429977A CN110210350B CN 110210350 B CN110210350 B CN 110210350B CN 201910429977 A CN201910429977 A CN 201910429977A CN 110210350 B CN110210350 B CN 110210350B
Authority
CN
China
Prior art keywords
parking space
image
convolution
feature
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910429977.8A
Other languages
Chinese (zh)
Other versions
CN110210350A (en
Inventor
陈慧岩
陈建松
熊光明
黄书昊
齐建永
龚建伟
吴绍斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beili Huidong Beijing Technology Co ltd
Bit Intelligent Vehicle Technology Co ltd
Beijing Institute of Technology BIT
Original Assignee
Beili Huidong Beijing Technology Co ltd
Bit Intelligent Vehicle Technology Co ltd
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beili Huidong Beijing Technology Co ltd, Bit Intelligent Vehicle Technology Co ltd, Beijing Institute of Technology BIT filed Critical Beili Huidong Beijing Technology Co ltd
Priority to CN201910429977.8A priority Critical patent/CN110210350B/en
Publication of CN110210350A publication Critical patent/CN110210350A/en
Application granted granted Critical
Publication of CN110210350B publication Critical patent/CN110210350B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • G06V20/586Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads of parking space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/588Recognition of the road, e.g. of lane markings; Recognition of the vehicle driving pattern in relation to the road

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

本发明涉及一种基于深度学习的快速停车位检测方法,属于驾驶技术领域,用于解决停车位检测环境适应性差、模型计算量大问题,方法包括离线步骤:离线采集包含有停车位的图像数据,建立训练、验证数据集;进行神经网络模型的训练、评价和优化;所述神经网络模型用于对图像数据中停车位边线进行语义分割;在线步骤:在线采集包含有停车位的图像数据,使用训练好的神经网络模型进行停车位边线语义分割得到停车位边线掩膜,对得到的边线掩膜进行拟合、聚类与组合,得到由边线组成的几何形状;根据设定的形状判别条件,对所述几何形状进行筛选确定停车位。本发明具环境适应性强;采用模型体积很小,计算量低,对计算资源的需求较小;系统造价低,具有大规模应用的潜力。

Figure 201910429977

The invention relates to a fast parking space detection method based on deep learning, belongs to the field of driving technology, and is used to solve the problems of poor environmental adaptability and large model calculation amount for parking space detection. The method includes an offline step: offline collection of image data including parking spaces , establish a training and verification data set; carry out the training, evaluation and optimization of the neural network model; the neural network model is used to semantically segment the parking space edges in the image data; online steps: online collection of image data containing parking spaces, Use the trained neural network model to perform semantic segmentation of parking space edges to obtain parking space edge masks, and fit, cluster and combine the obtained edge masks to obtain geometric shapes composed of edge lines; according to the set shape discrimination conditions , filter the geometric shape to determine the parking space. The invention has strong environmental adaptability; the adopted model is small in size, low in calculation amount, and requires less computing resources; the system cost is low, and it has the potential of large-scale application.

Figure 201910429977

Description

Rapid parking space detection method based on deep learning
Technical Field
The invention relates to the technical field of driving, in particular to a quick parking space detection method based on deep learning.
Background
Parking space detection and positioning are the basis of an automatic parking system and an auxiliary parking system, in the existing method, a method based on non-deep learning carries out parking space detection by means of manually extracting the sideline characteristics of the parking space, and the problem that the detection system fails exists under the conditions that the marks of the sidelines of the parking space are not clear, the shadow of a building and the reflection phenomenon caused by water accumulation exist, a camera is fuzzy and the like. However, the deep learning-based method generally has a large model volume, a large model calculation amount, a high requirement on a calculation device, and high system cost, and is not favorable for large-scale application and popularization on vehicles. The method with the robust detection rate and the low system cost has important popularization value.
Disclosure of Invention
In view of the above analysis, the present invention aims to provide a fast parking space detection method based on deep learning, which solves the problems of poor environment adaptability and large model calculation amount in parking space detection.
The purpose of the invention is mainly realized by the following technical scheme:
a quick parking space detection method based on deep learning comprises the following steps:
an off-line step: acquiring image data including parking spaces offline, and establishing a training and verification data set; training, evaluating and optimizing a neural network model; the neural network model is used for performing semantic segmentation on a parking space sideline in the image data;
an online step: acquiring image data containing parking spaces on line, performing parking space sideline semantic segmentation by using a trained neural network model to obtain a parking space sideline mask, and fitting, clustering and combining the obtained sideline masks to obtain a geometric shape consisting of sidelines; and screening the geometric shapes according to the set shape discrimination conditions to determine the parking spaces.
Further, the offline step specifically includes:
1) acquiring a plurality of groups of picture data containing parking spaces in an off-line manner, marking the side line areas of the parking spaces in the pictures, and constructing a training and verifying data set;
2) constructing a lightweight deep learning semantic segmentation model based on a channel compression convolution mode, and performing model parameter training by using a training data set;
3) establishing an evaluation standard, evaluating the trained model by using a verification data set, and adjusting model parameters;
4) and optimizing and accelerating the evaluated model.
Further, the lightweight deep learning semantic segmentation model based on the channel compression convolution mode comprises:
a preprocessing unit for input size W1×H1Performing convolution and maximum pooling on the x 3 image, reducing dimension of the image in width and height dimensions, connecting the convolution processing result and the maximum pooling processing result, and outputting the result with the size of W2×H2×N2The pre-processed image of (1);
a down-sampling feature extraction unit for sequentially performing two-stage down-sampling processing on the preprocessed image, reducing the dimension of the width and height dimensions of the image, extracting edge semantic features, and outputting the edge semantic features with the output size of W3×H3×N3The down-sampled image of (2);
an up-sampling feature extraction unit for sequentially performing two-stage up-sampling processing on the sampled image, increasing the width and height dimensions of the image, recovering edge semantic features, and outputting an output size of W2×H2Up-sampling a binary image by 2;
a model output unit for performing difference processing on the up-sampled binary image with output size W1×H1A binary image of x 2;
the two-stage up-sampling process corresponds to the two-stage down-sampling process; wherein the first stage upsampling process corresponds to the second stage downsampling process, and the second stage upsampling process corresponds to the first stage downsampling process.
Furthermore, the main structure of each stage of downsampling processing firstly reduces the channel dimension of an input image through a 1 × 1 convolution kernel, then reduces the width dimension and the height dimension through a 3 × 3 convolution kernel, and finally expands the channel dimension through the 1 × 1 convolution kernel to obtain a downsampling main output result, and the lateral structure of each stage of upsampling processing firstly reduces the width dimension and the height dimension through pooling layer operation and then expands the channel dimension through the 1 × 1 convolution kernel to obtain a downsampling lateral output result; and finally, performing element-by-element addition on the trunk output result and the side output result to obtain a down-sampling result.
Furthermore, the trunk structure of each stage of upsampling processing firstly adopts a 1 × 1 convolution kernel to reduce the channel dimension of the input image, then adopts a 3 × 3 deconvolution to extract the features and improve the width and height dimensions, and then adopts a 1 × 1 convolution mode to expand the channel dimension to obtain an upsampling lateral output result; the input features of the lateral connection structure of each level of up-sampling processing are the features of width and height dimension output by the corresponding down-sampling processing, and the features of width and height dimension and the main output result are added element by element to obtain the fused feature information of different levels.
Furthermore, after each convolution layer which is subjected to convolution operation, a batch normalization layer, a linear mapping layer and a linear rectification layer are sequentially connected, and batch normalization operation, linear mapping operation and linear rectification operation are carried out on convolution operation results to realize normalization and nonlinear transformation of output characteristics.
Further, the optimizing and accelerating the neural network model comprises,
a. extracting and fusing parameters in all the convolution layers, the batch normalization layer and the linear mapping layer in the model; the fused parameters include:
fused convolutional layer weights
Figure BDA0002068672810000031
Fused convolutional layer biasing
Figure BDA0002068672810000032
Wherein, woldAs convolution layer weights before fusion, boldBiasing the convolutional layer before fusing;
gamma and beta are parameters of a linear mapping layer;
mean and var are the mean and variance of all features in the normalization layer;
ε is a minimum value greater than 0;
b. the model was quantized with low precision using FP 16.
Further, the online step specifically includes:
1) in the driving process of the vehicle, a camera installed on the vehicle is adopted to collect the image information including the parking space on line;
2) performing parking space sideline semantic segmentation on the picture information by using the neural network model;
3) performing line-by-line scanning on the semantic partition result of the parking space sidelines, extracting the central point of a continuous area in the partition result, and performing straight line fitting by using hough transformation on the basis to obtain each sideline of the parking space contained in the picture;
4) and clustering and combining the sidelines to form a geometric area, and judging the geometric area meeting the parking space judging condition as the parking space according to the set parking space judging condition.
Further, the clustering the edge includes:
1) judging whether the included angle delta theta of two straight line segments in the side line is smaller than a clustering angle threshold theta or notT
2) Judging whether the distance between two straight line sections is smaller than the pixel distance of the borderline of the parking space in the image, wherein the distance between the two straight line sections is the distance from the center point of any straight line section to the straight line where the other straight line section is located;
3) judging whether the distance between the nearest points of the two straight line segments is less than a threshold value dTSaid
Figure BDA0002068672810000041
LsThe distance of the short side of the parking space in the image is taken as the distance;
4) clustering straight line segments satisfying 1) to 3).
Further, the image data including the parking space is acquired by a calibrated camera installed on the vehicle, and the calibrating of the camera includes:
1) off-line calibration is carried out on the internal and external parameters of the camera, and the parameters are used for eliminating image distortion caused by imaging of a camera lens;
2) and off-line calibration is carried out on the inverse perspective transformation matrix of the camera, so that the forward-looking image is converted into a top view, and the shape distortion of the parking space caused by the perspective transformation imaging of the camera is eliminated.
The scheme of the invention can realize at least one of the following beneficial effects:
1. the method for detecting the parking space by utilizing the deep learning has the advantages of strong environmental adaptability and good detection effect under the conditions of shadow, shielding, ground reflection, abrasion of parking space marking lines and the like.
2. The model structure design adopts a lightweight convolution module design method based on channel compression, and compared with a network structure using a standard convolution operation mode, the model has the advantages of small volume, low calculation amount and low requirement on calculation resources, so that the detection efficiency is high and the performance requirement on hardware is low.
3. Through the combination of network weight, the further improvement of network speed is realized, and the level of realizing real-time detection on an embedded platform can be reached, so that the system is low in cost and has the potential of large-scale application.
Drawings
The drawings are only for purposes of illustrating particular embodiments and are not to be construed as limiting the invention, wherein like reference numerals are used to designate like parts throughout.
Fig. 1 is a flowchart of a fast parking space detection method according to an embodiment of the present invention;
FIG. 2 is a diagram of a parking space segmentation network model architecture according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating an overall calculation process of a standard convolution according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating a single step standard convolution calculation process according to an embodiment of the present invention;
FIG. 5 is a diagram of a standard convolution wide high dimension information flow structure in an embodiment of the present invention;
fig. 6 is a diagram of a standard convolution channel information flow structure in an embodiment of the present invention.
Detailed Description
The preferred embodiments of the present invention will now be described in detail with reference to the accompanying drawings, which form a part hereof, and which together with the embodiments of the invention serve to explain the principles of the invention.
The embodiment discloses a quick parking space detection method based on deep learning, as shown in fig. 1, comprising the following steps:
step S1, offline step: acquiring image data including parking spaces offline, and establishing a training and verification data set; training, evaluating and optimizing a neural network model; the neural network model is used for performing semantic segmentation on a parking space sideline in the image data;
the establishing process comprises the following steps:
1) acquiring a plurality of groups of image data containing parking spaces in an off-line manner, marking the side line areas of the parking spaces in the images, and constructing a training and verifying data set;
2) constructing a lightweight deep learning semantic segmentation model based on a channel compression convolution mode, and performing model parameter training by using a training data set;
3) establishing an evaluation standard, evaluating the trained model by using a verification data set, and adjusting model parameters;
4) and optimizing and accelerating the evaluated model.
Specifically, in order to enable the trained neural network model to be more accurate, image data near parking spaces of various types as much as possible are collected on the construction of training and verifying data sets; moreover, the sample ratio of the training data set to the validation data set is approximately 5: 1;
for example, 10000-30000 pieces of picture data are collected as training samples of a deep learning model training data set; meanwhile, 2000 plus 6000 pictures are collected as verification samples of the model verification data set;
since the top view is output after the calibration of the camera in step S1, the samples in the training and verification data sets need to be converted into the top view, and the conversion method adopts the same inverse perspective transformation method as that in step S1.
Specifically, marking the parking space borderline area of the sample in the training and verification data set is realized by adopting manual marking in a top view;
in the marking process, open source software such as a labelme tool is adopted to carry out image pixel level marking, the side line area of the parking space is marked as 1 through marking, and the background area is marked as 0.
Specially, the lightweight deep learning semantic segmentation model based on the channel compression convolution mode is constructed based on the open source deep learning framework Caffe, and specifically comprises the following steps:
a preprocessing unit for input size W1×H1Performing convolution and maximum pooling on the x 3 image, reducing dimension of the image in width and height dimensions, connecting the convolution processing result and the maximum pooling processing result, and outputting the result with the size of W2×H2×N2The pre-processed image of (1); the wide and high dimension of the image is reduced by preprocessing, so that the calculation amount of subsequent processing is greatly reduced.
A down-sampling feature extraction unit for sequentially performing two-stage down-sampling processing on the preprocessed image, reducing the dimension of the width and height dimensions of the image, extracting edge semantic features, and outputting the edge semantic features with the output size of W3×H3×N3The down-sampled image of (2);
an up-sampling feature extraction unit for sequentially performing two-stage up-sampling processing on the sampled image, increasing the width and height dimensions of the image, recovering edge semantic features, and outputting an output size of W2×H2Up-sampling a binary image by 2;
a model output unit for performing difference processing on the up-sampled binary image with output size W1×H1A binary image of x 2;
the two-stage up-sampling process corresponds to the two-stage down-sampling process; wherein the first stage upsampling process corresponds to the second stage downsampling process, and the second stage upsampling process corresponds to the first stage downsampling process.
For each level of downsampling processing, the main structure of the downsampling processing is firstly reduced through a 1 × 1 convolution kernel to reduce the channel dimensionality of an input image, then reduced through a 3 × 3 convolution kernel to reduce the width and height dimensionalities, and finally expanded through the 1 × 1 convolution kernel to obtain a downsampling main output result; the lateral structure firstly reduces the width dimension and the height dimension through the operation of a pooling layer, and then expands the channel dimension through a 1 multiplied by 1 convolution kernel to obtain a down-sampling lateral output result; and finally, performing element-by-element addition on the trunk output result and the side output result to obtain a down-sampling result.
Preferably, after the first-stage downsampling processing, the image feature data is subjected to feature extraction through a certain number of serial same-dimensional feature extraction modules, and after output, the second-stage downsampling processing is performed; after the second-stage down-sampling processing, feature extraction is carried out on the image feature data through a certain number of serial same-dimensional feature extraction modules;
for each level of up-sampling processing, firstly reducing the channel dimension of an input image by adopting a 1 × 1 convolution kernel, then extracting the features and improving the width and height dimensions by adopting a 3 × 3 deconvolution, and then expanding the channel dimension by adopting a 1 × 1 convolution mode to obtain an up-sampling lateral output result; the input features of the lateral connection structure of each level of up-sampling processing are the features of width and height dimension output by the corresponding down-sampling processing, and the features of width and height dimension and the main output result are added element by element to obtain the fused feature information of different levels.
Preferably, after the first-stage up-sampling processing, feature extraction is performed on the image feature data through a certain number of series same-dimensional feature extraction modules, and after output, second-stage up-sampling processing is performed; and after the second-stage up-sampling processing, performing feature extraction on the image feature data through a same-dimension feature extraction module.
Preferably, after each convolution layer which is subjected to convolution operation, a batch normalization layer, a linear mapping layer and a linear rectification layer are sequentially connected, and batch normalization operation, linear mapping operation and linear rectification operation are carried out on convolution operation results to realize normalization of output characteristics, so that convergence speed of network training is accelerated.
The model will be described below by taking as an example an RGB color image having an input image width and height of 448 × 448 (i.e., an image size of 448 × 448 × 3). The network structure using the model is shown in fig. 2;
1) the preprocessing unit is used for extracting features and reducing dimensions of an input image by adopting 3 × 3 standard convolution processing with the step length of 2, outputting a feature with the dimension of 224 × 224 × 13, simultaneously obtaining the feature with the dimension of 224 × 224 × 3 by adopting maximum pooling processing, connecting the feature with the feature in parallel to obtain an output feature with the dimension of 224 × 224 × 16, and performing batch normalization operation (BatchNorm) on the feature to obtain the output feature with the dimension of 224 × 224 × 16, namely the preprocessing output feature after linear mapping operation and linear rectification operation.
2) The down-sampling feature extraction unit is used for extracting features and reducing feature width and height dimensions, the down-sampling feature extraction unit of the network model shares a two-stage down-sampling module, the feature scale input by the first-stage down-sampling module is 224 multiplied by 16, and the output feature scale is 112 multiplied by 64; the method comprises the steps that 3 serial conventional feature extraction modules are connected behind a first-level downsampling module, the conventional feature extraction modules are same-dimensional conventional feature extraction modules, features with the extraction scale of 112 multiplied by 64 are output to a second-level downsampling module, the feature scale output by the second-level downsampling module is 56 multiplied by 128, 16 serial conventional feature extraction modules are connected behind the second-level downsampling module, the conventional feature extraction modules are same-dimensional conventional feature extraction modules, and the features with the extraction scale of 56 multiplied by 128 are output to an upsampling feature extraction unit.
The down-sampling module is formed as follows: the main structure firstly adopts 1 × 1 convolution to reduce the channel dimension of the feature dimension into 1/4 of the input feature channel dimension, then standard 3 × 3 convolution operation is carried out, the step length used in the convolution operation is 2, therefore, the width dimension and the height dimension of the output feature are reduced to half of the width dimension and the height dimension of the input feature, the channel dimension of the output feature is equal to the input feature, then the 1 × 1 convolution is used for expanding the feature channel dimension, and the width dimension and the height dimension of the feature are not changed. In the lateral connection structure, firstly, the maximum value pooling operation with the step length set to be 2 is used for carrying out the dimension reduction operation of the feature width and height dimensions, and then the 1 x1 convolution mode is adopted for carrying out the expansion of the feature channel dimensions. And then, carrying out element-by-element addition operation on the features output from the main structure and the features output from the lateral connection structure to realize feature fusion. After each convolution operation, batch normalization operation is used, and normalization of output characteristics is realized through linear mapping operation and linear rectification operation, so that convergence speed of network training is accelerated. In addition, the input features of the module can be connected with the features of the corresponding scale of the up-sampling module at the back, so that the fusion of the features of different layers is realized, and the segmentation precision is improved.
The conventional feature extraction module is constituted as follows: in the trunk structure, 1 × 1 convolution is firstly adopted for channel dimension reduction, then 3 × 3 standard convolution is adopted for feature extraction, and then a 1 × 1 convolution mode is adopted for channel dimension expansion; in the lateral connection structure, the pixel-by-pixel addition operation is directly carried out on the input features and the convolution output features of the main network; moreover, batch normalization operation is used after each convolution operation, and normalization of output characteristics is realized through linear mapping operation and linear rectification operation, so that convergence speed of network training is accelerated.
3) And the up-sampling feature extraction unit is used for realizing the expansion of feature width and height dimensions and the feature extraction function. The up-sampling feature extraction unit of the network model has two stages of up-sampling modules, the feature scale input by the first stage of up-sampling module is 56 multiplied by 128, and the feature scale output by the first stage of up-sampling module is 112 multiplied by 64; connecting 3 serial conventional feature extraction modules after the first-stage upsampling module, wherein the conventional feature extraction modules are same-dimensional conventional feature extraction modules, extracting features with the scale of 112 multiplied by 64 and outputting the features to the second-stage upsampling module, the feature scale input by the second-stage upsampling module is 112 multiplied by 64, and the output feature scale is 224 multiplied by 16; and a second-stage down-sampling module is connected with a serial conventional feature extraction module, the conventional feature extraction module is a channel dimension reduction conventional feature extraction module, and features with the extraction scale of 224 multiplied by 2 are output to a model output unit.
The upsampling module is composed as follows: the method comprises the steps that firstly, 1 × 1 convolution is adopted for a trunk structure to reduce channel dimensionality into 1/4 of input channel dimensionality, then 3 × 3 deconvolution is adopted for feature extraction and width and height dimensionality lifting, and then a 1 × 1 convolution mode is adopted for channel dimensionality expansion; the input features of the lateral connection structure are features corresponding to the width and the height dimensions in the downsampling module, and the features obtained by convolution processing in the main structure are subjected to element-by-element addition operation, so that feature information of different fused layers is obtained, and the segmentation accuracy of the network model is improved. After each convolution operation, batch normalization operation, linear mapping operation and linear rectification operation are used for realizing normalization and nonlinear transformation of output characteristics, and the convergence speed of network training is accelerated conveniently.
4) A model output unit that performs difference processing on an input image having a scale of 224 × 224 × 2 and outputs a binary image having a size of 448 × 448 × 2;
the 448 × 448 RGB color image is converted into a 448 × 448 × 2 binary image by a network model, where 1 in the binary image is a parking space borderline and a non-parking space borderline is 0.
The feature channel dimensionality reduction is carried out by a large amount of 1 multiplied by 1 convolution in the network model, so that the calculated amount of a feature extraction unit in the neural network model is greatly reduced, and meanwhile, a good detection effect can be guaranteed. The specific analysis is as follows:
in the deep convolutional neural network, a large part of the calculation amount comes from a convolutional layer or a fully-connected layer, and in the semantic segmentation network, the convolution operation occupies most of the calculation amount because the fully-connected layer is less adopted. The standard convolution calculation process is shown in fig. 3 and 4, in the standard convolution operation, the width, height and channel number of the input feature are respectively represented as W, H and N, the number of convolution kernels of the convolution layer is M, and the dimension of the convolution kernels is represented as K × N, where K represents the scale of the convolution kernel, and N represents the channel of the input vector, the convolution operation process is that the convolution kernel slides along the width direction and height direction of the image and performs pixel-by-pixel multiplication and summation with the corresponding image according to the set step size, the multiplication and summation result at each position represents the function response of the input data of the convolution kernel in the local area, and the convolution result after the same convolution kernel performs traversal on all positions of the input image is the output feature of the convolution kernel. In the convolution operation with the step size of 1, a convolution result with the same dimension as the input characteristic scale can be obtained, the dimension is H multiplied by W multiplied by 1, and the output characteristic dimension obtained by all the convolution results in the convolution layer is H multiplied by W multiplied by M.
For standard convolution, the multiplication operation of one convolution operation is calculated to be H × W × N × K under the condition that the step size is 12
Therefore, in the convolutional layer, the calculation amount of the multiplication operation performed on the input features is H × W × N × K2×M
Taking convolution operation with a size of 3 × 3 scale as an example, information flows in the image width and height spatial dimension and channel dimension are shown in fig. 5 and 6.
In the case that the dense connection in the width and height dimensions of the image is a local connection mode (local connection in a range of 3 × 3 pixels), and the dense connection relationship in the channel dimensions is full connection, that is, the channels are all connected to each other, so the calculation amount in the calculation process of the dimension is the same as that in the full connection mode, for this case, the convolution operation mode adopted in the embodiment is a convolution method based on channel compression module, which can greatly reduce the calculation amount, as follows:
under the conditions that the input characteristic dimension is H multiplied by W multiplied by N, the output characteristic dimension is H multiplied by 0W multiplied by 1M and the compression channel dimension is H multiplied by W multiplied by C, the calculation amount of the compression convolution, the intermediate conventional convolution and the expansion convolution is H multiplied by W multiplied by N multiplied by C, C multiplied by W multiplied by H multiplied by K2xC and CxW xH x M;
amount of computation compared to standard convolution
Figure BDA0002068672810000121
In the case of 64 input and output channels and 16 compressed channel dimensions, the method based on channel compressed convolution is 9.7% of the standard convolution. It can be seen that the convolution method based on the channel compression method has more economical computational overhead than the standard convolution method. Compared with a mainstream segmentation network, the network structure designed by the patent has smaller model volume and higher calculation speed.
Through tests, the lightweight neural network model can reach 20ms/frame on the NVIDIA GTX1060 video card without network acceleration optimization processing, the running speed on the embedded artificial intelligence platform NVIDAI TX2 is 103ms, and the size of the network model is 2.7M. After subsequent neural network optimization, the operation speed can reach 30ms/frame on NVIDIA TX2 on the embedded artificial intelligence platform.
Preferably, in the training process, the model evaluation is performed on the neural network model by using the verification data set, the evaluation criterion uses a pixel segmentation precision standard, that is, the ratio of the number of pixels correctly segmented by the network to the number of all pixels is used as an optimization target to train the network, the training platform is NVIDIA TITAN X, the training solver is an Adam solver, the number of training steps is 70000-100000 steps, the number of images input into the neural network in each step is 6, and the number depends on the performance of the graphics card used for training.
Preferably, the optimization and acceleration of the neural network model includes:
a. the parameters in all the convolution layers, the batch normalization layer (batch norm layer) and the linear mapping layer in the model parameters are extracted and fused, the calculated amount of the network can be greatly reduced, and the fusion principle is as follows:
the batch normalization layer (BatchNorm layer) and the linear mapping layer (Scale layer) play a role in accelerating the training and convergence of the neural network through data normalization. However, when the network is deployed, only forward reasoning is carried out, the updated parameters do not need to be propagated reversely, the batch normalization layer and the linear mapping layer only play a role in linear transformation of data, repeated redundant calculation is generated, and the calculation speed of the network is influenced. Considering that all the batch normalization layer and the linear mapping layer in the network are after the convolutional layer, the layer parameters and the convolutional layer parameters can be directly fused.
The calculation formula in the batch normalization layer is
Figure BDA0002068672810000131
In the formula, mean and var are the mean and variance of all the characteristics in the normalization layer, epsilon is a minimum value larger than 0, and the prevention denominator is 0;
the data is linearly transformed in a linear mapping layer (Scale layer) with the formula
Figure BDA0002068672810000132
In the formula, gamma and beta are linear mapping layer parameters;
the parameters after parameter fusion are preferably adopted to include:
fused convolutional layer weights
Figure BDA0002068672810000133
Fused convolutional layer biasing
Figure BDA0002068672810000134
In the formula, woldAs convolution layer weights before fusion, boldIs biased for the pre-fused convolutional layer.
Meanwhile, after the model training is finished, the parameter of the convolutional layer is solidified and can be directly multiplied by the result of the combination of the two layers, so that the result of the combination of the three network layers is obtained. The model after parameter combination is effectively compressed, the size of the network model is reduced from 2.7M to 1.8M, the compression ratio reaches 33%, meanwhile, the inference time of the network is reduced from 103ms to about 78ms, the acceleration ratio of the network reaches nearly 24.3%, and obvious speed improvement is brought.
b. The model is subjected to low-precision quantification by using FP16, so that the model computation amount and the memory demand amount are further reduced, and the model can be subjected to real-time detection on an embedded platform: the training process of the network is carried out by using 32-bit floating point operation, and because a large number of local connections of the neural network have strong self-adaptive capacity, the calculation speed can be doubled by replacing parameters with low-precision 16-bit half-precision floating point operation under the condition that a segmentation result is not obviously reduced. After the TensorRT-based semi-precision quantization acceleration operation, the running speed of the network on the NVIDIA TX2 platform is compressed from 78ms to 31ms, and the segmentation frame rate exceeds 30 frames per second.
Step S2, online step: acquiring image data containing parking spaces on line, performing parking space sideline semantic segmentation by using a trained neural network model to obtain a parking space sideline mask, and fitting, clustering and combining the obtained sideline masks to obtain a geometric shape consisting of sidelines; and screening the geometric shapes according to the set shape discrimination conditions to determine the parking spaces.
The method specifically comprises the following steps:
1) in the driving process of the vehicle, a camera installed on the vehicle is adopted to collect the image information including the parking space on line;
2) performing semantic segmentation on the parking space borderline on the picture information by using the neural network model established in the step S2 to obtain a parking space borderline mask;
3) and (4) scanning line by line on the semantic partition result of the parking space sidelines, extracting the central point of the continuous area in the partition result, and performing straight line fitting by using hough transformation on the basis to obtain each sideline of the parking space contained in the picture.
4) Clustering and combining sidelines and judging the geometrical shape of the parking space:
in the image, due to the phenomena of shielding, shadow or sideline abrasion, the same sideline can be split into a plurality of straight line segments, so that the straight line segments fitted by hough transformation are clustered, and the straight line segments meeting the following position relation are clustered into the same straight line segment, so that the calculation amount and the mismatching rate when the parking space sideline is combined are reduced:
a. two straight line segments are approximately parallel to each other (| delta theta | < theta |)TT=5°);
b. The distance between the straight line sections is smaller than the pixel distance of the borderline of the parking space in the image, and the distance between the straight line sections is defined as the distance from the center point of any one straight line section to the straight line where the other straight line section is located;
c. the distance between the nearest points of the two straight line segments is less than a certain threshold dTLet L be the distance of the short side of the parking space in the imagesThen, then
Figure BDA0002068672810000151
In an actual scene, a parking space is a rectangle or a parallelogram (a parking line with a special shape is not in the detection direction of the technology related to the patent) surrounded by 4 or 3 sidelines, and the area surrounded by the shape, the parallelism and the distance between opposite sides all accord with certain standards. Therefore, the scale of the standard parking space needs to be calibrated according to the size of the actual parking space in the image, then the combination of 4 sidelines and 3 sidelines is selected from the extracted straight line segments in a full arrangement mode, and the parking space geometric shape screening is carried out according to the following rules, so that the parking space conforming to the geometric shape relation is obtained:
a. the area enclosed by the sideline area is about the calibration area range, namely | S-SC|<SdIn which S isCFor the calibrated standard area size, the value is equal to the arithmetic mean value S of randomly selected 10-30 standard parking spaces in the overhead view imageCThe threshold value for detecting the difference between the parking space area and the calibrated parking space area is equal to half of the range of the area of 10-30 randomly selected standard parking spaces in the overlook image.
b. Parallel to the side lines: let the included angle of the side lines be thetadThen | θdIf < 5 degrees, only one group of opposite sides is judged according to the condition of three side lines.
c. The distance between the opposite side lines is in a certain range: for a set of long edges, | Dl-DCl|<DdlWherein D isClThe calibrated standard distance between the long sides is equal to the central point of any long side calculated in randomly selected 10-30 standard parking spaces in the overhead view imageArithmetic mean of the distances of the bars to the long sides, DdlThe value of the threshold value for detecting the difference between the distance between the long sides of the parking spaces and the distance between the long sides of the calibrated parking spaces is equal to half of the range of the distance between the long sides of 10-30 randomly selected standard parking spaces in the overlook image. The same constraint on the distance between the short sides, i.e. | Ds-DCs|<Dds
Preferably, the image data including the parking space is acquired by a calibrated camera mounted on the vehicle; by calibrating the camera, the image distortion caused by the camera and the shooting angle is removed, and the problem of parking space deformation caused by the perspective effect is solved;
the method specifically comprises the following steps:
1) carrying out offline calibration on internal and external parameters of the camera:
the camera can bring image distortion when shooting pictures, internal and external parameters of the camera are obtained by utilizing the calibration of the images captured by the camera, the images are corrected, and the image distortion caused by the imaging of the camera lens is removed.
2) Calibrating an inverse perspective transformation matrix of the camera:
because the camera is arranged on the vehicle, when the parking space is imaged, the shape of the parking space is distorted due to the perspective effect; through the calibration of the inverse projection transformation matrix of the camera, a forward-looking image can be converted into an overlooking top view, and the problem of parking space shape distortion caused by the perspective transformation imaging of the camera is solved.
More preferably, the online step may employ two cameras, which are respectively disposed at the left and right sides of the vehicle, respectively detect the parking spaces at the left and right sides of the vehicle, and respectively project the parking spaces to the vehicle body coordinate system, thereby increasing the detection range. This scheme all has good detection effect to perpendicular parking stall and parallel parking stall.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention.

Claims (7)

1.一种基于深度学习的快速停车位检测方法,其特征在于,包括:1. a fast parking space detection method based on deep learning, is characterized in that, comprises: 离线步骤:离线采集包含有停车位的图像数据,建立训练、验证数据集;进行神经网络模型的训练、评价和优化;所述神经网络模型用于对图像数据中停车位边线进行语义分割;所述神经网络模型为轻量化深度学习语义分割模型;Offline steps: offline collection of image data containing parking spaces, establishment of training and verification data sets; training, evaluation and optimization of neural network models; the neural network model is used for semantic segmentation of parking space edges in the image data; The neural network model described above is a lightweight deep learning semantic segmentation model; 在线步骤:在线采集包含有停车位的图像数据,使用训练好的神经网络模型进行停车位边线语义分割得到停车位边线掩膜,对得到的边线掩膜进行拟合、聚类与组合,得到由边线组成的几何形状;根据设定的形状判别条件,对所述几何形状进行筛选确定停车位;Online steps: collect image data containing parking spaces online, use the trained neural network model to perform semantic segmentation of parking space edges to obtain parking space edge masks, and perform fitting, clustering and combination on the obtained edge masks. The geometric shape formed by the edge; according to the set shape discrimination conditions, the geometric shape is screened to determine the parking space; 对边线进行聚类包括:Clustering edges includes: 1)判断边线中两条直线段的夹角Δθ是否小于聚类角度阈值θT1) judge whether the included angle Δθ of the two straight line segments in the sideline is less than the clustering angle threshold θ T ; 2)判断两条直线段间距离是否小于停车位边线在图像中的像素距离,所述两条直线段间距离为任一直线段中心点到另外一条直线段所在的直线的距离;2) judge whether the distance between two straight line segments is less than the pixel distance of the parking space edge in the image, and the distance between the two straight line segments is the distance from the center point of any straight line segment to the straight line where another straight line segment is located; 3)判断两条直线段的最近点间的距离是否小于阈值dT,所述
Figure FDA0003147795280000011
Ls为停车位短边在图像中的距离;
3) Judging whether the distance between the closest points of the two straight line segments is less than the threshold d T , the
Figure FDA0003147795280000011
L s is the distance of the short side of the parking space in the image;
4)将满足1)-3)的直线段聚类;4) Clustering the straight line segments satisfying 1)-3); 所述轻量化深度学习语义分割模型包括预处理单元、下采样特征提取单元、上采样特征提取单元和模型输出单元;The lightweight deep learning semantic segmentation model includes a preprocessing unit, a downsampling feature extraction unit, an upsampling feature extraction unit and a model output unit; 预处理单元,用于对输入图像的宽、高维度降维;The preprocessing unit is used to reduce the dimension of the width and height of the input image; 所述下采样特征提取单元,用于进行特征提取与特征的宽度和高度维度缩减;下采样特征提取单元中共有两级下采样模块;The downsampling feature extraction unit is used for feature extraction and feature reduction in width and height dimensions; there are two-level downsampling modules in the downsampling feature extraction unit; 每级下采样模块构成如下:主干结构,首先采用1×1卷积将特征维度的通道维度缩减为输入特征通道维度的1/4,然后进行标准的3×3卷积操作,该卷积操作中使用的步长为2,输出特征的宽度和高度维度缩减为输入特征宽度和高度维度的一半,输出特征的通道维度与输入特征相等;然后使用1×1卷积进行特征通道维度的扩张,特征的宽度和高度维度不发生变化;侧向连接结构,首先使用步长设置为2的最大值池化操作进行特征的宽度和高度维度的降维操作;然后采用1×1卷积方式进行特征的通道维度的扩张;然后将主干结构中输出的特征与侧向连接结构中输出的特征进行逐元素相加实现特征融合;Each level of downsampling module is composed as follows: the backbone structure, firstly, 1×1 convolution is used to reduce the channel dimension of the feature dimension to 1/4 of the input feature channel dimension, and then the standard 3×3 convolution operation is performed. The stride used in is 2, the width and height dimensions of the output feature are reduced to half of the width and height dimensions of the input feature, and the channel dimension of the output feature is equal to the input feature; then 1×1 convolution is used to expand the feature channel dimension, The width and height dimensions of the features do not change; the lateral connection structure first uses the maximum pooling operation with a stride set to 2 to reduce the width and height dimensions of the features; then use 1×1 convolution to perform the feature The expansion of the channel dimension; then the features output in the backbone structure and the features output in the lateral connection structure are added element by element to achieve feature fusion; 上采样特征提取单元,用于实现特征的宽度和高度维度的扩张与特征的提取得到宽、高尺寸与预处理单元输出图像相同的上采样二值图像;上采样特征提取单元中共有两级上采样模块;The up-sampling feature extraction unit is used to realize the expansion of the width and height dimensions of the feature and the extraction of the feature to obtain an up-sampled binary image with the same width and height as the output image of the preprocessing unit; there are two levels of up-sampling feature extraction in the up-sampling feature extraction unit. sampling module; 每级上采样模块构成如下:主干结构,首先采用1×1卷积将通道维度的缩减为输入通道维度的1/4,然后采用3×3反卷积进行特征提取和宽度和高度维度的提升,然后采用1×1卷积方式进行通道维度的扩张;侧向连接结构,输入的特征为所述下采样模块中对应宽度和高度维度相同的特征,将所述输入的特征与主干结构中卷积处理所得到的特征进行逐元素相加的操作,得到融合的不同层次的特征信息;Each level of upsampling module is composed as follows: the backbone structure, firstly, 1×1 convolution is used to reduce the channel dimension to 1/4 of the input channel dimension, and then 3×3 deconvolution is used for feature extraction and enhancement of width and height dimensions , and then use the 1×1 convolution method to expand the channel dimension; for the lateral connection structure, the input features are the features with the same width and height dimensions in the downsampling module, and the input features are convoluted with the backbone structure. The features obtained by product processing are added element by element to obtain the fused feature information of different levels; 模型输出单元,用于对所述上采样二值图像进行差值处理输出宽、高尺寸与输入图像相同的二值图像,二值图像中的1为停车位边线,非停车位边线为0。The model output unit is configured to perform difference processing on the up-sampled binary image to output a binary image with the same width and height as the input image, where 1 in the binary image is a parking space borderline, and a non-parking space borderline is 0.
2.根据权利要求1所述的快速停车位检测方法,其特征在于,2. The fast parking space detection method according to claim 1, characterized in that, 所述离线步骤具体包括:The offline steps specifically include: 1)离线采集多组包含有停车位的图片数据,并对图片中的停车位边线区域进行标记,构建训练、验证数据集;1) Collect multiple sets of image data containing parking spaces offline, and mark the edge areas of the parking spaces in the images to construct training and validation datasets; 2)构建基于通道压缩卷积方式的轻量化深度学习语义分割模型,使用训练数据集进行模型参数训练;2) Build a lightweight deep learning semantic segmentation model based on channel compression and convolution, and use the training data set for model parameter training; 3)建立评价标准,使用验证数据集对训练的模型进行评价,调整模型参数;3) Establish evaluation criteria, use the validation data set to evaluate the trained model, and adjust the model parameters; 4)对评价后的模型进行优化加速。4) Optimize and accelerate the evaluated model. 3.根据权利要求2所述的快速停车位检测方法,其特征在于,基于通道压缩卷积方式的轻量化深度学习语义分割模型包括:3. The fast parking space detection method according to claim 2, wherein the lightweight deep learning semantic segmentation model based on the channel compression convolution method comprises: 预处理单元,对输入尺寸为W1×H1×3的图像进行卷积和最大池化处理,对图像的宽、高维度降维,并将卷积处理和最大池化处理结果连接,输出尺寸为W2×H2×N2的预处理图像;The preprocessing unit performs convolution and maximum pooling on the image with the input size W 1 × H 1 × 3, reduces the dimension of the width and height of the image, and connects the results of the convolution processing and the maximum pooling processing, and outputs Preprocessed images of size W 2 ×H 2 ×N 2 ; 下采样特征提取单元,对预处理图像顺序进行两级下采样处理,对图像的宽、高维度降维,提取边线语义特征,输出尺寸为W3×H3×N3的下采样图像;The downsampling feature extraction unit performs two-level downsampling processing on the preprocessed image sequence, reduces the dimension of the width and height of the image, extracts edge semantic features, and outputs a downsampled image with a size of W 3 ×H 3 ×N 3 ; 上采样特征提取单元,对采样图像顺序进行两级上采样处理,对图像的宽、高维度升维,恢复边线语义特征,输出尺寸为W2×H2×2的上采样二值图像;The up-sampling feature extraction unit performs two-level up-sampling processing on the sampling image sequence, up-scales the width and height dimensions of the image, restores the edge semantic features, and outputs an up-sampled binary image with a size of W 2 ×H 2 ×2; 模型输出单元,对上采样二值图像进行差值处理,输出尺寸为W1×H1×2的二值图像;The model output unit performs difference processing on the up-sampled binary image, and outputs a binary image with a size of W 1 ×H 1 ×2; 所述两级上采样处理与两级下采样处理相对应;其中第一级上采样处理与第二级下采样处理相对应,第二级上采样处理与第一级下采样处理相对应。The two-stage up-sampling processing corresponds to the two-stage down-sampling processing; wherein the first-stage up-sampling processing corresponds to the second-stage down-sampling processing, and the second-stage up-sampling processing corresponds to the first-stage down-sampling processing. 4.根据权利要求1-3任一项所述的快速停车位检测方法,其特征在于,每一个进行卷积操作的卷积层后都顺序连接批归一化层、线性映射层和线性整流层,对卷积操作结果进行批归一化操作、线性映射操作和线性整流操作实现输出特征的归一化和非线性变换。4. The fast parking space detection method according to any one of claims 1-3, characterized in that, after each convolutional layer for convolution operation, batch normalization layer, linear mapping layer and linear rectification are sequentially connected layer, which performs batch normalization operation, linear mapping operation and linear rectification operation on the result of convolution operation to achieve normalization and nonlinear transformation of output features. 5.根据权利要求4所述的快速停车位检测方法,其特征在于,5. The fast parking space detection method according to claim 4, characterized in that, 对神经网络模型进行优化加速包括,Optimization and acceleration of neural network models include, a.将模型中所有的卷积层、批归一化层以及线性映射层中的参数进行提取与融合;融合后的参数包括:a. Extract and fuse the parameters in all convolutional layers, batch normalization layers and linear mapping layers in the model; the fused parameters include: 融合后的卷积层权重
Figure FDA0003147795280000041
fused convolutional layer weights
Figure FDA0003147795280000041
融合后的卷积层偏置
Figure FDA0003147795280000042
The fused convolutional layer bias
Figure FDA0003147795280000042
其中,wold为融合前的卷积层权重,bold为融合前的卷积层偏置;Among them, w old is the weight of the convolutional layer before fusion, and old is the bias of the convolutional layer before fusion; γ、β为线性映射层参数;γ, β are linear mapping layer parameters; mean、var为归一化层中所有特征的均值和方差;mean and var are the mean and variance of all features in the normalization layer; ε为大于0的极小值;ε is a minimum value greater than 0; b.将模型使用FP16进行低精度量化。b. Use FP16 for low-precision quantization of the model.
6.根据权利要求1所述的快速停车位检测方法,其特征在于,6. The fast parking space detection method according to claim 1, characterized in that, 所述在线步骤具体包括:The online steps specifically include: 1)车辆行驶过程中,采用安装在车辆上的相机在线采集包括停车位的图片信息;1) During the driving process of the vehicle, the camera installed on the vehicle is used to collect the picture information including the parking space online; 2)使用所述神经网络模型,对图片信息进行停车位边线语义分割;2) Using the neural network model, the image information is semantically segmented on the edge of the parking space; 3)在停车位边线语义分割结果上进行逐行扫描,提取出分割结果中连续区域的中心点,在此基础上使用hough变换进行直线拟合得到图片中包含的停车位各条边线;3) Perform line-by-line scanning on the semantic segmentation result of the parking space edge, extract the center point of the continuous area in the segmentation result, and use hough transform on this basis to perform straight line fitting to obtain each edge of the parking space included in the picture; 4)对边线进行聚类、组合后围成几何区域,根据设定的停车位判别条件,判别满足停车位判别条件的几何区域为停车位。4) The edge lines are clustered and combined to form a geometric area, and according to the set parking space discrimination conditions, the geometric area that satisfies the parking space discrimination conditions is discriminated as a parking space. 7.根据权利要求1所述的快速停车位检测方法,其特征在于,所述包含有停车位的图像数据由安装在车辆上的标定后的相机采集得到,所述相机的标定包括:7. The fast parking space detection method according to claim 1, wherein the image data including the parking space is collected by a calibrated camera installed on the vehicle, and the calibration of the camera comprises: 1)对相机内外参数进行离线标定,用于消除由于相机镜头成像引起的图像畸变;1) Offline calibration of camera internal and external parameters to eliminate image distortion caused by camera lens imaging; 2)对相机逆透视变换矩阵进行离线标定,用于将前视图像转换到俯视图下,消除由于相机透视变换成像带来的停车位形状畸变。2) The off-line calibration of the camera inverse perspective transformation matrix is used to convert the front-view image to the top view, and to eliminate the shape distortion of the parking space caused by the camera's perspective transformation imaging.
CN201910429977.8A 2019-05-22 2019-05-22 Rapid parking space detection method based on deep learning Active CN110210350B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910429977.8A CN110210350B (en) 2019-05-22 2019-05-22 Rapid parking space detection method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910429977.8A CN110210350B (en) 2019-05-22 2019-05-22 Rapid parking space detection method based on deep learning

Publications (2)

Publication Number Publication Date
CN110210350A CN110210350A (en) 2019-09-06
CN110210350B true CN110210350B (en) 2021-12-21

Family

ID=67788167

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910429977.8A Active CN110210350B (en) 2019-05-22 2019-05-22 Rapid parking space detection method based on deep learning

Country Status (1)

Country Link
CN (1) CN110210350B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110991452B (en) * 2019-12-03 2023-09-19 深圳市捷顺科技实业股份有限公司 Parking space frame detection method, device, equipment and readable storage medium
JP7346267B2 (en) * 2019-12-04 2023-09-19 キヤノン株式会社 Information processing device, relay device, system, information processing device control method, relay device control method and program
CN110992267A (en) * 2019-12-05 2020-04-10 北京科技大学 A wear particle recognition method based on DPSR and Lightweight CNN
CN111179272B (en) * 2019-12-10 2024-01-05 中国科学院深圳先进技术研究院 Rapid semantic segmentation method for road scene
CN111178236B (en) * 2019-12-27 2023-06-06 清华大学苏州汽车研究院(吴江) Parking space detection method based on deep learning
CN111368846B (en) * 2020-03-19 2022-09-09 中国人民解放军国防科技大学 Road ponding identification method based on boundary semantic segmentation
CN112365434B (en) * 2020-11-10 2022-10-21 大连理工大学 Unmanned aerial vehicle narrow passage detection method based on double-mask image segmentation
CN112600221B (en) * 2020-12-08 2023-03-03 深圳供电局有限公司 Reactive compensation device configuration method, device, equipment and storage medium
CN112560945B (en) * 2020-12-14 2024-08-09 珠海格力电器股份有限公司 Equipment control method and system based on emotion recognition
CN112991171B (en) * 2021-03-08 2023-07-28 Oppo广东移动通信有限公司 Image processing method, device, electronic equipment and storage medium
CN113283429B (en) * 2021-07-21 2021-09-21 四川泓宝润业工程技术有限公司 Liquid level meter reading method based on deep convolutional neural network
CN113658268B (en) * 2021-08-04 2024-07-12 智道网联科技(北京)有限公司 Verification method and device for camera calibration result, electronic equipment and storage medium
CN113762272B (en) * 2021-09-10 2024-06-14 北京精英路通科技有限公司 Road information determining method and device and electronic equipment
CN114882727B (en) * 2022-03-15 2023-09-05 深圳市德驰微视技术有限公司 Parking space detection method based on domain controller, electronic equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106373426B (en) * 2016-09-29 2019-02-12 成都通甲优博科技有限责任公司 Parking stall based on computer vision and violation road occupation for parking monitoring method
CN107516110B (en) * 2017-08-22 2020-02-18 华南理工大学 A Semantic Clustering Method for Medical Question Answering Based on Ensemble Convolutional Coding

Also Published As

Publication number Publication date
CN110210350A (en) 2019-09-06

Similar Documents

Publication Publication Date Title
CN110210350B (en) Rapid parking space detection method based on deep learning
CN113052210B (en) Rapid low-light target detection method based on convolutional neural network
CN110111366B (en) End-to-end optical flow estimation method based on multistage loss
CN111461134B (en) Low-resolution license plate recognition method based on generation countermeasure network
CN106845478B (en) A kind of secondary licence plate recognition method and device of character confidence level
CN111612008B (en) Image segmentation method based on convolution network
WO2019169816A1 (en) Deep neural network for fine recognition of vehicle attributes, and training method thereof
CN111027461B (en) Vehicle track prediction method based on multi-dimensional single-step LSTM network
CN113343822B (en) Light field saliency target detection method based on 3D convolution
CN114170286B (en) Monocular depth estimation method based on unsupervised deep learning
CN112365414A (en) Image defogging method based on double-path residual convolution neural network
CN108292367B (en) Image processing device, semiconductor device, image recognition device, mobile device, and image processing method
CN112731436A (en) Multi-mode data fusion travelable area detection method based on point cloud up-sampling
CN112613392A (en) Lane line detection method, device and system based on semantic segmentation and storage medium
CN114299383A (en) Remote sensing image target detection method based on integration of density map and attention mechanism
CN115564888A (en) Visible light multi-view image three-dimensional reconstruction method based on deep learning
CN115482518A (en) A scalable multi-task visual perception method for traffic scenes
CN116342894A (en) GIS infrared feature recognition system and method based on improved YOLOv5
CN114627150A (en) Method and device for data processing and motion estimation based on event camera
CN112766056A (en) Method and device for detecting lane line in low-light environment based on deep neural network
CN116129291A (en) Unmanned aerial vehicle animal husbandry-oriented image target recognition method and device
CN111881914B (en) License plate character segmentation method and system based on self-learning threshold
CN118072022A (en) A lane detection method based on semantic segmentation
CN111881924B (en) A low-light vehicle photo recognition method combining light invariance and short-exposure light enhancement
CN111986233A (en) Large-scene minimum target remote sensing video tracking method based on feature self-learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant