CN111461083A - Rapid vehicle detection method based on deep learning - Google Patents

Rapid vehicle detection method based on deep learning Download PDF

Info

Publication number
CN111461083A
CN111461083A CN202010452151.6A CN202010452151A CN111461083A CN 111461083 A CN111461083 A CN 111461083A CN 202010452151 A CN202010452151 A CN 202010452151A CN 111461083 A CN111461083 A CN 111461083A
Authority
CN
China
Prior art keywords
detection
prediction
network
training
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010452151.6A
Other languages
Chinese (zh)
Inventor
王国栋
王亮亮
徐洁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao University
Original Assignee
Qingdao University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao University filed Critical Qingdao University
Priority to CN202010452151.6A priority Critical patent/CN111461083A/en
Publication of CN111461083A publication Critical patent/CN111461083A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/54Surveillance or monitoring of activities, e.g. for recognising suspicious objects of traffic, e.g. cars on the road, trains or boats
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a rapid vehicle detection method based on deep learning, which comprises the steps of performing feature layer fusion and feature extraction by using a multi-scale cavity convolution module in a detection network through feature extraction of a basic network, performing detection of three different scales, training on a data set to obtain a weight model of a detection algorithm, adding a pruning strategy based on a BN layer on the basis of the model, and performing retraining, thereby obtaining a pruned model. The method has the advantages of small dependence on system environment, less memory, high detection speed and higher detection precision, can be conveniently arranged in an edge end system, and can be widely applied to various engineering applications requiring target detection technology through the training data generation model.

Description

Rapid vehicle detection method based on deep learning
Technical Field
The invention belongs to the technical field of machine vision and deep learning, relates to a target detection technology, and particularly relates to a rapid vehicle detection method based on deep learning.
Background
The rapid vehicle detection method based on deep learning aims to provide a detection algorithm which is simple in environment dependence, small in memory occupation, high in running speed and capable of meeting detection accuracy requirements and is conveniently arranged in an edge end system for vehicles in urban roads. The small-sized system at the edge end can directly process the acquired data, so that the timeliness and the accuracy of information analysis are improved, and the pressure can be reduced for the central data processing system. The algorithm can be applied to the construction of the Internet of things, smart cities and the like.
A two-stage detection algorithm (a region of interest is selected in advance and then further detected) in deep learning (with a region projection module) is high in detection accuracy, but the requirement on hardware equipment is high, the detection speed under the general equipment environment cannot meet the requirement of multi-target high-definition video real-time detection, and the YO L O (You Only L ook one) algorithm solves the problem of object detection as a regression problem, completes the input from an original image to the output of the position and the type of an object based on a single end-to-end network, does not have an explicit region-of-interest extraction process, greatly improves the detection speed, enables the speed of YO L Ov1 to reach 45FPS in an image detection test, simplifies the version to reach 155FPS, meets the detection speed required by high-frame-rate video high-definition (high-frame-rate) real-time detection, but sacrifices much precision until the occurrence of YO L Ov3, greatly improves the detection precision, reduces the speed, can also meet the requirement of real-time detection in a high-performance computer configuration for deep learning, but can hardly meet the requirement of high-speed detection in real-time detection, and can be applied to a system with less requirement for hardware detection under the premise that the high-speed detection, and the lack of the high-speed detection of the edge detection.
CN201910915495.3 relates to a vehicle detection method based on improved YO L Ov3, a convolutional neural network structure between a Darknet layer and three yolo layers is redesigned, a YO L O-TN network is designed by taking the Trident Net weight sharing idea as reference, model pruning is carried out on the YO L O-TN convolutional neural network, a vehicle detection data set is constructed and vehicle position information in the data set is labeled, vehicle detection models based on YO L O-TN and YO L Ov3 are respectively trained to complete vehicle detection tasks, and detection results of the two are compared.
CN201711104408.3 discloses a vehicle detection method based on deep learning, which comprises the steps of firstly training a deep learning network by using a vehicle database, then sending a picture to be detected into the trained network, obtaining class information of the picture through one-time forward propagation, obtaining the weight with the largest weight in parameters according to the class information, superposing the weight with a feature map of the last layer of convolution layer, then carrying out image fusion with the picture to be detected, and finally realizing accurate positioning of a vehicle. The problems of environmental interference, illumination influence, obstacle influence, low accuracy and the like when the traditional image processing algorithm is used for realizing vehicle detection are effectively solved, and the method is applicable to vehicle detection in different scenes.
CN201910065786.8 discloses a vehicle detection method based on deep learning. The method utilizes an R-FCN algorithm to detect the vehicle, avoids the problem that manual features need to be designed in the traditional vehicle detection, and simultaneously improves the accuracy and the robustness. The vehicle detection method comprises the following steps: A. defining a vehicle vision task; B. making a vehicle detection data set; C. determining a shared convolutional network structure; D. optimizing a shared convolution network by adopting a random depth method; E. and training the integral R-FCN model to obtain the final vehicle detection network. F. And testing the vehicle detection network by using the new sample to obtain the detection result of the new sample.
CN201811322079.4 provides a vehicle detection method based on deep learning. Constructing a training set and a verification set; performing data amplification on the training set; constructing a vehicle detection network; and training and predicting the vehicle detection network. The vehicle detection method based on deep learning fully considers the diversity of application scene weather and the complexity of vehicle types, and uses the faster-rcnn network based on the resnet101, so that the vehicle detection speed is ensured, and the vehicle detection accuracy is improved.
CN201810539356.0 discloses a vehicle detection method based on deep learning, which combines Edge Boxes and an improved Faster R-CNN model to detect vehicles in a complex environment, firstly, the Edge Boxes are used to process images, and more accurate vehicle candidate regions are preliminarily extracted; and secondly, inputting the candidate region into an improved Faster R-CNN model to further finely position the vehicle and obtaining a final detection result through classification and judgment. Meanwhile, in order to enhance the detection capability of the model on small-size vehicles and the discrimination capability of the model, convolution features of different layers are combined to supplement detailed information of some vehicles, and a difficult sample mining strategy is added in a training stage, so that the model focuses on difficult samples, and the background of the vehicles and suspected vehicles can be well distinguished.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a rapid vehicle detection method based on deep learning, which is used for solving the problems that a general target detection algorithm based on deep learning occupies a large amount of memory and has low detection speed; the method can be applied to, but not limited to, a road monitoring system for automatically detecting, identifying and uploading the result of the vehicle.
The invention constructs a detection network based on deep learning, and detects a characteristic layer in three scales after a basic network, wherein a multi-scale hole convolution module (MDC module) is used in the detection network part, the module takes the output of the basic network as input, and then uses hole convolution kernels with convolution expansion rates of [1,2 and 5] respectively as 3 × 3 to carry out three-channel convolution operation, then the output and the input of the three channels are subjected to channel fusion together, and the fused characteristic layer is transmitted into the detection network in the three scales to detect and identify a target object, so that the sensitivity of the target object in different scales can be increased, and the recall rate of an algorithm is increased.
The technical scheme provided by the invention is as follows:
a rapid vehicle detection method based on deep learning is characterized in that a multi-scale cavity convolution module is used for fusion and feature extraction of feature layers in a detection network through feature extraction of a basic network, and then detection of three different scales is carried out, so that feature information of a target object in a feature map is enhanced, a feature edge is strengthened, a detection algorithm is sensitive to the target object, and algorithm detection precision is increased; and then training the data set to obtain a weight model of the detection algorithm, and adding a pruning strategy based on a BN layer on the basis of the weight model for retraining so as to obtain a pruned model. The method specifically comprises the following steps:
1) construction of detection algorithms
The method comprises the steps of constructing a basic feature extraction network of a detection algorithm, constructing a basic module DB L of the detection algorithm, wherein the basic module DB L comprises a convolution layer (CONV) plus scale normalization (BN) and an activation function (L eaky re), an RES module refers to a residual module structure and comprises a pixel filling module (padding), a DB L module and a classical residual module, processing an input feature layer by three parallel channels through three types of cavity convolution operation with a convolution kernel size of 3 × by using a classical cavity convolution expansion rate of [1,2,5] after extracting features by the basic network, thereby forming a multi-scale cavity convolution module, inputting the basic network into the multi-scale cavity convolution module after passing through three DB L structures, then dividing a processing result into three scales, performing channel integration and feature fusion on an output result of the multi-scale convolution module by using an FPN structure, performing channel integration and feature fusion on the output result of the multi-scale convolution module by using a DB L structure, finally performing channel integration and feature fusion on an extracted feature graph by three scales, generating a depth value of 3 + 5 n, and a coordinate value of a detection result, and a central prediction frame, and a final detection result of a prediction position, wherein the three scales are obtained by using a detection algorithm, and a prediction frame, and a prediction value, and a prediction result, and a prediction position prediction frame, and a final detection algorithm, and a prediction result are formed by using a prediction value of a prediction frame, and a prediction value of a;
2) using cross entropy loss function
The method comprises the following steps of selecting L eaky Re L U functions as activation functions, accelerating convergence speed and avoiding gradient disappearance, selecting L eaky Re L U functions as loss functions, wherein the loss functions comprise four types of target object position information (x, y), prediction frame width and height (w, h), prediction category information class and prediction target object confidence coefficient, taking mean square errors of the prediction frame width and height relative to the real frame width and height and multiplying the mean square errors by a coefficient 0.5 to balance errors of prediction frames with different sizes and proportion of width and height errors in total errors in order to solve the problem that error proportions of prediction frames with different sizes are the same, directly calculating relative errors by adopting cross entropy loss functions in the remaining three types, and during training, the total loss value is the sum of four loss values, and the loss functions are as follows:
Figure RE-GDA0002536393240000041
Figure RE-GDA0002536393240000042
wherein, xylossIs the position coordinate error of the object, whlossTo predict frame width height error, s2S in (2), (52) and (104) respectively represents that the characteristic diagram under 3 scales is divided into S × S grids (grid cells) which are respectively responsible for predicting the target object falling into a certain grid, B takes a value of 3, which means that each grid has 3 frames (bounding boxes) with different scales,
Figure RE-GDA0002536393240000043
the object falls into the jth bounding box of the grid i, the value of lambda is 0.5, and (x, y, w, h) is the actual position coordinate and width and height of the detected target object in the labeling data,
Figure RE-GDA0002536393240000044
position coordinates and width and height of the detected target object predicted in the forward calculation;
3) training algorithm to obtain weight model
Collecting target object pictures in different places, different time periods, different weather conditions and the like as data set original pictures, labeling the target objects in the pictures to generate a data set, and training a constructed deep learning algorithm in the data set by a certain strategy to generate a weight model;
4) performing model clipping on the trained weight model
Using a pruning method of a BN layer, introducing a scaling factor gamma to each channel of the BN layer, multiplying the scaling factor gamma by the output of the channel, training the network weight and the scaling factors in a combined manner, finally directly removing the channel with the small scaling factor, and finely adjusting the pruned network to obtain a weight model after pruning; the objective function adopts the formula:
Figure RE-GDA0002536393240000045
where (x, y) represents training data and labels, W is the trainable parameter of the network, the first term is the training loss function of CNN, g (γ) is the multiplication term over the scaling factor, λ is the balance factor of the two terms, where g (x) ═ x |, i.e., L1 regularization, which is also widely applied to sparsification;
5) final compression model testing
And inputting the test pictures in the data set into a network, and marking out the target object in the test picture by running the test, thereby realizing the rapid vehicle detection method based on deep learning.
The invention discloses a rapid vehicle detection method based on deep learning, which constructs a rapid vehicle detection algorithm by analyzing the advantages and disadvantages of various target detection algorithms based on deep learning, and using a residual error module, a multi-scale cavity convolution module, a characteristic pyramid structure and the like. The residual error module effectively prevents network degradation problems such as gradient disappearance, gradient explosion and the like, multi-scale cavity convolution increases association of local features and the whole feature graph and strengthens feature information, and the robustness of a detection algorithm to the scale of a target object is enhanced by a feature pyramid structure, so that vehicle detection under complex conditions has higher precision; in view of the characteristics of large generation model and large calculation amount of the deep learning algorithm, the algorithm is difficult to be effectively arranged at the edge end, and in order to make the algorithm more suitable for engineering, a model channel pruning strategy is adopted. Introducing a scaling factor gamma into each channel, multiplying the scaling factor gamma by the output of the channel, training the network weight and the scaling factors jointly, and finally directly removing the channel with the small scaling factor and finely adjusting the pruned network. The detection speed is improved and the memory occupation is reduced under the condition that the detection precision of the final generated model is similar, and a better effect is obtained.
In the invention, a basic feature extraction network of the detection algorithm is constructed, and a multi-scale cavity convolution module and a feature pyramid structure are used in the detection network, so that the performance of the detection algorithm is improved. And then, a pruning strategy based on a BN layer channel is adopted for the improved algorithm, so that the model volume is reduced and the detection speed is increased on the premise of ensuring the detection precision. The method has the advantages of small dependence on system environment, less memory occupation, high detection speed and high detection precision. The method can be conveniently arranged in an edge end system, and the model can be widely applied to various engineering applications requiring target detection technology through training data generation.
Compared with the prior art, the invention has the beneficial effects that:
the invention provides a rapid vehicle detection method based on deep learning. The method can be applied to engineering implementation, and a smaller weight model can be obtained by training with enough data sets and compressing the model. The model is easy to arrange in an edge end system, and the vehicle can be detected quickly and accurately.
Drawings
FIG. 1 is a schematic diagram of an overall structural framework of a target detection algorithm provided by the present invention;
the figure is a schematic diagram of the overall network structure of the detection algorithm and is divided into a basic network and a detection network.
Fig. 2 is a supplementary illustration of the structure of the module in the framework diagram of the detection algorithm of fig. 1.
FIG. 3 is an illustration of the structure of the multi-scale hole convolution module (MDC module) of FIG. 1.
FIG. 4 is an illustration of the detection structures of Y1, Y2, and Y3 in FIG. 1.
Fig. 5 is a schematic diagram of a fusion process of the detection results of the three scales in fig. 4.
Fig. 6 is a schematic diagram of the detection result of the detection algorithm.
Detailed Description
The invention will be further described by way of examples, without in any way limiting the scope of the invention, with reference to the accompanying drawings.
The invention provides a rapid vehicle detection method based on deep learning, which realizes the improvement of the precision of a detection algorithm by constructing a basic network and using a multi-scale cavity convolution module and a characteristic pyramid structure in the detection network. The algorithm is used for training a data set with a vehicle as a target object to obtain a weight model, and then the trained weight model is pruned, so that the compression of a detection algorithm model is realized, and the detection speed of the algorithm is greatly improved.
As shown in fig. 1, after the basic network, the extracted feature layer is input into the MDC module to perform a void convolution operation of 3 scales to further enhance the feature points, and then the channel processing results of three scales and the input are subjected to channel addition processing, and the obtained feature layer is input into the subsequent detection network to perform classification and regression. Fig. 2 shows the modular construction of the algorithm framework. FIG. 3 is a diagram illustrating the structure of a multi-scale hole convolution module according to the present invention. The specific implementation comprises the following steps:
1) and constructing a detection algorithm.
After the characteristics are extracted by the basic network, three types of cavity convolution operations with convolution kernel size of 3 × are divided into three parallel channels to process input characteristic layers, so that a multi-scale cavity convolution module is formed, the basic network is input to the multi-scale cavity convolution module through three DB L structures, then the processing results are divided into three scales, the output results of the multi-scale cavity module are subjected to channel integration and characteristic fusion by taking the FPN structure, the channel fusion, upsampling and DB L structure to perform channel integration and characteristic fusion, finally, the extracted characteristic graph is processed by dividing the three scales, each FPN structure generates three types of characteristic results, the coordinate values of each characteristic layer are as shown in a channel fusion, an IOP 5 coordinate value of the characteristic layer are as shown in a 5, and the coordinate values of the three types of characteristic results are obtained by performing channel fusion and characteristic fusion, and the prediction results of the three types of characteristic results are as shown in a 5, a coordinate value of the same detection position of a detection frame, a coordinate value of the initial position and a coordinate value of a 5, and a prediction position of a horizontal position, and a coordinate value of a prediction are formed by a coordinate value of a reference of a corresponding to be equal to a coordinate value of a reference point of a coordinate value of a reference of a convolution layer (CONV, a reference point of a reference point, a coordinate value of a coordinate.
2) A cross entropy loss function is used.
In order to solve the problem that error values of large frames are larger when error proportions of prediction frames with different sizes are the same, mean square errors of the prediction frame width and the prediction frame height relative to the real frame width and the real frame height are taken and multiplied by a coefficient of 0.5, so that errors of the prediction frames with different sizes and specific gravity of the width and height errors in total errors are balanced, and relative errors are directly calculated by adopting cross-over loss functions in the remaining three categories, wherein during training, the total loss value is the sum of four loss values, and the loss functions are as follows:
Figure RE-GDA0002536393240000071
Figure RE-GDA0002536393240000072
wherein, xylossIs the position coordinate error of the object, whlossTo predict the frame width height error. s2S in (26,52,104) represents that the characteristic diagram under 3 scales is divided into S × S grids (grid cells) which are respectively responsible for predicting the target object falling into a certain grid, and B takes a value of 3, which means that each grid has 3 frames (bounding boxes) with different scales.
Figure RE-GDA0002536393240000073
Indicating that the object falls into the jth bounding box of grid i. Lambda takes the value 0.5. (x, y, w, h) are the actual position coordinates and width and height of the detected target object in the labeling data,
Figure RE-GDA0002536393240000074
the position coordinates and the width and height of the detected target object predicted in the forward calculation are obtained.
3) The training algorithm obtains a weight model.
The method comprises the steps of extracting pictures by adopting a road monitoring video, collecting the pictures under the conditions of different places, different time periods, different weather and the like as original pictures of a data set, marking windows in the pictures, replacing vehicles with the windows as detected target objects to generate the data set, increasing the generalization capability of a weight model for increasing data quantity during training, selecting the data set to be augmented to increase image contrast, increase image saturation and image decoloration to expand the data set, taking an image with the size of 416 × 416 as input in the data set, gradually increasing the learning rate to 0.002 in the first 2000 steps during training, reducing the learning rate to 0.0002 in the 40000 steps, reducing the learning rate to 0.0001 in the 45000 steps, training the 60000 step to generate the weight model (the adjustment of the learning rate is slightly changed in the training process of different data sets)
4) And performing model clipping on the trained weight model.
A pruning method using one BN layer. A scaling factor y is introduced for each channel of the BN layer and then multiplied with the output of the channel. And (4) training the network weight and the scaling factors in a combined manner, finally, directly removing the channel with the small scaling factor, and finely adjusting the pruned network to obtain the pruned weight model. The objective function adopts the formula:
Figure RE-GDA0002536393240000081
where (x, y) represents training data and labels, W is the trainable parameter of the network, the first term is the training loss function of CNN g (γ) is the multiplication term on the scaling factor, λ is the balance factor of the two terms, where g (x) x, i.e., L1 regularization, is also widely applied to sparsification.
5) And finally testing the compression model.
And (3) collecting the data set which is made in the same data set (step 3) and takes the car window as a detected target object) to train the YO L Ov3 algorithm and the text algorithm to obtain a weight model, and carrying out test comparison in a GeForce GTX 1080Ti video card environment, wherein the results are shown in Table 1.
TABLE 1 YO L Ov3 Algorithm model and Algorithm model test comparisons herein
Figure RE-GDA0002536393240000082
Compared with the results in the table 1, the detection models finally generated by the method are superior to the YO L Ov3 detection algorithm, so that the sizes of the models are easier to arrange in small-sized systems at edge ends, and the detection and identification of the vehicles in the road can be well completed with more excellent detection speed and better recall rate.
It is noted that the disclosed embodiments are intended to aid in further understanding of the invention, but those skilled in the art will appreciate that: various substitutions and modifications are possible without departing from the spirit and scope of the invention and appended claims. Therefore, the invention should not be limited to the embodiments disclosed, but the scope of the invention is defined by the appended claims.

Claims (4)

1. A rapid vehicle detection method based on deep learning is characterized by comprising the following steps:
1) construction of detection algorithms
Constructing a basic feature extraction network of a detection algorithm, wherein a basic module DB L in the structure of the network is composed of a convolutional layer (CONV), a scale normalization (BN) and an activation function (L eaky re), an RES module refers to a residual module structure and is composed of a pixel filling module (padding), a DB L module and a classical residual module, after extracting features, the basic network processes an input feature layer by three parallel channels through classical cavity convolution operation to form a multi-scale cavity convolution module, after the basic network passes through three DB L structures, the basic network inputs the input feature layer into the multi-scale cavity convolution module, then divides the processing result into three scales, and uses an FPN structure to adopt channel fusion, up-sampling and DB L structures to perform channel and feature integration fusion on the output result of the multi-scale cavity convolution module, finally processes the extracted feature map by three scales, each scale feature layer generates a feature map with the depth of 3 × (5+ n), 3 is the number of prediction frames, 5 is the number of the prediction frames, 5 is the width, the horizontal coordinate value and 4 longitudinal position information, 1 position information and three position information and the prediction results are classified into the same number of a prediction frame, and the prediction result of a prediction center prediction model, and the prediction result is obtained by the same prediction and the detection algorithm, and the final detection result is obtained by classification, and the detection algorithm, and the detection result is obtained by the detection method, and the detection algorithm, wherein the detection algorithm, and the detection result is not obtained by;
2) using cross entropy loss function
The method comprises the following steps of selecting L eaky Re L U functions as activation functions, accelerating convergence speed and avoiding gradient disappearance, selecting L eaky Re L U functions as loss functions, wherein the loss functions comprise four types of target object position information (x, y), prediction frame width and height (w, h), prediction category information class and prediction target object confidence coefficient, taking mean square errors of the prediction frame width and height relative to the real frame width and height and multiplying the mean square errors by a coefficient 0.5 to balance errors of prediction frames with different sizes and proportion of width and height errors in total errors in order to solve the problem that error proportions of prediction frames with different sizes are the same, directly calculating relative errors by adopting cross entropy loss functions in the remaining three types, and during training, the total loss value is the sum of four loss values, and the loss functions are as follows:
Figure FDA0002507965930000011
Figure FDA0002507965930000012
wherein, xylossIs the position coordinate error of the object, whlossTo predict frame width height error, s2S in (2), (52) and (104) respectively represents that the characteristic diagram under 3 scales is divided into S × S grids (grid cells) which are respectively responsible for predicting the target object falling into a certain grid, B takes a value of 3, which means that each grid has 3 frames (bounding boxes) with different scales,
Figure FDA0002507965930000021
the object falls into the jth bounding box of the grid i, the value of lambda is 0.5, and (x, y, w, h) is the actual position coordinate and width and height of the detected target object in the labeling data,
Figure FDA0002507965930000022
position coordinates and width and height of the detected target object predicted in the forward calculation;
3) training algorithm to obtain weight model
Collecting vehicle pictures at different places, different time periods and different weather conditions as data set original pictures, and labeling a target object in the pictures to generate a data set; training a built deep learning algorithm in the data set by a certain strategy to generate a weight model;
4) performing model clipping on the trained weight model
Using a pruning method of a BN layer, introducing a scaling factor gamma to each channel of the BN layer, multiplying the scaling factor gamma by the output of the channel, training the network weight and the scaling factors in a combined manner, finally directly removing the channel with the small scaling factor, and finely adjusting the pruned network to obtain a weight model after pruning; the objective function adopts the formula:
Figure FDA0002507965930000023
where (x, y) represents training data and labels, W is the trainable parameter of the network, the first term is the training loss function of CNN, g (γ) is the multiplication term over the scaling factor, λ is the balance factor of the two terms, where g (x) ═ x |, i.e., L1 regularization, which is also widely applied to sparsification;
5) final compression model testing
Inputting the test pictures in the data set into a network, and marking the vehicles in the test images after running the test;
through the steps, the rapid vehicle detection method based on deep learning is realized.
2. The fast vehicle detection method based on deep learning as claimed in claim 1, wherein in step 1), the input feature layer is processed in three parallel channels by three kinds of hole convolution operations with convolution expansion rate [1,2,5] and convolution kernel size 3 × 3.
3. The rapid vehicle detection method based on deep learning of claim 1, wherein in step 3), a road monitoring video is used for extracting pictures, the pictures at different places and time periods under different weather conditions are collected as original pictures of a data set, windows in the pictures are labeled, the windows replace vehicles to be detected as targets, the data set is generated, the data set is expanded by using a method of increasing image contrast, increasing image saturation and image decoloring to increase the generalization capability of a weight model in order to increase data volume during training, images with the size of 416 × 416 are used as input in the data set, the learning rate is gradually increased to 0.002 in the first 2000 steps of training, the learning rate is reduced to 0.0002 in the 40000 steps, and the learning rate is reduced to 0.0001 in the 45000 steps, and the weight model is generated in the 60000 steps.
4. The method of any one of claims 1 to 3 is applied to, but not limited to, automatic detection and identification of vehicles by a road monitoring system and uploading results.
CN202010452151.6A 2020-05-26 2020-05-26 Rapid vehicle detection method based on deep learning Pending CN111461083A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010452151.6A CN111461083A (en) 2020-05-26 2020-05-26 Rapid vehicle detection method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010452151.6A CN111461083A (en) 2020-05-26 2020-05-26 Rapid vehicle detection method based on deep learning

Publications (1)

Publication Number Publication Date
CN111461083A true CN111461083A (en) 2020-07-28

Family

ID=71685417

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010452151.6A Pending CN111461083A (en) 2020-05-26 2020-05-26 Rapid vehicle detection method based on deep learning

Country Status (1)

Country Link
CN (1) CN111461083A (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111898539A (en) * 2020-07-30 2020-11-06 国汽(北京)智能网联汽车研究院有限公司 Multi-target detection method, device, system, equipment and readable storage medium
CN112101175A (en) * 2020-09-09 2020-12-18 沈阳帝信人工智能产业研究院有限公司 Expressway vehicle detection and multi-attribute feature extraction method based on local images
CN112308019A (en) * 2020-11-19 2021-02-02 中国人民解放军国防科技大学 SAR ship target detection method based on network pruning and knowledge distillation
CN112507861A (en) * 2020-12-04 2021-03-16 江苏科技大学 Pedestrian detection method based on multilayer convolution feature fusion
CN112560933A (en) * 2020-12-10 2021-03-26 中邮信息科技(北京)有限公司 Model training method and device, electronic equipment and medium
CN112580665A (en) * 2020-12-18 2021-03-30 深圳赛安特技术服务有限公司 Vehicle money identification method and device, electronic equipment and storage medium
CN112668663A (en) * 2021-01-05 2021-04-16 南京航空航天大学 Aerial photography car detection method based on YOLOv4
CN112801027A (en) * 2021-02-09 2021-05-14 北京工业大学 Vehicle target detection method based on event camera
CN113033284A (en) * 2020-12-22 2021-06-25 迪比(重庆)智能科技研究院有限公司 Vehicle real-time overload detection method based on convolutional neural network
CN113111889A (en) * 2021-03-10 2021-07-13 国网浙江省电力有限公司宁波供电公司 Target detection network processing method for edge computing terminal
CN113554084A (en) * 2021-07-16 2021-10-26 华侨大学 Vehicle re-identification model compression method and system based on pruning and light-weight convolution
CN113657174A (en) * 2021-07-21 2021-11-16 北京中科慧眼科技有限公司 Vehicle pseudo-3D information detection method and device and automatic driving system
CN114120246A (en) * 2021-10-12 2022-03-01 吉林大学 Front vehicle detection algorithm based on complex environment
CN114201289A (en) * 2021-10-27 2022-03-18 山东师范大学 Target detection method and system based on edge computing node and cloud server
CN114332688A (en) * 2021-12-14 2022-04-12 浙江省交通投资集团有限公司智慧交通研究分公司 Vehicle detection method under highway monitoring video scene
CN114359880A (en) * 2022-03-18 2022-04-15 北京理工大学前沿技术研究院 Riding experience enhancement method and device based on intelligent learning model and cloud
CN115082695A (en) * 2022-05-31 2022-09-20 中国科学院沈阳自动化研究所 Transformer substation insulator string modeling and detecting method based on improved Yolov5
CN116630904A (en) * 2023-04-28 2023-08-22 淮阴工学院 Small target vehicle detection method integrating non-adjacent jump connection and multi-scale residual error structure
CN118397403A (en) * 2024-07-01 2024-07-26 合肥市正茂科技有限公司 Training method, device, equipment and medium for low-illumination vehicle image detection model

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109829400A (en) * 2019-01-18 2019-05-31 青岛大学 A kind of fast vehicle detection method
CN110796168A (en) * 2019-09-26 2020-02-14 江苏大学 Improved YOLOv 3-based vehicle detection method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109829400A (en) * 2019-01-18 2019-05-31 青岛大学 A kind of fast vehicle detection method
CN110796168A (en) * 2019-09-26 2020-02-14 江苏大学 Improved YOLOv 3-based vehicle detection method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
JOSEPH REDMON等: "YOLOv3: An Incremental Improvement", 《ARXIV PREPRINT ARXIV》 *
LIANG-CHIEH CHEN等: "DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs", 《ARXIV PREPRINT ARXIV》 *
王亮亮,王国栋等: "基于车窗特征的快速车辆检测算法", 《青岛大学学报》 *

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111898539A (en) * 2020-07-30 2020-11-06 国汽(北京)智能网联汽车研究院有限公司 Multi-target detection method, device, system, equipment and readable storage medium
CN112101175A (en) * 2020-09-09 2020-12-18 沈阳帝信人工智能产业研究院有限公司 Expressway vehicle detection and multi-attribute feature extraction method based on local images
CN112101175B (en) * 2020-09-09 2024-05-10 沈阳帝信人工智能产业研究院有限公司 Expressway vehicle detection and multi-attribute feature extraction method based on local image
CN112308019A (en) * 2020-11-19 2021-02-02 中国人民解放军国防科技大学 SAR ship target detection method based on network pruning and knowledge distillation
CN112507861A (en) * 2020-12-04 2021-03-16 江苏科技大学 Pedestrian detection method based on multilayer convolution feature fusion
CN112560933A (en) * 2020-12-10 2021-03-26 中邮信息科技(北京)有限公司 Model training method and device, electronic equipment and medium
CN112580665A (en) * 2020-12-18 2021-03-30 深圳赛安特技术服务有限公司 Vehicle money identification method and device, electronic equipment and storage medium
CN112580665B (en) * 2020-12-18 2024-04-19 深圳赛安特技术服务有限公司 Vehicle style identification method and device, electronic equipment and storage medium
CN113033284B (en) * 2020-12-22 2022-10-25 迪比(重庆)智能科技研究院有限公司 Vehicle real-time overload detection method based on convolutional neural network
CN113033284A (en) * 2020-12-22 2021-06-25 迪比(重庆)智能科技研究院有限公司 Vehicle real-time overload detection method based on convolutional neural network
CN112668663A (en) * 2021-01-05 2021-04-16 南京航空航天大学 Aerial photography car detection method based on YOLOv4
CN112668663B (en) * 2021-01-05 2024-03-22 南京航空航天大学 Yolov 4-based aerial car detection method
CN112801027B (en) * 2021-02-09 2024-07-12 北京工业大学 Vehicle target detection method based on event camera
CN112801027A (en) * 2021-02-09 2021-05-14 北京工业大学 Vehicle target detection method based on event camera
CN113111889A (en) * 2021-03-10 2021-07-13 国网浙江省电力有限公司宁波供电公司 Target detection network processing method for edge computing terminal
CN113554084A (en) * 2021-07-16 2021-10-26 华侨大学 Vehicle re-identification model compression method and system based on pruning and light-weight convolution
CN113554084B (en) * 2021-07-16 2024-03-01 华侨大学 Vehicle re-identification model compression method and system based on pruning and light convolution
CN113657174A (en) * 2021-07-21 2021-11-16 北京中科慧眼科技有限公司 Vehicle pseudo-3D information detection method and device and automatic driving system
CN114120246A (en) * 2021-10-12 2022-03-01 吉林大学 Front vehicle detection algorithm based on complex environment
CN114120246B (en) * 2021-10-12 2024-04-16 吉林大学 Front vehicle detection algorithm based on complex environment
CN114201289A (en) * 2021-10-27 2022-03-18 山东师范大学 Target detection method and system based on edge computing node and cloud server
CN114332688A (en) * 2021-12-14 2022-04-12 浙江省交通投资集团有限公司智慧交通研究分公司 Vehicle detection method under highway monitoring video scene
CN114332688B (en) * 2021-12-14 2022-09-09 浙江省交通投资集团有限公司智慧交通研究分公司 Vehicle detection method under highway monitoring video scene
CN114359880A (en) * 2022-03-18 2022-04-15 北京理工大学前沿技术研究院 Riding experience enhancement method and device based on intelligent learning model and cloud
CN115082695A (en) * 2022-05-31 2022-09-20 中国科学院沈阳自动化研究所 Transformer substation insulator string modeling and detecting method based on improved Yolov5
CN116630904A (en) * 2023-04-28 2023-08-22 淮阴工学院 Small target vehicle detection method integrating non-adjacent jump connection and multi-scale residual error structure
CN118397403A (en) * 2024-07-01 2024-07-26 合肥市正茂科技有限公司 Training method, device, equipment and medium for low-illumination vehicle image detection model

Similar Documents

Publication Publication Date Title
CN111461083A (en) Rapid vehicle detection method based on deep learning
CN111126202B (en) Optical remote sensing image target detection method based on void feature pyramid network
CN110188705B (en) Remote traffic sign detection and identification method suitable for vehicle-mounted system
CN109584248B (en) Infrared target instance segmentation method based on feature fusion and dense connection network
CN108509978B (en) Multi-class target detection method and model based on CNN (CNN) multi-level feature fusion
CN111008562B (en) Human-vehicle target detection method with feature map depth fusion
CN113392960B (en) Target detection network and method based on mixed hole convolution pyramid
CN114202672A (en) Small target detection method based on attention mechanism
CN113780211A (en) Lightweight aircraft detection method based on improved yolk 4-tiny
CN111401293B (en) Gesture recognition method based on Head lightweight Mask scanning R-CNN
CN112733693B (en) Multi-scale residual error road extraction method for global perception high-resolution remote sensing image
CN114187311A (en) Image semantic segmentation method, device, equipment and storage medium
CN113850324B (en) Multispectral target detection method based on Yolov4
CN113436210B (en) Road image segmentation method fusing context progressive sampling
CN113361528B (en) Multi-scale target detection method and system
CN111353544A (en) Improved Mixed Pooling-Yolov 3-based target detection method
CN111723660A (en) Detection method for long ground target detection network
CN114332473A (en) Object detection method, object detection device, computer equipment, storage medium and program product
CN116468740A (en) Image semantic segmentation model and segmentation method
CN116342894A (en) GIS infrared feature recognition system and method based on improved YOLOv5
CN110909656B (en) Pedestrian detection method and system integrating radar and camera
CN116597326A (en) Unmanned aerial vehicle aerial photography small target detection method based on improved YOLOv7 algorithm
CN117853955A (en) Unmanned aerial vehicle small target detection method based on improved YOLOv5
CN117197687A (en) Unmanned aerial vehicle aerial photography-oriented detection method for dense small targets
CN113537013A (en) Multi-scale self-attention feature fusion pedestrian detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200728