CN114898327B - Vehicle detection method based on lightweight deep learning network - Google Patents

Vehicle detection method based on lightweight deep learning network Download PDF

Info

Publication number
CN114898327B
CN114898327B CN202210250838.0A CN202210250838A CN114898327B CN 114898327 B CN114898327 B CN 114898327B CN 202210250838 A CN202210250838 A CN 202210250838A CN 114898327 B CN114898327 B CN 114898327B
Authority
CN
China
Prior art keywords
vehicle
frame
image
frames
algorithm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210250838.0A
Other languages
Chinese (zh)
Other versions
CN114898327A (en
Inventor
贺宜
鲁曼可
曹博
巴继东
李泽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University of Technology WUT
Original Assignee
Wuhan University of Technology WUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University of Technology WUT filed Critical Wuhan University of Technology WUT
Priority to CN202210250838.0A priority Critical patent/CN114898327B/en
Publication of CN114898327A publication Critical patent/CN114898327A/en
Application granted granted Critical
Publication of CN114898327B publication Critical patent/CN114898327B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • G06V10/763Non-hierarchical techniques, e.g. based on statistics of modelling distributions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a vehicle detection method based on a lightweight deep learning network. The method comprises the steps of collecting original videos of road vehicles to obtain a road vehicle image data set; introducing a PSO particle swarm algorithm, improving a particle fitness function, and optimizing the width and height dimensions of a vehicle marking frame; adopting a distance intersection ratio (DIOU) as an index for measuring the similarity of labels, and optimizing the width and height dimensions of a vehicle prior frame by combining a PSO particle swarm algorithm and a K-means clustering algorithm; adding a depth separable convolution to the YOLOv model, modifying the Res module in the YOLOv model; training a lightweight deep learning network and detecting road vehicle types. The method accelerates the convergence of the K-means clustering algorithm and the PSO particle swarm optimization algorithm to obtain the optimal prior frame size of the road vehicle, assists YOLOv in the deep learning network to generate an accurate target prediction frame, and reduces the number of operation parameters on a large scale on the premise of improving the detection accuracy of the algorithm, so that the detection speed of the algorithm is further improved, and the vehicle types in the traffic scene are detected in real time.

Description

Vehicle detection method based on lightweight deep learning network
Technical Field
The invention belongs to the technical field of image recognition target detection, and particularly relates to a vehicle detection method based on a lightweight deep learning network.
Background
In recent years, as traffic construction further increases, urban road networks become more and more complex, and the detection demands for traffic objects on roads are gradually increased, so that higher requirements are placed on accuracy and detection speed. Meanwhile, with the rapid development of computer hardware, the large-scale computing power of the computer is improved unprecedented, so that the target detection algorithm based on deep learning is gradually becoming the mainstream. However, the depth network model often involves a huge amount of parameter calculation, so that the high requirement of the detection speed required by real-time detection cannot be met, and how to further improve the detection speed of the model becomes the key of optimizing the target detection algorithm under the condition of ensuring the accuracy.
The existing algorithms are mainly classified into two types, namely RCNN series of algorithms of two-stage, the algorithms are divided into two stages, a candidate region is generated in the first stage, then the generated candidate region is input into the second stage, the target objects possibly existing in the candidate region are classified and subjected to frame regression on a candidate frame to adjust the positions of the target objects, and the algorithms comprise RCNN, fast RCNN, FASTER RCNN and the like, while the algorithms ensure the accuracy, but the detection speed is slower; the other type is a one-stage algorithm, which omits the generation stage of a candidate region, and can directly output the class probability and the position coordinates of a target object after processing an input image, wherein the class probability and the position coordinates comprise SSD and YOLO series algorithms, and the series algorithms increase the detection speed, lose part of accuracy and have poor performance in small-range information. The existing network model can distinguish and classify traffic targets in a real detection environment, but cannot meet the increasing detection speed requirement, and the existing larger model has the problems of multiple calculation parameters, large memory occupation and the like, and cannot well meet the requirement of large-scale application.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides the vehicle detection method and the system based on the lightweight deep learning network, which solve the problems of more memory, more parameter and complex calculation of a general target detection algorithm based on deep learning by reducing the parameter and simplifying calculation, so that the detection network has higher detection speed, and meanwhile, the target detection network can keep higher detection precision by improving the quality of a priori frame, thereby meeting the requirements of identifying the types and the positions of vehicles in traffic scenes in real time.
The invention provides a method for optimizing the size calculation of a vehicle dataset image priori frame by combining a PSO particle swarm optimization algorithm and a K-means clustering algorithm, and introduces a distance intersection ratio (DIOU) as a similarity index of a vehicle rectangular labeling frame and a vehicle priori frame, accelerates the convergence of the K-means clustering algorithm and the PSO particle swarm optimization algorithm to obtain the optimal vehicle dataset image priori frame size, and assists YOLOv in generating an accurate target prediction frame by a deep learning network. Meanwhile, the residual error module of the YOLOv algorithm is modified by adopting a depth separable convolution network, so that the number of operation parameters can be reduced on a large scale on the premise of improving the detection accuracy of the algorithm, the memory occupied by the model is reduced, the detection speed of the algorithm is further improved, and the type and the position of the vehicle in the traffic scene are detected in real time.
In order to achieve the above object, the technical solution of the present invention is a vehicle detection method based on a lightweight deep learning network, which is characterized by comprising the following steps:
Step 1: collecting an initial video of a vehicle through a road monitoring camera, and transmitting the initial video to a calculation processing host for frame extraction to obtain a plurality of road vehicle images; manually labeling each vehicle labeling frame in each road vehicle image, and further manually labeling the vehicle category in each vehicle labeling frame in each road vehicle image; extracting the width of each vehicle marking frame in each road vehicle image and the height of each vehicle marking frame in each road vehicle image to form a wide-high data set;
Step 2: introducing a PSO particle swarm optimization algorithm, taking the marked frame width and height dimensions as variables to be optimized, improving a particle fitness function, and obtaining optimized K marked frame width and height dimensions by utilizing the global searching capability of the PSO algorithm;
Step 3: taking the optimized width and height dimensions of the K marking frames as initial values of priori frames to be generated by a K-means clustering algorithm, calculating the distance intersection ratio of each marking frame to each generated priori frame, and clustering by the K-means clustering algorithm to generate clustered optimized priori frame width and height dimensions;
Step4: adding depth separable convolution to a Res residual error module of YOLOv < 3 > deep learning network to obtain a lightweight YOLOv < 3 > deep learning network, inputting priori frame width and height data and a vehicle image dataset which are obtained by optimizing a K-means algorithm and a PSO algorithm and are suitable for the dataset into the lightweight YOLOv < 3 > deep learning network for training to obtain a trained lightweight YOLOv < 3 > deep learning network model;
step 5: and transmitting the traffic video acquired in real time to a calculation processing host for frame extraction to obtain a plurality of real-time road vehicle images, and further predicting by using the trained lightweight YOLOv deep learning network model to obtain a vehicle prediction frame in the plurality of real-time road vehicle images and the category of vehicles in the prediction frame.
Preferably, in step1, each vehicle marking frame in each road vehicle image is:
Boxm,n=(xm,n,ym,n,wm,n,hm,n),m∈[1,M],n∈[1,N]
the width and height data set in the step 1 is as follows:
Ф=(vm,n,hm,n),m∈[1,M],n∈[1,N]
Wherein Box m,n represents an nth vehicle marking frame in an mth road vehicle image, Φ represents a data set of width and height dimensions of all vehicle marking frames, x m,n represents an abscissa of a center point of the nth vehicle marking frame in the mth road vehicle image, y m,n represents an ordinate w m,n of a center point of the nth vehicle marking frame in the mth road vehicle image in the road vehicle data set, and h m,n represents a height of the nth vehicle marking frame in the mth road vehicle image in the road vehicle data set. M represents the number of road vehicle images, N represents the number of vehicle marking frames in each road vehicle image;
in the step1, the vehicle category in each circumscribed rectangular frame of each vehicle in each road vehicle image is as follows:
typem,n,typem,n∈[1,3]
wherein, type m,n =1 indicates that the vehicle type in the nth vehicle marking frame in the mth road vehicle image is a car, type m,n =2 indicates that the vehicle type in the nth vehicle marking frame in the mth road vehicle image is a bus, and type m,n =3 indicates that the vehicle type in the nth vehicle marking frame in the mth road vehicle image is a truck;
Preferably, the step of optimizing the PSO algorithm in the step2 to obtain the width and height dimensions of the K marked frames comprises the following steps:
Step 2.1: randomly selecting K marking frames from all the vehicle marking frames of each road vehicle image in the step 1;
Step 2.2: c k=(wk,hk) which is the wide and high size data of the K marking frames, wherein K epsilon [1, K ] is used as an initial value of the center point position of the particle population of the PSO particle swarm algorithm, and is initialized;
Wherein c k=(wk,hk) represents the width-height dimension data of the K marking frames selected randomly, an
Step 2.3: initializing the speed V k =0 of the particles, the individual optimal position P best (k) and the corresponding individual extremum f (P best (k)), the group optimal position G best and the corresponding global extremum f (G bsst), and the particle group population size is N, namely N particles P j, j epsilon (1, 2, …, N) of the particle group;
Step 2.4: calculating the distance-to-intersection ratio (DIOU) of each particle p j to the center point c k=(wk,hk), improving the particle fitness function, which fitness function is as follows:
step 2.5: comparing the calculation results of the fitness fit, and updating the individual extremum f (P best (k)) and the individual optimal position P best (k) of the particle swarm, the global extremum f (G bsst) and the group optimal position G best of the particle swarm.
Step 2.6: when the highest iteration times are reached, the algorithm is ended, the optimal group position G best=(P1,P2,P3,…,PK is obtained through PSO algorithm optimization), the positions of particles in the optimal group are correspondingly the optimized K particle coordinates P k=(w′k,h′k), K epsilon [1, K ], and the optimized marking frame width and height dimensions are obtained.
Preferably, in the step 3, the clustering is performed by a K-means clustering algorithm to generate a priori frame width height after clustering optimization, and the specific process is as follows:
Step 3.1: firstly, reading the width and height dimensions P k=(w′k,h′k of K marked frames obtained by a PSO particle swarm optimization algorithm), wherein K is E [1, K ] which is used as the initial value of K priori frames to be generated by a K-means clustering algorithm.
Step 3.2: calculating the distance intersection ratio of all the vehicle marking frames of each road vehicle image to K prior frames, further calculating the distance values of all the vehicle marking frames of each road vehicle image to the K prior frames according to an improved distance formula d, comparing the distance values, and classifying marking frames with the smallest distance values with the K prior frames into one type.
The distance intersection ratio (DIOU) formula of all the vehicle marking frames of each road vehicle image and the generated K prior frames is calculated as follows:
The improved distance formula is as follows:
Wherein b, b Box represents the center points of the prior frame (anchor frame) and the labeling frame (boundary frame), ρ represents the Euclidean distance of the two centers, and c represents the diagonal distance of the minimum closed frame capable of surrounding the prior frame and the labeling frame at the same time.
Step 3.3: and aiming at the labeling frames of a certain class of prior frames, sequencing the labeling frames according to the width and the height, and taking the intermediate value as a new class of prior frames to update the width and the height of the prior frames.
Step 3.4: and calculating the distance intersection ratio and the distance value of each new prior frame and all the vehicle marking frames of each road vehicle image, and carrying out new classification according to the steps.
Step 3.5: repeating the steps 3.3 and 3.4 until the width and height dimensions of the prior frame are not updated, outputting the K prior frame width and height dimensions (w' k,h″k) optimized by the K-means clustering algorithm, and K E [1, K ].
Preferably, the lightweight YOLOv deep learning network described in step 4 is:
The Res residual block in the YOLOv network model is modified with a depth separable convolutional network. The basic module in the original YOLOv network structure is a DBL, and the DBL module is composed of a convolution layer, scale normalization and a Leaky_ relu activation function, wherein the Res module references the residual structure of ResNet. The invention adopts depth separable convolution to replace basic convolution operation in Res module of YOLOv network, and carries out 3X 3 channel-by-channel convolution operation after 1X 1 convolution operation in the first step, the specific step of channel-by-channel convolution is to split the multi-channel feature map of the upper layer to form a plurality of single-channel feature maps, then carries out single-channel convolution operation on the feature maps, and then stacks the feature maps together again, and carries out 1X 1 point-by-point convolution operation again after channel-by-channel convolution operation, which has the function of carrying out weighting operation in the depth direction to obtain a new feature map. The depth separable convolution operation is added into the residual error module of YOLOv, so that the parameter calculation amount in network calculation is greatly reduced, and the memory occupied by the YOLOv algorithm model is also reduced accordingly.
The loss function model of the lightweight deep learning network described in step 4 is:
the design of the penalty function of the YOLOv algorithm is mainly considered from three aspects of boundary frame coordinate prediction error, confidence error of boundary frame and classification prediction error. YOLOv3 the loss function formula can be expressed as:
Wherein G is the number of grids divided by the image, B is the number of predicted boundary frames in each grid, i represents the number of cells, and j represents the number of prior frames (anchor frames); Indicating whether the jth prior frame (anchor frame) of the ith cell is responsible for predicting the object, and taking a value of 1 or 0; /(I) Abscissa representing center point of frame of target of nth vehicle predicted by ith grid of mth image of image training set,/>Ordinate representing center point of frame of object of nth vehicle predicted by ith grid of mth image of image training set,/>Representing the width of the nth vehicle target frame predicted by the ith grid of the mth image of the image training set, and type m,n,s,i represents the nth vehicle target frame category of the ith grid of the mth image of the image training set, wherein/(I)Representing an nth vehicle target frame category of an ith grid prediction of an mth image of the image training set; /(I)Indicates whether there is no target in the j-th anchor box of the i grids,/>N-th vehicle target frame vehicle class confidence predicted by the mth image and the ith grid of the image training set is represented, and p i(typem,n,s,i) represents n-th vehicle target frame vehicle class confidence of the mth image and the ith grid of the image training set is true.
Compared with the prior art, the invention has the beneficial effects that: the PSO particle swarm optimization algorithm is adopted, the width and height dimensions of the marked frame are used as variables to be optimized, the obtained width and height dimensions of the optimized marked frame are used as initial values generated by prior frames of the K-means clustering algorithm, in the calculation of the PSO algorithm and the K-means algorithm, a distance intersection ratio (DIOU) is introduced to replace an intersection ratio (IOU), a distance formula is improved, the distance between the marked frame and the center point of the prior frame can be directly minimized in the calculation process, the problem that the IOU cannot accurately reflect the true degree of the two frames is solved, the convergence speed of network calculation can be accelerated, the prior frame dimensions are more consistent with real values through optimization of the prior frame width and height dimensions, a YOLOv deep learning network is assisted to generate a more accurate predicted frame, and the detection accuracy of the YOLOv algorithm to the vehicle type and the position is improved. And in a residual module of the YOLOv model, the depth separable convolution is utilized to replace basic convolution operation in the model, so that the parameter number participating in calculation is greatly reduced, the memory of the model is reduced, and the detection speed is improved. The technical scheme of the invention can rapidly and accurately identify the vehicles in the traffic video, realizes the real-time position identification operation of traffic targets, and has a certain application prospect.
Drawings
Fig. 1: is a general method flow diagram in the embodiment of the invention;
Fig. 2: a flow chart of a Particle Swarm Optimization (PSO) and K-means clustering algorithm is combined in the embodiment of the invention;
fig. 3: a structure diagram of a residual error module ResNet-DS which is improved in the embodiment of the invention;
fig. 4: a schematic diagram of a YOLOv network structure modified in the example of the present invention is shown.
Detailed Description
The following describes the technical scheme provided by the invention in detail with reference to the accompanying drawings:
As shown in fig. 1, the invention provides a vehicle detection method and system based on a lightweight deep learning network, comprising:
the image deep learning system is characterized by comprising: the road monitoring camera, the calculation processing host and the display screen;
the road monitoring camera, the calculation processing host and the display screen are connected in sequence;
the road monitoring camera is used for collecting an initial image of a road vehicle and transmitting the initial image to the calculation processing host;
The calculation processing host is used for identifying the type of the road vehicle from the initial image of the road vehicle to obtain a prediction frame of the road vehicle, and the corresponding vehicle type and confidence in the prediction frame of the road vehicle;
the display screen is used for displaying a prediction frame of the road vehicle, and the corresponding vehicle category and the confidence level in the prediction frame of the road vehicle.
The computing processing host is configured to: i9-10980XE type CPU; RTX3080 model GPU; a model X299 motherboard; DDR4 3000HZ 16G memory strips; GW-EPS1250DA type power supply;
step 1: the method comprises the steps of collecting an initial video of a vehicle through a road monitoring camera, transmitting the initial video to a calculation processing host for frame extraction, obtaining a plurality of vehicle images, and constructing an image data set of the vehicle. Each vehicle in each image in the manual labeling image data set is externally connected with a rectangular frame to form a labeling frame, and the types of the vehicles are further manually labeled to obtain the width and height size information of the labeling frame;
In the step 1, each vehicle marking frame in each road vehicle image is:
Boxm,n=(xm,n,ym,n,wm,n,hm,n),m∈[1,M],n∈[1,N]
the width and height data set in the step 1 is as follows:
Ф=(wm,n,hm,n),m∈[1,M],n∈[1,N]
Wherein Box m,n represents an nth vehicle marking frame in an mth road vehicle image, Φ represents a data set of width and height dimensions of all vehicle marking frames, x m,n represents an abscissa of a center point of the nth vehicle marking frame in the mth road vehicle image, y m,n represents an ordinate w m,n of a center point of the nth vehicle marking frame in the mth road vehicle image in the road vehicle data set, and h m,n represents a height of the nth vehicle marking frame in the mth road vehicle image in the road vehicle data set. M represents the number of road vehicle images, N represents the number of vehicle marking frames in each road vehicle image;
Step 2: introducing a PSO particle swarm optimization algorithm, taking the marked frame width and height dimensions as variables to be optimized, improving a particle fitness function, and obtaining optimized K marked frame width and height dimensions by utilizing the global searching capability of the PSO algorithm;
The PSO algorithm in the step 2 is optimized to obtain the width and height dimensions of the K marked frames, which comprises the following steps:
Step 2.1: randomly selecting K marking frames from all the vehicle marking frames of each road vehicle image in the step 1;
Step 2.2: c k=(wk,hk) which is the wide and high size data of the K marking frames, wherein K epsilon [1, K ] is used as an initial value of the center point position of the particle population of the PSO particle swarm algorithm, and is initialized;
Wherein c k=(wk,hk) represents the width-height dimension data of the K marking frames selected randomly, an
Step 2.3: initializing the speed V k =0 of the particles, the individual optimal position P best (k) and the corresponding individual extremum f (P best (k)), the group optimal position G best and the corresponding global extremum f (G best), and the particle group population size is N, namely N particles P j, j epsilon (1, 2, …, N) of the particle group;
Step 2.4: calculating the distance-to-intersection ratio (DIOU) of each particle p j to the center point c k=(wk,hk), improving the particle fitness function, which fitness function is as follows:
step 2.5: comparing the calculation results of the fitness fit, and updating the individual extremum f (P best (k)) and the individual optimal position P best (k) of the particle swarm, the global extremum f (G bsst) and the group optimal position G best of the particle swarm.
Step 2.6: when the highest iteration times are reached, the algorithm is ended, the optimal group position G best=(P1,P2,P3,…,PK is obtained through PSO algorithm optimization), the positions of particles in the optimal group are correspondingly the optimized K particle coordinates P k=(w′k,h′k), K epsilon [1, K ], and the optimized marking frame width and height dimensions are obtained.
Step 3: the width and height dimensions of K marking frames obtained after optimizing the PSO particle swarm optimization algorithm are used as initial values of priori frames to be generated by the K-means clustering algorithm, and the distance intersection ratio (DIOU) of each marking frame and each generated priori frame is calculated and clustered to obtain the width and height dimensions of the clustered and optimized priori frames;
and 3, clustering through a K-means clustering algorithm to generate a priori frame width height after clustering optimization, wherein the method comprises the following specific processes:
Step 3.1: firstly, reading the width and height dimensions P k=(w′k,h′k of K marked frames obtained by a PSO particle swarm optimization algorithm), wherein K is E [1, K ] which is used as the initial value of K priori frames to be generated by a K-means clustering algorithm.
Step 3.2: calculating the distance intersection ratio of all the vehicle marking frames of each road vehicle image to K prior frames, further calculating the distance values of all the vehicle marking frames of each road vehicle image to the K prior frames according to an improved distance formula d, comparing the distance values, and classifying marking frames with the smallest distance values with the K prior frames into one type.
Step 3.3: aiming at the labeling frames of a certain class of prior frames, sorting the labeling frames according to the width and the height, and taking an intermediate value as a new class of prior frames to update the width and the height of the prior frames;
step 3.4: and calculating the distance intersection ratio and the distance value of each new prior frame and all the vehicle marking frames of each road vehicle image, and carrying out new classification according to the steps.
Step 3.5: repeating the steps 3.3 and 3.4 until the width and height dimensions of the prior frame are not updated, outputting the K prior frame width and height dimensions (w' k,h″k) optimized by the K-means clustering algorithm, and K E [1, K ].
The distance overlap ratio (DIOU) formula is:
The improved distance formula is as follows:
where b Box represents the center points of the prior frame (anchor frame) and the label frame (bounding frame), respectively, ρ represents the euclidean distance of the two centers, and c represents the diagonal distance of the smallest closed frame that can enclose both the prior frame and the label frame. The problem that when the prior frame and the labeling frame are in the containing relation and the separating relation, the prior frame and the labeling frame cannot be clearly represented by the original distance formula can be well solved by adopting the distance cross-over (DIOU) to replace the cross-over (IOU) in the distance formula of the original K-means clustering algorithm.
Meanwhile, the distance between the center points of the prior frame and the labeling frame can be directly minimized by the improved distance formula, so that the convergence of the K-means clustering algorithm can be accelerated.
The problem that the K-means clustering result is greatly influenced by the initial value can be solved by using a PSO particle optimization method through a PSO particle swarm optimization algorithm and K-means clustering algorithm combination method, the prior frame size is finally enabled to be more in line with the true value, and YOLOv detection accuracy is improved through improving prior frame quality.
Step4: adding depth separable convolution to a Res residual error module of YOLOv < 3 > deep learning network to obtain a lightweight YOLOv < 3 > deep learning network, inputting priori frame width and height data and a vehicle image dataset which are obtained by optimizing a K-means algorithm and a PSO algorithm and are suitable for the dataset into the lightweight YOLOv < 3 > deep learning network for training to obtain a trained lightweight YOLOv < 3 > deep learning network model;
The lightweight YOLOv deep learning network described in step 4 is:
the Res residual block in the YOLOv network model is modified with a depth separable convolutional network. The basic module in the original YOLOv network structure is a DBL, and the DBL module is composed of a convolution layer, scale normalization and a Leaky_ relu activation function, wherein the Res module references the residual structure of ResNet. The present invention replaces the basic convolution operation with a depth separable convolution in the Res module of YOLOv network.
As shown in fig. 3, in the new residual module ResNet-DS of the YOLOv network, after the first step of 1×1 convolution operation, a 3×3 channel-by-channel convolution operation is performed, where the specific step of the channel-by-channel convolution is to split the multi-channel feature map of the previous layer to form multiple single-channel feature maps, then perform single-channel convolution operation on the feature maps, and then re-stack the feature maps together, and after the channel-by-channel convolution operation, perform another 1×1 point-by-point convolution operation again, which is used to perform a weighting operation in the depth direction, so as to obtain a new feature map. The depth separable convolution operation is added into the residual error module of YOLOv, so that the parameter calculation amount in network calculation is greatly reduced, and the memory occupied by the YOLOv algorithm model is also reduced accordingly.
The parameter operation process participated in the calculation in the improved YOLOv residual module comprises the following specific steps:
According to the convolution operation of the original YOLOv algorithm, the input of a convolution layer is set to be a 3-channel image, the convolution layer has N filters, and each Filter comprises k convolution kernels with the size of 3 multiplied by 3. Thus, the number of parameters N 1 for the original YOLOv algorithm convolutional layer is:
N1=N×k×3×3=27N
The convolution operation is separable by depth, and for the channel-by-channel convolution operation, the number of channels and convolution kernels should be one-to-one, with 1 convolution kernel being responsible for 1 channel. Therefore, 1 image of 3 channels is convolved channel by channel to generate 3 feature maps. Wherein, 1 Filter only contains 1 convolution kernel with size of 3×3, and the number N DW of parameters involved in the channel-by-channel convolution part is:
NDW=3×3×3=27
The channel-by-channel convolution independently carries out convolution operation on each channel of the input layer, so that the number of feature images after the channel-by-channel convolution operation is the same as the number of channels of the input layer, but the operation does not use the spatial feature image information therein, so that the original feature images cannot be expanded after the channel-by-channel convolution. The feature map after the channel-by-channel convolution is weighted and combined in the depth direction to generate a new feature map, the convolution kernel size of the point-by-point convolution operation is 1×1×m, M is the number of channels of the previous layer, and since the number of channels of the previous layer is 3, m=3. Since the convolution has N filters and the convolution kernel adopts a1×1 convolution method, the number N PW of parameters involved in the point-by-point convolution is:
NPW=3×1×1×N=3N
The number of convolutional layer parameters N 2 of the modified YOLOv algorithm is therefore:
N2=NDW+NPW=27+3N
the parameter quantity 27+3N of the convolution operation participated in calculation after the improvement of the depth separable convolution is obviously reduced compared with the parameter quantity 27N of the original operation, and the parameter reduction quantity reaches 87.9% when the N value exceeds 100 because of more filters participated in calculation in the convolution operation, namely the N value is larger. Therefore, the parameter scale of the improved YOLOv algorithm is well improved, the memory occupied by the corresponding improved YOLOv algorithm model is greatly reduced, the training and detection speed of the improved algorithm is also improved, and meanwhile, the accuracy of the modified algorithm identification is still kept at a higher level because the depth separable volume operation is modified only for the residual module.
The loss function model of the lightweight deep learning network described in step 4 is:
the design of the penalty function of the YOLOv algorithm is mainly considered from three aspects of boundary frame coordinate prediction error, confidence error of boundary frame and classification prediction error. YOLOv3 the loss function formula can be expressed as:
Wherein G is the number of grids divided by the image, B is the number of predicted boundary frames in each grid, i represents the number of cells, and j represents the number of prior frames (anchor frames); Indicating whether the jth prior frame (anchor frame) of the ith cell is responsible for predicting the object, and taking a value of 1 or 0; /(I) Abscissa representing center point of frame of target of nth vehicle predicted by ith grid of mth image of image training set,/>Ordinate representing center point of frame of object of nth vehicle predicted by ith grid of mth image of image training set,/>Representing the width of the nth vehicle target frame predicted by the ith grid of the mth image of the image training set, and type m,n,s,i represents the nth vehicle target frame category of the ith grid of the mth image of the image training set, wherein/(I)Representing an nth vehicle target frame category of an ith grid prediction of an mth image of the image training set; /(I)Indicates whether there is no target in the j-th anchor box of the i grids,/>N-th vehicle target frame vehicle class confidence predicted by the mth image and the ith grid of the image training set is represented, and p i(typem,n,s,i) represents n-th vehicle target frame vehicle class confidence of the mth image and the ith grid of the image training set is true.
Step 5: and transmitting the traffic video acquired in real time to a calculation processing host for frame extraction to obtain a plurality of road vehicle images, and further predicting by using the trained lightweight YOLOv deep learning network model to obtain a vehicle prediction frame in the plurality of road vehicle images and the category of vehicles in the prediction frame.
The specific process of the lightweight YOLOv deep learning network detection in the step 5 is as follows:
and inputting the vehicle image acquired by the video into an improved YOLOv network for feature extraction, and performing downsampling in a Darknet backbone network through convolution operation with the step length of 2 for a plurality of times to obtain feature graphs with the three dimensions of 13×13, 26×26 and 52×52. And distributing the K optimized prior frame sizes obtained in the step 3 to three feature maps, wherein after the prior frame is predicted by the feature map with the 13 multiplied by 13 scale, the subsequent candidate frame information is directly obtained after further convolution operation. The 26×26 scale feature map is obtained by up-sampling the 13×13 scale feature map, adding the up-sampled feature map to the 26×26 scale feature map, and outputting the subsequent candidate frame information through a plurality of convolution operations. The feature map of 52×52 scale is firstly up-sampled, then added with the feature map of 52×52 scale, and then the subsequent candidate frame information is output through a plurality of convolution operations. And screening the candidate frames by adopting a Soft-NMS suppression algorithm according to the generated candidate frames, and finally outputting a high-precision vehicle prediction boundary frame and a vehicle category.
The above examples are merely illustrative of the preferred embodiments of the present invention and are not intended to limit the scope of the present invention, and various modifications and improvements made by those skilled in the art to the technical solution of the present invention should fall within the protection scope of the present invention without departing from the design spirit of the present invention.

Claims (3)

1. The vehicle detection method based on the lightweight deep learning network is characterized by comprising the following steps of:
Step 1: collecting an initial video of a vehicle through a road monitoring camera, and transmitting the initial video to a calculation processing host for frame extraction to obtain a plurality of road vehicle images; manually labeling each vehicle labeling frame in each road vehicle image, and further manually labeling the vehicle category in each vehicle labeling frame in each road vehicle image; extracting the width of each vehicle marking frame in each road vehicle image and the height of each vehicle marking frame in each road vehicle image to form a wide-high data set;
Step 2: introducing a PSO particle swarm optimization algorithm, taking the marked frame width and height dimensions as variables to be optimized, improving a particle fitness function, and obtaining optimized K marked frame width and height dimensions by utilizing the global searching capability of the PSO algorithm;
Step 3: taking the optimized width and height dimensions of the K marking frames as initial values of priori frames to be generated by a K-means clustering algorithm, calculating the distance intersection ratio of each marking frame to each generated priori frame, and clustering by the K-means clustering algorithm to generate clustered optimized priori frame width and height dimensions;
Step4: adding depth separable convolution to a Res residual error module of YOLOv < 3 > deep learning network to obtain a lightweight YOLOv < 3 > deep learning network, inputting priori frame width and height data and a vehicle image dataset which are obtained by optimizing a K-means algorithm and a PSO algorithm and are suitable for the dataset into the lightweight YOLOv < 3 > deep learning network for training to obtain a trained lightweight YOLOv < 3 > deep learning network model;
Step 5: transmitting the traffic video acquired in real time to a calculation processing host for frame extraction to obtain a plurality of real-time road vehicle images, and further predicting by using a trained lightweight YOLOv3 deep learning network model to obtain a vehicle prediction frame in the plurality of real-time road vehicle images and the category of vehicles in the prediction frame;
In the step 1, each vehicle marking frame in each road vehicle image is:
Boxm,n=(xm,n,ym,n,wm,n,hm,n),m∈[1,M],n∈[1,N]
the width and height data set in the step 1 is as follows:
Ф=(wm,n,hm,n),m∈[1,M],n∈[1,N]
Wherein Box m,n represents an nth vehicle marking frame in an mth road vehicle image, Φ represents a data set of width and height dimensions of all vehicle marking frames, x m,n represents an abscissa of a center point of the nth vehicle marking frame in the mth road vehicle image, y m,n represents an ordinate w m,n of a center point of the nth vehicle marking frame in the mth road vehicle image in the road vehicle data set, and h m,n represents a height of the nth vehicle marking frame in the mth road vehicle image in the road vehicle data set; m represents the number of road vehicle images, N represents the number of vehicle marking frames in each road vehicle image;
in the step1, the vehicle category in each circumscribed rectangular frame of each vehicle in each road vehicle image is as follows:
typem,n,typem,n∈[1,3]
wherein, type m,n =1 indicates that the vehicle type in the nth vehicle marking frame in the mth road vehicle image is a car, type m,n =2 indicates that the vehicle type in the nth vehicle marking frame in the mth road vehicle image is a bus, and type m,n =3 indicates that the vehicle type in the nth vehicle marking frame in the mth road vehicle image is a truck;
The PSO algorithm in the step 2 is optimized to obtain the width and height dimensions of the K marked frames, which comprises the following steps:
Step 2.1: randomly selecting K marking frames from all the vehicle marking frames of each road vehicle image in the step 1;
Step 2.2: c k=(wk,hk) which is the wide and high size data of the K marking frames, wherein K epsilon [1, K ] is used as an initial value of the center point position of the particle population of the PSO particle swarm algorithm, and is initialized;
Wherein c k=(wk,hk) represents the width-height dimension data of the K marking frames selected randomly, an
Step 2.3: initializing the speed V k =0 of the particles, the individual optimal position P best (k) and the corresponding individual extremum f (P best (k)), the group optimal position G best and the corresponding global extremum f (G bsst), and the particle group population size is N, namely N particles P j, j epsilon (1, 2, …, N) of the particle group;
Step 2.4: calculating the distance overlap ratio DIOU of each particle p j to the center point c k=(wk,hk), the particle fitness function is improved as follows:
Step 2.5: comparing the calculation results of the fitness fit, and updating the individual extremum f (P best (k)) and the individual optimal position P best (k) of the particle swarm, the global extremum f (G best) and the group optimal position G best of the particle swarm;
step 2.6: when the highest iteration times are reached, the algorithm is ended, the optimal group position G best=(P1,P2,P3,…,PK is obtained through PSO algorithm optimization), the positions of particles in the optimal group are correspondingly the optimized K particle coordinates P k=(w'k,h'k), K epsilon [1, K ], and the optimized marking frame width and height dimensions are obtained.
2. The method for vehicle detection based on a lightweight deep learning network of claim 1, wherein,
And 3, clustering through a K-means clustering algorithm to generate a priori frame width height after clustering optimization, wherein the method comprises the following specific processes:
step 3.1: firstly, reading the width and height dimensions P k=(w'k,h'k of K marking frames obtained by a PSO particle swarm optimization algorithm), wherein K is E [1, K ] which is used as the initial value of K priori frames to be generated by a K-means clustering algorithm;
Step 3.2: calculating the distance intersection ratio of all the vehicle marking frames of each road vehicle image to K prior frames, further calculating the distance values of all the vehicle marking frames of each road vehicle image to the K prior frames according to an improved distance formula d, comparing the distance values, and classifying marking frames with the minimum distance values with the K prior frames into one type;
Calculating the distance intersection ratio of all vehicle marking frames of each road vehicle image and the generated K prior frames, wherein the distance intersection ratio DIOU formula is as follows:
The improved distance formula is as follows:
B Box represents the center points of the prior frame and the labeling frame, ρ represents the Euclidean distance between the two centers, and c represents the diagonal distance of the minimum closed frame capable of surrounding the prior frame and the labeling frame simultaneously;
Step 3.3: aiming at the labeling frames of a certain class of prior frames, sorting the labeling frames according to the width and the height, and taking an intermediate value as a new class of prior frames to update the width and the height of the prior frames;
step 3.4: calculating the distance intersection ratio and the distance value of each new prior frame and all vehicle marking frames of each road vehicle image, and carrying out new classification according to the steps;
Step 3.5: repeating the steps 3.3 and 3.4 until the width and height dimensions of the prior frame are not updated, outputting the K prior frame width and height dimensions (w' k,h″k) optimized by the K-means clustering algorithm, and K E [1, K ].
3. The method for vehicle detection based on a lightweight deep learning network of claim 1, wherein,
The lightweight YOLOv deep learning network described in step 4 is:
Modifying YOLOv a Res residual module in the network model with a depth separable convolutional network; the basic module in the original YOLOv network structure is DBL, and the DBL module is composed of a convolution layer, scale normalization and a Leaky_ relu activation function, wherein the Res module references the residual structure of ResNet; the Res module of YOLOv network adopts depth separable convolution to replace basic convolution operation, and after the first step of 1X 1 convolution operation, 3X 3 channel-by-channel convolution operation is carried out, wherein the specific step of channel-by-channel convolution is to split the multi-channel feature map of the upper layer so as to form a plurality of single-channel feature maps, then the feature maps are subjected to single-channel convolution operation and then are stacked together again, and after the channel-by-channel convolution operation, 1X 1 point-by-point convolution operation is carried out again, and the effect is that weighting operation is carried out in the depth direction so as to obtain a new feature map; adding depth separable convolution operation into the residual error module YOLOv to greatly reduce the parameter calculation amount in network calculation, so that the memory occupied by the YOLOv algorithm model is reduced;
the loss function model of the lightweight deep learning network described in step 4 is:
The design of the loss function of YOLOv algorithm is considered from three aspects of boundary frame coordinate prediction error, confidence coefficient error of boundary frame and classification prediction error; YOLOv3 the loss function formula can be expressed as:
Wherein G is the number of grids divided by the image, B is the number of predicted boundary frames in each grid, i represents the number of cells, and j represents the number of prior frames; Indicating whether the jth prior frame of the ith cell is responsible for predicting an object, wherein the value is 1 or 0; /(I) Abscissa representing center point of frame of target of nth vehicle predicted by ith grid of mth image of image training set,/>Ordinate representing center point of frame of object of nth vehicle predicted by ith grid of mth image of image training set,/>Representing the width of the nth vehicle target frame predicted by the ith grid of the mth image of the image training set, and type m,n,s,i represents the nth vehicle target frame category of the ith grid of the mth image of the image training set, wherein/(I)Representing an nth vehicle target frame category of an ith grid prediction of an mth image of the image training set; /(I)Indicates whether there is no target in the j-th anchor box of the i grids,/>N-th vehicle target frame vehicle class confidence predicted by the mth image and the ith grid of the image training set is represented, and p i(typem,n,s,i) represents n-th vehicle target frame vehicle class confidence of the mth image and the ith grid of the image training set is true.
CN202210250838.0A 2022-03-15 2022-03-15 Vehicle detection method based on lightweight deep learning network Active CN114898327B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210250838.0A CN114898327B (en) 2022-03-15 2022-03-15 Vehicle detection method based on lightweight deep learning network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210250838.0A CN114898327B (en) 2022-03-15 2022-03-15 Vehicle detection method based on lightweight deep learning network

Publications (2)

Publication Number Publication Date
CN114898327A CN114898327A (en) 2022-08-12
CN114898327B true CN114898327B (en) 2024-04-26

Family

ID=82715326

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210250838.0A Active CN114898327B (en) 2022-03-15 2022-03-15 Vehicle detection method based on lightweight deep learning network

Country Status (1)

Country Link
CN (1) CN114898327B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115100492B (en) * 2022-08-26 2023-04-07 摩尔线程智能科技(北京)有限责任公司 Yolov3 network training and PCB surface defect detection method and device
CN115565068B (en) * 2022-09-30 2023-04-18 宁波大学 Full-automatic detection method for breakage of high-rise building glass curtain wall based on light-weight deep convolutional neural network
CN116258721A (en) * 2023-05-16 2023-06-13 成都数之联科技股份有限公司 OLED panel defect judging method, device, equipment and medium
CN116863419A (en) * 2023-09-04 2023-10-10 湖北省长投智慧停车有限公司 Method and device for lightening target detection model, electronic equipment and medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020181685A1 (en) * 2019-03-12 2020-09-17 南京邮电大学 Vehicle-mounted video target detection method based on deep learning
WO2021244079A1 (en) * 2020-06-02 2021-12-09 苏州科技大学 Method for detecting image target in smart home environment

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10713542B2 (en) * 2018-10-24 2020-07-14 The Climate Corporation Detection of plant diseases with multi-stage, multi-scale deep learning
US11288507B2 (en) * 2019-09-27 2022-03-29 Sony Corporation Object detection in image based on stochastic optimization

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020181685A1 (en) * 2019-03-12 2020-09-17 南京邮电大学 Vehicle-mounted video target detection method based on deep learning
WO2021244079A1 (en) * 2020-06-02 2021-12-09 苏州科技大学 Method for detecting image target in smart home environment

Also Published As

Publication number Publication date
CN114898327A (en) 2022-08-12

Similar Documents

Publication Publication Date Title
CN114898327B (en) Vehicle detection method based on lightweight deep learning network
CN110991311B (en) Target detection method based on dense connection deep network
CN110929577A (en) Improved target identification method based on YOLOv3 lightweight framework
CN110348384B (en) Small target vehicle attribute identification method based on feature fusion
CN112016605B (en) Target detection method based on corner alignment and boundary matching of bounding box
CN112232371B (en) American license plate recognition method based on YOLOv3 and text recognition
CN111126359A (en) High-definition image small target detection method based on self-encoder and YOLO algorithm
CN111967313B (en) Unmanned aerial vehicle image annotation method assisted by deep learning target detection algorithm
CN111428625A (en) Traffic scene target detection method and system based on deep learning
CN111178451A (en) License plate detection method based on YOLOv3 network
CN112084890A (en) Multi-scale traffic signal sign identification method based on GMM and CQFL
CN113205026A (en) Improved vehicle type recognition method based on fast RCNN deep learning network
CN110659601A (en) Depth full convolution network remote sensing image dense vehicle detection method based on central point
CN112712102A (en) Recognizer capable of simultaneously recognizing known radar radiation source individuals and unknown radar radiation source individuals
CN116091946A (en) Yolov 5-based unmanned aerial vehicle aerial image target detection method
CN115131747A (en) Knowledge distillation-based power transmission channel engineering vehicle target detection method and system
CN115019039A (en) Example segmentation method and system combining self-supervision and global information enhancement
CN114550134A (en) Deep learning-based traffic sign detection and identification method
CN114648667A (en) Bird image fine-granularity identification method based on lightweight bilinear CNN model
CN111612803B (en) Vehicle image semantic segmentation method based on image definition
CN113496260A (en) Grain depot worker non-standard operation detection method based on improved YOLOv3 algorithm
CN117173697A (en) Cell mass classification and identification method, device, electronic equipment and storage medium
Huo et al. Traffic sign recognition based on improved SSD model
CN111832463A (en) Deep learning-based traffic sign detection method
CN114882490B (en) Unlimited scene license plate detection and classification method based on point-guided positioning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant