CN112270252A - Multi-vehicle target identification method for improving YOLOv2 model - Google Patents
Multi-vehicle target identification method for improving YOLOv2 model Download PDFInfo
- Publication number
- CN112270252A CN112270252A CN202011158555.0A CN202011158555A CN112270252A CN 112270252 A CN112270252 A CN 112270252A CN 202011158555 A CN202011158555 A CN 202011158555A CN 112270252 A CN112270252 A CN 112270252A
- Authority
- CN
- China
- Prior art keywords
- target
- model
- value
- training
- layers
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 238000012549 training Methods 0.000 claims abstract description 64
- 238000001514 detection method Methods 0.000 claims abstract description 38
- 238000012360 testing method Methods 0.000 claims abstract description 28
- 239000013598 vector Substances 0.000 claims abstract description 10
- 230000000694 effects Effects 0.000 claims abstract description 7
- 230000006870 function Effects 0.000 claims description 20
- 230000003993 interaction Effects 0.000 claims description 15
- 238000011176 pooling Methods 0.000 claims description 12
- 238000011156 evaluation Methods 0.000 claims description 11
- 230000008569 process Effects 0.000 claims description 8
- 238000012795 verification Methods 0.000 claims description 8
- 230000004913 activation Effects 0.000 claims description 6
- 238000009432 framing Methods 0.000 claims description 6
- 238000002372 labelling Methods 0.000 claims description 6
- 238000010030 laminating Methods 0.000 claims description 3
- 238000004519 manufacturing process Methods 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 3
- 230000000452 restraining effect Effects 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 238000013135 deep learning Methods 0.000 description 5
- 238000011897 real-time detection Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 1
- 238000011840 criminal investigation Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
- G06V20/54—Surveillance or monitoring of activities, e.g. for recognising suspicious objects of traffic, e.g. cars on the road, trains or boats
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/08—Detecting or categorising vehicles
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Probability & Statistics with Applications (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a multi-vehicle target identification method for improving a YOLOv2 model, which comprises the steps of firstly collecting sample data in an actual traffic environment, and dividing the sample data into sample images of a training set and a test set according to a ratio of 7: 3; then, performing data enhancement on the sample images of the training set, including random scaling of the sample images and adjustment of exposure and saturation, so that the processed images are used as input of a training model; extracting the target region feature vector of the processed training set through an improved Darknet-19 network; inputting the training set into a Darknet-19 network model for training to obtain a detection and identification model; and finally, inputting the test set into the model for testing to obtain a result of multi-target vehicle identification. The invention solves the problems of low detection rate, poor robustness and unsatisfactory classification effect of the prior art aiming at the road vehicle multi-target detection and vehicle type classification method.
Description
Technical Field
The invention belongs to the technical field of image detection and classification, and particularly relates to a multi-vehicle target identification method for improving a YOLOv2 model.
Background
Image detection and image classification techniques are important components of image processing techniques, and are widely applied in many fields, such as remote sensing image identification, military criminal investigation, modern biomedicine, intelligent transportation and the like. However, the conventional target detection and identification method, such as a Cascade classifier based on Haar features, mainly aims at the detection of specific targets, is limited to multi-classified targets, and the region selection process of the targets is complex and the detection and identification efficiency is low. When an object is selected, the feature extraction has the defects of strong subjectivity, poor robustness, weak generalization capability and the like, and the accurate identification effect is difficult to achieve in practical application. Compared with the traditional method, the deep learning method has obvious advantages. Vehicle detection and identification technologies based on deep learning have become a current research trend.
The current detection algorithm based on deep learning is basically divided into three directions: the first scheme is a scheme of extracting candidate regions and classifying corresponding regions mainly by a deep learning method, such as: RCNN, SPP-net, Fast-RCNN, R-FCN, etc.; a regression method based on deep learning, such as a method of YOLO, SSD, etc.; and thirdly, RRC method combined with RNN algorithm and Deformable CNN method combined with DPM. Vehicle detection and other methods based on CNN, R-CNN and Fast-RCNN models cannot achieve the effect of real-time detection on detection precision and detection speed in practical application. YOLOv2 is a real-time object detection algorithm, follows the design concept of end-to-end training and real-time detection, can directly go from input images to detection output in the detection process, directly takes the confidence scores of the target position and the corresponding position as output, omits the step of generating a candidate frame, and greatly shortens the detection time. The detection speed of YOLO can reach 45Fps/s, but the detection and identification precision is slightly lower than that of fast-RCNN. In order to improve the detection and identification precision, the invention improves the network model on the basis of YOLOv2, and improves the detection and identification precision and the robustness of the algorithm while keeping the original speed.
Disclosure of Invention
The invention aims to provide a multi-vehicle target identification method for improving a YOLOv2 model, and solves the problems of low detection rate, poor robustness and unsatisfactory classification effect of the prior art for multi-target detection of road vehicles and vehicle type classification methods.
The technical scheme adopted by the invention is that the multi-vehicle target identification method for improving the YOLOv2 model is implemented according to the following steps:
step 4, inputting the training set in the step 2 into a Darknet-19 network model for training to obtain a detection and identification model;
and 5, inputting the test set obtained in the step 2 into the model obtained in the step 4 for testing to obtain a multi-target vehicle identification result.
The present invention is also characterized in that,
the step 1 is as follows:
step 1.1, shooting vehicle information in a real-time road traffic environment, framing and extracting a shot video into an image format, and deleting a picture with poor image quality;
step 1.2, labeling vehicles in the selected pictures by using a LabImage labeling tool, framing out a target area, classifying the vehicles in the target area, and manufacturing labels, wherein the labels are car, bus, van and truck, each picture generates an xml file, and finally, randomly distributing the xml files by using Matlab to generate a training set, a test set and a verification set to form a complete data set;
step 1.3, the training set comprises 3 folders, namely indication, ImageSets and JPEGImages, wherein the XML files are stored in the folder indication, each XML file corresponds to an image, the position and the category information of each marked target are stored in each XML file, the names of the position and the category information are the same as those of the corresponding original image, a text file is stored in a Main folder under the ImageSets folder, the formats of the text file are train.txt and test.txt, the content in the ImageSets folder is the name of the image which needs to be used for training or testing, and the JPEGImages folder stores the original image which is named according to a uniform rule.
The step 3 is as follows:
step 3.1, respectively carrying out parameter adjustment on the convolution layer number, the pooling layer number, the BN layer number and the activation function in the Darknet-19 network to finally obtain an improved YOLOv2-S network, wherein the YOLOv2-S network comprises 20 convolution layers, 5 pooling layers and 20 batch normalization layers, namely the BN layer and the Leaky-Linear activation function;
step 3.2, extracting the feature vectors, which is specifically as follows:
(1) the 1 st, 3 rd, 5 th, 6 th, 7 th, 9 th, 10 th, 11 th, 13 th, 14 th, 15 th, 16 th, 17 th, 19 th, 20 th, 21 th, 22 th, 23 th, 24 th, 31 th layers are convolution layers, the 2 nd, 4 th, 8 th, 12 th, 18 th layers are maximum pooling layers, the 26 th, 29 th layers are route layers, and the 32 th layers are detection layers;
(2) the sizes of convolution kernels of layers 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21 and 23 in the convolution layers are set to be 3 x 3, the depths of the convolution kernels are set to be 32, 64, 128, 256, 512, 1024 and 1024 respectively, the sizes of convolution kernels of layers 6, 10, 14, 16, 20, 22, 24 and 27 are set to be 1 x 1, and the depths of the convolution kernels are set to be 64, 128, 256, 512, 256, 1024 and 5030 respectively;
(3) the sizes of convolution kernels of layers 2, 4, 8, 12 and 18 in the maximum pooling layer are set to be 2 multiplied by 2, and the step size is set to be 2;
(4) the route layer is used for laminating, namely, the characteristics of a plurality of layers are fused to the next layer and output together, and the 29 th layer of the route layer combines the 28 layers and the 25 layers of the convolution layer together and outputs a characteristic vector.
The step 4 is as follows:
step 4.1, inputting a training set, wherein the process is as follows:
step 4.1.1, dividing the picture obtained in the step 2 into s multiplied by s unit cells, and if the central position of the target to be identified falls into one unit cell, enabling the corresponding unit cell to be responsible for detecting the target; then, directly predicting the position of each unit cell to generate the positions of the required B bounding boxes, wherein each bounding box obtains 5 predicted values: are respectively (t)x,ty)、(tw,th) And a Confidence.
The offset of the center of each bounding box from the cell boundary where the bounding box is located is sigma (t)x),σ(ty),(tw,th) The actual width and height of the target relative to the proportional width and height of the whole image are shown, and the edge distance between the bounding box and the upper left corner of the image is (c)x,cy) The length and width of the bounding box corresponding to the cell are (p)w,ph) Where x represents the length of the cell, y represents the width of the cell, w represents the width of the bounding box, and h represents the height of the bounding box, the real position of the bounding box is as follows:
bx=σ(tx)+cx
by=σ(ty)+cy
bw=pwetw
bh=pheth
the method comprises the following steps of representing the precision of the predicted position of a bounding box of the bounding box by the relation between the bounding box of the bounding box and the probability of a target to be detected and the IOU product of the bounding box and a real position, and specifically calculating the Confidence coefficient of a candidate frame as shown in the following formula:
wherein, truth represents the real value of IOU, pred represents the predicted value of IOU, Pr (object) represents the probability value of the object existing in the grid, if the object exists in one grid, the value of Pr (object) is 1; if no target object appears, the value of pr (object) is 0, that is, the value of Confidence is also 0;
step 4.1.2, clustering the real target frames of the targets to be recognized marked in the training set, obtaining the initial candidate frames of the predicted targets in the training set by using the area interaction ratio IOU value as an evaluation index, and inputting the initial candidate frames as initial parameters into a Yolov2-S network model, wherein the specific steps are as follows:
using K-means method, with distance formula d (box),centroid)=1-IOU(box,centroid) clustering the real target frame of the training data set; wherein, IOU (box) is the area interaction ratio of the predicted target frame and the real target frame, and IOU (box) is used for calculating the area interaction ratio of the predicted target frame and the real target frame,centroid) as an initial target frame when the threshold value is not less than 0.5;
the area interaction ratio IOU (box, centroid) formula is shown as follows:
wherein, boxpredRepresenting the area, box, of the predicted target frametruthThe area of the real target frame is represented, and the proportion of the intersection and the union of the real target frame and the real target frame is the average interaction ratio of the real target frame and the initial candidate frame of the predicted initial target;
step 4.1.3, when an object exists in the grid, the object class needs to be predicted, a conditional probability Pre (class | object) is used for representing, and a value obtained by class prediction is multiplied by a Confidence of a candidate frame, so as to obtain a Confidence C (M) of a certain class M, as shown in the following formula:
step 4.2, 70000 times of iterative training is carried out on the training set obtained in the step 1 by using a Darknet-19 network, the network input of the model is set to be 416 multiplied by 416, the decade is set to be 0.0005, the momentum is set to be 0.9 and the learning rate is set to be 0.001, the training is stopped until the loss value output by the training data set is smaller than a certain threshold value Q or reaches the preset maximum iteration number N, and the trained YOLOv2-S network model is obtained;
the loss function loss (object) represents:
the loss function comprises a loss function, a first term and a fifth term, wherein the first term of the loss function is the coordinate loss of the anchor of the calculation prediction target, the third term of the loss function is the confidence loss of the anchor of the calculation prediction target, and the fifth term of the loss function is the category loss of the anchor of the calculation prediction target; the second term adds a limit in the hope of returning directly to its own anchor box, the fourth term only calculates anchor boxes that are below the IOU threshold, where,error coefficients that are predicted coordinates;as error coefficients not containing confidence in identifying the object, S2Representing the number of meshes into which the input image is divided; b represents the predicted target frame number of each grid;an abscissa representing the center point of the predicted target,An ordinate indicating the center point of the predicted target,Width of center point representing predicted target,Representing the height of the predicted center point of the target;the ith grid in which the jth candidate box is positioned is shown to be responsible for detecting the object;indicating that the ith grid in which the jth candidate box is positioned is not responsible for detecting the object;an actual abscissa representing the center point of the target frame,The actual ordinate representing the center of the target frame,indicating the prediction confidence of the target existing in the ith mesh of the jth candidate box,indicating the predicted probability value of the object in the ith grid of the jth candidate box belonging to a certain category,in the ith grid representing the jth candidate frameTrue probability value that a target belongs to a certain category;
step 4.3, the training process specifically includes forward propagation and backward propagation, and the model is saved every 1000 iterations, the momentum adopted is 0.9, the optimization is performed by using random gradient descent, the initial learning rate is 0.001, the attenuation coefficient is set to 0.0005, the learning rate learning _ rate adopted in the previous 10000 iterations is 0.001, the learning rate adopted in 10000-45000 iterations is 0.0001, the subsequent learning rate is adjusted to 0.00001, and finally the network model for detection and identification is obtained.
The step 5 is as follows:
step 5.1, loading the network weight trained in the step 4, and inputting the test set obtained in the step 2 into the network trained in the step 4 to obtain a multi-scale feature map;
step 5.2, restraining and reserving the prior frame with the maximum confidence score according to the non-maximum value to obtain a finally identified detection frame and a classification result of the multi-target vehicle;
step 5.3, testing the existing Yolov2, Yolov2-voc and Yolov3 protogenic model network models by using the prepared data set;
and 5.4, evaluating the performance of the obtained model by utilizing the evaluation index Recall, Precision value Precision and F1 value, wherein the evaluation index Recall, Precision value Precision and F1 value are as follows:
wherein, Total represents the actual number of the bounding box, namely the number of the actual targets to be detected; correct represents the number of correctly detected bounding boxes, namely after a picture is put into a network, the network detects the number of redundant targets of the bounding boxes, each bounding box has a confidence probability, the bounding box with the probability larger than a set threshold value and the actual bounding box, namely the content of txt in labels, calculate the IOU, find out the bounding box with the largest IOU, and if the maximum value is larger than the preset threshold of the IOU, add 1 to the count value; the Proposal represents the number of the detected bounding boxes which are larger than a set threshold value; precision represents a Precision value; recall represents the Recall rate, which is the ratio of the number of the detected targets to the number of all targets in the verification set; f1 represents F1 Score, namely F1-Score, also called balanced F Score, which is defined as the harmonic mean of accuracy and recall ratio and considers the recall ratio and accuracy of the model, the value range is between 0 and 1, and the higher the F1 is, the better the effect is.
The method has the advantages that the method for identifying the multiple vehicle targets of the improved YOLOv2 model realizes the detection and identification of the multiple target vehicles in the end-to-end actual traffic scene, has higher accuracy and robustness compared with the traditional method, and can identify the multiple vehicle instance targets in the image sample at one time; the multi-target vehicle identification method provided by the invention is improved on the basis of a basic Darknet-19 network model, the operation speed is improved, and the identification accuracy rate of multi-target vehicles is improved. The invention provides an effective method for identifying the multi-target vehicles, and a large number of empirical experiments show that the method has strong robustness and better identification performance compared with the existing multi-target vehicle identification method.
Drawings
FIG. 1 is a general flow chart of a multi-vehicle object recognition method of the present invention that improves the YOLOv2 model;
FIG. 2 is a graph of regression target calculation in a multiple vehicle target identification method of the present invention with an improved YOLOv2 model;
FIG. 3 is a model structure diagram of a multi-vehicle object recognition method of the present invention with improved YOLOv2 model;
FIG. 4 is an experimental comparison of the multiple vehicle target recognition method of the present invention with an improved YOLOv2 model, wherein diagram (a) is the YOLOv2 model, diagram (b) is the YOLOv2-voc model, diagram (c) is the YOLOv3 model, and diagram (d) is the YOLOv2-voc _ mul model;
FIG. 5 is a partial experimental result display diagram of a multi-vehicle target identification method of the invention with an improved YOLOv2 model.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
The invention relates to a multi-vehicle target recognition method for improving a YOLOv2 model, which is implemented by the following steps in detail as shown in a flow chart shown in FIG. 1:
the step 1 is as follows:
step 1.1, shooting vehicle information in a real-time road traffic environment, framing and extracting a shot video into an image format, and deleting a picture with poor image quality;
step 1.2, labeling vehicles in the selected pictures by using a LabImage labeling tool, framing out a target area, classifying the vehicles in the target area, and manufacturing labels, wherein the labels are car, bus, van and truck, each picture generates an xml file, and finally, randomly distributing the xml files by using Matlab to generate a training set, a test set and a verification set to form a complete data set;
step 1.3, the training set comprises 3 folders, namely indication, ImageSets and JPEGImages, wherein the XML files are stored in the folder indication, each XML file corresponds to an image, the position and the category information of each marked target are stored in each XML file, the names of the position and the category information are the same as those of the corresponding original image, a text file is stored in a Main folder under the ImageSets folder, the formats of the text file are train.txt and test.txt, the content in the ImageSets folder is the name of the image which needs to be used for training or testing, and the JPEGImages folder stores the original image which is named according to a uniform rule.
the step 3 is as follows:
step 3.1, respectively carrying out parameter adjustment on the convolution layer number, the pooling layer number, the BN layer number and the activation function in the Darknet-19 network to finally obtain an improved YOLOv2-S network, wherein the YOLOv2-S network comprises 20 convolution layers, 5 pooling layers and 20 batch normalization layers, namely the BN layer and the Leaky-Linear activation function;
step 3.2, extracting the feature vectors, which is specifically as follows:
(1) the 1 st, 3 rd, 5 th, 6 th, 7 th, 9 th, 10 th, 11 th, 13 th, 14 th, 15 th, 16 th, 17 th, 19 th, 20 th, 21 th, 22 th, 23 th, 24 th, 31 th layers are convolution layers, the 2 nd, 4 th, 8 th, 12 th, 18 th layers are maximum pooling layers, the 26 th, 29 th layers are route layers, and the 32 th layers are detection layers;
(2) the sizes of convolution kernels of layers 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21 and 23 in the convolution layers are set to be 3 x 3, the depths of the convolution kernels are set to be 32, 64, 128, 256, 512, 1024 and 1024 respectively, the sizes of convolution kernels of layers 6, 10, 14, 16, 20, 22, 24 and 27 are set to be 1 x 1, and the depths of the convolution kernels are set to be 64, 128, 256, 512, 256, 1024 and 5030 respectively;
(3) the sizes of convolution kernels of layers 2, 4, 8, 12 and 18 in the maximum pooling layer are set to be 2 multiplied by 2, and the step size is set to be 2;
(4) the route layer is used for laminating, namely, the characteristics of a plurality of layers are fused to the next layer and output together, and the 29 th layer of the route layer combines the 28 layers and the 25 layers of the convolution layer together and outputs a characteristic vector.
As shown in fig. 2 to fig. 3, step 4, inputting the training set in step 2 into a Darknet-19 network model for training to obtain a model for detection and recognition;
the step 4 is as follows:
step 4.1, inputting a training set, wherein the process is as follows:
step 4.1.1, dividing the picture obtained in the step 2 into s multiplied by s unit cells, and if the central position of the target to be identified falls into one unit cell, enabling the corresponding unit cell to be responsible for detecting the target; then, directly predicting the position of each unit cell to generate the positions of the required B bounding boxes, wherein each bounding box obtains 5 predicted values: are respectively (t)x,ty)、(tw,th) And a Confidence.
The offset of the center of each bounding box from the cell boundary where the bounding box is located is sigma (t)x),σ(ty),(tw,th) The actual width and height of the target relative to the proportional width and height of the whole image are shown, and the edge distance between the bounding box and the upper left corner of the image is (c)x,cy) The length and width of the bounding box corresponding to the cell are (p)w,ph) Where x represents the length of the cell, y represents the width of the cell, w represents the width of the bounding box, and h represents the height of the bounding box, the real position of the bounding box is as follows:
bx=σ(tx)+cx
by=σ(ty)+cy
bw=pwetw
bh=pheth
the method comprises the following steps of representing the precision of the predicted position of a bounding box of the bounding box by the relation between the bounding box of the bounding box and the probability of a target to be detected and the IOU product of the bounding box and a real position, and specifically calculating the Confidence coefficient of a candidate frame as shown in the following formula:
wherein, truth represents the real value of IOU, pred represents the predicted value of IOU, Pr (object) represents the probability value of the object existing in the grid, if the object exists in one grid, the value of Pr (object) is 1; if no target object appears, the value of pr (object) is 0, that is, the value of Confidence is also 0;
step 4.1.2, clustering the real target frames of the targets to be recognized marked in the training set, obtaining the initial candidate frames of the predicted targets in the training set by using the area interaction ratio IOU value as an evaluation index, and inputting the initial candidate frames as initial parameters into a Yolov2-S network model, wherein the specific steps are as follows:
using K-means method, with distance formula d (box),centroid)=1-IOU(box,centroid) clustering the real target frame of the training data set; wherein, IOU (box) is the area interaction ratio of the predicted target frame and the real target frame, and IOU (box) is used for calculating the area interaction ratio of the predicted target frame and the real target frame,centroid) as an initial target frame when the threshold value is not less than 0.5;
the area interaction ratio IOU (box, centroid) formula is shown as follows:
wherein, boxpredRepresenting the area, box, of the predicted target frametruthThe area of the real target frame is represented, and the proportion of the intersection and the union of the real target frame and the real target frame is the average interaction ratio of the real target frame and the initial candidate frame of the predicted initial target;
step 4.1.3, when an object exists in the grid, the object class needs to be predicted, a conditional probability Pre (class | object) is used for representing, and a value obtained by class prediction is multiplied by a Confidence of a candidate frame, so as to obtain a Confidence C (M) of a certain class M, as shown in the following formula:
step 4.2, (use Darknet-19 network to carry on 70000 times of iterative training on the training set that step 1 gets, set the network input of the model as 416 x 416, adopt the gradient descent algorithm, set decade as 0.0005, momentum as 0.9, learning rate as 0.001, stop training after the loss value that the training data set outputs is smaller than certain threshold value Q or reaches the maximum iteration number N that is set up in advance, get the good YOLOv2-S network model of training;
the loss function loss (object) represents:
the loss function comprises a loss function, a first term and a fifth term, wherein the first term of the loss function is the coordinate loss of the anchor of the calculation prediction target, the third term of the loss function is the confidence loss of the anchor of the calculation prediction target, and the fifth term of the loss function is the category loss of the anchor of the calculation prediction target; the second term adds a limit in the hope of returning directly to its own anchor box, the fourth term only calculates anchor boxes that are below the IOU threshold, where,error coefficients that are predicted coordinates;as error coefficients not containing confidence in identifying the object, S2Representing the number of meshes into which the input image is divided; b represents the predicted target frame number of each grid;an abscissa representing the center point of the predicted target,An ordinate indicating the center point of the predicted target,Width of center point representing predicted target,Representing the height of the predicted center point of the target;the ith grid in which the jth candidate box is positioned is shown to be responsible for detecting the object;indicating that the ith grid in which the jth candidate box is positioned is not responsible for detecting the object;an actual abscissa representing the center point of the target frame,The actual ordinate representing the center of the target frame,indicating the prediction confidence of the target existing in the ith mesh of the jth candidate box,indicating the predicted probability value of the object in the ith grid of the jth candidate box belonging to a certain category,representing the real probability value of the target in the ith grid of the jth candidate box belonging to a certain category;
step 4.3, the training process specifically includes forward propagation and backward propagation, and the model is saved every 1000 iterations, the momentum adopted is 0.9, the optimization is performed by using random gradient descent, the initial learning rate is 0.001, the attenuation coefficient is set to 0.0005, the learning rate learning _ rate adopted in the previous 10000 iterations is 0.001, the learning rate adopted in 10000-45000 iterations is 0.0001, the subsequent learning rate is adjusted to 0.00001, and finally the network model for detection and identification is obtained.
And 5, inputting the test set in the step 2 into the model obtained in the step 4 for testing to obtain a multi-target vehicle identification result.
The step 5 is as follows:
step 5.1, loading the network weight trained in the step 4, and inputting the test set obtained in the step 2 into the network trained in the step 4 to obtain a multi-scale feature map;
step 5.2, restraining and reserving the prior frame with the maximum confidence score according to the non-maximum value to obtain a finally identified detection frame and a classification result of the multi-target vehicle;
step 5.3, testing the existing Yolov2, Yolov2-voc and Yolov3 protogenic model network models by using the prepared data set;
and 5.4, evaluating the performance of the obtained model by utilizing the evaluation index Recall, Precision value Precision and F1 value, wherein the evaluation index Recall, Precision value Precision and F1 value are as follows:
wherein, Total represents the actual number of the bounding box, namely the number of the actual targets to be detected; correct represents the number of correctly detected bounding boxes, namely after a picture is put into a network, the network detects the number of redundant targets of the bounding boxes, each bounding box has a confidence probability, the bounding box with the probability larger than a set threshold value and the actual bounding box, namely the content of txt in labels, calculate the IOU, find out the bounding box with the largest IOU, and if the maximum value is larger than the preset threshold of the IOU, add 1 to the count value; the Proposal represents the number of the detected bounding boxes which are larger than a set threshold value; precision represents a Precision value; recall represents the Recall rate, which is the ratio of the number of the detected targets to the number of all targets in the verification set; f1 represents F1 Score, namely F1-Score, also called balanced F Score, which is defined as the harmonic mean of accuracy and recall ratio and considers the recall ratio and accuracy of the model, the value range is between 0 and 1, and the higher the F1 is, the better the effect is.
FIG. 4 is the verification results of 4 training sets and verification sets of models, wherein (a) is the Yolov2 model, (b) is the Yolov2-voc model, (c) is the Yolov3 model, and (d) is the Yolov2-voc _ mul model, and it can be seen that: the recall rates of the 4 models are greatly fluctuated initially, but when the number of detected targets is increased, the recall rate of the YOLOv2 model is gradually stabilized at 96%, YOLOv2-voc tends to 94.5%, the recall rate of the improved YOLOv2-voc _ mul model is stabilized at 95.5%, and the recall rate of the YOLOv3 model is fluctuated between 40% and 60%, which shows that the 3 models of the YOLOv2 can ensure good accuracy in a simple background, and the accuracy of the YOLOv3 is low; the accuracy curve of the YOLOv2 model has large fluctuation, the accuracy curve of the YOLOv2-voc model has jump when the target number increases and is gradually stabilized at 98.6% after the jump, the improved YOLOv2-voc _ mul model is stabilized to about 99.2% after the jump is relatively small, the good accuracy and stability are kept, the accuracy curve of the YOLOv3 model has large jump, and the final accuracy value fluctuates about 60%; meanwhile, the intersection ratio of the YOLOv2 model fluctuates between 0.75 and 0.83 by comparing the intersection ratio curves of the 4 models, namely the stability of the detection number is low, the intersection ratio of the YOLOv2-voc model and the YOLOv2-voc _ mul model is improved by comparing the YOLOv2 and can be kept between 0.8 and 0.83, and the curve change shows that the intersection ratio of the YOLOv2-voc _ mul is similar to that of the YOLOv2-voc model and fluctuates up and down at 0.83 when the target number is increased, and the intersection ratio of the YOLOv3 fluctuates only between 0.4 and 0.7 and is the worst in stability compared with the other 3 models.
FIG. 5 is a graph of the partial detection results of the YOLOv2-voc _ mul model. From the test results, the categories of different vehicles are tested and accurately defined as car, bus, van, truck.
Table 1 shows an evaluation index table of the multi-vehicle target identification method of the present invention, which improves the YOLOv2 model.
TABLE 1 evaluation index Table
Model | Total | Correct | Proposal | Precision(%) | Recall(%) | F1(%) |
YOLOv2 | 154 | 147 | 152 | 96.71 | 95.45 | 96.07 |
YOLOv2-voc | 154 | 143 | 147 | 97.28 | 92.86 | 95.01 |
YOLOv3 | 154 | 84 | 151 | 55.63 | 54.55 | 55.08 |
YOLOv2-S | 154 | 146 | 148 | 98.62 | 94.81 | 96.67 |
Claims (5)
1. A multi-vehicle target identification method for improving a YOLOv2 model is characterized by comprising the following steps:
step 1, collecting sample data in an actual traffic environment, and dividing the sample data into sample images of a training set and a test set according to a ratio of 7: 3;
step 2, performing data enhancement on the sample images of the training set, including random scaling of the sample images and adjustment of exposure and saturation, so that the processed images are used as input of a training model;
step 3, extracting the target region characteristic vector of the training set processed in the step 2 through an improved Darknet-19 network;
step 4, inputting the training set in the step 2 into a Darknet-19 network model for training to obtain a detection and identification model;
and 5, inputting the test set in the step 2 into the model obtained in the step 4 for testing to obtain a multi-target vehicle identification result.
2. The method for identifying multiple vehicle targets based on an improved YOLOv2 model according to claim 1, wherein the step 1 is as follows:
step 1.1, shooting vehicle information in a real-time road traffic environment, framing and extracting a shot video into an image format, and deleting a picture with poor image quality;
step 1.2, labeling vehicles in the selected pictures by using a LabImage labeling tool, framing out a target area, classifying the vehicles in the target area, and manufacturing labels, wherein the labels are car, bus, van and truck, each picture generates an xml file, and finally, randomly distributing the xml files by using Matlab to generate a training set, a test set and a verification set to form a complete data set;
step 1.3, the training set comprises 3 folders, namely indication, ImageSets and JPEGImages, wherein the XML files are stored in the folder indication, each XML file corresponds to an image, the position and the category information of each marked target are stored in each XML file, the names of the position and the category information are the same as those of the corresponding original image, a text file is stored in a Main folder under the ImageSets folder, the formats of the text file are train.txt and test.txt, the content in the ImageSets folder is the name of the image which needs to be used for training or testing, and the JPEGImages folder stores the original image which is named according to a uniform rule.
3. The method for identifying multiple vehicle targets based on an improved YOLOv2 model according to claim 2, wherein the step 3 is as follows:
step 3.1, respectively carrying out parameter adjustment on the convolution layer number, the pooling layer number, the BN layer number and the activation function in the Darknet-19 network to finally obtain an improved YOLOv2-S network, wherein the YOLOv2-S network comprises 20 convolution layers, 5 pooling layers and 20 batch normalization layers, namely the BN layer and the Leaky-Linear activation function;
step 3.2, extracting the feature vectors, which is specifically as follows:
(1) the 1 st, 3 rd, 5 th, 6 th, 7 th, 9 th, 10 th, 11 th, 13 th, 14 th, 15 th, 16 th, 17 th, 19 th, 20 th, 21 th, 22 th, 23 th, 24 th, 31 th layers are convolution layers, the 2 nd, 4 th, 8 th, 12 th, 18 th layers are maximum pooling layers, the 26 th, 29 th layers are route layers, and the 32 th layers are detection layers;
(2) the sizes of convolution kernels of layers 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21 and 23 in the convolution layers are set to be 3 x 3, the depths of the convolution kernels are set to be 32, 64, 128, 256, 512, 1024 and 1024 respectively, the sizes of convolution kernels of layers 6, 10, 14, 16, 20, 22, 24 and 27 are set to be 1 x 1, and the depths of the convolution kernels are set to be 64, 128, 256, 512, 256, 1024 and 5030 respectively;
(3) the sizes of convolution kernels of layers 2, 4, 8, 12 and 18 in the maximum pooling layer are set to be 2 multiplied by 2, and the step size is set to be 2;
(4) the route layer is used for laminating, namely, the characteristics of a plurality of layers are fused to the next layer and output together, and the 29 th layer of the route layer combines the 28 layers and the 25 layers of the convolution layer together and outputs a characteristic vector.
4. The method for identifying multiple vehicle targets based on the improved YOLOv2 model of claim 3, wherein the step 4 is as follows:
step 4.1, inputting a training set, wherein the process is as follows:
step 4.1.1, dividing the picture obtained in the step 2 into s multiplied by s unit cells, and if the central position of the target to be identified falls into one unit cell, enabling the corresponding unit cell to be responsible for detecting the target; then, directly predicting the position of each unit cell to generate the positions of the required B bounding boxes, wherein each bounding box obtains 5 predicted values: are respectively (t)x,ty)、(tw,th) And a Confidence;
the offset of the center of each bounding box from the cell boundary where the bounding box is located is sigma (t)x),σ(ty),(tw,th) The actual width and height of the target relative to the proportional width and height of the whole image are shown, and the edge distance between the bounding box and the upper left corner of the image is (c)x,cy) The length and width of the bounding box corresponding to the cell are (p)w,ph) Where x represents the length of the cell, y represents the width of the cell, w represents the width of the bounding box, and h represents the height of the bounding box, the real position of the bounding box is as follows:
bx=σ(tx)+cx
by=σ(ty)+cy
bw=pwetw
bh=pheth
the method comprises the following steps of representing the precision of the predicted position of a bounding box of the bounding box by the relation between the bounding box of the bounding box and the probability of a target to be detected and the IOU product of the bounding box and a real position, and specifically calculating the Confidence coefficient of a candidate frame as shown in the following formula:
wherein, truth represents the real value of IOU, pred represents the predicted value of IOU, Pr (object) represents the probability value of the object existing in the grid, if the object exists in one grid, the value of Pr (object) is 1; if no target object appears, the value of pr (object) is 0, that is, the value of Confidence is also 0;
step 4.1.2, clustering the real target frames of the targets to be recognized marked in the training set, obtaining the initial candidate frames of the predicted targets in the training set by using the area interaction ratio IOU value as an evaluation index, and inputting the initial candidate frames as initial parameters into a Yolov2-S network model, wherein the specific steps are as follows:
using K-means method, with distance formula d (box),centroid)=1-IOU(box,centroid) clustering the real target frame of the training data set; wherein, IOU (box) is the area interaction ratio of the predicted target frame and the real target frame, and IOU (box) is used for calculating the area interaction ratio of the predicted target frame and the real target frame,centroid) as the threshold valueThe candidate frame predicted at 0.5 is taken as an initial target frame;
the area interaction ratio IOU (box, centroid) formula is shown as follows:
wherein, boxpredRepresenting the area, box, of the predicted target frametruthThe area of the real target frame is represented, and the proportion of the intersection and the union of the real target frame and the real target frame is the average interaction ratio of the real target frame and the initial candidate frame of the predicted initial target;
step 4.1.3, when an object exists in the grid, the object class needs to be predicted, a conditional probability Pre (class | object) is used for representing, and a value obtained by class prediction is multiplied by a Confidence of a candidate frame, so as to obtain a Confidence C (M) of a certain class M, as shown in the following formula:
step 4.2, (use Darknet-19 network to carry on 70000 times of iterative training on the training set that step 1 gets, set the network input of the model as 416 x 416, adopt the gradient descent algorithm, set decade as 0.0005, momentum as 0.9, learning rate as 0.001, stop training after the loss value that the training data set outputs is smaller than certain threshold value Q or reaches the maximum iteration number N that is set up in advance, get the good YOLOv2-S network model of training;
the loss function loss (object) represents:
wherein the first term of the loss function is the coordinate loss of the anchor of the calculated prediction target, the third term is the confidence loss of the anchor of the calculated prediction target,the fifth term is to calculate the category loss of the anchor of the predicted target; the second term adds a limit in the hope of returning directly to its own anchor box, the fourth term only calculates anchor boxes that are below the IOU threshold, where,error coefficients that are predicted coordinates;as error coefficients not containing confidence in identifying the object, S2Representing the number of meshes into which the input image is divided; b represents the predicted target frame number of each grid;an abscissa representing the center point of the predicted target,An ordinate indicating the center point of the predicted target,Width of center point representing predicted target,Representing the height of the predicted center point of the target;the ith grid in which the jth candidate box is positioned is shown to be responsible for detecting the object;indicating that the ith grid in which the jth candidate box is positioned is not responsible for detecting the object;actual representation of center point of target frameThe abscissa,The actual ordinate representing the center of the target frame,indicating the prediction confidence of the target existing in the ith mesh of the jth candidate box,indicating the predicted probability value of the object in the ith grid of the jth candidate box belonging to a certain category,representing the real probability value of the target in the ith grid of the jth candidate box belonging to a certain category;
step 4.3, the training process specifically includes forward propagation and backward propagation, and the model is saved every 1000 iterations, the momentum adopted is 0.9, the optimization is performed by using random gradient descent, the initial learning rate is 0.001, the attenuation coefficient is set to 0.0005, the learning rate learning _ rate adopted in the previous 10000 iterations is 0.001, the learning rate adopted in 10000-45000 iterations is 0.0001, the subsequent learning rate is adjusted to 0.00001, and finally the network model for detection and identification is obtained.
5. The method for identifying multiple vehicle targets based on the improved YOLOv2 model of claim 4, wherein the step 5 is as follows:
step 5.1, loading the network weight trained in the step 4, and inputting the test set obtained in the step 2 into the network trained in the step 4 to obtain a multi-scale feature map;
step 5.2, restraining and reserving the prior frame with the maximum confidence score according to the non-maximum value to obtain a finally identified detection frame and a classification result of the multi-target vehicle;
step 5.3, testing the existing Yolov2, Yolov2-voc and Yolov3 protogenic model network models by using the prepared data set;
and 5.4, evaluating the performance of the obtained model by utilizing the evaluation index Recall, Precision value Precision and F1 value, wherein the evaluation index Recall, Precision value Precision and F1 value are as follows:
wherein, Total represents the actual number of the bounding box, namely the number of the actual targets to be detected; correct represents the number of correctly detected bounding boxes, namely after a picture is put into a network, the network detects the number of redundant targets of the bounding boxes, each bounding box has a confidence probability, the bounding box with the probability larger than a set threshold value and the actual bounding box, namely the content of txt in labels, calculate the IOU, find out the bounding box with the largest IOU, and if the maximum value is larger than the preset threshold of the IOU, add 1 to the count value; the Proposal represents the number of the detected bounding boxes which are larger than a set threshold value; precision represents a Precision value; recall represents the Recall rate, which is the ratio of the number of the detected targets to the number of all targets in the verification set; f1 represents F1 Score, namely F1-Score, also called balanced F Score, which is defined as the harmonic mean of accuracy and recall ratio and considers the recall ratio and accuracy of the model, the value range is between 0 and 1, and the higher the F1 is, the better the effect is.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011158555.0A CN112270252A (en) | 2020-10-26 | 2020-10-26 | Multi-vehicle target identification method for improving YOLOv2 model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011158555.0A CN112270252A (en) | 2020-10-26 | 2020-10-26 | Multi-vehicle target identification method for improving YOLOv2 model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112270252A true CN112270252A (en) | 2021-01-26 |
Family
ID=74342539
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011158555.0A Pending CN112270252A (en) | 2020-10-26 | 2020-10-26 | Multi-vehicle target identification method for improving YOLOv2 model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112270252A (en) |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112926681A (en) * | 2021-03-29 | 2021-06-08 | 复旦大学 | Target detection method and device based on deep convolutional neural network |
CN112949750A (en) * | 2021-03-25 | 2021-06-11 | 清华大学深圳国际研究生院 | Image classification method and computer readable storage medium |
CN112990065A (en) * | 2021-03-31 | 2021-06-18 | 上海海事大学 | Optimized YOLOv5 model-based vehicle classification detection method |
CN113076858A (en) * | 2021-03-30 | 2021-07-06 | 深圳技术大学 | Vehicle information detection method based on deep learning, storage medium and terminal device |
CN113139945A (en) * | 2021-02-26 | 2021-07-20 | 山东大学 | Intelligent image detection method, equipment and medium for air conditioner outdoor unit based on Attention + YOLOv3 |
CN113134683A (en) * | 2021-05-13 | 2021-07-20 | 兰州理工大学 | Laser marking method and device based on machine learning |
CN113283307A (en) * | 2021-04-30 | 2021-08-20 | 北京雷石天地电子技术有限公司 | Method and system for identifying object in video and computer storage medium |
CN113298167A (en) * | 2021-06-01 | 2021-08-24 | 北京思特奇信息技术股份有限公司 | Character detection method and system based on lightweight neural network model |
CN113343785A (en) * | 2021-05-19 | 2021-09-03 | 山东大学 | YOLO ground mark detection method and equipment based on perspective downsampling and storage medium |
CN113537106A (en) * | 2021-07-23 | 2021-10-22 | 仲恺农业工程学院 | Fish feeding behavior identification method based on YOLOv5 |
CN113538389A (en) * | 2021-07-23 | 2021-10-22 | 仲恺农业工程学院 | Pigeon egg quality identification method |
CN113538390A (en) * | 2021-07-23 | 2021-10-22 | 仲恺农业工程学院 | Quick identification method for shaddock diseases and insect pests |
CN113743233A (en) * | 2021-08-10 | 2021-12-03 | 暨南大学 | Vehicle model identification method based on YOLOv5 and MobileNet V2 |
CN113780270A (en) * | 2021-03-23 | 2021-12-10 | 京东鲲鹏(江苏)科技有限公司 | Target detection method and device |
CN113808080A (en) * | 2021-08-12 | 2021-12-17 | 常州大学 | Method for detecting number of interference fringes of glass panel of mobile phone camera hole |
CN113808200A (en) * | 2021-08-03 | 2021-12-17 | 嘉洋智慧安全生产科技发展(北京)有限公司 | Method and device for detecting moving speed of target object and electronic equipment |
CN113850799A (en) * | 2021-10-14 | 2021-12-28 | 长春工业大学 | YOLOv 5-based trace DNA extraction workstation workpiece detection method |
CN113963299A (en) * | 2021-10-26 | 2022-01-21 | 大连民族大学 | Table tennis ball detection method based on improved YOLO V4 algorithm |
CN114387520A (en) * | 2022-01-14 | 2022-04-22 | 华南农业大学 | Precision detection method and system for intensive plums picked by robot |
CN114648513A (en) * | 2022-03-29 | 2022-06-21 | 华南理工大学 | Motorcycle detection method based on self-labeling data augmentation |
CN114972807A (en) * | 2022-05-17 | 2022-08-30 | 北京百度网讯科技有限公司 | Method and device for determining image recognition accuracy, electronic equipment and medium |
CN117612021A (en) * | 2023-10-19 | 2024-02-27 | 广州大学 | Remote sensing extraction method and system for agricultural plastic greenhouse |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108520114A (en) * | 2018-03-21 | 2018-09-11 | 华中科技大学 | A kind of textile cloth defect detection model and its training method and application |
CN109816024A (en) * | 2019-01-29 | 2019-05-28 | 电子科技大学 | A kind of real-time automobile logo detection method based on multi-scale feature fusion and DCNN |
CN109829428A (en) * | 2019-01-31 | 2019-05-31 | 兰州交通大学 | Based on the video image pedestrian detection method and system for improving YOLOv2 |
CN109886147A (en) * | 2019-01-29 | 2019-06-14 | 电子科技大学 | A kind of more attribute detection methods of vehicle based on the study of single network multiple-task |
WO2019127838A1 (en) * | 2017-12-29 | 2019-07-04 | 国民技术股份有限公司 | Method and apparatus for realizing convolutional neural network, terminal, and storage medium |
CN110443208A (en) * | 2019-08-08 | 2019-11-12 | 南京工业大学 | A kind of vehicle target detection method, system and equipment based on YOLOv2 |
CN110751232A (en) * | 2019-11-04 | 2020-02-04 | 哈尔滨理工大学 | Chinese complex scene text detection and identification method |
CN110929577A (en) * | 2019-10-23 | 2020-03-27 | 桂林电子科技大学 | Improved target identification method based on YOLOv3 lightweight framework |
CN111428558A (en) * | 2020-02-18 | 2020-07-17 | 东华大学 | Vehicle detection method based on improved YO L Ov3 method |
CN111476756A (en) * | 2020-03-09 | 2020-07-31 | 重庆大学 | Method for identifying casting DR image loose defects based on improved YO L Ov3 network model |
-
2020
- 2020-10-26 CN CN202011158555.0A patent/CN112270252A/en active Pending
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019127838A1 (en) * | 2017-12-29 | 2019-07-04 | 国民技术股份有限公司 | Method and apparatus for realizing convolutional neural network, terminal, and storage medium |
CN108520114A (en) * | 2018-03-21 | 2018-09-11 | 华中科技大学 | A kind of textile cloth defect detection model and its training method and application |
CN109816024A (en) * | 2019-01-29 | 2019-05-28 | 电子科技大学 | A kind of real-time automobile logo detection method based on multi-scale feature fusion and DCNN |
CN109886147A (en) * | 2019-01-29 | 2019-06-14 | 电子科技大学 | A kind of more attribute detection methods of vehicle based on the study of single network multiple-task |
CN109829428A (en) * | 2019-01-31 | 2019-05-31 | 兰州交通大学 | Based on the video image pedestrian detection method and system for improving YOLOv2 |
CN110443208A (en) * | 2019-08-08 | 2019-11-12 | 南京工业大学 | A kind of vehicle target detection method, system and equipment based on YOLOv2 |
CN110929577A (en) * | 2019-10-23 | 2020-03-27 | 桂林电子科技大学 | Improved target identification method based on YOLOv3 lightweight framework |
CN110751232A (en) * | 2019-11-04 | 2020-02-04 | 哈尔滨理工大学 | Chinese complex scene text detection and identification method |
CN111428558A (en) * | 2020-02-18 | 2020-07-17 | 东华大学 | Vehicle detection method based on improved YO L Ov3 method |
CN111476756A (en) * | 2020-03-09 | 2020-07-31 | 重庆大学 | Method for identifying casting DR image loose defects based on improved YO L Ov3 network model |
Non-Patent Citations (4)
Title |
---|
JUN SANG等: "An Improved YOLOv2 for Vehicle Detection", 《SENSORS》, vol. 18, pages 1 - 15 * |
MIGE_: "(五)目标检测yolov2", pages 1 - 5, Retrieved from the Internet <URL:《https://blog.csdn.net/MIge_/article/details/108680652》> * |
XUN LI等: "Multi-object Recognition Method Based on Improved YOLOv2 Model", 《INFORMATION TECHNOLOGY AND CONTROL》, vol. 50, no. 1, pages 13 - 27 * |
李珣等: "基于改进YOLOv2模型的多目标识别方法", 《激光与光电子学进展》, vol. 57, no. 10, pages 1 - 10 * |
Cited By (32)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113139945A (en) * | 2021-02-26 | 2021-07-20 | 山东大学 | Intelligent image detection method, equipment and medium for air conditioner outdoor unit based on Attention + YOLOv3 |
CN113780270A (en) * | 2021-03-23 | 2021-12-10 | 京东鲲鹏(江苏)科技有限公司 | Target detection method and device |
CN112949750A (en) * | 2021-03-25 | 2021-06-11 | 清华大学深圳国际研究生院 | Image classification method and computer readable storage medium |
CN112926681A (en) * | 2021-03-29 | 2021-06-08 | 复旦大学 | Target detection method and device based on deep convolutional neural network |
CN112926681B (en) * | 2021-03-29 | 2022-11-29 | 复旦大学 | Target detection method and device based on deep convolutional neural network |
CN113076858A (en) * | 2021-03-30 | 2021-07-06 | 深圳技术大学 | Vehicle information detection method based on deep learning, storage medium and terminal device |
CN112990065A (en) * | 2021-03-31 | 2021-06-18 | 上海海事大学 | Optimized YOLOv5 model-based vehicle classification detection method |
CN112990065B (en) * | 2021-03-31 | 2024-03-22 | 上海海事大学 | Vehicle classification detection method based on optimized YOLOv5 model |
CN113283307A (en) * | 2021-04-30 | 2021-08-20 | 北京雷石天地电子技术有限公司 | Method and system for identifying object in video and computer storage medium |
CN113134683A (en) * | 2021-05-13 | 2021-07-20 | 兰州理工大学 | Laser marking method and device based on machine learning |
CN113343785A (en) * | 2021-05-19 | 2021-09-03 | 山东大学 | YOLO ground mark detection method and equipment based on perspective downsampling and storage medium |
CN113298167A (en) * | 2021-06-01 | 2021-08-24 | 北京思特奇信息技术股份有限公司 | Character detection method and system based on lightweight neural network model |
CN113538390B (en) * | 2021-07-23 | 2023-05-09 | 仲恺农业工程学院 | Quick identification method for shaddock diseases and insect pests |
CN113537106B (en) * | 2021-07-23 | 2023-06-02 | 仲恺农业工程学院 | Fish ingestion behavior identification method based on YOLOv5 |
CN113537106A (en) * | 2021-07-23 | 2021-10-22 | 仲恺农业工程学院 | Fish feeding behavior identification method based on YOLOv5 |
CN113538389A (en) * | 2021-07-23 | 2021-10-22 | 仲恺农业工程学院 | Pigeon egg quality identification method |
CN113538389B (en) * | 2021-07-23 | 2023-05-09 | 仲恺农业工程学院 | Pigeon egg quality identification method |
CN113538390A (en) * | 2021-07-23 | 2021-10-22 | 仲恺农业工程学院 | Quick identification method for shaddock diseases and insect pests |
CN113808200B (en) * | 2021-08-03 | 2023-04-07 | 嘉洋智慧安全科技(北京)股份有限公司 | Method and device for detecting moving speed of target object and electronic equipment |
CN113808200A (en) * | 2021-08-03 | 2021-12-17 | 嘉洋智慧安全生产科技发展(北京)有限公司 | Method and device for detecting moving speed of target object and electronic equipment |
CN113743233B (en) * | 2021-08-10 | 2023-08-01 | 暨南大学 | Vehicle model identification method based on YOLOv5 and MobileNet V2 |
CN113743233A (en) * | 2021-08-10 | 2021-12-03 | 暨南大学 | Vehicle model identification method based on YOLOv5 and MobileNet V2 |
CN113808080B (en) * | 2021-08-12 | 2023-10-24 | 常州大学 | Method for detecting number of interference fringes of glass panel of camera hole of mobile phone |
CN113808080A (en) * | 2021-08-12 | 2021-12-17 | 常州大学 | Method for detecting number of interference fringes of glass panel of mobile phone camera hole |
CN113850799A (en) * | 2021-10-14 | 2021-12-28 | 长春工业大学 | YOLOv 5-based trace DNA extraction workstation workpiece detection method |
CN113850799B (en) * | 2021-10-14 | 2024-06-07 | 长春工业大学 | YOLOv 5-based trace DNA extraction workstation workpiece detection method |
CN113963299A (en) * | 2021-10-26 | 2022-01-21 | 大连民族大学 | Table tennis ball detection method based on improved YOLO V4 algorithm |
CN114387520A (en) * | 2022-01-14 | 2022-04-22 | 华南农业大学 | Precision detection method and system for intensive plums picked by robot |
CN114387520B (en) * | 2022-01-14 | 2024-05-14 | 华南农业大学 | Method and system for accurately detecting compact Li Zijing for robot picking |
CN114648513A (en) * | 2022-03-29 | 2022-06-21 | 华南理工大学 | Motorcycle detection method based on self-labeling data augmentation |
CN114972807A (en) * | 2022-05-17 | 2022-08-30 | 北京百度网讯科技有限公司 | Method and device for determining image recognition accuracy, electronic equipment and medium |
CN117612021A (en) * | 2023-10-19 | 2024-02-27 | 广州大学 | Remote sensing extraction method and system for agricultural plastic greenhouse |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112270252A (en) | Multi-vehicle target identification method for improving YOLOv2 model | |
CN111062413B (en) | Road target detection method and device, electronic equipment and storage medium | |
CN110059554B (en) | Multi-branch target detection method based on traffic scene | |
CN109902677B (en) | Vehicle detection method based on deep learning | |
CN110188705B (en) | Remote traffic sign detection and identification method suitable for vehicle-mounted system | |
CN109087510B (en) | Traffic monitoring method and device | |
CN110348384B (en) | Small target vehicle attribute identification method based on feature fusion | |
CN107784288B (en) | Iterative positioning type face detection method based on deep neural network | |
CN111461083A (en) | Rapid vehicle detection method based on deep learning | |
CN112016605B (en) | Target detection method based on corner alignment and boundary matching of bounding box | |
CN111275044A (en) | Weak supervision target detection method based on sample selection and self-adaptive hard case mining | |
CN108960074B (en) | Small-size pedestrian target detection method based on deep learning | |
CN110288017B (en) | High-precision cascade target detection method and device based on dynamic structure optimization | |
CN112084890B (en) | Method for identifying traffic signal sign in multiple scales based on GMM and CQFL | |
CN109858327B (en) | Character segmentation method based on deep learning | |
CN109087337B (en) | Long-time target tracking method and system based on hierarchical convolution characteristics | |
CN111079540A (en) | Target characteristic-based layered reconfigurable vehicle-mounted video target detection method | |
CN115170611A (en) | Complex intersection vehicle driving track analysis method, system and application | |
CN114332921A (en) | Pedestrian detection method based on improved clustering algorithm for Faster R-CNN network | |
CN114529581A (en) | Multi-target tracking method based on deep learning and multi-task joint training | |
CN115620518A (en) | Intersection traffic conflict discrimination method based on deep learning | |
CN116964588A (en) | Target detection method, target detection model training method and device | |
CN114219936A (en) | Object detection method, electronic device, storage medium, and computer program product | |
CN112819100A (en) | Multi-scale target detection method and device for unmanned aerial vehicle platform | |
CN116311004A (en) | Video moving target detection method based on sparse optical flow extraction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |