CN110929577A - An improved target recognition method based on YOLOv3 lightweight framework - Google Patents

An improved target recognition method based on YOLOv3 lightweight framework Download PDF

Info

Publication number
CN110929577A
CN110929577A CN201911013341.1A CN201911013341A CN110929577A CN 110929577 A CN110929577 A CN 110929577A CN 201911013341 A CN201911013341 A CN 201911013341A CN 110929577 A CN110929577 A CN 110929577A
Authority
CN
China
Prior art keywords
yolov3
tiny
data set
sample data
network model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911013341.1A
Other languages
Chinese (zh)
Inventor
陈名松
张泽功
吴泳蓉
吴冉冉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guilin University of Electronic Technology
Original Assignee
Guilin University of Electronic Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guilin University of Electronic Technology filed Critical Guilin University of Electronic Technology
Priority to CN201911013341.1A priority Critical patent/CN110929577A/en
Publication of CN110929577A publication Critical patent/CN110929577A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开了一种基于YOLOv3的轻量级框架改进的目标识别方法,通过将YOLOv3的轻量级版本YOLOv3‑tiny和SENet相结合得到YOLOv3‑tiny‑SE来进行目标检测和识别。具体包括:在不同的路况、行车环境和天气条件下进行车辆、行人和交通环境图片的采集,对采集到的数据进行预处理以及数据增强,制作并完善目标识别样本集,对样本集进行标注,然后将样本集分为训练集和测试集两部分,在YOLOv3‑tiny中嵌入SENet结构,得到YOLOv3‑tiny‑SE,在训练集上训练YOLOv3‑tiny‑SE,在测试集上测试YOLOv3‑tiny‑SE,然后和YOLOv3‑tiny性能进行比较。本发明提出的目标识别方法泛化能力强,且可以加快目标检测速度、提高小目标检测的准确率、提高模型参数对噪声的鲁棒性。

Figure 201911013341

The invention discloses an improved target recognition method based on a lightweight framework of YOLOv3, which is used for target detection and recognition by combining YOLOv3-tiny and SENet, a lightweight version of YOLOv3, to obtain YOLOv3-tiny-SE. Specifically, it includes: collecting pictures of vehicles, pedestrians and traffic environments under different road conditions, driving environments and weather conditions, preprocessing and data enhancement of the collected data, making and improving target recognition sample sets, and labeling the sample sets , and then divide the sample set into two parts: training set and test set, embed the SENet structure in YOLOv3‑tiny to obtain YOLOv3‑tiny‑SE, train YOLOv3‑tiny‑SE on the training set, and test YOLOv3‑tiny on the test set ‑SE, and then compare with YOLOv3‑tiny performance. The target recognition method proposed by the invention has strong generalization ability, and can speed up target detection, improve the accuracy of small target detection, and improve the robustness of model parameters to noise.

Figure 201911013341

Description

Improved target identification method based on YOLOv3 lightweight framework
Technical Field
The invention relates to the field of computer vision and deep learning, in particular to a lightweight frame improved target identification method based on YOLOv 3.
Background
The unmanned technology is characterized in that collected real-time road condition video images (including information of pedestrians, vehicles, traffic signs and the like) are subjected to modeling processing through camera tools such as a driving recorder and the like to obtain automobile drive-by-wire state parameters, and then the parameters are input into a decision and control network model of the vehicle to perform decision control on vehicle behaviors. The target detection is a precondition of behavior decision, and the target detection method based on deep learning not only ensures the accuracy of multi-target detection classification, but also meets the real-time processing requirement. Currently, mainstream target detection methods based on machine learning are mainly divided into two major methods based on Region nomination (Region probable) as an idea and a regression method.
The methods based on region nomination mainly include methods such as R _ CNN, SPP _ Net, Fast R _ CNN and Fast R _ CNN. The Fast R _ CNN method greatly reduces the operation time by utilizing a method of sharing a characteristic layer, in addition, the classification and regression method is changed from using an SVM model to using SoftMax for classification, and classification and regression are simultaneously carried out in a multitask mode, so the operation time of target detection is reduced to a certain extent, but all candidate frames are searched in the selective search process, the process is very time-consuming, and the bottleneck problem of calculation speed exists. The fast R _ CNN method directly utilizes the RPN network to extract candidate frames, and the operations of Region nomination, classification, regression and the like share convolution characteristics together, so that the operation speed is further improved.
A representative method based on a regression method is a YOLO method, the YOLO method simplifies the whole process of target detection, video frame images are zoomed into images with uniform size, but only two boundary frames are predicted for each cell in the specific implementation process, and the two boundary frames belong to one category, so that the YOLO method is not high enough in small target detection accuracy, weak in generalization capability and incapable of meeting the requirements of unmanned driving on multi-target detection.
Disclosure of Invention
The invention aims to provide a lightweight frame improved target identification method based on YOLOv3, which is used for solving the problems in the prior art and aims to improve the robustness of model parameters to noise while ensuring the improvement of the target detection speed and the accuracy of small target detection.
In order to achieve the purpose, the invention provides the following scheme: the invention provides a lightweight frame improved target identification method based on YOLOv3, which comprises the following steps:
s1, collecting pictures of vehicles, pedestrians and traffic environments under different road conditions, driving environments and weather conditions, and making an initial sample data set; specifically, step S1 includes:
s11, starting a driving recorder or a high-definition camera installed by a vehicle, and shooting driving information in a road traffic environment in real time;
s12, performing framing processing on the obtained driving video, and extracting images of each frame to obtain driving image sequence sets in different driving environments;
s13, screening the driving image sequence set obtained in the step S12, and selecting driving images under different illumination conditions, traffic time periods and environmental backgrounds;
and S14, marking the selected driving image by using a marking tool, framing a target area, wherein the target area comprises vehicles, pedestrians and traffic signs, and then labeling the target area to make an initial sample data set.
S2, preprocessing and enhancing the picture data in the initial sample data set to obtain a target identification sample data set, specifically, the step S2 includes: and (5) processing the characteristic parameters of the target to be recognized through translation, rotation, saturation adjustment, exposure adjustment and noise addition operation on the initial sample data set obtained in the step (S1) to obtain a complete sample data set.
And S3, dividing the obtained target recognition sample data set into a training set and a test set.
S4, embedding a SEnet structure in a YOLOv3-tiny method frame to obtain a YOLOv3-tiny-SE network model, specifically, the step S4 comprises the following steps:
embedding a SEnet structure in a YOLOv3-tiny method, embedding the SEnet structure after each pooling layer and after a convolutional layer before a final output result, adding the SEnet structure after the pooling layers of the 2 nd, 4 th, 6 th, 8 th, 10 th and 12 th layers and the convolutional layers of the 13 th, 14 th, 15 th, 19 th and 22 th layers by modifying a YOLOv3-tiny.cfg file, and specifying the characteristic channel values 16, 32 th, 64 th, 128 th, 256 th, 512 th, 1024 th, 256 th, 512 th and 256 of the input global pooling layer of the SEnet structure as the number of the characteristic channels output by the embedding layers to obtain a YOLOv3-tiny-SE network model.
S5, training a YOLOv3-tiny-SE network model on a training set, specifically, the step S5 comprises:
s51, after the sample data set is enhanced and the parameters are marked in the step S2, recalculating the anchorbox value of the prepared complete sample data set; calculating an anchor value in a traffic environment by using a K-means clustering method, and comprising the following steps of: reading the marked data set, randomly taking out the width and height values of one picture as coordinate points and initial clustering centers, and performing iterative computation by using a K-means clustering method to obtain a specific anchor value;
s52, setting hyper-parameters and network parameters during training, inputting the training set into a YOLOv3-tiny-SE network model for multi-task training, and storing a trained network model weight file.
S6, testing the performance of YOLOv3-tiny-SE on the test set, specifically, the step S6 comprises the following steps:
s61, loading the trained network model weight file obtained in the step S52, inputting a test set into the trained YOLOv3-tiny-SE network model, and obtaining a multi-scale characteristic diagram through a convolutional layer, a pooling layer, a SENET structure and an upper sampling layer;
s62, activating the x, y, confidence coefficient and category probability of the network prediction by adopting a logistic function, and obtaining the coordinates, confidence coefficient and category probability of all prediction frames through threshold judgment;
and S63, removing redundant detection frames from the result obtained in the step S62 through non-maximum suppression processing, and generating a final target detection frame and a recognition result.
S7, comparing the performance test result of the YOLOv3-tiny-SE obtained in the step S6 on the test set with the performance of the YOLOv3-tiny to obtain the result of performance comparison.
The invention discloses the following technical effects: aiming at the problems that the detection speed of a target is low and the detection accuracy of a small target is not accurate enough in a complex environment in the prior art, a lightweight version YOLOv3-tiny of YOLOv3 is combined with a SEnet structure to obtain a YOLOv3-tiny-SE network model, and the obtained YOLOv3-tiny-SE network model is used for target detection and identification. The invention also provides an improved activation function: PSReLU function and is used to activate the model. By the target identification method, real-time road condition video images acquired by camera tools such as a vehicle data recorder can be processed quickly, accurately in real time, and scientific basis is provided for decision control of vehicle behaviors in automatic driving.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a block diagram of the system of the present application;
FIG. 2 is a diagram of the improved PSReLU activation function of the present application;
FIG. 3 is a diagram of a YOLOv3-tiny-SE network model structure;
FIG. 4 is a SENET structure diagram.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
Referring to fig. 1-4, the invention provides an improved target identification method based on a lightweight framework of YOLOv3, which specifically comprises the following steps:
s1, collecting pictures of vehicles, pedestrians and traffic environments under different roads, traffic environments and weather conditions, and making an initial sample data set; the method comprises the following steps:
s11, starting a driving recorder or a high-definition camera installed by a vehicle, and shooting driving videos in a road traffic environment in real time;
s12, performing framing processing on the obtained driving video, and extracting images of each frame to obtain driving image sequence sets in different driving environments;
s13, screening the driving image sequence set, and selecting the driving images under different illumination conditions, traffic periods and environmental backgrounds;
and S14, marking the selected driving image by using a marking tool, marking the target to be identified in the sample data set by using a Labelimg sample marking tool for parameters, framing a target area (specifically comprising three types of vehicles, pedestrians and traffic signs), and marking a label to manufacture an initial sample data set.
S2, preprocessing and enhancing the data of the manufactured initial sample data set, perfecting the initial sample data set and obtaining a target identification sample data set; the method comprises the following specific steps:
processing the initial sample data set, and performing the following operations through a writing program: on the basis of the existing initial sample data set, the characteristic parameters of the target to be recognized are processed through translation, rotation, saturation degree and exposure amount adjustment and noise adding operation, sample data are added, the target recognition sample data set is obtained to improve the initial sample data set, and the generalization capability of the neural network is improved.
And S3, dividing the obtained target recognition sample data set into a training set and a test set according to the proportion of 7:3, 8:2 or 8: 1.
S4, embedding a SENet structure in a YOLOv3-tiny method to obtain a YOLOv3-tiny-SE network model; comprises the following steps:
s41, improving the lightweight frame of YOLOv3 in the step shown in the table 1, and embedding a SENET structure into the YOLOv3-tiny frame to obtain an improved YOLOv3-tiny-SE network model shown in the figure 3;
s42 and YOLOv3-tiny as a lightweight framework of YOLOv3, wherein the overall network architecture is shown in Table 1, and specifically comprises 13 convolutional layers, 6 pooling layers, 2 fusion layers, 1 upsampling layer and 2 output layers with different scales, compared with YOLOv3, the overall architecture reduces residual layers, replaces a series of pooling layers, and simultaneously omits some convolutional layers and FPN networks for extracting features, thereby simplifying the network, reducing the operation complexity and improving the recognition speed;
the processing idea of S43 and Yolov3-tiny on target detection and recognition is the same as that of Yolov3, and Yolov3 carries out Batch Normalization (BN) operation after convolution of each convolutional layer so as to avoid the occurrence of network training overfitting phenomenon, and then uses a Leaky-Relu function as an activation function after Batch normalization;
s44 and YOLOv3 add an FPN structure on the basis of the previous two generations of methods to improve the recognition accuracy of the multiple scale target, and the specific steps are as follows:
firstly, an image pyramid is established for an image, image pyramids of different levels are input into corresponding networks, target detection is respectively carried out on feature maps of different depths, the feature maps of a future layer are up-sampled through the feature map of a current layer and are utilized, so that the current feature map can obtain information of the future layer, low-order semantic information and high-order semantic information are organically fused, the detection precision is improved, the defects of the former two versions of methods are improved, an FPN (field programmable gate network) is introduced into a YOLOv3 frame, the precision of small target identification is improved, and the identification of traffic signs is more effective;
s45 and YOLOv3-tiny-SE network model as shown in figure 3, the SEnet structure firstly performs global average pooling on input feature maps to obtain feature maps (c is the number of feature channels) with the size of cx1 × 1, then passes through two full connection layers, performs the processes of firstly reducing and then increasing dimensions, finally performs nonlinear processing by using a Sigmoid function to obtain weights with the size of cx1 × 1, and then performs multiplication operation on the weights and the original input feature maps at corresponding positions to obtain the final output result;
s46, embedding a SENet structure in a YOLOv3-tiny method; the method comprises the following specific steps:
embedding a SENet structure after each pooling layer and after the convolutional layer before final output, adding the SENet structure after the pooling layers of the 2 nd, 4 th, 6 th, 8 th, 10 th and 12 th layers and the convolutional layers of the 13 th, 14 th, 15 th, 19 th and 22 th layers by modifying a YOLOv3-tiny file, and specifying the characteristic channel values 16, 32 th, 64 th, 128 th, 256 th, 512 th, 1024 th, 256 th, 512 th and 256 th layers of the global pooling layer of the SENet structure as the number of the characteristic channels output by the embedding layers to obtain a YOLOv3-tiny-SE network model;
the network depth of S47 and YOLOV3-tiny originally is 24 layers, and becomes 35 layers after embedding SENet structure, the main purpose of embedding SENet network is to strengthen useful information and compress useless information, wherein the concrete steps of the embedded SENet structure take a second layer of pooling layer as an example, the feature map output by the pooling layer is 208 x 16, and is also the input feature map size of a Global pooling layer (Global posing), the feature map of 1 x 16 is obtained after Global averaging pooling, then the feature map of 1 x 1 is obtained after dimensionality reduction through a first Full connection (Full connected), the feature map of 1 x 16 is obtained after dimensionality increase through a second Full connection (Full connected), finally the weight value of 1 x 16 is obtained after activation of a Sigmoid function, and finally the weight value of 1 x 16 is obtained by multiplying the feature map with the input feature map 208 and the weight value of 16 is obtained.
TABLE 1
Figure BDA0002244857710000081
Figure BDA0002244857710000091
S5, training a YOLOv3-tiny-SE network model on a training set, and specifically comprises the following steps:
s51, clustering real target frames of the target to be recognized marked in the training set, obtaining an initial candidate frame of the target predicted in the training set by taking the area interaction ratio IOU as an evaluation index, and inputting the initial candidate frame as an initial parameter into a Yolov3-tiny-SE network model, wherein the method specifically comprises the following steps:
clustering the real target frame of the training data set by adopting a K-means method according to a distance formula dis (box, centroid) which is 1-IOU (box, centroid); the IOU (box, centroid) is the area interaction ratio of the predicted target frame and the real target frame, and when the IOU (box, centroid) is used as an evaluation standard and the value is not less than 0.5, the predicted candidate frame is used as an initial target frame;
the area interaction ratio IOU (box, centroid) formula is shown as follows:
Figure BDA0002244857710000092
wherein, boxpredAnd boxtruthRespectively representing the areas of the predicted target frame and the real target frame, wherein the proportion of the intersection and the union of the predicted target frame and the real target frame is the initial candidate of the real target frame and the predicted initial targetAverage interaction ratio of the bounding box;
s53, calling the initial weight of the YOLOv3-tiny network, setting the super parameters, the learning rate, the iteration step number N and the size of batch _ size, wherein the super parameters can be adjusted according to the obtained model data; then inputting the training data set into a Yolov3-tiny-SE network model for training until the loss value output by the training data set is smaller than a certain threshold Q1 or reaches the preset maximum iteration number N, and stopping training to obtain a well-trained Yolov3-tiny-SE network model; the method comprises the following specific steps:
calling initial network weight of YOLOv3-tiny, inputting a training data set into a YOLOv3-tiny network for training, outputting a loss function value, continuously training and adjusting the network weight and a bias value according to the loss function value until the loss function value output by the training set is smaller than a threshold value Q1 or the training is stopped after the maximum iteration number N is reached to obtain a trained YOLOv3-tiny-SE network model;
the loss function (object) is expressed by the following formula
Figure BDA0002244857710000101
Each term of the loss function corresponds to a loss of the prediction center coordinate, a loss of the prediction bounding box, a loss of the prediction confidence degree and a loss of the prediction category. The loss functions of the prediction center coordinates and the bounding boxes are expressed by error square sum, and the loss functions of the prediction categories and the confidence degrees are expressed by courtyard cross entropy loss functions;
in the above formula, λcoordError coefficients that are predicted coordinates; lambda [ alpha ]noobjectAn error coefficient that does not contain a confidence level when identifying an object; k2Indicating the number of meshes into which the input image is divided; m represents the predicted target frame number of each grid; x is the number ofi,yi,wi,hiRespectively representing the abscissa and ordinate of the predicted center point of the target and the width and height,
Figure BDA0002244857710000111
respectively representing the horizontal and vertical coordinates, the width and the height of the central point of a real target;
Figure BDA0002244857710000112
the ith grid representing the jth candidate box is responsible for detecting the object;
Figure BDA0002244857710000113
indicating that the ith grid in which the jth candidate box is positioned is not responsible for detecting the object; ciAnd
Figure BDA0002244857710000114
respectively representing the prediction confidence coefficient and the real confidence coefficient of the target to be detected in the ith grid; p is a radical ofi(c) And
Figure BDA0002244857710000115
respectively representing a prediction probability value and a real probability value of the target identification in the ith network belonging to a certain category;
the activation function of YOLOv3 after convolution layer adopts a Leaky-ReLU function, and the expression of the function is shown as the following formula:
Figure BDA0002244857710000116
the Leaky-ReLU function is evolved from the ReLU function, values obtained by the ReLU function when x is less than or equal to 0 are all 0, so that the problem that the neuron weight cannot be updated possibly occurs along with training, the problem is not large in influence on a deep neural network, but the problem is large in influence on a neural network with a shallow layer number, so that the output of the Leaky-ReLU function, which is 0 in a complex number field, is changed into a linear function with a small slope on the basis of ReLu, the output of a negative number field is reserved, but the parameter a is a proper parameter value determined through artificial prior and repeated training for many times, and the noise robustness in an inactive state cannot be ensured. Based on the above problem, the present embodiment proposes an improved activation function PSReLU (Parametric Soft plus-ReLU) function as shown in fig. 2, and the function expression is as follows:
Figure BDA0002244857710000121
in the positive value domain, the YOLOv3-tiny adopts an activation function, leak-ReLU, which is the same as a ReLU function, in the negative value domain, a Softplus function is adopted, log2 units are shifted downwards, and a parameter α is taken as a learnable parameter in the network, and back propagation training is carried out in the network to be jointly optimized with other network layers.
S6, testing the performance of the YOLOv3-tiny-SE network model on a test set, specifically, the step S6 comprises the following steps:
s61, loading the trained network weight, inputting the test set into the trained network, and obtaining a multi-scale characteristic diagram through a convolution layer, a pooling layer, a SENET structure and an up-sampling layer;
s62, activating the x, y, confidence coefficient and category probability of the network prediction by adopting a logistic function, and obtaining the coordinates, confidence coefficient and category probability of all prediction frames through threshold judgment;
s63, removing redundant detection frames from the result through non-maximum suppression processing (NMS) to generate a final target detection frame and an identification result;
s64, comparing the effect of the YOLOv3-tiny native model with the effect of the improved YOLOv3-tiny-SE network model by using the original activation function; respectively using an improved activation function and a primitive activation function to perform performance test on a Yolov3-tiny native model; respectively using an improved activation function and an original activation function to perform performance test on a YOLOv3-tiny-SE network model;
s65, respectively inputting the test sets obtained in the step S3 into the network corresponding to the step S61 for performance detection, and obtaining the final evaluation indexes for model performance, including Average accuracy mean mAP (mean Average precision), number of frames Per second detected FPS (frames Per second) and Recall rate (Recall).
S7, comparing the performance test result of the YOLOv3-tiny-SE network model obtained in the step S6 on the test set with the performance of YOLOv3-tiny to obtain a performance comparison result.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Various modifications and improvements of the technical solution of the present invention may be made by those skilled in the art without departing from the spirit of the present invention, and the technical solution of the present invention is to be covered by the protection scope defined by the claims.

Claims (6)

1.一种基于YOLOv3的轻量级框架改进的目标识别方法,其特征在于:包括如下步骤:1. an improved target recognition method based on the lightweight frame of YOLOv3, is characterized in that: comprise the steps: S1、在不同的路况、行车环境和天气条件下采集车辆、行人和交通环境图片,制作初始样本数据集;S1. Collect pictures of vehicles, pedestrians and traffic environments under different road conditions, driving environments and weather conditions, and create an initial sample data set; S2、对初始样本数据集中的图片数据进行预处理以及数据增强,得到目标识别样本数据集;S2, performing preprocessing and data enhancement on the image data in the initial sample data set to obtain a target recognition sample data set; S3、将得到的目标识别样本数据集划分为训练集和测试集两部分;S3. Divide the obtained target recognition sample data set into two parts: training set and test set; S4、在YOLOv3-tiny方法框架中嵌入SENet结构,得到YOLOv3-tiny-SE网络模型;S4. Embed the SENet structure in the YOLOv3-tiny method framework to obtain the YOLOv3-tiny-SE network model; S5、在训练集上训练YOLOv3-tiny-SE网络模型;S5. Train the YOLOv3-tiny-SE network model on the training set; S6、在测试集上测试YOLOv3-tiny-SE性能;S6. Test the performance of YOLOv3-tiny-SE on the test set; S7、将步骤S6得到的YOLOv3-tiny-SE网络模型在测试集上的性能测试结果与YOLOv3-tiny进行性能比较,得到性能比较的结果。S7. Compare the performance test result of the YOLOv3-tiny-SE network model obtained in step S6 on the test set with that of YOLOv3-tiny to obtain a performance comparison result. 2.根据权利要求1所述的基于YOLOv3的轻量级框架改进的目标识别方法,其特征在于:步骤S1具体包括:2. the improved target recognition method based on the lightweight framework of YOLOv3 according to claim 1, is characterized in that: step S1 specifically comprises: S11、开启行车记录仪或者车辆自行安装的高清摄像头,实时拍摄道路交通环境下的行车信息;S11. Turn on the driving recorder or the high-definition camera installed by the vehicle to capture real-time driving information under the road traffic environment; S12、将获取到的行车视频进行分帧处理,将每一帧的图像提取出来,得到不同行车环境下的行车图像序列集;S12. Perform frame-by-frame processing on the obtained driving video, and extract the image of each frame to obtain a driving image sequence set under different driving environments; S13、对步骤S12得到的行车图像序列集进行筛选,选取不同光照条件、交通时段和环境背景下的行车图像;S13, screening the driving image sequence set obtained in step S12, and selecting driving images under different lighting conditions, traffic periods and environmental backgrounds; S14、利用标注工具对选取的行车图像进行标注,框出目标区域,所述目标区域包括车辆、行人和交通标志,然后将目标区域打上标签,制作初始样本数据集。S14. Use a labeling tool to label the selected driving image to frame a target area, where the target area includes vehicles, pedestrians and traffic signs, and then label the target area to create an initial sample data set. 3.根据权利要求1所述的基于YOLOv3的轻量级框架改进的目标识别方法,其特征在于:步骤S2具体包括:对步骤S1得到的初始样本数据集通过平移、旋转、调整饱和度和曝光量以及添加噪声操作,对待识别目标的特征参数进行处理,得到完备样本数据集。3. the improved target recognition method based on the lightweight frame of YOLOv3 according to claim 1, is characterized in that: step S2 specifically comprises: the initial sample data set that step S1 obtains by translation, rotation, adjustment saturation and exposure The characteristic parameters of the target to be identified are processed to obtain a complete sample data set. 4.根据权利要求1所述的基于YOLOv3的轻量级框架改进的目标识别方法,其特征在于:步骤S4具体包括:在YOLOv3-tiny方法中嵌入SENet结构,在每个池化层后以及最终输出结果前的卷积层后嵌入SENet结构,通过修改YOLOv3-tiny.cfg文件,在第2、4、6、8、10、12层的池化层和第13、14、15、19、22层的卷积层后添加SEnet结构,并指定SENet结构的输入全局池化层的特征通道值16、32、64、128、256、512、1024、256、512、128、256为嵌入层输出的特征通道数,得到YOLOv3-tiny-SE网络模型。4. the improved target recognition method based on the lightweight frame of YOLOv3 according to claim 1, is characterized in that: step S4 specifically comprises: in YOLOv3-tiny method, embed SENet structure, after each pooling layer and final The SENet structure is embedded in the convolutional layer before the output result. By modifying the YOLOv3-tiny.cfg file, the pooling layers of the 2nd, 4th, 6th, 8th, 10th and 12th layers and the 13th, 14th, 15th, 19th and 22nd layers are modified. The SEnet structure is added after the convolution layer of the layer, and the feature channel values 16, 32, 64, 128, 256, 512, 1024, 256, 512, 128, 256 of the input global pooling layer of the SENet structure are specified as the output of the embedding layer. The number of feature channels to obtain the YOLOv3-tiny-SE network model. 5.根据权利要求1所述的基于YOLOv3的轻量级框架改进的目标识别方法,其特征在于:步骤S5具体包括:5. the improved target recognition method based on the lightweight frame of YOLOv3 according to claim 1, is characterized in that: step S5 specifically comprises: S51、在步骤S2进行样本数据集的增强并标注好参数以后,对于制作好的完备样本数据集重新计算anchorbox值;利用K-means聚类方法进行交通环境中的anchorbox值的计算,步骤如下:读取已标注好的数据集,随机取出其中一个图片的宽度和高度值作为坐标点并作为初始聚类中心,再使用K-means聚类方法进行迭代计算得到具体的anchorbox值;S51. After the sample data set is enhanced and the parameters are marked in step S2, the anchorbox value is recalculated for the prepared complete sample data set; the K-means clustering method is used to calculate the anchorbox value in the traffic environment, and the steps are as follows: Read the marked data set, randomly take the width and height of one of the pictures as the coordinate point and use it as the initial cluster center, and then use the K-means clustering method to iteratively calculate the specific anchorbox value; S52、设置训练时的超参数和网络参数,然后将训练集输入到YOLOv3-tiny-SE网络模型中进行多任务训练,并保存训练好的网络模型权重文件。S52. Set the hyperparameters and network parameters during training, and then input the training set into the YOLOv3-tiny-SE network model for multi-task training, and save the trained network model weight file. 6.根据权利要求1所述的基于YOLOv3的轻量级框架改进的目标识别方法,其特征在于:步骤S6具体包括:6. the improved target recognition method based on the lightweight framework of YOLOv3 according to claim 1, is characterized in that: step S6 specifically comprises: S61、加载步骤S52得到的训练好的网络模型权重文件,将测试集输入到上述训练好的YOLOv3-tiny-SE网络模型中,经过卷积层、池化层、SENet结构以及上采样层,得到多尺度特征图;S61. Load the trained network model weight file obtained in step S52, input the test set into the above trained YOLOv3-tiny-SE network model, go through the convolution layer, pooling layer, SENet structure and upsampling layer to obtain multi-scale feature maps; S62、采用logistic函数对网络预测的x、y、置信度、类别概率进行激活,经阈值判断,得到所有预测框的坐标、置信度和类别概率;S62. Use the logistic function to activate the x, y, confidence, and category probability predicted by the network, and obtain the coordinates, confidence, and category probability of all prediction frames through threshold judgment; S63、将步骤S62得到的结果通过非极大值抑制处理去除冗余检测框,产生最终的目标检测框和识别结果。S63 , using the result obtained in step S62 to remove redundant detection frames through non-maximum suppression processing to generate final target detection frames and recognition results.
CN201911013341.1A 2019-10-23 2019-10-23 An improved target recognition method based on YOLOv3 lightweight framework Pending CN110929577A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911013341.1A CN110929577A (en) 2019-10-23 2019-10-23 An improved target recognition method based on YOLOv3 lightweight framework

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911013341.1A CN110929577A (en) 2019-10-23 2019-10-23 An improved target recognition method based on YOLOv3 lightweight framework

Publications (1)

Publication Number Publication Date
CN110929577A true CN110929577A (en) 2020-03-27

Family

ID=69849270

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911013341.1A Pending CN110929577A (en) 2019-10-23 2019-10-23 An improved target recognition method based on YOLOv3 lightweight framework

Country Status (1)

Country Link
CN (1) CN110929577A (en)

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111709381A (en) * 2020-06-19 2020-09-25 桂林电子科技大学 Road environment target detection method based on YOLOv3-SPP
CN111723747A (en) * 2020-06-22 2020-09-29 西安工业大学 A lightweight and high-efficiency target detection method applied to embedded platforms
CN111754498A (en) * 2020-06-29 2020-10-09 河南科技大学 A Conveyor Belt Idler Detection Method Based on YOLOv3
CN111814621A (en) * 2020-06-29 2020-10-23 中国科学院合肥物质科学研究院 A multi-scale vehicle pedestrian detection method and device based on attention mechanism
CN111882554A (en) * 2020-08-06 2020-11-03 桂林电子科技大学 An intelligent detection method for power line faults based on SK-YOLOv3
CN111986240A (en) * 2020-09-01 2020-11-24 交通运输部水运科学研究所 Drowning person detection method and system based on visible light and thermal imaging data fusion
CN112070713A (en) * 2020-07-03 2020-12-11 中山大学 A Multi-scale Object Detection Method Introducing Attention Mechanism
CN112084870A (en) * 2020-08-10 2020-12-15 同济大学 YOLO-based multi-target detection method and device in traffic scene
CN112115880A (en) * 2020-09-21 2020-12-22 成都数之联科技有限公司 Ship pollution monitoring method, system, device and medium based on multi-label learning
CN112183361A (en) * 2020-09-29 2021-01-05 中科人工智能创新技术研究院(青岛)有限公司 Goal detection method and system combining target detection and dynamic difference
CN112233175A (en) * 2020-09-24 2021-01-15 西安交通大学 Chip positioning method and integrated positioning platform based on YOLOv3-tiny algorithm
CN112270252A (en) * 2020-10-26 2021-01-26 西安工程大学 Multi-vehicle target identification method for improving YOLOv2 model
CN112307921A (en) * 2020-10-22 2021-02-02 桂林电子科技大学 A vehicle-mounted multi-target recognition tracking and prediction method
CN112396002A (en) * 2020-11-20 2021-02-23 重庆邮电大学 Lightweight remote sensing target detection method based on SE-YOLOv3
CN112464910A (en) * 2020-12-18 2021-03-09 杭州电子科技大学 Traffic sign identification method based on YOLO v4-tiny
CN112487969A (en) * 2020-11-30 2021-03-12 苏州热工研究院有限公司 Method for acquiring position of inspection target of inspection robot of steam generator
CN112561885A (en) * 2020-12-17 2021-03-26 中国矿业大学 YOLOv 4-tiny-based gate valve opening detection method
TWI723823B (en) * 2020-03-30 2021-04-01 聚晶半導體股份有限公司 Object detection device and object detection method based on neural network
CN112633308A (en) * 2020-09-15 2021-04-09 北京华电天仁电力控制技术有限公司 Detection method and detection system for whether power plant operating personnel wear safety belts
CN112686186A (en) * 2021-01-05 2021-04-20 润联软件系统(深圳)有限公司 High-altitude parabolic recognition method based on deep learning and related components thereof
CN112861919A (en) * 2021-01-15 2021-05-28 西北工业大学 Underwater sonar image target detection method based on improved YOLOv3-tiny
CN113128555A (en) * 2021-03-09 2021-07-16 西南交通大学 Method for detecting abnormality of train brake pad part
CN113537013A (en) * 2021-07-06 2021-10-22 哈尔滨理工大学 Multi-scale self-attention feature fusion pedestrian detection method
CN114399440A (en) * 2022-01-13 2022-04-26 马上消费金融股份有限公司 Image processing method, image processing network training method and device and electronic equipment
CN114463772A (en) * 2022-01-13 2022-05-10 苏州大学 Traffic sign detection and recognition method and system based on deep learning
CN114898152A (en) * 2022-05-13 2022-08-12 电子科技大学 Embedded elastic self-expanding general learning framework
CN115239949A (en) * 2022-07-28 2022-10-25 湖南锐思华创科技有限公司 Target detection and tracking method, storage medium and device based on yolo neural network
CN116030351A (en) * 2023-03-28 2023-04-28 南京信息工程大学 Cascade network-based aerial image ship segmentation method

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180101744A1 (en) * 2016-10-09 2018-04-12 Airspace Systems Inc. Spatio-temporal awareness engine for priority tree based region selection across multiple input cameras and multimodal sensor empowered awareness engine for target recovery and object path prediction
US20180253613A1 (en) * 2017-03-06 2018-09-06 Honda Motor Co., Ltd. System and method for vehicle control based on red color and green color detection
CN109214399A (en) * 2018-10-12 2019-01-15 清华大学深圳研究生院 A kind of improvement YOLOV3 Target Recognition Algorithms being embedded in SENet structure
CN109325418A (en) * 2018-08-23 2019-02-12 华南理工大学 Pedestrian recognition method in road traffic environment based on improved YOLOv3
CN109447034A (en) * 2018-11-14 2019-03-08 北京信息科技大学 Traffic mark detection method in automatic Pilot based on YOLOv3 network
CN109934121A (en) * 2019-02-21 2019-06-25 江苏大学 A pedestrian detection method in orchard based on YOLOv3 algorithm
CN110222769A (en) * 2019-06-06 2019-09-10 大连理工大学 Improved target detection method based on YOLOV3-tiny

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180101744A1 (en) * 2016-10-09 2018-04-12 Airspace Systems Inc. Spatio-temporal awareness engine for priority tree based region selection across multiple input cameras and multimodal sensor empowered awareness engine for target recovery and object path prediction
US20180253613A1 (en) * 2017-03-06 2018-09-06 Honda Motor Co., Ltd. System and method for vehicle control based on red color and green color detection
CN109325418A (en) * 2018-08-23 2019-02-12 华南理工大学 Pedestrian recognition method in road traffic environment based on improved YOLOv3
CN109214399A (en) * 2018-10-12 2019-01-15 清华大学深圳研究生院 A kind of improvement YOLOV3 Target Recognition Algorithms being embedded in SENet structure
CN109447034A (en) * 2018-11-14 2019-03-08 北京信息科技大学 Traffic mark detection method in automatic Pilot based on YOLOv3 network
CN109934121A (en) * 2019-02-21 2019-06-25 江苏大学 A pedestrian detection method in orchard based on YOLOv3 algorithm
CN110222769A (en) * 2019-06-06 2019-09-10 大连理工大学 Improved target detection method based on YOLOV3-tiny

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘军: "基于增强Tiny YOLOV3算法的车辆实时检测与跟踪", 《农业工程学报》 *

Cited By (40)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI723823B (en) * 2020-03-30 2021-04-01 聚晶半導體股份有限公司 Object detection device and object detection method based on neural network
CN111709381A (en) * 2020-06-19 2020-09-25 桂林电子科技大学 Road environment target detection method based on YOLOv3-SPP
CN111723747A (en) * 2020-06-22 2020-09-29 西安工业大学 A lightweight and high-efficiency target detection method applied to embedded platforms
CN111754498A (en) * 2020-06-29 2020-10-09 河南科技大学 A Conveyor Belt Idler Detection Method Based on YOLOv3
CN111814621A (en) * 2020-06-29 2020-10-23 中国科学院合肥物质科学研究院 A multi-scale vehicle pedestrian detection method and device based on attention mechanism
CN111754498B (en) * 2020-06-29 2023-11-21 河南科技大学 Conveyor belt carrier roller detection method based on YOLOv3
CN111814621B (en) * 2020-06-29 2024-01-23 中国科学院合肥物质科学研究院 Attention mechanism-based multi-scale vehicle pedestrian detection method and device
CN112070713A (en) * 2020-07-03 2020-12-11 中山大学 A Multi-scale Object Detection Method Introducing Attention Mechanism
CN111882554A (en) * 2020-08-06 2020-11-03 桂林电子科技大学 An intelligent detection method for power line faults based on SK-YOLOv3
CN111882554B (en) * 2020-08-06 2022-05-03 桂林电子科技大学 An intelligent detection method for power line faults based on SK-YOLOv3
CN112084870A (en) * 2020-08-10 2020-12-15 同济大学 YOLO-based multi-target detection method and device in traffic scene
CN111986240A (en) * 2020-09-01 2020-11-24 交通运输部水运科学研究所 Drowning person detection method and system based on visible light and thermal imaging data fusion
CN112633308A (en) * 2020-09-15 2021-04-09 北京华电天仁电力控制技术有限公司 Detection method and detection system for whether power plant operating personnel wear safety belts
CN112115880A (en) * 2020-09-21 2020-12-22 成都数之联科技有限公司 Ship pollution monitoring method, system, device and medium based on multi-label learning
CN112233175B (en) * 2020-09-24 2023-10-24 西安交通大学 Chip positioning method and integrated positioning platform based on YOLOv3-tiny algorithm
CN112233175A (en) * 2020-09-24 2021-01-15 西安交通大学 Chip positioning method and integrated positioning platform based on YOLOv3-tiny algorithm
CN112183361A (en) * 2020-09-29 2021-01-05 中科人工智能创新技术研究院(青岛)有限公司 Goal detection method and system combining target detection and dynamic difference
CN112307921A (en) * 2020-10-22 2021-02-02 桂林电子科技大学 A vehicle-mounted multi-target recognition tracking and prediction method
CN112307921B (en) * 2020-10-22 2022-05-17 桂林电子科技大学 Vehicle-mounted end multi-target identification tracking prediction method
CN112270252A (en) * 2020-10-26 2021-01-26 西安工程大学 Multi-vehicle target identification method for improving YOLOv2 model
CN112396002A (en) * 2020-11-20 2021-02-23 重庆邮电大学 Lightweight remote sensing target detection method based on SE-YOLOv3
CN112396002B (en) * 2020-11-20 2023-05-30 重庆邮电大学 SE-YOLOv 3-based lightweight remote sensing target detection method
CN112487969A (en) * 2020-11-30 2021-03-12 苏州热工研究院有限公司 Method for acquiring position of inspection target of inspection robot of steam generator
CN112487969B (en) * 2020-11-30 2023-06-30 苏州热工研究院有限公司 A method for obtaining the position of a steam generator inspection robot inspection target
CN112561885B (en) * 2020-12-17 2023-04-18 中国矿业大学 YOLOv 4-tiny-based gate valve opening detection method
CN112561885A (en) * 2020-12-17 2021-03-26 中国矿业大学 YOLOv 4-tiny-based gate valve opening detection method
CN112464910A (en) * 2020-12-18 2021-03-09 杭州电子科技大学 Traffic sign identification method based on YOLO v4-tiny
CN112686186A (en) * 2021-01-05 2021-04-20 润联软件系统(深圳)有限公司 High-altitude parabolic recognition method based on deep learning and related components thereof
CN112861919A (en) * 2021-01-15 2021-05-28 西北工业大学 Underwater sonar image target detection method based on improved YOLOv3-tiny
CN113128555B (en) * 2021-03-09 2022-05-31 西南交通大学 A method for abnormal detection of train brake pad components
CN113128555A (en) * 2021-03-09 2021-07-16 西南交通大学 Method for detecting abnormality of train brake pad part
CN113537013A (en) * 2021-07-06 2021-10-22 哈尔滨理工大学 Multi-scale self-attention feature fusion pedestrian detection method
CN114463772B (en) * 2022-01-13 2022-11-25 苏州大学 Deep learning-based traffic sign detection and identification method and system
CN114399440B (en) * 2022-01-13 2022-12-13 马上消费金融股份有限公司 Image processing method, image processing network training method and device and electronic equipment
CN114463772A (en) * 2022-01-13 2022-05-10 苏州大学 Traffic sign detection and recognition method and system based on deep learning
CN114399440A (en) * 2022-01-13 2022-04-26 马上消费金融股份有限公司 Image processing method, image processing network training method and device and electronic equipment
CN114898152A (en) * 2022-05-13 2022-08-12 电子科技大学 Embedded elastic self-expanding general learning framework
CN114898152B (en) * 2022-05-13 2023-05-30 电子科技大学 Embedded elastic self-expanding general learning system
CN115239949A (en) * 2022-07-28 2022-10-25 湖南锐思华创科技有限公司 Target detection and tracking method, storage medium and device based on yolo neural network
CN116030351A (en) * 2023-03-28 2023-04-28 南京信息工程大学 Cascade network-based aerial image ship segmentation method

Similar Documents

Publication Publication Date Title
CN110929577A (en) An improved target recognition method based on YOLOv3 lightweight framework
Ma et al. Automatic detection and counting system for pavement cracks based on PCGAN and YOLO-MF
CN112052787B (en) Target detection method and device based on artificial intelligence and electronic equipment
CN109978893B (en) Training method, device, equipment and storage medium of image semantic segmentation network
CN107563372B (en) License plate positioning method based on deep learning SSD frame
CN112884742B (en) A multi-target real-time detection, recognition and tracking method based on multi-algorithm fusion
CN112541532B (en) Target detection method based on dense connection structure
CN108564097A (en) A kind of multiscale target detection method based on depth convolutional neural networks
CN113971764B (en) Remote sensing image small target detection method based on improvement YOLOv3
JP2024513596A (en) Image processing method and apparatus and computer readable storage medium
CN109886147A (en) A vehicle multi-attribute detection method based on single-network multi-task learning
CN114627441B (en) Unstructured road recognition network training method, application method and storage medium
CN109871792B (en) Pedestrian detection method and device
CN110599459A (en) Underground pipe network risk assessment cloud system based on deep learning
CN117765409A (en) Remote sensing graph semantic segmentation method based on multi-scale feature fusion and attention mechanism
CN111881833B (en) Vehicle detection method, device, equipment and storage medium
CN112634174B (en) Image representation learning method and system
CN118941526A (en) A road crack detection method, medium and product
CN111160274A (en) Pedestrian detection method based on binaryzation fast RCNN (radar cross-correlation neural network)
CN114299328A (en) Environment self-adaptive sensing small sample endangered animal detection method and system
CN112132839B (en) Multi-scale rapid face segmentation method based on deep convolution cascade network
Zhao et al. Recognition and Classification of Concrete Cracks under Strong Interference Based on Convolutional Neural Network.
CN119206659A (en) YOLOv8-based target detection method, device, storage medium, and equipment in driving scenarios
CN116861262B (en) Perception model training method and device, electronic equipment and storage medium
CN117975018A (en) Alignment module, decoder training method, image segmentation method, device and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
AD01 Patent right deemed abandoned
AD01 Patent right deemed abandoned

Effective date of abandoning: 20231013