CN111046787A - Pedestrian detection method based on improved YOLO v3 model - Google Patents
Pedestrian detection method based on improved YOLO v3 model Download PDFInfo
- Publication number
- CN111046787A CN111046787A CN201911257993.XA CN201911257993A CN111046787A CN 111046787 A CN111046787 A CN 111046787A CN 201911257993 A CN201911257993 A CN 201911257993A CN 111046787 A CN111046787 A CN 111046787A
- Authority
- CN
- China
- Prior art keywords
- model
- target
- yolo
- grid
- pedestrian
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/103—Static body considered as a whole, e.g. static pedestrian or occupant recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a pedestrian detection method based on an improved YOLO v3 model, which comprises the following steps: selecting a training sample; carrying out K-means value clustering calculation on the sample to obtain a new anchors value, and replacing the data set parameter in the original YOLO v3 model with the new anchors value; introducing an acceptance module, and performing cutting optimization on the acceptance module to obtain an improved YOLO v3 model; detecting the pedestrian by using the improved YOLO v3 model to obtain a detection result; the problem that the characteristics extracted by the original YOLO v3 model are too single is solved, and the pedestrian detection precision is improved.
Description
Technical Field
The invention relates to a pedestrian detection method based on a neural network, in particular to a pedestrian detection method based on an improved YOLOv3 model.
Background
Pedestrian detects the branch in the target detection field, can embody from many aspects to pedestrian detection technology's urgent need, like wisdom traffic, security protection video monitoring, autopilot technique etc.. In the early days, due to the limitation of computer hardware conditions, the form of pedestrian detection is mainly based on images, and only the requirement of detecting whether pedestrians exist in the images is met. Nowadays, with the rapid development of microelectronic technology and computer technology, the technology is required to be capable of detecting pedestrians in a simple background environment, and also required to be capable of accurately detecting pedestrians even though strong interference factors exist in an external environment, such as a strong light environment, a weak light environment, shielding and the like; meanwhile, the detection form is not limited to images any more, real-time detection is required, and functions such as tracking, behavior recognition and the like are required to be added on the basis of detection. In addition, with the rapid development of deep learning in recent years, more and more deep learning models are beginning to be widely applied to various technologies of computer vision. For example, various related technologies ranging from ubiquitous license plate recognition, pedestrian detection, to advanced driver assistance. Compared with the traditional pedestrian detection method, the pedestrian detection method based on the convolutional neural network greatly improves the detection precision and speed; however, the existing YOLO v3 model extracts too single features, so that the accuracy in recognition is not high.
Disclosure of Invention
The invention aims to provide a pedestrian detection method based on an improved YOLO v3 model, which solves the problem that the characteristics extracted by the original YOLO v3 model are too single and improves the pedestrian detection precision.
In a first aspect, the present invention provides a pedestrian detection method based on an improved YOLO v3 model, including:
step 1, selecting a training sample;
step 2, carrying out K-means value clustering calculation on the sample to obtain a new anchors value, and replacing the data set parameter in the original YOLO v3 model with the new anchors value;
step 3, introducing an acceptance module, and performing cutting optimization on the acceptance module to obtain an improved YOLO v3 model;
and 4, detecting the pedestrian by using the improved YOLO v3 model to obtain a detection result.
2. The pedestrian detection method based on the improved YOLO v3 model of claim 1, wherein: the step 1 is further specifically as follows: pedestrian images in the public data sets pascal voc2007 and pascalloc 2012 are respectively extracted from the public data sets, and training samples are selected according to the proportion that the training set and the testing set are 2: 1.
Further, the step 2 is further specifically:
using a non-linear mapping of θ to convert the sample xi(i ═ 1,2, …, l) is mapped into the high-dimensional space G, i.e. the samples are θ (x)1),θ(x2),...,θ(xi);
Performing K-means clustering operation in high-dimensional space to optimize function
Wherein, the sample mean value mkThis can be derived from the following formula:
in the nuclear space, the nuclear distance of two characteristic points is calculated
Where N is a kernel function.
Merging all the sample subsets obtained by clustering, wherein the merged set of the sample subsets comprises K target categories, and calculating the mean values of the K target categories respectively
Wherein n isiData volume, x, representing the categoryiRepresents the mean of the i-th class.
Calculating the distance between any two class means
I=|xi-xj|2(5)
If the distance between the mean values of the two target categories is smaller than a preset threshold value, combining the two target categories into one category; and then continuing to calculate the mean-like distance by the formula (5). Merging the union sets of the sample subsets to obtain a final clustering result;
and calculating the anchors values which are consistent with the model by using the finally generated clustering results, and replacing the data set parameters in the original YOLOv3 model by the new anchors values.
Further, the step 3 is further specifically: introducing an acceptance module, then cutting and optimizing the acceptance module, wherein the cut acceptance module is mainly formed by combining a 3 × 3 convolution layer and a 5 × 5 convolution layer, meanwhile, for the 5 × 5 convolution layer, continuously using two 3 × 3 convolution layers to replace the convolution layer, merging two paths of output results of different receptive fields by using a route layer in a YOLOv3 model, combining the output results into one output layer, and transmitting the output layer to the next convolution network for further feature extraction operation; and putting the clipped initiation module into the original YOLOv3 model to obtain an improved YOLOv3 model.
Further, the step 4 is further specifically:
step 4 a: partitioning an image to be detected; when a model is input, the size of an image is adjusted in a self-adaptive mode, the image is adjusted to be square, and then grids with the size of N x N are used for blocking;
and 4 b: when the center point of a certain target exists in the blocked grid, the grid is responsible for carrying out classification judgment and position detection on the target, and the following operations are carried out:
when the central point of a certain target falls into N-N grids which are divided, the grids generate B prediction frames to detect the target, namely each grid has B boundary frames which are generated by the predictions of anchors and confidence coefficient CS which indicates whether the grid contains the target or not, so that the possibility of the target existing in the boundary frame based on the current model and the accuracy of the predicted target position are comprehensively reflected
Wherein Pr (object) indicates whether the center point of the object is contained in the mesh, and if so, is 1; on the contrary, the number is 0,the intersection ratio is used for representing the intersection ratio of the bounding box generated by grid prediction and the real bounding box area of the object;
generating B predicted boundary boxes for each grid, and detecting the target in the grid, wherein each boundary prediction box comprises5 parameters [ x, y, w, h, confidence [ ]],[x,y]Represents the coordinates of the center point of the target within the grid, [ w, h ]]Representing the width and height of the predicted boundary box, while confidence represents the intersection ratio of the predicted boundary box and the real boundary box of the object, and each grid corresponds to a predicted value C for predicting whether a certain type of target condition is containediThe expression is as follows,
Ci=Pr(Classi|Object) (7)
and 4 c: each grid obtained in step 4b contains 5 parameters, using vector yiSpecifically, the following is shown:
yi=[bx,by,bw,bh,c](8)
wherein (b)x,by) Coordinates representing the center point of the object, (b)w,bh) Representing the width and height of a bounding box generated by the network for the target prediction, and c representing the total confidence score of the prediction box;
and 4 d: and after the N-by-N grids are predicted, sorting and summarizing the parameters of all the grids, and outputting the detection result of the whole image.
One or more technical solutions provided in the embodiments of the present invention have at least the following technical effects or advantages:
the embodiment of the application provides an improved pedestrian detection method based on improved YOLOv 3. Firstly, a model training data set is manufactured, and a new anchors value is calculated through K value clustering to replace the original YOLOv3 data set parameter; then, improving the YOLOv3 model by blending different sizes of receptive fields, and cutting the size of the receptive fields of the improved model, namely introducing a tiny-interception module to further enrich the extracted characteristics and reduce the number of network layers; and finally, training by adopting the improved YOLOv3 model, and applying the trained model to a pedestrian detection scene to realize pedestrian detection. The method provided by the invention improves the accuracy and robustness of the pedestrian detection algorithm and obtains better detection effect in the aspects of subjective vision and objective evaluation indexes.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
The invention will be further described with reference to the following examples with reference to the accompanying drawings.
FIG. 1 is a schematic block diagram of the system of the present invention;
fig. 2 is a schematic diagram of the mini network of the present invention replacing 5 × 5 convolutions.
FIG. 3a is a graph I of the original model test result.
FIG. 3b is a graph I of the detection result of the model of the present invention.
FIG. 4a is a graph II of the results of the original model test.
FIG. 4b is a second graph of the test results of the model of the present invention.
FIG. 5a is a third diagram of the test results of the original model.
FIG. 5b is a third graph of the detection result of the model of the present invention.
FIG. 6a is a diagram of the test results of the original model.
FIG. 6b is a graph IV of the test results of the model of the present invention.
Detailed Description
As shown in fig. 1, the present invention provides a pedestrian detection method based on an improved YOLO v3 model, comprising:
step 1, selecting a training sample;
step 2, carrying out K-means value clustering calculation on the sample to obtain a new anchors value, and replacing the data set parameter in the original YOLO v3 model with the new anchors value;
step 3, introducing an acceptance module, and performing cutting optimization on the acceptance module to obtain an improved YOLO v3 model;
and 4, detecting the pedestrian by using the improved YOLO v3 model to obtain a detection result.
The step 1 is further specifically as follows: pedestrian images in the public data sets pascal voc2007 and pascal voc2012 are respectively extracted from the public data sets, and training samples are selected according to the proportion that the training set to the testing set is 2: 1.
The step 2 is further specifically as follows:
using a non-linear mapping of θ to convert the sample xi(i ═ 1,2, …, l) is mapped into the high-dimensional space G, i.e. the samples are θ (x)1),θ(x2),...,θ(xi);
Performing K-means clustering operation in high-dimensional space to optimize function
Wherein, the sample mean value mkThis can be derived from the following formula:
in the nuclear space, the nuclear distance of two characteristic points is calculated
Where N is a kernel function.
Merging all the sample subsets obtained by clustering, wherein the merged set of the sample subsets comprises K target categories, and calculating the mean values of the K target categories respectively
Wherein n isiData volume, x, representing the categoryiRepresents the mean of the i-th class.
Calculating the distance between any two class means
I=|xi-xj|2(5)
If the distance between the mean values of the two target categories is smaller than a preset threshold value, combining the two target categories into one category; and then continuing to calculate the mean-like distance by the formula (5). Merging the union sets of the sample subsets to obtain a final clustering result;
and calculating the anchors values which are consistent with the model by using the finally generated clustering results, and replacing the data set parameters in the original YOLOv3 model by the new anchors values.
The step 3 is further specifically as follows: introducing an acceptance module, then cutting and optimizing the acceptance module, wherein the cut acceptance module is mainly formed by combining a 3 × 3 convolution layer and a 5 × 5 convolution layer, meanwhile, for the 5 × 5 convolution layer, continuously using two 3 × 3 convolution layers to replace the convolution layer, merging output results of two different sense fields by using a route layer in a YOLO v3 model (the two different sense fields are the independent 3 × 3 convolution layer and the improved two 3 × 3 convolution layers), and combining the output results into one output layer to be transmitted to the next convolution network for further feature extraction operation; and putting the clipped initiation module into the original YOLO v3 model to obtain an improved YOLO v3 model.
The step 4 is further specifically as follows:
step 4 a: partitioning an image to be detected; when a model is input, the size of an image is adjusted in a self-adaptive mode, the image is adjusted to be square, and then grids with the size of N x N are used for blocking;
and 4 b: when the center point of a certain target exists in the blocked grid, the grid is responsible for carrying out classification judgment and position detection on the target, and the following operations are carried out:
when the central point of a certain target falls into N-N grids which are divided, the grids generate B prediction frames to detect the target, namely each grid has B boundary frames which are generated by the predictions of anchors and confidence coefficient CS which indicates whether the grid contains the target or not, so that the possibility of the target existing in the boundary frame based on the current model and the accuracy of the predicted target position are comprehensively reflected
Wherein Pr (object) indicates whether the center point of the object is contained in the mesh, and if so, is 1; on the contrary, the number is 0,the intersection ratio is used for representing the intersection ratio of the bounding box generated by grid prediction and the real bounding box area of the object;
b predicted boundary boxes are generated by each grid to detect the target in the grid, wherein each boundary prediction box comprises 5 parameters [ x, y, w, h, confidence ]],[x,y]Represents the coordinates of the center point of the target within the grid, [ w, h ]]Representing the width and height of the predicted boundary box, while confidence represents the intersection ratio of the predicted boundary box and the real boundary box of the object, and each grid corresponds to a predicted value C for predicting whether a certain type of target condition is containediThe expression is as follows,
Ci=Pr(Classi|Object) (7)
and 4 c: each grid obtained in step 4b contains 5 parameters, using vector yiSpecifically, the following is shown:
yi=[bx,by,bw,bh,c](8)
wherein (b)x,by) Coordinates representing the center point of the object, (b)w,bh) Representing the width and height of a bounding box generated by the network for the target prediction, and c representing the total confidence score of the prediction box;
and 4 d: and after the N-by-N grids are predicted, sorting and summarizing the parameters of all the grids, and outputting the detection result of the whole image.
One specific embodiment of the present invention:
in order to further explain the technical scheme of the invention, the invention is explained in detail by the specific embodiment.
Inputting: the image containing the pedestrian to be recognized is a sample image containing the pedestrian for learning.
1. Sample data preparation
Pedestrian images in the public data sets pascalloc 2007 and pascalloc 2012 are respectively extracted from the public data sets, and 2094 images are extracted to train the data sets: test set 2:1, respectively recording the training set and the test set asAndwherein TR represents a training set, TE represents a test set, x represents an input sample image, y represents a label corresponding to the sample image, and N represents the number of data sets.
2. Performing clustering calculation on the K-means sample to obtain a new anchors value
Reading the width and height of the pedestrian as data to be classified according to the TR training set in the step 1, initializing a clustering center point, wherein the coordinate of the clustering center point describes the width and height of a rectangular frame, calculating the IOU value of the clustering center point and the rectangular frame described by the data to be classified respectively, and finally obtaining 9 groups of anchors by taking the distance of 1-IOU as a classification basis, wherein the coordinates of the prediction center point, the width and height of an anchor frame and the prediction target class are included.
Where S represents the area of the rectangular frame.
3 improving the original YOLO v3 network and introducing a tiny-interception module
3.1 introduction of multiple size convolution kernels
An initiation module is introduced into the original YOLO v3 framework to train the set TR to construct new model data.
(1) The input and output relationship is
Wherein X is an abstract feature or a hierarchical feature extracted from an input sample,for the feature extraction function, x is the input image, W is the convolution kernel, b is the offset value, Y is the predicted value of the sample image, and θ is the softmax logistic regression parameter。
(2) The classifier parameters are
Where T is the classifier coefficient and P represents the probability that the sample prediction result is k.
The final output class is y ═ argmax { y (k) } (4)
(3) Constructing an objective function by using cross entropy on a training set
Where R (W) and R (θ) are regularization terms for thinning out parameters and preventing overfitting, λ1And λ2Representing sparse coefficients.
(4) Optimization formula after updating parameters
Wherein J (W, b; theta) represents the objective function constructed by using the cross entropy in (3). The parameters of the above formula are consistent with the classifier.
3.2 clipping acceptance Module
In the convolutional layer, one 5 × 5 convolutional kernel is replaced by two 3 × 3 convolutional kernels in succession. The calculation resource consumption is large by using the convolution kernel of 5 × 5, 25/9 times of calculation consumption is needed compared with the convolution kernel of 3 × 3, and the calculation amount saved after improvement is 1- (9+ 9)/25-7/25-28%. As shown in fig. 2, the improved acceptance module combines two different reception fields, and then combines the two inputs to form input data of the next layer by using the YOLO v3 network route layer.
4 pedestrian detection
4.1 image adaptive adjustment and partitioning
The size of the input image with any size is firstly adjusted to 224X 224 in an adaptive mode, and then the image is divided by using a grid with the size of N X N to be used as a new input imageThe image resolution of the input image domain is 224 x 224 and the output image y e 0,1,2, …,1396]。
4.2 pedestrian prediction
After the image to be recognized passes through an improved LOYO v3 network to complete the prediction of N-by-N grids, the position of the target pedestrian is judged according to the network output result, the coordinate value and the confidence score of the rectangular frame formed by the target pedestrian are output, finally, the parameters of all the grids are sorted and summarized, and the detection result of the whole image is output.
5 simulation experiment
The effects of the present invention can be further illustrated by the following simulation experiments. In the experiment, in order to ensure the objectivity of the experiment, images are derived from various pedestrian images extracted from public data sets pascal voc2007 and pascal voc2012, wherein 1396 images are taken as training samples and include various images containing people, and the rest 698 images are taken as test sample sets. The experiment will be compared to the original YOLO v3 model algorithm.
In order to quantitatively evaluate the superiority of the improved algorithm in performance, the experiment was trained and tested using the same data set, and finally the test results were analyzed.
TABLE-comprehensive Property test results
From the data in table one, it can be seen that the improved algorithm has a small improvement in accuracy, recall, and average IOU values over the original YOLO v3 model.
TABLE two AP value to compare
The pedestrian samples in the test set are tested, and the comparison result of the AP (average precision) values of the pedestrian categories is shown in the table two. An improved network architecture may be demonstrated with improved detection performance compared to the original network. In addition, when the improved model is used for detecting the pedestrian image, as shown in fig. 3a to 6b, the improved model is improved to a certain extent in the aspects of the integrity, the accuracy, the missing detection condition, the false detection condition and the like of the prediction frame. The improved YOLO model uses the anchor scale aiming at the pedestrian data set (passacal voc2007+ passacal voc2012), and in addition, richer layer structures are obtained by applying different scale feature extraction methods, test results show that the precision ratio and the recall ratio respectively reach 79% and 74%, and are slightly improved compared with the original YOLO v3, and meanwhile, the new model has the performance improvement of 1.72% aiming at the AP value of pedestrians.
Although specific embodiments of the invention have been described above, it will be understood by those skilled in the art that the specific embodiments described are illustrative only and are not limiting upon the scope of the invention, and that equivalent modifications and variations can be made by those skilled in the art without departing from the spirit of the invention, which is to be limited only by the appended claims.
Claims (5)
1. A pedestrian detection method based on an improved YOLO v3 model is characterized by comprising the following steps: the method comprises the following steps:
step 1, selecting a training sample;
step 2, carrying out K-means value clustering calculation on the sample to obtain a new anchors value, and replacing the data set parameter in the original YOLO v3 model with the new anchors value;
step 3, introducing an acceptance module, and performing cutting optimization on the acceptance module to obtain an improved YOLO v3 model;
and 4, detecting the pedestrian by using the improved YOLO v3 model to obtain a detection result.
2. The pedestrian detection method based on the improved YOLO v3 model of claim 1, wherein: the step 1 is further specifically as follows: pedestrian images in the public data sets pascal voc2007 and pascal voc2012 are respectively extracted from the public data sets, and training samples are selected according to the proportion that the training set to the testing set is 2: 1.
3. The pedestrian detection method based on the improved YOLO v3 model of claim 1, wherein: the step 2 is further specifically as follows:
using a non-linear mapping of θ to convert the sample xi(i ═ 1,2, …, l) is mapped into the high-dimensional space G, i.e. the samples are θ (x)1),θ(x2),...,θ(xi);
Performing K-means clustering operation in high-dimensional space to optimize function
Wherein, the sample mean value mkThis can be derived from the following formula:
in the nuclear space, the nuclear distance of two characteristic points is calculated
Where N is a kernel function.
Merging all the sample subsets obtained by clustering, wherein the merged set of the sample subsets comprises K target categories, and calculating the mean values of the K target categories respectively
Wherein n isiData volume, x, representing the categoryiRepresents the mean of the i-th class.
Calculating the distance between any two class means
I=|xi-xj|2(5)
If the distance between the mean values of the two target categories is smaller than a preset threshold value, combining the two target categories into one category; and then continuing to calculate the mean-like distance by the formula (5). Merging the union sets of the sample subsets to obtain a final clustering result;
and calculating the anchors values which are consistent with the model by using the finally generated clustering results, and replacing the data set parameters in the original YOLO v3 model by the new anchors values.
4. The pedestrian detection method based on the improved YOLO v3 model of claim 1, wherein: the step 3 is further specifically as follows: introducing an acceptance module, then cutting and optimizing the acceptance module, wherein the cut acceptance module is mainly formed by combining a 3 × 3 convolution layer and a 5 × 5 convolution layer, meanwhile, for the 5 × 5 convolution layer, continuously using two 3 × 3 convolution layers to replace the convolution layer, merging two paths of output results of different receptive fields by a route layer in a YOLO v3 model, combining the output results into one output layer, and transmitting the output layer to a next convolution network for further feature extraction operation; and putting the clipped initiation module into the original YOLO v3 model to obtain an improved YOLO v3 model.
5. The pedestrian detection method based on the improved YOLO v3 model of claim 1, wherein: the step 4 is further specifically as follows:
step 4 a: partitioning an image to be detected; when a model is input, the size of an image is adjusted in a self-adaptive mode, the image is adjusted to be square, and then grids with the size of N x N are used for blocking;
and 4 b: when the center point of a certain target exists in the blocked grid, the grid is responsible for carrying out classification judgment and position detection on the target, and the following operations are carried out:
when the central point of a certain target falls into N-N grids which are divided, the grids generate B prediction frames to detect the target, namely each grid has B boundary frames which are generated by the predictions of anchors and confidence coefficient CS which indicates whether the grid contains the target or not, so that the possibility of the target existing in the boundary frame based on the current model and the accuracy of the predicted target position are comprehensively reflected
Wherein Pr (object) indicates whether the center point of the object is contained in the mesh, and if so, is 1; on the contrary, the number is 0,the intersection ratio is used for representing the intersection ratio of the bounding box generated by grid prediction and the real bounding box area of the object;
b predicted boundary boxes are generated by each grid to detect the target in the grid, wherein each boundary prediction box comprises 5 parameters [ x, y, w, h, confidence ]],[x,y]Represents the coordinates of the center point of the target within the grid, [ w, h ]]Representing the width and height of the predicted boundary box, while confidence represents the intersection ratio of the predicted boundary box and the real boundary box of the object, and each grid corresponds to a predicted value C for predicting whether a certain type of target condition is containediThe expression is as follows,
Ci=Pr(Classi|Object) (7)
and 4 c: each grid obtained in step 4b contains 5 parameters, using vector yiSpecifically, the following is shown:
yi=[bx,by,bw,bh,c](8)
wherein (b)x,by) Coordinates representing the center point of the object, (b)w,bh) Representing the width and height of a bounding box generated by the network for the target prediction, and c representing the total confidence score of the prediction box;
and 4 d: and after the N-by-N grids are predicted, sorting and summarizing the parameters of all the grids, and outputting the detection result of the whole image.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911257993.XA CN111046787A (en) | 2019-12-10 | 2019-12-10 | Pedestrian detection method based on improved YOLO v3 model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911257993.XA CN111046787A (en) | 2019-12-10 | 2019-12-10 | Pedestrian detection method based on improved YOLO v3 model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111046787A true CN111046787A (en) | 2020-04-21 |
Family
ID=70235386
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911257993.XA Pending CN111046787A (en) | 2019-12-10 | 2019-12-10 | Pedestrian detection method based on improved YOLO v3 model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111046787A (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111814749A (en) * | 2020-08-12 | 2020-10-23 | Oppo广东移动通信有限公司 | Human body feature point screening method and device, electronic equipment and storage medium |
CN111950500A (en) * | 2020-08-21 | 2020-11-17 | 成都睿芯行科技有限公司 | Real-time pedestrian detection method based on improved YOLOv3-tiny in factory environment |
CN112347938A (en) * | 2020-11-09 | 2021-02-09 | 南京机电职业技术学院 | People stream detection method based on improved YOLOv3 |
CN112418212A (en) * | 2020-08-28 | 2021-02-26 | 西安电子科技大学 | Improved YOLOv3 algorithm based on EIoU |
CN112560682A (en) * | 2020-12-16 | 2021-03-26 | 重庆守愚科技有限公司 | Valve automatic detection method based on deep learning |
CN112598056A (en) * | 2020-12-21 | 2021-04-02 | 北京工业大学 | Software identification method based on screen monitoring |
CN112633299A (en) * | 2020-12-30 | 2021-04-09 | 深圳市优必选科技股份有限公司 | Target detection method, network, device, terminal equipment and storage medium |
CN113011390A (en) * | 2021-04-23 | 2021-06-22 | 电子科技大学 | Road pedestrian small target detection method based on image partition |
CN113158897A (en) * | 2021-04-21 | 2021-07-23 | 新疆大学 | Pedestrian detection system based on embedded YOLOv3 algorithm |
CN113609895A (en) * | 2021-06-22 | 2021-11-05 | 上海中安电子信息科技有限公司 | Road traffic information acquisition method based on improved Yolov3 |
CN113935410A (en) * | 2021-10-13 | 2022-01-14 | 甘肃同兴智能科技发展有限责任公司 | Electric power customer portrait method based on cross-correlation density clustering |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109934121A (en) * | 2019-02-21 | 2019-06-25 | 江苏大学 | A kind of orchard pedestrian detection method based on YOLOv3 algorithm |
CN110059554A (en) * | 2019-03-13 | 2019-07-26 | 重庆邮电大学 | A kind of multiple branch circuit object detection method based on traffic scene |
CN110276247A (en) * | 2019-05-09 | 2019-09-24 | 南京航空航天大学 | A kind of driving detection method based on YOLOv3-Tiny |
-
2019
- 2019-12-10 CN CN201911257993.XA patent/CN111046787A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109934121A (en) * | 2019-02-21 | 2019-06-25 | 江苏大学 | A kind of orchard pedestrian detection method based on YOLOv3 algorithm |
CN110059554A (en) * | 2019-03-13 | 2019-07-26 | 重庆邮电大学 | A kind of multiple branch circuit object detection method based on traffic scene |
CN110276247A (en) * | 2019-05-09 | 2019-09-24 | 南京航空航天大学 | A kind of driving detection method based on YOLOv3-Tiny |
Non-Patent Citations (1)
Title |
---|
葛雯 等: "改进YOLOV3算法在行人识别中的应用", 《计算机工程与应用》 * |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111814749A (en) * | 2020-08-12 | 2020-10-23 | Oppo广东移动通信有限公司 | Human body feature point screening method and device, electronic equipment and storage medium |
CN111950500A (en) * | 2020-08-21 | 2020-11-17 | 成都睿芯行科技有限公司 | Real-time pedestrian detection method based on improved YOLOv3-tiny in factory environment |
CN112418212A (en) * | 2020-08-28 | 2021-02-26 | 西安电子科技大学 | Improved YOLOv3 algorithm based on EIoU |
CN112418212B (en) * | 2020-08-28 | 2024-02-09 | 西安电子科技大学 | YOLOv3 algorithm based on EIoU improvement |
CN112347938B (en) * | 2020-11-09 | 2023-09-26 | 南京机电职业技术学院 | People stream detection method based on improved YOLOv3 |
CN112347938A (en) * | 2020-11-09 | 2021-02-09 | 南京机电职业技术学院 | People stream detection method based on improved YOLOv3 |
CN112560682A (en) * | 2020-12-16 | 2021-03-26 | 重庆守愚科技有限公司 | Valve automatic detection method based on deep learning |
CN112598056A (en) * | 2020-12-21 | 2021-04-02 | 北京工业大学 | Software identification method based on screen monitoring |
CN112633299A (en) * | 2020-12-30 | 2021-04-09 | 深圳市优必选科技股份有限公司 | Target detection method, network, device, terminal equipment and storage medium |
CN112633299B (en) * | 2020-12-30 | 2024-01-16 | 深圳市优必选科技股份有限公司 | Target detection method, network, device, terminal equipment and storage medium |
CN113158897A (en) * | 2021-04-21 | 2021-07-23 | 新疆大学 | Pedestrian detection system based on embedded YOLOv3 algorithm |
CN113011390A (en) * | 2021-04-23 | 2021-06-22 | 电子科技大学 | Road pedestrian small target detection method based on image partition |
CN113609895A (en) * | 2021-06-22 | 2021-11-05 | 上海中安电子信息科技有限公司 | Road traffic information acquisition method based on improved Yolov3 |
CN113935410A (en) * | 2021-10-13 | 2022-01-14 | 甘肃同兴智能科技发展有限责任公司 | Electric power customer portrait method based on cross-correlation density clustering |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111046787A (en) | Pedestrian detection method based on improved YOLO v3 model | |
CN109978893B (en) | Training method, device, equipment and storage medium of image semantic segmentation network | |
Feng et al. | A review and comparative study on probabilistic object detection in autonomous driving | |
CN111062413B (en) | Road target detection method and device, electronic equipment and storage medium | |
CN108416250B (en) | People counting method and device | |
CN107563372B (en) | License plate positioning method based on deep learning SSD frame | |
US20210089895A1 (en) | Device and method for generating a counterfactual data sample for a neural network | |
CN103020978B (en) | SAR (synthetic aperture radar) image change detection method combining multi-threshold segmentation with fuzzy clustering | |
CN110309747B (en) | Support quick degree of depth pedestrian detection model of multiscale | |
CN110348437B (en) | Target detection method based on weak supervised learning and occlusion perception | |
CN107784288B (en) | Iterative positioning type face detection method based on deep neural network | |
CN110322445B (en) | Semantic segmentation method based on maximum prediction and inter-label correlation loss function | |
CN106778687A (en) | Method for viewing points detecting based on local evaluation and global optimization | |
CN112861970B (en) | Fine-grained image classification method based on feature fusion | |
CN111368634B (en) | Human head detection method, system and storage medium based on neural network | |
CN112287983B (en) | Remote sensing image target extraction system and method based on deep learning | |
CN113609895A (en) | Road traffic information acquisition method based on improved Yolov3 | |
CN110738132A (en) | target detection quality blind evaluation method with discriminant perception capability | |
CN110781970A (en) | Method, device and equipment for generating classifier and storage medium | |
CN112801227A (en) | Typhoon identification model generation method, device, equipment and storage medium | |
CN112149664A (en) | Target detection method for optimizing classification and positioning tasks | |
CN114549909A (en) | Pseudo label remote sensing image scene classification method based on self-adaptive threshold | |
CN112528058B (en) | Fine-grained image classification method based on image attribute active learning | |
CN111797795A (en) | Pedestrian detection algorithm based on YOLOv3 and SSR | |
CN116030300A (en) | Progressive domain self-adaptive recognition method for zero-sample SAR target recognition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |