CN112884810A - Pedestrian tracking method based on YOLOv3 - Google Patents

Pedestrian tracking method based on YOLOv3 Download PDF

Info

Publication number
CN112884810A
CN112884810A CN202110290409.1A CN202110290409A CN112884810A CN 112884810 A CN112884810 A CN 112884810A CN 202110290409 A CN202110290409 A CN 202110290409A CN 112884810 A CN112884810 A CN 112884810A
Authority
CN
China
Prior art keywords
target
tracking
frame
video
color
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110290409.1A
Other languages
Chinese (zh)
Other versions
CN112884810B (en
Inventor
张德慧
张德育
吕艳辉
徐子睿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenyang Ligong University
Original Assignee
Shenyang Ligong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenyang Ligong University filed Critical Shenyang Ligong University
Priority to CN202110290409.1A priority Critical patent/CN112884810B/en
Publication of CN112884810A publication Critical patent/CN112884810A/en
Application granted granted Critical
Publication of CN112884810B publication Critical patent/CN112884810B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/40Image enhancement or restoration by the use of histogram techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a pedestrian tracking method based on YOLOv3, and relates to the field of target tracking application in computer vision. The method comprises three parts of target detection, target matching and target prediction. The target detection part realizes the identification of all pedestrians in the field of view by using YOLOv 3; the target matching part matches the result of target identification with the template based on the color characteristics of the object, and locks the target to be tracked; the target prediction part predicts the position of the next frame of the target and reduces the detection range, thereby realizing the improvement of the tracking accuracy. The invention can track a selected pedestrian in the video with a single target, the tracking precision reaches about 99%, the tracking speed reaches about 22 frames per second, and the real-time requirement can be met.

Description

Pedestrian tracking method based on YOLOv3
Technical Field
The invention relates to the field of target tracking application in computer vision, in particular to a pedestrian tracking method based on YOLOv 3.
Background
The YOLO (You Only Look one) algorithm is a target detection algorithm based on deep learning proposed by Redmon et al in 2016, and is unique in the neighborhood of the deep learning target detection algorithm with the advantages of simple network, high detection speed and the like. YOLOv3 is the third version of its family of algorithms, which is popular in the industry with high stability.
As an important application field of computer vision, target tracking develops through the following three stages: early classical target tracking algorithms, correlation filtering based target tracking algorithms and today's deep learning based target tracking algorithms, wherein the deep learning based target tracking algorithms become the mainstream algorithms of today's target tracking technologies. In 2012, a target detection algorithm based on deep learning has obtained great success in the fields of image recognition and the like, wherein the most representative is an AlexNet network. Taking the distance as a boundary, the target detection and tracking algorithm based on deep learning begins to show the head and corner completely in the field of computer vision. The deep learning algorithm achieved very excellent performance on the VOT (Visual-Object-Tracking Change) 2017 tournament. Although the processing speed of the target tracking algorithm based on deep learning is not higher than that of the target tracking algorithm based on the relevant filtering, the tracking accuracy rate of the target tracking algorithm based on deep learning presents a rolling posture, and the target tracking algorithm based on the relevant filtering and the early classical algorithm are far better.
Disclosure of Invention
Aiming at the problems of low single-target tracking accuracy, low real-time performance and the like of a target tracking algorithm based on deep learning, the invention provides a pedestrian tracking method based on YOLOv3, which realizes high accuracy and meets the requirement of real-time target tracking.
The invention has the technical scheme that the pedestrian tracking method based on the YOLOv3 specifically comprises the following steps:
step 1: a user manually selects a tracking target in a first frame of a video to be detected by using machine vision software, and the tracking target is used as a template;
pausing the video at a first frame, and waiting for a user to manually frame a target to be tracked; pressing a left mouse button to open drawing authority; moving a mouse and selecting a target to be tracked in a frame; lifting a left mouse button and closing the drawing authority; the drawing result is confirmed and used as a template.
Step 2: detecting all pedestrians in the video by using a YOLOv3 algorithm;
setting a 'person' label as a detection standard by adopting a COCO data set preset by the Yolo official, and only reserving a target with the label of 'person'; all objects labeled "person" in the video, i.e., all pedestrians, are detected using the YOLOv3 algorithm.
Step 2.1: adjusting the size of an input picture to be a fixed size;
step 2.2: detecting an object specified by the COCO data set through a Darknet-53 neural network;
step 2.3: setting size classification standards, and outputting in three branches, namely large, medium and small according to different target sizes;
and step 3: matching all pedestrian detection results with a tracking target template by using a color histogram algorithm, and locking a tracking target;
step 3.1: analyzing the image color characteristics by using a color histogram algorithm;
adjusting the sizes of two pictures to be compared to be consistent, calculating and counting color histograms, respectively counting three color histograms of R, G and B for three-channel color pictures, and finally calculating the Babbitt coefficient, wherein the calculation formula of the Babbitt coefficient is
Figure BDA0002982264730000021
Wherein rho is a Papanicolaou coefficient, P and P 'are color histograms of the two pictures respectively, P (i) and P' (i) are ith components of the color histograms of the two pictures respectively, and N is the total number of the components;
step 3.2: and calculating the color difference between all pedestrians and the template in the detection result and calculating the Papanicolaou coefficient of the pedestrians by taking the image color characteristics as matching conditions, and taking the pedestrian detection result with the maximum Papanicolaou coefficient, namely the minimum image difference as a tracking target.
And 4, step 4: predicting the position of a tracking target in the next frame of the video by using a K neighborhood algorithm, reducing the detection range and improving the tracking accuracy;
using a K neighborhood algorithm for a pedestrian detection tracking result of the current frame; the K neighborhood is based on a target rectangular frame detected by a current frame, an adjacent area is searched for in a next frame by taking the rectangular frame as a reference, and a center point of the search rectangle is coincided with a coordinate position of a center point of the basic rectangular frame, as shown in the following formula:
Figure BDA0002982264730000022
wherein WsearchWidth, H, representing a rectangular search areasearchRepresents the height of a rectangular search area; wobjectWidth, H, of the rectangular region representing the object in the previous frameobjectAnd K is the expansion ratio of the prediction frame relative to the detection frame.
And 5: judging whether the video is finished:
if the video is not processed, jumping to the step 2; and if the video is processed, ending the program and completing the pedestrian tracking.
The beneficial effects produced by adopting the technical method are as follows:
the invention provides a pedestrian tracking method based on YOLOv3, which greatly improves the tracking accuracy of a single target, and the target can be freely specified by a user and has strong universality. The defects of low tracking accuracy, poor real-time performance and single tracking target in the prior art are overcome.
Drawings
FIG. 1 is a flow chart of a pedestrian tracking method based on YOLOv3 in the invention
FIG. 2 is a schematic diagram of a statistical histogram of colors according to the present invention;
FIG. 3 is a flow chart of a K neighborhood algorithm in an embodiment of the present invention;
FIG. 4 is a diagram illustrating the K neighborhood algorithm results of the present invention;
FIG. 5 is a schematic diagram illustrating the effect of the pedestrian tracking method based on YOLOv3 according to the present invention;
FIG. 6 is a flow chart of the YOLOv3 algorithm in an embodiment of the present invention;
fig. 7 is a flowchart of a color histogram algorithm in an embodiment of the present invention.
Detailed Description
The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.
The invention provides a pedestrian tracking method based on YOLOv3, a flow chart of which is shown in figure 1 and comprises the following steps:
step 1: a user manually selects a tracking target in a first frame of a video to be detected by using machine vision software, and the tracking target is used as a template;
pausing the video in a first frame, manually framing a target to be tracked in the first frame of the video by a user, and opening a drawing authority by pressing a left mouse button; then, the mouse is moved in a state that the left button of the mouse is pressed down, the selected frame wraps the target as much as possible, and interference caused by background information is reduced; and lifting a left mouse button after the frame selection is finished, closing the drawing authority, and taking the frame-selected part as a template.
In this embodiment, simulation is performed on an R740 server, where a CPU is Inter Xeon, a GPU is Nvidia Titan X, an operating system is a Linux operating system Ubuntu server 16.04, and an OpenCV version is 3.4.5;
step 2: all pedestrians in the video were detected using the YOLOv3 algorithm, as shown in fig. 6;
setting a 'person' label as a detection standard by adopting a COCO data set preset by the Yolo official, and only reserving a target with the label of 'person'; all objects labeled "person" in the video, i.e., all pedestrians, are detected using the YOLOv3 algorithm.
Step 2.1: adjusting the size of the input picture to a fixed size, in this embodiment, the size is 416 × 416;
step 2.2: detecting an object specified by the COCO data set through a Darknet-53 neural network;
step 2.3: setting size classification standards, and outputting in three branches, namely large, medium and small according to different target sizes;
and step 3: matching all pedestrian detection results with a tracking target template by using a color histogram algorithm, and locking a tracking target;
step 3.1: analyzing the image color characteristics by using a color histogram algorithm, as shown in fig. 7;
the sizes of two pictures to be compared are adjusted to be consistent, a statistical color histogram is calculated, and for the three-channel color picture, three color histograms of R, G and B are respectively calculated, as shown in FIG. 2, R is a red component of a color image pixel value, G is a green component of the color image pixel value, and B is a blue component of the color image pixel value. Finally, calculating the Babbitt coefficient, wherein the larger the obtained Babbitt coefficient is, the more similar the pictures are; the calculation formula of the Babbitt coefficient is
Figure BDA0002982264730000041
Wherein ρ is a babbitt coefficient, P and P 'are color histograms of the two pictures, P (i) and P' (i) are ith components of the color histograms of the two pictures, N is a total number of the components, and the value of N in this embodiment is 256. And taking the candidate block diagram with the minimum difference as a tracking target, and displaying the frame of the candidate block diagram on a screen.
Step 3.2: and calculating the color difference between all pedestrians and the template in the detection result and calculating the Papanicolaou coefficient of the pedestrians by taking the image color characteristics as matching conditions, and taking the pedestrian detection result with the maximum Papanicolaou coefficient, namely the minimum image difference as a tracking target.
And 4, step 4: predicting the position of a tracking target in the next frame of the video by using a K neighborhood algorithm as shown in FIG. 3, reducing the detection range and improving the tracking accuracy, wherein a result graph is shown in FIG. 4;
using a K neighborhood algorithm for a pedestrian detection tracking result of the current frame; the K neighborhood is based on a target rectangular frame detected by a current frame, an adjacent area is searched for in a next frame by taking the rectangular frame as a reference, and a center point of the search rectangle is coincided with a coordinate position of a center point of the basic rectangular frame, as shown in the following formula:
Figure BDA0002982264730000042
wherein WsearchWidth, H, representing a rectangular search areasearchRepresents the height of a rectangular search area; wobjectWidth, H, of the rectangular region representing the object in the previous frameobjectThe height of the target rectangular region in the previous frame is indicated, K is the expansion ratio of the prediction frame to the detection frame, and in this embodiment, K is 2.
The visual field is reduced through a target prediction algorithm, the calculated amount is reduced, and the tracking accuracy is improved.
And 5: judging whether the video is finished:
if the video is not processed, jumping to the step 2; and if the video is processed, ending the program and completing the pedestrian tracking.
The final experimental result is shown in fig. 5, the tracking accuracy is 99.5%, and the tracking speed is about 22 frames per second, which meets the requirements of practical application.

Claims (3)

1. A pedestrian tracking method based on YOLOv3 is characterized by comprising the following steps:
step 1: a user manually selects a tracking target in a first frame of a video to be detected by using machine vision software, and the tracking target is used as a template;
pausing the video at a first frame, and waiting for a user to manually frame a target to be tracked; pressing a left mouse button to open drawing authority; moving a mouse and selecting a target to be tracked in a frame; lifting a left mouse button and closing the drawing authority; confirming the drawing result as a template;
step 2: detecting all pedestrians in the video by using a YOLOv3 algorithm;
setting a 'person' label as a detection standard by adopting a COCO data set preset by the Yolo official, and only reserving a target with the label of 'person'; detecting all objects labeled as 'person' in the video by using a YOLOv3 algorithm, namely detecting all pedestrians;
step 2.1: adjusting the size of an input picture to be a fixed size;
step 2.2: detecting an object specified by the COCO data set through a Darknet-53 neural network;
step 2.3: setting size classification standards, and outputting in three branches, namely large, medium and small according to different target sizes;
and step 3: matching all pedestrian detection results with a tracking target template by using a color histogram algorithm, and locking a tracking target;
and 4, step 4: predicting the position of a tracking target in the next frame of the video by using a K neighborhood algorithm on the pedestrian detection tracking result of the current frame, reducing the detection range and improving the tracking accuracy;
and 5: judging whether the video is finished:
if the video is not processed, jumping to the step 2; and if the video is processed, ending the program and completing the pedestrian tracking.
2. The method for pedestrian tracking based on YOLOv3 as claimed in claim 1, wherein step 3 specifically comprises the steps of:
step 3.1: analyzing the image color characteristics by using a color histogram algorithm;
adjusting the sizes of two pictures to be compared to be consistent, calculating and counting color histograms, respectively counting three color histograms of R, G and B for three-channel color pictures, and finally calculating the Babbitt coefficient, wherein the calculation formula of the Babbitt coefficient is
Figure FDA0002982264720000011
Wherein rho is a Papanicolaou coefficient, P and P 'are color histograms of the two pictures respectively, P (i) and P' (i) are ith components of the color histograms of the two pictures respectively, and N is the total number of the components;
step 3.2: and calculating the color difference between all pedestrians and the template in the detection result and calculating the Papanicolaou coefficient of the pedestrians by taking the image color characteristics as matching conditions, and taking the pedestrian detection result with the maximum Papanicolaou coefficient, namely the minimum image difference as a tracking target.
3. The method as claimed in claim 1, wherein the K neighborhood is based on a target rectangle detected in the current frame, and in the next frame, the neighborhood is searched based on the target rectangle, and the coordinate position of the center point of the search rectangle coincides with the coordinate position of the center point of the base rectangle, as shown in the following formula:
Figure FDA0002982264720000021
wherein WsearchWidth, H, representing a rectangular search areasearchRepresents the height of a rectangular search area; wobjectWidth, H, of the rectangular region representing the object in the previous frameobjectAnd K is the expansion ratio of the prediction frame relative to the detection frame.
CN202110290409.1A 2021-03-18 2021-03-18 Pedestrian tracking method based on YOLOv3 Active CN112884810B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110290409.1A CN112884810B (en) 2021-03-18 2021-03-18 Pedestrian tracking method based on YOLOv3

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110290409.1A CN112884810B (en) 2021-03-18 2021-03-18 Pedestrian tracking method based on YOLOv3

Publications (2)

Publication Number Publication Date
CN112884810A true CN112884810A (en) 2021-06-01
CN112884810B CN112884810B (en) 2024-02-02

Family

ID=76041053

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110290409.1A Active CN112884810B (en) 2021-03-18 2021-03-18 Pedestrian tracking method based on YOLOv3

Country Status (1)

Country Link
CN (1) CN112884810B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107563313A (en) * 2017-08-18 2018-01-09 北京航空航天大学 Multiple target pedestrian detection and tracking based on deep learning
WO2018133666A1 (en) * 2017-01-17 2018-07-26 腾讯科技(深圳)有限公司 Method and apparatus for tracking video target
CN108509859A (en) * 2018-03-09 2018-09-07 南京邮电大学 A kind of non-overlapping region pedestrian tracting method based on deep neural network
CN110516705A (en) * 2019-07-19 2019-11-29 平安科技(深圳)有限公司 Method for tracking target, device and computer readable storage medium based on deep learning
WO2019237536A1 (en) * 2018-06-11 2019-12-19 平安科技(深圳)有限公司 Target real-time tracking method and apparatus, and computer device and storage medium
CN111126152A (en) * 2019-11-25 2020-05-08 国网信通亿力科技有限责任公司 Video-based multi-target pedestrian detection and tracking method
CN111241931A (en) * 2019-12-30 2020-06-05 沈阳理工大学 Aerial unmanned aerial vehicle target identification and tracking method based on YOLOv3
CN111582062A (en) * 2020-04-21 2020-08-25 电子科技大学 Re-detection method in target tracking based on YOLOv3

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018133666A1 (en) * 2017-01-17 2018-07-26 腾讯科技(深圳)有限公司 Method and apparatus for tracking video target
CN107563313A (en) * 2017-08-18 2018-01-09 北京航空航天大学 Multiple target pedestrian detection and tracking based on deep learning
CN108509859A (en) * 2018-03-09 2018-09-07 南京邮电大学 A kind of non-overlapping region pedestrian tracting method based on deep neural network
WO2019237536A1 (en) * 2018-06-11 2019-12-19 平安科技(深圳)有限公司 Target real-time tracking method and apparatus, and computer device and storage medium
CN110516705A (en) * 2019-07-19 2019-11-29 平安科技(深圳)有限公司 Method for tracking target, device and computer readable storage medium based on deep learning
CN111126152A (en) * 2019-11-25 2020-05-08 国网信通亿力科技有限责任公司 Video-based multi-target pedestrian detection and tracking method
CN111241931A (en) * 2019-12-30 2020-06-05 沈阳理工大学 Aerial unmanned aerial vehicle target identification and tracking method based on YOLOv3
CN111582062A (en) * 2020-04-21 2020-08-25 电子科技大学 Re-detection method in target tracking based on YOLOv3

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
任珈民;宫宁生;韩镇阳;: "基于YOLOv3与卡尔曼滤波的多目标跟踪算法", 计算机应用与软件, no. 05 *
王超;苏湛;: "基于Kalman和Surf的Camshift目标跟踪研究", 软件导刊, no. 01 *

Also Published As

Publication number Publication date
CN112884810B (en) 2024-02-02

Similar Documents

Publication Publication Date Title
CN105005766B (en) A kind of body color recognition methods
US9020195B2 (en) Object tracking device, object tracking method, and control program
US7869631B2 (en) Automatic skin color model face detection and mean-shift face tracking
US7218759B1 (en) Face detection in digital images
US6404900B1 (en) Method for robust human face tracking in presence of multiple persons
CN102426649B (en) Simple steel seal digital automatic identification method with high accuracy rate
CN109271937B (en) Sports ground marker identification method and system based on image processing
JP4373840B2 (en) Moving object tracking method, moving object tracking program and recording medium thereof, and moving object tracking apparatus
CN111161313B (en) Multi-target tracking method and device in video stream
CN108470356B (en) Target object rapid ranging method based on binocular vision
EP1300804A2 (en) Face detecting method by skin color recognition
CN107230219B (en) Target person finding and following method on monocular robot
CN113052170B (en) Small target license plate recognition method under unconstrained scene
CN109961016B (en) Multi-gesture accurate segmentation method for smart home scene
CN110991398A (en) Gait recognition method and system based on improved gait energy map
CN111242074A (en) Certificate photo background replacement method based on image processing
CN111310768A (en) Saliency target detection method based on robustness background prior and global information
CN110956184A (en) Abstract diagram direction determination method based on HSI-LBP characteristics
Zeng et al. Adaptive foreground object extraction for real-time video surveillance with lighting variations
CN107194954B (en) Player tracking method and device of multi-view video
Hu et al. Fast face detection based on skin color segmentation using single chrominance Cr
CN110310303B (en) Image analysis multi-target tracking method
Tan et al. Gesture segmentation based on YCb'Cr'color space ellipse fitting skin color modeling
CN112884810B (en) Pedestrian tracking method based on YOLOv3
JP4181313B2 (en) Scene content information adding device and scene content information adding program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant