CN112581496A - Multi-target pedestrian trajectory tracking method based on reinforcement learning - Google Patents
Multi-target pedestrian trajectory tracking method based on reinforcement learning Download PDFInfo
- Publication number
- CN112581496A CN112581496A CN201910934151.7A CN201910934151A CN112581496A CN 112581496 A CN112581496 A CN 112581496A CN 201910934151 A CN201910934151 A CN 201910934151A CN 112581496 A CN112581496 A CN 112581496A
- Authority
- CN
- China
- Prior art keywords
- target
- tracking
- reinforcement learning
- track
- learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/29—Graphical models, e.g. Bayesian networks
- G06F18/295—Markov models or related models, e.g. semi-Markov models; Markov random fields; Networks embedding Markov models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/103—Static body considered as a whole, e.g. static pedestrian or occupant recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Abstract
The invention provides a multi-target pedestrian trajectory tracking method based on reinforcement learning, and mainly relates to tracking multi-target pedestrian trajectories in a complex scene video in a deep reinforcement learning mode. The method comprises the following steps: and distributing a single-target tracker obtained by deep reinforcement learning training for each tracked target to obtain the track of each target, detecting the position of the target in the current frame by using a high-precision target detector, and performing data association on the detection result and the tracking track by using a Hungary algorithm according to the current apparent information and position information of the target, thereby realizing continuous multi-target pedestrian track tracking in the video sequence. The invention integrates the advantages of deep learning and reinforcement learning, and better tracks the target position. In addition, data association is carried out by using a cost matrix integrating the appearance and the position characteristics, the problems of shielding, missing detection and the like are effectively avoided, and the multi-target track tracking accuracy is improved.
Description
Technical Field
The invention relates to a multi-target tracking problem in the field of machine learning, in particular to a multi-target pedestrian trajectory tracking method based on reinforcement learning.
Background
With the research of artificial intelligence brought into the strategic level, the state puts forward and strengthens the research and development application of the artificial intelligence of a new generation, expands the intelligent life by developing the intelligent industry, and greatly improves the traditional industry by applying new technology, new industry state and new mode. In recent years, target tracking has received much attention from domestic and foreign scholars as a topic of intensive research in the field of computer vision.
Target tracking refers to the process of continuously deducing the state of a target in a video sequence, and the task is to locate the target in each frame of a video and then associate the target with the frame to form a pedestrian motion track. Target tracking can be divided into single target tracking and multi-target tracking, and compared with single target tracking, the multi-target tracking problem is more complex because not only each target needs to be effectively tracked, but also the problem of mutual interference among different targets needs to be solved. Although the multi-target tracking problem has great challenges, the multi-target tracking problem has huge application requirements in many scenes, and particularly the practical application value and the application prospect of multi-pedestrian tracking are particularly outstanding. The method is widely applied to various fields such as intelligent monitoring, automatic driving, robot visual navigation, man-machine interaction and the like.
The traditional multi-target tracking algorithm comprises a multi-hypothesis multi-target tracking algorithm, a multi-target tracking algorithm based on relevant filtering, an approximate multi-target tracking algorithm based on local flow characteristics and the like, and the problems of multi-target shielding, track drifting and the like in a complex scene cannot be solved mostly by the methods. With the rapid development of deep learning in recent years, the target detection precision is continuously improved, and the development of a multi-target tracking technology based on detection is promoted to a certain extent. But is limited by the accuracy of target position prediction and is difficult to achieve very good results. With the rapid development of the field of machine learning, a deep reinforcement learning algorithm combining deep learning and reinforcement learning obtains a plurality of excellent achievements on the decision problem, and becomes a new research direction of the multi-target tracking problem.
Disclosure of Invention
The invention aims to provide a multi-target pedestrian trajectory tracking method based on reinforcement learning, which converts a tracking task into a Markov decision process for solving, trains a neural network by utilizing a mode of combining deep learning and reinforcement learning, and predicts and tracks the position of a target.
For convenience of explanation, the following concepts are first introduced:
markov Decision Process (MDP): the markov decision process is a mathematical model of sequential decisions for simulating stochastic strategies and returns achievable by an agent in an environment where the system state has markov properties. The MDP is built based on a set of interactive objects, namely agents and environments, with elements including state, actions, policies and rewards. In the simulation of MDP, the agent perceives the current system state and acts on the environment in a strategic manner, thereby changing the state of the environment and receiving rewards, the accumulation of which over time is referred to as rewards.
Reinforcement Learning (RL): also known as refinish learning, evaluation learning or reinforcement learning, is one of the paradigms and methodologies of machine learning, and is used to describe and solve the problem of an agent (agent) in interacting with the environment to achieve maximum return or achieve a specific goal through learning strategies.
The invention specifically adopts the following technical scheme:
a multi-target pedestrian trajectory tracking method based on reinforcement learning is characterized by comprising the following steps:
a. converting the tracking task into a Markov decision process for solving;
b. training a single-target tracking network by using a mode of combining supervised learning and reinforcement learning;
c. fusing target appearance information and position information to perform data association on the multi-target tracking track;
the method mainly comprises the following steps:
(1) training a single-target tracking network by using a mode of combining deep learning and reinforcement learning;
(2) distributing a single target tracker obtained in the step (1) to each tracked target in the current video frame, and tracking the positions of a plurality of targets at the same time;
(3) detecting a target in a current video frame through a high-precision target detector, and extracting apparent characteristics and position information;
(4) generating a cost matrix through the apparent similarity and the position similarity between the tracking track set and the detection result set, and performing multi-target data association by using a Hungarian algorithm to obtain a tracking result of the current frame;
(5) and (5) taking the tracking result of the current frame as the input of the next tracking, and repeating the steps (2) - (4) to realize the tracking of the multi-target pedestrian track in the whole video sequence.
The invention has the beneficial effects that:
(1) the advantages of the exciting strategy of reinforcement learning are fully developed, the machine automatically learns the optimal decision, and the target tracking effect is improved.
(2) And the video sequence is detected frame by frame in the tracking process, so that the problems of shielding, track drifting and the like are effectively avoided.
(3) And the apparent characteristics and the position information of the target are fused for data association, so that the conditions of false detection and missing detection and track loss caused by target shielding are prevented.
(4) The supervised learning and the reinforcement learning are combined, the problem of low accuracy of the traditional method is solved, and the research value is improved.
Drawings
FIG. 1 is a diagram of a single target tracker network architecture based on reinforcement learning.
Detailed Description
The present invention is further described in detail with reference to the drawings and examples, it should be noted that the following examples are only for illustrating the present invention and should not be construed as limiting the scope of the present invention, and those skilled in the art should be able to make certain insubstantial modifications and adaptations to the present invention based on the above disclosure and should still fall within the scope of the present invention.
The group emotion recognition method based on the motion characteristics specifically comprises the following steps:
(1) pre-training a single-target tracking network in a supervised learning mode to enable the network to have the capability of selecting correct actions, and further optimizing network parameters by using a reinforcement learning strategy gradient algorithm to enable the network parameters to predict and track the target position;
(2) allocating a single target tracker obtained in the step (1) to each tracked target in the current video frame, and tracking the positions of a plurality of targets to obtain the apparent characteristics and position information of the targets in the current frame;
(3) detecting a target in a current video frame through a high-precision target detector, and extracting apparent characteristics and position information of the target in a detection result boundary frame;
(4) generating a cost matrix through the apparent similarity and the position similarity between the tracking track set and the detection result set, performing multi-target data association by using a Hungarian algorithm, matching the tracking tracks with the detection sets one by one, and finally obtaining the tracking result of the current frame;
(5) and (5) taking the tracking result of the current frame as the input of the next tracking, and repeating the steps (2) - (4) to realize the tracking of the multi-target pedestrian track in the whole video sequence.
Claims (4)
1. A multi-target pedestrian trajectory tracking method based on reinforcement learning is characterized by comprising the following steps:
a. converting the tracking task into a Markov decision process for solving;
b. training a single-target tracking network by using a mode of combining supervised learning and reinforcement learning;
c. fusing target appearance information and position information to perform data association on the multi-target tracking track;
the method mainly comprises the following steps:
(1) training a single-target tracking network by using a mode of combining deep learning and reinforcement learning;
(2) distributing a single target tracker obtained in the step (1) to each tracked target in the current video frame, and tracking the positions of a plurality of targets at the same time;
(3) detecting a target in a current video frame through a high-precision target detector, and extracting apparent characteristics and position information;
(4) generating a cost matrix through the apparent similarity and the position similarity between the tracking track set and the detection result set, and performing multi-target data association by using a Hungarian algorithm to obtain a tracking result of the current frame;
(5) and (5) taking the tracking result of the current frame as the input of the next tracking, and repeating the steps (2) - (4) to realize the tracking of the multi-target pedestrian track in the whole video sequence.
2. The reinforcement learning-based multi-target pedestrian trajectory tracking method according to claim 1, characterized in that in step (1), each single-target tracker is used as an independent agent to construct a Markov decision process, and a tracking target state and action-taking mapping are learned through a reward and punishment mechanism, and a tracking strategy is optimized in combination with time sequence information.
3. The reinforcement learning-based multi-target pedestrian trajectory tracking method according to claim 1, characterized in that in the step (1), a single-target tracking network is trained in a manner of combining supervised learning and reinforcement learning, network parameters are optimized for multiple times, and accuracy of position prediction and tracking is improved.
4. The multi-target pedestrian trajectory tracking method based on reinforcement learning as claimed in claim 1, wherein the apparent features and position information of the tracking trajectory and the detection result are extracted in step (4), and are fused to form a similarity cost matrix, so that the characteristic difference between different targets is sufficiently mined, the target confusion under a complex scene is reduced, and the tracking algorithm has certain robustness to the problems of target occlusion, false detection and missed detection and the like.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910934151.7A CN112581496A (en) | 2019-09-29 | 2019-09-29 | Multi-target pedestrian trajectory tracking method based on reinforcement learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910934151.7A CN112581496A (en) | 2019-09-29 | 2019-09-29 | Multi-target pedestrian trajectory tracking method based on reinforcement learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112581496A true CN112581496A (en) | 2021-03-30 |
Family
ID=75111277
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910934151.7A Pending CN112581496A (en) | 2019-09-29 | 2019-09-29 | Multi-target pedestrian trajectory tracking method based on reinforcement learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112581496A (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108447076A (en) * | 2018-03-16 | 2018-08-24 | 清华大学 | Multi-object tracking method based on depth enhancing study |
CN108921873A (en) * | 2018-05-29 | 2018-11-30 | 福州大学 | The online multi-object tracking method of Markovian decision of filtering optimization is closed based on nuclear phase |
CN109919974A (en) * | 2019-02-21 | 2019-06-21 | 上海理工大学 | Online multi-object tracking method based on the more candidate associations of R-FCN frame |
US20190266420A1 (en) * | 2018-02-27 | 2019-08-29 | TuSimple | System and method for online real-time multi-object tracking |
CN110288627A (en) * | 2019-05-22 | 2019-09-27 | 江苏大学 | One kind being based on deep learning and the associated online multi-object tracking method of data |
-
2019
- 2019-09-29 CN CN201910934151.7A patent/CN112581496A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190266420A1 (en) * | 2018-02-27 | 2019-08-29 | TuSimple | System and method for online real-time multi-object tracking |
CN108447076A (en) * | 2018-03-16 | 2018-08-24 | 清华大学 | Multi-object tracking method based on depth enhancing study |
CN108921873A (en) * | 2018-05-29 | 2018-11-30 | 福州大学 | The online multi-object tracking method of Markovian decision of filtering optimization is closed based on nuclear phase |
CN109919974A (en) * | 2019-02-21 | 2019-06-21 | 上海理工大学 | Online multi-object tracking method based on the more candidate associations of R-FCN frame |
CN110288627A (en) * | 2019-05-22 | 2019-09-27 | 江苏大学 | One kind being based on deep learning and the associated online multi-object tracking method of data |
Non-Patent Citations (2)
Title |
---|
NICOLAI WOJKE,ALEX BEWLEY,DIETRICH PAULUS: "Simple online and realtime tracking with a deep association metric", 《2017 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP)》 * |
杜婷婷: "基于MDP模型的物体检测与跟踪技术研究", 《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107545582B (en) | Video multi-target tracking method and device based on fuzzy logic | |
CN106169188B (en) | A kind of method for tracing object based on the search of Monte Carlo tree | |
Bisagno et al. | Group lstm: Group trajectory prediction in crowded scenarios | |
Luber et al. | People tracking in rgb-d data with on-line boosted target models | |
Krebs et al. | A survey on leveraging deep neural networks for object tracking | |
CN105957105B (en) | The multi-object tracking method and system of Behavior-based control study | |
Rakai et al. | Data association in multiple object tracking: A survey of recent techniques | |
CN101799968B (en) | Detection method and device for oil well intrusion based on video image intelligent analysis | |
CN107146237B (en) | Target tracking method based on online state learning and estimation | |
CN104091348A (en) | Multi-target tracking method integrating obvious characteristics and block division templates | |
CN108447076B (en) | Multi-target tracking method based on deep reinforcement learning | |
CN106355604A (en) | Target image tracking method and system | |
CN114638855A (en) | Multi-target tracking method, equipment and medium | |
CN111739053A (en) | Online multi-pedestrian detection tracking method under complex scene | |
CN110193828A (en) | Method and device for identifying state of mobile robot | |
CN103971384A (en) | Node cooperation target tracking method of wireless video sensor | |
Li et al. | Multi-target tracking with trajectory prediction and re-identification | |
Mittal et al. | Pedestrian detection and tracking using deformable part models and Kalman filtering | |
CN112581496A (en) | Multi-target pedestrian trajectory tracking method based on reinforcement learning | |
Craye et al. | RL-IAC: An exploration policy for online saliency learning on an autonomous mobile robot | |
CN108153519A (en) | A kind of Intelligent target tracking universal design frame | |
Singh et al. | A greedy data association technique for multiple object tracking | |
CN116224319A (en) | Track re-association method for micro-motion and moving targets in millimeter wave radar | |
Zhou et al. | SA-SGAN: A Vehicle Trajectory Prediction Model Based on Generative Adversarial Networks | |
Baiget et al. | Finding prototypes to estimate trajectory development in outdoor scenarios |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20210330 |