CN112581496A - Multi-target pedestrian trajectory tracking method based on reinforcement learning - Google Patents

Multi-target pedestrian trajectory tracking method based on reinforcement learning Download PDF

Info

Publication number
CN112581496A
CN112581496A CN201910934151.7A CN201910934151A CN112581496A CN 112581496 A CN112581496 A CN 112581496A CN 201910934151 A CN201910934151 A CN 201910934151A CN 112581496 A CN112581496 A CN 112581496A
Authority
CN
China
Prior art keywords
target
tracking
reinforcement learning
track
learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910934151.7A
Other languages
Chinese (zh)
Inventor
卿粼波
许盛宇
何小海
苏婕
吴晓红
牛通
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
Original Assignee
Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan University filed Critical Sichuan University
Priority to CN201910934151.7A priority Critical patent/CN112581496A/en
Publication of CN112581496A publication Critical patent/CN112581496A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/29Graphical models, e.g. Bayesian networks
    • G06F18/295Markov models or related models, e.g. semi-Markov models; Markov random fields; Networks embedding Markov models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Abstract

The invention provides a multi-target pedestrian trajectory tracking method based on reinforcement learning, and mainly relates to tracking multi-target pedestrian trajectories in a complex scene video in a deep reinforcement learning mode. The method comprises the following steps: and distributing a single-target tracker obtained by deep reinforcement learning training for each tracked target to obtain the track of each target, detecting the position of the target in the current frame by using a high-precision target detector, and performing data association on the detection result and the tracking track by using a Hungary algorithm according to the current apparent information and position information of the target, thereby realizing continuous multi-target pedestrian track tracking in the video sequence. The invention integrates the advantages of deep learning and reinforcement learning, and better tracks the target position. In addition, data association is carried out by using a cost matrix integrating the appearance and the position characteristics, the problems of shielding, missing detection and the like are effectively avoided, and the multi-target track tracking accuracy is improved.

Description

Multi-target pedestrian trajectory tracking method based on reinforcement learning
Technical Field
The invention relates to a multi-target tracking problem in the field of machine learning, in particular to a multi-target pedestrian trajectory tracking method based on reinforcement learning.
Background
With the research of artificial intelligence brought into the strategic level, the state puts forward and strengthens the research and development application of the artificial intelligence of a new generation, expands the intelligent life by developing the intelligent industry, and greatly improves the traditional industry by applying new technology, new industry state and new mode. In recent years, target tracking has received much attention from domestic and foreign scholars as a topic of intensive research in the field of computer vision.
Target tracking refers to the process of continuously deducing the state of a target in a video sequence, and the task is to locate the target in each frame of a video and then associate the target with the frame to form a pedestrian motion track. Target tracking can be divided into single target tracking and multi-target tracking, and compared with single target tracking, the multi-target tracking problem is more complex because not only each target needs to be effectively tracked, but also the problem of mutual interference among different targets needs to be solved. Although the multi-target tracking problem has great challenges, the multi-target tracking problem has huge application requirements in many scenes, and particularly the practical application value and the application prospect of multi-pedestrian tracking are particularly outstanding. The method is widely applied to various fields such as intelligent monitoring, automatic driving, robot visual navigation, man-machine interaction and the like.
The traditional multi-target tracking algorithm comprises a multi-hypothesis multi-target tracking algorithm, a multi-target tracking algorithm based on relevant filtering, an approximate multi-target tracking algorithm based on local flow characteristics and the like, and the problems of multi-target shielding, track drifting and the like in a complex scene cannot be solved mostly by the methods. With the rapid development of deep learning in recent years, the target detection precision is continuously improved, and the development of a multi-target tracking technology based on detection is promoted to a certain extent. But is limited by the accuracy of target position prediction and is difficult to achieve very good results. With the rapid development of the field of machine learning, a deep reinforcement learning algorithm combining deep learning and reinforcement learning obtains a plurality of excellent achievements on the decision problem, and becomes a new research direction of the multi-target tracking problem.
Disclosure of Invention
The invention aims to provide a multi-target pedestrian trajectory tracking method based on reinforcement learning, which converts a tracking task into a Markov decision process for solving, trains a neural network by utilizing a mode of combining deep learning and reinforcement learning, and predicts and tracks the position of a target.
For convenience of explanation, the following concepts are first introduced:
markov Decision Process (MDP): the markov decision process is a mathematical model of sequential decisions for simulating stochastic strategies and returns achievable by an agent in an environment where the system state has markov properties. The MDP is built based on a set of interactive objects, namely agents and environments, with elements including state, actions, policies and rewards. In the simulation of MDP, the agent perceives the current system state and acts on the environment in a strategic manner, thereby changing the state of the environment and receiving rewards, the accumulation of which over time is referred to as rewards.
Reinforcement Learning (RL): also known as refinish learning, evaluation learning or reinforcement learning, is one of the paradigms and methodologies of machine learning, and is used to describe and solve the problem of an agent (agent) in interacting with the environment to achieve maximum return or achieve a specific goal through learning strategies.
The invention specifically adopts the following technical scheme:
a multi-target pedestrian trajectory tracking method based on reinforcement learning is characterized by comprising the following steps:
a. converting the tracking task into a Markov decision process for solving;
b. training a single-target tracking network by using a mode of combining supervised learning and reinforcement learning;
c. fusing target appearance information and position information to perform data association on the multi-target tracking track;
the method mainly comprises the following steps:
(1) training a single-target tracking network by using a mode of combining deep learning and reinforcement learning;
(2) distributing a single target tracker obtained in the step (1) to each tracked target in the current video frame, and tracking the positions of a plurality of targets at the same time;
(3) detecting a target in a current video frame through a high-precision target detector, and extracting apparent characteristics and position information;
(4) generating a cost matrix through the apparent similarity and the position similarity between the tracking track set and the detection result set, and performing multi-target data association by using a Hungarian algorithm to obtain a tracking result of the current frame;
(5) and (5) taking the tracking result of the current frame as the input of the next tracking, and repeating the steps (2) - (4) to realize the tracking of the multi-target pedestrian track in the whole video sequence.
The invention has the beneficial effects that:
(1) the advantages of the exciting strategy of reinforcement learning are fully developed, the machine automatically learns the optimal decision, and the target tracking effect is improved.
(2) And the video sequence is detected frame by frame in the tracking process, so that the problems of shielding, track drifting and the like are effectively avoided.
(3) And the apparent characteristics and the position information of the target are fused for data association, so that the conditions of false detection and missing detection and track loss caused by target shielding are prevented.
(4) The supervised learning and the reinforcement learning are combined, the problem of low accuracy of the traditional method is solved, and the research value is improved.
Drawings
FIG. 1 is a diagram of a single target tracker network architecture based on reinforcement learning.
Detailed Description
The present invention is further described in detail with reference to the drawings and examples, it should be noted that the following examples are only for illustrating the present invention and should not be construed as limiting the scope of the present invention, and those skilled in the art should be able to make certain insubstantial modifications and adaptations to the present invention based on the above disclosure and should still fall within the scope of the present invention.
The group emotion recognition method based on the motion characteristics specifically comprises the following steps:
(1) pre-training a single-target tracking network in a supervised learning mode to enable the network to have the capability of selecting correct actions, and further optimizing network parameters by using a reinforcement learning strategy gradient algorithm to enable the network parameters to predict and track the target position;
(2) allocating a single target tracker obtained in the step (1) to each tracked target in the current video frame, and tracking the positions of a plurality of targets to obtain the apparent characteristics and position information of the targets in the current frame;
(3) detecting a target in a current video frame through a high-precision target detector, and extracting apparent characteristics and position information of the target in a detection result boundary frame;
(4) generating a cost matrix through the apparent similarity and the position similarity between the tracking track set and the detection result set, performing multi-target data association by using a Hungarian algorithm, matching the tracking tracks with the detection sets one by one, and finally obtaining the tracking result of the current frame;
(5) and (5) taking the tracking result of the current frame as the input of the next tracking, and repeating the steps (2) - (4) to realize the tracking of the multi-target pedestrian track in the whole video sequence.

Claims (4)

1. A multi-target pedestrian trajectory tracking method based on reinforcement learning is characterized by comprising the following steps:
a. converting the tracking task into a Markov decision process for solving;
b. training a single-target tracking network by using a mode of combining supervised learning and reinforcement learning;
c. fusing target appearance information and position information to perform data association on the multi-target tracking track;
the method mainly comprises the following steps:
(1) training a single-target tracking network by using a mode of combining deep learning and reinforcement learning;
(2) distributing a single target tracker obtained in the step (1) to each tracked target in the current video frame, and tracking the positions of a plurality of targets at the same time;
(3) detecting a target in a current video frame through a high-precision target detector, and extracting apparent characteristics and position information;
(4) generating a cost matrix through the apparent similarity and the position similarity between the tracking track set and the detection result set, and performing multi-target data association by using a Hungarian algorithm to obtain a tracking result of the current frame;
(5) and (5) taking the tracking result of the current frame as the input of the next tracking, and repeating the steps (2) - (4) to realize the tracking of the multi-target pedestrian track in the whole video sequence.
2. The reinforcement learning-based multi-target pedestrian trajectory tracking method according to claim 1, characterized in that in step (1), each single-target tracker is used as an independent agent to construct a Markov decision process, and a tracking target state and action-taking mapping are learned through a reward and punishment mechanism, and a tracking strategy is optimized in combination with time sequence information.
3. The reinforcement learning-based multi-target pedestrian trajectory tracking method according to claim 1, characterized in that in the step (1), a single-target tracking network is trained in a manner of combining supervised learning and reinforcement learning, network parameters are optimized for multiple times, and accuracy of position prediction and tracking is improved.
4. The multi-target pedestrian trajectory tracking method based on reinforcement learning as claimed in claim 1, wherein the apparent features and position information of the tracking trajectory and the detection result are extracted in step (4), and are fused to form a similarity cost matrix, so that the characteristic difference between different targets is sufficiently mined, the target confusion under a complex scene is reduced, and the tracking algorithm has certain robustness to the problems of target occlusion, false detection and missed detection and the like.
CN201910934151.7A 2019-09-29 2019-09-29 Multi-target pedestrian trajectory tracking method based on reinforcement learning Pending CN112581496A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910934151.7A CN112581496A (en) 2019-09-29 2019-09-29 Multi-target pedestrian trajectory tracking method based on reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910934151.7A CN112581496A (en) 2019-09-29 2019-09-29 Multi-target pedestrian trajectory tracking method based on reinforcement learning

Publications (1)

Publication Number Publication Date
CN112581496A true CN112581496A (en) 2021-03-30

Family

ID=75111277

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910934151.7A Pending CN112581496A (en) 2019-09-29 2019-09-29 Multi-target pedestrian trajectory tracking method based on reinforcement learning

Country Status (1)

Country Link
CN (1) CN112581496A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108447076A (en) * 2018-03-16 2018-08-24 清华大学 Multi-object tracking method based on depth enhancing study
CN108921873A (en) * 2018-05-29 2018-11-30 福州大学 The online multi-object tracking method of Markovian decision of filtering optimization is closed based on nuclear phase
CN109919974A (en) * 2019-02-21 2019-06-21 上海理工大学 Online multi-object tracking method based on the more candidate associations of R-FCN frame
US20190266420A1 (en) * 2018-02-27 2019-08-29 TuSimple System and method for online real-time multi-object tracking
CN110288627A (en) * 2019-05-22 2019-09-27 江苏大学 One kind being based on deep learning and the associated online multi-object tracking method of data

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190266420A1 (en) * 2018-02-27 2019-08-29 TuSimple System and method for online real-time multi-object tracking
CN108447076A (en) * 2018-03-16 2018-08-24 清华大学 Multi-object tracking method based on depth enhancing study
CN108921873A (en) * 2018-05-29 2018-11-30 福州大学 The online multi-object tracking method of Markovian decision of filtering optimization is closed based on nuclear phase
CN109919974A (en) * 2019-02-21 2019-06-21 上海理工大学 Online multi-object tracking method based on the more candidate associations of R-FCN frame
CN110288627A (en) * 2019-05-22 2019-09-27 江苏大学 One kind being based on deep learning and the associated online multi-object tracking method of data

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
NICOLAI WOJKE,ALEX BEWLEY,DIETRICH PAULUS: "Simple online and realtime tracking with a deep association metric", 《2017 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP)》 *
杜婷婷: "基于MDP模型的物体检测与跟踪技术研究", 《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》 *

Similar Documents

Publication Publication Date Title
CN107545582B (en) Video multi-target tracking method and device based on fuzzy logic
CN106169188B (en) A kind of method for tracing object based on the search of Monte Carlo tree
Bisagno et al. Group lstm: Group trajectory prediction in crowded scenarios
Luber et al. People tracking in rgb-d data with on-line boosted target models
Krebs et al. A survey on leveraging deep neural networks for object tracking
CN105957105B (en) The multi-object tracking method and system of Behavior-based control study
Rakai et al. Data association in multiple object tracking: A survey of recent techniques
CN101799968B (en) Detection method and device for oil well intrusion based on video image intelligent analysis
CN107146237B (en) Target tracking method based on online state learning and estimation
CN104091348A (en) Multi-target tracking method integrating obvious characteristics and block division templates
CN108447076B (en) Multi-target tracking method based on deep reinforcement learning
CN106355604A (en) Target image tracking method and system
CN114638855A (en) Multi-target tracking method, equipment and medium
CN111739053A (en) Online multi-pedestrian detection tracking method under complex scene
CN110193828A (en) Method and device for identifying state of mobile robot
CN103971384A (en) Node cooperation target tracking method of wireless video sensor
Li et al. Multi-target tracking with trajectory prediction and re-identification
Mittal et al. Pedestrian detection and tracking using deformable part models and Kalman filtering
CN112581496A (en) Multi-target pedestrian trajectory tracking method based on reinforcement learning
Craye et al. RL-IAC: An exploration policy for online saliency learning on an autonomous mobile robot
CN108153519A (en) A kind of Intelligent target tracking universal design frame
Singh et al. A greedy data association technique for multiple object tracking
CN116224319A (en) Track re-association method for micro-motion and moving targets in millimeter wave radar
Zhou et al. SA-SGAN: A Vehicle Trajectory Prediction Model Based on Generative Adversarial Networks
Baiget et al. Finding prototypes to estimate trajectory development in outdoor scenarios

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20210330