CN111539979B - Human body front tracking method based on deep reinforcement learning - Google Patents

Human body front tracking method based on deep reinforcement learning Download PDF

Info

Publication number
CN111539979B
CN111539979B CN202010341730.3A CN202010341730A CN111539979B CN 111539979 B CN111539979 B CN 111539979B CN 202010341730 A CN202010341730 A CN 202010341730A CN 111539979 B CN111539979 B CN 111539979B
Authority
CN
China
Prior art keywords
tracker
human body
tracking
target
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010341730.3A
Other languages
Chinese (zh)
Other versions
CN111539979A (en
Inventor
张雅帆
张堃博
孙哲南
胡清华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin Zhongke Intelligent Identification Co ltd
Tianjin University
Original Assignee
Tianjin Zhongke Intelligent Identification Industry Technology Research Institute Co ltd
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Zhongke Intelligent Identification Industry Technology Research Institute Co ltd, Tianjin University filed Critical Tianjin Zhongke Intelligent Identification Industry Technology Research Institute Co ltd
Priority to CN202010341730.3A priority Critical patent/CN111539979B/en
Publication of CN111539979A publication Critical patent/CN111539979A/en
Application granted granted Critical
Publication of CN111539979B publication Critical patent/CN111539979B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a human body front tracking method based on deep reinforcement learning, which comprises the following steps: s1: building a plurality of non Engine 4 virtual environments for training and testing; s2: constructing a convolutional neural network and an Actor-Critic network; s3: the input of the convolutional neural network is the observation visual angle of a tracker, and a network model is trained until the model converges; s4: testing a tracking effect in a UE4 virtual test scene; s5: and migrating the model meeting the requirements after the test to a real scene. Different from the traditional tracking work that two functional modules of human body detection and camera control must be respectively realized, the invention integrates the two modules by using an end-to-end active tracking method, does not need human body detection, inputs the video stream of the visual angle of a tracker, directly outputs the most effective action for tracking, and saves the complex flow of the traditional human body tracking.

Description

Human body front tracking method based on deep reinforcement learning
Technical Field
The invention relates to the technical field of computer vision and machine learning, in particular to the field of human body tracking, which can be used for a series of intelligent service robots, human face or iris acquisition systems without artificial cooperation and the like, and particularly relates to a human body front tracking method based on deep reinforcement learning.
Background
Human body tracking is a process of accurately detecting and tracking the position of a human body from a complex environment by using a continuous video sequence as an input. In the fields of market monitoring, traffic control and the like in real life, a camera is generally in a static state, namely a tracking background does not change within a certain time period, which is called static human body tracking. In recent years, social development has made new demands for human body tracking, and when a camera is mounted on a mobile robot and the position of the camera changes, the background of an image captured by the camera changes, which is called dynamic human body tracking. The latter is a main overcoming difficulty in the current human body tracking field. The dynamic human body tracking technology has significance in scientific research and practical value in a plurality of social fields.
The explosion of computer technology will likely replace more and more of the less intelligent, less human intervention-free service industry with intelligent robots. Like market shopping guide: the mobile intelligent robot captures a real-time video sequence, firstly obtains the accurate position of a customer through human body detection, then carries out human body tracking, considers the comfort of human-to-human communication, generally needs the robot to move to the front of a person to be served for face-to-face communication, can improve the service quality and increase the happiness of the person to be served. Meanwhile, other applications such as a nursing robot, an educational service robot, a home service robot, and the like are also being widely developed.
With the continuous development of the computing power of hardware such as a GPU, the deep learning-based method gradually shows incomparable advantages. If the deep reinforcement learning can be applied to human body tracking, the real-time performance and the high efficiency of the human body tracking can be further improved.
Disclosure of Invention
The invention aims to provide a human body front tracking method based on deep reinforcement learning, which directly outputs actions required by a tracker without human body detection by taking a video sequence as input so as to realize end-to-end active front human body tracking. The invention utilizes the low-cost virtual environment without labels for training, and abundant data effectively inhibits the overfitting problem which is easy to occur in the process of training the convolutional neural network, thereby obtaining better generalization capability and coping with the application in an uncontrollable scene. .
In order to achieve the purpose of the invention, the invention provides a human body front tracking method based on deep reinforcement learning, which comprises the following specific steps:
s1, establishing a UE4 virtual training and testing environment, including abundant illumination changes, changes of backgrounds and human body surface textures;
s2, constructing a convolutional neural network and connecting an Actor-Critic network;
s3, inputting the visual angle of the tracker into the network constructed in the step S2 in a video stream mode until the successful tracking time reaches more than 300 seconds, namely, the model converges;
the reinforcement learning algorithm realizes positive tracking through automatic learning to maximize the final reward value, and the following formula is a set reward and punishment function:
Figure RE-GDA0002574319160000021
r is a reward and punishment value given by each execution action of the tracker by the model, A is an artificially set reward and punishment value upper limit, deltax and Delay represent offsets between a target and the tracker on an x axis and a y axis, d is an ideal distance expected to be maintained between the target and the tracker, omega and theta are angles at which the tracker and the target need to rotate in a positive opposite direction respectively, and c, lambda and beta are normalized parameters;
s4, testing the trained model according to the method of the S3 by using a UE4 virtual test environment, and outputting successful tracking time;
and S5, in order to verify the performance of the model in the real world, migrating the model meeting the requirements after testing in the virtual environment into a real scene, and evaluating the tracking effect by matching with artificial observation according to the output successful tracking time.
The data input into the neural network constructed in the step S2 is a video stream of a continuous tracker view, so that the neural network is connected with an LSTM structure behind a multilayer convolutional neural network, and a subsequent Actor-Critic network module directly outputs an action to be taken by a tracker to maintain the action on the front of a target human body.
The purpose that the tracker always tracks the front of the human body target is achieved by simultaneously reducing the difference value between the actual distance and the expected distance between the human body target and the tracker and the two rotation angles.
Compared with the prior art, the method has the beneficial effects that the method is beneficial to improving the high efficiency, convenience and comfort of life, and the beneficial effects are embodied in the following aspects:
1. the invention uses the deep reinforcement learning in the human body front tracking for the first time, and can automatically learn the action which a tracker needs to take to maintain the front tracking without manual participation.
2. The invention uses an end-to-end tracking method, thereby omitting the complex flow of the traditional human body tracking, needing no two modules of processing target detection and camera control respectively, and being capable of directly obtaining the action required to be taken by a tracker from an input video sequence.
3. Different from the prior art that a great amount of labeled data are needed for training the convolutional neural network, the method effectively inhibits the overfitting problem which is easy to occur in the training of the convolutional neural network by using the UE4 virtual environment training model and abundant data.
The invention can positively track the human body target, can be applied to the fields of mobile intelligent robots or biological characteristic collection and identification and the like, and effectively improves the comfort of communication and the happiness of service by face-to-face tracking.
Drawings
FIG. 1 is a flow chart of an end-to-end active human body front tracking method based on deep reinforcement learning according to the present invention;
FIG. 2 is an example of a virtual training environment used in the present invention, where there are texture changes of illumination, characters and background during actual training, and the lower right window is the view angle of the tracker, and the tracker faces the target at this time, and this state is a successful tracking;
FIG. 3 is a graphical illustration of the calculation of the angle at which the tracker and target need to rotate, respectively, to the opposite side in the present invention;
fig. 4 is a flow chart for front tracking of human targets in the real world using the proposed method of the invention.
Detailed Description
It should be noted that, in the present application, the embodiments and features of the embodiments may be combined with each other without conflict.
The invention is described in further detail below with reference to the figures and specific examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the invention.
In practical applications, human body tracking technology has many challenges, especially in the field of mobile human body tracking where the image background can change in real time. If the mobile robot needs to take account of the image processing module and the mobile control module, the image processing module needs to process illumination and distance change, and the mobile control module needs to pay attention to the real-time performance of sending and executing the mobile instruction.
The traditional moving human body tracking method generally separates an image processing module and a camera control module for processing, and the depth reinforcement learning method designed by the invention uniformly considers the image processing module and the camera control module, so that a complicated middle step is omitted by using an end-to-end learning method. Meanwhile, the virtual training and testing environment is set up by using UE4 software, and the overfitting problem is avoided due to abundant data volume.
As shown in fig. 1, the human body front tracking method based on deep reinforcement learning provided by the invention comprises the following steps:
s1, establishing a UE4 virtual training and testing environment which comprises abundant illumination changes and changes of backgrounds and human body surface textures, so that overfitting can be effectively inhibited; as shown in fig. 2.
The step S1 specifically comprises the following steps: in view of the problem of difficult data acquisition and labeling in a real scene, a virtual environment similar to the real scene and built by UE4 software is used for training and testing the model. In order to effectively inhibit the overfitting phenomenon during training, the invention adds abundant illumination change in the virtual environment, randomly replaces different human body targets and various background textures, and on the other hand, the virtual environment provides abundant low-cost training data.
S2, constructing a convolutional neural network and connecting an Actor-Critic network;
the step S2 specifically comprises the following steps: because the data input into the neural network is a continuous video stream, the neural network is connected with an LSTM structure suitable for learning serialized data after a plurality of layers of convolutional neural networks, and a subsequent Actor-Critic network module takes the characteristics extracted by the convolutional neural networks as input and directly outputs the action which a tracker should take to realize that the data are always maintained on the front of a target human body.
S3, inputting the visual angle of the tracker into the network constructed in the S2 in a video stream mode until successful tracking time reaches more than 300 seconds, namely, model convergence;
the step S3 specifically includes: the reinforcement learning algorithm realizes positive tracking by maximizing a final reward value, and the following formula is a reward and punishment function set by the invention:
Figure BDA0002468714050000051
the method includes the steps that r is a reward and penalty value given by a model to each execution action of a tracker, A is an artificially set reward and penalty value upper limit, Δ x and Δ y represent offsets between a target and the tracker on an x axis and a y axis, d is an ideal distance expected to be maintained between the target and the tracker, ω and θ are angles at which the tracker and the target need to rotate opposite to each other, respectively, see fig. 3, arrows a and B in fig. 3 represent orientations of the tracker and the target, respectively, and c, λ and β are normalized parameters. The invention can obtain the real-time position coordinates and Euler angles of the target and the tracker in the virtual environment by calling the method function provided by the UE4 software and interacting with the virtual environment, and then respectively calling arctan (delta x, delta y) and arctan (delta y, delta x) to calculate omega and theta. The purpose that the tracker always tracks the front of the human body target is achieved by simultaneously reducing the difference value between the actual distance and the expected distance between the human body target and the tracker and two rotation angles. The criterion of whether the model converges is whether the human target and the tracker are in a face-to-face state and maintain a proper distance for 300 seconds.
S4, testing the model trained according to the method in the step S3 by using a UE4 virtual test environment, and outputting successful tracking time;
the step S4 specifically comprises the following steps: in the stage, the test environment which is completely different from the training environment is used for testing the generalization capability of the converged model, and by means of an API document disclosed by UE4 software, a method function is called to determine whether the human body target and the tracker are in a face-to-face state or not and maintain a proper ideal distance, and the total time for successful tracking is output.
And S5, in order to verify the performance of the model in the real world, migrating the model meeting the requirements after being tested in the virtual environment into a real scene, and evaluating the tracking effect according to the output successful tracking time and artificial observation.
Specifically, in a real scene, a camera is mounted on a four-wheel-drive autonomous mobile platform, and a video stream acquired by the camera is used as the input of a trained network model. The flow chart at this stage is shown in figure 4.
Examples of applications of the invention are listed below:
application example 1: the human body front tracking method based on deep reinforcement learning is applied to a mobile intelligent service robot.
The invention can be applied to mobile intelligent service robot technology, such as a family service robot, a teaching service robot, an accompanying robot and the like. By using the technology, the intelligent robot always moves face to face along with the target, provides service or responds to other instructions at any time, reduces the time of manpower input, improves the service quality, and increases the interest of service and the happiness of the serviced personnel. Different from the traditional moving human body tracking work that image processing and camera control are divided into two modules for processing, the end-to-end active human body front tracking method based on deep reinforcement learning uniformly considers the two modules, fully utilizes the advantages of the deep learning in the aspect of image processing and the excellent performance of the reinforcement learning in the aspect of processing complex, multi-aspect and serialized data, and solves the human body front tracking problem by using the end-to-end method.
Application example 2: the human body front tracking method based on deep reinforcement learning is applied to human face or iris acquisition and recognition equipment without human cooperation.
The invention can be applied to biological characteristic acquisition equipment without manual cooperation. Face or iris recognition is a biometric technology for identity recognition based on biometric features of a person. And acquiring an image or video stream containing the face or the iris by using a camera, and automatically carrying out subsequent identity recognition. Biometric identification requires acquisition of high-quality biometric information, requires natural light of images without overexposure, and requires no occlusion of biometric features and resolution. The traditional biological characteristic acquisition equipment needs manual matching equipment to stand in a limited area, and the tracking method used by the invention can ensure that the moving robot automatically moves to the front of the human body and maintains a proper distance without manual matching, thereby being quite convenient for biological characteristic acquisition and identity identification.
The technical means not described in detail in the present application are known techniques.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should be regarded as the protection scope of the present invention.

Claims (1)

1. A human body front tracking method based on deep reinforcement learning is characterized by comprising the following specific steps:
s1, establishing a UE4 virtual training and testing environment, including abundant illumination changes, changes of backgrounds and human body surface textures;
s2, constructing a convolutional neural network and connecting an Actor-Critic network;
s3, inputting the visual angle of the tracker into the network constructed in the S2 in a video stream mode until the successful tracking time reaches more than 300 seconds, namely, the model is converged;
the reinforcement learning algorithm realizes positive tracking by automatically learning to maximize the final reward value, and the following formula is a set reward and punishment function:
Figure FDA0003785687560000011
r is a reward and penalty value given by the model to each execution action of the tracker, A is an artificially set reward and penalty value upper limit, deltax and Deltay represent offsets between a target and the tracker on an x axis and a y axis, d is an ideal distance expected to be maintained between the target and the tracker, omega and theta are angles which are required to be rotated by the tracker and the target to face the opposite side respectively, and c, lambda and beta are normalized parameters; calling a method function provided by UE4 software and interacting with a virtual environment to obtain real-time position coordinates and Euler angles of a target and a tracker in the virtual environment, calling arctan (delta x, delta y) and arctan (delta y, delta x) respectively at the moment to calculate omega and theta, and simultaneously reducing the difference value between the actual distance and the expected distance of the human target and the tracker and two rotation angles to achieve the purpose that the tracker always tracks the front of the human target;
s4, testing the model trained according to the method in the step S3 by using a UE4 virtual test environment, and outputting successful tracking time;
s5, in order to verify the performance of the model in the real world, the model meeting the requirements after being tested in the virtual environment is transferred to a real scene, and the tracking effect is evaluated by matching with artificial observation according to the output successful tracking time;
because the data input into the neural network constructed in the step S2 is a continuous video stream of the view angle of the tracker, the neural network is connected with an LSTM structure after a plurality of layers of convolutional neural networks, and a subsequent Actor-Critic network module takes the characteristics extracted by the convolutional neural networks as input and directly outputs the action which the tracker should take to realize that the data is always maintained on the front of the target human body.
CN202010341730.3A 2020-04-27 2020-04-27 Human body front tracking method based on deep reinforcement learning Active CN111539979B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010341730.3A CN111539979B (en) 2020-04-27 2020-04-27 Human body front tracking method based on deep reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010341730.3A CN111539979B (en) 2020-04-27 2020-04-27 Human body front tracking method based on deep reinforcement learning

Publications (2)

Publication Number Publication Date
CN111539979A CN111539979A (en) 2020-08-14
CN111539979B true CN111539979B (en) 2022-12-27

Family

ID=71967571

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010341730.3A Active CN111539979B (en) 2020-04-27 2020-04-27 Human body front tracking method based on deep reinforcement learning

Country Status (1)

Country Link
CN (1) CN111539979B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107813310A (en) * 2017-11-22 2018-03-20 浙江优迈德智能装备有限公司 One kind is based on the more gesture robot control methods of binocular vision
CN108305275A (en) * 2017-08-25 2018-07-20 深圳市腾讯计算机系统有限公司 Active tracking method, apparatus and system
CN110084307A (en) * 2019-04-30 2019-08-02 东北大学 A kind of mobile robot visual follower method based on deeply study
CN110503661A (en) * 2018-05-16 2019-11-26 武汉智云星达信息技术有限公司 A kind of target image method for tracing based on deeply study and space-time context

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10695911B2 (en) * 2018-01-12 2020-06-30 Futurewei Technologies, Inc. Robot navigation and object tracking

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108305275A (en) * 2017-08-25 2018-07-20 深圳市腾讯计算机系统有限公司 Active tracking method, apparatus and system
WO2019037498A1 (en) * 2017-08-25 2019-02-28 腾讯科技(深圳)有限公司 Active tracking method, device and system
CN107813310A (en) * 2017-11-22 2018-03-20 浙江优迈德智能装备有限公司 One kind is based on the more gesture robot control methods of binocular vision
CN110503661A (en) * 2018-05-16 2019-11-26 武汉智云星达信息技术有限公司 A kind of target image method for tracing based on deeply study and space-time context
CN110084307A (en) * 2019-04-30 2019-08-02 东北大学 A kind of mobile robot visual follower method based on deeply study

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
《End to end Active Object Tracking and Its Real world Deployment via Reinforcement Learning》;W.Luo;《arXiv:1808.03405v2[cs.CV]》;20190212;第1-16页 *

Also Published As

Publication number Publication date
CN111539979A (en) 2020-08-14

Similar Documents

Publication Publication Date Title
CN108491880B (en) Object classification and pose estimation method based on neural network
Aloimonos Purposive and qualitative active vision
CN109800689A (en) A kind of method for tracking target based on space-time characteristic fusion study
CN110000785A (en) Agriculture scene is without calibration robot motion's vision collaboration method of servo-controlling and equipment
CN110084307A (en) A kind of mobile robot visual follower method based on deeply study
CN108960067A (en) Real-time train driver motion recognition system and method based on deep learning
CN108229587A (en) A kind of autonomous scan method of transmission tower based on aircraft floating state
CN1648840A (en) Head carried stereo vision hand gesture identifying device
CN110045740A (en) A kind of Mobile Robot Real-time Motion planing method based on human behavior simulation
CN106895824A (en) Unmanned plane localization method based on computer vision
CN106127125A (en) Distributed DTW human body behavior intension recognizing method based on human body behavior characteristics
CN113516108B (en) Construction site dust suppression data matching processing method based on data identification
CN110097574A (en) A kind of real-time pose estimation method of known rigid body
CN112418171A (en) Zebra fish spatial attitude and heart position estimation method based on deep learning
CN111539979B (en) Human body front tracking method based on deep reinforcement learning
Kyrkou C 3 Net: end-to-end deep learning for efficient real-time visual active camera control
CN111523495B (en) End-to-end active human body tracking method in monitoring scene based on deep reinforcement learning
CN113723277A (en) Learning intention monitoring method and system integrating multi-mode visual information
CN113569849A (en) Car fills electric pile interface detection intelligent interaction system based on computer vision
CN117392568A (en) Method for unmanned aerial vehicle inspection of power transformation equipment in complex scene
CN114998573A (en) Grabbing pose detection method based on RGB-D feature depth fusion
CN114187663A (en) Method for controlling unmanned aerial vehicle by posture based on radar detection gray level graph and neural network
CN112598742A (en) Stage interaction system based on image and radar data
CN111798514A (en) Intelligent moving target tracking and monitoring method and system for marine ranching
Ji et al. Behavior inference based on joint node motion under the low quality and small-scale sample size

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: 300072 Tianjin City, Nankai District Wei Jin Road No. 92

Patentee after: Tianjin University

Patentee after: Tianjin Zhongke intelligent identification Co.,Ltd.

Address before: 300072 Tianjin City, Nankai District Wei Jin Road No. 92

Patentee before: Tianjin University

Patentee before: TIANJIN ZHONGKE INTELLIGENT IDENTIFICATION INDUSTRY TECHNOLOGY RESEARCH INSTITUTE Co.,Ltd.