CN110084307A - A kind of mobile robot visual follower method based on deeply study - Google Patents

A kind of mobile robot visual follower method based on deeply study Download PDF

Info

Publication number
CN110084307A
CN110084307A CN201910361528.4A CN201910361528A CN110084307A CN 110084307 A CN110084307 A CN 110084307A CN 201910361528 A CN201910361528 A CN 201910361528A CN 110084307 A CN110084307 A CN 110084307A
Authority
CN
China
Prior art keywords
model
robot
cnn
training
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910361528.4A
Other languages
Chinese (zh)
Other versions
CN110084307B (en
Inventor
张云洲
王帅
庞琳卓
刘及惟
王磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeastern University China
Original Assignee
Northeastern University China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeastern University China filed Critical Northeastern University China
Priority to CN201910361528.4A priority Critical patent/CN110084307B/en
Publication of CN110084307A publication Critical patent/CN110084307A/en
Application granted granted Critical
Publication of CN110084307B publication Critical patent/CN110084307B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/12Target-seeking control
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2155Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Automation & Control Theory (AREA)
  • Image Analysis (AREA)
  • Manipulator (AREA)

Abstract

The invention proposes a kind of mobile robot visual follower methods based on deeply study.Using the framework of " analog image has supervision pre-training+model migration+RL ", a small amount of data are collected in true environment first, automation expansion is carried out to data set using computer program and image processing techniques, to be largely adapted to the simulated data sets of real scene in a short time, for carrying out Training to the direction controlling model for following robot;Secondly, building the CNN model for robot direction controlling, and Training is carried out to it with the simulated data sets of automation construction, makes it as pre-training model;Then by the knowledge migration of pre-training model into the Controlling model based on DRL, it enables robot execute in true environment and follows task, in conjunction with intensified learning mechanism, robot is followed on one side during environmental interaction, direction handling quality is promoted on one side, not only robustness is high, and substantially reduces cost.

Description

A kind of mobile robot visual follower method based on deeply study
Technical field
The invention belongs to intelligent robot technology fields, are related to a kind of mobile robot visual based on deeply study Follower method.
Background technique
With advances in technology with the development of society, more and more intelligent robots appearance are in people's lives.With It is one of the novel system received significant attention in recent years with formula robot, can applies in such as hospital, market or school Assistant in equal complex environments as its owner, carries out following movement, this will bring great convenience to people's lives. It follows robot that should have autonomous perception, identification, decision and motor function, can identify a certain specific target, and combine Corresponding control system is realized under complex scene follows the target.
Visual sensor is typically based on for the research for following robot system at present or multisensor combines, the former is usually Visual pattern is acquired using stereoscopic camera, needs cumbersome demarcating steps, and is difficult to adapt to outdoor stronger illumination;The latter due to The addition of additional sensors, improves system cost, also brings complicated data fusion process.In order to guarantee that dynamic is unknown The robustness tracked in environment, it usually needs the feature of hand-designed complexity, this considerably increases human cost, time cost and Computing resource.In addition, whole system is usually split as target tracking module by robot system and robot transports for traditional following Two parts of dynamic control, in such the pipeline design structure, the error occurred in previous module would generally be sequentially delivered to Subsequent module is gradually amplified so as to cause the accumulation of error, is finally produced bigger effect to system performance.
In conclusion current tradition follows robot system to have the shortcomings that hardware cost and design cost are excessively high, and nothing Method adapts to the variability and complexity of indoor and outdoor surroundings completely under hardware simplicity support, is easy to happen robot with losing target person The case where, the robustness of system for tracking is reduced, therefore seriously affected application of the trailing type robot in real life.
Summary of the invention
The deficiency that Robot Design is followed for current tradition, the present invention provides a kind of shiftings based on deeply study Mobile robot vision follower method.
The present invention is using monocular colour imagery shot as the unique input pickup of robot, by convolutional neural networks The study of (Convolutional Neural Network, CNN) and deeply (Deep Reinforcement Learning, DRL it) is introduced into and follows in robot system, get rid of the process that tradition follows hand-designed feature complicated in robot system, The Learning control strategy directly from field-of-view image is allowed the robot to, a possibility that target is with losing is greatly reduced, it can be more preferable Adapt to complex environment in illumination variation, background object interference, target disappear and pedestrian interference.Meanwhile deeply learns Introducing but also follow robot that can constantly learn through experience during with environmental interaction, oneself is continuously improved Intelligent level.
The present invention is received in true environment first using the framework of " analog image has supervision pre-training+model migration+RL " Collect a small amount of data, automation expansion is carried out to data set using computer program and image processing techniques, so as in the short time The simulated data sets of real scene are inside largely adapted to, for having carried out prison to the direction controlling model for following robot Supervise and instruct white silk;Secondly, build the CNN model for robot direction controlling, and with the simulated data sets of automation construction to its into Row Training makes it as pre-training model;Then by the knowledge migration of pre-training model to the Controlling model based on DRL In, it enables robot execute in true environment and follows task, in conjunction with intensified learning (Reinforcement Learning, RL) machine System, follows robot on one side during environmental interaction, is promoted on one side to direction handling quality.
Specific technical solution is as follows:
A kind of mobile robot visual follower method based on deeply study, comprises the following steps that
Step 1: the automation construction of data set;
In order to reduce the cost of data collection, be quickly obtained large scale training data, the present invention using computer program and Image processing techniques devises a kind of dataset construction method of automation.It is collected on a small quantity first under simple experiment scene Then data use pattern mask technology, are expanded on a large scale obtained a small amount of experimental data, can be obtained in a short time The data for being largely adapted to complicated indoor and outdoor scene are obtained, to substantially reduce the cost artificially collected with labeled data.
(1) prepare the simple scenario that a target being followed easily is distinguished with background;Under simple scenario, from random The field-of-view image of visual field acquisition target person different location in robot view field of device people;
(2) prepare to follow the application scenarios of robot as complex scene image, such as indoor and outdoor scene, streetscape.Due to Being followed target person and can be easier to open with background separation under simple scenario, therefore can use pattern mask technology for target People extracts from the background of simple scenario, and then superimposed to get being under complex scene to target person with complex scene Image, and directly for synthesis complex scene image assign the motion space label under corresponding simple scenario;
Wherein, pattern mask technology is mainly first directed to the designed two-dimensional matrix of area-of-interest (i.e. mask) of image The operation being multiplied is done with image to be processed, obtained result is exactly the area-of-interest to be extracted.
Step 2: direction controlling model buildings and training based on CNN;
Direction controlling model based on CNN is responsible for being directed to robot view field image, exports to the direction that will take movement Prediction.Training is carried out by the extensive analogue data the set pair analysis model using automation construction, that is, may make model to have There is higher direction controlling horizontal.The knowledge learnt in this model moves to the means migrated by model based on DRL's Priori knowledge in direction controlling model, as the latter about direction controlling strategy.
From the monocular color camera acquired image of robot, before inputing to CNN, first its RGB triple channel is turned It is changed to the channel HSV, input picture is re-used as and gives CNN;Then using step 1 automation construction data set to CNN model into Row Training enables CNN to achieve the effect that export respective action state by robot view field input picture;
Step 3: model migration;
The present invention knows the strategy learnt in CNN direction controlling model as the priori of the direction controlling model based on DRL The model moving method of knowledge.Although the output of CNN model and DRL model has different meanings: the output of CNN model is each side To the probability of movement, and the output of DRL model is usually the value estimations of all directions movement, but their outputs having the same Dimension.Generally, value estimations corresponding to the biggish direction of action of output probability are also higher in CNN model.
CNN parameters weighting trained in step 2 is migrated as initial parameter and gives DRL model, so that DRL model obtains Obtain controlled level identical with CNN model;
Step 4: direction controlling model buildings and training based on DRL;
DRL model is responsible on the basis of obtaining CNN model priori knowledge, using performance of the RL mechanism to model carry out into One step is promoted.The introducing of RL mechanism allows robot to collect experience on one side during with environmental interaction, on one side to certainly Oneself knowledge improves, and follows direction controlling horizontal to obtain robot more higher than CNN model.
DRL model after the migration of step 3 initial parameter is used to robotic end to carry out using and by constantly and ring Border interacts, and allows the robot to constantly update model, learns and adapt to the environment being presently in.
Further, above-mentioned steps two: being 640 × 480 from the monocular color camera acquired image size of robot, Before inputing to neural network, its RGB triple channel is first converted into the channel HSV, and by the image tune of 640 × 480 sizes 60 × 80 sizes are made into, 4 adjacent moment institute acquired images are merged into the input as network, final input Totally 12 channel, the size in each channel are 60 × 80 to layer comprising 4 × 3.
Further, above-mentioned steps two: based on CNN structure formed by 8 layers, including it is 3 layers of convolutional layer, 2 layers of pond layer, complete 2 layers of connectivity layer and output layer;The design of convolutional layer be in order to input picture carry out feature extraction, the design of pond layer be in order to Dimensionality reduction is carried out to the feature extracted, with calculation amount needed for reducing propagated forward.From front to back, the convolution of three convolutional layers Nuclear parameter setting is respectively as follows: 8 × 8,4 × 4,2 × 2;Two pond layers are all made of maximum pond, and size is 2 × 2;By After three convolution, it will input to two full articulamentums, each layer has 384 nodes, is output after full articulamentum Layer, by being multidimensional output after output layer, it includes three directions that each dimension indicates the movement of corresponding direction altogether Movement: forward, to the left, to the right;It can all add a Relu activation primitive to right after three convolutional layers and two full articulamentums The result non-linearization of input layer;The update of CNN parameter uses cross entropy loss function, is embodied as:
Wherein, y ' is the label data of sample, is three-dimensional One-Hot vector, wherein the dimension for 1 indicates correctly dynamic Make.F (x) indicates CNN model to the prediction probability of each movement dimension.
Further, the DRL model in above-mentioned steps three is specially DQN model, transition process are as follows: removal is trained The Softmax layer of CNN network, directly assigns the weight parameter of preceding layers to DQN model.
Further, above-mentioned steps four: DQN carry out approximate value functions using neural network, i.e. the input of neural network is to work as Preceding state value s, output are the magnitude of value Q of predictionθ(s, a), in each time step, environment can provide a state value s, intelligence Body obtains the magnitude of value Q about this s and everything according to value function networkθ(s a) then utilizes greedy algorithm e- Greedy selection movement, makes a policy, and environment can provide reward value r and next state s ' after receiving this movement a;This It is a step;According to the parameter of r updated value Function Network;DQN uses mean square deviation error objective function:
Wherein, s ', a ' are the state and movement of subsequent time, and γ is hyper parameter, and θ is model parameter;
When training, the mode of the update of parameter are as follows:
When final deeply learning algorithm to be applied on physical machine people, the monocular as entrained by robot is color State input value of the collected real-time field-of-view image of color camera as DRL algorithm, algorithm output action space is exactly direction The set for controlling signal allows robot to follow target person to be moved in real time by executing direction control command.
Method of the invention proposes to be based on deeply aiming at the problem that following robot system to exist in practical applications The intelligence of learning algorithm follows robot system, design end to end so that tradition follow tracking module in robot system and Direction controlling module is merged, it is therefore prevented that error propagation and accumulation between module, so that the direct learning objective of robot arrives Mapping relations between behavioral strategy.Robot system is followed compared to traditional, the system not only robustness with higher, and Hardware cost and human cost can be substantially reduced, it can to following popularization and use of the robot in real life to increase Energy.
Detailed description of the invention
Fig. 1 is flow chart of the invention.
Fig. 2 is data set automation construction process schematic diagram of the invention.
Fig. 3 is data set automation construction composograph effect picture of the invention.
Wherein, each subgraph is described as follows:
(a) (b) (c) (d) is the picture example of robot collected target person different location under simple scenario;
(e) (f) (g) (h) is the complex scene picture example being collected into interconnection;
(i) (g) (k) (l) is the generated data collection example images after pattern mask technical treatment;
(a) (e) (i), (b) (f) (g), (c) (g) (k), (d) (h) (l) respectively show target person and are in simple image Complete image mask synthesis process and effect under different location.
Fig. 4 is input picture and motion space corresponding relationship of the invention.
Fig. 5 is of the invention to follow robot system architecture diagram.
Specific embodiment
The software environment of present embodiment is Ubuntu14.04 system, and mobile robot uses TurtleBot2 robot, Robot input pickup is the monocular colour TV camera of 640 × 480 resolution ratio.
Step 1: the automatic construction process of data set
For thering is supervision to follow robot direction controlling model in the present invention, input to follow the camera of robot to regard Wild image exports the movement that should be taken for robot at current time.The construction process of entire data set includes two parts: Input the acquisition of field-of-view image and the mark of output action.
Prepare a simple scene, in this scene, the target needs being followed are easier to distinguish with background.? Under simple scenario, from the field-of-view image for following the visual field of robot to acquire multiple target persons different location in robot view field. The more complicated scene image of certain amount is downloaded from internet, main includes the application scenarios for following robot common, such as Indoor and outdoor scene, streetscape etc..It, can be with due to being followed target person and can be easier to open with background separation under simple scenario Target person is extracted from background using pattern mask technology, and then superimposed with the complex scene that is obtained on internet, The image that target person is under complex scene can be obtained, and directly can assign corresponding letter for the complex scene image of synthesis Motion space label under single game scape.It is as shown in Figure 2 that data set automates construction process schematic diagram.Simple scenario image, interconnection Effect picture after net complex scene image and progress data set automation construction process is as shown in Figure 3.
After the image comprising different location target person being collected under simple scenario, due to tracked target color with Simple scenario background color difference is larger, directly passes through setpoint color threshold design pattern mask.This mask is applied to machine After people's field-of-view image, the bianry image of available tracked target and background and the profile for extracting target person.It carries on the back at this time The image value of scape part all 0, the image value of tracked target people are 1.It at this time can be by target person image section and complexity Scene picture is overlapped.Label is acted by asking tracked target people image value 1 in water the bianry image after pattern mask Mean value that prosposition is set and obtain.
Step 2: direction controlling model buildings and training process based on CNN
It is 640 × 480 from monocular color camera acquired image size, before inputing to neural network, first by it RGB triple channel is converted to the channel HSV, and by the Image Adjusting of 640 × 480 sizes at being re-used as input figure after 60 × 80 sizes As giving neural network.Design merges 4 adjacent moment institute acquired images as the defeated of network in the present invention Enter, since single image is triple channel HSV image, final input layer includes 4 × 3 totally 12 channels, each channel Size is all 60 × 80.Then Training is carried out to CNN model using the data set of automation construction, so that CNN network It can achieve the effect that export respective action state by robot view field input picture.
Step 3: model transition process
The DQN model finally used in the present invention has the CNN direction controlling network designed with the previously described present invention Similar structure, but last Softmax layer is eliminated, directly value forecasting of the output to each state action pair, without It is the probability distribution of every kind of movement.Therefore, the model migration strategy that the present invention uses is i.e.: removing trained CNN network It Softmax layers, directly assigns the weight parameter of preceding layers to DQN model, achievees the purpose that priori knowledge migrates with this.
Step 4: the direction controlling model training process based on DRL
After completing model migration, DRL model can be used for robotic end and carried out using and by constantly and environment It interacts, and then allows the robot to constantly update model, learn to the environment being presently in, promote the robustness followed. In this during, algorithm exports discrete motion space and controls robot, follows the movement of robot empty in the present invention Between be one and include the set that instructs to the left, to the right, forward, the corresponding relationship of motion space and input picture is as shown in Figure 4.
There is no independent mark for data in RL, relies only on the reward signal of external feedback only to imply the good of movement Bad degree, therefore the design of reward function is a most important link of RL successful application.For in the present invention with random Device people's direction controlling reward function design are as follows: user is by being connected remotely to follow robot local side, the view of observer robot Wild image, initial STOP are 0, and expression, which follows, not to be terminated;When user find robot view field image in oneself position it is inclined From center, a stopping message being sent by handheld device, just know oneself when robotic end receives this message Failure is followed, 1 is set by STOP, controls robot stop motion.On the one hand such design can facilitate the user's operation, On the other hand also available more accurate reward signal, to accelerate the convergence of model.At this point, reward function can use following formula It indicates:
Wherein, C is negative.
By the confirmatory experiment in TurtleBot2 robot, this method can accurately follow specific target person, And robustness with higher.

Claims (5)

1. a kind of mobile robot visual follower method based on deeply study, which comprises the steps of:
Step 1: the automation construction of data set;
(1) prepare the simple scenario that a target being followed easily is distinguished with background;Under simple scenario, from following robot The visual field acquisition target person different location in robot view field field-of-view image;
(2) prepare to follow the application scenarios of robot as complex scene image, target person is conformed to the principle of simplicity using pattern mask technology It is extracted in the background of single game scape, and then superimposed to get being in the image under complex scene to target person with complex scene, And the complex scene image directly for synthesis assigns the motion space label under corresponding simple scenario;
Step 2: direction controlling model buildings and training based on CNN;
Training is carried out to CNN model using the data set of step 1 automation construction, CNN is reached and passes through machine Device people visual field input picture exports the effect of respective action state, from the monocular color camera acquired image of robot, Before inputing to CNN, its RGB triple channel is first converted into the channel HSV, input picture is re-used as and gives CNN, network can be with later Export corresponding action state;
Step 3: model migration;
The trained CNN parameters weighting of step 2 is migrated as initial parameter and gives DRL model, so that DRL model obtains and CNN The identical controlled level of model;
Step 4: direction controlling model buildings and training based on DRL;
By step 3 initial parameter migration after DRL model be used for robotic end carry out using, and by constantly and environment into Row interaction allows the robot to constantly update model, study to the environment being presently in.
2. the mobile robot visual follower method according to claim 1 based on deeply study, which is characterized in that Step 2: being 640 × 480 from the monocular color camera acquired image size of robot, before inputing to neural network, Its RGB triple channel is first converted into the channel HSV, and by the Image Adjusting of 640 × 480 sizes at 60 × 80 sizes, by 4 phases Adjacent moment institute acquired image merges the input as network, and final input layer includes 4 × 3 totally 12 channels, often The size in one channel is all 60 × 80.
3. the mobile robot visual follower method according to claim 1 based on deeply study, which is characterized in that Step 2: based on CNN structure formed by 8 layers, including 3 layers of convolutional layer, 2 layers of pond layer, 2 layers of full-mesh layer and output layer;From After going to, the convolution kernel parameter setting of three convolutional layers is respectively as follows: 8 × 8,4 × 4,2 × 2;Two pond layers are all made of maximum Chi Hua, size are 2 × 2;After third convolution, it will input to two full articulamentums, each layer has 384 sections Point is output layer after full articulamentum, by being multidimensional output after output layer, each dimension expression corresponding direction Movement includes the movement in three directions: forward, to the left, to the right altogether;Can all it add after three convolutional layers and two full articulamentums One Relu activation primitive is to the result non-linearization to input layer;The update of CNN parameter uses cross entropy loss function, tool Body surface is shown as:
Wherein, y ' is the label data of sample, is three-dimensional One-Hot vector, wherein the dimension for 1 indicates correctly movement;f (x) indicate CNN model to the prediction probability of each movement dimension.
4. the mobile robot visual follower method according to claim 1 based on deeply study, which is characterized in that DRL model in step 3 is specially DQN model, transition process are as follows: remove the Softmax layer of trained CNN network, will before The weight parameter of each layer in face directly assigns DQN model.
5. the mobile robot visual follower method according to claim 4 based on deeply study, which is characterized in that Step 4: DQN uses neural network approximate value functions, i.e. the input of neural network is current state value s, and output is the valence of prediction Value amount Qθ(s, a), in each time step, environment can provide a state value s, intelligent body according to value function network obtain about The magnitude of value Q of this s and everythingθ(s a) is then acted using greedy algorithm e-greedy selection, is made a policy, environment Reward value r and next state s ' can be provided after receiving this movement a;This is a step;Value function net is updated according to r The parameter of network;DQN uses mean square deviation error objective function:
Wherein, s ', a ' are the state and movement of subsequent time, and γ is hyper parameter, and θ is model parameter;
When training, the mode of the update of parameter are as follows:
CN201910361528.4A 2019-04-30 2019-04-30 Mobile robot vision following method based on deep reinforcement learning Expired - Fee Related CN110084307B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910361528.4A CN110084307B (en) 2019-04-30 2019-04-30 Mobile robot vision following method based on deep reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910361528.4A CN110084307B (en) 2019-04-30 2019-04-30 Mobile robot vision following method based on deep reinforcement learning

Publications (2)

Publication Number Publication Date
CN110084307A true CN110084307A (en) 2019-08-02
CN110084307B CN110084307B (en) 2021-06-18

Family

ID=67418184

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910361528.4A Expired - Fee Related CN110084307B (en) 2019-04-30 2019-04-30 Mobile robot vision following method based on deep reinforcement learning

Country Status (1)

Country Link
CN (1) CN110084307B (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110728368A (en) * 2019-10-25 2020-01-24 中国人民解放军国防科技大学 Acceleration method for deep reinforcement learning of simulation robot
CN111523495A (en) * 2020-04-27 2020-08-11 天津中科智能识别产业技术研究院有限公司 End-to-end active human body tracking method in monitoring scene based on deep reinforcement learning
CN111539979A (en) * 2020-04-27 2020-08-14 天津大学 Human body front tracking method based on deep reinforcement learning
CN111578940A (en) * 2020-04-24 2020-08-25 哈尔滨工业大学 Indoor monocular navigation method and system based on cross-sensor transfer learning
CN112297012A (en) * 2020-10-30 2021-02-02 上海交通大学 Robot reinforcement learning method based on self-adaptive model
CN112702423A (en) * 2020-12-23 2021-04-23 杭州比脉科技有限公司 Robot learning system based on Internet of things interactive entertainment mode
CN112731804A (en) * 2019-10-29 2021-04-30 北京京东乾石科技有限公司 Method and device for realizing path following
CN112799401A (en) * 2020-12-28 2021-05-14 华南理工大学 End-to-end robot vision-motion navigation method
CN113011526A (en) * 2021-04-23 2021-06-22 华南理工大学 Robot skill learning method and system based on reinforcement learning and unsupervised learning
CN113031441A (en) * 2021-03-03 2021-06-25 北京航空航天大学 Rotary mechanical diagnosis network automatic search method based on reinforcement learning
CN113158778A (en) * 2021-03-09 2021-07-23 中国电子科技集团公司第五十四研究所 SAR image target detection method
CN113156959A (en) * 2021-04-27 2021-07-23 东莞理工学院 Self-supervision learning and navigation method of autonomous mobile robot in complex scene
CN113485326A (en) * 2021-06-28 2021-10-08 南京深一科技有限公司 Autonomous mobile robot based on visual navigation
TWI751511B (en) * 2019-09-05 2022-01-01 日商三菱電機股份有限公司 Inference device, machine control system and learning device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2017101165A4 (en) * 2017-08-25 2017-11-02 Liu, Yichen MR Method of Structural Improvement of Game Training Deep Q-Network
US20180018562A1 (en) * 2016-07-14 2018-01-18 Cside Japan Inc. Platform for providing task based on deep learning
CN108549928A (en) * 2018-03-19 2018-09-18 清华大学 Visual tracking method and device based on continuous moving under deeply learning guide
CN108932735A (en) * 2018-07-10 2018-12-04 广州众聚智能科技有限公司 A method of generating deep learning sample
CN109242882A (en) * 2018-08-06 2019-01-18 北京市商汤科技开发有限公司 Visual tracking method, device, medium and equipment
CN109341689A (en) * 2018-09-12 2019-02-15 北京工业大学 Vision navigation method of mobile robot based on deep learning

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180018562A1 (en) * 2016-07-14 2018-01-18 Cside Japan Inc. Platform for providing task based on deep learning
AU2017101165A4 (en) * 2017-08-25 2017-11-02 Liu, Yichen MR Method of Structural Improvement of Game Training Deep Q-Network
CN108549928A (en) * 2018-03-19 2018-09-18 清华大学 Visual tracking method and device based on continuous moving under deeply learning guide
CN108932735A (en) * 2018-07-10 2018-12-04 广州众聚智能科技有限公司 A method of generating deep learning sample
CN109242882A (en) * 2018-08-06 2019-01-18 北京市商汤科技开发有限公司 Visual tracking method, device, medium and equipment
CN109341689A (en) * 2018-09-12 2019-02-15 北京工业大学 Vision navigation method of mobile robot based on deep learning

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
OFIR NACHUM ET AL: "Bridging the Gap Between Value and Policy Based Reinforcement Learning", 《ARXIV:1702.08892V1》 *
VOLODYMYR MNIH ET AL: "Human-level control through deep reinforcement learning", 《NATURE》 *
李昌: "基于多模态视频的鲁棒目标跟踪方法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
杨杰等: "《医学影像分析和三维重建及其应用》", 31 January 2015 *
阳岳生: "基于机器学习的视觉目标跟踪算法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI751511B (en) * 2019-09-05 2022-01-01 日商三菱電機股份有限公司 Inference device, machine control system and learning device
CN110728368A (en) * 2019-10-25 2020-01-24 中国人民解放军国防科技大学 Acceleration method for deep reinforcement learning of simulation robot
CN110728368B (en) * 2019-10-25 2022-03-15 中国人民解放军国防科技大学 Acceleration method for deep reinforcement learning of simulation robot
CN112731804A (en) * 2019-10-29 2021-04-30 北京京东乾石科技有限公司 Method and device for realizing path following
CN111578940A (en) * 2020-04-24 2020-08-25 哈尔滨工业大学 Indoor monocular navigation method and system based on cross-sensor transfer learning
CN111578940B (en) * 2020-04-24 2021-05-11 哈尔滨工业大学 Indoor monocular navigation method and system based on cross-sensor transfer learning
CN111523495A (en) * 2020-04-27 2020-08-11 天津中科智能识别产业技术研究院有限公司 End-to-end active human body tracking method in monitoring scene based on deep reinforcement learning
CN111539979A (en) * 2020-04-27 2020-08-14 天津大学 Human body front tracking method based on deep reinforcement learning
CN111523495B (en) * 2020-04-27 2023-09-01 天津中科智能识别产业技术研究院有限公司 End-to-end active human body tracking method in monitoring scene based on deep reinforcement learning
CN111539979B (en) * 2020-04-27 2022-12-27 天津大学 Human body front tracking method based on deep reinforcement learning
CN112297012A (en) * 2020-10-30 2021-02-02 上海交通大学 Robot reinforcement learning method based on self-adaptive model
CN112297012B (en) * 2020-10-30 2022-05-31 上海交通大学 Robot reinforcement learning method based on self-adaptive model
CN112702423A (en) * 2020-12-23 2021-04-23 杭州比脉科技有限公司 Robot learning system based on Internet of things interactive entertainment mode
CN112702423B (en) * 2020-12-23 2022-05-03 杭州比脉科技有限公司 Robot learning system based on Internet of things interactive entertainment mode
CN112799401A (en) * 2020-12-28 2021-05-14 华南理工大学 End-to-end robot vision-motion navigation method
CN113031441B (en) * 2021-03-03 2022-04-08 北京航空航天大学 Rotary mechanical diagnosis network automatic search method based on reinforcement learning
CN113031441A (en) * 2021-03-03 2021-06-25 北京航空航天大学 Rotary mechanical diagnosis network automatic search method based on reinforcement learning
CN113158778A (en) * 2021-03-09 2021-07-23 中国电子科技集团公司第五十四研究所 SAR image target detection method
CN113011526A (en) * 2021-04-23 2021-06-22 华南理工大学 Robot skill learning method and system based on reinforcement learning and unsupervised learning
CN113011526B (en) * 2021-04-23 2024-04-26 华南理工大学 Robot skill learning method and system based on reinforcement learning and unsupervised learning
CN113156959A (en) * 2021-04-27 2021-07-23 东莞理工学院 Self-supervision learning and navigation method of autonomous mobile robot in complex scene
CN113156959B (en) * 2021-04-27 2024-06-04 东莞理工学院 Self-supervision learning and navigation method for autonomous mobile robot in complex scene
CN113485326A (en) * 2021-06-28 2021-10-08 南京深一科技有限公司 Autonomous mobile robot based on visual navigation

Also Published As

Publication number Publication date
CN110084307B (en) 2021-06-18

Similar Documents

Publication Publication Date Title
CN110084307A (en) A kind of mobile robot visual follower method based on deeply study
Ruan et al. Mobile robot navigation based on deep reinforcement learning
CN106648103B (en) A kind of the gesture tracking method and VR helmet of VR helmet
CN109800864B (en) Robot active learning method based on image input
CN106966298B (en) Assembled architecture intelligence hanging method based on machine vision and system
CN104589356B (en) The Dextrous Hand remote operating control method caught based on Kinect human hand movement
CN113110482B (en) Indoor environment robot exploration method and system based on priori information heuristic method
CN110135249A (en) Human bodys' response method based on time attention mechanism and LSTM
CN109045676B (en) Chinese chess recognition learning algorithm and robot intelligent system and method based on algorithm
CN108198221A (en) A kind of automatic stage light tracking system and method based on limb action
CN101127078A (en) Unmanned machine vision image matching method based on ant colony intelligence
CN111176309B (en) Multi-unmanned aerial vehicle self-group mutual inductance understanding method based on spherical imaging
CN101154289A (en) Method for tracing three-dimensional human body movement based on multi-camera
CN107818333A (en) Robot obstacle-avoiding action learning and Target Searching Method based on depth belief network
CN112651262A (en) Cross-modal pedestrian re-identification method based on self-adaptive pedestrian alignment
CN115147488B (en) Workpiece pose estimation method and grabbing system based on dense prediction
CN113255514B (en) Behavior identification method based on local scene perception graph convolutional network
CN113370217A (en) Method for recognizing and grabbing object posture based on deep learning for intelligent robot
CN109508686A (en) A kind of Human bodys' response method based on the study of stratification proper subspace
CN110059597A (en) Scene recognition method based on depth camera
CN114036969A (en) 3D human body action recognition algorithm under multi-view condition
Cheng et al. A grasp pose detection scheme with an end-to-end CNN regression approach
Chen et al. Improving registration of augmented reality by incorporating DCNNS into visual SLAM
CN114998573A (en) Grabbing pose detection method based on RGB-D feature depth fusion
Liu et al. Robotic picking in dense clutter via domain invariant learning from synthetic dense cluttered rendering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20210618

CF01 Termination of patent right due to non-payment of annual fee