CN106094516A - A kind of robot self-adapting grasping method based on deeply study - Google Patents
A kind of robot self-adapting grasping method based on deeply study Download PDFInfo
- Publication number
- CN106094516A CN106094516A CN201610402319.6A CN201610402319A CN106094516A CN 106094516 A CN106094516 A CN 106094516A CN 201610402319 A CN201610402319 A CN 201610402319A CN 106094516 A CN106094516 A CN 106094516A
- Authority
- CN
- China
- Prior art keywords
- robot
- target
- network
- photo
- deeply
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 41
- 238000012549 training Methods 0.000 claims abstract description 27
- 230000009467 reduction Effects 0.000 claims abstract description 18
- 238000000605 extraction Methods 0.000 claims abstract description 15
- 238000011217 control strategy Methods 0.000 claims abstract description 14
- 230000006870 function Effects 0.000 claims description 26
- 230000007935 neutral effect Effects 0.000 claims description 20
- 230000009471 action Effects 0.000 claims description 11
- 230000008569 process Effects 0.000 claims description 11
- 238000003384 imaging method Methods 0.000 claims description 10
- 230000007246 mechanism Effects 0.000 claims description 8
- 239000011159 matrix material Substances 0.000 claims description 6
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 238000012937 correction Methods 0.000 claims description 3
- 238000011156 evaluation Methods 0.000 claims description 3
- 238000004064 recycling Methods 0.000 claims description 3
- 238000013519 translation Methods 0.000 claims description 3
- 230000007306 turnover Effects 0.000 claims description 3
- 239000000523 sample Substances 0.000 claims 2
- 239000012723 sample buffer Substances 0.000 claims 1
- 238000005516 engineering process Methods 0.000 description 7
- 238000012545 processing Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 238000005070 sampling Methods 0.000 description 3
- 230000006978 adaptation Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 210000004218 nerve net Anatomy 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000002604 ultrasonography Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
- G05B13/042—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Automation & Control Theory (AREA)
- Image Analysis (AREA)
- Manipulator (AREA)
Abstract
The invention provides a kind of robot self-adapting grasping method based on deeply study, step includes: in distance in time capturing target certain distance, robot obtains the photo of target by anterior photographic head, utilize binocular distance-finding method to calculate the positional information of target further according to photo, and the positional information calculated is used for robot navigation;When target enters in the range of mechanical arm is grabbed, then the photo by anterior photographic head photographic subjects, and the deeply learning network based on DDPG utilizing training in advance to cross carries out Data Dimensionality Reduction feature extraction to photo;Draw the control strategy of robot according to feature extraction result, robot utilizes control strategy to control the pose of motion path and mechanical arm, thus realizes the self-adapting grasping of target.This grasping means can realize self-adapting grasping to size shape difference, the unfixed object in position, has good market application foreground.
Description
Technical field
The present invention relates to a kind of method that robot captures object, a kind of robot based on deeply study
Self-adapting grasping method.
Background technology
Autonomous robot is the most intelligentized service type robot, has the learning functionality of environment to external world.For reality
The function of existing various basic activities (such as location, mobile, crawl), needs robot be furnished with mechanical arm and mechanical paw and merge
The information of multisensor carries out machine learning (such as degree of depth study and intensified learning), interacts with external environment, it is achieved its
The various functions such as perception, decision-making and action.Now most capture humanoid robots be operated in object size to be captured, shape and
The relatively-stationary situation in position, and the technology that captures is mainly based upon the sensors such as ultrasound wave, infrared and laser ranging, therefore makes
The most restricted by scope, it is impossible to adaptation crawl environment is increasingly complex, capture object size, the unfixed situation of shape and position;
At present, existing optic type robotics is difficult to solve visual information dimension is high, data volume is big " dimension disaster " of input
Problem;Further, the neutral net utilizing machine learning to train also is difficult to convergence, it is impossible to directly process the image information of input.Always
For body, present optic type captures the control technology of service robot and not yet reaches gratifying result, especially in practicality
In also need to optimize further.
Summary of the invention
The technical problem to be solved in the present invention be existing cannot adapt to capture environment increasingly complex, capture object size,
The unfixed situation of shape and position.
In order to solve above-mentioned technical problem, the invention provides a kind of robot self adaptation based on deeply study and grab
Access method, comprises the steps:
Step 1, in distance in time capturing target certain distance, robot obtains the photograph of target by anterior photographic head
Sheet, utilizes binocular distance-finding method to calculate the positional information of target further according to photo, and the positional information calculated is used for machine
Device people navigates;
Step 2, robot moves according to navigation, when target enters in the range of mechanical arm is grabbed, then by front portion
The photo of photographic head photographic subjects, and photo carries out by the deeply learning network based on DDPG utilizing training in advance to cross
Data Dimensionality Reduction feature extraction;
Step 3, draws the control strategy of robot according to feature extraction result, and robot utilizes control strategy to control fortune
Dynamic path and the pose of mechanical arm, thus realize the self-adapting grasping of target.
Binocular distance-finding method is utilized to calculate target as limiting further in scheme, step 1 of the present invention according to photo
The concretely comprising the following steps of positional information:
Step 1.1, obtains the focal distance f of photographic head, centre-to-centre spacing T of two photographic head in left and rightxAnd impact point is in left and right two
The subpoint of the image plane of individual photographic head is to physical distance x of the respective image plane leftmost sidelAnd xr, two, left and right photographic head correspondence
The image plane in left side and the image plane on right side be rectangle plane, and be positioned on same imaging plane, two, left and right photographic head
Photocentre projection lay respectively at the center of corresponding image plane, then parallax d is:
D=xl-xr (1)
Step 1.2, utilizes Similar Principle of Triangle to set up Q matrix to be:
In formula (2) and (3), (X, Y, Z) is impact point seat in the three-dimensional coordinate system with left photographic head photocentre as initial point
Mark, W is for rotating translation conversion ratio example coefficient, and (x, y) is impact point coordinate in the image plane in left side, cxAnd cyIt is respectively a left side
The coordinate system of the image plane of side and the image plane on right side and the side-play amount of initial point, c in three-dimensional coordinate systemx' for cxCorrection value;
Step 1.3, being calculated impact point to the space length of imaging plane is:
Using the photocentre position of left photographic head as robot position, by the co-ordinate position information of impact point (X,
Y, Z) carry out robot navigation as navigation purpose.
The deeply based on DDPG utilizing training in advance to cross in scheme, step 2 is limited further as the present invention
Learning network carries out Data Dimensionality Reduction feature extraction to photo and concretely comprises the following steps:
Step 2.1, utilizes target to capture process and meets intensified learning and meet the condition of Markov character, when calculating t
Observed quantity and the collection of action before quarter are combined into:
st=(x1,a1,...,at-1,xt)=xt (5)
In formula (5), xtAnd atThe observed quantity being respectively t and the action taked;
Step 2.2, Utilization strategies value function describes the prospective earnings of crawl process and is:
Qπ(st,at)=E [Rt|st,at] (6)
In formula (6),The future profits summation that discount is later, γ was beaten for what moment t obtained
∈ [0,1] is discount factor, r (st,at) it is the revenue function of moment t, T is to capture the moment terminated, and π is for capturing strategy;
Target strategy π owing to capturing presets and determines, being designated as function mu: S ← A, S are state space, A is N-dimensional degree
Motion space, utilize Bellman equation to process formula (6) has simultaneously:
In formula (7), st+1~E represents that the observed quantity in t+1 moment obtains from environment E, μ (st+1) represent the t+1 moment
The action being be mapped to by function mu from observed quantity;
Step 2.3, utilizes the principle of maximal possibility estimation, by minimizing loss function and updating network weight parameter is
θQPolicy evaluation network Q (s, a | θQ), the loss function used is:
L(θQ)=Eμ'[(Q(st,at|θQ)-yt)2] (8)
In formula (8), yt=r (st,at)+γQ(st+1,μ(st+1)|θQ) it is that target strategy assesses network, μ ' is target plan
Slightly;
Step 2.4, is θ for actual parameterμStrategic function μ (s | θμ), the gradient utilizing chain method to obtain is:
Be Policy-Gradient by formula (9) calculated gradient, recycling Policy-Gradient update strategic function μ (s |
θμ);
Step 2.5, utilizes and carrys out training network from policing algorithm, and the sample data used in network training is from same sample
Relief area obtains, to minimize the relatedness between sample, trains neutral net, i.e. with a target Q value network simultaneously
Using experience replay mechanism and target Q value network method for the renewal of objective network, the slow more New Policy used is:
θQ'←τθQ+(1-τ)θQ' (10)
θμ'←τθμ+(1-τ)θμ' (11)
In formula (10) and (11), τ is turnover rate, and τ < < 1 the most just constructs a deeply based on DDPG
Practise network, and be the neutral net of convergence;
Step 2.6, utilizes the deeply learning network built that photo is carried out Data Dimensionality Reduction feature extraction, it is thus achieved that machine
The control strategy of device people.
The deeply learning network limited further in scheme, step 2.6 as the present invention is inputted by an image
Layer, two convolutional layers, two full articulamentums and an output layer are constituted, and image input layer comprises object to be captured for input
Image;Convolutional layer is used for extracting feature, i.e. the deep layer form of expression of an image;Full articulamentum and output layer are for composition one
Individual deep layer network, after training, input feature vector information, to this most exportable control instruction of deep layer network, i.e. controls robot
Mechanical arm steering wheel angle and control carry dolly DC motor speed.By selected convolutional layer and the number of full articulamentum
Amount be the purpose of two be both can effectively to have extracted characteristics of image, again so that neutral net training time be easy to convergence.
The beneficial effects of the present invention is: during (1) pre-training neutral net, use experience replay mechanism and stochastical sampling true
Before and after the image information of fixed input can effectively solve photo, degree of association is unsatisfactory for more greatly neutral net for input data each other
The problem of demand for independence;(2) realize Data Dimensionality Reduction by degree of depth study, use target Q value network technique constantly to adjust nerve net
The weight matrix of network, can ensure the neutral net convergence of training as much as possible;(3) degree of depth based on DDPG trained
Intensified learning neutral net can realize Data Dimensionality Reduction and object feature extraction, and directly gives the motor control plan of robot
Slightly, " dimension disaster " problem is effectively solved.
Accompanying drawing explanation
Fig. 1 is the system structure schematic diagram of the present invention;
Fig. 2 is the method flow diagram of the present invention;
Fig. 3 is the binocular distance-finding method floor map of the present invention;
Fig. 4 is the binocular ranging technology schematic perspective view of the present invention;
Fig. 5 is the composition schematic diagram of the deeply learning network based on DDPG of the present invention.
Detailed description of the invention
As it is shown in figure 1, the system bag of a kind of based on deeply learning method the robot self-adapting grasping of the present invention
Include: image processing system, wireless telecommunication system and robot motion's system.
Wherein, image processing system mainly has photographic head and the matlab software sharing being arranged on robot front portion;Wireless
Communication system is mainly made up of WIFI module;Robot motion's system is mainly made up of base dolly and mechanical arm;First need
Will be by dynamics simulation platform pre-training deeply learning network based on DDPG (degree of depth deterministic policy gradient), at this
During generally use experience replay mechanism and target Q value network both approaches to guarantee that deeply based on DDPG learns
Network can be restrained during pre-training, and then image processing system obtains the image of target object, passes through wireless telecommunication system
Image information is passed to computer, robot distance wait capture object farther out time, use binocular ranging technology, to obtain object
The positional information of body also uses it for the navigation of robot.
When robot moves and can catch object to mechanical arm, shoot object picture the most again and utilize the most trained
Good deeply learning network based on DDPG realizes Data Dimensionality Reduction and extracts feature and provide the control strategy of robot, finally
Send control strategy to robot motion system by wireless telecommunication system and control the kinestate of robot, it is achieved target
The accurate crawl of object.
First with matlab software, the RGB image of target object is converted into gray level image during pre-training, then uses warp
Test playback mechanism so that before and after photo degree of association the least with meet neutral net for input data independent of each other want
Ask, obtain the image of input neural network finally by stochastical sampling;Realize Data Dimensionality Reduction by degree of depth study, use target
Q-value network technique constantly adjusts the weight matrix of neutral net, finally gives the neutral net of convergence.
The control of robot Arduino plate realizes, and plate has carried WIFI module, and mechanical arm is made up of 4 steering wheels,
Realizing 4 degree of freedom altogether, base dolly is by DC motor Driver;Image processing system is mainly soft by photographic head and image transmitting thereof
Part and matlab are main;The photo of the target object that photographic head photographs will be transferred to electricity by the WIFI module on Arduino plate
Brain, and transfer to matlab process.
Operationally, step is as follows for system:
Step 1, it is necessary first to based on DDPG (degree of depth deterministic policy gradient) by dynamics simulation platform pre-training
Deeply learning network, the most generally uses experience replay mechanism and target Q value network both approaches to guarantee
Deeply learning network based on DDPG can restrain during pre-training;
Step 2, obtains the image of target object, utilizes WIFI module by image with the photographic head being arranged on robot front portion
Information passes to computer;
Step 3, robot distance wait capture object farther out time, use binocular ranging technology, to obtain target object
Positional information also uses it for the navigation of robot;
Step 4, when robot moves and can catch object to mechanical arm, shoots object picture the most again and utilizes
Trained good deeply learning network based on DDPG realizes Data Dimensionality Reduction and extracts feature and provide the control plan of robot
Slightly;
Step 5, utilizes WIFI module to send control information to robot motion system, it is achieved accurately grabbing of target object
Take;
As shown in Figure 3 and Figure 4, binocular ranging technology mainly make use of impact point imaging horizontal on the width view of left and right two
The difference (i.e. parallax) that coordinate directly exists also exists inversely proportional relation with the distance of impact point to imaging plane.Ordinary circumstance
Under, the dimension of focal length is pixel, and the dimension of photographic head centre-to-centre spacing is by the tessellated actual size of calibration plate and our input
Value determines, is usually in units of millimeter (in order to improve precision, we are set to 0.1 millimeter of magnitude), and the dimension of parallax is also picture
Vegetarian refreshments.Therefore molecule denominator is divided out, and the dimension of the distance of impact point to imaging plane is identical with photographic head centre-to-centre spacing.
As it is shown in figure 5, deeply learning network based on DDPG mainly by an image input layer, two convolutional layers,
Two full articulamentums, output layers are constituted.The degree of depth network architecture is used for realizing Data Dimensionality Reduction, and convolutional layer is used for extracting feature,
Output layer output control information.
As in figure 2 it is shown, the invention provides a kind of robot self-adapting grasping method based on deeply study, including
Following steps:
Step 1, in distance in time capturing target certain distance, robot obtains the photograph of target by anterior photographic head
Sheet, utilizes binocular distance-finding method to calculate the positional information of target further according to photo, and the positional information calculated is used for machine
Device people navigates;
Step 2, robot moves according to navigation, when target enters in the range of mechanical arm is grabbed, then by front portion
The photo of photographic head photographic subjects, and photo carries out by the deeply learning network based on DDPG utilizing training in advance to cross
Data Dimensionality Reduction feature extraction;
Step 3, draws the control strategy of robot according to feature extraction result, and robot utilizes control strategy to control fortune
Dynamic path and the pose of mechanical arm, thus realize the self-adapting grasping of target.
Wherein, step 1 utilizes binocular distance-finding method to calculate the concretely comprising the following steps of positional information of target according to photo:
Step 1.1, obtains the focal distance f of photographic head, centre-to-centre spacing T of two photographic head in left and rightxAnd impact point is in left and right two
The subpoint of the image plane of individual photographic head is to physical distance x of the respective image plane leftmost sidelAnd xr, two, left and right photographic head correspondence
The image plane in left side and the image plane on right side be rectangle plane, and be positioned on same imaging plane, two, left and right photographic head
Photocentre projection lay respectively at the center of corresponding image plane, i.e. Ol、OrAt the subpoint of imaging plane, then parallax d is:
D=xl-xr (1)
Step 1.2, utilizes Similar Principle of Triangle to set up Q matrix to be:
In formula (2) and (3), (X, Y, Z) is impact point seat in the three-dimensional coordinate system with left photographic head photocentre as initial point
Mark, W is for rotating translation conversion ratio example coefficient, and (x, y) is impact point coordinate in the image plane in left side, cxAnd cyIt is respectively a left side
The coordinate system of the image plane of side and the image plane on right side and the side-play amount of initial point, c in three-dimensional coordinate systemx' for cxCorrection value (two
Person's numerical value is typically more or less the same, in the present invention it is believed that both approximately equals);
Step 1.3, being calculated impact point to the space length of imaging plane is:
Using the photocentre position of left photographic head as robot position, by the co-ordinate position information of impact point (X,
Y, Z) carry out robot navigation as navigation purpose.
It is special that the deeply learning network based on DDPG utilizing training in advance to cross in step 2 carries out Data Dimensionality Reduction to photo
Levy concretely comprising the following steps of extraction:
Step 2.1, utilizes target to capture process and meets intensified learning and meet the condition of Markov character, when calculating t
Observed quantity and the collection of action before quarter are combined into:
st=(x1,a1,...,at-1,xt)=xt (5)
In formula (5), xtAnd atThe observed quantity being respectively t and the action taked;
Step 2.2, Utilization strategies value function describes the prospective earnings of crawl process and is:
Qπ(st,at)=E [Rt|st,at] (6)
In formula (6),The future profits summation that discount is later, γ was beaten for what moment t obtained
∈ [0,1] is discount factor, r (st,at) it is the revenue function of moment t, T is to capture the moment terminated, and π is for capturing strategy;
Target strategy π owing to capturing presets and determines, being designated as function mu: S ← A, S are state space, A is N-dimensional degree
Motion space, utilize Bellman equation to process formula (6) has simultaneously:
In formula (7), st+1~E represents that the observed quantity in t+1 moment obtains from environment E, μ (st+1) represent t+1
The action that moment is be mapped to by function mu from observed quantity;
Step 2.3, utilizes the principle of maximal possibility estimation, by minimizing loss function and updating network weight parameter is
θQPolicy evaluation network Q (s, a | θQ), the loss function used is:
L(θQ)=Eμ'[(Q(st,at|θQ)-yt)2] (8)
In formula (8), yt=r (st,at)+γQ(st+1,μ(st+1)|θQ) it is that target strategy assesses network, μ ' is target plan
Slightly;
Step 2.4, is θ for actual parameterμStrategic function μ (s | θμ), the gradient utilizing chain method to obtain is:
Be Policy-Gradient by formula (9) calculated gradient, recycling Policy-Gradient update strategic function μ (s |
θμ);
Step 2.5, utilizes and carrys out training network from policing algorithm, and the sample data used in network training is from same sample
Relief area obtains, to minimize the relatedness between sample, trains neutral net, i.e. with a target Q value network simultaneously
Using experience replay mechanism and target Q value network method for the renewal of objective network, the slow more New Policy used is:
θQ'←τθQ+(1-τ)θQ' (10)
θμ'←τθμ+(1-τ)θμ' (11)
In formula (10) and (11), τ is turnover rate, and τ < < 1 the most just constructs a deeply based on DDPG
Practise network, and be the neutral net of convergence;
Step 2.6, utilizes the deeply learning network built that photo is carried out Data Dimensionality Reduction feature extraction, it is thus achieved that machine
The control strategy of device people;Deeply learning network is by an image input layer, two convolutional layers, two full articulamentums and
Individual output layer constitute, wherein, the quantity of selected convolutional layer and full articulamentum be the purpose of two be both can effectively to have extracted
Characteristics of image, again so that neutral net is easy to convergence when training;Image input layer comprises object to be captured for input
Image;Convolutional layer is used for extracting feature, i.e. the deep layer form of expression of an image, such as some lines, limit, camber line etc.;Quan Lian
Connect layer and output layer for constitute a deep layer network, by training after, input feature vector information can export control to this network
System instruction, i.e. controls the mechanical arm steering wheel angle of robot and controls to carry the DC motor speed of dolly.
Experience replay mechanism and stochastical sampling is used to determine that the image information of input can during pre-training neutral net of the present invention
The problem that neutral net requires independently of one another it is unsatisfactory for more greatly for input data with degree of association before and after effectively solving photo;Pass through
Degree of depth study realizes Data Dimensionality Reduction, uses target Q value network technique constantly to adjust the weight matrix of neutral net, can be as far as possible
Ground ensures the neutral net convergence of training;The deeply learning neural network based on DDPG trained can realize number
According to dimensionality reduction and object feature extraction, and directly give the Motion Control Strategies of robot, effectively solve " dimension disaster " problem.
Claims (4)
1. a robot self-adapting grasping method based on deeply study, it is characterised in that comprise the steps:
Step 1, in distance in time capturing target certain distance, robot obtains the photo of target by anterior photographic head, then
Utilize binocular distance-finding method to calculate the positional information of target according to photo, and the positional information calculated is used for robot leads
Boat;
Step 2, robot moves according to navigation, when target enters in the range of mechanical arm is grabbed, then is taken the photograph by anterior
As the photo of head photographic subjects, and the deeply learning network based on DDPG utilizing training in advance to cross carries out data to photo
Dimensionality reduction feature extraction;
Step 3, draws the control strategy of robot according to feature extraction result, and robot utilizes control strategy to control road of moving
Footpath and the pose of mechanical arm, thus realize the self-adapting grasping of target.
Robot self-adapting grasping method based on deeply study the most according to claim 1, it is characterised in that step
Binocular distance-finding method is utilized to calculate the concretely comprising the following steps of positional information of target according to photo in rapid 1:
Step 1.1, obtains the focal distance f of photographic head, centre-to-centre spacing T of two photographic head in left and rightxAnd impact point is in the shooting of two, left and right
The subpoint of the image plane of head is to physical distance x of the respective image plane leftmost sidelAnd xr, left side that two, left and right photographic head is corresponding
Image plane and the image plane on right side be rectangle plane, and be positioned on same imaging plane, the photocentre of two photographic head in left and right
Projection lays respectively at the center of corresponding image plane, then parallax d is:
D=xl-xr (1)
Step 1.2, utilizes Similar Principle of Triangle to set up Q matrix to be:
In formula (2) and (3), (X, Y, Z) is impact point coordinate in the three-dimensional coordinate system with left photographic head photocentre as initial point, W
For rotating translation conversion ratio example coefficient, (x y) is impact point coordinate in the image plane in left side, cxAnd cyIt is respectively left side
The coordinate system of the image plane on image plane and right side and the side-play amount of initial point, c in three-dimensional coordinate systemx' for cxCorrection value;
Step 1.3, being calculated impact point to the space length of imaging plane is:
Using the photocentre position of left photographic head as robot position, by the co-ordinate position information (X, Y, Z) of impact point
Robot navigation is carried out as navigation purpose.
Robot self-adapting grasping method based on deeply study the most according to claim 1 and 2, its feature exists
In, the deeply learning network based on DDPG utilizing training in advance to cross in step 2 carries out Data Dimensionality Reduction feature to photo and carries
Take concretely comprises the following steps:
Step 2.1, utilize target capture process meet intensified learning and meet the condition of Markov character, calculate t it
Front observed quantity and the collection of action are combined into:
st=(x1,a1,...,at-1,xt)=xt (5)
In formula (5), xtAnd atThe observed quantity being respectively t and the action taked;
Step 2.2, Utilization strategies value function describes the prospective earnings of crawl process and is:
Qπ(st,at)=E [Rt|st,at] (6)
In formula (6),The future profits summation that discount is later, γ ∈ was beaten for what moment t obtained
[0,1] is discount factor, r (st,at) it is the revenue function of moment t, T is to capture the moment terminated, and π is for capturing strategy;
Target strategy π owing to capturing presets and determines, being designated as function mu: S ← A, S are state space, A is the action of N-dimensional degree
Space, utilize Bellman equation to process formula (6) has simultaneously:
In formula (7), st+1~E represents that the observed quantity in t+1 moment obtains from environment E, μ (st+1) represent that the t+1 moment is from sight
The action that the amount of examining is be mapped to by function mu;
Step 2.3, utilizes the principle of maximal possibility estimation, is θ by minimize loss function updating network weight parameterQ's
Policy evaluation network Q (s, a | θQ), the loss function used is:
L(θQ)=Eμ'[(Q(st,at|θQ)-yt)2] (8)
In formula (8), yt=r (st,at)+γQ(st+1,μ(st+1)|θQ) it is that target strategy assesses network, μ ' is target strategy;
Step 2.4, is θ for actual parameterμStrategic function μ (s | θμ), the gradient utilizing chain method to obtain is:
Be Policy-Gradient by formula (9) calculated gradient, recycling Policy-Gradient update strategic function μ (s | θμ);
Step 2.5, utilizes and carrys out training network from policing algorithm, and the sample data used in network training is from same Sample Buffer
District obtains, to minimize the relatedness between sample, trains neutral net with a target Q value network simultaneously, i.e. use
Experience replay mechanism and target Q value network method are for the renewal of objective network, and the slow more New Policy used is:
θQ'←τθQ+(1-τ)θQ' (10)
θμ'←τθμ+(1-τ)θμ' (11)
In formula (10) and (11), τ is turnover rate, τ < < 1, the most just constructs deeply based on a DDPG study net
Network, and be the neutral net of convergence;
Step 2.6, utilizes the deeply learning network built that photo is carried out Data Dimensionality Reduction feature extraction, it is thus achieved that robot
Control strategy.
Robot self-adapting grasping method based on deeply study the most according to claim 3, it is characterised in that step
Deeply learning network in rapid 2.6 is by an image input layer, two convolutional layers, two full articulamentums and an output
Layer is constituted, and image input layer comprises the image of object to be captured for input;Convolutional layer is used for extracting feature, i.e. an image
The deep layer form of expression;Full articulamentum and output layer are for constituting a deep layer network, and after training, input feature vector information arrives
This most exportable control instruction of deep layer network, i.e. controls the mechanical arm steering wheel angle of robot and controls to carry the direct current of dolly
Motor speed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610402319.6A CN106094516A (en) | 2016-06-08 | 2016-06-08 | A kind of robot self-adapting grasping method based on deeply study |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610402319.6A CN106094516A (en) | 2016-06-08 | 2016-06-08 | A kind of robot self-adapting grasping method based on deeply study |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106094516A true CN106094516A (en) | 2016-11-09 |
Family
ID=57228280
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610402319.6A Pending CN106094516A (en) | 2016-06-08 | 2016-06-08 | A kind of robot self-adapting grasping method based on deeply study |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106094516A (en) |
Cited By (54)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106600650A (en) * | 2016-12-12 | 2017-04-26 | 杭州蓝芯科技有限公司 | Binocular visual sense depth information obtaining method based on deep learning |
CN106780605A (en) * | 2016-12-20 | 2017-05-31 | 芜湖哈特机器人产业技术研究院有限公司 | A kind of detection method of the object crawl position based on deep learning robot |
CN106737673A (en) * | 2016-12-23 | 2017-05-31 | 浙江大学 | A kind of method of the control of mechanical arm end to end based on deep learning |
CN106873585A (en) * | 2017-01-18 | 2017-06-20 | 无锡辰星机器人科技有限公司 | One kind navigation method for searching, robot and system |
CN106970594A (en) * | 2017-05-09 | 2017-07-21 | 京东方科技集团股份有限公司 | A kind of method for planning track of flexible mechanical arm |
CN107092254A (en) * | 2017-04-27 | 2017-08-25 | 北京航空航天大学 | A kind of design method for the Household floor-sweeping machine device people for strengthening study based on depth |
CN107139179A (en) * | 2017-05-26 | 2017-09-08 | 西安电子科技大学 | A kind of intellect service robot and method of work |
CN107168110A (en) * | 2016-12-09 | 2017-09-15 | 陈胜辉 | A kind of material grasping means and system |
CN107186708A (en) * | 2017-04-25 | 2017-09-22 | 江苏安格尔机器人有限公司 | Trick servo robot grasping system and method based on deep learning image Segmentation Technology |
CN107367929A (en) * | 2017-07-19 | 2017-11-21 | 北京上格云技术有限公司 | Update method, storage medium and the terminal device of Q value matrixs |
CN107450593A (en) * | 2017-08-30 | 2017-12-08 | 清华大学 | A kind of unmanned plane autonomous navigation method and system |
CN107450555A (en) * | 2017-08-30 | 2017-12-08 | 唐开强 | A kind of Hexapod Robot real-time gait planing method based on deeply study |
CN107479501A (en) * | 2017-09-28 | 2017-12-15 | 广州智能装备研究院有限公司 | 3D parts suction methods based on deep learning |
CN107479368A (en) * | 2017-06-30 | 2017-12-15 | 北京百度网讯科技有限公司 | A kind of method and system of the training unmanned aerial vehicle (UAV) control model based on artificial intelligence |
CN107562052A (en) * | 2017-08-30 | 2018-01-09 | 唐开强 | A kind of Hexapod Robot gait planning method based on deeply study |
CN107748566A (en) * | 2017-09-20 | 2018-03-02 | 清华大学 | A kind of underwater autonomous robot constant depth control method based on intensified learning |
CN108051999A (en) * | 2017-10-31 | 2018-05-18 | 中国科学技术大学 | Accelerator beam path control method and system based on deeply study |
CN108052004A (en) * | 2017-12-06 | 2018-05-18 | 湖北工业大学 | Industrial machinery arm autocontrol method based on depth enhancing study |
CN108305275A (en) * | 2017-08-25 | 2018-07-20 | 深圳市腾讯计算机系统有限公司 | Active tracking method, apparatus and system |
CN108321795A (en) * | 2018-01-19 | 2018-07-24 | 上海交通大学 | Start-stop of generator set configuration method based on depth deterministic policy algorithm and system |
CN108415254A (en) * | 2018-03-12 | 2018-08-17 | 苏州大学 | Waste recycling robot control method and device based on deep Q network |
CN108536011A (en) * | 2018-03-19 | 2018-09-14 | 中山大学 | A kind of Hexapod Robot complicated landform adaptive motion control method based on deeply study |
CN108594804A (en) * | 2018-03-12 | 2018-09-28 | 苏州大学 | Automatic driving control method for distribution trolley based on deep Q network |
CN108873687A (en) * | 2018-07-11 | 2018-11-23 | 哈尔滨工程大学 | A kind of Intelligent Underwater Robot behavior system knot planing method based on depth Q study |
CN109063827A (en) * | 2018-10-25 | 2018-12-21 | 电子科技大学 | It takes automatically in the confined space method, system, storage medium and the terminal of specific luggage |
CN109116854A (en) * | 2018-09-16 | 2019-01-01 | 南京大学 | A kind of robot cooperated control method of multiple groups based on intensified learning and control system |
CN109344877A (en) * | 2018-08-31 | 2019-02-15 | 深圳先进技术研究院 | A kind of sample data processing method, sample data processing unit and electronic equipment |
CN109358628A (en) * | 2018-11-06 | 2019-02-19 | 江苏木盟智能科技有限公司 | A kind of container alignment method and robot |
CN109407603A (en) * | 2017-08-16 | 2019-03-01 | 北京猎户星空科技有限公司 | A kind of method and device of control mechanical arm crawl object |
CN109483534A (en) * | 2018-11-08 | 2019-03-19 | 腾讯科技(深圳)有限公司 | A kind of grasping body methods, devices and systems |
CN109523029A (en) * | 2018-09-28 | 2019-03-26 | 清华大学深圳研究生院 | For the adaptive double from driving depth deterministic policy Gradient Reinforcement Learning method of training smart body |
CN109760046A (en) * | 2018-12-27 | 2019-05-17 | 西北工业大学 | Robot for space based on intensified learning captures Tum bling Target motion planning method |
CN109807882A (en) * | 2017-11-20 | 2019-05-28 | 株式会社安川电机 | Holding system, learning device and holding method |
CN109909998A (en) * | 2017-12-12 | 2019-06-21 | 北京猎户星空科技有限公司 | A kind of method and device controlling manipulator motion |
WO2019155061A1 (en) * | 2018-02-09 | 2019-08-15 | Deepmind Technologies Limited | Distributional reinforcement learning using quantile function neural networks |
CN110202583A (en) * | 2019-07-09 | 2019-09-06 | 华南理工大学 | A kind of Apery manipulator control system and its control method based on deep learning |
CN110293549A (en) * | 2018-03-21 | 2019-10-01 | 北京猎户星空科技有限公司 | Mechanical arm control method, device and neural network model training method, device |
CN110323981A (en) * | 2019-05-14 | 2019-10-11 | 广东省智能制造研究所 | A kind of method and system controlling permanent magnetic linear synchronous motor |
CN110328668A (en) * | 2019-07-27 | 2019-10-15 | 南京理工大学 | Robotic arm path planing method based on rate smoothing deterministic policy gradient |
CN110394804A (en) * | 2019-08-26 | 2019-11-01 | 山东大学 | A kind of robot control method, controller and system based on layering thread frame |
CN110400345A (en) * | 2019-07-24 | 2019-11-01 | 西南科技大学 | Radioactive waste based on deeply study, which pushes away, grabs collaboration method for sorting |
CN110427021A (en) * | 2018-05-01 | 2019-11-08 | 本田技研工业株式会社 | System and method for generating automatic driving vehicle intersection navigation instruction |
CN110691676A (en) * | 2017-06-19 | 2020-01-14 | 谷歌有限责任公司 | Robot crawling prediction using neural networks and geometrically-aware object representations |
CN110722556A (en) * | 2019-10-17 | 2020-01-24 | 苏州恒辉科技有限公司 | Movable mechanical arm control system and method based on reinforcement learning |
CN111347411A (en) * | 2018-12-20 | 2020-06-30 | 中国科学院沈阳自动化研究所 | Two-arm cooperative robot three-dimensional visual recognition grabbing method based on deep learning |
WO2020134254A1 (en) * | 2018-12-27 | 2020-07-02 | 南京芊玥机器人科技有限公司 | Method employing reinforcement learning to optimize trajectory of spray painting robot |
CN111618847A (en) * | 2020-04-22 | 2020-09-04 | 南通大学 | Mechanical arm autonomous grabbing method based on deep reinforcement learning and dynamic motion elements |
CN112347900A (en) * | 2020-11-04 | 2021-02-09 | 中国海洋大学 | Monocular vision underwater target automatic grabbing method based on distance estimation |
US10926416B2 (en) | 2018-11-21 | 2021-02-23 | Ford Global Technologies, Llc | Robotic manipulation using an independently actuated vision system, an adversarial control scheme, and a multi-tasking deep learning architecture |
CN112734759A (en) * | 2021-03-30 | 2021-04-30 | 常州微亿智造科技有限公司 | Method and device for determining trigger point of flying shooting |
CN112757284A (en) * | 2019-10-21 | 2021-05-07 | 佳能株式会社 | Robot control apparatus, method and storage medium |
CN113836788A (en) * | 2021-08-24 | 2021-12-24 | 浙江大学 | Acceleration method for flow industry reinforcement learning control based on local data enhancement |
CN114454160A (en) * | 2021-12-31 | 2022-05-10 | 中国人民解放军国防科技大学 | Mechanical arm grabbing control method and system based on kernel least square soft Bellman residual reinforcement learning |
CN117516530A (en) * | 2023-09-28 | 2024-02-06 | 中国科学院自动化研究所 | Robot target navigation method and device |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080133053A1 (en) * | 2006-11-29 | 2008-06-05 | Honda Motor Co., Ltd. | Determination of Foot Placement for Humanoid Push Recovery |
CN102521205A (en) * | 2011-11-23 | 2012-06-27 | 河海大学常州校区 | Multi-Agent based robot combined search system by reinforcement learning |
CN102902271A (en) * | 2012-10-23 | 2013-01-30 | 上海大学 | Binocular vision-based robot target identifying and gripping system and method |
CN203390936U (en) * | 2013-04-26 | 2014-01-15 | 上海锡明光电科技有限公司 | Self-adaption automatic robotic system realizing dynamic and real-time capture function |
CN104778721A (en) * | 2015-05-08 | 2015-07-15 | 哈尔滨工业大学 | Distance measuring method of significant target in binocular image |
CN105115497A (en) * | 2015-09-17 | 2015-12-02 | 南京大学 | Reliable indoor mobile robot precise navigation positioning system and method |
CN105137967A (en) * | 2015-07-16 | 2015-12-09 | 北京工业大学 | Mobile robot path planning method with combination of depth automatic encoder and Q-learning algorithm |
CN105425828A (en) * | 2015-11-11 | 2016-03-23 | 山东建筑大学 | Robot anti-impact double-arm coordination control system based on sensor fusion technology |
CN105459136A (en) * | 2015-12-29 | 2016-04-06 | 上海帆声图像科技有限公司 | Robot vision grasping method |
CN105637540A (en) * | 2013-10-08 | 2016-06-01 | 谷歌公司 | Methods and apparatus for reinforcement learning |
-
2016
- 2016-06-08 CN CN201610402319.6A patent/CN106094516A/en active Pending
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080133053A1 (en) * | 2006-11-29 | 2008-06-05 | Honda Motor Co., Ltd. | Determination of Foot Placement for Humanoid Push Recovery |
CN102521205A (en) * | 2011-11-23 | 2012-06-27 | 河海大学常州校区 | Multi-Agent based robot combined search system by reinforcement learning |
CN102902271A (en) * | 2012-10-23 | 2013-01-30 | 上海大学 | Binocular vision-based robot target identifying and gripping system and method |
CN203390936U (en) * | 2013-04-26 | 2014-01-15 | 上海锡明光电科技有限公司 | Self-adaption automatic robotic system realizing dynamic and real-time capture function |
CN105637540A (en) * | 2013-10-08 | 2016-06-01 | 谷歌公司 | Methods and apparatus for reinforcement learning |
CN104778721A (en) * | 2015-05-08 | 2015-07-15 | 哈尔滨工业大学 | Distance measuring method of significant target in binocular image |
CN105137967A (en) * | 2015-07-16 | 2015-12-09 | 北京工业大学 | Mobile robot path planning method with combination of depth automatic encoder and Q-learning algorithm |
CN105115497A (en) * | 2015-09-17 | 2015-12-02 | 南京大学 | Reliable indoor mobile robot precise navigation positioning system and method |
CN105425828A (en) * | 2015-11-11 | 2016-03-23 | 山东建筑大学 | Robot anti-impact double-arm coordination control system based on sensor fusion technology |
CN105459136A (en) * | 2015-12-29 | 2016-04-06 | 上海帆声图像科技有限公司 | Robot vision grasping method |
Non-Patent Citations (3)
Title |
---|
TIMOTHY P. LILLICRAP 等: "Continuous Control with Deep Reinforcement Learning", 《GOOGLE DEEPMIND,ICLR 2016》 * |
史忠植: "《心智计算》", 31 August 2015, 清华大学出版社 * |
陈强: "基于双目立体视觉的三维重建", 《图形图像》 * |
Cited By (88)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107168110A (en) * | 2016-12-09 | 2017-09-15 | 陈胜辉 | A kind of material grasping means and system |
CN106600650A (en) * | 2016-12-12 | 2017-04-26 | 杭州蓝芯科技有限公司 | Binocular visual sense depth information obtaining method based on deep learning |
CN106780605A (en) * | 2016-12-20 | 2017-05-31 | 芜湖哈特机器人产业技术研究院有限公司 | A kind of detection method of the object crawl position based on deep learning robot |
CN106737673B (en) * | 2016-12-23 | 2019-06-18 | 浙江大学 | A method of the control of mechanical arm end to end based on deep learning |
CN106737673A (en) * | 2016-12-23 | 2017-05-31 | 浙江大学 | A kind of method of the control of mechanical arm end to end based on deep learning |
CN106873585A (en) * | 2017-01-18 | 2017-06-20 | 无锡辰星机器人科技有限公司 | One kind navigation method for searching, robot and system |
CN107186708A (en) * | 2017-04-25 | 2017-09-22 | 江苏安格尔机器人有限公司 | Trick servo robot grasping system and method based on deep learning image Segmentation Technology |
CN107186708B (en) * | 2017-04-25 | 2020-05-12 | 珠海智卓投资管理有限公司 | Hand-eye servo robot grabbing system and method based on deep learning image segmentation technology |
CN107092254B (en) * | 2017-04-27 | 2019-11-29 | 北京航空航天大学 | A kind of design method of the Household floor-sweeping machine device people based on depth enhancing study |
CN107092254A (en) * | 2017-04-27 | 2017-08-25 | 北京航空航天大学 | A kind of design method for the Household floor-sweeping machine device people for strengthening study based on depth |
CN106970594A (en) * | 2017-05-09 | 2017-07-21 | 京东方科技集团股份有限公司 | A kind of method for planning track of flexible mechanical arm |
CN106970594B (en) * | 2017-05-09 | 2019-02-12 | 京东方科技集团股份有限公司 | A kind of method for planning track of flexible mechanical arm |
CN107139179B (en) * | 2017-05-26 | 2020-05-29 | 西安电子科技大学 | Intelligent service robot and working method |
CN107139179A (en) * | 2017-05-26 | 2017-09-08 | 西安电子科技大学 | A kind of intellect service robot and method of work |
CN110691676A (en) * | 2017-06-19 | 2020-01-14 | 谷歌有限责任公司 | Robot crawling prediction using neural networks and geometrically-aware object representations |
US11554483B2 (en) | 2017-06-19 | 2023-01-17 | Google Llc | Robotic grasping prediction using neural networks and geometry aware object representation |
US11150655B2 (en) | 2017-06-30 | 2021-10-19 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Method and system for training unmanned aerial vehicle control model based on artificial intelligence |
CN107479368A (en) * | 2017-06-30 | 2017-12-15 | 北京百度网讯科技有限公司 | A kind of method and system of the training unmanned aerial vehicle (UAV) control model based on artificial intelligence |
CN107367929A (en) * | 2017-07-19 | 2017-11-21 | 北京上格云技术有限公司 | Update method, storage medium and the terminal device of Q value matrixs |
CN109407603B (en) * | 2017-08-16 | 2020-03-06 | 北京猎户星空科技有限公司 | Method and device for controlling mechanical arm to grab object |
CN109407603A (en) * | 2017-08-16 | 2019-03-01 | 北京猎户星空科技有限公司 | A kind of method and device of control mechanical arm crawl object |
CN108305275A (en) * | 2017-08-25 | 2018-07-20 | 深圳市腾讯计算机系统有限公司 | Active tracking method, apparatus and system |
CN107450555A (en) * | 2017-08-30 | 2017-12-08 | 唐开强 | A kind of Hexapod Robot real-time gait planing method based on deeply study |
CN107450593A (en) * | 2017-08-30 | 2017-12-08 | 清华大学 | A kind of unmanned plane autonomous navigation method and system |
CN107450593B (en) * | 2017-08-30 | 2020-06-12 | 清华大学 | Unmanned aerial vehicle autonomous navigation method and system |
CN107562052A (en) * | 2017-08-30 | 2018-01-09 | 唐开强 | A kind of Hexapod Robot gait planning method based on deeply study |
CN107748566B (en) * | 2017-09-20 | 2020-04-24 | 清华大学 | Underwater autonomous robot fixed depth control method based on reinforcement learning |
CN107748566A (en) * | 2017-09-20 | 2018-03-02 | 清华大学 | A kind of underwater autonomous robot constant depth control method based on intensified learning |
CN107479501A (en) * | 2017-09-28 | 2017-12-15 | 广州智能装备研究院有限公司 | 3D parts suction methods based on deep learning |
CN108051999A (en) * | 2017-10-31 | 2018-05-18 | 中国科学技术大学 | Accelerator beam path control method and system based on deeply study |
CN109807882A (en) * | 2017-11-20 | 2019-05-28 | 株式会社安川电机 | Holding system, learning device and holding method |
US11338435B2 (en) | 2017-11-20 | 2022-05-24 | Kabushiki Kaisha Yaskawa Denki | Gripping system with machine learning |
CN109807882B (en) * | 2017-11-20 | 2022-09-16 | 株式会社安川电机 | Gripping system, learning device, and gripping method |
CN108052004B (en) * | 2017-12-06 | 2020-11-10 | 湖北工业大学 | Industrial mechanical arm automatic control method based on deep reinforcement learning |
CN108052004A (en) * | 2017-12-06 | 2018-05-18 | 湖北工业大学 | Industrial machinery arm autocontrol method based on depth enhancing study |
CN109909998B (en) * | 2017-12-12 | 2020-10-02 | 北京猎户星空科技有限公司 | Method and device for controlling movement of mechanical arm |
CN109909998A (en) * | 2017-12-12 | 2019-06-21 | 北京猎户星空科技有限公司 | A kind of method and device controlling manipulator motion |
CN108321795A (en) * | 2018-01-19 | 2018-07-24 | 上海交通大学 | Start-stop of generator set configuration method based on depth deterministic policy algorithm and system |
CN108321795B (en) * | 2018-01-19 | 2021-01-22 | 上海交通大学 | Generator set start-stop configuration method and system based on deep certainty strategy algorithm |
WO2019155061A1 (en) * | 2018-02-09 | 2019-08-15 | Deepmind Technologies Limited | Distributional reinforcement learning using quantile function neural networks |
EP3701432A1 (en) * | 2018-02-09 | 2020-09-02 | DeepMind Technologies Limited | Distributional reinforcement learning using quantile function neural networks |
US11610118B2 (en) | 2018-02-09 | 2023-03-21 | Deepmind Technologies Limited | Distributional reinforcement learning using quantile function neural networks |
US11887000B2 (en) | 2018-02-09 | 2024-01-30 | Deepmind Technologies Limited | Distributional reinforcement learning using quantile function neural networks |
CN108415254B (en) * | 2018-03-12 | 2020-12-11 | 苏州大学 | Waste recycling robot control method based on deep Q network |
CN108594804B (en) * | 2018-03-12 | 2021-06-18 | 苏州大学 | Automatic driving control method for distribution trolley based on deep Q network |
CN108594804A (en) * | 2018-03-12 | 2018-09-28 | 苏州大学 | Automatic driving control method for distribution trolley based on deep Q network |
CN108415254A (en) * | 2018-03-12 | 2018-08-17 | 苏州大学 | Waste recycling robot control method and device based on deep Q network |
CN108536011A (en) * | 2018-03-19 | 2018-09-14 | 中山大学 | A kind of Hexapod Robot complicated landform adaptive motion control method based on deeply study |
CN110293549B (en) * | 2018-03-21 | 2021-06-22 | 北京猎户星空科技有限公司 | Mechanical arm control method and device and neural network model training method and device |
CN110293549A (en) * | 2018-03-21 | 2019-10-01 | 北京猎户星空科技有限公司 | Mechanical arm control method, device and neural network model training method, device |
CN110427021A (en) * | 2018-05-01 | 2019-11-08 | 本田技研工业株式会社 | System and method for generating automatic driving vehicle intersection navigation instruction |
CN110427021B (en) * | 2018-05-01 | 2024-04-12 | 本田技研工业株式会社 | System and method for generating navigation instructions for an autonomous vehicle intersection |
CN108873687A (en) * | 2018-07-11 | 2018-11-23 | 哈尔滨工程大学 | A kind of Intelligent Underwater Robot behavior system knot planing method based on depth Q study |
CN109344877A (en) * | 2018-08-31 | 2019-02-15 | 深圳先进技术研究院 | A kind of sample data processing method, sample data processing unit and electronic equipment |
CN109344877B (en) * | 2018-08-31 | 2020-12-11 | 深圳先进技术研究院 | Sample data processing method, sample data processing device and electronic equipment |
CN109116854A (en) * | 2018-09-16 | 2019-01-01 | 南京大学 | A kind of robot cooperated control method of multiple groups based on intensified learning and control system |
CN109523029B (en) * | 2018-09-28 | 2020-11-03 | 清华大学深圳研究生院 | Self-adaptive double-self-driven depth certainty strategy gradient reinforcement learning method |
CN109523029A (en) * | 2018-09-28 | 2019-03-26 | 清华大学深圳研究生院 | For the adaptive double from driving depth deterministic policy Gradient Reinforcement Learning method of training smart body |
CN109063827B (en) * | 2018-10-25 | 2022-03-04 | 电子科技大学 | Method, system, storage medium and terminal for automatically taking specific luggage in limited space |
CN109063827A (en) * | 2018-10-25 | 2018-12-21 | 电子科技大学 | It takes automatically in the confined space method, system, storage medium and the terminal of specific luggage |
CN109358628A (en) * | 2018-11-06 | 2019-02-19 | 江苏木盟智能科技有限公司 | A kind of container alignment method and robot |
CN109483534A (en) * | 2018-11-08 | 2019-03-19 | 腾讯科技(深圳)有限公司 | A kind of grasping body methods, devices and systems |
US10926416B2 (en) | 2018-11-21 | 2021-02-23 | Ford Global Technologies, Llc | Robotic manipulation using an independently actuated vision system, an adversarial control scheme, and a multi-tasking deep learning architecture |
CN111347411A (en) * | 2018-12-20 | 2020-06-30 | 中国科学院沈阳自动化研究所 | Two-arm cooperative robot three-dimensional visual recognition grabbing method based on deep learning |
CN111347411B (en) * | 2018-12-20 | 2023-01-24 | 中国科学院沈阳自动化研究所 | Two-arm cooperative robot three-dimensional visual recognition grabbing method based on deep learning |
CN109760046A (en) * | 2018-12-27 | 2019-05-17 | 西北工业大学 | Robot for space based on intensified learning captures Tum bling Target motion planning method |
WO2020134254A1 (en) * | 2018-12-27 | 2020-07-02 | 南京芊玥机器人科技有限公司 | Method employing reinforcement learning to optimize trajectory of spray painting robot |
CN110323981A (en) * | 2019-05-14 | 2019-10-11 | 广东省智能制造研究所 | A kind of method and system controlling permanent magnetic linear synchronous motor |
CN110202583A (en) * | 2019-07-09 | 2019-09-06 | 华南理工大学 | A kind of Apery manipulator control system and its control method based on deep learning |
CN110400345A (en) * | 2019-07-24 | 2019-11-01 | 西南科技大学 | Radioactive waste based on deeply study, which pushes away, grabs collaboration method for sorting |
CN110400345B (en) * | 2019-07-24 | 2021-06-15 | 西南科技大学 | Deep reinforcement learning-based radioactive waste push-grab cooperative sorting method |
CN110328668A (en) * | 2019-07-27 | 2019-10-15 | 南京理工大学 | Robotic arm path planing method based on rate smoothing deterministic policy gradient |
CN110328668B (en) * | 2019-07-27 | 2022-03-22 | 南京理工大学 | Mechanical arm path planning method based on speed smooth deterministic strategy gradient |
CN110394804B (en) * | 2019-08-26 | 2022-08-12 | 山东大学 | Robot control method, controller and system based on layered thread framework |
CN110394804A (en) * | 2019-08-26 | 2019-11-01 | 山东大学 | A kind of robot control method, controller and system based on layering thread frame |
CN110722556A (en) * | 2019-10-17 | 2020-01-24 | 苏州恒辉科技有限公司 | Movable mechanical arm control system and method based on reinforcement learning |
CN112757284A (en) * | 2019-10-21 | 2021-05-07 | 佳能株式会社 | Robot control apparatus, method and storage medium |
CN112757284B (en) * | 2019-10-21 | 2024-03-22 | 佳能株式会社 | Robot control device, method, and storage medium |
CN111618847A (en) * | 2020-04-22 | 2020-09-04 | 南通大学 | Mechanical arm autonomous grabbing method based on deep reinforcement learning and dynamic motion elements |
CN111618847B (en) * | 2020-04-22 | 2022-06-21 | 南通大学 | Mechanical arm autonomous grabbing method based on deep reinforcement learning and dynamic motion elements |
CN112347900B (en) * | 2020-11-04 | 2022-10-14 | 中国海洋大学 | Monocular vision underwater target automatic grabbing method based on distance estimation |
CN112347900A (en) * | 2020-11-04 | 2021-02-09 | 中国海洋大学 | Monocular vision underwater target automatic grabbing method based on distance estimation |
CN112734759A (en) * | 2021-03-30 | 2021-04-30 | 常州微亿智造科技有限公司 | Method and device for determining trigger point of flying shooting |
CN113836788B (en) * | 2021-08-24 | 2023-10-27 | 浙江大学 | Acceleration method for flow industrial reinforcement learning control based on local data enhancement |
CN113836788A (en) * | 2021-08-24 | 2021-12-24 | 浙江大学 | Acceleration method for flow industry reinforcement learning control based on local data enhancement |
CN114454160A (en) * | 2021-12-31 | 2022-05-10 | 中国人民解放军国防科技大学 | Mechanical arm grabbing control method and system based on kernel least square soft Bellman residual reinforcement learning |
CN114454160B (en) * | 2021-12-31 | 2024-04-16 | 中国人民解放军国防科技大学 | Mechanical arm grabbing control method and system based on kernel least square soft Belman residual error reinforcement learning |
CN117516530A (en) * | 2023-09-28 | 2024-02-06 | 中国科学院自动化研究所 | Robot target navigation method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106094516A (en) | A kind of robot self-adapting grasping method based on deeply study | |
WO2021160184A1 (en) | Target detection method, training method, electronic device, and computer-readable medium | |
CN105787439B (en) | A kind of depth image human synovial localization method based on convolutional neural networks | |
Fan et al. | Learning collision-free space detection from stereo images: Homography matrix brings better data augmentation | |
US10475209B2 (en) | Camera calibration | |
CN107450555A (en) | A kind of Hexapod Robot real-time gait planing method based on deeply study | |
CN110175566A (en) | A kind of hand gestures estimating system and method based on RGBD converged network | |
CN107909061A (en) | A kind of head pose tracks of device and method based on incomplete feature | |
CN109479088A (en) | The system and method for carrying out multiple target tracking based on depth machine learning and laser radar and focusing automatically | |
US20220153298A1 (en) | Generating Motion Scenarios for Self-Driving Vehicles | |
CN111368755A (en) | Vision-based pedestrian autonomous following method for quadruped robot | |
CN107397658B (en) | Multi-scale full-convolution network and visual blind guiding method and device | |
CN109901572A (en) | Automatic Pilot method, training method and relevant apparatus | |
CN103150728A (en) | Vision positioning method in dynamic environment | |
CN105760894A (en) | Robot navigation method based on machine vision and machine learning | |
US20220137647A1 (en) | System and method for operating a movable object based on human body indications | |
CN114851201B (en) | Mechanical arm six-degree-of-freedom visual closed-loop grabbing method based on TSDF three-dimensional reconstruction | |
CN110334701A (en) | Collecting method based on deep learning and multi-vision visual under the twin environment of number | |
CN109062229A (en) | The navigator of underwater robot system based on binocular vision follows formation method | |
CN111489392B (en) | Single target human motion posture capturing method and system in multi-person environment | |
CN106056633A (en) | Motion control method, device and system | |
CN107363834A (en) | A kind of mechanical arm grasping means based on cognitive map | |
CN106371442A (en) | Tensor-product-model-transformation-based mobile robot control method | |
CN114719848A (en) | Unmanned aerial vehicle height estimation method based on neural network fused with visual and inertial navigation information | |
US20210035326A1 (en) | Human pose estimation system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20161109 |
|
RJ01 | Rejection of invention patent application after publication |