CN110032189A - A kind of intelligent storage method for planning path for mobile robot not depending on map - Google Patents
A kind of intelligent storage method for planning path for mobile robot not depending on map Download PDFInfo
- Publication number
- CN110032189A CN110032189A CN201910323366.5A CN201910323366A CN110032189A CN 110032189 A CN110032189 A CN 110032189A CN 201910323366 A CN201910323366 A CN 201910323366A CN 110032189 A CN110032189 A CN 110032189A
- Authority
- CN
- China
- Prior art keywords
- mobile robot
- data
- target point
- target
- angle
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
- G05D1/02—Control of position or course in two dimensions
- G05D1/021—Control of position or course in two dimensions specially adapted to land vehicles
- G05D1/0212—Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory
- G05D1/0214—Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory in accordance with safety or protection criteria, e.g. avoiding hazardous areas
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
- G05D1/02—Control of position or course in two dimensions
- G05D1/021—Control of position or course in two dimensions specially adapted to land vehicles
- G05D1/0212—Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory
- G05D1/0221—Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory involving a learning process
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
- G05D1/02—Control of position or course in two dimensions
- G05D1/021—Control of position or course in two dimensions specially adapted to land vehicles
- G05D1/0212—Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory
- G05D1/0223—Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory involving speed control of the vehicle
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
- G05D1/02—Control of position or course in two dimensions
- G05D1/021—Control of position or course in two dimensions specially adapted to land vehicles
- G05D1/0231—Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means
- G05D1/0238—Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means using obstacle or wall sensors
- G05D1/024—Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means using obstacle or wall sensors in combination with a laser
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
- G05D1/02—Control of position or course in two dimensions
- G05D1/021—Control of position or course in two dimensions specially adapted to land vehicles
- G05D1/0276—Control of position or course in two dimensions specially adapted to land vehicles using signals provided by a source external to the vehicle
Abstract
A kind of intelligent storage method for planning path for mobile robot for not depending on map is claimed in the present invention; include step: S1: being trained in the environment of simulation first; S2: Mobile Robotics Navigation in actual environment determines that gradient policy method carries out movement selection using the depth for saving network parameter in S1.Path planning problem of the method for the present invention effective solution in circumstances not known;By simulated training, the avoidance efficiency in circumstances not known is effectively improved.
Description
Technical field
The invention belongs to robot path planning's technical field, it is related to a kind of using laser sensor, and does not depend on ground
The intelligent storage method for planning path for mobile robot of figure.
Background technique
Path planning is one of key element of autonomous mobile robot, it is desirable to which mobile robot can as far as possible quick and precisely
Ground arrives at the destination, while being also required to mobile robot and can safely and effectively hide barrier in environment.At present in environment
It safely and effectively avoiding barrier and accurately arrives at destination in the case that map is completely known and has had more preferable solution
Certainly scheme.But it is unknown in environmental map, and when relying solely on the perception data compared with dispersion laser sensor, to mobile machine
The obstacle avoidance algorithm real-time and accuracy requirement of people's navigation procedure are higher, if continuing to use method known to environment carries out environment
Unknown navigation and avoidance leads to failure of finally navigating then may greatly avoidance be caused to fail.
The research of the dynamic obstacle avoidance of mobile robot is mainly effectively detected to barrier and control is hidden in collision
Algorithm design optimization, enables mobile robot that navigation task is accurately rapidly completed.Detection for barrier needs to utilize shifting
The measurement sensor of mobile robot itself institute band carries out measurement and the movement shape of distance and position by sensor to barrier
The judgement of state.Generally there are sonar sensor, infrared sensor, laser sensor, vision for the use of this kind of sensor at present
Sensor etc..But sensor often has its defect, such as detection effect will when encountering sound-absorbing material for sonar sensor
Being greatly affected leads to error, and for visual sensor in the poor situation of light, detection has large error etc..
In the research of dynamic obstacle avoidance algorithm, more commonly used method has Artificial Potential Field Method, VFH class algorithm, neural network
Method, genetic algorithm, fuzzy logic method and rolling window method etc..Respectively there are respective advantage and disadvantage, such as Artificial Potential Field Method calculation amount
Small real-time is good, but is easy to appear local minizing point.
Summary of the invention
Present invention seek to address that the above problem of the prior art, proposes a kind of intelligent storage moving machine for not depending on map
Device people's paths planning method, this method are relative to conventional method advantage: the 1. laser sensor laser beams that use are less, but
It is able to achieve reliable real-time route planning, reduces the sensor cost of mobile robot;2. without establishing physical surroundings map,
It still can be carried out path planning.Technical scheme is as follows:
A kind of intelligent storage method for planning path for mobile robot not depending on map comprising following steps: S1: first
It is trained in the environment of simulation, a1: when setting moveable robot movement, random initial target point co-ordinate position information (xt,
) and radius of target range R ytm;Xt, yt respectively indicate X, Y coordinates of the center of target point in static map, RmIndicate with
Side length centered on (xt, yt) is dminSquare area, all arrive at the destination finally in the zone, set mobile robot
Current pose (x, y, θr), x, y are the current position coordinates of mobile robot, θrIt is the real-time direction of motion of mobile robot
With the angle of X-axis, and navigation path planning is carried out by location information (θ, d) of the target point under mobile robot polar coordinates,
And moved forward with fixed speed, θ is angle information of the target point in mobile robot polar coordinates, and d is target point away from movement
The range information at robot center;A2: in navigation procedure, environmental data L that laser sensor in mobile robot is detectedi
With target position data DiIt is pre-processed and is characterized, then blend to obtain environmental data Si;A3: ladder is determined using depth
Strategy process is spent, obtains the action state a of next step, and change in tactful sub-network after movement a is executed by reward feedback
The weight and biasing of neuron, the angle that mobile robot is deflected when a ∈ W represents execution movement is within the scope of W;A4: judgement
Whether mobile robot reaches target point (xt, yt), and a2 is returned if not reaching target point and continues to navigate, if arrived
Target point then terminates to navigate;A5: it after terminating navigation, according to reward value, updates depth and determines the evaluation net in gradient policy method
Network parameter saves depth and determines the tactful sub-network in gradient policy method after trained success rate reaches target success rate,
Network parameter is evaluated, after trained success rate reaches target success rate, depth is stored in and determines net in gradient policy method
Network parameter.S2: actual Mobile Robotics Navigation (environment can be different from the environment of simulation), using saving network in S1
The depth of parameter determines that gradient policy method carries out movement selection.
Further, the environmental data L that the step a2 detects laser sensoriWith target position data DiIt carries out
Pretreatment and characterization, then blend to obtain environmental data Si, specifically include: laser sensor data Li(i=1,2 ...,
10) it is pre-processed, is reconverted into environment characteristic parameters Lfi(i=1,2 ..., 10);Target position data need to first carry out subregion
Region distance data D is obtained after processingi(i=11,12,13), wherein D11It is angle of the current mobile robot relative to X-coordinate
Degree, D12It is the distance of distance objective point, D13It is angle of the target point relative to mobile robot itself direction of advance, then carries out
Be converted to distance feature parameter Dfi(i=11,12,13);According to the maximum distance dm of definition, by the range data of laser sensor
Be converted to distance feature Value Data: Lfi=Li÷ dm (i=1,2,3 ..., 10) by the range data of laser sensor be converted to away from
From characteristic value data: Dfi=D11÷ π, D12÷ dm, D13÷ π, then according to the distance feature Value Data and target of laser sensor
The distance feature Value Data of point position is merged, and obtains current environmental characteristic data Sf1~Sf13, amalgamation mode are as follows:
Further, the purpose of data of the target position need to first carry out subregion, subregion is to reach mesh in order to obtain
Target best angle obtains range data D after processing13, D13It is angle of the target point relative to mobile robot itself direction of advance
Degree, specifically includes: first by, as with reference to starting point, clockwise angle is negative, and counterclockwise angle is positive, and obtains immediately ahead of mobile robot
Absolute value to the optimal angle relative to target position, angle is less than or equal to 180 °.
Further, depth determines that gradient policy method specifically includes in the step a3: movement selection strategy uses
It is tactful sub-network output action, and additional NtDisturbance, be expressed as
A=A (s | μ A)+Nt
Wherein, s indicates state, and μ A is tactful sub-network parameter, NtIt is disturbance, A is that depth determines gradient policy method
Action policy.When mobile robot needs to carry out dynamic obstacle avoidance, gradient plan is determined using the fused data at the moment as depth
Then slightly input data exports movement of lower a moment a after depth determines gradient policy decision, movement a is held in the environment
After row, according to the different updates for carrying out depth and determining gradient policy method network parameter of reward value, in evaluation network:
Q (s, a)=Q (s, a)+α (r+ (Q (s', a'))-Q (s, a))
Wherein Q is value function, (s, a) be t moment state, r is the corresponding reward value of t moment behavior, and Q (s', a') is
In the Q value that the behavior that the t+1 moment takes calculates under new state, α is learning rate, and γ is discount factor.
Further, the design of the movement a in fixed continuum specifically, select.
Further, which is characterized in that the design of R value specifically: in order to define reward function, first to mobile robot
State S classified as follows:
1) safe condition SS: one group of state that any barrier in mobile robot and environment does not collide;
2) non-secure states NS: one group of state of any barrier collision in mobile robot and environment;
3) winning phase WS: mobile robot reaches state when target;
According to mobile robot state, reward function is defined.
Further, the step a4 specifically: current coordinate information (x, y) judges moving machine according to mobile robot
Whether device people reaches target point (xt, yt);IfShow that mobile robot has arrived at mesh
Within the scope of punctuate, if min { L1,L2,...L10} > C, L1It is the distance apart from obstacle that laser sensor obtains, C is moving machine
The length of device people shows that mobile robot generates collision with obstacle, has been WS or NS, terminates this time to navigate;Conversely,
Show that target point has not yet been reached in mobile robot, it is still necessary to continue to navigate, return step a2 is continued to execute, until reaching target
Point.
It advantages of the present invention and has the beneficial effect that:
The present invention provides a kind of intelligent storage method for planning path for mobile robot for not depending on map, the method for the present invention
By the method for deep learning, path planning problem of the effective solution in circumstances not known;By simulated training, effectively
Improve the avoidance efficiency in true environment.
Detailed description of the invention
Fig. 1 is that the present invention provides preferred embodiment as mobile robot perception target point model;
Fig. 2 is mobile robot laser sensor disturbance of perception model;
Fig. 3 is S1 step overall flow figure;
Fig. 4 is S2 step overall flow figure.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, detailed
Carefully describe.Described embodiment is only a part of the embodiments of the present invention.
The technical solution that the present invention solves above-mentioned technical problem is:
As shown in figures 3 and 4, a kind of intelligent storage method for planning path for mobile robot not depending on map, this method include
Following steps:
S1: it is trained in the environment of simulation first;
A1: the target of moveable robot movement, random initial target point co-ordinate position information (xt, yt) and target half are set
Diameter range Rm;Xt, yt respectively indicate X of the center of target point in static map, Y axis coordinate, RmIndicate with (xt, yt) be
The side length of the heart is dminSquare area, all arrive at the destination finally in the zone, the current pose of setting mobile robot
(x, y, θr), x, y are the current position coordinates of mobile robot, θrIt is the folder of the mobile robot real-time direction of motion and X-axis
Angle, and path planning is carried out by location information (θ, d) of the target point under mobile robot polar coordinates, and with fixed speed to
Preceding traveling, θ are angle information of the target point in mobile robot polar coordinates, d is target point away from mobile robot center away from
From information;
A2: in navigation procedure, environmental data L that laser sensor in mobile robot is detectediWith target position number
According to DiIt is pre-processed and is characterized, then blend to obtain environmental data Si;
A3: determine that gradient policy method obtains the action state a of next step using depth;A ∈ W represents execution movement time shift
The angle that mobile robot is deflected is within the scope of W;
A4: judging whether mobile robot reaches target point (xt, yt) or collision, continues to lead if returning to a2 without if
Boat, terminates to navigate if it arrived target point;
A5: it after terminating navigation, according to reward value, updates depth and determines the tactful sub-network in gradient policy method, evaluate
Network parameter is stored in depth and determines that the network in gradient policy method is joined after trained success rate reaches target success rate
Number.
S2: Mobile Robotics Navigation (environment can be different from environment when simulation) in actual environment are protected using in S1
The depth for having deposited network parameter determines that gradient policy method carries out movement selection.
Further, the environmental data L that the step a3 detects laser sensoriWith target position data DiIt carries out
Pretreatment and characterization, then blend to obtain environmental data Si, specifically include: laser sensor data Li(i=1,2 ...,
10) it is pre-processed, is reconverted into environment characteristic parameters Lfi(i=1,2 ..., 10);The data of target position need to first carry out subregion
Region distance data D is obtained after the processing of domaini(i=11,12,13), wherein D11It is angle of the current mobile robot relative to X-coordinate
Degree, D12It is the distance i.e. d of distance objective point, D13It is angle i.e. θ of the target point relative to mobile robot itself direction of advance,
It is converted again, obtains distance feature parameter Dfi(i=11,12,13);According to the maximum distance dm of definition, by laser sensor
Range data value be converted to distance feature Value Data: Lfi=Li÷ dm (i=1,2,3 ..., 10) is by the distance of laser sensor
Data value is converted to distance feature Value Data: Dfi=D11÷ π, D12÷ dm, D13÷ π, it is then special according to the environment of laser sensor
The environmental characteristic Value Data of value indicative data and aiming spot is merged, and obtains current environmental characteristic data Sf1~Sf13, melt
Conjunction mode are as follows:
Further, the data of the target position need to first carry out subregion, and subregional purpose is to reach in order to obtain
The best angle of target obtains range data D after processing13, D13It is target point relative to mobile robot itself direction of advance
Angle specifically includes: it will be first used as immediately ahead of mobile robot and refer to starting point, clockwise angle is negative, and counterclockwise angle is positive,
The optimal angle relative to target position is obtained, the absolute value of angle is equal to less than 180 °.
Further, depth determines that gradient policy method specifically includes in the step a3: the movement of selection is tactful son
Network output action corresponds to the movement a that current state obtains after tactful sub-network operation as input value, and additional Nt
Disturbance, be expressed as
A=A (s | μ A)+Nt (2)
S indicates state, and μ A is tactful sub-network parameter, and A is the action policy that depth determines gradient policy method, works as movement
When robot needs path planning, determines that gradient policy inputs using the fused data at the moment as depth, then pass through depth
After determining gradient policy method decision, movement of lower a moment a is exported, it is deep according to the different progress of reward value after movement a is executed
The update for determining gradient policy network parameter is spent, in evaluation network:
Q (s, a)=Q (s, a)+α (r+ (Q (s', a'))-Q (s, a)) (3)
Wherein Q is value function, (s, a) be t moment state, r is the corresponding reward value of t moment behavior, and Q (s', a') is
In the Q value that the behavior that the t+1 moment takes calculates under new state, α is learning rate, and γ is discount factor.
Further, the design of the movement a in fixed continuum specifically, select.
Further, the step a5, which is characterized in that the design of R value specifically: in order to define reward function, first
Classified as follows to the state S of mobile robot:
1) safe condition SS: one group of state that any barrier in mobile robot and environment does not collide;
2) non-secure states NS: one group of state of any barrier collision in mobile robot and environment;
3) winning phase WS: mobile robot reaches state when target;
According to mobile robot state, it is as follows to define reward function:
When mobile robot reaches target, and state is winning phase WS, R=10;When mobile robot and obstacle produce
When raw collision, when state is non-secure states NS, R=-5;When mobile robot is in the environment both without colliding or not reaching
When terminal, state is safe condition SS, R=(di-di+1)/dm, diIt is current time at a distance from target point, di+1It is lower a period of time
It carves at a distance from target point.
Further, the step a4 specifically: current coordinate information (x, y) judges moving machine according to mobile robot
Whether device people reaches target point (xt, yt);IfShow that mobile robot has arrived at mesh
Within the scope of punctuate, if min { L1,L2,...L10} > C, L1It is the distance apart from obstacle that laser sensor obtains, C is moving machine
The length of device people shows that mobile robot generates collision with obstacle, has been WS or NS, terminates this time to navigate;Conversely,
Show that target point has not yet been reached in mobile robot, it is still necessary to continue to navigate, return step a2 is continued to execute, until reaching target
Point.
Further, the step S2 specifically, in the navigation procedure of entity mobile robot, inherit by mobile robot
Network parameter in step sl determines the movement at gradient policy method choice current time by depth, until reaching target
Region.
The above embodiment is interpreted as being merely to illustrate the present invention rather than limit the scope of the invention.?
After the content for having read record of the invention, technical staff can be made various changes or modifications the present invention, these equivalent changes
Change and modification equally falls into the scope of the claims in the present invention.
Claims (7)
1. a kind of intelligent storage method for planning path for mobile robot for not depending on map, which comprises the following steps:
S1: mobile robot is trained in the environment of simulation first;
A1: target when setting moveable robot movement, random initial target point co-ordinate position information (xt, yt) and radius of target
Range Rm;Xt, yt respectively indicate X of the center of target point in static map, Y axis coordinate, RmIt indicates centered on (xt, yt)
Side length be dminSquare area, all arrive at the destination finally in the zone, the current pose of setting mobile robot (x,
Y, θr), x, y are the current position coordinates of mobile robot, θrIt is the angle of the mobile robot real-time direction of motion and X-axis, and
Path planning is carried out by location information (θ, d) of the target point under mobile robot polar coordinates, and with fixed speed to moving ahead
It sails, θ is angle information of the target point under mobile robot polar coordinates, and d is distance letter of the target point away from mobile robot center
Breath;
A2: in navigation procedure, environmental data L that laser sensor in mobile robot is detectediWith target position data DiInto
Row pretreatment and characterization, then blend to obtain environmental data Si;
A3: determining gradient policy method using depth, and action state a, a ∈ W for obtaining next step represents movement when execution acts
The angle that robot is deflected is within the scope of W;
A4: judging whether mobile robot reaches target point (xt, yt), continues to navigate if returning to a2 without if, if arrived
Up to then terminating to navigate;
A5: it after terminating navigation, according to reward value, updates depth and determines the tactful sub-network in gradient policy method, evaluate network
Parameter is stored in depth and determines network parameter in gradient policy method after trained success rate reaches target success rate;
S2: actual environment Mobile Robotics Navigation use saved in S1 network parameter depth determine gradient policy method into
The movement selection of row mobile robot.
2. a kind of intelligent storage method for planning path for mobile robot for not depending on map according to claim 1, special
Sign is, the environmental data L that the step a2 detects laser sensoriWith target position data DiCarry out pretreatment and spy
Then signization blends to obtain environmental data Si, it specifically includes:
Laser sensor data Li(i=1,2 ..., 10) is pre-processed, and environment characteristic parameters L is reconverted intofi(i=1,
2 ..., 10);The data of target position need to first carry out subarea processing, then obtain region distance data Di(i=1,2,3),
Middle D1It is angle of the current mobile robot with respect to X-coordinate, D2It is the distance i.e. d of distance objective point, D3It is target point relative to shifting
Angle, that is, θ of mobile robot itself direction of advance, then DiIt carries out being converted to distance feature parameter D againfi(i=11,12,
13);According to the maximum distance dm of definition, the range data of laser sensor is converted into distance feature Value Data: Lfi=Li÷
The range data of laser sensor is converted to distance feature Value Data: D by dm (i=1,2,3 ..., 10)fi=D11÷ π, D12÷
Dm, D13Then ÷ π is melted according to the distance feature Value Data of the distance feature Value Data of laser sensor and aiming spot
It closes, obtains current environmental characteristic data Sf1~Sf13, amalgamation mode are as follows:
3. a kind of intelligent storage method for planning path for mobile robot for not depending on map according to claim 2, special
Sign is that the data of the target position obtain data D after need to first carrying out subarea processing13, D13It is target point relative to movement
The angle of robot itself direction of advance, specifically includes: will first be used as immediately ahead of mobile robot and refers to starting point, clockwise angle
It is negative, counterclockwise angle is positive, and obtains the optimal angle relative to target position, and the absolute value of angle is less than or equal to 180 °.
4. a kind of intelligent storage method for planning path for mobile robot for not depending on map according to claim 1, special
Sign is that depth determines that gradient policy method specifically includes in the step a3: movement selection strategy is using tactful subnet
Network output action, and additional disturbance:
A=A (s | μ A)+Nt
Wherein, s indicates state, and μ A is tactful sub-network parameter, NtIt is disturbance, A is the movement plan that depth determines gradient policy method
Slightly.When mobile robot needs to carry out dynamic obstacle avoidance, determine that gradient policy inputs using the fused data at the moment as depth
Then data export movement of lower a moment a, after movement a is executed in the environment, root after depth determines gradient policy decision
According to the different updates for carrying out depth and determining gradient policy method network parameter of reward value, in evaluation network:
Q (s, a)=Q (s, a)+α (r+ (Q (s', a'))-Q (s, a))
Wherein Q is value function, (s, a) be t moment state, R is the corresponding reward value of t moment behavior, and Q (s', a') is in t+1
The Q value that the behavior that moment takes calculates under new state, α are learning rates, and γ is discount factor.
5. a kind of intelligent storage method for planning path for mobile robot for not depending on map according to claim 4, special
Sign is that the design of the movement a in fixed continuum specifically, select.
6. a kind of intelligent storage method for planning path for mobile robot for not depending on map according to claim 4, special
Sign is, the design of R value specifically: in order to define reward function, classified as follows to the state S of mobile robot first:
1) safe condition SS: one group of state that any barrier in mobile robot and environment does not collide;
2) non-secure states NS: one group of state of any barrier collision in mobile robot and environment;
3) winning phase WS: mobile robot reaches state when target;
According to mobile robot state, reward function is defined.
7. a kind of intelligent storage method for planning path for mobile robot for not depending on map according to claim 1, special
Sign is, the step a4 specifically:
Judge whether mobile robot reaches target point (xt, yt) according to the current coordinate information (x, y) of mobile robot;IfShow that mobile robot has arrived in target point range, if min { L1,L2,
...L10} > C, L1It is the distance apart from obstacle that laser sensor obtains, C is the length of mobile robot, shows mobile machine
People generates collision with obstacle, has been WS or NS, terminates this time to navigate;Conversely, showing that mobile robot has not yet been reached
Target point, it is still necessary to continue to navigate, return step a2 is continued to execute, until reaching target point.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910323366.5A CN110032189A (en) | 2019-04-22 | 2019-04-22 | A kind of intelligent storage method for planning path for mobile robot not depending on map |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910323366.5A CN110032189A (en) | 2019-04-22 | 2019-04-22 | A kind of intelligent storage method for planning path for mobile robot not depending on map |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110032189A true CN110032189A (en) | 2019-07-19 |
Family
ID=67239486
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910323366.5A Pending CN110032189A (en) | 2019-04-22 | 2019-04-22 | A kind of intelligent storage method for planning path for mobile robot not depending on map |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110032189A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113062601A (en) * | 2021-03-17 | 2021-07-02 | 同济大学 | Q learning-based concrete distributing robot trajectory planning method |
CN113140104A (en) * | 2021-04-14 | 2021-07-20 | 武汉理工大学 | Vehicle queue tracking control method and device and computer readable storage medium |
CN113848974A (en) * | 2021-09-28 | 2021-12-28 | 西北工业大学 | Aircraft trajectory planning method and system based on deep reinforcement learning |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106548486A (en) * | 2016-11-01 | 2017-03-29 | 浙江大学 | A kind of unmanned vehicle location tracking method based on sparse visual signature map |
CN108255182A (en) * | 2018-01-30 | 2018-07-06 | 上海交通大学 | A kind of service robot pedestrian based on deeply study perceives barrier-avoiding method |
CN109445440A (en) * | 2018-12-13 | 2019-03-08 | 重庆邮电大学 | The dynamic obstacle avoidance method with improvement Q learning algorithm is merged based on sensor |
-
2019
- 2019-04-22 CN CN201910323366.5A patent/CN110032189A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106548486A (en) * | 2016-11-01 | 2017-03-29 | 浙江大学 | A kind of unmanned vehicle location tracking method based on sparse visual signature map |
CN108255182A (en) * | 2018-01-30 | 2018-07-06 | 上海交通大学 | A kind of service robot pedestrian based on deeply study perceives barrier-avoiding method |
CN109445440A (en) * | 2018-12-13 | 2019-03-08 | 重庆邮电大学 | The dynamic obstacle avoidance method with improvement Q learning algorithm is merged based on sensor |
Non-Patent Citations (1)
Title |
---|
宋宇 等: "基于改进SARSA(λ)移动机器人路径规划", 《长春工业大学学报》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113062601A (en) * | 2021-03-17 | 2021-07-02 | 同济大学 | Q learning-based concrete distributing robot trajectory planning method |
CN113062601B (en) * | 2021-03-17 | 2022-05-13 | 同济大学 | Q learning-based concrete distributing robot trajectory planning method |
CN113140104A (en) * | 2021-04-14 | 2021-07-20 | 武汉理工大学 | Vehicle queue tracking control method and device and computer readable storage medium |
CN113848974A (en) * | 2021-09-28 | 2021-12-28 | 西北工业大学 | Aircraft trajectory planning method and system based on deep reinforcement learning |
CN113848974B (en) * | 2021-09-28 | 2023-08-15 | 西安因诺航空科技有限公司 | Aircraft trajectory planning method and system based on deep reinforcement learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108762264B (en) | Dynamic obstacle avoidance method of robot based on artificial potential field and rolling window | |
CN107063280A (en) | A kind of intelligent vehicle path planning system and method based on control sampling | |
CN109445440B (en) | Dynamic obstacle avoidance method based on sensor fusion and improved Q learning algorithm | |
CN111780777A (en) | Unmanned vehicle route planning method based on improved A-star algorithm and deep reinforcement learning | |
CN108762281A (en) | It is a kind of that intelligent robot decision-making technique under the embedded Real-time Water of intensified learning is associated with based on memory | |
CN110032189A (en) | A kind of intelligent storage method for planning path for mobile robot not depending on map | |
CN107894773A (en) | A kind of air navigation aid of mobile robot, system and relevant apparatus | |
CN109784201B (en) | AUV dynamic obstacle avoidance method based on four-dimensional risk assessment | |
CN109597404A (en) | Road roller and its controller, control method and system | |
JP7469850B2 (en) | Path determination device, robot, and path determination method | |
CN110174118A (en) | Robot multiple-objective search-path layout method and apparatus based on intensified learning | |
WO2020136978A1 (en) | Path determination method | |
CN110850880A (en) | Automatic driving system and method based on visual sensing | |
Almasri et al. | Development of efficient obstacle avoidance and line following mobile robot with the integration of fuzzy logic system in static and dynamic environments | |
CN113291318A (en) | Unmanned vehicle blind area turning planning method based on partially observable Markov model | |
Chen et al. | Automatic overtaking on two-way roads with vehicle interactions based on proximal policy optimization | |
Jaafra et al. | Robust reinforcement learning for autonomous driving | |
CN113341999A (en) | Forklift path planning method and device based on optimized D-x algorithm | |
Yu et al. | Road-following with continuous learning | |
Lin et al. | Robust unmanned surface vehicle navigation with distributional reinforcement learning | |
Lee et al. | Autonomous lane keeping based on approximate Q-learning | |
Li et al. | An efficient deep reinforcement learning algorithm for Mapless navigation with gap-guided switching strategy | |
JP2020149095A (en) | Inverted pendulum robot | |
CN111413974B (en) | Automobile automatic driving motion planning method and system based on learning sampling type | |
Hacene et al. | Toward safety navigation in cluttered dynamic environment: A robot neural-based hybrid autonomous navigation and obstacle avoidance with moving target tracking |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |