CN110032189A - A kind of intelligent storage method for planning path for mobile robot not depending on map - Google Patents

A kind of intelligent storage method for planning path for mobile robot not depending on map Download PDF

Info

Publication number
CN110032189A
CN110032189A CN201910323366.5A CN201910323366A CN110032189A CN 110032189 A CN110032189 A CN 110032189A CN 201910323366 A CN201910323366 A CN 201910323366A CN 110032189 A CN110032189 A CN 110032189A
Authority
CN
China
Prior art keywords
mobile robot
data
target point
target
angle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910323366.5A
Other languages
Chinese (zh)
Inventor
魏长赟
张鹏鹏
蔡帛良
倪福生
蒋爽
顾磊
李洪彬
刘增辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changzhou Campus of Hohai University
Original Assignee
Changzhou Campus of Hohai University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changzhou Campus of Hohai University filed Critical Changzhou Campus of Hohai University
Priority to CN201910323366.5A priority Critical patent/CN110032189A/en
Publication of CN110032189A publication Critical patent/CN110032189A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0212Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory
    • G05D1/0214Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory in accordance with safety or protection criteria, e.g. avoiding hazardous areas
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0212Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory
    • G05D1/0221Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory involving a learning process
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0212Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory
    • G05D1/0223Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory involving speed control of the vehicle
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0231Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means
    • G05D1/0238Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means using obstacle or wall sensors
    • G05D1/024Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means using obstacle or wall sensors in combination with a laser
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0276Control of position or course in two dimensions specially adapted to land vehicles using signals provided by a source external to the vehicle

Abstract

A kind of intelligent storage method for planning path for mobile robot for not depending on map is claimed in the present invention; include step: S1: being trained in the environment of simulation first; S2: Mobile Robotics Navigation in actual environment determines that gradient policy method carries out movement selection using the depth for saving network parameter in S1.Path planning problem of the method for the present invention effective solution in circumstances not known;By simulated training, the avoidance efficiency in circumstances not known is effectively improved.

Description

A kind of intelligent storage method for planning path for mobile robot not depending on map
Technical field
The invention belongs to robot path planning's technical field, it is related to a kind of using laser sensor, and does not depend on ground The intelligent storage method for planning path for mobile robot of figure.
Background technique
Path planning is one of key element of autonomous mobile robot, it is desirable to which mobile robot can as far as possible quick and precisely Ground arrives at the destination, while being also required to mobile robot and can safely and effectively hide barrier in environment.At present in environment It safely and effectively avoiding barrier and accurately arrives at destination in the case that map is completely known and has had more preferable solution Certainly scheme.But it is unknown in environmental map, and when relying solely on the perception data compared with dispersion laser sensor, to mobile machine The obstacle avoidance algorithm real-time and accuracy requirement of people's navigation procedure are higher, if continuing to use method known to environment carries out environment Unknown navigation and avoidance leads to failure of finally navigating then may greatly avoidance be caused to fail.
The research of the dynamic obstacle avoidance of mobile robot is mainly effectively detected to barrier and control is hidden in collision Algorithm design optimization, enables mobile robot that navigation task is accurately rapidly completed.Detection for barrier needs to utilize shifting The measurement sensor of mobile robot itself institute band carries out measurement and the movement shape of distance and position by sensor to barrier The judgement of state.Generally there are sonar sensor, infrared sensor, laser sensor, vision for the use of this kind of sensor at present Sensor etc..But sensor often has its defect, such as detection effect will when encountering sound-absorbing material for sonar sensor Being greatly affected leads to error, and for visual sensor in the poor situation of light, detection has large error etc..
In the research of dynamic obstacle avoidance algorithm, more commonly used method has Artificial Potential Field Method, VFH class algorithm, neural network Method, genetic algorithm, fuzzy logic method and rolling window method etc..Respectively there are respective advantage and disadvantage, such as Artificial Potential Field Method calculation amount Small real-time is good, but is easy to appear local minizing point.
Summary of the invention
Present invention seek to address that the above problem of the prior art, proposes a kind of intelligent storage moving machine for not depending on map Device people's paths planning method, this method are relative to conventional method advantage: the 1. laser sensor laser beams that use are less, but It is able to achieve reliable real-time route planning, reduces the sensor cost of mobile robot;2. without establishing physical surroundings map, It still can be carried out path planning.Technical scheme is as follows:
A kind of intelligent storage method for planning path for mobile robot not depending on map comprising following steps: S1: first It is trained in the environment of simulation, a1: when setting moveable robot movement, random initial target point co-ordinate position information (xt, ) and radius of target range R ytm;Xt, yt respectively indicate X, Y coordinates of the center of target point in static map, RmIndicate with Side length centered on (xt, yt) is dminSquare area, all arrive at the destination finally in the zone, set mobile robot Current pose (x, y, θr), x, y are the current position coordinates of mobile robot, θrIt is the real-time direction of motion of mobile robot With the angle of X-axis, and navigation path planning is carried out by location information (θ, d) of the target point under mobile robot polar coordinates, And moved forward with fixed speed, θ is angle information of the target point in mobile robot polar coordinates, and d is target point away from movement The range information at robot center;A2: in navigation procedure, environmental data L that laser sensor in mobile robot is detectedi With target position data DiIt is pre-processed and is characterized, then blend to obtain environmental data Si;A3: ladder is determined using depth Strategy process is spent, obtains the action state a of next step, and change in tactful sub-network after movement a is executed by reward feedback The weight and biasing of neuron, the angle that mobile robot is deflected when a ∈ W represents execution movement is within the scope of W;A4: judgement Whether mobile robot reaches target point (xt, yt), and a2 is returned if not reaching target point and continues to navigate, if arrived Target point then terminates to navigate;A5: it after terminating navigation, according to reward value, updates depth and determines the evaluation net in gradient policy method Network parameter saves depth and determines the tactful sub-network in gradient policy method after trained success rate reaches target success rate, Network parameter is evaluated, after trained success rate reaches target success rate, depth is stored in and determines net in gradient policy method Network parameter.S2: actual Mobile Robotics Navigation (environment can be different from the environment of simulation), using saving network in S1 The depth of parameter determines that gradient policy method carries out movement selection.
Further, the environmental data L that the step a2 detects laser sensoriWith target position data DiIt carries out Pretreatment and characterization, then blend to obtain environmental data Si, specifically include: laser sensor data Li(i=1,2 ..., 10) it is pre-processed, is reconverted into environment characteristic parameters Lfi(i=1,2 ..., 10);Target position data need to first carry out subregion Region distance data D is obtained after processingi(i=11,12,13), wherein D11It is angle of the current mobile robot relative to X-coordinate Degree, D12It is the distance of distance objective point, D13It is angle of the target point relative to mobile robot itself direction of advance, then carries out Be converted to distance feature parameter Dfi(i=11,12,13);According to the maximum distance dm of definition, by the range data of laser sensor Be converted to distance feature Value Data: Lfi=Li÷ dm (i=1,2,3 ..., 10) by the range data of laser sensor be converted to away from From characteristic value data: Dfi=D11÷ π, D12÷ dm, D13÷ π, then according to the distance feature Value Data and target of laser sensor The distance feature Value Data of point position is merged, and obtains current environmental characteristic data Sf1~Sf13, amalgamation mode are as follows:
Further, the purpose of data of the target position need to first carry out subregion, subregion is to reach mesh in order to obtain Target best angle obtains range data D after processing13, D13It is angle of the target point relative to mobile robot itself direction of advance Degree, specifically includes: first by, as with reference to starting point, clockwise angle is negative, and counterclockwise angle is positive, and obtains immediately ahead of mobile robot Absolute value to the optimal angle relative to target position, angle is less than or equal to 180 °.
Further, depth determines that gradient policy method specifically includes in the step a3: movement selection strategy uses It is tactful sub-network output action, and additional NtDisturbance, be expressed as
A=A (s | μ A)+Nt
Wherein, s indicates state, and μ A is tactful sub-network parameter, NtIt is disturbance, A is that depth determines gradient policy method Action policy.When mobile robot needs to carry out dynamic obstacle avoidance, gradient plan is determined using the fused data at the moment as depth Then slightly input data exports movement of lower a moment a after depth determines gradient policy decision, movement a is held in the environment After row, according to the different updates for carrying out depth and determining gradient policy method network parameter of reward value, in evaluation network:
Q (s, a)=Q (s, a)+α (r+ (Q (s', a'))-Q (s, a))
Wherein Q is value function, (s, a) be t moment state, r is the corresponding reward value of t moment behavior, and Q (s', a') is In the Q value that the behavior that the t+1 moment takes calculates under new state, α is learning rate, and γ is discount factor.
Further, the design of the movement a in fixed continuum specifically, select.
Further, which is characterized in that the design of R value specifically: in order to define reward function, first to mobile robot State S classified as follows:
1) safe condition SS: one group of state that any barrier in mobile robot and environment does not collide;
2) non-secure states NS: one group of state of any barrier collision in mobile robot and environment;
3) winning phase WS: mobile robot reaches state when target;
According to mobile robot state, reward function is defined.
Further, the step a4 specifically: current coordinate information (x, y) judges moving machine according to mobile robot Whether device people reaches target point (xt, yt);IfShow that mobile robot has arrived at mesh Within the scope of punctuate, if min { L1,L2,...L10} > C, L1It is the distance apart from obstacle that laser sensor obtains, C is moving machine The length of device people shows that mobile robot generates collision with obstacle, has been WS or NS, terminates this time to navigate;Conversely, Show that target point has not yet been reached in mobile robot, it is still necessary to continue to navigate, return step a2 is continued to execute, until reaching target Point.
It advantages of the present invention and has the beneficial effect that:
The present invention provides a kind of intelligent storage method for planning path for mobile robot for not depending on map, the method for the present invention By the method for deep learning, path planning problem of the effective solution in circumstances not known;By simulated training, effectively Improve the avoidance efficiency in true environment.
Detailed description of the invention
Fig. 1 is that the present invention provides preferred embodiment as mobile robot perception target point model;
Fig. 2 is mobile robot laser sensor disturbance of perception model;
Fig. 3 is S1 step overall flow figure;
Fig. 4 is S2 step overall flow figure.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, detailed Carefully describe.Described embodiment is only a part of the embodiments of the present invention.
The technical solution that the present invention solves above-mentioned technical problem is:
As shown in figures 3 and 4, a kind of intelligent storage method for planning path for mobile robot not depending on map, this method include Following steps:
S1: it is trained in the environment of simulation first;
A1: the target of moveable robot movement, random initial target point co-ordinate position information (xt, yt) and target half are set Diameter range Rm;Xt, yt respectively indicate X of the center of target point in static map, Y axis coordinate, RmIndicate with (xt, yt) be The side length of the heart is dminSquare area, all arrive at the destination finally in the zone, the current pose of setting mobile robot (x, y, θr), x, y are the current position coordinates of mobile robot, θrIt is the folder of the mobile robot real-time direction of motion and X-axis Angle, and path planning is carried out by location information (θ, d) of the target point under mobile robot polar coordinates, and with fixed speed to Preceding traveling, θ are angle information of the target point in mobile robot polar coordinates, d is target point away from mobile robot center away from From information;
A2: in navigation procedure, environmental data L that laser sensor in mobile robot is detectediWith target position number According to DiIt is pre-processed and is characterized, then blend to obtain environmental data Si
A3: determine that gradient policy method obtains the action state a of next step using depth;A ∈ W represents execution movement time shift The angle that mobile robot is deflected is within the scope of W;
A4: judging whether mobile robot reaches target point (xt, yt) or collision, continues to lead if returning to a2 without if Boat, terminates to navigate if it arrived target point;
A5: it after terminating navigation, according to reward value, updates depth and determines the tactful sub-network in gradient policy method, evaluate Network parameter is stored in depth and determines that the network in gradient policy method is joined after trained success rate reaches target success rate Number.
S2: Mobile Robotics Navigation (environment can be different from environment when simulation) in actual environment are protected using in S1 The depth for having deposited network parameter determines that gradient policy method carries out movement selection.
Further, the environmental data L that the step a3 detects laser sensoriWith target position data DiIt carries out Pretreatment and characterization, then blend to obtain environmental data Si, specifically include: laser sensor data Li(i=1,2 ..., 10) it is pre-processed, is reconverted into environment characteristic parameters Lfi(i=1,2 ..., 10);The data of target position need to first carry out subregion Region distance data D is obtained after the processing of domaini(i=11,12,13), wherein D11It is angle of the current mobile robot relative to X-coordinate Degree, D12It is the distance i.e. d of distance objective point, D13It is angle i.e. θ of the target point relative to mobile robot itself direction of advance, It is converted again, obtains distance feature parameter Dfi(i=11,12,13);According to the maximum distance dm of definition, by laser sensor Range data value be converted to distance feature Value Data: Lfi=Li÷ dm (i=1,2,3 ..., 10) is by the distance of laser sensor Data value is converted to distance feature Value Data: Dfi=D11÷ π, D12÷ dm, D13÷ π, it is then special according to the environment of laser sensor The environmental characteristic Value Data of value indicative data and aiming spot is merged, and obtains current environmental characteristic data Sf1~Sf13, melt Conjunction mode are as follows:
Further, the data of the target position need to first carry out subregion, and subregional purpose is to reach in order to obtain The best angle of target obtains range data D after processing13, D13It is target point relative to mobile robot itself direction of advance Angle specifically includes: it will be first used as immediately ahead of mobile robot and refer to starting point, clockwise angle is negative, and counterclockwise angle is positive, The optimal angle relative to target position is obtained, the absolute value of angle is equal to less than 180 °.
Further, depth determines that gradient policy method specifically includes in the step a3: the movement of selection is tactful son Network output action corresponds to the movement a that current state obtains after tactful sub-network operation as input value, and additional Nt Disturbance, be expressed as
A=A (s | μ A)+Nt (2)
S indicates state, and μ A is tactful sub-network parameter, and A is the action policy that depth determines gradient policy method, works as movement When robot needs path planning, determines that gradient policy inputs using the fused data at the moment as depth, then pass through depth After determining gradient policy method decision, movement of lower a moment a is exported, it is deep according to the different progress of reward value after movement a is executed The update for determining gradient policy network parameter is spent, in evaluation network:
Q (s, a)=Q (s, a)+α (r+ (Q (s', a'))-Q (s, a)) (3)
Wherein Q is value function, (s, a) be t moment state, r is the corresponding reward value of t moment behavior, and Q (s', a') is In the Q value that the behavior that the t+1 moment takes calculates under new state, α is learning rate, and γ is discount factor.
Further, the design of the movement a in fixed continuum specifically, select.
Further, the step a5, which is characterized in that the design of R value specifically: in order to define reward function, first Classified as follows to the state S of mobile robot:
1) safe condition SS: one group of state that any barrier in mobile robot and environment does not collide;
2) non-secure states NS: one group of state of any barrier collision in mobile robot and environment;
3) winning phase WS: mobile robot reaches state when target;
According to mobile robot state, it is as follows to define reward function:
When mobile robot reaches target, and state is winning phase WS, R=10;When mobile robot and obstacle produce When raw collision, when state is non-secure states NS, R=-5;When mobile robot is in the environment both without colliding or not reaching When terminal, state is safe condition SS, R=(di-di+1)/dm, diIt is current time at a distance from target point, di+1It is lower a period of time It carves at a distance from target point.
Further, the step a4 specifically: current coordinate information (x, y) judges moving machine according to mobile robot Whether device people reaches target point (xt, yt);IfShow that mobile robot has arrived at mesh Within the scope of punctuate, if min { L1,L2,...L10} > C, L1It is the distance apart from obstacle that laser sensor obtains, C is moving machine The length of device people shows that mobile robot generates collision with obstacle, has been WS or NS, terminates this time to navigate;Conversely, Show that target point has not yet been reached in mobile robot, it is still necessary to continue to navigate, return step a2 is continued to execute, until reaching target Point.
Further, the step S2 specifically, in the navigation procedure of entity mobile robot, inherit by mobile robot Network parameter in step sl determines the movement at gradient policy method choice current time by depth, until reaching target Region.
The above embodiment is interpreted as being merely to illustrate the present invention rather than limit the scope of the invention.? After the content for having read record of the invention, technical staff can be made various changes or modifications the present invention, these equivalent changes Change and modification equally falls into the scope of the claims in the present invention.

Claims (7)

1. a kind of intelligent storage method for planning path for mobile robot for not depending on map, which comprises the following steps:
S1: mobile robot is trained in the environment of simulation first;
A1: target when setting moveable robot movement, random initial target point co-ordinate position information (xt, yt) and radius of target Range Rm;Xt, yt respectively indicate X of the center of target point in static map, Y axis coordinate, RmIt indicates centered on (xt, yt) Side length be dminSquare area, all arrive at the destination finally in the zone, the current pose of setting mobile robot (x, Y, θr), x, y are the current position coordinates of mobile robot, θrIt is the angle of the mobile robot real-time direction of motion and X-axis, and Path planning is carried out by location information (θ, d) of the target point under mobile robot polar coordinates, and with fixed speed to moving ahead It sails, θ is angle information of the target point under mobile robot polar coordinates, and d is distance letter of the target point away from mobile robot center Breath;
A2: in navigation procedure, environmental data L that laser sensor in mobile robot is detectediWith target position data DiInto Row pretreatment and characterization, then blend to obtain environmental data Si
A3: determining gradient policy method using depth, and action state a, a ∈ W for obtaining next step represents movement when execution acts The angle that robot is deflected is within the scope of W;
A4: judging whether mobile robot reaches target point (xt, yt), continues to navigate if returning to a2 without if, if arrived Up to then terminating to navigate;
A5: it after terminating navigation, according to reward value, updates depth and determines the tactful sub-network in gradient policy method, evaluate network Parameter is stored in depth and determines network parameter in gradient policy method after trained success rate reaches target success rate;
S2: actual environment Mobile Robotics Navigation use saved in S1 network parameter depth determine gradient policy method into The movement selection of row mobile robot.
2. a kind of intelligent storage method for planning path for mobile robot for not depending on map according to claim 1, special Sign is, the environmental data L that the step a2 detects laser sensoriWith target position data DiCarry out pretreatment and spy Then signization blends to obtain environmental data Si, it specifically includes:
Laser sensor data Li(i=1,2 ..., 10) is pre-processed, and environment characteristic parameters L is reconverted intofi(i=1, 2 ..., 10);The data of target position need to first carry out subarea processing, then obtain region distance data Di(i=1,2,3), Middle D1It is angle of the current mobile robot with respect to X-coordinate, D2It is the distance i.e. d of distance objective point, D3It is target point relative to shifting Angle, that is, θ of mobile robot itself direction of advance, then DiIt carries out being converted to distance feature parameter D againfi(i=11,12, 13);According to the maximum distance dm of definition, the range data of laser sensor is converted into distance feature Value Data: Lfi=Li÷ The range data of laser sensor is converted to distance feature Value Data: D by dm (i=1,2,3 ..., 10)fi=D11÷ π, D12÷ Dm, D13Then ÷ π is melted according to the distance feature Value Data of the distance feature Value Data of laser sensor and aiming spot It closes, obtains current environmental characteristic data Sf1~Sf13, amalgamation mode are as follows:
3. a kind of intelligent storage method for planning path for mobile robot for not depending on map according to claim 2, special Sign is that the data of the target position obtain data D after need to first carrying out subarea processing13, D13It is target point relative to movement The angle of robot itself direction of advance, specifically includes: will first be used as immediately ahead of mobile robot and refers to starting point, clockwise angle It is negative, counterclockwise angle is positive, and obtains the optimal angle relative to target position, and the absolute value of angle is less than or equal to 180 °.
4. a kind of intelligent storage method for planning path for mobile robot for not depending on map according to claim 1, special Sign is that depth determines that gradient policy method specifically includes in the step a3: movement selection strategy is using tactful subnet Network output action, and additional disturbance:
A=A (s | μ A)+Nt
Wherein, s indicates state, and μ A is tactful sub-network parameter, NtIt is disturbance, A is the movement plan that depth determines gradient policy method Slightly.When mobile robot needs to carry out dynamic obstacle avoidance, determine that gradient policy inputs using the fused data at the moment as depth Then data export movement of lower a moment a, after movement a is executed in the environment, root after depth determines gradient policy decision According to the different updates for carrying out depth and determining gradient policy method network parameter of reward value, in evaluation network:
Q (s, a)=Q (s, a)+α (r+ (Q (s', a'))-Q (s, a))
Wherein Q is value function, (s, a) be t moment state, R is the corresponding reward value of t moment behavior, and Q (s', a') is in t+1 The Q value that the behavior that moment takes calculates under new state, α are learning rates, and γ is discount factor.
5. a kind of intelligent storage method for planning path for mobile robot for not depending on map according to claim 4, special Sign is that the design of the movement a in fixed continuum specifically, select.
6. a kind of intelligent storage method for planning path for mobile robot for not depending on map according to claim 4, special Sign is, the design of R value specifically: in order to define reward function, classified as follows to the state S of mobile robot first:
1) safe condition SS: one group of state that any barrier in mobile robot and environment does not collide;
2) non-secure states NS: one group of state of any barrier collision in mobile robot and environment;
3) winning phase WS: mobile robot reaches state when target;
According to mobile robot state, reward function is defined.
7. a kind of intelligent storage method for planning path for mobile robot for not depending on map according to claim 1, special Sign is, the step a4 specifically:
Judge whether mobile robot reaches target point (xt, yt) according to the current coordinate information (x, y) of mobile robot;IfShow that mobile robot has arrived in target point range, if min { L1,L2, ...L10} > C, L1It is the distance apart from obstacle that laser sensor obtains, C is the length of mobile robot, shows mobile machine People generates collision with obstacle, has been WS or NS, terminates this time to navigate;Conversely, showing that mobile robot has not yet been reached Target point, it is still necessary to continue to navigate, return step a2 is continued to execute, until reaching target point.
CN201910323366.5A 2019-04-22 2019-04-22 A kind of intelligent storage method for planning path for mobile robot not depending on map Pending CN110032189A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910323366.5A CN110032189A (en) 2019-04-22 2019-04-22 A kind of intelligent storage method for planning path for mobile robot not depending on map

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910323366.5A CN110032189A (en) 2019-04-22 2019-04-22 A kind of intelligent storage method for planning path for mobile robot not depending on map

Publications (1)

Publication Number Publication Date
CN110032189A true CN110032189A (en) 2019-07-19

Family

ID=67239486

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910323366.5A Pending CN110032189A (en) 2019-04-22 2019-04-22 A kind of intelligent storage method for planning path for mobile robot not depending on map

Country Status (1)

Country Link
CN (1) CN110032189A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113062601A (en) * 2021-03-17 2021-07-02 同济大学 Q learning-based concrete distributing robot trajectory planning method
CN113140104A (en) * 2021-04-14 2021-07-20 武汉理工大学 Vehicle queue tracking control method and device and computer readable storage medium
CN113848974A (en) * 2021-09-28 2021-12-28 西北工业大学 Aircraft trajectory planning method and system based on deep reinforcement learning

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106548486A (en) * 2016-11-01 2017-03-29 浙江大学 A kind of unmanned vehicle location tracking method based on sparse visual signature map
CN108255182A (en) * 2018-01-30 2018-07-06 上海交通大学 A kind of service robot pedestrian based on deeply study perceives barrier-avoiding method
CN109445440A (en) * 2018-12-13 2019-03-08 重庆邮电大学 The dynamic obstacle avoidance method with improvement Q learning algorithm is merged based on sensor

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106548486A (en) * 2016-11-01 2017-03-29 浙江大学 A kind of unmanned vehicle location tracking method based on sparse visual signature map
CN108255182A (en) * 2018-01-30 2018-07-06 上海交通大学 A kind of service robot pedestrian based on deeply study perceives barrier-avoiding method
CN109445440A (en) * 2018-12-13 2019-03-08 重庆邮电大学 The dynamic obstacle avoidance method with improvement Q learning algorithm is merged based on sensor

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
宋宇 等: "基于改进SARSA(λ)移动机器人路径规划", 《长春工业大学学报》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113062601A (en) * 2021-03-17 2021-07-02 同济大学 Q learning-based concrete distributing robot trajectory planning method
CN113062601B (en) * 2021-03-17 2022-05-13 同济大学 Q learning-based concrete distributing robot trajectory planning method
CN113140104A (en) * 2021-04-14 2021-07-20 武汉理工大学 Vehicle queue tracking control method and device and computer readable storage medium
CN113848974A (en) * 2021-09-28 2021-12-28 西北工业大学 Aircraft trajectory planning method and system based on deep reinforcement learning
CN113848974B (en) * 2021-09-28 2023-08-15 西安因诺航空科技有限公司 Aircraft trajectory planning method and system based on deep reinforcement learning

Similar Documents

Publication Publication Date Title
CN108762264B (en) Dynamic obstacle avoidance method of robot based on artificial potential field and rolling window
CN107063280A (en) A kind of intelligent vehicle path planning system and method based on control sampling
CN109445440B (en) Dynamic obstacle avoidance method based on sensor fusion and improved Q learning algorithm
CN111780777A (en) Unmanned vehicle route planning method based on improved A-star algorithm and deep reinforcement learning
CN108762281A (en) It is a kind of that intelligent robot decision-making technique under the embedded Real-time Water of intensified learning is associated with based on memory
CN110032189A (en) A kind of intelligent storage method for planning path for mobile robot not depending on map
CN107894773A (en) A kind of air navigation aid of mobile robot, system and relevant apparatus
CN109784201B (en) AUV dynamic obstacle avoidance method based on four-dimensional risk assessment
CN109597404A (en) Road roller and its controller, control method and system
JP7469850B2 (en) Path determination device, robot, and path determination method
CN110174118A (en) Robot multiple-objective search-path layout method and apparatus based on intensified learning
WO2020136978A1 (en) Path determination method
CN110850880A (en) Automatic driving system and method based on visual sensing
Almasri et al. Development of efficient obstacle avoidance and line following mobile robot with the integration of fuzzy logic system in static and dynamic environments
CN113291318A (en) Unmanned vehicle blind area turning planning method based on partially observable Markov model
Chen et al. Automatic overtaking on two-way roads with vehicle interactions based on proximal policy optimization
Jaafra et al. Robust reinforcement learning for autonomous driving
CN113341999A (en) Forklift path planning method and device based on optimized D-x algorithm
Yu et al. Road-following with continuous learning
Lin et al. Robust unmanned surface vehicle navigation with distributional reinforcement learning
Lee et al. Autonomous lane keeping based on approximate Q-learning
Li et al. An efficient deep reinforcement learning algorithm for Mapless navigation with gap-guided switching strategy
JP2020149095A (en) Inverted pendulum robot
CN111413974B (en) Automobile automatic driving motion planning method and system based on learning sampling type
Hacene et al. Toward safety navigation in cluttered dynamic environment: A robot neural-based hybrid autonomous navigation and obstacle avoidance with moving target tracking

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination