Specific embodiment
In order to which those skilled in the art is made to more fully understand the present invention program, below in conjunction in the embodiment of the present invention
The technical solution in the embodiment of the present invention is clearly and completely described in attached drawing, it is clear that described embodiment is only
The embodiment of a part of the invention, instead of all the embodiments.Based on the embodiments of the present invention, ordinary skill people
Member's all other embodiments obtained without making creative work should all belong to the model that the present invention protects
It encloses.
It should be noted that term " first " in description and claims of this specification and above-mentioned attached drawing, "
Two " etc. be the object for distinguishing similar, and specific sequence or precedence are described without being used for.It should be appreciated that it uses in this way
Data can be interchanged in the appropriate case, so as to the embodiment of the present invention described herein can in addition to illustrating herein or
Sequence other than those of description is implemented.In addition, term " comprising " and " having " and their any deformation, it is intended that cover
Cover it is non-exclusive include, be not necessarily limited to for example, containing the process of series of steps or unit, method, system, product or equipment
Those steps or unit clearly listed, but may include not listing clearly or for these processes, method, product
Or the intrinsic other steps of equipment or unit.
First, the part noun or term occurred during the embodiment of the present application is described is suitable for following solution
It releases:
The method of exhaustion:The approximate range of answer is determined according to the partial condition of topic, and within this range to all possible
Situation is verified one by one, until the whole circumstances verify.If some situation authenticator synthesis purpose full terms, for this problem
One solution;If not complying with the full terms of topic after whole circumstances verification, this subject is without solution.The method of exhaustion is also referred to as enumerated
Method.
MDP:Markovian decision process (Markov Decision Process, MDP), MDP considers action, that is, is
Next state of uniting is not only related with current state, also related with the action currently taken.
Boltzman distribution mechanisms:That is ANALOGY OF BOLTZMANN DISTRIBUTION, is the selection of typical random neural network, state with probability come
It determines.
Q-learning algorithms:Enhance learning algorithm, be based on finite state and the learning algorithm for having line model, be used for
Machine learning learns a circumstances not known by unsupervised training (unsupervised training).
It is extensive:After a certain reaction is contacted with certain stimulation formation condition, this reaction also can be with other similar stimulations
Conditioned connection to a certain degree is formed, this process is referred to as extensive.
Artificial Potential Field Method:It is the movement by robot in ambient enviroment, is designed to a kind of be abstracted in artificial gravitational field
Movement, target point to mobile robot generate " gravitation ", barrier to mobile robot generate " repulsion ", finally by ask resultant force
To control the movement of mobile robot.
Fuzzy logic:It establishes on the basis of multi valued logic, fuzziness thinking, language is studied with the method for fuzzy set
The science of form and its rule.
According to embodiments of the present invention, a kind of embodiment of the method for path planning is provided, it should be noted that in attached drawing
The step of flow illustrates can perform in the computer system of such as a group of computer-executable instructions, although also,
Logical order is shown in flow chart, but in some cases, it can perform shown with the sequence being different from herein or retouch
The step of stating.
Fig. 1 is a kind of flow chart of optional paths planning method according to embodiments of the present invention, as shown in Figure 1, the party
Method includes the following steps:
Step S102 obtains the current location of target robot and the position of candidate target point, wherein, current location and time
Selecting the position of target point, the position of candidate target point is located at currently in default path planning, and in default path planning
After position, the position candidate of next position that will be reached by current location as target robot;
Step S104, according to current location and the position of candidate target point, whether the position for determining candidate target point is mesh
Next position that scalar robot will reach, if so, step S106 is performed, if it is not, return to step S102;
Step S106, using the position of candidate target point as next position of current location in physical planning path.
By the above embodiment, target robot can obtain the current location of target robot and candidate target point
Position, wherein, the position of current location and candidate target point in default path planning, is waited in default path planning
The position of target point is selected to be located at after current location, the next position that will be reached by current location as target robot
The position candidate put, and according to current location and the position of candidate target point, whether the position for determining candidate target point is target
Next position that robot will reach, if so, using the position of candidate target point as present bit in physical planning path
The next position put, if it is not, may return to the current location for obtaining target robot and the position of candidate target point
Step, and whether the position for redefining candidate target point is next position that target robot will reach, until candidate
The position of target point is next position that target robot will reach.According to the above embodiment, target robot can be with
The continuous position for obtaining current location and candidate target point, makes Path selection, thus according to the routing information got,
Rational path planning is made, solves the technical issues of path planning efficiency is low.
For the above embodiment, it can be applied in various robots, which includes intelligent robot, military
Robot and service robot etc..Robot can make Path selection according to the environmental information detected.
Optionally, target robot can select the intelligent robot in path, can be set in the target robot
Multiple sensors are put, the environmental information around target robot can be got by the sensor in real time, wherein, environment letter
Breath can include it is a variety of, as obstacle position information and candidate target point location information and target robot to candidate mesh
Routing information between punctuate.Wherein, which can include target robot to the distance and barrier of barrier
With the range information of candidate target point etc..
Optionally, the sensor in the above embodiment can include a variety of, such as:Sonar sensor, infrared ray sensing
Device, velocity sensor etc., by each sensor can detect target robot between barrier, candidate target point away from
From can also detect the size of barrier and the edge of barrier by the sensor in target robot.
For above-described embodiment, target robot can first determine that target robot is current before path planning is carried out
Location information and the end point location information finally to be reached, and rational path is made according to current location and final position and is advised
It draws.
Another optional embodiment, in the technical solution of step S102 offers, obtains the current of target robot
Position and the position of candidate target point, wherein, the position of current location and candidate target point all in default path planning, and
In default path planning, the position of candidate target point is located at after current location, as target robot by current location
The position candidate of next position that will be reached.Wherein, candidate target point can be that target robot can select next time
Traveling target point, by obtaining the current location of target robot and the location determination of candidate target point, so that it is determined that target
The direction that robot can walk in next step.Optionally, the direction that target robot can walk can include it is multiple, for example,
Front, left front and right front etc..
Optionally, above-mentioned candidate target point can be multiple, and candidate target point is relative to target robot current location
There can be different directions, target robot is when selecting candidate target point, it may be determined that candidate target point to target robot
Distance, candidate target point can also be determined the distance between to the terminal finally to be reached.
Optionally, it in the technical solution provided in step S104, according to current location and the position of candidate target point, determines
Whether the position of candidate target point is next position that target robot will reach.Wherein, candidate target point is being determined
When whether position is next position that target robot will reach, target robot can be obtained to each candidate target point
The distance between, at the same need to get target robot between candidate target point obstacle information (such as barrier it is big
Small, barrier edge), and according to the information got, whether the position for determining candidate target point is that target robot will
The next position to be reached.
Optionally, it is behind next position that target robot will reach in the position for determining candidate target point, holds
Row step S106 provide technical solution, using the position of candidate target point as in physical planning path current location it is next
Position.If it is determined that next position that the position of candidate target point, which will not be target robot, to be reached, then return to step
S102, whether the position for redefining other candidate target points is next position that target robot will reach, you can with
According to current location and the position of candidate target point, whether the position for judging other candidate target points again is that target robot will
The next position to be reached.Until determining that target robot thinks next position to be achieved.
Another optional embodiment, according to current location and the position of candidate target point, determines candidate target point
Whether position is that next position that target robot will reach includes:Obtain current location and candidate target point position it
Between barrier information, wherein, the information of barrier includes the marginal position of barrier, and marginal position is close in barrier
The position at the edge of default path planning;Marginal position based on barrier determines the environment shape of target robot and barrier
State information;According to environmental state information, determine that whether to be that target robot will reach next for the position of candidate target point
Position.
Optionally, the barrier in the above embodiment can include static-obstacle thing and dynamic barrier, wherein, it is static
Barrier can be immotile barrier, not moved relative to target robot, position is relatively-stationary, example
Such as, stone;Can be the real-time barrier for changing position, for example, mobile trolley for dynamic barrier.Target machine
Sensor in people can get the obstacle information in preset range in real time, which can be target robot
Around a range, for example, using the radius in the artificial center of circle of target machine as 5 meters of border circular areas.Target robot can not
The environmental information of surrounding is got disconnectedly, and the information of acquisition is stored.
Wherein, target robot can establish an environmental model according to the environmental information got, wherein, at this
In environmental model, it is stored with the routing information and obstacle information passed by, and the environment detected recently according to sensor
Information updates environmental model.Optionally, target robot needs timely to update to the information of dynamic barrier.
Wherein, when obtaining obstacle information, the size of barrier and the location information of barrier can be obtained, it can also
The information of the marginal position of barrier is got, the marginal position of the barrier can be proximate to the margin location of candidate target point
It puts.
Another optional embodiment after obstacle information is got, can be based on obstacle information, determine target
The environmental state information of robot and barrier, wherein, the environmental state information of target robot and barrier includes at least down
At least one state:Success status;Safe condition;Precarious position;Status of fail.And according to the environmental state information got, really
Whether the position for determining candidate target point is next position that target robot will reach.
Optionally, the marginal position based on barrier determines that the environmental state information of target robot and barrier includes:
Determine the distance between marginal position of current location and barrier;If distance is less than the first preset distance, target machine is determined
The environmental state information of people and barrier is status of fail;If distance is not less than the first preset distance and less than the second pre- spacing
From the environmental state information for determining target robot and barrier is precarious position;If distance not less than the second preset distance and
Less than third preset distance, the environmental state information for determining target robot and barrier is safe condition;If distance is not less than
Third preset distance and less than the 4th preset distance, the environmental state information for determining target robot and barrier are successful shape
State;Wherein, the first preset distance is less than the second preset distance, and the second preset distance is less than third preset distance, the pre- spacing of third
From less than the 4th preset distance.
Wherein, the first preset distance can be very short, such as 0.1 meter, i.e., the barrier may be in target robot to time
It selects on the path between target point so that target robot can not reach the candidate target point by the path, if at this point, detection
It is less than the first preset distance to distance between target robot and the marginal position of barrier, then can be determined that target robot not
The candidate target point, i.e. status of fail can be reached.Optionally, the second preset distance can be more than the first preset distance and be less than
The numerical value of third preset distance, for example, 0.6 meter, i.e., the edge of the barrier may target robot to candidate target point it
Between path on, if target robot is wanted to reach the candidate target point, need to get around the edge of the barrier, at this point, target
Robot is precarious position to reach candidate target point.
Another optional embodiment, third preset distance can be pre-set threshold value, the third preset distance
It may be greater than the second preset distance and be less than the numerical value of the 4th preset distance, for example, 1.2 meters, that is, the barrier detected
The path that the Edge Distance target robot reaches candidate target point is nearer, and still, which does not interfere target robot to arrive
Up to the candidate target point, at this point it is possible to which it is safe condition to determine that target robot reaches candidate target point.Optionally, the 4th is pre-
Set a distance can be pre-set numerical value, can be more than third preset distance, for example, 2 meters, i.e. the obstacle distance target
Robot reaches the path of candidate target point farther out, at this point, barrier does not influence the Path selection of target robot, it can
To determine that target robot reaches candidate target point as success status at this time.
Wherein, target robot select it is next will reach position when, can be to reaching each candidate target point
Path analyzed.
For the above embodiment, can sliding-model control be carried out to ambient condition by fuzzy logic, it optionally, can be with
The distance between target robot and barrier edge according to size are obscured and are:Low coverage, middle-range and long distance.Wherein, low coverage
Represent that target robot is nearer apart from the edge of barrier, the range that low coverage represents can be predetermined to second in the first preset distance
Between distance;Middle-range represents that target robot is more than low coverage but the range represented less than long distance, middle-range apart from the edge of barrier
It can be in the second preset distance between the 4th preset distance;During long distance represents that target robot is more than apart from the edge of barrier
Away from the range that long distance represents may be greater than the 4th preset distance.
Optionally, robot can judge that the target robot reaches candidate mesh by the fuzzy distance that sensor returns
Whether punctuate can walk, for example, sensor return be long distance information when, it can be determined that go out the Edge Distance mesh of barrier
Scalar robot farther out, will not interfere robot to advance, if the low coverage information that sensor returns, it can be determined that go out the edge of barrier
Distance objective robot is nearer, if robot moves on, may bump against with barrier, which may interfere with target
Robot advances.
Optionally, the marginal position based on barrier determines that the environmental state information of target robot and barrier may be used also
To include:Marginal position based on current location and barrier determines to reach the of the marginal position of barrier from current location
One vector, wherein, the length of primary vector is the first length, and the direction of primary vector is first direction;According to current location and
The position of candidate target point determines the secondary vector from the position of current location arrival candidate target point, wherein, secondary vector
Length is the second length, and the direction of secondary vector is second direction;In the case where the first length is more than the second length, mesh is determined
The environmental state information of scalar robot and barrier is success status;In the first length no more than in the case of the second length, examine
Survey first direction and the angle of second direction;If angle is more than the first predetermined angle, it is determined that target robot and barrier
Environmental state information is safe condition;If angle is more than the second predetermined angle and less than or equal to the first predetermined angle, it is determined that mesh
The environmental state information of scalar robot and barrier is precarious position;If angle is less than or equal to the first predetermined angle, it is determined that mesh
The environmental state information of scalar robot and barrier is status of fail, wherein, the second predetermined angle is less than the first predetermined angle.
For the above embodiment, be by detect target robot current location relative to barrier edge first to
Amount and target robot current location determine that target robot arrives relative to the angle between the secondary vector of candidate target point
Ambient condition up between candidate target point.In this embodiment, if the first representative length of primary vector is more than the
The second length represented by two vectors, then can obtain barrier relative to candidate target point position reach target robot it
Between it is distant, barrier will not hamper completely target robot reach candidate target point, at this point it is possible to determine target machine
Device people reaches the candidate target point as success status.Optionally, in the case where the first length is not more than the second length, Ke Yijian
The angle of first direction and second direction is surveyed, if angle is more than the first predetermined angle, it is determined that target robot and barrier
Environmental state information is safe condition, wherein, which can be larger numerical value, such as 30 degree, first direction
It is more than the first predetermined angle with the angle of second direction, then can determines that barrier will not hamper target robot and reach time
Select target point.
Wherein, the second predetermined angle is less than the first predetermined angle, if angle is more than the second predetermined angle and less than or equal to the
One predetermined angle, it is determined that target robot and the environmental state information of barrier are precarious position, and the second predetermined angle can be with
It is smaller angle numerical value, for example, 10 degree, it is more than the second predetermined angle in the angle of primary vector and secondary vector and is less than
During the first predetermined angle, at this point, the edge of barrier may hamper in target robot to the path of candidate target point
Target robot reaches the candidate target point at this point it is possible to determine that it is precarious position.Another optional embodiment, is being pressed from both sides
Angle is less than or equal to the first predetermined angle, it is determined that target robot and the environmental state information of barrier are status of fail, at this time
Target robot cannot reach candidate target point.
Another optional embodiment, it is next in the position for determining candidate target point to be that target robot will reach
After a position, current location can also be recorded to the routing information of next position, wherein, routing information includes current location
Range information, current location to next position between next position obstacle information, current location to barrier
The distance between information.Target robot can update above-mentioned environmental model according to the routing information of record.
Optionally, when target machine records routing information of the current location to next position, it may be determined that outbound path
Evaluation of estimate, which can be obtained according to above-mentioned environmental state information, i.e., target robot reaches next position
Afterwards, the path passed by can be fed back accordingly, such as:- 1,0,1 etc., which can be above-mentioned ambient condition
Information is corresponding, for example, in status of fail, evaluation of estimate is -1;During safe condition, evaluation of estimate 1;During precarious position, evaluation
Be worth is 0;During success status, evaluation of estimate 2.
Optionally, according to environmental state information, whether the position for determining candidate target point is that target robot will reach
Next position include:It determines from the position that current location reaches each candidate target point, is performed needed for target robot
Candidate actions;By the corresponding candidate actions of each candidate target point and environmental state information, each candidate target point is determined
Evaluation of estimate;Obtain the maximum value in the corresponding evaluation of estimate of multiple candidate target points;According to the corresponding ambient condition letter in current location
Breath and current action and maximum value determine that whether to be that target robot will reach next for the position of candidate target point
Position.
Wherein, the candidate actions performed needed for target robot, can be target robot and each candidate target point
Action selection, for example, advance, turn left and turn right etc..It is (i.e. successful according to the candidate actions and above-mentioned environmental state information
State, safe condition, precarious position or status of fail), the evaluation of estimate of each candidate target is obtained, by obtaining multiple candidates
Maximum value in the corresponding evaluation of estimate of target point, according to the corresponding environmental state information in current location and current action and most
Big value, whether the position for determining candidate target point is next position that target robot will reach, you can to select to evaluate
Maximum value in value, and action selection is made in combining target robot current location, to reach next position.
By the above embodiment, target robot can pass through evaluation of estimate, the corresponding environmental state information in current location
Whether the position that candidate target point is determined with current action is next position that target robot will reach, so as in real time
Update routing information, then, by the routing information passed by, constantly path optimizing selects, and finally, can reach home.
Optionally, target robot needs to select multiple candidate target points, no when proceeding by path planning
Disconnectedly according to the path passed by, Path selection is made.In this embodiment, target robot is by candidate target point
After next position of the position as current location in physical planning path, the next of selection can be moved to from current location
A position, and current location is recorded to the routing information of next position.I.e. target robot can constantly learn to having walked
Outbound path action selection is done in the path crossed according to the routing information learnt.
Optionally, according to the corresponding environmental state information in current location and current action and maximum value, candidate's mesh is determined
Whether the position of punctuate is that next position that target robot will reach includes:Candidate target point is determined by equation below
Position whether be next position that target robot will reach:
Q (state, action)=R (state, action)+Gamma*MaxQ (next state, all actions),
Wherein, state is target robot current location, and action is optional action, and Gamma is constant, R (state,
Action it is) to be selected according to the action that current location is made, positions of the next state for candidate target point, all actions
It is selected for all actions, MaxQ (next state, all actions) is position of the target robot according to candidate target point
Maximum value in the evaluation of estimate of the everything of selection, Q (state, action) are according to the corresponding ambient condition in current location
Information and current action and maximum value, it is next position to determine optimal candidate target point.
Target robot can with by the above-mentioned means, constantly learn, robot will from a target point to another
Target point constantly detects, until it reaches the terminal in path.Target robot, can be according to commenting when path is selected
The size of value makes optimum path planning, selects the larger represented path of evaluation of estimate, constantly updates the evaluation of estimate in path, most
The optimal path between starting position to final position can be determined eventually, which can be effective avoiding obstacles
Path.
Wherein, when above-mentioned formula is used to determine next position, a matrix can be established, in the matrix, and is built
Corresponding path planning parameter is found, for example, target point A can arrive target point B, then it is 1 that can determine its parameter, and target point A is not
Target point C can be arrived, then it is -1 that can determine its parameter, wherein, the specific implementation step of the matrix can include:Setting is current
Target point is initial target point;Since current target point, the action for having highest evaluation of estimate is found;Under setting current target point is
One target point (the candidate target point target point i.e. corresponding to highest evaluation of estimate);The above embodiment is repeated, until current mesh
Punctuate is terminal.
Optionally, obtain between current location and barrier apart from when, can target machine be detected according to sensor
The distance between device people current location and barrier determine distance, and target robot is detected by sensor in target robot
Behind the distance between current location and barrier, when can be handled in the processing end in feeding back to target robot, it can obscure
Its specifically distance, feeds back corresponding range information to processing end.Wherein, processing end can analyze current goal robot and week
Enclose ambient condition.
By the above embodiment, target robot can carry out path planning, Ke Yi during continuous study
In circumstances not known, exploration is made to environment and knowledge is obtained from evaluation, improves action strategy to accomplish the end in view.
Here is according to a particular embodiment of the invention.
Embodiment of the present invention is to be based on intensified learning in circumstances not known into walking along the street for robot path planning
Diameter plans that the advantage of intensified learning is, by with environmental interaction trial and error, constantly to learn.The embodiment of the present invention proposes will be strong
Chemistry is practised and fuzzy logic be combined with each other, and sliding-model control is carried out to ambient condition using fuzzy logic, to ensure state demarcation
Reasonability, and complete using intensified learning selection strategy obstacle-avoiding route planning in the circumstances not known of robot.
Optionally, when carrying out sliding-model control to ambient condition using fuzzy logic, robot and barrier can first be determined
Hinder the distance of object, wherein, the distance between robot and barrier are divided into low coverage, middle-range and long distance.
Optionally, intensified learning can include such as lower part:
Environmental model:The information that robot can be returned according to sensor establishes the model of a currently known environment,
In the model, it can be detected with recorder people current location information, directional information, the gait of march of current robot, sensor
The target point to be reached to obstacle information and robot.Wherein, robot current location information can for robot
Manage position;Distance and the barrier of the obstacle information that sensor detects including barrier and robot are relative to machine
The direction of people.In this embodiment, barrier can include static-obstacle thing and dynamic barrier, for static-obstacle thing,
Robot can determine how avoidance, for dynamic barrier, robot according to the obstacle position information once detected
The obstacle position information that can be real-time detected according to sensor updates environmental model, so as to preferably carry out path planning.
Penalty:Since Q-learning algorithms are the MDP models based on finite state and limited action, input or defeated
It must be finite discrete state to go out, and in embodiments of the present invention, extensive processing can be carried out to enhancing study, by robot
State is set as s, and action is set as a.Optionally, the distance of robot and target point can be set as d, when may be set in d=1,
The distance of robot and terminal is successful distance, and d=2 is conflict distance, and d=3 is risk distance, and d=4 is failure distance, ring
Border state can include success status, safe condition, precarious position and status of fail, can return to numerical value in success status
2, return to numerical value 2 during safe condition, when precarious position returns to numerical value 0, and when status of fail returns to numerical value -1.
Wherein it is possible to according to the distance between robot and target point relationship, robot current location is obtained relative to end
The state of point position, and according to the ambient condition between the robot and target point detected, select next target point.
By the above embodiment, can constantly be determined to obtain the optimal value in path according to penalty.
Q value function:The selection in the path can be realized by following formula:
Q (state, action)=R (state, action)+Gamma*MaxQ (next state, all actions)
Wherein, state is robotary, and action is action, and Q (state, action) is represented by Q value function
One element of matrix value, R (state, action) is the value of respective element in R matrixes, and Gamma is Variable Learning, MaxQ
(next state, all actions) is the maximum value in the evaluation of estimate of robot NextState everything.Robot can be with
By the above-mentioned means, constantly learning, robot will constantly be explored from a state to another state, until it is arrived
Up to dbjective state.Our such each time explore is referred to as a scene.Each scene is exactly target robot from initial state
Reach the process of dbjective state.Each target robot reaches dbjective state, which can enter next field
Jing Zhong.State s execution acts the evaluation function of a, and hereafter presses enhanced signal discount when optimal action sequence performs, only
The status switch from initial state to dbjective state is tracked to use matrix Q.This algorithm is found in matrix Q for current state
The next step for having highest evaluation of estimate acts.
Optionally, above-mentioned Q matrixes occupation mode can include:First, robotary is initialized, it then can be by machine
The current state of people is set as init state;Since current state, the action for having highest q value is found;It can be by highest q value
Represented state is expressed as NextState, and it is NextState to set current state;Judge whether current state is target-like
State, if so, without finding the state of robot again, if it is not, continuing to execute above-mentioned since current state, searching highest q value
Action, until current state be dbjective state.By the above embodiment, can return from initial state to dbjective state
Status switch.
Act selection strategy:In order to select more actions acted while consider Q value function maximum as far as possible, the present invention is real
Apply the action progress robot obstacle-avoiding that mode chooses Q value maximums using Boltzmann distribution mechanisms.Wherein, action selection includes
Advance, turn left, turn right.
By the above embodiment, sliding-model control can be carried out to ambient condition using fuzzy logic, ensure that state
What is divided is reasonable, then can complete robot in circumstances not known using intensified learning according to the environmental model of real-time update
Obstacle-avoiding route planning.
Fig. 2 is the flowchart 2 of the optional paths planning method of another kind according to embodiments of the present invention, as shown in Fig. 2,
This method comprises the following steps:
Step S201, robot initial current state.
Step S203 judges whether current state is dbjective state.
If it is not, step S205 is performed, if so, performing step S209.
Step S205 inquires highest evaluation of estimate from current state, and finds the corresponding action of highest evaluation of estimate.
Step S207, setting current state are NextState.
Step S209, robot reach the corresponding target point of NextState.
The embodiment of the present invention, robot can first initialize current state when carrying out path planning, which can
To be robot current location and the ambient condition, then, robot may determine that whether current state is dbjective state,
If to judge current state be not dbjective state, highest evaluation of estimate is inquired from current state, and find the most higher assessment
Corresponding action is worth, it is NextState then to set current state, if judging current state for dbjective state, robot
The corresponding target point of NextState can be reached.I.e. robot can when selecting path, according to current state and evaluation of estimate come
Determine the next target point that will be reached.
In the embodiment, using the paths planning method of enhancing study, as robot constantly moves, constantly obtain
Environmental information around robot, and pass through and constantly choose state and update routing information, current state can be calculated
Under optimal path, ensure that robot can carry out effective avoidance in circumstances not known, and path optimizing selects, robot can be with
In continuous processing state space, optimal path is searched out.
Fig. 3 is the structure chart of the optional path planning apparatus of another kind according to embodiments of the present invention, as shown in figure 3, should
Device includes:First acquisition unit 31, for obtaining the position of the current location of target robot and candidate target point, wherein,
Current location and the position of candidate target point are in default path planning, and in default path planning, candidate target point
Position is located at after current location, the candidate bit of next position that will be reached by current location as target robot
It puts;First determination unit 33, for the position according to current location and candidate target point, determine candidate target point position whether
It is next position that target robot will reach;Second determination unit 35, the second determination unit, in candidate target point
Position be next position that target robot will reach in the case of, using the position of candidate target point as physical planning
Next position of current location in path;And in the position of candidate target point not to be that target robot will reach next
In the case of position, first acquisition unit is returned, and is redefined until the position of candidate target point is that target robot will
The next position reached.
By the above embodiment, target robot can obtain the current of target robot by first acquisition unit 31
Position and the position of candidate target point, wherein, the position of current location and candidate target point in default path planning, and
In default path planning, the position of candidate target point is located at after current location, as target robot by current location
The position candidate of next position that will be reached, and pass through the first determination unit 33 according to current location and candidate target point
Position, whether the position for determining candidate target point is next position that target robot will reach, and passes through second and determine
Unit 35 is in the case of next position that the position of candidate target point is that target robot will reach, by candidate target point
Next position of the position as current location in physical planning path;And in the position of candidate target point it is not target machine
In the case of next position that people will reach, first acquisition unit is returned, and redefines the position until candidate target point
It is next position that target robot will reach to put.According to the above embodiment, target robot can be obtained constantly
Current location and the position of candidate target point, make Path selection, so as to according to the routing information got, make rational
Path planning solves the technical issues of path planning efficiency is low.
Optionally, the first determination unit includes:First acquisition module, for obtaining the position of current location and candidate target point
The information of barrier between putting, wherein, the information of barrier includes the marginal position of barrier, and marginal position is in barrier
Close to the position at the edge of default path planning;First determining module for the marginal position based on barrier, determines target machine
Device people and the environmental state information of barrier;Second determining module, for according to environmental state information, determining candidate target point
Whether position is next position that target robot will reach.
Another optional embodiment, the first determining module include:First determination sub-module, for determining current location
The distance between marginal position of barrier;Second determination sub-module if being less than the first preset distance for distance, determines mesh
The environmental state information of scalar robot and barrier is status of fail;Third determination sub-module, if being not less than first for distance
Preset distance and less than the second preset distance, the environmental state information for determining target robot and barrier is precarious position;The
Four determination sub-modules, if for distance not less than the second preset distance and less than third preset distance, determine target robot with
The environmental state information of barrier is safe condition;5th determination sub-module, if for distance not less than third preset distance and
Less than the 4th preset distance, the environmental state information for determining target robot and barrier is success status;Wherein, first is predetermined
Distance is less than the second preset distance, and the second preset distance is less than third preset distance, and third preset distance is less than the 4th pre- spacing
From.
Wherein, the second determining module includes:6th determination sub-module, for determining to reach each candidate mesh from current location
In the position of punctuate, the candidate actions of execution needed for target robot;7th determination sub-module, for by each candidate target point
Corresponding candidate actions and environmental state information determine the evaluation of estimate of each candidate target point;Second acquisition module, for obtaining
Maximum value in the corresponding evaluation of estimate of multiple candidate target points;8th determination sub-module, for according to the corresponding ring in current location
Border status information and current action and maximum value, whether the position for determining candidate target point is that target robot will reach
Next position.
Optionally, device further includes:Recording unit, for whether being target robot in the position for determining candidate target point
After the next position that will be reached, the routing information of record current location to next position, wherein, routing information includes
Current location to the range information of next position, current location between next position obstacle information, current location
To the distance between barrier information.
Another aspect according to embodiments of the present invention additionally provides a kind of robot, which is characterized in that including:Above-mentioned reality
Apply the path planning apparatus of any one of mode.
The embodiment of the present invention, as robot constantly moves, is constantly obtained using the paths planning method of enhancing study
Environmental information around robot, and pass through and constantly choose candidate target point and update routing information, it can be calculated and work as
Optimal path under preceding state ensure that robot can carry out effective avoidance in circumstances not known, and path optimizing selects, machine
People can search out optimal path in continuous processing state space.
The embodiments of the present invention are for illustration only, do not represent the quality of embodiment.
In the above embodiment of the present invention, all emphasize particularly on different fields to the description of each embodiment, do not have in some embodiment
The part of detailed description may refer to the associated description of other embodiment.
In several embodiments provided herein, it should be understood that disclosed technology contents can pass through others
Mode is realized.Wherein, the apparatus embodiments described above are merely exemplary, such as the division of the unit, Ke Yiwei
A kind of division of logic function, can there is an other dividing mode in actual implementation, for example, multiple units or component can combine or
Person is desirably integrated into another system or some features can be ignored or does not perform.Another point, shown or discussed is mutual
Between coupling, direct-coupling or communication connection can be INDIRECT COUPLING or communication link by some interfaces, unit or module
It connects, can be electrical or other forms.
The unit illustrated as separating component may or may not be physically separate, be shown as unit
The component shown may or may not be physical unit, you can be located at a place or can also be distributed to multiple
On unit.Some or all of unit therein can be selected according to the actual needs to realize the purpose of this embodiment scheme.
In addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, it can also
That each unit is individually physically present, can also two or more units integrate in a unit.Above-mentioned integrated list
The form that hardware had both may be used in member is realized, can also be realized in the form of SFU software functional unit.
If the integrated unit is realized in the form of SFU software functional unit and is independent product sale or uses
When, it can be stored in a computer read/write memory medium.Based on such understanding, technical scheme of the present invention is substantially
The part to contribute in other words to the prior art or all or part of the technical solution can be in the form of software products
It embodies, which is stored in a storage medium, is used including some instructions so that a computer
Equipment (can be personal computer, server or network equipment etc.) perform each embodiment the method for the present invention whole or
Part steps.And aforementioned storage medium includes:USB flash disk, read-only memory (ROM, Read-Only Memory), arbitrary access are deposited
Reservoir (RAM, Random Access Memory), mobile hard disk, magnetic disc or CD etc. are various can to store program code
Medium.
The above is only the preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art
For member, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications also should
It is considered as protection scope of the present invention.