CN108827312A

CN108827312A - A kind of coordinating game model paths planning method based on neural network and Artificial Potential Field

Info

Publication number: CN108827312A
Application number: CN201810907242.7A
Authority: CN
Inventors: 张菁; 何友; 彭应宁; 李刚
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2018-08-08
Filing date: 2018-08-08
Publication date: 2018-11-16
Anticipated expiration: 2038-08-08
Also published as: CN108827312B

Abstract

The present invention proposes a kind of coordinating game model paths planning method based on neural network and Artificial Potential Field, belongs to intelligent body active path planning field.This method constructs training sample set in the path optimizing that off-line phase obtains intelligent body first, and is trained to BP neural network module, obtains the BP neural network module that training finishes；On-line stage, it obtains the position at the current time of each intelligent body and environmental information and inputs the BP neural network module that finishes of training, network module exports the repulsion gain at the moment, and then be calculated the moment target point to the gravitation of intelligent body, threaten area to the repulsion of intelligent body, and resultant force is calculated.Intelligent body updates repulsion gain according to new position and environmental information according to being moved with joint efforts, and at next moment, until path planning reaches the condition of end.The present invention can better solve the path planning problem under coordinating game model scene, can better adapt to the active path planning under target and obstacle movement.

Description

A kind of coordinating game model paths planning method based on neural network and Artificial Potential Field

Technical field

It is the invention belongs to intelligent body active path planning field, in particular to a kind of based on neural network and Artificial Potential Field Coordinating game model paths planning method.

Background technique

Coordinating game model path planning based on neural network and Artificial Potential Field refers to：Multiple agent uses improved nerve net The method that network and Artificial Potential Field combine, synergistically for the purpose of reaching target area, the path that obstacle avoidance area is constraint and progress Planning, wherein game refers to that target area and barrier are directed to multiple agent movement in real time and carry out antagonism feedback.Coordinating game model road Diameter planning problem is important branch of the two-dimensional space to the active path planning problem of particle, and robot is enclosed in the research of the problem It catches, the practical applications such as robot soccer game are of great significance.

The methods of existing Artificial Potential Field, neural network all exist centainly not for solving coordinating game model path planning problem Foot.Artificial Potential Field Method is the common method in existing multiple agent real-time route planning problem, and this method uses for reference physical field Thought constructs the gravitation potential field from target and the repulsion potential field from obstacle, the resultant force influence pair of generation in Virtual Space As the movement of (may be robot, aircraft or vehicle), practicability is very strong, and with principle, intuitive, algorithm structure simply can Adaptability by being easily achieved, to dynamic environment is good, computing cost is small and real-time is good, without the building space C in advance The advantages that (Configuration space).By studying for a long period of time, the common locally optimal solution of Artificial Potential Field Method closes on obstacle It is shaken when goal nonreachable (Goals Non-Reachable with Obstacles Nearby, GNRON), mistake slype Be rebuffed, dynamic disorder the problems such as also preferably solved.But existing Artificial Potential Field Method considers deficiency to coordinating game model scene, The antagonism feedback of environment and intelligent body collaboration are difficult quantitatively to be described with rule and strategy.Although can based on cybernetic method To carry out quantitative optimization, but computation complexity often exceeds the requirement of real-time.

S S Ge etc. 2000 in " New Potential Functions for Mobile Robot Path The method proposed in Planning " is presently the most classical Artificial Potential Field Method, is at Khatib etc. 1986 in " Real- The primitive man being put forward for the first time in time obstacle avoidance for manipulators and mobile robots " Repulsion function construction method is improved on the basis of work potential field method, is largely overcome and is closed on obstacle mesh existing for original method Mark unreachable (Goals Non-Reachable with Obstacles Nearby, GNRON) problem.This method mainly includes Following steps：

1) target point is calculated to the gravitation of intelligent body；

Shown in gravitation potential field such as formula (1)：

Wherein k_aFor gravitation gain, ρ (p, p_goal)=| | p_goal- p | | it is Euclidean distance of the intelligent body to target point, p For intelligent body position coordinates, p_goalFor aiming spot coordinate, n ∈ (0 ,+∞) usually takes n=2 by experience convention.

Gravitation can be acquired by the negative gradient of gravitation potential field, as shown in formula (2)：

2) calculating threatens area to the repulsion of intelligent body；

Shown in repulsion potential field such as formula (3)：

Wherein k_rFor repulsion gain, ρ (p, p_threat)=| | p_threat- p | | for intelligent body to threat area boundary closest approach (the exactly introducing of the distance, when avoiding while closing on obstacle and target, repulsion covering gravitation causes target to Euclidean distance It is unreachable), p_threatFor the position coordinates for threatening area boundary closest approach, n ∈ (0 ,+∞) usually takes n=2 by experience convention, denounces Power can be acquired by the negative gradient of repulsion potential field, as shown in formula (4)：

3) resultant force is sought；

Resultant force is the vector sum of gravitation and repulsion, then will further be converted into the direction of motion and speed, conversion side with joint efforts Method is determined according to specific path planning problem scene.

At each moment, intelligent body carries out movement until next moment according to the resultant force at the current time being calculated； If intelligent body reaches target point at next moment, path planning terminates；If not up to, going successively to next step path rule It draws, gos to step 1).

The major defect of this method is：Bad for the adaptability of dynamic scene, potential field function cannot consider target and barrier The movement hindered not can be carried out the path planning with prediction and trail.

Neural network is applied among path planning as a kind of powerful machine learning and knowledge representation method. Neuron and neural network framework are mapped in configuration space by some researchs, the guidance pair using the output of neuron as potential field As movement.Other research and utilization neural networks carry out the knowledge preparation before path planning, such as to environment classification, prediction Obstacle motor pattern, calculating functional equation etc..Neural network is also used in combination with some other method, including fuzzy control, something lost Propagation algorithm, simulated annealing etc..However many basic frameworks and forward-propagating process studied just with neural network, not Make full use of its learning ability.

It is some to research and propose the situation feelings according to dynamic change by various modes in order to adapt to active path planning scene The adaptive adjusts path of condition plans equation coefficient, exists wherein being more typically Lin Zhixiong in 2014《Computer Simulation》Middle proposition " the Soccer robot path planning based on fuzzy neuron potential field method ".This method first improves the function of Artificial Potential Field, Then increase fuzzy neural network module and adjust improved Artificial potential functions coefficient in real time, fuzzy neural network module is by artificial The fuzzy rule and fuzzy set subordinating degree function off-line training of setting generate.This method key step is as follows：

1) building and Training Fuzzy Neural Networks module；

Fuzzy Rule Sets are set, fuzzy neural network module is obtained by off-line training.

2) adaptive optimization potential field function coefficients；

The fuzzy neural network module obtained using step 1), online in real time to improved potential field function coefficients (target phase To speed parameter α, obstacle offsets parameter beta) it optimizes, specific method is at a time interval, to be made with current state information For fuzzy neural network input, (target relative velocity parameters α, obstacle offset parameter the potential field function coefficients of forward direction output optimization β)。

3) path planning；

(target relative velocity parameters α, obstacle offset parameter the potential field function coefficients for the adaptive optimization that step 2) is obtained It β) substitutes into following potential field function and carries out gravitation and repulsion calculating, and finally obtain resultant force.

Target point 3-1) is calculated to the gravitation of intelligent body；

Shown in gravitation calculation method such as formula (5)：

F_att(p)=k_aρ(p,p_goal)+α(V_g-V) (5)

Wherein, k_aFor gravitation gain, ρ (p, p_goal)=| | p_goal- p | | it is Euclidean distance of the intelligent body to target point, P is intelligent body position coordinates, p_goalFor aiming spot coordinate, V_gFor target velocity, V is intelligent body speed, n ∈ (0 ,+∞), N=2 is usually taken by experience convention.

3-2) calculating threatens area to the repulsion of intelligent body；

Shown in repulsion calculation method such as formula (6)：

Wherein k_rFor repulsion gain, ρ (p, p_threat)=| | p_threat- p | | for intelligent body to threat area boundary closest approach Euclidean distance, p_threatFor the position coordinates for threatening area boundary closest approach, p₀For repulsion biggest impact range, exceed the model Area is threatened to be reduced to 0, V to intelligent body repulsion after enclosing₀For obstacle speed, n ∈ (0 ,+∞) usually takes n=2 by experience convention.

3-3) seek resultant force

Resultant force is the vector sum of gravitation and repulsion, then will further be converted into the direction of motion and speed of intelligent body with joint efforts Degree, method for transformation are determined according to specific path planning problem scene.

At each moment, intelligent body carries out movement until next moment according to the resultant force at the current time being calculated； If intelligent body reaches target point, path planning terminates；If not up to, going successively to next step path planning, jumping to step Rapid 3-1).

The major defect of this method is：Collaboration path planning adaptability under game scene is lacking, path The collaboration and game of planning are mainly reflected in the formulation of fuzzy rule, however fuzzy rule needs artificial formulation, and quality is difficult To guarantee, acquisition difficulty is big, good not as good as what is learnt from sample；On the other hand, this method is to the public affairs in classical Artificial Potential Field Method Formula is improved, but improve be directly against draw, repulsion function, not for potential field function, according to " by field to power " Principle, this method, which skips field function and directly improves the theoretical foundation of force function, to be needed further to be proved.

Summary of the invention

The purpose of the present invention is the shortcomings to overcome prior art, propose a kind of based on neural network and Artificial Potential Field Coordinating game model paths planning method.The present invention can preferably solve the path planning problem under coordinating game model scene, play The learning ability of neural network, practicability is good, and dynamic adaptivity is good.

The present invention proposes a kind of coordinating game model paths planning method based on neural network and Artificial Potential Field, and feature exists In this approach includes the following steps：

1) off-line phase；

1-1) construct training sample set；Specific step is as follows：

R-th of intelligent body, which arbitrarily 1-1-1) is chosen, from the R intelligent body for participating in coordinating game model path planning is denoted as F_r, obtain F_r1 path optimizing；The path optimizing shares T step, then the path optimizing includes 1 to T moment intelligent body F_rPosition coordinates；

To the path optimizing, using the intelligent body position coordinates at t-th of moment and corresponding environment as training sample Input；Wherein the intelligent body position coordinates at t-th of moment are (F_r_x_t,F_r_y_t), F_r_x_tFor the X of r-th of intelligent body t moment Coordinate, F_r_y_tFor the Y coordinate of r-th of intelligent body t moment；T-th of moment corresponds to environment：F_rIn the direction of motion of t moment F_r_v_t, the target position (G_x of t moment_t,G_y_t), the target direction of motion G_v of t moment_t, the obstacle location (O_x of t moment_t, O_y_t), the obstacle direction of motion O_v of t moment_tAnd remove F_rOuter other cooperative intelligent body t moment positions coordinate is remembered respectively For (F₁_x_t,F₁_y_t)、(F₂_x_t,F₂_y_t)、…(F_R_x_t,F_R_y_t)；By the potential field function coefficients of t moment, that is, repulsion gain k_r As the output of the training sample, wherein the k of t moment_rBy by the intelligent body alternate position spike generation at t+1 moment and t moment Enter Artificial potential functions to solve to obtain；Then the symbiosis of this path optimizing is at T-1 training sample；

1-1-2) repeat step 1-1-1), M path optimizing is obtained altogether to R intelligent body, and raw to every path optimizing At corresponding training sample；By the training sample composing training sample set of all path optimizings；

1-2) construct BP neural network module；

1-3) the BP nerve that step 1-2) is established using momentum gradient descent method using the training sample set of step 1-1) Network module carries out off-line training, when reaching training cut-off condition, obtains the BP neural network module that training finishes；

2) on-line stage；

The initial position of each intelligent body and corresponding environmental information in R intelligent body 2-1) are obtained, when note current time is t It carves, initializes pedometer c=0；

2-2) the obtained training of the position at each intelligent body current time and environmental information input step 1-3) is finished BP neural network module, network module input the repulsion gain k at the corresponding intelligent body current time_r；

For r-th of intelligent body F_r, the input of BP neural network module is the F of t moment_rPosition (F_r_x_t,F_r_y_t), when t The F at quarter_rDirection of motion F_r_v_t, the target position (G_x of t moment_t,G_y_t), the target direction of motion G_v of t moment_t, t moment Obstacle location (O_x_t,O_y_t), the obstacle direction of motion O_v of t moment_t, remove F_rThe position of outer other cooperative intelligent body t moments (F₁_x_t,F₁_y_t)、(F₂_x_t,F₂_y_t)、…(F_R_x_t,F_R_y_t)；The output of BP neural network module is：Intelligent body F_rT when The repulsion gain k at quarter_r；

The path planning based on Artificial Potential Field 2-3) is carried out at current time, obtains the conjunction at each intelligent body current time Power；Specific step is as follows：

Target point 2-3-1) is calculated to the gravitation of intelligent body；

To each intelligent body, calculate shown in gravitation potential field such as formula (1)：

Wherein, k_aFor gravitation gain, it is set as 1, ρ (p, p_goal)=| | p_goal- p | | it is several for the Europe of intelligent body to target point In distance, p be intelligent body position coordinates, p_goalFor aiming spot coordinate, n ∈ (0 ,+∞)；

Gravitation is acquired by the negative gradient of gravitation potential field, as shown in formula (2)：

2-3-2) calculating threatens area to the repulsion of intelligent body；

To each intelligent body, calculate shown in repulsion potential field such as formula (3)：

Wherein, k_rFor the repulsion gain at step 2-2) obtained current time；ρ(p,p_threat)=| | p_threat- p | | it is intelligence Euclidean distance of the energy body to threat area boundary closest approach, p_threatFor the position coordinates for threatening area boundary closest approach, p₀For reprimand Power biggest impact range；

Repulsion is acquired by the negative gradient of repulsion potential field, as shown in formula (4)：

2-3-3) seek resultant force；

The resultant force of each intelligent body is the vector sum of the intelligent body corresponding gravitation and repulsion, and the direction of vector is the intelligence The direction of the movement at energy body next moment；

2-4) each intelligent body carries out movement until at the t+1 moment, update step counting according to the resultant force that step 2-3) is calculated Device c=c+1, and determined：

If the t+1 moment, any intelligent body reaches target point in R intelligent body and all intelligent bodies do not reach and threaten area, Then path planning success, method terminate；If the t+1 moment, in R intelligent body, any intelligent body, which is reached, threatens area, then path planning Failure, method terminate；If the t+1 moment, all intelligent bodies did not reached success or failure condition, t=t+1 is enabled, when by t+1 Carve as new current time, then return to step 2-2), continue subsequent time path planning；When the step number of pedometer c When reaching the step number upper limit A step of setting, if all intelligent bodies are still not up to success or failure condition, path planning is denoted as super When, method terminates.

The features of the present invention and beneficial effect are：

The present invention can better solve the path planning problem under coordinating game model scene, can better adapt to target and Active path planning under obstacle movement, and fully consider that target and obstacle are made for the movement of this intelligent body in path planning Antagonism variation；The present invention formulates fuzzy rule without artificial, alleviates using difficulty, increases practicability, plays nerve The learning ability of network directly improves by learning from coordinating game model sample and solves coordinating game model path planning problem Effect.

The invention proposes BP neural network dynamic optimization Artificial potential functions coefficient is used, one kind is devised with intelligent body It is input with environment current state, potential field function coefficients are the BP neural network exported and corresponding sample process and study side Method.

The present invention improves the effect of the game path planning under multiple agent collaboration, improves to active path planning Adaptivity surrounds and seize robot, the practical applications such as robot soccer game are of great significance.

Detailed description of the invention

Fig. 1 is the overall flow figure of the method for the present invention.

Fig. 2 is the top plan view of red aircraft and blue party aircraft path planning region in the embodiment of the present invention.

Fig. 3 is the route programming result schematic diagram that traditional classical method is used in case study on implementation of the present invention.

Fig. 4 is the route programming result schematic diagram that the method for the present invention is used in case study on implementation of the present invention.

Specific embodiment

The present invention proposes a kind of coordinating game model paths planning method based on neural network and Artificial Potential Field, below with reference to attached Figure and specific embodiment are further described as follows.

The present invention proposes a kind of coordinating game model paths planning method based on neural network and Artificial Potential Field, is divided into offline rank Section and application on site stage, overall flow is as shown in Figure 1, include the following steps：

1) off-line phase；

1-1) construct training sample set；Specific step is as follows：

1-1-1) for the coordinating game model path planning problem (R is positive integer) of R intelligent body, r-th of intelligent body is obtained F_r1 " path optimizing " (r be less than or equal to R), path optimizing refers to that the path can better meet the coordinating game model path The target and constraint of planning problem, the degree of optimization will affect the performance for the neural network that learning training obtains, path optimizing Obtaining can be by multiple means, such as manual path's planning, the simulation paths planning based on optimizing algorithm etc..Remember the optimization road Diameter shares T step (T is positive integer and T is greater than 1), i.e., the path optimizing includes 1 to T moment intelligent body F_rPosition coordinates.To this Path optimizing, using the intelligent body coordinate at t-th of moment and corresponding environment as the input of a training sample；Wherein t-th when The intelligent body position coordinates at quarter are (F_r_x_t,F_r_y_t), F_r_x_tFor the X-coordinate of r-th of intelligent body t moment, F_r_y_tIt is r-th The Y coordinate of intelligent body t moment, t are less than or equal to T；T-th of moment corresponds to environment：F_rIn the direction of motion (the i.e. r of t moment The Y coordinate of a intelligent body t moment) F_r_v_t, the target position (G_x of t moment_t,G_y_t), the target direction of motion G_ of t moment v_t, the obstacle location (O_x of t moment_t,O_y_t), the obstacle direction of motion O_v of t moment_tAnd remove F_rOuter other cooperative intelligent bodies T moment position coordinate is denoted as (F respectively₁_x_t,F₁_y_t)、(F₂_x_t,F₂_y_t)、…(F_R_x_t,F_R_y_t).By t moment Potential field function coefficients (repulsion gain k_r) output as the training sample, wherein the k of t moment_rBy by the t+1 moment with The intelligent body alternate position spike of t moment substitutes into Artificial potential functions and solves to obtain, it is assumed that a path optimizing includes that T is walked (i.e. in total Have T moment), then the symbiosis of this path optimizing at T-1 training sample (each training sample include it is corresponding input and it is defeated Out), the size of the quantity of sample will affect the performance for the neural network that learning training obtains.

1-1-2) repeat step 1-1-1), obtaining M path optimizing altogether to R intelligent body, (M is positive integer, the bigger training of M Effect is better, it is believed that each intelligent body be it is identical, only initial position is different, as long as M it is enough it is big if do not need it is each Intelligent body has corresponding path optimizing), and corresponding training sample is generated to every path optimizing；By all path optimizings Training sample composing training sample set.

1-2) construct BP neural network module；

Middle layer neuron is arranged using single hidden layer BP neural network module, by heuristic and empirical data in the present invention Number (such as can be set as 24 to the game path planning problem of 2 intelligent bodies collaboration), activation primitive is Sigmoid function.

1-3) training sample set obtained using step 1-1) to the BP neural network module that step 1-2) is established carry out from Line training obtains the BP neural network module that training finishes；

Using the sample of the training sample set obtained in step 1-1), using momentum gradient descent method off-line training BP nerve Network module, with mean square error less than 0.001 or training iteration reach 100000 wheels for training cut-off condition, obtain training finishing BP neural network module.So far, preparation is completed, and can be entered real-time route and be planned the application on site stage.

2) on-line stage；

The initial position of each intelligent body and corresponding environmental information in R intelligent body 2-1) are obtained, these information are known , it can be obtained by the existing sensor technology such as radar, note current time is t moment, initializes pedometer c=0；

2-2) the obtained training of the position at each intelligent body current time and environmental information input step 1-3) is finished BP neural network module, network module input potential field function coefficients (the repulsion gain k at the corresponding intelligent body current time_r), Adaptive optimization is carried out using BP neural network module pair potential field function coefficient；

For r-th of intelligent body F_r, the input of BP neural network module is the F of t moment_rPosition (F_r_x_t,F_r_y_t), when t The F at quarter_rDirection of motion F_r_v_t, the target position (G_x of t moment_t,G_y_t), the target direction of motion G_v of t moment_t, t moment Obstacle location (O_x_t,O_y_t), the obstacle direction of motion O_v of t moment_t, remove F_rThe position of outer other cooperative intelligent body t moments (F₁_x_t,F₁_y_t)、(F₂_x_t,F₂_y_t)、…(F_R_x_t,F_R_y_t)；The output of BP neural network module is：Intelligent body F_rT when Potential field function coefficients (the repulsion gain k at quarter_r)。

The application on site stage is planned in real-time route, and since initial position, each moment plans that its is next to intelligent body What position is step move to, using the obtained BP neural network module of step 1-3) training, with the position of current time intelligent body and Environment exports potential field function coefficients (the repulsion gain k at the moment as input, forward direction_r), reach online calculate in real time and works as front ring The optimal repulsion gain k of border lower aprons_rEffect.Subsequent time, the position and environment for inputting variation use BP neural network mould Block output updates repulsion gain k_r, have the function that adaptive optimization.

Target point 2-3-1) is calculated to the gravitation of intelligent body；

Wherein, k_aFor gravitation gain, be set as 1, this be in order to further decrease BP neural network module export dimension from And promote precision, reduce computation complexity, it is contemplated that the essence of resultant force potential field is the comparison of gravitation potential field and repulsion potential field, is used The thought of scaling, to only need adaptive optimization repulsion gain k_r?.ρ(p,p_goal)=| | p_goal- p | | it is intelligent body To the Euclidean distance of target point, p is intelligent body position coordinates, p_goalFor aiming spot coordinate, n ∈ (0 ,+∞), by passing through It tests convention and usually takes n=2.

2-3-2) calculating threatens area to the repulsion of intelligent body；

Wherein, k_rFor the repulsion gain at step 2-2) obtained current time.ρ(p,p_threat)=| | p_threat- p | | it is intelligence Can body to threat area boundary closest approach Euclidean distance (the exactly introducing of the distance avoids while closing on obstacle and mesh When mark, repulsion covering gravitation causes goal nonreachable), p_threatFor the position coordinates for threatening area boundary closest approach, p₀Most for repulsion Big coverage usually takes n=by experience convention beyond threatening area to be reduced to 0, n ∈ (0 ,+∞) to intelligent body repulsion after the range 2。

Repulsion can be acquired by the negative gradient of repulsion potential field, as shown in formula (4)：

2-3-3) seek resultant force；

If the t+1 moment, any intelligent body reaches target point in R intelligent body and all intelligent bodies do not reach and threaten area, Then path planning success, method terminate；If the t+1 moment, in R intelligent body, any intelligent body, which is reached, threatens area, then path planning Failure, method terminate；If the t+1 moment, all intelligent bodies did not reached success or failure condition, t=t+1 is enabled, when by t+1 Carve as new current time, then return to step 2-2), go successively to subsequent time (i.e. in next step) path planning；When The step number of pedometer c reaches step number upper limit A set when walking (A as positive integer, if A takes 600) all intelligent bodies in the present embodiment Still it is not up to success or failure condition, then path planning is terminated with time-out.

Below with reference to a specific embodiment, that the present invention is described in more detail is as follows.

The present embodiment synergistically carries out 1 airplane of blue party (target drone) with 2 airplane of red (red A machine and red B machine) For game path planning.

Background：

Fig. 2 show the top plan view in both aircraft path planning region, initial position：Red A machine is inclined in figure lower part A left side, red B machine is to the right in figure lower part, and there is a common target area in blue party target drone portion on the diagram to 2 airplane of red, to red Square A machine and red B machine have a threat area respectively.Both aircraft obtains other side's real time position and heading device under intelligence support Breath, target drone two side areas is target area, front region be threaten area, red aircraft by enter target area for the purpose of, to avoid Threatening area is constraint, carries out coordinating game model path planning.

Basic assumption：

Assuming that both aircraft is the particle in two-dimensional space, 1 unit distance in top view represents 1 in real airspace Km, the step-length in time domain are equivalent to 1 second in reality.Both aircraft remains that maximum speed is flown, wherein red aircraft 0.3 thousand meter per second of speed (about 0.9 Mach), 0.5 thousand meter per second of blue party air speed (about 1.5 Mach).

Initial position：

Red two-shipper uses the formation of 100 km of spacing, if target drone occurs at random using red vector as 0 degree In -90 to+90 degree directions of red Two-fighter formation, about 300 kms of distance.

Target area：

Point line boundary as shown in Figure 2, is directed toward in target drone head ± 90 degree of radius maximum, reachable 110 kms.

Threaten area：

Be equivalent to the barrier in legacy paths planning problem, it is related to both aircraft relative velocity, therefore red A machine with B machine has different threat areas because of course difference, and dashed boundaries as shown in Figure 2, maximum appears in both aircraft and meets head on to fly When row, up to 105 kms.

Decision condition：

Area is threatened when any aircraft of red enters, is path planning " failure "；When any aircraft of red enters target area, and It does not enter and threatens area, be path planning " success ".Red aircraft does not enter target still when more than maximum step number (being set as 600 steps) Area threatens area, then is " flat " office.

Game environment：

According to current location and course caused by red aircraft back path planning decision, target drone is according to " course direction At first enter threaten area red aircraft " strategy in real time adjustment course.

The present embodiment proposes a kind of coordinating game model paths planning method based on neural network and Artificial Potential Field, including following Step：

1) off-line phase；

1-1) construct training sample set；

The known red A machine path optimizing in 6 initial positions respectively, i.e. 6 path optimizings in total, it is excellent to each Change path, using the coordinate at t-th of moment and corresponding environment as the input of a training sample, by the potential field function of t moment Coefficient (repulsion gain k_r) output as the training sample, wherein the k of t moment_rBy by t+1 moment and t moment Intelligent body alternate position spike substitutes into Artificial potential functions and solves to obtain, it is assumed that a path optimizing includes that T is walked (at i.e. a total of T Carve), then the path optimizing symbiosis is at T-1 training sample (each training sample includes corresponding outputs and inputs).

6 path optimizings are handled as above, and corresponding training sample is generated to every path optimizing；By institute There is the training sample composing training sample set of path optimizing.

In the present embodiment, a part of red B machine as " environment " participates in building training sample set.

Since A machine and B machine are identical intelligent bodies, identical processing can also be done to B machine and is added to training sample concentration, But it only used the path optimizing building training sample set of A machine in the present embodiment.

1-2) construct BP neural network module；

Using single hidden layer BP neural network module, it is 24 that middle layer neuron number, which is arranged, and activation primitive is Sigmoid letter Number.

When carrying out path planning to the red A machine t+1 moment, the input of BP neural network module is：Red A machine t moment Position (F_A_x_t,F_A_y_t), red A machine t moment direction of motion F_A_v_t, target area closest approach t moment position (G_x_t,G_y_t), when t Carve target drone direction of motion G_v_t, t moment barrier closest approach position (O_x_t,O_y_t), the position of another airplane B of t moment red (F_B_x_t,F_B_y_t)；BP neural network module exports：Potential field function coefficients (the repulsion gain k of red A machine t moment optimization_r)。

1-3) using step 1-1) obtain training sample set to the BP neural network module that step 1-2) is established carry out from Line training obtains the BP neural network module that training finishes；

Using the sample of training sample set in step 1-1), using momentum gradient descent method off-line training BP neural network mould Block, using mean square error less than 0.001 or training iteration reach 100000 wheels as cut-off condition.So far, preparation is completed, can be with The application on site stage is planned into real-time route.

2) on-line stage；

The initial position and corresponding environmental information, note current time for 2-1) obtaining each intelligent body are t moment, initialization Pedometer c=0；

2-2) the BP for finishing the obtained training of the location circumstances information input step 1-3 at each intelligent body current time) Neural network module, network module input potential field function coefficients (the repulsion gain k at the corresponding intelligent body current time_r), benefit Adaptive optimization is carried out with BP neural network module pair potential field function coefficient；

The application on site stage is planned into real-time route, and since initial position, each moment is to red aircraft A and B points Do not plan what position is its next step move to, the neural network module obtained using step 1-3) training, with current time intelligence As input, forward direction exports potential field function coefficients (the repulsion gain k at the moment for the position of body and environment_r), reach online real-time Calculate the optimal repulsion gain k of current environment lower aprons_rEffect.Subsequent time, the position and environment for inputting variation use BP Neural network module output updates repulsion gain k_r, have the function that adaptive optimization.

The path planning based on Artificial Potential Field 2-3) is carried out at current time, obtains the conjunction at each intelligent body current time Power；Specific step is as follows

Target point 2-3-1) is calculated to the gravitation of intelligent body；

To each intelligent body (the present embodiment is red aircraft), calculate shown in gravitation potential field such as formula (1)：

Wherein k_aIt is set as 1, n=2.

2-3-2) calculating threatens area to the repulsion of intelligent body；

Shown in repulsion potential field such as formula (3)：

Wherein n=2.

2-3-3) seek resultant force

The resultant force of intelligent body is the vector sum of the intelligent body corresponding gravitation and repulsion, and the direction of vector is the intelligent body The direction that (i.e. red aircraft) moves in next step.

If the t+1 moment, any aircraft of red reaches target area and all red aircrafts do not reach and threaten area, then path is advised It draws successfully to terminate；If any aircraft of red, which reaches, threatens area, path planning is ended in failure；If t+1 moment all red Aircraft body does not arrive not up to success or failure condition, then enables t=t+1, using the t+1 moment as new current time, then weigh New return step 2-2), go successively to subsequent time (i.e. in next step) path planning；When the step number of pedometer c reaches 600 step When red aircraft is still not up to success or failure condition, path planning is terminated with draw.

Effect：

Fig. 3 shows that (S S Ge etc. 2000 in " New Potential Functions for using classical way The method proposed in Mobile Robot Path Planning ") route programming result, show both aircraft path planning Top plan view, initial position：For red A machine among figure lower part, red B machine is to the left in figure lower part, and blue party target drone is on the diagram Portion is to the right.Repulsion gain k in this method_rIt is the average optimal value that statistics obtains with gravitation gain value.As seen from Figure 3, by Adaptive optimization is unable in potential field function coefficients：First is that cause the path planning of red aircraft close to target drone adjustment course or from It opens and insensitive, lacks game；Although moving tendency is similar second is that causing two airplane of red remote at a distance of target drone one nearly one, Fail to generate collaboration.Final route programming result is failure.When using the method for the present invention under same scene, route programming result As shown in figure 4, both aircraft initial position is constant, potential field function coefficients are according to real-time opposite situation adaptive optimization：Red B machine Due to farther apart from target drone, repulsion gain k_rAdaptive to reduce so that its fast approaching target drone, forms friendly machine and support, this is association With the embodiment of ability；Path planning latter end red aircraft will enter target area or threaten area, when the machine is directed toward in target drone course Repulsion gain k_rIt significantly increases to flee from and threaten area, repulsion gain k after target drone course drift the machine_rIt falls after rise rapidly and is forced with taking advantage of the occasion Close-target area, this is the embodiment of game ability.

In addition, statistically verifying the performance and computing cost of this method by one group of large sample, and carried out with classical way Comparison, as shown in table 1.Sample group includes the path planning of 1000 random initial positions, and statistical result showed compares classics side Method, the success rate of this method are obviously improved up to 32.3 percentage points, and step number needed for success reduces about 15.37%, are sufficiently shown Promotion of this method to path planning performance.On the other hand, although neural network forward-propagating process introduces additional meter Expense is calculated, makes which increase 4 orders of magnitude, but still in millimeter magnitude, it is contemplated that will also be obtained in the following practical application into one Soft or hard piece optimization is walked, the computing cost of this method is acceptable.

The overall performance of 1 the method for the present invention of table and classical way and expense contrast table

Claims

1. a kind of coordinating game model paths planning method based on neural network and Artificial Potential Field, which is characterized in that this method includes Following steps：

1) off-line phase；

1-1) construct training sample set；Specific step is as follows：

To the path optimizing, using the intelligent body position coordinates at t-th of moment and corresponding environment as the input of a training sample； Wherein the intelligent body position coordinates at t-th of moment are (F_r_x_t,F_r_y_t), F_r_x_tFor the X-coordinate of r-th of intelligent body t moment, F_r_y_tFor the Y coordinate of r-th of intelligent body t moment；T-th of moment corresponds to environment：F_rIn the direction of motion F of t moment_r_v_t, t Target position (the G_x at moment_t,G_y_t), the target direction of motion G_v of t moment_t, the obstacle location (O_x of t moment_t,O_y_t), t The obstacle direction of motion O_v at moment_tAnd remove F_rOuter other cooperative intelligent body t moment positions coordinate is denoted as (F respectively₁_ x_t,F₁_y_t)、(F₂_x_t,F₂_y_t)、…(F_R_x_t,F_R_y_t)；By the potential field function coefficients of t moment, that is, repulsion gain k_rAs this The output of training sample, wherein the k of t moment_rIt is artificial by substituting into the intelligent body alternate position spike at t+1 moment and t moment Potential field function solves to obtain；Then the symbiosis of this path optimizing is at T-1 training sample；

1-1-2) repeat step 1-1-1), M path optimizing is obtained altogether to R intelligent body, and to every path optimizing generation pair The training sample answered；By the training sample composing training sample set of all path optimizings；

1-2) construct BP neural network module；

1-3) the BP neural network for using momentum gradient descent method to establish step 1-2) using the training sample set of step 1-1) Module carries out off-line training, when reaching training cut-off condition, obtains the BP neural network module that training finishes；

2) on-line stage；

2-1) obtaining the initial position of each intelligent body and corresponding environmental information, note current time in R intelligent body is t moment, Initialize pedometer c=0；

The BP mind for 2-2) finishing the obtained training of the position at each intelligent body current time and environmental information input step 1-3) Through network module, network module inputs the repulsion gain k at the corresponding intelligent body current time_r；

For r-th of intelligent body F_r, the input of BP neural network module is the F of t moment_rPosition (F_r_x_t,F_r_y_t), the F of t moment_r Direction of motion F_r_v_t, the target position (G_x of t moment_t,G_y_t), the target direction of motion G_v of t moment_t, the obstacle position of t moment Set (O_x_t,O_y_t), the obstacle direction of motion O_v of t moment_t, remove F_rPosition (the F of outer other cooperative intelligent body t moments₁_x_t, F₁_y_t)、(F₂_x_t,F₂_y_t)、…(F_R_x_t,F_R_y_t)；The output of BP neural network module is：Intelligent body F_rT moment reprimand Power gain k_r；

The path planning based on Artificial Potential Field 2-3) is carried out at current time, obtains the resultant force at each intelligent body current time；Tool Steps are as follows for body：

Target point 2-3-1) is calculated to the gravitation of intelligent body；

Wherein, k_aFor gravitation gain, it is set as 1, ρ (p, p_goal)=| | p_goal- p | | for the Euclid of intelligent body to target point Distance, p are intelligent body position coordinates, p_goalFor aiming spot coordinate, n ∈ (0 ,+∞)；

F_att(p)=- ▽ U_att(p)=k_aρ(p,p_goal) (2)

2-3-2) calculating threatens area to the repulsion of intelligent body；

Wherein, k_rFor the repulsion gain at step 2-2) obtained current time；ρ(p,p_threat)=| | p_threat- p | | it is intelligent body To the Euclidean distance for threatening area boundary closest approach, p_threatFor the position coordinates for threatening area boundary closest approach, p₀Most for repulsion Big coverage；

2-3-3) seek resultant force；

The resultant force of each intelligent body is the vector sum of the intelligent body corresponding gravitation and repulsion, and the direction of vector is the intelligent body The direction of the movement at next moment；

2-4) each intelligent body carries out movement until at the t+1 moment, update pedometer c=according to the resultant force that step 2-3) is calculated C+1, and determined：

If the t+1 moment, any intelligent body reaches target point in R intelligent body and all intelligent bodies do not reach and threaten area, then road Diameter plans that successfully, method terminates；If the t+1 moment, in R intelligent body, any intelligent body, which is reached, threatens area, then path planning fails, Method terminates；If the t+1 moment, all intelligent bodies did not reached success or failure condition, enable t=t+1, using the t+1 moment as At new current time, then return to step 2-2), continue subsequent time path planning；When the step number arrival of pedometer c is set When fixed step number upper limit A is walked, if all intelligent bodies are still not up to success or failure condition, path planning is denoted as time-out, side Method terminates.