CN110673486A

CN110673486A - Multi-spacecraft pursuit and escape control method based on dynamic game theory

Info

Publication number: CN110673486A
Application number: CN201911003658.7A
Authority: CN
Inventors: 师鹏; 徐添; 邓忠民; 张冉; 赵育善; 王逍
Original assignee: Beijing University of Aeronautics and Astronautics
Current assignee: Beihang University; Beijing University of Aeronautics and Astronautics
Priority date: 2019-10-22
Filing date: 2019-10-22
Publication date: 2020-01-10
Anticipated expiration: 2039-10-22
Also published as: CN110673486B

Abstract

The invention relates to a multi-spacecraft pursuit and escape control method based on a dynamic game theory, which is characterized in that according to the space environment of a spacecraft, all perturbation forces are ignored, a multi-spacecraft dynamics model, also called a dynamics equation, is established in a gravity field at the center of the earth, and the model comprises three roles of a target spacecraft, a defense spacecraft and an attack spacecraft; determining a payment function of the spacecraft and a termination target set of a pursuit and escape game according to the role relationship of each spacecraft in the dynamic model, thereby establishing a multi-spacecraft game model; processing by using a dynamic game theory to obtain a two-point edge value equation, wherein the two-point edge value equation comprises a differential equation of a co-modal vector, a new kinetic equation represented by the co-modal vector and a cross-section equation of time; aiming at the two-point edge value equation obtained from the established two-point edge value problem, the solution of the equation is obtained by solving by a method combining a particle swarm algorithm and a nonlinear programming method, so that the optimal control strategy and the optimal track for the spacecraft to pursue are obtained.

Description

Multi-spacecraft pursuit and escape control method based on dynamic game theory

Technical Field

The invention relates to a multi-spacecraft pursuit and escape control method based on a dynamic game theory, and belongs to the technical field of spacecraft control.

Background

Similar to ground pursuit, for pursuit or counterpursuit purposes, a relative motion scene in which at least two spacecrafts participate under the action of control force is called space pursuit of the spacecrafts. The space pursuit technology plays an important role in the aspects of space safety of the spacecraft, on-orbit service and the like. Due to the inconsistency of task targets among the spacecrafts, the control strategy of the spacecrafts can be analyzed by adopting the idea of non-cooperative dynamic game. So-called dynamic gaming (ISAACS r. differential Games [ M ]. New York: John Wiley and Sons,1965:1-5.) is that during the course of a participant's confrontation, at least one participant can use state information from previous processes to determine the specific action at the current moment, and if the goals between the participants are not exactly the same, it is uncooperative. Since the idea of the game theory can simultaneously consider the control quantity of a plurality of spacecrafts, the research on the application of the game theory in pursuing escape problems is concerned by a plurality of scholars.

Hayon (HAYOUN S Y, SHIMA T.A two-on-one input impulse-evaluation with bound controls [ J ]. Journal of Optimization principles and Applications,2017,174(3):837 and 857.) based on linear model studies two tracing parties, one escape party' S multiple chase one form problem, gives a closed-loop control strategy under the condition of control bound;

rusnak (RUSNAK I. the lady, the bases and the body guard-a two times technical gain [ J ]. IFAC Proceedings Volumes,2005,38(1):441 and 446.) when studying the game problem of multi-person participation, the participant is divided into two time courses, the defense period comprises a protector and a guarded, the attack period is an attacker, the attacker considers that the discussion can be carried out under two time courses, the problem is analyzed by taking the possible interception of the protector to the attacker as a node, and the solution by using the multi-objective optimization method is provided. Yifang Liu (LIU Y F, Li R F, WANG SQ. objective Three-dimensional spatial using semi-direct interaction with non-linear programming [ C ]// Proceedings of 20162 nd International Conference on Control Science and Systems Engineering, Piscataway, NJ: IEEE Press,2016:217-222.) et al propose an escape game model that includes Three spacecrafts, define a termination target set and a spacecraft index function applicable to the model, and propose a solution model using Pontani (CONWAY B A, PONTANI M. numerical solution of the Three-dimensional objective output-space [ J ] of the Journal of the spacecraft and the optimal trajectory solving method of the trajectory in the equation of 2. the method of solving the spacecraft and the optimal trajectory in the equation of 2. the equation of Joonass 2. the method of solving the spacecraft and the optimal trajectory in the equation of the spacecraft. The linear model adopted by Hayon is used for processing the situation of relatively close distance, and although the game of multiple spacecrafts does not contain defense spacecrafts, the role is limited; although the Rusnak model comprises three pursuing and preventing parties, the proposed method for analyzing under two time histories can lead to complex index function situations and is complicated in the using process; the Yifong Liu et al model is a simplified version of a multi-spacecraft game model because the protected object, namely the target spacecraft, in the model has no power, and the relative position relationship of the terminals of the three spacecrafts is used as an index function to simplify the solving process, but the index function and the terminal target set are repeated.

In a word, the existing model for researching multi-spacecraft pursuit is not complete, the multi-spacecraft game is split into a plurality of double-spacecraft games, modeling solving is carried out under a plurality of time courses, the model is complex, most of solving strategies need initial value guessing, and convergence is difficult to guarantee.

Disclosure of Invention

The technical problem to be solved by the invention is as follows: the multi-spacecraft pursuit control method based on the dynamic game theory is simple in form compared with the traditional model, the solving method does not need initial value guessing, and the obtained control law can effectively realize interception or tracking.

The technical scheme of the invention is as follows: a multi-spacecraft pursuit escape control method based on a dynamic game theory comprises the following steps:

the method comprises the steps that firstly, according to the space environment of the orbit height of a spacecraft, all perturbation forces are ignored, a multi-spacecraft dynamics model, also called a dynamics equation, is established in an earth central gravitational field, and the model comprises three roles of a target spacecraft, a defense spacecraft and an attack spacecraft; determining a payment function of the spacecraft and a termination target set of a pursuit and escape game according to the role relationship of each spacecraft in the dynamic model, thereby establishing a multi-spacecraft game model;

secondly, processing the multi-spacecraft game model established in the first step by using a dynamic game theory to obtain a two-point boundary value equation, wherein the two-point boundary value equation comprises a differential equation of a co-modal vector, a new kinetic equation expressed by the co-modal vector and a cross-section equation of time;

and thirdly, solving the two-point edge value equation obtained in the second step by combining a particle swarm algorithm and a nonlinear programming method aiming at the two-point edge value problem established in the second step to obtain a solution of the equation, so as to obtain an optimal control strategy and an optimal track for the spacecraft to pursue.

In the first step, the game model of the multi-spacecraft is as follows:

(1) the dynamic equation of the spacecraft is:

wherein, i ═ D, E, A stand for defending spacecraft, escaping spacecraft and attacking spacecraft respectively, mu ═ 3.986005 × 10¹⁴m³/s²As the gravitational constant, r represents the spacecraft vectorThe mode of the diameter, x, y and z are three-axis position components of the spacecraft, T represents the unit mass control quantity of the spacecraft, theta and alpha represent the control direction angle of the spacecraft, and the specific direction is defined as follows: theta is an included angle between the controlled variable and the direction of the z axis under the inertial system, and alpha is an included angle between the projection of the controlled variable on the x-y plane and the y axis.

(2) According to the role relationship of the spacecraft, the payment function is confirmed as follows:

wherein, t₀,t_fRepresents the start and end time of the game, r_AEAnd r_DARespectively representing the relative distance of the attacking and target spacecraft and the distance between the defending and attacking spacecraft, k₁And k₃Weight, k, representing payment₂And k₄Representing the rate of change of payment with distance.

(3) The target set of spacecraft is as follows:

ψ(r_E(t_f),r_D(t_f),r_A(t_f),t_f)＝F₁(ξ₁)F₂(ξ₂)＝0

I₁＝{(r_A(t_f),r_E(t_f))|||r_A(t_f)-r_E(t_f)||＜ε₁}

I₂＝{(r_D(t_f),r_A(t_f))|||r_D(t_f)-r_A(t_f)||＜ε₂}

wherein r is_i(t_f) Terminal position vector, ξ, representing spacecraft i₁,ξ₂Respectively representing combinations of terminal positions of attack-target spacecraft and attack-defense spacecraft, I₁,I₂Set of terminal positions of attack-tolerant target spacecraft and set of terminal positions of attack-tolerant defense spacecraft to satisfy acquisition relationship, epsilon₁≥0,ε₂And more than or equal to 0 respectively represents the capture range of the attacking spacecraft and the defending spacecraft.

The two-point edge value equation of the second step is as follows:

x(t_f) Should satisfy the terminal target set

Wherein x represents the position and velocity vector of the spacecraft, λ is the co-modal vector, t is time, f (-) is a new spacecraft dynamical equation represented by the co-modal vector,representing a collaborative equation, and deriving from game theory, psi' is a boundary form of a terminal target set, and x (t)_f) Representing the terminal value of the state.

The third step is specifically realized as follows:

(1) the method comprises the following steps of obtaining a rough solution of a two-point boundary value equation by utilizing a particle swarm algorithm, wherein the fitness function of the algorithm is as follows:

wherein, c_iBoundary constraint and time-transverse condition, k, representing the two-point boundary equation in the second step_i> 0 denotes the weight coefficient of each constraint.

Particle renewal was as follows:

wherein w represents an inertial weight, w_max,w_minRepresenting the upper and lower bounds of the inertial weight, n representing the current evolutionary algebra, M representing the total evolutionary algebra, c₁,c₂Represents the acceleration constant, r₁,r₂Is a random number from 0 to 1 and,

is the individual extremum in the d dimension of the ith example, g^dIs a global extremum in the d-th dimension.

(2) And (3) performing fine optimization on the coarse solution of the two-point boundary value equation by using a nonlinear programming tool box, wherein the fitness functions of the objective function and the particle swarm function are the same, so that the numerical solution of the equation is obtained.

(3) And (4) bringing the solution back to an initial value problem equation, and solving an optimal control strategy and an optimal track for the spacecraft to pursue.

Compared with the prior art, the invention has the advantages that:

(1) according to the invention, the three-dimensional space multi-spacecraft game model is established under a unified time history, and a simpler and clearer payment function form can be obtained compared with a multi-time history model, so that formula deduction under different time histories is omitted, and the model is easier to understand.

(2) The payment function adopted by the invention is based on the instantaneous relative position relationship of the spacecraft, so that the target party and the defense party can be effectively matched, and the model establishment is closer to the reality.

(3) The invention omits the initial guess of the solving result and improves the convergence of the solving.

Drawings

FIG. 1 is a flow chart of a method implementation of the present invention;

FIG. 2 is a schematic diagram of what coordinate system is according to the present invention;

FIG. 3 is a diagram of the spacecraft motion trajectory of the present invention;

FIG. 4 is a diagram of the relative distance of the defense spacecraft and the target spacecraft of the present invention;

FIG. 5 is a diagram of the relative distances of the attacking and defending spacecraft of the present invention;

fig. 6 shows a spacecraft control azimuth angle in accordance with the present invention.

Detailed Description

The present invention will be described in detail below with reference to the accompanying drawings and examples.

As shown in fig. 1, the invention relates to a multi-spacecraft pursuit control method based on a dynamic game theory, which is used for establishing a game model with a maneuvering capability of a target spacecraft under a uniform time history aiming at a game system comprising three spacecrafts, wherein a control strategy and an optimal trajectory of the spacecraft are obtained by adopting a new index function and a solving method in the model, and the method comprises the following steps:

the invention establishes a multi-spacecraft dynamics model in an earth central gravity field, gives a payment function form of each spacecraft and a termination target set of a pursuit escape game, obtains a first-order condition and a second-order condition of each spacecraft control strategy by using necessary conditions existing in saddle point solution, converts the game problem into a two-point boundary value equation, and solves the solution of the equation by using a method combining a particle swarm algorithm and a nonlinear programming method, thereby obtaining an optimal control strategy and an optimal track of the pursuit escape of the spacecraft.

1. Spacecraft attack and defense model

The model comprises a target spacecraft, an attack spacecraft and a defense spacecraft, wherein the target spacecraft is a capture target of the attack spacecraft, and the defense spacecraft attempts to intercept the attack spacecraft in order to protect the target spacecraft. And the game is terminated when the attacking spacecraft successfully captures the target spacecraft or the defending spacecraft intercepts the attacking spacecraft.

1.1 kinetic model

Establishing a dynamic model of the spacecraft in a spring equinox geocentric inertial coordinate system: as shown in FIG. 2, the origin of coordinates is located at the center of the earth, with the X-axis pointing in the direction of the spring's point and the Z-axis perpendicular to the equatorial plane pointing in the North Pole direction.

Only the gravity of the earth center is considered, all perturbation and the mass consumption of the spacecraft are ignored, and the dynamic equation form of the spacecraft is obtained as follows:

where i ═ E, D, a represents the target, defense and attack spacecraft. mu-3.986005X 10¹⁴m³/s²The method comprises the following steps of defining parameters of a control quantity as an earth gravity constant, u as the control quantity of a unit mass, T, theta and alpha, wherein T represents the magnitude of the control quantity, theta and alpha represent control direction angles of a spacecraft, and the specific directions are defined as follows: theta is an included angle between the controlled variable and the direction of the z axis under the inertial system, and alpha is an included angle between the projection of the controlled variable on the x-y plane and the y axis. The spacecraft control quantity T is assumed to be constant in the whole game time, so that the control parameter of the game is the direction angle of thrust.

If the control quantity of the target spacecraft is always larger than that of the attack spacecraft, the target spacecraft cannot be captured as long as the control direction is proper, and the terminal target set loses meaning, so the following assumptions are made:

T_D＞T_A＞T_E(2)

the kinetic equation can be developed as:

wherein x, y, z are the three-axis position components of the spacecraft. The state quantities of the three spacecrafts are expressed in a unified state as follows:

wherein any one of the components

The above kinetic equation can be described as:

1.2 terminating the target set

Considering duration t of a game_fUnknown, so it is necessary to determine a terminal target set, and when the relative position between the spacecrafts meets the set requirement, the game is terminated, and the target set is established by the relative position relationship of the terminals:

ψ(r_E(t_f),r_D(t_f),r_A(t_f),t_f)＝F₁(ξ)F₂(ξ)＝0 (5)

wherein the function used is defined as:

in the formula, r_i(t_f) Terminal position vector, ξ, representing spacecraft i₁,ξ₂Respectively representing combinations of terminal positions of attack-target spacecraft and attack-defense spacecraft, I₁,I₂Set of terminal positions of attack-tolerant target spacecraft and set of terminal positions of attack-tolerant defense spacecraft to satisfy acquisition relationship, epsilon₁≥0,ε₂And > 0 represents the capture radius of the attacking spacecraft and the defending spacecraft. From the above equation, the terminal time definition of the countermeasures can be obtained:

2. spacecraft pursuit strategy solving

2.1 Payment function of people in bureau

Payment functions are the targets that each spacecraft expects to achieve, and commonly used payment functions include terminal distance constraints, countermeasure time, fuel consumption and the like. However, because the invention researches the multi-spacecraft game, the countermeasure time is not a contradiction point among the spacecrafts, and because the thrust of the spacecraft per unit mass is assumed to be constant in the game process, the fuel consumption and the countermeasure time are consistent in nature, and the terminal distance constraint and the terminal target set are repeated in action and form, so that the conventional pay function cannot meet the game model of the multi-spacecraft.

In the invention, indexes containing the instantaneous position relation between spacecrafts are used as payment functions, and the following instantaneous payment forms are firstly defined:

in the formula, r_AEAnd r_DARespectively representing the relative distance of the attacking and target spacecraft and the distance between the defending and attacking spacecraft, k₁And k₃Represents a paid weight representing the measure of the attacking spacecraft as it is away from the defending spacecraft or continues to track the target spacecraft, k₂And k₄Representing the rate of change of payment with distance.

In the game process, the attack spacecraft evaluates the relative position relation between the target spacecraft and the defense spacecraft, so as to select whether to capture the target spacecraft preferentially or avoid being intercepted preferentially. On the contrary, the target spacecraft and the defense spacecraft hope to cooperate to complete the capture of the attack spacecraft, at least to delay the capture of the target spacecraft, and the pay function of the game is expressed as:

therefore, a complete three-spacecraft pursuit prevention model is defined, and in order to apply the dynamic game theory to process the problem, a Nash equilibrium solution and the necessary conditions for the Nash equilibrium solution are required to be introduced to convert the game problem into a two-point boundary value equation.

2.2 Nash equilibrium solution and constraint processing

In a single-target game involving n players, with s_iIndicates the respective benefit, c ═ c₁,c₂,···,c_i,···,c_n) Represents a set of control strategy combinations, wherein c_iRepresenting the control strategy selected by the ith participant, C representing the set of selectable control strategy combinations, then a set of strategy combinations

Is a Nash equilibrium, if and only if

Comprises the following steps:

in the formula, c_-iRepresenting the control strategy of other participants than the ones in the ith office,

representing the optimal strategy selection for the person in the ith office.

To obtain a control strategy for a spacecraft, a hamiltonian is introduced:

H＝L+λ^Tf(x,u,t)＝L+λ_A ^Tf_A(x_A,u_A,t)+λ_D ^Tf_D(x_D,u_D,t)+λ_E ^Tf_E(x_E,u_E,t) (11)

wherein, λ represents a co-modal vector, and according to the necessary condition existing in the saddle point, the differential equation of the co-modal vector is:

the three spacecraft collaborative states can be analyzed through an equationThe results of the vector differential equations are similar in form, the components of the collaborative vectors of each spacecraft in three axes are also similar, and only the differential equation of the collaborative vectors of the defense spacecraft is solved below, so that H is the₁＝L，H₂＝λ^Tf(x,u,t)，

Then there are:

it can be seen that the co-modal vector is a function of the state quantity, which is noted as:

and (3) the final value of the co-modal vector is related to the terminal target set, and the terminal constraint represented by the formula (5) is converted as follows:

ψ′＝(||r_A(t_f)-r_E(t_f)||-ε₁)·(||r_D(t_f)-r_A(t_f)||-ε₂)＝0 (16)

the above form is a boundary form of the terminal target, which means that when the distance between the attacking spacecraft and the target spacecraft is exactly epsilon₁Or the distance between the attacking spacecraft and the defending spacecraft is exactly epsilon₂. According to the game theory, the magnitude of the final value of the covariance vector along the external normal direction of the boundary of the terminal target set is represented by the partial derivative of the boundary of the target set:

similarly, the covariance end values of the spacecraft are similar, wherein the covariance end value of the attacking spacecraft is:

in the same way, the coordination final values of the other two spacecrafts can be obtained, and the following rules are observed and found:

λ_P,x(t_f)+λ_E,x(t_f)+λ_D,x(t_f)＝0

λ_P,y(t_f)+λ_E,y(t_f)+λ_D,y(t_f)＝0 (19)

λ_P,z(t_f)+λ_E,z(t_f)+λ_D,z(t_f)＝0

for the saddle-point strategy solution, it must be satisfied:

if the control has no constraint requirement, the first order condition for the optimum control of the spacecraft can be obtained by the equation:

suppose sin θ_iNot equal to 0, then there are

By substituting an arbitrary α into equation (21), two θ can be obtained, thereby obtaining four combinations of steering angles, so it is necessary to obtain a second-order condition of the steering equation:

for an attacking spacecraft, its hessian matrix h should satisfy: h is more than or equal to 0; the defense spacecraft and the target spacecraft need to meet: h is less than or equal to 0. At any instant, only a pair of combinations of control direction angles satisfy the first and second order conditions simultaneously. The obtained control amount equation is recorded as:

in the case where formula (4) is substituted by formula (22), the following are provided:

the state at the initial time is given as:

ψ₀(x_A,0,x_D,0,x_E,0,t₀)＝0 (24)

the countermeasure time is free, and then the time cross-section condition is:

the dynamic game problem involving three space vehicles is converted to a two-point equation by equations (15), (17), (23) - (25). In this equation, the unknown variables comprise the state vectors x of the three spacecraft_A,x_D,x_EAnd their control quantities u_A,u_D,u_ECo-modal vector λ_A,λ_D,λ_EAnd a game time t_f。

3. Solving strategy based on particle swarm and downhill simplex method

For a multi-spacecraft game model, whether the result is converged or not and the precision of the result is very sensitive to the initial value of the co-modal vector lambda, so the method adopts a particle swarm algorithm which is an optimization algorithm without giving out an initial guess value. However, the calculation accuracy of the particle swarm is limited by the population number and the evolution times, and the calculation accuracy is considered by integrating time and accuracy, so that excessive particles and algebras cannot be adopted, so that the convergence accuracy cannot be ensured by the algorithm under certain conditions, and further optimization is further performed by adopting a downhill simplex method on the basis of the particle swarm algorithm, and the method is a nonlinear programming method under the condition of processing no constraint. In the process of solving the established model, the downhill simplex method can directly adopt an index function of a particle swarm algorithm, so that the time and the difficulty of constraint processing are saved.

1) Fitness function

In the established two-point boundary value equation, the related constraint summation is taken as a fitness function of the particle swarm optimization:

wherein, c_iThe constraints are represented and are composed of equations (16), (17) and (25). k is a radical of_i> 0 represents the weight of each coefficient, n_cRepresenting the number of constraints.

2) Arrangement of particles

The dimension of the particle comprises three spacecraft covariance initial values (6 x 3) and a game time t_fIn total 19 dimensions. The limit value of each dimension of the particle cannot be determined, the co-modal vector is processed in a unitization mode, and the particle swarm optimization adopts a dynamic interval mode to process: initial guess interval (a) for a given particle_i,b_i) Wherein i ═ 1,2, ·,19 denotes the particle dimension, N particles are randomly generated in this interval, and if some element j of the current globally optimal particle exceeds (a) in the particle updating process_j,b_j) Taking expansion factor gamma > 1, expanding the interval to obtain new element limit (gamma a)_j,γ*b_j) And regenerating M particles under the limit, selecting N particles with higher fitness from the new N + M particles, and continuously iterating to correct the limit range.

For element t_fSetting a small quantity delta as a time lower limit, wherein the limit is not changed in the iteration process, and the following steps are carried out:

4. simulation setup

In order to verify the effectiveness of the established game model and the adopted solving strategy, the pursuit model comprising three spacecrafts is subjected to simulation calculation. Firstly, in order to avoid generating a sick matrix due to too large magnitude difference of each parameter of an equation, the equation is unitized, and the equator radius R of the earth is taken_EFor a Distance Unit (DU), a Time Unit (TU) is defined as a value such that the earth's gravity constant μ is 1, and there are:

wherein g is the earth gravitational acceleration. The thrust vector of the spacecraft is T_A＝0.04g，T_D＝0.1g，T_EThe acquisition range of the spacecraft was set to 0.01 g. Orbit element for initial state of spacecraftThe elements in the formula represent a semi-major axis of the orbit, eccentricity, inclination angle of the orbit, right ascension of the ascending intersection, amplitude angle of the perigee and angle of the perigee respectively. If the orbit is a circular orbit, the horizontal latitude amplitude angle u is represented by omega + M. The initial orbit elements of the spacecraft are:

TABLE 1 spacecraft initial orbit element

In the payment of the spacecraft, the values of all coefficients of the instantaneous payment are as follows: k is a radical of₁＝k₂＝k₃＝k ₄1, the particle group size is 150, the iteration frequency is 200 generations, the simplex method is iterated 3000 times, and the precision requirement is 10^-8。

The invention establishes a three-dimensional space game model containing a plurality of spacecrafts, which divides the game process of the spacecrafts into a uniform time history and avoids the sectional representation of an index function. The analysis method based on the dynamic game can simultaneously consider the control strategies of a plurality of spacecrafts and is closer to the antagonism requirement in the scene.

The model adopts an index containing the instantaneous relative position information of the spacecraft, and the index promotes the connection between the target spacecraft and the defense spacecraft to form certain matching; the attacker is also prompted to evaluate the importance of determining the two sets of games.

In the process of solving the model, the method can adopt a mode of combining the particle swarm algorithm and the downhill simplex to calculate, modifies the traditional particle swarm algorithm in the method for processing the particle boundary, uses the form of a dynamic interval, and avoids strict guess of the particle interval.

The simulation results are shown in fig. 3 and 4: as can be seen from fig. 3, due to the setting of the initial state and the control condition, the defending spacecraft successfully intercepts the attacking spacecraft, the interception time is 856.9s, and the relative distance of the terminal is 1.16 m. Fig. 4 shows the distance change between the target spacecraft and the defense spacecraft when the target spacecraft makes the optimal maneuver and natural motion, and it can be seen that the distance between the target spacecraft and the defense spacecraft is closer under the optimal maneuver, because the target spacecraft can give a greater payment function pressure to the attack spacecraft by approaching the defense spacecraft under the condition of ensuring the self-security, thereby forming an effective cooperation with the attack spacecraft.

Fig. 5 shows the relative distance change between the attacking and defending spacecraft, and it can be seen that the three-axis components all converge almost to 0. Fig. 6 is a time history of the three spacecraft control direction angles, the curve is generally smooth, there is a sudden change near the game termination time, because the co-modal vector with respect to speed is close to zero at this time, and the direction angle solved according to equation (21) has a singularity at this time, which should be solved using the law of lobida.

In a word, the invention constructs a game model under the same time course through the analysis of the problem of the spacecraft pursuit and prevention, solves a multilateral optimal control problem by utilizing the dynamic game theory analysis, provides optimal control strategies and optimal motion tracks which accord with the benefits of the spacecraft in the model, has simple model and fitting reality, does not need initial guess in the solving method, and has wide adaptability.

Although particular embodiments of the present invention have been described above, it will be appreciated by those skilled in the art that these are merely examples and that many variations or modifications may be made to these embodiments without departing from the principles and implementations of the invention, the scope of which is therefore defined by the appended claims.

Claims

1. A multi-spacecraft pursuit escape control method based on a dynamic game theory is characterized by comprising the following steps:

firstly, establishing a dynamic model of a plurality of spacecrafts in an earth center gravitational field, wherein the model is also called a dynamic equation and comprises three roles of a target spacecraft, a defense spacecraft and an attack spacecraft; determining a payment function of the spacecraft and a termination target set of a pursuit and escape game according to the role relationship of each spacecraft in the dynamic model, thereby establishing a multi-spacecraft game model;

2. The multi-spacecraft escape pursuit control method based on the dynamic game theory as claimed in claim 1, characterized in that: in the first step, the multi-spacecraft gaming model is as follows:

(1) the dynamic equation of the spacecraft is:

wherein i ═ D, E, a respectively represent the defense spacecraft, the target spacecraft and the attack spacecraft, μ ═ 3.986005 × 10¹⁴m³/s²The method is characterized in that the method is an earth gravity constant, r represents a mode of a radial of a spacecraft, x, y and z are three-axis position components of the spacecraft, T represents the unit mass control quantity of the spacecraft, theta and alpha represent control direction angles of the spacecraft, and the specific directions are defined as follows: theta is an included angle between the controlled variable and the direction of the z axis under the inertial system, and alpha is an included angle between the projection of the controlled variable on the x-y plane and the y axis;

(2) according to the role relationship of each spacecraft in the dynamic model, the payment function is as follows:

wherein, t₀,t_fRepresents the start and end time of the game, r_AEAnd r_DARespectively representing the relative distance of the attacking and target spacecraft and the distance between the defending and attacking spacecraft, k₁And k₃Weight, k, representing payment₂And k₄Represents the rate of change of payment with distance;

(3) the set of termination targets for the spacecraft is as follows:

ψ(r_E(t_f),r_D(t_f),r_A(t_f),t_f)＝F₁(ξ₁)F₂(ξ₂)＝0

I₁＝{(r_A(t_f),r_E(t_f))|||r_A(t_f)-r_E(t_f)||＜ε₁}

I₂＝{(r_D(t_f),r_A(t_f))|||r_D(t_f)-r_A(t_f)||＜ε₂}

wherein r is_i(t_f) Terminal position vector, ξ, representing spacecraft i₁,ξ₂Respectively representing combinations of terminal positions of attack-target spacecraft and attack-defense spacecraft, I₁,I₂Set of terminal positions of attack-tolerant target spacecraft and set of terminal positions of attack-tolerant defense spacecraft to satisfy acquisition relationship, epsilon₁≥0,ε₂More than or equal to 0 respectively represents the capture range of the attacking spacecraft and the defending spacecraft;

and a multi-spacecraft game model is formed by a kinetic equation, a payment function and a termination target set.

3. The multi-spacecraft escape pursuit control method based on the dynamic game theory as claimed in claim 1, characterized in that: in the second step, the two-point edge value equation obtained by the two-point edge value problem comprises the following steps:

differential equation of the co-modal vector:

wherein x represents the position and velocity vector of the spacecraft, the final value needs to meet the termination target set in the first step, lambda is a co-modal vector, and t is time;

new kinetic equations represented by the co-modal vectors:

the equation is derived by jointly deriving a kinetic equation and a game theory in the first step;

cross-sectional equation of time:

wherein, x (t)_f) The terminal value representing the state, ψ' is a boundary form of the termination target set, expressed as follows:

ψ′＝(||r_A(t_f)-r_E(t_f)||-ε₁)·(||r_D(t_f)-r_A(t_f)||-ε₂)＝0

wherein r is_i(t_f) Representing the terminal position vector, ε, of a spacecraft i₁≥0,ε₂More than or equal to 0 respectively represents the capture range of the attacking spacecraft and the defending spacecraft;

the above equations together form a two-point edge value equation.

4. The multi-spacecraft escape pursuit control method based on the dynamic game theory as claimed in claim 1, characterized in that: the third step is specifically realized as follows:

wherein, c_iBoundary constraint and time-transverse condition, k, representing the two-point boundary equation in the second step_i> 0 represents the weight coefficient of each constraint;

particle renewal was as follows:

is the individual extremum of the ith particle in the d dimension, g^dIs a global extremum in the d-th dimension;

(2) utilizing a nonlinear programming tool box to carry out fine optimization on the rough solution of the two-point boundary value equation, wherein the target function is the same as the fitness function of the particle swarm function, thereby obtaining the numerical solution of the equation, namely the solution is the initial value lambda of the covariance vector lambda₀；

(3) The solution takes the initial equation, which is in the form:

x(0)＝x₀,λ(0)＝λ₀

wherein x represents the position and velocity vector of the spacecraft, λ is the co-modal vector, t is time, x₀Is the initial position and velocity, λ, of the spacecraft₀I.e. the numerical solution of the equation of two-point edge values, f (-) i.e. the new formThe kinetic equation of the formula (I) is,

namely a collaborative equation;

thereby numerically solving the optimal control strategy theta for spacecraft pursuit_i、α_iAnd an optimal trajectory x_i，

Wherein, theta and alpha are control direction angles of the spacecraft, and i represents a defense spacecraft, a target spacecraft and an attack spacecraft.