CN109085754B

CN109085754B - Spacecraft pursuit game method based on neural network

Info

Publication number: CN109085754B
Application number: CN201810827228.6A
Authority: CN
Inventors: 袁源; 张鹏; 孙冲; 于洋; 万文娅; 李晨
Original assignee: Northwestern Polytechnical University
Current assignee: Northwestern Polytechnical University
Priority date: 2018-07-25
Filing date: 2018-07-25
Publication date: 2020-09-04
Anticipated expiration: 2038-07-25
Also published as: GB2577371A; CN109085754A; GB201910670D0

Abstract

The invention discloses a neural network-based spacecraft pursuit game method, which comprises the steps of firstly establishing a pursuit game discrete system model of a spacecraft; then, an adaptive dynamic programming iterative control strategy of zero and game synchronous convergence is designed, and an optimal control strategy is approximated by utilizing a neural network. The invention adopts a control strategy of zero and game synchronous convergence based on self-adaptive dynamic programming, and can ensure that the system performance reaches the optimum.

Description

Spacecraft pursuit game method based on neural network

Technical Field

The invention belongs to the field of spacecraft pursuit control, particularly relates to a zero-sum game optimal control algorithm for self-adaptive dynamic planning, and particularly relates to a spacecraft pursuit game method based on a neural network.

Background

In recent years, the problem of relative motion of spacecraft has received much attention from domestic and foreign scholars and scientific research institutions. Among them, the problem of pursuit gaming of spacecraft is a content of intense research in recent years. At present, the pursuit game algorithm of most spacecraft is mainly based on a linear space dynamic model of the spacecraft, and a linear feedback controller is designed by utilizing a linear quadratic differential game theory. However, in an actual system, the dynamic model of the spacecraft has a strong nonlinear characteristic, and if a general linear controller is adopted, the control performance of the system is greatly reduced. Therefore, it is important to design an adaptive nonlinear controller for the nonlinear model of the spacecraft.

Aiming at the problem of escape game of the spacecraft, various game control algorithms are proposed at present. In the field of spacecraft, a common control method is linear L_∞A control method, a linear optimal control method, and the like. However, the actual spacecraft model is non-linear, and the controller designed by the method inevitably reduces the controllability of the systemCan be used. Therefore, it is urgently needed to design a nonlinear controller for a nonlinear system of a spacecraft. Currently, an adaptive dynamic programming algorithm is mainly adopted for an optimal controller of a nonlinear system. For the discrete zero sum game problem, at present, no self-adaptive dynamic programming algorithm exists to enable the control strategy to meet the condition of simultaneous convergence.

Disclosure of Invention

The invention aims to provide a neural network-based spacecraft pursuit game method to overcome the defects of the prior art, and the invention adopts a zero-sum game synchronous convergence control strategy based on adaptive dynamic programming to ensure that the system performance reaches the optimum.

In order to achieve the purpose, the invention adopts the following technical scheme:

a spacecraft pursuit game method based on a neural network comprises the following steps:

step 1: establishing a pursuit game discrete system model of the spacecraft;

step 2: and designing a self-adaptive dynamic programming iterative control strategy of zero and game synchronous convergence, and approximating an optimal control strategy by using a neural network.

Further, step 1 specifically comprises:

in the Euler-Hill coordinate system, the nonlinear relative motion equation of the aircraft is:

wherein x, y and z are position information of the aircraft in an Euler-Hill coordinate system; u. of_x，u_yAnd u_zIs a control input; mu is a universal gravitation constant; r is_cAnd r_dThe orbit radiuses of the main aircraft and the auxiliary aircraft are respectively; v is the true proximal angle of the primary spacecraft orbit,

the first derivative of x is represented as,

represents the second derivative of x;

order to

Writing the nonlinear relative motion equation into a state space is in the form:

wherein the content of the first and second substances,

in the formula, eta is a state set, u is an input set, B is an input coefficient matrix, and I is a unit matrix;

therefore, the nonlinear dynamics model of the main aircraft and the auxiliary aircraft is as follows:

wherein, η_pAnd u_pRespectively the state vector and the control input of the main aircraftη_eAnd u_eRespectively is the state vector and the control input of the auxiliary aircraft, and the nonlinear escape pursuit game model is obtained by subtracting the formula (3) from the formula (2):

wherein the content of the first and second substances,

is η_pAnd η_eDifference of (A), B_pInput matrix of the primary spacecraft, B_eAn input matrix for a secondary spacecraft;

since the function f (-) is at any point η_eThere is an arbitrary order derivative, therefore, f (η)_p) At point η_eThe Taylor expansion is:

further obtaining:

wherein the content of the first and second substances,

is composed of

The high order infinitesimal quantity of (a),

is a sign of a gradient, which

Is defined as:

substituting formula (6) into formula (4) to obtain:

then, using an euler discretization method, discretizing equation (7) into:

wherein the content of the first and second substances,

is composed of

Value at time k, u_p,kIs the input value, u, of the primary spacecraft at the k-th time_e,kIs the input value of the secondary spacecraft at the kth time instant.

Further, step 2 specifically includes:

step 2.1: initializing an error threshold

Admission control strategy

And a weight matrix

Let s ← 0; wherein the content of the first and second substances,

is a positive number;

and

is an initial control strategy;

is the initial value of the weight matrix; s is the number of iterations;

step 2.2: calculating Hamiltonian residual error:

wherein Q and R are positive definite matrixes; gamma is a preset positive number;

and

the control strategy value of the step s;

is the value of the weight matrix of the step s; e.g. of the type_k,s+1Which is a residual error, is determined,

the expression is as follows:

in the formula (I), the compound is shown in the specification,

for neural network basis functions, the weight matrix is updated as follows

Wherein θ is a real number between 0 and 1;

step 2.3: let s ← s +1, calculate a value function and control strategy:

step 2.4: calculate and judge

If not, turning to step 2.2, if so, stopping iteration and outputting a control strategy

Compared with the prior art, the invention has the following beneficial technical effects:

the discrete self-adaptive dynamic planning pursuit escape game controller designed by the invention is convenient for engineering realization; in addition, the synchronous convergence self-adaptive dynamic programming iterative algorithm designed by the invention can effectively ensure that the control strategy is synchronously converged to an optimal value, and the algorithm can effectively process strong nonlinear characteristics existing in the system, obtain an approximate optimal control strategy and ensure that the system performance is optimal.

Drawings

FIG. 1 is a flow chart of the present invention;

FIG. 2 is a diagram of simulation results of the present invention.

Detailed Description

The invention is described in further detail below:

the invention provides an optimal control strategy of zero-sum game synchronous convergence based on self-adaptive dynamic programming aiming at strong nonlinear characteristics in a spacecraft model. Firstly, establishing a pursuit game discrete system model of the spacecraft; secondly, designing a self-adaptive dynamic programming iteration control strategy of zero and game synchronous convergence; and finally, approximating an optimal control strategy by using a neural network.

As shown in fig. 1, the specific steps are as follows:

1. pursuit and escape game discrete model establishment

wherein x, y and z are position information of the aircraft in an Euler-Hill coordinate system; u. of_x，u_yAnd u_zIs a control input; mu is a universal gravitation constant; r is_cAnd r_dThe orbit radiuses of the main aircraft and the auxiliary aircraft are respectively; v is the true proximal angle of the main spacecraft orbit, superscript

The first derivative of x is represented as,

representing the second derivative of x.

Order to

The nonlinear first-hand equation of motion is written as a state space in the form:

wherein the content of the first and second substances,

where η is the state set, u is the input set, B is the input coefficient matrix, and I is the identity matrix.

wherein, η_pAnd u_pRespectively state vector and control input of the host aircraft η_eAnd u_eRespectively, the state vector and control inputs for the secondary aircraft. Subtracting the formula (3) from the formula (2), the nonlinear escape pursuit game model is:

wherein the content of the first and second substances,

is η_pAnd η_eDifference of (A), B_pInput matrix of the primary spacecraft, B_eIs a pairAn input matrix for the spacecraft.

further, it is possible to obtain:

wherein the content of the first and second substances,

is composed of

The high order infinitesimal quantity of (a),

is a sign of a gradient, which

Is defined as:

when formula (6) is substituted into formula (4), it is possible to obtain:

then, the Euler discretization method is adopted, and the system (7) is discretized into the following components according to the sampling period T:

wherein the content of the first and second substances,

is composed of

2. Adaptive dynamic planning pursuit escape game algorithm design

The synchronous self-adaptive dynamic planning pursuit game iterative algorithm based on the neural network is given as follows:

1) initializing an error threshold

Admission control strategy

And a weight matrix

Wherein the content of the first and second substances,

is a very small positive number;

and

is an initial control strategy;

is the initial value of the weight matrix; and s is the number of iterations. In the present example, it is shown that,

let s ← 0.

2) Computing Hamiltonian residual

Wherein Q and R are positive definite matrixes; gamma is a positive number which is preset in advance,

and

the control strategy value of the step s;

is the value of the weight matrix of the step s; e.g. of the type_k,s+1Is the residual error. In this example, Q ═ diag ([ 111111)])，R＝diag([1 1 1])，γ＝20。

The expression is as follows:

in the formula (I), the compound is shown in the specification,

is a neural network basis function. In this example, σ (-) is defined as

σ(x)＝[tanh(x₁) tanh(x₂) tanh(x₃) tanh(x₄) tanh(x₅) tanh(x₆)]^T

The weight matrix is updated as follows

Where θ is a real number between 0 and 1.

3) Let s ← s +1, calculate a value function and control strategy:

4) calculate and judge

If not, go to step 2). Otherwise, iteration stops and a control strategy is output

The simulation is carried out by adopting the method of the invention, as shown in figure 2,

is composed of

Of (1). As can be seen from the figure, the state error eventually converges to 0, indicating that the primary spacecraft has tracked the secondary spacecraft and can remain stable. Therefore, the spacecraft pursuit game method based on the neural network is effective.

Claims

1. A spacecraft pursuit and escape game method based on a neural network is characterized by comprising the following steps:

step 1: establishing a pursuit game discrete system model of the spacecraft; the method specifically comprises the following steps:

the first derivative of x is represented as,

represents the second derivative of x;

order to

wherein the content of the first and second substances,

wherein, η_pAnd u_pRespectively state vector and control input of the host aircraft η_eAnd u_eRespectively is the state vector and the control input of the auxiliary aircraft, and the nonlinear escape pursuit game model is obtained by subtracting the formula (3) from the formula (2):

wherein the content of the first and second substances,

further obtaining:

wherein the content of the first and second substances,

is composed of

The high order infinitesimal quantity of (a),

is a sign of a gradient, which

Is defined as:

substituting formula (6) into formula (4) to obtain:

then, using an euler discretization method, discretizing equation (7) into:

wherein the content of the first and second substances,

is composed of

Value at time k, u_p,kIs the input value, u, of the primary spacecraft at the k-th time_e,kThe input value of the auxiliary spacecraft at the kth moment;

2. The spacecraft escape pursuit gaming method based on the neural network as claimed in claim 1, wherein the step 2 specifically comprises:

step 2.1: initializing an error threshold