CN109917647B

CN109917647B - Teaching and learning algorithm improved based on teaching strategy and liquid-filled spacecraft optimization sliding mode control method

Info

Publication number: CN109917647B
Application number: CN201910166219.1A
Authority: CN
Inventors: 肖玲斐; 何虹兴; 申斌; 马磊明; 叶志锋
Original assignee: Nanjing University of Aeronautics and Astronautics
Current assignee: Nanjing University of Aeronautics and Astronautics
Priority date: 2019-03-06
Filing date: 2019-03-06
Publication date: 2020-12-11
Anticipated expiration: 2039-03-06
Also published as: CN109917647A

Abstract

The invention provides a teaching and learning algorithm improved based on a teaching strategy and a sliding mode control method for optimizing a liquid-filled spacecraft. The invention refers to a specific teaching process, makes practical adjustment on a teaching strategy in a teaching and learning algorithm through group learning and teacher depth inter-learning, provides a teaching and learning algorithm (SMTLBO) based on teaching strategy improvement, introduces an intermediate variable to decouple the system aiming at a typical underactuated system such as a liquid-filled spacecraft, which has high nonlinearity and high coupling, constructs a time-varying sliding mode surface, inputs a system state variable as the SMTLBO algorithm, obtains a sliding mode surface parameter at the current moment by utilizing the dynamic calculation of the SMTLBO, realizes the dynamic adjustment of the sliding mode surface parameter, and the designed sliding mode controller can ensure the stability of the liquid-filled spacecraft system.

Description

Teaching and learning algorithm improved based on teaching strategy and liquid-filled spacecraft optimization sliding mode control method

Technical Field

The invention relates to a control system optimization technology, in particular to a teaching and learning method based on teaching strategy improvement and a liquid-filled spacecraft optimization sliding mode control method.

Background

At present, the commonly used spacecraft often has the requirements of small volume, light weight and long on-orbit time, so most of the spacecraft adopt liquid fuel with higher heat value as propellant. With the continuous development of the aerospace industry, the spacecraft has more and more tasks, and the specific gravity of the liquid propellant carried by the spacecraft is gradually increased. For a general spacecraft, the liquid fuel inside the spacecraft often generates oscillation in the dynamic process of the spacecraft, so that the spacecraft is impacted, and the stability of spacecraft control is affected. The disturbance and impact of the sloshing of the liquid fuel on the spacecraft can cause the instability of the control system or the damage of the structure once the regulation range of the control system or the upper limit of the structure bearing is exceeded. Therefore, sloshing of the liquid fuel during operation of the spacecraft is an important issue that has to be addressed.

Considering that the overall quality of the spacecraft is often greatly increased by a general passive control means, at present, scholars at home and abroad widely adopt an active control method. The nonlinear feedback control law based on Lyapunov is designed by the Mahmut Reyhanoglu; in consideration of the problem of uncertain parameters caused by fuel consumption in the maneuvering process of the spacecraft, a self-adaptive layered sliding mode controller is designed by the aid of a Schchenjian and the like; hesham shaker and Taogang et al study the equivalent pendulum model of a liquid-sloshing spacecraft and design self-adaptive pole configuration attitude control for the spacecraft in a translation state. However, the active control means cannot directly regulate and control the liquid sloshing, and only indirectly adjust the attitude of the spacecraft to inhibit the sloshing, so the liquid-filled spacecraft system is often regarded as an under-actuated system. The under-actuated system is a nonlinear system with the number of independent control variables of the system smaller than the number of degrees of freedom of the system, and the system has a simple structure and is convenient for integral dynamics analysis and experiment. Due to the reasons of high nonlinearity of the system, requirements such as parameter perturbation and multi-target control, limited control quantity and the like, the underactuated system is sufficiently complex, and the effectiveness of various algorithms is convenient to study and verify. In the literature (X.Rong, U.S. Ozgunner.sliding Mode Control of a class of systematic of exploited Systems [ J ]. Automatica,2008,44: 233-.

A Teaching-Learning Based Optimization (TLBO) is a new swarm intelligence Optimization algorithm proposed by Rao et al in 2010, and the method simulates the Teaching process of a teacher and the Learning process of a student to obtain an optimal solution. The teaching and learning algorithm has the advantages of few parameters, simple structure, concise concept, high solving precision, high convergence speed and strong convergence capability. Compared with some classical intelligent optimization algorithms, such as a particle swarm algorithm, the algorithm is characterized in that the algorithm only has two parameters, namely the group membership and the iterative algebra, and the number of the parameters to be set is small, so that the problems of low calculation efficiency or easy falling into local convergence and the like caused by improper parameter setting can be solved. Teaching and learning algorithms as an emerging intelligent optimization algorithm still have many problems, such as the characteristics of easy premature convergence and rapid loss of algorithm diversity. This is due to the "teaching" process of the algorithm, which is essentially the process by which a student quickly approaches a single teacher. In the process, the diversity of the algorithm is lost rapidly, and the population often shows the characteristics similar to those of individual teachers, so that local convergence is easy to be caused.

Disclosure of Invention

The purpose of the invention is as follows: in order to promote the research on the theory and the application of the intelligent optimization algorithm in the control system, the invention aims at the attitude control problem of the liquid-filled spacecraft,

the technical scheme is as follows:

a teaching and learning algorithm improved based on a teaching strategy comprises the following steps:

1) initializing the class: randomly generating each student in a class in a search space

j ═ 1,2, …, NP; the generation method is carried out according to the following formula:

wherein the content of the first and second substances,

and

respectively, as the upper and lower bounds of each dimension, where i ═ 1,2, …, d

2) Dividing the class into num groups, evaluating the adaptive value of the kth group, and selecting the optimal groupThe body is taken as the teacher individual of the kth group

The teaching process of the individual i in the kth group member is represented by the following formula (2) and formula (3):

in the formula:

and

respectively representing the values before and after learning of the ith student of the k groups; k is 1,2, …, mum;

wherein the content of the first and second substances,

is the mean value of the kth subgroup, TF _i ^k1+ rand (0,0.5) is the teaching factor of the teacher, r_i ^kRand (0.5,1) is the learning step length of the student;

3) for students

Randomly selecting a member from the group

Analyze oneself with

The difference between the two is used for learning adjustment;

wherein the content of the first and second substances,

in order to learn the values of the trainee after adjustment,

adjusting the values of the trainees prior to learning;

4) for individual teacher

Randomly selecting another teacher individual

Carrying out learning adjustment;

4.1) let k equal to 1 for individual teachers

Randomly selecting another teacher individual

4.2) let y equal to 1, use

Is substituted by the y-component

The y-th dimension component of (a) to generate a test solution if the adaptation value of the test solution is better than that of the test solution

It is replaced;

4.3) making y equal to y +1, repeating the step 4.2) until y equal to d, and finishing the inter-learning;

4.4) let k be k +1, repeat steps 4.2) and 4.3) until k is num, then the learning exchange of all teachers is finished;

4.5) comparing the adaptive values of all teachers, and taking the teacher with the optimal adaptive value as a global optimal solutionX_best；

5) Self-learning by the whole staff: for any individual XⁱThe operation steps of the self-learning process are as follows:

wherein randn is a normal distribution, and wherein,

X^U,X^Lrespectively representing the maximum value and the minimum value of the actual individual of the population; t is the maximum allowable iteration number, and T is the current iteration number;

6) judging whether a termination condition is met, if so, terminating the algorithm and outputting an optimal solution; otherwise, jumping to the step 2), and continuing the iterative computation.

A liquid-filled spacecraft optimization sliding-mode control method based on teaching strategy improved teaching and learning algorithm comprises the following steps:

1) the liquid-filled spacecraft is equivalent to a simple pendulum model, the x axis and the z axis are inertial coordinate systems, the liquid-filled spacecraft has constant thrust T and axial speed v along the body_x(ii) a The dry weight of the liquid-filled spacecraft is m, and the mass of the fuel part in the storage tank is m_f(ii) a The moment of inertia of the liquid-filled spacecraft relative to the center of the tank is I, and the moment of inertia of the fuel relative to the center of the tank is I_f(ii) a The distance from the mass center of the spacecraft to the suspension point of the simple pendulum is b, and the pendulum length of the simple pendulum is a; the attitude angle of the spacecraft is theta, and the liquid oscillation angle, namely the equivalent simple pendulum swing angle is theta

The controlled inputs of the system are transverse control force F and pitching moment M of the mass center of the spacecraft_y，v_zA lateral velocity generated for influence by a control input;

is the energy dissipation factor;

2) obtaining a liquid-filled spacecraft dynamics equation of liquid sloshing:

equation (6) is simplified to:

order to

Comprises the following steps:

wherein the content of the first and second substances,

the control inputs F and M of the system are transformed via equation (11)_yConversion to u₁And u₂Accordingly, formulae (7) to (9) are converted into the following forms:

wherein the content of the first and second substances,

3) designing a sliding mode controller with the controller designed to target system state variables

Reach zero in a finite time; divide the system into

And an under-actuated subsystem (15) comprising

Full drive subsystem (16):

for the subsystem (16), a control law is designed according to a general sliding mode control method:

in the formula, c₂Is a subsystemSliding mode surface parameter of system (16) < rho >₂And₂for subsystem sliding mode surface s₂Corresponding exponential and isovelocity approach terms, and satisfies c₂,ρ₂,₂＞0；

For the sub-system (15), a new state variable eta is introduced instead of

The two sides of the formula (19) are subjected to derivation to obtain:

thereby converting the subsystem (15) into the form:

for the subsystem (21), define

The sliding mode function is designed as follows

s₁＝μ₁-Mμ₂ (23)

Wherein M is [ M ]₁,m₂]Is a sliding mode surface matrix

Order to

Then

The control law of the subsystem (21) is designed as follows:

in the formula, ρ₁，₁Is s is₁Corresponding exponential approach term and constant velocity approach term, and satisfies rho₁,₁＞0；

Define the lyapunov function as:

then

The sliding mode arrival condition is met, and the time t exists₁For t ≧ t₁，s₁＝μ₁-Mμ ₂0, i.e. at the slip-form face s₁On 0, there is μ₁＝Mμ₂；

Then:

wherein, Delta_μIs a very small real number vector, substituted

It is possible to obtain,

in the formula (I), the compound is shown in the specification,

the solution to M is:

4) and solving the value of M based on a teaching and learning algorithm improved by the teaching strategy.

In the step 4), the sliding mode controller of the liquid-filled spacecraft is at a balance point

Designed on a nearby basis to obtain

The value of M is taken as the solution target,

as input, dependent on the current time

Calculating to obtain M (t) such that

Can be approximated to A_μ→0μ₂(t); definition of

The fitness function is taken as

Setting the iteration number G to be 20 and the population size NP to be 20, averagely dividing the population into num to be 3 groups, and obtaining the optimal solution M (t) at the current time.

Has the advantages that: according to the method, a specific teaching process is referred, a teaching and learning algorithm improved based on the teaching strategy is provided through grouping learning and teacher deep learning, an intermediate variable is introduced for the liquid-filled spacecraft to decouple the system, a time-varying sliding mode surface is constructed, a system state variable is used as an SMTLBO algorithm input, sliding mode surface parameters at the current moment are obtained through SMTLBO dynamic calculation, dynamic adjustment of the sliding mode surface parameters is achieved, and the designed sliding mode controller can guarantee stability of the liquid-filled spacecraft system.

Drawings

Fig. 1 is a box diagram of the results of SMTLBO calculations on the rosenblock function.

Fig. 2 is a box diagram of the calculation result of SMTLBO on the Griewank function.

Fig. 3 is a box diagram of the results of SMTLBO calculations on the rastigin function.

Fig. 4 is a distribution diagram of the optimal solution calculated by SMTLBO when the number of iterations G is 25 in a certain iteration.

Fig. 5 is a liquid-filled spacecraft model.

FIGS. 6 to 12 are views of

And (5) a time simulation result graph.

Fig. 13 is a graph showing the variation of M calculated by SMTLBO in a dynamic process.

FIGS. 14 to 22 are views showing

And

and (5) a time simulation result graph.

Detailed Description

The invention is further elucidated with reference to the drawings and the embodiments.

Basic TLBO algorithm:

for one optimization problem:

search space

Any search point in space X ═ X₁,x₂,…x_d) Where d represents the dimension of the dimensional space (number of decision variables),

and

expressed as the upper and lower bounds of each dimension, respectively, and f (x) is the objective function. Is provided with

In order to search for a point in the space,

is a point X^jNP is the number of spatial search points (i.e., population size). Corresponding to the basic TLBO algorithm is: 1) class: in the TLBO algorithm, the set of all search particles in the search space is called class (class); 2) the student: any individual in a class

Called a student; 3) a teacher: student X with best performance in class_bestCalled teacher, in the invention using X_teacherAnd (4) showing.

Thus, a class may be represented in the form:

wherein, X^j(j-1, 2, …, NP) represents a class student, X_teacher＝argmaxf(X^j) (j ═ 1,2, …, NP). NP is the number of students, d is the number of subjects learned by the students.

The teaching and learning optimization algorithm (TLBO) is as follows:

The generation method is carried out according to the following formula:

2) the "teach" stage: in the teaching stage of TLBO algorithm, selecting the trainee X with the most excellent performance_bestAs teacher X_teacher. According to the subject scores of the students, the students can learn according to the difference value between the teacher and the average Mean of the students, and the score of each student is improved to a certain extent, so that the class average is improved. It should be noted that the amount of knowledge that the trainee can acquire depends on the difference between the teacher and the average value of the class, the teaching factor of the teacher and the learning ability of the student, and therefore, the promotion space of the teaching stage is limited.

Assuming that the subject performance of the student follows normal distribution, the average performance of the class is Mean at the beginning_A30, the average performance is low and the distribution is wide. After the teacher's multiple teaching' process, the class average score is gradually improved to Mean_BAt 80, the performance improved and the distribution concentrated. The specific teaching method is shown in the formulas (2) and (3).

difference＝r_i×(X_teacher-TF_i×Mean)(3)

Wherein the content of the first and second substances,

and

respectively represent the values before and after learning of the ith student,

represents the average of all trainees. In addition, there are two important parameters in the formula: teaching factor TF of teacher_i＝round[1+rand(0,1)]And learning step length r of student_iRand (0, 1). The former characterizes the teaching ability of teachers, and the latter characterizes the learning ability of students.

3) The "learning" stage: the "learning" stage refers to mutual learning among students, and learning is performed by comparing and analyzing differences among students. For each student Xⁱ(i ═ 1,2, …, NP), randomly choosing a learning object X in the class^j(j＝1,2,…,NP,j≠i)，XⁱThe learning adjustment is carried out by analyzing the difference between the self and the Xj, and the method for improving the learning is similar to a difference mutation operator in a difference algorithm. The difference is that the learning step r in the teaching and learning algorithm uses a different learning factor for each different learner. Student XⁱAnd X^jBy comparing the respective objective function values (namely, the learning results), the inferior one gets close to the superior one, and in this way, the inter-learning and progress among students are realized. The specific adjustment process can be represented by the following formula (4):

wherein r is_iIs the learning step size of the ith student, and r_i＝U(0,1)。

4) The "update" operation: the student performs the updating operation when going through the stages of 'teaching' and 'learning'. The purpose of the update operation is to replace the inferior individuals with the better ones after learning to achieve an improvement in the average performance of all trainees. The update operation is as follows:

End.

in a standard teaching and learning algorithm, a teaching process is a process that population members approach to a single good individual, and the process easily causes loss of diversity of the population and falls into local convergence. Considering that a single teacher individually performs teaching operation on a whole member, the method is not only low in efficiency, but also easy to fall into local convergence, so that the invention adopts a grouping teaching method and provides SMTLBO. When the main parameters of the algorithm are initialized, the population members are averagely divided into num groups, and on the basis, the subsequent teaching process is developed.

1) Grouped teaching process

Evaluating the adaptive value of the kth subgroup, and selecting the individual with the optimal adaptive value as the teacher individual of the kth subgroup

Thus, the teaching process for individual i among the kth group members can be represented by equations (5) and (6):

in the formula:

and

respectively representing the values before and after learning of the ith student of the k groups. Unlike TLBO, the membership of each team is highly different due to grouping, and if the average value of the students in TLBO is used for processing, the teaching results are influenced. Meanwhile, the number of students which need to be responsible for by each group of teachers is reduced along with the generation of grouping, and the teaching process can be more targeted. Therefore, it is not allowed to:

and teaching factor TF for the other two parameters_i ^kAnd learning step length r_i ^kCorresponding changes are also made. Considering that students have stronger receptivity and comprehension ability and stronger learning willingness under the targeted teaching, the following steps are not required: TF_i ^k＝1+rand(0,0.5)，r_i ^k＝rand(0.5,1)。

2) Group learning process

The grouping learning process is mainly divided into two parts: first, the members in the group learn each other; the second is the joint progress represented by the groups. The inter-study of the members in the group can be carried out by referring to basic teaching and learning algorithms for the members

Randomly selecting a member from the group

Analyze oneself with

The difference between the two values is used for learning adjustment, and a specific adjustment method can be performed by referring to equation (4).

The inter-group representative is the inter-study of teachers in each group. Since the quality of the individual teacher is one of the most important factors influencing the quality of the group fitness value to which the individual teacher belongs, the individual teacher must learn each other comprehensively and deeply. For individual teacher

Randomly selecting another teacher individual

And (5) carrying out learning adjustment. Unlike the individual study of students, the study among teachers is not limited to twoThe differences between the users also include the mutual exchange of knowledge and experience. Therefore, the invention introduces the idea of co-evolution, exchanges internal information between the teacher individuals and components in the same dimension, and optimizes the current teacher individuals in a dimension-by-dimension mode. The specific operation steps are as follows:

2.1) let k equal to 1 for individual teachers

Randomly selecting another teacher individual

2.2) let y equal to 1, use

Is substituted by the y-component

It is replaced.

2.3) let y be y +1, repeat step 2.2) until y is d, then the iteration ends.

2.4) let k be k +1, repeat steps 2.2) and 2.3) until k is num, and the learning communication ends for all teachers.

2.5) comparing the adaptive values of all teachers, and taking the teacher with the optimal adaptive value as a global optimal solution X_best。

3) Full-member self-learning process

Compared with basic teaching and learning algorithms, the self-learning process is added so that the algorithms do not lose the diversity of the population prematurely in the teaching and mutual learning processes. Considering that the adaptive value of the population is continuously improved and the distribution is more concentrated along with the progress of the algorithm, the adaptive learning step length is adopted to adjust the local search capability so as to eliminate the defect that the fixed step length local search causes low algorithm convergence precision in the later stage of the algorithm.For any individual XⁱThe operation steps of the self-learning process are as follows:

wherein randn is a normal distribution, and wherein,

X^U,X^Lrespectively representing the maximum value and the minimum value of the actual individual of the population. T is the maximum number of iterations allowed, and T is the current number of iterations.

In order to prevent the self-learning process from causing the deterioration of the population adaptive value, the teacher individual of the current iteration is compared with the teacher individual of the previous generation, and the teacher individual with the better adaptive value is reserved.

The invention selects a Rosenbrock function, a Griewank function and a Rastrigin function as test functions. The population size of the algorithm is set to be NP which is 12, the maximum iteration number G which is 100 and the subgroup number num which is 3, each function is independently calculated for 20 times, and the result is counted and compared with the previous result. The statistical results are shown in table 1:

table 1: statistical table of calculation results of three test functions

As can be seen from the statistical table 1 and fig. 1 to fig. 3, SMTLBO is significantly better than other algorithms in calculation of the rosenblock function, and the algorithm completely converges on the global optimal solution, which proves that SMTLBO has strong convergence and high calculation efficiency. The statistical results on the Griewank function and the rasstrigin function of the multi-peak value also demonstrate the effectiveness of SMTLBO. Although the magnitude of the adaptive value is within the acceptable range, it cannot be ignored that the standard deviation of the statistical result of the SMTLBO algorithm is relatively large, which indicates that SMTLBO is not stable in the 20 times of calculation, and the results of several times of calculation are relatively poor. Considering the strong convergence of the algorithm, the algorithm is not repeated for the number of iterationsIn the case of feet, there is a problem that local convergence tends to be easily caused. In fig. 4, a distribution diagram of the optimal solution calculated by SMTLBO when the number of iterations G is 25 in a certain iteration is plotted. From the figure we can see that the solution of the local optimal solution is distributed in the neighborhood of the global optimal solution X ═ 0, and the components with more dimensions are trapped in X _i1 on the local optimum solution. This shows that SMTLBO risks getting into local convergence when dealing with the multi-peak function problem, and the local search capability of the algorithm is relatively poor. As can be seen from fig. 1 to fig. 3, most of the 20 calculations converge to the global optimal solution, and the optimal adaptation values of only a few calculations are relatively large. This also illustrates that SMTLBO is in most cases reliable, but still has the disadvantage of being prone to local convergence when dealing with the multiple peak problem.

Fig. 5 is a liquid-filled spacecraft model. Considering that the liquid surface is mainly acted by surface tension under the microgravity condition, the shaking of the liquid surface can be approximately equivalent to a simple pendulum model, and the liquid shaking angle is equivalent to a simple pendulum swinging angle if the suspension point of the simple pendulum is the center of the spherical cavity, as shown in fig. 5.

The fuel part of the liquid-filled spacecraft equivalent to a simple pendulum model is assumed to be rigid. The x-axis and z-axis are inertial coordinate systems. The liquid-filled spacecraft has constant thrust T and axial speed v along the machine body_x(ii) a The dry weight of the liquid-filled spacecraft is m, and the mass of the fuel part in the storage tank is m_f(ii) a The moment of inertia of the liquid-filled spacecraft relative to the center of the tank is I, and the moment of inertia of the fuel relative to the center of the tank is I_f(ii) a The distance from the mass center of the spacecraft to the suspension point of the simple pendulum is b, and the pendulum length of the simple pendulum is a; the attitude angle of the spacecraft is theta, and the liquid oscillation angle, namely the equivalent simple pendulum swing angle is theta

for energy dissipationAnd (4) the coefficient.

The spacecraft dynamics equation of the liquid sloshing can be obtained:

considering that the thrust T is large, the liquid sloshing has a negligible effect on the acceleration in the x-axis direction, but the effect is small, so v can be set to be small_xLooking at the external variables, equation (8) can be simplified to:

order to

Comprises the following steps:

wherein the content of the first and second substances,

the control inputs F and M of the system are transformed via equation (13)_yConversion to u₁And u₂Thus, the formulae (9) to (11) can be convertedIn the form:

wherein the content of the first and second substances,

thus, a controller can be designed on a simplified system, the controller being designed with the goal of making the system state variables

The zero point is reached within a limited time, so that the requirements of inhibiting liquid from shaking and keeping the attitude of the spacecraft stable are met.

As can be seen from the equations (14) - (16), the independent control variables of the system are smaller than the number of degrees of freedom of the system, so that the liquid-filled spacecraft is an under-actuated system, and the general design method cannot effectively control all the variables of the system.

Consider the system as comprising

And an under-actuated subsystem (17) comprising

The full drive subsystem (18):

for the subsystem (18), a control law can be designed according to a general sliding mode control method:

in the formula, c₂Is a sliding mode surface parameter, rho, of the subsystem (18)₂And₂for subsystem sliding mode surface s₂Corresponding exponential and isovelocity approach terms, and satisfies c₂,ρ₂,₂Is greater than 0. Easy to verify, the control law meets the sliding mode arrival condition of the subsystem (18), and

the zero point can be reached in a limited time.

For the subsystem (17), due to the designed control law u₂Is only one of

Related item, and satisfy when

When u is turned on₂→ 0. Then in the subsystem (17) it is possible to convert

And u₂The subsystem (17) can be regarded as a single-input and multi-output under-actuated system processing by being regarded as an external variable.

To eliminate

Control amount u in₁Introducing a new state variable eta instead of

And (3) obtaining the following results by performing derivation on two sides of the step (21):

the subsystem (17) can thus be converted into the following form:

for the subsystem (23), define

The sliding mode function is designed as follows

s₁＝μ₁-Mμ₂ (25)

Wherein M is [ M ]₁,m₂]Is a sliding mode surface matrix

Order to

Then

The control law of the subsystem (23) can be designed as follows:

in the formula, ρ₁，₁Is s is₁Corresponding exponential approach term and constant velocity approach term, and satisfies rho₁,₁＞0。

Define the lyapunov function as:

then

The sliding mode arrival condition is met, and the time t exists₁For t ≧ t₁，s₁＝μ₁-Mμ ₂0, i.e. at the slip-form face s₁On 0, there is μ₁＝Mμ₂。

If appropriate, M is selected so that when μ₂→ 0,. mu.l₁→ 0, then the control objective is achieved. Taking into account the point of equilibrium

In the vicinity of the location of the mobile station,

then:

wherein, Delta_μIs a very small real number vector, substituted

It is possible to obtain,

in the formula (I), the compound is shown in the specification,

in view of

And u₂Are all considered as external variables and have a time t₁For t ≧ t₁Is provided with

Then if the vector M can be designed such that the matrix a is_μHurwitz, then μ₂Can reach zero point and keep stable in a limited time, thereby leading to mu₁→ 0, subsystem (17) is exponentially stable.

In view of

If A is taken_μHas a characteristic value of-2, then is composed of (s +2)²＝s²+4s +4, the correspondence can be obtained as follows:

thus, a solution for M is obtained:

3. solving the value of M based on a teaching and learning algorithm improved by a teaching strategy;

the sliding mode controller of the liquid-filled spacecraft is at a balance point

Designed on a near basis, with small deviationsA very good quality, but an effective control performance is not necessarily guaranteed within a large deviation range. This is because the matrix A cannot necessarily be used for large deviations of M calculated according to the small deviation model_μIs Hurwitz. Therefore, in order to make the under-actuated system stable under more severe conditions, the control law should be designed in consideration of the design of M to make μ when the deviation is large₂Is exponentially stable.

Can obtain the product

Known as a Hurwitz matrix, then the design for M is not as A_μ→0As a reference. In order to obtain a proper M in the whole control process, the invention adopts a teaching strategy to improve a teaching and learning algorithm for calculation and dynamically adjusts the M₁,m₂]To ensure mu₁，μ₂Is index stable globally.

The value of M is taken as the solution target,

as input, a strategy improvement teaching and learning algorithm is utilized for solving. The purpose being dependent on the current time

Calculating a suitable M (t) such that

Can be approximated to A_μ→0μ₂(t) of (d). Definition of

The fitness function may be taken as

Setting the iteration number G as 20 and the population size NP as 20, dividing the population into 3 groups on average, and applying the strategyAnd slightly improving the teaching and learning algorithm to obtain the optimal solution M (t) at the current moment.

Considering that the variation curve of M calculated by SMTLBO is not necessarily two smooth curves, but has certain error and high-frequency oscillation around two photochemical curves. Therefore, after M at the current time is obtained through SMTLBO, an averaging filtering unit should be added to reduce or even eliminate the oscillation phenomenon.

The specific embodiment is as follows:

simulation parameters in the spacecraft dynamics equations (8) to (11) of the liquid sloshing are selected as follows:

m＝600kg，m_f＝100kg，I＝720kg·m^-2

I_f＝90kg·m^-2，a＝0.32m，b＝0.25m

T＝500N，

assume initial values of:

θ₀＝2°，

v_x0＝25m/s

v_z0＝1.1m/s，

the controller parameters are set as:

c₂＝5，ρ₂＝5，₂＝0.5，ρ₁＝2，₁＝0.2

when in use

The simulation results are shown in fig. 6 to 12. Wherein, fig. 6 is v_xGraph of variation, v in FIG. 7_zGraph of variation, FIG. 8 is θ and

graph of change of (1), figure9 is

And

FIG. 10 is a graph showing a control amount u₁Fig. 11 shows the sliding mode quantity s₁In FIG. 12 is s₁Phase plan view of (a). From these figures we can see that the designed control law can perform well the control task. v. of_xThe acceleration is regarded as an external variable and acts for acceleration movement under the thrust along the x axis, and the magnitude of the acceleration is basically not influenced, so that the simulation result is consistent with the theoretical analysis. Since will contain

The subsystem (18) is regarded as a sliding mode controller of a full-drive system design, and the system has simple structure, so that the sum of theta and theta is equal to the sum of theta

The control effect of (2) is good. For containing

The system reaches a longer equilibrium state because the controller is designed to have (mu) in addition to the sliding mode arrival process₁，μ₂) The process of (1) stabilization. In the control process, the state variables

The deviation is first off the equilibrium point, but still within an acceptable range, and does not compromise the stability of the control. The state variable change curves of fig. 6-9 demonstrate the effectiveness of this control method. And from s₁Graph 11 and phase plan 12 show that the sliding mode number s is at the sliding mode arrival stage because M calculated by SMTLBO has a certain oscillation₁There is also an oscillation phenomenon, but s₁There is still a major trend and oscillations are attenuated as the slip-form surfaces approach. And from phase plane 12It can be seen that the slip form surface s of the present invention is designed₁Is a non-linear sliding mode surface, and (mu)₁,μ₂) Finally, the sliding mode can fall on the nonlinear sliding mode surface, and the sliding mode reaching condition of the subsystem (17) is met.

As can be seen from fig. 13, although the averaging filter is used to eliminate the high frequency oscillation, the obtained M still has an oscillation phenomenon, and has a certain error compared with the M calculated by equation (31). The oscillation phenomenon is mainly generated by the input at each moment

In contrast, the algorithm itself has a small accuracy error, and therefore, the high frequency oscillation is represented by the value of M. And for m with larger amplitude₂This is due to the adaptation value pair m₂Is not sensitive to the change in size of m, thereby affecting m₂The calculation accuracy of (2). The error is generated because the mean filtering method cannot achieve the ideal filtering effect for M calculated by SMTLBO. The high frequency oscillation in M has a certain limitation, not the conventional noise, so the generation of error is inevitable. But even if there is a certain oscillation and error, A calculated from M_μStill Hurwitz, mu₁And mu₂The self-stabilization process can still be completed, which also indicates that the method adopted by the invention is feasible.

To verify at

Under the condition of large deviation from the balance point, the control method adopted by the invention can still complete the control target, and the invention respectively makes

And

the simulation results of the simulation experiments of (1) are shown in fig. 14 to 22.

FIGS. 14 to 18 are graphs showing differences

In this case, the change profile of each system variable is shown. Wherein, FIG. 14 is different

In case v_zA graph of variation of (d); FIG. 15 is a drawing showing the difference

Under the circumstances

The variation curve of (d); FIG. 16 is a drawing showing the difference

Under the circumstances

The variation curve of (d); FIG. 17 is a variation

In case of s₁The variation curve of (d); FIG. 18 is a drawing showing the difference

In case u₁The variation curve of (d); FIG. 19 is a drawing showing

The variation curve of M; FIG. 20 is a drawing showing

When s is₁A phase plane view of; FIG. 21 is a drawing showing

The variation curve of M; FIG. 22 is a drawing showing

When s is₁Phase plan view of (a). It can be seen from the figure that the difference is

Under the condition, the designed control law can complete the control target. Except that s₁Because the relation of the sliding mode surface parameter M can generate oscillation, the other variables are not subjected to oscillation in the dynamic response process. The state variables deviate from the equilibrium point first, but the deviation is not much within an acceptable range, and the stability of the system is not affected. FIGS. 19 to 22 are graphs showing the difference

In this case, the variation curve of M and the phase plane diagram. It can be seen that the change of M is regular and finally converges

Nearby. While the phase plan still reflects: although buffeting is generated on the slip form surface in the dynamic response process, the state is not influenced to reach the slip form surface and the slip form motion is kept, and meanwhile, the buffeting of the slip form surface is smaller as the dynamic slip form surface is closer to the dynamic slip form surface. The numerical simulation result shows that the designed controller can effectively control the system under the condition of large deviation, the effectiveness of the method for dynamically designing the control law by utilizing the SMTLBO algorithm is proved, and the method can be suitable for the design of the control law of the complex underactuated system.

Although the preferred embodiments of the present invention have been described in detail, the present invention is not limited to the details of the foregoing embodiments, and various equivalent changes (such as number, shape, position, etc.) may be made to the technical solution of the present invention within the technical spirit of the present invention, and the equivalents are protected by the present invention.

Claims

1. A liquid-filled spacecraft optimization sliding mode control method based on teaching strategy improved teaching and learning algorithm is characterized in that: the method comprises the following steps:

1) the liquid-filled spacecraft is equivalent to a simple pendulum model, the x axis and the z axis are inertial coordinate systems, and the liquid-filled spacecraft has constant thrustForce T, axial speed v along the machine body_x(ii) a The dry weight of the liquid-filled spacecraft is m, and the mass of the fuel part in the storage tank is m_f(ii) a The moment of inertia of the liquid-filled spacecraft relative to the center of the tank is I, and the moment of inertia of the fuel relative to the center of the tank is I_f(ii) a The distance from the mass center of the spacecraft to the suspension point of the simple pendulum is b, and the pendulum length of the simple pendulum is a; the attitude angle of the spacecraft is theta, and the liquid oscillation angle, namely the equivalent simple pendulum swing angle is theta

is the energy dissipation factor;

2) obtaining a liquid-filled spacecraft dynamics equation of liquid sloshing:

equation (6) is simplified to:

order to

Comprises the following steps:

wherein the content of the first and second substances,

wherein the content of the first and second substances,

Reach zero in a finite time; divide the system into

And an under-actuated subsystem (15) comprising

Full drive subsystem (16):

in the formula, c₂Is a sliding mode surface parameter, rho, of the subsystem (16)₂And₂for subsystem sliding mode surface s₂Corresponding exponential and isovelocity approach terms, and satisfies c₂，ρ₂，₂＞0；

For the sub-system (15), a new state variable eta is introduced instead of

The two sides of the formula (19) are subjected to derivation to obtain:

thereby converting the subsystem (15) into the form:

for the subsystem (21), define

The sliding mode function is designed as follows

s₁＝μ₁-Mμ₂ (23)

Wherein M is [ M ]₁，m₂]Is a sliding mode surface matrix

Order to

Then

The control law of the subsystem (21) is designed as follows:

in the formula, ρ₁，₁Is s is₁Corresponding exponential approach term and constant velocity approach term, and satisfies rho₁，₁＞0；

Define the lyapunov function as:

then

The sliding mode arrival condition is met, and the time t exists₁For t ≧ t₁，s₁＝μ₁-Mμ₂0, i.e. at the slip-form face s₁On 0, there is μ₁＝Mμ₂；

Then:

wherein, Delta_μIs a very small real number vector, substituted

It is possible to obtain,

in the formula (I), the compound is shown in the specification,

the solution to M is:

4) the teaching and learning algorithm improved based on the teaching strategy for solving the value of M comprises the following steps:

41) initializing the class: randomly generating each student in a class in a search space

The generation method is carried out according to the following formula:

wherein the content of the first and second substances,

and

42) Dividing the class member into num groups, evaluating the adaptive value of the kth group, selecting the individual with the optimal adaptive value as the teacher individual of the kth group

in the formula:

and

respectively representing the values before and after learning of the ith student of the k groups; k is 1,2, …, num;

wherein the content of the first and second substances,

is the mean value of the kth subgroup, TF_i ^k＝1+rand(0，0.5) Is a teaching factor of the teacher, r_i ^kRand (0.5,1) is the learning step length of the student;

43) for students

Randomly selecting a member from the group

Analyze oneself with

The difference between the two is used for learning adjustment;

wherein the content of the first and second substances,

in order to learn the values of the trainee after adjustment,

adjusting the values of the trainees prior to learning;

44) for individual teacher

Randomly selecting another teacher individual

Carrying out learning adjustment;

44.1) let k equal 1 for individual teacher

Randomly selecting another oneIndividual teacher

44.2) let y equal 1, use

Is substituted by the y-component

It is replaced;

44.3) making y equal to y +1, repeating the step 44.2) until y equal to d, and finishing the inter-learning;

44.4) let k be k +1, repeat steps 44.2) and 44.3) until k is num, then the learning exchange of all teachers is finished;

44.5) comparing the adaptive values of all teachers, and taking the teacher with the optimal adaptive value as the global optimal solution X_best；

wherein randn is a normal distribution, and wherein,

X^U，X^Lrespectively representing the maximum value and the minimum value of the actual individual of the population; e is the maximum allowable iteration number, and t is the current iteration number;

2. The liquid-filled spacecraft optimization sliding-mode control method according to claim 1, characterized in that: in the step 4), the sliding mode controller of the liquid-filled spacecraft is at a balance point

Designed on a nearby basis to obtain

The value of M is taken as a solving target, v_z，

As input, v according to the current time_z(t)，

Calculating to obtain M (t) such that

Can be approximated to A_μ→0μ₂(t); definition of

The fitness function is taken as