CN108919828A

CN108919828A - A kind of aerial vehicle trajectory optimization method based on artificial memory

Info

Publication number: CN108919828A
Application number: CN201810768925.9A
Authority: CN
Inventors: 崔乃刚; 韦常柱; 赵宏宇; 关英姿; 李�浩
Original assignee: Harbin Institute of Technology
Current assignee: Harbin Institute of Technology
Priority date: 2018-07-13
Filing date: 2018-07-13
Publication date: 2018-11-30

Abstract

The present invention proposes a kind of aerial vehicle trajectory optimization method based on artificial memory, belongs to the field of aerial vehicle trajectory design.The present invention has studied artificial memory's track optimizing method to solve the problems, such as that existing aerial vehicle trajectory optimization method convergence precision is low, convergence rate is slow.It models traditional track optimizing problem by memory principle；Guarantee the randomness of trial solution in each iteration using trial solution state evolution operator；Resettle stimulus type judgment models and memory more new model；Finally aerial vehicle trajectory optimization problem is studied using artificial memory's optimization method.The present invention is used for the aerial vehicle trajectory optimization design under the conditions of Complex Constraints.

Description

A kind of aerial vehicle trajectory optimization method based on artificial memory

Technical field

The present invention relates to a kind of aerial vehicle trajectory optimization method based on artificial memory, belongs to aerial vehicle trajectory optimization design Technical field.

Background technique

Motors in boost phase penetration Optimal design of trajectory is to solve for the nonlinear optimal control problem comprising path and end conswtraint.For rail There are many method of mark optimization, are broadly divided into analytic method and numerical method two major classes, and wherein numerical method is sent out from computer technology height A kind of method booming therewith after exhibition.Stryk, Betts et al. summarize and analyze the numerical method of Today, most, two people Numerical method is divided into two class of indirect method and direct method.Indirect method is based on not to performance index function direct searching optimization Optimal control problem is converted Hamilton boundary value problem by Pontryagin maximal principle.Direct method uses parametric method Non-Linear Programming (Nonlinear programming, NLP) is converted by the optimal control problem solution of continuous space to ask Topic, obtains optimal trajectory by numerical solution nonlinear programming problem.Direct method is more more extensive than the application of indirect method, commonly uses It includes method etc. that method, which has direct shooting method, multiple direct shooting method, point collocation, differential,.Harmony grace rice, Chen Cong, yellow Guoqiang etc. are also right Track optimizing method has carried out deep analysis, has carried out standardization description to track optimizing problem, has been set forth different Track optimizing method classification, summarizes the latest Progress in track optimizing field in recent years.

It is generally believed that numerical method can be divided into indirect method and direct method according to the method for transformation of problem, according to whether relying on ladder Degree information again can approximation be divided into two major classes：Precise search algorithm and random search algorithm.

So-called precise search algorithm refers to the searching method based on gradient information.Iterative search of such algorithm in optimizing In the process, the best gradient side at current time is determined by solving the gradient of the penalty function based on performance indicator and constraint construction To being continued searching at the iteration moment of next step according to the direction, until find optimal solution.It should be noted that precise search For algorithm due to carrying out optimizing using gradient information, optimal solution is usually locally optimal solution, not can guarantee Global Optimality.

More and more there is multivariable, multiple target, multiple constraint, non-linear, multiple extremum, excellent in aerial vehicle trajectory optimization design Change target and the irregular complicated optimum problem of constraint type, make traditional optimization algorithm --- precise search algorithm faces sternness and chooses War.In recent years, with the continuous development of computer technology, researchers to genetic algorithm, ant group algorithm, particle swarm algorithm, The stochastic search methods such as differential evolution algorithm extensively and profoundly study.

Summary of the invention

The invention aims to solve the problems, such as that existing track optimizing method Global Optimality is lower, a kind of base is provided In the aerial vehicle trajectory optimization method of artificial memory.

The purpose of the present invention is achieved through the following technical solutions：A kind of aerial vehicle trajectory optimization side based on artificial memory Method, it includes the following steps：

Step 1：Establish the track optimizing problem model of standard；

Step 2：Trial solution state evolution operator is designed, provides a large amount of trial solution for each iterative search；

Step 3：The target function value obtained using trial solution changes the type to judge stimulation；

Step 4：Design forgets and memory more new model；

Step 5：Track optimizing is carried out to aircraft using artificial memory's algorithm.

Further, the step 1 is specially：

The track optimizing problem model of the standard is：

Wherein, X=(x₁,x₂,…,x_n) it is optimal control vector, S is feasible zone or search space, RⁿIndicate n dimension The space Euclidean, f (X) are objective function, g_i(X) >=0 inequality constraints is indicated, K is the number of inequality constraints；

The objective function of above-mentioned optimization problem is rewritten as：

In formula, F_maxIt is the positive number to differ greatly with f (X) magnitude, is used for screening and is unable to satisfy the constrained exploration of institute Solution, consequently facilitating system carries out rejecting or other processing to it；

Memory elements are designed to data burst structure, including trial solution, memory residue, system time, memory state and something lost Forget state, i.e.,

M_i=(X_i,m_i,t_i,s_i,f_i), i=1,2 ..., N

In formula, point X_iIndicate trial solution, N is memory elements number；m_iTo remember residue, note of the characterization memory elements to stimulation Recall degree, m_i≥0；t_iFor system time, that is, search for the time carried out；s_iFor memory state, there are I, tri- states of S, L, difference For it is instantaneous, in short-term with long-term memory state；f_iTo forget state, work as f_iShow memory elements M when=1_iTrial solution X is forgotten_i, work as f_i It then indicates clearly to remember when=0.

Further, the step 2 is specially：

Assuming that during the optimal solution search of the optimization problem, at current time (t), point X_iProperty will receive it The influence of its dotted state, enables the set that other points are constituted when previous moment (t-1), i.e. trial solution collection is combined into

In order to guarantee global convergence and convergence rate, the trial solution search strategy is as follows：

1) Average Strategy

A certain number of memory elements are possessed in long-term memory system at this time, i.e., good trial solution；From wherein random L trial solution is picked out, enables the trial solution at current time (t) for the average value of L trial solution in previous moment (t-1), i.e. point X_iCorresponding trial solution value is：

In formula,i₁,i₂,…,i_LFor from trial solution set X The serial number of L trial solution of middle taking-up, n are point X_iIn variable number；

2) orthogonal strategy

Two memory elements, serial number i are randomly selected from trial solution set X at this time₁With i₂, corresponding trial solution carries out orthogonal Weighting operations, then point X_iCorresponding trial solution value is：

In formula, a is the random number between 0~1, i.e. a=rand (0,1)；

3) projection strategy

Two memory elements, serial number i are randomly selected from trial solution set X at this time₁With i₂, correspond to trial solution and projected It operates, then point X_iCorresponding trial solution value is：

4) expanding policy

A memory elements, serial number k are randomly selected from trial solution set X at this time, corresponding trial solution is extended operation, Then point X_iCorresponding trial solution value is：

In formula, coefficient r₁=rand (- 1,1)；

5) Chevy strategy

Choose the optimal trial solution of previous moment and the optimal trial solution at current time respectively from trial solution set X at this timeCorresponding trial solution carries out Chevy operation, then point X_iCorresponding trial solution value is：

In formula, coefficient r₂=rand (0.4,0.6), r₃=1.3rand (0,1), r₄=1.8rand (0,1)；

Above-mentioned 5 strategies can be all executed randomly in each search, each after entire Optimizing Search process The number that strategy is performed is of substantially equal.

Further, the step 3 is specially：

1) effective stimulus

For point X_iIf compared with previous moment (t-1), target function value reduces in the search at current time (t) , i.e.,

Then illustrate the trial solution quality at current time better than previous moment, which is effective stimulus；

2) ineffective stimulation

Correspondingly, for point X_iIf in the search at current time (t), compared with previous moment (t-1), objective function Value does not reduce, i.e.,

Then illustrate that the trial solution quality at current time has no improvement, which is ineffective stimulation.

Further, the step 4 is specially：

Forgetting modelling is carried out using Ebbinghaus forgetting curve model, then forgetting function is：

In formula, t is the time, since 0；Δ t is taken as regular time increment；α > 0 is to forget velocity coeffficient；

Known effective stimulus can be improved memory, and memory residue is enabled to increase, then enabling the increment of memory residue is Δ m_i (t)；And the absolute value of target function value incrementBigger, then the stimulation before showing the stimulation ratio is more effective, intuitively It is presented as that stimulation leads to the increment Delta m for remembering residue_i(t) bigger；Then, increased using the memory residue of proportional relationship Measure Δ m_i(t) it is：

In formula, positive number h is effective stimulus proportionality coefficient, is taken as constant, is become for adjusting memory residue caused by stimulation Change size；

Remember at this time and has just been divided into two parts：The new memory increment that existing old memory and stimulation generate, then remembers residual Value is stayed to be expressed as

m_i(t+ Δ t)=m_i(t)+Δm_i(t)

In addition, memory should also include the memory part to old memory, according to Ebbinghaus forgetting curve it is found that memory Forgetting law be it is fixed, the speed only forgotten is different；

In order to distinguish old memory and to the memory of old memory, the memory residue at current time is expressed as form：

m_i(t+ Δ t)=m_i(t)e^-αΔt+λm_i(t)e^-bΔt

In formula, function lambda m_i(t)e^-bΔtFor the memory value of memory, b is constant and b > α, that is, the forgetting speed recalled is faster； λ is a positive coefficient, for adjusting the forgetting speed recalled, enables λ < 1, that is, recalling can not be more than the memory of original memory Residue；

Then above formula can be rewritten as

m_i(t+ Δ t)=m_i(t)(e^-αΔt+λe^-bΔt)

The links for comprehensively considering memory models, finally obtain the more new model of total recall, and expression formula is

m_i(t+ Δ t)=m_i(t)(e^-αΔt+λe^-bΔt)+Δm_i(t)

Enable β=e^-αΔt+λe^-bΔt, when time increment Delta t is fixed value, then β is also fixed constant, and then memory updates Model simplification is following form：

In formula, factor beta_I,β_S,β_LIt is respectively instantaneous, in short-term with the forgetting speed of long-term memory, by three memory systems The forgetting speed of memory is it is found that 0 < β_L< β_S< β_I< 1.

Further, the step 5 is specially：

Wherein the setting of parameters is as follows：

1) incremental time Δ t=0.1, system sweeps number or maximum number of iterations is G=10000；

2) element number of optimal control vector X, i.e. n=100；

3) memory elements number N=100, the memory elements number L=3 selected in long-term memory system；

4) effective stimulus proportionality coefficient h=1.2；

5) instantaneous, in short-term with the forgetting velocity coeffficient β of long-term memory_I=0.55, β_S=0.35, β_L=0.15.

Advantages of the present invention：The present invention devises a kind of aerial vehicle trajectory optimization method based on artificial memory.The invention The track optimizing problem model for establishing standard, devise be made of immediate memory, short-term memory, long-term memory system it is artificial Memory system forgets the more new model of model foundation memory using great this of Chinese mugwort guest, finally uses trial solution state evolution operator pair The trial solution of each iteration is designed, and is obtained the trial solution with stronger randomness, is finally obtained track optimizing problem Globally optimal solution.The present invention studies aerial vehicle trajectory optimization problem using the artificial memory based on memory principle, effectively The Global Optimality for improving track optimizing has broad application prospects in the Optimal design of trajectory field comprising Complex Constraints.

Detailed description of the invention

Fig. 1 is the flow chart of the aerial vehicle trajectory optimization method of the present invention based on artificial memory.

Specific embodiment

Technical solution in the embodiment of the present invention that following will be combined with the drawings in the embodiments of the present invention carries out clear, complete Ground description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Based on this Embodiment in invention, every other reality obtained by those of ordinary skill in the art without making creative efforts Example is applied, shall fall within the protection scope of the present invention.

In conjunction with Fig. 1, the present invention proposes a kind of aerial vehicle trajectory optimization method based on artificial memory, it includes following step Suddenly：

Step 1：The track optimizing problem model of standard is established, wherein objective function takes the form of penalty function；

Step 2：Trial solution state evolution operator is designed, provides a large amount of trial solution for each iterative search, and provide 5 Kind search strategy, guarantees the randomness of search；

Step 4：The forgetting model of memory is designed according to Ebbinghaus forgetting curve model, then derives note according to this The more new model recalled；

Track optimizing problem model is in step 1：

Wherein, X=(x₁,x₂,…,x_n) it is optimal control vector, S is feasible zone or search space, RⁿIndicate n dimension The space Euclidean, f (X) are objective function, g_i(X) >=0 inequality constraints is indicated, K is the number of inequality constraints.

In order to facilitate use, the objective function of above-mentioned optimization problem is rewritten as

In formula, F_maxBe one and differ biggish positive number with f (X) magnitude, be used for screen be unable to satisfy institute it is constrained that A little trial solutions, consequently facilitating system carries out rejecting or other processing to it.

M_i=(X_i,m_i,t_i,s_i,f_i), i=1,2 ..., N

In formula, point X_iIndicate trial solution, N is memory elements number；m_iTo remember residue, note of the characterization memory elements to stimulation Recall degree, m_i≥0；t_iFor system time, that is, search for the time carried out；s_iFor memory state, there are I, tri- states of S, L, difference For it is instantaneous, in short-term with long-term memory state；f_iTo forget state, work as f_iShow memory elements M when=1_iTrial solution X is forgotten_i, work as f_i Then indicate that there are also more clearly remember when=0.

In step 2, the design process of trial solution state evolution operator is as follows：

Assuming that during the optimal solution search of above-mentioned optimization problem, at current time (t), point X_iProperty will receive it The influence of its dotted state.The set that other points are constituted when previous moment (t-1) is enabled, i.e. trial solution collection is combined into

In order to guarantee global convergence and convergence rate, trial solution search strategy given herein is as follows.

1) Average Strategy

A certain number of memory elements are possessed in long-term memory system at this time, i.e., good trial solution.From wherein random L trial solution is picked out, enables the trial solution at current time (t) for the average value of L trial solution in previous moment (t-1), i.e. point X_iCorresponding trial solution value is：

In formula,i₁,i₂,…,i_LFor from trial solution set X The serial number of L trial solution of middle taking-up, n are point X_iIn variable number.

2) orthogonal strategy

In formula, a is the random number between 0~1, i.e. a=rand (0,1).

3) projection strategy

4) expanding policy

In formula, coefficient r₁=rand (- 1,1).

5) Chevy strategy

In formula, coefficient r₂=rand (0.4,0.6), r₃=1.3rand (0,1), r₄=1.8rand (0,1).

Above-mentioned 5 strategies can be all executed randomly in each search, each after entire Optimizing Search process The number that strategy is performed is of substantially equal.Each strategy emphasizes particularly on different fields, and Average Strategy, which is to make full use of, has optimal trial solution, It calculates simply, but is easily trapped into local optimum.Orthogonal strategy, projection strategy and expanding policy randomness are strong, can be to avoid part Optimal problem, but not can guarantee convergence rate.Chevy strategy performance compromise, but calculation amount is larger.

In summary analysis can generate N number of exploration using 5 search strategies it is found that souning out Solution operator before search starts every time Solution, has on the one hand fully ensured that convergent rapidity, has on the other hand ensured the randomness of search, can finally enable search smooth It carries out and quickly obtains optimal solution.

In step 3, the stimulus type judgment method that trial solution generates is as follows：

As previously mentioned, stimulation (situation of change of target function value) that algorithm is generated by trial solution judges its type.

1) effective stimulus

Then illustrate the trial solution quality at current time better than previous moment, which is effective stimulus.

2) ineffective stimulation

In step 4, the update model inference process of memory is as follows：

German psychologist H.Ebbinghaus proposes famous Ebbinghaus and forgets song according to years of researches Line, after which can describe brain by outside stimulus, memory decays the process of (forgetting) as time goes by and gradually. The present invention carries out forgetting modelling using the model, and then forgetting function is：

In formula, t is the time, since 0；Δ t is taken as regular time increment；α > 0 is to forget velocity coeffficient.It needs Bright, for the same stimulation, in TM, STM and LTM system, the speed that the memory generated passes into silence is not identical, because This, which forgets velocity coeffficient, 3 kinds, respectively α_I,α_S,α_L, and α_I> α_S> α_L, i.e., remember in TM, STM and LTM system and pass into silence Speed show slower and slower trend.

According to memory models it is found that on the one hand memory can gradually decay with the time, if had during another aspect Effect stimulation can also enhance memory.As shown in the above, memory can be improved in effective stimulus, enables memory residue increase, enables The increment for remembering residue is Δ m_i(t).And the absolute value of target function value incrementIt is bigger, then show the stimulation ratio Stimulation before is more effective, is intuitively presented as that stimulation leads to the increment Delta m for remembering residue_i(t) bigger.Then, using just The memory residue increment Delta m of proportionate relationship_i(t) it can be written as：

In formula, positive number h is effective stimulus proportionality coefficient, is taken as constant, is become for adjusting memory residue caused by stimulation Change size.

m_i(t+ Δ t)=m_i(t)+Δm_i(t)

In addition, memory should also include the memory part to old memory.According to Ebbinghaus forgetting curve it is found that memory Forgetting law be it is fixed, the speed only forgotten is different.

m_i(t+ Δ t)=m_i(t)e^-αΔt+λm_i(t)e^-bΔt

In formula, function lambda m_i(t)e^-bΔtFor the memory value of memory, b is constant and b > α, that is, the forgetting speed recalled is faster； λ is a positive coefficient, for adjusting the forgetting speed recalled, enables λ < 1, that is, recalling can not be more than the memory of original memory Residue.

Then above formula can be rewritten as

m_i(t+ Δ t)=m_i(t)(e^-αΔt+λe^-bΔt)

The links of memory models are comprehensively considered, such as have forgotten, recall, memory-enhancing effect (by effective stimulus), most The more new model of total recall is obtained eventually, and expression formula is

m_i(t+ Δ t)=m_i(t+ Δ t)=m_i(t)(e^-αΔt+λe^-bΔt)+Δm_i(t)

In step 5, artificial memory's algorithm is applied in aerial vehicle trajectory optimization problem.The wherein setting of parameters As follows：

1) incremental time Δ t=0.1, system sweeps number or maximum number of iterations is G=10000.

2) element number of dominant vector X, i.e. n=100.

3) memory elements number N=100, the memory elements number L=3 selected in long-term memory system.

4) effective stimulus proportionality coefficient h=1.2.

The present invention is proposed for the deficiency that existing track optimizing method Global Optimality is difficult to ensure, by artificial memory's original It ought to be used in the track optimizing problem model amount of standard, traditional track optimizing problem be studied, by using difference Trial solution search strategy, ensure that the randomness of algorithm, effectively increase the convergence rate of track optimizing, and further decrease There is the probability of local minimum phenomenon.Meanwhile trial solution state evolution operator effectively improves the complete of track optimizing problem Office's optimality, it is ensured that converge within given precision, the practical application for artificial Memory algorithm provides support.

Above to a kind of aerial vehicle trajectory optimization method based on artificial memory provided by the present invention, detailed Jie has been carried out It continues, used herein a specific example illustrates the principle and implementation of the invention, and the explanation of above embodiments is only It is to be used to help understand method and its core concept of the invention；At the same time, for those skilled in the art, according to this hair Bright thought, there will be changes in the specific implementation manner and application range, in conclusion the content of the present specification should not manage Solution is limitation of the present invention.

Claims

1. a kind of aerial vehicle trajectory optimization method based on artificial memory, which is characterized in that it includes the following steps：

Step 1：Establish the track optimizing problem model of standard；

Step 4：Design forgets and memory more new model；

2. the aerial vehicle trajectory optimization method according to claim 1 based on artificial memory, it is characterised in that：The step One is specially：

The track optimizing problem model of the standard is：

In formula, F_maxIt is the positive number to differ greatly with f (X) magnitude, is used for screening and is unable to satisfy the constrained trial solution of institute, Consequently facilitating system carries out rejecting or other processing to it；

Memory elements are designed to data burst structure, including trial solution, memory residue, system time, memory state and forgetting shape State, i.e.,

M_i=(X_i,m_i,t_i,s_i,f_i), i=1,2 ..., N

In formula, point X_iIndicate trial solution, N is memory elements number；m_iTo remember residue, memory journey of the characterization memory elements to stimulation Degree, m_i≥0；t_iFor system time, that is, search for the time carried out；s_iFor memory state, there are I, tri- states of S, L, respectively wink When, in short-term with long-term memory state；f_iTo forget state, work as f_iShow memory elements M when=1_iTrial solution X is forgotten_i, work as f_i=0 When then indicate clearly to remember.

3. the aerial vehicle trajectory optimization method according to claim 2 based on artificial memory, it is characterised in that：The step Two are specially：

Assuming that during the optimal solution search of the optimization problem, at current time (t), point X_iProperty will receive other points The influence of state, enables the set that other points are constituted when previous moment (t-1), i.e. trial solution collection is combined into

1) Average Strategy

A certain number of memory elements are possessed in long-term memory system at this time, i.e., good trial solution；From wherein selecting at random L trial solution out enables the trial solution at current time (t) for the average value of L trial solution in previous moment (t-1), i.e. point X_iIt is right The trial solution value answered is：

In formula,i₁,i₂,…,i_LTo be taken from trial solution set X The serial number of L trial solution out, n are point X_iIn variable number；

2) orthogonal strategy

Two memory elements, serial number i are randomly selected from trial solution set X at this time₁With i₂, corresponding trial solution progress orthogonal weight It operates, then point X_iCorresponding trial solution value is：

In formula, a is the random number between 0~1, i.e. a=rand (0,1)；

3) projection strategy

Two memory elements, serial number i are randomly selected from trial solution set X at this time₁With i₂, corresponding trial solution progress projection operation, Then point X_iCorresponding trial solution value is：

4) expanding policy

In formula, coefficient r₁=rand (- 1,1)；

5) Chevy strategy

Above-mentioned 5 strategies can be all executed randomly in each search, each strategy after entire Optimizing Search process The number being performed is of substantially equal.

4. the aerial vehicle trajectory optimization method according to claim 3 based on artificial memory, it is characterised in that：The step Three are specially：

1) effective stimulus

For point X_iIf compared with previous moment (t-1), target function value is reduced, i.e., in the search at current time (t)

2) ineffective stimulation

Correspondingly, for point X_iIf compared with previous moment (t-1), target function value is not in the search at current time (t) Reduce, i.e.,

5. the aerial vehicle trajectory optimization method according to claim 4 based on artificial memory, it is characterised in that：The step Four are specially：

Known effective stimulus can be improved memory, and memory residue is enabled to increase, then enabling the increment of memory residue is Δ m_i(t)；And The absolute value of target function value incrementBigger, then the stimulation before showing the stimulation ratio is more effective, is intuitively presented as Stimulation leads to the increment Delta m for remembering residue_i(t) bigger；Then, using the memory residue increment Delta m of proportional relationship_i (t) it is：

In formula, positive number h is effective stimulus proportionality coefficient, is taken as constant, for adjusting memory residue variation caused by stimulation greatly It is small；

Remember at this time and has just been divided into two parts：The new memory increment that existing old memory and stimulation generate, then remembers residue It is expressed as

m_i(t+ Δ t)=m_i(t)+Δm_i(t)

In addition, memory should also include the memory part to old memory, according to Ebbinghaus forgetting curve it is found that the something lost of memory Forgetting rule is speed difference that is fixed, only forgeing；

m_i(t+ Δ t)=m_i(t)e^-αΔt+λm_i(t)e^-bΔt

In formula, function lambda m_i(t)e^-bΔtFor the memory value of memory, b is constant and b > α, that is, the forgetting speed recalled is faster；λ is one A positive coefficient enables λ < 1, that is, recalling can not remain more than the memory of original memory for adjusting the forgetting speed recalled Value；

Then above formula can be rewritten as

m_i(t+ Δ t)=m_i(t)(e^-αΔt+λe^-bΔt)

m_i(t+ Δ t)=m_i(t)(e^-αΔt+λe^-bΔt)+Δm_i(t)

Enable β=e^-αΔt+λe^-bΔt, when time increment Delta t is fixed value, then β is also fixed constant, then remembers more new model It is reduced to following form：

In formula, factor beta_I,β_S,β_LIt is respectively instantaneous, in short-term with the forgetting speed of long-term memory, by remembering in three memory systems Forgetting speed it is found that 0 < β_L< β_S< β_I< 1.

6. the aerial vehicle trajectory optimization method according to claim 5 based on artificial memory, it is characterised in that：The step Five are specially：

Wherein the setting of parameters is as follows：

2) element number of optimal control vector X, i.e. n=100；

4) effective stimulus proportionality coefficient h=1.2；