CN116185061A - Middle section guidance method based on integrated transfer learning - Google Patents

Middle section guidance method based on integrated transfer learning Download PDF

Info

Publication number
CN116185061A
CN116185061A CN202211516761.3A CN202211516761A CN116185061A CN 116185061 A CN116185061 A CN 116185061A CN 202211516761 A CN202211516761 A CN 202211516761A CN 116185061 A CN116185061 A CN 116185061A
Authority
CN
China
Prior art keywords
neural network
training
new
learner
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211516761.3A
Other languages
Chinese (zh)
Inventor
何绍溟
金天宇
王江
李虹言
刘子超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Priority to CN202211516761.3A priority Critical patent/CN116185061A/en
Publication of CN116185061A publication Critical patent/CN116185061A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/10Simultaneous control of position or course in three dimensions
    • G05D1/107Simultaneous control of position or course in three dimensions specially adapted for missiles
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T90/00Enabling technologies or technologies with a potential or indirect contribution to GHG emissions mitigation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Automation & Control Theory (AREA)
  • Traffic Control Systems (AREA)
  • Control Of Position, Course, Altitude, Or Attitude Of Moving Bodies (AREA)

Abstract

The invention discloses a guidance method (ETLS) based on integrated transfer learning, which not only can generate an optimal guidance command in real time, but also can quickly adapt to a new working environment by fine adjustment with little new data after the scene changes, and has almost the same good performance as before; according to the method, a plurality of traditional trained DNN neural networks are combined with a meta-learner, the problem of optimal control of middle guidance of a new aircraft is simplified to the problem of searching an optimal weighting function and an optimal bias function, the two functions can be rapidly determined by using a small amount of data, the problems of time consumption and insufficient data of retraining a new network are avoided, and therefore, a middle guidance section control instruction meeting the requirements of final speed and precision can be given in a very short time aiming at the new aircraft and a new application scene.

Description

Middle section guidance method based on integrated transfer learning
Technical Field
The invention relates to an aircraft middle section guidance method, in particular to a middle section guidance method based on integrated transfer learning.
Background
During the mission of the aircraft, the aircraft undergoes three stages of launch, mid-stage guidance and end-stage guidance. The middle-stage guidance process takes the longest time and is also the most critical step in the aircraft guidance process.
The guidance system is the core of the missile with high hit rate, and the hit precision of the missile can be directly influenced by the quality of a guidance algorithm. The widely used algorithm at present is a mature analysis method, a numerical method and the like, is called a traditional guidance algorithm, and can ensure higher guidance precision in a foreseeable range. In recent years, students introduce machine learning methods into the guidance field, so that a series of emerging data-based guidance algorithms, typically deep learning and reinforcement learning guidance algorithms, are created. These methods are characterized by the need for large amounts of data and time to train Deep Neural Networks (DNNs). Once trained, the deep neural network can quickly generate results with less computational cost.
However, one inherent disadvantage of this approach is the poor generalization ability. Well-trained deep neural networks often fail to provide satisfactory performance in entirely new mission scenarios, and even fail to function properly in most cases. This means that when the application scene changes, a new DNN needs to be retrained. However, since the training process is time consuming and requires a large amount of marking data, data-based guidance algorithms are difficult to apply for tasks that can only provide small amounts of data or are time-critical.
Based on this, when a new aircraft is designed or a mature aircraft is applied to a new application scene, the control system of the aircraft in the middle guidance section often cannot meet the guidance requirement due to lack of enough data flushing, the aircraft cannot obtain the maximum final speed in the middle guidance section, and the final hit accuracy is also affected.
Based on the above problems, the present inventors have conducted an in-depth analysis on a guidance method of a data-based middle guidance section, and have expected to design a guidance method based on integrated transfer learning (ETLS) capable of solving the above problems.
Disclosure of Invention
In order to overcome the problems, the inventor has conducted intensive research and designs a guidance method (ETLS) based on integrated transfer learning, which not only can generate an optimal guidance command in real time, but also can adapt to a new working environment quickly by fine tuning with little new data after the scene changes, and has almost the same good performance as before; according to the method, a plurality of traditional trained DNN neural networks are combined with a meta-learner, the problem of optimal control of middle guidance of a new aircraft is simplified to the problem of searching an optimal weighting function and an optimal bias function, the two functions can be rapidly determined by using a small amount of data, the problems of time consumption and insufficient data of retraining a new network are avoided, and therefore, a middle guidance section control instruction meeting the requirements of final speed and precision can be given in a very short time aiming at the new aircraft and a new application scene, and the method is completed.
Specifically, the present invention aims to provide a middle guidance method based on integrated transfer learning, which is characterized in that in the method, an optimal control instruction a is obtained in real time in a middle guidance section c new The method comprises the steps of carrying out a first treatment on the surface of the By the optimal control instruction a c new And controlling steering engine rudder operation of the aircraft to enable the aircraft to fly according to a preset track, and further completing a middle-stage guidance task with maximized tail speed.
Wherein the optimal control command a is obtained in real time by inputting the state vector S of the aircraft into a pre-trained network E in real time c new
The training process of the network E comprises the following steps:
step 1, training to obtain at least 5 DNN neural networks to form a base learner;
step 2, connecting the base learner with the element learner to obtain a network E, namely taking the output of the base learner as the input of the element learner;
and step 3, training the network E through a small amount of aircraft training data to obtain the trained network E.
In the step 1, each DNN neural network corresponds to an application scenario, that is, application scenarios targeted by each DNN neural network are different.
In the step 1, the DNN neural network is a deep feed-forward neural network, and the DNN neural network has 3 hidden layers, 20 neurons in each layer, and the neurons in each hidden layer are fully connected with the neurons in the previous layer.
Wherein in the step 1, the training process of the DNN neural network includes:
step a, normalizing and grouping training data;
step b, inputting training set data into the DNN neural network, and comparing the predicted value with a standard value in the training set to obtain a loss;
step c, error back propagation and parameter updating;
step d, after the neural network finishes one training, inputting data of a verification set and a test set into the neural network, and calculating a loss value of the network to be used as a measurement index of generalization capability of the neural network; training is stopped when the loss value decreases to a set value or a maximum epoch is reached.
The element learner is a single hidden layer feedforward neural network, and at least 5 elements are input to the element learner, namely at least 5 DNN neural network outputs are input to the element learner; the output of the meta learner is the optimal control instruction a c new
The algorithm in the single hidden layer feedforward neural network is as follows:
Figure BDA0003967911730000031
wherein i represents the number input by the single hidden layer feedforward neural network;
n represents the number of single hidden layer feedforward neural network inputs;
a ci representation sheetAn ith input of the hidden feedforward neural network;
C j representing a weighting function;
b j representing the bias function.
Wherein in step 3, the small amount refers to less than 500 sets of data.
The invention has the beneficial effects that:
(1) According to the middle-stage guidance method based on integrated transfer learning, a plurality of neural networks are respectively trained for different aerodynamic models, namely a base learner, and then a small feedforward neural network is utilized to learn the mapping relation from the old optimal control to the optimal control under the new environment, so that the method can be rapidly adapted to the new environment with insufficient data;
(2) According to the middle-stage guidance method based on the integrated transfer learning, training data can be greatly reduced under the condition of ensuring guidance performance;
(3) According to the middle-stage guidance method based on integrated transfer learning, the adaptation to the new environment can be completed in a few seconds, and the method is suitable for being used under the condition of strict time requirements;
(4) The middle-section guidance method based on the integrated transfer learning provided by the invention can be flexibly applied to other scenes, such as minimum control energy guidance, minimum time guidance and the like.
Drawings
Fig. 1 shows a schematic structural diagram of a DNN neural network according to a preferred embodiment of the present invention;
FIG. 2 illustrates an optimal guidance command for different aerodynamic parameters under the same initial conditions;
FIG. 3 is a schematic diagram showing the structure of a meta learner according to a preferred embodiment of the present invention;
FIG. 4 shows a schematic diagram of training loss and validation loss of the meta-learner at different neuron numbers in example 1;
FIG. 5 shows a comparative schematic of position errors in example 2;
FIG. 6 shows a comparative schematic of velocity error in example 2;
fig. 7 shows a comparative schematic diagram of the terminal angle error in embodiment 2;
FIG. 8 is a diagram showing time error comparison in example 2;
fig. 9 shows a positional error comparison diagram in embodiment 3;
fig. 10 shows a comparative schematic of speed error in example 3;
fig. 11 shows a comparative schematic diagram of the terminal angle error in embodiment 3;
FIG. 12 is a diagram showing a comparison of time errors in example 3;
fig. 13 shows a comparative schematic diagram of positional errors in embodiment 4;
fig. 14 shows a comparative schematic of speed error in example 4;
fig. 15 shows a comparative schematic diagram of the terminal angle error in embodiment 4;
FIG. 16 is a diagram showing a comparison of time errors in example 4;
FIG. 17 shows a schematic view of the trajectory of an aircraft in example 5;
FIG. 18 shows a schematic diagram of the speed change with time in example 5;
FIG. 19 is a graph showing the change of the track inclination angle with time in example 5;
fig. 20 is a diagram showing a time-dependent change of the control command in embodiment 5.
Detailed Description
The invention is further described in detail below by means of the figures and examples. The features and advantages of the present invention will become more apparent from the description.
The word "exemplary" is used herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments. Although various aspects of the embodiments are illustrated in the accompanying drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
According to the middle-section guidance method based on the integrated transfer learning, in the method, the optimal control finger is obtained in real time in the middle-section guidanceLet a c new The method comprises the steps of carrying out a first treatment on the surface of the The frequency for obtaining the optimal control instruction is not lower than the calculation speed of the neural network. The neural network used in the present application calculates time not more than 1ms, about 0.3-0.6ms, so that the optimal control command can be calculated at a frequency not lower than 1000 HZ. The frequency of obtaining the optimal control command is preferably set to 20HZ; obviously higher frequencies can achieve higher accuracy. By the optimal control instruction a c new And controlling steering engine rudder operation of the aircraft to enable the aircraft to fly according to a preset track, and further completing a middle-stage guidance task with maximized tail speed. The final speed maximization is the maximization of the speed of the aircraft at the intersection point of the middle guidance section and the final guidance section when the aircraft flies according to the optimal track. In the application, the aircraft can be a novel aircraft obtained by new design, or can be an aircraft applied to a new application scene, and the optimal control instruction a can be obtained by obtaining only a small amount of flight trajectory data c new To control the rudder of the steering engine of the aircraft.
Preferably, said optimal control command a is obtained in real time by inputting in real time the state vector S of the aircraft into a pre-trained network E c new . The state vector S is obtained in real time by sensors onboard the aircraft. In this application, the selection of the state vector is not fixed, and appropriate parameters need to be selected for different tasks. For example, the state vector S may include
Figure BDA0003967911730000061
Wherein x, y represents the abscissa of the aircraft, < >>
Figure BDA0003967911730000062
For ballistic dip, V is aircraft speed, x f ,y f ,/>
Figure BDA0003967911730000063
Representing the terminal position coordinates and the ballistic tilt angle.
In a preferred embodiment, the training process of the network E comprises the following steps:
step 1, training to obtain at least 5 DNN neural networks to form a base learner;
step 2, connecting the base learner with the element learner to obtain a network E, namely taking the output of the base learner as the input of the element learner; the migration learning module in the application consists of the meta learner;
and step 3, training the network E through a small amount of aircraft training data to obtain the trained network E.
Preferably, in the step 1, each DNN neural network corresponds to an application scenario, that is, the application scenario targeted by each DNN neural network is different. The application scene comprises: the system dynamics model, the operation environment and the task target are different in any aspect of the two application scenes, namely the two application scenes.
For example, there are five types of aircraft, referred to as aircraft 1 through pilot 5. Due to the different configurations, their aerodynamic coefficients are also different, as shown in tables 1 and 2. Wherein table 1 is the baseline for the aerodynamic coefficients, and the coefficients in table 2 are all changed on the basis of this. Aircraft 1 through 5 have sufficient flight data and thus each has a trained DNN, numbered B 1 To B 5 The terminal angle constraint terminal speed maximization middle-stage guidance task of the corresponding aircraft can be controlled by the base learner. At this time, if a new aircraft is put into operation, its aerodynamic coefficient is as shown in table 3, and it is difficult to train a new DNN for it due to its lack of data support; there is a need to address this problem in the event of such data scarcity by the ETLS proposed by the present application.
TABLE 1 aerodynamic coefficient baseline
Figure BDA0003967911730000071
Table 2 aerodynamic coefficients of aircraft 1 to 5
Figure BDA0003967911730000072
TABLE 3 aerodynamic coefficients of a New aircraft
Figure BDA0003967911730000081
Preferably, the base learner requires a method of generating the data set. The optimal control problem is highly nonlinear, so that an analytical solution does not exist and can only be solved by a numerical method; the existing numerical methods are of many kinds, but the convergence and the calculation speed are not uniform. The present application selects hp-FRPM as the solution method. The method is insensitive to initial value guess, the convergence speed is high, and an optimal solution of an optimal control problem can be calculated only by 1 second on average in an exemplary scene in the application. The algorithm can be directly invoked by using the existing solver GPOPS-II. The greatest advantage of hp-FRPM is that discrete points can be adaptively increased or decreased according to the gradient of the change of state variables and control variables. More discrete points are arranged at the positions with larger variable change gradient, and fewer discrete points are arranged at the positions with less severe change. This makes the optimal data distribution that solves more reasonable, more is favorable to neural network training fully.
The application randomly generates 15000 initial conditions, wherein the random initial emission angle and the horizontal position of the target correspondingly have 15000 optimal control problems. The solution accuracy of GPOPS-II is set to 1×10 -8 . The solved state-optimal control pair s, a c ]Stored as a training set.
Preferably, in the step 1, the DNN neural network is a deep feed-forward neural network, the DNN neural network has 3 hidden layers, 20 neurons in each layer, the neurons in each hidden layer are fully connected with the neurons in the previous layer, and the structure of the DNN neural network is shown in fig. 1.
The calculation method of each layer in the neural network is as follows:
L i+1 =σ(W i L i +b i )
wherein W is i As a weight matrix, b i For biasing matrix, L i+1 Representing the output of layer i+1, σ represents the nonlinear activation function. The nonlinear activation function is an indispensable part of a neural network, and different kinds of activation functions have different effects on training results. This patent selects to use
Figure BDA0003967911730000091
The function acts as an activation function because it is better suited to fit the problem.
Preferably, in the step 1, the training process of the DNN neural network includes:
step a, normalizing and grouping training data; normalization can increase the training speed and stability of the neural network. The application uses
Figure BDA0003967911730000092
Normalizing;
at the time of grouping, 70% of the data was used as training set, 15% as validation set, and 15% as test set.
Step b, inputting training set data into the DNN neural network, and comparing the predicted value with a standard value in the training set to obtain a loss; the present application uses mean square error as a loss function:
Figure BDA0003967911730000093
step c, error back propagation and parameter updating; the loss value calculated by the loss function is propagated into the neural network using a back propagation algorithm. Then each parameter W i And b i Updating according to the loss value. The present application uses the Levenberg-Marquardt algorithm for updating. The update formula is as follows:
x k+1 =x k (J T J+μI) -1 J T e
wherein J represents jacobian, μ is a radius of a confidence region, an initial value is 0.001, and μ' =0.1 μ is made when the loss of neural network training is reduced, so that convergence speed can be increased; conversely, if the loss of the neural network increases, μ' =10μ. Adaptive adjustment of μ allows the Levenberg-Marquardt algorithm to achieve the fastest convergence speed when training smaller neural networks.
Step d, after the neural network finishes one training, namely after 1epoch, inputting data of a verification set and a test set into the neural network, and calculating a loss value of the network to be used as a measurement index of generalization capability of the neural network; when the loss value is reduced to 1×10 -6 Training is stopped below or when the maximum epoch is reached. Because the convergence rate of the L-M algorithm is extremely high, the training time is stopped when the training time reaches 200 epoch.
In a preferred embodiment, the element learner is a single hidden layer feed forward neural network, and at least 5 inputs of the element learner, i.e. at least 5 outputs of the DNN neural network, are inputs of the element learner; the output of the meta learner is the optimal control instruction a c new
Preferably, in fig. 2, the optimal guidance command profile (expressed by lateral acceleration) required by the aircraft to complete the final speed optimal mid-section guidance is shown, under the same initial conditions, with different aerodynamic parameters; FIG. 2 illustrates an Aero i Corresponding to table 2, the results of the ith aerodynamic coefficient are indicated; as can be seen from fig. 2, the optimal solution for one particular aerodynamic model differs from another model; however, it is also clearly observed from fig. 2 that within the same time interval, there is also a strong similarity between different solutions. This means that we can calculate new accelerations with a weighted combination of existing optimal accelerations, based on which the algorithm in the single hidden layer feedforward neural network is:
Figure BDA0003967911730000101
wherein i represents the number input by the single hidden layer feedforward neural network;
n represents the number of single hidden layer feedforward neural network inputs;
a ci an ith input representing a single hidden layer feed-forward neural network;
C j representing a weighting function;
b j representing the bias function.
C j And b j Parameters to be determined are determined; thereby simplifying the problem of optimal control of middle section guidance of a new aircraft into searching an optimal weighting function C i And an optimal bias function b i The method comprises the steps of carrying out a first treatment on the surface of the If these two functions can be determined quickly with a small amount of data, the time consuming and data starvation problems of retraining a new network can be avoided.
The element learner which is essentially a single hidden layer feedforward neural network is arranged as shown in figure 3; the input to the meta-learner is the best guidance command a generated by the trained base learner ci Outputting an optimal control command a required by a new missile c new
In a preferred embodiment, in step 3, the training method of the meta learner is also similar to that of the base learner; first, a small amount of data s, a is prepared c new ]The state vector S of the aircraft is firstly passed through 5 existing basis learners to obtain 5 outputs a c1 To a c5 . The five outputs are input to the element learner to obtain the output of network prediction, the predicted value is compared with the true value in the training set, the loss is calculated and the back propagation is performed until loss is small enough, wherein the specific training termination condition is that the loss is 1×10 -6 The following is given. Finally obtaining the trained meta learner, and further obtaining the network E.
Preferably, in step 3, the small amount refers to less than 500 sets of data, and the aircraft training data is aircraft trajectory data. I.e. each set of aircraft training data contains all the data in one trajectory of the aircraft.
Example 1
Selecting single hidden layer feedforward neural networks with different neuron numbers as element learners, wherein the number of the neurons is respectively 10 to 90, and the number of the neurons is 9 element learners; the training input data of the meta learner are output data of 5 trained DNN neural networks, and the training output data of the meta learner is an optimal control instruction;
the algorithm in the single hidden layer feedforward neural network is as follows:
Figure BDA0003967911730000121
wherein i represents the number input by the single hidden layer feedforward neural network;
n represents the number of single hidden layer feedforward neural network inputs;
a ci an ith input representing a single hidden layer feed-forward neural network;
C j representing a weighting function;
b j representing the bias function.
The training and verification were performed on the 9-element learner, and the training loss and verification loss were recorded, and the results are shown in fig. 4, and as can be seen from fig. 4, the performance loss is minimum at 40 neurons, so that the numerical simulation of the present application was performed using the 40-element learner.
Training time of 9 meta-learners as shown in table 4 below, the time required for retraining DNN was known to be 2 hours, and as can be seen from table 4, the meta-learners can complete learning of control instructions under new pneumatic parameters extremely quickly.
TABLE 4 training time of the Meta learner with different neuron numbers
Figure BDA0003967911730000122
The trained network E composed of the basic learner and the meta learner can be directly loaded into an onboard computer. The method has a small amount of operation in the computer, so that the time for calculating the optimal control instruction once is very short, only about 0.3-0.6ms is needed on the notebook computer, and the method can be faster on the customized onboard computer. After the aircraft is launched, the on-board computer calculates a set of state vectors S of the aircraft every 0.05 seconds and inputs the state vectors S into the network E, and then the network E can rapidly give out the state vectorsOptimal control instruction a required at this time c new The method comprises the steps of carrying out a first treatment on the surface of the The aircraft need only follow the instructions of network E to achieve maximum terminal speeds under terminal angle constraints.
Example 2
Invoking a trained DNN neural network corresponding to five aircrafts, B 1 、B 2 、B 3 、B 4 、B 5 And further forming a base learner through the 5 DNN neural networks, wherein the base learner is connected with the element learner to obtain a network E, the network E obtained by training 100 new tracks is E (5,100), and the network E obtained by training 500 new tracks is E (5,500).
Statistics E (5,100), E (5,500) and DNNs corresponding to five aircraft are shown with specific statistical position errors as shown in fig. 5, speed errors as shown in fig. 6, terminal angle errors as shown in fig. 7 and time errors as shown in fig. 8.
As can be seen from fig. 4 to 8, the error of the network E obtained using only less than 500 pieces of new data is very close to the error of the retraining DNN, and the effectiveness of the guidance method based on the integrated transfer learning in the present application is illustrated.
Example 3
And calling the trained DNN neural networks corresponding to the five aircrafts, and utilizing two, three, four or five of the DNN neural networks to form a base learner, wherein the base learner is connected with the element learner to obtain a network E, and the network E obtained by training 100 new tracks, in particular to E (2, 100), E (3, 100), E (4, 100) and E (5, 100).
The error performance is counted, the specific statistical position error is shown in fig. 9, the speed error is shown in fig. 10, the terminal angle error is shown in fig. 11, and the time error is shown in fig. 12.
As can be seen from fig. 9 to 12, when the number of DNN neural networks in the base learner is reduced to 2, the guidance performance is greatly reduced. The number of DNN neural networks in the base learner is continuously increased, and the guidance performance is limited, which means that the number of DNN neural networks is preferably more than 2, but not necessarily more than 5.
Example 4
And calling the trained DNN neural networks corresponding to the five aircrafts, and forming a base learner, wherein the base learner is connected with the element learner to obtain a network E, and different numbers of new trajectories are used for training to obtain the network E, specifically E (5, 6), E (5, 12), E (5, 25), E (5, 50), E (5, 100), E (5, 200) and E (5, 500).
The error performance is counted, the specific statistical position error is shown in fig. 13, the speed error is shown in fig. 14, the terminal angle error is shown in fig. 15, and the time error is shown in fig. 16.
13-16, when training data is reduced to 6 tracks, the performance of the ETLS method is rapidly deteriorated based on the guidance method of the integrated transfer learning, and the improvement caused by the increase of the data to 500 tracks is not great; the experimental result fully shows that the ETLS method can complete the study of the optimal control under the new pneumatic parameters only with a small data volume.
Example 5
Invoking network E (5, 12) in example 4, DNN neural network B in example 2 5 Conventional ballistic shaping guidance laws (TSG), see Zarchan, p., tactical and strategic missile guidance, vol.239, american Institute of Aeronautics and Astronautics, inc, 2012. The three guidance control schemes are utilized to carry out simulation on the same aircraft, the obtained aircraft track is shown in fig. 17, the speed change track with time is shown in fig. 18, the trajectory inclination angle change track with time is shown in fig. 19, and the control command change track with time is shown in fig. 20;
17-20, the performance of the middle-stage guidance method of the integrated transfer learning, namely the network E (5, 12), is very close to that of the optimal solution, and the effectiveness of the method is proved; although the traditional TSG method can meet the terminal angle constraint, the terminal speed cannot be optimized; furthermore, old neural network B 5 Failure to work properly under the new aerodynamic parameters results in task failure.
The invention has been described above in connection with preferred embodiments, which are, however, exemplary only and for illustrative purposes. On this basis, the invention can be subjected to various substitutions and improvements, and all fall within the protection scope of the invention.

Claims (9)

1. The middle section guidance method based on integrated transfer learning is characterized in that in the method, an optimal control instruction a is obtained in real time in a middle guidance section c new The method comprises the steps of carrying out a first treatment on the surface of the By the optimal control instruction a c new And controlling steering engine rudder operation of the aircraft to enable the aircraft to fly according to a preset track, and further completing a middle-stage guidance task with maximized tail speed.
2. The method for mid-stage guidance based on integrated transfer learning of claim 1, wherein,
the optimal control command a is obtained in real time by inputting the state vector S of the aircraft into a pre-trained network E in real time c new
3. The method for mid-stage guidance based on integrated transfer learning of claim 1, wherein,
the training process of the network E comprises the following steps:
step 1, training to obtain at least 5 DNN neural networks to form a base learner;
step 2, connecting the base learner with the element learner to obtain a network E, namely taking the output of the base learner as the input of the element learner;
and step 3, training the network E through a small amount of aircraft training data to obtain the trained network E.
4. The method for mid-stage guidance based on integrated transfer learning of claim 3,
in the step 1, each DNN neural network corresponds to an application scenario, that is, application scenarios targeted by each DNN neural network are different.
5. The method for mid-stage guidance based on integrated transfer learning of claim 3,
in the step 1, the DNN neural network is a deep feed-forward neural network, and the DNN neural network has 3 hidden layers, 20 neurons in each layer, and the neurons in each hidden layer are fully connected with the neurons in the upper layer.
6. The method for mid-stage guidance based on integrated transfer learning of claim 3,
in the step 1, the training process of the DNN neural network includes:
step a, normalizing and grouping training data;
step b, inputting training set data into the DNN neural network, and comparing the predicted value with a standard value in the training set to obtain a loss;
step c, error back propagation and parameter updating;
step d, after the neural network finishes one training, inputting data of a verification set and a test set into the neural network, and calculating a loss value of the network to be used as a measurement index of generalization capability of the neural network; training is stopped when the loss value decreases to a set value or a maximum epoch is reached.
7. The method for mid-stage guidance based on integrated transfer learning of claim 3,
the element learner is a single hidden layer feedforward neural network, and at least 5 elements of the element learner are input, namely at least 5 DNN neural network outputs are input of the element learner; the output of the meta learner is the optimal control instruction a c new
8. The method for mid-stage guidance based on integrated transfer learning of claim 7,
the algorithm in the single hidden layer feedforward neural network is as follows:
Figure FDA0003967911720000021
wherein i represents the number input by the single hidden layer feedforward neural network;
n represents the number of single hidden layer feedforward neural network inputs;
a ci an ith input representing a single hidden layer feed-forward neural network;
C j representing a weighting function;
b j representing the bias function.
9. The method for mid-stage guidance based on integrated transfer learning of claim 3,
in step 3, the small amount refers to less than 500 sets of data.
CN202211516761.3A 2022-11-28 2022-11-28 Middle section guidance method based on integrated transfer learning Pending CN116185061A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211516761.3A CN116185061A (en) 2022-11-28 2022-11-28 Middle section guidance method based on integrated transfer learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211516761.3A CN116185061A (en) 2022-11-28 2022-11-28 Middle section guidance method based on integrated transfer learning

Publications (1)

Publication Number Publication Date
CN116185061A true CN116185061A (en) 2023-05-30

Family

ID=86437202

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211516761.3A Pending CN116185061A (en) 2022-11-28 2022-11-28 Middle section guidance method based on integrated transfer learning

Country Status (1)

Country Link
CN (1) CN116185061A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118627572A (en) * 2024-08-12 2024-09-10 国网山东省电力公司莱芜供电公司 Neural network optimization method, system, equipment and medium for microwave filter

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118627572A (en) * 2024-08-12 2024-09-10 国网山东省电力公司莱芜供电公司 Neural network optimization method, system, equipment and medium for microwave filter

Similar Documents

Publication Publication Date Title
CN112198870B (en) Unmanned aerial vehicle autonomous guiding maneuver decision method based on DDQN
CN111339690A (en) Deep reinforcement learning training acceleration method based on expected value function
CN114253296B (en) Hypersonic aircraft airborne track planning method and device, aircraft and medium
CN111176263B (en) Online identification method for thrust fault of aircraft based on BP neural network
CN113625740B (en) Unmanned aerial vehicle air combat game method based on transfer learning pigeon swarm optimization
CN107065897B (en) Three-degree-of-freedom helicopter explicit model prediction control method
CN115688268A (en) Aircraft near-distance air combat situation assessment adaptive weight design method
CN116185061A (en) Middle section guidance method based on integrated transfer learning
CN106980262A (en) Self-adaptive flight device robust control method based on Kernel recursive least square algorithm
CN114118365B (en) Cross-medium aircraft rapid water inlet approximate optimization method based on radial basis network
Roudbari et al. Generalization of ANN-based aircraft dynamics identification techniques into the entire flight envelope
CN111830848A (en) Unmanned aerial vehicle super-maneuvering flight performance simulation training system and method
CN116661493A (en) Deep reinforcement learning-based aerial tanker control strategy method
Dong et al. Trial input method and own-aircraft state prediction in autonomous air combat
JP2002532677A (en) Combat pilot support system
CN110803290B (en) Novel ejection seat program control method
CN114815878B (en) Hypersonic aircraft collaborative guidance method based on real-time optimization and deep learning
Anderson et al. Design of a guided missile interceptor using a genetic algorithm
CN116339373A (en) Monte Carlo self-adaptive dynamic programming unmanned aerial vehicle control method and system
CN110045761A (en) A kind of intelligent rotating platform control system design method based on adaptive Dynamic Programming
Jebakumar et al. Design and verification of carefree maneuvering protection for a high performance fighter aircraft
CN118034065B (en) Training method and device for unmanned aerial vehicle decision network
CN116484227B (en) Neural network modeling method for generating tail end maneuver avoidance index of aircraft bullet countermeasure
CN116360476A (en) Middle section guidance method based on deep neural network
Özdil Trajectory optimization of a tactical missile by using genetic algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination