CN115289917B - Rocket sublevel landing real-time optimal guidance method and system based on deep learning - Google Patents

Rocket sublevel landing real-time optimal guidance method and system based on deep learning Download PDF

Info

Publication number
CN115289917B
CN115289917B CN202210972819.9A CN202210972819A CN115289917B CN 115289917 B CN115289917 B CN 115289917B CN 202210972819 A CN202210972819 A CN 202210972819A CN 115289917 B CN115289917 B CN 115289917B
Authority
CN
China
Prior art keywords
landing
rocket
bang
real
data set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210972819.9A
Other languages
Chinese (zh)
Other versions
CN115289917A (en
Inventor
王劲博
李辉旭
陈洪波
施健林
苏霖锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN202210972819.9A priority Critical patent/CN115289917B/en
Publication of CN115289917A publication Critical patent/CN115289917A/en
Application granted granted Critical
Publication of CN115289917B publication Critical patent/CN115289917B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • FMECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
    • F42AMMUNITION; BLASTING
    • F42BEXPLOSIVE CHARGES, e.g. FOR BLASTING, FIREWORKS, AMMUNITION
    • F42B15/00Self-propelled projectiles or missiles, e.g. rockets; Guided missiles
    • F42B15/01Arrangements thereon for guidance or control
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention provides a rocket sublevel landing real-time optimal guidance method and a rocket sublevel landing real-time optimal guidance system based on deep learning.

Description

Rocket sublevel landing real-time optimal guidance method and system based on deep learning
Technical Field
The invention relates to the technical field of earth landing guidance of vertical take-off and landing rockets, in particular to a rocket sublevel landing real-time optimal guidance method and system based on deep learning.
Background
The vertical take-off and landing reusable carrier rocket is a novel carrier rocket and is an effective tool for reducing the cost of space carrier missions and improving the efficiency of entering space. The rocket sub-level earth landing guidance is a key technology for controlling the position and the speed of the center of mass of a carrier rocket in the three-degree-of-freedom flight process of returning the rocket to the center of mass of the earth landing, namely, generating an instruction for guiding the center of mass of the rocket to move according to a certain principle or strategy, so that the moving process meets constraint conditions, the terminal state meets a preset target, the recovery precision of the carrier rocket is ensured, the fuel consumption is reduced, and reliable reuse is realized.
The existing rocket sublevel earth landing guidance method is an online track optimization guidance method for solving an optimal flight track on line mainly by establishing a target rocket dynamics model and a corresponding track optimization problem model and adopting an indirect method or a direct method, although the track which meets a certain constraint condition and meets the requirements of precision and performance indexes can be obtained by using the method, the method needs to carry out online numerical optimization iterative computation in the specific application process, has the defects of poor instantaneity, potential convergence difficulty, infeasibility of problems and the like, and has potential safety hazards when being applied to complex high-dynamic tasks such as rocket earth landing and the like.
Disclosure of Invention
The invention aims to provide a rocket sublevel landing real-time optimal guidance method and a rocket sublevel landing real-time optimal guidance system based on deep learning.
In order to achieve the above object, it is necessary to provide a rocket substage landing real-time optimal guidance method and system based on deep learning to solve the above technical problems.
In a first aspect, an embodiment of the present invention provides a rocket substage landing real-time optimal guidance method based on deep learning, including the following steps:
acquiring a rocket fuel optimal landing data set; the rocket fuel optimal landing data set comprises a Bang-Bang type data set and a Non-Bang-Bang type data set;
constructing an intelligent controller according to the rocket fuel optimal landing data set; the intelligent controller comprises a track classification network, a first control regression network and a second control regression network;
acquiring a real-time state of the rocket, and generating a corresponding intelligent control instruction according to the intelligent controller;
and verifying the landing control effect of the intelligent control instruction through ultra-real-time trajectory integration simulation to obtain the terminal landing precision.
Further, the intelligent controller also comprises a pre-constructed standby protection solver;
after the step of verifying the landing control effect of the intelligent control instruction through the super real-time trajectory integration simulation to obtain the landing precision of the terminal, the method further comprises the following steps of:
judging whether a preset landing precision standard is met or not according to the terminal landing precision, if not, hot-starting the backup protection solver to generate an optimized control instruction, and replacing the intelligent control instruction with the optimized control instruction to carry out landing control.
Further, the step of obtaining a rocket fuel optimal landing data set comprises:
solving the trajectory optimization problem of the rocket power descending section with aerodynamic drag by adopting a preset trajectory optimization numerical algorithm to obtain a plurality of fuel optimal landing trajectories;
classifying the optimal landing tracks of the fuels according to the characteristics of the thrust section, and carrying out corresponding normalization processing to obtain a Bang-Bang type track set and a Non-Bang-Bang type track set;
respectively obtaining the initial state value of each track in the Bang-Bang type track set and the Non-Bang-Bang type track set, and carrying out corresponding type labeling on each initial state value to obtain a Bang-Bang type data set and a Non-Bang-Bang type data set.
Further, the step of constructing an intelligent controller based on the rocket fuel optimal landing data set comprises:
training a preset classification network to obtain the trajectory classification network according to the rocket fuel optimal landing data set;
training a preset control regression network according to the Bang-Bang type data set to obtain a first control regression network;
and training a preset control regression network according to the Non-bang-bang data set to obtain the second control regression network.
Further, the step of acquiring the real-time state of the rocket and generating a corresponding intelligent control instruction according to the intelligent controller comprises:
after normalization processing is carried out on the rocket real-time state, the rocket real-time state is input into the intelligent controller;
carrying out track classification according to the track classification network of the intelligent controller, and selecting one of the first control regression network and the second control regression network as a corresponding control regression network according to a corresponding track classification result;
generating a control instruction to be processed according to the control regression network;
and sequentially carrying out reverse normalization processing and amplitude limiting processing on the control instruction to be processed to obtain the intelligent control instruction.
Further, the step of hot starting the backup protection solver to generate an optimized control instruction comprises:
acquiring a rocket state to be analyzed in a current guidance period, and solving a corresponding optimal landing track through the backup protection solver according to the rocket state to be analyzed;
and performing track interpolation processing on the optimal landing track to obtain a control instruction corresponding to the time interval as an optimal control instruction.
In a second aspect, an embodiment of the present invention provides a deep learning-based rocket substage landing real-time optimal guidance system, where the system includes:
the data set construction module is used for acquiring an optimal landing data set of rocket fuel; the rocket fuel optimal landing data set comprises a Bang-Bang type data set and a Non-Bang-Bang type data set;
the controller construction module is used for constructing an intelligent controller according to the rocket fuel optimal landing data set; the intelligent controller comprises a track classification network, a first control regression network and a second control regression network;
the instruction generating module is used for acquiring the real-time state of the rocket and generating a corresponding intelligent control instruction according to the intelligent controller;
and the effect verification module is used for verifying the landing control effect of the intelligent control instruction through ultra-real-time trajectory integration simulation to obtain the terminal landing precision.
Further, the intelligent controller also comprises a pre-constructed standby protection solver; the system further comprises:
and the instruction optimization module is used for judging whether a preset landing precision standard is met or not according to the terminal landing precision, if not, the backup solver is hot-started to generate an optimization control instruction, and the optimization control instruction is adopted to replace the intelligent control instruction to carry out landing control.
In a third aspect, an embodiment of the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the method when executing the computer program.
In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the above method.
The method comprises the steps of obtaining an optimal rocket fuel landing data set, constructing an intelligent controller comprising a trajectory classification network, a first control regression network and a second control regression network according to the optimal rocket fuel landing data set, obtaining a real-time state of the rocket, generating a corresponding intelligent control instruction according to the intelligent controller, and verifying a landing control effect of the intelligent control instruction through super real-time trajectory integral simulation to obtain a technical scheme of terminal landing precision. Compared with the prior art, the rocket sublevel landing real-time optimal guidance method based on deep learning can avoid potential risks brought to a landing guidance task by the black box characteristic of a neural network model, effectively ensures that the rocket is guided to finish earth landing with higher precision under the working condition of larger initial state deviation caused by preorder flight, has extremely high real-time performance, does not have the defects of difficult convergence or infeasibility of problems and the like in the online application process, has higher reliability, and can meet the actual application requirements.
Drawings
FIG. 1 is a schematic diagram of a landing site coordinate system used for building a rocket three-degree-of-freedom motion model in an embodiment of the present invention;
FIG. 2 is a schematic view of an application scenario of a rocket substage landing real-time optimal guidance method based on deep learning in the embodiment of the present invention;
FIG. 3 is a schematic flow chart of a rocket substage landing real-time optimal guidance method based on deep learning in the embodiment of the invention;
FIG. 4 is a schematic diagram of a trajectory classification network of an intelligent controller according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of the first control regression network and the second control regression network of the intelligent controller according to the embodiment of the present invention;
FIG. 6 is another schematic flow chart of the rocket substage landing real-time optimal guidance method based on deep learning in the embodiment of the invention;
FIG. 7 is a schematic structural diagram of a rocket substage landing real-time optimal guidance system based on deep learning in the embodiment of the invention;
FIG. 8 is another schematic structural diagram of the rocket substage landing real-time optimal guidance system based on deep learning in the embodiment of the invention;
fig. 9 is an internal structural diagram of a computer device in the embodiment of the present invention.
Detailed Description
In order to make the purpose, technical solution and advantages of the present invention more clearly apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments, and it is obvious that the embodiments described below are part of the embodiments of the present invention, and are used for illustrating the present invention only, but not for limiting the scope of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The researched rocket substage landing real-time optimal guidance based on deep learning can be understood as solving the problem of establishing nonlinear and continuous optimal landing trajectory of rocket fuel based on actual rocket flight conditions and task targets; the optimal landing trajectory optimization problem of the rocket fuel consists of a performance index function, a rocket sublevel kinetic equation, a process and a terminal constraint, and specifically comprises the following steps:
1) The rocket sublevel dynamics equation is based on a landing point coordinate System shown in fig. 1, considers the influences of engine thrust, earth attraction and aerodynamic force generated in a dense atmosphere environment on the rocket sublevel in the final landing process, does not consider the influences of Control mechanisms such as a grid rudder and a Reaction Control System (RCS) on rocket adjustment, takes the thrust generated by a rocket engine as a unique Control quantity, takes the rocket landing motion as a mass center motion, and adopts a three-degree-of-freedom dynamics equation established by a plane landing field and a normal gravitational field model, and is specifically expressed as follows:
Figure BDA0003795049250000061
wherein m is the mass of the rocket body;
Figure BDA0003795049250000062
and
Figure BDA0003795049250000063
respectively a three-dimensional position vector and a velocity vector of the rocket substage,
Figure BDA0003795049250000064
the gravitational acceleration to which the rocket is subjected is itself a function of the rocket position r, which is set to a constant value in the solution of the partial problem herein;
Figure BDA0003795049250000065
being rocketsThrust vectors can be understood as engine thrust obtained by combining a plurality of engines equipped at a rocket sublevel into an equivalent engine as a rocket to provide thrust without considering rocket attitude transformation, and are also control variables of the trajectory optimization problem, and are specifically expressed as:
Figure BDA0003795049250000066
Figure BDA0003795049250000071
wherein, I sp Is the specific impulse of fuel, g 0 Is the average gravitational acceleration at sea level of the earth,
Figure BDA0003795049250000072
the second consumption of the propellant after the engine is started; i T is the thrust amplitude, T x 、T y And T z Thrust components along three axes in the landing site coordinate system;
Figure BDA0003795049250000073
the aerodynamic drag vector experienced by the rocket substage can be expressed as:
D=-C D S ref ρ||V|| 2 V/2
in the formula (I), the compound is shown in the specification,
Figure BDA0003795049250000074
wherein, C D Representing a drag coefficient; s ref A reference area representing a rocket substage; rho represents the atmospheric density in the earth landing environment and is represented by an exponential atmospheric density model; v represents the velocity vector of the rocket; rho 0 Representing a reference atmospheric density at sea level of the earth; h represents the flight height of the rocket sublevel, namely the rocket position component of the Z axis in the landing point coordinate system;h ref represents a reference height;
2) Terminal and process constraints
In order to guarantee successful landing of the rocket, strict limits need to be imposed on the states of two endpoints at the beginning and the end of the flight, and the endpoint constraints are as follows:
Figure BDA0003795049250000075
wherein, the variable t 0 And t f Respectively the start and end times of the problem, r 0 、V 0 And m 0 Respectively representing the initial position, the initial speed and the initial mass of the rocket; r is a radical of hydrogen f And V f Respectively representing the target position and the target speed of the rocket substage landing terminal moment, and taking a landing target point corresponding to a landing point coordinate system as an origin, namely r f =0 and V f =0;m(t f ) And m dry Respectively representing the residual mass and the dry weight of the rocket during landing, wherein the residual mass of the rocket needs to be ensured to be more than or equal to the dry weight of the rocket during landing;
in addition, in order to ensure that the rocket crashes when contacting the ground during the solving process of the trajectory optimization problem, the following restrictions must be applied to the flight height of the rocket stage:
r z ≥0;
meanwhile, due to the horizontal limitation of the prior reusable engine technology, and in order to ensure the safety in the landing process, in the last section of landing flight process, the engine is not shut down after being ignited and started, namely in the whole power descent section flight, the rocket sublevel can be subjected to the non-zero minimum thrust action, and the corresponding flight process constraint also comprises the rocket sublevel engine thrust amplitude constraint, which is specifically shown as follows:
T min ≤||T||≤T max
wherein, T max And T min Respectively an upper bound and a lower bound of the thrust amplitude of the rocket engine.
3) Performance index
The fuel-optimal landing problem, i.e. the requirement for the minimum fuel consumption during the whole flight, which is related to the thrust distributed by the engine, is as follows:
Figure BDA0003795049250000081
wherein, I sp 、g 0 And m 0 The track optimization problems are all known constants, so that the optimal performance index of the fuel can be equivalent to the problem of maximum rocket sublevel terminal quality, and the terminal quality of the rocket during landing can be directly used as the performance index.
4) Modeling of rocket fuel optimal landing trajectory optimization problem
Based on the analysis of the model dynamics and the constraint conditions, the optimal trajectory optimization problem P0 of the rocket sublevel landing fuel can be determined as follows:
Figure BDA0003795049250000082
the invention provides a rocket sublevel landing real-time optimal guidance method based on the solution of the rocket sublevel landing fuel optimal trajectory optimization problem P0 model, which can be applied to vertical take-off and landing reusable carrier rocket earth return landing guidance, can effectively deal with working conditions such as large-range deviation of shift-handing conditions generated by preorder flight in rocket sublevel return landing flight and the like based on the overall architecture shown in figure 2, does not face the potential convergence difficulty problem of the traditional guidance method based on trajectory optimization, provides a high-real-time, high-precision and high-reliability guidance strategy for rocket sublevel earth landing, can meet the actual application requirements, and provides reliable guarantee for guiding a rocket to perform high-precision fixed-point soft landing; it should be noted that the method of the present invention can be executed by a server that undertakes related functions, and the following embodiments all take the server as an execution subject, and the deep learning-based rocket substage landing real-time optimal guidance method of the present invention is described in detail.
In one embodiment, as shown in fig. 3, a deep learning-based rocket substage landing real-time optimal guidance method is provided, which comprises the following steps:
s11, obtaining an optimal rocket fuel landing data set; the rocket fuel optimal landing data set comprises a Bang-Bang type data set and a Non-Bang-Bang type data set; wherein, the rocket fuel optimal landing data set can be understood as a data set of a plurality of fuel optimal trajectories collected by solving the rocket sublevel landing fuel optimal trajectory optimization problem through efficient digital optimization calculation, and comprises Bang-Bang type fuel optimal trajectories and Non-Bang-Bang type fuel optimal trajectories, and each trajectory is in a series (state x) k Control u k ) For example, where a state can be represented as
Figure BDA0003795049250000091
r x 、r y And r z Respectively representing the three-dimensional position component, V, of the rocket x 、V y And V z Respectively representing three-dimensional velocity components, m representing mass, and control represented as
Figure BDA0003795049250000092
T x 、T y And T z Respectively representing three-dimensional thrust components of the rocket;
specifically, the step of acquiring the rocket fuel optimal landing data set comprises the following steps:
solving the trajectory optimization problem of the rocket power descending section with aerodynamic drag by adopting a preset trajectory optimization numerical algorithm to obtain a plurality of fuel optimal landing trajectories; the preset track optimization numerical algorithm can be realized by adopting the existing high-efficiency numerical optimization algorithm, and the specific used algorithm and the solving process are not specifically limited;
classifying the optimal landing tracks of the fuels according to the characteristics of the thrust section, and carrying out corresponding normalization processing to obtain a Bang-Bang type track set and a Non-Bang-Bang type track set; the normalization process can be understood as mapping each variable in the trajectory data to dimensionless data in an [ -1,1] interval, and the corresponding normalization process function is expressed as follows:
Figure BDA0003795049250000101
wherein, x and x norm Respectively representing data before normalization and data after normalization; x is the number of min And x max Respectively representing the minimum value and the maximum value of the data before normalization;
respectively obtaining a state initial value of each track in a Bang-Bang type track set and a Non-Bang-Bang type track set, and carrying out corresponding class marking on each state initial value to obtain a Bang-Bang type data set and a Non-Bang-Bang type data set; wherein the initial state value of each track can be understood as the rocket initial state x shown in Table 1 0 =[r 0 ,r 0 ,r 0 ,V 0 ,V 0 ,V 0 ,m 0 ]:
TABLE 1 initial state table of certain rocket running track
Figure BDA0003795049250000102
The category marking can be understood as marking the initial state of each track according to two categories of Bang-Bang category and Non-Bang-Bang category to obtain (initial state value x) 0 Class label) data pairs to form a training construction classification data set for a subsequent intelligent controller;
s12, constructing an intelligent controller according to the rocket fuel optimal landing data set; the intelligent controller comprises a track classification network, a first control regression network and a second control regression network, and is composed of full connection layers, and the mathematical expression of the intelligent controller is as follows:
y=g(w T x+b)
where x denotes the input of the previous layer, y denotes the output of the present layer, w T Representing weights of neurons of the layer, b representing bias values of the layer, and g representing an activation function of the neurons of the layer; wherein the trajectory classification network isThe method comprises the following steps that a first control regression network and a second control regression network are respectively connected in series, and the first control regression network and the second control regression network are connected in parallel, namely, the initial state of starting a rocket task is used as an intelligent controller to be input, the corresponding regression network (the first control regression network or the second control regression network) is selected as a controller to be closed to the rocket flight task according to the classification result of a trajectory classification network, and a real-time control instruction is generated according to the rocket state; specifically, the step of constructing an intelligent controller according to the rocket fuel optimal landing data set comprises the following steps:
training a preset classification network to obtain the trajectory classification network according to the rocket fuel optimal landing data set; in order to improve the classification efficiency and aim at the characteristics of the optimal landing data set of the rocket fuel, the embodiment preferably adopts a network structure comprising a first input layer, a first hidden layer and a first output layer which are sequentially connected, and as shown in fig. 4, the first input layer comprises 7 neural units which respectively correspond to the initial values of the rocket state
Figure BDA0003795049250000112
Each dimension of (a); the first hidden layer comprises 3 fully-connected layers, each fully-connected layer comprises 32 nerve units, and the corresponding activation function selects a ReLU function; the first output layer comprises 2 neurons, the probabilities respectively correspond to two categories of Bang-Bang type and Non-Bang-Bang type of the track, the input of the first hidden layer is converted into the corresponding probability through a softmax function, and the corresponding mathematical expression is as follows:
Figure BDA0003795049250000111
wherein e is i For each input, S, corresponding to a neuron i For the output corresponding to the neuron, the softmax layer maps the output changed into a plurality of neurons to [0,1 ]]In the interval, the input of a plurality of neurons is mapped into corresponding probability forms, so that the input is classified according to the probabilityEvaluating, wherein the node with the maximum probability value is the final prediction category;
training a preset control regression network according to the Bang-Bang type data set to obtain a first control regression network;
and training a preset control regression network according to the Non-bang-bang data set to obtain the second control regression network.
The first control regression network and the second control regression network may be understood as single regression networks corresponding to the Bang-Bang type trajectory and the Non-Bang-Bang type trajectory, parameters of the two single regression networks may be different but model structures of the two single regression networks are different, and the single regression networks include a second input layer, a second hidden layer and a second output layer which are sequentially connected as shown in fig. 5; preferably, the second input layer comprises 7 nerve units which respectively correspond to the initial values of the rocket state
Figure BDA0003795049250000121
Each dimension of (a); the second hidden layer comprises 12 fully-connected layers, each fully-connected layer comprises 300 nerve units, and the corresponding activation function selects a ReLU function; the second output layer comprises 3 neurons which respectively correspond to rocket control variables
Figure BDA0003795049250000122
Corresponding to the output layer function, selecting a tanh function;
the trajectory classification network, the first control regression network and the second control regression network training process all comprise the steps of dividing corresponding data sets into training sets and verification sets, inputting data in the training sets into a network to be trained according to batches to train network parameters, using the corresponding verification sets to input in batches to verify after each round of training is finished, and storing parameters of a current neural network to update network model parameters when the current round training effect is superior to the prior optimal effect to obtain a stable and available network model so as to obtain an intelligent controller capable of being used for real-time online analysis; specifically, the network training process is as follows:
firstly, dividing a constructed data set into a training set and a verification set according to a set training set and verification set proportion, initializing parameters W of a trajectory classification network, a first control regression network and a second control regression network, a network training batch size batch _ size, a network training round number epoch, a network parameter learning rate lr and a loss function L, and performing learning optimization on network parameters by using an Adam optimization algorithm;
during each round of network training, data X are subjected to batch processing according to a training set i Inputting the result into the network to obtain the output result Y of the network i And calculating a Mean Square Error (MSE) loss function:
Figure BDA0003795049250000131
further obtaining the descending gradient of the network loss function as follows:
Figure BDA0003795049250000132
calculating and updating a network parameter W = W + aw according to the gradient using an Adam optimizer; testing the network after the parameters are updated by using the verification set data, verifying the fitting effect of the network, and if the mean square error of the verification set is less than the prior lowest value, saving the network parameters in the current round;
and repeating the process until the set training round number epoch is reached to obtain the corresponding track classification network, the first control regression network and the second control regression network.
S13, acquiring a real-time state of the rocket, and generating a corresponding intelligent control instruction according to the intelligent controller; the intelligent control instruction can be understood as a real-time instruction for guiding the rocket to land and fly, which is predicted by the intelligent controller according to the real-time state of the rocket in each guidance period; specifically, the step of acquiring the real-time state of the rocket and generating a corresponding intelligent control instruction according to the intelligent controller includes:
after normalization processing is carried out on the rocket real-time state, the rocket real-time state is input into the intelligent controller; the normalization processing of the rocket real-time state is the same as the normalization processing used in the rocket fuel optimal landing data set establishing process, and is not repeated here;
carrying out track classification according to the track classification network of the intelligent controller, and selecting one of the first control regression network and the second control regression network as a corresponding control regression network according to a corresponding track classification result; if the trajectory classification network carries out trajectory classification according to the real-time state of the rocket to obtain a trajectory classification result of Bang-Bang type, selecting a first control regression network as a corresponding control regression network for predicting and generating an intelligent control instruction of the guidance period, and if the trajectory classification result is Non-Bang-Bang type, using a second control regression network as a corresponding control regression network for predicting and generating the intelligent control instruction of the guidance period;
generating a control instruction to be processed according to the control regression network; the process of generating the control instruction to be processed is referred to the functional description of each network model of the intelligent controller, and is not described again here;
sequentially carrying out reverse normalization processing and amplitude limiting processing on the control instruction to be processed to obtain the intelligent control instruction; the inverse normalization processing can be understood as restoring the dimensionless thrust control instruction obtained by the control regression network prediction to the original dimension, and the corresponding inverse normalization formula is as follows:
Figure BDA0003795049250000141
the amplitude limiting processing is to prevent the thrust instruction obtained by network fitting from exceeding the actual rocket thrust amplitude constraint limit and carry out amplitude intercepting processing, and the specific operation is as follows:
Figure BDA0003795049250000142
s14, verifying the landing control effect of the intelligent control instruction through super real-time track integration simulation to obtain terminal landing precision; the terminal landing precision can be understood as the error between a complete track obtained by guiding the rocket to land and fly according to an intelligent control instruction given by the intelligent controller according to the real-time state of the rocket, namely the landing control effect, and an expected target state through the ultra-real-time track integral simulation verification in each guidance period; specifically, the super real-time trajectory integration simulation verification process is as follows:
in each guidance period, a simulated dynamics model of the rocket is established in a super real-time track integral simulation system while the rocket actually flies, the rocket is subjected to dynamics integration through an intelligent control instruction given by an input intelligent controller of the model, the change of the dynamics state of the model is driven, meanwhile, waiting time is not set, the integrated system state is fed back to the input of the intelligent controller for next round of system integration, the complete track of the subsequent landing of the rocket is predicted in a short time, quality evaluation is carried out on the predicted track based on track evaluation logic, and the error between the terminal state of the simulated track and the expected target state is verified. If the error of the terminal is larger, the simulation track quality is poorer, and the landing precision of the terminal is lower, otherwise, the simulation track quality is better, and the landing precision of the terminal is higher. It should be noted that the super real-time trajectory integration simulation system has strong calculation performance, the time required for the rocket to fly for 1 second in a real environment is 1 second, and in simulation, the simulation orbit 1 second only needs to integrate dynamics for several times according to a set integration interval, and can complete trajectory simulation for 1 second within tens of milliseconds, so that in the super real-time trajectory integration simulation system, subsequent complete trajectory simulation for tens of seconds can be realized within several seconds, the verification efficiency is high, and high real-time performance of an intelligent controller for outputting an effective control instruction is further ensured.
In principle, the real-time optimal guidance for guiding rocket sub-level earth landing by the intelligent controller based on deep learning can be obtained through the steps, but in consideration of potential unreliability brought by the black box characteristic of the intelligent controller based on a neural network, in order to further improve the accuracy of optimal guidance, a pre-constructed backup solver is preferably embedded into the intelligent controller to serve as a backup control instruction generating device so as to make up for potential risks of the intelligent controller; that is, after the intelligent control command output by the intelligent controller is obtained in step S13, the method is not directly used for guiding the rocket to land and fly, but after the landing control effect corresponding to the intelligent control command is obtained through the super real-time trajectory integration simulation verification in step S14, as shown in fig. 6, the method further includes the following steps:
s15, judging whether a preset landing precision standard is met or not according to the terminal landing precision, if not, hot-starting the backup solver to generate an optimized control instruction, and replacing the intelligent control instruction with the optimized control instruction to carry out landing control; the backup protection solver and the intelligent controller based on the neural network are low in coupling degree, the format has no strict requirement, but the application scene of the backup protection solver is considered to be high in real-time requirement for online calculation, so that the backup protection solver can be realized by adopting a customized algorithm aiming at a specific problem or an existing high-efficiency track optimization numerical solution algorithm, in order to enable the selected backup protection solver to obtain a better solution effect, a track optimization algorithm based on sequence convexity or other similar convexity is usually adopted in the rocket earth landing track problem, and a predicted track obtained by a super real-time track integration simulation system is close enough to an optimal landing track even if the landing precision does not meet the task requirement, the backup protection solver construction algorithm of the embodiment preferably adopts a direct method based on a convex optimization method, the predicted track is used as a track optimization numerical algorithm of an initial reference track hot start backup protection solver, so that the iteration times of the numerical optimization algorithm can be greatly reduced, the algorithm solution efficiency is accelerated, and the advantages and the mutual complementation can be achieved through the cooperation with the intelligent controller based on the neural network, the mutual enhancement effect is further ensured, and the high real-time of the generation of the control instruction generation is further;
specifically, the step of hot-starting the backup protection solver to generate the optimized control instruction includes:
acquiring a rocket state to be analyzed in a current guidance period, and solving a corresponding optimal landing track through the backup protection solver according to the rocket state to be analyzed;
performing track interpolation processing on the optimal landing track to obtain a control instruction corresponding to a time interval as an optimal control instruction; the track interpolation is implemented by using the prior art, such as linear interpolation, and is not limited herein.
According to the method, the rocket is guided to complete the earth landing with higher precision under the working condition of larger initial state deviation caused by preorder flight, the real-time performance is high, the defects of difficult convergence or infeasibility of problems and the like do not exist in the online application process, the reliability is high, and the actual application requirements can be met.
It should be noted that, although the steps in the above-described flowcharts are shown in sequence as indicated by arrows, the steps are not necessarily executed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise.
In one embodiment, as shown in fig. 7, there is provided a deep learning-based rocket substage landing real-time optimal guidance system, the system comprising:
the data set construction module 1 is used for acquiring an optimal landing data set of rocket fuel; the rocket fuel optimal landing data set comprises a Bang-Bang type data set and a Non-Bang-Bang type data set;
the controller building module 2 is used for building an intelligent controller according to the rocket fuel optimal landing data set; the intelligent controller comprises a track classification network, a first control regression network and a second control regression network;
the instruction generating module 3 is used for acquiring the real-time state of the rocket and generating a corresponding intelligent control instruction according to the intelligent controller;
and the effect verification module 4 is used for verifying the landing control effect of the intelligent control instruction through ultra-real-time trajectory integration simulation to obtain the terminal landing precision.
In one embodiment, as shown in fig. 8, the intelligent controller further comprises a pre-built standby solver; the system further comprises:
and the instruction optimization module 5 is used for judging whether a preset landing precision standard is met or not according to the terminal landing precision, if not, hot-starting the backup solver to generate an optimization control instruction, and replacing the intelligent control instruction with the optimization control instruction to perform landing control.
For specific limitations of a rocket substage landing real-time optimal guidance system based on deep learning, reference may be made to the above limitations of a rocket substage landing real-time optimal guidance method based on deep learning, and details are not repeated here. All modules in the deep learning-based rocket substage landing real-time optimal guidance system can be completely or partially realized through software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
Fig. 9 shows an internal structure diagram of a computer device in one embodiment, and the computer device may be specifically a terminal or a server. As shown in fig. 9, the computer apparatus includes a processor, a memory, a network interface, a display, and an input device, which are connected through a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to realize a rocket substage landing real-time optimal guidance method based on deep learning. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those of ordinary skill in the art that the architecture shown in FIG. 9 is a block diagram of only a portion of the architecture associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects may be applied, as a particular computing device may include more or less components than those shown, or may combine certain components, or have a similar arrangement of components.
In one embodiment, a computer device is provided, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the steps of the above method being performed when the computer program is executed by the processor.
In an embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, carries out the steps of the above-mentioned method.
In summary, the rocket sublevel landing real-time optimal guidance method and system based on deep learning provided by the embodiments of the present invention, after obtaining the rocket fuel optimal landing data set and constructing the intelligent controller including the trajectory classification network, the first control regression network and the second control regression network according to the rocket fuel optimal landing data set, obtain the rocket real-time state, generate the corresponding intelligent control instruction according to the intelligent controller, and verify the landing control effect of the intelligent control instruction through the super real-time trajectory integration simulation, so as to obtain the technical scheme of the terminal landing precision, which not only can avoid the potential risk brought to the landing guidance task by the black box characteristic of the neural network model, effectively ensure that the rocket is guided to complete the earth with higher precision under the working condition of larger initial state deviation caused by preamble flight, have extremely high real-time performance, and have no defects of convergence difficulty or infeasibility of problems in the online application process, have higher reliability, and can meet the actual application requirements.
The embodiments in this specification are described in a progressive manner, and all the same or similar parts of the embodiments are directly referred to each other, and each embodiment is described with emphasis on differences from other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment. It should be noted that, the technical features of the embodiments may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several preferred embodiments of the present application, and the description thereof is specific and detailed, but not to be understood as limiting the scope of the invention. It should be noted that, for those skilled in the art, various modifications and substitutions can be made without departing from the technical principle of the present invention, and these should be construed as the protection scope of the present application. Therefore, the protection scope of the present patent shall be subject to the protection scope of the claims.

Claims (10)

1. A rocket substage landing real-time optimal guidance method based on deep learning is characterized by comprising the following steps:
acquiring a rocket fuel optimal landing data set; the rocket fuel optimal landing data set comprises a Bang-Bang type data set and a Non-Bang-Bang type data set;
constructing an intelligent controller according to the rocket fuel optimal landing data set; the intelligent controller comprises a track classification network, a first control regression network and a second control regression network;
acquiring a real-time state of the rocket, and generating a corresponding intelligent control instruction according to the intelligent controller;
and verifying the landing control effect of the intelligent control instruction through ultra-real-time trajectory integration simulation to obtain the terminal landing precision.
2. A rocket substage landing real-time optimal guidance method based on deep learning as claimed in claim 1, wherein said intelligent controller further comprises a pre-constructed backup solver;
after the step of verifying the landing control effect of the intelligent control instruction through the super real-time trajectory integration simulation to obtain the landing precision of the terminal, the method further comprises the following steps of:
judging whether a preset landing precision standard is met or not according to the terminal landing precision, if not, hot-starting the backup protection solver to generate an optimized control instruction, and replacing the intelligent control instruction with the optimized control instruction to carry out landing control.
3. A rocket substage landing real-time optimal guidance method based on deep learning as claimed in claim 1, wherein said step of obtaining a rocket fuel optimal landing data set comprises:
solving a rocket power descent segment track optimization problem with aerodynamic drag by adopting a preset track optimization numerical algorithm to obtain a plurality of fuel optimal landing tracks;
classifying the optimal landing tracks of the fuels according to the characteristics of the thrust profile, and carrying out corresponding normalization processing to obtain a Bang-Bang type track set and a Non-Bang-Bang type track set;
respectively obtaining the state initial value of each track in the Bang-Bang type track set and the Non-Bang-Bang type track set, and carrying out corresponding type labeling on each state initial value to obtain a Bang-Bang type data set and a Non-Bang-Bang type data set.
4. A deep learning based rocket substage landing real-time optimal guidance method according to claim 1, wherein said step of constructing an intelligent controller based on said rocket fuel optimal landing data set comprises:
training a preset classification network to obtain the trajectory classification network according to the rocket fuel optimal landing data set;
training a preset control regression network according to the Bang-Bang type data set to obtain a first control regression network;
and training a preset control regression network according to the Non-bang-bang data set to obtain the second control regression network.
5. A rocket substage landing real-time optimal guidance method based on deep learning as claimed in claim 1, wherein said step of obtaining a rocket real-time status and generating corresponding intelligent control instructions according to said intelligent controller comprises:
normalizing the real-time state of the rocket and inputting the normalized real-time state of the rocket into the intelligent controller;
carrying out trajectory classification according to the trajectory classification network of the intelligent controller, and selecting one of the first control regression network and the second control regression network as a corresponding control regression network according to a corresponding trajectory classification result;
generating a control instruction to be processed according to the control regression network;
and sequentially carrying out reverse normalization processing and amplitude limiting processing on the control instruction to be processed to obtain the intelligent control instruction.
6. A deep learning based rocket substage landing real-time optimal guidance method as claimed in claim 2, wherein said step of hot starting said reserve saver solver to generate optimal control instructions comprises:
acquiring a rocket state to be analyzed in a current guidance period, and solving a corresponding optimal landing track through the backup protection solver according to the rocket state to be analyzed;
and performing track interpolation processing on the optimal landing track to obtain a control instruction corresponding to the time interval as an optimal control instruction.
7. A rocket substage landing real-time optimal guidance system based on deep learning is characterized by comprising:
the data set construction module is used for acquiring an optimal landing data set of rocket fuel; the rocket fuel optimal landing data set comprises a Bang-Bang type data set and a Non-Bang-Bang type data set;
the controller construction module is used for constructing an intelligent controller according to the rocket fuel optimal landing data set; the intelligent controller comprises a track classification network, a first control regression network and a second control regression network;
the instruction generating module is used for acquiring the real-time state of the rocket and generating a corresponding intelligent control instruction according to the intelligent controller;
and the effect verification module is used for verifying the landing control effect of the intelligent control instruction through ultra-real-time track integration simulation to obtain the terminal landing precision.
8. A deep learning based rocket substage landing real-time optimal guidance system as recited in claim 7, wherein said intelligent controller further comprises a pre-built backup solver; the system further comprises:
and the instruction optimization module is used for judging whether a preset landing precision standard is met or not according to the terminal landing precision, if not, hot-starting the backup solver to generate an optimization control instruction, and replacing the intelligent control instruction with the optimization control instruction to carry out landing control.
9. A computer arrangement comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method as claimed in any one of claims 1 to 6 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 6.
CN202210972819.9A 2022-08-12 2022-08-12 Rocket sublevel landing real-time optimal guidance method and system based on deep learning Active CN115289917B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210972819.9A CN115289917B (en) 2022-08-12 2022-08-12 Rocket sublevel landing real-time optimal guidance method and system based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210972819.9A CN115289917B (en) 2022-08-12 2022-08-12 Rocket sublevel landing real-time optimal guidance method and system based on deep learning

Publications (2)

Publication Number Publication Date
CN115289917A CN115289917A (en) 2022-11-04
CN115289917B true CN115289917B (en) 2023-02-28

Family

ID=83829818

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210972819.9A Active CN115289917B (en) 2022-08-12 2022-08-12 Rocket sublevel landing real-time optimal guidance method and system based on deep learning

Country Status (1)

Country Link
CN (1) CN115289917B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111597702A (en) * 2020-05-11 2020-08-28 北京航天自动控制研究所 Rocket landing trajectory planning method and device
CN113479347A (en) * 2021-07-13 2021-10-08 北京理工大学 Rocket vertical recovery landing segment trajectory control method
CN114528692A (en) * 2022-01-14 2022-05-24 北京航天自动控制研究所 Reusable rocket landing stage feasible region calculation method based on numerical optimization

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6720071B2 (en) * 2016-12-26 2020-07-08 三菱重工業株式会社 Spacecraft, program and control device
US11613384B2 (en) * 2021-01-25 2023-03-28 Brian Haney Precision landing for rockets using deep reinforcement learning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111597702A (en) * 2020-05-11 2020-08-28 北京航天自动控制研究所 Rocket landing trajectory planning method and device
CN113479347A (en) * 2021-07-13 2021-10-08 北京理工大学 Rocket vertical recovery landing segment trajectory control method
CN114528692A (en) * 2022-01-14 2022-05-24 北京航天自动控制研究所 Reusable rocket landing stage feasible region calculation method based on numerical optimization

Also Published As

Publication number Publication date
CN115289917A (en) 2022-11-04

Similar Documents

Publication Publication Date Title
CN110806759B (en) Aircraft route tracking method based on deep reinforcement learning
CN112162564B (en) Unmanned aerial vehicle flight control method based on simulation learning and reinforcement learning algorithm
Rubies-Royo et al. A classification-based approach for approximate reachability
CN111240345A (en) Underwater robot trajectory tracking method based on double BP network reinforcement learning framework
CN106272443A (en) The incomplete paths planning method of multiple degrees of freedom space manipulator
CN109144099B (en) Fast evaluation method for unmanned aerial vehicle group action scheme based on convolutional neural network
CN114995132B (en) Multi-arm spacecraft model prediction control method, equipment and medium based on Gaussian mixture process
CN112141369B (en) Decision and control method for autonomous rendezvous and docking of translational closing sections of spacecraft
Bairstow Reentry guidance with extended range capability for low L/D spacecraft
CN111813146A (en) Reentry prediction-correction guidance method based on BP neural network prediction voyage
CN114253296A (en) Airborne trajectory planning method and device for hypersonic aircraft, aircraft and medium
CN115755598A (en) Intelligent spacecraft cluster distributed model prediction path planning method
Goecks Human-in-the-loop methods for data-driven and reinforcement learning systems
Ma et al. Target tracking control of UAV through deep reinforcement learning
Wang et al. Intelligent control of air-breathing hypersonic vehicles subject to path and angle-of-attack constraints
Amir et al. Priority neuron: A resource-aware neural network for cyber-physical systems
CN115289917B (en) Rocket sublevel landing real-time optimal guidance method and system based on deep learning
CN116697829A (en) Rocket landing guidance method and system based on deep reinforcement learning
Ansari et al. Hybrid genetic algorithm fuzzy rule based guidance and control for launch vehicle
CN116339373A (en) Monte Carlo self-adaptive dynamic programming unmanned aerial vehicle control method and system
CN115524964B (en) Rocket landing real-time robust guidance method and system based on reinforcement learning
CN114384931A (en) Unmanned aerial vehicle multi-target optimal control method and device based on strategy gradient
CN113778117A (en) Multi-stage pseudo-spectrum method for intelligently selecting initial values for planning longitudinal optimal paths of airplanes
Wang et al. Design of Adaptive Time-Varying Sliding Mode Controller for Underactuated Overhead Crane Optimized via Improved Honey Badger Algorithm
Chen et al. Rocket powered landing guidance using proximal policy optimization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant