CN114020018B

CN114020018B - Determination method and device of missile control strategy, storage medium and electronic equipment

Info

Publication number: CN114020018B
Application number: CN202111292421.2A
Authority: CN
Inventors: 刘昊; 赵万兵; 蔡国飙
Original assignee: Beihang University
Current assignee: Beihang University
Priority date: 2021-11-03
Filing date: 2021-11-03
Publication date: 2024-02-27
Anticipated expiration: 2041-11-03
Also published as: CN114020018A

Abstract

This application provides a method, device, storage medium and electronic equipment for determining a missile control strategy. The determination method includes: determining the control amount data of the missile system to be analyzed based on the initial control strategy set by the missile system to be analyzed; based on the control Iteratively solve the quantitative data to obtain the first control strategy, and obtain the reference value function and reference control strategy based on the first control strategy; use the reference control strategy to update the first control strategy until the updated control strategy meets the preset control strategy update After conditions, the optimal control strategy and the optimal value function are obtained; based on the optimal control strategy, the unknown parameter matrix in the missile system to be analyzed is determined, and combined with the parameter equations of the missile system to be analyzed, the parameters for the missile system to be analyzed are determined. Describe the control strategy of the missile system to be analyzed. In this way, the missile status data is used in combination with the optimal control algorithm to determine the unknown parameters and achieve optimal control of the missile trajectory tracking.

Description

Method, device, storage medium and electronic device for determining missile control strategy

技术领域Technical Field

本申请涉及控制算法技术领域，尤其是涉及一种导弹控制策略的确定方法、装置、存储介质及电子设备。The present application relates to the field of control algorithm technology, and in particular to a method, device, storage medium and electronic device for determining a missile control strategy.

背景技术Background Art

在自动控制领域，控制方法有很多种。例如：鲁棒控制技术，滑模控制技术，反步控制技术和预测控制技术等。这些控制方案结合自身的控制优势，通过调节控制器参数从而获得较好的控制性能。然而，这些控制方法更多的是可以保证系统稳定，并不能把性能指标和控制器设计结合起来。最终的控制结果是否满足性能指标更多的是人为判断，而不能通过理论分析把控制器设计和要求的性能指标结合起来。因此，最优控制理论受到了广泛的关注，它可以在给定的性能指标下设计出最优的控制器。求解最优控制器的方法有多种，例如：极大极小值原理，线性二次型最优控制，最优鲁棒控制，动态规划法。In the field of automatic control, there are many control methods. For example: robust control technology, sliding mode control technology, backstepping control technology and predictive control technology. These control schemes combine their own control advantages and adjust the controller parameters to obtain better control performance. However, these control methods can only ensure the stability of the system, and cannot combine performance indicators with controller design. Whether the final control result meets the performance indicators is more of a human judgment, and the controller design and the required performance indicators cannot be combined through theoretical analysis. Therefore, optimal control theory has received widespread attention. It can design the optimal controller under given performance indicators. There are many methods to solve the optimal controller, such as: maximum and minimum principle, linear quadratic optimal control, optimal robust control, and dynamic programming.

现有阶段中，考虑参数不确定性条件下，设计非线性控制算法，如鲁棒补偿算法、H无穷控制方法和滑膜控制方法等，解决了参数不确定性等对导弹性能的影响。但是，此类方法需要基于被控对象模型信息，抑制导弹参数不确定性造成的影响。而实际导弹动态模型存在不确定性(如跟踪目标不确定和气动参数不确定等)。对于参数完全未知、被控对象模型信息未知的情况并不适用。所以，迫切需要实现一种导弹控制策略的确定方法，实现对导弹系统的准确控制。At the current stage, nonlinear control algorithms, such as robust compensation algorithm, H infinite control method and sliding film control method, are designed under the condition of parameter uncertainty to solve the influence of parameter uncertainty on missile performance. However, such methods need to suppress the influence of missile parameter uncertainty based on the model information of the controlled object. However, there are uncertainties in the actual missile dynamic model (such as uncertainty in tracking targets and uncertainty in aerodynamic parameters). It is not applicable to the situation where the parameters are completely unknown and the model information of the controlled object is unknown. Therefore, it is urgent to realize a method for determining the missile control strategy to achieve accurate control of the missile system.

发明内容Summary of the invention

有鉴于此，本申请的目的在于提供一种导弹控制策略的确定方法、装置、存储介质及电子设备，在确定出对待分析导弹系统的控制策略时，考虑了导弹系统中的非线性以及不确定系统参数，通过最优控制策略，确定出包含非线性以及不确定系统参数的未知参数矩阵，确定出实现对待分析导弹系统的控制策略，从而提高对待分析导弹系统控制的准确率。In view of this, the purpose of the present application is to provide a method, device, storage medium and electronic device for determining a missile control strategy. When determining the control strategy for the missile system to be analyzed, the nonlinearity and uncertain system parameters in the missile system are taken into account. Through the optimal control strategy, the unknown parameter matrix containing the nonlinearity and uncertain system parameters is determined, and the control strategy for the missile system to be analyzed is determined, thereby improving the accuracy of the control of the missile system to be analyzed.

本申请实施例提供了一种导弹控制策略的确定方法，所述确定方法包括：The embodiment of the present application provides a method for determining a missile control strategy, the method comprising:

基于待分析导弹系统设定的初始控制策略，确定出待分析导弹系统的控制量数据；Based on the initial control strategy set for the missile system to be analyzed, the control quantity data of the missile system to be analyzed is determined;

基于所述控制量数据进行迭代求解获得第一控制策略，基于所述第一控制策略得到参考值函数和参考控制策略；Performing an iterative solution based on the control quantity data to obtain a first control strategy, and obtaining a reference value function and a reference control strategy based on the first control strategy;

利用所述参考控制策略对所述第一控制策略进行更新，直至更新后的控制策略满足预设的控制策略更新条件后，获得最优控制策略以及最优值函数；Using the reference control strategy to update the first control strategy until the updated control strategy meets the preset control strategy update condition, thereby obtaining an optimal control strategy and an optimal value function;

基于所述最优控制策略确定出所述待分析导弹系统中的未知参数矩阵，并结合所述待分析导弹系统的参数方程，确定对所述待分析导弹系统的控制策略。Based on the optimal control strategy, an unknown parameter matrix in the missile system to be analyzed is determined, and combined with the parameter equation of the missile system to be analyzed, a control strategy for the missile system to be analyzed is determined.

进一步的，通过以下方法建立所述待分析导弹系统的参数方程：Furthermore, the parameter equation of the missile system to be analyzed is established by the following method:

获取所述导弹在三维惯性坐标下的位置信息、所述导弹的运行状态参数信息以及空气的参数信息确定出导弹的运动线性方程；Obtaining the position information of the missile in three-dimensional inertial coordinates, the operating state parameter information of the missile and the parameter information of air to determine the motion linear equation of the missile;

获取所述导弹的位置向量以及速度向量，确定出所述导弹的初始模型方程；Obtaining the position vector and the velocity vector of the missile, and determining the initial model equation of the missile;

基于所述导弹的运动线性方程确定出第一参考目标方程、第二参考目标方程、第一目标方程以及第二目标方程；Determine a first reference target equation, a second reference target equation, a first target equation, and a second target equation based on the linear motion equation of the missile;

基于所述第一参考目标方程、第二参考目标方程、第一目标方程、第二目标方程以及状态反馈变量，确定出所述导弹的控制力向量方程；Determining a control force vector equation of the missile based on the first reference target equation, the second reference target equation, the first target equation, the second target equation, and the state feedback variable;

基于所述导弹的控制力向量方程以及所述导弹的初始模型方程，确定出所述待分析导弹系统的参数方程。Based on the control force vector equation of the missile and the initial model equation of the missile, the parameter equation of the missile system to be analyzed is determined.

进一步的，在所述基于所述控制策略以及所述参考控制策略进行控制策略更新，直至更新后的控制策略满足预设的控制策略更新条件后，获得最优控制策略以及最优值函数方法之前，所述确定方法包括：Furthermore, before the control strategy is updated based on the control strategy and the reference control strategy until the updated control strategy satisfies the preset control strategy update condition and the optimal control strategy and the optimal value function method are obtained, the determination method includes:

获取待分析导弹系统中的参考信号，基于待分析导弹系统的参数方程以及所述参考信号确定出增广函数；Acquire a reference signal in the missile system to be analyzed, and determine an augmented function based on a parameter equation of the missile system to be analyzed and the reference signal;

基于增广函数的位置跟踪误差参数、控制量数据、折扣因子参数、系统参数以及矩阵参数，确定出值函数；Determine the value function based on the position tracking error parameter, control quantity data, discount factor parameter, system parameter and matrix parameter of the augmented function;

基于所述待分析导弹系统中的探索稳态控制量对所述值函数进行求导运算确定出第二值函数参数方程，以基于所述第二值函数参数方程确定出所述值函数和所述第一控制策略的并行更新情况。The value function is derivatized based on the explored steady-state control quantity in the missile system to be analyzed to determine a second value function parameter equation, so as to determine the parallel update status of the value function and the first control strategy based on the second value function parameter equation.

进一步的，所述确定方法包括：Furthermore, the determination method includes:

基于所述位置跟踪误差参数、所述折扣因子参数、所述系统参数、所述控制量数据以及所述增广函数确定出所述最优控制策略。The optimal control strategy is determined based on the position tracking error parameter, the discount factor parameter, the system parameter, the control amount data and the augmented function.

进一步的，所述确定方法还包括：Furthermore, the determination method further includes:

基于所述最优值函数、所述最优控制策略中的最优控制器以及系统参数，确定出待分析导弹系统中的未知参数矩阵。Based on the optimal value function, the optimal controller in the optimal control strategy and the system parameters, an unknown parameter matrix in the missile system to be analyzed is determined.

本申请实施例还提供了一种导弹控制策略的确定装置，所确定装置包括：The embodiment of the present application also provides a device for determining a missile control strategy, the determining device comprising:

输出确定模块，用于基于待分析导弹系统设定的初始控制策略，确定出待分析导弹系统的控制量数据；An output determination module is used to determine the control quantity data of the missile system to be analyzed based on the initial control strategy set for the missile system to be analyzed;

迭代求解模块，用于基于所述控制量数据进行迭代求解获得第一控制策略，基于所述第一控制策略得到参考值函数和参考控制策略；An iterative solution module, configured to perform iterative solution based on the control quantity data to obtain a first control strategy, and obtain a reference value function and a reference control strategy based on the first control strategy;

更新模块，用于利用所述参考控制策略对所述第一控制策略进行更新，直至更新后的控制策略满足预设的控制策略更新条件后，获得最优控制策略以及最优值函数；An updating module, used to update the first control strategy using the reference control strategy until the updated control strategy satisfies a preset control strategy update condition, thereby obtaining an optimal control strategy and an optimal value function;

控制模块，用于基于所述最优控制策略确定出所述待分析导弹系统中的未知参数矩阵，以使基于所述最优控制策略、所述未知参数矩阵以及所述待分析导弹系统的参数方程对导弹进行跟踪控制。The control module is used to determine the unknown parameter matrix in the missile system to be analyzed based on the optimal control strategy, so as to track and control the missile based on the optimal control strategy, the unknown parameter matrix and the parameter equation of the missile system to be analyzed.

进一步的，所述确定装置还包括系统建立模块，所述系统建立模块用于：Furthermore, the determining device further includes a system establishing module, and the system establishing module is used to:

进一步的，所述确定装置包括函数确定模块，所述函数确定模块用于：Further, the determining device includes a function determining module, and the function determining module is used to:

本申请实施例还提供一种电子设备，包括：处理器、存储器和总线，所述存储器存储有所述处理器可执行的机器可读指令，当电子设备运行时，所述处理器与所述存储器之间通过总线通信，所述机器可读指令被所述处理器执行时执行如上述的导弹控制策略的确定方法的步骤。An embodiment of the present application also provides an electronic device, comprising: a processor, a memory and a bus, wherein the memory stores machine-readable instructions executable by the processor, and when the electronic device is running, the processor and the memory communicate via the bus, and when the machine-readable instructions are executed by the processor, the steps of the method for determining the missile control strategy as described above are performed.

本申请实施例还提供一种计算机可读存储介质，该计算机可读存储介质上存储有计算机程序，该计算机程序被处理器运行时执行如上述的导弹控制策略的确定方法的步骤。An embodiment of the present application further provides a computer-readable storage medium having a computer program stored thereon. When the computer program is executed by a processor, the steps of the method for determining the missile control strategy as described above are executed.

本申请提供了一种导弹控制策略的确定方法，所述跟踪控制确定方法包括：基于待分析导弹系统设定的初始控制策略，确定出待分析导弹系统的控制量数据；基于控制量数据进行迭代求解获得第一控制策略，基于第一控制策略得到参考值函数和参考控制策略；利用参考控制策略对第一控制策略进行更新，直至更新后的控制策略满足预设的控制策略更新条件后，获得最优控制策略以及最优值函数；基于所述最优控制策略确定出所述待分析导弹系统中的未知参数矩阵，并结合所述待分析导弹系统的参数方程，确定对所述待分析导弹系统的控制策略。The present application provides a method for determining a missile control strategy, and the tracking control determination method includes: determining control quantity data of the missile system to be analyzed based on an initial control strategy set for the missile system to be analyzed; obtaining a first control strategy by iteratively solving the control quantity data, and obtaining a reference value function and a reference control strategy based on the first control strategy; updating the first control strategy using the reference control strategy until the updated control strategy meets a preset control strategy update condition, thereby obtaining an optimal control strategy and an optimal value function; determining an unknown parameter matrix in the missile system to be analyzed based on the optimal control strategy, and determining the control strategy for the missile system to be analyzed in combination with a parameter equation of the missile system to be analyzed.

这样，在确定出对待分析导弹系统的控制策略时，考虑了导弹系统中的非线性以及不确定系统参数，通过最优控制策略，确定出包含非线性以及不确定系统参数的未知参数矩阵，确定出实现对待分析导弹系统的控制策略，从而提高对待分析导弹系统控制的准确率。In this way, when determining the control strategy for the missile system to be analyzed, the nonlinearity and uncertain system parameters in the missile system are taken into account. Through the optimal control strategy, the unknown parameter matrix containing the nonlinearity and uncertain system parameters is determined, and the control strategy for the missile system to be analyzed is determined, thereby improving the accuracy of the control of the missile system to be analyzed.

为使本申请的上述目的、特征和优点能更明显易懂，下文特举较佳实施例，并配合所附附图，作详细说明如下。In order to make the above-mentioned objects, features and advantages of the present application more obvious and easy to understand, preferred embodiments are specifically cited below and described in detail with reference to the attached drawings.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本申实施例的技术方案，下面将对实施例中所需要使用的附图作简单地介绍，应当理解，以下附图仅示出了本申请的某些实施例，因此不应被看作是对范围的限定，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他相关的附图。In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for use in the embodiments will be briefly introduced below. It should be understood that the following drawings only show certain embodiments of the present application and therefore should not be regarded as limiting the scope. For ordinary technicians in this field, other related drawings can be obtained based on these drawings without creative work.

图1为本申请实施例所提供的一种导弹控制策略的确定方法的流程图；FIG1 is a flow chart of a method for determining a missile control strategy provided by an embodiment of the present application;

图2为本申请实施例所提供的导弹轨迹跟踪图；FIG2 is a missile trajectory tracking diagram provided by an embodiment of the present application;

图3为本申请实施例所提供的一种导弹控制策略的确定装置的结构示意图之一；FIG3 is a schematic diagram of a structure of a device for determining a missile control strategy provided in an embodiment of the present application;

图4为本申请实施例所提供的一种导弹控制策略的确定装置的结构示意图之二；FIG4 is a second structural schematic diagram of a missile control strategy determination device provided in an embodiment of the present application;

图5为本申请实施例所提供的一种电子设备的结构示意图。FIG5 is a schematic diagram of the structure of an electronic device provided in an embodiment of the present application.

具体实施方式DETAILED DESCRIPTION

为使本申请实施例的目的、技术方案和优点更加清楚，下面将结合本申请实施例中的附图，对本申请实施例中的技术方案进行清楚、完整地描述，应当理解，本申请中的附图仅起到说明和描述的目的，并不用于限定本申请的保护范围。另外，应当理解，示意性的附图并未按实物比例绘制。本申请中使用的流程图示出了根据本申请的一些实施例实现的操作。应当理解，流程图的操作可以不按顺序实现，没有逻辑的上下文关系的步骤可以反转顺序或者同时实施。此外，本领域技术人员在本申请内容的指引下，可以向流程图添加一个或多个其他操作，也可以从流程图中移除一个或多个操作。To make the purpose, technical scheme and advantages of the embodiments of the present application clearer, the technical scheme in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. It should be understood that the drawings in the present application only serve the purpose of explanation and description and are not used to limit the scope of protection of the present application. In addition, it should be understood that the schematic drawings are not drawn in real proportion. The flowchart used in this application shows the operations implemented according to some embodiments of the present application. It should be understood that the operations of the flowchart can be implemented out of sequence, and the steps without logical context can be reversed in order or implemented simultaneously. In addition, those skilled in the art, under the guidance of the content of the present application, can add one or more other operations to the flowchart, or remove one or more operations from the flowchart.

另外，所描述的实施例仅仅是本申请一部分实施例，而不是全部的实施例。通常在此处附图中描述和示出的本申请实施例的组件可以以各种不同的配置来布置和设计。因此，以下对在附图中提供的本申请的实施例的详细描述并非旨在限制要求保护的本申请的范围，而是仅仅表示本申请的选定实施例。基于本申请的实施例，本领域技术人员在没有做出创造性劳动的前提下所获得的全部其他实施例，都属于本申请保护的范围。In addition, the described embodiments are only a part of the embodiments of the present application, rather than all the embodiments. The components of the embodiments of the present application described and shown in the drawings here can be arranged and designed in various configurations. Therefore, the following detailed description of the embodiments of the present application provided in the drawings is not intended to limit the scope of the application claimed for protection, but merely represents the selected embodiments of the present application. Based on the embodiments of the present application, all other embodiments obtained by those skilled in the art without making creative work belong to the scope of protection of the present application.

为了使得本领域技术人员能够使用本申请内容，结合特定应用场景“控制算法技术领域”，给出以下实施方式，对于本领域技术人员来说，在不脱离本申请的精神和范围的情况下，可以将这里定义的一般原理应用于其他实施例和应用场景。In order to enable those skilled in the art to use the contents of this application, the following implementation methods are given in combination with the specific application scenario "field of control algorithm technology". For those skilled in the art, the general principles defined here can be applied to other embodiments and application scenarios without departing from the spirit and scope of this application.

本申请实施例下述方法、装置、电子设备或计算机可读存储介质可以应用于任何需要进行控制算法技术的场景，本申请实施例并不对具体的应用场景作限制，任何使用本申请实施例提供的一种导弹控制策略的确定方法及装置的方案均在本申请保护范围内。The following methods, devices, electronic devices or computer-readable storage media of the embodiments of the present application can be applied to any scenario requiring control algorithm technology. The embodiments of the present application are not limited to specific application scenarios. Any scheme using a method and device for determining a missile control strategy provided by the embodiments of the present application is within the scope of protection of the present application.

经研究发现，现阶段中，是将导弹模型简化为线性模型，基于一些优化方法(例如Riccati微分方法、θ-D方法)并结合传统最优控制方法，得到最优控制策略。由于此类方法将系统简化为线性模型，且传统最优控制方法需要完整的模型参数信息。但是，在导弹在实际飞行中的动态是强非线性的，故此类方法由于研究模型的不精确，并没有达到真正的最优控制策略。或者是针对非线性模型，考虑参数不确定性条件下，设计非线性控制算法，如鲁棒补偿算法、H无穷控制方法和滑膜控制方法等，解决了参数不确定性等对导弹性能的影响。但是，此类方法需要基于被控对象模型信息，抑制导弹参数不确定性造成的影响。而实际导弹动态模型存在不确定性(如跟踪目标不确定和气动参数不确定等)。对于参数完全未知、被控对象模型信息未知的情况并不适用。According to research, at present, the missile model is simplified into a linear model, and the optimal control strategy is obtained based on some optimization methods (such as Riccati differential method, θ-D method) and combined with traditional optimal control methods. Since such methods simplify the system into a linear model, and the traditional optimal control method requires complete model parameter information. However, the dynamics of the missile in actual flight are strongly nonlinear, so such methods do not achieve the true optimal control strategy due to the inaccuracy of the research model. Or for nonlinear models, nonlinear control algorithms are designed under the condition of parameter uncertainty, such as robust compensation algorithm, H infinite control method and sliding film control method, etc., to solve the influence of parameter uncertainty on missile performance. However, such methods need to suppress the influence of missile parameter uncertainty based on the model information of the controlled object. However, there are uncertainties in the actual missile dynamic model (such as tracking target uncertainty and aerodynamic parameter uncertainty). It is not applicable to the situation where the parameters are completely unknown and the model information of the controlled object is unknown.

基于此，本申请的目的在于提供一种导弹控制策略的确定方法，在确定出对待分析导弹系统的控制策略时，考虑了导弹系统中的非线性以及不确定系统参数，通过最优控制策略，确定出包含非线性以及不确定系统参数的未知参数矩阵，确定出实现对待分析导弹系统的控制策略，从而提高对待分析导弹系统控制的准确率。Based on this, the purpose of this application is to provide a method for determining a missile control strategy. When determining the control strategy of the missile system to be analyzed, the nonlinearity and uncertain system parameters in the missile system are taken into account. Through the optimal control strategy, the unknown parameter matrix containing the nonlinearity and uncertain system parameters is determined, and the control strategy for the missile system to be analyzed is determined, thereby improving the accuracy of the control of the missile system to be analyzed.

请参阅图1，图1为本申请实施例所提供的一种导弹控制策略的确定方法的流程图。如图1中所示，本申请实施例提供控制策略的确定方法，包括：Please refer to Figure 1, which is a flow chart of a method for determining a missile control strategy provided by an embodiment of the present application. As shown in Figure 1, the method for determining a control strategy provided by an embodiment of the present application includes:

S101：基于待分析导弹系统设定的初始控制策略，确定出待分析导弹系统的控制量数据。S101: Based on the initial control strategy set for the missile system to be analyzed, control quantity data of the missile system to be analyzed is determined.

该步骤中，对待分析导弹系统设定初始控制策略，进而使得初始控制策略对初始控制器进行策略控制，从而确定出在该初始控制策略下导弹系统的飞行误差数据以及控制量数据。In this step, an initial control strategy is set for the missile system to be analyzed, so that the initial control strategy performs strategy control on the initial controller, thereby determining the flight error data and control quantity data of the missile system under the initial control strategy.

这里，通过设定对初始控制器进行设定，其中，是一个稳态控制量，u_ve为探索噪声。Here, by setting The initial controller is set up, where is a steady-state control variable, and u _ve is the exploration noise.

这里，初始控制策略可以是用于对导弹进行控制跟踪控制的策略，此部分不做限定。Here, the initial control strategy may be a strategy for controlling and tracking the missile, which is not limited in this part.

其中，初始控制器是用于应用到导弹轨迹跟踪任务中。Among them, the initial controller is used to apply to the missile trajectory tracking task.

S102：基于所述控制量数据进行迭代求解获得第一控制策略，基于所述第一控制策略得到参考值函数和参考控制策略。S102: Performing an iterative solution based on the control quantity data to obtain a first control strategy, and obtaining a reference value function and a reference control strategy based on the first control strategy.

该步骤中，对获取到的控制量数据进行迭代求解获得第一控制策略，第一控制策略利用贝尔曼方程得到参考值函数和参考控制策略。In this step, the acquired control quantity data is iteratively solved to obtain a first control strategy, and the first control strategy uses the Bellman equation to obtain a reference value function and a reference control strategy.

这里，第一控制策略是控制量数据在进行迭代求解获得的，如为在第n次迭代更新的控制策略。Here, the first control strategy is obtained by iteratively solving the control quantity data, such as is the control strategy updated at the nth iteration.

这里，对任意的都可以通过以下贝尔曼方程得到Vⁿ和贝尔曼方程如下：Here, for any V ⁿ and The Bellman equation is as follows:

其中，eβt为定义参数，Δt为时间间隔，Vⁿ为参考值函数，ep为位置跟踪误差参数，Q为系统参数之一、R为系统参数之一，为参考控制策略，是一个稳态控制量，u_ve为探索噪声，x为参考信号动态，为第一控制策略。Among them, eβt is the definition parameter, Δt is the time interval, ^Vn is the reference value function, ep is the position tracking error parameter, Q is one of the system parameters, R is one of the system parameters, is the reference control strategy, is a steady-state control quantity, u _ve is the exploration noise, x is the reference signal dynamics, This is the first control strategy.

这里，因为参考值函数是未知的，所以会导致不能直接得到最优控制器，因此引入上述迭代算法接近最优性能指标和最优控制器。Here, because the reference value function is unknown, the optimal controller cannot be obtained directly, so the above iterative algorithm is introduced to approach the optimal performance index and the optimal controller.

S103：利用所述参考控制策略对所述第一控制策略进行更新，直至更新后的控制策略满足预设的控制策略更新条件后，获得最优控制策略以及最优值函数。S103: Using the reference control strategy to update the first control strategy until the updated control strategy meets a preset control strategy update condition, thereby obtaining an optimal control strategy and an optimal value function.

该步骤中，在获得参考控制策略之后，根据参考控制策略对第一控制策略进行更新，直到更新后的控制策略满足预设的控制策略更新条件后，得到最优控制策略和最优值函数。In this step, after obtaining the reference control strategy, the first control strategy is updated according to the reference control strategy until the updated control strategy meets the preset control strategy update condition, thereby obtaining the optimal control strategy and the optimal value function.

这里，更新策略为使得将参考策略的值赋给第一控制策略 Here, the update strategy is Make the reference strategy The value of is assigned to the first control strategy

这里，预设的控制策略更新条件为第一控制策略与参考控制的差值小于ε则停止进行控制策略更新。Here, the preset control strategy update condition is First control strategy With reference control If the difference is less than ε, the control strategy update is stopped.

其中，ε＞0为设定的一个可接受的计算精度。Wherein, ε＞0 is an acceptable calculation accuracy.

其中，当满足预设的控制更新条件后，利用强化学习算法，不断的学习并更新最优控制参数最后得到最优控制策略这里，V^*为最优值函数，为最优控制跟踪器，R为系统参数，B为未知矩阵，利用系统参数、未知矩阵、最优值函数以及最优控制跟踪器一同确定出最优控制策略，进而利用最优控制策略对导弹进行跟踪控制。Among them, when the preset control update conditions are met, the reinforcement learning algorithm is used to continuously learn and update the optimal control parameters and finally obtain the optimal control strategy. Here, V ^* is the optimal value function, is the optimal control tracker, R is the system parameter, B is the unknown matrix, and the optimal control strategy is determined by using the system parameters, the unknown matrix, the optimal value function and the optimal control tracker, and then the missile is tracked and controlled using the optimal control strategy.

S104：基于所述最优控制策略确定出所述待分析导弹系统中的未知参数矩阵，并结合所述待分析导弹系统的参数方程，确定对所述待分析导弹系统的控制策略。S104: Determine an unknown parameter matrix in the missile system to be analyzed based on the optimal control strategy, and determine a control strategy for the missile system to be analyzed in combination with a parameter equation of the missile system to be analyzed.

该步骤中，根据获取到最优控制策略确定出所述待分析导弹系统中的未知参数矩阵，然后根据最优控制策略、所述未知参数矩阵以及所述待分析导弹系统的参数方程对导弹进行跟踪控制，从而实现结合最优控制算法利用导弹状态数据，学习出最优控制策略的跟踪器，从而实现对导弹轨迹跟踪最优控制方法。In this step, the unknown parameter matrix in the missile system to be analyzed is determined based on the optimal control strategy obtained, and then the missile is tracked and controlled based on the optimal control strategy, the unknown parameter matrix and the parameter equation of the missile system to be analyzed, so as to realize the tracker of the optimal control strategy by combining the optimal control algorithm with the missile state data, thereby realizing the optimal control method for missile trajectory tracking.

这里，未知参数矩阵为导弹在飞行过程中的一些未知参数，如导弹的动态未知参数等。Here, the unknown parameter matrix is some unknown parameters of the missile during flight, such as the dynamic unknown parameters of the missile.

这里，通过以下方法建立所述待分析导弹系统的参数方程：Here, the parameter equation of the missile system to be analyzed is established by the following method:

A：获取所述导弹在三维惯性坐标下的位置信息、所述导弹的运行状态参数信息以及空气的参数信息确定出导弹的运动线性方程。A: Obtain the position information of the missile in three-dimensional inertial coordinates, the operating state parameter information of the missile and the parameter information of air to determine the linear motion equation of the missile.

其中，通过将导弹视为质点，通过以下运动线性方程来描述空间中导弹的运动： The motion of the missile in space is described by the following linear equation of motion by treating the missile as a point mass:

其中，x表示导弹在三维惯性坐标系下的横向位置信息，y表示导弹在三维惯性坐标系下的纵向位置信息，z表示导弹在三维惯性坐标系下的与横向坐标以及纵向坐标相垂直的方向上的位置信息，V为导弹的速度信息，θ为导弹的弹道倾斜角，ψ为弹道偏角，m为导弹的质量；X表示导弹在飞行过程中受到的阻力，F_X为导弹在横向坐标上的控制力，F_y为导弹在横向坐标上的控制力，F_z为导弹在与横向坐标以及纵向坐标相垂直的方向上的控制力。Among them, x represents the lateral position information of the missile in the three-dimensional inertial coordinate system, y represents the longitudinal position information of the missile in the three-dimensional inertial coordinate system, z represents the position information of the missile in the direction perpendicular to the lateral and longitudinal coordinates in the three-dimensional inertial coordinate system, V is the velocity information of the missile, θ is the ballistic inclination angle of the missile, ψ is the ballistic deflection angle, and m is the mass of the missile; X represents the resistance encountered by the missile during flight, F _X is the control force of the missile in the lateral coordinate, F _y is the control force of the missile in the lateral coordinate, and F _z is the control force of the missile in the direction perpendicular to the lateral and longitudinal coordinates.

B：获取所述导弹的位置向量以及速度向量，确定出所述导弹的初始模型方程。B: Obtain the position vector and velocity vector of the missile, and determine the initial model equation of the missile.

其中，F＝[F_x F_y F_z]^T表示导弹控制力向量，F_X为导弹在横向坐标上的控制力，F_y为导弹在横向坐标上的控制力，F_z为导弹在与横向坐标以及纵向坐标相垂直的方向上的控制力。在导弹实际飞行过程中，导弹从地面发射到升入高空，其阻力系数c_x和空气密度ρ都会不断在变化，因此具有渐变和不确定性。并且导弹的运动模型具有仿射非线性的特点，而且各个通道之间相互耦合，对模型进行直接求解最优控制问题比较困难，所以需要基于微分几何理论的仿射非线性系统反馈线性化方法解决该问题。Among them, F = [F _x F _y F _z ] ^T represents the missile control force vector, F _x is the missile control force on the lateral coordinate, F _y is the missile control force on the lateral coordinate, and F _z is the missile control force in the direction perpendicular to the lateral coordinate and the longitudinal coordinate. In the actual flight of the missile, from the launch of the missile from the ground to the rise into the high altitude, its drag coefficient c _x and air density ρ will continue to change, so it has gradual and uncertain characteristics. In addition, the missile's motion model has the characteristics of affine nonlinearity, and the various channels are coupled with each other. It is difficult to directly solve the optimal control problem of the model, so it is necessary to solve the problem with the affine nonlinear system feedback linearization method based on differential geometry theory.

为了将反馈线性化方法应用于导弹系统中，令p＝[x y z]^T表示导弹的位置向量，表示导弹速度向量，其中，x表示导弹在三维惯性坐标系下的横向位置信息，y表示导弹在三维惯性坐标系下的纵向位置信息，z表示导弹在三维惯性坐标系下的与横向坐标以及纵向坐标相垂直的方向上的位置信息。In order to apply the feedback linearization method to the missile system, let p = [xyz] ^T represent the position vector of the missile, Represents the missile velocity vector, where x represents the lateral position information of the missile in the three-dimensional inertial coordinate system, y represents the longitudinal position information of the missile in the three-dimensional inertial coordinate system, and z represents the position information of the missile in the direction perpendicular to the lateral and longitudinal coordinates in the three-dimensional inertial coordinate system.

基于上述导弹的位置向量以及速度向量，确定出导弹的初始模型方程为其中，c_3,2＝[0 1 0]^T，α(p，v)为第一参考目标方程，β(p,v)为第二目标参考方程，P为导弹的位置向量，v为导弹的速度向量。这里，利用导弹的位置向量以及导弹的速度向量确定出导弹的初始模型。Based on the position vector and velocity vector of the missile, the initial model equation of the missile is determined as: Among them, c _3,2 = [0 1 0] ^T , α(p, v) is the first reference target equation, β(p, v) is the second target reference equation, P is the position vector of the missile, and v is the velocity vector of the missile. Here, the initial model of the missile is determined using the position vector and velocity vector of the missile.

C：基于所述导弹的运动线性方程确定出第一参考目标方程、第二参考目标方程、第一目标方程以及第二目标方程。C: Determine the first reference target equation, the second reference target equation, the first target equation and the second target equation based on the linear motion equation of the missile.

这里，第一参考目标方程为：Here, the first reference objective equation is:

其中，V为导弹的速度信息，θ为导弹的弹道倾斜角，ψ为弹道偏角。并且，其中m为导弹的质量，ρ为空气密度参数，c_x为阻力系数。这里，利用导弹的速度信息、导弹的弹道倾斜角、弹道偏角、导弹的质量、空气密度参数以及阻力系数确定出第一参考目标方程。Wherein, V is the missile's velocity information, θ is the missile's trajectory inclination angle, and ψ is the trajectory deviation angle. And, where m is the missile's mass, ρ is the air density parameter, and c _x is the drag coefficient. Here, the first reference target equation is determined using the missile's velocity information, the missile's trajectory inclination angle, the trajectory deviation angle, the missile's mass, the air density parameter, and the drag coefficient.

这里，第二参考目标方程为：Here, the second reference objective equation is:

其中，V为导弹的速度信息，θ为导弹的弹道倾斜角，ψ为弹道偏角，m为导弹的质量，这里，利用导弹的速度信息、导弹的弹道倾斜角、弹道偏角以及导弹的质量确定出第二参考目标方程。Among them, V is the missile's velocity information, θ is the missile's ballistic inclination angle, ψ is the ballistic deviation angle, and m is the missile's mass. Here, the second reference target equation is determined using the missile's velocity information, the missile's ballistic inclination angle, the ballistic deviation angle, and the missile's mass.

考虑到弹道倾角θ和弹道偏角ψ在导弹实际飞行中均为可测量，而阻力系数c_x、空气密度ρ以及质量m等为不确定量，设定不定参数为：Considering that the ballistic inclination angle θ and the ballistic deviation angle ψ are both measurable in the actual flight of the missile, while the drag coefficient c _x , air density ρ and mass m are uncertain quantities, the uncertain parameters are set as:

这里，第一目标方程为：Here, the first objective equation is:

其中，V为导弹的速度信息，θ为导弹的弹道倾斜角，ψ为弹道偏角。这里，利用导弹的速度信息、导弹的弹道倾斜角以及弹道偏角确定出第一目标方程。Wherein, V is the velocity information of the missile, θ is the trajectory inclination angle of the missile, and ψ is the trajectory deviation angle. Here, the first target equation is determined using the velocity information of the missile, the trajectory inclination angle of the missile, and the trajectory deviation angle.

这里，第二目标方程为：Here, the second objective equation is:

其中，V为导弹的速度信息，θ为导弹的弹道倾斜角，ψ为弹道偏角。这里，利用导弹的速度信息、导弹的弹道倾斜角以及弹道偏角确定出第二目标方程。Wherein, V is the velocity information of the missile, θ is the trajectory inclination angle of the missile, and ψ is the trajectory deviation angle. Here, the second target equation is determined using the velocity information of the missile, the trajectory inclination angle of the missile, and the trajectory deviation angle.

D：基于所述第一参考目标方程、第二参考目标方程、第一目标方程、第二目标方程以及状态反馈变量，确定出所述导弹的控制力向量方程。D: Based on the first reference target equation, the second reference target equation, the first target equation, the second target equation and the state feedback variable, a control force vector equation of the missile is determined.

这里，根据第一参考目标方程、第二参考目标方程、第一目标方程、第二目标方程以及状态反馈变量共同确定出所述导弹的控制力向量方程。Here, the control force vector equation of the missile is determined based on the first reference target equation, the second reference target equation, the first target equation, the second target equation and the state feedback variable.

其中，首先分步计算，先根据上述第一参考目标方程、第二参考目标方程、第一目标方程、第二目标方程可以得到：Among them, firstly, the calculation is performed step by step. According to the above first reference target equation, second reference target equation, first target equation and second target equation, we can get:

α(p，v)＝σ·α′(p，v)和 α(p, v)=σ·α′(p, v) and

然后，在设定状态反馈变量为u_v满足，进而确定出导弹的控制力向量方程为：Then, when the state feedback variable is set to u _v , the control force vector equation of the missile is determined as:

F＝σ·β(p，v)^-1(u_y-α′(p，v))；F=σ·β(p, v) ^-1 (u _y -α′(p, v));

β(p，v)为第二目标参考方程，α’(p，v)为第一目标方程，σ为不定参数。β(p, v) is the second target reference equation, α’(p, v) is the first target equation, and σ is an indeterminate parameter.

E：基于所述导弹的控制力向量方程以及所述导弹的初始模型方程，确定出所述待分析导弹系统的参数方程。E: Based on the control force vector equation of the missile and the initial model equation of the missile, the parameter equation of the missile system to be analyzed is determined.

其中，将得到的导弹的控制力向量方程代入导弹的初始模型方程中，进而确定出待分析导弹系统的参数方程为其中，P为导弹的位置向量，V为导弹的速度参数，σ为不定参数，u_v为控制量数据，g为重力常数。Among them, the obtained missile control force vector equation is substituted into the missile's initial model equation, and then the parameter equation of the missile system to be analyzed is determined as follows: Among them, P is the position vector of the missile, V is the velocity parameter of the missile, σ is the uncertain parameter, _uv is the control quantity data, and g is the gravity constant.

进一步的，所述跟踪控制确定方法包括：Furthermore, the tracking control determination method includes:

a：获取待分析导弹系统中的参考信号，基于待分析导弹系统的参数方程以及所述参考信号确定出增广函数。a: Obtain a reference signal in the missile system to be analyzed, and determine an augmenting function based on a parameter equation of the missile system to be analyzed and the reference signal.

其中，设定x＝[p v]^T，根据待分析导弹系统的参数方程确定出：In which, x is set to [pv] ^T and the following is determined based on the parameter equation of the missile system to be analyzed:

v＝Cx；v = Cx;

其中，A＝[0_6×3 c_6，1 c_6，2 c_6，3]，B＝[0_3×3 σI₃]^T，y为系统状态输出，C＝[I_3×3 0_3×3]为待分析导弹系统输出矩阵。Among them, A＝[0 _6×3 c _6，1 c _6，2 c _6，3 ], B＝[0 _3×3 σI ₃ ] ^T , y is the system state output, and C＝[I _3×3 0 _3×3 ] is the output matrix of the missile system to be analyzed.

设定参考信号动态如下：Set the reference signal dynamics as follows:

y₀＝C₀x₀；y ₀ =C ₀ x ₀ ;

其中，x₀∈R^6×1表示参考信号状态，A₀∈R^6×1表示参考信号动态矩阵。由于根据待分析导弹系统确定出的y＝Cx以及参考信号可得到一个增广函数：Among them, x ₀ ∈R ^6×1 represents the reference signal state, and A ₀ ∈R ^6×1 represents the reference signal dynamic matrix. y＝Cx and the reference signal can obtain an augmented function:

其中， e_p∈R^6×1表示位置跟踪误差， in, e _p ∈R ^6×1 represents the position tracking error,

b：基于增广函数的位置跟踪误差参数、控制量数据、折扣因子参数、系统参数以及矩阵参数，确定出值函数。b: Determine the value function based on the position tracking error parameters, control quantity data, discount factor parameters, system parameters and matrix parameters of the augmented function.

其中，设计值函数为：Among them, the design value function is:

其中，β为折扣因子，P为矩阵，Q>0，R>0，Q、R为设置的系统参数，e_p为位置跟踪误差参数，u_v为控制量数据。Among them, β is the discount factor, P is the matrix, Q>0, R>0, Q and R are the set system parameters, _ep is the position tracking error parameter, and _uv is the control quantity data.

进一步的，对上述值函数进行求导获得公式1：Furthermore, the above value function is derived to obtain formula 1:

这里，β为折扣因子，P为矩阵，Q>0，R>0，Q、R为设置的系统参数，e_p为位置跟踪误差参数，u_v为控制量数据。Here, β is the discount factor, P is the matrix, Q>0, R>0, Q and R are the set system parameters, _ep is the position tracking error parameter, and _uv is the control quantity data.

其中设V^*为最优值函数。基于经典最优控制理论，可得到最优控制策略：in Let V ^* be the optimal value function. Based on the classical optimal control theory, the optimal control strategy can be obtained:

这里，V^*为最优值函数，为最优控制跟踪器，R为系统参数，B为未知矩阵。Here, V ^* is the optimal value function, is the optimal control tracker, R is the system parameter, and B is the unknown matrix.

进一步的，将代入公式1中得到哈密顿方程如下：Further, Substituting into formula 1, we get the Hamiltonian equation as follows:

这里，V^*为最优值函数，为最优控制跟踪器，R为系统参数，B为未知矩阵，e_p为位置跟踪误差参数，A＝diag(A,A₀)，β为折扣因子。Here, V ^* is the optimal value function, is the optimal control tracker, R is the system parameter, B is the unknown matrix, _ep is the position tracking error parameter, A＝diag(A, _A0 ), and β is the discount factor.

其中，在实际应用当中，由于导弹部分参数未知导致矩阵未知，因此很难得到精确的哈密顿方程。这使得基于传统最优控制算法得到的最优解无法在实际中保持最优性。下面将介绍一种强化学习算法，通过利用导弹状态信息得到符合实际情况的最优解并辨识出实际中的系统参数矩阵B。Among them, in actual application, due to the unknown parameters of the missile The matrix is unknown, so it is difficult to obtain the exact Hamiltonian equation. This makes it impossible for the optimal solution obtained based on the traditional optimal control algorithm to maintain optimality in practice. The following will introduce a reinforcement learning algorithm that uses the missile state information to obtain the optimal solution that conforms to the actual situation and identify the actual system parameter matrix B.

进而为了精准的确定出哈密顿的最优值，在导弹系统中加入一个探索的稳态控制量。Furthermore, in order to accurately determine the optimal value of Hamiltonian, an exploratory steady-state control quantity is added to the missile system.

c：基于所述待分析导弹系统中的探索稳态控制量对所述值函数进行求导运算确定出第二值函数参数方程，以基于所述第二值函数参数方程确定出所述值函数和所述第一控制策略的并行更新情况。c: Derivative operation is performed on the value function based on the explored steady-state control quantity in the missile system to be analyzed to determine a second value function parameter equation, so as to determine the parallel update status of the value function and the first control strategy based on the second value function parameter equation.

其中，在待分析导弹系统中加入了一个探索的稳态控制量后，此时的增广系统为：Among them, after adding an exploratory steady-state control quantity to the missile system to be analyzed, the augmented system at this time is:

其中，是一个稳态控制量，u_ve为探索噪声，为在第n次迭代更新的控制策略。in, is a steady-state control variable, u _ve is the exploration noise, is the control strategy updated at the nth iteration.

利用待分析导弹系统中的探索稳态控制量对值函数进行求导运算确定出第二值函数参数方程：The parameter equation of the second value function is determined by taking the derivative of the value function using the exploratory steady-state control quantity in the missile system to be analyzed:

其中，Vⁿ为参考值函数，e_p为位置跟踪误差参数，Q为系统参数之一、R为系统参数之一，是一个稳态控制量，u_ve为探索噪声，为在第n次迭代更新的控制策略。Wherein, ^Vn is the reference value function, _ep is the position tracking error parameter, Q is one of the system parameters, R is one of the system parameters, is a steady-state control variable, u _ve is the exploration noise, is the control strategy updated at the nth iteration.

进一步的，在上述第二值函数参数方程的两端同时乘e^βt并进行积分可得第二值函数参数方程：Furthermore, by multiplying both ends of the above second value function parameter equation by e ^βt and integrating, the second value function parameter equation can be obtained:

进一步的，根据上述第二值函数参数方程可以确定出值函数Vⁿ和第一控制策略同时更新。Furthermore, the value function V ⁿ and the first control strategy can be determined according to the above second value function parameter equation: Update at the same time.

进一步的，基于所述位置跟踪误差参数、所述折扣因子参数、所述系统参数、所述控制量数据以及所述增广函数确定出所述最优控制策略。Furthermore, the optimal control strategy is determined based on the position tracking error parameter, the discount factor parameter, the system parameter, the control quantity data and the augmented function.

这里，最优跟踪控制策略利用迭代学习和神经网络技术求取，并且通过以上迭代算法的多次迭代可以逼近最优值函数和最优控制器。Here, the optimal tracking control strategy is obtained using iterative learning and neural network technology, and the optimal value function and the optimal controller can be approximated through multiple iterations of the above iterative algorithm.

进一步的，基于所述最优值函数、所述最优控制策略中的最优控制器以及系统参数，确定出待分析导弹系统中的未知参数矩阵。Furthermore, based on the optimal value function, the optimal controller in the optimal control strategy and the system parameters, an unknown parameter matrix in the missile system to be analyzed is determined.

这里，在确定出最优控制策略之后，利用确定出未知参数矩阵 Here, after determining the optimal control strategy, we use Determine the unknown parameter matrix

在具体实施例中，搭建导弹飞行仿真系统，其中的参数设置如下：导弹质量为m＝158kg，气动参数c_x＝0.74，空气密度ρ＝0.868，导弹特征面积S＝0.0324。可得未知参数σ的值为σ＝6.5858×10^-5。导弹的初始位置设置为：P＝[250 -240 -250]^Tm。强化学习算法中时间隔Δt＝0.05s，矩阵设置为Q＝50，R＝I₃。折扣因子设置为β＝0.01。通过强化学习算法，学习出最优的控制器并应用到轨迹跟踪任务中，利用学习出来的最优策略，辨识出未知参数σ＝6.5835×10^-5。请参阅图2，图2为申请实施例所提供的导弹轨迹跟踪图。如图2中所示，可以看出在该实施例中，在纵向位置下的导弹位置可以确定出最优控制器具有良好的轨迹跟踪性能。In a specific embodiment, a missile flight simulation system is built, wherein the parameters are set as follows: the missile mass is m=158kg, the aerodynamic parameter c _x =0.74, the air density ρ=0.868, and the missile characteristic area S=0.0324. The value of the unknown parameter σ is σ=6.5858×10 ^-5 . The initial position of the missile is set to: P=[250 -240 -250] ^T m. In the reinforcement learning algorithm, the time interval Δt=0.05s, the matrix is set to Q=50, and R=I ₃ . The discount factor is set to β=0.01. Through the reinforcement learning algorithm, the optimal controller is learned and applied to the trajectory tracking task, and the unknown parameter σ=6.5835×10 ^-5 is identified using the learned optimal strategy. Please refer to FIG. 2, which is a missile trajectory tracking diagram provided in the application embodiment. As shown in FIG. 2, it can be seen that in this embodiment, the missile position under the longitudinal position can determine that the optimal controller has good trajectory tracking performance.

本申请提供了一种导弹控制策略的确定方法，所述跟踪控制确定方法包括：基于待分析导弹系统设定的初始控制策略，确定出待分析导弹系统的控制量数据；基于控制量数据进行迭代求解获得第一控制策略，基于第一控制策略得到参考值函数和参考控制策略；利用参考控制策略对第一控制策略进行更新，直至更新后的控制策略满足预设的控制策略更新条件后，获得最优控制策略以及最优值函数；基于最优控制策略确定出待分析导弹系统中的未知参数矩阵，以使基于最优控制策略、未知参数矩阵以及待分析导弹系统的参数方程对导弹进行跟踪控制。The present application provides a method for determining a missile control strategy, and the tracking control determination method includes: determining control quantity data of the missile system to be analyzed based on an initial control strategy set for the missile system to be analyzed; iteratively solving based on the control quantity data to obtain a first control strategy, and obtaining a reference value function and a reference control strategy based on the first control strategy; updating the first control strategy using the reference control strategy until the updated control strategy meets a preset control strategy update condition, thereby obtaining an optimal control strategy and an optimal value function; determining an unknown parameter matrix in the missile system to be analyzed based on the optimal control strategy, so that the missile is tracked and controlled based on the optimal control strategy, the unknown parameter matrix and the parameter equation of the missile system to be analyzed.

请参阅图3、图4，图3为本申请实施例所提供的一种导弹控制策略的确定装置的结构示意图之一；图4为本申请实施例所提供的一种导弹控制策略的确定装置的结构示意图之二；如图3中所示，所述确定装置300包括：Please refer to FIG. 3 and FIG. 4. FIG. 3 is a schematic diagram of a structure of a device for determining a missile control strategy provided in an embodiment of the present application; FIG. 4 is a schematic diagram of a structure of a device for determining a missile control strategy provided in an embodiment of the present application; as shown in FIG. 3, the determining device 300 includes:

输出确定模块310，用于基于待分析导弹系统设定的初始控制策略，确定出待分析导弹系统的控制量数据；An output determination module 310 is used to determine the control quantity data of the missile system to be analyzed based on the initial control strategy set for the missile system to be analyzed;

迭代求解模块320，用于基于所述控制量数据进行迭代求解获得第一控制策略，基于所述第一控制策略得到参考值函数和参考控制策略；An iterative solution module 320, configured to perform iterative solution based on the control quantity data to obtain a first control strategy, and obtain a reference value function and a reference control strategy based on the first control strategy;

更新求解模块330，用于利用所述参考控制策略对所述第一控制策略进行更新，直至更新后的控制策略满足预设的控制策略更新条件后，获得最优控制策略以及最优值函数；An updating and solving module 330 is used to update the first control strategy using the reference control strategy until the updated control strategy satisfies a preset control strategy updating condition, thereby obtaining an optimal control strategy and an optimal value function;

控制模块340，用于基于所述最优控制策略确定出所述待分析导弹系统中的未知参数矩阵，并结合所述待分析导弹系统的参数方程，确定对所述待分析导弹系统的控制策略。The control module 340 is used to determine the unknown parameter matrix in the missile system to be analyzed based on the optimal control strategy, and determine the control strategy for the missile system to be analyzed in combination with the parameter equation of the missile system to be analyzed.

进一步的，如图4所示，所述确定装置300还包括系统建立模块350，所述系统建立模块350用于：Furthermore, as shown in FIG. 4 , the determining device 300 further includes a system establishing module 350, and the system establishing module 350 is used to:

进一步的，如图4所示，所述确定装置300包括函数确定模块360，所述函数确定模块360用于：Further, as shown in FIG4 , the determining device 300 includes a function determining module 360, and the function determining module 360 is used to:

获取待分析导弹系统中的参考信号，基于待分析导弹系统的参数方程以及所述参考信号确定出增广函数；Acquire a reference signal in the missile system to be analyzed, and determine an augmenting function based on a parameter equation of the missile system to be analyzed and the reference signal;

进一步的，如图4所示，控制模块340用于：Further, as shown in FIG4 , the control module 340 is used to:

进一步的，如图4所示，更新求解模块330用于：Further, as shown in FIG4 , the update solution module 330 is used to:

本申请实施例提供的一种导弹控制策略的确定装置，所述确定装置包括：输出确定模块，用于基于待分析导弹系统设定的初始控制策略，确定出待分析导弹系统的控制量数据；迭代求解模块，用于基于所述控制量数据进行迭代求解获得第一控制策略，基于所述第一控制策略得到参考值函数和参考控制策略；更新求解模块，用于利用所述参考控制策略对所述第一控制策略进行更新，直至更新后的控制策略满足预设的控制策略更新条件后，获得最优控制策略以及最优值函数；控制模块，用于基于所述最优控制策略确定出所述待分析导弹系统中的未知参数矩阵，并结合所述待分析导弹系统的参数方程，确定对所述待分析导弹系统的控制策略。An embodiment of the present application provides a device for determining a missile control strategy, the device comprising: an output determination module, used to determine control quantity data of the missile system to be analyzed based on an initial control strategy set for the missile system to be analyzed; an iterative solution module, used to perform iterative solution based on the control quantity data to obtain a first control strategy, and obtain a reference value function and a reference control strategy based on the first control strategy; an update solution module, used to update the first control strategy using the reference control strategy until the updated control strategy meets a preset control strategy update condition, thereby obtaining an optimal control strategy and an optimal value function; a control module, used to determine an unknown parameter matrix in the missile system to be analyzed based on the optimal control strategy, and determine the control strategy for the missile system to be analyzed in combination with a parameter equation of the missile system to be analyzed.

请参阅图5，图5为本申请实施例所提供的一种电子设备的结构示意图。如图5中所示，所述电子设备500包括处理器510、存储器520和总线530。Please refer to FIG5 , which is a schematic diagram of the structure of an electronic device provided in an embodiment of the present application. As shown in FIG5 , the electronic device 500 includes a processor 510 , a memory 520 and a bus 530 .

所述存储器520存储有所述处理器510可执行的机器可读指令，当电子设备500运行时，所述处理器510与所述存储器520之间通过总线530通信，所述机器可读指令被所述处理器510执行时，可以执行如上述图1所示方法实施例中的导弹控制策略的确定方法的步骤，具体实现方式可参见方法实施例，在此不再赘述。The memory 520 stores machine-readable instructions executable by the processor 510. When the electronic device 500 is running, the processor 510 communicates with the memory 520 through the bus 530. When the machine-readable instructions are executed by the processor 510, the steps of the method for determining the missile control strategy in the method embodiment shown in Figure 1 above can be executed. The specific implementation method can be found in the method embodiment, which will not be repeated here.

本申请实施例还提供一种计算机可读存储介质，该计算机可读存储介质上存储有计算机程序，该计算机程序被处理器运行时可以执行如上述图1所示方法实施例中的导弹控制策略的确定方法的步骤，具体实现方式可参见方法实施例，在此不再赘述。The embodiment of the present application also provides a computer-readable storage medium, on which a computer program is stored. When the computer program is executed by a processor, the steps of the method for determining the missile control strategy in the method embodiment shown in Figure 1 above can be executed. The specific implementation method can be found in the method embodiment, which will not be repeated here.

所属领域的技术人员可以清楚地了解到，为描述的方便和简洁，上述描述的系统、装置和单元的具体工作过程，可以参考前述方法实施例中的对应过程，在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and brevity of description, the specific working processes of the systems, devices and units described above can refer to the corresponding processes in the aforementioned method embodiments and will not be repeated here.

在本申请所提供的几个实施例中，应该理解到，所揭露的系统、装置和方法，可以通过其它的方式实现。以上所描述的装置实施例仅仅是示意性的，例如，所述单元的划分，仅仅为一种逻辑功能划分，实际实现时可以有另外的划分方式，又例如，多个单元或组件可以结合或者可以集成到另一个系统，或一些特征可以忽略，或不执行。另一点，所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些通信接口，装置或单元的间接耦合或通信连接，可以是电性，机械或其它的形式。In the several embodiments provided in the present application, it should be understood that the disclosed systems, devices and methods can be implemented in other ways. The device embodiments described above are merely schematic. For example, the division of the units is only a logical function division. There may be other division methods in actual implementation. For example, multiple units or components can be combined or integrated into another system, or some features can be ignored or not executed. Another point is that the mutual coupling or direct coupling or communication connection shown or discussed can be through some communication interfaces, and the indirect coupling or communication connection of devices or units can be electrical, mechanical or other forms.

所述作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place or distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

另外，在本申请各个实施例中的各功能单元可以集成在一个处理单元中，也可以是各个单元单独物理存在，也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.

所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个处理器可执行的非易失的计算机可读取存储介质中。基于这样的理解，本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括：U盘、移动硬盘、只读存储器(Read-OnlyMemory，ROM)、随机存取存储器(Random Access Memory，RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the functions are implemented in the form of software functional units and sold or used as independent products, they can be stored in a non-volatile computer-readable storage medium that is executable by a processor. Based on this understanding, the technical solution of the present application, or the part that contributes to the prior art or the part of the technical solution, can be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for a computer device (which can be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method described in each embodiment of the present application. The aforementioned storage medium includes: various media that can store program codes, such as a USB flash drive, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk.

最后应说明的是：以上所述实施例，仅为本申请的具体实施方式，用以说明本申请的技术方案，而非对其限制，本申请的保护范围并不局限于此，尽管参照前述实施例对本申请进行了详细的说明，本领域的普通技术人员应当理解：任何熟悉本技术领域的技术人员在本申请揭露的技术范围内，其依然可以对前述实施例所记载的技术方案进行修改或可轻易想到变化，或者对其中部分技术特征进行等同替换；而这些修改、变化或者替换，并不使相应技术方案的本质脱离本申请实施例技术方案的精神和范围，都应涵盖在本申请的保护范围之内。因此，本申请的保护范围应以权利要求的保护范围为准。Finally, it should be noted that the above-described embodiments are only specific implementation methods of the present application, which are used to illustrate the technical solutions of the present application, rather than to limit them. The protection scope of the present application is not limited thereto. Although the present application is described in detail with reference to the above-mentioned embodiments, ordinary technicians in the field should understand that any technician familiar with the technical field can still modify the technical solutions recorded in the above-mentioned embodiments within the technical scope disclosed in the present application, or can easily think of changes, or make equivalent replacements for some of the technical features therein; and these modifications, changes or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present application, and should be included in the protection scope of the present application. Therefore, the protection scope of the present application shall be based on the protection scope of the claims.

Claims

1. A method for determining a missile control strategy, characterized in that the determining method includes:

Based on the initial control strategy set by the missile system to be analyzed, the control volume data of the missile system to be analyzed is determined;

Perform iterative solution based on the control quantity data to obtain a first control strategy, and obtain a reference value function and a reference control strategy based on the first control strategy;

Using the reference control strategy to update the first control strategy, until the updated control strategy meets the preset control strategy update conditions, the optimal control strategy and the optimal value function are obtained;

Determine the unknown parameter matrix in the missile system to be analyzed based on the optimal control strategy, and determine the control strategy for the missile system to be analyzed in combination with the parameter equations of the missile system to be analyzed;

Before the control strategy is updated based on the control strategy and the reference control strategy until the updated control strategy meets the preset control strategy update conditions and the optimal control strategy and the optimal value function method are obtained, the Determination methods include:

Obtain the reference signal in the missile system to be analyzed, and determine the augmented function based on the parametric equation of the missile system to be analyzed and the reference signal;

Based on the position tracking error parameters, control volume data, discount factor parameters, system parameters and matrix parameters of the augmented function, the value function is determined;

The value function is differentiated based on the explored steady-state control quantity in the missile system to be analyzed to determine a second value function parameter equation, so as to determine the value function and the value function based on the second value function parameter equation. Describe the parallel update situation of the first control strategy;

The determination method includes:

Determine the optimal control strategy based on the position tracking error parameter, the discount factor parameter, the system parameter, the control quantity data and the augmented function;

The determination method also includes:

Based on the optimal value function, the optimal controller in the optimal control strategy, and system parameters, an unknown parameter matrix in the missile system to be analyzed is determined.

2. The determination method according to claim 1, characterized in that the parametric equation of the missile system to be analyzed is established by the following method:

Obtain the position information of the missile in three-dimensional inertial coordinates, the operating status parameter information of the missile, and the parameter information of the air to determine the linear equation of motion of the missile;

Obtain the position vector and velocity vector of the missile, and determine the initial model equation of the missile;

Determine the first reference target equation, the second reference target equation, the first target equation and the second target equation based on the linear equation of motion of the missile;

Based on the first reference target equation, the second reference target equation, the first target equation, the second target equation and the state feedback variables, determine the control force vector equation of the missile;

Based on the missile's control force vector equation and the missile's initial model equation, the parameter equations of the missile system to be analyzed are determined.

3. A missile control strategy determining device, characterized in that the determining device includes:

The output determination module is used to determine the control quantity data of the missile system to be analyzed based on the initial control strategy set by the missile system to be analyzed;

An iterative solution module, configured to perform iterative solution based on the control quantity data to obtain a first control strategy, and obtain a reference value function and a reference control strategy based on the first control strategy;

An update solution module, configured to use the reference control strategy to update the first control strategy until the updated control strategy meets the preset control strategy update conditions, and then obtain the optimal control strategy and the optimal value function;

A control module configured to determine the unknown parameter matrix in the missile system to be analyzed based on the optimal control strategy, and to determine a control strategy for the missile system to be analyzed in combination with the parameter equations of the missile system to be analyzed;

The determination device includes a function determination module, and the function determination module is used for:

The control module is used for:

The update solution module is used to:

4. The determination device according to claim 3, characterized in that the determination device further includes a system establishment module, and the system establishment module is used for:

5. An electronic device, characterized in that it includes: a processor, a memory and a bus, the memory stores machine-readable instructions executable by the processor, and when the electronic device is running, the processor and the Memories communicate through the bus, and when the machine-readable instructions are executed by the processor, the steps of the method for determining a missile control strategy according to any one of claims 1 and 2 are performed.

6. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, it executes a method according to any one of claims 1 and 2. The steps of determining the missile control strategy.