CN109375514A - A kind of optimal track control device design method when the injection attacks there are false data - Google Patents
A kind of optimal track control device design method when the injection attacks there are false data Download PDFInfo
- Publication number
- CN109375514A CN109375514A CN201811453386.6A CN201811453386A CN109375514A CN 109375514 A CN109375514 A CN 109375514A CN 201811453386 A CN201811453386 A CN 201811453386A CN 109375514 A CN109375514 A CN 109375514A
- Authority
- CN
- China
- Prior art keywords
- strategy
- algorithm
- following
- optimal
- solving
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000002347 injection Methods 0.000 title claims abstract description 15
- 239000007924 injection Substances 0.000 title claims abstract description 15
- 238000000034 method Methods 0.000 title claims description 42
- 230000005540 biological transmission Effects 0.000 claims abstract description 9
- 238000011156 evaluation Methods 0.000 claims description 10
- 230000006872 improvement Effects 0.000 claims description 10
- 239000011159 matrix material Substances 0.000 claims description 10
- 230000002787 reinforcement Effects 0.000 claims description 6
- 230000003416 augmentation Effects 0.000 claims description 5
- 230000007123 defense Effects 0.000 claims description 5
- 230000006870 function Effects 0.000 claims description 4
- 238000012886 linear function Methods 0.000 claims description 2
- 230000009471 action Effects 0.000 claims 1
- 230000003044 adaptive effect Effects 0.000 abstract description 2
- 230000015572 biosynthetic process Effects 0.000 abstract 1
- 230000006855 networking Effects 0.000 abstract 1
- 238000011217 control strategy Methods 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000000243 solution Substances 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 238000012804 iterative process Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
- G05B13/042—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Automation & Control Theory (AREA)
- Complex Calculations (AREA)
- Feedback Control In General (AREA)
Abstract
The present invention relates to a kind of intelligent-tracking controllers, and when there are false data injection attacks, which can calculate optimal tracking control law in real time, so that the reference input of tracking system is capable of in the output of system.The controller may include different control algolithm processors, using the adaptive dynamic programming algorithm based on game theory and Q- study, the case where can be adapted for the unknown situation of system dynamic, can only even obtain input-output data.The present invention is suitable for the case where by Wireless networking systems and controller, or the case where by wireless communication networks transmission data, has great application value in terms of UAV Formation Flight, intelligent vehicle.
Description
Technical Field
The invention relates to a method for determining an optimal tracking controller by using a game theory, a self-adaptive dynamic programming and a reinforcement learning method when a linear discrete time system has false data injection attack.
Background
Optimal tracking control is an important research topic in the control field and has a wide application background. For example, the track tracking of intelligent vehicles and unmanned aerial vehicles, the tracking control of robots, and the like. The purpose of the optimal tracking control is to enable the output of the system to track the reference input (or reference trajectory) in an optimal sense, which can be achieved by minimizing a pre-given quadratic performance indicator function. It should be noted that, as network technologies develop and are applied, wireless network transmission technologies are increasingly applied to remote transmission of data. However, due to the existence of the wireless network, the transmitted data is easy to be attacked by adversaries, which mainly includes denial of service attack, replay attack, fake data injection attack, and the like. Therefore, the research on the optimal tracking control in the presence of network attacks has important practical significance. The invention mainly aims at the research of false data injection attack.
In the traditional optimal tracking control, a corresponding tracking controller is designed by adopting a dynamic programming method. However, the dynamic planning method belongs to a backward-forward recursion method, so that online calculation cannot be performed, and a problem of dimension disaster exists. The self-adaptive dynamic programming method belongs to the artificial intelligence category, is essentially based on the reinforcement learning theory, simulates the thinking of learning by a human through the feedback of a complex environment, and further recurs forward in time to solve a control strategy, so that the method can be executed on line.
The optimal control rate is calculated by adopting a Q learning method, a system matrix of an original system and a reference trajectory generator is not needed, and the method is suitable for the condition that some dynamic matrixes are unknown. In addition, the method can also be used for iteratively solving the optimal tracking control strategy by only adopting input and output data without current state information.
Disclosure of Invention
The invention aims to provide a design method of an optimal tracking controller of a discrete time system when false data injection attack exists, and solves the problem that the tracking cannot be performed when the false data injection attack exists in the prior art. The system of the present invention is illustrated in block diagram form in fig. 1. The technical scheme of the invention is implemented as follows:
1) establishing a false data attack model and an augmentation system model;
2) establishing a game model of an attack and defense party by adopting a game theory method; the defender is a controller, and the attacker is an injector of false data;
3) establishing a Bellman equation, and solving an optimal control strategy and an attack strategy through an optimal control theory; solving a game algebra Riacati equation by adopting a strategy iteration method and a value iteration method;
4) solving the optimal strategies of both game parties by adopting a Q-function-based reinforcement learning method, wherein the optimal strategies comprise a strategy iteration method and a value iteration method;
5) and (4) iteratively solving the optimal strategy by adopting a Q-learning method based on input-output data only.
Drawings
Fig. 1 is a system configuration diagram in the presence of a false data injection attack.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Referring to fig. 1, the present invention provides a method using game theory, adaptive dynamic programming and Q learning, which solves the problem of optimal tracking control of a discrete time system. The specific implementation mode is as follows:
1) establishing a false data attack model and an augmentation model
Consider the following system model
xk+1=Axk+Buk(1)
Wherein A and B are system matrixes; assume system control input ukThe system model is attacked in the transmission process and becomes the system model after being attacked by the dummy data injection
Wherein q is the number of attackers,the ith transmission is attacked by the jth attacker, otherwise, the ith transmission is not attacked;injecting false data for the jth channel at time k.
The assumed tracking model has the following form
Wherein the matrix T is a tracking model matrix; it should be noted that the system matrix T in the above equation is not required to be Hurwitz. By combining the equations (2) and (3), the following augmented system equation can be obtained
2) Establishing game models of an attack and defense party by adopting a game theory method
Generally, controllers come in a variety of forms, such as state feedback, output feedback, dynamic output feedback, and the like. In addition, the injection dummy data may be various. The present invention assumes that the tracking controller and the dummy data are both statesOf linear function, i.e.
Wherein, K ═ K1,K2]Andrespectively, the feedback gains of the attacking and defending parties. The following payment functions are selected for the two parties in the game respectively:
wherein,Qeand the value is more than or equal to 0, R is more than 0, and gamma epsilon (0, 1) is a discount factor. Therefore, the optimal strategy for the defenders and attackers is
Solving (9) and (10) is equivalent to solving the following game problem
3) Establishing a Bellman equation, and solving an optimal control strategy and an attack strategy through an optimal control theory
First, a utility function is defined as follows,
then, through calculation, the following optimal control bellman equation can be obtained,
according to the theory of optimal control, it can be known that,p is more than 0. Therefore, by solving the optimal control equation, the following optimal strategies of both attacking and defending can be obtained:
wherein,
Θ=[(Θ1)T(Θ3)T…(Θq)T]T
L(P)=[(L1(P))T(L2(P))T…(Lq(P))T]T
the matrix P > 0 and satisfies
The above results are given according to the dynamic programming method, and can be calculated only off-line. Now, a reinforcement learning method is adopted to calculate optimal strategies of both game parties on line. The strategy iteration and value iteration calculation processes are given in algorithm 1 and algorithm 2 below, respectively.
Algorithm 1: online policy iteration
1. Initialization: setting j to 0, selecting a stable initial strategyAnd
2. and (3) policy evaluation: solving the following equation to obtain Pj+1
3. Strategy improvement:
4. stopping conditions are as follows: i Kj+1-Kj||<∈,||Lj+1-Lj||<∈
And 2, algorithm: value iteration algorithm
1. Initialization: setting j to 0, selecting a stable initial strategyAnd
2. and (3) policy evaluation: solving the following equation to obtain Pj+1
3. Strategy improvement:
4. stopping conditions are as follows: i Kj+1-Kj||<∈,||Lj+1-Lj||<∈;
As can be seen from Algorithm 1, solving equation (17) requires known dataAnd the initial value must be stable, otherwise the equation is unsolved. Algorithm 2 is improved accordingly and the initial value is no longer required to be stable.
4) Solving the optimal strategy of both game parties by adopting a Q-function-based reinforcement learning method
The Q-function is defined as follows,
for ease of description re-writing to the following compact form,
wherein,
therefore, it is possible to solve the equationAndthe following optimal strategies of both attacking and defending can be obtained,
by taking equation (20) into equation (19), a bellman equation based on a Q-function, which is an important equation in the iterative process, can be obtained. The strategy iteration and value iteration methods based on the Q-function are given in algorithm 3 and algorithm 4, respectively.
Algorithm 3: strategy iteration algorithm based on Q-function
1. Initialization: setting j to 0, and selecting H0=(H0)T
2. And (3) policy evaluation: solving the following equation to obtain Pj+1
3. Strategy improvement:
4. stopping conditions are as follows: i Hj+1-Hj||<∈
And algorithm 4: value iteration algorithm based on Q-function
1. Initialization: setting j to 0, and selecting H0=(H0)T
2. And (3) policy evaluation: solving the following equation to obtain Pj+1
3. Strategy improvement:
4. stopping conditions are as follows: i Hj+1-Hj||<∈。
Notably, the Q-function based iterative algorithms 3 and 4 do not require the system matrix of the previously known augmentation systemAnd
iterative solution of optimal strategy based on input-output data by Q-learning method
Assuming the system is observable, the system stateAn input-output sequence representation may be employed,
wherein,
as can be seen from the above equation, there is a constant k > 0, such that when N < k, rank (V)N) N + p, when N is not less than k, rank (V)N) N + p. Wherein n is the original system state dimension, and p is the system output dimension. Therefore, N ≧ κ is selected such that the matrix VNColumn full rank. Definition of
Then, the Q-function can be written in the following form
Therefore, the optimal strategy of the attack and defense can be obtained as
Wherein,
bellman's equation based on Q-function and input-output data can be written as
Linear parameterized Q-function, can be obtained
In the above formula, the unknown matrixIs provided withThe number of the unknown elements is not known,because of the fact thatBased on the above analysis, algorithm 5 and algorithm 6 give a strategy iteration and value iteration method, respectively, using Q-learning, which uses only input-output data.
And algorithm 5: strategy iterative algorithm using Q-learning
1. Initialization: setting j to 0, and selecting a stable initial strategyAnd
2. and (3) policy evaluation: solving the following equation to solve hj+1
3. Strategy improvement:
4. stopping conditions are as follows: i Hj+1-Hj||<∈
And 6, algorithm: value iterative algorithm using Q-learning
1. Initialization: setting j to 0, and selecting an arbitrary initial policyAnd
2. and (3) policy evaluation: solving the following equation to solve hj+1
3. Strategy improvement:
4. stopping conditions are as follows: i Hj+1-Hj||<∈;
As can be seen from algorithm 6, the initial strategy of the two parties that do not need to attack and defend is stable. In addition, for recursive computation Must satisfy
Claims (5)
1. A design method of an optimal tracking controller in the presence of false data injection attack is characterized by comprising the following steps:
the method comprises the following steps: establishing a false data attack model and an augmentation system model;
step two: establishing a game model of an attack and defense party by adopting a game theory method;
step three: solving the optimal strategies of both game parties by adopting a Q-function-based reinforcement learning method, wherein the optimal strategies comprise a strategy iteration method and a value iteration method;
step four: and based on the input-output data, adopting a Q-learning method to iteratively solve the optimal strategy.
2. The method according to claim 1, wherein the first step is specifically:
consider the following system model:
xk+1=Axk+Buk
wherein A and B are system matrixes; if the system control input ukIf the system is attacked in the transmission process, the system model after being attacked by the dummy data injection is as follows:
wherein q is the number of attackers, the ith transmission is attacked by the jth attacker, otherwise, the ith transmission is not attacked;injecting false data for the jth channel at the moment k;
the tracking model is assumed to have the form:
wherein the matrix T is a tracking model matrix; the augmentation system can be expressed as:
3. the method for designing an optimal tracking controller in the presence of a false data injection attack as claimed in claim 1, wherein the second step specifically comprises:
assuming both tracking controller and dummy data are statesOf linear function, i.e.
Wherein, K ═ K1,K2]Andrespectively, the feedback gains of the attacking and defending parties.
The gaming parties pick the following payout functions:
wherein gamma is the discount factor, QeAnd R are respectively given semi-positive definite and positive definite matrixes; the optimal strategy design of the defender and the attacker is as follows:
4. the method according to claim 1, wherein the third step is specifically:
the following Q-function is defined:
by solving equationsAndthe following optimal action strategies of the attacking and defending parties can be obtained:
the strategy iteration and value iteration method based on the Q-function are respectively given in an algorithm 1 and an algorithm 2;
algorithm 1: the strategy iterative algorithm based on the Q-function comprises the following steps,
1) and initializing: setting j to 0, and selecting H0=(H0)T
2) And strategy evaluation: solving the following equation, solving for Pj+1
3) And strategy improvement:
4) and stopping conditions: i Hj+1-Hj||<∈;
And 2, algorithm: an iterative algorithm based on the values of the Q-function, comprising the following steps,
1) and initializing: setting j to 0, and selecting H0=(H0)T
2) And strategy evaluation: solving the following equation to obtain Pj+1
3) And strategy improvement:
4) stopping conditions are as follows: i Hj+1-Hj||<∈。
5. The method for designing an optimal tracking controller in the presence of a false data injection attack as claimed in claim 1, wherein said step four specifically comprises:
system stateThe following input-output sequence representation can be employed:
then, the Q-function can be written in the form:
therefore, the optimal strategy for both the attack and defense is as follows:
wherein,
the strategy iteration and value iteration methods using Q-learning are given in algorithm 3 and algorithm 4, respectively:
algorithm 3: the strategy iterative algorithm adopting Q-learning comprises the following steps,
1) initialization: setting j to 0, and selecting a stable initial strategyAnd
2) and (3) policy evaluation: solving the following equation to solve hj+1
3) Strategy improvement:
4) stopping conditions are as follows: i Hj+1-Hj||<∈;
And algorithm 4: the iterative algorithm using the Q-learned values, comprises the following steps,
1) initialization: setting j to 0, and selecting an arbitrary initial policyAnd
2) and (3) policy evaluation: solving the following equation to solve hj+1
3) Strategy improvement:
4) stopping conditions are as follows: i Hj+1-Hj||<∈。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811453386.6A CN109375514B (en) | 2018-11-30 | 2018-11-30 | Design method of optimal tracking controller in presence of false data injection attack |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811453386.6A CN109375514B (en) | 2018-11-30 | 2018-11-30 | Design method of optimal tracking controller in presence of false data injection attack |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109375514A true CN109375514A (en) | 2019-02-22 |
CN109375514B CN109375514B (en) | 2021-11-05 |
Family
ID=65376219
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811453386.6A Active CN109375514B (en) | 2018-11-30 | 2018-11-30 | Design method of optimal tracking controller in presence of false data injection attack |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109375514B (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109932905A (en) * | 2019-03-08 | 2019-06-25 | 辽宁石油化工大学 | A kind of optimal control method of the Observer State Feedback based on non-strategy |
CN110083064A (en) * | 2019-04-29 | 2019-08-02 | 辽宁石油化工大学 | A kind of network optimal track control method based on non-strategy Q- study |
CN111273543A (en) * | 2020-02-15 | 2020-06-12 | 西北工业大学 | PID optimization control method based on strategy iteration |
CN111673750A (en) * | 2020-06-12 | 2020-09-18 | 南京邮电大学 | Speed synchronization control scheme of master-slave type multi-mechanical arm system under deception attack |
CN112149361A (en) * | 2020-10-10 | 2020-12-29 | 中国科学技术大学 | Adaptive optimal control method and device for linear system |
CN112650057A (en) * | 2020-11-13 | 2021-04-13 | 西北工业大学深圳研究院 | Unmanned aerial vehicle model prediction control method based on anti-spoofing attack security domain |
CN113885330A (en) * | 2021-10-26 | 2022-01-04 | 哈尔滨工业大学 | Information physical system safety control method based on deep reinforcement learning |
CN114415633A (en) * | 2022-01-10 | 2022-04-29 | 云境商务智能研究院南京有限公司 | Security tracking control method based on dynamic event trigger mechanism under multi-network attack |
CN115877871A (en) * | 2023-03-03 | 2023-03-31 | 北京航空航天大学 | Non-zero and game unmanned aerial vehicle formation control method based on reinforcement learning |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2140650B1 (en) * | 2007-03-30 | 2011-05-25 | International Business Machines Corporation | Method and system for resilient packet traceback in wireless mesh and sensor networks |
CN104994569A (en) * | 2015-06-25 | 2015-10-21 | 厦门大学 | Multi-user reinforcement learning-based cognitive wireless network anti-hostile interference method |
CN106937295A (en) * | 2017-02-22 | 2017-07-07 | 沈阳航空航天大学 | Heterogeneous network high energy efficiency power distribution method based on game theory |
CN107038477A (en) * | 2016-08-10 | 2017-08-11 | 哈尔滨工业大学深圳研究生院 | A kind of neutral net under non-complete information learns the estimation method of combination with Q |
CN107819785A (en) * | 2017-11-28 | 2018-03-20 | 东南大学 | A kind of double-deck defence method towards power system false data injection attacks |
CN108181816A (en) * | 2018-01-05 | 2018-06-19 | 南京航空航天大学 | A kind of synchronization policy update method for optimally controlling based on online data |
CN108196448A (en) * | 2017-12-25 | 2018-06-22 | 北京理工大学 | False data injection attacks method based on inaccurate mathematical model |
CN108512837A (en) * | 2018-03-16 | 2018-09-07 | 西安电子科技大学 | A kind of method and system of the networks security situation assessment based on attacking and defending evolutionary Game |
-
2018
- 2018-11-30 CN CN201811453386.6A patent/CN109375514B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2140650B1 (en) * | 2007-03-30 | 2011-05-25 | International Business Machines Corporation | Method and system for resilient packet traceback in wireless mesh and sensor networks |
CN104994569A (en) * | 2015-06-25 | 2015-10-21 | 厦门大学 | Multi-user reinforcement learning-based cognitive wireless network anti-hostile interference method |
CN107038477A (en) * | 2016-08-10 | 2017-08-11 | 哈尔滨工业大学深圳研究生院 | A kind of neutral net under non-complete information learns the estimation method of combination with Q |
CN106937295A (en) * | 2017-02-22 | 2017-07-07 | 沈阳航空航天大学 | Heterogeneous network high energy efficiency power distribution method based on game theory |
CN107819785A (en) * | 2017-11-28 | 2018-03-20 | 东南大学 | A kind of double-deck defence method towards power system false data injection attacks |
CN108196448A (en) * | 2017-12-25 | 2018-06-22 | 北京理工大学 | False data injection attacks method based on inaccurate mathematical model |
CN108181816A (en) * | 2018-01-05 | 2018-06-19 | 南京航空航天大学 | A kind of synchronization policy update method for optimally controlling based on online data |
CN108512837A (en) * | 2018-03-16 | 2018-09-07 | 西安电子科技大学 | A kind of method and system of the networks security situation assessment based on attacking and defending evolutionary Game |
Non-Patent Citations (5)
Title |
---|
HAO LIU 等: "《Optimal Tracking Control of Linear 》Discrete-Time Systems Under Cyber Attacks》", 《IFAC2020》 * |
YING CHEN 等: "《Evaluation of Reinforcement Learning Based False Data Injection Attack to Automatic Voltage Control》", 《IEEE》 * |
YUZHE LI 等: "《SINR-based DoS Attack on Remote State Estimation: A Game-theoretic Approach》", 《IEEE》 * |
刘皓: "《信息物理系统的"攻与防"》", 《沈阳航空航天大学学报》 * |
田继伟 等: "《基于博弈论的负荷重分配攻击最佳防御策略》", 《计算机仿真》 * |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109932905A (en) * | 2019-03-08 | 2019-06-25 | 辽宁石油化工大学 | A kind of optimal control method of the Observer State Feedback based on non-strategy |
CN109932905B (en) * | 2019-03-08 | 2021-11-09 | 辽宁石油化工大学 | Optimization control method based on non-strategy observer state feedback |
CN110083064B (en) * | 2019-04-29 | 2022-02-15 | 辽宁石油化工大学 | Network optimal tracking control method based on non-strategy Q-learning |
CN110083064A (en) * | 2019-04-29 | 2019-08-02 | 辽宁石油化工大学 | A kind of network optimal track control method based on non-strategy Q- study |
CN111273543A (en) * | 2020-02-15 | 2020-06-12 | 西北工业大学 | PID optimization control method based on strategy iteration |
CN111273543B (en) * | 2020-02-15 | 2022-10-04 | 西北工业大学 | PID optimization control method based on strategy iteration |
CN111673750A (en) * | 2020-06-12 | 2020-09-18 | 南京邮电大学 | Speed synchronization control scheme of master-slave type multi-mechanical arm system under deception attack |
CN111673750B (en) * | 2020-06-12 | 2022-03-04 | 南京邮电大学 | Speed synchronization control scheme of master-slave type multi-mechanical arm system under deception attack |
CN112149361A (en) * | 2020-10-10 | 2020-12-29 | 中国科学技术大学 | Adaptive optimal control method and device for linear system |
CN112149361B (en) * | 2020-10-10 | 2024-05-17 | 中国科学技术大学 | Self-adaptive optimal control method and device for linear system |
CN112650057B (en) * | 2020-11-13 | 2022-05-20 | 西北工业大学深圳研究院 | Unmanned aerial vehicle model prediction control method based on anti-spoofing attack security domain |
CN112650057A (en) * | 2020-11-13 | 2021-04-13 | 西北工业大学深圳研究院 | Unmanned aerial vehicle model prediction control method based on anti-spoofing attack security domain |
CN113885330A (en) * | 2021-10-26 | 2022-01-04 | 哈尔滨工业大学 | Information physical system safety control method based on deep reinforcement learning |
CN113885330B (en) * | 2021-10-26 | 2022-06-17 | 哈尔滨工业大学 | Information physical system safety control method based on deep reinforcement learning |
CN114415633A (en) * | 2022-01-10 | 2022-04-29 | 云境商务智能研究院南京有限公司 | Security tracking control method based on dynamic event trigger mechanism under multi-network attack |
CN114415633B (en) * | 2022-01-10 | 2024-02-02 | 云境商务智能研究院南京有限公司 | Security tracking control method based on dynamic event triggering mechanism under multi-network attack |
CN115877871A (en) * | 2023-03-03 | 2023-03-31 | 北京航空航天大学 | Non-zero and game unmanned aerial vehicle formation control method based on reinforcement learning |
Also Published As
Publication number | Publication date |
---|---|
CN109375514B (en) | 2021-11-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109375514B (en) | Design method of optimal tracking controller in presence of false data injection attack | |
Yuan et al. | Resilient strategy design for cyber-physical system under DoS attack over a multi-channel framework | |
CN115169539B (en) | Secret communication method based on inertial complex value memristor neural network | |
US20230367934A1 (en) | Method and apparatus for constructing vehicle dynamics model and method and apparatus for predicting vehicle state information | |
CN112085050A (en) | Antagonistic attack and defense method and system based on PID controller | |
CN113156821B (en) | Self-adaptive tracking method of nonlinear system under false data injection attack | |
CN111563330A (en) | Information physical system security optimization analysis method based on zero sum game strategy | |
CN115047907B (en) | Air isomorphic formation command method based on multi-agent PPO algorithm | |
CN117150566B (en) | Robust training method and device for collaborative learning | |
CN112558476A (en) | Non-linear multi-wisdom system leaderless consistency control method based on attack compensation | |
Liu et al. | Resilient consensus of discrete-time connected vehicle systems with interaction network against cyber-attacks | |
CN115047769A (en) | Unmanned combat platform obstacle avoidance-arrival control method based on constraint following | |
He et al. | Local synchronization of nonlinear dynamical networks with hybrid impulsive saturation control inputs | |
Ma et al. | Event‐triggered adaptive finite‐time secure control for nonlinear cyber‐physical systems against unknown deception attacks | |
CN116679753B (en) | Formation tracking control method for anti-spoofing attack of heterogeneous unmanned system | |
CN116560240B (en) | Computer readable storage medium and second order multi-agent consistency control system | |
CN112327632A (en) | Multi-agent system tracking control method for false data injection attack | |
CN116781754A (en) | Multi-agent robust fault-tolerant cooperative control method under network attack | |
Wang et al. | Leader-following consensus of multiple euler-lagrange systems under deception attacks | |
Shi et al. | Flocking control for Cucker–Smale model under denial‐of‐service attacks | |
Yuan et al. | Incomplete information-based resilient strategy design for cyber-physical systems under stochastic communication protocol | |
Zheng et al. | UAV maneuver decision-making via deep reinforcement learning for short-range air combat | |
CN117544956B (en) | Multi-mobile robot safety positioning method based on network communication | |
Zhao et al. | An Observer-Based Resilient Control Method Against Intermittent DoS Attacks for Nonstrict-Feedback Nonlinear Cyberphysical Systems with Tracking Errors Constraint | |
Wei et al. | Resilient Predictive Control of Constrained Connected and Automated Vehicles under Malicious Attacks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20220718 Address after: 452370 Building 2, Xingfu industrial new town, Micun Town, Xinmi City, Zhengzhou City, Henan Province Patentee after: Shensu intelligent agricultural machinery equipment (Henan) Co.,Ltd. Address before: 110136, Liaoning, Shenyang, Shenbei New Area moral South Avenue No. 37 Patentee before: SHENYANG AEROSPACE University |