CN117111594B - Self-adaptive track control method for unmanned surface vessel - Google Patents

Self-adaptive track control method for unmanned surface vessel Download PDF

Info

Publication number
CN117111594B
CN117111594B CN202310530731.6A CN202310530731A CN117111594B CN 117111594 B CN117111594 B CN 117111594B CN 202310530731 A CN202310530731 A CN 202310530731A CN 117111594 B CN117111594 B CN 117111594B
Authority
CN
China
Prior art keywords
network
unmanned
surface vessel
gate
unmanned surface
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310530731.6A
Other languages
Chinese (zh)
Other versions
CN117111594A (en
Inventor
张卫东
林�源
陈树康
仓乃梦
曹刚
贾泽华
吴迪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hainan University
Original Assignee
Hainan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hainan University filed Critical Hainan University
Priority to CN202310530731.6A priority Critical patent/CN117111594B/en
Publication of CN117111594A publication Critical patent/CN117111594A/en
Application granted granted Critical
Publication of CN117111594B publication Critical patent/CN117111594B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Control Of Position, Course, Altitude, Or Attitude Of Moving Bodies (AREA)

Abstract

The invention relates to a self-adaptive track control method of an unmanned surface vessel, which comprises the following steps: aiming at the characteristics of timeliness and high nonlinearity of unmanned surface vessel operation data in a complex environment, based on the Peephole LSTM method, the nonlinear characteristics of unmanned surface vessel navigation data are learned by introducing a constant error transmitter, and the time sequence rule among the data is excavated, so that the state space of the unmanned surface vessel is formed. And performing real-time self-adaptive track control on the unmanned surface vessel based on a deep reinforcement learning DDPG algorithm, and adjusting an action strategy of an optimized network by constructing a double-layer network architecture and using a maximized global reward. And the experience playback technology is adopted, samples at each moment are stored in a replay buffer area, and the correlation among the samples is reduced through non-uniform small batch sampling. The parameters of the target network are updated periodically by iteratively calculating the loss function. Compared with the prior art, the unmanned surface vessel has the advantages of improving the sailing efficiency and the safety of the unmanned surface vessel and the like.

Description

Self-adaptive track control method for unmanned surface vessel
Technical Field
The invention relates to the technical field of intelligent control, in particular to self-adaptive track control of an unmanned surface vessel.
Background
With the rapid development of the automatic driving technology, the unmanned surface vessel is used as a novel water carrying platform and plays an important role in the aspects of environment monitoring, reconnaissance and water patrol. However, when facing complicated and changeable sea conditions, how to ensure the unmanned ship to effectively realize the sailing task and improve the motion control performance is widely paid attention by researchers at home and abroad. However, the current track control method is mainly based on feedback control and model predictive control, and aiming at time-varying data generated during navigation of an unmanned surface vessel, the model structure is often too complex, a large amount of online calculation is needed, and the track control strategy is difficult to output in real time, so that the task execution efficiency of the unmanned surface vessel is reduced, and the risk factors during navigation of the unmanned surface vessel are increased. Therefore, for complex navigation environments and application scenes, an adaptive track control method of the unmanned surface vessel is needed to ensure the efficiency and safety of the unmanned surface vessel.
Disclosure of Invention
In view of the above, the present invention aims to provide an adaptive track control method for an unmanned surface vessel to dynamically control the track of the unmanned surface vessel under complex sea conditions, so as to improve the sailing efficiency and safety of the unmanned surface vessel.
Based on the above purpose, the invention provides a self-adaptive track control method of an unmanned surface vessel, which comprises the following steps:
s1, based on a Peephole LSTM method, taking average endurance mileage, average endurance time and average sailing speed of an unmanned ship as input of the current moment of the Peephole LSTM, and obtaining a complex time data sequence generated when the unmanned ship runs by learning nonlinear characteristics of sailing data of the unmanned ship;
s2, taking the obtained complex time data sequence generated when the unmanned ship runs as a state space of a deep reinforcement learning DDPG algorithm, and setting an action space of the unmanned ship as (V, beta), wherein V is the speed of the unmanned ship, beta is the rudder angle value of the unmanned ship, training is carried out based on the DDPG algorithm, and an adaptive flight path control strategy of the unmanned water surface ship is output in real time, wherein the DDPG network comprises an Actor network and a Critic network;
the step S2 specifically comprises the following steps:
s21, initializing parameters of an Actor and a Critic network in a training starting stage, outputting an unmanned ship control strategy by a prediction network Actor based on a state space at the current moment, and taking the output action value as input of the prediction network Critic;
s22, action a of Critic network to Actor network output at current time St t Evaluating to obtain a reward function r t The unmanned ship state is converted into S t+1 Outputting a value function Q at the current moment;
s23, the Actor network adjusts and optimizes the action strategy of the Actor network according to the value function output by the Critic network, and updates the network parameters.
S24, updating network parameters of the target network based on a soft update mode to finish DDPG network training;
s3, controlling navigation of the unmanned surface vessel in real time based on an optimal track control strategy of the unmanned surface vessel output by the DDPG network.
Preferably, in step S1, the method of pephole LSTM further includes the steps of:
adding peeping holes on all doors in the network for connection;
inputting the data to a forgetting gate, and learning nonlinear characteristics of unmanned ship navigation data in low-level neurons;
inputting the result of the forgetting gate at the current moment into an input gate;
and outputting the result of the input gate to the output gate, and generating network output in the neuron of the last layer.
Preferably, in the pephole LSTM method, the forget gate update formula is:
wherein f t Is a forgetting gate at time t, sigma is expressed as a sigmoid activation function, W f Input layer weight representing forget gate, C t-1 And h t-1 The state and output at time t-1 are shown,representing k unmanned ship indexes input at t time, b f A bias factor representing a forgetting gate;
the result of the forgetting gate at the t moment is input to an input gate i, the input gate has a structure similar to that of the forgetting gate, and an input gate update formula is as follows:
wherein i is t An input gate at time t, W i Input layer weight of output gate, b i The bias coefficient of the input gate is represented, the pepole LSTM adopts a neuron structure with a plurality of layers, processes a part of calculation tasks at each layer, transmits the hidden state of the neurons at the moment to the pepole LSTM layer at the next moment, and generates network output in the neurons at the last layer;
the output gate update formula is:
wherein o is t Is the output gate at time t, W o Input layer weights representing output gates, b o Representing the bias factor of the output gate.
Preferably, the DDPG network uses an average task execution duration as a reward function of the DDPG, and the average task execution duration T has a calculation formula:
wherein N is the task executed by the unmanned ship, N is the total number of tasks, T n Is the length of time that the unmanned boat takes in performing the nth task.
Preferably, the DDPG network adopts a deterministic strategy, and random noise is added into the predicted network in the training stage of the network, so that the Actor network has certain exploration capability when outputting deterministic actions.
Preferably, the DDPG network uses an empirical playback technique to store the state St at each time, the action value executed, the obtained bonus function and the state at the next time in a replay buffer, and non-uniform small-batch sampling is adopted during sampling, so that the correlation between samples is reduced.
Preferably, the DDPG network performs iterative calculation through the prediction network and the target network, and calculates the loss function L based on the mean square error.
Preferably, the DDPG network updates network parameters of Critic in the predictive network based on back propagation of the neural network, and calculates gradients of the predictive network based on a random gradient descent method.
Preferably, the DDPG network is based on predicting parameters θ of network actors and Critic Q And theta μ To update parameters in the target network, respectively.
The invention has the beneficial effects that:
A. according to the invention, nonlinear complex data generated when the unmanned surface vessel runs are processed based on the Peephole LSTM algorithm, the coupling relation of different indexes is analyzed by arranging a multi-layer neural network structure, and the time sequence rule among the data is mined, so that the state change of the unmanned surface vessel at different moments is described, and a state space is formed.
B. According to the invention, network parameters are updated in a single step by setting a prediction and target network based on a DDPG algorithm, the neural network is prevented from being fitted, and the unmanned surface vessel can adapt to navigation under various complex environments through extensive offline training, so that the current track control strategy is output in real time, and the navigation efficiency and safety of the unmanned surface vessel are improved.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only of the invention and that other drawings can be obtained from them without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a method for adaptive track control according to an embodiment of the present invention;
fig. 2 is a schematic diagram of a real-time adaptive track control flow of the deep reinforcement learning DDPG algorithm on an unmanned surface vessel according to an embodiment of the present invention.
Detailed Description
The present invention will be further described in detail with reference to specific embodiments in order to make the objects, technical solutions and advantages of the present invention more apparent.
It is to be noted that unless otherwise defined, technical or scientific terms used herein should be taken in a general sense as understood by one of ordinary skill in the art to which the present invention belongs. The terms "first," "second," and the like, as used herein, do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof, but does not exclude other elements or items. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", etc. are used merely to indicate relative positional relationships, which may also be changed when the absolute position of the object to be described is changed.
The embodiment provides a self-adaptive track control method of an unmanned surface vessel, and reference is made to fig. 1.
The self-adaptive track control of the unmanned surface vessel comprises the following steps:
s1, considering that unmanned surface vessels running data in a complex environment have time-varying and highly nonlinear characteristics, state changes of unmanned surface vessels at different moments cannot be directly described through state transfer functions. The average endurance mileage, the average endurance time and the average sailing speed of the unmanned ship are used as the input of the current moment of the Peephole LSTMk isNumber of unmanned boats operating data.
The long-term memory neural network of the Peephole LSTM is widely used for processing complex sequence data, and solves the defects of gradient elimination and gradient explosion of RNN during long-sequence training by introducing a constant error transmitter. The deep neural network has better generalization capability than the shallow neural network. By adding peepholes to all gates in the network with the pephalestm for connection, the remaining gates can detect the current cell state even when the output gate is closed. By learning the nonlinear characteristics of unmanned ship navigation data in the low-level neurons and then combining the nonlinear characteristics in the deep-level neurons, the timing rules of the data are mined. The forgetting door update formula is as follows:
wherein f t Is a forgetting gate at time t, sigma is expressed as a sigmoid activation function, W f Representing the input layer weights of the forget gate. C (C) t-1 And h t-1 The state and output at time t-1 are shown, respectively.And represents k unmanned ship indexes input at the time t. b f Representing the bias factor of the forgetting gate. The result of the forgetting gate at the t moment is input to an input gate i, the input gate has a structure similar to that of the forgetting gate, and an input gate update formula is as follows:
wherein i is t An input gate at time t, W i The input layer weight of the gate is output. b i Representing the bias factor of the input gate. The pephole LSTM adopts a neuron structure with multiple layers, processes a part of calculation tasks at each layer, transmits the hidden state of the neurons at the moment to the pephole LSTM layer at the next moment, and generates a network in the neurons at the last layerAnd outputting. The output gate update formula is as follows:
wherein o is t Is the output gate at time t, W o Representing the input layer weights of the output gates. b o Representing the bias factor of the output gate. Based on the Peephole LSTM model, the complex time data sequence generated when the unmanned ship runs is effectively analyzed and used as a state space of the deep reinforcement learning DDPG algorithm.
S2, performing real-time self-adaptive track control on the unmanned surface vessel based on a depth reinforcement learning DDPG algorithm, wherein the DDPG expands a discrete action space of the depth reinforcement learning, so that the unmanned surface vessel can output an optimal control strategy of the current stage on a continuous action space. The motion space of the unmanned ship is set as (V, beta), wherein V is the navigational speed of the unmanned ship, and beta is the rudder angle value of the unmanned ship. Through a large amount of off-line training, the unmanned ship can output the optimal track control strategy of the current stage in a complex navigation environment.
The DDPG adopts a prediction network and a target network, so that the stability of an algorithm is improved, and the convergence performance of the network is improved. The average task execution duration is used as a reward function of the DDPG to improve the task execution efficiency of the unmanned ship and ensure the effective execution of the task. The average task execution duration T is calculated as follows:
wherein N is the task executed by the unmanned ship, N is the total number of tasks, T n Is the length of time that the unmanned boat takes in performing the nth task.
S21, DDPG adopts a deterministic strategy, and random noise is added into a predicted network in a training stage of the network, so that the Actor network has certain exploration capacity when outputting deterministic actions. In the beginning of training, the parameters of the Actor and Critic networks are initialized. Unmanned ship capable of outputting state space based on current moment by predictive network ActorAnd the control strategy takes the output action value as the input of the prediction network Critic. Action a of Critic network output to Actor network at current time St t Evaluating to obtain a reward function r t The unmanned ship state is converted into S t+1 And outputting a value function Q at the current moment.
Wherein Q(s) t ,a t ) Represented in state s t Lower predictive network usage action a t The resulting function of the value(s),is the bellman equation. r(s) t ,a t ) Representing the unmanned ship in state s t Lower execution action a t The prize value obtained. Gamma is the discount coefficient, Q (s t+1 ,μ(s t+1 ) Indicating that the unmanned ship is in state s t+1 The action strategy μ(s) t+1 ) The resulting cost function.
S22, the Actor network adjusts and optimizes the action strategy of the Actor network according to the value function output by the Critic network, and updates the network parameters. Using experience playback technique to make the state St of each moment and the action value a of execution t Obtained bonus function r t And the state S at the next moment t+1 And the samples are stored in a replay buffer area, non-uniform small-batch sampling is adopted during sampling, and correlation among samples is reduced. Calculating a loss function L of the network based on the mean square error:
where N represents the number of samples sampled from the replay buffer and μ' represents the action policy used in the target network. Q'(s) t+1 ,μ'(s t+1 ) Representing state s t+1 Lower target network parameter usage strategy μ'(s) t+1 ) The resulting value function.
S23, updating network parameters of Critic in the prediction network based on back propagation of the neural network by iteratively calculating a loss function, and calculating the gradient of the prediction network by using a random gradient descent method.
Wherein,as a gradient of the network, θ μ Parameter indicating the predictor in the network, < +.>To predict the gradient of the network θμ. />Representing the gradient of the target network performing action a in state s, a=μ(s) representing the action value based on the policy μ in state s. />Represented in predicted network theta μ The gradient obtained using strategy μ(s).
S24, predicting parameters theta of network Actor and Critic Q And theta μ To update parameters in the target network, respectively. In order to avoid frequent updating of network parameters, the parameters are updated in a soft updating mode so as to prevent the phenomenon of over fitting of the DDPG network.
Where η is the update coefficient of the predicted network and the target network. θ Q Representing parameters of Critic in the predictive network, θ μ ' and theta Q ' represents the parameters of the Actor and Critic in the target network, respectively.
S3, controlling and optimizing navigation of the unmanned surface vessel in real time based on an optimal track control strategy of the unmanned surface vessel output by the DDPG network, so that the unmanned surface vessel can adapt to a complex navigation environment, and the task execution efficiency and the navigation safety are improved.
Those of ordinary skill in the art will appreciate that: the discussion of any of the embodiments above is merely exemplary and is not intended to suggest that the scope of the invention (including the claims) is limited to these examples; the technical features of the above embodiments or in the different embodiments may also be combined within the idea of the invention, the steps may be implemented in any order and there are many other variations of the different aspects of the invention as described above, which are not provided in detail for the sake of brevity.
The present invention is intended to embrace all such alternatives, modifications and variances which fall within the broad scope of the appended claims. Therefore, any omission, modification, equivalent replacement, improvement, etc. of the present invention should be included in the scope of the present invention.

Claims (7)

1. The self-adaptive track control method of the unmanned surface vessel is characterized by comprising the following steps of:
s1, based on a Peephole LSTM method, taking average endurance mileage, average endurance time and average sailing speed of an unmanned ship as input of the current moment of the Peephole LSTM, and obtaining a complex time data sequence generated when the unmanned ship runs by learning nonlinear characteristics of sailing data of the unmanned ship;
s2, taking the obtained complex time data sequence generated when the unmanned ship runs as a state space of a deep reinforcement learning DDPG algorithm, and setting the action space of the unmanned ship as the state spaceWherein V is the speed of the unmanned boat, < ->Is the rudder angle value of the unmanned ship, trains based on a DDPG algorithm, outputs the self-adaptive track control strategy of the unmanned surface ship in real time, and the DDPG network comprises an Actor network and a CriticA network;
the step S2 specifically comprises the following steps:
s21, initializing parameters of an Actor and a Critic network in a training starting stage, outputting an unmanned ship control strategy by a prediction network Actor based on a state space at the current moment, and taking the output action value as input of the prediction network Critic;
s22, outputting action of Critic network to Actor network at current time StEvaluating and obtaining a reward function->The state of the unmanned boat is changed to +.>Outputting a value function Q at the current moment;
s23, the Actor network adjusts and optimizes the action strategy of the Actor network according to a value function output by the Critic network, and updates network parameters;
s24, updating network parameters of the target network based on a soft update mode to finish DDPG network training;
s3, controlling navigation of the unmanned surface vessel in real time based on an optimal track control strategy of the unmanned surface vessel output by the DDPG network;
in step S1, the method of pephole LSTM further includes the following steps:
adding peeping holes on all doors in the network for connection;
inputting the data to a forgetting gate, and learning nonlinear characteristics of unmanned ship navigation data in low-level neurons;
inputting the result of the forgetting gate at the current moment into an input gate;
outputting the result of the input gate to the output gate, and generating network output in the last layer of neurons;
in the method of the pephole LSTM, the forgetting gate updating formula is as follows:
wherein,is a forgetting door at the moment t +.>Expressed as sigmoid activation function, +.>Input layer weights representing forgetting gate, < ->And->Respectively indicate->Status and output of time of day->K unmanned ship indexes input at t time are represented, namely +.>A bias factor representing a forgetting gate;
the result of the forgetting gate at the t moment is input to an input gate i, the input gate has a structure similar to that of the forgetting gate, and an input gate updating formula is as follows:
wherein,for the input gate at time t +.>Input layer weight of output gate, +.>The bias coefficient of the input gate is represented, the pepole LSTM adopts a neuron structure with a plurality of layers, processes a part of calculation tasks at each layer, transmits the hidden state of the neurons at the moment to the pepole LSTM layer at the next moment, and generates network output in the neurons at the last layer;
the output gate update formula is:
wherein,is the output gate at time t, +.>Input layer weights representing output gates, +.>Representing the bias factor of the output gate.
2. The adaptive track control method of the unmanned surface vessel according to claim 1, wherein the DDPG network uses an average task execution time length as a reward function of the DDPG, and the average task execution time length T has a calculation formula of:
wherein N is the task executed by the unmanned ship, N is the total number of tasks,is the length of time that the unmanned boat takes in performing the nth task.
3. The self-adaptive track control method of the unmanned surface vessel according to claim 1, wherein the DDPG network adopts a deterministic strategy, and random noise is added into the predictive network in the training stage of the network, so that the Actor network has a certain exploration capacity when outputting deterministic actions.
4. The self-adaptive track control method of the unmanned surface vessel according to claim 1, wherein the DDPG network uses an empirical playback technique to store the state St at each time, the action value executed, the obtained reward function and the state at the next time in a replay buffer, and non-uniform small-batch sampling is adopted during sampling, so that the correlation between samples is reduced.
5. The adaptive track control method of an unmanned surface vessel according to claim 1, wherein the DDPG network performs iterative computation by a prediction network and a target network, and calculates a loss function based on a mean square errorL
6. The adaptive track control method of an unmanned surface vessel of claim 5, wherein the DDPG network updates Critic's network parameters in the predictive network based on back propagation of the neural network, and calculates a gradient of the predictive network based on a random gradient descent method.
7. The method of claim 1, wherein the DDPG network is based on parameters of predictive networks Actor and CriticAnd->To update parameters in the target network, respectively.
CN202310530731.6A 2023-05-12 2023-05-12 Self-adaptive track control method for unmanned surface vessel Active CN117111594B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310530731.6A CN117111594B (en) 2023-05-12 2023-05-12 Self-adaptive track control method for unmanned surface vessel

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310530731.6A CN117111594B (en) 2023-05-12 2023-05-12 Self-adaptive track control method for unmanned surface vessel

Publications (2)

Publication Number Publication Date
CN117111594A CN117111594A (en) 2023-11-24
CN117111594B true CN117111594B (en) 2024-04-12

Family

ID=88795429

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310530731.6A Active CN117111594B (en) 2023-05-12 2023-05-12 Self-adaptive track control method for unmanned surface vessel

Country Status (1)

Country Link
CN (1) CN117111594B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117806311A (en) * 2023-11-30 2024-04-02 中船(北京)智能装备科技有限公司 Unmanned ship path planning method and device with multiple task points

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110658829A (en) * 2019-10-30 2020-01-07 武汉理工大学 Intelligent collision avoidance method for unmanned surface vehicle based on deep reinforcement learning
CN110782664A (en) * 2019-10-16 2020-02-11 北京航空航天大学 Running state monitoring method of intelligent vehicle road system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110782664A (en) * 2019-10-16 2020-02-11 北京航空航天大学 Running state monitoring method of intelligent vehicle road system
CN110658829A (en) * 2019-10-30 2020-01-07 武汉理工大学 Intelligent collision avoidance method for unmanned surface vehicle based on deep reinforcement learning

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
A Cooperative Control Method Based on Reinforcement Learning;Zhiwei Zhuang 等;2018 Chinese Automation Congress;20190124;全文 *
Applications of Deep Reinforcement Learning in Communications and Networking: A Survey;Nguyen Cong Luong 等;IEEE Communications Surveys & Tutorials;20190514;全文 *
AUV path tracking with real-time obstacle avoidance via reinforcement learning under adaptive constraints;Chenming Zhang 等;Ocean Engineering;20220815;全文 *
基于LSTM流量预测的DDoS攻击检测方法;程杰仁 等;华中科技大学学报;20190412;全文 *
基于深度强化学习的多潜器编队控制算法设计;闫敬 等;控制与决策;20230419;全文 *

Also Published As

Publication number Publication date
CN117111594A (en) 2023-11-24

Similar Documents

Publication Publication Date Title
CN108803321A (en) Autonomous Underwater Vehicle Trajectory Tracking Control method based on deeply study
CN106325071B (en) One kind being based on the adaptive tender course heading control method of event driven Generalized Prediction
CN113052372B (en) Dynamic AUV tracking path planning method based on deep reinforcement learning
CN109901403A (en) A kind of face autonomous underwater robot neural network S control method
CN117111594B (en) Self-adaptive track control method for unmanned surface vessel
Ma et al. Neural network model-based reinforcement learning control for auv 3-d path following
CN114839884B (en) Underwater vehicle bottom layer control method and system based on deep reinforcement learning
Jiang et al. Neural network based adaptive sliding mode tracking control of autonomous surface vehicles with input quantization and saturation
CN111176122A (en) Underwater robot parameter self-adaptive backstepping control method based on double BP neural network Q learning technology
Knudsen et al. Deep learning for station keeping of AUVs
CN117666355A (en) Flexible shaft-based vector propeller control system and method
Li et al. Parallel path following control of cyber-physical maritime autonomous surface ships based on deep neural predictor
CN114715331B (en) Floating ocean platform power positioning control method and system
CN116880191A (en) Intelligent control method of process industrial production system based on time sequence prediction
Liu et al. Forward-looking imaginative planning framework combined with prioritized-replay double DQN
Zhao et al. Consciousness neural network for path tracking control of floating objects at sea
CN115453880A (en) Training method of generative model for state prediction based on antagonistic neural network
US20240046111A1 (en) Methods and apparatuses of determining for controlling a multi-agent reinforcement learning environment
Li et al. Robust model predictive ship heading control with event-triggered strategy
CN117111620B (en) Autonomous decision-making method for task allocation of heterogeneous unmanned system
Bande et al. Online model adaptation of autonomous underwater vehicles with LSTM networks
Khorasgani et al. Deep reinforcement learning with adjustments
Zhang et al. An on-line adaptive hybrid PID autopilot of ship heading control using auto-tuning BP & RBF neurons
CN118466227B (en) Electric propeller track tracking control method and system based on artificial intelligence
Yu et al. Vessel trajectory prediction based on modified LSTM with attention mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant