CN117311160A

CN117311160A - Automatic control system and control method based on artificial intelligence

Info

Publication number: CN117311160A
Application number: CN202311431617.4A
Authority: CN
Inventors: 吴根平; 王浩; 华锴玮; 刘佳玥; 李永龙
Original assignee: 719th Research Institute Of China State Shipbuilding Corp
Current assignee: 719th Research Institute Of China State Shipbuilding Corp
Priority date: 2023-10-31
Filing date: 2023-10-31
Publication date: 2023-12-29

Abstract

The invention discloses an automatic control system and a control method based on artificial intelligence, and belongs to the technical field of artificial intelligence. The invention comprises a data acquisition module, a route planning module, an intelligent control module, an emergency obstacle avoidance module and a remote monitoring module. The data acquisition module is responsible for collecting data related to ship operation and surrounding environment, including GPS location, weather conditions, radar scanning, camera images, etc., for providing real-time situational awareness and status monitoring. The route planning module analyzes the data using an artificial intelligence algorithm to plan an optimal route. This process takes into account factors such as target location, safety, fuel efficiency, etc. The intelligent control module implements an automatic control strategy to adjust parameters such as heading, speed, engine output and the like of the ship according to the planned route, and performs feedback control according to real-time data. The remote monitoring module allows the control center to monitor the state and operation of the ship in real time and perform remote control.

Description

Automatic control system and control method based on artificial intelligence

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to an automatic control system and a control method based on artificial intelligence.

Background

Autonomous control of a vessel is a critical technique for ensuring safe navigation of the vessel under different marine conditions. Conventional marine control methods typically involve manual steering, but as technology advances, autonomous control systems become increasingly important to improve the safety and efficiency of sailing. The deep Q network adopted by the invention is a reinforcement learning algorithm for decision making, and combines the deep neural network and Q learning. In autonomous control of a vessel, a deep Q network is used to learn an optimal control strategy to enable the vessel to sail safely under different environmental conditions.

Traditional ship control methods rely mainly on manual maneuvering, which can lead to human error, especially during long voyages. Further, manual operations are a great burden to crews. Many vessels are poorly autonomous and cannot be effectively handled in a variety of complex marine environments, resulting in potential safety risks. Conventional route planning methods are generally based on static sea charts and manual calculations, and cannot flexibly cope with changing environments and targets. Conventional ship control systems often lack remote monitoring functionality, which means that the ship command center cannot monitor and interfere with the operation of the ship in real time.

Disclosure of Invention

The invention aims to provide an automatic control system and a control method based on artificial intelligence so as to solve the problems in the background technology.

In order to solve the technical problems, the invention provides the following technical scheme:

an automatic control system based on artificial intelligence comprises a data acquisition module, a route planning module, an intelligent control module, an emergency obstacle avoidance module and a remote monitoring module;

the data acquisition module acquires different types of data by using a sensor, and acquires corresponding data by a GPS receiver, a meteorological sensor, a radar system, an underwater sonar and an inertial navigation sensor respectively; the data includes the following types:

GPS data: the GPS receiver is used to obtain accurate position coordinates of the vessel, including latitude and longitude information, which is critical to route planning and vessel position tracking.

Weather data: meteorological sensors collect data of meteorological conditions such as wind speed, wind direction, temperature, humidity and the like. This information is critical to the planning of the route and the safety of the weather in sailing.

Radar data: radar systems can detect surrounding vessels, shore obstructions, and other objects. These data aid in emergency obstacle avoidance and collision risk assessment.

Underwater sonar data: the underwater sonar can measure the depth and the position of the underwater obstacle, and is helpful for avoiding collision between the terrain and the underwater obstacle.

Inertial navigation data: inertial navigation sensors provide acceleration, speed and attitude information of the vessel. These data help to ensure stability and navigational control of the vessel.

The route planning module receives real-time data from the data acquisition module, performs route planning by combining the data and using an A-algorithm, and applies the optimal route to navigation control of the ship;

the intelligent control module receives real-time data from the data acquisition module and uses a deep Q network to realize intelligent control of the ship;

the emergency obstacle avoidance module receives real-time data from the data acquisition module, performs obstacle detection and target tracking, performs collision risk assessment by using a support vector machine, formulates an obstacle avoidance strategy in real time if collision risk exists, automatically executes the obstacle avoidance strategy, communicates with a ship command center, and sends out an emergency alarm;

the remote monitoring module is connected to the ship command center and is used for transmitting ship data in real time through satellite communication, the Internet or other communication links, wherein the ship data comprise positions, speeds, heading, meteorological conditions, sensor data, camera images and video streams; allowing the operator of the marine vessel command center to remotely control the vessel.

The route planning module firstly needs a map, wherein the map comprises a route starting point, a route ending point and possible obstacles, and the map is provided by a GPS receiver of the data acquisition module; the planning module receives a target position set by a user, wherein the target position is a coordinate point or a designated target port;

the algorithm a divides the map into nodes, where each node represents a location on the map; the route planning module calculates the total estimated cost for each node; the route planning module maintains two lists, one is an open list, and is used for storing nodes to be explored; the other is a closed list for storing the explored nodes; it is necessary to add the starting point to the open list and set the estimated cost of the starting point to 0;

the route planning module starts an A-search process, and continuously selects a node with the lowest estimated cost from the open list for expansion; when the nodes are expanded, the route planning module checks the adjacent nodes, calculates estimated cost of the adjacent nodes, and adds the adjacent nodes into the open list; adding the extension node to the closed list to prevent repeated exploration; at each step, the planning module checks whether the target node has been reached, if so, the route planning is completed;

If the target node is found, the planning module starts backtracking from the target node to construct an optimal path; the optimal path will be the shortest route from the origin to the target, which contains any necessary turning or obstacle avoidance actions;

the route planning module will ensure that the generated route is viable, without path segments intersecting obstacles; if an unfeasible portion is found, the module will adjust the route to bypass the obstacle; ultimately, the route planning module will output an optimal route from the origin to the destination, which is a sequence of consecutive waypoints or path segments that the vessel follows in turn.

The mathematical formula of the a-algorithm is: f (n) =g (n) +h (n), where f (n) is an evaluation function, g (n) represents an actual path cost of the optimal path from the start point to the node n, and h (n) represents an estimated cost of the optimal path from the node n to the target; wherein, the mathematical formula of g (n) is as follows:

g(n)＝l·L+b·G+c·O+d·F+e·T+f·S

where L is the weight of the course path length, b is the weight of the cost of turning or changing heading on the course, c is the weight of the cost of avoiding the obstacle or obstacle active avoidance, d is the weight of the fuel consumption associated with the course path, e is the weight of the time constraint associated with the course path, F is the weight of the safety factor associated with the course path, L represents the course path length from the origin to the node n, G represents the cost of turning or changing heading on the course, O represents the cost of avoiding the obstacle or obstacle active avoidance, F represents the fuel consumption associated with the course path, T represents the time constraint associated with the course path, S represents the safety factor associated with the course path;

Using euclidean distance calculations, the mathematical formula for h (n) is as follows:

where h (n) represents the estimated distance from the current node n to the target node, x _goal And y _goal Is the geographic coordinates of the target node, x _n And y _n Is the geographical coordinates of the current node n.

The intelligent control module uses a deep Q network to realize intelligent control of the ship, and comprises the following steps:

s4-1, defining a state space and an action space of a ship control problem; the state space comprises the position, the current speed, the course angle, the surrounding environment information, the meteorological conditions, the ship state and the target position of the ship; the action space represents actions for controlling the ship, including rudder angle, engine output, propeller direction and power, controllable parameters of the ship;

s4-2, creating a deep neural network, taking the state as input, and outputting a Q value of each possible action, wherein the Q value represents expected return or long-term rewards after taking a certain action;

s4-3, training a deep Q network by using a deep Q learning algorithm, selecting an action to execute by the intelligent agent in each time step, and observing rewards and new states;

s4-4, repeatedly executing actions, observing rewards and updating the deep Q network until the required performance level is reached, wherein different training parameters and super parameters are needed to optimize the performance in the process;

S4-5, after the deep Q network training is finished, the deep Q network takes the current state as input in real-time control and outputs specific control actions;

s4-6, monitoring the performance of the system in real time, and adjusting parameters of the deep Q network according to the requirements so as to adapt to different environments and tasks.

According to the step S4-2, the input layer of the deep neural network receives the current state of the ship as input, the state is expressed in the form of a numerical vector, and the parameters correspond to the state space; the output layer of the deep neural network is a Q value estimation layer, and each output node corresponds to one possible action; each output node outputs a Q value of a corresponding action, representing an expected return or long-term reward after taking the action, corresponding to the action space;

according to step S4-3, the training goal of the network is to make the output Q value of the network as close as possible to the actual optimal Q value, which is determined by a Q learning algorithm or by a combination with a reward function; during the training process, the parameters of the network are gradually adjusted to minimize the error between the actual Q value and the network output Q value;

in the training process, a loss function of a training network is required to be defined and used for reducing the difference between the actual Q value and the network output Q value; minimizing the loss function by gradient descent;

Training data is obtained from the experience playback, the training data including status, actions taken, rewards, and next status information.

The mathematical formula of the Q learning algorithm is as follows:

Q(s,a)＝Q(s,a)+α·[R+γ·max(Q(s′+a′))-Q(s,a)]

wherein Q (s, a) represents the Q value of action a taken in state s for guiding the control action of the vessel; alpha represents learning rate and controls the update speed of the Q value; r represents a reward function; gamma is a discount factor for balancing instant rewards and future rewards; s' is a new state, entering according to the control action a of the ship; a 'represents the optimal control action selected in the new state s' obtained through the Q-value network.

The mathematical formula of the reward function is as follows:

R＝ω _dist ·R _dist (s,s′)+ω _obs ·R _obs (s,s′)+ω _fuel ·R _fuel (s,s′)+ω _safe ·R _safe (s,s′)

wherein R represents the total prize when action a is taken in state s and transferred to state s'; omega _dist ，ω _obs ，ω _fuel ，ω _safe The weight is a positive number set by a user, and the sum is 1;

R _dist (s, s') means that if the vessel is approaching the target location, a positive reward is given, using Euclidean distance calculations; the shorter the distance, the higher the prize;

R _obs (s, s') represents that the ship approaches an obstacle, a negative reward is given, and the obstacle detection data provided by the ship sensor is used for judgment; the shorter the distance, the lower the prize;

R _fuel (s, s') means that if the ship maintains proper speed and engine output while underway, a positive reward is given, and fuel efficiency is judged by the semp ship fuel efficiency management program; the higher the fuel efficiency, the higher the prize;

Semp is a marine fuel efficiency management program aimed at helping shipowners and marine operators to improve the fuel efficiency of ships, reduce energy consumption and emission, thereby reducing operating costs and reducing environmental impact. The index reflects the energy consumed per cargo transported and the associated greenhouse gas emissions. The higher the fuel efficiency, the lower the EEOI value, as this means less fuel is consumed per unit of cargo transportation.

R _safe (s, s') means that if the vessel avoids a possible dangerous situation, a positive reward is given, otherwise a 0 reward is given; and judging dangerous situations through the emergency obstacle avoidance module.

The loss function uses the mean square error to measure the difference between the actual value and the predicted value, and the mathematical formula of the loss function J (theta) is as follows:

where θ is a parameter of the depth Q network, adjusted during training to minimize the loss function; q (s, a, θ) is the output of the deep Q network, representing the predicted Q value of action a taken in state s; r is the actual prize, representing the prize obtained after performing action a; s' is a new state, representing a state after performing action a; a 'represents a new action taken at s'; gamma is the discount factor;

the gradient descent minimization loss function J (θ) needs to be used, and the gradient is calculated as follows:

Next, the parameter θ is updated using a gradient descent rule, the update formula is as follows:

wherein alpha is the learning rate and is used for controlling the step length of parameter updating;

the gradient is repeatedly calculated and the parameter θ is updated, and when the loss function is less than 0.001, the gradient descent is ended.

The emergency obstacle avoidance module uses a support vector machine to evaluate collision risk, and comprises the following steps:

s9-1, the data is marked as two categories: obstacle and safety area; if data v _i Indicating an obstacle, then the sign u _i =1; if data v _i Indicating a safe area, then the mark u _i ＝-1；

S9-2, finding an optimal separation hyperplane, and separating two categories, wherein the hyperplane is expressed as: w·v+ε=0, where w is the normal vector, v is the feature vector, ε is the bias;

s9-3 to minimize weightThe norm of the weight vector w satisfies the data point to the segmentation hyperplane interval not less than 1, and the convex quadratic programming problem is expressed as:meanwhile, the SVM requires that the following constraint conditions be satisfied: u (u) _i (w·v _i +ε)≥1；

S9-4, training the support vector machine model by using training data;

s9-5, evaluating the performance of the support vector machine model by using the test data;

s9-6, using the trained support vector machine model for real-time collision risk assessment.

The emergency obstacle avoidance module needs to avoid various potential obstacles in autonomous navigation of the vessel, including but not limited to:

other vessels: the vessel needs to avoid collisions with other vessels, especially in busy waters.

Islands and land: the module needs to ensure that the vessel does not collide with land, islands or shorelines.

Marine barrier: this may include buoys, lighttowers, fishing boats, offshore platforms, rafts, etc.

And (3) floating objects: this may be various objects floating on the water surface, such as wood, refuse, lost cargo from ships, etc.

Marine organism: some marine organisms may become obstacles, particularly large marine mammals, fish and water birds.

Severe weather conditions: severe weather conditions, such as heavy fog, strong storms, thunderstorms, etc., may affect vessel navigation and require avoidance of potential hazards.

Marine environment: certain geographical or ecological environments may require special care, such as coral reefs, sub-sea terrain, and the like.

Depth and shoal: the ship is ensured to avoid shoal and terrain from collecting different types of data changes by using sensors to avoid stranding.

An automatic control method based on artificial intelligence comprises the following steps:

S10-1, collecting different types of data by using a sensor, and receiving by a data collecting module;

s10-2, a route planning module receives real-time data provided by a data acquisition module and performs route planning by using an A-type algorithm;

s10-3, the intelligent control module implements an automatic control strategy by using a Q learning algorithm according to the real-time data provided by the data acquisition module;

s10-4, the emergency obstacle avoidance module receives real-time data from the data acquisition module and performs obstacle detection and collision risk assessment; if collision risk exists, making a collision prevention strategy through a support vector machine, and automatically executing collision prevention operation; simultaneously communicating with a ship command center to send out an emergency alarm;

s10-5, the remote monitoring module is connected to a ship command center, and real-time data of the ship are transmitted through different communication links, wherein the real-time data comprise positions, speeds, heading, meteorological conditions, sensor data, images and video streams; allowing the operator of the ship command center to monitor the status of the ship in real time and to perform remote control when needed.

Compared with the prior art, the invention has the following beneficial effects:

the invention uses the deep Q network and the reinforcement learning algorithm, so that the ship can better cope with various navigation challenges including obstacle avoidance, emergency collision avoidance and the like, thereby obviously improving the navigation safety.

The autonomous control system can make immediate decisions based on real-time conditions to ensure an optimal navigation path for the vessel. The navigation efficiency of the ship is improved, the fuel consumption is reduced, and the navigation time is shortened.

While traditional manual maneuvers are susceptible to human error, the autonomous control system of the present invention is data and algorithm based, which reduces the risk of human error, particularly in long voyage missions.

The remote monitoring module allows the ship command center to monitor the state and performance of the ship in real time, can remotely intervene when necessary, and provides stronger monitoring and coping capacity.

Drawings

The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:

FIG. 1 is a system architecture diagram of an artificial intelligence based automatic control system of the present invention;

FIG. 2 is a schematic diagram of steps of an artificial intelligence based automatic control method of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Referring to fig. 1 and 2, the present invention provides the following technical solutions:

according to one embodiment of the invention, as shown in a system architecture diagram of an automatic control system based on artificial intelligence in fig. 1, the automatic control system based on artificial intelligence comprises a data acquisition module, a route planning module, an intelligent control module, an emergency obstacle avoidance module and a remote monitoring module;

the data acquisition module acquires different types of data by using a sensor, and acquires corresponding data by a GPS receiver, a meteorological sensor, a radar system, an underwater sonar and an inertial navigation sensor respectively;

According to another embodiment of the present invention, as shown in the schematic step diagram of fig. 2, an automatic control method based on artificial intelligence, the steps of the automatic control method based on artificial intelligence are as follows:

According to step S10-1, the partial data samples collected are, for example, as follows:

timestamp: 2023-10-01:09, current location latitude and longitude: (-70.1, 30.9), speed (section): 12.5, heading angle (degree): 45.0, wind speed (section): 10.2 wave height (meter): 2.3, visibility (sea): 5.0, ship state: normally, longitude and latitude of the target position: (-68.0, 33.5).

Timestamp: 2023-10-01:15, latitude and longitude of current position: (-69.5, 31.5), speed (section): heading angle (degree): 48.5, wind speed (section): 9.8, wave height (meter): 2.1, visibility (sea): 4.8, ship state: normally, longitude and latitude of the target position: (-67.8, 34.0).

Timestamp: 2023-10-01:09, current position latitude and longitude: (-69.0, 32.0), speed (section): 13.8, heading angle (degree): 50.0, wind speed (section): 10.5, wave height (meter): 2.5, visibility (sea): 4.5, ship state: normally, longitude and latitude of the target position: (-67.5, 34.5).

According to step S10-2, the mathematical formula of the a algorithm is: f (n) =g (n) +h (n), where f (n) is an evaluation function, g (n) represents an actual path cost of the optimal path from the start point to the node n, and h (n) represents an estimated cost of the optimal path from the node n to the target; wherein, the mathematical formula of g (n) is as follows:

g(n)＝l·L+b·G+c·O+d·F+e·T+f·S

Initializing node data according to experimental data: starting node: (0, 0), target node: (4,4). The valuations from the current node (x, y) to the target node (4, 4) are:

the estimated value of the starting node is: h (0, 0) =4.0, then:

initializing an open list and a closed list: an open list (0, 0) (starting point) estimated value of 4.0;

closed list, empty.

The node (0, 0) with the lowest valuation in the open list is selected and moved to the closed list. Neighboring nodes (1, 0) and (0, 1) are checked. Valuation value of (1, 0): h (1, 0) =3.162;

g represents the actual cost from the start point to the current node: g (1, 0) =1 (because the inter-node cost is 1), the total cost f (1, 0) =g (1, 0) +h (1, 0) =1+3.162= 4.162;

valuation h (0, 1) =sqrt ((4-0)/(2+ (4-1)/(2))=3.162;

the node (1, 0) with the lowest total cost is selected and moved to the closed list. The steps are repeated until the target node (4, 4) is found or the open list is empty.

According to step S10-3, the mathematical formula of the Q learning algorithm is as follows:

Q(s,a)＝Q(s,a)+α·[R+γ·max(Q(s′+a′))-Q(s,a)]

The mathematical formula of the reward function is as follows:

R _safe (s, s') means that if the vessel avoids a possible dangerous situation, a positive reward is givenOtherwise, giving a 0 prize; and judging dangerous situations through the emergency obstacle avoidance module.

The states and action space in the experiment and the reward function are as follows:

the state space S includes: vessel position (x, y), current speed v, heading angle θ, ambient information, weather conditions, vessel status, target position (x_target, y_target);

The action space a includes: rudder angle, engine output, propeller direction and power, controllable parameters of the ship;

the reward function R (s, a, s') is used to evaluate how well different actions are taken. The goal of the bonus function is to increase fuel efficiency. If fuel efficiency increases, the prize value is positive. If fuel efficiency decreases, the prize value is negative. The prize is associated with the distance from the target location, with the prize being higher closer to the target.

Current state s1, position (x 1, y 1) = (0, 0), velocity v1=10, heading angle θ1=30°, wind speed=5 knots, wave height=1.5 meters, meteorological conditions=sunny days, ship state=normal, target position (x_target, y_target) = (100 );

action a1, rudder angle=10°, engine output engine_output=0.8, propeller direction and power= (0 °, 50%), other control parameters;

the next state s2, position (x 2, y 2) = (10, 5), speed v2=11, heading angle θ2=35°, wind speed=5.2 knots, wave height=1.6 meters, meteorological conditions=sunny day, ship state=normal.

Prize r1=prize value calculated from fuel efficiency.

According to step S10-4, the emergency obstacle avoidance module uses a support vector machine to evaluate collision risk, and the steps are as follows:

s9-3, in order to minimize the norm of the weight vector w and simultaneously meet the requirement that the interval between the data point and the segmentation hyperplane is not less than 1, the convex quadratic programming problem is expressed as:meanwhile, the SVM requires that the following constraint conditions be satisfied: u (u) _i (w·v _i +ε)≥1；

S9-4, training the support vector machine model by using training data;

If the collision risk score is above a certain threshold (e.g., 0.7), this indicates that a collision risk exists.

The support vector machine is executed to determine collision avoidance strategies including course adjustment, deceleration, turning left or right, stopping the ship, and the like.

When encountering an obstacle, the ship simulates to execute a collision avoidance strategy so as to avoid collision.

According to step S10-5, data is transmitted: the position of the vessel, the speed of the vessel, the heading of the vessel, the weather conditions of the current ocean, sensor data, real-time images and video streams. The ship remote monitoring module transmits real-time data to the ship command center through satellite communication.

An operator of the ship command center receives real-time data of the ship by using a remote monitoring system, and monitors the position, speed, heading and environmental condition of the ship in real time; and viewing sensor data, including images of radar, underwater sonar and cameras. And viewing the real-time video stream to know the current state of the ship.

When needed, an operator of the marine vessel command center may perform some control operations, such as adjusting heading, modifying speed, starting or stopping a particular device, through a remote monitoring system.

It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.

Finally, it should be noted that: the foregoing description is only a preferred embodiment of the present invention, and the present invention is not limited thereto, but it is to be understood that modifications and equivalents of some of the technical features described in the foregoing embodiments may be made by those skilled in the art, although the present invention has been described in detail with reference to the foregoing embodiments. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. An automatic control system based on artificial intelligence, characterized in that: the system comprises a data acquisition module, a route planning module, an intelligent control module, an emergency obstacle avoidance module and a remote monitoring module;

2. An artificial intelligence based automatic control system according to claim 1, wherein: the route planning module firstly needs a map, wherein the map comprises a route starting point, a route ending point and an obstacle, and the map is provided by a GPS receiver of the data acquisition module; the planning module receives a target position set by a user, wherein the target position is a coordinate point or a designated target port;

the algorithm a divides the map into a plurality of nodes, wherein each node represents a position on the map; the route planning module calculates the total estimated cost for each node; the route planning module maintains two lists, one is an open list, and is used for storing nodes to be explored; the other is a closed list for storing the explored nodes; it is necessary to add the starting point to the open list and set the estimated cost of the starting point to 0;

3. An artificial intelligence based automatic control system according to claim 2, wherein: the mathematical formula of the a-algorithm is: f (n) =g (n) +h (n), where f (n) is an evaluation function, g (n) represents an actual path cost of the optimal path from the start point to the node n, and h (n) represents an estimated cost of the optimal path from the node n to the target; wherein, the mathematical formula of g (n) is as follows:

g(n)＝l·L+b·G+c·O+d·F+e·T+f·S

4. An artificial intelligence based automatic control system according to claim 1, wherein: the intelligent control module uses a deep Q network to realize intelligent control of the ship, and comprises the following steps:

S4-2, creating a deep neural network, taking the state as input, and outputting a Q value of each possible action, wherein the Q value represents expected return after taking a certain action;

5. An artificial intelligence based automatic control system according to claim 4 and wherein: according to the step S4-2, the input layer of the deep neural network receives the current state of the ship as input, the state is expressed in the form of a numerical vector, and the parameters correspond to the state space; the output layer of the deep neural network is a Q value estimation layer, and each output node corresponds to one possible action; each output node outputs a Q value of a corresponding action, representing an expected return or long-term reward after taking the action, corresponding to the action space;

6. An artificial intelligence based automatic control system according to claim 5 and wherein: the mathematical formula of the Q learning algorithm is as follows:

Q(s,a)＝Q(s,a)+α·[R+γ·max(Q(s′+a′))-Q(s,a)]

7. An artificial intelligence based automatic control system according to claim 5 or 6, characterized in that: the mathematical formula of the reward function is as follows:

R _obs (s, s') represents the approach of the vessel to the obstacle, giving a negative reward, obstacle detection using the vessel sensorJudging measured data; the shorter the distance, the lower the prize;

8. An artificial intelligence based automatic control system according to claim 5 and wherein: the loss function uses the mean square error to measure the difference between the actual value and the predicted value, and the mathematical formula of the loss function J (theta) is as follows:

the gradient is repeatedly calculated and the parameter θ is updated, and when the loss function is smaller than Y, which is a decimal and positive number, the gradient descent is ended.

9. An artificial intelligence based automatic control system according to claim 5 and wherein: the emergency obstacle avoidance module uses a support vector machine to evaluate collision risk, and comprises the following steps:

S9-4, training the support vector machine model by using training data;

10. An automatic control method based on artificial intelligence is characterized in that: the method comprises the following steps: