CN111930141A - Three-dimensional path visual tracking method for underwater robot - Google Patents

Three-dimensional path visual tracking method for underwater robot Download PDF

Info

Publication number
CN111930141A
CN111930141A CN202010703073.2A CN202010703073A CN111930141A CN 111930141 A CN111930141 A CN 111930141A CN 202010703073 A CN202010703073 A CN 202010703073A CN 111930141 A CN111930141 A CN 111930141A
Authority
CN
China
Prior art keywords
underwater robot
coordinate system
tracking
angle
path
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010703073.2A
Other languages
Chinese (zh)
Inventor
张国成
孙玉山
柴璞鑫
吴新雨
张宸鸣
马陈飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Engineering University
Original Assignee
Harbin Engineering University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Engineering University filed Critical Harbin Engineering University
Priority to CN202010703073.2A priority Critical patent/CN111930141A/en
Publication of CN111930141A publication Critical patent/CN111930141A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
    • G05D1/10Simultaneous control of position or course in three dimensions

Abstract

The invention discloses a three-dimensional path visual tracking method for an underwater robot. The invention belongs to the technical field of three-dimensional path planning of underwater robots, and is characterized in that a geodetic coordinate system, a carrier coordinate system and a curve coordinate system are established, a six-degree-of-freedom model of the underwater robot is established according to the coordinate system, a path tracking error model is established according to the established six-degree-of-freedom model of the underwater robot, and course angle deviation and submergence angle deviation are determined; performing three-dimensional path tracking on the established six-degree-of-freedom model of the underwater robot by adopting a backstepping sliding mode control method; and training the three-dimensional path tracking by adopting a deep reinforcement learning method to complete the visual tracking of the three-dimensional path of the underwater robot. The invention ensures continuous tracking error and improves the stability of path tracking; adding an integral term to the line-of-sight guidance rate to introduce the influence of time; a boundary reward function is added to accelerate the convergence speed of path tracking, reduce overshoot and improve precision.

Description

Three-dimensional path visual tracking method for underwater robot
Technical Field
The invention relates to the technical field of three-dimensional path planning of underwater robots, in particular to a visual tracking method for a three-dimensional path of an underwater robot.
Background
The ocean is the cradle of the earth life, the ocean area accounts for 71 percent of the earth surface area, rich water resources, biological resources and mineral resources are stored, along with the reduction of the mineral resources on the land and the gradual exposure of the problem of water resource shortage, all countries in the world realize the importance of ocean resource development, the development and utilization of the ocean resources become necessary ways for sustainable development, and the ocean resource development and utilization become a new field of cooperation and competition of all countries. Meanwhile, the world economy can not be shipped, and the ocean transportation is an important way for the circulation of bulk commodities, so that the safety of a navigation channel is protected, the stable and smooth ocean transportation is maintained, and the method has great significance for the continuous and healthy development of national economy. In order to meet the requirements of the economic field and the military field, the underwater mobile platform which is small in size, long in voyage, rich in functions and intelligent to a certain extent is required to be developed. Driven by these demands, Autonomous Underwater Vehicles (AUV) have been developed rapidly, and are widely used in the field of ocean development and become research hotspots of various research institutes. The intelligent underwater robot has the characteristics of long voyage, long endurance, small size and high flexibility, and has wide application and good development prospect in the aspects of ocean resource detection, hydrological information observation, underwater operation and underwater target search. The AUV can monitor hydrological information underwater for a long time through the requirements of a set program, cruise according to a set route, scan and model submarine topography, autonomously search targets, detect and maintain submarine pipelines, cables and the like, so that manpower and material resources are saved to a great extent, and meanwhile, the operation efficiency and safety are improved. In the military field, the underwater unmanned underwater vehicle can be used for executing anti-diving, marine blocking, early warning and communication tasks, a plurality of AUVs can form a powerful underwater cluster through interconnection, and various complex combat tasks can be executed in a wide sea area through centralized command and information sharing.
From the current research situation, for the path tracking guidance method of the AUV, the time-varying nonlinearity, the uncertainty research of the model parameters and the external environment interference, domestic and foreign scholars have proposed various methods to solve the above problems, and significant achievements are obtained. Such as a line-of-sight method, a virtual target method, a time delay estimation technology, a virtual control quantity, an energy dissipation theory and the like, but most of the methods are complex and have poor adaptability, and in the research of designing a controller by adopting deep reinforcement learning, the controller has better self-adaptive capacity, but the problems to be solved are also solved when the deep reinforcement learning is applied to the three-dimensional path tracking of an under-actuated AUV. Therefore, the deep neural network only outputs and acts with a single target at many times, and neglects the manipulation characteristic of the underactuated AUV. Secondly, the controller using reinforcement learning may not be sensitive enough to small errors in path tracking, which affects further improvement of tracking accuracy. Based on the analysis, the characteristics of multiple inputs, multiple outputs, nonlinearity and strong coupling of the system need to be considered in the design process of the AUV path tracking controller, and meanwhile, the influence of external ocean currents on the system is reduced as much as possible. The designed controller has stronger robustness and self-adaptability while ensuring the tracking precision.
Disclosure of Invention
The invention provides a three-dimensional path visual tracking method for an underwater robot, aiming at ensuring continuous tracking error and improving the stability of path tracking, and the invention provides the following technical scheme:
a three-dimensional path visual tracking method for an underwater robot comprises the following steps:
step 1: establishing a geodetic coordinate system, a carrier coordinate system and a curve coordinate system, and establishing a six-degree-of-freedom model of the underwater robot according to the coordinate system;
step 2: according to the established six-degree-of-freedom model of the underwater robot, a path tracking error model is established, and course angle deviation and submergence angle deviation are determined;
and step 3: performing three-dimensional path tracking on the established six-degree-of-freedom model of the underwater robot by adopting a backstepping sliding mode control method;
and 4, step 4: and training the three-dimensional path tracking by adopting a deep reinforcement learning method to complete the visual tracking of the three-dimensional path of the underwater robot.
Preferably, the step 1 specifically comprises:
step 1.1: establishing a geodetic coordinate system, wherein the geodetic coordinate system is a certain point on the sea level, the positive direction of a zeta axis in the geodetic coordinate system is the same as the main course of the underwater robot AUV, the zeta axis points to the geocentric, and the zeta axis, the eta axis and the zeta axis form a right-hand coordinate system;
establishing a carrier coordinate system, wherein the origin of the carrier coordinate system is the mass center, x, of the AUV (autonomous Underwater vehicle)BThe shaft is fixedly connected with the AUV heading of the underwater robot, yBThe shaft is fixedly connected with an AUV starboard, xBAxis, yBAxis and zBThe axes form a right-hand coordinate system;
establishing a curve coordinate system, wherein the origin of the curve coordinate system is a point P, x on the expected pathSFIn the tangential direction of the desired path, ySFAxis in the normal direction, xSFAxis, ySFAxis and zSFThe axes form a right-hand coordinate system;
step 1.2: according to a geodetic coordinate system, a carrier coordinate system and a curve coordinate system, a six-degree-of-freedom model of the underwater robot is established, the six-degree-of-freedom model comprises a kinetic equation and a kinematic equation, and the kinetic equation is expressed by the following formula:
Figure BDA0002593606130000031
the kinematic equation is represented by:
Figure BDA0002593606130000032
Figure BDA0002593606130000033
wherein m is the mass of the underwater robot, IyFor moment of inertia about the y-axis, IzFor the moment of inertia about the z-axis, u, v, w are the longitudinal, transverse and vertical velocities, respectively, q, r are the pitch and yaw angular velocities, θ, ψ are the pitch and heading, X(·),Y(·),Z(·),M(·),N(·)Are all hydrodynamic coefficients, zg,zbFor the position of the centre of gravity and centre of buoyancy, X is the longitudinal thrust, M and N are the torques about the y-axis and z-axis generated by the combined action of the propeller and rudder, psiBIs the heading angle, theta, of the underwater robotBIs the submergence angle of the underwater robot, alpha is the attack angle, beta is the drift angle; v. oftThe resultant velocity of the underwater robot.
Preferably, the step 2 specifically comprises:
step 2.1: defining a virtual underwater robot AUV on a tracking path according to the established six-degree-of-freedom model of the underwater robot, and expressing a virtual underwater robot AUV kinematic equation by the following formula:
Figure BDA0002593606130000041
wherein psipAnd thetapAttitude angle, V, of a virtual targetpThe resultant velocity of the virtual robot;
step 2.2: converting the position errors of the real underwater robot AUV and the virtual underwater robot AUV in the inertial coordinate system into a curve coordinate system, and expressing the conversion process by the following formula:
Figure BDA0002593606130000042
differentiating the converted coordinate system to obtain an error kinetic equation, and expressing the error kinetic equation by the following formula:
Figure BDA0002593606130000043
step 2.3: neglecting errors caused by non-linearity in a three-dimensional space, determining course angle deviation and submergence angle deviation, and expressing the course angle deviation and the submergence angle deviation by the following formula:
Figure BDA0002593606130000044
wherein the content of the first and second substances,
Figure BDA0002593606130000045
in order to be the deviation of the course angle,
Figure BDA0002593606130000046
is the deviation of the angle of repose.
Preferably, the step 3 specifically comprises:
step 3.1: adopting a backstepping sliding mode control method, introducing a horizontal plane approach angle and a vertical plane approach angle based on a Lyapunov function to adjust the path tracking process of the underwater robot, and expressing the horizontal plane approach angle (y) by the following formulae):
Figure BDA0002593606130000051
The vertical approach angle χ (z) is expressed bye):
Figure BDA0002593606130000052
Wherein, DeltajIs the horizontal front-looking distance, DeltakIs the vertical front viewing distance;
step 3.2: determining a tracking error according to the horizontal plane approach angle and the vertical plane approach angle, and tracking the error according to the following formula:
ψ=ψe-(ye)
θ=θe-χ(ze)
wherein the content of the first and second substances,ψin order to correct the tracking error of the horizontal plane,θa vertical plane tracking error;
adopting a three-dimensional spiral path to test the path tracking effect of the backstepping sliding mode control method, establishing a three-dimensional spiral line available parameter equation, and expressing the three-dimensional spiral line available parameter equation by the following formula:
Figure BDA0002593606130000053
wherein S is a path parameter, the initial value of the target position is S (0) ═ 0, the initial position of AUV is ξ (0) ═ 65, η (0) ═ 500, ζ (0) ═ 50, the initial heading angle ψ (0) ═ 0, the initial pitch angle θ (0) ═ 0, the initial speed of 0.1m/S, the initial angular speeds of 0 and 1m/S, and the steady water flow is added to detect the anti-flow interference capability, the speed of the water flow in the ξ direction is 0.3m/S, the speed in the η direction is 0.3m/S, and the speed in the ζ direction is 0.15m/S, thereby completing the three-dimensional path tracking.
Preferably, the step 4 specifically includes:
step 4.1: when the LOS method is adopted to calculate the deviation, an integral term is added to eliminate the periodic error, the integral term introduces time, the consideration of time can be effectively added into a control loop, and the deviation after the integral term is added is represented by the following formula:
Figure BDA0002593606130000061
Figure BDA0002593606130000062
wherein k isψ,kθRespectively, control gains;
the basic idea of the disturbance observer is to modify the estimated value by the difference between the estimated output and the actual output, by means of a hundred and ten disturbance observer of the following formula:
Figure BDA0002593606130000063
Figure BDA0002593606130000064
step 4.2: adding penalty items into the reward function, wherein the penalty items comprise a vertical rudder, a rudder angle of a horizontal rudder and a rudder angle change rate, the penalty items are set by adopting a deformed second-order Gaussian function, and the improved reward function is represented by the following formula:
Figure BDA0002593606130000065
adding a boundary reward to the reward function, i.e. when the AUV is within a set specific boundary range, adding 1 to the reward value of the step as an extra reward, and expressing the boundary reward function by the following formula:
Figure BDA0002593606130000066
step 4.3: performing parameter optimization, including Actor neural network learning rate LR _ A, Critic neural network learning rate LR _ C, reward value attenuation discount coefficient and parameter updating discount coefficient;
selecting different parameters for debugging for multiple times, wherein the final parameters are LR _ A is 0.001, LR _ C is 0.003, gamma is 0.95 and tau is 0.05;
after the reward function is improved and the current disturbance observer is added, the new path tracking controller is trained, and the three-dimensional path visual tracking of the underwater robot is completed.
The invention has the following beneficial effects:
the current error is influenced by gradual accumulation of past errors due to continuous time integration, so that tracking is dynamically adjusted, the occurrence of static error is restrained, the influence caused by water flow interference is further reduced in order to adapt to a complex ocean current environment, the robustness and the anti-interference capability of the controller are enhanced, an interference observer is added to the control system after research, and the output of the controller is actively adjusted in real time through the observed interference form and characteristics.
The invention considers the problems of low path tracking convergence speed and large early deviation, and has the problems of continuous correct heading and large position deviation in training. When the distance is out of the boundary range, the reward is not given, the sensitivity of the neural network to the position deviation is improved, the tracking effect is improved,
the Actor neural network learning rate LR _ a determines how much experience needs to be learned in one update of the Actor network parameters, i.e., the larger LR _ a, the more experience is learned in each round of learning, and vice versa. The Critic neural network learning rate LR _ C determines how much experience needs to be learned in one update of Critic network parameters, i.e., the larger LR _ C, the more experience is learned in each round of learning, and vice versa. The discount coefficient of attenuation of reward value is used for reducing the influence of the return of the state after in the Markov decision-making process to the current state measurement, namely the smaller the influence is, the larger the influence is, the more the influence is, the later state return is, when the current state is measured. The parameter update discount coefficient determines the weight of the new parameter when the new network parameter updates the old network parameter, i.e. the larger the weight of the new parameter is, the larger the change degree of the parameter is, and vice versa. The efficiency and the effect of deep reinforcement learning are closely related to the four parameters, and the optimal parameter setting needs to be obtained through theoretical analysis and practice.
The invention designs the AUV approach angle by adopting a line-of-sight method, and designs a virtual AUV and the control rate of the position thereof by adopting a virtual guide method. And an AUV three-dimensional motion error model is established, and the virtual AUV continuously guides the AUV by adjusting the speed of the virtual AUV, so that the tracking error is continuous, and the stability of path tracking is improved.
A design method of the deep reinforcement learning path tracking controller is explored, and the most appropriate deep reinforcement learning algorithm is obtained. Learning and simulation environments are established by adopting python language, and a training cycle flow is designed by adding exploration parameters into the DDPG algorithm to enhance the exploration of the algorithm. The current state of the AUV is used as input, the action of an AUV motion actuating mechanism is used as output, and a deep neural network is built as the core of the controller. And finally, AUV three-dimensional path tracking based on the DDPG algorithm is successfully realized in simulation, and stable three-dimensional path tracking is basically realized.
Adding an integral term to the line-of-sight guidance rate to introduce the influence of time; a boundary reward function is added to accelerate the convergence speed of path tracking, reduce overshoot and improve precision. A second-order Gaussian function related to the rudder angle and the change rate of the rudder angle is added into a reward function of the reinforcement learning algorithm, so that the frequent reciprocating change of the rudder angle is inhibited; an ocean current disturbance observer is added in the control loop, so that the ocean current is observed in real time, the disturbance of the ocean current on the controller is inhibited, and the periodic error is reduced.
Drawings
FIG. 1 is a flow chart of a three-dimensional path visualization tracking method of an underwater robot;
FIG. 2 is a schematic view of a coordinate system;
FIG. 3 is a low pass filtering block diagram;
FIG. 4 is a three-dimensional path tracking trajectory diagram;
FIG. 5 is a graph of position error curves;
FIG. 6 is a schematic diagram of path tracking training;
FIG. 7 is a deep learning three-dimensional path tracking trajectory diagram;
FIG. 8 is a deep learning three-dimensional path tracking position error map;
FIG. 9 is a graph of deep reinforcement learning round rewards.
Detailed Description
The present invention will be described in detail with reference to specific examples.
The first embodiment is as follows:
according to fig. 1, the application provides a method for visually tracking a three-dimensional path of an underwater robot, which comprises the following steps:
a three-dimensional path visual tracking method for an underwater robot comprises the following steps:
step 1: establishing a geodetic coordinate system, a carrier coordinate system and a curve coordinate system, and establishing a six-degree-of-freedom model of the underwater robot according to the coordinate system;
the step 1 specifically comprises the following steps:
as shown in fig. 2, step 1.1: establishing a geodetic coordinate system { I }, wherein the geodetic coordinate system is a certain point on the sea level, the positive direction of a zeta axis in the geodetic coordinate system is the same as the main course of the AUV, the zeta axis points to the geocentric, and the zeta axis, the eta axis and the zeta axis form a right-hand coordinate system;
establishing a carrier coordinate system { B }, wherein the origin of the carrier coordinate system is the centroid of the AUV (autonomous Underwater vehicle), xBThe shaft is fixedly connected with the AUV heading of the underwater robot, yBThe shaft is fixedly connected with an AUV starboard, xBAxis, yBAxis and zBThe axes form a right-hand coordinate system;
establishing a curve coordinate system S-F, wherein the origin of the curve coordinate system is a point P, x on the expected pathSFIn the tangential direction of the desired path, ySFAxis in the normal direction, xSFAxis, ySFAxis and zSFThe axes form a right-hand coordinate system;
step 1.2: according to a geodetic coordinate system, a carrier coordinate system and a curve coordinate system, a six-degree-of-freedom model of the underwater robot is established, the six-degree-of-freedom model comprises a kinetic equation and a kinematic equation, and the kinetic equation is expressed by the following formula:
Figure BDA0002593606130000081
the kinematic equation is represented by:
Figure BDA0002593606130000091
Figure BDA0002593606130000092
wherein m is the mass of the underwater robot, IyFor moment of inertia about the y-axis, IzFor rotational inertia about the z-axisThe quantities u, v, w are the longitudinal, transverse and vertical velocities, respectively, q, r are the pitch and yaw angular velocities, theta, psi are the pitch and yaw angles, X(·),Y(·),Z(·),M(·),N(·)Are all hydrodynamic coefficients, zg,zbFor the position of the centre of gravity and centre of buoyancy, X is the longitudinal thrust, M and N are the torques about the y-axis and z-axis generated by the combined action of the propeller and rudder, psiBIs the heading angle, theta, of the underwater robotBIs the submergence angle of the underwater robot, alpha is the attack angle, beta is the drift angle; v. oftThe resultant velocity of the underwater robot.
Step 2: according to the established six-degree-of-freedom model of the underwater robot, a path tracking error model is established, and course angle deviation and submergence angle deviation are determined;
the step 2 specifically comprises the following steps:
step 2.1: defining a virtual underwater robot AUV on a tracking path according to the established six-degree-of-freedom model of the underwater robot, and expressing a virtual underwater robot AUV kinematic equation by the following formula:
Figure BDA0002593606130000093
wherein psipAnd thetapAttitude angle, V, of a virtual targetpThe resultant velocity of the virtual robot;
step 2.2: converting the position errors of the real underwater robot AUV and the virtual underwater robot AUV in the inertial coordinate system into a curve coordinate system, and expressing the conversion process by the following formula:
Figure BDA0002593606130000101
differentiating the converted coordinate system to obtain an error kinetic equation, and expressing the error kinetic equation by the following formula:
Figure BDA0002593606130000102
step 2.3: neglecting errors caused by non-linearity in a three-dimensional space, determining course angle deviation and submergence angle deviation, and expressing the course angle deviation and the submergence angle deviation by the following formula:
Figure BDA0002593606130000103
wherein the content of the first and second substances,
Figure BDA0002593606130000104
in order to be the deviation of the course angle,
Figure BDA0002593606130000105
is the deviation of the angle of repose.
And step 3: performing three-dimensional path tracking on the established six-degree-of-freedom model of the underwater robot by adopting a backstepping sliding mode control method;
as shown in fig. 3, the step 3 specifically includes:
step 3.1: adopting a backstepping sliding mode control method, introducing a horizontal plane approach angle and a vertical plane approach angle based on a Lyapunov function to adjust the path tracking process of the underwater robot, and expressing the horizontal plane approach angle (y) by the following formulae):
Figure BDA0002593606130000106
The vertical approach angle χ (z) is expressed bye):
Figure BDA0002593606130000107
Wherein, DeltajIs the horizontal front-looking distance, DeltakIs the vertical front viewing distance;
step 3.2: determining a tracking error according to the horizontal plane approach angle and the vertical plane approach angle, and tracking the error according to the following formula:
ψ=ψe-(ye)
θ=θe-χ(ze)
wherein the content of the first and second substances,ψin order to correct the tracking error of the horizontal plane,θa vertical plane tracking error;
adopting a three-dimensional spiral path to test the path tracking effect of the backstepping sliding mode control method, establishing a three-dimensional spiral line available parameter equation, and expressing the three-dimensional spiral line available parameter equation by the following formula:
Figure BDA0002593606130000111
wherein S is a path parameter, the initial value of the target position is S (0) ═ 0, the initial position of AUV is ξ (0) ═ 65, η (0) ═ 500, ζ (0) ═ 50, the initial heading angle ψ (0) ═ 0, the initial pitch angle θ (0) ═ 0, the initial speed of 0.1m/S, the initial angular speeds of 0 and 1m/S, and the steady water flow is added to detect the anti-flow interference capability, the speed of the water flow in the ξ direction is 0.3m/S, the speed in the η direction is 0.3m/S, and the speed in the ζ direction is 0.15m/S, thereby completing the three-dimensional path tracking.
And 4, step 4: and training the three-dimensional path tracking by adopting a deep reinforcement learning method to complete the visual tracking of the three-dimensional path of the underwater robot.
The step 4 specifically comprises the following steps:
step 4.1: when the LOS method is adopted to calculate the deviation, an integral term is added to eliminate the periodic error, the integral term introduces time, the consideration of time can be effectively added into a control loop, and the deviation after the integral term is added is represented by the following formula:
Figure BDA0002593606130000112
Figure BDA0002593606130000113
wherein k isψ,kθRespectively, control gains;
the current error is influenced by gradual accumulation of past errors due to continuous time integration, so that tracking is dynamically adjusted, the occurrence of static error is restrained, the influence caused by water flow interference is further reduced in order to adapt to a complex ocean current environment, the robustness and the anti-interference capability of the controller are enhanced, an interference observer is added to the control system after research, and the output of the controller is actively adjusted in real time through the observed interference form and characteristics.
The basic idea of the disturbance observer is to modify the estimated value by the difference between the estimated output and the actual output, by means of a hundred and ten disturbance observer of the following formula:
Figure BDA0002593606130000121
Figure BDA0002593606130000122
step 4.2: adding penalty items into the reward function, wherein the penalty items comprise a vertical rudder, a rudder angle of a horizontal rudder and a rudder angle change rate, the penalty items are set by adopting a deformed second-order Gaussian function, and the improved reward function is represented by the following formula:
Figure BDA0002593606130000123
the problems of low convergence speed and large early deviation of path tracking are considered, and the problems of continuous correct heading and large position deviation appear in training.
Therefore, the boundary reward is determined to be added into the reward function, namely when the AUV is within a set specific boundary range, 1 is continuously added to the reward value of the step to serve as the extra reward, if the AUV is outside the boundary range, the step is not rewarded, the sensitivity of the neural network to the position deviation is improved, the tracking effect is improved, the boundary reward function is determined, and the boundary reward function is represented by the following formula:
Figure BDA0002593606130000124
step 4.3: performing parameter optimization, including Actor neural network learning rate LR _ A, Critic neural network learning rate LR _ C, reward value attenuation discount coefficient and parameter updating discount coefficient; the Actor neural network learning rate LR _ a determines how much experience needs to be learned in one update of the Actor network parameters, i.e., the larger LR _ a, the more experience is learned in each round of learning, and vice versa. The Critic neural network learning rate LR _ C determines how much experience needs to be learned in one update of Critic network parameters, i.e., the larger LR _ C, the more experience is learned in each round of learning, and vice versa. The discount coefficient of attenuation of reward value is used for reducing the influence of the return of the state after in the Markov decision-making process to the current state measurement, namely the smaller the influence is, the larger the influence is, the more the influence is, the later state return is, when the current state is measured. The parameter update discount coefficient determines the weight of the new parameter when the new network parameter updates the old network parameter, i.e. the larger the weight of the new parameter is, the larger the change degree of the parameter is, and vice versa. The efficiency and the effect of deep reinforcement learning are closely related to the four parameters, and the optimal parameter setting needs to be obtained through theoretical analysis and practice. Therefore, different parameters are selected for debugging for multiple times, and the final parameters are selected as LR _ A being 0.001, LR _ C being 0.003, gamma being 0.95 and tau being 0.05;
after the reward function is improved and the current disturbance observer is added, the new path tracking controller is trained, and the three-dimensional path visual tracking of the underwater robot is completed.
In the simulation process, a simulation result with fast convergence and good effect is finally obtained through a large amount of parameter debugging and a plurality of tests. The simulation result is shown in fig. 4, in which the dotted line represents the target path and the solid line represents the tracking path under the control of the backstepping sliding mode technique.
Fig. 5 shows the position error of AUV in three directions during tracking, and it can be seen that the tracking error curve obviously changes periodically and reciprocates around 0 due to the action of ocean current.
The initial value of the target position is S (0) ═ 0, the initial position of AUV is ξ (0) ═ 650, η (0) ═ 500, ζ (0) ═ 50, the initial heading angle θ (0) ═ 0, and the initial pitch angle θ (0) ═ 0. Initial speed 0.1m/s, desired forward speed ud6/s. The method is characterized in that the interfering water flow is added in the environment, and the speed of the water flow in the xi direction is 0.3m/s, the speed in the eta direction is 0.3m/s, and the speed in the direction is 0.15m/s in an inertial coordinate system. The path tracking training scenario is shown in fig. 6.
As is clear from fig. 7 and 8, the AUV as a whole implements path tracing. In 10,000 iterations, the average distance between the AUV position and the target position is 5.5 m. However, in the initial phase, there is a large overshoot because the neural network controller is not sensitive enough. The maximum deviation in direction reaches 36m and the maximum deviation in direction reaches 30m, and due to the water flow, the AUV oscillates on both sides of the target path and there is a static error.
Because the DDPG reinforcement learning algorithm comprises strategy gradient thought, the neural network can be learned in successful and failed experiences. As can be seen from the prize value curve in fig. 9, the prize value remains good most of the time after the learning is started, but the lower prize value is observed for a long time in the process of 600 to 700 steps. It shows that the neural network controller has learned much in the successful experience, but the experience learned in the failed experience is insufficient, which is likely to cause the controller to fall into local optimality and lack of exploratory. A good neural network controller should be able to learn the experience of success and failure in order to be able to succeed and avoid failure.
The above description is only a preferred embodiment of the method for visually tracking the three-dimensional path of the underwater robot, and the protection range of the method for visually tracking the three-dimensional path of the underwater robot is not limited to the above embodiments, and all technical solutions belonging to the idea belong to the protection range of the present invention. It should be noted that modifications and variations which do not depart from the gist of the invention will be those skilled in the art to which the invention pertains and which are intended to be within the scope of the invention.

Claims (5)

1. A three-dimensional path visual tracking method for an underwater robot is characterized by comprising the following steps: the method comprises the following steps:
step 1: establishing a geodetic coordinate system, a carrier coordinate system and a curve coordinate system, and establishing a six-degree-of-freedom model of the underwater robot according to the coordinate system;
step 2: according to the established six-degree-of-freedom model of the underwater robot, a path tracking error model is established, and course angle deviation and submergence angle deviation are determined;
and step 3: performing three-dimensional path tracking on the established six-degree-of-freedom model of the underwater robot by adopting a backstepping sliding mode control method;
and 4, step 4: and training the three-dimensional path tracking by adopting a deep reinforcement learning method to complete the visual tracking of the three-dimensional path of the underwater robot.
2. The underwater robot three-dimensional path visual tracking method as claimed in claim 1, wherein: the step 1 specifically comprises the following steps:
step 1.1: establishing a geodetic coordinate system, wherein the geodetic coordinate system is a certain point on the sea level, the positive direction of a zeta axis in the geodetic coordinate system is the same as the main course of the underwater robot AUV, the zeta axis points to the geocentric, and the zeta axis, the eta axis and the zeta axis form a right-hand coordinate system;
establishing a carrier coordinate system, wherein the origin of the carrier coordinate system is the mass center, x, of the AUV (autonomous Underwater vehicle)BThe shaft is fixedly connected with the AUV heading of the underwater robot, yBThe shaft is fixedly connected with an AUV starboard, xBAxis, yBAxis and zBThe axes form a right-hand coordinate system;
establishing a curve coordinate system, wherein the origin of the curve coordinate system is a point P, x on the expected pathSFIn the tangential direction of the desired path, ySFAxis in the normal direction, xSFAxis, ySFAxis and zSFThe axes form a right-hand coordinate system;
step 1.2: according to a geodetic coordinate system, a carrier coordinate system and a curve coordinate system, a six-degree-of-freedom model of the underwater robot is established, the six-degree-of-freedom model comprises a kinetic equation and a kinematic equation, and the kinetic equation is expressed by the following formula:
Figure FDA0002593606120000011
the kinematic equation is represented by:
Figure FDA0002593606120000021
Figure FDA0002593606120000022
wherein m is the mass of the underwater robot, IyFor moment of inertia about the y-axis, IzFor the moment of inertia about the z-axis, u, v, w are the longitudinal, transverse and vertical velocities, respectively, q, r are the pitch and yaw angular velocities, θ, ψ are the pitch and heading, X(·),Y(·),Z(·),M(·),N(·)Are all hydrodynamic coefficients, zg,zbFor the position of the centre of gravity and centre of buoyancy, X is the longitudinal thrust, M and N are the torques about the y-axis and z-axis generated by the combined action of the propeller and rudder, psiBIs the heading angle, theta, of the underwater robotBIs the submergence angle of the underwater robot, alpha is the attack angle, beta is the drift angle; v. oftThe resultant velocity of the underwater robot.
3. The underwater robot three-dimensional path visual tracking method as claimed in claim 1, wherein: the step 2 specifically comprises the following steps:
step 2.1: defining a virtual underwater robot AUV on a tracking path according to the established six-degree-of-freedom model of the underwater robot, and expressing a virtual underwater robot AUV kinematic equation by the following formula:
Figure FDA0002593606120000023
wherein psipAnd thetapAttitude angle, V, of a virtual targetpThe resultant velocity of the virtual robot;
step 2.2: converting the position errors of the real underwater robot AUV and the virtual underwater robot AUV in the inertial coordinate system into a curve coordinate system, and expressing the conversion process by the following formula:
Figure FDA0002593606120000031
differentiating the converted coordinate system to obtain an error kinetic equation, and expressing the error kinetic equation by the following formula:
Figure FDA0002593606120000032
step 2.3: neglecting errors caused by non-linearity in a three-dimensional space, determining course angle deviation and submergence angle deviation, and expressing the course angle deviation and the submergence angle deviation by the following formula:
Figure FDA0002593606120000033
wherein the content of the first and second substances,
Figure FDA0002593606120000034
in order to be the deviation of the course angle,
Figure FDA0002593606120000035
is the deviation of the angle of repose.
4. The underwater robot three-dimensional path visual tracking method as claimed in claim 1, wherein: the step 3 specifically comprises the following steps:
step 3.1: by usingA backstepping sliding mode control method is characterized in that a horizontal plane approach angle and a vertical plane approach angle are introduced based on a Lyapunov function to adjust a path tracking process of an underwater robot, and the horizontal plane approach angle (y) is expressed by the following formulae):
Figure FDA0002593606120000036
The approach angle x (z) of the vertical plane is expressed by the following equatione):
Figure FDA0002593606120000037
Wherein, DeltajIs the horizontal front-looking distance, DeltakIs the vertical front viewing distance;
step 3.2: determining a tracking error according to the horizontal plane approach angle and the vertical plane approach angle, and tracking the error according to the following formula:
ψ=ψe-(ye)
θ=θe-χ(ze)
wherein the content of the first and second substances,ψin order to correct the tracking error of the horizontal plane,θa vertical plane tracking error;
adopting a three-dimensional spiral path to test the path tracking effect of the backstepping sliding mode control method, establishing a three-dimensional spiral line available parameter equation, and expressing the three-dimensional spiral line available parameter equation by the following formula:
Figure FDA0002593606120000041
wherein S is a path parameter, the initial value of the target position is S (0) ═ 0, the initial position of AUV is ξ (0) ═ 65, η (0) ═ 500, ζ (0) ═ 50, the initial heading angle ψ (0) ═ 0, the initial pitch angle θ (0) ═ 0, the initial speed of 0.1m/S, the initial angular speeds of 0 and 1m/S, and the steady water flow is added to detect the anti-flow interference capability, the speed of the water flow in the ξ direction is 0.3m/S, the speed in the η direction is 0.3m/S, and the speed in the ζ direction is 0.15m/S, thereby completing the three-dimensional path tracking.
5. The underwater robot three-dimensional path visual tracking method as claimed in claim 1, wherein: the step 4 specifically comprises the following steps:
step 4.1: when the LOS method is adopted to calculate the deviation, an integral term is added to eliminate the periodic error, the integral term introduces time, the consideration of time can be effectively added into a control loop, and the deviation after the integral term is added is represented by the following formula:
Figure FDA0002593606120000042
Figure FDA0002593606120000043
wherein k isψ,kθRespectively, control gains;
because of continuous time integration, past errors can be gradually accumulated to influence the current errors, so that tracking is dynamically adjusted, the occurrence of static errors is restrained, in order to adapt to a complex ocean current environment, the influence caused by a water flow interference problem is further reduced, the robustness and the anti-interference capability of the controller are enhanced, an interference observer is determined to be added to the control system after research, and the output of the controller is actively adjusted in real time through the observed interference form and characteristics;
the basic idea of the disturbance observer is to modify the estimated value by the difference between the estimated output and the actual output, by means of a hundred and ten disturbance observer of the following formula:
Figure FDA0002593606120000051
Figure FDA0002593606120000052
step 4.2: adding penalty items into the reward function, wherein the penalty items comprise a vertical rudder, a rudder angle of a horizontal rudder and a rudder angle change rate, the penalty items are set by adopting a deformed second-order Gaussian function, and the improved reward function is represented by the following formula:
Figure FDA0002593606120000053
determining a boundary reward function, the boundary reward function being represented by:
Figure FDA0002593606120000054
step 4.3: performing parameter optimization, including Actor neural network learning rate LR _ A, Critic neural network learning rate LR _ C, reward value attenuation discount coefficient and parameter updating discount coefficient; therefore, different parameters are selected for debugging for multiple times, and the final parameters are selected as LR _ A being 0.001, LR _ C being 0.003, gamma being 0.95 and tau being 0.05;
after the reward function is improved and the current disturbance observer is added, the new path tracking controller is trained, and the three-dimensional path visual tracking of the underwater robot is completed.
CN202010703073.2A 2020-07-21 2020-07-21 Three-dimensional path visual tracking method for underwater robot Pending CN111930141A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010703073.2A CN111930141A (en) 2020-07-21 2020-07-21 Three-dimensional path visual tracking method for underwater robot

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010703073.2A CN111930141A (en) 2020-07-21 2020-07-21 Three-dimensional path visual tracking method for underwater robot

Publications (1)

Publication Number Publication Date
CN111930141A true CN111930141A (en) 2020-11-13

Family

ID=73313711

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010703073.2A Pending CN111930141A (en) 2020-07-21 2020-07-21 Three-dimensional path visual tracking method for underwater robot

Country Status (1)

Country Link
CN (1) CN111930141A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112880663A (en) * 2021-01-19 2021-06-01 西北工业大学 AUV reinforcement learning path planning method considering accumulated errors
CN113247220A (en) * 2021-06-28 2021-08-13 深之蓝海洋科技股份有限公司 Method for automatically scanning and detecting tunnel by underwater robot and electronic equipment
CN113268068A (en) * 2021-05-31 2021-08-17 自然资源部第二海洋研究所 Hybrid intelligent autonomous detection method for deep sea area based on bionic submersible vehicle
CN114051273A (en) * 2021-11-08 2022-02-15 南京大学 Large-scale network dynamic self-adaptive path planning method based on deep learning
CN116300982A (en) * 2023-03-03 2023-06-23 新兴际华(北京)智能装备技术研究院有限公司 Underwater vehicle and path tracking control method and device thereof
CN116414152A (en) * 2023-06-12 2023-07-11 中国空气动力研究与发展中心空天技术研究所 Reentry vehicle transverse and lateral rapid maneuver control method, system, terminal and medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102768539A (en) * 2012-06-26 2012-11-07 哈尔滨工程大学 AUV (autonomous underwater vehicle) three-dimension curve path tracking control method based on iteration
CN106292287A (en) * 2016-09-20 2017-01-04 哈尔滨工程大学 A kind of UUV path following method based on adaptive sliding-mode observer
CN106444838A (en) * 2016-10-25 2017-02-22 西安兰海动力科技有限公司 Precise path tracking control method for autonomous underwater vehicle
CN108490961A (en) * 2018-03-23 2018-09-04 哈尔滨工程大学 A kind of more AUV dynamics circular arc formation control methods
US20190012924A1 (en) * 2017-05-02 2019-01-10 Here Global B.V. Method and apparatus for privacy-sensitive routing of an aerial drone
CN109189071A (en) * 2018-09-25 2019-01-11 大连海事大学 Robust adaptive unmanned boat path tracking control method based on Fuzzy Observer
CN110109363A (en) * 2019-05-28 2019-08-09 重庆理工大学 A kind of Neural Network Adaptive Control method that wheeled mobile robot is formed into columns
US20190378423A1 (en) * 2018-06-12 2019-12-12 Skydio, Inc. User interaction with an autonomous unmanned aerial vehicle

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102768539A (en) * 2012-06-26 2012-11-07 哈尔滨工程大学 AUV (autonomous underwater vehicle) three-dimension curve path tracking control method based on iteration
CN106292287A (en) * 2016-09-20 2017-01-04 哈尔滨工程大学 A kind of UUV path following method based on adaptive sliding-mode observer
CN106444838A (en) * 2016-10-25 2017-02-22 西安兰海动力科技有限公司 Precise path tracking control method for autonomous underwater vehicle
US20190012924A1 (en) * 2017-05-02 2019-01-10 Here Global B.V. Method and apparatus for privacy-sensitive routing of an aerial drone
CN108490961A (en) * 2018-03-23 2018-09-04 哈尔滨工程大学 A kind of more AUV dynamics circular arc formation control methods
US20190378423A1 (en) * 2018-06-12 2019-12-12 Skydio, Inc. User interaction with an autonomous unmanned aerial vehicle
CN109189071A (en) * 2018-09-25 2019-01-11 大连海事大学 Robust adaptive unmanned boat path tracking control method based on Fuzzy Observer
CN110109363A (en) * 2019-05-28 2019-08-09 重庆理工大学 A kind of Neural Network Adaptive Control method that wheeled mobile robot is formed into columns

Non-Patent Citations (13)

* Cited by examiner, † Cited by third party
Title
JIAN LI; JIALU DU: "Robust adaptive formation control of underactuated autonomous underwater vehicles under input saturation", 《2018 CHINESE CONTROL AND DECISION CONFERENCE (CCDC)》 *
WEI ZHANG,等: "Fuzzy adaptive sliding mode controller for path following of an autonomous underwater vehicle", 《OCEANS 2015 - MTS/IEEE WASHINGTON》 *
YUSHAN SUN,等: "Three-Dimensional Path Tracking Control of Autonomous Underwater Vehicle Based on Deep Reinforcement Learning", 《JOURNAL OF MARINE SCIENCE AND ENGINEERING》 *
YUSHAN SUN等: "Underactuated AUV Three-Dimensional Path Tracking Control of the Underactuated AUV Based on Backstepping Sliding Mode", 《2019 4TH ASIA-PACIFIC CONFERENCE ON INTELLIGENT ROBOT SYSTEMS (ACIRS)》 *
刘晓平: "《计算机技术与应用进展 2004 下》", 31 August 2004, 合肥:中国科学技术大学出版社 *
姚绪梁等: "基于MPC导引律的AUV路径跟踪和避障控制", 《北京航空航天大学学报》 *
孙玉山,等: "果蝇算法在基于LSSVM智能水下机器人操纵运动模型辨识中的应用", 《船舶工程》 *
王晓伟,等: "欠驱动AUV 三维路径跟踪滑模控制", 《控制工程》 *
王银涛,等: "一种新的AUV路径跟踪控制方法", 《西北工业大学学报》 *
赵琳, 国防工业出版社 *
陈霄等: "基于改进积分视线导引策略的欠驱动无人水面艇路径跟踪", 《北京航空航天大学学报》 *
陈霄等: "欠驱动无人艇路径跟踪控制算法", 《海军工程大学学报》 *
马 岭,崔维成: "NTSM 控制的AUV路径跟踪控制研究", 《中国造船》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112880663A (en) * 2021-01-19 2021-06-01 西北工业大学 AUV reinforcement learning path planning method considering accumulated errors
CN112880663B (en) * 2021-01-19 2022-07-26 西北工业大学 AUV reinforcement learning path planning method considering accumulated error
CN113268068A (en) * 2021-05-31 2021-08-17 自然资源部第二海洋研究所 Hybrid intelligent autonomous detection method for deep sea area based on bionic submersible vehicle
CN113268068B (en) * 2021-05-31 2022-06-28 自然资源部第二海洋研究所 Bionic submersible vehicle-based mixed intelligent autonomous detection method for deep sea area
CN113247220A (en) * 2021-06-28 2021-08-13 深之蓝海洋科技股份有限公司 Method for automatically scanning and detecting tunnel by underwater robot and electronic equipment
CN114051273A (en) * 2021-11-08 2022-02-15 南京大学 Large-scale network dynamic self-adaptive path planning method based on deep learning
CN114051273B (en) * 2021-11-08 2023-10-13 南京大学 Large-scale network dynamic self-adaptive path planning method based on deep learning
CN116300982A (en) * 2023-03-03 2023-06-23 新兴际华(北京)智能装备技术研究院有限公司 Underwater vehicle and path tracking control method and device thereof
CN116414152A (en) * 2023-06-12 2023-07-11 中国空气动力研究与发展中心空天技术研究所 Reentry vehicle transverse and lateral rapid maneuver control method, system, terminal and medium
CN116414152B (en) * 2023-06-12 2023-08-15 中国空气动力研究与发展中心空天技术研究所 Reentry vehicle transverse and lateral rapid maneuver control method, system, terminal and medium

Similar Documents

Publication Publication Date Title
CN111930141A (en) Three-dimensional path visual tracking method for underwater robot
CN109540151B (en) AUV three-dimensional path planning method based on reinforcement learning
JP6854549B2 (en) AUV action planning and motion control methods based on reinforcement learning
CN112241176B (en) Path planning and obstacle avoidance control method of underwater autonomous vehicle in large-scale continuous obstacle environment
Sun et al. Mapless motion planning system for an autonomous underwater vehicle using policy gradient-based deep reinforcement learning
CN105807789B (en) UUV control methods based on the compensation of T-S Fuzzy Observers
CN113534668B (en) Maximum entropy based AUV (autonomous Underwater vehicle) motion planning method for actor-critic framework
CN113885534B (en) Intelligent predictive control-based water surface unmanned ship path tracking method
CN114115262B (en) Multi-AUV actuator saturation cooperative formation control system and method based on azimuth information
CN111240345A (en) Underwater robot trajectory tracking method based on double BP network reinforcement learning framework
CN109784201A (en) AUV dynamic obstacle avoidance method based on four-dimensional risk assessment
CN113848974B (en) Aircraft trajectory planning method and system based on deep reinforcement learning
CN111123923A (en) Unmanned ship local path dynamic optimization method
CN106840143A (en) A kind of method for differentiating underwater robot attitude stabilization
CN115657683B (en) Unmanned cable-free submersible real-time obstacle avoidance method capable of being used for inspection operation task
CN113741433B (en) Distributed formation method of unmanned ship on water surface
CN115480580A (en) NMPC-based underwater robot path tracking and obstacle avoidance control method
CN115657713A (en) Launching decision control method considering launching platform sinking and floating and shaking conditions
Li et al. Energy Efficient Space-Air-Ground-Ocean Integrated Network based on Intelligent Autonomous Underwater Glider
Tanaka et al. Underwater vehicle localization considering the effects of its oscillation
Zhai et al. Path planning algorithms for USVs via deep reinforcement learning
Emrani et al. An adaptive leader-follower formation controller for multiple AUVs in spatial motions
CN114943168B (en) Method and system for combining floating bridges on water
CN117168468B (en) Multi-unmanned-ship deep reinforcement learning collaborative navigation method based on near-end strategy optimization
CN116909150A (en) AUV intelligent control system based on PPO algorithm, control method and application

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20201113

RJ01 Rejection of invention patent application after publication