CN111930141A - Three-dimensional path visual tracking method for underwater robot - Google Patents
Three-dimensional path visual tracking method for underwater robot Download PDFInfo
- Publication number
- CN111930141A CN111930141A CN202010703073.2A CN202010703073A CN111930141A CN 111930141 A CN111930141 A CN 111930141A CN 202010703073 A CN202010703073 A CN 202010703073A CN 111930141 A CN111930141 A CN 111930141A
- Authority
- CN
- China
- Prior art keywords
- underwater robot
- coordinate system
- tracking
- angle
- path
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 55
- 230000000007 visual effect Effects 0.000 title claims abstract description 19
- 230000006870 function Effects 0.000 claims abstract description 27
- 230000002787 reinforcement Effects 0.000 claims abstract description 14
- 238000012549 training Methods 0.000 claims abstract description 9
- 238000013459 approach Methods 0.000 claims description 19
- 238000013528 artificial neural network Methods 0.000 claims description 18
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims description 14
- 230000008569 process Effects 0.000 claims description 10
- 230000000694 effects Effects 0.000 claims description 8
- 238000011160 research Methods 0.000 claims description 8
- 230000008859 change Effects 0.000 claims description 7
- 239000000126 substance Substances 0.000 claims description 6
- 230000009471 action Effects 0.000 claims description 5
- 230000000737 periodic effect Effects 0.000 claims description 4
- 230000003068 static effect Effects 0.000 claims description 4
- 238000012360 testing method Methods 0.000 claims description 4
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 230000005484 gravity Effects 0.000 claims description 3
- 230000010354 integration Effects 0.000 claims description 3
- 235000021184 main course Nutrition 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 3
- 238000011161 development Methods 0.000 description 7
- 238000004088 simulation Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000013461 design Methods 0.000 description 3
- 238000009825 accumulation Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 229910052500 inorganic mineral Inorganic materials 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 239000011707 mineral Substances 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000012938 design process Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000021715 photosynthesis, light harvesting Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000012876 topography Methods 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
- G05D1/10—Simultaneous control of position or course in three dimensions
Abstract
The invention discloses a three-dimensional path visual tracking method for an underwater robot. The invention belongs to the technical field of three-dimensional path planning of underwater robots, and is characterized in that a geodetic coordinate system, a carrier coordinate system and a curve coordinate system are established, a six-degree-of-freedom model of the underwater robot is established according to the coordinate system, a path tracking error model is established according to the established six-degree-of-freedom model of the underwater robot, and course angle deviation and submergence angle deviation are determined; performing three-dimensional path tracking on the established six-degree-of-freedom model of the underwater robot by adopting a backstepping sliding mode control method; and training the three-dimensional path tracking by adopting a deep reinforcement learning method to complete the visual tracking of the three-dimensional path of the underwater robot. The invention ensures continuous tracking error and improves the stability of path tracking; adding an integral term to the line-of-sight guidance rate to introduce the influence of time; a boundary reward function is added to accelerate the convergence speed of path tracking, reduce overshoot and improve precision.
Description
Technical Field
The invention relates to the technical field of three-dimensional path planning of underwater robots, in particular to a visual tracking method for a three-dimensional path of an underwater robot.
Background
The ocean is the cradle of the earth life, the ocean area accounts for 71 percent of the earth surface area, rich water resources, biological resources and mineral resources are stored, along with the reduction of the mineral resources on the land and the gradual exposure of the problem of water resource shortage, all countries in the world realize the importance of ocean resource development, the development and utilization of the ocean resources become necessary ways for sustainable development, and the ocean resource development and utilization become a new field of cooperation and competition of all countries. Meanwhile, the world economy can not be shipped, and the ocean transportation is an important way for the circulation of bulk commodities, so that the safety of a navigation channel is protected, the stable and smooth ocean transportation is maintained, and the method has great significance for the continuous and healthy development of national economy. In order to meet the requirements of the economic field and the military field, the underwater mobile platform which is small in size, long in voyage, rich in functions and intelligent to a certain extent is required to be developed. Driven by these demands, Autonomous Underwater Vehicles (AUV) have been developed rapidly, and are widely used in the field of ocean development and become research hotspots of various research institutes. The intelligent underwater robot has the characteristics of long voyage, long endurance, small size and high flexibility, and has wide application and good development prospect in the aspects of ocean resource detection, hydrological information observation, underwater operation and underwater target search. The AUV can monitor hydrological information underwater for a long time through the requirements of a set program, cruise according to a set route, scan and model submarine topography, autonomously search targets, detect and maintain submarine pipelines, cables and the like, so that manpower and material resources are saved to a great extent, and meanwhile, the operation efficiency and safety are improved. In the military field, the underwater unmanned underwater vehicle can be used for executing anti-diving, marine blocking, early warning and communication tasks, a plurality of AUVs can form a powerful underwater cluster through interconnection, and various complex combat tasks can be executed in a wide sea area through centralized command and information sharing.
From the current research situation, for the path tracking guidance method of the AUV, the time-varying nonlinearity, the uncertainty research of the model parameters and the external environment interference, domestic and foreign scholars have proposed various methods to solve the above problems, and significant achievements are obtained. Such as a line-of-sight method, a virtual target method, a time delay estimation technology, a virtual control quantity, an energy dissipation theory and the like, but most of the methods are complex and have poor adaptability, and in the research of designing a controller by adopting deep reinforcement learning, the controller has better self-adaptive capacity, but the problems to be solved are also solved when the deep reinforcement learning is applied to the three-dimensional path tracking of an under-actuated AUV. Therefore, the deep neural network only outputs and acts with a single target at many times, and neglects the manipulation characteristic of the underactuated AUV. Secondly, the controller using reinforcement learning may not be sensitive enough to small errors in path tracking, which affects further improvement of tracking accuracy. Based on the analysis, the characteristics of multiple inputs, multiple outputs, nonlinearity and strong coupling of the system need to be considered in the design process of the AUV path tracking controller, and meanwhile, the influence of external ocean currents on the system is reduced as much as possible. The designed controller has stronger robustness and self-adaptability while ensuring the tracking precision.
Disclosure of Invention
The invention provides a three-dimensional path visual tracking method for an underwater robot, aiming at ensuring continuous tracking error and improving the stability of path tracking, and the invention provides the following technical scheme:
a three-dimensional path visual tracking method for an underwater robot comprises the following steps:
step 1: establishing a geodetic coordinate system, a carrier coordinate system and a curve coordinate system, and establishing a six-degree-of-freedom model of the underwater robot according to the coordinate system;
step 2: according to the established six-degree-of-freedom model of the underwater robot, a path tracking error model is established, and course angle deviation and submergence angle deviation are determined;
and step 3: performing three-dimensional path tracking on the established six-degree-of-freedom model of the underwater robot by adopting a backstepping sliding mode control method;
and 4, step 4: and training the three-dimensional path tracking by adopting a deep reinforcement learning method to complete the visual tracking of the three-dimensional path of the underwater robot.
Preferably, the step 1 specifically comprises:
step 1.1: establishing a geodetic coordinate system, wherein the geodetic coordinate system is a certain point on the sea level, the positive direction of a zeta axis in the geodetic coordinate system is the same as the main course of the underwater robot AUV, the zeta axis points to the geocentric, and the zeta axis, the eta axis and the zeta axis form a right-hand coordinate system;
establishing a carrier coordinate system, wherein the origin of the carrier coordinate system is the mass center, x, of the AUV (autonomous Underwater vehicle)BThe shaft is fixedly connected with the AUV heading of the underwater robot, yBThe shaft is fixedly connected with an AUV starboard, xBAxis, yBAxis and zBThe axes form a right-hand coordinate system;
establishing a curve coordinate system, wherein the origin of the curve coordinate system is a point P, x on the expected pathSFIn the tangential direction of the desired path, ySFAxis in the normal direction, xSFAxis, ySFAxis and zSFThe axes form a right-hand coordinate system;
step 1.2: according to a geodetic coordinate system, a carrier coordinate system and a curve coordinate system, a six-degree-of-freedom model of the underwater robot is established, the six-degree-of-freedom model comprises a kinetic equation and a kinematic equation, and the kinetic equation is expressed by the following formula:
the kinematic equation is represented by:
wherein m is the mass of the underwater robot, IyFor moment of inertia about the y-axis, IzFor the moment of inertia about the z-axis, u, v, w are the longitudinal, transverse and vertical velocities, respectively, q, r are the pitch and yaw angular velocities, θ, ψ are the pitch and heading, X(·),Y(·),Z(·),M(·),N(·)Are all hydrodynamic coefficients, zg,zbFor the position of the centre of gravity and centre of buoyancy, X is the longitudinal thrust, M and N are the torques about the y-axis and z-axis generated by the combined action of the propeller and rudder, psiBIs the heading angle, theta, of the underwater robotBIs the submergence angle of the underwater robot, alpha is the attack angle, beta is the drift angle; v. oftThe resultant velocity of the underwater robot.
Preferably, the step 2 specifically comprises:
step 2.1: defining a virtual underwater robot AUV on a tracking path according to the established six-degree-of-freedom model of the underwater robot, and expressing a virtual underwater robot AUV kinematic equation by the following formula:
wherein psipAnd thetapAttitude angle, V, of a virtual targetpThe resultant velocity of the virtual robot;
step 2.2: converting the position errors of the real underwater robot AUV and the virtual underwater robot AUV in the inertial coordinate system into a curve coordinate system, and expressing the conversion process by the following formula:
differentiating the converted coordinate system to obtain an error kinetic equation, and expressing the error kinetic equation by the following formula:
step 2.3: neglecting errors caused by non-linearity in a three-dimensional space, determining course angle deviation and submergence angle deviation, and expressing the course angle deviation and the submergence angle deviation by the following formula:
wherein the content of the first and second substances,in order to be the deviation of the course angle,is the deviation of the angle of repose.
Preferably, the step 3 specifically comprises:
step 3.1: adopting a backstepping sliding mode control method, introducing a horizontal plane approach angle and a vertical plane approach angle based on a Lyapunov function to adjust the path tracking process of the underwater robot, and expressing the horizontal plane approach angle (y) by the following formulae):
The vertical approach angle χ (z) is expressed bye):
Wherein, DeltajIs the horizontal front-looking distance, DeltakIs the vertical front viewing distance;
step 3.2: determining a tracking error according to the horizontal plane approach angle and the vertical plane approach angle, and tracking the error according to the following formula:
ψ=ψe-(ye)
θ=θe-χ(ze)
wherein the content of the first and second substances,ψin order to correct the tracking error of the horizontal plane,θa vertical plane tracking error;
adopting a three-dimensional spiral path to test the path tracking effect of the backstepping sliding mode control method, establishing a three-dimensional spiral line available parameter equation, and expressing the three-dimensional spiral line available parameter equation by the following formula:
wherein S is a path parameter, the initial value of the target position is S (0) ═ 0, the initial position of AUV is ξ (0) ═ 65, η (0) ═ 500, ζ (0) ═ 50, the initial heading angle ψ (0) ═ 0, the initial pitch angle θ (0) ═ 0, the initial speed of 0.1m/S, the initial angular speeds of 0 and 1m/S, and the steady water flow is added to detect the anti-flow interference capability, the speed of the water flow in the ξ direction is 0.3m/S, the speed in the η direction is 0.3m/S, and the speed in the ζ direction is 0.15m/S, thereby completing the three-dimensional path tracking.
Preferably, the step 4 specifically includes:
step 4.1: when the LOS method is adopted to calculate the deviation, an integral term is added to eliminate the periodic error, the integral term introduces time, the consideration of time can be effectively added into a control loop, and the deviation after the integral term is added is represented by the following formula:
wherein k isψ,kθRespectively, control gains;
the basic idea of the disturbance observer is to modify the estimated value by the difference between the estimated output and the actual output, by means of a hundred and ten disturbance observer of the following formula:
step 4.2: adding penalty items into the reward function, wherein the penalty items comprise a vertical rudder, a rudder angle of a horizontal rudder and a rudder angle change rate, the penalty items are set by adopting a deformed second-order Gaussian function, and the improved reward function is represented by the following formula:
adding a boundary reward to the reward function, i.e. when the AUV is within a set specific boundary range, adding 1 to the reward value of the step as an extra reward, and expressing the boundary reward function by the following formula:
step 4.3: performing parameter optimization, including Actor neural network learning rate LR _ A, Critic neural network learning rate LR _ C, reward value attenuation discount coefficient and parameter updating discount coefficient;
selecting different parameters for debugging for multiple times, wherein the final parameters are LR _ A is 0.001, LR _ C is 0.003, gamma is 0.95 and tau is 0.05;
after the reward function is improved and the current disturbance observer is added, the new path tracking controller is trained, and the three-dimensional path visual tracking of the underwater robot is completed.
The invention has the following beneficial effects:
the current error is influenced by gradual accumulation of past errors due to continuous time integration, so that tracking is dynamically adjusted, the occurrence of static error is restrained, the influence caused by water flow interference is further reduced in order to adapt to a complex ocean current environment, the robustness and the anti-interference capability of the controller are enhanced, an interference observer is added to the control system after research, and the output of the controller is actively adjusted in real time through the observed interference form and characteristics.
The invention considers the problems of low path tracking convergence speed and large early deviation, and has the problems of continuous correct heading and large position deviation in training. When the distance is out of the boundary range, the reward is not given, the sensitivity of the neural network to the position deviation is improved, the tracking effect is improved,
the Actor neural network learning rate LR _ a determines how much experience needs to be learned in one update of the Actor network parameters, i.e., the larger LR _ a, the more experience is learned in each round of learning, and vice versa. The Critic neural network learning rate LR _ C determines how much experience needs to be learned in one update of Critic network parameters, i.e., the larger LR _ C, the more experience is learned in each round of learning, and vice versa. The discount coefficient of attenuation of reward value is used for reducing the influence of the return of the state after in the Markov decision-making process to the current state measurement, namely the smaller the influence is, the larger the influence is, the more the influence is, the later state return is, when the current state is measured. The parameter update discount coefficient determines the weight of the new parameter when the new network parameter updates the old network parameter, i.e. the larger the weight of the new parameter is, the larger the change degree of the parameter is, and vice versa. The efficiency and the effect of deep reinforcement learning are closely related to the four parameters, and the optimal parameter setting needs to be obtained through theoretical analysis and practice.
The invention designs the AUV approach angle by adopting a line-of-sight method, and designs a virtual AUV and the control rate of the position thereof by adopting a virtual guide method. And an AUV three-dimensional motion error model is established, and the virtual AUV continuously guides the AUV by adjusting the speed of the virtual AUV, so that the tracking error is continuous, and the stability of path tracking is improved.
A design method of the deep reinforcement learning path tracking controller is explored, and the most appropriate deep reinforcement learning algorithm is obtained. Learning and simulation environments are established by adopting python language, and a training cycle flow is designed by adding exploration parameters into the DDPG algorithm to enhance the exploration of the algorithm. The current state of the AUV is used as input, the action of an AUV motion actuating mechanism is used as output, and a deep neural network is built as the core of the controller. And finally, AUV three-dimensional path tracking based on the DDPG algorithm is successfully realized in simulation, and stable three-dimensional path tracking is basically realized.
Adding an integral term to the line-of-sight guidance rate to introduce the influence of time; a boundary reward function is added to accelerate the convergence speed of path tracking, reduce overshoot and improve precision. A second-order Gaussian function related to the rudder angle and the change rate of the rudder angle is added into a reward function of the reinforcement learning algorithm, so that the frequent reciprocating change of the rudder angle is inhibited; an ocean current disturbance observer is added in the control loop, so that the ocean current is observed in real time, the disturbance of the ocean current on the controller is inhibited, and the periodic error is reduced.
Drawings
FIG. 1 is a flow chart of a three-dimensional path visualization tracking method of an underwater robot;
FIG. 2 is a schematic view of a coordinate system;
FIG. 3 is a low pass filtering block diagram;
FIG. 4 is a three-dimensional path tracking trajectory diagram;
FIG. 5 is a graph of position error curves;
FIG. 6 is a schematic diagram of path tracking training;
FIG. 7 is a deep learning three-dimensional path tracking trajectory diagram;
FIG. 8 is a deep learning three-dimensional path tracking position error map;
FIG. 9 is a graph of deep reinforcement learning round rewards.
Detailed Description
The present invention will be described in detail with reference to specific examples.
The first embodiment is as follows:
according to fig. 1, the application provides a method for visually tracking a three-dimensional path of an underwater robot, which comprises the following steps:
a three-dimensional path visual tracking method for an underwater robot comprises the following steps:
step 1: establishing a geodetic coordinate system, a carrier coordinate system and a curve coordinate system, and establishing a six-degree-of-freedom model of the underwater robot according to the coordinate system;
the step 1 specifically comprises the following steps:
as shown in fig. 2, step 1.1: establishing a geodetic coordinate system { I }, wherein the geodetic coordinate system is a certain point on the sea level, the positive direction of a zeta axis in the geodetic coordinate system is the same as the main course of the AUV, the zeta axis points to the geocentric, and the zeta axis, the eta axis and the zeta axis form a right-hand coordinate system;
establishing a carrier coordinate system { B }, wherein the origin of the carrier coordinate system is the centroid of the AUV (autonomous Underwater vehicle), xBThe shaft is fixedly connected with the AUV heading of the underwater robot, yBThe shaft is fixedly connected with an AUV starboard, xBAxis, yBAxis and zBThe axes form a right-hand coordinate system;
establishing a curve coordinate system S-F, wherein the origin of the curve coordinate system is a point P, x on the expected pathSFIn the tangential direction of the desired path, ySFAxis in the normal direction, xSFAxis, ySFAxis and zSFThe axes form a right-hand coordinate system;
step 1.2: according to a geodetic coordinate system, a carrier coordinate system and a curve coordinate system, a six-degree-of-freedom model of the underwater robot is established, the six-degree-of-freedom model comprises a kinetic equation and a kinematic equation, and the kinetic equation is expressed by the following formula:
the kinematic equation is represented by:
wherein m is the mass of the underwater robot, IyFor moment of inertia about the y-axis, IzFor rotational inertia about the z-axisThe quantities u, v, w are the longitudinal, transverse and vertical velocities, respectively, q, r are the pitch and yaw angular velocities, theta, psi are the pitch and yaw angles, X(·),Y(·),Z(·),M(·),N(·)Are all hydrodynamic coefficients, zg,zbFor the position of the centre of gravity and centre of buoyancy, X is the longitudinal thrust, M and N are the torques about the y-axis and z-axis generated by the combined action of the propeller and rudder, psiBIs the heading angle, theta, of the underwater robotBIs the submergence angle of the underwater robot, alpha is the attack angle, beta is the drift angle; v. oftThe resultant velocity of the underwater robot.
Step 2: according to the established six-degree-of-freedom model of the underwater robot, a path tracking error model is established, and course angle deviation and submergence angle deviation are determined;
the step 2 specifically comprises the following steps:
step 2.1: defining a virtual underwater robot AUV on a tracking path according to the established six-degree-of-freedom model of the underwater robot, and expressing a virtual underwater robot AUV kinematic equation by the following formula:
wherein psipAnd thetapAttitude angle, V, of a virtual targetpThe resultant velocity of the virtual robot;
step 2.2: converting the position errors of the real underwater robot AUV and the virtual underwater robot AUV in the inertial coordinate system into a curve coordinate system, and expressing the conversion process by the following formula:
differentiating the converted coordinate system to obtain an error kinetic equation, and expressing the error kinetic equation by the following formula:
step 2.3: neglecting errors caused by non-linearity in a three-dimensional space, determining course angle deviation and submergence angle deviation, and expressing the course angle deviation and the submergence angle deviation by the following formula:
wherein the content of the first and second substances,in order to be the deviation of the course angle,is the deviation of the angle of repose.
And step 3: performing three-dimensional path tracking on the established six-degree-of-freedom model of the underwater robot by adopting a backstepping sliding mode control method;
as shown in fig. 3, the step 3 specifically includes:
step 3.1: adopting a backstepping sliding mode control method, introducing a horizontal plane approach angle and a vertical plane approach angle based on a Lyapunov function to adjust the path tracking process of the underwater robot, and expressing the horizontal plane approach angle (y) by the following formulae):
The vertical approach angle χ (z) is expressed bye):
Wherein, DeltajIs the horizontal front-looking distance, DeltakIs the vertical front viewing distance;
step 3.2: determining a tracking error according to the horizontal plane approach angle and the vertical plane approach angle, and tracking the error according to the following formula:
ψ=ψe-(ye)
θ=θe-χ(ze)
wherein the content of the first and second substances,ψin order to correct the tracking error of the horizontal plane,θa vertical plane tracking error;
adopting a three-dimensional spiral path to test the path tracking effect of the backstepping sliding mode control method, establishing a three-dimensional spiral line available parameter equation, and expressing the three-dimensional spiral line available parameter equation by the following formula:
wherein S is a path parameter, the initial value of the target position is S (0) ═ 0, the initial position of AUV is ξ (0) ═ 65, η (0) ═ 500, ζ (0) ═ 50, the initial heading angle ψ (0) ═ 0, the initial pitch angle θ (0) ═ 0, the initial speed of 0.1m/S, the initial angular speeds of 0 and 1m/S, and the steady water flow is added to detect the anti-flow interference capability, the speed of the water flow in the ξ direction is 0.3m/S, the speed in the η direction is 0.3m/S, and the speed in the ζ direction is 0.15m/S, thereby completing the three-dimensional path tracking.
And 4, step 4: and training the three-dimensional path tracking by adopting a deep reinforcement learning method to complete the visual tracking of the three-dimensional path of the underwater robot.
The step 4 specifically comprises the following steps:
step 4.1: when the LOS method is adopted to calculate the deviation, an integral term is added to eliminate the periodic error, the integral term introduces time, the consideration of time can be effectively added into a control loop, and the deviation after the integral term is added is represented by the following formula:
wherein k isψ,kθRespectively, control gains;
the current error is influenced by gradual accumulation of past errors due to continuous time integration, so that tracking is dynamically adjusted, the occurrence of static error is restrained, the influence caused by water flow interference is further reduced in order to adapt to a complex ocean current environment, the robustness and the anti-interference capability of the controller are enhanced, an interference observer is added to the control system after research, and the output of the controller is actively adjusted in real time through the observed interference form and characteristics.
The basic idea of the disturbance observer is to modify the estimated value by the difference between the estimated output and the actual output, by means of a hundred and ten disturbance observer of the following formula:
step 4.2: adding penalty items into the reward function, wherein the penalty items comprise a vertical rudder, a rudder angle of a horizontal rudder and a rudder angle change rate, the penalty items are set by adopting a deformed second-order Gaussian function, and the improved reward function is represented by the following formula:
the problems of low convergence speed and large early deviation of path tracking are considered, and the problems of continuous correct heading and large position deviation appear in training.
Therefore, the boundary reward is determined to be added into the reward function, namely when the AUV is within a set specific boundary range, 1 is continuously added to the reward value of the step to serve as the extra reward, if the AUV is outside the boundary range, the step is not rewarded, the sensitivity of the neural network to the position deviation is improved, the tracking effect is improved, the boundary reward function is determined, and the boundary reward function is represented by the following formula:
step 4.3: performing parameter optimization, including Actor neural network learning rate LR _ A, Critic neural network learning rate LR _ C, reward value attenuation discount coefficient and parameter updating discount coefficient; the Actor neural network learning rate LR _ a determines how much experience needs to be learned in one update of the Actor network parameters, i.e., the larger LR _ a, the more experience is learned in each round of learning, and vice versa. The Critic neural network learning rate LR _ C determines how much experience needs to be learned in one update of Critic network parameters, i.e., the larger LR _ C, the more experience is learned in each round of learning, and vice versa. The discount coefficient of attenuation of reward value is used for reducing the influence of the return of the state after in the Markov decision-making process to the current state measurement, namely the smaller the influence is, the larger the influence is, the more the influence is, the later state return is, when the current state is measured. The parameter update discount coefficient determines the weight of the new parameter when the new network parameter updates the old network parameter, i.e. the larger the weight of the new parameter is, the larger the change degree of the parameter is, and vice versa. The efficiency and the effect of deep reinforcement learning are closely related to the four parameters, and the optimal parameter setting needs to be obtained through theoretical analysis and practice. Therefore, different parameters are selected for debugging for multiple times, and the final parameters are selected as LR _ A being 0.001, LR _ C being 0.003, gamma being 0.95 and tau being 0.05;
after the reward function is improved and the current disturbance observer is added, the new path tracking controller is trained, and the three-dimensional path visual tracking of the underwater robot is completed.
In the simulation process, a simulation result with fast convergence and good effect is finally obtained through a large amount of parameter debugging and a plurality of tests. The simulation result is shown in fig. 4, in which the dotted line represents the target path and the solid line represents the tracking path under the control of the backstepping sliding mode technique.
Fig. 5 shows the position error of AUV in three directions during tracking, and it can be seen that the tracking error curve obviously changes periodically and reciprocates around 0 due to the action of ocean current.
The initial value of the target position is S (0) ═ 0, the initial position of AUV is ξ (0) ═ 650, η (0) ═ 500, ζ (0) ═ 50, the initial heading angle θ (0) ═ 0, and the initial pitch angle θ (0) ═ 0. Initial speed 0.1m/s, desired forward speed ud6/s. The method is characterized in that the interfering water flow is added in the environment, and the speed of the water flow in the xi direction is 0.3m/s, the speed in the eta direction is 0.3m/s, and the speed in the direction is 0.15m/s in an inertial coordinate system. The path tracking training scenario is shown in fig. 6.
As is clear from fig. 7 and 8, the AUV as a whole implements path tracing. In 10,000 iterations, the average distance between the AUV position and the target position is 5.5 m. However, in the initial phase, there is a large overshoot because the neural network controller is not sensitive enough. The maximum deviation in direction reaches 36m and the maximum deviation in direction reaches 30m, and due to the water flow, the AUV oscillates on both sides of the target path and there is a static error.
Because the DDPG reinforcement learning algorithm comprises strategy gradient thought, the neural network can be learned in successful and failed experiences. As can be seen from the prize value curve in fig. 9, the prize value remains good most of the time after the learning is started, but the lower prize value is observed for a long time in the process of 600 to 700 steps. It shows that the neural network controller has learned much in the successful experience, but the experience learned in the failed experience is insufficient, which is likely to cause the controller to fall into local optimality and lack of exploratory. A good neural network controller should be able to learn the experience of success and failure in order to be able to succeed and avoid failure.
The above description is only a preferred embodiment of the method for visually tracking the three-dimensional path of the underwater robot, and the protection range of the method for visually tracking the three-dimensional path of the underwater robot is not limited to the above embodiments, and all technical solutions belonging to the idea belong to the protection range of the present invention. It should be noted that modifications and variations which do not depart from the gist of the invention will be those skilled in the art to which the invention pertains and which are intended to be within the scope of the invention.
Claims (5)
1. A three-dimensional path visual tracking method for an underwater robot is characterized by comprising the following steps: the method comprises the following steps:
step 1: establishing a geodetic coordinate system, a carrier coordinate system and a curve coordinate system, and establishing a six-degree-of-freedom model of the underwater robot according to the coordinate system;
step 2: according to the established six-degree-of-freedom model of the underwater robot, a path tracking error model is established, and course angle deviation and submergence angle deviation are determined;
and step 3: performing three-dimensional path tracking on the established six-degree-of-freedom model of the underwater robot by adopting a backstepping sliding mode control method;
and 4, step 4: and training the three-dimensional path tracking by adopting a deep reinforcement learning method to complete the visual tracking of the three-dimensional path of the underwater robot.
2. The underwater robot three-dimensional path visual tracking method as claimed in claim 1, wherein: the step 1 specifically comprises the following steps:
step 1.1: establishing a geodetic coordinate system, wherein the geodetic coordinate system is a certain point on the sea level, the positive direction of a zeta axis in the geodetic coordinate system is the same as the main course of the underwater robot AUV, the zeta axis points to the geocentric, and the zeta axis, the eta axis and the zeta axis form a right-hand coordinate system;
establishing a carrier coordinate system, wherein the origin of the carrier coordinate system is the mass center, x, of the AUV (autonomous Underwater vehicle)BThe shaft is fixedly connected with the AUV heading of the underwater robot, yBThe shaft is fixedly connected with an AUV starboard, xBAxis, yBAxis and zBThe axes form a right-hand coordinate system;
establishing a curve coordinate system, wherein the origin of the curve coordinate system is a point P, x on the expected pathSFIn the tangential direction of the desired path, ySFAxis in the normal direction, xSFAxis, ySFAxis and zSFThe axes form a right-hand coordinate system;
step 1.2: according to a geodetic coordinate system, a carrier coordinate system and a curve coordinate system, a six-degree-of-freedom model of the underwater robot is established, the six-degree-of-freedom model comprises a kinetic equation and a kinematic equation, and the kinetic equation is expressed by the following formula:
the kinematic equation is represented by:
wherein m is the mass of the underwater robot, IyFor moment of inertia about the y-axis, IzFor the moment of inertia about the z-axis, u, v, w are the longitudinal, transverse and vertical velocities, respectively, q, r are the pitch and yaw angular velocities, θ, ψ are the pitch and heading, X(·),Y(·),Z(·),M(·),N(·)Are all hydrodynamic coefficients, zg,zbFor the position of the centre of gravity and centre of buoyancy, X is the longitudinal thrust, M and N are the torques about the y-axis and z-axis generated by the combined action of the propeller and rudder, psiBIs the heading angle, theta, of the underwater robotBIs the submergence angle of the underwater robot, alpha is the attack angle, beta is the drift angle; v. oftThe resultant velocity of the underwater robot.
3. The underwater robot three-dimensional path visual tracking method as claimed in claim 1, wherein: the step 2 specifically comprises the following steps:
step 2.1: defining a virtual underwater robot AUV on a tracking path according to the established six-degree-of-freedom model of the underwater robot, and expressing a virtual underwater robot AUV kinematic equation by the following formula:
wherein psipAnd thetapAttitude angle, V, of a virtual targetpThe resultant velocity of the virtual robot;
step 2.2: converting the position errors of the real underwater robot AUV and the virtual underwater robot AUV in the inertial coordinate system into a curve coordinate system, and expressing the conversion process by the following formula:
differentiating the converted coordinate system to obtain an error kinetic equation, and expressing the error kinetic equation by the following formula:
step 2.3: neglecting errors caused by non-linearity in a three-dimensional space, determining course angle deviation and submergence angle deviation, and expressing the course angle deviation and the submergence angle deviation by the following formula:
4. The underwater robot three-dimensional path visual tracking method as claimed in claim 1, wherein: the step 3 specifically comprises the following steps:
step 3.1: by usingA backstepping sliding mode control method is characterized in that a horizontal plane approach angle and a vertical plane approach angle are introduced based on a Lyapunov function to adjust a path tracking process of an underwater robot, and the horizontal plane approach angle (y) is expressed by the following formulae):
The approach angle x (z) of the vertical plane is expressed by the following equatione):
Wherein, DeltajIs the horizontal front-looking distance, DeltakIs the vertical front viewing distance;
step 3.2: determining a tracking error according to the horizontal plane approach angle and the vertical plane approach angle, and tracking the error according to the following formula:
ψ=ψe-(ye)
θ=θe-χ(ze)
wherein the content of the first and second substances,ψin order to correct the tracking error of the horizontal plane,θa vertical plane tracking error;
adopting a three-dimensional spiral path to test the path tracking effect of the backstepping sliding mode control method, establishing a three-dimensional spiral line available parameter equation, and expressing the three-dimensional spiral line available parameter equation by the following formula:
wherein S is a path parameter, the initial value of the target position is S (0) ═ 0, the initial position of AUV is ξ (0) ═ 65, η (0) ═ 500, ζ (0) ═ 50, the initial heading angle ψ (0) ═ 0, the initial pitch angle θ (0) ═ 0, the initial speed of 0.1m/S, the initial angular speeds of 0 and 1m/S, and the steady water flow is added to detect the anti-flow interference capability, the speed of the water flow in the ξ direction is 0.3m/S, the speed in the η direction is 0.3m/S, and the speed in the ζ direction is 0.15m/S, thereby completing the three-dimensional path tracking.
5. The underwater robot three-dimensional path visual tracking method as claimed in claim 1, wherein: the step 4 specifically comprises the following steps:
step 4.1: when the LOS method is adopted to calculate the deviation, an integral term is added to eliminate the periodic error, the integral term introduces time, the consideration of time can be effectively added into a control loop, and the deviation after the integral term is added is represented by the following formula:
wherein k isψ,kθRespectively, control gains;
because of continuous time integration, past errors can be gradually accumulated to influence the current errors, so that tracking is dynamically adjusted, the occurrence of static errors is restrained, in order to adapt to a complex ocean current environment, the influence caused by a water flow interference problem is further reduced, the robustness and the anti-interference capability of the controller are enhanced, an interference observer is determined to be added to the control system after research, and the output of the controller is actively adjusted in real time through the observed interference form and characteristics;
the basic idea of the disturbance observer is to modify the estimated value by the difference between the estimated output and the actual output, by means of a hundred and ten disturbance observer of the following formula:
step 4.2: adding penalty items into the reward function, wherein the penalty items comprise a vertical rudder, a rudder angle of a horizontal rudder and a rudder angle change rate, the penalty items are set by adopting a deformed second-order Gaussian function, and the improved reward function is represented by the following formula:
determining a boundary reward function, the boundary reward function being represented by:
step 4.3: performing parameter optimization, including Actor neural network learning rate LR _ A, Critic neural network learning rate LR _ C, reward value attenuation discount coefficient and parameter updating discount coefficient; therefore, different parameters are selected for debugging for multiple times, and the final parameters are selected as LR _ A being 0.001, LR _ C being 0.003, gamma being 0.95 and tau being 0.05;
after the reward function is improved and the current disturbance observer is added, the new path tracking controller is trained, and the three-dimensional path visual tracking of the underwater robot is completed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010703073.2A CN111930141A (en) | 2020-07-21 | 2020-07-21 | Three-dimensional path visual tracking method for underwater robot |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010703073.2A CN111930141A (en) | 2020-07-21 | 2020-07-21 | Three-dimensional path visual tracking method for underwater robot |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111930141A true CN111930141A (en) | 2020-11-13 |
Family
ID=73313711
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010703073.2A Pending CN111930141A (en) | 2020-07-21 | 2020-07-21 | Three-dimensional path visual tracking method for underwater robot |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111930141A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112880663A (en) * | 2021-01-19 | 2021-06-01 | 西北工业大学 | AUV reinforcement learning path planning method considering accumulated errors |
CN113247220A (en) * | 2021-06-28 | 2021-08-13 | 深之蓝海洋科技股份有限公司 | Method for automatically scanning and detecting tunnel by underwater robot and electronic equipment |
CN113268068A (en) * | 2021-05-31 | 2021-08-17 | 自然资源部第二海洋研究所 | Hybrid intelligent autonomous detection method for deep sea area based on bionic submersible vehicle |
CN114051273A (en) * | 2021-11-08 | 2022-02-15 | 南京大学 | Large-scale network dynamic self-adaptive path planning method based on deep learning |
CN116300982A (en) * | 2023-03-03 | 2023-06-23 | 新兴际华(北京)智能装备技术研究院有限公司 | Underwater vehicle and path tracking control method and device thereof |
CN116414152A (en) * | 2023-06-12 | 2023-07-11 | 中国空气动力研究与发展中心空天技术研究所 | Reentry vehicle transverse and lateral rapid maneuver control method, system, terminal and medium |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102768539A (en) * | 2012-06-26 | 2012-11-07 | 哈尔滨工程大学 | AUV (autonomous underwater vehicle) three-dimension curve path tracking control method based on iteration |
CN106292287A (en) * | 2016-09-20 | 2017-01-04 | 哈尔滨工程大学 | A kind of UUV path following method based on adaptive sliding-mode observer |
CN106444838A (en) * | 2016-10-25 | 2017-02-22 | 西安兰海动力科技有限公司 | Precise path tracking control method for autonomous underwater vehicle |
CN108490961A (en) * | 2018-03-23 | 2018-09-04 | 哈尔滨工程大学 | A kind of more AUV dynamics circular arc formation control methods |
US20190012924A1 (en) * | 2017-05-02 | 2019-01-10 | Here Global B.V. | Method and apparatus for privacy-sensitive routing of an aerial drone |
CN109189071A (en) * | 2018-09-25 | 2019-01-11 | 大连海事大学 | Robust adaptive unmanned boat path tracking control method based on Fuzzy Observer |
CN110109363A (en) * | 2019-05-28 | 2019-08-09 | 重庆理工大学 | A kind of Neural Network Adaptive Control method that wheeled mobile robot is formed into columns |
US20190378423A1 (en) * | 2018-06-12 | 2019-12-12 | Skydio, Inc. | User interaction with an autonomous unmanned aerial vehicle |
-
2020
- 2020-07-21 CN CN202010703073.2A patent/CN111930141A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102768539A (en) * | 2012-06-26 | 2012-11-07 | 哈尔滨工程大学 | AUV (autonomous underwater vehicle) three-dimension curve path tracking control method based on iteration |
CN106292287A (en) * | 2016-09-20 | 2017-01-04 | 哈尔滨工程大学 | A kind of UUV path following method based on adaptive sliding-mode observer |
CN106444838A (en) * | 2016-10-25 | 2017-02-22 | 西安兰海动力科技有限公司 | Precise path tracking control method for autonomous underwater vehicle |
US20190012924A1 (en) * | 2017-05-02 | 2019-01-10 | Here Global B.V. | Method and apparatus for privacy-sensitive routing of an aerial drone |
CN108490961A (en) * | 2018-03-23 | 2018-09-04 | 哈尔滨工程大学 | A kind of more AUV dynamics circular arc formation control methods |
US20190378423A1 (en) * | 2018-06-12 | 2019-12-12 | Skydio, Inc. | User interaction with an autonomous unmanned aerial vehicle |
CN109189071A (en) * | 2018-09-25 | 2019-01-11 | 大连海事大学 | Robust adaptive unmanned boat path tracking control method based on Fuzzy Observer |
CN110109363A (en) * | 2019-05-28 | 2019-08-09 | 重庆理工大学 | A kind of Neural Network Adaptive Control method that wheeled mobile robot is formed into columns |
Non-Patent Citations (13)
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112880663A (en) * | 2021-01-19 | 2021-06-01 | 西北工业大学 | AUV reinforcement learning path planning method considering accumulated errors |
CN112880663B (en) * | 2021-01-19 | 2022-07-26 | 西北工业大学 | AUV reinforcement learning path planning method considering accumulated error |
CN113268068A (en) * | 2021-05-31 | 2021-08-17 | 自然资源部第二海洋研究所 | Hybrid intelligent autonomous detection method for deep sea area based on bionic submersible vehicle |
CN113268068B (en) * | 2021-05-31 | 2022-06-28 | 自然资源部第二海洋研究所 | Bionic submersible vehicle-based mixed intelligent autonomous detection method for deep sea area |
CN113247220A (en) * | 2021-06-28 | 2021-08-13 | 深之蓝海洋科技股份有限公司 | Method for automatically scanning and detecting tunnel by underwater robot and electronic equipment |
CN114051273A (en) * | 2021-11-08 | 2022-02-15 | 南京大学 | Large-scale network dynamic self-adaptive path planning method based on deep learning |
CN114051273B (en) * | 2021-11-08 | 2023-10-13 | 南京大学 | Large-scale network dynamic self-adaptive path planning method based on deep learning |
CN116300982A (en) * | 2023-03-03 | 2023-06-23 | 新兴际华(北京)智能装备技术研究院有限公司 | Underwater vehicle and path tracking control method and device thereof |
CN116414152A (en) * | 2023-06-12 | 2023-07-11 | 中国空气动力研究与发展中心空天技术研究所 | Reentry vehicle transverse and lateral rapid maneuver control method, system, terminal and medium |
CN116414152B (en) * | 2023-06-12 | 2023-08-15 | 中国空气动力研究与发展中心空天技术研究所 | Reentry vehicle transverse and lateral rapid maneuver control method, system, terminal and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111930141A (en) | Three-dimensional path visual tracking method for underwater robot | |
CN109540151B (en) | AUV three-dimensional path planning method based on reinforcement learning | |
JP6854549B2 (en) | AUV action planning and motion control methods based on reinforcement learning | |
CN112241176B (en) | Path planning and obstacle avoidance control method of underwater autonomous vehicle in large-scale continuous obstacle environment | |
Sun et al. | Mapless motion planning system for an autonomous underwater vehicle using policy gradient-based deep reinforcement learning | |
CN105807789B (en) | UUV control methods based on the compensation of T-S Fuzzy Observers | |
CN113534668B (en) | Maximum entropy based AUV (autonomous Underwater vehicle) motion planning method for actor-critic framework | |
CN113885534B (en) | Intelligent predictive control-based water surface unmanned ship path tracking method | |
CN114115262B (en) | Multi-AUV actuator saturation cooperative formation control system and method based on azimuth information | |
CN111240345A (en) | Underwater robot trajectory tracking method based on double BP network reinforcement learning framework | |
CN109784201A (en) | AUV dynamic obstacle avoidance method based on four-dimensional risk assessment | |
CN113848974B (en) | Aircraft trajectory planning method and system based on deep reinforcement learning | |
CN111123923A (en) | Unmanned ship local path dynamic optimization method | |
CN106840143A (en) | A kind of method for differentiating underwater robot attitude stabilization | |
CN115657683B (en) | Unmanned cable-free submersible real-time obstacle avoidance method capable of being used for inspection operation task | |
CN113741433B (en) | Distributed formation method of unmanned ship on water surface | |
CN115480580A (en) | NMPC-based underwater robot path tracking and obstacle avoidance control method | |
CN115657713A (en) | Launching decision control method considering launching platform sinking and floating and shaking conditions | |
Li et al. | Energy Efficient Space-Air-Ground-Ocean Integrated Network based on Intelligent Autonomous Underwater Glider | |
Tanaka et al. | Underwater vehicle localization considering the effects of its oscillation | |
Zhai et al. | Path planning algorithms for USVs via deep reinforcement learning | |
Emrani et al. | An adaptive leader-follower formation controller for multiple AUVs in spatial motions | |
CN114943168B (en) | Method and system for combining floating bridges on water | |
CN117168468B (en) | Multi-unmanned-ship deep reinforcement learning collaborative navigation method based on near-end strategy optimization | |
CN116909150A (en) | AUV intelligent control system based on PPO algorithm, control method and application |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20201113 |
|
RJ01 | Rejection of invention patent application after publication |