CN113671994A - Multi-unmanned aerial vehicle and multi-unmanned ship inspection control system based on reinforcement learning - Google Patents

Multi-unmanned aerial vehicle and multi-unmanned ship inspection control system based on reinforcement learning Download PDF

Info

Publication number
CN113671994A
CN113671994A CN202111020276.2A CN202111020276A CN113671994A CN 113671994 A CN113671994 A CN 113671994A CN 202111020276 A CN202111020276 A CN 202111020276A CN 113671994 A CN113671994 A CN 113671994A
Authority
CN
China
Prior art keywords
unmanned aerial
unmanned
aerial vehicle
abnormal
aerial vehicles
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111020276.2A
Other languages
Chinese (zh)
Other versions
CN113671994B (en
Inventor
陈刚
乔永龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University
Original Assignee
Chongqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University filed Critical Chongqing University
Priority to CN202111020276.2A priority Critical patent/CN113671994B/en
Publication of CN113671994A publication Critical patent/CN113671994A/en
Application granted granted Critical
Publication of CN113671994B publication Critical patent/CN113671994B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
    • G05D1/10Simultaneous control of position or course in three dimensions
    • G05D1/101Simultaneous control of position or course in three dimensions specially adapted for aircraft
    • G05D1/104Simultaneous control of position or course in three dimensions specially adapted for aircraft involving a plurality of aircrafts, e.g. formation flying

Abstract

The invention relates to a multi-unmanned aerial vehicle and multi-unmanned ship inspection control system based on reinforcement learning, and belongs to the field of robot control. The system comprises a plurality of unmanned aerial vehicles and a plurality of unmanned ships; in the inspection process, the depth camera carried by the unmanned aerial vehicle is used for sensing the abnormal point, the unmanned aerial vehicle closest to the abnormal point is used as a pilot to perform information fusion on acquired data, and the pilot sends fused position information to the unmanned aerial vehicle which does not sense the abnormal point and performs limited time formation control based on event triggering. And as the laser radar carried by part of unmanned ships senses the abnormal point, the navigator fuses the sensing information of the laser radar and the depth camera again, so that all the unmanned ships are driven to cooperatively approach the abnormal point. The anomaly points are handled by actuators configured with drones or drones. The invention adopts the combination of heterogeneous multi-agents, and utilizes deep learning, reinforcement learning and information fusion technologies to carry out routing inspection operation, thereby directly realizing the monitoring and abnormal condition processing of the unmanned water area.

Description

Multi-unmanned aerial vehicle and multi-unmanned ship inspection control system based on reinforcement learning
Technical Field
The invention belongs to the field of robot control, and relates to a multi-unmanned aerial vehicle and multi-unmanned ship inspection control system based on reinforcement learning.
Background
For water area inspection, most of the prior inspection methods such as walking and ship-riding are adopted, so that the working intensity is high, the efficiency is low, the inspection methods are easily limited by conditions such as river and lake terrain environments, dead corners of inspection blind areas and incomplete problem discovery exist, and the like. The manual operation of a ship in an open water area is dangerous. The unmanned aerial vehicle is advanced in a water area inspection mode, an operator judges abnormality according to an image acquired by the unmanned aerial vehicle by remote control operation and a wireless image transmission technology, and then corresponding measures are taken. Although unmanned aerial vehicle has very strong flexibility, mobility, when can discovering actual problem fast, but unmanned aerial vehicle is because its self structure, or factor influence such as duration, and the actuating mechanism that can carry on is limited, and when going to solve the problem actually, the ability of its self handling problem is a bottleneck.
According to the problem that the manual work and the unmanned aerial vehicle work face together. Thus adopting the respective advantages of combining unmanned ship and unmanned plane simultaneously. The heterogeneous multi-agent system is built, so that the problems of insufficient execution capacity, low efficiency of manual working modes and the like of the unmanned aerial vehicle can be solved. Moreover, the unmanned aerial vehicle can also utilize the unmanned ship to serve as a relay station of the unmanned aerial vehicle, and batteries are replaced or a temporary parking place is parked. If the practical problem is met, the plurality of agents can work cooperatively. And the real unmanned operation is realized.
Disclosure of Invention
In view of this, the present invention provides a system for routing inspection and controlling multiple drones and multiple drones based on reinforcement learning.
In order to achieve the purpose, the invention provides the following technical scheme:
a multi-unmanned aerial vehicle and multi-unmanned ship inspection control system based on reinforcement learning comprises a plurality of unmanned aerial vehicles and a plurality of unmanned ships;
the unmanned aerial vehicles are provided with a D435i vision mechanism and a Jetson XavierNX computing platform;
the plurality of unmanned ships are provided with RPLIDAR-A3 laser radar and TX2 computing platforms;
the unmanned planes and the unmanned ships are provided with inertial measurement units IMU and GPS;
the unmanned planes and the unmanned ships are also provided with power batteries;
the D435i vision mechanism is in signal connection with a Jetson XavierNX computing platform;
the RPLIDAR-A3 lidar is in signal connection with a TX2 computing platform;
the inertial measurement unit IMU and the GPS are respectively in signal connection with a Jetson XavierNX computing platform and a TX2 computing platform;
the unmanned aerial vehicles and the multi-unmanned ship are also provided with actuating mechanisms;
when the unmanned aerial vehicles find abnormality, the unmanned aerial vehicle closest to the abnormal point serves as a navigator, the position information of an object is sensed through a D435i vision mechanism, position information fusion is carried out at the navigator, and the unmanned aerial vehicles are guided to approach the abnormal point;
When the unmanned ship moves, sensing abnormal points by using the carried RPLIDAR-A3 laser radar, carrying out information fusion on data acquired by the unmanned ship at a pilot, and sending fused position information to the unmanned ship by the unmanned ship so as to drive the unmanned ship to approach the abnormal points in a cooperative manner and process the abnormal points by using an executing mechanism configured by the unmanned ship or the unmanned ship.
Optionally, the actuating mechanism includes arm, water quality sampling instrument, backup battery and megaphone.
Optionally, a plurality of unmanned aerial vehicles and a plurality of unmanned ship carry out the waters and patrol and examine, when unmanned aerial vehicle discovery is unusual, the nearest unmanned aerial vehicle of departure anomaly will be as the pilot, through the positional information of D435i vision mechanism perception object, carries out positional information and fuses in pilot department, and the guide a plurality of unmanned aerial vehicles is close the anomaly and specifically is:
the unmanned aerial vehicle closest to the abnormal point serves as a pilot and serves as an information processing platform; if the unmanned aerial vehicles find the same target, position information is obtained by adopting a weighted average algorithm;
the pilot guides the rest unmanned aerial vehicles to approach the abnormal points;
as the unmanned ships approach and sense abnormal points, sending a plurality of unmanned ship laser radar data capable of identifying objects to a node serving as a navigator, fusing the received laser point cloud data and the vision mechanism data by the unmanned plane, and calculating the final positions of the abnormal points;
And guiding the ground unmanned ship to achieve the position approaching to the target abnormal point according to the information after the position fusion of the pilot, and then carrying out subsequent abnormal point processing.
Optionally, the Jetson Xavier NX computing platform constructs a water area inspection picture data set with a standard size, which contains a labeled training set and a labeled test set, and the ratio is 3: 1; the training data set is sent into a deep convolution neural network to learn and optimize internal structure weight;
performing targeted scoring on a target detection result, screening the detection result by using a non-maximum value inhibition method, selecting a detection frame with the highest confidence coefficient as a first output boundary frame, selecting other detection frames to calculate the overlapping rate with the first output boundary frame, if the overlapping rate is greater than a preset threshold value, discarding the detection frame, and if the overlapping rate is not greater than the preset threshold value, reserving the detection frame; continuously selecting a prediction frame with the highest reliability except the first output boundary frame, and repeating the steps until no detection frame remains, and the remaining detection frame is the target detection result in the image;
in the output result, each grid corresponds to 3 prior frames, and the prediction information of each prior frame comprises 4 frame position parameters, 1 target evaluation and 5 category predictions; the frame position parameters comprise center coordinates, width and height;
Calculating a loss function, and continuously adjusting model parameters by using a gradient descent method through back propagation to finally obtain an optimal network model;
inputting images in the test set, extracting target features by using a trained model, outputting a multi-scale prediction result, performing targeted scoring through a classifier, screening a detection result by using a non-maximum inhibition method, and finally obtaining an object identification result based on the deep convolutional neural network.
Optionally, if the multiple drones find the same target, combining the GPS and D435i to detect the positions (a) of multiple objects according to the uniqueness of the target in the world coordinatei,bi) Weighted average is carried out to obtain the final positioning position (a)t,bt) N is the number of the unmanned aerial vehicles which identify the abnormal points;
Figure BDA0003241650170000031
the method comprises the following steps that a plurality of unmanned aerial vehicles are subjected to limited time formation approach control based on event triggering, and an unmanned aerial vehicle dynamic model is established:
the unmanned aerial vehicle selects a four-rotor aircraft, and the specific form of the unmanned aerial vehicle after the dynamic model is established is as follows:
Figure BDA0003241650170000032
in the formula: x, y, z represent the position of the drone in space; phi, theta and psi represent a rolling angle, a pitching angle and a yaw angle; m represents the mass of the drone; i isxx、Iyy、IzzRepresenting moments of inertia about the x, y, z axes, respectively; l represents the distance between the motor shaft and the center of the machine body; g represents the gravitational acceleration; u. of 1、u2、u3、u4Represents drone control input, defined as:
Figure BDA0003241650170000033
wherein: b represents a lift coefficient; d represents a torque coefficient; omega1、ω2、ω3、ω4Respectively representing the rotational speed of rotors 1, 2, 3, 4; u. of1Represents the total lift perpendicular to the fuselage direction; u. of2Representing a difference in lift that affects the pitch motion of the aircraft; u. of3Representing lift differences that affect the rolling motion of the aircraft; u. of4Representing a torque affecting yaw movement of the aircraft;
the cooperative processing capability is more important in the water area inspection, and the position and the attitude of the unmanned aerial vehicle are not controlled;
and (3) carrying out linearization processing on the model to obtain a second-order integrator model as follows:
Figure BDA0003241650170000041
wherein
Figure BDA0003241650170000042
ui=[uxi,uyi,uzi]TRespectively representing position, velocity, and control inputs; the matrix form is:
Figure BDA0003241650170000043
wherein
Figure BDA0003241650170000044
The position information fused by the unmanned aerial vehicles serving as a pilot is issued to the unmanned aerial vehicles to be formed, and then the unmanned aerial vehicles approach to an abnormal point under a limited time formation control protocol based on event triggering, so that subsequent operation is conveniently executed;
setting a formation form h in space as h ═ h1,h2,...,hn],
Figure BDA0003241650170000045
Order to
Figure BDA0003241650170000046
The formation problem is translated into a consistency problem as follows:
Figure BDA0003241650170000047
wherein i, j ∈ [1, n ]]And i ≠ j represents the drone number,
Figure BDA0003241650170000048
defining control input vectors
Figure BDA0003241650170000049
Obtaining a new system model:
Figure BDA00032416501700000410
when the formation states are consistent, the system forms corresponding formation control;
Order to
Figure BDA00032416501700000411
Figure BDA00032416501700000412
The vector of composition is
Figure BDA00032416501700000413
Figure BDA00032416501700000414
The vector of composition is
Figure BDA00032416501700000415
And defines an error vector:
Figure BDA00032416501700000416
the distribution based on event-driven finite-time sliding mode control is as follows:
Figure BDA00032416501700000417
wherein
Figure BDA00032416501700000418
α∈(0,1),<*>α=|*|α·sgn(*),β1234> 0 is a parameter of the controller,
Figure BDA0003241650170000051
is an integral sliding mode surface, defined as follows:
Figure BDA0003241650170000052
Si(t)=[Si1(t),Si2(t),Si3(t)]Tis a three-dimensional column vector; each drone is designed with the following event trigger functions, wherein the ith is taken as an example:
Figure BDA0003241650170000053
where eta > 0 is a parameter for adjusting the event function, Deltai(t) when the system is normally operated, the system is normally operated; deltai(t) when the error vector is not less than 0, triggering an event, and resetting the error vector so as to regulate and control the updating of the control system;
Figure BDA0003241650170000054
wherein λ2And a second small eigenvalue of a Laplace matrix of the undirected communication topological graph formed by the plurality of unmanned aerial vehicles.
Optionally, when the D435i vision mechanism and the RPLIDAR-A3 laser radar cooperate to perform 3D target detection and positioning, a depth convolution neural network is used to detect a two-dimensional object region of an outlier in an RGB image and classify an object. And determining abnormal points to be processed according to the category library. And obtaining a view frustum (a near plane and a far plane are specified by a depth sensor range) of a 3D search space of the abnormal point object corresponding to the 2D bounding box by using a known depth camera projection relation and combining laser radar point cloud data. All points within the viewing frustum form a viewing frustum point cloud.
The principle of nearest neighbor clustering is based on the continuity of the surface of the same object, wherein the continuity is that the reflection point of the object is a continuous point set; combining the formed view cone, wherein point cloud formed by the abnormal point depth map is used as a reference point, and segmenting the 3D point cloud data in the view cone; obtaining 3D point cloud data of abnormal points and estimating the central position of the point cloud
Figure BDA0003241650170000055
Figure BDA0003241650170000056
Estimating the real center of the whole object by using a T-Net network, and then converting coordinates to enable the prediction center to be an origin; and correcting the center of the bounding box by adopting a residual error method. Calculating the final object center by combining the center residual error of the network prediction estimated by the bounding box, the previous center residual error from the T-Net and the centroid obtained by the nearest neighbor clustering algorithm;
Figure BDA0003241650170000057
for the point cloud of the selected object in the 3D point cloud data, a bounding box is predicted by adopting a bounding box estimation network, and parameters of the three-dimensional bounding box are output; i.e. the bounding box center (c)x,cy,cz) Magnitude (h, w, l), yaw angle θ; the total loss of optimization for both networks is:
Lgeneral assembly=λ(Lc1-reg+Lc2-reg+Lh-cls+Lh-reg+Ls-cls+Ls-reg+γLcorner)
Wherein: l isc1-regAnd Lc2-regRespectively estimating the loss generated by a network judgment center for the coordinate translation loss and the boundary box of the T-Net; λ, γ are model parameters; l ish-clsAnd Lh-regCategory losses and regression losses for the estimated 3D bounding box corresponding orientations, respectively; l is s-clsAnd Ls-regClass loss and regression loss representing bounding box size, respectively; the Softmax method is used in the category determination process, and the smooth-l method is used in the regression problem1Loss; since a bounding box information is determined by both size and angle, LcornerThe angular loss quantifies this, and the formula is:
Figure BDA0003241650170000061
where NS represents the number of bounding boxes of different sizes and NH represents the bounding boxes of different orientations. And according to the finally obtained position of the 3D detection frame relative to the camera, combining the coordinate change to obtain the position of the object in the world coordinate system.
And establishing a dynamic model of the unmanned ship.
The invention has the beneficial effects that: the combination form of heterogeneous multi-agent is adopted, deep learning, reinforcement learning and information fusion technologies are applied to carry out routing inspection operation, and unmanned water area monitoring and abnormal condition processing can be directly realized. Not only reduces the operation risk of personnel in the water area, but also greatly improves the working efficiency.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention may be realized and attained by the means of the instrumentalities and combinations particularly pointed out hereinafter.
Drawings
For the purposes of promoting a better understanding of the objects, aspects and advantages of the invention, reference will now be made to the following detailed description taken in conjunction with the accompanying drawings in which:
FIG. 1 is a schematic diagram of routing inspection by cooperation of multiple unmanned planes and multiple unmanned ships;
FIG. 2 is a deep convolutional neural network structure;
FIG. 3 is an example of a conventional convolution process;
FIG. 4 is an example of a depth separable convolution process;
fig. 5 illustrates cooperative positioning of multiple drones;
fig. 6 is a schematic diagram of multi-drone collaboration;
FIG. 7 is laser radar and depth camera information fusion 3D target detection;
FIG. 8 is a 3D point cloud segmentation nearest neighbor clustering algorithm;
FIG. 9 is a center residual estimation T-Net network;
FIG. 10 is a bounding box evaluation network PointNet;
FIG. 11 is a reference motion coordinate system and motion variables for the motion of the unmanned ship;
FIG. 12 is a diagram of a heterogeneous multi-agent system distributed observer;
FIG. 13 is an actor neural network;
FIG. 14 is a review family neural network;
FIG. 15 is a graph of reinforcement learning algorithm solving for output synchronization control;
FIG. 16 is a depth separable convolution Conv2D structure;
FIG. 17 shows a standard Conv2D structure;
FIG. 18 is a residual base unit structure;
FIG. 19 is a combined residual block structure;
FIG. 20 is a binocular vision measurement principle;
FIG. 21 illustrates the laser radar triangulation ranging principle;
Fig. 22 is a flow chart of the operation of the multiple drone and drone system.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention in a schematic way, and the features in the following embodiments and examples may be combined with each other without conflict.
Wherein the showings are for the purpose of illustrating the invention only and not for the purpose of limiting the same, and in which there is shown by way of illustration only and not in the drawings in which there is no intention to limit the invention thereto; to better illustrate the embodiments of the present invention, some parts of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if there is an orientation or positional relationship indicated by terms such as "upper", "lower", "left", "right", "front", "rear", etc., based on the orientation or positional relationship shown in the drawings, it is only for convenience of description and simplification of description, but it is not an indication or suggestion that the referred device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and therefore, the terms describing the positional relationship in the drawings are only used for illustrative purposes, and are not to be construed as limiting the present invention, and the specific meaning of the terms may be understood by those skilled in the art according to specific situations.
The constructed multi-unmanned aerial vehicle and multi-unmanned ship system comprises the following steps:
1. multiple drones are each configured with vision D435i and a computing platform Jetson XavierNX.
2. Multiple drones are each configured with a laser radar RPLIDAR-a3 and a computing platform TX 2.
3. Sensors such as a GPS (global positioning system), an IMU (inertial measurement unit) and the like commonly used by unmanned planes and unmanned ships are all carried and configured, and can be added or deleted according to a specific scene.
4. Unmanned aerial vehicles and unmanned ships carry corresponding actuating mechanisms according to scene needs. Such as mechanical arms, water quality sampling instruments, megaphones, power batteries and the like.
II, working flows of multiple unmanned aerial vehicles and multiple unmanned ships:
the system has the working flow as follows: firstly, a plurality of (or single) unmanned aerial vehicles and a plurality of unmanned ships carry out water area patrol, and the unmanned aerial vehicles find abnormality firstly because the visual patrol range of the unmanned aerial vehicles is wider and the speed is higher, and the unmanned aerial vehicles closest to the abnormal points serve as pilots. According to the position information of the object sensed by the D435i depth camera carried by the unmanned aerial vehicle, position information fusion is carried out at a pilot, and the unmanned aerial vehicles are guided to approach abnormal points. Along with the movement of the unmanned ship, multiple (or single) unmanned ships sense abnormal points by using the carried RPLIDAR-A3 laser radar, information fusion is carried out on data collected by the unmanned ships at a pilot, and the unmanned ships send the fused position information to the unmanned ships, so that the unmanned ships are driven to cooperatively approach the abnormal points. And finally, carrying out abnormal management by using an executing mechanism configured by the unmanned aerial vehicle or the unmanned ship.
Fig. 1 is a schematic diagram of routing inspection cooperatively performed by multiple unmanned aerial vehicles and multiple unmanned ships, and the process is as follows:
the first step is as follows: the unmanned aerial vehicle closest to the abnormal point serves as a pilot and serves as an information processing station. If a plurality of unmanned aerial vehicles find the same target, more accurate position information is obtained by adopting a weighted average algorithm.
The second step is that: the unmanned aerial vehicle as the pilot guides the other unmanned aerial vehicles to approach the abnormal point.
The third step: as the multiple unmanned ships approach and perceive the abnormal points, the laser radar data of the multiple unmanned ships capable of recognizing the objects are sent to the unmanned aerial vehicle nodes serving as pilots, and the unmanned aerial vehicle fuses the received laser point cloud data and the depth camera data to calculate the final positions of the abnormal points.
The fourth step: and guiding the ground unmanned ship to achieve the position approaching to the target abnormal point according to the information after the position fusion of the pilot, and then performing subsequent abnormal point management operation.
Many unmanned aerial vehicles and many unmanned aerial vehicle system are explained in detail:
the first step is as follows: depth camera D435i perceives anomalous objects
By combining an actual application scene and the real-time processing capability of an unmanned aerial vehicle computing platform (Jetson Xavier NX), the following deep convolution-based neural network learning structure is designed to perform an object detection algorithm and achieve a multi-classification target. And sequencing the danger degrees of the objects in the database class, and processing according to the emergency degree of the abnormal class during each actual tracking.
FIG. 2 is a deep convolutional neural network structure, in which:
1. the parameter calculation amount is reduced by adopting a depth separable convolution technology in a feature extraction part, and the Jetson Xavier NX calculation force can meet the calculation requirement.
2. In addition, during inspection of a water area, 104 × 104 scales are specially designed for target detection in order to enhance the detection accuracy of small-scale objects such as sewage image colors during inspection of pollution sources.
As shown in fig. 3 and 4, the overall object recognition process is summarized as follows:
1. firstly, a water area patrol inspection picture data set with a standard size needs to be constructed, and the situation that the data set is insufficient in some aspects can be made up by using a data enhancement technology. Containing the labeled training set and test set, in a ratio of 3: 1. and (5) sending the training data set into a deep convolution neural network to learn and optimize internal structure weight.
2. Performing targeted scoring in the target detection result, screening the detection result by using a non-maximum value inhibition method, selecting the detection frame with the highest confidence coefficient as a first output boundary frame, selecting other detection frames to calculate the overlapping rate with the first output boundary frame, if the overlapping rate is greater than a preset threshold value, discarding the detection frame, and otherwise, reserving the detection frame; and continuously selecting the prediction frame with the highest external reliability except the first output boundary frame, and repeating the steps until no detection frames remain, wherein the target detection result in the image is remained.
3. In the output result, each grid corresponds to 3 prior frames, and the prediction information of each prior frame comprises 4 frame position parameters (center coordinates and width and height), 1 target evaluation and 5 category predictions (categories can be added according to actual conditions).
4. And calculating a loss function, and continuously adjusting model parameters by using a gradient descent method through back propagation to finally obtain the optimal network model.
5. Inputting images in the test set, extracting target features by using a trained model, outputting a multi-scale prediction result, performing targeted scoring through a classifier, screening a detection result by using a non-maximum inhibition method, and finally obtaining an object identification result based on the deep convolutional neural network. If the obtained test effect is better, the method can be put into practical use.
The second step is that: multiple D435i, GPS cooperative object location
Multi-drone vision (D435i) cooperative positioning
When a plurality of unmanned aerial vehicles sense abnormal objects, the invention combines the functions of the unmanned aerial vehicles according to the uniqueness of the objects under the world coordinatesGPS, D435i for multiple perceived object positions (a)i,bi) Weighted average is carried out to obtain the final positioning position (a)t,bt) And n is the number of the unmanned aerial vehicles for identifying the abnormal points. As shown in fig. 5, the principle of depth camera position perception is shown in the appendix.
Figure BDA0003241650170000091
The third step: the method comprises the following steps of (1) performing finite time formation approach control on multiple unmanned aerial vehicles based on event triggering, as shown in fig. 6, establishing an unmanned aerial vehicle dynamic model:
the unmanned aerial vehicle selects a four-rotor aircraft, and the specific form of the unmanned aerial vehicle after the dynamic model is established is as follows:
Figure BDA0003241650170000101
In the formula: x, y, z represent the position of the drone in space; phi, theta and psi represent a rolling angle, a pitching angle and a yaw angle; m represents the mass of the drone; i isxx、Iyy、IzzRepresenting moments of inertia about the x, y, z axes, respectively; l represents the distance between the motor shaft and the center of the machine body; g represents the gravitational acceleration; u. of1、u2、u3、u4Represents drone control input, defined as:
Figure BDA0003241650170000102
wherein: b represents a lift coefficient; d represents a torque coefficient; omega1、ω2、ω3、ω4Respectively representing the rotational speed of rotors 1, 2, 3, 4; u. of1Represents the total lift perpendicular to the fuselage direction; u. of2Representing a difference in lift that affects the pitch motion of the aircraft; u. of3Representing lift differences that affect the rolling motion of the aircraft; u. of4Representing the torque that affects the yaw motion of the aircraft.
The cooperative processing capability is more important in the water area inspection, and the unmanned aerial vehicle is not used for controlling the position and the attitude of the unmanned aerial vehicle. All unmanned aerial vehicles participating in the inspection tour ignore the dynamic process of attitude control based on the situation in the patent, and only consider the positions of the unmanned aerial vehicles.
The model is linearized and simplified to obtain the following second-order integrator model:
Figure BDA0003241650170000103
wherein
Figure BDA0003241650170000104
ui=[uxi,uyi,uzi]TRespectively representing position, velocity, and control inputs. (4) The formula can also be expressed in matrix form as follows:
Figure BDA0003241650170000105
wherein
Figure BDA0003241650170000106
The position information fused by the unmanned aerial vehicles serving as the pilot is issued to the unmanned aerial vehicles to be formed, and then the unmanned aerial vehicles approach to abnormal points under a limited time formation control protocol based on event triggering, so that subsequent operation is conveniently executed.
Setting a formation form h in space as h ═ h1,h2,...,hn],
Figure BDA0003241650170000111
Order to
Figure BDA0003241650170000112
The formation problem is translated into a consistency problem as follows:
Figure BDA0003241650170000113
wherein i, j ∈ [1, n ]]And i ≠ j represents the drone number,
Figure BDA0003241650170000114
defining control input vectors
Figure BDA0003241650170000115
The following new system model was obtained:
Figure BDA0003241650170000116
when the forming states of the formula (7) are consistent, the formula (4) system forms corresponding formation control.
Order to
Figure BDA0003241650170000117
Figure BDA0003241650170000118
The vector of composition is
Figure BDA0003241650170000119
Figure BDA00032416501700001110
The vector of composition is
Figure BDA00032416501700001111
And defines an error vector:
Figure BDA00032416501700001112
designing distributed event-driven finite-time sliding mode control as follows:
Figure BDA00032416501700001113
wherein
Figure BDA00032416501700001114
α∈(0,1),<*>α=|*|α·sgn(*),β1234> 0 is a parameter of the controller,
Figure BDA00032416501700001115
is an integral sliding mode surface, defined as follows:
Figure BDA00032416501700001116
Si(t)=[Si1(t),Si2(t),Si3(t)]Tis a three-dimensional column vector. Each drone is designed with the following event trigger functions, wherein the ith is taken as an example:
Figure BDA00032416501700001117
where eta > 0 is a parameter for adjusting the event function, Deltai(t) when the system is normally operated, the system is normally operated; deltaiAnd (t) is more than or equal to 0, the event is triggered, and the error vector is cleared, so that the updating of the control system is regulated and controlled.
Therefore, the system (7) adopts the controller protocol designed by the controller (9), and based on the event trigger function (11), when the parameters of the controller and the event function meet the following conditions, the state of the system (7) can be agreed within a limited time, namely, the formation control of a plurality of unmanned aerial vehicles on the approach of abnormal points is completed.
Figure BDA0003241650170000121
Wherein λ2La of undirected communication topological graph formed by multiple unmanned aerial vehiclesThe second smallest eigenvalue of the place matrix.
The fourth step: d435i and RPLIDAR-A3 cooperate to perform 3D object detection localization
And at the moment, the 3D target detection of the abnormal point is carried out by fusing the object depth and the anchor frame data sensed by the depth camera and the laser point cloud data sensed after the unmanned ship approaches. The invention adopts 3D abnormal point detection, which can greatly improve the accurate processing of the subsequent actuating mechanism operation, as shown in figure 7.
Table 1 fusion algorithm flow:
Figure BDA0003241650170000122
as shown in table 1, the details of the fusion algorithm are:
Step1:
the depth convolution neural network designed by the patent is utilized to detect the two-dimensional object area of the abnormal point in the RGB image and classify the object. And determining abnormal points to be processed according to the category library. And (3) obtaining a view frustum (a near plane and a far plane specified by the range of the depth sensor) of the 3D search space of the abnormal point object corresponding to the 2D bounding box by using a known depth camera projection matrix and combining laser radar point cloud data. All points within the viewing frustum form a viewing frustum point cloud.
Step2:
The principle of nearest neighbor clustering is based on the continuity of the surface of the same object, i.e. the reflection point of the object will be a continuous set of points. And (4) dividing the 3D point cloud data in the view cone by combining the view cone formed in Step1, wherein the point cloud formed by the abnormal point depth map is used as a reference point. Obtaining 3D point cloud data of abnormal points and estimating the central position of the point cloud
Figure BDA0003241650170000123
Figure BDA0003241650170000131
Fig. 8 is a 3D point cloud segmentation nearest neighbor clustering algorithm.
Step3:
The real center of the whole object is estimated by using a T-Net network, and then coordinates are converted so that the predicted center becomes the origin. And (3) correcting the center of the boundary box by adopting a residual error method, and combining the center residual error predicted by the network estimation of the boundary box, the previous center residual error from the T-Net and the centroid obtained by a nearest neighbor clustering algorithm to calculate the final object center, wherein the expression (14) is as follows. And for the point cloud of the selected object in the 3D point cloud data, predicting a boundary box by adopting a boundary box estimation network, and outputting parameters of the three-dimensional boundary box. I.e. the bounding box center (c)x,cy,cz) Magnitude (h, w, l), deflection angle θ. The optimized total loss of the two networks is as follows (15).
Figure BDA0003241650170000132
LGeneral assembly=λ(Lc1-reg+Lc2-reg+Lh-cls+Lh-reg+Ls-cls+Ls-reg+γLcorner) (15)
Wherein: l isc1-regAnd Lc2-regThe penalty incurred by the discrimination center of the network is estimated for the left translation penalty of T-Net and the bounding box, respectively. L ish-clsAnd Lh-regThe class loss and the regression loss, respectively, for estimating the corresponding orientation of the 3D bounding box. L iss-clsAnd Ls-regClass penalty and regression penalty for bounding box size, respectively. The Softmax method is used in the category determination process, and the smooth-l method is used in the regression problem1And (4) loss. Since a bounding box information is determined by both size and angle, L cornerThe angular loss was quantified by the following calculation formula (16), λ, γ being model parameters, T-net and bounding box estimation network structure fig. 9 and 10.
Figure BDA0003241650170000133
Step4:
And according to the finally obtained position of the 3D detection frame relative to the camera, combining the coordinate change to obtain the position of the object in the world coordinate system.
The fifth step: event trigger-based reinforcement learning unmanned aerial vehicle and multi-unmanned ship cooperative control
The heterogeneous multi-agent cooperative control design is carried out by taking a plurality of unmanned ships in the water area as Follower and taking unmanned planes in the air as pilots.
Establishing a dynamic model of the unmanned ship:
the unmanned ship can be regarded as a rigid body, adopts a rectangular coordinate system fixed on a ship body, and has 6 degrees of freedom, namely, the linear motion along 3 axes is respectively forward direction, rolling and pitching, and the rotary motion around the 3 axes is respectively rolling, pitching and yawing. The degrees of freedom have coupling effect, and in the course control research of the unmanned ship on the water surface, the coupling effect is very small and can be ignored, and only the plane motion is considered. The motion of the unmanned ship is a complex six-degree-of-freedom motion, and two coordinate systems are defined for research convenience. I.e. inertial coordinate system O with inspection station as origin of coordinates 0X0Y0Z0And an attached body coordinate system Oxyz taking the gravity center of the ship body as a coordinate origin.
The most important factors are considered, where ω is 0, p is 0, and q is 0, and at this time, the six-degree-of-freedom motion of the unmanned boat can be simplified into three-degree-of-freedom motion (forward speed along the X-axis direction is u, traverse speed along the Y-axis direction is v, and yaw rate r of rotation around the Z-axis). The transformation relationship of the motion in the two coordinate systems is shown in formula (17). The variables are defined as shown in table 2.
TABLE 2 definition of variables
Figure BDA0003241650170000141
Fig. 11 is a reference motion coordinate system and motion variables for the motion of the unmanned ship.
Figure BDA0003241650170000142
Setting the control state of the unmanned ship to [ x y psi]TThen, equation (17) is simplified to the following state space expression form:
Figure BDA0003241650170000143
wherein xi(t),ui(t),yi(t) represents the state input and output of the unmanned ship, respectively, fi,gi,CiRepresenting the internal dynamics, the input dynamics and the output dynamics of the unmanned ship, respectively. The model is subjected to linearization processing in the patent, a model-free reinforcement learning mode is adopted for hull control, model parameters are only applied in event triggering conditions, sampling time is very small, and the influence on actual application is not large. The linearized form is as follows:
Figure BDA0003241650170000144
1. designing a distributed observer of a pilot:
in a communication topology consisting of a pilot (unmanned aerial vehicle) and a Follower (unmanned ship), the unmanned ship is considered
A temporary increase may be faced in that only a portion of the unmanned vessels have access to the leader's information, and a distributed observer will be designed for each unmanned vessel in this patent to estimate the leader's state and output. Therefore, the unmanned ship can know the state and the output of the unmanned plane and is used for consistency control of a subsequent heterogeneous intelligent system. According to the formula (5) in the third step.
The following distributed observer form is designed:
Figure BDA0003241650170000151
in which ξi,ηi
Figure BDA0003241650170000152
Respectively the state, control input and output of the ith observer. The structure of the observation structure is shown in figure 12,
Figure BDA0003241650170000153
is the actual position, xiiA position is estimated for the observer.
Control signal etaiIs designed as follows:
ηi=c1Fzi+c2σ(Fzi) (21)
function σ (·):
Figure BDA0003241650170000154
and ziThe definition is as follows:
Figure BDA0003241650170000155
Figure BDA0003241650170000156
the distributed observer parameters are:
Figure BDA0003241650170000157
wherein P is a positive definite matrix and satisfies the following linear matrix inequality
Figure BDA0003241650170000158
Figure BDA0003241650170000159
Is the upper bound of the control input.
The distributed observers (20) and (21) thus designed are in c1,c2The parameter F satisfies the above condition, (A)0,B0) Is calmable, A0In the case of all eigenvalues on the imaginary axis. When limt→∞Then z isi(t) → 0. The output of all unmanned ships is ensured to be consistent with the output of a pilot.
2. Designing event-triggered based controllers
Since continuous sampling control is not necessary in the tracking control process approaching the abnormal point, wireless communication resources are very tight in a scene of multiple drones and multiple drones. Therefore, the control algorithm of the patent adopts the idea based on event triggering. An observer-based unmanned ship augmented state defined as follows:
Figure BDA00032416501700001510
Combining (19) (20) the augmented dynamics system of the unmanned ship:
Figure BDA00032416501700001511
wherein:
Figure BDA0003241650170000161
design controller uiLet eiTowards 0, then ziTending to 0. (25) It becomes the following form and defines the performance function as follows.
Figure BDA0003241650170000162
Figure BDA0003241650170000163
Let the augmented state of two consecutive events be expressed as follows:
Figure BDA0003241650170000164
wherein
Figure BDA0003241650170000165
Is a sequence of sampling instants at which the ith drone monotonically increases. The sampling error of the ith unmanned ship is as follows:
Figure BDA0003241650170000166
the following event-triggered controllers are designed:
Figure BDA0003241650170000167
designing an event trigger function:
Figure BDA0003241650170000168
wherein:
Figure BDA0003241650170000169
Piis a positive definite symmetric matrix and satisfies the algebraic Riccati equation
Figure BDA00032416501700001610
αiiThe right side of equation (30) is guaranteed to be greater than 0 for adjusting the parameters.
3. And a reinforcement learning algorithm is applied to realize the cooperative consistency of the unmanned ships and the unmanned planes.
Combining the augmented state (26) and the performance function (27), the performance function becomes of the form:
Figure BDA00032416501700001611
wherein: qiT=[Ci -C0]TQi[Ci -C0]
Differentiating (31) yields the Hamiltonian defined as:
Figure BDA00032416501700001612
(26) formula (iv) represents another form as follows:
Figure BDA00032416501700001613
wherein: u. ofiIn order to explore the strategy, the method comprises the following steps,
Figure BDA0003241650170000171
is the v iteration strategy. According to the optimal control theory, the update control at v +1 times is as follows:
Figure BDA0003241650170000172
from (32) to (34) can be obtained:
Figure BDA0003241650170000173
designing an Actor network to approximate an updated control strategy
Figure BDA0003241650170000174
Critic network approximation function Vi v
Figure BDA0003241650170000175
As shown in fig. 13 and 14, Actor NN inputs the actual state deviation and outputs the drive Follower motion.
Critic NN inputs actual state deviations and outputs evaluation values corresponding to the states.
Wherein
Figure BDA0003241650170000176
In order to be the basis function(s),
Figure BDA0003241650170000177
weight vector,/1And l2The number of neurons. Definition of
Figure BDA0003241650170000178
Ri=diag(r1,...,rm). On both sides of formula (35), multiply
Figure BDA0003241650170000179
And integrating and combining the integration with the formula (36) to obtain the reinforcement learning algorithm of the offline strategy.
Figure BDA00032416501700001710
Wherein omegai(t) is the Bellman approximation error,
Figure BDA00032416501700001711
is as follows
Figure BDA00032416501700001712
The ith column weight value of (1). Using least square method to minimize Bellman error
Figure BDA00032416501700001713
And Vi v. A specific algorithm flow diagram is shown in fig. 15.
And finally, the multiple unmanned ships approach abnormal points according to the unmanned aerial vehicle tracks. After the unmanned ship arrives, mechanisms such as a broadcaster, a mechanical arm and a detection device arranged on the unmanned ship are driven to perform further abnormal point treatment.
The unmanned ship can be loaded with a large-capacity power device and a large-scale precision instrument, has strong abnormal point processing capability, but is slow in moving speed and small in visual field range, and the action capability is greatly limited in an area with dense obstacles; compared with the prior art, the unmanned aerial vehicle has higher moving speed and space flexibility, does not need to consider the complex obstacle environment on the ground in the moving process, and has a strong visual field advantage in high altitude. Therefore, by combining the advantages of the unmanned aerial vehicle and the navigation device, the unmanned aerial vehicle is used as visual data acquired by a pilot and position information of the unmanned aerial vehicle, and is received by the unmanned ship under the communication topology. If an emergency situation occurs, the multiple unmanned aerial vehicles and the multiple unmanned ships cooperatively develop related work to handle the abnormal situation.
The heterogeneous system can process the following water area inspection operation (and can bring the unconsidered abnormal conditions into the collected data set, and can widen the application range):
1. search and rescue for drowned person
Such as: and (4) carrying out events such as ship turning, drowning and the like to develop the cooperative rescue.
2. Monitoring illegal ship mined water resources
Such as: the fish are illegally harvested, and natural resources in the water area are exploited.
3. Patrol for preventing and controlling water pollution
Such as: patrol and go into water area department drain and set up the condition, have the pollution sources in newly-increased water territory, if: pollution of industrial and mining enterprises, urban domestic sewage, ship oil pollution leakage, livestock and poultry breeding pollution and the like.
4. Disposal of garbage in water area
Such as: garbage, floating objects, sundries and the like on the surface of the river.
5. Fixed point sampling inspection of water quality
Such as: the water quality spot check of the water area part is mainly protected.
FIG. 16 is a depth separable convolution Conv2D structure; FIG. 17 shows a standard Conv2D structure; FIG. 18 is a residual base unit structure; FIG. 19 is a combined residual block structure; FIG. 20 is a binocular vision measurement principle;
assuming that the distance between the left camera and the right camera is determined as L (intrinsic parameter of the camera), two angles of < SAB and < SBA can be obtained through binocular vision measurement, and at the moment, the triangle SAB is determined. The distance of the object relative to the camera can be obtained by the following calculation. Namely, the distance information of the water area target, the calculation method is as follows:
Figure BDA0003241650170000181
Figure BDA0003241650170000182
Figure BDA0003241650170000183
And finally, the position of the object relative to the world coordinate system can be obtained through coordinate conversion.
FIG. 21 illustrates the laser radar triangulation ranging principle; laser radar:
the positions of light spots formed by the emergent laser on the sensor for imaging are also different for objects positioned at different distances; on the other hand, the internal structure of the ranging module is fixed, and the focal length f of the receiving lens and the offset (i.e. the baseline distance) L between the optical axis of the transmitting light path and the main optical axis of the receiving lens are known. According to the similarity relation of the triangles, the distance D of the object can be calculated as follows:
Figure BDA0003241650170000184
fig. 22 is a flow chart of the operation of the multiple drone and drone system.
The following advantageous effects are expected by the implementation of the present invention:
firstly, the existing water area inspection is basically carried out in a manual mode, and the adoption of the system can greatly save manpower, realize unmanned operation and reduce the possible risk of personnel in the water area;
the plurality of unmanned aerial vehicles and unmanned ships are used for carrying out operation cooperatively, so that the capacity of processing abnormal work can be greatly improved, and the situation that the working capacity of a single intelligent body is insufficient is avoided;
and thirdly, a new depth convolution neural network is designed by combining a depth camera of the unmanned aerial vehicle, the data operation amount is reduced by applying a depth separable convolution method and multi-scale output prediction, and the processing can be realized on a Jetson Xavier NX embedded platform. The inspection abnormity in the water area can be accurately sensed.
And fourthly, the cooperative working capacity of the unmanned aerial vehicles and the unmanned ship system can be improved by the formation control design of the positions of the unmanned aerial vehicles around the abnormal points.
And fifthly, by combining the vision of the unmanned aerial vehicle and sensors such as unmanned ship laser radar and GPS, a new multi-information fusion algorithm is designed to realize 3D detection and positioning on abnormal objects, so that the convenience of subsequent execution mechanism operation can be greatly improved.
In a communication architecture formed by multiple unmanned planes and multiple unmanned ships, the unmanned ships can realize the output cooperation with the unmanned planes under the condition that the unmanned planes as pilots are dynamic.
And seventhly, combining event triggering and reinforcement learning ideas, and designing an Actor-Critic neural network framework to realize cooperative optimal output regulation of the unmanned aerial vehicle and the unmanned ship multi-agent system.
Finally, the above embodiments are only intended to illustrate the technical solutions of the present invention and not to limit the present invention, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions, and all of them should be covered by the claims of the present invention.

Claims (6)

1. Many unmanned aerial vehicles and many unmanned ships patrol and examine control system based on reinforcement study, its characterized in that: the unmanned aerial vehicle comprises a plurality of unmanned aerial vehicles and a plurality of unmanned ships;
the unmanned aerial vehicles are provided with a D435i vision mechanism and a Jetson Xavier NX computing platform;
the plurality of unmanned ships are provided with RPLIDAR-A3 laser radar and TX2 computing platforms;
the unmanned planes and the unmanned ships are provided with inertial measurement units IMU and GPS;
the unmanned planes and the unmanned ships are also provided with power batteries;
the D435i vision mechanism is in signal connection with a Jetson Xavier NX computing platform;
the RPLIDAR-A3 lidar is in signal connection with a TX2 computing platform;
the inertial measurement unit IMU and the GPS are respectively in signal connection with a Jetson Xavier NX computing platform and a TX2 computing platform;
the unmanned aerial vehicles and the multi-unmanned ship are also provided with actuating mechanisms;
when the unmanned aerial vehicles find abnormality, the unmanned aerial vehicle closest to the abnormal point serves as a navigator, the position information of an object is sensed through a D435i vision mechanism, position information fusion is carried out at the navigator, and the unmanned aerial vehicles are guided to approach the abnormal point;
When the unmanned ship moves, sensing abnormal points by using the carried RPLIDAR-A3 laser radar, carrying out information fusion on data acquired by the unmanned ship at a pilot, and sending fused position information to the unmanned ship by the unmanned ship so as to drive the unmanned ship to approach the abnormal points in a cooperative manner and process the abnormal points by using an executing mechanism configured by the unmanned ship or the unmanned ship.
2. The reinforcement learning-based multi-unmanned aerial vehicle and multi-unmanned ship inspection control system according to claim 1, wherein: the actuating mechanism comprises a mechanical arm, a water quality sampling instrument, a standby battery and a megaphone.
3. The reinforcement learning-based multi-unmanned aerial vehicle and multi-unmanned ship inspection control system according to claim 1, wherein: a plurality of unmanned aerial vehicle and a plurality of unmanned ship carry out the waters and patrol and examine, when unmanned aerial vehicle discovery is unusual, the nearest unmanned aerial vehicle of departure anomaly will be as the pilot, through the positional information of D435i vision mechanism perception object, carries out positional information and fuses in pilot department, and the guide of a plurality of unmanned aerial vehicle approaches the anomaly and specifically does:
the unmanned aerial vehicle closest to the abnormal point serves as a pilot and serves as an information processing platform; if the unmanned aerial vehicles find the same target, position information is obtained by adopting a weighted average algorithm;
The pilot guides the rest unmanned aerial vehicles to approach the abnormal points;
as the unmanned ships approach and sense abnormal points, sending a plurality of unmanned ship laser radar data capable of identifying objects to a node serving as a navigator, fusing the received laser point cloud data and the vision mechanism data by the unmanned plane, and calculating the final positions of the abnormal points;
and guiding the ground unmanned ship to approach the target abnormal point according to the information fused with the position of the pilot, and then carrying out subsequent abnormal point processing.
4. The reinforcement learning-based multi-unmanned aerial vehicle and multi-unmanned ship inspection control system according to claim 3, wherein: the Jetson XavierNX computing platform constructs a water area patrol inspection picture data set with standard size, which contains a marked training set and a marked testing set, and the proportion is 3: 1; the training data set is sent into a deep convolution neural network to learn and optimize internal structure weight;
performing targeted scoring on a target detection result, screening the detection result by using a non-maximum value inhibition method, selecting a detection frame with the highest confidence coefficient as a first output boundary frame, selecting other detection frames to calculate the overlapping rate with the first output boundary frame, if the overlapping rate is greater than a preset threshold value, discarding the detection frame, and if the overlapping rate is not greater than the preset threshold value, reserving the detection frame; continuously selecting a prediction frame with the highest reliability except the first output boundary frame, and repeating the steps until no detection frame remains, and the remaining detection frame is the target detection result in the image;
In the output result, each grid corresponds to 3 prior frames, and the prediction information of each prior frame comprises 4 frame position parameters, 1 target evaluation and 5 category predictions; the frame position parameters comprise center coordinates, width and height;
calculating a loss function, and continuously adjusting model parameters by using a gradient descent method through back propagation to finally obtain an optimal network model;
inputting images in the test set, extracting target features by using a trained model, outputting a multi-scale prediction result, performing targeted scoring through a classifier, screening a detection result by using a non-maximum inhibition method, and finally obtaining an object identification result based on the deep convolutional neural network.
5. The method of claim 4 based onControl system is patrolled and examined to many unmanned aerial vehicles and many unmanned ships of reinforcement study, its characterized in that: if the unmanned aerial vehicles find the same target, combining GPS and D435i to sense the positions (a) of the objects according to the uniqueness of the target in the world coordinatei,bi) Weighted average is carried out to obtain the final positioning position (a)t,bt) N is the number of the unmanned aerial vehicles which identify the abnormal points;
Figure FDA0003241650160000021
the method comprises the following steps that a plurality of unmanned aerial vehicles are subjected to limited time formation approach control based on event triggering, and an unmanned aerial vehicle dynamic model is established:
The unmanned aerial vehicle selects a four-rotor aircraft, and the specific form of the unmanned aerial vehicle after the dynamic model is established is as follows:
Figure FDA0003241650160000022
in the formula: x, y, z represent the position of the drone in space; phi, theta and psi represent a rolling angle, a pitching angle and a yaw angle; m represents the mass of the drone; i isxx、Iyy、IzzRepresenting moments of inertia about the x, y, z axes, respectively; l represents the distance between the motor shaft and the center of the machine body; g represents the gravitational acceleration; u. of1、u2、u3、u4Represents drone control input, defined as:
Figure FDA0003241650160000031
wherein: b represents a lift coefficient; d represents a torque coefficient; omega1、ω2、ω3、ω4Respectively representing the rotational speed of rotors 1, 2, 3, 4; u. of1Represents the total lift perpendicular to the fuselage direction; u. of2Representing a difference in lift that affects the pitch motion of the aircraft; u. of3Representing lift differences that affect the rolling motion of the aircraft; u. of4Representing a torque affecting yaw movement of the aircraft;
the cooperative processing capability is more important in the water area inspection, and the position and the attitude of the unmanned aerial vehicle are not controlled;
and (3) carrying out linearization processing on the model to obtain a second-order integrator model as follows:
Figure FDA0003241650160000032
wherein
Figure FDA0003241650160000033
ui=[uxi,uyi,uzi]TRespectively representing position, velocity, and control inputs; the matrix form is:
Figure FDA0003241650160000034
wherein
Figure FDA0003241650160000035
The position information fused by the unmanned aerial vehicles serving as a pilot is issued to the unmanned aerial vehicles to be formed, and then the unmanned aerial vehicles approach to an abnormal point under a limited time formation control protocol based on event triggering, so that subsequent operation is conveniently executed;
Setting a formation form h in space as h ═ h1,h2,...,hn],
Figure FDA0003241650160000036
Order to
Figure FDA0003241650160000037
The formation problem is translated into a consistency problem as follows:
Figure FDA0003241650160000038
wherein i, j ∈ [1, n ]]And i ≠ j represents the drone number,
Figure FDA0003241650160000039
defining control input vectors
Figure FDA00032416501600000310
Obtaining a new system model:
Figure FDA00032416501600000311
when the formation states are consistent, the system forms corresponding formation control;
order to
Figure FDA0003241650160000041
Figure FDA0003241650160000042
The vector of composition is
Figure FDA0003241650160000043
Figure FDA0003241650160000044
The vector of composition is
Figure FDA0003241650160000045
And defines an error vector:
Figure FDA0003241650160000046
the finite time sliding mode controller based on distributed event driving is designed as follows:
Figure FDA0003241650160000047
wherein
Figure FDA0003241650160000048
α∈(0,1),<*>α=|*|α·sgn(*),β1234> 0 is a parameter of the controller,
Figure FDA0003241650160000049
is an integral sliding mode surface, defined as follows:
Figure FDA00032416501600000410
Si(t)=[Si1(t),Si2(t),Si3(t)]Tis a three-dimensional column vector; each drone is designed with the following event trigger functions, wherein the ith is taken as an example:
Figure FDA00032416501600000411
where eta > 0 is a parameter for adjusting the event function, Deltai(t) when the system is normally operated, the system is normally operated; deltai(t) when the error vector is not less than 0, triggering an event, and resetting the error vector so as to regulate and control the updating of the control system;
Figure FDA00032416501600000412
wherein λ2And a second small eigenvalue of a Laplace matrix of the undirected communication topological graph formed by the plurality of unmanned aerial vehicles.
6. The reinforcement learning-based multi-unmanned aerial vehicle and multi-unmanned ship inspection control system according to claim 5, wherein: when the D435i vision mechanism and the RPLIDAR-A3 laser radar cooperate to detect and locate A3D target, a depth convolution neural network is used for detecting a two-dimensional object area of an abnormal point in an RGB image and classifying objects; the anomaly points to be processed are determined from the category library. And obtaining a view frustum (a near plane and a far plane are specified by a depth sensor range) of a 3D search space of the abnormal point object corresponding to the 2D bounding box by using a known depth camera projection relation and combining laser radar point cloud data. All points within the viewing frustum form a viewing frustum point cloud.
The principle of nearest neighbor clustering is based on the continuity of the surface of the same object, wherein the continuity is that the reflection point of the object is a continuous point set; combining the formed view cone, wherein point cloud formed by the abnormal point depth map is used as a reference point, and segmenting the 3D point cloud data in the view cone; obtaining 3D point cloud data of abnormal points and estimating the central position of the point cloud
Figure FDA0003241650160000051
Figure FDA0003241650160000052
Estimating the real center of the whole object by using a T-Net network, and then converting coordinates to enable the prediction center to be an origin; correcting the center of the boundary box by adopting a residual error method, and calculating the final object center by combining the center residual error of the estimated network of the boundary box, the previous center residual error from T-Net and the centroid obtained by a nearest neighbor clustering algorithm;
Figure FDA0003241650160000053
for the point clouds of selected objects in the 3D point cloud data, the bounding box is predicted by using a bounding box estimation network, and the output isParameters of the three-dimensional bounding box; i.e. the bounding box center (c)x,cy,cz) Magnitude (h, w, l), yaw angle θ; the total loss of optimization for both networks is:
Lgeneral assembly=λ(Lc1-reg+Lc2-reg+Lh-cls+Lh-reg+Ls-cls+Ls-reg+γLcorner)
Wherein: l isc1-regAnd Lc2-regRespectively estimating the coordinate translation loss of the T-Net and the loss generated by a judgment center of the network for the bounding box; λ, γ are model parameters; l ish-clsAnd Lh-regRespectively estimating the category loss and the regression loss of the corresponding orientation of the 3D bounding box; l is s-clsAnd Ls-regClass losses and regression losses representing box sizes, respectively; the Softmax method is used in the category determination process, and the smooth-l method is used in the regression problem1Loss; since a bounding box information is determined by both size and angle, LcornerThe angular loss quantifies this, and the formula is:
Figure FDA0003241650160000054
where NS represents the number of bounding boxes of different sizes and NH represents the bounding boxes of different orientations. And according to the finally obtained position of the 3D detection frame relative to the camera, combining the coordinate change to obtain the position of the object in the world coordinate system.
And establishing a dynamic model of the unmanned ship.
CN202111020276.2A 2021-09-01 2021-09-01 Multi-unmanned aerial vehicle and multi-unmanned ship inspection control system based on reinforcement learning Active CN113671994B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111020276.2A CN113671994B (en) 2021-09-01 2021-09-01 Multi-unmanned aerial vehicle and multi-unmanned ship inspection control system based on reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111020276.2A CN113671994B (en) 2021-09-01 2021-09-01 Multi-unmanned aerial vehicle and multi-unmanned ship inspection control system based on reinforcement learning

Publications (2)

Publication Number Publication Date
CN113671994A true CN113671994A (en) 2021-11-19
CN113671994B CN113671994B (en) 2024-03-05

Family

ID=78548038

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111020276.2A Active CN113671994B (en) 2021-09-01 2021-09-01 Multi-unmanned aerial vehicle and multi-unmanned ship inspection control system based on reinforcement learning

Country Status (1)

Country Link
CN (1) CN113671994B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114281101A (en) * 2021-12-03 2022-04-05 南京航空航天大学 Unmanned aerial vehicle and holder interference source joint search method based on reinforcement learning
CN114545979A (en) * 2022-03-16 2022-05-27 哈尔滨逐宇航天科技有限责任公司 Aircraft intelligent sliding mode formation control method based on reinforcement learning
CN115877718A (en) * 2023-02-23 2023-03-31 北京航空航天大学 Data-driven heterogeneous missile formation switching communication topology cooperative control method
CN116295507A (en) * 2023-05-26 2023-06-23 南京师范大学 Laser inertial odometer optimization method and system based on deep learning
CN116382328A (en) * 2023-03-09 2023-07-04 南通大学 Dam intelligent detection method based on cooperation of multiple robots in water and air
CN116839570A (en) * 2023-07-13 2023-10-03 安徽农业大学 Crop interline operation navigation method based on sensor fusion target detection

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106405040A (en) * 2016-11-17 2017-02-15 苏州航天系统工程有限公司 Unmanned-device-based water quality patrolling, contaminant originating system and method thereof
WO2017035516A1 (en) * 2015-08-26 2017-03-02 Peloton Technology, Inc. Devices systems and methods for vehicle monitoring and platooning
CN108681321A (en) * 2018-04-10 2018-10-19 华南理工大学 A kind of undersea detection method that unmanned boat collaboration is formed into columns
CN110751360A (en) * 2019-08-30 2020-02-04 广州睿启智能科技有限公司 Unmanned ship region scheduling method
CN110827535A (en) * 2019-10-30 2020-02-21 中南大学 Nonlinear vehicle queue cooperative self-adaptive anti-interference longitudinal control method
CN111968128A (en) * 2020-07-10 2020-11-20 北京航空航天大学 Unmanned aerial vehicle visual attitude and position resolving method based on image markers
CN112422783A (en) * 2020-10-10 2021-02-26 广东华南水电高新技术开发有限公司 Unmanned aerial vehicle intelligent patrol system based on parking apron cluster
CN112774073A (en) * 2021-02-05 2021-05-11 燕山大学 Unmanned aerial vehicle guided multi-machine cooperation fire extinguishing method and fire extinguishing system thereof
CN112904388A (en) * 2020-12-05 2021-06-04 哈尔滨工程大学 Fusion positioning tracking control method based on navigator strategy

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017035516A1 (en) * 2015-08-26 2017-03-02 Peloton Technology, Inc. Devices systems and methods for vehicle monitoring and platooning
CN106405040A (en) * 2016-11-17 2017-02-15 苏州航天系统工程有限公司 Unmanned-device-based water quality patrolling, contaminant originating system and method thereof
CN108681321A (en) * 2018-04-10 2018-10-19 华南理工大学 A kind of undersea detection method that unmanned boat collaboration is formed into columns
CN110751360A (en) * 2019-08-30 2020-02-04 广州睿启智能科技有限公司 Unmanned ship region scheduling method
CN110827535A (en) * 2019-10-30 2020-02-21 中南大学 Nonlinear vehicle queue cooperative self-adaptive anti-interference longitudinal control method
CN111968128A (en) * 2020-07-10 2020-11-20 北京航空航天大学 Unmanned aerial vehicle visual attitude and position resolving method based on image markers
CN112422783A (en) * 2020-10-10 2021-02-26 广东华南水电高新技术开发有限公司 Unmanned aerial vehicle intelligent patrol system based on parking apron cluster
CN112904388A (en) * 2020-12-05 2021-06-04 哈尔滨工程大学 Fusion positioning tracking control method based on navigator strategy
CN112774073A (en) * 2021-02-05 2021-05-11 燕山大学 Unmanned aerial vehicle guided multi-machine cooperation fire extinguishing method and fire extinguishing system thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ATTA DEBABRATA,等: "Decentralized formation control of multiple autonomous underwater vehicles", 《INTERNATIONAL JOURNAL OF ROBOTICS & AUTOMATION》, vol. 28, no. 4, 20 November 2013 (2013-11-20), pages 303 - 310 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114281101A (en) * 2021-12-03 2022-04-05 南京航空航天大学 Unmanned aerial vehicle and holder interference source joint search method based on reinforcement learning
CN114281101B (en) * 2021-12-03 2023-11-03 南京航空航天大学 Unmanned aerial vehicle and cradle head interference source joint search method based on reinforcement learning
CN114545979A (en) * 2022-03-16 2022-05-27 哈尔滨逐宇航天科技有限责任公司 Aircraft intelligent sliding mode formation control method based on reinforcement learning
CN115877718A (en) * 2023-02-23 2023-03-31 北京航空航天大学 Data-driven heterogeneous missile formation switching communication topology cooperative control method
CN116382328A (en) * 2023-03-09 2023-07-04 南通大学 Dam intelligent detection method based on cooperation of multiple robots in water and air
CN116382328B (en) * 2023-03-09 2024-04-12 南通大学 Dam intelligent detection method based on cooperation of multiple robots in water and air
CN116295507A (en) * 2023-05-26 2023-06-23 南京师范大学 Laser inertial odometer optimization method and system based on deep learning
CN116295507B (en) * 2023-05-26 2023-08-15 南京师范大学 Laser inertial odometer optimization method and system based on deep learning
CN116839570A (en) * 2023-07-13 2023-10-03 安徽农业大学 Crop interline operation navigation method based on sensor fusion target detection
CN116839570B (en) * 2023-07-13 2023-12-01 安徽农业大学 Crop interline operation navigation method based on sensor fusion target detection

Also Published As

Publication number Publication date
CN113671994B (en) 2024-03-05

Similar Documents

Publication Publication Date Title
CN113671994B (en) Multi-unmanned aerial vehicle and multi-unmanned ship inspection control system based on reinforcement learning
CN110782481B (en) Unmanned ship intelligent decision-making method and system
Bircher et al. Three-dimensional coverage path planning via viewpoint resampling and tour optimization for aerial robots
Tisdale et al. Autonomous UAV path planning and estimation
Lin et al. A robust real-time embedded vision system on an unmanned rotorcraft for ground target following
McGee et al. Obstacle detection for small autonomous aircraft using sky segmentation
CN112558608B (en) Vehicle-mounted machine cooperative control and path optimization method based on unmanned aerial vehicle assistance
CN110632941A (en) Trajectory generation method for target tracking of unmanned aerial vehicle in complex environment
Jin et al. On-board vision autonomous landing techniques for quadrotor: A survey
Pinto et al. An autonomous surface-aerial marsupial robotic team for riverine environmental monitoring: Benefiting from coordinated aerial, underwater, and surface level perception
Wang et al. Autonomous flights in dynamic environments with onboard vision
CN113228043A (en) System and method for obstacle detection and association of mobile platform based on neural network
Chen et al. Real-time identification and avoidance of simultaneous static and dynamic obstacles on point cloud for UAVs navigation
Mittal et al. Vision-based autonomous landing in catastrophe-struck environments
Chen et al. A novel unmanned surface vehicle with 2d-3d fused perception and obstacle avoidance module
Lu et al. Perception and avoidance of multiple small fast moving objects for quadrotors with only low-cost RGBD camera
Lee et al. Landing Site Inspection and Autonomous Pose Correction for Unmanned Aerial Vehicles
Harun et al. Collision avoidance control for Unmanned Autonomous Vehicles (UAV): Recent advancements and future prospects
Bui et al. A uav exploration method by detecting multiple directions with deep learning
Abbas et al. Autonomous canal following by a micro-aerial vehicle using deep CNN
Venna et al. Application of image-based visual servoing on autonomous drones
Sanchez-Lopez et al. Deep learning based semantic situation awareness system for multirotor aerial robots using LIDAR
Duan et al. Integrated localization system for autonomous unmanned aerial vehicle formation flight
Capi et al. Application of deep learning for drone obstacle avoidance and goal directed navigation
Bertoncini et al. Fixed-Wing UAV Path Planning and Collision Avoidance using Nonlinear Model Predictive Control and Sensor-based Cloud Detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant