CN115792912A

CN115792912A - Method and system for sensing environment of unmanned surface vehicle based on fusion of vision and millimeter wave radar under weak observation condition

Info

Publication number: CN115792912A
Application number: CN202211587579.7A
Authority: CN
Inventors: 李政霖; 袁田鑫; 周洋; 柳春; 彭艳
Original assignee: University of Shanghai for Science and Technology
Current assignee: University of Shanghai for Science and Technology
Priority date: 2022-12-09
Filing date: 2022-12-09
Publication date: 2023-03-14

Abstract

The invention belongs to the field of unmanned ship target detection, and discloses a method and a system for sensing the environment of an unmanned surface ship based on fusion of vision and a millimeter wave radar under a weak observation condition, wherein the method comprises the steps of collecting the surrounding water surface environment data of the unmanned surface ship by using a vision camera and the millimeter wave radar, and collecting the self attitude information of the unmanned surface ship by using an inertia measurement unit; clustering the millimeter wave radar point cloud data, filtering out clutter, identifying a target, and projecting the identified target into an image coordinate system by combining unmanned ship pose information measured by an inertial measurement unit; and detecting multi-scale targets in the water surface environment through a deep neural network framework, fusing the millimeter wave radar projection targets, the detection results of the deep target detection network and the anchor frame information before non-maximum suppression processing, and outputting the final sensing result. The invention can effectively reduce the adverse effects of sensing information imperfection and uncertainty in the weak observation environment, and provides important information for subsequent autonomous decision-making.

Description

Method and system for sensing environment of unmanned surface vehicle based on fusion of vision and millimeter wave radar under weak observation condition

Technical Field

The invention belongs to the field of unmanned surface vehicle target detection, and particularly relates to a method and a system for sensing an environment of a water surface unmanned surface vehicle based on fusion of vision and a millimeter wave radar under a weak observation condition.

Background

In recent years, unmanned boats are widely used for performing various military and non-military tasks, replace operators to perform dangerous or time-consuming and labor-consuming tasks, play an important role in more and more aspects, improve working efficiency and reduce casualties of the operators.

To achieve efficient fully autonomous and semi-autonomous behavior, accurate perception of the surrounding environment is a primary task (identification of static or dynamic obstacles, identification of obstacle category, speed and heading). The sensing method of the single sensor has limitations and is greatly influenced by the environment. A plurality of similar or heterogeneous sensors can obtain different local and classified information, the target direction, environment and position can be known more accurately and completely, and the multi-sensor fusion has robustness to the self-operation environment change of the unmanned ship.

The vision camera imaging details are abundant, can provide the target feature, but lack the depth information, and the field of vision is limited, receives environment, illumination influence big, and the millimeter wave radar can test the speed range, and detection distance is far away, receives little and the sexual valence relative altitude of environmental impact, and the fusion of vision camera and millimeter wave radar has fine complementary characteristic. The two are fused, so that the missing detection rate under the condition of insufficient light can be effectively reduced, and the precision of remote small target detection can be greatly improved.

At present, in the aspect of unmanned ship fusion perception, the problem of high deployment cost of sensors exists, and in the prior art, a plurality of sets of sensors are mostly used for obtaining a larger perception range, or expensive laser radars are used for enhancing close-range perception. Patent application CN105741234B discloses an unmanned ship automatic anchoring vision auxiliary system based on three-dimensional panoramic looking around, which uses four vision cameras and four millimeter wave radars to sense the environment, and provides a wide visual field for unmanned ship automatic anchoring. Patent application CN109444911A discloses a monocular camera and laser radar information fused unmanned ship water surface target detection, identification and positioning method, which fuses a vision camera and laser radar point cloud, improves the capacity of water surface target detection and identification, and gives a good perception premise of unmanned ship target tracking, path planning and autonomous navigation. Patent application CN115202366A discloses an unmanned ship autonomous berthing method and system based on environment perception, which uses a laser radar and a vision camera to identify the surrounding environment to help the unmanned ship to berth. In addition, most of the fusion schemes in the prior art are based on post fusion, and the perception effect of the whole fusion system is limited by the perception effect of a single sensor.

Disclosure of Invention

The invention uses a matched vision camera and a millimeter wave radar, and adopts a multi-stage fusion scheme of fusion before and intermediate fusion, and aims to provide a method and a system for sensing the environment of the unmanned surface vehicle based on the fusion of vision and the millimeter wave radar under the weak observation condition with high cost performance and accurate effect.

Based on the purpose, the invention adopts the following technical scheme:

a water surface unmanned ship environment perception method based on fusion of vision and millimeter wave radar under weak observation conditions comprises the following steps:

step 1, installing a millimeter wave radar and a vision camera at the bow of an unmanned ship, and recording the position offset and the angle offset between the millimeter wave radar and the vision camera;

step 2, a vision camera is used for obtaining camera images at the same time, a millimeter wave radar is used for obtaining point cloud data, and an inertia measurement unit is used for obtaining the pitch angle pitch of the unmanned ship;

step 3, clustering the point cloud data by using an improved K-Means clustering algorithm to obtain a point cloud detection target set, and then calculating the length l, the width w and the central point coordinate (x) of all the point cloud detection targets _co ,y _co )、rcs _o 、v _xo 、v _yo And a center distance; averaging the radar cross sections of all points in the point cloud detection target, and obtaining the velocity v of the point cloud detection target by the radial velocity _xo 、v _yo And radar cross-sectional area rcs _o 。

Step 4, projecting the point cloud detection target to a coordinate system of a camera image to obtain a projection target set;

step 5, placing the camera image into a depth target detection network for target identification to obtain an image detection target set and a complete anchor frame set, wherein the complete anchor frame set is anchor frame identification information before non-maximum suppression processing; the image detection target set comprises the category of each image detection target, coordinates of the upper left corner and the lower right corner of the bounding box and confidence; the complete anchor box set includes the confidence for all classes of each anchor box and the upper left and lower right corner coordinates of the bounding box.

And 6, fusing the projection target set and the image detection target set to obtain a primary residual projection target set, a residual image detection target set and a primary fusion target set, then fusing the primary residual projection target set and the complete anchor frame set to obtain a secondary residual projection target set and a secondary fusion target set, and finally forming the primary fusion target set, the secondary residual projection target set and the residual image detection target set into a final target set.

Further, in step 6, the method for fusing the projection target set and the image detection target set includes: traversing the projection target set, finding out the overlapping degree of each image detection target and the current projection target for each projection target in the projection target set, if the image detection target with the overlapping degree larger than 0.5 exists, outputting a primary fusion target, wherein the boundary frame and the category of the primary fusion target are the boundary frame and the category of the image detection target with the largest overlapping degree, and the Euclidean distance of the primary fusion target is the Euclidean distance calculated according to the central point coordinate of the current projection target; finally, deleting the image detection target with the maximum overlapping degree and the current projection target from the image detection target set and the projection target set respectively; and all the primary fusion targets form a primary fusion target set, and the traversed projection target set and the traversed image detection target set are respectively a primary residual projection target set and a residual image detection target set.

Further, in step 6, the method for fusing the initial residual projection target set and the complete anchor frame set includes: traversing the primary residual projection target set, finding out the overlapping degree of each anchor frame in the complete anchor frame set and the current projection target for each projection target in the primary residual projection target set, selecting the anchor frames with the overlapping degree larger than 0.5 to form a fusion anchor frame set, and multiplying the confidence coefficient of each anchor frame in the fusion anchor frame set by a coefficient epsilon to obtain an amplification confidence coefficient, wherein epsilon =2.8 × rcs _o ，rcs _o Is the radar cross-sectional area of the projected target; if the amplification confidence coefficient is greater than or equal to 0.5, deleting the anchor frame with the amplification confidence coefficient less than 0.5 in the fusion anchor frame set, and outputting a secondary fusion target, wherein the category of the secondary fusion target is the category of the maximum output value of the category of the anchor frame with the maximum amplification confidence coefficient in the fusion anchor frame set, the boundary frame of the secondary fusion target is a final boundary frame obtained after the boundary frame of the current projection target is corrected by using the anchor frame in the fusion anchor frame set, and the Euclidean distance of the secondary fusion target is the Euclidean distance calculated according to the central point coordinate of the current projection target; finally, deleting the current projection target from the primary projection target set; and forming a secondary fusion target set by the set of all secondary fusion targets, wherein the traversed primary residual projection target set is a secondary residual projection target set.

Further, in step 6, the method for obtaining the final bounding box includes:

wherein B is the left side of the final bounding boxCoordinates of upper and lower right corners, B _{Radar projection} For the coordinates of the upper left corner and the lower right corner of the current projection target,

the mean value of coordinates of the upper left corner and the lower right corner of all anchor frames in the fusion anchor frame set is defined, N is the number of the anchor frames in the fusion anchor frame set, B _n Fusing the coordinates of the upper left corner and the lower right corner of each anchor frame in the anchor frame set _n The confidence of each anchor frame in the fused anchor frame set is calculated.

Further, in step 4, the method for projecting the point cloud detection target into the coordinate system of the camera image comprises the following steps:

the method comprises the steps of firstly, transferring a center point coordinate of a point cloud detection target into a world coordinate system with a camera center as an origin, calculating the center point coordinate of the point cloud detection target in the world coordinate system, and then calculating the height of a center point projection;

secondly, transferring the corrected central point coordinate into an image coordinate system, and calculating the central point coordinate of the point cloud detection target in the image coordinate system;

thirdly, calculating the coordinates of the upper left corner and the lower right corner of a target frame of the point cloud detection target in an image coordinate system;

and fourthly, combining the coordinates of the center point, the coordinates of the lower left corner and the upper right corner of the point cloud detection target in the image coordinate system, the speed of the point cloud detection target and the radar sectional area information to form a projection target, wherein all the projection targets form a projection target set.

Further, in step 4, the coordinates (x) of the central point of the point cloud detection target in the world coordinate system are calculated _w ，y _w ) The method comprises the following steps:

where theta is the offset angle between the millimeter wave radar and the vision camera,

where x is millimeter wave radar and vision cameraY is the horizontal longitudinal offset of the millimeter wave radar and the vision camera.

Further, in step 4, the projected height of the center point

h is the camera mounting height.

Further, in step 4, the method for calculating the center point coordinates (u, v) of the point cloud detection target in the image coordinate system is as follows:

f _x and f _y Is the focal length of the vision camera in the x and y directions, c _x And c _y Are distortion parameters of the vision camera in the x-direction and the y-direction.

Further, in step 4, calculating the coordinates (u) of the top left corner of the target frame of the point cloud detection target in the image coordinate system _lefttop ，v _lefttop ) And coordinates of lower right corner (u) _rightbottom ，v _rightbottom ) The method comprises the following steps:

a. b is the highest point and the lowest point of the point cloud detection target.

A water surface unmanned ship environment sensing system based on fusion of vision and a millimeter wave radar under weak observation conditions comprises an unmanned ship, wherein a vision camera and a millimeter wave radar are mounted at the bow of the unmanned ship, inertia measurement unit equipment is mounted in a cabin of the unmanned ship, the vision camera is used for acquiring images, the millimeter wave radar is used for acquiring point cloud data, and the inertia measurement unit is used for acquiring a heading angle, a roll angle and a pitch angle of the unmanned ship; an industrial computer is installed in a cabin of the unmanned ship and is connected with a vision camera, a millimeter wave radar and an inertia measurement unit; the industrial computer comprises a point cloud processing module, an image processing module, a primary target fusion module, a secondary target fusion module and a final target output module; the point cloud processing module comprises a clustering algorithm unit and a coordinate projection unit; the clustering algorithm unit is used for clustering the point cloud data and acquiring a point cloud detection target set consisting of point cloud detection targets; the coordinate projection unit is used for projecting the point cloud detection target to an image coordinate system and acquiring a projection target set consisting of projection targets; the image processing module comprises a target identification unit and an anchor frame storage unit, and the image detection unit is used for carrying out target identification on the image and obtaining an image detection target set consisting of image detection targets; the anchor frame storage unit is used for storing the anchor frames before non-maximum value inhibition processing and obtaining a complete anchor frame set formed by the anchor frames; the primary target fusion module comprises an image overlapping degree calculation unit, a primary fusion output unit and a primary target deletion unit; the image overlapping degree calculating unit is used for calculating the overlapping degree of each projection target and all image detection targets; the primary fusion output unit is used for fusing the image detection target with the maximum overlapping degree and the overlapping degree larger than 0.5 with the projection target to obtain a primary fusion target, and the primary target deleting unit is used for deleting the fused projection target and image detection target from the projection target set and the image detection target set to obtain a primary residual projection target set and a residual image detection target set; the final target output module is used for outputting a final target set and adding the primary fusion target and the residual image detection targets into the final target set; the secondary target fusion module comprises an anchor frame overlapping degree calculation unit, an anchor frame screening unit, a secondary fusion output unit and a secondary target deletion unit; the anchor frame overlapping degree calculating unit is used for calculating the overlapping degree of each projection target in the primary residual projection target set and all anchor frames in the complete anchor frame set; the anchor frame screening unit is used for screening anchor frames with the overlapping degree of the projection target being more than 0.5 and forming a fusion anchor frame set; the secondary fusion output unit is used for fusing the fusion anchor frame set with the projection target and obtaining a secondary fusion target; the secondary target deleting unit is used for deleting the projection target fused with the fusion anchor frame set and obtaining a secondary residual projection target set; and the final target output module is used for adding the secondary fusion target and the secondary residual projection target set into the final target set.

Compared with the prior art, the method has the following beneficial effects:

1. the millimeter wave radar data are acquired simultaneously, the data synchronization of the three sensors can be ensured through the visual camera image data and the inertial measurement unit data, the follow-up processing is facilitated, the position of a point cloud detection target projected into an image coordinate system is more accurate through the access of the inertial measurement unit data, and the influence of the shaking of the ship on a projection result caused by water flow, wind and waves is reduced.

2. And acquiring a point cloud detection target by using an improved K-Means clustering algorithm, and filtering out clutter caused by environmental problems. Continuously processing the point cloud of each point cloud detection target to obtain the size, the speed and the RCS of the point cloud detection target _O And distance from the unmanned boat. The data generated by the millimeter radar is fully utilized, and the data is better supplemented with the image data generated by the visual camera, so that a better target detection effect is realized.

3. The point cloud detection target information is added to the camera image, so that the richness of the camera image information is increased, and the unmanned ship target detection is more robust.

4. The identification result of the deep neural network is obtained, the anchor frame identification information before the non-maximum value is inhibited is stored, the information loss of the non-maximum value inhibition is avoided, the information and the target identified by the millimeter wave radar are fused, the uncertainty of the target identification estimation is reduced, and the robustness is improved. With the multistage combination of millimeter wave radar's information and camera image information, the information of two sensors of make full use of, discernment target on water that can be better guarantees unmanned ship's navigation safety.

Drawings

Fig. 1 is a schematic view of installation positions of a millimeter wave radar and a visual camera in embodiment 1 of the present invention;

FIG. 2 is a flowchart of example 1 of the present invention;

FIG. 3 is a camera image according to embodiment 1 of the present invention;

FIG. 4 is a schematic diagram showing information of a projection target and an anchor frame in embodiment 1 of the present invention;

FIG. 5 is a schematic diagram of a fusion target in embodiment 1 of the present invention.

Detailed Description

Example 1

A method for sensing the environment of an unmanned surface vehicle based on fusion of vision and a millimeter wave radar under weak observation conditions is shown in figure 2 and comprises the following steps:

step 1, installing a millimeter wave radar and a vision camera on a bracket of a bow of an unmanned boat, as shown in fig. 1, installing the millimeter wave radar right in front of the bracket, installing the vision camera right above the bracket, and enabling the millimeter wave radar and the vision camera to face right in front, so that the centers of the angles of view of the vision camera and the millimeter wave radar are consistent, and better performing information supplement between the two sensors. And (3) installing the inertia measurement unit equipment in the cabin and keeping the inertia measurement unit equipment parallel to the ship plate to obtain the information of three shaking angles of the ship body. An industrial computer is installed in a cabin of the unmanned ship, a vision camera is connected into the industrial computer through a serial port of an acquisition card, a millimeter wave radar is connected into the industrial computer through a CAN card, and an inertia measuring unit is connected into the industrial computer through a serial port. Recording the amount of positional offset between the millimeter-wave radar and the vision camera

Wherein x is the offset in the horizontal transverse direction, and y is the offset in the horizontal longitudinal direction and the angular offset theta. Since the vision camera and the millimeter wave radar are both directed straight ahead, the angular offset θ is 0. The millimeter wave radar and the vision camera equipment are installed on the same support of the bow of the unmanned ship, the centers of the field angles of the millimeter wave radar and the vision camera equipment are consistent, information is better unified, and the target detected by the millimeter wave radar is projected into a camera image coordinate system more accurately by introducing the data of the inertia measurement unit.

Step 2, using the visual camera to obtain a camera IMage IMage at time t as shown in FIG. 3 _t Using millimeter wave radar to obtain point cloud data PC _t Obtaining data IMU using inertial measurement unit _t . n is the number of the points in the point cloud, and each point has ten data: comprising coordinates (x) _r ,y _r ) Radar cross-sectional area rcs, radial velocity v _x ,v _y Motion state dynamic _ property, object class _ type, probability of existence prob _ of _ exist, fuzzy state ambig _ state, invalid state invalidity _ state. IMage _t Three channel RGB image data with 1920 x 1280 resolution, IMU _t Information for 3 angles

From top to bottom, the heading angle, the roll angle and the pitch angle of the unmanned boat at the time t are respectively.

Step 3, using an improved K-Means clustering algorithm to point cloud data PC _t Clustering is carried out to obtain a point cloud detection target set O _wwv ，O _wwv Detection of object o from point cloud _wwv And (4) forming. Then using the tightest rectangular frame to frame O _wwv Each of o _wwv . Calculate each o _wwv Length l, width w, center point coordinate (x) _co ,y _co )、rcs _o 、v _xo 、v _yo And according to (x) _co ,y _co ) The calculated Euclidean distance; to o is _wwv Radar cross-sectional area rcs, radial velocity v at all points in _x ,v _y Average to obtain o _wwv Radar cross-sectional area rcs _o And velocity v _xo 、v _yo 。

Step 4, all o are added _wwv Projecting the image to a coordinate system of a camera image to obtain a projection target set, wherein the method comprises the following four steps:

in the first step, for O obtained in step three _wwv Each of o _wwv O is prepared by _wwv The coordinates of the center point of (2) are transferred into a world coordinate system with the center of the camera as an origin: according to the mapping relation

Calculating o _wwv Center point coordinate (x) in world coordinate system _w ，y _w ) Where the offset angle θ between the millimeter wave radar and the vision camera is 0, the mapping is a translation transformation. Then using the IMU _t The pitch angle information pitch of (2) corrects the projection, calculates the height of the center point projection

h is the camera mounting height (relative to the horizontal).

Secondly, transferring the corrected central point coordinate to an image coordinate system according to an imaging matrix of the camera

Calculating o _wwv Center point in image coordinate system

f _x And f _y Is the focal length of the vision camera in the x-and y-directions, c _x And c _y Are distortion parameters of the vision camera in the x-direction and the y-direction.

Third, calculate o _wwv Coordinates (u) of upper left corner of target frame in image coordinate system _lefttop ，v _lefttop ) And coordinates u of lower right corner _rightbottom ，v _rightbottom ) (ii) a Let o _wwv The highest point of (2) is 2m, and the lowest point is 0m, then

The fourth step is to mix (u, v), (u) _lefttop ，v _lefttop )、(u _rightbottom ，v _rightbottom )、o _wwv Velocity v in _xo ，v _yo Radar cross-sectional area rcs _o Center point coordinates (x) corresponding to (u, v) _co ,y _co ) The information is combined to form a projection target w at O _wwv Each of o _wwv After the projected targets W are formed, all the projected targets W constitute a projected target set W.

Step 5, the camera IMage IMage obtained in the step 2 is processed _t Putting into a deep target detection networkAnd carrying out target recognition in the image target detection network, and storing the recognition result to obtain an image detection target set D, wherein the image detection target set D consists of each image detection target D, and the D comprises the category of each image detection target, the image coordinate system coordinates of the upper left corner and the lower right corner of the bounding box and the confidence coefficient of network output. And simultaneously storing the identification information of the anchor frame before the non-maximum suppression processing is carried out on the image target detection network of the depth target detection network to obtain a complete anchor frame set A, wherein the complete anchor frame set A consists of each anchor frame a, and the a comprises the confidence coefficients of all types of the anchor frames and the image coordinate system coordinates of the upper left corner and the lower right corner of the boundary frame. As shown in fig. 4, green is the projection target, and red is the anchor frame information filtered by the anchor frame detection of the depth target detection network.

Step 6, traversing the projection target set W: for each projection target W in the projection target set W, finding out the overlapping degree of each image detection target and the current projection target, and if the image detection target with the overlapping degree larger than 0.5 exists, finding out the image detection target d with the maximum overlapping degree with the current projection target _ioumax And outputting the primary fusion target; bounding box and class of primary fusion target are d _ioumax The Euclidean distance of the primary fusion target is determined according to the o corresponding to the current projection target _wwv Coordinate of center point (x) _co ,y _co ) A calculated Euclidean distance; finally d is _ioumax Deleting the current projection target from the image detection target set and the projection target set respectively; if no image detection target with the overlapping degree larger than 0.5 exists, the operation is continuously carried out on the next projection target. And finally, forming a primary fusion target set by all the primary fusion targets, wherein the projection target set and the image detection target set after traversal are a primary residual projection target set and a residual image detection target set respectively.

Then traversing the first remaining projection target set: for each projection target in the initial residual projection target set, finding out the overlapping degree of each anchor frame in the complete anchor frame set and the current projection target, and selecting the anchor frames with the overlapping degree IOU larger than 0.5 to form a fusion anchor frame set A _cross Then will beAnd multiplying the confidence coefficient epsilon of each anchor frame in the fused anchor frame set to obtain an amplification confidence coefficient, wherein the default epsilon =2.8 rcs _o ，rcs _o The radar cross section of the current projection target; if the amplification confidence coefficient is greater than or equal to 0.5, selecting A _cross Anchor frame a with maximum central confidence level _cross-max Deleting the anchor frame with the amplification confidence coefficient less than 0.5 in the fusion anchor frame set, and outputting a secondary fusion target as shown in figure 5, wherein the category of the secondary fusion target is a _cross-max The class of the maximum value is output, and the bounding box of the secondary fusion target is defined by using A _cross The anchor frame in (1) corrects the final boundary frame B obtained after the boundary frame of the current projection target,

wherein B is the coordinates of the upper left corner and the lower right corner of the final bounding box, B _{Radar projection} For the coordinates of the upper left corner and the lower right corner of the current projection target,

for deleting A after an anchor frame with amplification confidence coefficient less than 0.5 _cross The mean value of coordinates of the upper left corner and the lower right corner of all the anchor frames in the list, wherein N is A after deleting the anchor frame with the amplification confidence coefficient smaller than 0.5 _cross Number of middle anchor frames, B _n For deleting A after the anchor frame with the amplification confidence coefficient less than 0.5 _cross Coordinates of upper left corner and lower right corner of each anchor frame, confidence _n For deleting A after the anchor frame with the amplification confidence coefficient less than 0.5 _cross The confidence of each anchor frame in the image, the Euclidean distance of the secondary fusion target is according to the o corresponding to the current projection target _wwv Coordinate of center point (x) _co ,y _co ) A calculated Euclidean distance; finally, deleting the current projection target from the primary projection target set; and if the anchor frame with the amplification confidence coefficient larger than or equal to 0.5 does not exist, performing the operation on the next projection target. And finally, forming a secondary fusion target set by the set of all the obtained secondary fusion targets, wherein the traversed primary residual projection target set is a secondary residual projection target set.

Finally, the primary fusion target set and the secondary fusion are integratedCombining the target sets to form a final target set O; reserving coordinates in a pixel coordinate system for the residual w in the secondary residual projection target set, and o corresponding to w _wwv Coordinate of center point (x) _co ,y _co ) Adding the calculated Euclidean distance into the final target set O; and adding the remaining d reserved output categories and the pixel coordinate system boundary frames in the remaining image detection target set into a final output target set O, and outputting the final output target set O. According to the method and the device, the point cloud data and the camera image are processed, the anchor frame identification information before non-maximum suppression is saved, more features are obtained, and the target is found more easily.

Example 2

A water surface unmanned ship environment sensing system based on fusion of vision and a millimeter wave radar under weak observation conditions comprises an unmanned ship, wherein a vision camera and a millimeter wave radar are mounted at the bow of the unmanned ship, inertia measurement unit equipment is mounted in a cabin of the unmanned ship, the vision camera is used for acquiring images, the millimeter wave radar is used for acquiring point cloud data, and the inertia measurement unit is used for acquiring a heading angle, a roll angle and a pitch angle of the unmanned ship; an industrial computer is installed in a cabin of the unmanned ship and connected with a vision camera, a millimeter wave radar and an inertia measurement unit; the industrial computer comprises a point cloud processing module, an image processing module, a primary target fusion module, a secondary target fusion module and a final target output module; the point cloud processing module comprises a clustering algorithm unit and a coordinate projection unit; the clustering algorithm unit is used for clustering the point cloud data and acquiring a point cloud detection target set consisting of point cloud detection targets; the coordinate projection unit is used for projecting the point cloud detection target to an image coordinate system and acquiring a projection target set consisting of projection targets; the image processing module comprises a target identification unit and an anchor frame storage unit, and the image detection unit is used for carrying out target identification on the image and obtaining an image detection target set consisting of image detection targets; the anchor frame storage unit is used for storing the anchor frames before non-maximum value inhibition processing and obtaining a complete anchor frame set formed by the anchor frames; the primary target fusion module comprises an image overlapping degree calculation unit, a primary fusion output unit and a primary target deletion unit; the image overlapping degree calculating unit is used for calculating the overlapping degree of each projection target and all image detection targets; the primary fusion output unit is used for fusing the image detection target with the maximum overlapping degree and the overlapping degree larger than 0.5 with the projection target to obtain a primary fusion target, and the primary target deleting unit is used for deleting the fused projection target and the image detection target from the projection target set and the image detection target set to obtain a primary residual projection target set and a residual image detection target set; the final target output module is used for outputting a final target set and adding the primary fusion target and the residual image detection targets into the final target set; the secondary target fusion module comprises an anchor frame overlapping degree calculation unit, an anchor frame screening unit, a secondary fusion output unit and a secondary target deletion unit; the anchor frame overlapping degree calculating unit is used for calculating the overlapping degree of each projection target in the primary residual projection target set and all anchor frames in the complete anchor frame set; the anchor frame screening unit is used for screening anchor frames with the overlapping degree of the projection target being more than 0.5 and forming a fusion anchor frame set; the secondary fusion output unit is used for fusing the fusion anchor frame set with the projection target and obtaining a secondary fusion target; the secondary target deleting unit is used for deleting the projection target fused with the fusion anchor frame set and obtaining a secondary residual projection target set; and the final target output module is used for adding the secondary fusion target and the secondary residual projection target set into the final target set.

Claims

1. A method for sensing the environment of an unmanned surface vehicle based on fusion of vision and a millimeter wave radar under a weak observation condition is characterized by comprising the following steps:

step 3, clustering the point cloud data by using an improved K-Means clustering algorithm to obtain a point cloud detection targetCollecting, and calculating the length l, the width w and the center point coordinate (x) of all point cloud detection targets _co ,y _co )、rcs _o 、v _xo 、v _yo ；

step 5, placing the camera image into a depth target detection network for target identification to obtain an image detection target set and a complete anchor frame set, wherein the complete anchor frame set is anchor frame identification information before non-maximum suppression processing is carried out on the depth target detection network;

and 6, fusing the projection target set and the image detection target set to obtain a primary residual projection target set, a residual image detection target set and a primary fusion target set, fusing the primary residual projection target set and the complete anchor frame set to obtain a secondary residual projection target set and a secondary fusion target set, and finally forming the primary fusion target set, the secondary residual projection target set and the residual image detection target set into a final target set.

2. The method of claim 1, wherein in step 6, the method of fusing the set of projection targets and the set of image detection targets is: traversing the projection target set, finding out the overlapping degree of each image detection target and the current projection target for each projection target in the projection target set, if the image detection target with the overlapping degree larger than 0.5 exists, outputting a primary fusion target, wherein the boundary frame and the category of the primary fusion target are the boundary frame and the category of the image detection target with the largest overlapping degree, and the Euclidean distance of the primary fusion target is the Euclidean distance calculated according to the central point coordinate of the current projection target; finally, deleting the image detection target with the maximum overlapping degree and the current projection target from the image detection target set and the projection target set respectively; and all the primary fusion targets form a primary fusion target set, and the traversed projection target set and the traversed image detection target set are respectively a primary residual projection target set and a residual image detection target set.

3. The method of claim 2, wherein in step 6, the fusing the initial residual projection target set with the complete anchor frame set is performed by: traversing the initial residual projection target set, finding out the overlapping degree of each anchor frame in the complete anchor frame set and the current projection target for each projection target in the initial residual projection target set, selecting the anchor frames with the overlapping degree larger than 0.5 to form a fusion anchor frame set, and multiplying the confidence coefficient of each anchor frame in the fusion anchor frame set by a coefficient epsilon to obtain an amplification confidence coefficient, wherein epsilon =2.8 rcs _o ，rcs _o Radar cross-sectional area for the projected target; if the amplification confidence coefficient is greater than or equal to 0.5, deleting the anchor frame with the amplification confidence coefficient less than 0.5 in the fusion anchor frame set, and outputting a secondary fusion target, wherein the category of the secondary fusion target is the category of the maximum output value of the category of the anchor frame with the maximum amplification confidence coefficient in the fusion anchor frame set, the boundary frame of the secondary fusion target is a final boundary frame obtained after the anchor frame in the fusion anchor frame set is used for correcting the boundary frame of the current projection target, and the Euclidean distance of the secondary fusion target is the Euclidean distance calculated according to the center point coordinate of the current projection target; finally, deleting the current projection target from the primary projection target set; and forming a secondary fusion target set by the set of all secondary fusion targets, wherein the traversed primary residual projection target set is a secondary residual projection target set.

4. The method of claim 3, wherein in step 6, the final bounding box is obtained by:

wherein B is the coordinates of the upper left corner and the lower right corner of the final bounding box, B _{Radar projection} To project the upper left and lower right corner coordinates of the target,

the mean value of coordinates of the upper left corner and the lower right corner of all anchor frames in the fusion anchor frame set is defined, N is the number of the anchor frames in the fusion anchor frame set, B is the number of the anchor frames in the fusion anchor frame set _n Fusing the coordinates of the upper left corner and the lower right corner of each anchor frame in the anchor frame set _n The confidence of each anchor frame in the fused anchor frame set is calculated.

5. The method of claim 4, wherein in step 4, the method of projecting the point cloud detection target into the coordinate system of the camera image comprises the steps of:

6. The method of claim 5, wherein in step 4, the coordinates (x) of the center point of the point cloud detection target in the world coordinate system are calculated _w ，y _w ) The method comprises the following steps:

where x is the horizontal lateral offset of the millimeter wave radar and vision camera, and y is milliHorizontal and longitudinal offsets of the meter-wave radar and the vision camera.

7. The method of claim 6, wherein in step 4, the projected height of the center point is

h is the camera mounting height.

8. The method of claim 7, wherein in step 4, the center point coordinates (u, v) of the point cloud detection target in the image coordinate system are calculated by:

9. The method of claim 8, wherein in step 4, the coordinates of the point cloud detection object in the image coordinate system at the upper left corner of the object frame (u) are calculated _lefttop ，v _lefttop ) And coordinates of lower right corner (u) _rightbottom ，v _rightbottom ) The method comprises the following steps:

10. A water surface unmanned ship environment sensing system based on fusion of vision and a millimeter wave radar under a weak observation condition is characterized by comprising an unmanned ship, wherein a vision camera and the millimeter wave radar are mounted at the bow of the unmanned ship, inertia measurement unit equipment is mounted in a cabin of the unmanned ship, the vision camera is used for acquiring images, the millimeter wave radar is used for acquiring point cloud data, and the inertia measurement unit is used for acquiring a heading angle, a roll angle and a pitch angle of the unmanned ship; an industrial computer is installed in a cabin of the unmanned ship and connected with a vision camera, a millimeter wave radar and an inertia measurement unit; the industrial computer comprises a point cloud processing module, an image processing module, a primary target fusion module, a secondary target fusion module and a final target output module; the point cloud processing module comprises a clustering algorithm unit and a coordinate projection unit; the clustering algorithm unit is used for clustering the point cloud data and acquiring a point cloud detection target set consisting of point cloud detection targets; the coordinate projection unit is used for projecting the point cloud detection target to an image coordinate system and acquiring a projection target set consisting of projection targets; the image processing module comprises a target identification unit and an anchor frame storage unit, and the image detection unit is used for carrying out target identification on an image and obtaining an image detection target set consisting of image detection targets; the anchor frame storage unit is used for storing the anchor frames before non-maximum value inhibition processing and obtaining a complete anchor frame set formed by the anchor frames; the primary target fusion module comprises an image overlapping degree calculation unit, a primary fusion output unit and a primary target deletion unit; the image overlapping degree calculating unit is used for calculating the overlapping degree of each projection target and all image detection targets; the primary fusion output unit is used for fusing the image detection target with the maximum overlapping degree and the overlapping degree larger than 0.5 with the projection target to obtain a primary fusion target, and the primary target deleting unit is used for deleting the fused projection target and the image detection target from the projection target set and the image detection target set to obtain a primary residual projection target set and a residual image detection target set; the final target output module is used for outputting a final target set and adding the primary fusion target and the residual image detection targets into the final target set; the secondary target fusion module comprises an anchor frame overlapping degree calculation unit, an anchor frame screening unit, a secondary fusion output unit and a secondary target deletion unit; the anchor frame overlapping degree calculating unit is used for calculating the overlapping degree of each projection target in the primary residual projection target set and all anchor frames in the complete anchor frame set; the anchor frame screening unit is used for screening anchor frames with the overlapping degree of more than 0.5 with the projection target and forming a fusion anchor frame set; the secondary fusion output unit is used for fusing the fusion anchor frame set with the projection target and obtaining a secondary fusion target; the secondary target deleting unit is used for deleting the projection target fused with the fusion anchor frame set and obtaining a secondary residual projection target set; and the final target output module is used for adding the secondary fusion target and the secondary residual projection target set into the final target set.