CN114359493B - Method and system for generating three-dimensional semantic map for unmanned ship - Google Patents

Method and system for generating three-dimensional semantic map for unmanned ship Download PDF

Info

Publication number
CN114359493B
CN114359493B CN202111559071.1A CN202111559071A CN114359493B CN 114359493 B CN114359493 B CN 114359493B CN 202111559071 A CN202111559071 A CN 202111559071A CN 114359493 B CN114359493 B CN 114359493B
Authority
CN
China
Prior art keywords
key frame
key
map
frame
frames
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111559071.1A
Other languages
Chinese (zh)
Other versions
CN114359493A (en
Inventor
李波
陈小乔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
709th Research Institute of CSIC
Original Assignee
709th Research Institute of CSIC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 709th Research Institute of CSIC filed Critical 709th Research Institute of CSIC
Priority to CN202111559071.1A priority Critical patent/CN114359493B/en
Publication of CN114359493A publication Critical patent/CN114359493A/en
Application granted granted Critical
Publication of CN114359493B publication Critical patent/CN114359493B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a method and a system for generating a three-dimensional semantic map by an unmanned ship, which comprises the following steps: processing image frames acquired by the unmanned ship in the process of sailing by using an ORB-SLAM algorithm to obtain a key frame and an initialized three-dimensional point cloud map; performing instance segmentation on the obtained key frame by using a Mask R-CNN algorithm to obtain all object instances in the key frame, and performing semantic annotation on the object instances; adding the obtained key frame into the initialized three-dimensional point cloud map to update the three-dimensional point cloud map; matching the feature points extracted from the key frames with all object examples obtained by segmenting the examples with the nearest key frames to obtain the key frames carrying object example semantic annotation information; and mapping the key frame carrying object semantic annotation information to the coordinates of the key frame on the three-dimensional point cloud map to obtain the three-dimensional semantic map. The invention improves the accuracy of the unmanned ship in recognizing the boundary of the shot object.

Description

Method and system for generating three-dimensional semantic map for unmanned ship
Technical Field
The invention belongs to the technical field of computer vision, and particularly relates to a method and a system for generating a three-dimensional semantic map by an unmanned ship.
Background
In recent years, the market of unmanned ships is continuously growing, and autonomous navigation is used as a guarantee for increasing safe navigation of unmanned ships along with the development of technologies. The unmanned ship wants to realize applications, and one key problem is to realize autonomous positioning and environmental perception. In many scenarios involving unmanned ship applications, such as path planning, environmental awareness, obstacle avoidance, etc., the process of positioning and mapping is seen as a prerequisite. Meanwhile, the vision SLAM (Simultaneous Localization and Mapping) algorithm can establish a map of an environment in a movement process without environment prior information after a camera is used as a vision sensor, and can perform autonomous positioning at the same time.
The ORB-SLAM algorithm (organic FAST and BRIEF-Simultaneous Localization and Mapping) is a monocular vision SLAM algorithm completely based on sparse feature points, and comprises visual odometry, tracking and loop detection, and the core of the ORB-SLAM algorithm is to use ORB (organic FAST and BRIEF) as the core feature in the whole visual SLAM. However, the map constructed by the ORB-SLAM algorithm cannot provide semantic information of objects in the surrounding environment, and image information acquired by the visual sensor cannot be fully utilized. Therefore, the ORB-SLAM algorithm can only solve the problem of autonomous navigation of the unmanned ship, and objects shot by the vision sensor in the sailing process cannot be intelligently identified.
The Mask R-CNN algorithm is a very representative example segmentation algorithm based on deep learning in recent years, and can be applied to the fields of target detection, example segmentation, target key point detection and the like. Since a large number of objects have complex and irregular boundary shapes, and the accurate prediction of the boundary shapes has a great influence on the segmentation of the whole example, the Mask R-CNN algorithm has a little defect in the accuracy of the boundary identification of the object example.
In summary, although the existing ORB-SLAM method can solve the autonomous navigation function of the unmanned ship, it cannot identify the type of the object shot in the navigation process, and the existing Mask R-CNN algorithm is slightly deficient in accurately identifying the boundary of the object. Therefore, it is an urgent task in the current unmanned ship technology field to provide a method and a system for generating a three-dimensional semantic map for an unmanned ship, which can enhance the efficiency of the unmanned ship in performing tasks such as navigation and exploration.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a method and a system for generating a three-dimensional semantic map by an unmanned ship, and solves the technical problem that the boundary of an object cannot be accurately identified in the navigation process of the unmanned ship in the prior art.
In order to achieve the purpose, the invention provides a method for generating a three-dimensional semantic map by an unmanned ship, which comprises the following steps:
(1) Acquiring a key frame: continuously acquiring image frames by using a visual sensor in the process of sailing of the unmanned ship to obtain an image frame sequence, processing the continuous image frames in the image frame sequence by adopting an ORB-SLAM algorithm to obtain an initialized three-dimensional point cloud map, simultaneously processing the extracted feature points of the continuous image frames to obtain continuous key frames, and respectively performing the step (2) and the step (3) below;
(2) A key frame example segmentation step: performing instance segmentation on each key frame in the continuous key frames by using a Mask R-CNN algorithm to obtain object instances contained in each key frame in the continuous key frames, performing semantic annotation on the object instances, and turning to the step (4);
(3) Updating the three-dimensional point cloud map: adding the continuous key frames into the initialized three-dimensional point cloud map to update the three-dimensional point cloud map, and turning to the step (4);
(4) And (3) feature fusion step: matching the feature points extracted from each key frame in the continuous key frames and the obtained object examples with the feature points and the object examples of the previous key frame, and performing local nonlinear optimization on the pose estimation result of each key frame by integrating the feature points and the object example matching results to obtain object example semantic labeling information carried by each key frame in the continuous key frames;
(5) Constructing a three-dimensional semantic map: and associating the object instance semantic annotation information carried by each key frame in the continuous key frames with the coordinates of each key frame on the three-dimensional point cloud map, so as to realize semantic annotation on the three-dimensional point cloud map and obtain the three-dimensional semantic map.
In an optional embodiment, in the step of obtaining the key frame, the feature points of the continuous image frames are extracted and processed to obtain the continuous key frame, and the specific process includes:
extracting feature points of each image frame by adopting an ORB-SLAM algorithm, and matching the feature points extracted by each image frame with the feature points of the previous image frame to obtain a pose estimation result of the visual sensor when each image frame is acquired; if the matching with the previous image frame fails, the image frame needs to be matched with the previous key frame to obtain a pose estimation result of the image frame, if the matching with the previous key frame fails, the image frame needs to be subjected to global repositioning to obtain a pose estimation result of the image frame, and after the pose estimation result of each image frame is obtained, each image frame is screened to obtain a key frame.
In an optional embodiment, the global relocation specifically includes:
and generating a key frame database by the acquired continuous key frames, matching the image frames with all the key frames in the key frame database, and determining the pose estimation result of the image frames.
In an optional embodiment, the screening of each image frame to obtain a key frame specifically includes:
selecting the image frames meeting the following conditions as key frames:
the image frames are spaced at least 20 image frames from the last globally repositioned image frame;
the mapping process is idle, or more than 20 image frames are added to the key frame last time;
the image frame tracks at least 50 map points;
the number of map points matched by the image frame cannot exceed 90 percent of the number of map points matched by the reference key frame.
In an optional embodiment, the mapping process is idle, and further includes:
and if the mapping thread is busy when the key frame is added, sending a signal to the mapping thread, stopping the current task of the mapping thread, and enabling the key frame to be processed in time, wherein the mapping thread is a thread used for constructing the navigation map of the unmanned ship in the ORB-SLAM algorithm.
In an optional embodiment, the map points are specifically:
the visual sensor observes the road sign during the navigation of the unmanned ship.
In an optional embodiment, the reference key frame specifically includes:
the reference key frame is a key frame with the highest matching degree with the feature points of the image frames in the key frame database.
In an optional embodiment, the key frame instance segmenting step further includes:
in the process of performing example segmentation on the key frame by using Mask R-CNN, a boundary thinning module is also needed to be used for fitting the segmented object example boundary, so as to obtain a more accurate object example boundary contour.
In an optional embodiment, in the step of segmenting the key frame instance, semantic labeling is performed on the key frame instance, specifically:
the semantics are labeled as the categories of the object instances in the key frame and the relations between the object instances.
Another object of the present invention is to provide a system for generating a three-dimensional semantic map for an unmanned ship, comprising the following units:
acquiring a key frame unit: continuously acquiring image frames by using a visual sensor in the process of sailing of the unmanned ship to obtain an image frame sequence, processing the continuous image frames in the image frame sequence by adopting an ORB-SLAM algorithm to obtain an initialized three-dimensional point cloud map, simultaneously, processing the feature points of the extracted continuous image frames to obtain continuous key frames, and respectively switching to a key frame example segmentation unit and a three-dimensional point cloud map updating unit from the following steps;
key frame instance partitioning unit: performing instance segmentation on each key frame in the continuous key frames by using a Mask R-CNN algorithm to obtain object instances contained in each key frame in the continuous key frames, performing semantic annotation on the object instances, and transferring the object instances to a feature fusion unit;
a three-dimensional point cloud map updating unit: adding the continuous key frames into the initialized three-dimensional point cloud map, updating the three-dimensional point cloud map, and transferring to a feature fusion unit;
a feature fusion unit: matching the feature points extracted from each key frame in the continuous key frames, the obtained object example, the feature points of the previous key frame and the object example, and performing local nonlinear optimization on the pose estimation result of each key frame by integrating the feature points and the object example matching result to obtain object example semantic labeling information carried by each key frame in the continuous key frames;
constructing a three-dimensional semantic map unit: and associating the object instance semantic annotation information carried by each key frame in the continuous key frames with the coordinates of each key frame on the three-dimensional point cloud map, so as to realize semantic annotation on the three-dimensional point cloud map and obtain the three-dimensional semantic map.
Generally, compared with the prior art, the above technical solution conceived by the present invention has the following beneficial effects:
the invention provides a method and a system for generating a three-dimensional semantic map by an unmanned ship, wherein an image frame acquired by an unmanned ship vision sensor is processed by using a vision SLAM algorithm to obtain a key frame and an initial three-dimensional point cloud map, and the obtained key frame is subjected to instance segmentation by using a Mask R-CNN algorithm to obtain all object instances in the key frame.
On the other hand, matching the feature points extracted from the current key frame and the object example obtained after the example is segmented with the feature points and the object example of the nearest key frame, and performing local nonlinear optimization on the pose estimation result of the key frame by integrating the feature points and the example segmentation matching result to obtain the key frame carrying object example semantic annotation information; and mapping the key frame carrying object semantic annotation information to the coordinates of the key frame on the three-dimensional point cloud map to obtain the three-dimensional semantic map.
The invention combines two algorithms of visual SLAM and Mask R-CNN, improves the Mask R-CNN algorithm, adds a boundary thinning module, realizes autonomous navigation of the unmanned ship in unknown environment or navigation environment with poor communication signals, and can identify the type of the object ahead, thereby greatly improving the task execution effect.
Drawings
FIG. 1 is a flow chart of a method for an unmanned ship to generate a three-dimensional semantic map of the present invention;
FIG. 2 is a Mask R-CNN example segmentation algorithm model with an added boundary refining module according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the invention.
As shown in fig. 1, the method for generating a three-dimensional semantic map for an unmanned ship of the present invention includes the following steps:
(1) Acquiring a key frame: continuously acquiring image frames by using a visual sensor in the process of sailing of the unmanned ship to obtain an image frame sequence, processing the continuous image frames in the image frame sequence by adopting an ORB-SLAM algorithm to obtain an initialized three-dimensional point cloud map, simultaneously processing the extracted feature points of the continuous image frames to obtain continuous key frames, and respectively performing the step (2) and the step (3) below;
(2) A key frame example segmentation step: performing instance segmentation on each key frame in the continuous key frames by using a Mask R-CNN algorithm to obtain object instances contained in each key frame in the continuous key frames, performing semantic annotation on the object instances, and turning to the step (4);
(3) Updating the three-dimensional point cloud map: adding the continuous key frames into the initialized three-dimensional point cloud map to update the three-dimensional point cloud map, and turning to the step (4);
(4) A characteristic fusion step: matching the feature points extracted from each key frame in the continuous key frames, the obtained object example, the feature points of the previous key frame and the object example, and performing local nonlinear optimization on the pose estimation result of each key frame by integrating the feature points and the object example matching result to obtain object example semantic labeling information carried by each key frame in the continuous key frames;
(5) Constructing a three-dimensional semantic map: and associating the object instance semantic annotation information carried by each key frame in the continuous key frames with the coordinates of each key frame on the three-dimensional point cloud map, so as to realize semantic annotation on the three-dimensional point cloud map and obtain the three-dimensional semantic map.
Specifically, the method for generating the three-dimensional semantic map by the unmanned ship provided by the invention comprises the following steps of:
(1) Acquiring a key frame:
extracting feature points of each image frame by using an ORB-SLAM algorithm, and matching the feature points extracted from each image frame with the feature points of a nearest key frame to obtain a pose estimation result of a visual sensor when each image frame is acquired; if the matching with the latest key frame fails, global repositioning is needed to obtain the pose estimation result of the current image frame, and after the pose estimation result of each image frame is obtained, each image frame is screened to obtain the key frame.
(2) And a key frame example segmentation step.
The example segmentation is performed on the key frames mainly to acquire all object examples in each frame of the key frames. Example segmentation is a technology combining semantic segmentation and target detection, and compared with a bounding box of target detection, example segmentation can obtain more accurate object edges; with respect to semantic segmentation, instance segmentation can distinguish different individuals of the same object.
An example segmentation algorithm adopted by the embodiment of the invention is Mask R-CNN. The Mask R-CNN algorithm can perform example segmentation while performing target detection, and achieves excellent effect. The Mask R-CNN process includes two stages, the first stage scans key frames and generates proposals, i.e., regions that may contain a target, and the second stage classifies the proposals and generates bounding boxes and masks.
The specific flow is shown in fig. 2: inputting the continuous key frames into a Backbone network (Backbone), and performing feature extraction to generate a feature map (Featuremap); inputting the obtained Feature map into a Feature Pyramid Network (FPN), and performing side connection from top to bottom on the high-level features of low-resolution and high-semantic information and the low-level features of high-resolution and low-semantic information in the Feature map, so that the features under all scales have rich semantic information; inputting the feature map processed by the FPN into an RPN (Region suggestion network), wherein the RPN is a lightweight neural network, scans an image by using a sliding window and searches for a Region with a target, the RPN generates candidate regions in the feature map, the candidate regions obtained by each RPN are subjected to RoIAlign operation, and each RoI generates a feature map with a fixed size through a RoIAlign layer; these processed feature maps are then used for two branches: one is a classifier branch used for generating a regression box and a prediction category; the other is a Mask branch for generating a Mask.
In general, objects have complex and irregular boundary shapes, and accurate prediction of the boundary shapes has a great influence on the accuracy of intelligent recognition of the front objects by the unmanned ship. Therefore, in order to improve the segmentation accuracy of the edge, in the embodiment of the present invention, a Boundary refinement module (BR for short) is added after the Mask path, and the module optimizes the generated Mask through a residual structure to further fit the example Boundary, so that the positioning capability at the Boundary is greatly improved.
In the embodiment of the present invention, a joint training loss function is used to optimize a mask, and a formula of the joint training loss function L is:
L=L cls +L box +L mask
wherein L is cls Representing the cross entropy of target category prediction of the interest region in the target detection task, and representing the classification error; l is box Representing a pose correction loss function for the interest area in target detection and representing a detection error; l is mask And representing the sum of the pixel-by-pixel cross entropy of the predicted mask and the actual mask in the semantic segmentation task, and representing the segmentation error.
Wherein L is mask For the average binary cross entropy loss function, the formula can be expressed as:
Figure BDA0003419939650000081
wherein y represents the correctly labeled data after binarization,
Figure BDA0003419939650000082
representing the predicted segmentation result after binarization. However, the deficiency of the cross entropy loss function in the segmentation task is dependent on the region information, so that the prediction of the boundary is ignored, and the accuracy of boundary segmentation is not high on the final segmentation result.
In the embodiment of the invention L mask BWL (boundary weighted loss function) is added to optimize the generated mask. L is mask-bwl For the optimized formula, the following is shown:
Figure BDA0003419939650000083
wherein L is dis Representing the boundary weighting loss function, alpha representing the weighting coefficient, B representing the boundary of the segmentation result, R representing the entire segmentation region, M dist The distance transformation of the data segmentation bounding box representing the correct mark corresponds to a distance map. This is equivalent to L mask The weight of frame loss is increased, and the segmentation result can be more accurate.
And combining the object examples in the key frame obtained after the processing of the boundary thinning module with the classifer branch processing to obtain object category information to obtain a processing result of a Mask R-CNN algorithm, namely the object examples in the key frame and the semantic annotation information thereof. And (4) after the Mask R-CNN algorithm processing of the continuous key frames is finished, turning to the step (4).
(3) And updating the three-dimensional point cloud map.
Adding the continuous key frames into the initialized three-dimensional point cloud map to update the three-dimensional point cloud map, and turning to the step (4);
(4) And (5) feature fusion.
Matching the feature points extracted from each key frame in the continuous key frames, the obtained object example, the feature points of the previous key frame and the object example, and performing local nonlinear optimization on the pose estimation result of each key frame by integrating the feature points and the object example matching result to obtain object example semantic labeling information carried by each key frame in the continuous key frames;
(5) And constructing a three-dimensional semantic map.
And associating object instance semantic annotation information carried by each key frame in the continuous key frames with the coordinates of each key frame on the three-dimensional point cloud map to realize semantic annotation of the three-dimensional point cloud map and obtain the three-dimensional semantic map.
Specifically, the system for generating the three-dimensional semantic map by the unmanned ship comprises the following units:
acquiring a key frame unit: continuously acquiring image frames by using a visual sensor in the process of sailing of the unmanned ship to obtain an image frame sequence, processing the continuous image frames in the image frame sequence by adopting an ORB-SLAM algorithm to obtain an initialized three-dimensional point cloud map, simultaneously processing the feature points of the extracted continuous image frames to obtain continuous key frames, and respectively switching to a key frame example segmentation unit and a three-dimensional point cloud map updating unit;
key frame instance partitioning unit: performing instance segmentation on each key frame in the continuous key frames by using a Mask R-CNN algorithm to obtain object instances contained in each key frame in the continuous key frames, performing semantic annotation on the object instances, and transferring the object instances to a feature fusion unit;
a three-dimensional point cloud map updating unit: adding the continuous key frames into the initialized three-dimensional point cloud map, updating the three-dimensional point cloud map, and transferring to a feature fusion unit;
a feature fusion unit: matching the feature points extracted from each key frame in the continuous key frames and the obtained object examples with the feature points and the object examples of the previous key frame, and performing local nonlinear optimization on the pose estimation result of each key frame by integrating the feature points and the object example matching results to obtain object example semantic labeling information carried by each key frame in the continuous key frames;
constructing a three-dimensional semantic map unit: and associating the object instance semantic annotation information carried by each key frame in the continuous key frames with the coordinates of each key frame on the three-dimensional point cloud map, so as to realize semantic annotation on the three-dimensional point cloud map and obtain the three-dimensional semantic map.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. A method for generating a three-dimensional semantic map by an unmanned ship is characterized by comprising the following steps:
(1) Acquiring a key frame: continuously acquiring image frames by using a visual sensor in the process of sailing of the unmanned ship to obtain an image frame sequence, processing the continuous image frames in the image frame sequence by adopting an ORB-SLAM algorithm to obtain an initialized three-dimensional point cloud map, simultaneously processing the extracted feature points of the continuous image frames to obtain continuous key frames, and respectively performing the step (2) and the step (3) below;
(2) A key frame example segmentation step: performing instance segmentation on each key frame in the continuous key frames by using a Mask R-CNN algorithm to obtain object instances contained in each key frame in the continuous key frames, performing semantic annotation on the object instances, and turning to the step (4);
(3) Updating the three-dimensional point cloud map: adding the continuous key frames into the initialized three-dimensional point cloud map to update the three-dimensional point cloud map, and turning to the step (4);
(4) And (3) feature fusion step: matching the feature points extracted from each key frame in the continuous key frames, the obtained object example, the feature points of the previous key frame and the object example, and performing local nonlinear optimization on the pose estimation result of each key frame by integrating the feature points and the object example matching result to obtain object example semantic labeling information carried by each key frame in the continuous key frames;
(5) Constructing a three-dimensional semantic map: and associating the object instance semantic annotation information carried by each key frame in the continuous key frames with the coordinates of each key frame on the three-dimensional point cloud map, so as to realize semantic annotation on the three-dimensional point cloud map and obtain the three-dimensional semantic map.
2. The method for generating the three-dimensional semantic map by the unmanned ship as claimed in claim 1, wherein the step of acquiring the keyframes comprises processing feature points of the extracted continuous image frames to obtain continuous keyframes, and the specific process comprises:
extracting feature points of each image frame by adopting an ORB-SLAM algorithm, and matching the feature points extracted from each image frame with the feature points of the previous image frame to obtain a pose estimation result of the visual sensor when each image frame is acquired; if the matching with the previous image frame fails, the image frame needs to be matched with the previous key frame to obtain a pose estimation result of the image frame, if the matching with the previous key frame fails, the image frame needs to be subjected to global repositioning to obtain a pose estimation result of the image frame, and after the pose estimation result of each image frame is obtained, each image frame is screened to obtain a key frame.
3. The method for unmanned ship to generate three-dimensional semantic map according to claim 2, characterized in that the global relocation is specifically:
and generating a key frame database by the acquired continuous key frames, matching the image frames with all the key frames in the key frame database, and determining the pose estimation result of the image frames.
4. The method for generating the three-dimensional semantic map by the unmanned ship as claimed in claim 2, wherein the filtering of each image frame to obtain the key frame specifically comprises:
selecting the image frames meeting the following conditions as key frames:
the image frames are spaced at least 20 image frames from the last globally repositioned image frame;
the mapping process is idle, or more than 20 image frames are added to the key frame last time;
the image frame tracks at least 50 map points;
the number of map points matched by the image frame cannot exceed 90 percent of the number of map points matched by the reference key frame.
5. The method for unmanned ship to generate a three-dimensional semantic map of claim 4, wherein the mapping routine is idle, further comprising:
and if the mapping thread is busy when the key frame is added, sending a signal to the mapping thread, stopping the current task of the mapping thread, and enabling the key frame to be processed in time, wherein the mapping thread is a thread used for constructing the navigation map of the unmanned ship in the ORB-SLAM algorithm.
6. The method for unmanned ship to generate three-dimensional semantic map according to claim 4, wherein the map points are specifically:
the visual sensor observes the road sign during the navigation of the unmanned ship.
7. The method for unmanned ship to generate three-dimensional semantic map according to claim 4, wherein the reference key frame is specifically:
the reference key frame is a key frame which is in a key frame database and has the highest matching degree with the feature points of the image frame.
8. The method for unmanned ship to generate a three-dimensional semantic map of claim 1, wherein the keyframe instance segmentation step further comprises:
in the process of segmenting the key frame by using the Mask R-CNN, a boundary thinning module is also needed to be used for fitting the segmented object instance boundary to obtain a more accurate object instance boundary outline.
9. The method for generating a three-dimensional semantic map for an unmanned ship according to claim 1, wherein the key frame instance is segmented and semantically labeled in the key frame instance segmentation step, specifically:
the semantic labels are the categories of the object instances in the key frame and the relationship between them.
10. A system for generating a three-dimensional semantic map for an unmanned ship is characterized by comprising the following units:
acquiring a key frame unit: continuously acquiring image frames by using a visual sensor in the process of sailing of the unmanned ship to obtain an image frame sequence, processing the continuous image frames in the image frame sequence by adopting an ORB-SLAM algorithm to obtain an initialized three-dimensional point cloud map, simultaneously processing the feature points of the extracted continuous image frames to obtain continuous key frames, and respectively switching to a key frame example segmentation unit and a three-dimensional point cloud map updating unit;
key frame instance partitioning unit: performing instance segmentation on each key frame in continuous key frames by using a Mask R-CNN algorithm to obtain object instances contained in each key frame in the continuous key frames, performing semantic annotation on the object instances, and transferring the object instances to a feature fusion unit;
a three-dimensional point cloud map updating unit: adding the continuous key frames into the initialized three-dimensional point cloud map to update the three-dimensional point cloud map, and transferring to a feature fusion unit;
a feature fusion unit: matching the feature points extracted from each key frame in the continuous key frames, the obtained object example, the feature points of the previous key frame and the object example, and performing local nonlinear optimization on the pose estimation result of each key frame by integrating the feature points and the object example matching result to obtain object example semantic labeling information carried by each key frame in the continuous key frames;
constructing a three-dimensional semantic map unit: and associating the object instance semantic annotation information carried by each key frame in the continuous key frames with the coordinates of each key frame on the three-dimensional point cloud map, so as to realize semantic annotation on the three-dimensional point cloud map and obtain the three-dimensional semantic map.
CN202111559071.1A 2021-12-20 2021-12-20 Method and system for generating three-dimensional semantic map for unmanned ship Active CN114359493B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111559071.1A CN114359493B (en) 2021-12-20 2021-12-20 Method and system for generating three-dimensional semantic map for unmanned ship

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111559071.1A CN114359493B (en) 2021-12-20 2021-12-20 Method and system for generating three-dimensional semantic map for unmanned ship

Publications (2)

Publication Number Publication Date
CN114359493A CN114359493A (en) 2022-04-15
CN114359493B true CN114359493B (en) 2023-01-03

Family

ID=81100951

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111559071.1A Active CN114359493B (en) 2021-12-20 2021-12-20 Method and system for generating three-dimensional semantic map for unmanned ship

Country Status (1)

Country Link
CN (1) CN114359493B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115630185B (en) * 2022-09-23 2024-02-02 深圳市云洲创新科技有限公司 Repositioning method, water surface aircraft and storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109816686A (en) * 2019-01-15 2019-05-28 山东大学 Robot semanteme SLAM method, processor and robot based on object example match
CN110243370A (en) * 2019-05-16 2019-09-17 西安理工大学 A kind of three-dimensional semantic map constructing method of the indoor environment based on deep learning
CN111462135B (en) * 2020-03-31 2023-04-21 华东理工大学 Semantic mapping method based on visual SLAM and two-dimensional semantic segmentation
CN111563442B (en) * 2020-04-29 2023-05-02 上海交通大学 Slam method and system for fusing point cloud and camera image data based on laser radar
CN112288857A (en) * 2020-10-30 2021-01-29 西安工程大学 Robot semantic map object recognition method based on deep learning

Also Published As

Publication number Publication date
CN114359493A (en) 2022-04-15

Similar Documents

Publication Publication Date Title
CN111626217B (en) Target detection and tracking method based on two-dimensional picture and three-dimensional point cloud fusion
CN111275688B (en) Small target detection method based on context feature fusion screening of attention mechanism
Zai et al. 3-D road boundary extraction from mobile laser scanning data via supervoxels and graph cuts
Chen et al. Vehicle detection in high-resolution aerial images via sparse representation and superpixels
CN107563372B (en) License plate positioning method based on deep learning SSD frame
Yuan et al. Robust lane detection for complicated road environment based on normal map
CN110633632A (en) Weak supervision combined target detection and semantic segmentation method based on loop guidance
Romdhane et al. An improved traffic signs recognition and tracking method for driver assistance system
CN111832514A (en) Unsupervised pedestrian re-identification method and unsupervised pedestrian re-identification device based on soft multiple labels
CN113408584B (en) RGB-D multi-modal feature fusion 3D target detection method
Rateke et al. Passive vision region-based road detection: A literature review
CN114998815B (en) Traffic vehicle identification tracking method and system based on video analysis
Mistry et al. Survey: Vision based road detection techniques
Chen et al. Semantic loop closure detection with instance-level inconsistency removal in dynamic industrial scenes
CN114359493B (en) Method and system for generating three-dimensional semantic map for unmanned ship
CN111666953B (en) Tidal zone surveying and mapping method and device based on semantic segmentation
CN106650814B (en) Outdoor road self-adaptive classifier generation method based on vehicle-mounted monocular vision
CN116721398A (en) Yolov5 target detection method based on cross-stage route attention module and residual information fusion module
Li et al. Pole-like street furniture decompostion in mobile laser scanning data
Song et al. ODSPC: deep learning-based 3D object detection using semantic point cloud
Al Noman et al. A computer vision-based lane detection technique using gradient threshold and hue-lightness-saturation value for an autonomous vehicle
Imad et al. Navigation system for autonomous vehicle: A survey
Ding et al. A comprehensive approach for road marking detection and recognition
CN115393379A (en) Data annotation method and related product
CN113392852A (en) Vehicle detection method and system based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant