CN114359493A - Method and system for generating three-dimensional semantic map for unmanned ship - Google Patents

Method and system for generating three-dimensional semantic map for unmanned ship Download PDF

Info

Publication number
CN114359493A
CN114359493A CN202111559071.1A CN202111559071A CN114359493A CN 114359493 A CN114359493 A CN 114359493A CN 202111559071 A CN202111559071 A CN 202111559071A CN 114359493 A CN114359493 A CN 114359493A
Authority
CN
China
Prior art keywords
key frame
key
frames
map
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111559071.1A
Other languages
Chinese (zh)
Other versions
CN114359493B (en
Inventor
李波
陈小乔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
709th Research Institute of CSIC
Original Assignee
709th Research Institute of CSIC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 709th Research Institute of CSIC filed Critical 709th Research Institute of CSIC
Priority to CN202111559071.1A priority Critical patent/CN114359493B/en
Publication of CN114359493A publication Critical patent/CN114359493A/en
Application granted granted Critical
Publication of CN114359493B publication Critical patent/CN114359493B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a method and a system for generating a three-dimensional semantic map by an unmanned ship, which comprises the following steps: processing image frames acquired by the unmanned ship in the process of sailing by using an ORB-SLAM algorithm to obtain a key frame and an initialized three-dimensional point cloud map; performing instance segmentation on the obtained key frame by using a Mask R-CNN algorithm to obtain all object instances in the key frame, and performing semantic annotation on the object instances; adding the obtained key frame into the initialized three-dimensional point cloud map to update the three-dimensional point cloud map; matching the feature points extracted from the key frames with all object examples obtained by segmenting the examples with the nearest key frames to obtain the key frames carrying object example semantic annotation information; and mapping the key frame carrying object semantic annotation information to the coordinates of the key frame on the three-dimensional point cloud map to obtain the three-dimensional semantic map. The invention improves the accuracy of the unmanned ship in recognizing the boundary of the shot object.

Description

Method and system for generating three-dimensional semantic map for unmanned ship
Technical Field
The invention belongs to the technical field of computer vision, and particularly relates to a method and a system for generating a three-dimensional semantic map by an unmanned ship.
Background
In recent years, the market of unmanned ships is growing, and autonomous navigation as a guarantee for increasing safe navigation of unmanned ships is changing with the development of technology. One key issue with unmanned vessels that want to implement applications is achieving autonomous positioning and environmental awareness. In many scenarios involving unmanned ship applications, such as path planning, environmental awareness, obstacle avoidance, etc., the process of positioning and mapping is seen as a prerequisite. Meanwhile, the visual slam (simultaneous Localization and mapping) algorithm can establish a map of the environment in the motion process without environment prior information after a camera is used as a visual sensor, and can perform autonomous positioning at the same time.
The ORB-SLAM algorithm (organic FAST and BRIEF-Simultaneous Localization and Mapping) is a monocular vision SLAM algorithm completely based on sparse feature points, and comprises visual odometry, tracking and loopback detection, and the core of the ORB-SLAM algorithm is to use ORB (organic FAST and BRIEF) as the core feature in the whole visual SLAM. However, the map constructed by the ORB-SLAM algorithm cannot provide semantic information of objects in the surrounding environment, and image information acquired by the vision sensor cannot be fully utilized. Therefore, the ORB-SLAM algorithm can only solve the problem of autonomous navigation of the unmanned ship, and objects shot by the vision sensor in the process of sailing cannot be intelligently identified.
The Mask R-CNN algorithm is a very representative example segmentation algorithm based on deep learning in recent years, and can be applied to the fields of target detection, example segmentation, target key point detection and the like. Since a large number of objects have complex and irregular boundary shapes, and the accurate prediction of the boundary shapes has a great influence on the segmentation of the whole example, the Mask R-CNN algorithm has a little defect on the accuracy of the boundary identification of the object example.
In summary, although the existing ORB-SLAM method can solve the autonomous navigation function of the unmanned ship, it cannot identify the type of the object shot in the navigation process, and the existing Mask R-CNN algorithm is slightly deficient in accurately identifying the boundary of the object. Therefore, it is an urgent task in the current unmanned ship technology field to provide a method and a system for generating a three-dimensional semantic map for an unmanned ship, which can enhance the efficiency of the unmanned ship in performing tasks such as navigation and exploration.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a method and a system for generating a three-dimensional semantic map by an unmanned ship, and solves the technical problem that the boundary of an object cannot be accurately identified in the navigation process of the unmanned ship in the prior art.
In order to achieve the above object, the present invention provides a method for generating a three-dimensional semantic map for an unmanned ship, comprising the steps of:
(1) acquiring a key frame: continuously acquiring image frames by using a visual sensor in the process of sailing of the unmanned ship to obtain an image frame sequence, processing the continuous image frames in the image frame sequence by adopting an ORB-SLAM algorithm to obtain an initialized three-dimensional point cloud map, simultaneously processing the extracted feature points of the continuous image frames to obtain continuous key frames, and respectively performing the step (2) and the step (3) below;
(2) a key frame example segmentation step: performing instance segmentation on each key frame in the continuous key frames by using a Mask R-CNN algorithm to obtain object instances contained in each key frame in the continuous key frames, performing semantic annotation on the object instances, and turning to the step (4);
(3) updating the three-dimensional point cloud map: adding the continuous key frames into the initialized three-dimensional point cloud map to update the three-dimensional point cloud map, and turning to the step (4);
(4) and (3) feature fusion step: matching the feature points extracted from each key frame in the continuous key frames, the obtained object example, the feature points of the previous key frame and the object example, and performing local nonlinear optimization on the pose estimation result of each key frame by integrating the feature points and the object example matching result to obtain object example semantic labeling information carried by each key frame in the continuous key frames;
(5) constructing a three-dimensional semantic map: and associating the object instance semantic annotation information carried by each key frame in the continuous key frames with the coordinates of each key frame on the three-dimensional point cloud map, so as to realize semantic annotation on the three-dimensional point cloud map and obtain the three-dimensional semantic map.
In an optional embodiment, in the step of obtaining the key frame, the feature points of the continuous image frames are extracted and processed to obtain the continuous key frame, and the specific process includes:
extracting feature points of each image frame by adopting an ORB-SLAM algorithm, and matching the feature points extracted by each image frame with the feature points of the previous image frame to obtain a pose estimation result of the visual sensor when each image frame is acquired; if the matching with the previous image frame fails, the image frame needs to be matched with the previous key frame to obtain a pose estimation result of the image frame, if the matching with the previous key frame fails, the image frame needs to be subjected to global repositioning to obtain a pose estimation result of the image frame, and after the pose estimation result of each image frame is obtained, each image frame is screened to obtain a key frame.
In an optional embodiment, the global relocation specifically includes:
and generating a key frame database by the acquired continuous key frames, matching the image frames with all the key frames in the key frame database, and determining the pose estimation result of the image frames.
In an optional embodiment, the screening of each image frame to obtain a key frame specifically includes:
selecting the image frames meeting the following conditions as key frames:
the image frames are spaced at least 20 image frames from the last globally repositioned image frame;
the mapping process is idle, or more than 20 image frames are added to the key frame last time;
the image frame tracks at least 50 map points;
the number of map points matched by the image frame cannot exceed 90 percent of the number of map points matched by the reference key frame.
In an optional embodiment, the mapping process is idle, and further includes:
and if the mapping thread is busy when the key frame is added, sending a signal to the mapping thread, stopping the current task of the mapping thread, and enabling the key frame to be processed in time, wherein the mapping thread is a thread used for constructing the navigation map of the unmanned ship in the ORB-SLAM algorithm.
In an optional embodiment, the map points are specifically:
the visual sensor observes the road sign during the navigation of the unmanned ship.
In an optional embodiment, the reference key frame specifically includes:
the reference key frame is a key frame with the highest matching degree with the feature points of the image frames in the key frame database.
In an optional embodiment, the key frame instance partitioning step further includes:
in the process of performing example segmentation on the key frame by using Mask R-CNN, a boundary thinning module is also needed to be used for fitting the segmented object example boundary, so as to obtain a more accurate object example boundary contour.
In an optional embodiment, in the step of segmenting the key frame instance, semantic labeling is performed on the key frame instance, specifically:
the semantic labels are the categories of the object instances in the key frame and the relationship between them.
Another object of the present invention is to provide a system for generating a three-dimensional semantic map for an unmanned ship, comprising the following units:
acquiring a key frame unit: continuously acquiring image frames by using a visual sensor in the process of sailing of the unmanned ship to obtain an image frame sequence, processing the continuous image frames in the image frame sequence by adopting an ORB-SLAM algorithm to obtain an initialized three-dimensional point cloud map, simultaneously processing the feature points of the extracted continuous image frames to obtain continuous key frames, and respectively switching to a key frame example segmentation unit and a three-dimensional point cloud map updating unit;
key frame instance partitioning unit: performing instance segmentation on each key frame in the continuous key frames by using a Mask R-CNN algorithm to obtain object instances contained in each key frame in the continuous key frames, performing semantic annotation on the object instances, and transferring the object instances to a feature fusion unit;
a three-dimensional point cloud map updating unit: adding the continuous key frames into the initialized three-dimensional point cloud map, updating the three-dimensional point cloud map, and transferring to a feature fusion unit;
a feature fusion unit: matching the feature points extracted from each key frame in the continuous key frames, the obtained object example, the feature points of the previous key frame and the object example, and performing local nonlinear optimization on the pose estimation result of each key frame by integrating the feature points and the object example matching result to obtain object example semantic labeling information carried by each key frame in the continuous key frames;
constructing a three-dimensional semantic map unit: and associating the object instance semantic annotation information carried by each key frame in the continuous key frames with the coordinates of each key frame on the three-dimensional point cloud map, so as to realize semantic annotation on the three-dimensional point cloud map and obtain the three-dimensional semantic map.
Generally, compared with the prior art, the above technical solution conceived by the present invention has the following beneficial effects:
the invention provides a method and a system for generating a three-dimensional semantic map for an unmanned ship, wherein a visual SLAM algorithm is used for processing image frames acquired by an unmanned ship visual sensor to obtain a key frame and an initial three-dimensional point cloud map, and a Mask R-CNN algorithm is used for carrying out example segmentation on the obtained key frame to obtain all object examples in the key frame.
On the other hand, matching the feature points extracted from the current key frame and the object example obtained after the example is segmented with the feature points and the object example of the nearest key frame, and performing local nonlinear optimization on the pose estimation result of the key frame by integrating the feature points and the example segmentation matching result to obtain the key frame carrying object example semantic annotation information; and mapping the key frame carrying object semantic annotation information to the coordinates of the key frame on the three-dimensional point cloud map to obtain the three-dimensional semantic map.
The invention combines two algorithms of visual SLAM and Mask R-CNN, improves the Mask R-CNN algorithm, adds a boundary thinning module, realizes autonomous navigation of the unmanned ship in unknown environment or navigation environment with poor communication signals, and can identify the type of the object ahead, thereby greatly improving the task execution effect.
Drawings
FIG. 1 is a flow chart of a method for an unmanned ship to generate a three-dimensional semantic map of the present invention;
FIG. 2 is a Mask R-CNN example segmentation algorithm model with an added boundary refining module according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
As shown in fig. 1, the method for generating a three-dimensional semantic map for an unmanned ship of the present invention includes the following steps:
(1) acquiring a key frame: continuously acquiring image frames by using a visual sensor in the process of sailing of the unmanned ship to obtain an image frame sequence, processing the continuous image frames in the image frame sequence by adopting an ORB-SLAM algorithm to obtain an initialized three-dimensional point cloud map, simultaneously processing the extracted feature points of the continuous image frames to obtain continuous key frames, and respectively performing the step (2) and the step (3) below;
(2) a key frame example segmentation step: performing instance segmentation on each key frame in the continuous key frames by using a Mask R-CNN algorithm to obtain object instances contained in each key frame in the continuous key frames, performing semantic annotation on the object instances, and turning to the step (4);
(3) updating the three-dimensional point cloud map: adding the continuous key frames into the initialized three-dimensional point cloud map to update the three-dimensional point cloud map, and turning to the step (4);
(4) and (3) feature fusion step: matching the feature points extracted from each key frame in the continuous key frames, the obtained object example, the feature points of the previous key frame and the object example, and performing local nonlinear optimization on the pose estimation result of each key frame by integrating the feature points and the object example matching result to obtain object example semantic labeling information carried by each key frame in the continuous key frames;
(5) constructing a three-dimensional semantic map: and associating the object instance semantic annotation information carried by each key frame in the continuous key frames with the coordinates of each key frame on the three-dimensional point cloud map, so as to realize semantic annotation on the three-dimensional point cloud map and obtain the three-dimensional semantic map.
Specifically, the method for generating the three-dimensional semantic map by the unmanned ship provided by the invention comprises the following detailed technical scheme:
(1) acquiring a key frame:
extracting feature points of each image frame by using an ORB-SLAM algorithm, and matching the feature points extracted by each image frame with the feature points of a nearest key frame to obtain a pose estimation result of the visual sensor when each image frame is acquired; if the matching with the latest key frame fails, global repositioning is needed to obtain the pose estimation result of the current image frame, and after the pose estimation result of each image frame is obtained, each image frame is screened to obtain the key frame.
(2) And a key frame example segmentation step.
The example segmentation is performed on the key frames mainly to acquire all object examples in each frame of the key frames. Example segmentation is a technology combining semantic segmentation and target detection, and compared with a bounding box of target detection, example segmentation can obtain more accurate object edges; with respect to semantic segmentation, instance segmentation can distinguish different individuals of the same object.
An example segmentation algorithm adopted by the embodiment of the invention is Mask R-CNN. The Mask R-CNN algorithm can perform example segmentation while performing target detection, and achieves excellent effect. The Mask R-CNN process includes two stages, the first stage scans key frames and generates proposals, i.e., regions that may contain a target, and the second stage classifies the proposals and generates bounding boxes and masks.
The specific flow is shown in fig. 2: inputting the continuous key frames into a Backbone network (Backbone), and performing feature extraction to generate a feature map (Featuremap); inputting the obtained Feature map into a Feature Pyramid Network (FPN), and performing side connection from top to bottom on the high-level features of low-resolution and high-semantic information and the low-level features of high-resolution and low-semantic information in the Feature map, so that the features under all scales have rich semantic information; inputting the feature map processed by the FPN into an RPN (Region suggestion network), wherein the RPN is a lightweight neural network, scans an image by using a sliding window and searches for a Region with a target, the RPN generates candidate regions in the feature map, the candidate regions obtained by each RPN are subjected to RoIAlign operation, and each RoI generates a feature map with a fixed size through a RoIAlign layer; these processed feature maps are then used for two branches: one is a classifier branch used for generating a regression box and a prediction category; the other is a Mask branch for generating a Mask.
In general, objects have complex and irregular boundary shapes, and accurate prediction of the boundary shapes has a great influence on the accuracy of intelligent recognition of the front objects by the unmanned ship. Therefore, in order to improve the edge segmentation accuracy, in the embodiment of the present invention, a Boundary refinement module (BR for short) is added after the Mask path, and the Boundary refinement module optimizes the generated Mask through a residual structure to further fit the example Boundary, so that the positioning capability at the Boundary is greatly improved.
In the embodiment of the present invention, a joint training loss function is used to optimize a mask, where a formula of the joint training loss function L is:
L=Lcls+Lbox+Lmask
wherein L isclsRepresenting the cross entropy of target category prediction of the interest region in the target detection task, and representing the classification error; l isboxRepresenting a pose correction loss function for the interest area in target detection and representing a detection error; l ismaskAnd representing the sum of the pixel-by-pixel cross entropy of the predicted mask and the actual mask in the semantic segmentation task, and representing the segmentation error.
Wherein L ismaskFor the average binary cross entropy loss function, the formula can be expressed as:
Figure BDA0003419939650000081
wherein y represents the correctly labeled data after binarization,
Figure BDA0003419939650000082
representing the predicted segmentation result after binarization. However, the deficiency of the cross entropy loss function in the segmentation task is dependent on the region information, so that the prediction of the boundary is ignored, and the accuracy of boundary segmentation is not high on the final segmentation result.
In the embodiment of the invention LmaskBWL (boundary weighted loss function) is added to optimize the generated mask. L ismask-bwlFor the optimized formula, the following is shown:
Figure BDA0003419939650000083
wherein L isdisRepresenting the boundary weighted loss function, alpha representing the weight systemNumber, B represents the boundary of the segmentation result, R represents the entire segmentation region, MdistThe distance transformation of the data segmentation bounding box representing the correct mark corresponds to a distance map. This is equivalent to LmaskThe weight of frame loss is increased, and the segmentation result can be more accurate.
And obtaining object class information by combining the object example in the key frame processed by the boundary thinning module and the classifer branch processing, and obtaining the processing result of the Mask R-CNN algorithm, namely the object entity in the key frame and the semantic annotation information thereof. And (4) after the Mask R-CNN algorithm processing of the continuous key frames is finished, turning to the step (4).
(3) And updating the three-dimensional point cloud map.
Adding the continuous key frames into the initialized three-dimensional point cloud map to update the three-dimensional point cloud map, and turning to the step (4);
(4) and (5) feature fusion.
Matching the feature points extracted from each key frame in the continuous key frames, the obtained object example, the feature points of the previous key frame and the object example, and performing local nonlinear optimization on the pose estimation result of each key frame by integrating the feature points and the object example matching result to obtain object example semantic labeling information carried by each key frame in the continuous key frames;
(5) and constructing a three-dimensional semantic map.
And associating the object instance semantic annotation information carried by each key frame in the continuous key frames with the coordinates of each key frame on the three-dimensional point cloud map, so as to realize semantic annotation on the three-dimensional point cloud map and obtain the three-dimensional semantic map.
Specifically, the system for generating the three-dimensional semantic map by the unmanned ship comprises the following units:
acquiring a key frame unit: continuously acquiring image frames by using a visual sensor in the process of sailing of the unmanned ship to obtain an image frame sequence, processing the continuous image frames in the image frame sequence by adopting an ORB-SLAM algorithm to obtain an initialized three-dimensional point cloud map, simultaneously processing the feature points of the extracted continuous image frames to obtain continuous key frames, and respectively switching to a key frame example segmentation unit and a three-dimensional point cloud map updating unit;
key frame instance partitioning unit: performing instance segmentation on each key frame in the continuous key frames by using a Mask R-CNN algorithm to obtain object instances contained in each key frame in the continuous key frames, performing semantic annotation on the object instances, and transferring the object instances to a feature fusion unit;
a three-dimensional point cloud map updating unit: adding the continuous key frames into the initialized three-dimensional point cloud map, updating the three-dimensional point cloud map, and transferring to a feature fusion unit;
a feature fusion unit: matching the feature points extracted from each key frame in the continuous key frames, the obtained object example, the feature points of the previous key frame and the object example, and performing local nonlinear optimization on the pose estimation result of each key frame by integrating the feature points and the object example matching result to obtain object example semantic labeling information carried by each key frame in the continuous key frames;
constructing a three-dimensional semantic map unit: and associating the object instance semantic annotation information carried by each key frame in the continuous key frames with the coordinates of each key frame on the three-dimensional point cloud map, so as to realize semantic annotation on the three-dimensional point cloud map and obtain the three-dimensional semantic map.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. A method for generating a three-dimensional semantic map for an unmanned ship is characterized by comprising the following steps:
(1) acquiring a key frame: continuously acquiring image frames by using a visual sensor in the process of sailing of the unmanned ship to obtain an image frame sequence, processing the continuous image frames in the image frame sequence by adopting an ORB-SLAM algorithm to obtain an initialized three-dimensional point cloud map, simultaneously processing the extracted feature points of the continuous image frames to obtain continuous key frames, and respectively performing the step (2) and the step (3) below;
(2) a key frame example segmentation step: performing instance segmentation on each key frame in the continuous key frames by using a Mask R-CNN algorithm to obtain object instances contained in each key frame in the continuous key frames, performing semantic annotation on the object instances, and turning to the step (4);
(3) updating the three-dimensional point cloud map: adding the continuous key frames into the initialized three-dimensional point cloud map to update the three-dimensional point cloud map, and turning to the step (4);
(4) and (3) feature fusion step: matching the feature points extracted from each key frame in the continuous key frames, the obtained object example, the feature points of the previous key frame and the object example, and performing local nonlinear optimization on the pose estimation result of each key frame by integrating the feature points and the object example matching result to obtain object example semantic labeling information carried by each key frame in the continuous key frames;
(5) constructing a three-dimensional semantic map: and associating the object instance semantic annotation information carried by each key frame in the continuous key frames with the coordinates of each key frame on the three-dimensional point cloud map, so as to realize semantic annotation on the three-dimensional point cloud map and obtain the three-dimensional semantic map.
2. The method for generating the three-dimensional semantic map by the unmanned ship according to claim 1, wherein the step of acquiring the key frames comprises the following steps of processing the feature points extracted from the continuous image frames to obtain continuous key frames:
extracting feature points of each image frame by adopting an ORB-SLAM algorithm, and matching the feature points extracted by each image frame with the feature points of the previous image frame to obtain a pose estimation result of the visual sensor when each image frame is acquired; if the matching with the previous image frame fails, the image frame needs to be matched with the previous key frame to obtain a pose estimation result of the image frame, if the matching with the previous key frame fails, the image frame needs to be subjected to global repositioning to obtain a pose estimation result of the image frame, and after the pose estimation result of each image frame is obtained, each image frame is screened to obtain a key frame.
3. The method for unmanned ship to generate three-dimensional semantic map according to claim 2, characterized in that the global relocation is specifically:
and generating a key frame database by the acquired continuous key frames, matching the image frames with all the key frames in the key frame database, and determining the pose estimation result of the image frames.
4. The method for generating the three-dimensional semantic map for the unmanned ship according to claim 2, wherein the screening of each image frame to obtain a key frame specifically comprises:
selecting the image frames meeting the following conditions as key frames:
the image frames are spaced at least 20 image frames from the last globally repositioned image frame;
the mapping process is idle, or more than 20 image frames are added to the key frame last time;
the image frame tracks at least 50 map points;
the number of map points matched by the image frame cannot exceed 90 percent of the number of map points matched by the reference key frame.
5. The method for unmanned ship to generate three-dimensional semantic maps according to claim 4, wherein the mapping process is idle, further comprising:
and if the mapping thread is busy when the key frame is added, sending a signal to the mapping thread, stopping the current task of the mapping thread, and enabling the key frame to be processed in time, wherein the mapping thread is a thread used for constructing the navigation map of the unmanned ship in the ORB-SLAM algorithm.
6. The method for unmanned ship to generate three-dimensional semantic map according to claim 4, wherein the map points are specifically:
the visual sensor observes the road sign during the navigation of the unmanned ship.
7. The method for unmanned ship to generate three-dimensional semantic map according to claim 4, wherein the reference key frame is specifically:
the reference key frame is a key frame with the highest matching degree with the feature points of the image frames in the key frame database.
8. The method for unmanned ship to generate three-dimensional semantic map as claimed in claim 1, wherein the key frame instance segmentation step further comprises:
in the process of performing example segmentation on the key frame by using Mask R-CNN, a boundary thinning module is also needed to be used for fitting the segmented object example boundary, so as to obtain a more accurate object example boundary contour.
9. The method for generating a three-dimensional semantic map for an unmanned ship according to claim 1, wherein the key frame instance is segmented and semantically labeled in the key frame instance segmentation step, specifically:
the semantic labels are the categories of the object instances in the key frame and the relationship between them.
10. A system for generating a three-dimensional semantic map for an unmanned ship is characterized by comprising the following units:
acquiring a key frame unit: continuously acquiring image frames by using a visual sensor in the process of sailing of the unmanned ship to obtain an image frame sequence, processing the continuous image frames in the image frame sequence by adopting an ORB-SLAM algorithm to obtain an initialized three-dimensional point cloud map, simultaneously processing the feature points of the extracted continuous image frames to obtain continuous key frames, and respectively switching to a key frame example segmentation unit and a three-dimensional point cloud map updating unit;
key frame instance partitioning unit: performing instance segmentation on each key frame in the continuous key frames by using a Mask R-CNN algorithm to obtain object instances contained in each key frame in the continuous key frames, performing semantic annotation on the object instances, and transferring the object instances to a feature fusion unit;
a three-dimensional point cloud map updating unit: adding the continuous key frames into the initialized three-dimensional point cloud map, updating the three-dimensional point cloud map, and transferring to a feature fusion unit;
a feature fusion unit: matching the feature points extracted from each key frame in the continuous key frames, the obtained object example, the feature points of the previous key frame and the object example, and performing local nonlinear optimization on the pose estimation result of each key frame by integrating the feature points and the object example matching result to obtain object example semantic labeling information carried by each key frame in the continuous key frames;
constructing a three-dimensional semantic map unit: and associating the object instance semantic annotation information carried by each key frame in the continuous key frames with the coordinates of each key frame on the three-dimensional point cloud map, so as to realize semantic annotation on the three-dimensional point cloud map and obtain the three-dimensional semantic map.
CN202111559071.1A 2021-12-20 2021-12-20 Method and system for generating three-dimensional semantic map for unmanned ship Active CN114359493B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111559071.1A CN114359493B (en) 2021-12-20 2021-12-20 Method and system for generating three-dimensional semantic map for unmanned ship

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111559071.1A CN114359493B (en) 2021-12-20 2021-12-20 Method and system for generating three-dimensional semantic map for unmanned ship

Publications (2)

Publication Number Publication Date
CN114359493A true CN114359493A (en) 2022-04-15
CN114359493B CN114359493B (en) 2023-01-03

Family

ID=81100951

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111559071.1A Active CN114359493B (en) 2021-12-20 2021-12-20 Method and system for generating three-dimensional semantic map for unmanned ship

Country Status (1)

Country Link
CN (1) CN114359493B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115630185A (en) * 2022-09-23 2023-01-20 深圳市云洲创新科技有限公司 Repositioning method, water surface aircraft and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109816686A (en) * 2019-01-15 2019-05-28 山东大学 Robot semanteme SLAM method, processor and robot based on object example match
CN110243370A (en) * 2019-05-16 2019-09-17 西安理工大学 A kind of three-dimensional semantic map constructing method of the indoor environment based on deep learning
CN111462135A (en) * 2020-03-31 2020-07-28 华东理工大学 Semantic mapping method based on visual S L AM and two-dimensional semantic segmentation
CN111563442A (en) * 2020-04-29 2020-08-21 上海交通大学 Slam method and system for fusing point cloud and camera image data based on laser radar
CN112288857A (en) * 2020-10-30 2021-01-29 西安工程大学 Robot semantic map object recognition method based on deep learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109816686A (en) * 2019-01-15 2019-05-28 山东大学 Robot semanteme SLAM method, processor and robot based on object example match
CN110243370A (en) * 2019-05-16 2019-09-17 西安理工大学 A kind of three-dimensional semantic map constructing method of the indoor environment based on deep learning
CN111462135A (en) * 2020-03-31 2020-07-28 华东理工大学 Semantic mapping method based on visual S L AM and two-dimensional semantic segmentation
CN111563442A (en) * 2020-04-29 2020-08-21 上海交通大学 Slam method and system for fusing point cloud and camera image data based on laser radar
CN112288857A (en) * 2020-10-30 2021-01-29 西安工程大学 Robot semantic map object recognition method based on deep learning

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115630185A (en) * 2022-09-23 2023-01-20 深圳市云洲创新科技有限公司 Repositioning method, water surface aircraft and storage medium
CN115630185B (en) * 2022-09-23 2024-02-02 深圳市云洲创新科技有限公司 Repositioning method, water surface aircraft and storage medium

Also Published As

Publication number Publication date
CN114359493B (en) 2023-01-03

Similar Documents

Publication Publication Date Title
CN111626217B (en) Target detection and tracking method based on two-dimensional picture and three-dimensional point cloud fusion
CN111275688B (en) Small target detection method based on context feature fusion screening of attention mechanism
Zai et al. 3-D road boundary extraction from mobile laser scanning data via supervoxels and graph cuts
Chen et al. Vehicle detection in high-resolution aerial images via sparse representation and superpixels
Foresti et al. Vehicle recognition and tracking from road image sequences
Yuan et al. Robust lane detection for complicated road environment based on normal map
CN110633632A (en) Weak supervision combined target detection and semantic segmentation method based on loop guidance
CN110598690A (en) End-to-end optical character detection and identification method and system
CN113408584B (en) RGB-D multi-modal feature fusion 3D target detection method
Fan et al. Improving robustness of license plates automatic recognition in natural scenes
Rateke et al. Passive vision region-based road detection: A literature review
Mistry et al. Survey: Vision based road detection techniques
CN114998815B (en) Traffic vehicle identification tracking method and system based on video analysis
CN112488229A (en) Domain self-adaptive unsupervised target detection method based on feature separation and alignment
CN112037268B (en) Environment sensing method based on probability transfer model in dynamic scene
CN114359493B (en) Method and system for generating three-dimensional semantic map for unmanned ship
Zhang et al. A vertical text spotting model for trailer and container codes
CN106650814B (en) Outdoor road self-adaptive classifier generation method based on vehicle-mounted monocular vision
Gökçe et al. Recognition of dynamic objects from UGVs using Interconnected Neuralnetwork-based Computer Vision system
CN113780040A (en) Lip key point positioning method and device, storage medium and electronic equipment
CN111666953A (en) Tidal zone surveying and mapping method and device based on semantic segmentation
Song et al. ODSPC: deep learning-based 3D object detection using semantic point cloud
Al Noman et al. A computer vision-based lane detection technique using gradient threshold and hue-lightness-saturation value for an autonomous vehicle
CN115393379A (en) Data annotation method and related product
Manoharan et al. Detection of unstructured roads from a single image for autonomous navigation applications

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant