CN114463334A - Inner cavity vision SLAM method based on semantic segmentation - Google Patents

Inner cavity vision SLAM method based on semantic segmentation Download PDF

Info

Publication number
CN114463334A
CN114463334A CN202111548927.5A CN202111548927A CN114463334A CN 114463334 A CN114463334 A CN 114463334A CN 202111548927 A CN202111548927 A CN 202111548927A CN 114463334 A CN114463334 A CN 114463334A
Authority
CN
China
Prior art keywords
semantic segmentation
feature points
surgical tool
inner cavity
lumen
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111548927.5A
Other languages
Chinese (zh)
Inventor
赵剑波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin University of Science and Technology
Original Assignee
Harbin University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin University of Science and Technology filed Critical Harbin University of Science and Technology
Priority to CN202111548927.5A priority Critical patent/CN114463334A/en
Publication of CN114463334A publication Critical patent/CN114463334A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10068Endoscopic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20016Hierarchical, coarse-to-fine, multiscale or multiresolution image processing; Pyramid transform
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Endoscopes (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to an inner cavity vision SLAM method based on semantic segmentation, which comprises the following steps: (1) acquiring an image sequence of an inner cavity environment through an endoscope, and extracting characteristic points from image data frame by frame; (2) aiming at the lumen image obtained in the first step, performing binary semantic segmentation by using a convolutional neural network to obtain mask information of the surgical tool; (3) removing the dynamic characteristic points on the surgical tool by combining the preliminarily extracted characteristic points and the segmentation result; (4) and estimating the endoscopic pose according to the processed reliable static characteristic points and completing the three-dimensional mapping of the lumen environment. The method can solve the problems that when the SLAM system is executed in an inner cavity scene, the robustness of the system is reduced due to the moving surgical tool, pose estimation errors are generated, the map is wrongly built, and the like.

Description

Inner cavity vision SLAM method based on semantic segmentation
The technical field is as follows:
the invention relates to the technical field of computer vision, in particular to an inner cavity vision SLAM method for eliminating dynamic feature points based on semantic segmentation.
Background art:
synchronous positioning and mapping (SLAM) is used for solving the problems of positioning and mapping of a robot in an unknown environment, and is a basic module and a prerequisite for applications such as an autonomous robot or augmented reality. The vision SLAM inputs data through a camera serving as a sensor, and completes the tasks of self positioning and mapping of the surrounding environment. Depending on the way in which the data association is performed, there are mainly a feature method and a direct method. The feature method estimates the three-dimensional geometric shape along the common-view image by using a group of matched feature points, and the direct method estimates the three-dimensional shape by directly using the pixel intensity without extracting the image features.
With the increasing attention on minimally invasive surgery and medical robots, a minimally invasive surgery navigation system is gradually combined with a computer vision technology, a visual SLAM algorithm is used for performing three-dimensional positioning and three-dimensional reconstruction on a focus area only through an endoscope image sequence, and the limitation that visual feedback in the traditional minimally invasive surgery is relatively incomplete or poor can be solved. In a traditional feature method vision SLAM algorithm, an observed object is generally assumed to be static, however, in an inner cavity scene, a moving surgical tool may appear on a picture, and feature points from a moving object bring wrong matching, so that the pose estimation of a camera is deviated, the mapping is not accurate, even SLAM tracking is lost, and the robustness of the whole system is reduced. In recent years, image semantic segmentation and target detection algorithms utilizing deep learning are greatly developed in efficiency and accuracy, so that moving objects are identified and segmented by combining a convolutional neural network in a feature method visual SLAM, corresponding image regions are shielded to avoid feature matching in the image regions, the robustness of an SLAM system can be improved, and a more accurate reconstruction result is obtained.
The invention content is as follows:
the invention aims to provide a semantic segmentation-based lumen vision SLAM method, which can segment a surgical tool appearing in an endoscope image in a lumen environment, thereby eliminating dynamic feature points on the surgical tool and constructing a map with higher accuracy by using static feature points in a background area.
In order to solve the technical problem, the invention provides an inner cavity vision SLAM method based on semantic segmentation, which comprises the following steps:
step 1: shooting an inner cavity environment through a monocular endoscope, acquiring an inner cavity image sequence, and then transmitting the inner cavity image sequence into an SLAM system to perform feature extraction and descriptor matching on the inner cavity image sequence frame by frame;
step 2: semantic segmentation is carried out on the image data by using a convolutional neural network, dynamic operation tools appearing in the segmented image are detected, and a corresponding binary mask is obtained through calculation;
and step 3: checking the preliminarily extracted feature point sequence according to the extracted feature points and a binary mask result obtained by segmenting the network, and if the preselected feature points are in the mask range, rejecting the preselected feature points to eliminate error feature points detected on the dynamic surgical tool;
and 4, step 4: and (4) continuing tracking by using static characteristic points of other background areas for executing subsequent pose estimation and constructing an environment map, and realizing the dynamic visual SLAM method facing the inner cavity scene.
Preferably, a feature method ORB-SLAM2 is adopted in step 1 as a global SLAM algorithm, when an endoscope camera acquires an RGB image of a lumen, the RGB image is transmitted to a SLAM system, ORB feature points are extracted and matched for each frame image through an ORB feature extraction algorithm in a SLAM system tracking thread, in the ORB algorithm, an image pyramid is divided into 8 layers, each layer of image is divided into 30 × 30 grids, and corner points are extracted from each grid, so that the extracted feature points are uniformly distributed, and finally, a preselected feature point sequence is obtained.
Preferably, in the step 2, the surgical tool moving under the intracavity scene is segmented by utilizing semantic information and a convolutional neural network, and the method introduces the U-net neural network into an ORB-SLAM2 framework to realize semantic segmentation on the surgical tool. The U-net is a full convolutional network with a symmetric encoding-decoding structure. The network mainly comprises two parts, namely a contraction path and an expansion path. Every two 3 × 3 convolutional layers are followed by a 2 × 2 max pooling layer in the contraction path, and each convolutional layer is down-sampled by using the ReLU activation function, which doubles the number of channels. An upsampling operation is performed on the extended path, each step comprising a 2 x 2 convolutional layer and a 3 x 3 convolutional layer, the activation function is also a ReLU, and the features in the contracted path are fused by a jump connection. The last layer of the network is a 1 × 1 convolutional layer, which converts the feature components into binary classification results. The network has a total of 23 convolutional layers. The model was trained using a data set from MICCAI collected using the da vinci surgical system. As an output of the model, a pixel probability map is obtained. For the desired binary segmentation problem, a prediction mask for the surgical instrument is ultimately generated.
Preferably, in the step 3, a RANSAC algorithm is used to eliminate mismatching points in the image with respect to the extracted ORB feature points, and the RANSAC algorithm finds an optimal homography matrix H for the matching points by 8 at a time, and is implemented by the following formula:
Figure BDA0003416577780000021
wherein (x, y) represents the corner position of the target image, (x ', y') represents the coordinates of the feature points on the image to be matched, and s is a scale parameter.
The RANSAC algorithm randomly extracts samples from the matched data set to calculate a homography matrix, then the model is used for testing all data, and if the number of consistent sets under the model meets the requirement, the consistent sets under the model are determined to be the optimal model. The sampling times are as follows:
Figure BDA0003416577780000031
where δ is an exemplary number of samples, p is the probability that there is no outlier at least once in δ random samples, and ε is the ratio of outliers to total. The number of consistent sets is determined by:
T=(1-ε)n
since the RANSAC algorithm described above cannot exclusively select the correct point pairs, it is incorrect for feature points in a moving area and may lead to errors. Therefore, firstly, for the dynamic feature points on the surgical tool, further screening of the feature points is performed in the SLAM tracking thread according to the preselected feature points and the mask information of the surgical tool provided by the segmentation network, and the feature point detection area is limited by using the mask so as to prevent the feature points from being concentrated on the surgical tool; the pixel-by-pixel mask obtained by semantic segmentation can distinguish an operation tool area and a background area in an image, so that the operation tool area and the background area are used for shielding a moving operation tool, when points in the mask area exist in a feature point sequence, the points are identified as dynamic feature points and deleted, and then the RANSAC algorithm is used for processing the feature points in a static area, so that the endoscope pose is estimated more stably, and the built map is not interfered by the movement of the operation tool in a picture.
Preferably, in the step 4, after the dynamic feature points on the surgical tool are removed, the static feature points in other areas are used for calculating and acquiring a map of the pose and the lumen scene of the endoscope; specifically, the SLAM system mainly comprises a tracking thread, a local map building thread and a closed loop detection thread, wherein the tracking thread extracts static feature points in an image, carries out pose estimation, tracks a reconstructed local map and determines a key frame; the local map building process mainly completes the building of a local map and carries out three-dimensional point reconstruction on the static environment; finally, the closed-loop detection thread mainly comprises closed-loop fusion and global optimization; specifically, for each new image frame, extracting and matching static ORB feature points on the image frame, estimating a pose based on a uniform motion model, optimizing the pose by using a minimized projection error, and judging whether a key frame is generated or not; calculating the map points of the key frames with higher common view range by a triangulation method, and fusing if the current key frame and the adjacent frames have repeated map points; and finally, adjusting through a global beam method, and jointly optimizing the endoscope pose and map points.
Drawings
FIG. 1 is a flow chart of an intracavity vision SLAM method based on semantic segmentation according to the present invention;
FIG. 2 is a schematic diagram of a semantic segmentation network model in the present invention;
FIG. 3 is a schematic flow chart of the method for eliminating dynamic feature points on a surgical tool according to the present invention.
The specific implementation mode is as follows:
the principles and methods of the present invention will be described in further detail below with reference to the accompanying drawings, but the described embodiments are not intended to limit the embodiments of the invention.
As shown in FIG. 1, the invention provides a lumen vision SLAM method based on semantic segmentation, which works by utilizing a lumen image sequence shot by an endoscope camera and eliminates wrong feature points on a surgical tool to obtain more stable endoscope pose estimation and more accurate mapping. The method specifically comprises the following steps:
step 1: acquiring video data of an inner cavity environment through an endoscope camera, converting the video into an RGB image frame sequence, sequentially transmitting the RGB image frame sequence into an SLAM system for processing, and extracting ORB characteristic points;
specifically, the ORB-SLAM2 is selected as a global SLAM algorithm framework, FAST corner points in images are extracted from the cavity images frame by frame through a feature extraction algorithm in the cavity images, the images are made to have scale invariance and rotation invariance by constructing an image pyramid and calculating a gray centroid, grids are divided by using a quadtree algorithm to enable feature points to be uniformly distributed, and then a BRIEF descriptor is calculated to obtain a final ORB feature point descriptor which is used as a preselected feature point.
Step 2: simultaneously, inputting the original RGB image into a trained semantic segmentation network, segmenting the surgical tool in the image, and distinguishing a surgical tool region and a background region by utilizing semantic information;
further, a U-net neural network is selected as a segmentation network of the system, and a U-net architecture comprises a contraction path for acquiring context information and a symmetrical expansion path for realizing accurate positioning; the contraction path adopts alternate convolution and pooling operation, gradually samples the characteristic diagram, and increases the number of the characteristic diagram layer by layer; each step in the extended path is composed of up-sampling and convolution of the feature map, so that the resolution of output can be improved; then, a low-level feature map and a high-resolution feature are combined through jump connection, so that target details can be better repaired, pixel-level positioning is realized, and a good effect is achieved on a task of segmenting limited data; the network model, using the Jaccard index as an evaluation index, can be interpreted as a similarity measure between a set of priority quantities, defined by the following equation:
Figure BDA0003416577780000041
wherein, yiIs the binary class attribute of the pixel point i,
Figure BDA0003416577780000042
is the pixel point probability obtained by model prediction; and combining with the classification loss function H to obtain a final expression of the generalized loss function:
L=H-logJ
as shown in fig. 2, the image frame with the surgical tool in the image is semantically segmented through a U-net network, so that the surgical tool appearing in the image can be segmented, and the corresponding pixel-by-pixel binary mask can be obtained through calculation.
And step 3: checking the preliminarily extracted feature point sequence according to the extracted feature points and a binary mask result obtained by segmenting the network, and if the preselected feature points are in the mask range, rejecting the preselected feature points to eliminate error feature points detected on the dynamic surgical tool;
as shown in fig. 3, the embodiment introduces a U-net semantic segmentation network into the SLAM system, pre-processes the lumen image, and removes dynamic feature points on the surgical tool according to the extracted ORB feature points and semantic segmentation results, which specifically includes:
when a moving surgical tool appears in an inner cavity image sequence, screening a characteristic point sequence by using a mask information of the surgical tool provided by a preselected characteristic point and a segmentation network in an SLAM tracking thread aiming at the dynamic characteristic point extracted from the surgical tool, distinguishing static and dynamic characteristics in the image by using a binary semantic information of a background and the surgical tool under an inner cavity scene, deleting the dynamic characteristic point in a mask area from the characteristic point sequence as an abnormal value, and not participating in pose estimation and mapping work; and then the static feature points are detected by using a RANSAC algorithm and the error matching in the static feature points is deleted, so that the robustness of the SLAM is improved, and the endoscope pose estimation is more stable and accurate and the mapping is more accurate.
And 4, step 4: sequentially executing subsequent modules of ORB-SLAM2 according to the processed ORB characteristics, estimating the camera pose by using the adjustment minimization of the reprojection error by using a light beam method through the matching corresponding relation of adjacent frames, and determining a key frame; calculating map points by using a triangulation method in a local map building thread, carrying out three-dimensional point reconstruction on the lumen environment, and carrying out local optimization on the camera pose and the map points; and finally, loop detection and global optimization are carried out.
The method can eliminate the characteristic points which are possibly adversely affected on the moving surgical tool in the SLAM reconstruction process of the inner cavity, and improve the robustness and accuracy of the system. The above embodiments are merely illustrative of the core idea of the present invention, and do not limit the present invention, and modifications or substitutions of some techniques in the foregoing technical solutions are made without departing from the spirit and scope of the present invention.

Claims (6)

1. An intracavity vision SLAM method based on semantic segmentation is characterized by comprising the following steps:
step 1: shooting an inner cavity environment through a monocular endoscope, acquiring an inner cavity image sequence, and then transmitting the inner cavity image sequence into an SLAM system to perform feature extraction and descriptor matching on the inner cavity image sequence frame by frame;
step 2: semantic segmentation is carried out on the image data by using a convolutional neural network, dynamic operation tools appearing in the segmented image are detected, and a corresponding binary mask is obtained through calculation;
and step 3: checking the preliminarily extracted feature point sequence according to the extracted feature points and a binary mask result obtained by segmenting the network, and if the preselected feature points are in the mask range, rejecting the preselected feature points to eliminate error feature points detected on the dynamic surgical tool;
and 4, step 4: and (4) continuing tracking by using static characteristic points of other background areas for executing subsequent pose estimation and constructing an environment map, and realizing the dynamic visual SLAM method facing the inner cavity scene.
2. The semantic segmentation based lumen vision SLAM method of claim 1, wherein in step 1: the lumen monocular vision SLAM method takes an ORB-SLAM2 open source feature method vision SLAM system as a basic framework, when an endoscope camera collects RGB images of a lumen, the RGB images are transmitted to the SLAM system, and ORB feature points of each frame image are extracted and matched through a feature extraction algorithm in a tracking thread.
3. The semantic segmentation based lumen vision SLAM method of claim 1, wherein in step 2: and inputting the original RGB image into a trained semantic segmentation network, and segmenting the surgical tool moving in the cavity scene by utilizing semantic information and a convolutional neural network, wherein the U-net neural network is introduced into an ORB-SLAM2 framework to realize semantic segmentation of the surgical tool, so that pixel-by-pixel binary mask information of the surgical tool is obtained.
4. The semantic segmentation based lumen vision SLAM method of claim 1, wherein in step 3: a decision module for distinguishing dynamic feature points by utilizing semantic information is added in the SLAM system, the operation tool is segmented, the dynamic features are judged and are used as exterior points to be deleted.
5. The semantic segmentation based lumen vision SLAM method of claim 4 wherein: checking the preliminarily extracted feature point sequence according to the preliminarily extracted ORB feature points and a binary mask result obtained by a semantic segmentation network, and if the preselected feature points are in the mask range, rejecting the preselected feature points to eliminate error feature points detected on a dynamic surgical tool; then, further rejecting outliers by using a RANSAC algorithm, and selecting the most reliable characteristic point pair.
6. The semantic segmentation based lumen vision SLAM method of claim 1 wherein in step 4: after the dynamic characteristic points on the surgical tool are removed, correct corresponding relations between reliable static characteristic points in other areas are used for calculating and acquiring a map of the pose and the lumen scene of the endoscope.
CN202111548927.5A 2021-12-17 2021-12-17 Inner cavity vision SLAM method based on semantic segmentation Pending CN114463334A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111548927.5A CN114463334A (en) 2021-12-17 2021-12-17 Inner cavity vision SLAM method based on semantic segmentation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111548927.5A CN114463334A (en) 2021-12-17 2021-12-17 Inner cavity vision SLAM method based on semantic segmentation

Publications (1)

Publication Number Publication Date
CN114463334A true CN114463334A (en) 2022-05-10

Family

ID=81405388

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111548927.5A Pending CN114463334A (en) 2021-12-17 2021-12-17 Inner cavity vision SLAM method based on semantic segmentation

Country Status (1)

Country Link
CN (1) CN114463334A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024022062A1 (en) * 2022-07-28 2024-02-01 杭州堃博生物科技有限公司 Endoscope pose estimation method and apparatus, and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112132897A (en) * 2020-09-17 2020-12-25 中国人民解放军陆军工程大学 Visual SLAM method based on deep learning semantic segmentation
CN113516664A (en) * 2021-09-02 2021-10-19 长春工业大学 Visual SLAM method based on semantic segmentation dynamic points

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112132897A (en) * 2020-09-17 2020-12-25 中国人民解放军陆军工程大学 Visual SLAM method based on deep learning semantic segmentation
CN113516664A (en) * 2021-09-02 2021-10-19 长春工业大学 Visual SLAM method based on semantic segmentation dynamic points

Non-Patent Citations (8)

* Cited by examiner, † Cited by third party
Title
QIANGGUO JIN, AT EL.: "DUNet: A deformable network for retinal vessel segmentation", ARXIV:1811.01206V1 [CS.CV] 3 NOV 2018, 3 November 2018 (2018-11-03), pages 1 - 12 *
张伟等: "《高分辨率光学卫星成像质量系统提升技术》", 31 July 2021, 国防工业出版社, pages: 107 - 108 *
曹其新: "《泛在机器人技术与实践》", 30 September 2021, 国防工业出版社, pages: 176 - 183 *
柳杨: "《数字图像物体识别理论详解与实战》", 31 January 2018, 北京邮电大学出版社, pages: 45 *
王召东;郭晨;: "一种动态场景下语义分割优化的ORB_SLAM2", 大连海事大学学报, no. 04, 21 December 2018 (2018-12-21) *
盛超;潘树国;赵涛;曽攀;黄砺枭;: "基于图像语义分割的动态场景下的单目SLAM算法", 测绘通报, no. 01, 25 January 2020 (2020-01-25) *
陈昱皓等: "基于 EM-ORB 算法的移动机器人 SLAM 系统研究", 电气传动, vol. 5, no. 50, 31 December 2020 (2020-12-31), pages 67 - 74 *
魏保志: "《专利检索之星》", 31 January 2019, 知识产权出版社, pages: 165 - 166 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024022062A1 (en) * 2022-07-28 2024-02-01 杭州堃博生物科技有限公司 Endoscope pose estimation method and apparatus, and storage medium

Similar Documents

Publication Publication Date Title
Labbé et al. Cosypose: Consistent multi-view multi-object 6d pose estimation
CN109643368B (en) Detecting objects in video data
CN107980150B (en) Modeling three-dimensional space
JP6011102B2 (en) Object posture estimation method
WO2020125499A9 (en) Operation prompting method and glasses
US20030012410A1 (en) Tracking and pose estimation for augmented reality using real features
CN111862201A (en) Deep learning-based spatial non-cooperative target relative pose estimation method
CN110298884A (en) A kind of position and orientation estimation method suitable for monocular vision camera in dynamic environment
Gómez-Rodríguez et al. SD-DefSLAM: Semi-direct monocular SLAM for deformable and intracorporeal scenes
CN112419497A (en) Monocular vision-based SLAM method combining feature method and direct method
CN113989928B (en) Motion capturing and redirecting method
CN112183506A (en) Human body posture generation method and system
CN116222543B (en) Multi-sensor fusion map construction method and system for robot environment perception
CN117934721A (en) Space robot reconstruction method and system for target spacecraft based on vision-touch fusion
Furukawa et al. Fully auto-calibrated active-stereo-based 3d endoscopic system using correspondence estimation with graph convolutional network
CN113312973A (en) Method and system for extracting features of gesture recognition key points
CN118247435A (en) Intestinal tract dense three-dimensional modeling method based on visual odometer and convolutional neural network
CN117011381A (en) Real-time surgical instrument pose estimation method and system based on deep learning and stereoscopic vision
CN114463334A (en) Inner cavity vision SLAM method based on semantic segmentation
JP7498404B2 (en) Apparatus, method and program for estimating three-dimensional posture of subject
Le et al. Sparse3D: A new global model for matching sparse RGB-D dataset with small inter-frame overlap
CN115330874A (en) Monocular depth estimation method based on super-pixel processing shielding
US20240153120A1 (en) Method to determine the depth from images by self-adaptive learning of a neural network and system thereof
Chen et al. Local homography estimation on user-specified textureless regions
Wu et al. 3d semantic vslam of dynamic environment based on yolact

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination