CN114463334A - Inner cavity vision SLAM method based on semantic segmentation - Google Patents
Inner cavity vision SLAM method based on semantic segmentation Download PDFInfo
- Publication number
- CN114463334A CN114463334A CN202111548927.5A CN202111548927A CN114463334A CN 114463334 A CN114463334 A CN 114463334A CN 202111548927 A CN202111548927 A CN 202111548927A CN 114463334 A CN114463334 A CN 114463334A
- Authority
- CN
- China
- Prior art keywords
- semantic segmentation
- feature points
- surgical tool
- inner cavity
- lumen
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 39
- 230000011218 segmentation Effects 0.000 title claims abstract description 32
- 230000003068 static effect Effects 0.000 claims abstract description 14
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 6
- 238000000605 extraction Methods 0.000 claims description 5
- 230000000007 visual effect Effects 0.000 claims description 5
- 238000013528 artificial neural network Methods 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000013507 mapping Methods 0.000 abstract description 8
- 238000001514 detection method Methods 0.000 description 5
- 230000008602 contraction Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 238000002324 minimally invasive surgery Methods 0.000 description 3
- 238000005457 optimization Methods 0.000 description 3
- 230000004913 activation Effects 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011524 similarity measure Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10068—Endoscopic image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20016—Hierarchical, coarse-to-fine, multiscale or multiresolution image processing; Pyramid transform
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Endoscopes (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to an inner cavity vision SLAM method based on semantic segmentation, which comprises the following steps: (1) acquiring an image sequence of an inner cavity environment through an endoscope, and extracting characteristic points from image data frame by frame; (2) aiming at the lumen image obtained in the first step, performing binary semantic segmentation by using a convolutional neural network to obtain mask information of the surgical tool; (3) removing the dynamic characteristic points on the surgical tool by combining the preliminarily extracted characteristic points and the segmentation result; (4) and estimating the endoscopic pose according to the processed reliable static characteristic points and completing the three-dimensional mapping of the lumen environment. The method can solve the problems that when the SLAM system is executed in an inner cavity scene, the robustness of the system is reduced due to the moving surgical tool, pose estimation errors are generated, the map is wrongly built, and the like.
Description
The technical field is as follows:
the invention relates to the technical field of computer vision, in particular to an inner cavity vision SLAM method for eliminating dynamic feature points based on semantic segmentation.
Background art:
synchronous positioning and mapping (SLAM) is used for solving the problems of positioning and mapping of a robot in an unknown environment, and is a basic module and a prerequisite for applications such as an autonomous robot or augmented reality. The vision SLAM inputs data through a camera serving as a sensor, and completes the tasks of self positioning and mapping of the surrounding environment. Depending on the way in which the data association is performed, there are mainly a feature method and a direct method. The feature method estimates the three-dimensional geometric shape along the common-view image by using a group of matched feature points, and the direct method estimates the three-dimensional shape by directly using the pixel intensity without extracting the image features.
With the increasing attention on minimally invasive surgery and medical robots, a minimally invasive surgery navigation system is gradually combined with a computer vision technology, a visual SLAM algorithm is used for performing three-dimensional positioning and three-dimensional reconstruction on a focus area only through an endoscope image sequence, and the limitation that visual feedback in the traditional minimally invasive surgery is relatively incomplete or poor can be solved. In a traditional feature method vision SLAM algorithm, an observed object is generally assumed to be static, however, in an inner cavity scene, a moving surgical tool may appear on a picture, and feature points from a moving object bring wrong matching, so that the pose estimation of a camera is deviated, the mapping is not accurate, even SLAM tracking is lost, and the robustness of the whole system is reduced. In recent years, image semantic segmentation and target detection algorithms utilizing deep learning are greatly developed in efficiency and accuracy, so that moving objects are identified and segmented by combining a convolutional neural network in a feature method visual SLAM, corresponding image regions are shielded to avoid feature matching in the image regions, the robustness of an SLAM system can be improved, and a more accurate reconstruction result is obtained.
The invention content is as follows:
the invention aims to provide a semantic segmentation-based lumen vision SLAM method, which can segment a surgical tool appearing in an endoscope image in a lumen environment, thereby eliminating dynamic feature points on the surgical tool and constructing a map with higher accuracy by using static feature points in a background area.
In order to solve the technical problem, the invention provides an inner cavity vision SLAM method based on semantic segmentation, which comprises the following steps:
step 1: shooting an inner cavity environment through a monocular endoscope, acquiring an inner cavity image sequence, and then transmitting the inner cavity image sequence into an SLAM system to perform feature extraction and descriptor matching on the inner cavity image sequence frame by frame;
step 2: semantic segmentation is carried out on the image data by using a convolutional neural network, dynamic operation tools appearing in the segmented image are detected, and a corresponding binary mask is obtained through calculation;
and step 3: checking the preliminarily extracted feature point sequence according to the extracted feature points and a binary mask result obtained by segmenting the network, and if the preselected feature points are in the mask range, rejecting the preselected feature points to eliminate error feature points detected on the dynamic surgical tool;
and 4, step 4: and (4) continuing tracking by using static characteristic points of other background areas for executing subsequent pose estimation and constructing an environment map, and realizing the dynamic visual SLAM method facing the inner cavity scene.
Preferably, a feature method ORB-SLAM2 is adopted in step 1 as a global SLAM algorithm, when an endoscope camera acquires an RGB image of a lumen, the RGB image is transmitted to a SLAM system, ORB feature points are extracted and matched for each frame image through an ORB feature extraction algorithm in a SLAM system tracking thread, in the ORB algorithm, an image pyramid is divided into 8 layers, each layer of image is divided into 30 × 30 grids, and corner points are extracted from each grid, so that the extracted feature points are uniformly distributed, and finally, a preselected feature point sequence is obtained.
Preferably, in the step 2, the surgical tool moving under the intracavity scene is segmented by utilizing semantic information and a convolutional neural network, and the method introduces the U-net neural network into an ORB-SLAM2 framework to realize semantic segmentation on the surgical tool. The U-net is a full convolutional network with a symmetric encoding-decoding structure. The network mainly comprises two parts, namely a contraction path and an expansion path. Every two 3 × 3 convolutional layers are followed by a 2 × 2 max pooling layer in the contraction path, and each convolutional layer is down-sampled by using the ReLU activation function, which doubles the number of channels. An upsampling operation is performed on the extended path, each step comprising a 2 x 2 convolutional layer and a 3 x 3 convolutional layer, the activation function is also a ReLU, and the features in the contracted path are fused by a jump connection. The last layer of the network is a 1 × 1 convolutional layer, which converts the feature components into binary classification results. The network has a total of 23 convolutional layers. The model was trained using a data set from MICCAI collected using the da vinci surgical system. As an output of the model, a pixel probability map is obtained. For the desired binary segmentation problem, a prediction mask for the surgical instrument is ultimately generated.
Preferably, in the step 3, a RANSAC algorithm is used to eliminate mismatching points in the image with respect to the extracted ORB feature points, and the RANSAC algorithm finds an optimal homography matrix H for the matching points by 8 at a time, and is implemented by the following formula:
wherein (x, y) represents the corner position of the target image, (x ', y') represents the coordinates of the feature points on the image to be matched, and s is a scale parameter.
The RANSAC algorithm randomly extracts samples from the matched data set to calculate a homography matrix, then the model is used for testing all data, and if the number of consistent sets under the model meets the requirement, the consistent sets under the model are determined to be the optimal model. The sampling times are as follows:
where δ is an exemplary number of samples, p is the probability that there is no outlier at least once in δ random samples, and ε is the ratio of outliers to total. The number of consistent sets is determined by:
T=(1-ε)n
since the RANSAC algorithm described above cannot exclusively select the correct point pairs, it is incorrect for feature points in a moving area and may lead to errors. Therefore, firstly, for the dynamic feature points on the surgical tool, further screening of the feature points is performed in the SLAM tracking thread according to the preselected feature points and the mask information of the surgical tool provided by the segmentation network, and the feature point detection area is limited by using the mask so as to prevent the feature points from being concentrated on the surgical tool; the pixel-by-pixel mask obtained by semantic segmentation can distinguish an operation tool area and a background area in an image, so that the operation tool area and the background area are used for shielding a moving operation tool, when points in the mask area exist in a feature point sequence, the points are identified as dynamic feature points and deleted, and then the RANSAC algorithm is used for processing the feature points in a static area, so that the endoscope pose is estimated more stably, and the built map is not interfered by the movement of the operation tool in a picture.
Preferably, in the step 4, after the dynamic feature points on the surgical tool are removed, the static feature points in other areas are used for calculating and acquiring a map of the pose and the lumen scene of the endoscope; specifically, the SLAM system mainly comprises a tracking thread, a local map building thread and a closed loop detection thread, wherein the tracking thread extracts static feature points in an image, carries out pose estimation, tracks a reconstructed local map and determines a key frame; the local map building process mainly completes the building of a local map and carries out three-dimensional point reconstruction on the static environment; finally, the closed-loop detection thread mainly comprises closed-loop fusion and global optimization; specifically, for each new image frame, extracting and matching static ORB feature points on the image frame, estimating a pose based on a uniform motion model, optimizing the pose by using a minimized projection error, and judging whether a key frame is generated or not; calculating the map points of the key frames with higher common view range by a triangulation method, and fusing if the current key frame and the adjacent frames have repeated map points; and finally, adjusting through a global beam method, and jointly optimizing the endoscope pose and map points.
Drawings
FIG. 1 is a flow chart of an intracavity vision SLAM method based on semantic segmentation according to the present invention;
FIG. 2 is a schematic diagram of a semantic segmentation network model in the present invention;
FIG. 3 is a schematic flow chart of the method for eliminating dynamic feature points on a surgical tool according to the present invention.
The specific implementation mode is as follows:
the principles and methods of the present invention will be described in further detail below with reference to the accompanying drawings, but the described embodiments are not intended to limit the embodiments of the invention.
As shown in FIG. 1, the invention provides a lumen vision SLAM method based on semantic segmentation, which works by utilizing a lumen image sequence shot by an endoscope camera and eliminates wrong feature points on a surgical tool to obtain more stable endoscope pose estimation and more accurate mapping. The method specifically comprises the following steps:
step 1: acquiring video data of an inner cavity environment through an endoscope camera, converting the video into an RGB image frame sequence, sequentially transmitting the RGB image frame sequence into an SLAM system for processing, and extracting ORB characteristic points;
specifically, the ORB-SLAM2 is selected as a global SLAM algorithm framework, FAST corner points in images are extracted from the cavity images frame by frame through a feature extraction algorithm in the cavity images, the images are made to have scale invariance and rotation invariance by constructing an image pyramid and calculating a gray centroid, grids are divided by using a quadtree algorithm to enable feature points to be uniformly distributed, and then a BRIEF descriptor is calculated to obtain a final ORB feature point descriptor which is used as a preselected feature point.
Step 2: simultaneously, inputting the original RGB image into a trained semantic segmentation network, segmenting the surgical tool in the image, and distinguishing a surgical tool region and a background region by utilizing semantic information;
further, a U-net neural network is selected as a segmentation network of the system, and a U-net architecture comprises a contraction path for acquiring context information and a symmetrical expansion path for realizing accurate positioning; the contraction path adopts alternate convolution and pooling operation, gradually samples the characteristic diagram, and increases the number of the characteristic diagram layer by layer; each step in the extended path is composed of up-sampling and convolution of the feature map, so that the resolution of output can be improved; then, a low-level feature map and a high-resolution feature are combined through jump connection, so that target details can be better repaired, pixel-level positioning is realized, and a good effect is achieved on a task of segmenting limited data; the network model, using the Jaccard index as an evaluation index, can be interpreted as a similarity measure between a set of priority quantities, defined by the following equation:
wherein, yiIs the binary class attribute of the pixel point i,is the pixel point probability obtained by model prediction; and combining with the classification loss function H to obtain a final expression of the generalized loss function:
L=H-logJ
as shown in fig. 2, the image frame with the surgical tool in the image is semantically segmented through a U-net network, so that the surgical tool appearing in the image can be segmented, and the corresponding pixel-by-pixel binary mask can be obtained through calculation.
And step 3: checking the preliminarily extracted feature point sequence according to the extracted feature points and a binary mask result obtained by segmenting the network, and if the preselected feature points are in the mask range, rejecting the preselected feature points to eliminate error feature points detected on the dynamic surgical tool;
as shown in fig. 3, the embodiment introduces a U-net semantic segmentation network into the SLAM system, pre-processes the lumen image, and removes dynamic feature points on the surgical tool according to the extracted ORB feature points and semantic segmentation results, which specifically includes:
when a moving surgical tool appears in an inner cavity image sequence, screening a characteristic point sequence by using a mask information of the surgical tool provided by a preselected characteristic point and a segmentation network in an SLAM tracking thread aiming at the dynamic characteristic point extracted from the surgical tool, distinguishing static and dynamic characteristics in the image by using a binary semantic information of a background and the surgical tool under an inner cavity scene, deleting the dynamic characteristic point in a mask area from the characteristic point sequence as an abnormal value, and not participating in pose estimation and mapping work; and then the static feature points are detected by using a RANSAC algorithm and the error matching in the static feature points is deleted, so that the robustness of the SLAM is improved, and the endoscope pose estimation is more stable and accurate and the mapping is more accurate.
And 4, step 4: sequentially executing subsequent modules of ORB-SLAM2 according to the processed ORB characteristics, estimating the camera pose by using the adjustment minimization of the reprojection error by using a light beam method through the matching corresponding relation of adjacent frames, and determining a key frame; calculating map points by using a triangulation method in a local map building thread, carrying out three-dimensional point reconstruction on the lumen environment, and carrying out local optimization on the camera pose and the map points; and finally, loop detection and global optimization are carried out.
The method can eliminate the characteristic points which are possibly adversely affected on the moving surgical tool in the SLAM reconstruction process of the inner cavity, and improve the robustness and accuracy of the system. The above embodiments are merely illustrative of the core idea of the present invention, and do not limit the present invention, and modifications or substitutions of some techniques in the foregoing technical solutions are made without departing from the spirit and scope of the present invention.
Claims (6)
1. An intracavity vision SLAM method based on semantic segmentation is characterized by comprising the following steps:
step 1: shooting an inner cavity environment through a monocular endoscope, acquiring an inner cavity image sequence, and then transmitting the inner cavity image sequence into an SLAM system to perform feature extraction and descriptor matching on the inner cavity image sequence frame by frame;
step 2: semantic segmentation is carried out on the image data by using a convolutional neural network, dynamic operation tools appearing in the segmented image are detected, and a corresponding binary mask is obtained through calculation;
and step 3: checking the preliminarily extracted feature point sequence according to the extracted feature points and a binary mask result obtained by segmenting the network, and if the preselected feature points are in the mask range, rejecting the preselected feature points to eliminate error feature points detected on the dynamic surgical tool;
and 4, step 4: and (4) continuing tracking by using static characteristic points of other background areas for executing subsequent pose estimation and constructing an environment map, and realizing the dynamic visual SLAM method facing the inner cavity scene.
2. The semantic segmentation based lumen vision SLAM method of claim 1, wherein in step 1: the lumen monocular vision SLAM method takes an ORB-SLAM2 open source feature method vision SLAM system as a basic framework, when an endoscope camera collects RGB images of a lumen, the RGB images are transmitted to the SLAM system, and ORB feature points of each frame image are extracted and matched through a feature extraction algorithm in a tracking thread.
3. The semantic segmentation based lumen vision SLAM method of claim 1, wherein in step 2: and inputting the original RGB image into a trained semantic segmentation network, and segmenting the surgical tool moving in the cavity scene by utilizing semantic information and a convolutional neural network, wherein the U-net neural network is introduced into an ORB-SLAM2 framework to realize semantic segmentation of the surgical tool, so that pixel-by-pixel binary mask information of the surgical tool is obtained.
4. The semantic segmentation based lumen vision SLAM method of claim 1, wherein in step 3: a decision module for distinguishing dynamic feature points by utilizing semantic information is added in the SLAM system, the operation tool is segmented, the dynamic features are judged and are used as exterior points to be deleted.
5. The semantic segmentation based lumen vision SLAM method of claim 4 wherein: checking the preliminarily extracted feature point sequence according to the preliminarily extracted ORB feature points and a binary mask result obtained by a semantic segmentation network, and if the preselected feature points are in the mask range, rejecting the preselected feature points to eliminate error feature points detected on a dynamic surgical tool; then, further rejecting outliers by using a RANSAC algorithm, and selecting the most reliable characteristic point pair.
6. The semantic segmentation based lumen vision SLAM method of claim 1 wherein in step 4: after the dynamic characteristic points on the surgical tool are removed, correct corresponding relations between reliable static characteristic points in other areas are used for calculating and acquiring a map of the pose and the lumen scene of the endoscope.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111548927.5A CN114463334A (en) | 2021-12-17 | 2021-12-17 | Inner cavity vision SLAM method based on semantic segmentation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111548927.5A CN114463334A (en) | 2021-12-17 | 2021-12-17 | Inner cavity vision SLAM method based on semantic segmentation |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114463334A true CN114463334A (en) | 2022-05-10 |
Family
ID=81405388
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111548927.5A Pending CN114463334A (en) | 2021-12-17 | 2021-12-17 | Inner cavity vision SLAM method based on semantic segmentation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114463334A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024022062A1 (en) * | 2022-07-28 | 2024-02-01 | 杭州堃博生物科技有限公司 | Endoscope pose estimation method and apparatus, and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112132897A (en) * | 2020-09-17 | 2020-12-25 | 中国人民解放军陆军工程大学 | Visual SLAM method based on deep learning semantic segmentation |
CN113516664A (en) * | 2021-09-02 | 2021-10-19 | 长春工业大学 | Visual SLAM method based on semantic segmentation dynamic points |
-
2021
- 2021-12-17 CN CN202111548927.5A patent/CN114463334A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112132897A (en) * | 2020-09-17 | 2020-12-25 | 中国人民解放军陆军工程大学 | Visual SLAM method based on deep learning semantic segmentation |
CN113516664A (en) * | 2021-09-02 | 2021-10-19 | 长春工业大学 | Visual SLAM method based on semantic segmentation dynamic points |
Non-Patent Citations (8)
Title |
---|
QIANGGUO JIN, AT EL.: "DUNet: A deformable network for retinal vessel segmentation", ARXIV:1811.01206V1 [CS.CV] 3 NOV 2018, 3 November 2018 (2018-11-03), pages 1 - 12 * |
张伟等: "《高分辨率光学卫星成像质量系统提升技术》", 31 July 2021, 国防工业出版社, pages: 107 - 108 * |
曹其新: "《泛在机器人技术与实践》", 30 September 2021, 国防工业出版社, pages: 176 - 183 * |
柳杨: "《数字图像物体识别理论详解与实战》", 31 January 2018, 北京邮电大学出版社, pages: 45 * |
王召东;郭晨;: "一种动态场景下语义分割优化的ORB_SLAM2", 大连海事大学学报, no. 04, 21 December 2018 (2018-12-21) * |
盛超;潘树国;赵涛;曽攀;黄砺枭;: "基于图像语义分割的动态场景下的单目SLAM算法", 测绘通报, no. 01, 25 January 2020 (2020-01-25) * |
陈昱皓等: "基于 EM-ORB 算法的移动机器人 SLAM 系统研究", 电气传动, vol. 5, no. 50, 31 December 2020 (2020-12-31), pages 67 - 74 * |
魏保志: "《专利检索之星》", 31 January 2019, 知识产权出版社, pages: 165 - 166 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024022062A1 (en) * | 2022-07-28 | 2024-02-01 | 杭州堃博生物科技有限公司 | Endoscope pose estimation method and apparatus, and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Labbé et al. | Cosypose: Consistent multi-view multi-object 6d pose estimation | |
CN109643368B (en) | Detecting objects in video data | |
CN107980150B (en) | Modeling three-dimensional space | |
JP6011102B2 (en) | Object posture estimation method | |
WO2020125499A9 (en) | Operation prompting method and glasses | |
US20030012410A1 (en) | Tracking and pose estimation for augmented reality using real features | |
CN111862201A (en) | Deep learning-based spatial non-cooperative target relative pose estimation method | |
CN110298884A (en) | A kind of position and orientation estimation method suitable for monocular vision camera in dynamic environment | |
Gómez-Rodríguez et al. | SD-DefSLAM: Semi-direct monocular SLAM for deformable and intracorporeal scenes | |
CN112419497A (en) | Monocular vision-based SLAM method combining feature method and direct method | |
CN113989928B (en) | Motion capturing and redirecting method | |
CN112183506A (en) | Human body posture generation method and system | |
CN116222543B (en) | Multi-sensor fusion map construction method and system for robot environment perception | |
CN117934721A (en) | Space robot reconstruction method and system for target spacecraft based on vision-touch fusion | |
Furukawa et al. | Fully auto-calibrated active-stereo-based 3d endoscopic system using correspondence estimation with graph convolutional network | |
CN113312973A (en) | Method and system for extracting features of gesture recognition key points | |
CN118247435A (en) | Intestinal tract dense three-dimensional modeling method based on visual odometer and convolutional neural network | |
CN117011381A (en) | Real-time surgical instrument pose estimation method and system based on deep learning and stereoscopic vision | |
CN114463334A (en) | Inner cavity vision SLAM method based on semantic segmentation | |
JP7498404B2 (en) | Apparatus, method and program for estimating three-dimensional posture of subject | |
Le et al. | Sparse3D: A new global model for matching sparse RGB-D dataset with small inter-frame overlap | |
CN115330874A (en) | Monocular depth estimation method based on super-pixel processing shielding | |
US20240153120A1 (en) | Method to determine the depth from images by self-adaptive learning of a neural network and system thereof | |
Chen et al. | Local homography estimation on user-specified textureless regions | |
Wu et al. | 3d semantic vslam of dynamic environment based on yolact |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |