CN112802185A - Endoscope image three-dimensional reconstruction method and system facing minimally invasive surgery space perception - Google Patents
Endoscope image three-dimensional reconstruction method and system facing minimally invasive surgery space perception Download PDFInfo
- Publication number
- CN112802185A CN112802185A CN202110106321.XA CN202110106321A CN112802185A CN 112802185 A CN112802185 A CN 112802185A CN 202110106321 A CN202110106321 A CN 202110106321A CN 112802185 A CN112802185 A CN 112802185A
- Authority
- CN
- China
- Prior art keywords
- point cloud
- image
- depth
- endoscope
- neural network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/30—Determination of transform parameters for the alignment of images, i.e. image registration
- G06T7/33—Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10068—Endoscopic image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Molecular Biology (AREA)
- Evolutionary Computation (AREA)
- Mathematical Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Quality & Reliability (AREA)
- Radiology & Medical Imaging (AREA)
- Medical Informatics (AREA)
- Computer Graphics (AREA)
- Geometry (AREA)
- Endoscopes (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a minimally invasive surgery space perception oriented endoscope image three-dimensional reconstruction method and system, and relates to the technical field of three-dimensional reconstruction. According to the method, an endoscope image is obtained, depth estimation is carried out on a current frame of the endoscope image based on a preset multitask neural network model, and point cloud depth of the current frame is obtained; acquiring a local point cloud based on the point cloud depth and the camera model; carrying out registration fusion on the plurality of local point clouds; and splicing the registered and fused local point clouds to form a global point cloud flexibly transformed along with the time, and visually displaying the global point cloud. The invention overcomes the technical problems that the existing endoscope image three-dimensional reconstruction method based on depth learning can only estimate the depth of field information of the current endoscope image and can not reconstruct and dynamically update the whole three-dimensional model, and realizes the endoscope image three-dimensional reconstruction facing minimally invasive surgery space perception.
Description
Technical Field
The invention relates to the technical field of three-dimensional reconstruction, in particular to a minimally invasive surgery space perception-oriented endoscope image three-dimensional reconstruction method and system.
Background
Minimally invasive surgery refers to surgery performed using modern medical instruments such as endoscopes and related equipment. In the past decade, minimally invasive surgery has the advantages of small wound, light pain, less bleeding, quick recovery and the like, and becomes an important diagnosis and treatment means for multiple departments such as general surgery, urinary surgery, extracerebral surgery, extracardiac surgery and the like.
In minimally invasive surgery, it is difficult for a doctor to obtain comprehensive in-vivo environmental information due to a restriction of an endoscope angle of view. In addition, the displacement of organs before and during the operation, the operation in the operation may cause the lack of anatomical features, which brings challenges to the operation of locating, suturing, cutting and the like of the focus point in the operation and reduces the operation precision. The three-dimensional reconstruction of the in-vivo model can solve the problems and assist the development of minimally invasive surgery.
The existing endoscope image three-dimensional reconstruction method based on depth learning only can estimate the depth of field information of the current endoscope image and cannot reconstruct and dynamically update the whole three-dimensional model.
Disclosure of Invention
Technical problem to be solved
Aiming at the defects of the prior art, the invention provides a minimally invasive surgery space perception-oriented endoscope image three-dimensional reconstruction method and system, and solves the technical problem that the existing method cannot reconstruct and dynamically update the whole three-dimensional model.
(II) technical scheme
In order to achieve the purpose, the invention is realized by the following technical scheme:
the invention provides a minimally invasive surgery space perception-oriented endoscope image three-dimensional reconstruction method, which comprises the following steps:
s1, acquiring an endoscope image;
s2, depth estimation is carried out on the current frame of the endoscope image based on a preset multitask neural network model, and the point cloud depth of the current frame is obtained;
s3, acquiring local point cloud based on the point cloud depth and a camera model;
s4, carrying out registration fusion on the local point clouds;
and S5, splicing the local point clouds after registration and fusion to form a global point cloud flexibly transformed along with the passage of time, and visually displaying the global point cloud.
Preferably, the preset multitasking neural network model comprises: the method comprises three types of convolution blocks and a global pooling layer, wherein the three types of convolution blocks comprise a convolution block I, a convolution block II and a convolution block III, and the process of processing the endoscope image by the multitask neural network model comprises the following steps:
extracting a feature map of an endoscopic image through two convolution blocks and a pair of two frames of endoscopic images to obtain a first feature map and a second feature map, wherein network parameter weights of the two convolution blocks are shared;
splicing the first characteristic diagram and the second characteristic diagram, and extracting the characteristics of the spliced characteristic diagrams through the two pairs of the rolling blocks to obtain the inter-frame motion vector estimation characteristics;
pooling the inter-frame motion vector estimation characteristics through the global pooling layer to obtain a camera motion vector between two frames of endoscope images;
adjusting feature extraction is carried out on the spliced feature map through the volume block III to obtain depth information features; and the second feature map is connected with the depth information feature layer jump and outputs a multi-scale parallax map suitable for the second endoscopic image.
Preferably, the training process of the preset multitask neural network model includes:
acquiring and processing an endoscope image;
inputting the processed endoscope image into an initial neural network model, and training the initial neural network model in a self-supervision mode to obtain a multitask neural network model;
wherein, the loss function in the training process comprises:
camera inter-frame motion estimation loss:
wherein the content of the first and second substances,representing a camera translation vector preset by a neural network model;representing a camera rotation vector preset by a neural network model; δ 1 and δ 2 are parameters of the Huber function applied to the translation vector and the rotation vector, respectively;
the image restoration loss comprises pixel error loss and similarity error loss, and specifically comprises the following steps:
loss of pixel error:
wherein: m and N represent the pixel width and height values of the image, respectively; i (I, j) represents the true pixel value of the second graph at coordinate (I, j);representing the pixel values at coordinates (i, j) of the second image reconstructed by the algorithm; θ is a Huber function parameter representing pixel error;
loss of similarity error:
wherein: sim represents an image similarity evaluation function, and the value of the Sim is between 0 and 1; i represents the second real graph;representing a second graph reconstructed by the algorithm;
depth smoothing error loss:
wherein: d (i, j) represents the inverse of the estimated depth of the second map at coordinate (i, j);
the total loss function is the weighted sum of the loss functions, and the weight distribution of each part is obtained through the neural network hyper-parameter learning.
Preferably, the S3 includes:
distortion correction is performed on an endoscope image based on camera external parameters, and a pixel value restoration step for undistorted image pixel coordinates (u v) includes:
for the normalized plane of the undistorted image, there are:
wherein (x 'y') represents the corresponding coordinates of the undistorted image pixel coordinates (u v) on the normalized plane;
after the coordinates are distorted, the coordinates on the normalized plane are (x "y"), with:
wherein r is2=x′2+y′2
Projecting the distorted normalized plane coordinates onto a pixel plane to obtain pixel coordinates as follows:
therefore, the pixel value of the undistorted image coordinate (u v) is the distorted image coordinate (u)d vd) The corresponding pixel value; and udAnd vdUsually a non-integer, and can be obtained by bilinear interpolationd vd) The corresponding pixel value;
bilinear interpolation is as follows:
if udAnd vdIf all are non-integers, then u is taken1<ud<u1+1,v1<vd<v1+ 1; if u1And v1Are integers, then:
I(ud,vd)=(v1+1-vd)I(ud,v1)+(vd-v1)I(ud,v1+1)
wherein the content of the first and second substances,
I(ud,v1)=(u1+1-ud)I(u1,v1)+(ud-u1)I(u1+1,v1)
I(ud,v1+1)=(u1+1-ud)I(u1,v1+1)+(ud-u1)I(u1+1,v1+1);
solving x and y of the point cloud according to the coordinates and the camera model after the pixel value reduction, specifically:
x=z(u-cx)/fx
y=z(v-cy)/fy
and taking the depth of the point cloud in the step S2 as z according to the x and y of the point cloud, and obtaining a local point cloud under the endoscope camera coordinate system corresponding to the current frame.
Preferably, the S4 includes:
if the endoscope is supported by the robot, the camera pose corresponding to each frame of image is obtained, and the inter-frame motion information of the endoscope camera is obtained through pose conversion;
and (3) taking the interframe motion information as an initial value of point cloud registration, and performing registration fusion on a plurality of local point clouds by adopting a coherent point drift algorithm.
Preferably, the S4 further includes:
if the endoscope is not supported by the robot, acquiring interframe motion information through a multitask neural network model;
and (3) taking the interframe motion information as an initial value of point cloud registration, and performing registration fusion on a plurality of local point clouds by adopting a coherent point drift algorithm.
Preferably, the S5 includes:
and splicing the registered and fused local point clouds by adopting a dynamic updating mechanism to form a global point cloud flexibly transformed along with the time, and visually displaying the global point cloud by adopting a three-dimensional data processing library.
The invention also provides a minimally invasive surgery space perception-oriented endoscope image three-dimensional reconstruction system, which is characterized by comprising the following components:
the acquisition module is used for acquiring an endoscope image;
the depth estimation module is used for carrying out depth estimation on a current frame of the endoscope image based on a preset multitask neural network model to obtain the point cloud depth of the current frame;
the local point cloud obtaining module is used for obtaining a local point cloud based on the point cloud depth and the camera model;
the registration fusion module is used for performing registration fusion on the local point clouds;
and the global point cloud generating module is used for splicing the local point clouds after registration and fusion to form a global point cloud flexibly transformed along with the time, and visually displaying the global point cloud.
The invention also provides a computer readable storage medium for storing program code for performing the method of any of claims 1 to 7.
The present invention also provides an electronic device, comprising a processor and a memory:
the memory is used for storing program codes and transmitting the program codes to the processor;
the processor is configured to perform the method of any one of claims 1 to 7 according to instructions in the program code.
(III) advantageous effects
The invention provides a minimally invasive surgery space perception oriented endoscope image three-dimensional reconstruction method and system. Compared with the prior art, the method has the following beneficial effects:
according to the method, an endoscope image is obtained, depth estimation is carried out on a current frame of the endoscope image based on a preset multitask neural network model, and point cloud depth of the current frame is obtained; acquiring a local point cloud based on the point cloud depth and the camera model; carrying out registration fusion on the plurality of local point clouds; and splicing the registered and fused local point clouds to form a global point cloud flexibly transformed along with the time, and visually displaying the global point cloud. The invention overcomes the technical problems that the existing endoscope image three-dimensional reconstruction method based on depth learning can only estimate the depth of field information of the current endoscope image and can not reconstruct and dynamically update the whole three-dimensional model, and realizes the endoscope image three-dimensional reconstruction facing minimally invasive surgery space perception.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a block diagram of a minimally invasive surgery space perception oriented endoscopic image three-dimensional reconstruction method according to an embodiment of the invention;
fig. 2 is a structural diagram of a multitasking neural network model in an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention are clearly and completely described, and it is obvious that the described embodiments are a part of the embodiments of the present invention, but not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the application provides the endoscope image three-dimensional reconstruction method and system facing minimally invasive surgery space perception, solves the technical problem that the existing method cannot reconstruct and dynamically update the whole three-dimensional model, and realizes the endoscope image three-dimensional reconstruction facing minimally invasive surgery space perception.
In order to solve the technical problems, the general idea of the embodiment of the application is as follows:
the development of minimally invasive surgery can be assisted by endoscope image three-dimensional reconstruction, better perception experience is brought to a doctor, and the surgery precision is improved. The existing endoscope image three-dimensional reconstruction method based on depth learning is limited to three-dimensional reconstruction (depth estimation) of a single image, and is less related to global three-dimensional reconstruction. Global three-dimensional reconstruction models most of the research has focused on non-rigid registration with preoperative CT/MRI three-dimensional models, and the method fails when there is no preoperative three-dimensional model. In order to solve the above problems, the method of the embodiment of the present invention is provided, so as to overcome the current situation that the existing endoscope image three-dimensional reconstruction system based on deep learning can only estimate the depth of field information of the current endoscope image, and cannot reconstruct and dynamically update the whole three-dimensional model, and meanwhile, the embodiment of the present invention does not need the support of the preoperative CT/MRI image, and realizes the unsupervised real-time dynamic global three-dimensional reconstruction of the endoscope image.
In order to better understand the technical solution, the technical solution will be described in detail with reference to the drawings and the specific embodiments.
The embodiment of the invention provides a minimally invasive surgery space perception-oriented endoscope image three-dimensional reconstruction method, which comprises the following steps of S1-S5:
s1, acquiring an endoscope image;
s2, depth estimation is carried out on the current frame of the endoscope image based on a preset multitask neural network model, and the point cloud depth of the current frame is obtained;
s3, acquiring local point cloud based on the point cloud depth and the camera model;
s4, carrying out registration fusion on the local point clouds;
and S5, splicing the local point clouds after registration and fusion to form a global point cloud flexibly transformed along with the passage of time, and visually displaying the global point cloud.
The embodiment of the invention overcomes the technical problems that the existing endoscope image three-dimensional reconstruction method based on depth learning can only estimate the depth of field information of the current endoscope image and cannot reconstruct and dynamically update the whole three-dimensional model, and realizes the endoscope image three-dimensional reconstruction facing minimally invasive surgery space perception.
The following describes each step in detail:
in step S1, an endoscopic image is acquired. The specific implementation process is as follows:
calibrating endoscope parameters by adopting Opencv and checkerboards to obtain endoscope camera internal parameters fx,fy,cx,cyAnd external reference k1,k2,k3,p1,p2. Wherein k is1,k2,k3For radial distortion parameters, p1 and p2 are tangential distortion parameters.
Shooting soft tissue images by using a calibrated endoscope, acquiring the endoscope images by adopting Opencv, and modifying the resolution of the endoscope images to meet the requirement of multitask neural network model input.
In step S2, depth estimation is performed on the current frame of the endoscopic image based on a preset multitask neural network model, and a point cloud depth of the current frame is obtained. The specific implementation process is as follows:
in the embodiment of the present invention, the process of constructing the preset multitask neural network model includes:
a1, acquiring and processing an endoscope image, comprising:
calibrating endoscope parameters by adopting Opencv and checkerboards to obtain endoscope camera internal parameters fx,fy,cx,cyAnd external reference k1,k2,k3,p1,p2. Wherein k is1,k2,k3As a radial distortion parameter, p1And p2Is a tangential distortion parameter.
And shooting soft tissue images by using the endoscope calibrated by the robot support, acquiring the endoscope images by adopting Opencv, and modifying the resolution of the endoscope images to meet the model input. When the endoscope image is obtained, the corresponding camera pose of each frame of image is solved according to positive kinematics modeling of the robot. And calculating the pose conversion relation between every two frames of endoscope images.
The forward kinematics modeling is to adopt a motion equation of the robot, solve the pose of the end effector according to relevant state parameters of each joint of the robot, then convert the pose of the end effector to the pose of the endoscope camera, and finally obtain the pose of the endoscope camera. A common forward kinematics solution method is a D-H parametric method. This process is common knowledge to those skilled in the art and will not be described further herein.
And A2, inputting the processed endoscope image into an initial neural network model, and training the initial neural network model to obtain the multitask neural network model. The method specifically comprises the following steps:
the initial neural network model inputs any two frames of endoscope images with more matching points and outputs a camera pose transformation vector between the two frames of endoscope images, including rotation (r)x ry rz) And translating (txty tz) And the inverse of the depth information of each pixel of the following frame image.
The structure of the multitask neural network model is shown in fig. 2, and comprises: three types of convolutional block layers and one layer of global pooling layer, wherein a convolutional block represents a series of blocks composed of convolutional layers.
The two frames of endoscope images are respectively subjected to feature extraction suitable for interframe motion vector estimation through a convolution block II after splicing of the obtained feature images, interframe motion vector estimation features are obtained, and then camera motion vectors between the two frames of endoscope images are obtained through a global pooling layer; meanwhile, the spliced feature map is subjected to feature extraction suitable for solving the depth information of the second endoscopic image through the convolution block III to obtain a depth information feature, the multi-scale feature map output when the second endoscopic image is subjected to the rolling block I operation is connected with the depth information feature layer jump generated by the rolling block III operation, and finally the multi-scale parallax map (namely a matrix formed by the reciprocals of the depths) suitable for the second endoscopic image is output.
In the training process, firstly, the network model of the branch of the camera interframe motion estimation is trained, after the training, the weight of the public part is fixed, and the network weight of the depth estimation part is further trained. The advantage of this is to ensure that the model can achieve better effect under the condition of non-uniform dimension.
In the training process of the model, a self-supervision mode is adopted for training. The loss function is as follows:
camera inter-frame motion estimation loss:
wherein the content of the first and second substances,representing neural network modelsSetting a camera translation vector;representing a camera rotation vector preset by a neural network model; δ 1 and δ 2 are parameters of the Huber function applied to the translation vector and the rotation vector, respectively;
The image restoration loss comprises pixel error loss and similarity error loss, and specifically comprises the following steps:
loss of pixel error:
wherein: m and N represent the pixel width and height values of the image, respectively. I (I, j) represents the true pixel value of the second image at coordinate (I, j),the pixel values at coordinates (i, j) of the second image reconstructed by the algorithm are shown, and θ is the Huber function parameter representing the pixel error.
Loss of similarity error:
wherein: sim represents an image similarity evaluation function, and SSIM, PSNR, etc. may be used, and its value is 0-1. I denotes the actual second graph,a second graph reconstructed by the algorithm is shown.
Depth smoothing error loss:
wherein: d (i, j) represents the inverse of the estimated depth of the second map at coordinate (i, j).
The final overall loss function is a weighted sum of the above loss functions. The weight distribution of each part is obtained by neural network hyper-parameter learning.
It should be noted that after the model is trained, the model can be used repeatedly without repeated training.
In the actual use process, endoscope image data in the use process can be collected, the model is updated regularly, and the precision of the model is guaranteed.
And (3) performing depth estimation on the current frame by adopting a trained multi-task neural network model and taking a time window as M frames, wherein M is usually 3. (i.e. assuming that the current frame is i, the three frames i-3, i-2 and i-1 are respectively combined with the ith frame to enter a neural network model to calculate the depth of the ith frame, and finally, the average value is calculated to be the depth of the ith frame.)
In step S3, a local point cloud is acquired based on the point cloud depth and the camera model. The specific implementation process is as follows:
first, distortion correction is performed on an endoscope image based on camera external parameters. For undistorted image pixel coordinates (u v), the pixel value reduction steps are as follows:
for the normalized plane of the undistorted image, there are:
where (x 'y') represents the corresponding coordinates of the undistorted image pixel coordinates (u v) on the normalized plane.
And the coordinates are distorted, the coordinates on the normalized plane are (x "y"), with:
wherein r is2=x′2+y′2。
The distorted normalized plane coordinates are projected onto a pixel plane to obtain pixel coordinates as follows:
therefore, the pixel value of the undistorted image coordinate (u v) is the distorted image coordinate (u)d vd) The corresponding pixel value. And udAnd vdUsually a non-integer, in which case (u) can be obtained by bilinear interpolationd vd) The corresponding pixel value.
Bilinear interpolation is as follows:
if udAnd vdAre all non-integer. Then get u1<ud<u1+1,v1<vd<v1+1.u1And v1Are all integers. Then there are:
I(ud,vd)=(v1+1-vd)I(ud,v1)+(vd-v1)I(ud,v1+1)
wherein the content of the first and second substances,
I(ud,v1)=(u1+1-ud)I(u1,v1)+(ud-u1)I(u1+1,v1)
I(ud,v1+1)=(u1+1-ud)I(u1,v1+1)+(ud-u1)I(u1+1,v1+1)
then solving x and y of the point cloud according to the camera model;
the solving formula is as follows:
x=z(u-cx)/fx
y=z(v-cy)/fy
from the point cloud x and y, the depth in step S2 is taken as z. And obtaining local point cloud under the endoscope camera coordinate system corresponding to the current frame.
In step S4, registration fusion is performed on the plurality of local point clouds. The specific implementation process is as follows:
in the specific implementation process, before the registration fusion is performed, the local point cloud is required to be filtered, and in the embodiment of the invention, the outlier and the noise data of the local point cloud are filtered by adopting a filtering algorithm.
Registration fusion is divided into two cases:
in the first situation, an endoscope is supported by a robot, the camera pose corresponding to each frame of image is obtained, and the motion information between frames of the endoscope camera is obtained through pose conversion.
And in the second case, the endoscope is not supported by a robot, and the interframe motion information is obtained through a multitask neural network model.
And taking the interframe motion information as an initial value of point cloud registration, and then performing registration fusion on the local point clouds by adopting a coherent point drift algorithm suitable for flexible point cloud registration.
In step S5, the registered and fused local point clouds are spliced to form a global point cloud that is flexibly transformed over time, and the global point cloud is visually displayed. The specific implementation process is as follows:
and splicing the point clouds by adopting a dynamic updating mechanism, and performing visual display on the global point clouds by adopting PCL (polycaprolactone), Open3D, Chai3D and other libraries to form the global point clouds flexibly transformed along with the time.
Based on the same inventive concept, the embodiment of the invention also provides an endoscope image three-dimensional reconstruction system facing minimally invasive surgery space perception, which comprises:
the acquisition module is used for acquiring an endoscope image;
the depth estimation module is used for carrying out depth estimation on a current frame of the endoscope image based on a preset multitask neural network model to obtain the point cloud depth of the current frame;
the local point cloud obtaining module is used for obtaining a local point cloud based on the point cloud depth and the camera model;
the registration fusion module is used for performing registration fusion on the local point clouds;
and the global point cloud generating module is used for splicing the local point clouds after registration and fusion to form a global point cloud flexibly transformed along with the time, and visually displaying the global point cloud.
It can be understood that the endoscope image three-dimensional reconstruction system facing minimally invasive surgery space sensing provided by the embodiment of the invention corresponds to the endoscope image three-dimensional reconstruction method facing minimally invasive surgery space sensing provided by the invention, and explanations, examples, beneficial effects and other parts of relevant contents can refer to corresponding parts in the endoscope image three-dimensional reconstruction method facing minimally invasive surgery space sensing, and are not repeated here.
Based on the same inventive concept, the embodiment of the invention also provides a computer-readable storage medium for storing program codes, wherein the program codes are used for executing the endoscopic image three-dimensional reconstruction method facing the minimally invasive surgery space perception.
Based on the same inventive concept, an embodiment of the present invention further provides an electronic device, where the electronic device includes a processor and a memory:
the memory is used for storing program codes and transmitting the program codes to the processor;
the processor is used for executing the endoscopic image three-dimensional reconstruction method facing minimally invasive surgery space perception according to the instructions in the program code.
In summary, compared with the prior art, the method has the following beneficial effects:
1. the embodiment of the invention overcomes the current situation that the existing endoscope image three-dimensional reconstruction method based on depth learning can only estimate the depth of field information of the current endoscope image and can not reconstruct and dynamically update the whole three-dimensional model, and realizes the endoscope image three-dimensional reconstruction facing minimally invasive surgery space perception.
2. The training data of the multitask neural network model of the embodiment of the invention only needs the robot to support the endoscope to acquire the endoscope image data and the camera pose data, and does not need depth information. The data is easy to obtain and the applicability is strong.
3. The embodiment of the invention designs the multitask neural network model, and one network model can restore the depth of field information and the camera inter-frame motion information.
4. The embodiment of the invention can realize the unsupervised real-time dynamic global three-dimensional reconstruction without the support of the CT/MRI image before operation.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.
Claims (10)
1. A minimally invasive surgery space perception-oriented endoscopic image three-dimensional reconstruction method is characterized by comprising the following steps:
s1, acquiring an endoscope image;
s2, depth estimation is carried out on the current frame of the endoscope image based on a preset multitask neural network model, and the point cloud depth of the current frame is obtained;
s3, acquiring local point cloud based on the point cloud depth and a camera model;
s4, carrying out registration fusion on the local point clouds;
and S5, splicing the local point clouds after registration and fusion to form a global point cloud flexibly transformed along with the passage of time, and carrying out visual display on the global point cloud.
2. The minimally invasive surgery space perception-oriented endoscopic image three-dimensional reconstruction method according to claim 1, wherein the preset multitask neural network model comprises the following steps: the method comprises three types of convolution blocks and a global pooling layer, wherein the three types of convolution blocks comprise a convolution block I, a convolution block II and a convolution block III, and the process of processing the endoscope image by the multitask neural network model comprises the following steps:
extracting a feature map of an endoscopic image through two convolution blocks and a pair of two frames of endoscopic images to obtain a first feature map and a second feature map, wherein network parameter weights of the two convolution blocks are shared;
splicing the first characteristic diagram and the second characteristic diagram, and extracting the characteristics of the spliced characteristic diagrams through the two pairs of the rolling blocks to obtain the inter-frame motion vector estimation characteristics;
pooling the inter-frame motion vector estimation characteristics through the global pooling layer to obtain a camera motion vector between two frames of endoscope images;
adjusting feature extraction is carried out on the spliced feature map through the volume block III to obtain depth information features; and the second feature map is connected with the depth information feature layer jump and outputs a multi-scale parallax map suitable for the second endoscopic image.
3. The minimally invasive surgery space perception-oriented endoscopic image three-dimensional reconstruction method according to claim 1, wherein the training process of the preset multitask neural network model comprises the following steps:
acquiring and processing an endoscope image;
inputting the processed endoscope image into an initial neural network model, and training the initial neural network model in a self-supervision mode to obtain a multitask neural network model;
wherein, the loss function in the training process comprises:
camera inter-frame motion estimation loss:
wherein the content of the first and second substances,representing a camera translation vector preset by a neural network model;representing a camera rotation vector preset by a neural network model; δ 1 and δ 2 are parameters of the Huber function applied to the translation vector and the rotation vector, respectively;
the image restoration loss comprises pixel error loss and similarity error loss, and specifically comprises the following steps:
loss of pixel error:
wherein: m and N represent the pixel width and height values of the image, respectively; i (I, j) represents the true pixel value of the second graph at coordinate (I, j);representing the pixel values at coordinates (i, j) of the second image reconstructed by the algorithm; θ is a parameter of the Huber function representing the pixel error;
loss of similarity error:
wherein: sim represents an image similarity evaluation function, and the value of the Sim is between 0 and 1; i represents the second real graph;representing a second graph reconstructed by the algorithm;
depth smoothing error loss:
wherein: d (i, j) represents the inverse of the estimated depth of the second map at coordinate (i, j);
the total loss function is the weighted sum of the loss functions, and the weight distribution of each part is obtained through the neural network hyper-parameter learning.
4. The minimally invasive surgery spatially-aware-oriented endoscopic image three-dimensional reconstruction method according to claim 1, wherein said S3 includes:
distortion correction is performed on an endoscope image based on camera external parameters, and a pixel value restoration step for undistorted image pixel coordinates (u v) includes:
for the normalized plane of the undistorted image, there are:
wherein (x 'y') represents the corresponding coordinates of the undistorted image pixel coordinates (u v) on the normalized plane;
after the coordinates are distorted, the coordinates on the normalized plane are (x "y"), with:
wherein r is2=x′2+y′2
Projecting the distorted normalized plane coordinates onto a pixel plane to obtain pixel coordinates as follows:
therefore, the pixel value of the undistorted image coordinate (u v) is the distorted image coordinate (u)d vd) The corresponding pixel value; and udAnd vdUsually a non-integer, and can be obtained by bilinear interpolationd vd) The corresponding pixel value;
bilinear interpolation is as follows:
if udAnd vdIf all are non-integers, then u is taken1<ud<u1+1,v1<vd<v1+ 1; if u1And v1Are all integers, and are not limited to the specific figure,then there are:
I(ud,vd)=(v1+1-vd)I(ud,v1)+(vd-v1)I(ud,v1+1)
wherein the content of the first and second substances,
I(ud,v1)=(u1+1-ud)I(u1,v1)+(ud-u1)I(u1+1,v1)
I(ud,v1+1)=(u1+1-ud)I(u1,v1+1)+(ud-u1)I(u1+1,v1+1);
solving x and y of the point cloud according to the coordinates and the camera model after the pixel value reduction, specifically:
x=z(u-cx)/fx
y=z(v-cy)/fy
and taking the depth of the point cloud in the step S2 as z according to the x and y of the point cloud, and obtaining a local point cloud under the endoscope camera coordinate system corresponding to the current frame.
5. The minimally invasive surgery spatially-aware-oriented endoscopic image three-dimensional reconstruction method according to claim 1, wherein said S4 includes:
if the endoscope is supported by the robot, the camera pose corresponding to each frame of image is obtained, and the inter-frame motion information of the endoscope camera is obtained through pose conversion;
and (3) taking the interframe motion information as an initial value of point cloud registration, and performing registration fusion on a plurality of local point clouds by adopting a coherent point drift algorithm.
6. The minimally invasive surgery spatially-aware-oriented endoscopic image three-dimensional reconstruction method according to claim 1, wherein said S4 further comprises:
if the endoscope is not supported by the robot, acquiring interframe motion information through a multitask neural network model;
and (3) taking the interframe motion information as an initial value of point cloud registration, and performing registration fusion on a plurality of local point clouds by adopting a coherent point drift algorithm.
7. The minimally invasive surgery spatially-aware-oriented endoscopic image three-dimensional reconstruction method according to claim 1, wherein said S5 includes:
and splicing the registered and fused local point clouds by adopting a dynamic updating mechanism to form a global point cloud flexibly transformed along with the time, and visually displaying the global point cloud by adopting a three-dimensional data processing library.
8. An endoscope image three-dimensional reconstruction system facing minimally invasive surgery space perception is characterized by comprising:
the acquisition module is used for acquiring an endoscope image;
the depth estimation module is used for carrying out depth estimation on a current frame of the endoscope image based on a preset multitask neural network model to obtain the point cloud depth of the current frame;
the local point cloud obtaining module is used for obtaining a local point cloud based on the point cloud depth and the camera model;
the registration fusion module is used for performing registration fusion on the local point clouds;
and the global point cloud generating module is used for splicing the local point clouds after registration and fusion to form a global point cloud flexibly transformed along with the time, and visually displaying the global point cloud.
9. A computer-readable storage medium for storing program code for performing the method of any one of claims 1-7.
10. An electronic device, comprising a processor and a memory:
the memory is used for storing program codes and transmitting the program codes to the processor;
the processor is configured to perform the method of any one of claims 1 to 7 according to instructions in the program code.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110106321.XA CN112802185B (en) | 2021-01-26 | 2021-01-26 | Endoscope image three-dimensional reconstruction method and system facing minimally invasive surgery space perception |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110106321.XA CN112802185B (en) | 2021-01-26 | 2021-01-26 | Endoscope image three-dimensional reconstruction method and system facing minimally invasive surgery space perception |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112802185A true CN112802185A (en) | 2021-05-14 |
CN112802185B CN112802185B (en) | 2022-08-02 |
Family
ID=75811926
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110106321.XA Active CN112802185B (en) | 2021-01-26 | 2021-01-26 | Endoscope image three-dimensional reconstruction method and system facing minimally invasive surgery space perception |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112802185B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113435573A (en) * | 2021-06-07 | 2021-09-24 | 华中科技大学 | Method for establishing parallax prediction model of endoscope image and depth estimation method |
CN113925441A (en) * | 2021-12-17 | 2022-01-14 | 极限人工智能有限公司 | Imaging method and imaging system based on endoscope |
CN114387153A (en) * | 2021-12-13 | 2022-04-22 | 复旦大学 | Visual field expanding method for intubation robot |
CN117671012A (en) * | 2024-01-31 | 2024-03-08 | 临沂大学 | Method, device and equipment for calculating absolute and relative pose of endoscope in operation |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109448041A (en) * | 2018-10-29 | 2019-03-08 | 重庆金山医疗器械有限公司 | A kind of capsule endoscope 3-dimensional reconstruction method and system |
US20200219272A1 (en) * | 2019-01-07 | 2020-07-09 | The University Of North Carolina At Chapel Hill | Methods, systems, and computer readable media for deriving a three-dimensional (3d) textured surface from endoscopic video |
CN111772792A (en) * | 2020-08-05 | 2020-10-16 | 山东省肿瘤防治研究院(山东省肿瘤医院) | Endoscopic surgery navigation method, system and readable storage medium based on augmented reality and deep learning |
US20200402250A1 (en) * | 2017-11-15 | 2020-12-24 | Google Llc | Unsupervised learning of image depth and ego-motion prediction neural networks |
WO2020259248A1 (en) * | 2019-06-28 | 2020-12-30 | Oppo广东移动通信有限公司 | Depth information-based pose determination method and device, medium, and electronic apparatus |
-
2021
- 2021-01-26 CN CN202110106321.XA patent/CN112802185B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200402250A1 (en) * | 2017-11-15 | 2020-12-24 | Google Llc | Unsupervised learning of image depth and ego-motion prediction neural networks |
CN109448041A (en) * | 2018-10-29 | 2019-03-08 | 重庆金山医疗器械有限公司 | A kind of capsule endoscope 3-dimensional reconstruction method and system |
US20200219272A1 (en) * | 2019-01-07 | 2020-07-09 | The University Of North Carolina At Chapel Hill | Methods, systems, and computer readable media for deriving a three-dimensional (3d) textured surface from endoscopic video |
WO2020259248A1 (en) * | 2019-06-28 | 2020-12-30 | Oppo广东移动通信有限公司 | Depth information-based pose determination method and device, medium, and electronic apparatus |
CN111772792A (en) * | 2020-08-05 | 2020-10-16 | 山东省肿瘤防治研究院(山东省肿瘤医院) | Endoscopic surgery navigation method, system and readable storage medium based on augmented reality and deep learning |
Non-Patent Citations (4)
Title |
---|
WU, AIRONG等: "Diagnostic value of endoscopic ultrasonography for submucosal tumors of upper gastrointestinal tract", 《CHINESE JOURNAL OF GASTROINTESTINAL SURGERY》 * |
耿国华等: "交互式实时虚拟内窥镜系统中的关键技术", 《计算机应用》 * |
衡怡伶等: "基于序列内窥镜视频图像的膀胱三维场景重建", 《科学技术与工程》 * |
赵矿军: "基于RGB-D摄像机的室内三维彩色点云地图构建", 《哈尔滨商业大学学报(自然科学版)》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113435573A (en) * | 2021-06-07 | 2021-09-24 | 华中科技大学 | Method for establishing parallax prediction model of endoscope image and depth estimation method |
CN113435573B (en) * | 2021-06-07 | 2022-04-29 | 华中科技大学 | Method for establishing parallax prediction model of endoscope image and depth estimation method |
CN114387153A (en) * | 2021-12-13 | 2022-04-22 | 复旦大学 | Visual field expanding method for intubation robot |
CN113925441A (en) * | 2021-12-17 | 2022-01-14 | 极限人工智能有限公司 | Imaging method and imaging system based on endoscope |
CN117671012A (en) * | 2024-01-31 | 2024-03-08 | 临沂大学 | Method, device and equipment for calculating absolute and relative pose of endoscope in operation |
CN117671012B (en) * | 2024-01-31 | 2024-04-30 | 临沂大学 | Method, device and equipment for calculating absolute and relative pose of endoscope in operation |
Also Published As
Publication number | Publication date |
---|---|
CN112802185B (en) | 2022-08-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112802185B (en) | Endoscope image three-dimensional reconstruction method and system facing minimally invasive surgery space perception | |
CN111161290B (en) | Image segmentation model construction method, image segmentation method and image segmentation system | |
JP5153620B2 (en) | System for superimposing images related to a continuously guided endoscope | |
JP5335280B2 (en) | Alignment processing apparatus, alignment method, program, and storage medium | |
JP4885138B2 (en) | Method and system for motion correction in a sequence of images | |
Wu et al. | Three-dimensional modeling from endoscopic video using geometric constraints via feature positioning | |
CN111080778B (en) | Online three-dimensional reconstruction method of binocular endoscope soft tissue image | |
EP1685538A1 (en) | Device and method for generating a three-dimensional vascular model | |
US20220198693A1 (en) | Image processing method, device and computer-readable storage medium | |
CN114842154B (en) | Method and system for reconstructing three-dimensional image based on two-dimensional X-ray image | |
JP4613172B2 (en) | Method and apparatus for three-dimensional reconstruction of object from projection image | |
CN111161330B (en) | Non-rigid image registration method, device, system, electronic equipment and storage medium | |
CN112562070A (en) | Craniosynostosis operation cutting coordinate generation system based on template matching | |
CN113538335A (en) | In-vivo relative positioning method and device of wireless capsule endoscope | |
Deligianni et al. | Non-rigid 2d-3d registration with catheter tip em tracking for patient specific bronchoscope simulation | |
JP5051025B2 (en) | Image generating apparatus, program, and image generating method | |
CN114399527A (en) | Method and device for unsupervised depth and motion estimation of monocular endoscope | |
CN112150404B (en) | Global-to-local non-rigid image registration method and device based on joint saliency map | |
JP2022052210A (en) | Information processing device, information processing method, and program | |
Bouattour et al. | 4D reconstruction of coronary arteries from monoplane angiograms | |
CN115281584B (en) | Flexible endoscope robot control system and flexible endoscope robot simulation method | |
Tsuda et al. | Recovering size and shape of polyp from endoscope image by RBF-NN modification | |
JP5706933B2 (en) | Processing apparatus, processing method, and program | |
CN115100092B (en) | Subtraction method and device for coronary CT image, electronic equipment and storage medium | |
WO2024050918A1 (en) | Endoscope positioning method, electronic device, and non-transitory computer-readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |