CN116602764A - Positioning navigation method and device for end-to-end ophthalmic surgery - Google Patents

Positioning navigation method and device for end-to-end ophthalmic surgery Download PDF

Info

Publication number
CN116602764A
CN116602764A CN202310577883.1A CN202310577883A CN116602764A CN 116602764 A CN116602764 A CN 116602764A CN 202310577883 A CN202310577883 A CN 202310577883A CN 116602764 A CN116602764 A CN 116602764A
Authority
CN
China
Prior art keywords
image
eyeball
registration
polar coordinate
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310577883.1A
Other languages
Chinese (zh)
Inventor
王钊
翟雨轩
纪淳升
王雅琦
张炜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanxi Zhiyuan Huitu Technology Co ltd
University of Electronic Science and Technology of China
Original Assignee
Shanxi Zhiyuan Huitu Technology Co ltd
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanxi Zhiyuan Huitu Technology Co ltd, University of Electronic Science and Technology of China filed Critical Shanxi Zhiyuan Huitu Technology Co ltd
Priority to CN202310577883.1A priority Critical patent/CN116602764A/en
Publication of CN116602764A publication Critical patent/CN116602764A/en
Pending legal-status Critical Current

Links

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B34/00Computer-aided surgery; Manipulators or robots specially adapted for use in surgery
    • A61B34/20Surgical navigation systems; Devices for tracking or guiding surgical instruments, e.g. for frameless stereotaxis
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B34/00Computer-aided surgery; Manipulators or robots specially adapted for use in surgery
    • A61B34/10Computer-aided planning, simulation or modelling of surgical operations
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B90/00Instruments, implements or accessories specially adapted for surgery or diagnosis and not covered by any of the groups A61B1/00 - A61B50/00, e.g. for luxation treatment or for protecting wound edges
    • A61B90/20Surgical microscopes characterised by non-optical aspects
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B90/00Instruments, implements or accessories specially adapted for surgery or diagnosis and not covered by any of the groups A61B1/00 - A61B50/00, e.g. for luxation treatment or for protecting wound edges
    • A61B90/36Image-producing devices or illumination devices not otherwise provided for
    • A61B90/361Image-producing devices, e.g. surgical cameras
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B21/00Microscopes
    • G02B21/0004Microscopes specially adapted for specific applications
    • G02B21/0012Surgical microscopes
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B21/00Microscopes
    • G02B21/18Arrangements with more than one light path, e.g. for comparing two specimens
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B21/00Microscopes
    • G02B21/36Microscopes arranged for photographic purposes or projection purposes or digital imaging or video purposes including associated control and data processing arrangements
    • G02B21/361Optical details, e.g. image relay to the camera or image sensor
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B21/00Microscopes
    • G02B21/36Microscopes arranged for photographic purposes or projection purposes or digital imaging or video purposes including associated control and data processing arrangements
    • G02B21/365Control or image processing arrangements for digital or video microscopes
    • G02B21/367Control or image processing arrangements for digital or video microscopes providing an output produced by processing a plurality of individual source images, e.g. image tiling, montage, composite images, depth sectioning, image comparison
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • G06T7/33Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B34/00Computer-aided surgery; Manipulators or robots specially adapted for use in surgery
    • A61B34/10Computer-aided planning, simulation or modelling of surgical operations
    • A61B2034/101Computer-aided simulation of surgical operations
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B34/00Computer-aided surgery; Manipulators or robots specially adapted for use in surgery
    • A61B34/20Surgical navigation systems; Devices for tracking or guiding surgical instruments, e.g. for frameless stereotaxis
    • A61B2034/2046Tracking techniques
    • A61B2034/2065Tracking using image or pattern recognition
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B90/00Instruments, implements or accessories specially adapted for surgery or diagnosis and not covered by any of the groups A61B1/00 - A61B50/00, e.g. for luxation treatment or for protecting wound edges
    • A61B90/36Image-producing devices or illumination devices not otherwise provided for
    • A61B90/361Image-producing devices, e.g. surgical cameras
    • A61B2090/3612Image-producing devices, e.g. surgical cameras with images taken automatically
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10056Microscopic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Surgery (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Computation (AREA)
  • Chemical & Material Sciences (AREA)
  • Optics & Photonics (AREA)
  • Artificial Intelligence (AREA)
  • Veterinary Medicine (AREA)
  • Animal Behavior & Ethology (AREA)
  • Analytical Chemistry (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Public Health (AREA)
  • Computing Systems (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Pathology (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Robotics (AREA)
  • Databases & Information Systems (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to the field of medical image processing, in particular to an end-to-end ophthalmic surgery positioning navigation method and equipment. Firstly, extracting eyeball characteristics by adopting a coding and decoding network, carrying out semantic segmentation on a plurality of segmentation layers constructed according to the eyeball characteristics so as to determine the center position of an eyeball, constructing a polar coordinate sampler so as to directly project an intraoperative image acquired in real time under polar coordinates, and simultaneously converting an eyeball characteristic image, a semantic segmentation result image and a preoperative image into a polar coordinate system. And registering the current image and the target image under a polar coordinate system to complete eyeball rotation tracking. In the registration process, a twin network is combined with a correlation filter, tracking target positions, the number and the receptive field size are defined by defining a Gaussian template, the calculation efficiency is improved, the weighted sum of an interframe registration result and a long-time registration result is used as a final rotation quantity, the accumulated error is eliminated, and the stability and the registration precision of eyeball rotation continuous registration are improved.

Description

Positioning navigation method and device for end-to-end ophthalmic surgery
Technical Field
The invention relates to the field of medical image processing, in particular to an end-to-end ophthalmic surgery positioning navigation method and equipment.
Background
In ophthalmic surgery, there is a need for positioning and rotational registration of the eyeball, for example, cataract surgery, and the position and size of the capsulorhexis and the implantation angle of the intraocular lens in the surgical process have a critical influence on the postoperative vision recovery of the patient. In the traditional operation, a doctor performs interventional coloring marking on an eye ball by adopting a slit lamp marking method, so as to determine the position, the size and the implantation angle of the intraocular lens. The mode has low safety and poor precision, and is easy to cause discomfort of patients. In recent years, with the development of computer navigation assisted ophthalmic surgery, an ophthalmic doctor has the capability of accurately positioning eyeballs, so that the safety and efficiency of the surgery are effectively improved, and the recovery effect of the surgery of a patient is better.
An ophthalmic surgery navigation system integrated with a surgery microscope as disclosed in CN111616800a comprises a surgery microscope, a video camera system, an eyeball positioning navigation module and an eyepiece projection module. Eye iris boundary segmentation and scleral multiposition capillary vessel rotation tracking are accomplished using an ophthalmic surgical navigation system and accumulated errors are eliminated by registration with reference images. The rotation tracking mode is to select a plurality of boxes at the iris boundary, and a plurality of single-target trackers are adopted for tracking, so that the capacity of the single-target trackers for coping with large-amplitude rotation is weaker.
In addition, as disclosed in CN112043383B, the ophthalmic surgery navigation system and the electronic device enrich the information feedback in surgery by registering the OCT three-dimensional image and the two-dimensional surgery microscope image, and use the 3D display to display the surgery picture in the process, so as to improve the safety and accuracy of surgery. But the registration method of the OCT three-dimensional image and the two-dimensional operation image and the focus segmentation method are more fuzzy. In addition, the three-dimensional registration calculation amount of the high-resolution image is large, and real-time navigation in operation is difficult to realize.
The ophthalmic surgery navigation system disclosed by the prior art has certain defects, so that the research of the ophthalmic surgery navigation method with higher registration accuracy has positive significance.
Disclosure of Invention
The invention aims to provide an end-to-end ophthalmic surgery positioning navigation method and equipment so as to improve the working efficiency and registration accuracy of ophthalmic surgery navigation.
In order to achieve the above purpose, the invention adopts the following technical scheme:
an end-to-end ophthalmic surgery navigation positioning method comprises the following steps:
step 1, acquiring preoperative images of eyes of a patient;
step 2, constructing a coding and decoding network, wherein the coding and decoding network is a neural network based on a U-Net structure, and the coding network inputs eyeball images of a patient and outputs deep and shallow feature images of the extracted images; the deep features of the image represent global information, and the shallow features represent local information; the decoding network inputs the deep and shallow characteristic images of the image, and outputs the eyeball characteristic images fused with the deep and shallow characteristics; wherein the encoder adopts a TransU-Net or an Efficient Net encoder;
step 3, constructing a plurality of segmentation layers for semantically segmenting the iris, sclera and cornea regions of the eye; the multi-class segmentation layer consists of a plurality of convolution layers with the convolution kernel size of 1, wherein the input of the multi-class segmentation layer is an eyeball characteristic image, and the output of the multi-class segmentation layer is a semantic segmentation result, namely a segmentation prediction image of iris, sclera and cornea;
step 4, calculating the center of the iris region according to the semantic segmentation result, and taking the center as the center position of the eyeball; constructing a polar coordinate sampler by taking the center position of an eyeball as a pole;
step 5, acquiring an intraoperative image in real time by utilizing the polar coordinate sampler constructed in the step 4, converting the intraoperative image into a polar coordinate system, and inputting the preoperative image, the semantic segmentation result and the eyeball characteristic map into the polar coordinate sampler to be converted into a polar coordinate image;
step 6, registering the current image and the target image under a polar coordinate system to obtain rotation information of the eyeball, thereby completing rotation navigation of the eyeball;
step 6.1, constructing and training a twin network to extract scleral capillary characteristics of the search image and the target image under a polar coordinate system; the search image is a current frame image in the real-time image, the target image comprises a preoperative image and a previous frame image, wherein the previous frame image is a target image registered between frames, and the preoperative image is used as a target image registered for a long time;
step 6.2, defining a multi-point Gaussian template, wherein the Gaussian template is an image with image gray values distributed by a Gaussian function, the peak position of the gray values is a target position, and the position of the Gaussian distribution in the Gaussian template is a position where the target image is expected to be registered; training a correlation filter on line by using a Gaussian template and a target image, and calculating the cross correlation of the correlation filter and a search image to obtain a correlation response diagram;
step 6.3, constructing a confidence coefficient network by taking the correlation response graph as input to obtain a target position area prediction graph; multiplying the target position area prediction graph by the correlation response graph, and outputting the product as a registration result; the peak position in the registration result is the target position, the position correlation response value is the confidence coefficient of the registration result, and the displacement of the peak position on the angle coordinate axis of the Gaussian template under the polar coordinate system is the rotation of the point position;
step 6.4, calculating displacement average values of a plurality of target positions in the search image in the Gaussian template, and using the displacement average values as rotation amounts of the search image compared with the target image, namely an interframe registration result and a long-time registration result; calculating a weighted sum of the interframe registration result and the long-term registration result as a final rotation amount of the eyeball of the current frame image; and calculating an iris region of the current frame image according to the final rotation amount, registering the current image and the target image, and completing eyeball center positioning and eyeball rotation tracking.
Further, the end-to-end ophthalmic surgery positioning navigation method further comprises the following steps:
step 7, carrying out data enhancement on the registered image, wherein the process is as follows:
capillary characteristic information from random locations of other ocular surgical videos is superimposed at the registration locations, the capillary characteristic superimposed at the registration locations for each pair of registered images being the same, the capillary characteristic superimposed for different pairs of registered images being different.
Further, the plurality of gaussian templates defined in the step 6.2 are provided, a dense registration point set is established according to the plurality of gaussian templates, and non-rigid registration of the target image and the search image is performed.
Further, the twin network comprises a first neural network and a second neural network, wherein the first neural network is composed of a convolution layer and deformation or combination which are connected in sequence; the second neural network has the same structure as the first neural network and is shared in weight; the convolution layers of the first neural network and the second neural network are connected with a polar coordinate sampler, and are used for receiving preoperative image or real-time intraoperative image data under a polar coordinate system, extracting and outputting capillary vessel feature images; and the deformation network receives the semantic segmentation result and the capillary characteristic map under the polar coordinate system, and performs deformation compensation on the capillary characteristic map under the polar coordinate system according to the limbal edge converted into the semantic segmentation result under the polar coordinate system to obtain the iris boundary capillary characteristic map with deformation eliminated.
In the step 6.4, the method further includes calculating the translation amount during the process of positioning and rotation tracking the center of the eyeball, wherein the calculation method of the translation amount includes:
step 6.4.1, constructing a splitting branch of the twin network, wherein the splitting branch consists of a convolution layer and a classification layer; the convolution layer receives the iris boundary capillary characteristic image extracted by the twin network and the eyeball characteristic image changed from the rectangular coordinate system, and performs iris region segmentation through the classification layer after fusion, wherein the centroid of the iris region under the rectangular coordinate system is the center of the final eyeball;
and 6.4.2, calculating the displacement of the center of the eyeball obtained in the step 6.4.1 compared with the center of the eyeball at the initial operation, wherein the displacement is the translation of the eyeball.
Further, the weighted sum calculation method of the interframe registration and the long-term registration result in the step 6.4 is a Kalman filtering algorithm; the detailed calculation process comprises the following steps:
summing the covariance of the interframe registration results with the covariance of the long-term registration; calculating a weight m according to the covariance summation result:
m=P/(P+Q)
where P represents the covariance of the inter-frame registration and Q represents the covariance of the long-term registration.
Further, the step 3 further includes: and setting a full connection layer behind the encoding and decoding network, wherein the input of the full connection layer is connected with the eyeball characteristic diagram output by the decoder and is used for judging the current image state so as to exclude abnormal state images.
Further, two branches of CNN and Token Mixing are set in the encoder in the step 2; the CNN branch is used for acquiring local information, the branch processes an operation image through n convolution layers, and n is a natural number larger than 1; dividing an image into S mutually non-overlapping image blocks as input of the Token Mixing branches based on the Token Mixing branches, linearly mapping all the image blocks to a hidden layer, and embedding the mapped feature addition positions to store the relative positions of each image block; operating each channel dimension of the features by using the MLP network and then outputting a result containing the feature interaction; the local information acquired by the CNN branch and the global information acquired by the Token Mixing branch are fused by using an attention gate;
the decoder section takes the same structure as the Attention U-Net and uses Attention gates to enhance the characteristics of the encoding section.
An ophthalmic surgical microscope device comprises an objective lens system, an ocular lens system, a spectroscope system, a video camera assembly, a surgical navigation module and a projection and display module;
the objective lens system focuses the light beam on the eyes of a patient to realize clear imaging;
the ocular lens system projects an image plane into eyes of a doctor, so that a microscope image and a projection pattern generated by the operation navigation module can be directly observed by the doctor;
the beam splitter system divides the light beam into two or more paths, one path of the light beam is led to the ocular lens system, and the other path of the light beam is led to the video shooting and recording assembly;
the shooting and video recording component transmits the image or video to the operation navigation module;
the operation navigation module is connected with the projection and display module and is used for executing the end-to-end ophthalmic operation positioning navigation method according to any one of claims 1-7 to obtain navigation information of eyeball segmentation; the projection and display module projects the received navigation information into the view field of the microscope through the spectroscope system and the eyepiece system, or the real-time video acquired through the external display fusion shooting and video recording assembly is displayed on the display end, so that a doctor is assisted in performing an operation.
Furthermore, the ophthalmic surgical microscope device is a binocular stereoscopic ophthalmic surgical microscope, wherein the microscope is provided with a left group of video shooting components and a right group of video shooting components, and a projection and display module of the microscope is a three-dimensional projection and display module;
the optical path of the left ocular lens is divided into a beam by the spectroscope and is led into the left eye video shooting and recording component, and the optical path of the right ocular lens is divided into a beam by the spectroscope and is led into the right eye video shooting and recording component;
the operation navigation module receives images acquired by the Zuo Mu video shooting assembly and the right eye video shooting assembly, calibrates the internal and external parameters of the left ocular and the right ocular, establishes a binocular stereoscopic vision system according to the calibration result, and acquires depth information of eyes in the left ocular and the right ocular images; then, respectively executing the end-to-end ophthalmic surgery positioning navigation method of any one of claims 1 to 7 to obtain left eyeball segmentation navigation information and right eyeball segmentation navigation information, and converting the navigation information into a three-dimensional space and then taking an average position;
the three-dimensional projection and display module projects the navigation information with the average position to the view field of the microscope through the spectroscope system and the ocular system, or the real-time video acquired through the external display fusion shooting and video recording assembly is displayed on the display end, so as to assist a doctor in performing an operation.
According to the end-to-end ophthalmic surgery positioning navigation method provided by the invention, firstly, the eyeball characteristics are extracted by adopting a coding and decoding network, semantic segmentation is carried out according to a plurality of segmentation layers constructed by the eyeball characteristics, so that the central position of the eyeball is determined, and the central position of the eyeball is used as the center of the eyeball to construct a polar coordinate sampler. The coordinate sampler can directly project an intraoperative image acquired in real time to polar coordinates, and can convert an eyeball characteristic image, a semantic segmentation result image and a preoperative image to a polar coordinate system, register the current image and a target image in the polar coordinate system, and complete eyeball center positioning and iris boundary segmentation. In the registration process, the twin network is combined with the relevant filter, the position, the number and the receptive field size of the tracked target are defined by defining the Gaussian template, dependence of the Anchor setting on prior parameters such as size, proportion and number is avoided, the calculation efficiency is improved, and the vascular tracking is more reasonable and accurate for the target without clear boundary. The weighted sum of the results of the interframe registration and the long-term registration is used as the final rotation quantity, so that the accumulated error is eliminated, the stability of the eyeball rotation continuous registration is improved, and the interference of the error registration result on the rotation registration is weakened. Establishing a dense registration point set through a plurality of defined Gaussian templates, performing non-rigid registration of the target image and the search image, and simultaneously realizing interframe registration and long-term registration; the increase of the registration target number can be realized only by increasing the Gaussian template increase, and multiple feature extraction is not needed. It has a higher efficiency than using multiple single target trackers. In addition, the whole registration process is completed under the polar coordinates, the rotation of the target is changed into the translation of the target through the polar coordinates, and after the rotation is changed into the translation, the calculation process is simpler and the calculation efficiency is higher.
The end-to-end ophthalmic surgery positioning navigation method is used as a navigation module and applied to an ophthalmic surgery microscopic imaging system, is beneficial to assisting doctors in efficiently, accurately and safely completing surgery tasks, and lays a good foundation for postoperative recovery of patients.
Drawings
FIG. 1 is a workflow diagram of an end-to-end ophthalmic surgical positioning navigation method of an embodiment applied in ophthalmic surgery and devices;
FIG. 2 is a flow chart of an end-to-end ophthalmic surgical positioning navigation method of the present invention;
FIG. 3 is a block diagram of a twin network portion of the present invention;
FIG. 4 is a diagram of a method of registration tag data enhancement for a network training set of the present invention;
FIG. 5 is a graph of Gaussian template setup and registration response versus target area prediction in the present invention;
FIG. 6 is a schematic representation of the processing of binocular stereomicroscope images in accordance with the present invention;
FIG. 7 is an ophthalmic surgical microscope apparatus of the present invention;
fig. 8 is a binocular stereoscopic surgical microscope of the present invention.
Detailed Description
The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings.
The positioning and navigation method for the end-to-end ophthalmic surgery provided in the embodiment, as shown in fig. 2, includes the following steps:
step 1, acquiring a preoperative image of the eyes of a patient, wherein the image needs to be clear in picture, has no smear, comprises iris and partial sclera, and has centered pupil position.
Step 2, constructing a coding and decoding network, wherein the coding and decoding network is a neural network based on a U-Net structure and comprises an encoder 201 and a decoder 202, and the encoder 201 is connected with the decoder 202; the coding network inputs eyeball images of a patient and outputs deep and shallow feature images of the extracted images; the deep features of the image represent global information and the shallow features represent local information. The decoding network inputs the deep and shallow characteristic images of the image, and outputs the eyeball characteristic images fused with the deep and shallow characteristics. In the codec network of this embodiment, the encoder 201 may be a TransU-Net encoder or an EfficientNet encoder.
In addition, as the parameter amount of the transducer network is large, the calculation cost is large, the CNN convolution receptive field is limited, and in practical application, the local and global information of the image can be obtained by combining CNN and Token Mixing. Namely, two branches of CNN and Token Mixing are arranged in the encoder, wherein the CNN branch is used for acquiring local information, the branch processes the operation image through n convolution layers, and n is a natural number of 2,3,4,5 and the like which is larger than 1. The branch based on Token Mixing is used for acquiring global information, an original image is set as H.W, the branch image is divided into S mutually non-overlapping image blocks (patch) as input, and the size of each image block is P.P, then S=H.W/P.sup.2. All tiles are mapped linearly to the hidden layer, like a transducer, and position embedding is applied to the mapped features to preserve the relative position of each tile. To obtain global information, MLP is used to operate on each channel dimension of the feature, and the output results comprise the results of feature interaction. Finally, attention Gate (Attention Gate) is used to fuse local and global information. The decoder part adopts the same structure as the Attention U-Net, and the Attention gate is used for strengthening the characteristic of the coding part, so that the algorithm is more robust.
The method adopts a combination mode of CNN and Token Mixing, fully utilizes local and global information, and can simultaneously acquire low-level spatial information and high-level semantic information; meanwhile, the Token Mixing mode is based on MLP, the parameter is smaller than that of a transducer, and the method is suitable for the deployment of a real-time system.
Step 3, constructing a multi-class segmentation layer 204 for semantically segmenting the iris, sclera, and cornea regions of the eye. The multi-class segmentation layer 204 is composed of a plurality of convolution layers with a convolution kernel size of 1, and is input as an eyeball feature map and output as a semantic segmentation result, that is, a segmentation prediction map of the iris, sclera, and cornea.
Step 4, calculating the center of the iris region according to the semantic segmentation result, and taking the center as the eyeball center position X center ,Y center The method comprises the steps of carrying out a first treatment on the surface of the And constructing a polar coordinate sampler by taking the center position of the eyeball as a pole.
Step 5, acquiring an intraoperative image in real time by utilizing the polar coordinate sampler constructed in the step 4, and converting the intraoperative image into a polar coordinate system; and inputting the preoperative image, the feature map and the semantic segmentation result into a polar coordinate sampler to be converted into a polar coordinate image. Under the polar coordinate sampler, the rectangular coordinate and polar coordinate conversion method comprises the following steps:
establishing polar coordinate image pixel point grid matrix R and theta, R ij 、A ij The elements r and θ are defined by the center of the eyeball (X center ,Y center ) Establishing a coordinate mapping relation between a polar coordinate system and a rectangular coordinate system for an origin:
r=R h×w ,R ij =j
θ=A h×w ,A ij =i*2/h-1
wherein w, h is the width and height of the polar coordinate image, and the matrix size of r and theta is h multiplied by w, S eye The area of the iris and pupil area is b is a constant, f is the scaling of the polar coordinate image, and is changed according to the change of the radius of the iris, t is the initial angle of the polar coordinate transformation, wherein the multiplication of r and theta is point multiplication, the multiplication of other matrixes is cross multiplication, x and y are the mapping positions of the polar coordinate image under a rectangular coordinate system, and the image under the rectangular coordinate system is sampled 209 by utilizing the coordinate mappingThe image may be projected 210 under a polar coordinate system.
And 6, registering the current image and the target image under a polar coordinate system to obtain navigation information of eyeball rotation, thereby completing eyeball center positioning and rotation tracking. The eyeball registration in the polar coordinate system is realized by registering the target image and searching the capillary vessel characteristics of multiple points on the image. Continuous inter-frame image registration can ensure good continuity of eyeball rotation registration, but accumulated errors exist in long-term registration, and the current frame image and the reference image are registered by taking the reference image as a target image (namely long-term registration) so as to avoid error accumulation, but the stability and the continuity of eyeball complex movement, shielding and deformation are difficult to maintain in the operation process. The end-to-end method of the invention registers two types of targets simultaneously, so that the network has good accuracy, stability and persistence. The detailed process is as follows:
and 6.1, constructing and training a twin network, wherein the input of the twin network is preoperative image data and real-time intraoperative image data under a polar coordinate system, and the twin network is used for extracting scleral capillary blood vessel characteristics of a search image and a target image under the polar coordinate system. The search image is a current frame image in the real-time image, the target image comprises a preoperative image and a previous frame image, wherein the previous frame image is a target image registered between frames, and the preoperative image is used as a target image registered for a long time.
The deformation of the vascular features during registration is one of the major difficulties. Therefore, the deformation network is arranged in the twin network in the embodiment, so that the problem of deformation of the blood vessel characteristics is solved, and the extracted characteristics are more robust. The structure of the twin network in this embodiment includes a first neural network and a second neural network, where the first neural network is composed of a convolution layer 213 and a deformation network 214 that are sequentially connected; the second neural network has the same structure as the first neural network and is shared in weight; the convolution layers of the first and second neural networks are both connected to the polar sampler 209, and the deformation network 214 is connected to the convolution layer 209.
Since the eyeball is not in a regular form in the operation process, deformation is often caused by extrusion and dragging of an interventional instrument, and the deformation is greatly different from a reference image, so that the success rate of long-term registration is reduced. The deformation network can be arranged to restore the cornea edge of the image under the polar coordinate system according to the cornea edge in the classification prediction graph obtained in the step 3, the eyeball edge is restored to be round all the time according to the deformation expansion and contraction of the edge, the registration difficulty caused by the eyeball deformation is eliminated, and the stability of long-term registration is improved. In this embodiment, the process of strain elimination is described by taking the projection v of the limbus segmentation result in the polar coordinate system as an example.
As shown in fig. 3, let the gray value of the target position in v be 1 and the gray value of the background position be 0, and in the polar coordinate sampler, establishing a coordinate mapping relationship between the polar coordinate system and the rectangular coordinate system:
d=R h×w ,
wherein x and y are points in v, d with a size matrix of h×w is a difference between the sum of the limbus segmentation v result in the y direction and the position of the central axis of the image, namely, the deformation of each row in the polar coordinate image, x and 'y' are mapping positions of the deformation elimination feature map in the original polar coordinate feature map, and the deformation of the image can be eliminated by utilizing the coordinate mapping to sample and project 301 the feature map under the polar coordinate, so that the extracted features are more robust.
And 6.2, defining a multi-point Gaussian template 402, training a correlation filter on line by using the Gaussian template and the target image, and calculating the cross correlation of the correlation filter and the search image to obtain a correlation response diagram. Specific:
the gaussian template in this embodiment is an image in which the gray values of the image are distributed by a gaussian function, the peak positions of the gray values are target positions, and the positions of the gaussian distribution in the gaussian template are positions at which the target image is desired to be registered. And a plurality of Gaussian templates are provided, a dense registration point position set is established according to the plurality of Gaussian templates, and non-rigid registration of the target image and the search image is carried out. The updating strategy of the inter-frame registration Gaussian template can be to update the previous frame by frame according to the position prediction result of the previous frame, and continuously track the same target; it is also possible to update with only the scleral region transformation, uniformly distributed within the scleral region at the scleral boundary, and only register for interframe deformations. The correlation filter is updated as the target image and gaussian template are updated.
In this embodiment, correlation filtering is used to calculate the cross correlation between the correlation filter and the search image, and the working mechanism is that the correlation filter performs discrimination regression 215 on the surgical image to generate a gaussian distribution response that reaches a peak at the target position. Namely, the relevant filter is learned through the Gaussian tag G216 and the target image T211 at the designated position, so that the multi-vessel characteristic target tracking of the frame-by-frame images is realized. The correlation filter w may calculate w by calculating a correlation response i *T i Linear ridge regression with gaussian label G to learn online:
where λ is the regularization parameter, preventing overfitting.
The correlation filter w performs bit-by-bit correlation calculation on the search image S210 through a cyclic matrix, and in order to improve the calculation efficiency, the target image T211 and the search image S210 are converted into a frequency domain through discrete fourier transform F218 to perform calculation, so that the calculation efficiency is improved, and the multiplication of the frequency domain corresponds to convolution of the time domain.
Wherein F (S) is the complex conjugate of F (S), which is the Hadamard product.
response=F -1 (F * (w)·F(S))
The correlation filter w is multiplied by the search image 210 to obtain a response map in the frequency domain, and the response map is converted back to the frequency domain through inverse discrete fourier transform 219 to obtain a time domain response map response 220, the value of which represents the magnitude of the correlation between the position and the target.
And 6.3, constructing a confidence coefficient network 221 by taking the correlation response graph as input, and predicting the position area of the registration target in the current frame image to obtain a target position area prediction graph 405. The confidence network 221 is composed of a convolution layer of k layers of convolution-normalization-activation layers and a convolution layer with a convolution kernel of 1, where k is an integer greater than or equal to 2, and is usually 2,3, and 4. The target position area prediction map 405 is multiplied by the response map 404, and is output as a registration result 222, the peak position in the registration result 222 is the position of the target in the current frame, the position correlation response value is the confidence level of the registration result, and the response map peak position 404 is compared with the polar coordinate angular axis offset of the gaussian template 402 and is the rotation amount of the point. If tracking is abnormal, the classification result 405 is null, i.e. the tracking position is not output, and the classification branch has the function of classifying the response map 404, eliminating interference of non-target areas and improving the robustness of registration.
And 6.4, defining a multi-point Gaussian template 402, calculating the average value of displacement amounts of a plurality of target positions in the Gaussian template in the search image, and comparing the average value with the rotation amount of the target image as the search image, thereby obtaining an inter-frame registration result C and a long-time registration result L. As shown in fig. 4, the long-term registration with the preoperative image may uniformly distribute the center position of the gaussian function, or may set a gaussian template at a designated position of interest to the physician. In the interframe registration, a Gaussian template with a center position uniformly distributed in a scleral area can be selected for each frame, namely, the fixed position registration is performed; a gaussian template may also be set up centered on the registered target position of the previous frame, i.e. the same feature is continuously tracked.
To eliminate accumulated errors, improve the stability of continuous registration of eyeball rotation, and attenuate the interference of erroneous registration results with rotational registration, the present embodiment employs a weighted sum of inter-frame registration results and long-term registration results as the final rotation amount 226:
Z=C+m×(C-L)
m is the weight of the weighted sum of the interframe registration result and the long-term registration result, and Z is the final rotation amount.
The weighted sum of the inter-frame registration result and the long-term registration result in the embodiment is calculated by adopting a Kalman filtering algorithm, wherein the Kalman filtering algorithm is an optimization algorithm for carrying out optimal state estimation by combining various kinds of observation information, and can inhibit noise interference in a system result. E is the true rotation of the eyeball, and the results C, L of the inter-frame registration and the long-term registration can be respectively regarded as a system containing noise gamma and epsilon:
C=E+γ
L=E+ε
in the Kalman filtering method, the weight m is Kalman gain.
m=P/(P+Q)
P and Q are covariance of inter-frame registration and long-term registration, respectively. The weights of the result weighted sums of the inter-frame registration and the long-term registration are dynamically changed in the surgical engineering according to the registration results.
And calculating an iris region of the current frame image according to the final rotation amount, registering the current image and the target image, and completing eyeball center positioning and iris boundary segmentation.
In the process of eyeball center positioning and iris boundary segmentation, the embodiment also comprises the calculation of the translation amount, and the calculation method of the translation amount comprises the following steps:
constructing a division branch of the twin network, wherein the division branch consists of a convolution layer and a classification layer; the convolution layer receives the iris boundary capillary characteristic image extracted by the twin network and the eyeball characteristic image changed from the rectangular coordinate system, and performs iris region segmentation through the classification layer after fusion, wherein the centroid of the iris region under the rectangular coordinate system is the center of the final eyeball; compared with the displacement of the center of the eyeball in the initial operation, the displacement is the translation of the eyeball.
The step 3 of this embodiment further includes: and setting a full connection layer behind the encoding and decoding network, wherein the input of the full connection layer is connected with an eyeball characteristic diagram output by a decoder, and judging the current image state according to the characteristic diagram so as to exclude abnormal state images. In actual use, aiming at data enhancement on registered images, the data enhancement method comprises the following steps: capillary characteristic information from random locations of other ocular surgical videos is superimposed at the registration locations, the capillary characteristic superimposed at the registration locations for each pair of registered images being the same, the capillary characteristic superimposed for different pairs of registered images being different.
In the embodiment, the label corresponding to the data of the network construction and training process is marked by adopting a computer-aided method. In the process of semantic segmentation, small data volume labeling is performed manually, pixel-level labeling is performed on iris boundary areas and sclera areas, binary label labeling is performed on abnormal image states, and the abnormal image is 0 and normally is 1. The existing label training network is adopted to predict a large amount of data. And manually screening and predicting correct labels, and improving the coverage rate of correct prediction by adjusting network parameters, wherein for samples difficult to predict correct, manually modifying a computer labeling result or manually labeling.
For tracking problems, a large number of surgical video labeling tracking tags are selected. As shown in fig. 5, a scleral blood vessel feature point 502 located at an iris boundary 506 is manually selected on an initial frame image 501 as a tracking point, a single-target tracker, such as KCF based on correlation filtering or sialfc++ based on a twin network, is used for tracking the blood vessel feature point, the tracking result is manually screened, a correct result is reserved as a blood vessel feature registration tag, and an error result is modified. Each video selects a scattered u vascular feature points 502, 503, 504, 505 for labeling, u being an integer greater than 2. When the end-to-end network is trained, data are enhanced, affine transformation, gaussian blur processing and uneven brightness change processing are carried out on images, and the adaptability of the network to deformation of eyeballs in operation, image blur problems and uneven image light is enhanced. The network needs two images 507, 508 with the same vessel feature as a group of label training registration branches, we strengthen the vessel feature of the label, for each group of labels, additionally randomly take a surgical image 511 which is already segmented by sclera, extract the sclera area containing vessel information, scale and rotate affine transformation 512, and superimpose 513, 514 the same vessel feature with the training image according to the center position of eyeball after affine transformation 512, and the constraint condition of affine transformation is that the same vessel feature is superimposed at the labeling position of the same group of training image, thus obtaining a new label 515,516, enhancing the diversity of the labeled vessel feature and improving the network generalization capability.
The embodiment also provides application of the end-to-end ophthalmic surgery positioning navigation method in an ophthalmic surgery navigation system. FIG. 1 is a flow chart of an embodiment of a surgical navigation method and apparatus for intra-operative operation. As shown in fig. 1, a surgeon 115 observes a patient's eye 113 through an ophthalmic surgical microscope 112, performs surgical operations using various interventional instruments 114, an eye surgical image 103 is acquired by an image acquisition system 101 to a computer and is input to a processing module 108, and the processing module 108 sequentially performs segmentation of critical information of the eye, i.e., semantic segmentation 104, polar coordinate conversion 105 and rotational registration of the eye, using the above-described end-to-end ophthalmic surgical positioning navigation method, wherein rotational tracking of the eye includes inter-frame registration 106 and long-term registration 107 of an inter-frame target, obtains navigation information 109 output, and is presented in a projection system 116. The reference image 111 is a pre-operative image acquired by a pre-operative physician for long term registration 107. If the image is from a binocular stereo microscopy imaging system, a point cloud process 110 of depth calculation is also performed on the output navigation information 109.
Fig. 6 is a schematic diagram of processing a binocular stereo microscope image according to an embodiment, as shown in fig. 6, if image information 603 captured by an image acquisition system 601 is from a binocular stereo ophthalmic surgery system 602, a network 604 processes left and right eyepiece images in two channels, and output left and right eyepiece image navigation information 605 calculates depth information through a point cloud process 606. One specific example of depth calculation is to collect the iris boundary point set P of the eyeballs of the left and right eyepiece images iris 607, and the intersection point 608 of the rotation axis and the iris boundary are converted into a three-dimensional space 609 according to the calibration parameters of the binocular microscopy stereoscopic vision system, the center of mass 610 of the iris boundary point set in the three-dimensional space is used as the center of the eyeball, the average position 611 of the rotation axis of the left ocular image and the rotation axis of the right ocular image and the iris boundary focus is used as the identification of the rotation of the eyeball in the three-dimensional space, and the three-dimensional navigation information can be displayed in a 3d projection system 612 or can be displayed in the ocular view through a spectroscope 613.
As shown in fig. 7, the eye surgery microscope is composed of an objective lens 701, a spectroscope 703, an eyepiece 704 and a CMOS component 702, a surgery image is projected to the eyepiece 704 through the objective lens 701, a surgeon observes the surgery image through the eyepiece, the spectroscope passes a light beam into the CMOS component, the CMOS component passes the image into a computer, and a surgery navigation module processes the surgery image to calculate navigation information. Fig. 8 shows a binocular stereo vision operation microscope, an objective lens system divides two beams of light and projects the two beams of light to a left ocular lens and a right ocular lens to form two paths, the left ocular lens and the right ocular lens can provide a visual effect with stereo perception for an operator, the light paths of the left ocular lens and the right ocular lens respectively pass through spectroscopes 804 and 803, and a beam of light is divided and introduced into a left ocular lens and a right ocular lens assembly 802 and 801, then a computer acquires the left ocular lens and the right ocular lens images, a binocular stereo vision system can be established by calibrating internal and external parameters of the left ocular lens and the right ocular lens, depth information of eyes in the images can be acquired, and the operation navigation method can process the left ocular lens and the right ocular lens images and acquire navigation information with the depth information.
The embodiments described above are only some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention. The invention aims to realize eyeball positioning and rotation navigation in ophthalmic surgery by adopting an end-to-end method.

Claims (10)

1. The end-to-end ophthalmic surgery positioning navigation method is characterized by comprising the following steps of:
step 1, acquiring preoperative images of eyes of a patient;
step 2, constructing a coding and decoding network, wherein the coding and decoding network is a neural network based on a U-Net structure, and the coding network inputs eyeball images of a patient and outputs deep and shallow feature images of the extracted images; the deep features of the image represent global information, and the shallow features represent local information; the decoding network inputs the deep and shallow characteristic images of the image, and outputs the eyeball characteristic images fused with the deep and shallow characteristics; wherein the encoder adopts a TransU-Net or EfficientNet encoder;
step 3, constructing a plurality of segmentation layers for semantically segmenting the iris, sclera and cornea regions of the eye; the multi-class segmentation layer consists of a plurality of convolution layers with the convolution kernel size of 1, wherein the input of the multi-class segmentation layer is an eyeball characteristic image, and the output of the multi-class segmentation layer is a semantic segmentation result, namely a segmentation prediction image of iris, sclera and cornea;
step 4, calculating the center of the iris region according to the semantic segmentation result, and taking the center as the center position of the eyeball; constructing a polar coordinate sampler by taking the center position of an eyeball as a pole;
step 5, acquiring an intraoperative image in real time by utilizing the polar coordinate sampler constructed in the step 4, converting the intraoperative image into a polar coordinate system, and inputting the preoperative image, the semantic segmentation result and the eyeball characteristic map into the polar coordinate sampler to be converted into a polar coordinate image;
step 6, registering the current image and the target image under a polar coordinate system to obtain rotation information of the eyeball, thereby completing rotation navigation of the eyeball;
step 6.1, constructing and training a twin network to extract scleral capillary characteristics of the search image and the target image under a polar coordinate system; the search image is a current frame image in the real-time image, the target image comprises a preoperative image and a previous frame image, wherein the previous frame image is a target image registered between frames, and the preoperative image is used as a target image registered for a long time;
step 6.2, defining a multi-point Gaussian template, wherein the Gaussian template is an image with image gray values distributed by a Gaussian function, the peak position of the gray values is a target position, and the position of the Gaussian distribution in the Gaussian template is a position where the target image is expected to be registered; training a correlation filter on line by using a Gaussian template and a target image, and calculating the cross correlation of the correlation filter and a search image to obtain a correlation response diagram;
step 6.3, constructing a confidence coefficient network by taking the correlation response graph as input to obtain a target position area prediction graph; multiplying the target position area prediction graph by the correlation response graph, and outputting the product as a registration result; the peak position in the registration result is the target position, the position correlation response value is the confidence coefficient of the registration result, and the displacement of the peak position on the angle coordinate axis of the Gaussian template under the polar coordinate system is the rotation of the point position;
step 6.4, calculating displacement average values of a plurality of target positions in the search image in the Gaussian template, and using the displacement average values as rotation amounts of the search image compared with the target image, namely an interframe registration result and a long-time registration result; calculating a weighted sum of the inter-frame registration result and the long-term registration result as a final rotation amount of the eyeball of the current frame image, and dynamically adjusting the weight of the weighted sum of the inter-frame registration and the long-term registration in the operation process according to the registration result; and calculating an iris region of the current frame image according to the final rotation amount, registering the current image and the target image, and completing eyeball center positioning and eyeball rotation tracking.
2. The end-to-end ophthalmic surgical positioning navigation method of claim 1, further comprising:
step 7, carrying out data enhancement on the registered image, wherein the process is as follows: capillary characteristic information from random locations of other ocular surgical videos is superimposed at the registration locations, the capillary characteristic superimposed at the registration locations for each pair of registered images being the same, the capillary characteristic superimposed for different pairs of registered images being different.
3. The end-to-end ophthalmic surgical positioning navigation method of claim 1, wherein: and 6.2, establishing a dense registration point set according to a plurality of Gaussian templates, and carrying out non-rigid registration on the target image and the search image.
4. The end-to-end ophthalmic surgical positioning navigation method of claim 1, wherein: the twin network comprises a first neural network and a second neural network, wherein the first neural network consists of a convolution layer and a deformation network which are sequentially connected; the second neural network has the same structure as the first neural network and is shared in weight; the convolution layers of the first neural network and the second neural network are connected with a polar coordinate sampler, and are used for receiving preoperative image or real-time intraoperative image data under a polar coordinate system, extracting and outputting capillary vessel feature images; and the deformation network receives the semantic segmentation result and the capillary characteristic map under the polar coordinate system, and performs deformation compensation on the capillary characteristic map under the polar coordinate system according to the limbal edge in the semantic segmentation result under the polar coordinate system to obtain the iris boundary capillary characteristic map with deformation eliminated.
5. The end-to-end ophthalmic surgical positioning navigation method of claim 1, wherein: in the step 6.4, in the process of performing eyeball center positioning and rotation tracking, the method further comprises the steps of:
step 6.4.1, constructing a splitting branch of the twin network, wherein the splitting branch consists of a convolution layer and a classification layer; the convolution layer receives the iris boundary capillary characteristic image extracted by the twin network and the eyeball characteristic image changed from the rectangular coordinate system, and performs iris region segmentation through the classification layer after fusion, wherein the centroid of the iris region under the rectangular coordinate system is the center of the final eyeball;
and 6.4.2, calculating the displacement of the center of the eyeball obtained in the step 6.4.1 compared with the center of the eyeball at the initial operation, wherein the displacement is the translation of the eyeball.
6. The end-to-end ophthalmic surgical positioning navigation method of claim 1, wherein: the weighted sum calculation method of the interframe registration and long-term registration results in the step 6.4 is a Kalman filtering algorithm; the detailed calculation process comprises the following steps:
summing the covariance of the interframe registration results with the covariance of the long-term registration; calculating a weight m according to the covariance summation result:
m=P/(P+Q)
where P represents the covariance of the inter-frame registration and Q represents the covariance of the long-term registration.
7. The end-to-end ophthalmic surgical positioning navigation method of claim 1, wherein: the step 3 further includes: and setting a full connection layer behind the encoding and decoding network, wherein the full connection layer receives the eyeball characteristic diagram output by the decoder, judges the current image state and eliminates the abnormal state image.
8. The end-to-end ophthalmic surgical positioning navigation method of claim 1, wherein: two branches of CNN and Token Mixing are arranged in the encoder; the CNN branch is used for acquiring local information, the branch processes an operation image through n convolution layers, and n is a natural number larger than 1; dividing an image into S mutually non-overlapping image blocks as input of the Token Mixing branches based on the Token Mixing branches, linearly mapping all the image blocks to a hidden layer, and embedding the mapped feature addition positions to store the relative positions of each image block; operating each channel dimension of the features by using the MLP network and then outputting a result containing the feature interaction; the local information acquired by the CNN branch and the global information acquired by the Token Mixing branch are fused by using an attention gate;
the decoder section takes the same structure as the Attention U-Net and uses Attention gates to enhance the characteristics of the encoding section.
9. An ophthalmic surgical microscope device comprises an objective lens system, an ocular lens system, a spectroscope system, a video camera assembly, a surgical navigation module and a projection and display module;
the objective lens system focuses the light beam on the eyes of a patient to realize clear imaging;
the ocular lens system projects an image plane into eyes of a doctor, so that a microscope image and a projection pattern generated by the operation navigation module can be directly observed by the doctor;
the beam splitter system divides the light beam into two or more paths, one path of the light beam is led to the ocular lens system, and the other path of the light beam is led to the video shooting and recording assembly;
the shooting and video recording component transmits the image or video to the operation navigation module;
the operation navigation module is connected with the projection and display module and is used for executing the end-to-end ophthalmic operation positioning navigation method of any one of claims 1 to 7 to obtain navigation information of eyeball segmentation; the projection and display module projects the received navigation information into a microscope field of view through a spectroscope system and an eyepiece system, or a real-time video acquired through an external display fusion shooting and video recording assembly is displayed on a display end so as to assist a doctor in performing an operation.
10. An ophthalmic surgical microscope apparatus according to claim 8, wherein: the ophthalmic surgical microscope equipment is a binocular stereoscopic ophthalmic surgical microscope, wherein the microscope is provided with a left group of shooting video components and a right group of shooting video components, and a projection and display module of the microscope is a three-dimensional projection and display module;
the optical path of the left ocular lens is divided into a beam by the spectroscope and is led into the left eye video shooting and recording component, and the optical path of the right ocular lens is divided into a beam by the spectroscope and is led into the right eye video shooting and recording component;
the operation navigation module receives images acquired by the Zuo Mu video shooting assembly and the right eye video shooting assembly, calibrates the internal and external parameters of the left ocular and the right ocular, establishes a binocular stereoscopic vision system according to the calibration result, and acquires depth information of eyes in the left ocular and the right ocular images; then respectively executing the end-to-end ophthalmic surgery positioning navigation method according to any one of claims 1-7 to obtain left eyeball segmentation navigation information and right eyeball segmentation navigation information, and converting the navigation information into a three-dimensional space and then taking an average position;
the three-dimensional projection and display module projects the navigation information after the average position is taken into the view field of the microscope through the spectroscope system and the ocular system, or the real-time video acquired by the external display fusion shooting and video recording assembly is displayed on the display end, so that the doctor is assisted in performing the operation.
CN202310577883.1A 2023-05-22 2023-05-22 Positioning navigation method and device for end-to-end ophthalmic surgery Pending CN116602764A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310577883.1A CN116602764A (en) 2023-05-22 2023-05-22 Positioning navigation method and device for end-to-end ophthalmic surgery

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310577883.1A CN116602764A (en) 2023-05-22 2023-05-22 Positioning navigation method and device for end-to-end ophthalmic surgery

Publications (1)

Publication Number Publication Date
CN116602764A true CN116602764A (en) 2023-08-18

Family

ID=87683100

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310577883.1A Pending CN116602764A (en) 2023-05-22 2023-05-22 Positioning navigation method and device for end-to-end ophthalmic surgery

Country Status (1)

Country Link
CN (1) CN116602764A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117653463A (en) * 2023-12-27 2024-03-08 上海交通大学医学院附属新华医院 Microscope augmented reality guidance system and method for ophthalmic cataract surgery

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117653463A (en) * 2023-12-27 2024-03-08 上海交通大学医学院附属新华医院 Microscope augmented reality guidance system and method for ophthalmic cataract surgery

Similar Documents

Publication Publication Date Title
EP3509013A1 (en) Identification of a predefined object in a set of images from a medical image scanner during a surgical procedure
Wilson et al. A correlation-based approach to calculate rotation and translation of moving cells
WO2020187705A1 (en) Feature point detection
CN109215079A (en) Image processing method, operation navigation device, electronic equipment, storage medium
CN113762009B (en) Crowd counting method based on multi-scale feature fusion and double-attention mechanism
CN116602764A (en) Positioning navigation method and device for end-to-end ophthalmic surgery
CN114429555A (en) Image density matching method, system, equipment and storage medium from coarse to fine
CN112734776A (en) Minimally invasive surgical instrument positioning method and system
CN112507840A (en) Man-machine hybrid enhanced small target detection and tracking method and system
Schaffert et al. Metric-driven learning of correspondence weighting for 2-D/3-D image registration
CN113724190A (en) Image processing method and device based on medical image processing model
CN113689326A (en) Three-dimensional positioning method based on two-dimensional image segmentation guidance
CN116682140A (en) Three-dimensional human body posture estimation algorithm based on attention mechanism multi-mode fusion
Wang et al. Real-time surgical environment enhancement for robot-assisted minimally invasive surgery based on super-resolution
CN109544584B (en) Method and system for realizing inspection image stabilization precision measurement
CN111445575A (en) Image reconstruction method and device of Wirisi ring, electronic device and storage medium
Aslantas et al. Multi-focus image fusion based on optimal defocus estimation
CN112863453A (en) Holographic display method and holographic display system
CN111491151A (en) Microsurgical stereoscopic video rendering method
Zhang et al. Pgnn: Physics-guided neural network for fourier ptychographic microscopy
Wu et al. Multifocus image fusion using random forest and hidden Markov model
CN114463334A (en) Inner cavity vision SLAM method based on semantic segmentation
CN110033422B (en) Fundus OCT image fusion method and device
CN114283178A (en) Image registration method and device, computer equipment and storage medium
Xu et al. Auto-adjusted 3-D optic disk viewing from low-resolution stereo fundus image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination