CN117710324A

CN117710324A - Method and system for identifying and tracking recurrent laryngeal nerves in endoscopic thyroid surgery

Info

Publication number: CN117710324A
Application number: CN202311729413.9A
Authority: CN
Inventors: 林沛亮; 黄晓明; 陈俊周; 文萱; 范剑明
Original assignee: Sun Yat Sen Memorial Hospital Sun Yat Sen University; Sun Yat Sen University
Current assignee: Sun Yat Sen Memorial Hospital Sun Yat Sen University; Sun Yat Sen University
Priority date: 2023-12-15
Filing date: 2023-12-15
Publication date: 2024-03-15

Abstract

The invention relates to a method for identifying and tracking recurrent laryngeal nerves in endoscopic thyroid surgery, which comprises the following steps: obtaining a recurrent laryngeal nerve position image in a white light mode and a recurrent laryngeal nerve position image in an image enhancement mode; performing data preprocessing on the recurrent laryngeal nerve position image in the white light mode to obtain first training data after labeling, and performing data preprocessing on the recurrent laryngeal nerve position image in the image enhancement mode to obtain second training data after labeling; training the pre-constructed neural network model by using the first training data and the second training data to obtain a trained recognition model; embedding a video stream processing module into the trained recognition model to obtain a final recognition model; acquiring a real-time endoscope video in the operation process; and performing recurrent laryngeal nerve recognition and tracking on the real-time endoscopic video through a final recognition model. The invention can provide accurate operation guidance for operators, thereby reducing the risk of recurrent laryngeal nerve injury in operation.

Description

Method and system for identifying and tracking recurrent laryngeal nerves in endoscopic thyroid surgery

Technical Field

The invention relates to the technical field related to artificial intelligence and medical treatment, in particular to a method and a system for identifying and tracking recurrent laryngeal nerves in endoscopic thyroid surgery.

Background

Thyroid tumor is the most common malignancy of head and neck, and current surgical treatment is the most important treatment means. The main complications of thyroid cancer surgery include parathyroid injury and recurrent laryngeal nerve injury, and according to the related literature reports at home and abroad, the recurrent laryngeal nerve injury rate in thyroid surgery is about 0.0-2.0%. Unilateral recurrent laryngeal nerve injury can cause hoarseness of a postoperative patient, and influence daily speech communication; bilateral recurrent laryngeal nerve injury, post-operative patients may develop acute dyspnea, need emergency tracheal intubation/tracheotomy, and are not timely rescued or even life threatening.

The application of the nerve monitoring technology in the operation can effectively help the operator to locate the recurrent laryngeal nerve, reduce the probability of the recurrent laryngeal nerve injury and shorten the operation time. However, the technology of nerve monitoring in operation requires additional disposable expendable materials, such as trachea cannula with myoelectric probe, nerve probe, etc., which is not beneficial to wide popularization and application. In addition, the nerve monitoring in operation needs to switch the instrument between the surgical instrument and the nerve probe for a plurality of times, the continuous positioning can not be realized, and the operation time is additionally consumed; for some patients, repeated electrical stimulation may lead to post-operative neurological dysfunction with temporary vocal cord paralysis.

Compared with the traditional open operation, the endoscopic thyroid operation technology is more attractive in operation incision, and the postoperative life quality of patients is improved. However, changes in the endoscopic surgical viewing angle may increase the incidence of nerve misdamage, presenting a challenge to beginners. Meanwhile, the wide application of the endoscopic surgery provides a clinical basis for program development and application of artificial intelligent automatic identification and continuous tracking of important anatomical structures.

At present, a better artificial intelligent automatic identification scheme is not available, and the recurrent laryngeal nerve in the endoscopic thyroid operation can be accurately tracked and identified.

Disclosure of Invention

The invention aims to at least solve one of the defects of the prior art and provides a method and a system for identifying and tracking recurrent laryngeal nerves in endoscopic thyroid surgery.

In order to achieve the above purpose, the present invention adopts the following technical scheme:

specifically, a method for identifying and tracking recurrent laryngeal nerves in endoscopic thyroid surgery is provided, which comprises the following steps:

obtaining a recurrent laryngeal nerve position image in a white light mode and a recurrent laryngeal nerve position image in an image enhancement mode;

performing data preprocessing on the recurrent laryngeal nerve position image in the white light mode to obtain first training data after labeling, and performing data preprocessing on the recurrent laryngeal nerve position image in the image enhancement mode to obtain second training data after labeling;

training the pre-constructed neural network model by using the first training data and the second training data to obtain a trained recognition model;

embedding a video stream processing module into the trained recognition model to obtain a final recognition model, wherein the video stream processing module is used for capturing a video sequence, inputting each frame in the video sequence into the trained recognition model to obtain a target detection result, tracking the recurrent laryngeal nerve in the video sequence according to the target detection result, updating the state of the recurrent laryngeal nerve according to the position information of the recurrent laryngeal nerve in the current frame, processing the image of each frame in the video sequence in real time, and continuously updating the position information of the recurrent laryngeal nerve to realize dynamic tracking of the recurrent laryngeal nerve;

acquiring a real-time endoscope video in the operation process;

and performing recurrent laryngeal nerve recognition and tracking on the real-time endoscopic video through a final recognition model.

Further, the method, in particular,

the recurrent laryngeal nerve position image in the white light mode refers to a recurrent laryngeal nerve position image in the normal condition, and the recurrent laryngeal nerve position image in the image enhancement mode refers to a recurrent laryngeal nerve position image in which the recurrent laryngeal nerve region is highlighted as the ROI region.

Further, in particular, the data preprocessing operation includes,

carrying out image quality screening on the image to be processed through an image quality evaluation algorithm to remove images lower than a quality evaluation preset value so as to obtain screened images;

classifying the screened images into positive images and negative images according to whether the images contain recurrent laryngeal nerves or not, and marking the recurrent laryngeal nerves in the positive images to obtain marked images;

and carrying out data enhancement processing on the marked image to obtain marked training data, wherein the data enhancement processing comprises operations of image rotation, translation and scaling.

Further, the method further comprises the step of using a hierarchical sampling and cross-validation method when the first training data and the second training data are obtained through division, so that multiple pictures of the same case cannot appear in the training set and the testing set at the same time.

Further, the method further comprises the steps of storing the first training data and the second training data in a file format of an adaptive model before the first training data and the second training data are input into the model for training, converting the corresponding annotation information file into a format required by the model, and carrying out standardized normalization processing on images in the first training data and the second training data.

Further, specifically, the pre-constructed neural network model includes selecting a neural network model suitable for recurrent laryngeal nerve recognition as an initial neural network model, respectively constructing a first target detection model aiming at recurrent laryngeal nerve position images in a white light mode and a second target detection model aiming at reinforced images in the initial neural network model, fusing characteristic layers of the first target detection model and the second target detection model, removing a final full-connection layer in the initial neural network model, replacing the final full-connection layer with a full-connection layer with the output number of 2 types, randomly initializing weights of the full-connection layers with the output number of types, and finally obtaining the pre-constructed neural network model.

Further, specifically, the training process of the pre-constructed neural network model comprises,

in the training process, a random gradient descent method is adopted to optimize the model, in the multi-round training, the model carries out forward propagation calculation through input images to obtain a prediction result, then error between the prediction result and a real label is calculated, and model parameters are updated through reverse propagation to optimize the performance of the model.

Further, specifically, evaluating the performance of the neural network model obtained through training in multi-fold cross validation through accuracy, wherein the neural network model with the highest accuracy is the optimal neural network model;

and then testing the optimal neural network model, drawing an ROC curve graph of the optimal neural network model according to the condition of the model identification recurrent laryngeal nerve, drawing the true positive rate and the false positive rate of doctors participating in verification at the corresponding positions in the ROC curve graph, and if the ROC curve of the optimal neural network model does not completely surround the result points of the doctors, adjusting the model until the ROC curve of the optimal neural network model completely surrounds the result points of the doctors, so as to obtain the trained identification model.

The invention also provides a laryngeal return nerve identification and tracking system in endoscopic thyroid surgery, which comprises the following steps:

the training data acquisition module is used for acquiring a recurrent laryngeal nerve position image in a white light mode and a recurrent laryngeal nerve position image in an image enhancement mode;

the preprocessing module is used for carrying out data preprocessing on the recurrent laryngeal nerve position image in the white light mode to obtain first training data after marking, and carrying out data preprocessing on the recurrent laryngeal nerve position image in the image enhancement mode to obtain second training data after marking;

the model training module is used for training the pre-constructed neural network model by utilizing the first training data and the second training data to obtain a trained recognition model;

the recognition model determining module is used for embedding a video stream processing module into the trained recognition model to obtain a final recognition model, the video stream processing module is used for capturing a video sequence, inputting each frame in the video sequence into the trained recognition model to obtain a target detection result, tracking the recurrent laryngeal nerve in the video sequence according to the target detection result, updating the state of the recurrent laryngeal nerve according to the position information of the recurrent laryngeal nerve in the current frame, processing the image of each frame in the video sequence in real time, and continuously updating the position information of the recurrent laryngeal nerve to realize the dynamic tracking of the recurrent laryngeal nerve;

the real-time data acquisition module is used for acquiring real-time endoscope video in the operation process;

and the recurrent laryngeal nerve recognition and tracking module is used for recognizing and tracking the recurrent laryngeal nerve of the real-time endoscope video through the final recognition model.

The invention also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the computer program realizes the steps of the method for identifying and tracking recurrent laryngeal nerves in endoscopic thyroid surgery when being executed by a processor.

The beneficial effects of the invention are as follows:

the method for identifying and tracking the recurrent laryngeal nerve in the endoscopic thyroid surgery provided by the invention is based on an advanced deep learning target detection framework, and combines the endoscopic thyroid surgery images in a white light mode and an image enhancement mode to perform model training. Through the technical scheme, the system can automatically identify and track the position of the recurrent laryngeal nerve in real time, and has the following advantages:

1. the accuracy is high, and the stability is good: the system adopts a deep learning target detection framework, and can automatically identify and track the recurrent laryngeal nerves in real time by carrying out model training on the pictures of a white light mode and an image enhancement mode in endoscopic thyroid surgery, thereby improving the accuracy and the stability of recurrent laryngeal nerve identification.

2. The application range is wide: the system is suitable for different types of endoscope systems and operation, has higher universality, and can be widely applied to a plurality of clinical fields such as head and neck surgery in hospitals, common surgery and the like.

3. The operation flow is simplified, and the operation efficiency is improved: the system can realize the functions of automatic identification and real-time tracking of recurrent laryngeal nerve, avoids repeated switching between surgical instruments such as ultrasonic knife/bipolar electrocoagulation and the like and nerve probes in operation, simplifies the operation flow, shortens the operation time and improves the operation efficiency.

4. Improving the postoperative quality of life of the patient: by reducing the incidence rate of recurrent laryngeal nerve injury, the system can improve the safety and accuracy of the operation and improve the postoperative life quality of patients.

5. Improving the accuracy of the operation: the system can track the movement of the recurrent laryngeal nerve in real time, provides accurate operation guidance for operators, and improves the accuracy and controllability of operation.

Drawings

The above and other features of the present disclosure will become more apparent from the detailed description of the embodiments illustrated in the accompanying drawings, in which like reference numerals designate like or similar output voltages, it is apparent that the accompanying drawings in which the following description is given only by way of example of the present disclosure, and that other drawings may be obtained by those skilled in the art without undue effort, in which:

FIG. 1 is a flow chart of the method for identifying and tracking recurrent laryngeal nerves in endoscopic thyroid surgery according to the present invention;

fig. 2 is a schematic structural diagram of a neural network model related to the identification and tracking method of recurrent laryngeal nerves in endoscopic thyroid surgery without adding a video stream processing module.

Detailed Description

The conception, specific structure, and technical effects produced by the present invention will be clearly and completely described below with reference to the embodiments and the drawings to fully understand the objects, aspects, and effects of the present invention. It should be noted that, in the case of no conflict, the embodiments and features in the embodiments may be combined with each other. The same reference numbers will be used throughout the drawings to refer to the same or like parts.

Referring to fig. 1, embodiment 1 of the present invention provides a method for identifying and tracking recurrent laryngeal nerves in endoscopic thyroid surgery, comprising the following steps:

step 110, obtaining a recurrent laryngeal nerve position image in a white light mode and a recurrent laryngeal nerve position image in an image enhancement mode;

step 120, performing data preprocessing on the recurrent laryngeal nerve position image in the white light mode to obtain first training data after labeling, and performing the data preprocessing on the recurrent laryngeal nerve position image in the image enhancement mode to obtain second training data after labeling;

step 130, training the pre-constructed neural network model by using the first training data and the second training data to obtain a trained recognition model;

step 140, embedding a video stream processing module in the trained recognition model to obtain a final recognition model, wherein the video stream processing module is used for capturing a video sequence, inputting each frame in the video sequence into the trained recognition model to obtain a target detection result, tracking the recurrent laryngeal nerve in the video sequence according to the target detection result, updating the state of the recurrent laryngeal nerve according to the position information of the recurrent laryngeal nerve in the current frame, processing the image of each frame in the video sequence in real time, and continuously updating the position information of the recurrent laryngeal nerve to realize the dynamic tracking of the recurrent laryngeal nerve;

step 150, acquiring real-time endoscope video in the operation process;

and 160, identifying and tracking the recurrent laryngeal nerves of the real-time endoscopic video through a final identification model.

In particular, in practice, the present invention includes three processes,

1. data acquisition and preprocessing

In order to establish an operation auxiliary system for automatically identifying and tracking the recurrent laryngeal nerves in endoscopic thyroid surgery in real time, images of a white light mode and an image enhancement mode in endoscopic thyroid surgery and corresponding recurrent laryngeal nerves position information are required to be collected and preprocessed and used for training and testing a model.

(1) Collecting thyroid endoscope operation pictures: and collecting thyroid operation pictures in a white light mode and an image enhancement mode in a targeted manner according to task targets. The white light mode can provide natural and real image information, and is helpful for the model to learn visual characteristics in the real world; the image enhancement mode highlights specific characteristics of the target through image enhancement processing, so that the model is facilitated to capture key characteristics of the target more effectively;

(2) Data labeling and enhancement: dividing the collected thyroid endoscope operation pictures into positive pictures and negative pictures according to whether the thyroid endoscope operation pictures contain recurrent laryngeal nerves or not, and marking the nerves in the positive pictures. In addition, by applying a data enhancement technology and through operations such as image rotation, translation, scaling and the like, the diversity of a data set is increased, and the generalization capability of a model is improved;

(3) Image quality screening: and (3) quality screening is carried out on all pictures, and pictures with poor quality such as blurring, serious noise and the like are removed, so that the accuracy and the reliability of a data set are ensured. An image quality evaluation algorithm is introduced to automatically screen the images, so that the screening efficiency is improved;

(4) Data partitioning and cross-validation: firstly, dividing the marked white light mode and the marked image of the image enhancement mode into two pieces of training data. In the specific division process, a hierarchical sampling and cross-validation method is used to ensure that multiple pictures of the same case cannot appear in the training set and the testing set at the same time. Namely, all pictures contained in the case are distributed to a training set or a test set so as to avoid data leakage and deviation of an evaluation result;

(5) Data format conversion and standardization: and storing the marked data set in a proper file format, and converting the corresponding marked information file into a format required by the model. The image is preprocessed, including operations such as image standardization and normalization, so that model training difficulty is reduced, and convergence speed is improved.

2. Model training and optimization

The deep learning target detection framework is adopted, and the images in the white light mode and the image enhancement mode are used for training, so that the model has high-efficiency recurrent laryngeal nerve recognition capability. The specific process comprises the following steps:

(1) Loading a pre-training model: loading a pre-trained neural network model suitable for the task, such as YOLO V7 (only YOLO V7 is taken as an example, and other suitable target detection frames can be selected in practical application);

(2) Model construction and fusion: with reference to fig. 2, target detection models for the white light mode picture and the image enhancement mode picture are respectively constructed, and then feature layers of the white light mode picture and the image enhancement mode picture are fused. And removing the last full-connection layer in the neural network model, replacing the last full-connection layer with the output number of 2 types and randomly initializing the weight of the full-connection layer with the output number of 2 types and types. Finally obtaining a deep learning neural network model for thyroid endoscope image target detection;

(3) Model training and optimizing: and training the divided training sets respectively, and optimizing the model by adopting a random gradient descent method in the training process. In the multi-round training, the model carries out forward propagation calculation through input images to obtain a prediction result, then calculates an error between the prediction result and a real label, and updates model parameters through backward propagation to optimize the performance of the model. The image enhancement mode pictures in the training data are used for increasing the diversity of training set samples and improving the generalization capability of the model;

(4) Model verification and selection: the performance of the neural network model obtained through training in multi-fold cross validation is evaluated by using the accuracy, and the neural network model with the highest accuracy is the optimal neural network model;

(5) Model evaluation and comparison: and drawing an ROC curve graph of the optimal neural network model according to the condition of the model recognition recurrent laryngeal nerve, and drawing the true positive rate and the false positive rate of the doctor participating in verification at the corresponding positions in the ROC curve graph. If the ROC curve of the optimal neural network model surrounds a result point of a doctor, the optimal neural network model can reach or exceed the performance of human experts, and the ROC curve has the capability of identifying recurrent laryngeal nerves in actual endoscopic surgery;

(6) Dynamic tracking algorithm: and embedding a video stream processing module in the model, inputting each frame in the video sequence into a deep learning network, extracting high-level abstract feature representation, and performing forward propagation calculation on the image to obtain a target detection result, wherein the target detection result comprises positive or negative judgment of recurrent laryngeal nerves. And tracking the target object in the current frame according to the target detection result, and updating the state of the target object according to the position information of the target object in the current frame, thereby improving the detection accuracy of the next frame. Processing the image of each frame in real time in the video sequence, and continuously updating the position information of the target object to realize the dynamic tracking of the target object;

(7) Model integration and application: the optimized model and the dynamic tracking algorithm are embedded into an endoscopic surgery auxiliary system together, so that a real-time laryngeal return nerve recognition and tracking function is provided for a doctor, and the safety and accuracy of surgery are improved.

3. System implementation and application

(1) Model deployment: embedding the trained deep learning model into an endoscopic surgery auxiliary system to realize the functions of automatic recognition and real-time tracking of recurrent laryngeal nerves;

(2) Data acquisition and processing: when the recurrent laryngeal nerve enters the endoscope visual field, a video stream processing module of the system processes real-time video, and each frame in a video sequence is input into a deep learning network for detection;

(3) Target detection and optimization: inputting an image or video stream to be detected into a model, and outputting detected target position and category information by the model; meanwhile, post-processing is carried out on the target detection result output by the model according to actual demands, such as removing repeated detection, screening targets with low confidence and the like;

(4) Target tracking and display: the system automatically recognizes the position of the recurrent laryngeal nerve, marks the position of the recurrent laryngeal nerve on a screen, and tracks the movement of the recurrent laryngeal nerve in real time, so that an operator can adjust surgical instruments in time, protect the recurrent laryngeal nerve and reduce the incidence rate of recurrent laryngeal nerve injury.

The system adopts a deep learning target detection framework, and realizes the functions of automatic recognition and real-time tracking of recurrent laryngeal nerves by performing model training on the pictures in a white light mode and an image enhancement mode in endoscopic thyroid surgery. Compared with the traditional operation, which needs to use the trachea cannula, nerve probe and other extra disposable consumable materials of the myoelectricity probe, the system does not need to use extra consumable materials, and reduces the operation cost and the waste of medical resources.

In addition, the system has universality and adaptability. The system can be suitable for different types of endoscope systems and operation, has higher universality, and can be widely applied to a plurality of clinical fields such as thyroid surgery, head and neck surgery and the like in hospitals. Meanwhile, the system also supports multiple image enhancement modes, improves the accuracy and stability of laryngeal recurrent nerve identification, and is suitable for different clinical actual operations.

The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical modules, i.e., may be located in one place, or may be distributed over a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution in this embodiment.

In addition, each functional module in each embodiment of the present invention may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module. The integrated modules may be implemented in hardware or in software functional modules.

The integrated modules, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on this understanding, the present invention may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of each of the method embodiments described above. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth.

While the present invention has been described in considerable detail and with particularity with respect to several described embodiments, it is not intended to be limited to any such detail or embodiments or any particular embodiment, but is to be construed as providing broad interpretation of such claims by reference to the appended claims in view of the prior art so as to effectively encompass the intended scope of the invention. Furthermore, the foregoing description of the invention has been presented in its embodiments contemplated by the inventors for the purpose of providing a useful description, and for the purposes of providing a non-essential modification of the invention that may not be presently contemplated, may represent an equivalent modification of the invention.

The present invention is not limited to the above embodiments, but is merely preferred embodiments of the present invention, and the present invention should be construed as being limited to the above embodiments as long as the technical effects of the present invention are achieved by the same means. Various modifications and variations are possible in the technical solution and/or in the embodiments within the scope of the invention.

Claims

1. The method for identifying and tracking the recurrent laryngeal nerve in endoscopic thyroid surgery is characterized by comprising the following steps of:

acquiring a real-time endoscope video in the operation process;

2. The method for identifying and tracking recurrent laryngeal nerves in endoscopic thyroid surgery according to claim 1, wherein, in particular,

3. The method of claim 1, wherein the data preprocessing operation comprises,

4. The method of claim 1, further comprising, when the first training data and the second training data are obtained by dividing, using a hierarchical sampling and cross-validation method, ensuring that multiple pictures of the same case do not appear in both the training set and the test set.

5. The method of claim 1, further comprising, before training the first training data and the second training data in the model, storing the first training data and the second training data in a file format adapted to the model, converting a corresponding labeling information file into a format required by the model, and performing a standardized normalization process on images in the first training data and the second training data.

6. The method for identifying and tracking recurrent laryngeal nerves in endoscopic thyroid surgery according to claim 1, wherein the pre-built neural network model comprises the steps of selecting a neural network model suitable for recurrent laryngeal nerves identification as an initial neural network model, respectively constructing a first target detection model aiming at a recurrent laryngeal nerve position image in a white light mode and a second target detection model of an enhanced image in the initial neural network model, fusing characteristic layers of the first target detection model and the second target detection model, removing a final full-connection layer in the initial neural network model, replacing the final full-connection layer with a full-connection layer with an output number of 2 types, randomly initializing weights of the full-connection layers with the output number of types, and finally obtaining the pre-built neural network model.

7. The method for identifying and tracking recurrent laryngeal nerves in endoscopic thyroid surgery according to claim 1, wherein the training of the pre-constructed neural network model comprises,

8. The method for identifying and tracking recurrent laryngeal nerves in endoscopic thyroid surgery according to claim 7, wherein the neural network model with highest accuracy is the optimal neural network model, specifically, the performance of the neural network model obtained by training is evaluated through accuracy;

9. The laryngeal return nerve identifying and tracking system in endoscopic thyroid surgery is characterized by comprising the following steps of:

the model training module is used for training the pre-constructed neural network model by using the first training data and the second training data so as to obtain a trained recognition model;

10. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method according to any of claims 1-8.