CN111652175A - Real-time surgical tool detection method applied to robot-assisted surgical video analysis - Google Patents

Real-time surgical tool detection method applied to robot-assisted surgical video analysis Download PDF

Info

Publication number
CN111652175A
CN111652175A CN202010529745.2A CN202010529745A CN111652175A CN 111652175 A CN111652175 A CN 111652175A CN 202010529745 A CN202010529745 A CN 202010529745A CN 111652175 A CN111652175 A CN 111652175A
Authority
CN
China
Prior art keywords
surgical tool
surgical
robot
image
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010529745.2A
Other languages
Chinese (zh)
Inventor
赵子健
刘玉莹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
Original Assignee
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University filed Critical Shandong University
Priority to CN202010529745.2A priority Critical patent/CN111652175A/en
Publication of CN111652175A publication Critical patent/CN111652175A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/20Scenes; Scene-specific elements in augmented reality scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/03Recognition of patterns in medical or anatomical images
    • G06V2201/034Recognition of patterns in medical or anatomical images of medical instruments

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The present disclosure provides a real-time surgical tool detection method applied to robot-assisted surgical video analysis, comprising: acquiring a robot-assisted surgery video and processing the video to obtain a surgery image; estimating key points of the surgical tool on the surgical image to obtain a heat map of the central point of the surgical tool, and predicting the central point of the surgical tool and the size of the surgical tool according to the peak value of the heat map; and calculating the boundary frame of the surgical tool according to the predicted central point of the surgical tool and the size of the central point. The light-weight convolutional neural network is adopted, the fire module is used for replacing a residual error module in the traditional hourglass network, parameters required by training of the convolutional neural network are greatly reduced, the detection speed is improved while the high detection accuracy is guaranteed, and the real-time detection requirement is met.

Description

Real-time surgical tool detection method applied to robot-assisted surgical video analysis
Technical Field
The disclosure belongs to the technical field of robot-assisted surgical video analysis, and particularly relates to a real-time surgical tool detection method applied to robot-assisted surgical video analysis.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
The robot-assisted surgery mainly includes two main categories of robot-assisted surgery operation and robot-assisted surgery navigation. The computer-assisted surgery operation is that a surgeon performs surgery operation by means of a robot arm so as to reduce surgical accidents caused by subjective consciousness, fatigue and other factors of people, the surgeon can operate the robot far away from an operating table to perform accurate surgery, the computer-assisted surgery operation is completely different from the traditional surgery concept, and the probability of infection of medical staff can be reduced. The computer-assisted surgery navigation is to accurately correspond video data before or during surgery of a patient hand to an anatomical structure of the patient on an operating bed, track an operating tool during the surgery and update and display the position of the operating tool on an image of the patient in real time in a virtual probe mode, so that a doctor can clearly know the position of the operating tool relative to the anatomical structure of the patient, and the surgery is quicker, more accurate and safer.
Video analysis techniques further analyze and track objects in the camera scene by distinguishing between the background and objects in the scene using computer vision analysis techniques. Video analysis is to create a basic operation idea, namely acquisition, preprocessing, processing and action, according to the biological characteristics of human eyes. Firstly, acquiring video data of an experienced doctor in the operation process; then, performing framing on the operation video with clear pictures and complete process; identifying the information such as the appearance time, the position and the like of the surgical tool in the surgical image obtained by framing; and finally, training novices, operation early warning, operating room resource allocation and the like are realized according to the obtained information of the operation tool.
Real-time surgical tool detection, one of the classic problems in computer vision, has the task of marking the specific position of a surgical tool in an image with a box and giving the category of the surgical tool. Because the surgical tool detection technology is used in the surgical video image, it is necessary to achieve rapidness and accuracy in the detection process of the surgical tool, that is, real-time performance and accuracy are required for the detection of the surgical tool.
Different from a common target detection task, the actual image for surgical tool detection often has factors which are unfavorable for surgical tool detection, such as blood fog, blurring, too fast moving speed and the like, so that the detection precision of the surgical tool is reduced, and the surgical navigation process is influenced to cause harm to a human body; on the other hand, to help the surgical navigation through the surgical tool detection, real-time performance is very necessary, and if the real-time performance of the surgical tool detection is not achieved, the view of the doctor in the surgical process is delayed, and unnecessary damage is caused to the human body. However, the method adopted by the current surgical tool detection is very time-consuming, needs to generate a great number of anchor boxes as prior frames and even needs to map the prior frames back to the image feature map, which increases the amount of calculation, is very time-consuming and cannot achieve the effect of real-time performance.
Disclosure of Invention
In order to overcome the defects of the prior art, the real-time surgical tool detection method applied to robot-assisted surgical video analysis is provided, and the speed and accuracy of surgical tool detection are improved.
In one aspect, to achieve the above object, one or more embodiments of the present disclosure provide the following technical solutions:
a real-time surgical tool detection method applied to robot-assisted surgical video analysis comprises the following steps:
acquiring a robot-assisted surgery video and processing the video to obtain a surgery image;
estimating key points of the surgical tool on the surgical image to obtain a heat map of the central point of the surgical tool, and predicting the central point of the surgical tool and the size of the surgical tool according to the peak value of the heat map;
the bounding box of the surgical tool is derived from the predicted center point of the surgical tool and its size.
In a further technical scheme, the key points of the surgical tools of the surgical images are estimated by adopting a key point estimator of the surgical tools obtained after the lightweight neural network framework is trained.
In another aspect, a real-time surgical tool detection system for use in robotic-assisted surgical video analysis is disclosed, comprising:
the operation image acquisition module is configured to acquire a robot-assisted operation video and process the robot-assisted operation video to obtain an operation image;
the central point prediction module of the surgical tool is configured to preprocess the surgical image, estimate key points of the surgical tool to obtain a heat map of the central point of the surgical tool, and predict the central point of the surgical tool and the size of the surgical tool according to the peak value of the heat map;
and a bounding box acquisition module of the surgical tool, which is configured to obtain the bounding box of the surgical tool from the predicted central point of the surgical tool and the size thereof.
The above one or more technical solutions have the following beneficial effects:
1. according to the technical scheme, the light-weight convolutional neural network is adopted, the fire module is used for replacing a residual error module in the traditional hourglass network, parameters required by training of the convolutional neural network are greatly reduced, the detection speed is improved while the high detection accuracy is guaranteed, and the detection requirement of real-time performance is met.
2. According to the technical scheme, the candidate operation tool boundary frame is extracted by adopting an anchor-free box method, the boundary frame of the operation tool to be detected is directly obtained through the coordinate of the center point of the operation tool through a formula, any post-processing process is not needed, and the detection method at one stage is more efficient.
Drawings
The accompanying drawings, which are included to provide a further understanding of the disclosure, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure and are not to limit the disclosure.
FIG. 1 is a block diagram of the overall convolutional neural network of the present invention;
fig. 2 is a detailed block diagram of a fire block instead of a residual block.
Detailed Description
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present disclosure. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
The embodiments and features of the embodiments in the present disclosure may be combined with each other without conflict.
The disclosed embodiments employ an anchor-box-less lightweight convolutional neural network architecture.
Interpretation of terms
Anchor-free is an Anchor-free box, and an Anchor (also called an Anchor box) is a group of rectangular boxes clustered by using methods such as k-means and the like on a training set before convolutional neural network training, and represents the length and width dimensions of the main distribution of an object in a data set. The target detection algorithm can be generally divided into anchor-based and anchor-free fusion classes, namely sliding window, selective search and regression, and the difference lies in whether the anchor is used for extracting a candidate target frame or not. Anchor-free is an anchorless box, i.e., no anchor is used to extract candidate target boxes.
The detection method of the surgical tool widely adopted at present mainly comprises the following steps: a two-stage test method and a one-stage test method requiring an anchor box. The two-stage detection method needs to map a large number of anchor boxes back to the image feature map and then carry out classification regression, and is very time-consuming; although the detection-stage method requiring the anchor box can directly perform classification regression after the anchor box is generated, the design of a large number of anchor boxes requires the fine design of anchor box parameters, the detection precision of surgical tools of the method is reduced, and the speed of the method is far less than the requirement of the detection real-time performance of the surgical tools. The method for detecting the light-weight surgical tool is a light-weight surgical tool detection method which can directly carry out classification regression without a large number of anchor boxes at one stage, and meets the requirement on detection real-time performance of the surgical tool while ensuring the detection accuracy of the surgical tool.
Example one
The embodiment discloses a real-time surgical tool detection method applied to robot-assisted surgical video analysis, which has the overall concept that:
acquiring a robot-assisted surgery video and processing the video to obtain a surgery image;
estimating key points of the surgical tool on the surgical image to obtain a heat map of the central point of the surgical tool, and predicting the central point of the surgical tool and the size of the surgical tool according to the peak value of the heat map;
the bounding box of the surgical tool is derived from the predicted center point of the surgical tool and its size.
The method comprises the following specific steps:
s1: acquiring a robot-assisted surgery video, and performing framing operation on the surgery video to obtain a surgery image;
s2: initializing a neural network framework for training;
s3: inputting the operation image obtained in the step S1 into a neural network framework, and preprocessing the operation image;
s4: training the neural network framework to obtain the surgical tool keypoint estimator in the surgical image of step S3
Figure BDA0002534941490000052
S5: obtaining a heat map of the center point of the surgical tool according to the surgical tool key point estimator detected in the step S4, regressing the size of a surgical tool boundary box, and outputting an offset value after down sampling;
s6: the bounding box of the surgical tool is obtained from the center point of the surgical tool obtained in step S5.
In a specific implementation example, step S1 specifically includes:
s11: when the robot-assisted surgery is carried out, a camera is used for collecting the video of the whole surgery process, and the speed is 25 fps;
s12: down-sampling the video with the speed of 20fps collected in the step S11 into 5fps by using framing software, and then storing the video as an operation image; it should be noted here that, in the specific implementation, the original video is down-sampled to the frame rate of the manual marking operation video; the down-sampling realizes the down-sampling of the original video so as to enrich the time information among video segments and be beneficial to the accuracy rate of the detection of the surgical tool.
S13: step S12 is repeated until all surgical videos are converted into surgical images.
The step S2 specifically includes:
s21, the type of the surgical tool is C, wherein C is selected to be 1, C represents the type of the surgical tool appearing in the surgical video, C is 1, represents that only one surgical tool appears in the processed surgical video, the surgical image with the size of W × H and the downsampling factor of R, and the surgical image with the size of 720 × 576 and the downsampling factor of 4 is selected
Figure BDA0002534941490000051
The method comprises the steps of firstly passing through a convolution module and a residual error module of 7 × 7 to reduce the resolution of images, and uniformly setting the resolution of the images to be 512 × 512 so as to improve the training speed of the neural network;
s22: the operation image obtained in step S21 passes through two lightweight hourglass network frames with relay supervision layers (1 × 1 convolution module and batch normalization), as shown in fig. 1, although two hourglass network modules are used, only two symmetrical down-sampling modules and up-sampling modules with jump connection layers are added, the maximum pooling layer is not used during down-sampling, and the step length of 2 is selected to reduce the resolution of the operation image.
The lightweight hourglass network frame includes: a down-sampling part: processing the image to a size suitable for the display area; a low resolution map of the corresponding surgical tool image is generated.
An up-sampling part: the image is magnified so that it can be displayed on a high resolution device.
A relay supervision part: the whole neural network is directly subjected to gradient descent, and the problem of gradient disappearance caused by errors of an output layer can be greatly reduced by back propagation layer by layer. To solve this problem, a relay supervision part is added in the middle to ensure the update of the low-level parameters.
The training of the neural network is to input a picture into a black box, the neural network converts the picture into different characteristics, the neural network spreads layer by layer, and finally the required result is output.
Referring to fig. 2, S2 is based on using fire modules to replace residual modules in a traditional hourglass network, and the surgical tool key point estimator is learned through neural network training composed of fire modules; the fire module enables the application network to reduce the number of parameters while ensuring the detection accuracy of the surgical tool, so as to achieve the real-time detection of the surgical tool. The method specifically comprises the following steps:
s23: the residual error module in the traditional hourglass network framework used in step S22 is replaced by a fire module, which first compresses the channels of the input image using a 1 × 1 convolution kernel, then uses a module in which the 1 × 1 convolution kernel and a 3 × 3 depth separable convolution kernel are mixed and parallel, and finally outputs the result through a modified linear unit (also called a linear rectification function), which is an activation function common in convolutional neural networks and is a nonlinear mapping. The superposition of a plurality of linear operation layers can only carry out linear mapping, and the nonlinearity of the whole network can be increased by introducing activation functions such as a linear rectification function and the like to form a complex function. The design can reduce training parameters and greatly accelerate the training detection speed of the neural network, and the fire module and the deep separable convolution are common methods of the lightweight neural network;
s24: the depth separable convolution in step S23 is divided into two steps: in the first step, three convolutions are adopted to carry out convolution calculation on three channels respectively, three numbers are output after one convolution, then in the second step, three numbers output in the first step are used, a point convolution of 1 multiplied by 3 is calculated, and finally, one number is obtained.
In the neural network framework, the position of the surgical tool in the surgical tool image is trained and learned, and each surgical tool image is subjected to the convolution operation.
The deep separable convolution is two steps, each having a different role:
1. separating the depth information;
2. size was reduced using 1 x 1 convolution to fuse channels. The computation amount of the depth separable convolution is about 1/9 of the computation amount of the traditional convolution operation, and the speed is greatly increased.
The step S3 specifically includes:
s31: processing the input operation images in batches;
s32: preprocessing input batch images, namely enhancing data (rotating, displacing, scaling and the like operation on the operation images) so as to increase training data set samples;
s33: step S32 is repeated until all the batches have been processed.
The step S4 specifically includes:
s41: training the neural network framework to obtain the surgical tool keypoint estimator in the surgical image of step S3
Figure BDA0002534941490000071
When predicted
Figure BDA0002534941490000072
Then the key point detected is the center point of the surgical tool, if predicted
Figure BDA0002534941490000073
Then the representation of the detected keypoint is background.
Obtaining surgical tool keypoint estimators by training a designed one-stage anchorless, lightweight convolutional neural network
Figure BDA0002534941490000074
The expression is as follows:
Figure BDA0002534941490000075
where W, H are the width and height of the input image, respectively, and R is the downsampling factor.
The step S5 specifically includes:
s51: if the coordinates of the center point of a certain surgical tool c in the input surgical image I are (95, 102), the real value of the center point of the certain surgical tool can be distributed on the heat map based on the real value, and then various subsequent calculations are performed. Then the surgical tool keypoint estimator derived from step S4 is
Figure BDA0002534941490000081
Given a radius, for the surgical tool detectedNegative examples are penalized rather than being in the form of a non-zero, one; the penalty for negative samples is reduced by a denormal 2D Gaussian kernel Y centered at a positive positionxycGiving out; since the surgical image data has a serious imbalance problem of positive and negative samples, the items in front of the log operator have a balancing role. If Y isxycApproaching 1, indicating that this point is a point of easy detection, YxycApproaching 0 indicates that the key point has not been learned and therefore should be increased in training weight, so the parenthetical term preceding log will be based on YxycThe size of the weight of the training ball adjusts the training proportion. An improved focus loss function is used:
Figure BDA0002534941490000082
wherein, the hyper-parameter α is 2, β is 4, and N is 2 is the number of key points in the operation image;
s52: designing a loss function L consisting of three partsdet=LksLsoLoWherein L iskIs a loss function, L, that estimates the surgical tool key pointssIs an estimate of the size L of the surgical tool1Loss function, λs0.1 is its constant coefficient, LoIs a local offset L1Loss function, λ o1 is its constant coefficient.
S53: and repeating the step S52, continuously learning, continuously training the network, and enabling the value of the loss function in the step S52 to be gradually reduced to a certain value and then to be unchanged until the loss function of the convolutional neural network is subjected to curve fitting.
The loss function curve fitting represents the success of neural network training, and a marked surgical tool picture can be directly input for testing to directly obtain C +4 data, namely the type C of the surgical tool key point, the sizes W and H of the boundary box and the x and y of the offset.
The heat map is an output of the neural network, i.e., the class of surgical tool key points.
Post-processing NMS (non-maximum suppression), computing the IOU between bounding boxes, is commonly used to reduce the number of bounding boxes that are duplicated by the same surgical tool, but is difficult to distinguish and train, resulting in most detectors that are not end-to-end trainable today. Therefore, all the response points on the heat map are compared with 8 neighborhood points, and if the value of the key point is greater than or equal to 8 neighborhood point values, the key point is retained, and finally the first 100 peak points are left. Therefore, the NMS post-processing process is omitted, and end-to-end training can be realized.
For C types in the actual values of the key points of the surgical tool in the surgical image, calculating the key points P of the actual values for training, wherein the calculation formula of the key points of the surgical tool is as follows:
Figure BDA0002534941490000091
the corresponding key points after the down sampling of the original image are as follows:
Figure BDA0002534941490000092
thus, the device is provided with
Figure BDA0002534941490000093
And corresponding to the key point of the original image under low resolution after the down-sampling is finished.
Down-sampled image true value key point according to formula
Figure BDA0002534941490000094
Input Gaussian kernel
Figure BDA0002534941490000095
In the formula sigmaPIs the standard deviation associated with the surgical tools W and H, such that the keypoints are distributed on the feature map of the surgical tool, and if there is overlap, the selection level is large. Namely, each point Y ranges from 0 to 1, and 1 is the point needing to learn prediction.
Finally using a key point loss function, i.e. an improved logistic regression focus loss function at pixel level
Figure BDA0002534941490000096
Regression was performed using the L1 loss function.
If a certain operatorThe bounding box with k is represented as:
Figure BDA0002534941490000097
the center point of the surgical tool is:
Figure BDA0002534941490000098
the bounding box size for each surgical tool k is calculated prior to training:
Figure BDA0002534941490000099
Figure BDA00025349414900000910
y2k-y1k, this size is W, H after downsampling the original image;
by using
Figure BDA00025349414900000911
Estimating the central points of all surgical tools;
to achieve real-time reduction of computational effort, the same predicted size is used for different surgical tools
Figure BDA00025349414900000912
Regression was performed at the center point position with the L1 loss function:
Figure BDA00025349414900000913
the step S6 specifically includes:
s61: according to the peak value of the heat map obtained in the step S5, the responses of all the values greater than or equal to 8 connected neighborhoods are detected, the first 100 peak values are kept, and the
Figure BDA0002534941490000101
2 detected center points for class 1
Figure BDA0002534941490000102
The set of (a) and (b),
Figure BDA0002534941490000103
is a prediction of the deviation of the surgical tool,
Figure BDA0002534941490000104
size prediction of surgical tools.
S62: based on the center point of the surgical tool detected in step S5 and the predicted size of the surgical tool, the formula is finally used:
Figure BDA0002534941490000105
the bounding box of the surgical tool is obtained, the whole process belongs to the bounding box of the candidate surgical tool extracted without an anchor box, and any post-processing process is not needed.
Based on the same inventive concept, the present embodiment is directed to a computing device, which includes a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the steps of the real-time surgical tool detection method applied to the robot-assisted surgical video analysis in the first embodiment.
Based on the same inventive concept, it is an object of the present embodiment to provide a computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, performs the steps of the real-time surgical tool detection method applied to the robot-assisted surgical video analysis in the first implementation example.
Based on the same inventive concept, the present embodiment aims to provide a real-time surgical tool detection system applied to video analysis of robot-assisted surgery, comprising:
the operation image acquisition module is configured to acquire a robot-assisted operation video and process the robot-assisted operation video to obtain an operation image;
the central point prediction module of the surgical tool is configured to preprocess the surgical image, estimate key points of the surgical tool to obtain a heat map of the central point of the surgical tool, and predict the central point of the surgical tool and the size of the surgical tool according to the peak value of the heat map;
and a bounding box acquisition module of the surgical tool, which is configured to obtain the bounding box of the surgical tool from the predicted central point of the surgical tool and the size thereof.
The steps involved in the apparatus of the above embodiment correspond to the first embodiment of the method, and the detailed description thereof can be found in the relevant description of the first embodiment. The term "computer-readable storage medium" should be taken to include a single medium or multiple media containing one or more sets of instructions; it should also be understood to include any medium that is capable of storing, encoding or carrying a set of instructions for execution by a processor and that cause the processor to perform any of the methods of the present disclosure.
Those skilled in the art will appreciate that the modules or steps of the present disclosure described above can be implemented using general purpose computer means, or alternatively, they can be implemented using program code executable by computing means, whereby the modules or steps may be stored in memory means for execution by the computing means, or separately fabricated into individual integrated circuit modules, or multiple modules or steps thereof may be fabricated into a single integrated circuit module. The present disclosure is not limited to any specific combination of hardware and software.
The above description is only a preferred embodiment of the present disclosure and is not intended to limit the present disclosure, and various modifications and changes may be made to the present disclosure by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.
Although the present disclosure has been described with reference to specific embodiments, it should be understood that the scope of the present disclosure is not limited thereto, and those skilled in the art will appreciate that various modifications and changes can be made without departing from the spirit and scope of the present disclosure.

Claims (10)

1. The real-time surgical tool detection method applied to robot-assisted surgical video analysis is characterized by comprising the following steps of:
acquiring a robot-assisted surgery video and processing the video to obtain a surgery image;
estimating key points of the surgical tool on the surgical image to obtain a heat map of the central point of the surgical tool, and predicting the central point of the surgical tool and the size of the surgical tool according to the peak value of the heat map;
and calculating the boundary frame of the surgical tool according to the predicted central point of the surgical tool and the size of the central point.
2. The method of claim 1, wherein the surgical tool keypoint estimator derived from training the lightweight neural network framework is used to estimate the surgical tool keypoints for the surgical image.
3. The method of claim 2, wherein the lightweight neural network framework includes a fire module configured to compress channels of an input image using a convolution kernel, then to employ a module in which convolution kernels and depth separable convolution kernels are mixed and parallelized, and finally output through the modified linear elements.
4. The method as claimed in claim 2, wherein the surgical tool keypoint estimator obtained after training the lightweight neural network framework outputs different values to represent whether the detected keypoint is the center point of the surgical tool or the background of the surgical tool.
5. The method as claimed in claim 1, wherein the step of processing the video of the robot-assisted surgery includes framing the video of the surgery, downsampling the video of the robot-assisted surgery, and storing the downsampled video as a surgery image.
6. The method of claim 1, wherein the pre-processing, i.e., data enhancement, is performed on the surgical image prior to performing the surgical tool keypoint estimation on the surgical image to increase the training dataset samples for the lightweight neural network framework.
7. The method of claim 1, wherein the step of extracting the center point of the surgical tool comprises:
and constructing a loss function based on the operating tool key point estimator, and obtaining the center point coordinate of a certain operating tool in the operating image when fitting the loss function curve of the lightweight neural network framework.
8. Real-time surgical tool detection system for robot-assisted surgical video analysis, characterized by comprising:
the operation image acquisition module is configured to acquire a robot-assisted operation video and process the robot-assisted operation video to obtain an operation image;
the central point prediction module of the surgical tool is configured to preprocess the surgical image, estimate key points of the surgical tool to obtain a heat map of the central point of the surgical tool, and predict the central point of the surgical tool and the size of the surgical tool according to the peak value of the heat map;
and a bounding box acquisition module of the surgical tool, which is configured to obtain the bounding box of the surgical tool from the predicted central point of the surgical tool and the size thereof.
9. A computing device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program performs the steps of the method of any of claims 1-7 for real-time surgical tool detection for use in robot-assisted surgical video analysis.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, is adapted to carry out the steps of the method for real-time surgical tool detection for robot-assisted surgical video analysis according to any of the preceding claims 1-7.
CN202010529745.2A 2020-06-11 2020-06-11 Real-time surgical tool detection method applied to robot-assisted surgical video analysis Pending CN111652175A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010529745.2A CN111652175A (en) 2020-06-11 2020-06-11 Real-time surgical tool detection method applied to robot-assisted surgical video analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010529745.2A CN111652175A (en) 2020-06-11 2020-06-11 Real-time surgical tool detection method applied to robot-assisted surgical video analysis

Publications (1)

Publication Number Publication Date
CN111652175A true CN111652175A (en) 2020-09-11

Family

ID=72350504

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010529745.2A Pending CN111652175A (en) 2020-06-11 2020-06-11 Real-time surgical tool detection method applied to robot-assisted surgical video analysis

Country Status (1)

Country Link
CN (1) CN111652175A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112037263A (en) * 2020-09-14 2020-12-04 山东大学 Operation tool tracking system based on convolutional neural network and long-short term memory network
CN112699879A (en) * 2020-12-30 2021-04-23 山东大学 Attention-guided real-time minimally invasive surgical tool detection method and system
CN112926542A (en) * 2021-04-09 2021-06-08 博众精工科技股份有限公司 Performance detection method and device, electronic equipment and storage medium
CN113569727A (en) * 2021-07-27 2021-10-29 广东电网有限责任公司 Method, system, terminal and medium for identifying construction site in remote sensing image
CN114005022A (en) * 2021-12-30 2022-02-01 四川大学华西医院 Dynamic prediction method and system for surgical instrument
CN115122342A (en) * 2022-09-02 2022-09-30 北京壹点灵动科技有限公司 Software architecture for controlling a robot and control method of a robot

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105263446A (en) * 2013-03-21 2016-01-20 康复米斯公司 Systems, methods, and devices related to patient-adapted hip joint implants
CN110532961A (en) * 2019-08-30 2019-12-03 西安交通大学 A kind of semantic traffic lights detection method based on multiple dimensioned attention mechanism network model

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105263446A (en) * 2013-03-21 2016-01-20 康复米斯公司 Systems, methods, and devices related to patient-adapted hip joint implants
CN110532961A (en) * 2019-08-30 2019-12-03 西安交通大学 A kind of semantic traffic lights detection method based on multiple dimensioned attention mechanism network model

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
CHINEDU INNO CENT NWOYE等: "Weakly Sup ervised Convolutional LSTM Approach for To ol Tracking in Laparoscopic Videos" *
IRO LAINA等: "Concurrent Segmentation and Lo calization for Tracking of Surgical Instruments" *
YUYING LIU等: "An Anchor-Free Convolutional Neural Network for Real-Time Surgical Tool Detection in Robot-Assisted Surgery" *
刘玉莹等: "基于深度学习的微创手术工具检测与跟踪研究综述" *
陈兆瑞等: "计算机辅助的微创手术工具跟踪算法综述" *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112037263A (en) * 2020-09-14 2020-12-04 山东大学 Operation tool tracking system based on convolutional neural network and long-short term memory network
CN112037263B (en) * 2020-09-14 2024-03-19 山东大学 Surgical tool tracking system based on convolutional neural network and long-term and short-term memory network
CN112699879A (en) * 2020-12-30 2021-04-23 山东大学 Attention-guided real-time minimally invasive surgical tool detection method and system
CN112926542A (en) * 2021-04-09 2021-06-08 博众精工科技股份有限公司 Performance detection method and device, electronic equipment and storage medium
CN112926542B (en) * 2021-04-09 2024-04-30 博众精工科技股份有限公司 Sex detection method and device, electronic equipment and storage medium
CN113569727A (en) * 2021-07-27 2021-10-29 广东电网有限责任公司 Method, system, terminal and medium for identifying construction site in remote sensing image
CN113569727B (en) * 2021-07-27 2022-10-21 广东电网有限责任公司 Method, system, terminal and medium for identifying construction site in remote sensing image
CN114005022A (en) * 2021-12-30 2022-02-01 四川大学华西医院 Dynamic prediction method and system for surgical instrument
CN114005022B (en) * 2021-12-30 2022-03-25 四川大学华西医院 Dynamic prediction method and system for surgical instrument
CN115122342A (en) * 2022-09-02 2022-09-30 北京壹点灵动科技有限公司 Software architecture for controlling a robot and control method of a robot
CN115122342B (en) * 2022-09-02 2022-12-09 北京壹点灵动科技有限公司 Software system for controlling robot and control method of robot

Similar Documents

Publication Publication Date Title
CN111652175A (en) Real-time surgical tool detection method applied to robot-assisted surgical video analysis
CN109166130B (en) Image processing method and image processing device
CN110399929B (en) Fundus image classification method, fundus image classification apparatus, and computer-readable storage medium
CN107818554B (en) Information processing apparatus and information processing method
US20180060652A1 (en) Unsupervised Deep Representation Learning for Fine-grained Body Part Recognition
CN113034495B (en) Spine image segmentation method, medium and electronic device
CN112200162B (en) Non-contact heart rate measuring method, system and device based on end-to-end network
CN113920571A (en) Micro-expression identification method and device based on multi-motion feature fusion
CN115578770A (en) Small sample facial expression recognition method and system based on self-supervision
Han et al. Learning generative models of tissue organization with supervised GANs
CN114549557A (en) Portrait segmentation network training method, device, equipment and medium
Yang et al. An efficient one-stage detector for real-time surgical tools detection in robot-assisted surgery
CN113237881B (en) Detection method and device for specific cells and pathological section detection system
CN114372962A (en) Laparoscopic surgery stage identification method and system based on double-particle time convolution
CN117137435B (en) Rehabilitation action recognition method and system based on multi-mode information fusion
Sokolova et al. Pixel-based iris and pupil segmentation in cataract surgery videos using mask R-CNN
CN114419401B (en) Method and device for detecting and identifying leucocytes, computer storage medium and electronic equipment
CN115116117A (en) Learning input data acquisition method based on multi-mode fusion network
CN115546491A (en) Fall alarm method, system, electronic equipment and storage medium
CN115601823A (en) Method for tracking and evaluating concentration degree of primary and secondary school students
CN112614092A (en) Spine detection method and device
CN112699879A (en) Attention-guided real-time minimally invasive surgical tool detection method and system
Ramesh et al. Hybrid U-Net and ADAM Algorithm for 3DCT Liver Segmentation
Liu et al. AFCANet: An adaptive feature concatenate attention network for multi-focus image fusion
CN116129298B (en) Thyroid video stream nodule recognition system based on space-time memory network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination