CN112396601B - Real-time neurosurgical instrument segmentation method based on endoscope images - Google Patents

Real-time neurosurgical instrument segmentation method based on endoscope images Download PDF

Info

Publication number
CN112396601B
CN112396601B CN202011418220.8A CN202011418220A CN112396601B CN 112396601 B CN112396601 B CN 112396601B CN 202011418220 A CN202011418220 A CN 202011418220A CN 112396601 B CN112396601 B CN 112396601B
Authority
CN
China
Prior art keywords
image
instrument
graph
label
segmentation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011418220.8A
Other languages
Chinese (zh)
Other versions
CN112396601A (en
Inventor
黄凯
龚瑾
郭英
何海勇
郭思璐
宋日辉
梁宏立
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Third Affiliated Hospital Sun Yat Sen University
Sun Yat Sen University
Original Assignee
Third Affiliated Hospital Sun Yat Sen University
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Third Affiliated Hospital Sun Yat Sen University, Sun Yat Sen University filed Critical Third Affiliated Hospital Sun Yat Sen University
Priority to CN202011418220.8A priority Critical patent/CN112396601B/en
Publication of CN112396601A publication Critical patent/CN112396601A/en
Application granted granted Critical
Publication of CN112396601B publication Critical patent/CN112396601B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20016Hierarchical, coarse-to-fine, multiscale or multiresolution image processing; Pyramid transform
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention belongs to the field of medical image processing and the technical field of image segmentation, and particularly relates to a real-time neurosurgical instrument segmentation method based on an endoscope image. A set of real-time instrument example segmentation method aiming at an endoscopic neurosurgery scene is provided, and the method can be applied to clinic and plays a role in assisting neurosurgery in operation in real time. The invention also provides a set of data amplification method aiming at the noises such as light spots, reflection, blur and the like, so that the learning capability and the adaptability of the model are improved while samples are enriched.

Description

Real-time neurosurgical instrument segmentation method based on endoscope images
Technical Field
The invention belongs to the field of medical image processing and the technical field of image segmentation, and particularly relates to a real-time neurosurgical instrument segmentation method based on an endoscope image.
Background
The existing example division methods are mainly divided into two types, two-stage (two-stage) and one-stage (one-stage). Currently, there is no relevant real-time instance segmentation work in the context of neurosurgical endoscopic images.
Data augmentation is one of the common skills in deep learning, and is mainly used for increasing a training data set, so that the data set is diversified as much as possible, and a trained model has stronger generalization capability. The existing data augmentation mainly includes: horizontal/vertical flipping, rotation, scaling, cropping, translation, contrast, color dithering, noise, etc. However, the conventional data augmentation method is not directed to endoscopic surgery images and is not directed to scenes containing light spots, reflection and blur.
Chinese patent CN111724365A, published as 2020.09.29, discloses a method for detecting an interventional device for endovascular aneurysm repair surgery, which utilizes a trained fast attention network to generate a binary segmentation mask of the interventional device, and then covers the binary segmentation mask on an image to be detected to obtain an image of the interventional device. The invention is based on X-ray transmission images, and improves the accuracy and speed of classification of instruments and tissue backgrounds. The method is not developed aiming at the scene of the neurosurgery endoscope operation, and the challenges brought by noises such as light spots, reflection, blurring and the like which are easy to appear in the scene cannot be solved. Meanwhile, most of the segmentation technologies in the examples provide help for doctors in the preoperative examination stage, and cannot provide real-time prompts in the operation.
At present, example segmentation algorithms with good effects are derived from a target detection method, but the example segmentation difficulty is much higher than the target detection difficulty. The accuracy of the two-stage detector depends on feature positioning, and the process is ordered and cannot be accelerated. A single stage detector improves the process into a parallel process, but then many subsequent calculations are performed after the positioning, which is also difficult to accelerate. The real-time instance segmentation task has been difficult to break through.
Disclosure of Invention
The present invention overcomes at least one of the above-mentioned drawbacks of the prior art, and provides a real-time neurosurgical instrument segmentation method in an endoscopic image, which is capable of adapting to the segmentation task performed by a surgical operation and has a high segmentation speed.
In order to solve the technical problems, the invention adopts the technical scheme that: a real-time neurosurgical instrument segmentation method based on endoscopic images, comprising the steps of:
s1, collecting image data of an endoscopic surgery, labeling the image in a manual labeling mode, and performing spatial segmentation and semantic classification on a foreground, namely an instrument and a background by using a label; constructing a data set, setting a cross validation sample, and establishing an instrument instance segmentation database which is divided into a training set and a validation set;
S2, performing data amplification on the data set, wherein the data amplification comprises turning, rotating, adjusting the image intensity, adding light spots/Gaussian noise and mixing images, so that the number of samples of the data set is increased, and the samples are enriched;
s3, constructing a network model, which comprises a feature backbone network, a feature pyramid network, a prototype prediction branch and a mask coefficient prediction branch; the input is a two-dimensional image, and the output is a prediction result of the image, and comprises a group of target detection bounding boxes, masks and corresponding categories;
s4, training the network model constructed in the step S3 by using a back propagation strategy by using a training data set as a training sample, and minimizing a loss function to obtain an optimized network weight;
and S5, testing the model, namely testing the trained network model by using the verification data sample, inputting the verification image into the network model to obtain a prediction result, comparing the prediction result with the label, and judging whether the network has better adaptability.
Further, in the step S2, when selecting a specific data augmentation mode, for several augmentation modes, such as picture flipping, picture rotation, image intensity adjustment, and light spot/gaussian noise addition, a randomly generated probability mode is used to select a corresponding augmentation mode for each picture.
Further, the random probability manner specifically includes: firstly, respectively setting a picture rotation probability, a picture turning probability, an image intensity adjusting probability and a facula/Gaussian noise adding probability; and then generating a floating point random number of 0-1, and using a corresponding augmentation mode for the current picture when the random number is greater than a preset threshold probability.
Further, the adding of the light spots/gaussian noise specifically includes: in order to eliminate the influence of light spots, some elliptical light spots are added into an original image through image processing; these spots are randomly sized and randomly located in the image, so that the network learns the spots as noise rather than background or foreground.
Further, the adding of the elliptical light spot specifically comprises: an integer less than 8 is randomly generated as the number of light spots, elliptic light spots are generated on an image with the same size as the original image by using an ellipse, and the image and the original image are added.
Further, the image mixing specifically includes the following steps:
A1. selecting an image a and an image b, wherein the image b contains an instrument for inverting the tissue texture; extracting the color number of the labels of the graph a and the graph b, wherein the labels of the single picture have multiple colors to distinguish different instruments;
A2. Cutting out the instrument with the reflection in the image b, wherein the process can be obtained by setting the background of the instrument image with the reflection to be black (0,0, 0);
A3. covering the image a by using the instrument image with the completely black background obtained in the step A2, namely adding pixel points of the two images to obtain a new image c;
A4. and overlaying the image label of the graph b to the corresponding position of the label of the graph a, and rearranging the color of the label corresponding to the new instrument according to the number of the colors to obtain the label of the graph c.
Further, the image mixing specifically includes the following steps:
B1. selecting an image a and an image b, and extracting the number of label colors of the image a and the image b;
B2. overlaying the instrument of image a, comprising: rotating the image, and replacing the instrumented portion with the rotated image; under the condition that the rotation cannot cover all instruments, covering the instruments by using areas with equal size near the part with the rest instruments, namely translating;
B3. cutting out the instrument in the drawing b, and setting the background in the drawing b to be black (0,0, 0);
B4. adding images of a graph b with a completely black background except the instrument and a graph a after the instrument is covered, namely adding corresponding pixel points, multiplying the pixels of the instrument part in the graph b by a coefficient transmittance when adding, and multiplying the same part of the graph a by 1-transmittance; adding the images to obtain a new image c; fig. c has a certain transparency in the instrument part, namely, an image of a part belonging to the background can be seen on the instrument, so that the reflection is simulated;
B5. A label is generated for the graph c, which is the instrument label of the graph b because the instrument of the graph a has been covered by the background.
Further, the network model is mainly divided into two parallel tasks, including: a. prototype generation: generating a series of prototype masks having a size consistent with the original image and not dependent on a single instance; b. mask coefficient: predicting a series of mask coefficients for each instance for encoding a representation of the instance in a prototype mask space; and then linearly combining the prototype mask and the corresponding predicted coefficient, and cutting the prototype mask and the predicted boundary box to obtain an example segmentation result of the whole image.
The present invention also provides an electronic device comprising:
a memory for storing a computer program;
a processor for implementing the steps of the method as described above when executing the computer program.
The present invention also provides a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and the computer program is executed by a processor to implement the steps of the method
Compared with the prior art, the beneficial effects are:
1. the method has high speed, can realize real-time example segmentation of the endoscopic image of neurosurgery, can distinguish the instrument in the visual field from other tissues, and can also distinguish the instrument entity in the visual field;
2. A set of data augmentation scheme is designed for the neurosurgical scene, and the expression effects of the model on the light spot, reflection and other data sets are improved by changing the illumination brightness, adding the light spot, adding random noise and simulating the reflection of tissue textures on an instrument.
Drawings
FIG. 1 is a schematic flow diagram of the overall process of the present invention.
FIG. 2 is a flow chart illustrating a random probability selection method according to an embodiment of the present invention.
FIG. 3 is a flowchart illustrating an image blend mode one according to an embodiment of the invention.
Fig. 4 is a flowchart illustrating an image blending mode two according to an embodiment of the present invention.
Fig. 5 is a schematic diagram of a network model structure in the embodiment of the present invention.
Fig. 6 is a schematic diagram of a prototype-generating network according to an embodiment of the present invention.
FIG. 7 is a flowchart illustrating a mask coefficient processing according to an embodiment of the present invention.
Detailed Description
The drawings are for illustration purposes only and are not to be construed as limiting the invention; for the purpose of better illustrating the embodiments, certain features of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted. The positional relationships depicted in the drawings are for illustrative purposes only and are not to be construed as limiting the invention.
As shown in fig. 1, a real-time neurosurgical instrument segmentation method based on endoscopic images comprises the following steps:
step 1, collecting endoscope operation image data, labeling the image by adopting a manual labeling mode, and performing spatial segmentation and semantic classification on a foreground, namely an instrument and a background by using a label; constructing a data set, setting a cross validation sample, and establishing an instrument instance segmentation database which is divided into a training set and a validation set;
step 2, performing data augmentation on the data set, including turning, rotating, adjusting image intensity, adding light spots/Gaussian noise and mixing images, so that the number of samples of the data set is increased and the samples are enriched;
in this embodiment, four data augmentation methods are mainly used:
1. traditional image transformation: including flipping, rotating, adjusting image intensity, adding gaussian noise.
2. Randomly adding spot noise: in order to eliminate the influence of the light spots, some elliptical light spots are added to the original image through image processing. The spots are of random size and are distributed at random positions in the image. Thereby causing the network to learn the spots as noise rather than background or foreground.
3. Image mixing: the two pictures are combined into a new example in two different ways so as to artificially simulate the reflection of the tissue texture on the instrument.
For the first two augmentation modes, a mode of randomly generating probability is adopted to select which augmentation mode is used for each picture. And respectively setting picture rotation probability, picture turning probability, picture brightness change probability, Gaussian noise addition and light spot probability. And generating a floating point random number of 0-1, and using a corresponding augmentation mode for the current picture when the random number is greater than a preset threshold probability. At least one data augmentation mode is in effect by default. The augmentation may only be performed at most once for each type of data. The flow chart is shown in fig. 2.
Specifically, the following three modes of amplification will be described in detail.
Adding light spot noise: the addition of the ellipse is obtained by adding the original image and the image of the same size background, which is totally black and has only random elliptical light spots, namely adding the pixel values of the corresponding points. The RGB value of the elliptical spot is (150,150,150), which is generated by the epipcv's epipse function, and its parameters include picture matrix, center point, major and minor axes, rotation angle, ellipse start and end angles (0-360), edge thickness. The number of spots in a picture is a random integer number within the maximum number range (default 8). The center point position, the length of the major and minor axes are random values within a certain proportion of the picture size. This randomization process can be implemented by random libraries.
Image blend mode one:
as shown in fig. 3, the method comprises the following steps:
A1. selecting an image a and an image b, wherein the image b contains an instrument for inverting the tissue texture; extracting the color number of the labels of the graph a and the graph b, wherein the labels of the single picture have multiple colors to distinguish different instruments;
A2. cutting out the instrument with the reflection in the image b, wherein the process can be obtained by setting the background of the instrument image with the reflection to be black (0,0, 0);
A3. covering the image a by using the instrument image with the completely black background obtained in the step A2, namely adding pixel points of the two images to obtain a new image c;
A4. and overlaying the image label of the graph b to the corresponding position of the label of the graph a, and rearranging the color of the label corresponding to the new instrument according to the number of the colors to obtain the label of the graph c.
And image mixed mode two:
as shown in fig. 4, the method specifically includes the following steps:
B1. selecting an image a and an image b, and extracting the number of label colors of the image a and the image b;
B2. overlaying the instrument of image a, comprising: rotating the image, and replacing the instrumented portion with the rotated image; under the condition that the rotation cannot cover all instruments, covering the instruments by using areas with equal size near the part with the rest instruments, namely translating;
B3. Cutting out the instrument in the drawing b, and setting the background in the drawing b to be black (0,0, 0);
B4. adding images of a graph b with a completely black background except the instrument and a graph a after the instrument is covered, namely adding corresponding pixel points, multiplying the pixels of the instrument part in the graph b by a coefficient transmittance when adding, and multiplying the same part of the graph a by 1-transmittance; adding the images to obtain a new image c; fig. c has a certain transparency in the instrument part, namely, an image of a part belonging to the background can be seen on the instrument, so that the reflection is simulated;
B5. a label is generated for the graph c, which is the instrument label of the graph b because the instrument of the graph a has been covered by the background.
Step 3, constructing a network model, which comprises a feature backbone network, a feature pyramid network, a prototype prediction branch and a mask coefficient prediction branch; the input is a two-dimensional image, and the output is a prediction result of the image, and comprises a group of target detection bounding boxes, masks and corresponding categories;
as shown in fig. 5, the network model is mainly divided into two parallel tasks, including: a. prototype generation: generating a series of prototype masks having a size consistent with the original image and not dependent on a single instance; b. mask coefficient: predicting a series of mask coefficients for each instance for encoding a representation of the instance in a prototype mask space; and then linearly combining the prototype mask and the corresponding predicted coefficient, and cutting the prototype mask and the predicted boundary box to obtain an example segmentation result of the whole image.
Wherein, prototype generation: the prototype generation branch predicts k prototype masks for each image. The prototype generation branch is implemented with a Full Convolution Network (FCN). The full convolution network can classify the image at the pixel level, and up-sample the feature map (feature map) of the last volume base layer by using the deconvolution layer to restore it to the same size of the input image, so that a prediction can be generated for each pixel. In the prototype-generated branch, the last layer of the FCN has k channels, one for each prototype, and each channel is fed into the backbone feature layer. The prototype-generating network is shown in fig. 6. Representing feature sizes and channels with image sizes 550 x 550. Arrows indicate 3 × 3 convolutional layers. Finally, an upsampling is performed followed by a 1 × 1 convolutional layer. The prototype generation branches obtained from deeper stem features can produce a more robust mask, and the prototype with higher resolution can not only bring a higher quality mask, but also have a better effect on small targets. Therefore, FPN is used because its largest feature layer is deepest. Then, it is up-sampled to one-fourth of the input image size to improve detection performance on small target objects.
Mask coefficient: a typical anchor-box based target detector has two branches: one branch is used to predict confidence scores for c classes; the other branch is used to predict the 4 coordinates of the bounding box. To predict the mask coefficients, a third branch is added to the system to predict k mask coefficients, one for each prototype. So each anchor block has to predict 4+ c + k numbers. For the resulting mask, the prototype needs to be removed therefrom, and thus non-linear processing is performed using tanh for k mask coefficients, thereby making it more stable than the output without non-linear processing. This process is illustrated in fig. 7.
Mask integration: in order to generate the mask of the example, the branch for generating the prototype and the branch for generating the mask coefficient are combined by using a linear combination method, and Sigmoid nonlinearity is used for the combination result to obtain the final mask, and the process can be efficiently realized by using a single matrix multiplication method:
M=σ(PC T )
where P is the prototype mask of h × w × k size and C is the mask coefficient of n × k size, which is the coefficient of n instances that remain after NMS and score thresholding.
Cutting a mask: to retain small target objects in the prototype, we crop the final mask according to the predicted bounding box, in the training process we crop using the real bounding box, and L is added mask Divided by the area of the real bounding box.
In this embodiment, the loss function may be selected as: 1. loss of classification L cls (ii) a 2. Bounding box regression loss L box (ii) a 3. Mask loss L mask =BCE(M,M gt ) M is a predictive mask, M gt Is a true mask, BCE is both pixel level binary cross entropy.
Step 4, training the network model constructed in the step S3 by using a back propagation strategy by using a training data set as a training sample, and minimizing a loss function to obtain an optimized network weight;
and 5, testing the model, namely testing the trained network model by using the verification data sample, inputting the verification image into the network model to obtain a prediction result, comparing the prediction result with the label, and judging whether the network has better adaptability.
Under the method, the instrument instance segmentation can achieve a real-time effect. When ResNet50 is selected as a backbone network, the average frame rate is 66fps, and when ResNet101 is used as the backbone network, the average frame rate is 49.79fps, so that the effect of real-time instance segmentation is completely achieved.
The method realizes the accuracy of 89.17% at the frame rate of 66fps, and is superior to the most advanced real-time semantic segmentation method at present.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.
It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims (8)

1. A real-time neurosurgical instrument segmentation method based on endoscopic images, comprising the steps of:
s1, collecting image data of an endoscopic surgery, labeling the image in a manual labeling mode, and performing spatial segmentation and semantic classification on a foreground, namely an instrument and a background by using a label; constructing a data set, setting a cross validation sample, and establishing an instrument instance segmentation database which is divided into a training set and a validation set;
s2, performing data amplification on the data set, wherein the data amplification comprises turning, rotating, adjusting the image intensity, adding light spots/Gaussian noise and mixing images, so that the number of samples of the data set is increased, and the samples are enriched;
S3, constructing a network model, which comprises a feature backbone network, a feature pyramid network, a prototype prediction branch and a mask coefficient prediction branch; the input is a two-dimensional image, and the output is a prediction result of the image, and comprises a group of target detection bounding boxes, masks and corresponding categories;
s4, training the network model constructed in the step S3 by using a back propagation strategy by using a training data set as a training sample, and minimizing a loss function to obtain an optimized network weight;
s5, testing the trained network model by using the verification data sample, inputting the verification image into the network model to obtain a prediction result, comparing the prediction result with the label, and judging whether the network has better adaptability; the image mixing specifically comprises the following steps:
A1. selecting an image a and an image b, wherein the image b contains an instrument for inverting the tissue texture; extracting the color number of the labels of the graph a and the graph b, wherein the labels of the single picture have multiple colors to distinguish different instruments;
A2. cutting off the instrument with the reflection in the image b, wherein the process is obtained by setting the background of the instrument image with the reflection to be black (0,0, 0);
A3. Covering the image a by using the instrument image with the completely black background obtained in the step A2, namely adding pixel points of the two images to obtain a new image c;
A4. overlaying the image label of the graph b to the corresponding position of the label of the graph a, and rearranging the color of the label of the corresponding new instrument according to the number of the colors to obtain the label of the graph c;
or, the image mixing specifically comprises the following steps:
B1. selecting an image a and an image b, and extracting the number of label colors of the image a and the image b;
B2. overlaying the instrument of image a, comprising: rotating the image, and replacing the instrumented portion with the rotated image; under the condition that the rotation cannot cover all instruments, covering the instruments by using areas with equal size near the part with the rest instruments, namely translating;
B3. cutting out the instrument in the drawing b, and setting the background in the drawing b to be black (0,0, 0);
B4. adding images of a graph b with a completely black background except the instrument and a graph a after the instrument is covered, namely adding corresponding pixel points, multiplying the pixels of the instrument part in the graph b by a coefficient transmittance when adding, and multiplying the same part of the graph a by 1-transmittance; adding the images to obtain a new image c; fig. c has a certain transparency in the instrument part, namely, an image of a part belonging to the background can be seen on the instrument, so that the reflection is simulated;
B5. A label is generated for the graph c, which is the instrument label of the graph b because the instrument of the graph a has been covered by the background.
2. The method for real-time neurosurgical instrument segmentation based on endoscopic images as claimed in claim 1, wherein in step S2, when selecting a specific data augmentation mode, for several augmentation modes of picture flipping, picture rotation, image intensity adjustment and spot/gaussian noise addition, a randomly generated probability mode is used to select a corresponding augmentation mode for each picture.
3. The method of endoscopic image based real-time neurosurgical instrument segmentation according to claim 2, wherein the randomly generating probabilities specifically comprises: firstly, respectively setting a picture rotation probability, a picture turning probability, an image intensity adjusting probability and a facula/Gaussian noise adding probability; and then generating a floating point random number of 0-1, and using a corresponding augmentation mode for the current picture when the random number is greater than a preset threshold probability.
4. The endoscopic image based real-time neurosurgical instrument segmentation method of claim 1, wherein the adding of speckle/gaussian noise specifically comprises: in order to eliminate the influence of light spots, some elliptical light spots are added into an original image through image processing; these spots are randomly sized and randomly located in the image, so that the network learns the spots as noise rather than background or foreground.
5. The endoscopic image based real-time neurosurgical instrument segmentation method of claim 4, wherein the addition of elliptical spots specifically comprises: an integer less than 8 is randomly generated as the number of light spots, elliptic light spots are generated on an image with the same size as the original image by using an ellipse, and the image and the original image are added.
6. The method of endoscopic image based real-time neurosurgical instrument segmentation according to any one of claims 1 to 5, wherein the network model is divided into two parallel tasks comprising: a. prototype generation: generating a series of prototype masks having a size consistent with the original image and not dependent on a single instance; b. mask coefficient: predicting a series of mask coefficients for each instance for encoding a representation of the instance in a prototype mask space; and then linearly combining the prototype mask and the corresponding predicted coefficient, and cutting the prototype mask and the predicted boundary box to obtain an example segmentation result of the whole image.
7. An electronic device, comprising:
a memory for storing a computer program;
a processor for implementing the steps of the method according to any one of claims 1 to 6 when executing the computer program.
8. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 6.
CN202011418220.8A 2020-12-07 2020-12-07 Real-time neurosurgical instrument segmentation method based on endoscope images Active CN112396601B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011418220.8A CN112396601B (en) 2020-12-07 2020-12-07 Real-time neurosurgical instrument segmentation method based on endoscope images

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011418220.8A CN112396601B (en) 2020-12-07 2020-12-07 Real-time neurosurgical instrument segmentation method based on endoscope images

Publications (2)

Publication Number Publication Date
CN112396601A CN112396601A (en) 2021-02-23
CN112396601B true CN112396601B (en) 2022-07-29

Family

ID=74605173

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011418220.8A Active CN112396601B (en) 2020-12-07 2020-12-07 Real-time neurosurgical instrument segmentation method based on endoscope images

Country Status (1)

Country Link
CN (1) CN112396601B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107624193A (en) * 2015-04-29 2018-01-23 西门子公司 The method and system of semantic segmentation in laparoscope and endoscope 2D/2.5D view data
CN108510493A (en) * 2018-04-09 2018-09-07 深圳大学 Boundary alignment method, storage medium and the terminal of target object in medical image

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109215079B (en) * 2018-07-17 2021-01-15 艾瑞迈迪医疗科技(北京)有限公司 Image processing method, surgical navigation device, electronic device, and storage medium
WO2020147957A1 (en) * 2019-01-17 2020-07-23 Toyota Motor Europe System and method for generating a mask for object instances in an image
AU2020219858A1 (en) * 2019-02-08 2021-09-30 The Board Of Trustees Of The University Of Illinois Image-guided surgery system
CN109934831A (en) * 2019-03-18 2019-06-25 安徽紫薇帝星数字科技有限公司 A kind of surgical tumor operation real-time navigation method based on indocyanine green fluorescent imaging
CN110781924B (en) * 2019-09-29 2023-02-14 哈尔滨工程大学 Side-scan sonar image feature extraction method based on full convolution neural network
CN111597920B (en) * 2020-04-27 2022-11-15 东南大学 Full convolution single-stage human body example segmentation method in natural scene

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107624193A (en) * 2015-04-29 2018-01-23 西门子公司 The method and system of semantic segmentation in laparoscope and endoscope 2D/2.5D view data
CN108510493A (en) * 2018-04-09 2018-09-07 深圳大学 Boundary alignment method, storage medium and the terminal of target object in medical image

Also Published As

Publication number Publication date
CN112396601A (en) 2021-02-23

Similar Documents

Publication Publication Date Title
EP3553742B1 (en) Method and device for identifying pathological picture
CN111292264B (en) Image high dynamic range reconstruction method based on deep learning
Startsev et al. 360-aware saliency estimation with conventional image saliency predictors
EP3547179A1 (en) Method and system for adjusting color of image
US10600171B2 (en) Image-blending via alignment or photometric adjustments computed by a neural network
CN110490896B (en) Video frame image processing method and device
CN109712165B (en) Similar foreground image set segmentation method based on convolutional neural network
CN111767760A (en) Living body detection method and apparatus, electronic device, and storage medium
EP3675034A1 (en) Image realism predictor
CN110648331B (en) Detection method for medical image segmentation, medical image segmentation method and device
US20230377097A1 (en) Laparoscopic image smoke removal method based on generative adversarial network
CN114170227B (en) Product surface defect detection method, device, equipment and storage medium
CN113989407B (en) Training method and system for limb part recognition model in CT image
Han et al. Perceptual CT loss: implementing CT image specific perceptual loss for CNN-based low-dose CT denoiser
CN112396601B (en) Real-time neurosurgical instrument segmentation method based on endoscope images
CN112818774A (en) Living body detection method and device
CN110728630A (en) Internet image processing method based on augmented reality and augmented reality glasses
EP4283566A2 (en) Single image 3d photography with soft-layering and depth-aware inpainting
CN114972611A (en) Depth texture synthesis method based on guide matching loss and related equipment
Yin et al. Visual Attention and ODE-inspired Fusion Network for image dehazing
JP7349005B1 (en) Program, information processing method, information processing device, and learning model generation method
Park et al. Improving Instance Segmentation using Synthetic Data with Artificial Distractors
CN116797611B (en) Polyp focus segmentation method, device and storage medium
Swingler A Suite of Incremental Image Degradation Operators for Testing Image Classification Algorithms.
CN117011407A (en) Image generation method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant