WO2022057078A1 - Real-time colonoscopy image segmentation method and device based on ensemble and knowledge distillation - Google Patents

Real-time colonoscopy image segmentation method and device based on ensemble and knowledge distillation Download PDF

Info

Publication number
WO2022057078A1
WO2022057078A1 PCT/CN2020/130114 CN2020130114W WO2022057078A1 WO 2022057078 A1 WO2022057078 A1 WO 2022057078A1 CN 2020130114 W CN2020130114 W CN 2020130114W WO 2022057078 A1 WO2022057078 A1 WO 2022057078A1
Authority
WO
WIPO (PCT)
Prior art keywords
teacher
model
training
image
training image
Prior art date
Application number
PCT/CN2020/130114
Other languages
French (fr)
Chinese (zh)
Inventor
李坚强
陈杰
黄志超
Original Assignee
深圳大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳大学 filed Critical 深圳大学
Publication of WO2022057078A1 publication Critical patent/WO2022057078A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30028Colon; Small intestine

Definitions

  • the invention relates to the field of image segmentation, in particular to a real-time colonoscopy image segmentation method and device based on integrated knowledge distillation.
  • the technical problem to be solved by the present invention is to provide a real-time colonoscopy image segmentation method and storage medium based on integrated knowledge distillation, aiming at solving the problem that the data sets between different hospitals cannot be collected in the prior art.
  • the problem of training an automatic image segmentation model for colonoscopy is to provide a real-time colonoscopy image segmentation method and storage medium based on integrated knowledge distillation, aiming at solving the problem that the data sets between different hospitals cannot be collected in the prior art.
  • an embodiment of the present invention provides a real-time colonoscopy image segmentation method based on integrated knowledge distillation, wherein the method includes:
  • the training images are screenshots of colonoscopy images used to train the teacher model and the student model;
  • the training images are divided into multiple training image sets, and the training images of the same training image set are from the same data set;
  • the second real label is used to reflect the real classification situation corresponding to the pixels on the training image under the second preset classification condition;
  • the teacher label is used to reflect the classification situation of the training image in the trained teacher model;
  • Real-time colonoscopy images are input to the trained student model to generate real-time colonoscopy image segmentation maps.
  • the acquired training images are screenshots of colonoscopy images used to train the teacher model and the student model, including:
  • Compression processing is performed according to the screenshot of the colonoscopy image to obtain the training image; the height, width and number of channels of the training image are all constant.
  • the teacher model includes a first down-sampling encoder and a first up-sampling decoder; the inputting the training image into the teacher model to obtain the first segmentation map includes:
  • the first feature map includes feature information of the training image
  • the first segmentation map includes the first standard probability and the first abnormal probability corresponding to the pixels in the training image; the first standard probability is the probability that the pixel belongs to the standard under the first preset classification condition , the first abnormal probability is the probability that the pixel belongs to abnormal under the first preset classification condition; the sum of the probability value of the first abnormal probability and the first standard probability is 1.
  • the parameters of the teacher model are modified according to the first segmentation map and the first real label, and the training image is input into the teacher model continuously to obtain the first The step of dividing the graph, until the preset training conditions of the teacher model are met, to obtain the trained teacher model, including:
  • the step of inputting the training image into the teacher model to obtain the first segmentation map is continued until the preset training conditions of the teacher model are satisfied, so as to obtain the trained teacher model.
  • the student model includes a second down-sampling encoder and a second up-sampling decoder; the inputting the training image into the student model to obtain the second segmentation map includes:
  • the second feature map is output after feature extraction is performed on the training image according to the second down-sampling encoder; the second feature map includes feature information of the training image;
  • the second segmentation map includes the second standard probability and the second abnormal probability corresponding to the pixels in the training image;
  • the second standard probability is the probability that the pixel belongs to the standard under the second preset classification condition ,
  • the second abnormality probability is the probability that the pixel belongs to abnormality under the second preset classification condition;
  • the number of categories of the second preset classification condition is more than the number of categories of the first preset classification condition.
  • the parameters of the student model are modified according to the second segmentation map, the teacher label and the second real label, and the step of generating a second segmentation map according to the training image is continued. , until the preset training conditions of the student model are met to obtain a trained student model, including:
  • the calculating the second loss value according to the second segmentation map, the teacher label and the second ground truth includes:
  • the teacher labels are obtained by adjusting the first segmentation maps output by all trained teacher models according to the probability total value.
  • an embodiment of the present invention further provides an apparatus for real-time colonoscopy image segmentation based on integrated knowledge distillation, wherein the apparatus includes:
  • an image acquisition module the image acquisition module is used to acquire training images
  • the teacher model unit is used to obtain a first segmentation map according to the training image
  • a first parameter correction module which is used to correct the parameters of the teacher model according to the first segmentation map and the first real label
  • the student model unit is used to obtain a second segmentation map according to the training image
  • the second parameter correction module is used to modify the parameters of the student model according to the second segmentation map, the teacher label and the second real label;
  • the teacher model unit also includes:
  • first down-sampling encoder module where the first down-sampling encoder module is used to perform feature extraction on the training image to obtain a first feature map
  • the first upsampling decoder module is configured to parse the first feature map to obtain the first segmentation map
  • the student model unit also includes:
  • the second down-sampling encoder module is configured to perform feature extraction on the training image to obtain a second feature map
  • a second up-sampling decoder module the second up-sampling decoder module is configured to parse the second feature map to obtain the second segmentation map.
  • an embodiment of the present invention also provides a terminal, which is characterized in that it includes a memory and one or more programs, wherein one or more programs are stored in the memory and configured to be composed of one or more programs. Execution of the one or more programs by a processor includes performing any of the methods described above.
  • an embodiment of the present invention provides a computer-readable storage medium on which a computer program is stored, characterized in that, when the computer program is executed by a processor, any one of the above-mentioned real-time integrated knowledge distillation-based real-time Steps of the colonoscopy image segmentation method.
  • the present invention obtains a plurality of training images, the training images are divided into a plurality of training image sets, and the training images of the same training image set are from the same data set; Obtain the first segmentation map according to different training image sets; then use the trained teacher model to jointly refine a student model.
  • the training image is a screenshot of a colonoscopy image, and the trained student model can generate a real-time colonoscopy image segmentation map according to the real-time colonoscopy image. This solves the problem that the data sets between different hospitals are discontinuous and cannot be pooled together to train an automatic image segmentation model for colonoscopy.
  • FIG. 1 is a first schematic flowchart of a real-time colonoscopy image segmentation method based on integrated knowledge distillation provided by an embodiment of the present invention.
  • FIG. 2 is a second schematic flowchart of a real-time colonoscopy image segmentation method based on integrated knowledge distillation provided by an embodiment of the present invention.
  • FIG. 3 is a third schematic flowchart of a real-time colonoscopy image segmentation method based on integrated knowledge distillation provided by an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of a connection relationship between a down-sampling encoder and an up-sampling decoder provided by an embodiment of the present invention.
  • FIG. 5 is a fourth schematic flowchart of a real-time colonoscopy image segmentation method based on integrated knowledge distillation provided by an embodiment of the present invention.
  • FIG. 6 is a fifth schematic flowchart of a real-time colonoscopy image segmentation method based on integrated knowledge distillation provided by an embodiment of the present invention.
  • FIG. 7 is a sixth schematic flowchart of a real-time colonoscopy image segmentation method based on integrated knowledge distillation provided by an embodiment of the present invention.
  • FIG. 8 is a seventh schematic flowchart of a real-time colonoscopy image segmentation method based on integrated knowledge distillation provided by an embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of a hardware operating environment involved in the solution of an embodiment of the present invention.
  • FIG. 10 is a schematic diagram of the internal structure of a real-time enteroscopic image segmentation device based on integrated knowledge distillation provided by an embodiment of the present invention.
  • FIG. 11 is a prediction effect diagram of the student model provided by the solution according to the embodiment of the present invention.
  • Assisted surgery is a system that uses robots to assist doctors in performing surgical operations, and its purpose is mainly to overcome the limitations of the existing minimally invasive surgical field of view.
  • Adjunctive surgery is especially important during colonoscopy.
  • Colonoscopy technique is one of the important techniques in intestinal surgery, however, in colonoscopy, many colon lesions usually have similar characteristics to normal mucosa, such as similar color, or too flat, this deceptive colon Lesions are often difficult to detect without special methods. Therefore, automatic image segmentation of real-time colonoscopy plays an important role in colonoscopy technology.
  • the present invention provides a real-time colonoscopy image segmentation method based on integrated knowledge distillation.
  • the real-time colonoscopy image segmentation method is a technology for assisting doctors in judging a patient's colon image.
  • the automatic detection model can perform image analysis on the input colon examination images and give prediction results, thereby giving doctors corresponding reference opinions.
  • the automatic detection model requires a lot of computing resources in the training process, so that information can be extracted from very large and highly redundant data sets.
  • Knowledge distillation is a model compression method. Its main idea is to train a small network model to imitate a pre-trained large network or an integrated network.
  • knowledge distillation the method for teachers to impart knowledge to students is: In the process of training students, a loss function that targets the probability distribution of the teacher's predicted results is added.
  • the present invention firstly trains multiple binary classification models through data sets of different hospitals, and the multiple binary classification models can respectively detect polyps, ulcers, bleeding and Merkel's diverticulum, which are common in colonoscopy examinations. Then use multiple trained binary classification models to jointly refine a multivariate classification model, so that the multivariate classification model can automatically detect polyps, ulcers, bleeding and Merkel's diverticulum.
  • the teacher model is a binary classification model
  • the student model is a multivariate classification model
  • the purpose of training the teacher model and the student model is to determine the best model between the teacher model and the student model. optimal parameters to achieve the best classification effect.
  • Different teacher models are trained through different training image sets to obtain teacher models with different classification conditions.
  • the training images of the same teacher model are all from the same hospital dataset, which solves the problem that the datasets between different hospitals are discontinuous and cannot be pooled together to train an automatic image segmentation model for colonoscopy.
  • the knowledge distillation method is used (that is, the knowledge contained in the trained model is distilled and extracted into another model), and the knowledge contained in the trained teacher model is distilled and extracted into the student model, thereby effectively compressing the student model. Reduce the size of the student model.
  • the real-time colonoscopy image segmentation method based on integrated knowledge distillation includes the following steps:
  • the training image is a screenshot of a colonoscopy image used for training the teacher model and the student model; the training image is divided into multiple training image sets, and the training images of the same training image set are from the same data set.
  • the first step in training a model is to obtain available training images, and then use the training images to train the teacher model and the student model.
  • the data in the same training image set comes from the same hospital, which solves the problem that the data sets between different hospitals are discontinuous and cannot be pooled together to train an automatic image segmentation model for colonoscopy.
  • step S100 as shown in FIG. 2 further includes the following steps:
  • the size of the input images of the teacher model and the student model must be adapted to the dimensions of the network parameters, in order to To avoid dynamic changes of the network in the model, the size of the input images of the teacher model and the student model needs to be fixed.
  • the size of the input image is mainly related to the height, width and number of channels of the image, so the size of the input image is fixed, that is, the height, width and number of channels of the input image are all constant.
  • the screenshot of the colonoscopy image is obtained, the screenshot of the colonoscopy image is compressed so that the height, width, and number of channels conform to the input image standards of the teacher model and the student model.
  • the compressed colonoscopy image screenshot becomes the training image, which can be directly input into the teacher model and the student model for training the teacher model and the student model.
  • the training images can be divided into multiple training image sets, and the data of the same training image set are all from the same hospital data set.
  • the method further includes step S200, inputting the training image into the teacher model to obtain a first segmentation map; wherein the number of the teacher models is greater than or equal to two ; Different teacher models obtain the first segmentation map according to different training image sets.
  • the teacher model needs to be trained first.
  • Different teacher models obtain the first segmentation map according to different training image sets.
  • teacher model A is used to automatically detect polyps
  • teacher model B is used to automatically detect bleeding.
  • a and B are trained through different training image sets respectively.
  • the data in the training image set of teacher model A can come from the hospital with the strongest local polyp treatment technology
  • the data in the training image set of teacher model B can come from the local hospital with the strongest treatment technology for colonic hemorrhage. Hospital.
  • the teacher model classifies the pixels on the training image according to preset classification conditions by collecting specific feature information in the training image, and outputs a classification result, where the classification result is the first segmentation map .
  • the teacher model includes the first down-sampling encoder and the first up-sampling decoder, as shown in Figure 3, the step S200 also includes the following steps:
  • Step S210 performing feature extraction on the training image according to the first down-sampling encoder to obtain a first feature map; the first feature map includes feature information of the training image;
  • Step S220 analyzing the first feature map according to the first upsampling decoder to obtain the first segmentation map
  • the first segmentation map includes the first standard probability and the first abnormal probability corresponding to the pixels in the training image; the first standard probability is the probability that the pixel belongs to the standard under the first preset classification condition , the first abnormal probability is the probability that the pixel belongs to abnormal under the first preset classification condition; the sum of the probability value of the first abnormal probability and the first standard probability is 1.
  • the teacher model mainly adopts the stochastic gradient descent algorithm during training.
  • the teacher model is mainly composed of a first down-sampling encoder and a first up-sampling decoder.
  • the first down-sampling encoder and the first up-sampling decoder are composed of four down-sampling layers and four up-sampling layers, respectively, and the first down-sampling encoder and the first up-sampling decoder are composed of four downsampling layers.
  • There is a connection relationship between the four down-sampling layers and the four up-sampling layers are connected one-to-one, and the outputs of the four down-sampling layers are respectively added to their corresponding up-sampling layers to participate in the up-sampling process.
  • the first feature map is obtained by sequentially passing through the four downsampling layers of the first downsampling encoder. Then, the first feature map passes through the four upsampling layers of the first upsampling decoder in sequence, and the final output result is the first segmentation map.
  • a lightweight network MobileNetv2 is used to construct the first down-sampling encoder.
  • the lightweight network MobileNetv2 is a lightweight model proposed for devices with limited computing resources. It uses depthwise separable convolution to build a lightweight deep neural network, which simplifies the network structure, so it has high accuracy and good model compression. ability. In an implementation manner, as shown in FIG.
  • the first downsampling layer 10 in this embodiment, four layers in MobileNetv2 are used as the first downsampling layer 10 , the second downsampling layer 20 , the third downsampling layer 20 and the third downsampling layer in the first downsampling encoder, respectively.
  • the down-sampling layer 30 and the fourth down-sampling layer 40 are used as the first downsampling layer 10 , the second downsampling layer 20 , the third downsampling layer 20 and the third downsampling layer in the first downsampling encoder, respectively.
  • the down-sampling layer 30 and the fourth down-sampling layer 40 are used as the first downsampling layer 10 , the second downsampling layer 20 , the third downsampling layer 20 and the third downsampling layer in the first downsampling encoder, respectively.
  • the training image 1 is input into the first down-sampling encoder, the training image 1 is firstly input into the first down-sampling layer 10, and then the output of the first down-sampling layer 10 is used as the second down-sampling
  • the input of layer 20, and so on take the output image of the previous down-sampling layer as the input image of the next down-sampling layer, and continue to perform the step of extracting features from the input image until the fourth down-sampling layer 40 completes the After the feature extraction of the input image, the first feature map 2 is output.
  • each downsampling layer in the first downsampling encoder is composed of an inverse residual module, and the inverse residual module is constructed by depthwise separable convolution.
  • the inverse residual module first performs a point-by-point convolution calculation on the input image to expand the number of channels of the image; then extracts image features by performing a depthwise convolution calculation; and then performs a point-by-point convolution calculation , the number of channels of the compressed image. Therefore, the size of the teacher model can be reduced without losing the accuracy of the image feature extraction by the teacher model.
  • the first down-sampling encoder performs feature extraction on the input training image and outputs the first feature map, and uses the first feature map as the input image of the first up-sampling decoder, and then executes the Step S220.
  • each of the four upsampling layers in the upsampling decoder is composed of a transposed convolutional layer and a normalization layer respectively.
  • the transposed convolutional layer is used for expanding the image and extracting image features.
  • the normalization layer Layers are used to avoid parameter interaction between different upsampling layers. For example, as shown in FIG. 4 , after the first feature map 2 is input to the up-sampling decoder, it passes through the first up-sampling layer 50 to combine the output result of the first up-sampling layer 50 with the down-sampling layer.
  • the output results of the fourth downsampling layer 40 in the encoder are combined as the input image to the second upsampling layer 60 and into the second upsampling layer 60 .
  • the first segmentation map includes a first standard probability and a first abnormal probability corresponding to a pixel in the training image; the first standard probability is the pixel in a first preset classification condition
  • the probability that the pixel belongs to the standard is the probability that the pixel is abnormal under the first preset classification condition; the sum of the probability value of the first abnormal probability and the first standard probability is 1.
  • the teacher model is a model for automatically detecting polyps
  • the corresponding first preset classification condition is whether polyps are present
  • the first standard probability is the probability that the pixels in the training image correspond to normal (that is, no polyps).
  • the probability of ) the first abnormal probability is the probability that the pixel in the training image corresponds to having a polyp.
  • the first downsampling encoder performs downsampling according to the training image and obtains a first feature map of size W' ⁇ H' ⁇ N, where W ', H', N are the height, width and number of channels of the first feature map.
  • the first upsampling decoder performs upsampling according to the first feature map and obtains a first segmentation map Among them, T i ,i ⁇ 1,2,3...,k ⁇ represents the ith teacher model, k is the number of teacher models, and each teacher model corresponds to a specific classification category.
  • j ⁇ 1,2 ⁇ is the probability that each pixel is predicted to be normal under the first preset classification condition, that is, the first standard probability, is the probability that each pixel is predicted to be abnormal under the first preset classification condition, that is, the first abnormal probability, and The sum is 1.
  • a teacher model, B teacher model, C teacher model, D teacher model the corresponding classification conditions are whether there is polyp, whether there is Merck's diverticulum, whether there is ulcer and whether With hemorrhage, used to train automatic detection of polyps, Merck's diverticulum, ulcers, and hemorrhages, respectively.
  • the first segmentation map (0.1, 0.9) is obtained, then 0.1 is the probability that the predicted pixel output by channel 1 is normal, and 0.9 is the probability that the predicted pixel output by channel 2 has polyps
  • the first segmentation map (0.2, 0.8) is obtained, then 0.2 is the probability that the predicted pixel output by channel 1 is normal, and 0.8 is the predicted pixel output by channel 2 suffers from Merck The probability of diverticulum; after inputting the training image into the C teacher model, the first segmentation map (0.3, 0.7) is obtained, then 0.3 is the probability that the predicted pixel output by channel 1 is normal, and 0.7 is the predicted pixel output by channel 2.
  • the method further includes step S300, modifying the parameters of the teacher model according to the first segmentation map and the first real label, and continuing to perform the
  • the training image is input into the teacher model, and the first segmentation map is obtained until the preset training conditions of the teacher model are met, so as to obtain the trained teacher model; the first real label is used to reflect the training image.
  • each training image has its corresponding real label to evaluate the classification effect (prediction effect) of the model.
  • the real label used for training the teacher model is the first real label, to indicate the corresponding real result of the training image under the first preset classification condition.
  • the purpose of training is to keep the output of the teacher model close to the real label, so the teacher model will continuously perform parameter correction during the training process, so as to control the training process and guide the training process towards The optimal direction converges.
  • the step S300 specifically includes the following steps:
  • Step S310 calculating a first loss value according to the first segmentation map and the first real label
  • Step S320 adjusting the parameters of the first upsampling decoder according to the first loss value to update the teacher model
  • Step S330 Continue to perform the step of inputting the training image into the teacher model to obtain the first segmentation map, until the preset training conditions of the teacher model are met, so as to obtain the trained teacher model.
  • the gap between the prediction result of the teacher model and the real result can be obtained, so that the teacher model can calculate the difference between the prediction result and the real result according to the teacher model.
  • the gap determines how to perform parameter correction and achieve a better prediction effect.
  • the teacher model classifies the training image according to the first preset classification condition to obtain a first segmentation map, and substitutes the first segmentation map and the first true label into the first loss value
  • the first loss value is obtained, and the first loss value can represent the gap between the first segmentation map and the first real label.
  • the calculation formula of the first loss value is as follows:
  • the first loss value refers to the gap between the first segmentation map and the first real label
  • the larger the value of the first loss value the greater the value of the first segmentation map and the The larger the gap between the first true labels, the worse the classification effect of the teacher model; the smaller the value of the first loss value, that is, the difference between the first segmentation map and the first true label. The smaller the gap is, the better the classification effect of the teacher model is.
  • the first segmentation map of the pixels in the wth row and hth column of the jth channel output by the teacher model T i represents the prediction result of the teacher model Ti for this pixel.
  • the true label It is a one-hot vector (one-hot vector), that is, the result corresponding to only one channel in the same label is not 0, and the results corresponding to other channels are all 0.
  • the real labels of the teacher model are only (1, 0) and There are two forms of (0, 1), (1, 0) indicates that the real situation corresponding to the pixel is normal, and (0, 1) indicates that the real situation corresponding to the pixel is abnormal.
  • the true label to evaluate the first segmentation map output by the teacher model The predicted effect is determined by the and Substitute into the calculation formula of the first loss value to obtain the first loss value of the teacher model, and evaluate the prediction effect of the teacher model according to the size of the obtained first loss value. The worse the prediction effect of the teacher model is; the smaller the obtained first loss value is, the better the prediction effect of the teacher model is.
  • the trained teacher model After the training of the teacher model is completed, the trained teacher model is obtained, and the trained teacher model can be used to refine the student model, and the process of refining the student model is the training process of the student model.
  • the method further includes step S400 , inputting the training image into the student model to obtain a second segmentation map.
  • the student model classifies the pixels on the training image according to preset classification conditions by collecting specific feature information in the training image, and outputs a classification result, where the classification result is the first classification result.
  • Two-part diagram Two-part diagram.
  • step S400 specifically includes the following steps:
  • Step S410 outputting a second feature map after performing feature extraction on the training image according to the second down-sampling encoder; the second feature map includes feature information of the training image;
  • Step S420 Analyze the second feature map according to the second upsampling decoder to obtain the second segmentation map.
  • the second segmentation map includes the second standard probability and the second abnormal probability corresponding to the pixels in the training image;
  • the second standard probability is the probability that the pixel belongs to the standard under the second preset classification condition ,
  • the second abnormality probability is the probability that the pixel belongs to abnormality under the second preset classification condition;
  • the number of categories of the second preset classification condition is more than the number of categories of the first preset classification condition.
  • the structure of the student model is similar to that of the teacher model, and both include a down-sampling encoder and an up-sampling decoder.
  • the downsampling encoder in the student model is a second downsampling encoder
  • the upsampling decoder in the student model is a second upsampling decoder
  • the second downsampling encoder and the second upsampling encoder are
  • the sampling decoder is also composed of four down-sampling layers and four up-sampling layers respectively
  • the second down-sampling encoder and the second up-sampling decoder have a connection relationship
  • the four down-sampling layers are connected to the
  • the four up-sampling layers are connected in one-to-one correspondence, and the outputs of the four down-sampling layers are respectively added to their corresponding up-sampling layers to participate in the up-sampling process to maintain the gradient of the student model.
  • the second feature map is obtained by sequentially passing through four downsampling layers of the second downsampling encoder. Then, the second feature map passes through the four upsampling layers of the second upsampling decoder in sequence, and the final output result is the first segmentation map.
  • the main difference between the student model and the teacher model is that the number of categories of the classification conditions of the student model is more than the number of categories of the classification conditions of the teacher model, so that the prediction results output by the student model have more dimensions. The dimension of the prediction result output by the teacher model. And the number of channels in the middle layer in the student model is smaller than the number of channels in the middle layer in the teacher model, thereby reducing the overall size of the student model.
  • each downsampling layer in the second downsampling encoder consists of an inverse residual module constructed from depthwise separable convolutions. Specifically, the reverse residual module first performs a point-by-point convolution calculation on the input image to expand the number of channels of the image. Image features are then extracted by performing depthwise convolution calculations. Then, the number of channels of the image is compressed by performing a point-by-point convolution calculation.
  • the size of the student model can be reduced without losing the accuracy of the image feature extraction by the student model.
  • the second down-sampling encoder performs feature extraction on the input training image and outputs the second feature map, and uses the second feature map as the input image of the second up-sampling decoder, and then executes the Step S420.
  • the outputs of the four down-sampling layers in the second down-sampling encoder are added to the corresponding four up-sampling layers in the second up-sampling decoder, respectively, to participate in the up-sampling process. , to preserve the gradient of the student model.
  • the second feature map sequentially passes through the four upsampling layers of the second upsampling decoder, and the final output result is the second segmentation map (for a detailed process, please refer to step S220).
  • the second segmentation map includes a second standard probability and a second abnormal probability corresponding to a pixel in the training image;
  • the second standard probability is the pixel in a second preset classification condition
  • the probability that the pixel belongs to the standard, and the second abnormal probability is the probability that the pixel is abnormal under the second preset classification condition;
  • the number of categories of the second preset classification condition is more than the first preset classification condition number of categories.
  • the second downsampling encoder performs downsampling according to the training image and obtains a second feature map of size W' ⁇ H' ⁇ N, where W ', H', N are the height, width and number of channels of the second feature map.
  • is the minimum standard probability corresponding to the pixel under the second preset classification condition represents the probability that the pixel belongs to the first type of abnormality under the second preset classification condition, Indicates the probability that the pixel belongs to the second type of abnormality under the second preset classification condition, and so on.
  • the classification conditions set by the student model are related to the classification conditions of the multiple trained teacher models.
  • the corresponding classification conditions are whether there is polyp, whether there is Merck's diverticulum, whether there is ulcer and whether Bleeding, used to train automatic detection of polyps, Merck's diverticula, ulcers and bleeding, respectively.
  • the second preset classification conditions of the student model extracted by the above four teacher models are divided into four categories: the first classification condition is whether there is a polyp, the second classification condition is whether there is a Merkel's diverticulum, and the third The first class classification condition is whether there is ulceration, and the fourth class classification condition is whether there is bleeding.
  • a second segmentation map (0.2, 0.8, 0.7, 0.5, 0.3) is obtained, which means that the predicted pixel has a probability of polyp of 0.8, a Merkel's diverticulum with a probability of 0.7, and a The probability of having an ulcer is 0.5 and the probability of having a bleeding is 0.3.
  • the probability of not suffering from polyps is the minimum value of the normal probability among all diseases
  • the probability of not suffering from polyps 0.2 is retained as the normal probability corresponding to the pixel in the second segmentation map, so as to avoid the existence of excessive normal Probability causes the prediction effect of the student model to be inaccurate.
  • the prediction effect diagram of the student model is shown in Figure 11, where column A is the training image input into the student model; column B is the corresponding second real label (real category map); column C is the corresponding The output of the second segmentation map (prediction effect map).
  • the method further includes:
  • Step S500 modify the parameters of the student model according to the second segmentation map, the teacher label and the second real label, and continue to input the training image into the student model to obtain the second segmentation map. step, until the preset training conditions of the student model are met, to obtain the trained student model; the second real label is used to reflect the corresponding real classification of the pixels on the training image under the second preset classification condition situation; the teacher label is used to reflect the classification situation of the training image in the trained teacher model.
  • the real label used for training the student model is the second real label, to indicate the corresponding real classification situation of the pixels in the training image under the second preset classification condition.
  • the purpose of training is to keep the output of the student model close to the real label, so the student model will continuously perform parameter correction during the training process, so as to control the training process and guide the training process toward The optimal direction converges.
  • step S500 specifically includes the following steps:
  • Step S510 calculating a second loss value according to the second segmentation map, the teacher label and the second real label
  • Step S520 adjusting the parameters of the second upsampling decoder according to the second loss value to update the student model
  • Step S530 Continue to perform the step of generating a second segmentation map according to the training image until the preset training conditions of the student model are met, so as to obtain a trained student model.
  • the gap between the predicted result of the student model and the real classification result can be obtained, so that the student model can compare the predicted result and the real classification result according to the student model.
  • the gap between them determines how to make parameter corrections and achieve better prediction results.
  • the student model classifies the training image according to the second preset classification condition to obtain a second segmentation map, and substitutes the second segmentation map and the second true label into the second loss value
  • the second loss value is obtained, and the second loss value can represent the gap between the second segmentation map and the second real label.
  • the calculation formula of the second loss value is as follows:
  • the second loss value refers to the gap between the second segmentation map and the second true label
  • the larger the value of the second loss value the greater the The larger the gap between the second true labels, the worse the classification effect of the student model
  • the smaller the value of the second loss value the smaller the difference between the second segmentation map and the second true label. The smaller the gap, the better the classification effect of the student model.
  • the second real label in the form of where y normal is the label of the pixel that belongs to the standard under the second preset classification condition, the y 1 S pixel belongs to the label of the first type of abnormality under the second preset classification condition, and the y 2 S pixel belongs to the label of the first type of abnormality under the second preset classification condition.
  • Two labels belong to the second type of abnormality under the preset classification conditions, and y n S pixels belong to the labels of the nth type of abnormality under the second preset classification conditions.
  • the second real label is also a single-hot vector, that is, the result corresponding to only one channel in the same label is not 0, and the results corresponding to other channels are all 0. In other words, the same label can only refer to the pixel as normal or as One of the n types of exceptions.
  • the teacher label is obtained according to the first segmentation map output by all trained teacher models.
  • the step S510 includes the following steps:
  • Step S511 obtain the probability total value according to the first segmentation map output by all the trained teacher models
  • Step S512 Adjust the first segmentation maps output by all trained teacher models according to the probability total value to obtain a teacher label.
  • D is the total probability value
  • p T is the teacher label.
  • D is the total probability value
  • the p normal in the first vector is compared with all the Add up to get the total probability.
  • the first vector Divide by the probability total value D is obtained a second vector, which is the teacher label p T .
  • the second vector is represented as
  • the trained student model is obtained, and the trained student model can be used for real-time colonoscopy image segmentation, that is, as shown in FIG. 1 , the method further includes step S600:
  • the endoscopy images are input to the trained student model to generate real-time colonoscopy image segmentation maps.
  • the present invention further provides an apparatus for real-time colonoscopy image segmentation based on integrated knowledge distillation, wherein the apparatus includes: an image acquisition module 120, and the image acquisition module 120 is used for Acquire a training image; a teacher model unit 130, the teacher model unit 130 is used to obtain a first segmentation map according to the training image; a first parameter correction module 110, the first parameter correction module 110 is used to The segmentation map and the first real label, modify the parameters of the teacher model; the student model unit 170, the student model unit 170 is used to obtain the second segmentation map according to the training image; the second parameter correction module 160, the The second parameter modification module 160 is configured to modify the parameters of the student model according to the second segmentation map, the teacher label and the second real label;
  • the teacher model unit 130 further includes: a first downsampling encoder module 90, the first downsampling encoder module 90 is configured to perform feature extraction on the training image to obtain a first feature map; a first upsampling decoding a decoder module 100, the first upsampling decoder module 100 is configured to parse the first feature map to obtain the first segmentation map;
  • the student model unit 170 further includes: a second downsampling encoder module 140, the second downsampling encoder module 140 is configured to perform feature extraction on the training image to obtain a second feature map; a second upsampling decoding The second upsampling decoder module 150 is configured to analyze the second feature map to obtain the second segmentation map.
  • the present invention also provides a non-transitory computer-readable storage medium, where a data storage program is stored on the non-transitory computer-readable storage medium, and the data storage program is implemented as described above when executed by a processor Each step of the described real-time colonoscopy image segmentation method based on integrated knowledge distillation.
  • Nonvolatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in various forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Road (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
  • SRAM static RAM
  • DRAM dynamic RAM
  • SDRAM synchronous DRAM
  • DDRSDRAM double data rate SDRAM
  • ESDRAM enhanced SDRAM
  • SLDRAM synchronous chain Road (Synchlink) DRAM
  • SLDRAM synchronous chain Road (Synchlink) DRAM
  • Rambus direct RAM
  • DRAM direct memory bus dynamic RAM
  • RDRAM memory bus dynamic RAM
  • the present invention also provides a terminal including a memory and one or more programs, wherein the one or more programs are stored in the memory and configured to be processed by the one or more programs
  • the execution of the one or more programs by the processor includes performing the method of real-time colonoscopy image segmentation based on ensemble knowledge distillation as described in any of the above.
  • a functional block diagram of the terminal may be shown in FIG. 9 .
  • the terminal includes a processor, a memory, and a network interface connected through a system bus.
  • the processor of the terminal is used to provide computing and control capabilities.
  • the memory of the terminal includes a non-volatile storage medium and an internal memory.
  • the nonvolatile storage medium stores an operating system and a computer program.
  • the internal memory provides an environment for the execution of the operating system and computer programs in the non-volatile storage medium.
  • the network interface of the terminal is used to communicate with external terminals through a network connection.
  • the computer program when executed by the processor, implements a real-time colonoscopy image segmentation method based on integrated knowledge distillation.
  • FIG. 9 is only a block diagram of a partial structure related to the solution of the present invention, and does not constitute a limitation on the intelligent terminal to which the solution of the present invention is applied. More or fewer components than shown in the figures may be included, or some components may be combined, or have a different arrangement of components.
  • the realization of the real-time colonoscopy image segmentation method based on integrated knowledge distillation described in any of the above can be accomplished by instructing the relevant hardware through a computer program, and the computer program can be stored in a non-volatile computer. In reading the storage medium, when the computer program is executed, it may include the processes of the embodiments of the above-mentioned methods.
  • Nonvolatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in various forms such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Road (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
  • SRAM static RAM
  • DRAM dynamic RAM
  • SDRAM synchronous DRAM
  • DDRSDRAM double data rate SDRAM
  • ESDRAM enhanced SDRAM
  • SLDRAM synchronous chain Road (Synchlink) DRAM
  • SLDRAM synchronous chain Road (Synchlink) DRAM
  • Rambus direct RAM
  • DRAM direct memory bus dynamic RAM
  • RDRAM memory bus dynamic RAM
  • the present invention obtains multiple training images, which are divided into multiple training image sets, and the training images of the same training image set come from the same data set; first, the teacher model is trained, and different teacher models are The first segmentation map is obtained from different training image sets; then a student model is jointly refined with the trained teacher model.
  • the training image is a screenshot of a colonoscopy image, and the trained student model can generate a real-time colonoscopy image segmentation map according to the real-time colonoscopy image.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Biology (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Epidemiology (AREA)
  • Medical Informatics (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Image Analysis (AREA)

Abstract

Disclosed in the present invention are a real-time colonoscopy image segmentation method and device based on ensemble and knowledge distillation. The method comprises: acquiring a plurality of training images, the training images being classified into a plurality training image sets, and the training images in the same training image set being from the same data set; first training teacher models, different teacher models obtaining first segmented images respectively according to different training image sets; and then extracting a student model by using trained teacher models jointly. The training images are colonoscopy image screenshots, and a trained student model can generate a real-time colonoscopy image segmented image according to a real-time colonoscopy image. Therefore, the problem that data sets among different hospitals are discontinuous and cannot be collected together to train a colonoscopy automatic image segmentation model is solved.

Description

基于集成知识蒸馏的实时肠镜影像分割方法及装置Real-time colonoscopy image segmentation method and device based on integrated knowledge distillation 技术领域technical field
本发明涉及影像分割领域,尤其涉及一种基于集成知识蒸馏的实时肠镜影像分割方法及装置。The invention relates to the field of image segmentation, in particular to a real-time colonoscopy image segmentation method and device based on integrated knowledge distillation.
背景技术Background technique
微创手术具有视野的局限性,特别是在结肠镜检查中经常存在视野盲区。因此实时结肠镜检查自动图像分割对于肠外科手术具有重要作用。现有的结肠镜检查自动图像分割模型在训练时,通常需要来自不同医院的数据集,然而不同医院之间数据集具有不连续性,无法汇集在一起训练结肠镜检查自动图像分割模型。Minimally invasive surgery has limited field of view, especially in colonoscopy, there are often blind spots in the field of vision. Therefore, automatic image segmentation of real-time colonoscopy plays an important role in intestinal surgery. The training of existing automatic image segmentation models for colonoscopy usually requires datasets from different hospitals. However, the datasets between different hospitals are discontinuous and cannot be pooled together to train an automatic image segmentation model for colonoscopy.
因此现有技术还有待改进。Therefore, the existing technology still needs to be improved.
发明内容SUMMARY OF THE INVENTION
本发明要解决的技术问题在于,针对现有技术的上述缺陷,提供一种基于集成知识蒸馏的实时肠镜影像分割方法及存储介质,旨在解决现有技术中无法汇集不同医院之间数据集训练结肠镜检查自动图像分割模型的问题。The technical problem to be solved by the present invention is to provide a real-time colonoscopy image segmentation method and storage medium based on integrated knowledge distillation, aiming at solving the problem that the data sets between different hospitals cannot be collected in the prior art. The problem of training an automatic image segmentation model for colonoscopy.
本发明解决问题所采用的技术方案如下:The technical scheme adopted by the present invention to solve the problem is as follows:
第一方面,本发明实施例提供一种基于集成知识蒸馏的实时肠镜影像分割方法,其中,所述方法包括:In a first aspect, an embodiment of the present invention provides a real-time colonoscopy image segmentation method based on integrated knowledge distillation, wherein the method includes:
获取训练图像;所述训练图像为用于训练教师模型和学生模型的肠镜影像截图;所述训练图像分为多个训练图像集,同一训练图像集的训练图像来自同一个数据集;Acquiring training images; the training images are screenshots of colonoscopy images used to train the teacher model and the student model; the training images are divided into multiple training image sets, and the training images of the same training image set are from the same data set;
将所述训练图像输入至所述教师模型,得到第一分割图;其中,所述教师模型的数量大于等于二;不同的教师模型分别根据不同的训练图像集得到第一分割图;Inputting the training image to the teacher model to obtain a first segmentation map; wherein, the number of the teacher models is greater than or equal to two; different teacher models obtain the first segmentation map according to different training image sets;
根据所述第一分割图和第一真实标签,对所述教师模型的参数进行修正,并继续执行将所述训练图像输入至所述教师模型,得到第一分割图的步骤,直至满足所述教师模型的预设训练条件,以得到已训练的教师模型;所述第一真实标签用于反映所述训练图像上的像素在第一预设分类条件下对应的真实分类情况;Modify the parameters of the teacher model according to the first segmentation map and the first real label, and continue to perform the steps of inputting the training image into the teacher model to obtain the first segmentation map until the The preset training conditions of the teacher model to obtain the trained teacher model; the first real label is used to reflect the real classification situation corresponding to the pixels on the training image under the first preset classification conditions;
将所述训练图像输入至所述学生模型,得到第二分割图;Inputting the training image to the student model to obtain a second segmentation map;
根据所述第二分割图、教师标签和第二真实标签,对所述学生模型的参数进行修正,并继续执行将所述训练图像输入至所述学生模型,得到第二分割图的步骤,直至满足所述学生 模型的预设训练条件,以得到已训练的学生模型;所述第二真实标签用于反映所述训练图像上的像素在第二预设分类条件下对应的真实分类情况;所述教师标签用于反映所述训练图像在所述已训练的教师模型中的分类情况;Modify the parameters of the student model according to the second segmentation map, the teacher label and the second real label, and continue to perform the steps of inputting the training image into the student model to obtain the second segmentation map, until Meet the preset training conditions of the student model to obtain the trained student model; the second real label is used to reflect the real classification situation corresponding to the pixels on the training image under the second preset classification condition; the The teacher label is used to reflect the classification situation of the training image in the trained teacher model;
将实时肠镜影像输入至已训练的学生模型生成实时肠镜影像分割图。Real-time colonoscopy images are input to the trained student model to generate real-time colonoscopy image segmentation maps.
在一种实施方式中,所述获取训练图像,所述训练图像为用于训练教师模型和学生模型的肠镜影像截图,包括:In one embodiment, the acquired training images are screenshots of colonoscopy images used to train the teacher model and the student model, including:
获取肠镜影像截图;Take screenshots of colonoscopy images;
根据所述肠镜影像截图进行压缩处理,得到所述训练图像;所述训练图像的高度、宽度、通道数均恒定。Compression processing is performed according to the screenshot of the colonoscopy image to obtain the training image; the height, width and number of channels of the training image are all constant.
在一种实施方式,所述教师模型包括第一下采样编码器和第一上采样解码器;所述将所述训练图像输入至所述教师模型,得到第一分割图包括:In one embodiment, the teacher model includes a first down-sampling encoder and a first up-sampling decoder; the inputting the training image into the teacher model to obtain the first segmentation map includes:
根据所述第一下采样编码器对所述训练图像进行特征提取,得到第一特征图;所述第一特征图包含所述训练图像的特征信息;Perform feature extraction on the training image according to the first down-sampling encoder to obtain a first feature map; the first feature map includes feature information of the training image;
根据所述第一上采样解码器对所述第一特征图进行解析,得到所述第一分割图;Analyze the first feature map according to the first upsampling decoder to obtain the first segmentation map;
其中,所述第一分割图包含所述训练图像中的像素对应的第一标准概率和第一异常概率;所述第一标准概率为所述像素在第一预设分类条件下属于标准的概率,所述第一异常概率为所述像素在第一预设分类条件下属于异常的概率;所述第一异常概率与所述第一标准概率的概率值的和为1。The first segmentation map includes the first standard probability and the first abnormal probability corresponding to the pixels in the training image; the first standard probability is the probability that the pixel belongs to the standard under the first preset classification condition , the first abnormal probability is the probability that the pixel belongs to abnormal under the first preset classification condition; the sum of the probability value of the first abnormal probability and the first standard probability is 1.
在一种实施方式中,所述根据所述第一分割图和第一真实标签,对所述教师模型的参数进行修正,并继续执行将所述训练图像输入至所述教师模型,得到第一分割图的步骤,直至满足所述教师模型的预设训练条件,以得到已训练的教师模型,包括:In an embodiment, the parameters of the teacher model are modified according to the first segmentation map and the first real label, and the training image is input into the teacher model continuously to obtain the first The step of dividing the graph, until the preset training conditions of the teacher model are met, to obtain the trained teacher model, including:
根据所述第一分割图和所述第一真实标签计算第一损失值;Calculate a first loss value according to the first segmentation map and the first ground truth;
根据所述第一损失值调整所述第一上采样解码器的参数,以更新所述教师模型;Adjust parameters of the first upsampling decoder according to the first loss value to update the teacher model;
继续执行将所述训练图像输入至所述教师模型,得到第一分割图的步骤,直至满足所述教师模型的预设训练条件,以得到已训练的教师模型。The step of inputting the training image into the teacher model to obtain the first segmentation map is continued until the preset training conditions of the teacher model are satisfied, so as to obtain the trained teacher model.
在一种实施方式中,所述学生模型包括第二下采样编码器和第二上采样解码器;所述将所述训练图像输入至所述学生模型,得到第二分割图包括:In one embodiment, the student model includes a second down-sampling encoder and a second up-sampling decoder; the inputting the training image into the student model to obtain the second segmentation map includes:
根据所述第二下采样编码器对所述训练图像进行特征提取后输出第二特征图;所述第二特征图包含所述训练图像的特征信息;The second feature map is output after feature extraction is performed on the training image according to the second down-sampling encoder; the second feature map includes feature information of the training image;
根据所述第二上采样解码器对所述第二特征图进行解析,得到所述第二分割图;Analyze the second feature map according to the second upsampling decoder to obtain the second segmentation map;
其中;所述第二分割图包含所述训练图像中的像素对应的第二标准概率和第二异常概率;所述第二标准概率为所述像素在第二预设分类条件下属于标准的概率,所述第二异常概率为所述像素在第二预设分类条件下属于异常的概率;所述第二预设分类条件的类别数量多于所述第一预设分类条件的类别数量。Wherein; the second segmentation map includes the second standard probability and the second abnormal probability corresponding to the pixels in the training image; the second standard probability is the probability that the pixel belongs to the standard under the second preset classification condition , the second abnormality probability is the probability that the pixel belongs to abnormality under the second preset classification condition; the number of categories of the second preset classification condition is more than the number of categories of the first preset classification condition.
在一种实施方式中,所述根据所述第二分割图、教师标签和第二真实标签,对所述学生模型的参数进行修正,并继续执行根据所述训练图像生成第二分割图的步骤,直至满足所述学生模型的预设训练条件,以得到已训练的学生模型,包括:In an embodiment, the parameters of the student model are modified according to the second segmentation map, the teacher label and the second real label, and the step of generating a second segmentation map according to the training image is continued. , until the preset training conditions of the student model are met to obtain a trained student model, including:
根据所述第二分割图、所述教师标签和第二真实标签计算第二损失值;calculating a second loss value according to the second segmentation map, the teacher label and the second ground truth;
根据所述第二损失值调整所述第二上采样解码器的参数,以更新所述学生模型;Adjust parameters of the second upsampling decoder according to the second loss value to update the student model;
继续执行根据所述训练图像生成第二分割图的步骤,直至满足所述学生模型的预设训练条件,以得到已训练的学生模型。Continue to perform the step of generating a second segmentation map according to the training image until the preset training conditions of the student model are met, so as to obtain a trained student model.
在一种实施方式中,所述根据所述第二分割图、所述教师标签和第二真实标签计算第二损失值,包括:In one embodiment, the calculating the second loss value according to the second segmentation map, the teacher label and the second ground truth includes:
根据所有已训练的教师模型输出的第一分割图,得到概率总值;Obtain the total probability value according to the first segmentation map output by all trained teacher models;
根据所述概率总值调整所有已训练的教师模型输出的第一分割图,得到教师标签。The teacher labels are obtained by adjusting the first segmentation maps output by all trained teacher models according to the probability total value.
第二方面,本发明实施例还提供一种基于集成知识蒸馏的实时肠镜影像分割的装置,其中,所述装置包括:In a second aspect, an embodiment of the present invention further provides an apparatus for real-time colonoscopy image segmentation based on integrated knowledge distillation, wherein the apparatus includes:
图像获取模块,所述图像获取模块用于获取训练图像;an image acquisition module, the image acquisition module is used to acquire training images;
教师模型单元,所述教师模型单元用于根据所述训练图像获得第一分割图;a teacher model unit, the teacher model unit is used to obtain a first segmentation map according to the training image;
第一参数修正模块,所述第一参数修正模块用于根据所述第一分割图和第一真实标签,对所述教师模型的参数进行修正;a first parameter correction module, which is used to correct the parameters of the teacher model according to the first segmentation map and the first real label;
学生模型单元,所述学生模型单元用于根据所述训练图像获得第二分割图;a student model unit, the student model unit is used to obtain a second segmentation map according to the training image;
第二参数修正模块,所述第二参数修正模块用于根据所述第二分割图、教师标签和第二真实标签,对所述学生模型的参数进行修正;The second parameter correction module, the second parameter correction module is used to modify the parameters of the student model according to the second segmentation map, the teacher label and the second real label;
所述教师模型单元还包括:The teacher model unit also includes:
第一下采样编码器模块,所述第一下采样编码器模块用于对所述训练图像进行特征提取,得到第一特征图;a first down-sampling encoder module, where the first down-sampling encoder module is used to perform feature extraction on the training image to obtain a first feature map;
第一上采样解码器模块,所述第一上采样解码器模块用于对所述第一特征图进行解析,得到所述第一分割图;a first upsampling decoder module, the first upsampling decoder module is configured to parse the first feature map to obtain the first segmentation map;
所述学生模型单元还包括:The student model unit also includes:
第二下采样编码器模块,所述第二下采样编码器模块用于对所述训练图像进行特征提取,得到第二特征图;a second down-sampling encoder module, the second down-sampling encoder module is configured to perform feature extraction on the training image to obtain a second feature map;
第二上采样解码器模块,所述第二上采样解码器模块用于对所述第二特征图进行解析,得到所述第二分割图。A second up-sampling decoder module, the second up-sampling decoder module is configured to parse the second feature map to obtain the second segmentation map.
第三方面,本发明实施例还提供一种终端,其特征在于,包括有存储器,以及一个或者一个以上的程序,其中一个或者一个以上程序存储于存储器中,且经配置以由一个或者一个以上处理器执行所述一个或者一个以上程序包含用于执行上述任一所述的方法。In a third aspect, an embodiment of the present invention also provides a terminal, which is characterized in that it includes a memory and one or more programs, wherein one or more programs are stored in the memory and configured to be composed of one or more programs. Execution of the one or more programs by a processor includes performing any of the methods described above.
第四方面,本发明实施例提供一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现上述任一所述的基于集成知识蒸馏的实时肠镜影像分割方法的步骤。In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium on which a computer program is stored, characterized in that, when the computer program is executed by a processor, any one of the above-mentioned real-time integrated knowledge distillation-based real-time Steps of the colonoscopy image segmentation method.
本发明的有益效果:本发明通过获取多个训练图像,所述训练图像分为多个训练图像集,同一训练图像集的训练图像来自同一个数据集;首先训练教师模型,不同的教师模型分别根据不同的训练图像集得到第一分割图;再用训练好的教师模型共同提炼一个学生模型。其中所述训练图像为肠镜影像截图,已训练的学生模型可以根据实时肠镜影像生成实时肠镜影像分割图。从而解决不同医院之间的数据集具有不连续性,无法汇集在一起训练结肠镜检查自动图像分割模型的问题。Beneficial effects of the present invention: the present invention obtains a plurality of training images, the training images are divided into a plurality of training image sets, and the training images of the same training image set are from the same data set; Obtain the first segmentation map according to different training image sets; then use the trained teacher model to jointly refine a student model. The training image is a screenshot of a colonoscopy image, and the trained student model can generate a real-time colonoscopy image segmentation map according to the real-time colonoscopy image. This solves the problem that the data sets between different hospitals are discontinuous and cannot be pooled together to train an automatic image segmentation model for colonoscopy.
附图说明Description of drawings
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to explain the embodiments of the present invention or the technical solutions in the prior art more clearly, the following briefly introduces the accompanying drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments described in the present invention. For those of ordinary skill in the art, other drawings can also be obtained based on these drawings without any creative effort.
图1是本发明实施例提供的基于集成知识蒸馏的实时肠镜影像分割方法的第一流程示意图。FIG. 1 is a first schematic flowchart of a real-time colonoscopy image segmentation method based on integrated knowledge distillation provided by an embodiment of the present invention.
图2是本发明实施例提供的基于集成知识蒸馏的实时肠镜影像分割方法的第二流程示意图。FIG. 2 is a second schematic flowchart of a real-time colonoscopy image segmentation method based on integrated knowledge distillation provided by an embodiment of the present invention.
图3是本发明实施例提供的基于集成知识蒸馏的实时肠镜影像分割方法的第三流程示意图。3 is a third schematic flowchart of a real-time colonoscopy image segmentation method based on integrated knowledge distillation provided by an embodiment of the present invention.
图4是本发明实施例提供的下采样编码器与上采样解码器之间的连接关系示意图。FIG. 4 is a schematic diagram of a connection relationship between a down-sampling encoder and an up-sampling decoder provided by an embodiment of the present invention.
图5是本发明实施例提供的基于集成知识蒸馏的实时肠镜影像分割方法的第四流程示意 图。5 is a fourth schematic flowchart of a real-time colonoscopy image segmentation method based on integrated knowledge distillation provided by an embodiment of the present invention.
图6是本发明实施例提供的基于集成知识蒸馏的实时肠镜影像分割方法的第五流程示意图。FIG. 6 is a fifth schematic flowchart of a real-time colonoscopy image segmentation method based on integrated knowledge distillation provided by an embodiment of the present invention.
图7是本发明实施例提供的基于集成知识蒸馏的实时肠镜影像分割方法的第六流程示意图。FIG. 7 is a sixth schematic flowchart of a real-time colonoscopy image segmentation method based on integrated knowledge distillation provided by an embodiment of the present invention.
图8是本发明实施例提供的基于集成知识蒸馏的实时肠镜影像分割方法的第七流程示意图。FIG. 8 is a seventh schematic flowchart of a real-time colonoscopy image segmentation method based on integrated knowledge distillation provided by an embodiment of the present invention.
图9是本发明实施例方案涉及的硬件运行环境的结构示意图。FIG. 9 is a schematic structural diagram of a hardware operating environment involved in the solution of an embodiment of the present invention.
图10是本发明实施例方案提供的基于集成知识蒸馏的实时肠镜影像分割装置的内部结构示意图。FIG. 10 is a schematic diagram of the internal structure of a real-time enteroscopic image segmentation device based on integrated knowledge distillation provided by an embodiment of the present invention.
图11是本发明实施例方案提供的学生模型的预测效果图。FIG. 11 is a prediction effect diagram of the student model provided by the solution according to the embodiment of the present invention.
具体实施方式detailed description
为使本发明的目的、技术方案及优点更加清楚、明确,以下参照附图并举实施例对本发明进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。In order to make the objectives, technical solutions and advantages of the present invention clearer and clearer, the present invention will be further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.
需要说明,若本发明实施例中有涉及方向性指示(诸如上、下、左、右、前、后……),则该方向性指示仅用于解释在某一特定姿态(如附图所示)下各部件之间的相对位置关系、运动情况等,如果该特定姿态发生改变时,则该方向性指示也相应地随之改变。It should be noted that if there are directional indications (such as up, down, left, right, front, back, etc.) involved in the embodiments of the present invention, the directional indications are only used to explain a certain posture (as shown in the accompanying drawings). If the specific posture changes, the directional indication also changes accordingly.
随着微创外科手术的发展,以机器人系统为代表的人工智能辅助外科手术的应用越来越广泛。辅助外科手术是一种使用机器人协助医生完成外科手术的系统,其目的主要是为了克服现有微创手术的视野的局限性。辅助外科手术在结肠镜检查中尤为重要。结肠镜检查技术是肠外科手术中的重要技术之一,然而在结肠镜检查中,许多结肠病变通常具有和正常粘膜相似的特性,比如色泽相似,或过于平坦,这种带有欺骗性的结肠病变不借助特殊方法通常难以发现。因此实时结肠镜检查自动图像分割对于结肠镜检查技术具有重要作用。具体地,在过去的研究中,许多人尝试利用自然图像的深度神经网络模型,或者生物医学图像分割的卷积网络进行医学图像检测,例如对结直肠息肉进行自动内窥镜检测和分类,并达到了可观的检测结果:研究表明深度学习可在筛查结肠镜检查中以较高的精度实时定位和识别息肉(例如使用YOLO来检测结肠镜检查视频中的息肉可以以96%的精度实时定位和识别息肉)。With the development of minimally invasive surgery, artificial intelligence-assisted surgery represented by robotic systems has become more and more widely used. Assisted surgery is a system that uses robots to assist doctors in performing surgical operations, and its purpose is mainly to overcome the limitations of the existing minimally invasive surgical field of view. Adjunctive surgery is especially important during colonoscopy. Colonoscopy technique is one of the important techniques in intestinal surgery, however, in colonoscopy, many colon lesions usually have similar characteristics to normal mucosa, such as similar color, or too flat, this deceptive colon Lesions are often difficult to detect without special methods. Therefore, automatic image segmentation of real-time colonoscopy plays an important role in colonoscopy technology. Specifically, in the past studies, many people tried to use deep neural network models for natural images, or convolutional networks for biomedical image segmentation for medical image detection, such as automatic endoscopic detection and classification of colorectal polyps, and Achieving impressive detection results: studies have shown that deep learning can locate and identify polyps in real-time with high accuracy in screening colonoscopies (e.g., using YOLO to detect polyps in colonoscopy videos can be located in real-time with 96% accuracy) and identifying polyps).
目前,结肠镜检查自动图像分割模型在训练时通常需要来自不同医院的数据集,然而不 同医院之间的数据集具有不连续性,无法直接汇集在一起训练模型。此外,现有技术的结肠镜检查自动图像分割的研究主要集中在息肉检测上,缺乏关于溃疡,出血和默克尔憩室的自动检测的研究。At present, the training of automatic image segmentation models for colonoscopy usually requires datasets from different hospitals. However, the datasets between different hospitals are discontinuous and cannot be directly pooled together to train the model. In addition, prior art studies on automatic image segmentation for colonoscopy mainly focus on polyp detection, and studies on automatic detection of ulcers, hemorrhages and Merkel's diverticulum are lacking.
基于现有技术的上述缺陷,本发明提供一种基于集成知识蒸馏的实时肠镜影像分割方法,实时肠镜影像分割方法是一种辅助医生对患者结肠图像进行判断的技术。简单来说,由于结肠图像中,患病组织(病灶)与正常组织在形态上有差异,例如颜色、轮廓、纹理等特征上有差异。因此自动检测模型通过学习大量已知结果的结肠检查图像后,可以通过对输入的结肠检查图像进行图像解析并给出预测结果,从而给予医生相应的参考意见。然而自动检测模型在训练过程中,需要大量的计算资源,从而实现在非常大、高度冗余的数据集中提取出信息,因此往往会导致训练出来的模型规模很大,大规模的模型不方便在实际运用中进行部署,因此模型压缩成为了重要问题。而知识蒸馏是一种模型压缩方法,其主要思想是训练一个小的网络模型来模仿一个预先训练好的大型网络或者集成的网络,在知识蒸馏中,老师将知识传授给学生的方法是:在训练学生的过程中加入一个以老师预测结果的概率分布为目标的损失函数。Based on the above-mentioned defects of the prior art, the present invention provides a real-time colonoscopy image segmentation method based on integrated knowledge distillation. The real-time colonoscopy image segmentation method is a technology for assisting doctors in judging a patient's colon image. Simply put, due to the difference in morphology between diseased tissue (lesion) and normal tissue in colon images, such as differences in features such as color, outline, and texture. Therefore, after learning a large number of colon examination images with known results, the automatic detection model can perform image analysis on the input colon examination images and give prediction results, thereby giving doctors corresponding reference opinions. However, the automatic detection model requires a lot of computing resources in the training process, so that information can be extracted from very large and highly redundant data sets. It is deployed in practice, so model compression becomes an important issue. Knowledge distillation is a model compression method. Its main idea is to train a small network model to imitate a pre-trained large network or an integrated network. In knowledge distillation, the method for teachers to impart knowledge to students is: In the process of training students, a loss function that targets the probability distribution of the teacher's predicted results is added.
简单来说,本发明首先通过不同医院的数据集分别训练多个二元分类模型,所述多个二元分类模型可以分别检测息肉、溃疡,出血和默克尔憩室等结肠镜检查中的常见问题,再使用训练好的多个二元分类模型共同提炼一个多元分类模型,使所述多元分类模型可以进行息肉、溃疡,出血和默克尔憩室的自动检测。在本发明中,所述教师模型为二元分类模型,所述学生模型为多元分类模型,训练所述教师模型和所述学生模型的目的是为了确定所述教师模型和所述学生模型的最优参数,实现其最佳的分类效果。对不同的教师模型分别通过不同的训练图像集进行训练,得到具有不同分类条件的教师模型。同一教师模型的训练图像均来自同一个医院的数据集,从而解决不同医院之间的数据集具有不连续性,无法汇集在一起训练结肠镜检查自动图像分割模型的问题。最后使用知识蒸馏方法(即将已经训练好的模型包含的知识,蒸馏提取到另一个模型里面去),将训练好的教师模型包含的知识,蒸馏提取到学生模型里面,从而有效地压缩学生模型,减小学生模型的尺寸。To put it simply, the present invention firstly trains multiple binary classification models through data sets of different hospitals, and the multiple binary classification models can respectively detect polyps, ulcers, bleeding and Merkel's diverticulum, which are common in colonoscopy examinations. Then use multiple trained binary classification models to jointly refine a multivariate classification model, so that the multivariate classification model can automatically detect polyps, ulcers, bleeding and Merkel's diverticulum. In the present invention, the teacher model is a binary classification model, the student model is a multivariate classification model, and the purpose of training the teacher model and the student model is to determine the best model between the teacher model and the student model. optimal parameters to achieve the best classification effect. Different teacher models are trained through different training image sets to obtain teacher models with different classification conditions. The training images of the same teacher model are all from the same hospital dataset, which solves the problem that the datasets between different hospitals are discontinuous and cannot be pooled together to train an automatic image segmentation model for colonoscopy. Finally, the knowledge distillation method is used (that is, the knowledge contained in the trained model is distilled and extracted into another model), and the knowledge contained in the trained teacher model is distilled and extracted into the student model, thereby effectively compressing the student model. Reduce the size of the student model.
如图1所示,本实施例提供的基于集成知识蒸馏的实时肠镜影像分割方法包括如下步骤:As shown in FIG. 1 , the real-time colonoscopy image segmentation method based on integrated knowledge distillation provided in this embodiment includes the following steps:
S100、获取训练图像;所述训练图像为用于训练教师模型和学生模型的肠镜影像截图;所述训练图像分为多个训练图像集,同一训练图像集的训练图像来自同一个数据集。S100. Acquire a training image; the training image is a screenshot of a colonoscopy image used for training the teacher model and the student model; the training image is divided into multiple training image sets, and the training images of the same training image set are from the same data set.
简单来说,训练模型的第一步就是要获取可用的训练图像,再利用所述训练图像去训练所述教师模型和所述学生模型。同一训练图像集中的数据均来自同一个医院,从而解决不同 医院之间的数据集具有不连续性,无法汇集在一起训练结肠镜检查自动图像分割模型的问题。To put it simply, the first step in training a model is to obtain available training images, and then use the training images to train the teacher model and the student model. The data in the same training image set comes from the same hospital, which solves the problem that the data sets between different hospitals are discontinuous and cannot be pooled together to train an automatic image segmentation model for colonoscopy.
在一种实现方式中,如图2所示所述步骤S100还包括以下步骤:In an implementation manner, the step S100 as shown in FIG. 2 further includes the following steps:
S110、获取肠镜影像截图;S110. Obtain a screenshot of the colonoscopy image;
S120、根据所述肠镜影像截图进行压缩处理,得到所述训练图像;所述训练图像的高度、宽度、通道数均恒定。S120. Perform compression processing according to the screenshot of the colonoscopy image to obtain the training image; the height, width, and number of channels of the training image are all constant.
具体地,由于所述教师模型和所述学生模型中的网络参数的维度是固定的,因此所述教师模型和所述学生模型的输入图像的大小必须与所述网络参数的维度适配,以避免造成模型中的网络的动态变化,即需要固定所述教师模型和所述学生模型的输入图像的大小。而输入图像的大小主要与图像的高度、宽度和通道数有关,因此输入图像的大小固定即所述输入图像的的高度、宽度、通道数均恒定。具体实施时,在获取所述肠镜影像截图后,对所述肠镜影像截图进行压缩处理,使其高度、宽度、通道数符合所述教师模型和所述学生模型的输入图像的标准。完成压缩处理后的肠镜影像截图即成为所述训练图像,可以直接输入所述教师模型、所述学生模型中,用于训练所述教师模型、所述学生模型。此外,所述训练图像可分为多个训练图像集,同一训练图像集的数据均来自同一个医院的数据集。Specifically, since the dimensions of the network parameters in the teacher model and the student model are fixed, the size of the input images of the teacher model and the student model must be adapted to the dimensions of the network parameters, in order to To avoid dynamic changes of the network in the model, the size of the input images of the teacher model and the student model needs to be fixed. The size of the input image is mainly related to the height, width and number of channels of the image, so the size of the input image is fixed, that is, the height, width and number of channels of the input image are all constant. In a specific implementation, after the screenshot of the colonoscopy image is obtained, the screenshot of the colonoscopy image is compressed so that the height, width, and number of channels conform to the input image standards of the teacher model and the student model. The compressed colonoscopy image screenshot becomes the training image, which can be directly input into the teacher model and the student model for training the teacher model and the student model. In addition, the training images can be divided into multiple training image sets, and the data of the same training image set are all from the same hospital data set.
完成所述步骤S100后,如图1所示,所述方法还包括步骤S200、将所述训练图像输入至所述教师模型,得到第一分割图;其中,所述教师模型的数量大于等于二;不同的教师模型分别根据不同的训练图像集得到第一分割图。After completing the step S100, as shown in FIG. 1, the method further includes step S200, inputting the training image into the teacher model to obtain a first segmentation map; wherein the number of the teacher models is greater than or equal to two ; Different teacher models obtain the first segmentation map according to different training image sets.
由于本发明是通过先训练多个二元分类模型,再使用训练好的二元分类模型共同提炼一个多元分类模型,即先训练教师模型,再用训练好的教师模型共同提炼一个学生模型。因此需要首先训练教师模型。不同的教师模型分别根据不同的训练图像集得到第一分割图,例如,当前有两个教师模型A、B,教师模型A用于自动检测息肉,教师模型B用于自动检测出血,则教师模型A、B分别通过不同的训练图像集进行训练,教师模型A训练图像集中的数据可以来自本地息肉治疗技术最强的医院,教师模型B训练图像集中的数据可以来自本地结肠出血治疗技术最强的医院。所述教师模型通过采集所述训练图像中特定的特征信息,根据预设的分类条件对所述训练图像上的像素进行分类,并输出分类结果,所述分类结果即为所述第一分割图。Because the present invention firstly trains a plurality of binary classification models, and then uses the trained binary classification model to jointly refine a multivariate classification model, namely, firstly trains the teacher model, and then uses the trained teacher model to jointly refine a student model. Therefore, the teacher model needs to be trained first. Different teacher models obtain the first segmentation map according to different training image sets. For example, there are currently two teacher models A and B. Teacher model A is used to automatically detect polyps, and teacher model B is used to automatically detect bleeding. A and B are trained through different training image sets respectively. The data in the training image set of teacher model A can come from the hospital with the strongest local polyp treatment technology, and the data in the training image set of teacher model B can come from the local hospital with the strongest treatment technology for colonic hemorrhage. Hospital. The teacher model classifies the pixels on the training image according to preset classification conditions by collecting specific feature information in the training image, and outputs a classification result, where the classification result is the first segmentation map .
具体分类过程如下,在一种实现方式中,所述教师模型包括第一下采样编码器和第一上采样解码器,如图3所示,所述步骤S200还包括以下步骤:The specific classification process is as follows, in one implementation, the teacher model includes the first down-sampling encoder and the first up-sampling decoder, as shown in Figure 3, the step S200 also includes the following steps:
步骤S210、根据所述第一下采样编码器对所述训练图像进行特征提取,得到第一特征图;所述第一特征图包含所述训练图像的特征信息;Step S210, performing feature extraction on the training image according to the first down-sampling encoder to obtain a first feature map; the first feature map includes feature information of the training image;
步骤S220、根据所述第一上采样解码器对所述第一特征图进行解析,得到所述第一分割图;Step S220, analyzing the first feature map according to the first upsampling decoder to obtain the first segmentation map;
其中,所述第一分割图包含所述训练图像中的像素对应的第一标准概率和第一异常概率;所述第一标准概率为所述像素在第一预设分类条件下属于标准的概率,所述第一异常概率为所述像素在第一预设分类条件下属于异常的概率;所述第一异常概率与所述第一标准概率的概率值的和为1。The first segmentation map includes the first standard probability and the first abnormal probability corresponding to the pixels in the training image; the first standard probability is the probability that the pixel belongs to the standard under the first preset classification condition , the first abnormal probability is the probability that the pixel belongs to abnormal under the first preset classification condition; the sum of the probability value of the first abnormal probability and the first standard probability is 1.
简单来说,所述教师模型在训练时主要采用随机梯度下降算法。所述教师模型主要由第一下采样编码器与第一上采样解码器构成。所述第一下采样编码器和所述第一上采样解码器分别由四个下采样层和四个上采样层构成,所述第一下采样编码器与所述第一上采样解码器之间有连接关系,所述四个下采样层与所述四个上采样层一一对应连接,并将所述四个下采样层的输出分别加入其对应的上采样层中,参与上采样过程,以保持所述教师模型的梯度。所述训练图像输入所述教师模型以后,依次通过所述第一下采样编码器的四个下采样层,得到所述第一特征图。然后所述第一特征图再依次通过所述第一上采样解码器的四个上采样层,最后的输出结果即为所述第一分割图。In short, the teacher model mainly adopts the stochastic gradient descent algorithm during training. The teacher model is mainly composed of a first down-sampling encoder and a first up-sampling decoder. The first down-sampling encoder and the first up-sampling decoder are composed of four down-sampling layers and four up-sampling layers, respectively, and the first down-sampling encoder and the first up-sampling decoder are composed of four downsampling layers. There is a connection relationship between the four down-sampling layers and the four up-sampling layers are connected one-to-one, and the outputs of the four down-sampling layers are respectively added to their corresponding up-sampling layers to participate in the up-sampling process. , to maintain the gradient of the teacher model. After the training image is input into the teacher model, the first feature map is obtained by sequentially passing through the four downsampling layers of the first downsampling encoder. Then, the first feature map passes through the four upsampling layers of the first upsampling decoder in sequence, and the final output result is the first segmentation map.
此外,由于深度学习模型通常都需要依靠强大的计算能力作为支撑,因此难以在计算资源有限和存储空间有限的设备内部署。为了解决这一问题,在构建所述教师模型时,采用轻量化网络MobileNetv2构建所述第一下采样编码器。轻量化网络MobileNetv2是针对计算资源有限的设备提出的轻量级模型,用深度可分离卷积构建轻量深度神经网络,从而简化了网络结构,因此其具有较高的准确性以及良好的模型压缩能力。在一种实现方式中,如图4所示,本实施例采用MobileNetv2中的四层分别作为所述第一下采样编码器中的第一下采样层10、第二下采样层20、第三下采样层30、第四下采样层40。将训练图像1输入所述第一下采样编码器中,所述训练图像1首先输入所述第一下采样层10,然后将所述第一下采样层10的输出作为所述第二下采样层20的输入,以此类推将上一个下采样层的输出图像作为下一个下采样层的输入图像,并持续执行对输入图像进行特征提取的步骤,直至所述第四下采样层40完成对输入图像的特征提取后输出第一特征图2。其中,所述第一下采样编码器中的每一个下采样层均由一个反向残差模块组成,所述反向残差模块由深度可分离卷积构造而成。具体地,所述反向残差模块先对所述输入图像进行逐点卷积计算,拓展图像的通道数;然后通过执行深度卷积计算,提取图像特征;然后再通过执行逐点卷积计算,压缩图像的通道数。从而实现减小所述教师模型的尺寸,同时又不会损失所述教师模型提取图像特征的准确性。In addition, since deep learning models usually rely on powerful computing power, it is difficult to deploy in devices with limited computing resources and limited storage space. In order to solve this problem, when constructing the teacher model, a lightweight network MobileNetv2 is used to construct the first down-sampling encoder. The lightweight network MobileNetv2 is a lightweight model proposed for devices with limited computing resources. It uses depthwise separable convolution to build a lightweight deep neural network, which simplifies the network structure, so it has high accuracy and good model compression. ability. In an implementation manner, as shown in FIG. 4 , in this embodiment, four layers in MobileNetv2 are used as the first downsampling layer 10 , the second downsampling layer 20 , the third downsampling layer 20 and the third downsampling layer in the first downsampling encoder, respectively. The down-sampling layer 30 and the fourth down-sampling layer 40 . The training image 1 is input into the first down-sampling encoder, the training image 1 is firstly input into the first down-sampling layer 10, and then the output of the first down-sampling layer 10 is used as the second down-sampling The input of layer 20, and so on, take the output image of the previous down-sampling layer as the input image of the next down-sampling layer, and continue to perform the step of extracting features from the input image until the fourth down-sampling layer 40 completes the After the feature extraction of the input image, the first feature map 2 is output. Wherein, each downsampling layer in the first downsampling encoder is composed of an inverse residual module, and the inverse residual module is constructed by depthwise separable convolution. Specifically, the inverse residual module first performs a point-by-point convolution calculation on the input image to expand the number of channels of the image; then extracts image features by performing a depthwise convolution calculation; and then performs a point-by-point convolution calculation , the number of channels of the compressed image. Therefore, the size of the teacher model can be reduced without losing the accuracy of the image feature extraction by the teacher model.
所述第一下采样编码器对输入的训练图像进行特征提取后输出所述第一特征图,并将所 述第一特征图作为所述第一上采样解码器的输入图像,然后执行所述步骤S220。The first down-sampling encoder performs feature extraction on the input training image and outputs the first feature map, and uses the first feature map as the input image of the first up-sampling decoder, and then executes the Step S220.
具体地,所述上采样解码器中的四个上采样层均分别由一个转置卷积层和一个规范层组成,所述转置卷积层用于扩大图像以及提取图像特征,所述规范层用于避免不同上采样层之间的参数相互影响。举例说明:如图4所示,所述第一特征图2输入所述上采样解码器以后,经过第一上采样层50,将所述第一上采样层50的输出结果与所述下采样编码器中的第四下采样层40的输出结果相结合,作为第二上采样层60的输入图像并输入第二上采样层60。继续执行将所述第二上采样层60的输出结果与所述下采样编码器中的第三下采样层30的输出结果相结合,作为第三上采样层70的输入图像并输入第三上采样层70。然后再继续执行将所述第三上采样层70的输出结果与所述下采样编码器中的第二下采样层20的输出结果相结合,作为第四上采样层80的输入图像并输入第四上采样层80。最后将所述第四上采样层80的输出结果与所述下采样编码器中的第一下采样层10的输出结果相结合即可得到第一分割图3。Specifically, each of the four upsampling layers in the upsampling decoder is composed of a transposed convolutional layer and a normalization layer respectively. The transposed convolutional layer is used for expanding the image and extracting image features. The normalization layer Layers are used to avoid parameter interaction between different upsampling layers. For example, as shown in FIG. 4 , after the first feature map 2 is input to the up-sampling decoder, it passes through the first up-sampling layer 50 to combine the output result of the first up-sampling layer 50 with the down-sampling layer. The output results of the fourth downsampling layer 40 in the encoder are combined as the input image to the second upsampling layer 60 and into the second upsampling layer 60 . Continue to combine the output result of the second upsampling layer 60 with the output result of the third downsampling layer 30 in the downsampling encoder as the input image of the third upsampling layer 70 and input the third upsampling layer 70 Sampling layer 70. Then continue to combine the output result of the third upsampling layer 70 with the output result of the second downsampling layer 20 in the downsampling encoder as the input image of the fourth upsampling layer 80 and input the first Four upsampling layers 80 . Finally, the output result of the fourth up-sampling layer 80 is combined with the output result of the first down-sampling layer 10 in the down-sampling encoder to obtain the first segmentation diagram 3 .
在一种实现方式中,所述第一分割图包含所述训练图像中的像素对应的第一标准概率和第一异常概率;所述第一标准概率为所述像素在第一预设分类条件下属于标准的概率,所述第一异常概率为所述像素在第一预设分类条件下属于异常的概率;所述第一异常概率与所述第一标准概率的概率值的和为1。例如当教师模型为用于自动检测息肉的模型,则其对应的第一预设分类条件为是否患有息肉,第一标准概率则为训练图像中的像素对应正常的概率(即不患有息肉的概率),第一异常概率则为训练图像中的像素对应患有息肉的概率。In an implementation manner, the first segmentation map includes a first standard probability and a first abnormal probability corresponding to a pixel in the training image; the first standard probability is the pixel in a first preset classification condition The probability that the pixel belongs to the standard, the first abnormal probability is the probability that the pixel is abnormal under the first preset classification condition; the sum of the probability value of the first abnormal probability and the first standard probability is 1. For example, when the teacher model is a model for automatically detecting polyps, the corresponding first preset classification condition is whether polyps are present, and the first standard probability is the probability that the pixels in the training image correspond to normal (that is, no polyps). The probability of ), the first abnormal probability is the probability that the pixel in the training image corresponds to having a polyp.
具体地,将所述训练图像输入所述教师模型后,所述第一下采样编码器根据所述训练图像进行下采样并得到大小为W'×H'×N的第一特征图,其中W',H',N是所述第一特征图的高度,宽度和通道数。所述第一上采样解码器根据所述第一特征图进行上采样并得到第一分割图
Figure PCTCN2020130114-appb-000001
其中,T i,i∈{1,2,3...,k}代表第i个教师模型,k为教师模型的个数,每个教师模型分别对应一种特定的分类类别。
Figure PCTCN2020130114-appb-000002
j∈{1,2},
Figure PCTCN2020130114-appb-000003
是每个像素在第一预设分类条件下预测为正常的概率即第一标准概率,
Figure PCTCN2020130114-appb-000004
是每个像素在第一预设分类条件下预测为异常的概率即第一异常概率,
Figure PCTCN2020130114-appb-000005
Figure PCTCN2020130114-appb-000006
的和为1。j为图像通道,j=1表示通道1,输出第一标准概率;j=2表示通道2,输出第一异常概率,R是实数集。
Specifically, after the training image is input into the teacher model, the first downsampling encoder performs downsampling according to the training image and obtains a first feature map of size W'×H'×N, where W ', H', N are the height, width and number of channels of the first feature map. The first upsampling decoder performs upsampling according to the first feature map and obtains a first segmentation map
Figure PCTCN2020130114-appb-000001
Among them, T i ,i∈{1,2,3...,k} represents the ith teacher model, k is the number of teacher models, and each teacher model corresponds to a specific classification category.
Figure PCTCN2020130114-appb-000002
j∈{1,2},
Figure PCTCN2020130114-appb-000003
is the probability that each pixel is predicted to be normal under the first preset classification condition, that is, the first standard probability,
Figure PCTCN2020130114-appb-000004
is the probability that each pixel is predicted to be abnormal under the first preset classification condition, that is, the first abnormal probability,
Figure PCTCN2020130114-appb-000005
and
Figure PCTCN2020130114-appb-000006
The sum is 1. j is an image channel, j=1 means channel 1, outputting the first standard probability; j=2 means channel 2, outputting the first abnormal probability, and R is a set of real numbers.
举例说明,当前有4个教师模型:A教师模型、B教师模型、C教师模型、D教师模型,对应的分类条件分别为是否患有息肉,是否患有默克憩室,是否患有溃疡和是否患有出血,分别用于训练自动检测息肉,默克憩室,溃疡和出血。将所述训练图像输入所述A教师模型 后,得到第一分割图(0.1,0.9),则0.1是通道1输出的预测像素正常的概率,0.9是通道2输出的预测像素患有息肉的概率;将所述训练图像输入所述B教师模型后,得到第一分割图(0.2,0.8),则0.2是通道1输出的预测像素正常的概率,0.8是通道2输出的预测像素患有默克憩室的概率;将所述训练图像输入所述C教师模型后,得到第一分割图(0.3,0.7),则0.3是通道1输出的预测像素正常的概率,0.7是通道2输出的预测像素患有溃疡的概率;将所述训练图像输入所述C教师模型后,得到第一分割图(0.4,0.6),则0.4是通道1输出的预测像素正常的概率,0.6是通道2输出的预测像素患有出血的概率。For example, there are currently 4 teacher models: A teacher model, B teacher model, C teacher model, D teacher model, the corresponding classification conditions are whether there is polyp, whether there is Merck's diverticulum, whether there is ulcer and whether With hemorrhage, used to train automatic detection of polyps, Merck's diverticulum, ulcers, and hemorrhages, respectively. After inputting the training image into the A teacher model, the first segmentation map (0.1, 0.9) is obtained, then 0.1 is the probability that the predicted pixel output by channel 1 is normal, and 0.9 is the probability that the predicted pixel output by channel 2 has polyps After the training image is input into the B teacher model, the first segmentation map (0.2, 0.8) is obtained, then 0.2 is the probability that the predicted pixel output by channel 1 is normal, and 0.8 is the predicted pixel output by channel 2 suffers from Merck The probability of diverticulum; after inputting the training image into the C teacher model, the first segmentation map (0.3, 0.7) is obtained, then 0.3 is the probability that the predicted pixel output by channel 1 is normal, and 0.7 is the predicted pixel output by channel 2. There is a probability of ulcer; after inputting the training image into the C teacher model, the first segmentation map (0.4, 0.6) is obtained, then 0.4 is the probability that the predicted pixel output by channel 1 is normal, and 0.6 is the predicted pixel output by channel 2. Probability of suffering from bleeding.
训练时为了获知所述教师模型预测的正确性,所述方法还包括步骤S300、根据所述第一分割图和第一真实标签,对所述教师模型的参数进行修正,并继续执行将所述训练图像输入至所述教师模型,得到第一分割图的步骤,直至满足所述教师模型的预设训练条件,以得到已训练的教师模型;所述第一真实标签用于反映所述训练图像上的像素在第一预设分类条件下对应的真实结果。In order to know the correctness of the prediction of the teacher model during training, the method further includes step S300, modifying the parameters of the teacher model according to the first segmentation map and the first real label, and continuing to perform the The training image is input into the teacher model, and the first segmentation map is obtained until the preset training conditions of the teacher model are met, so as to obtain the trained teacher model; the first real label is used to reflect the training image. The corresponding real results of the pixels above under the first preset classification conditions.
在实际训练过程中,每一个训练图像都有其对应的真实标签,以评价模型的分类效果(预测效果)。用于训练所述教师模型的真实标签即为第一真实标签,以指示所述训练图像在所述第一预设分类条件下对应的真实结果。训练的目的是为了使所述教师模型的输出结果不断接近所述真实标签,因此所述教师模型在训练过程中会不断进行参数修正,从而对所述训练过程进行控制,引导所述训练过程向最优的方向收敛。In the actual training process, each training image has its corresponding real label to evaluate the classification effect (prediction effect) of the model. The real label used for training the teacher model is the first real label, to indicate the corresponding real result of the training image under the first preset classification condition. The purpose of training is to keep the output of the teacher model close to the real label, so the teacher model will continuously perform parameter correction during the training process, so as to control the training process and guide the training process towards The optimal direction converges.
如图5所示,所述步骤S300具体包含以下步骤:As shown in FIG. 5 , the step S300 specifically includes the following steps:
步骤S310、根据所述第一分割图和所述第一真实标签计算第一损失值;Step S310, calculating a first loss value according to the first segmentation map and the first real label;
步骤S320、根据所述第一损失值调整所述第一上采样解码器的参数,以更新所述教师模型;Step S320, adjusting the parameters of the first upsampling decoder according to the first loss value to update the teacher model;
步骤S330、继续执行将所述训练图像输入至所述教师模型,得到第一分割图的步骤,直至满足所述教师模型的预设训练条件,以得到已训练的教师模型。Step S330: Continue to perform the step of inputting the training image into the teacher model to obtain the first segmentation map, until the preset training conditions of the teacher model are met, so as to obtain the trained teacher model.
通过不断比较所述第一分割图与所述第一真实标签,可以得出所述教师模型的预测结果与真实结果之间的差距,使所述教师模型可以根据其预测结果与真实结果之间的差距确定如何进行参数修正,并达到更佳的预测效果。具体地,所述教师模型根据所述第一预设分类条件对所述训练图像进行分类得到第一分割图,将所述第一分割图与所述第一真实标签代入所述第一损失值的计算公式中,得到所述第一损失值,所述第一损失值即可表示所述第一分割图与所述第一真实标签之间的差距。所述第一损失值的计算公式如下:By continuously comparing the first segmentation map and the first real label, the gap between the prediction result of the teacher model and the real result can be obtained, so that the teacher model can calculate the difference between the prediction result and the real result according to the teacher model. The gap determines how to perform parameter correction and achieve a better prediction effect. Specifically, the teacher model classifies the training image according to the first preset classification condition to obtain a first segmentation map, and substitutes the first segmentation map and the first true label into the first loss value In the calculation formula of , the first loss value is obtained, and the first loss value can represent the gap between the first segmentation map and the first real label. The calculation formula of the first loss value is as follows:
Figure PCTCN2020130114-appb-000007
Figure PCTCN2020130114-appb-000007
由于所述第一损失值指代的是所述第一分割图与所述第一真实标签之间的差距,因此所述第一损失值的数值越大,即表示所述第一分割图与所述第一真实标签之间的差距越大,所述教师模型的分类效果越差;所述第一损失值的数值越小,即表示所述第一分割图与所述第一真实标签之间的差距越小,所述教师模型的分类效果越好。Since the first loss value refers to the gap between the first segmentation map and the first real label, the larger the value of the first loss value, the greater the value of the first segmentation map and the The larger the gap between the first true labels, the worse the classification effect of the teacher model; the smaller the value of the first loss value, that is, the difference between the first segmentation map and the first true label. The smaller the gap is, the better the classification effect of the teacher model is.
其中
Figure PCTCN2020130114-appb-000008
是教师模型T i输出的第j通道的第w行和第h列像素的第一分割图
Figure PCTCN2020130114-appb-000009
表示教师模型T i对该像素的预测结果。
Figure PCTCN2020130114-appb-000010
是所述训练图像的第j通道的第w行和第h列像素的对应的真实标签
Figure PCTCN2020130114-appb-000011
可以指示该像素在教师模型T i对应的分类条件下的真实分类情况。
Figure PCTCN2020130114-appb-000012
j∈{1,2},其中
Figure PCTCN2020130114-appb-000013
是该像素在教师模型T i中的正常标签,而
Figure PCTCN2020130114-appb-000014
是该像素在教师模型T i中的异常标签(患病标签)。所述真实标签
Figure PCTCN2020130114-appb-000015
为单次热向量(one-hot向量),即同一标签中只有一个通道对应的结果是不为0的,其他通道对应的结果都是0,换言之教师模型的真实标签只有(1,0)和(0,1)两种形式,(1,0)指示像素对应的真实情况为正常,(0,1)指示像素对应的真实情况为异常。根据所述真实标签
Figure PCTCN2020130114-appb-000016
去评价所述教师模型输出的第一分割图
Figure PCTCN2020130114-appb-000017
预测效果,则是通过即将
Figure PCTCN2020130114-appb-000018
Figure PCTCN2020130114-appb-000019
代入所述第一损失值的计算公式中得到所述教师模型的第一损失值,根据所得的第一损失值的大小去评价所述教师模型的预测效果,所得的第一损失值越大则所述教师模型的预测效果越差;所得的第一损失值越小则所述教师模型的预测效果越好。
in
Figure PCTCN2020130114-appb-000008
is the first segmentation map of the pixels in the wth row and hth column of the jth channel output by the teacher model T i
Figure PCTCN2020130114-appb-000009
represents the prediction result of the teacher model Ti for this pixel.
Figure PCTCN2020130114-appb-000010
are the corresponding ground truth labels of the pixels in the wth row and hth column of the jth channel of the training image
Figure PCTCN2020130114-appb-000011
It can indicate the true classification situation of the pixel under the classification condition corresponding to the teacher model T i .
Figure PCTCN2020130114-appb-000012
j∈{1,2}, where
Figure PCTCN2020130114-appb-000013
is the normal label of this pixel in the teacher model Ti , and
Figure PCTCN2020130114-appb-000014
is the abnormal label ( diseased label) of this pixel in the teacher model Ti. the true label
Figure PCTCN2020130114-appb-000015
It is a one-hot vector (one-hot vector), that is, the result corresponding to only one channel in the same label is not 0, and the results corresponding to other channels are all 0. In other words, the real labels of the teacher model are only (1, 0) and There are two forms of (0, 1), (1, 0) indicates that the real situation corresponding to the pixel is normal, and (0, 1) indicates that the real situation corresponding to the pixel is abnormal. According to the true label
Figure PCTCN2020130114-appb-000016
to evaluate the first segmentation map output by the teacher model
Figure PCTCN2020130114-appb-000017
The predicted effect is determined by the
Figure PCTCN2020130114-appb-000018
and
Figure PCTCN2020130114-appb-000019
Substitute into the calculation formula of the first loss value to obtain the first loss value of the teacher model, and evaluate the prediction effect of the teacher model according to the size of the obtained first loss value. The worse the prediction effect of the teacher model is; the smaller the obtained first loss value is, the better the prediction effect of the teacher model is.
当所述教师模型训练完毕后即得到已训练的教师模型,所述已训练的教师模型即可用于提炼所述学生模型,提炼所述学生模型的过程即为所述学生模型的训练过程。After the training of the teacher model is completed, the trained teacher model is obtained, and the trained teacher model can be used to refine the student model, and the process of refining the student model is the training process of the student model.
因此,如图1所示,所述方法还包括步骤S400、将所述训练图像输入至所述学生模型,得到第二分割图。具体地,所述学生模型通过采集所述训练图像中特定的特征信息,根据预设的分类条件对所述训练图像上的像素进行分类,并输出分类结果,所述分类结果即为所述第二分割图。Therefore, as shown in FIG. 1 , the method further includes step S400 , inputting the training image into the student model to obtain a second segmentation map. Specifically, the student model classifies the pixels on the training image according to preset classification conditions by collecting specific feature information in the training image, and outputs a classification result, where the classification result is the first classification result. Two-part diagram.
如图6所示,所述步骤S400具体包括以下步骤:As shown in Figure 6, the step S400 specifically includes the following steps:
步骤S410、根据所述第二下采样编码器对所述训练图像进行特征提取后输出第二特征图;所述第二特征图包含所述训练图像的特征信息;Step S410 , outputting a second feature map after performing feature extraction on the training image according to the second down-sampling encoder; the second feature map includes feature information of the training image;
步骤S420、根据所述第二上采样解码器对所述第二特征图进行解析,得到所述第二分割 图。Step S420: Analyze the second feature map according to the second upsampling decoder to obtain the second segmentation map.
其中;所述第二分割图包含所述训练图像中的像素对应的第二标准概率和第二异常概率;所述第二标准概率为所述像素在第二预设分类条件下属于标准的概率,所述第二异常概率为所述像素在第二预设分类条件下属于异常的概率;所述第二预设分类条件的类别数量多于所述第一预设分类条件的类别数量。Wherein; the second segmentation map includes the second standard probability and the second abnormal probability corresponding to the pixels in the training image; the second standard probability is the probability that the pixel belongs to the standard under the second preset classification condition , the second abnormality probability is the probability that the pixel belongs to abnormality under the second preset classification condition; the number of categories of the second preset classification condition is more than the number of categories of the first preset classification condition.
具体地,所述学生模型的构造与所述教师模型的构造类似,均包含一个下采样编码器和一个上采样解码器。所述学生模型中的下采样编码器为第二下采样编码器,所述学生模型中的上采样解码器为第二上采样解码器,所述第二下采样编码器和所述第二上采样解码器同样分别由四个下采样层和四个上采样层构成,所述第二下采样编码器与所述第二上采样解码器之间有连接关系,所述四个下采样层与所述四个上采样层一一对应连接,并将所述四个下采样层的输出分别加入其对应的上采样层中,参与上采样过程,以保持所述学生模型的梯度。所述训练图像输入所述学生模型以后,依次通过所述第二下采样编码器的四个下采样层,得到所述第二特征图。然后所述第二特征图再依次通过所述第二上采样解码器的四个上采样层,最后的输出结果即为所述第一分割图。所述学生模型与所述教师模型的主要区别在于,所述学生模型的分类条件的类别数量多于所述教师模型的分类条件的类别数量,从而使所述学生模型输出的预测结果的维度多于所述教师模型输出的预测结果的维度。并且所述学生模型中的中间层的通道数小于所述教师模型中的中间层的通道数,从而实现减小所述学生模型的整体尺寸。Specifically, the structure of the student model is similar to that of the teacher model, and both include a down-sampling encoder and an up-sampling decoder. The downsampling encoder in the student model is a second downsampling encoder, the upsampling decoder in the student model is a second upsampling decoder, the second downsampling encoder and the second upsampling encoder are The sampling decoder is also composed of four down-sampling layers and four up-sampling layers respectively, the second down-sampling encoder and the second up-sampling decoder have a connection relationship, and the four down-sampling layers are connected to the The four up-sampling layers are connected in one-to-one correspondence, and the outputs of the four down-sampling layers are respectively added to their corresponding up-sampling layers to participate in the up-sampling process to maintain the gradient of the student model. After the training image is input into the student model, the second feature map is obtained by sequentially passing through four downsampling layers of the second downsampling encoder. Then, the second feature map passes through the four upsampling layers of the second upsampling decoder in sequence, and the final output result is the first segmentation map. The main difference between the student model and the teacher model is that the number of categories of the classification conditions of the student model is more than the number of categories of the classification conditions of the teacher model, so that the prediction results output by the student model have more dimensions. The dimension of the prediction result output by the teacher model. And the number of channels in the middle layer in the student model is smaller than the number of channels in the middle layer in the teacher model, thereby reducing the overall size of the student model.
具体地,本实施例同样采用MobileNetv2中的四层分别作为所述第二下采样编码器中的四个下采样层,并将上一个下采样层的输出图像作为下一个下采样层的输入图像,并持续执行对输入图像进行特征提取的步骤,直至所述第四个下采样层完成对输入图像的特征提取后输出所述第二特征图(详细过程可参考步骤S210)。同样地,所述第二下采样编码器中的每一个下采样层均由一个反向残差模块组成,所述反向残差模块由深度可分离卷积构造而成。具体地,所述反向残差模块先对所述输入图像进行逐点卷积计算,拓展图像的通道数。然后通过执行深度卷积计算,提取图像特征。然后再通过执行逐点卷积计算,压缩图像的通道数。从而实现减小所述学生模型的尺寸,同时又不会损失所述学生模型提取图像特征的准确性。Specifically, this embodiment also uses the four layers in MobileNetv2 as the four downsampling layers in the second downsampling encoder, and uses the output image of the previous downsampling layer as the input image of the next downsampling layer , and continue to perform the feature extraction step of the input image until the fourth downsampling layer completes the feature extraction of the input image and then outputs the second feature map (for a detailed process, please refer to step S210 ). Likewise, each downsampling layer in the second downsampling encoder consists of an inverse residual module constructed from depthwise separable convolutions. Specifically, the reverse residual module first performs a point-by-point convolution calculation on the input image to expand the number of channels of the image. Image features are then extracted by performing depthwise convolution calculations. Then, the number of channels of the image is compressed by performing a point-by-point convolution calculation. Thus, the size of the student model can be reduced without losing the accuracy of the image feature extraction by the student model.
所述第二下采样编码器对输入的训练图像进行特征提取后输出所述第二特征图,并将所述第二特征图作为所述第二上采样解码器的输入图像,然后执行所述步骤S420。The second down-sampling encoder performs feature extraction on the input training image and outputs the second feature map, and uses the second feature map as the input image of the second up-sampling decoder, and then executes the Step S420.
具体实施时,将所述第二下采样编码器中的四个下采样层的输出分别一一加入其对应的所述第二上采样解码器中的四个上采样层中,参与上采样过程,以保持所述学生模型的梯度。 所述第二特征图依次通过所述第二上采样解码器的四个上采样层,最后的输出结果即为所述第二分割图(详细过程可参考步骤S220)。During specific implementation, the outputs of the four down-sampling layers in the second down-sampling encoder are added to the corresponding four up-sampling layers in the second up-sampling decoder, respectively, to participate in the up-sampling process. , to preserve the gradient of the student model. The second feature map sequentially passes through the four upsampling layers of the second upsampling decoder, and the final output result is the second segmentation map (for a detailed process, please refer to step S220).
在一种实现方式中,所述第二分割图包含所述训练图像中的像素对应的第二标准概率和第二异常概率;所述第二标准概率为所述像素在第二预设分类条件下属于标准的概率,所述第二异常概率为所述像素在第二预设分类条件下属于异常的概率;所述第二预设分类条件的类别数量多于所述第一预设分类条件的类别数量。In an implementation manner, the second segmentation map includes a second standard probability and a second abnormal probability corresponding to a pixel in the training image; the second standard probability is the pixel in a second preset classification condition The probability that the pixel belongs to the standard, and the second abnormal probability is the probability that the pixel is abnormal under the second preset classification condition; the number of categories of the second preset classification condition is more than the first preset classification condition number of categories.
具体地,将所述训练图像输入所述学生模型后,所述第二下采样编码器根据所述训练图像进行下采样并得到大小为W'×H'×N的第二特征图,其中W',H',N是所述第二特征图的高度,宽度和通道数。所述第二上采样解码器根据所述第二特征图进行上采样并得到第二分割图
Figure PCTCN2020130114-appb-000020
其中,S代表学生模型,
Figure PCTCN2020130114-appb-000021
n∈{normal,1,2,...N},n为图像通道(例如n=1表示通道1),N为正整数且N大于等于2,N的取值与教师模型的数量有关,R是实数集。
Figure PCTCN2020130114-appb-000022
是像素在第二预设分类条件下对应的最小标准概率,
Figure PCTCN2020130114-appb-000023
表示像素在第二预设分类条件下属于第一类异常的概率,
Figure PCTCN2020130114-appb-000024
表示像素在第二预设分类条件下属于第二类异常的概率,以此类推。
Specifically, after the training image is input into the student model, the second downsampling encoder performs downsampling according to the training image and obtains a second feature map of size W'×H'×N, where W ', H', N are the height, width and number of channels of the second feature map. The second upsampling decoder performs upsampling according to the second feature map and obtains a second segmentation map
Figure PCTCN2020130114-appb-000020
where S represents the student model,
Figure PCTCN2020130114-appb-000021
n∈{normal,1,2,...N}, n is the image channel (for example, n=1 means channel 1), N is a positive integer and N is greater than or equal to 2, the value of N is related to the number of teacher models, R is the set of real numbers.
Figure PCTCN2020130114-appb-000022
is the minimum standard probability corresponding to the pixel under the second preset classification condition,
Figure PCTCN2020130114-appb-000023
represents the probability that the pixel belongs to the first type of abnormality under the second preset classification condition,
Figure PCTCN2020130114-appb-000024
Indicates the probability that the pixel belongs to the second type of abnormality under the second preset classification condition, and so on.
举例说明,由于本发明是将多个训练好的教师模型包含的知识,蒸馏提取到学生模型中,因此学生模型设置的分类条件与多个训练好的教师模型的分类条件均相关。例如当前有4个教师模型:A教师模型、B教师模型、C教师模型、D教师模型,对应的分类条件分别为是否患有息肉,是否患有默克憩室,是否患有溃疡和是否患有出血,分别用于训练自动检测息肉,默克憩室,溃疡和出血。则由上述四个教师模型蒸馏提取的学生模型的第二预设分类条件共有四类:第一类分类条件为是否患有息肉,第二类分类条件为是否患有默克尔憩室,第三类分类条件为是否患有溃疡,第四类分类条件为是否患有出血。将训练图像输入所述学生模型后得到第二分割图(0.2,0.8,0.7,0.5,0.3),则表示预测像素患有息肉的概率为0.8,患有默克尔憩室的概率为0.7,患有溃疡的概率为0.5,患有出血的概率为0.3。由上述患病概率可得不患息肉的概率为1-0.8=0.2,不患息肉的概率为1-0.7=0.3,不患息肉的概率为1-0.5=0.5,不患息肉的概率为1-0.3=0.7,由于不患息肉的概率是所有疾病中正常概率的最小值,因此在第二分割图中保留不患息肉的概率0.2作为像素对应的正常概率,以此避免存在过高的正常概率导致所述学生模型的预测效果不准确。For example, since the present invention distills the knowledge contained in multiple trained teacher models into the student model, the classification conditions set by the student model are related to the classification conditions of the multiple trained teacher models. For example, there are currently 4 teacher models: A teacher model, B teacher model, C teacher model, D teacher model, the corresponding classification conditions are whether there is polyp, whether there is Merck's diverticulum, whether there is ulcer and whether Bleeding, used to train automatic detection of polyps, Merck's diverticula, ulcers and bleeding, respectively. Then the second preset classification conditions of the student model extracted by the above four teacher models are divided into four categories: the first classification condition is whether there is a polyp, the second classification condition is whether there is a Merkel's diverticulum, and the third The first class classification condition is whether there is ulceration, and the fourth class classification condition is whether there is bleeding. After inputting the training image into the student model, a second segmentation map (0.2, 0.8, 0.7, 0.5, 0.3) is obtained, which means that the predicted pixel has a probability of polyp of 0.8, a Merkel's diverticulum with a probability of 0.7, and a The probability of having an ulcer is 0.5 and the probability of having a bleeding is 0.3. From the above probability of disease, the probability of not suffering from polyps is 1-0.8=0.2, the probability of not suffering from polyps is 1-0.7=0.3, the probability of not suffering from polyps is 1-0.5=0.5, and the probability of not suffering from polyps is 1 -0.3=0.7, since the probability of not suffering from polyps is the minimum value of the normal probability among all diseases, the probability of not suffering from polyps 0.2 is retained as the normal probability corresponding to the pixel in the second segmentation map, so as to avoid the existence of excessive normal Probability causes the prediction effect of the student model to be inaccurate.
在具体实施时,所述学生模型的预测效果图如图11所示,其中A列为输入学生模型中的训练图像;B列为对应的第二真实标签(真实类别图);C列为对应输出的第二分割图(预 测效果图)。In the specific implementation, the prediction effect diagram of the student model is shown in Figure 11, where column A is the training image input into the student model; column B is the corresponding second real label (real category map); column C is the corresponding The output of the second segmentation map (prediction effect map).
训练时为了获知所述学生模型分类的正确性,所述方法还包括:In order to know the correctness of the classification of the student model during training, the method further includes:
步骤S500、根据所述第二分割图、教师标签和第二真实标签,对所述学生模型的参数进行修正,并继续执行将所述训练图像输入至所述学生模型,得到第二分割图的步骤,直至满足所述学生模型的预设训练条件,以得到已训练的学生模型;所述第二真实标签用于反映所述训练图像上的像素在第二预设分类条件下对应的真实分类情况;所述教师标签用于反映所述训练图像在所述已训练的教师模型中的分类情况。Step S500, modify the parameters of the student model according to the second segmentation map, the teacher label and the second real label, and continue to input the training image into the student model to obtain the second segmentation map. step, until the preset training conditions of the student model are met, to obtain the trained student model; the second real label is used to reflect the corresponding real classification of the pixels on the training image under the second preset classification condition situation; the teacher label is used to reflect the classification situation of the training image in the trained teacher model.
用于训练所述学生模型的真实标签为第二真实标签,以指示所述训练图像中的像素在所述第二预设分类条件下对应的真实分类情况。训练的目的是为了使所述学生模型的输出结果不断接近所述真实标签,因此所述学生模型在训练过程中会不断进行参数修正,从而对所述训练过程进行控制,引导所述训练过程向最优的方向收敛。The real label used for training the student model is the second real label, to indicate the corresponding real classification situation of the pixels in the training image under the second preset classification condition. The purpose of training is to keep the output of the student model close to the real label, so the student model will continuously perform parameter correction during the training process, so as to control the training process and guide the training process toward The optimal direction converges.
如图7所示,所述步骤S500具体包含以下步骤:As shown in Figure 7, the step S500 specifically includes the following steps:
步骤S510、根据所述第二分割图、所述教师标签和第二真实标签计算第二损失值;Step S510, calculating a second loss value according to the second segmentation map, the teacher label and the second real label;
步骤S520、根据所述第二损失值调整所述第二上采样解码器的参数,以更新所述学生模型;Step S520, adjusting the parameters of the second upsampling decoder according to the second loss value to update the student model;
步骤S530、继续执行根据所述训练图像生成第二分割图的步骤,直至满足所述学生模型的预设训练条件,以得到已训练的学生模型。Step S530: Continue to perform the step of generating a second segmentation map according to the training image until the preset training conditions of the student model are met, so as to obtain a trained student model.
通过不断比较所述第二分割图与所述第二真实标签,可以得出所述学生模型的预测结果与真实分类结果之间的差距,使所述学生模型可以根据其预测结果与真实分类结果之间的差距确定如何进行参数修正,并达到更佳的预测效果。具体地,所述学生模型根据所述第二预设分类条件对所述训练图像进行分类得到第二分割图,将所述第二分割图与所述第二真实标签代入所述第二损失值的计算公式中,得到所述第二损失值,所述第二损失值即可表示所述第二分割图与所述第二真实标签之间的差距。所述第二损失值的计算公式如下:By continuously comparing the second segmentation map and the second real label, the gap between the predicted result of the student model and the real classification result can be obtained, so that the student model can compare the predicted result and the real classification result according to the student model. The gap between them determines how to make parameter corrections and achieve better prediction results. Specifically, the student model classifies the training image according to the second preset classification condition to obtain a second segmentation map, and substitutes the second segmentation map and the second true label into the second loss value In the calculation formula of , the second loss value is obtained, and the second loss value can represent the gap between the second segmentation map and the second real label. The calculation formula of the second loss value is as follows:
Figure PCTCN2020130114-appb-000025
Figure PCTCN2020130114-appb-000025
由于所述第二损失值指代的是所述第二分割图与所述第二真实标签之间的差距,因此所述第二损失值的数值越大,即表示所述第二分割图与所述第二真实标签之间的差距越大,所述学生模型的分类效果越差;所述第二损失值的数值越小,即表示所述第二分割图与所述第二真实标签之间的差距越小,所述学生模型的分类效果越好。Since the second loss value refers to the gap between the second segmentation map and the second true label, the larger the value of the second loss value, the greater the The larger the gap between the second true labels, the worse the classification effect of the student model; the smaller the value of the second loss value, the smaller the difference between the second segmentation map and the second true label. The smaller the gap, the better the classification effect of the student model.
其中,
Figure PCTCN2020130114-appb-000026
是所述学生模型输出的第j通道的第w行和第h列像素的第二分割图
Figure PCTCN2020130114-appb-000027
是所述训练图像的第j通道的第w行和第h列像素的教师标签,可以表示该像素在已训练的教师模型中的预测结果,
Figure PCTCN2020130114-appb-000028
是所述训练图像的第j通道的第w行和第h列像素的真实标签,即第二真实标签,可以指示所述训练图像中的该像素在所述第二预设分类条件下的真实分类情况。
in,
Figure PCTCN2020130114-appb-000026
is the second segmentation map of the pixels in the wth row and hth column of the jth channel output by the student model
Figure PCTCN2020130114-appb-000027
is the teacher label of the pixel in the wth row and hth column of the jth channel of the training image, which can represent the prediction result of the pixel in the trained teacher model,
Figure PCTCN2020130114-appb-000028
is the real label of the pixel in the wth row and hth column of the jth channel of the training image, that is, the second real label, which can indicate the real label of the pixel in the training image under the second preset classification condition Classification situation.
具体地,所述第二真实标签
Figure PCTCN2020130114-appb-000029
的形式为
Figure PCTCN2020130114-appb-000030
其中y normal是像素在所述第二预设分类条件下属于标准的标签,y 1 S像素在所述第二预设分类条件下属于第一类异常的标签,y 2 S像素在所述第二预设分类条件下属于第二类异常的标签,y n S像素在所述第二预设分类条件下属于第n类异常的标签。所述第二真实标签同样为单次热向量,即同一标签中只有一个通道对应的结果是不为0的,其他通道对应的结果都是0,换言之同一标签只能指代像素为正常或者为n类异常中的一类。
Specifically, the second real label
Figure PCTCN2020130114-appb-000029
in the form of
Figure PCTCN2020130114-appb-000030
where y normal is the label of the pixel that belongs to the standard under the second preset classification condition, the y 1 S pixel belongs to the label of the first type of abnormality under the second preset classification condition, and the y 2 S pixel belongs to the label of the first type of abnormality under the second preset classification condition. Two labels belong to the second type of abnormality under the preset classification conditions, and y n S pixels belong to the labels of the nth type of abnormality under the second preset classification conditions. The second real label is also a single-hot vector, that is, the result corresponding to only one channel in the same label is not 0, and the results corresponding to other channels are all 0. In other words, the same label can only refer to the pixel as normal or as One of the n types of exceptions.
在一种实现方式中,所述教师标签是根据所有已训练的教师模型输出的第一分割图得到的,如图8所示,所述步骤S510包括如下步骤:In an implementation manner, the teacher label is obtained according to the first segmentation map output by all trained teacher models. As shown in FIG. 8 , the step S510 includes the following steps:
步骤S511、根据所有已训练的教师模型输出的第一分割图,得到概率总值;Step S511, obtain the probability total value according to the first segmentation map output by all the trained teacher models;
步骤S512、根据所述概率总值调整所有已训练的教师模型输出的第一分割图,得到教师标签。Step S512: Adjust the first segmentation maps output by all trained teacher models according to the probability total value to obtain a teacher label.
由于所述教师模型输出的第一分割图与所述学生模型输出的第二分割图的维度不同(相当于通道数不同),因此首先需要对所有教师模型输出的第一分割图进行调整,以使教师标签与学生模型的输出结果的维度相适配。本实施例通过下述公式对所有教师模型输出的第一分割图进行调整,并得到教师标签:Since the dimensions of the first segmentation map output by the teacher model and the second segmentation map output by the student model are different (equivalent to a different number of channels), it is necessary to first adjust the first segmentation maps output by all teacher models to Fit the teacher labels to the dimensions of the output of the student model. This embodiment adjusts the first segmentation maps output by all teacher models through the following formula, and obtains the teacher label:
Figure PCTCN2020130114-appb-000031
Figure PCTCN2020130114-appb-000031
Figure PCTCN2020130114-appb-000032
Figure PCTCN2020130114-appb-000032
其中,D为所述概率总值,p T为教师标签。具体地,首先根据i个教师模型输出的第一分割图
Figure PCTCN2020130114-appb-000033
得到第一向量
Figure PCTCN2020130114-appb-000034
所述第一向量中保留i个教师模型输出的第一分割图中的
Figure PCTCN2020130114-appb-000035
以及保留i个教师模型输出的第一分割图中概率值最小的
Figure PCTCN2020130114-appb-000036
作为p normal。根据计算所述概率总值D的公式,将所述第一向量中的p normal与所述第一向量中的所有
Figure PCTCN2020130114-appb-000037
相加,得到概率总值。再将所述第一向量
Figure PCTCN2020130114-appb-000038
与所述概率总值D相除,得到第二向量,所述第二向量即为教师标签p T
where D is the total probability value and p T is the teacher label. Specifically, first, according to the first segmentation map output by the i teacher models
Figure PCTCN2020130114-appb-000033
get the first vector
Figure PCTCN2020130114-appb-000034
In the first vector, the values in the first segmentation map output by the i teacher models are reserved.
Figure PCTCN2020130114-appb-000035
and retain the smallest probability value in the first segmentation map output by the i teacher models
Figure PCTCN2020130114-appb-000036
as p normal . According to the formula for calculating the probability total value D, the p normal in the first vector is compared with all the
Figure PCTCN2020130114-appb-000037
Add up to get the total probability. Then the first vector
Figure PCTCN2020130114-appb-000038
Divide by the probability total value D to obtain a second vector, which is the teacher label p T .
例如,当前有A、B、C、D四个教师模型,A、B、C、D教师模型输出的第一分割图分 别为(0.1,0.9),(0.2,0.8),(0.3,0.7),(0.4,0.6),则首先根据A、B、C、D教师模型输出的第一分割图得到第一向量(0.1,0.9,0.8,0.7,0.6),再将第一向量中的所有概率相加得到概率总值为3.1,即0.1+0.9+0.8+0.7+0.6=3.1,然后将第一向量与概率总值相除,即相当于将第一向量中的所有概率分别与3.1相除,得到第二向量,则该第二向量即为教师标签。For example, there are currently four teacher models A, B, C, and D. The first segmentation maps output by the teacher models A, B, C, and D are (0.1, 0.9), (0.2, 0.8), (0.3, 0.7) respectively. , (0.4, 0.6), then firstly obtain the first vector (0.1, 0.9, 0.8, 0.7, 0.6) according to the first segmentation map output by the A, B, C, D teacher models, and then calculate all the probabilities in the first vector Add to get the total probability value of 3.1, that is, 0.1+0.9+0.8+0.7+0.6=3.1, and then divide the first vector by the total probability value, which is equivalent to dividing all the probabilities in the first vector by 3.1 , get the second vector, then the second vector is the teacher label.
所述第二向量表示为
Figure PCTCN2020130114-appb-000039
The second vector is represented as
Figure PCTCN2020130114-appb-000039
当所述学生模型训练完毕后即得到已训练的学生模型,所述已训练的学生模型即可用于实时肠镜影像分割,即如图1所示,所述方法还包括步骤S600、将实时肠镜影像输入至已训练的学生模型生成实时肠镜影像分割图。After the training of the student model is completed, the trained student model is obtained, and the trained student model can be used for real-time colonoscopy image segmentation, that is, as shown in FIG. 1 , the method further includes step S600: The endoscopy images are input to the trained student model to generate real-time colonoscopy image segmentation maps.
基于上述实施例,如图10所示,本发明还提供一种基于集成知识蒸馏的实时肠镜影像分割的装置,其中,所述装置包括:图像获取模块120,所述图像获取模块120用于获取训练图像;教师模型单元130,所述教师模型单元130用于根据所述训练图像获得第一分割图;第一参数修正模块110,所述第一参数修正模块110用于根据所述第一分割图和第一真实标签,对所述教师模型的参数进行修正;学生模型单元170,所述学生模型单元170用于根据所述训练图像获得第二分割图;第二参数修正模块160,所述第二参数修正模块160用于根据所述第二分割图、教师标签和第二真实标签,对所述学生模型的参数进行修正;Based on the above embodiment, as shown in FIG. 10 , the present invention further provides an apparatus for real-time colonoscopy image segmentation based on integrated knowledge distillation, wherein the apparatus includes: an image acquisition module 120, and the image acquisition module 120 is used for Acquire a training image; a teacher model unit 130, the teacher model unit 130 is used to obtain a first segmentation map according to the training image; a first parameter correction module 110, the first parameter correction module 110 is used to The segmentation map and the first real label, modify the parameters of the teacher model; the student model unit 170, the student model unit 170 is used to obtain the second segmentation map according to the training image; the second parameter correction module 160, the The second parameter modification module 160 is configured to modify the parameters of the student model according to the second segmentation map, the teacher label and the second real label;
所述教师模型单元130还包括:第一下采样编码器模块90,所述第一下采样编码器模块90用于对所述训练图像进行特征提取,得到第一特征图;第一上采样解码器模块100,所述第一上采样解码器模块100用于对所述第一特征图进行解析,得到所述第一分割图;The teacher model unit 130 further includes: a first downsampling encoder module 90, the first downsampling encoder module 90 is configured to perform feature extraction on the training image to obtain a first feature map; a first upsampling decoding a decoder module 100, the first upsampling decoder module 100 is configured to parse the first feature map to obtain the first segmentation map;
所述学生模型单元170还包括:第二下采样编码器模块140,所述第二下采样编码器模块140用于对所述训练图像进行特征提取,得到第二特征图;第二上采样解码器模块150,所述第二上采样解码器模块150用于对所述第二特征图进行解析,得到所述第二分割图。The student model unit 170 further includes: a second downsampling encoder module 140, the second downsampling encoder module 140 is configured to perform feature extraction on the training image to obtain a second feature map; a second upsampling decoding The second upsampling decoder module 150 is configured to analyze the second feature map to obtain the second segmentation map.
基于上述实施例,本发明还提供了一种非临时性计算机可读存储介质,所述非临时性计算机可读存储介质上存储有数据存储程序,所述数据存储程序被处理器执行时实现如上所述的基于集成知识蒸馏的实时肠镜影像分割方法的各个步骤。Based on the above embodiments, the present invention also provides a non-transitory computer-readable storage medium, where a data storage program is stored on the non-transitory computer-readable storage medium, and the data storage program is implemented as described above when executed by a processor Each step of the described real-time colonoscopy image segmentation method based on integrated knowledge distillation.
本发明所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、 双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。Any reference to a memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. Nonvolatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in various forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Road (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
基于上述实施例,本发明还提供了一种终端,所述终端包括有存储器,以及一个或者一个以上的程序,其中一个或者一个以上程序存储于存储器中,且经配置以由一个或者一个以上处理器执行所述一个或者一个以上程序包含用于执行如上述任意一项所述的基于集成知识蒸馏的实时肠镜影像分割方法。所述终端的原理框图可以如图9所示。该终端包括通过系统总线连接的处理器、存储器、网络接口。其中,该终端的处理器用于提供计算和控制能力。该终端的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统和计算机程序。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该终端的网络接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现一种基于集成知识蒸馏的实时肠镜影像分割方法。Based on the above embodiments, the present invention also provides a terminal including a memory and one or more programs, wherein the one or more programs are stored in the memory and configured to be processed by the one or more programs The execution of the one or more programs by the processor includes performing the method of real-time colonoscopy image segmentation based on ensemble knowledge distillation as described in any of the above. A functional block diagram of the terminal may be shown in FIG. 9 . The terminal includes a processor, a memory, and a network interface connected through a system bus. The processor of the terminal is used to provide computing and control capabilities. The memory of the terminal includes a non-volatile storage medium and an internal memory. The nonvolatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the execution of the operating system and computer programs in the non-volatile storage medium. The network interface of the terminal is used to communicate with external terminals through a network connection. The computer program, when executed by the processor, implements a real-time colonoscopy image segmentation method based on integrated knowledge distillation.
本领域技术人员可以理解,图9中示出的原理框图,仅仅是与本发明方案相关的部分结构的框图,并不构成对本发明方案所应用于其上的智能终端的限定,具体的智能终端可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。此外,实现上述任意一项所述的基于集成知识蒸馏的实时肠镜影像分割方法,是可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一非易失性计算机可读取存储介质中,该计算机程序在执行时,可包括如上述各方法的实施例的流程。其中,本发明所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。Those skilled in the art can understand that the principle block diagram shown in FIG. 9 is only a block diagram of a partial structure related to the solution of the present invention, and does not constitute a limitation on the intelligent terminal to which the solution of the present invention is applied. More or fewer components than shown in the figures may be included, or some components may be combined, or have a different arrangement of components. In addition, the realization of the real-time colonoscopy image segmentation method based on integrated knowledge distillation described in any of the above can be accomplished by instructing the relevant hardware through a computer program, and the computer program can be stored in a non-volatile computer. In reading the storage medium, when the computer program is executed, it may include the processes of the embodiments of the above-mentioned methods. Wherein, any reference to memory, storage, database or other medium used in the various embodiments provided by the present invention may include non-volatile and/or volatile memory. Nonvolatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in various forms such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Road (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
综上所述,本发明通过获取多个训练图像,所述训练图像分为多个训练图像集,同一训练图像集的训练图像来自同一个数据集;首先训练教师模型,不同的教师模型分别根据不同的训练图像集得到第一分割图;再用训练好的教师模型共同提炼一个学生模型。其中所述训练图像为肠镜影像截图,已训练的学生模型可以根据实时肠镜影像生成实时肠镜影像分割图。 从而解决不同医院之间的数据集具有不连续性,无法汇集在一起训练结肠镜检查自动图像分割模型的问题。To sum up, the present invention obtains multiple training images, which are divided into multiple training image sets, and the training images of the same training image set come from the same data set; first, the teacher model is trained, and different teacher models are The first segmentation map is obtained from different training image sets; then a student model is jointly refined with the trained teacher model. The training image is a screenshot of a colonoscopy image, and the trained student model can generate a real-time colonoscopy image segmentation map according to the real-time colonoscopy image. This solves the problem that the data sets between different hospitals are discontinuous and cannot be pooled together to train an automatic image segmentation model for colonoscopy.
应当理解的是,本发明的应用不限于上述的举例,对本领域普通技术人员来说,可以根据上述说明加以改进或变换,所有这些改进和变换都应属于本发明所附权利要求的保护范围。It should be understood that the application of the present invention is not limited to the above examples. For those of ordinary skill in the art, improvements or transformations can be made according to the above descriptions, and all these improvements and transformations should belong to the protection scope of the appended claims of the present invention.

Claims (10)

  1. 一种基于集成知识蒸馏的实时肠镜影像分割方法,其特征在于,所述方法包括:A real-time colonoscopy image segmentation method based on integrated knowledge distillation, characterized in that the method comprises:
    获取训练图像;所述训练图像为用于训练教师模型和学生模型的肠镜影像截图;所述训练图像分为多个训练图像集,同一训练图像集的训练图像来自同一个数据集;Acquiring training images; the training images are screenshots of colonoscopy images used to train the teacher model and the student model; the training images are divided into multiple training image sets, and the training images of the same training image set are from the same data set;
    将所述训练图像输入至所述教师模型,得到第一分割图;其中,所述教师模型的数量大于等于二;不同的教师模型分别根据不同的训练图像集得到第一分割图;Inputting the training image to the teacher model to obtain a first segmentation map; wherein, the number of the teacher models is greater than or equal to two; different teacher models obtain the first segmentation map according to different training image sets;
    根据所述第一分割图和第一真实标签,对所述教师模型的参数进行修正,并继续执行将所述训练图像输入至所述教师模型,得到第一分割图的步骤,直至满足所述教师模型的预设训练条件,以得到已训练的教师模型;所述第一真实标签用于反映所述训练图像上的像素在第一预设分类条件下对应的真实分类情况;Modify the parameters of the teacher model according to the first segmentation map and the first real label, and continue to perform the steps of inputting the training image into the teacher model to obtain the first segmentation map until the The preset training conditions of the teacher model to obtain the trained teacher model; the first real label is used to reflect the real classification situation corresponding to the pixels on the training image under the first preset classification conditions;
    将所述训练图像输入至所述学生模型,得到第二分割图;Inputting the training image to the student model to obtain a second segmentation map;
    根据所述第二分割图、教师标签和第二真实标签,对所述学生模型的参数进行修正,并继续执行将所述训练图像输入至所述学生模型,得到第二分割图的步骤,直至满足所述学生模型的预设训练条件,以得到已训练的学生模型;所述第二真实标签用于反映所述训练图像上的像素在第二预设分类条件下对应的真实分类情况;所述教师标签用于反映所述训练图像在所述已训练的教师模型中的分类情况;Modify the parameters of the student model according to the second segmentation map, the teacher label and the second real label, and continue to perform the steps of inputting the training image into the student model to obtain the second segmentation map, until Meet the preset training conditions of the student model to obtain the trained student model; the second real label is used to reflect the real classification situation corresponding to the pixels on the training image under the second preset classification condition; the The teacher label is used to reflect the classification situation of the training image in the trained teacher model;
    将实时肠镜影像输入至已训练的学生模型生成实时肠镜影像分割图。Real-time colonoscopy images are input to the trained student model to generate real-time colonoscopy image segmentation maps.
  2. 根据权利要求1所述的方法,其特征在于,所述获取训练图像,所述训练图像为用于训练教师模型和学生模型的肠镜影像截图,包括:The method according to claim 1, wherein the acquiring training image is a screenshot of a colonoscopy image used for training the teacher model and the student model, comprising:
    获取肠镜影像截图;Take screenshots of colonoscopy images;
    根据所述肠镜影像截图进行压缩处理,得到所述训练图像;所述训练图像的高度、宽度、通道数均恒定。Compression processing is performed according to the screenshot of the colonoscopy image to obtain the training image; the height, width and number of channels of the training image are all constant.
  3. 根据权利要求1所述的方法,其特征在于,所述教师模型包括第一下采样编码器和第一上采样解码器;所述将所述训练图像输入至所述教师模型,得到第一分割图包括:The method according to claim 1, wherein the teacher model comprises a first down-sampling encoder and a first up-sampling decoder; the training image is input to the teacher model to obtain a first segmentation Figures include:
    根据所述第一下采样编码器对所述训练图像进行特征提取,得到第一特征图;所述第一特征图包含所述训练图像的特征信息;Perform feature extraction on the training image according to the first down-sampling encoder to obtain a first feature map; the first feature map includes feature information of the training image;
    根据所述第一上采样解码器对所述第一特征图进行解析,得到所述第一分割图;Analyze the first feature map according to the first upsampling decoder to obtain the first segmentation map;
    其中,所述第一分割图包含所述训练图像中的像素对应的第一标准概率和第一异常概率;所述第一标准概率为所述像素在第一预设分类条件下属于标准的概率,所述第一异常概率为所述像素在第一预设分类条件下属于异常的概率;所述第一异常概率与所述第一标准概率的概率值的和为1。The first segmentation map includes the first standard probability and the first abnormal probability corresponding to the pixels in the training image; the first standard probability is the probability that the pixel belongs to the standard under the first preset classification condition , the first abnormal probability is the probability that the pixel belongs to abnormal under the first preset classification condition; the sum of the probability value of the first abnormal probability and the first standard probability is 1.
  4. 根据权利要求3所述的方法,其特征在于,所述根据所述第一分割图和第一真实标签,对所述教师模型的参数进行修正,并继续执行将所述训练图像输入至所述教师模型,得到第一分割图的步骤,直至满足所述教师模型的预设训练条件,以得到已训练的教师模型,包括:The method according to claim 3, wherein, according to the first segmentation map and the first real label, the parameters of the teacher model are modified, and the input of the training image to the The teacher model, the step of obtaining the first segmentation map, until the preset training conditions of the teacher model are met, to obtain the trained teacher model, including:
    根据所述第一分割图和所述第一真实标签计算第一损失值;Calculate a first loss value according to the first segmentation map and the first ground truth;
    根据所述第一损失值调整所述第一上采样解码器的参数,以更新所述教师模型;Adjust parameters of the first upsampling decoder according to the first loss value to update the teacher model;
    继续执行将所述训练图像输入至所述教师模型,得到第一分割图的步骤,直至满足所述教师模型的预设训练条件,以得到已训练的教师模型。The step of inputting the training image into the teacher model to obtain the first segmentation map is continued until the preset training conditions of the teacher model are satisfied, so as to obtain the trained teacher model.
  5. 根据权利要求1所述的方法,其特征在于,所述学生模型包括第二下采样编码器和第二上采样解码器;所述将所述训练图像输入至所述学生模型,得到第二分割图包括:The method according to claim 1, wherein the student model comprises a second down-sampling encoder and a second up-sampling decoder; the training image is input to the student model to obtain a second segmentation Figures include:
    根据所述第二下采样编码器对所述训练图像进行特征提取后输出第二特征图;所述第二特征图包含所述训练图像的特征信息;The second feature map is output after feature extraction is performed on the training image according to the second down-sampling encoder; the second feature map includes feature information of the training image;
    根据所述第二上采样解码器对所述第二特征图进行解析,得到所述第二分割图;Analyze the second feature map according to the second upsampling decoder to obtain the second segmentation map;
    其中;所述第二分割图包含所述训练图像中的像素对应的第二标准概率和第二异常概率;所述第二标准概率为所述像素在第二预设分类条件下属于标准的概率,所述第二异常概率为所述像素在第二预设分类条件下属于异常的概率;所述第二预设分类条件的类别数量多于所述第一预设分类条件的类别数量。Wherein; the second segmentation map includes the second standard probability and the second abnormal probability corresponding to the pixels in the training image; the second standard probability is the probability that the pixel belongs to the standard under the second preset classification condition , the second abnormal probability is the probability that the pixel is abnormal under the second preset classification condition; the number of categories of the second preset classification condition is more than the number of categories of the first preset classification condition.
  6. 根据权利要求5所述的方法,其特征在于,所述根据所述第二分割图、教师标签和第二真实标签,对所述学生模型的参数进行修正,并继续执行根据所述训练图像生成第二分割图的步骤,直至满足所述学生模型的预设训练条件,以得到已训练的学生模型,包括:The method according to claim 5, wherein, according to the second segmentation map, the teacher label and the second real label, the parameters of the student model are modified, and the generation according to the training image is continued. The step of the second segmentation map, until the preset training conditions of the student model are met, to obtain the trained student model, including:
    根据所述第二分割图、所述教师标签和第二真实标签计算第二损失值;calculating a second loss value according to the second segmentation map, the teacher label and the second ground truth;
    根据所述第二损失值调整所述第二上采样解码器的参数,以更新所述学生模型;Adjust parameters of the second upsampling decoder according to the second loss value to update the student model;
    继续执行根据所述训练图像生成第二分割图的步骤,直至满足所述学生模型的预设训练条件,以得到已训练的学生模型。Continue to perform the step of generating a second segmentation map according to the training image until the preset training conditions of the student model are met, so as to obtain a trained student model.
  7. 根据权利要求6所述的方法,其特征在于,所述根据所述第二分割图、所述教师标签和第二真实标签计算第二损失值,包括:The method according to claim 6, wherein the calculating the second loss value according to the second segmentation map, the teacher label and the second true label comprises:
    根据所有已训练的教师模型输出的第一分割图,得到概率总值;Obtain the total probability value according to the first segmentation map output by all trained teacher models;
    根据所述概率总值调整所有已训练的教师模型输出的第一分割图,得到教师标签。The teacher labels are obtained by adjusting the first segmentation maps output by all trained teacher models according to the probability total value.
  8. 一种基于集成知识蒸馏的实时肠镜影像分割的装置,其特征在于,所述装置包括:A device for real-time colonoscopy image segmentation based on integrated knowledge distillation, characterized in that the device comprises:
    图像获取模块,所述图像获取模块用于获取训练图像;an image acquisition module, the image acquisition module is used to acquire training images;
    教师模型单元,所述教师模型单元用于根据所述训练图像获得第一分割图;a teacher model unit, the teacher model unit is used to obtain a first segmentation map according to the training image;
    第一参数修正模块,所述第一参数修正模块用于根据所述第一分割图和第一真实标签,对所述教师模型的参数进行修正;a first parameter correction module, which is used to correct the parameters of the teacher model according to the first segmentation map and the first real label;
    学生模型单元,所述学生模型单元用于根据所述训练图像获得第二分割图;a student model unit, the student model unit is used to obtain a second segmentation map according to the training image;
    第二参数修正模块,所述第二参数修正模块用于根据所述第二分割图、教师标签和第二真实标签,对所述学生模型的参数进行修正;The second parameter correction module, the second parameter correction module is used to modify the parameters of the student model according to the second segmentation map, the teacher label and the second real label;
    所述教师模型单元还包括:The teacher model unit also includes:
    第一下采样编码器模块,所述第一下采样编码器模块用于对所述训练图像进行特征提取,得到第一特征图;a first down-sampling encoder module, where the first down-sampling encoder module is used to perform feature extraction on the training image to obtain a first feature map;
    第一上采样解码器模块,所述第一上采样解码器模块用于对所述第一特征图进行解析,得到所述第一分割图;a first up-sampling decoder module, the first up-sampling decoder module is configured to parse the first feature map to obtain the first segmentation map;
    所述学生模型单元还包括:The student model unit also includes:
    第二下采样编码器模块,所述第二下采样编码器模块用于对所述训练图像进行特征提取,得到第二特征图;a second down-sampling encoder module, the second down-sampling encoder module is configured to perform feature extraction on the training image to obtain a second feature map;
    第二上采样解码器模块,所述第二上采样解码器模块用于对所述第二特征图进行解析,得到所述第二分割图。A second up-sampling decoder module, the second up-sampling decoder module is configured to parse the second feature map to obtain the second segmentation map.
  9. 一种终端,其特征在于,包括有存储器,以及一个或者一个以上的程序,其中一个或者一个以上程序存储于存储器中,且经配置以由一个或者一个以上处理器执行所述一个或者一个以上程序包含用于执行如权利要求1-7中任意一项所述的方法。A terminal comprising a memory and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors to execute the one or more programs Contains for performing the method of any of claims 1-7.
  10. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1至7中任一所述的基于集成知识蒸馏的实时肠镜影像分割方法的步骤。A computer-readable storage medium on which a computer program is stored, characterized in that, when the computer program is executed by a processor, the real-time colonoscopy image segmentation based on integrated knowledge distillation according to any one of claims 1 to 7 is realized steps of the method.
PCT/CN2020/130114 2020-09-21 2020-11-19 Real-time colonoscopy image segmentation method and device based on ensemble and knowledge distillation WO2022057078A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010997859.XA CN111932561A (en) 2020-09-21 2020-09-21 Real-time enteroscopy image segmentation method and device based on integrated knowledge distillation
CN202010997859.X 2020-09-21

Publications (1)

Publication Number Publication Date
WO2022057078A1 true WO2022057078A1 (en) 2022-03-24

Family

ID=73335334

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/130114 WO2022057078A1 (en) 2020-09-21 2020-11-19 Real-time colonoscopy image segmentation method and device based on ensemble and knowledge distillation

Country Status (2)

Country Link
CN (1) CN111932561A (en)
WO (1) WO2022057078A1 (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114677565A (en) * 2022-04-08 2022-06-28 北京百度网讯科技有限公司 Training method of feature extraction network and image processing method and device
CN115760868A (en) * 2022-10-14 2023-03-07 广东省人民医院 Colorectal and colorectal cancer segmentation method, system, device and medium based on topology perception
CN115829983A (en) * 2022-12-13 2023-03-21 广东工业大学 Knowledge distillation-based high-speed industrial scene visual quality detection method
CN115908441A (en) * 2023-01-06 2023-04-04 北京阿丘科技有限公司 Image segmentation method, device, equipment and storage medium
CN115965609A (en) * 2023-01-03 2023-04-14 江南大学 Intelligent detection method for ceramic substrate defects by knowledge distillation
CN116385274A (en) * 2023-06-06 2023-07-04 中国科学院自动化研究所 Multi-mode image guided cerebral angiography quality enhancement method and device
CN116825130A (en) * 2023-08-24 2023-09-29 硕橙(厦门)科技有限公司 Deep learning model distillation method, device, equipment and medium
CN116993694A (en) * 2023-08-02 2023-11-03 江苏济远医疗科技有限公司 Non-supervision hysteroscope image anomaly detection method based on depth feature filling
CN117765532A (en) * 2024-02-22 2024-03-26 中国科学院宁波材料技术与工程研究所 cornea Langerhans cell segmentation method and device based on confocal microscopic image
CN117765532B (en) * 2024-02-22 2024-05-31 中国科学院宁波材料技术与工程研究所 Cornea Langerhans cell segmentation method and device based on confocal microscopic image

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111932561A (en) * 2020-09-21 2020-11-13 深圳大学 Real-time enteroscopy image segmentation method and device based on integrated knowledge distillation
CN112686856A (en) * 2020-12-29 2021-04-20 杭州优视泰信息技术有限公司 Real-time enteroscopy polyp detection device based on deep learning
CN112819831B (en) * 2021-01-29 2024-04-19 北京小白世纪网络科技有限公司 Segmentation model generation method and device based on convolution Lstm and multi-model fusion
CN112802023A (en) * 2021-04-14 2021-05-14 北京小白世纪网络科技有限公司 Knowledge distillation method and device for pleural lesion segmentation based on lifelong learning
CN113343803B (en) * 2021-05-26 2023-08-22 北京百度网讯科技有限公司 Model training method, device, equipment and storage medium
CN113470025A (en) * 2021-09-02 2021-10-01 北京字节跳动网络技术有限公司 Polyp detection method, training method and related device
WO2023212997A1 (en) * 2022-05-05 2023-11-09 五邑大学 Knowledge distillation based neural network training method, device, and storage medium
CN114926471B (en) * 2022-05-24 2023-03-28 北京医准智能科技有限公司 Image segmentation method and device, electronic equipment and storage medium
CN116091773B (en) * 2023-02-02 2024-04-05 北京百度网讯科技有限公司 Training method of image segmentation model, image segmentation method and device
CN117496509B (en) * 2023-12-25 2024-03-19 江西农业大学 Yolov7 grapefruit counting method integrating multi-teacher knowledge distillation

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107506740A (en) * 2017-09-04 2017-12-22 北京航空航天大学 A kind of Human bodys' response method based on Three dimensional convolution neutral net and transfer learning model
CN111524124A (en) * 2020-04-27 2020-08-11 中国人民解放军陆军特色医学中心 Digestive endoscopy image artificial intelligence auxiliary system for inflammatory bowel disease
CN111932561A (en) * 2020-09-21 2020-11-13 深圳大学 Real-time enteroscopy image segmentation method and device based on integrated knowledge distillation

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109325443B (en) * 2018-09-19 2021-09-17 南京航空航天大学 Face attribute identification method based on multi-instance multi-label deep migration learning
CN110033026B (en) * 2019-03-15 2021-04-02 深圳先进技术研究院 Target detection method, device and equipment for continuous small sample images
CN110472681A (en) * 2019-08-09 2019-11-19 北京市商汤科技开发有限公司 The neural metwork training scheme and image procossing scheme of knowledge based distillation
CN111199242B (en) * 2019-12-18 2024-03-22 浙江工业大学 Image increment learning method based on dynamic correction vector
CN111428191B (en) * 2020-03-12 2023-06-16 五邑大学 Antenna downtilt angle calculation method and device based on knowledge distillation and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107506740A (en) * 2017-09-04 2017-12-22 北京航空航天大学 A kind of Human bodys' response method based on Three dimensional convolution neutral net and transfer learning model
CN111524124A (en) * 2020-04-27 2020-08-11 中国人民解放军陆军特色医学中心 Digestive endoscopy image artificial intelligence auxiliary system for inflammatory bowel disease
CN111932561A (en) * 2020-09-21 2020-11-13 深圳大学 Real-time enteroscopy image segmentation method and device based on integrated knowledge distillation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HUANG ZHICHAO; WANG ZHAOXIA; CHEN JIE; ZHU ZHONGSHENG; LI JIANQIANG: "Real-time Colonoscopy Image Segmentation Based on Ensemble Knowledge Distillation", 2020 5TH INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS (ICARM), 18 December 2020 (2020-12-18), pages 454 - 459, XP033825531, DOI: 10.1109/ICARM49381.2020.9195281 *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114677565A (en) * 2022-04-08 2022-06-28 北京百度网讯科技有限公司 Training method of feature extraction network and image processing method and device
CN115760868A (en) * 2022-10-14 2023-03-07 广东省人民医院 Colorectal and colorectal cancer segmentation method, system, device and medium based on topology perception
CN115829983B (en) * 2022-12-13 2024-05-03 广东工业大学 High-speed industrial scene visual quality detection method based on knowledge distillation
CN115829983A (en) * 2022-12-13 2023-03-21 广东工业大学 Knowledge distillation-based high-speed industrial scene visual quality detection method
CN115965609A (en) * 2023-01-03 2023-04-14 江南大学 Intelligent detection method for ceramic substrate defects by knowledge distillation
CN115965609B (en) * 2023-01-03 2023-08-04 江南大学 Intelligent detection method for flaws of ceramic substrate by utilizing knowledge distillation
CN115908441A (en) * 2023-01-06 2023-04-04 北京阿丘科技有限公司 Image segmentation method, device, equipment and storage medium
CN115908441B (en) * 2023-01-06 2023-10-10 北京阿丘科技有限公司 Image segmentation method, device, equipment and storage medium
CN116385274A (en) * 2023-06-06 2023-07-04 中国科学院自动化研究所 Multi-mode image guided cerebral angiography quality enhancement method and device
CN116385274B (en) * 2023-06-06 2023-09-12 中国科学院自动化研究所 Multi-mode image guided cerebral angiography quality enhancement method and device
CN116993694A (en) * 2023-08-02 2023-11-03 江苏济远医疗科技有限公司 Non-supervision hysteroscope image anomaly detection method based on depth feature filling
CN116993694B (en) * 2023-08-02 2024-05-14 江苏济远医疗科技有限公司 Non-supervision hysteroscope image anomaly detection method based on depth feature filling
CN116825130A (en) * 2023-08-24 2023-09-29 硕橙(厦门)科技有限公司 Deep learning model distillation method, device, equipment and medium
CN116825130B (en) * 2023-08-24 2023-11-21 硕橙(厦门)科技有限公司 Deep learning model distillation method, device, equipment and medium
CN117765532A (en) * 2024-02-22 2024-03-26 中国科学院宁波材料技术与工程研究所 cornea Langerhans cell segmentation method and device based on confocal microscopic image
CN117765532B (en) * 2024-02-22 2024-05-31 中国科学院宁波材料技术与工程研究所 Cornea Langerhans cell segmentation method and device based on confocal microscopic image

Also Published As

Publication number Publication date
CN111932561A (en) 2020-11-13

Similar Documents

Publication Publication Date Title
WO2022057078A1 (en) Real-time colonoscopy image segmentation method and device based on ensemble and knowledge distillation
Jin et al. DUNet: A deformable network for retinal vessel segmentation
CN113706526B (en) Training method and device for endoscope image feature learning model and classification model
CN110600122B (en) Digestive tract image processing method and device and medical system
JP7152513B2 (en) Image recognition method, device, terminal equipment and medical system, and computer program thereof
CN111383214B (en) Real-time endoscope enteroscope polyp detection system
Cho et al. Comparison of convolutional neural network models for determination of vocal fold normality in laryngoscopic images
CN110648331B (en) Detection method for medical image segmentation, medical image segmentation method and device
Zhang et al. Dual encoder fusion u-net (defu-net) for cross-manufacturer chest x-ray segmentation
CN112466466B (en) Digestive tract auxiliary detection method and device based on deep learning and computing equipment
Souaidi et al. A new automated polyp detection network MP-FSSD in WCE and colonoscopy images based fusion single shot multibox detector and transfer learning
WO2022242392A1 (en) Blood vessel image classification processing method and apparatus, and device and storage medium
Ji et al. A multi-scale recurrent fully convolution neural network for laryngeal leukoplakia segmentation
CN115223193B (en) Capsule endoscope image focus identification method based on focus feature importance
Uçar et al. Classification of different tympanic membrane conditions using fused deep hypercolumn features and bidirectional LSTM
Raut et al. Gastrointestinal tract disease segmentation and classification in wireless capsule endoscopy using intelligent deep learning model
Fu et al. Deep supervision feature refinement attention network for medical image segmentation
Wang et al. Automatic consecutive context perceived transformer GAN for serial sectioning image blind inpainting
Wang et al. RFPNet: Reorganizing feature pyramid networks for medical image segmentation
CN111209946A (en) Three-dimensional image processing method, image processing model training method, and medium
CN116091446A (en) Method, system, medium and equipment for detecting abnormality of esophageal endoscope image
CN113554641B (en) Pediatric pharyngeal image acquisition method and device
Hussain et al. RecU-Net++: Improved utilization of receptive fields in U-Net++ for skin lesion segmentation
CN112862786A (en) CTA image data processing method, device and storage medium
CN117726822B (en) Three-dimensional medical image classification segmentation system and method based on double-branch feature fusion

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20953935

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 30.06.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 20953935

Country of ref document: EP

Kind code of ref document: A1