CN116310525A

CN116310525A - Pathological image classification method based on contrast representation distillation and output distillation

Info

Publication number: CN116310525A
Application number: CN202310194883.3A
Authority: CN
Inventors: 薛梦凡; 陈怡达; 彭东亮; 李焘; 贾士绅
Original assignee: Hangzhou Dianzi University
Current assignee: Hangzhou Dianzi University
Priority date: 2023-02-28
Filing date: 2023-02-28
Publication date: 2023-06-23

Abstract

The invention discloses a pathological image classification method based on contrast representation distillation and output distillation. The weak supervision model is used as a classification network, the data dependence of the model is greatly reduced, the comparison is adopted to represent distillation, so that the weak supervision model extracts more obvious and more distinguishable deep features, and the output distillation is adopted to guide the optimization direction of the weak supervision model through the high-precision and high-accuracy full supervision model. The output distillation and the comparison show that the distillation can greatly improve the prediction capability of different lung cancer subtypes while the weak supervision model keeps high recognition capability of normal tissue samples. In addition, through the design of the depth gating attention module, accurate prediction of normal tissue samples and lung cancer subtypes is realized.

Description

Pathological image classification method based on contrast representation distillation and output distillation

Technical Field

The invention relates to the technical field of medical image processing and the field of artificial intelligence, in particular to a lung cancer pathological image classification method based on contrast expression distillation and output distillation.

Background

According to the data of the occurrence and death of Chinese cancer in 2020, the new cases and the death cases of the lung cancer respectively account for 17.9% and 23.8% of all cancer cases. Lung cancer is malignant tumor with highest incidence and mortality rate in China, and seriously threatens the life health and safety of people. If the patient is in early stage of lung cancer at the time of diagnosis, methods such as targeted therapy, drug therapy, surgery and the like can be adopted, and the medical cost of the patient in early stage of lung cancer is usually less than 10 ten thousand yuan. If the patient has progressed to the middle and late stage when lung cancer is diagnosed, only radiotherapy or chemotherapy with great harm to the body can be adopted, the medical cost is usually 10 to 30 ten thousand yuan, and the average cost of people is 2.6 times that of early patients. Therefore, early diagnosis of lung cancer is an important approach to increase patient survival, improve prognosis, and reduce medical costs.

The pathological diagnosis is a gold standard for lung cancer diagnosis, a doctor acquires tissues of a patient through a lung fine needle puncture, a surgical operation and the like, then makes a full-view digital pathological image, and then searches a focus area in the pathological image through continuously adjusting the view scale to finish diagnosis. However, the manual diagnosis is easily affected by subjective factors, fatigue, diagnosis experience and other external factors, and misdiagnosis or missed diagnosis is easily caused, so that the treatment of the patient is delayed. Fortunately, with the development of artificial intelligence technology and the improvement of computer power, deep learning technology is widely applied to the medical field, and provides a new idea for lung cancer diagnosis and subtype classification. The convolutional neural network can intelligently learn and analyze the pathological images to provide objective and scientific diagnosis suggestions for pathologists, and greatly improve the diagnosis efficiency and accuracy of the pathologists.

Currently, the mainstream full-field digital pathological image classification method based on deep learning is divided into a pathological image classification method based on full-supervised learning and a pathological image classification method based on weak-supervised learning. The classification method based on full-supervised learning has higher classification precision on cancer areas, but has poorer identification capability on normal tissue samples, and a large number of pathological images carrying accurate focus marks are required to train a network, so that data is extremely difficult to obtain and the collection cost is high. The classification method based on weak supervision learning only needs an image-level label, can be easily obtained from an open source database TCGA, and a large sample enables the weak supervision network to have stronger recognition capability on normal tissue samples, but the accuracy of the model on classifying the cancer subtypes is lower due to the fact that the focus area cannot be effectively obtained and the characteristic information related to subtype diagnosis is extracted, so that the auxiliary diagnosis effect cannot be achieved.

Therefore, there is a need for a full-field digital pathology image classification method that has high classification accuracy for both normal tissue samples and different cancer subtypes, and that is simple to collect data sets.

Disclosure of Invention

The invention aims to solve the problems that the existing lung cancer pathological image classification method based on full-supervision learning is poor in normal tissue sample recognition capability and the lung cancer subtype classification method based on weak-supervision learning is low in accuracy, and provides a pathological image classification method based on comparison expression distillation and output distillation, wherein the advantages of a full-supervision network are transferred to a weak-supervision network through comparison expression distillation and output distillation, so that the accurate classification of pathological images is realized.

In order to solve the technical problems, the invention adopts the following technical scheme:

a pathological image classification method based on contrast representation distillation and output distillation specifically comprises the following steps:

step S1: the method for constructing the data set of the full-supervision teacher classification network specifically comprises the following steps:

step S1.1: collecting full-field digital pathology images of normal tissue and different lung cancer subtypes;

step S1.2: performing color standardization on the collected full-field digital pathology image to eliminate color difference of the pathology image;

step S1.3: labeling focus areas of the lung cancer full-view digital pathological images, and labeling all tissue pathological areas of a normal tissue sample;

step S1.4: generating a mask with the same size and position according to the marking information, and cutting the image into a plurality of small image blocks with the size of n multiplied by n by using a sliding window, wherein n represents the length and the width of the small image blocks;

step S1.5: comparing the small image block obtained by cutting with the generated mask, if the focus area in the image block is more than 50%, storing the image block, otherwise discarding the image block. Marking a small image block cut out of a pathological image carrying focus marks as m _t ；

Step S2: the method for constructing the weakly supervised student classification network data set specifically comprises the following steps:

step S2.1: collecting full-field digital pathology images of normal tissue and different lung cancer subtypes;

step S2.2: performing color standardization on the collected full-field digital pathology image to eliminate color difference of the pathology image;

step S2.3: automatically dividing the standardized pathological image, filtering blank background and natural holes, cutting into n×n small image blocks, wherein n represents the length and width of the small image blocks, and marking the small image blocks cut from pathological image without focus mark as m _s ；

Step S3: constructing a double distilled pathology image classification network, the double distilled pathology image classification network comprising:

step S3.1: full supervision feature extraction module F _T For extracting pathological image m carrying focus labeling _t Deep characteristic information h of (2) _t And distilled until the pathological image m is marked with the lesion which is not carried _s The network is classified for the input weakly supervised students as follows:

h _t ＝F _T (m _t )

step S3.2: full supervision classification module C _T The method is used for judging the disease type of the input image block;

step S3.3: weak supervision feature extraction module F _S For extracting pathological image blocks m without focus labeling _s Deep characteristic information h of (2) _s ；

Step S3.4: depth-gated attention module A _S For assigning to each image block an attention score representing the degree of importance of the image block to the classification of the network;

step S3.5: auxiliary clustering module C _L The feature space is used for restraining the input image blocks, so that the positive features and the negative features in each category are linearly separable;

step S3.6: weak supervision classification module C _S The method comprises the steps of making a final category prediction for a pathological image;

step S4: pathological image m carrying focus mark _t As input, back propagation using a random gradient descent algorithm, targeting minimization of cross entropy loss functions to train and optimize parameters of a fully supervised teacher classification network, including a fully supervised feature extraction module F _T And a full supervision classification module C _T The trained fully supervised teacher classification network is used as an auxiliary reference to optimize the training of the weakly supervised student classification network;

step S5: marking pathological image block m with non-carried focus _s For input, training and optimizing parameters of a weakly supervised student classification network, and specifically comprises the following steps:

step S5.1: randomly initializing depth-gated attention module A _S Auxiliary clustering module C _L And weak supervision classification module C _S Parameters of (2);

step S5.2: pathological image block m without focus mark _s Sending the feature to a weak supervision feature extraction module F _S Obtaining deep features h of each image block _s The formula is as follows:

h _s ＝F _S (m _s )

step S5.3: combined weak supervision characteristic extraction module F _S Output characteristics h _s And full supervision feature extraction module F _T Output characteristics h _t The calculated contrast represents the distillation loss L _F ；

Step S5.4: deep layer feature h generated in step S5.2 _s Into depth-gated attention module A _S In (2), a concentration score a representing the importance of each image block to network classification is obtained _k,n The formula is as follows:

wherein n represents the number of categories of the classification task; n is the total number of image blocks cut out of one pathological image; p (P) _a,n Is a full connection layer belonging to the nth class; g _a 、H _a 、J _a A linear layer sharing weights for all categories; tan h and sigm represent tan h and sigmoid activation functions, respectively;

step S5.5: normalizing the attention score generated in the step S5.4, and combining the normalized attention score of each image block with the full-supervision classification module C _T Output classification result, calculate output distillation loss L _O ；

Step S5.6: deep layer feature h generated in step S5.2 _s Sending into an auxiliary clustering module C _L In the method, attention scores output by a depth gating attention module are used as pseudo-label supervision clusters, and auxiliary cluster loss L is calculated _C ；

Step S5.7: taking the attention score as a corresponding weight, calculating the weighted sum of all feature vectors, and aggregating to generate the feature h of the pathological image level _W The formula is as follows:

step S5.8: feature h of pathological image level _W Sending the classified data to a weak supervision classification module C _S Obtaining category prediction of the full-view digital pathological image, and calculating the category loss L of pathological image level features by combining with a real label _W ；

Step S5.9: calculate the total loss function L _total The total loss function includes the classification loss L of the pathological image level features _W Auxiliary clustering loss L _C Loss of output distillation L _O Comparison shows distillation loss L _F The specific formula is as follows:

L _total ＝λ _W L _W +λ _C L _C +λ _O L _O +λ _F L _F

wherein each λ is the weight of the corresponding loss function;

step S5.10: to minimize the total loss function L _total Training a weak supervision student classification network for a target, stopping training when the loss of continuous k rounds of iteration is not reduced any more, and determining an optimal model by using Monte Carlo ten-fold cross validation to obtain a trained weak supervision student classification network;

step S6: and (3) extracting the normalized attention score generated in the step S5.7, generating small image blocks mapped by different RGB colors, covering the small image blocks on the original pathological image with set transparency, and obtaining a focus detection heat map after blurring and smoothing operation.

Further, the double-distillation pathological image classification network comprises a full supervision feature extraction module F _T Full supervision classification module C _T Weak supervision feature extraction module F _S Depth-gated attention module A _S Auxiliary clustering module C _L Weak supervision classification module C _S 。

Further, the weak supervision feature extraction module F _S For a pre-trained network, as a preference, resNet-50, resNet-18, resNet-101, resNet-152 pre-trained in the ImageNet natural image dataset may be employed;

further, not all image blocks need to be sent to the auxiliary clustering module C _L Only m image blocks with the highest attention score and the lowest attention score in each category are sent to an auxiliary clustering module, and 8 is taken as a preferable m;

further, the output distillation loss L _O And the comparison shows the distillation loss L _F Not all of the time is involved in the training,preferably, both types of distillation losses are added only after the 40 th iteration;

further, the auxiliary clustering loss L _C Using a smooth support vector machine penalty as the preference; classification loss L of the pathological image-level features _W Using cross entropy loss as a preference; said output distillation loss L _O And the comparison shows the distillation loss L _F Using KL divergence or cross entropy loss as the preference;

further, the final full-field digital pathological image classification network is composed of a weak supervision feature extraction module F _S Depth-gated attention module A _S Weak supervision classification module C _S Constructing;

further, the full-field digital pathology image is a medical full-field digital pathology image.

Compared with the prior art, the invention has the beneficial effects that:

(1) The invention designs a double-distillation pathological image classification network aiming at the problems that the existing classification method based on full-supervised learning has poor recognition capability on normal tissue samples, data integration is expensive and difficult to collect, and the classification method based on weak-supervised learning has low accuracy in classifying lung cancer subtypes and cannot achieve the effect of auxiliary diagnosis. The full-supervision classification network is used as a teacher model by combining a contrast representation learning method, and the weak supervision model is extracted into more obvious and more distinguishable deep features by carrying out contrast representation distillation on the full-supervision model. In addition, the output result of the full supervision network is used as a soft label to guide the optimization direction of the attention module in the weak supervision model so as to improve the detection precision of the model on the lesion area. The model realizes accurate prediction of normal tissue samples and lung cancer subtypes by outputting distillation and comparing the distillation.

(2) According to the invention, an auxiliary clustering module is introduced in the training process, so that the positive characteristic and the negative characteristic of a model for predicting a certain cancer subtype can be linearly separated as much as possible, the distinguishing degree of the characteristics is improved, and the accurate prediction of the lung cancer subtype is realized.

(3) The depth gating attention module designed by the invention firstly uses the tanh activation function to introduce preliminary positive and negative gradients, then combines the gating thought to introduce the sigmoid activation function in the form of tensor product as weighting, integrates different characteristics of the two activation functions, ensures that feature vectors containing key information have higher attention scores, and further strengthens the attention scores of the feature vectors to the model. In addition, the attention module converts the focus detection problem into a classification problem, generates a focus detection heat map with high resolution, and maintains the texture structure of the underlying cell tissue while revealing the canceration probability.

Drawings

FIG. 1 is a flow chart of a training of a weakly supervised student network based on a comparative representation distillation and output distillation pathology image classification method of the present invention;

FIG. 2 is a diagram showing a network structure of a pathological image classification method based on comparative representation distillation and output distillation according to the present invention;

FIG. 3 is a flow chart of the full field digital pathology image classification of the present invention.

Detailed Description

For a clearer understanding of the objects, aspects and advantages of the present application, reference is made to the following detailed description and examples, taken in conjunction with the accompanying drawings. It is obvious that the examples described are only illustrative of the invention and are not limiting to the embodiments of the invention, and that other variants or modifications can be made on the basis of the examples, which need not be and cannot be exhaustive of all the embodiments. All other embodiments, which come within the spirit and principles of the invention and are desired to be protected by the following claims, are to be construed as equivalents and alternatives falling within the spirit and scope of the invention.

The invention provides a pathological image resolution-reducing rapid diagnosis method based on contrast expression learning and condition countermeasure, which is further described in detail by taking full-field digital pathological image classification for lung cancer diagnosis (lung adenocarcinoma, lung squamous carcinoma and normal tissue sample) as an example and combining a specific implementation method.

The embodiment of the invention comprises the following steps:

training phase:

step S1.1: collecting 50 cases of pathological images of normal tissues, lung adenocarcinoma and lung squamous carcinoma from an open source database TCGA and TCIA respectively;

step S1.2: color standardization is carried out on the collected 150 full-view digital pathology images, and color differences of the pathology images are eliminated;

step S1.3: labeling focus areas of 100 lung cancer full-view digital pathological images, and labeling all tissue pathological areas of 50 normal tissue samples;

step S1.4: generating a mask with the same size and position according to the marking information, and cutting the image into a plurality of small image blocks with the size of 256 multiplied by 256 by using a sliding window;

step S2.1: additionally collecting 1000 cases of normal tissue, lung adenocarcinoma and lung squamous carcinoma pathological images from the open source databases TCGA and TCIA, wherein all the images do not contain any pixel level, block level and ROI level labels;

step S2.2: color standardization is carried out on 3000 collected full-view digital pathology images, and color differences of the pathology images are eliminated;

step S2.3: automatically segmenting the standardized pathological image, filtering blank background and natural holes, cutting into a plurality of small image blocks with the size of 256 multiplied by 256, and marking the small image blocks cut out by the pathological image without focus marks as m _s ；

step S3.1: full supervision feature extraction module F _T Illustratively, resNet-18 is used for extracting pathological images m carrying lesion marking _t Deep characteristic information h of (2) _t And distilled until the pathological image m is marked with the lesion which is not carried _s The network is classified for the input weakly supervised students as follows:

h _t ＝F _T (m _t )

step S3.2: full supervision classification module C _T Illustratively, the full-connection layer and the Softmax layer are used for judging the disease type of the input image block;

step S3.3: weak supervision feature extraction module F _S Illustratively, resNet-50 pre-trained on ImageNet is used for extracting pathological image blocks m carrying no lesion marking _s Deep characteristic information h of (2) _s ；

step S3.6: weak supervision classification module C _S Illustratively, it is composed of a fully connected layer and a Softmax layer for making a final class prediction for the pathology image;

step S4: pathological image m carrying focus mark _t As input, back propagation using a random gradient descent algorithm, targeting minimization of cross entropy loss functions to train and optimize parameters of a fully supervised teacher classification network, including a fully supervised feature extraction module F _T And a full supervision classification module C _T In this embodiment, adam is used as an optimizer, the learning rate is set to 0.0001, the maximum iteration number is 200, and the trained fully-supervised teacher classification network is used as an auxiliary reference to optimize training of the weakly-supervised student classification networkTraining;

step S5: marking pathological image block m with non-carried focus _s For input, training and optimizing parameters of a weakly supervised student classification network, see fig. 1, specifically comprises the following steps:

h _s ＝F _S (m _s )

step S5.3: combined weak supervision characteristic extraction module F _S Output characteristics h _s And full supervision feature extraction module F _T Output characteristics h _t The calculated contrast represents the distillation loss L _F Illustratively, the comparison represents that distillation loss uses cross entropy loss;

wherein n represents the number of categories of the classification task; n is the total number of image blocks cut out of one pathological image; p (P) _a,n Is a full connection layer belonging to the nth class; g _a 、H _a 、J _a A linear layer sharing weights for all categories; tanh and sigm represent tanh and sigmoid activation functions, respectively.

Step S5.5: normalizing the attention score generated in the step S5.4, and combining the normalized attention score of each image block with the full-supervision classification module C _T Output classification result, calculate output distillation loss L _O Illustratively, the loss of distillation is outputUsing cross entropy loss;

step S5.6: deep layer feature h generated in step S5.2 _s Sending into an auxiliary clustering module C _L In the method, attention scores output by a depth gating attention module are used as pseudo-label supervision clusters, and auxiliary cluster loss L is calculated _C Illustratively, the auxiliary cluster penalty uses a smooth vector machine penalty;

step S5.8: feature h of pathological image level _W Sending the classified data to a weak supervision classification module C _S Obtaining category prediction of the full-view digital pathological image, and calculating the category loss L of pathological image level features by combining with a real label _W Illustratively, the classification penalty of pathological image-level features uses cross entropy penalty;

L _total ＝λ _W L _W +λ _C L _C +λ _O L _O +λ _F L _F

wherein each lambda is the weight of the corresponding loss function, and lambda is set _W ＝0.7，λ _C ＝0.1，λ _O ＝0.1，λ _F ＝0.1。

Step S5.10: to minimize the total loss function L _total Training a weak supervision student classification network for targets, using Adam as an optimizer, setting a learning rate to 0.0001, stopping training when the loss of continuous 30 rounds of iteration is not reduced any more, and determining an optimal model by using Monte Carlo ten-fold cross-validation to obtain trainingGood weakly supervised student classification network, see fig. 2;

step S6: and (3) extracting the normalized attention score generated in the step S5.7, generating small image blocks mapped by different RGB colors, covering the small image blocks on the original pathological image with the transparency of 0.4, and obtaining a focus detection heat map after blurring and smoothing operation.

The application phase, as shown in fig. 3:

step S1: acquiring a full-field digital pathological image of the lung of a patient;

step S2: preprocessing the full-view digital pathological image of the patient by using a method for preprocessing each full-view digital pathological image during training to obtain a plurality of small image blocks;

step S4: and sending all the small image blocks into a trained weak supervision student classification network to obtain a prediction result of the full-field digital pathological image.

Step S5: a lesion detection heat map is generated using the attention scores in the depth-gated attention module.

Claims

1. The pathological image classification method based on contrast representation distillation and output distillation is characterized by comprising the following steps of:

step S1: constructing a full-supervision teacher classification network data set, and marking a small image block cut out from a pathological image carrying focus marks as m _t ；

Step S2: constructing a weak supervision student classification network data set, and marking a small image block cut out from a pathological image without focus marks as m _s ；

h _t ＝F _T (m _t )

Step S3.4: depth-gated attention module A _S For marking pathological image blocks m for lesions not carried _s Assigning an attention score representing the degree of importance to the classification of the network;

step S3.5: auxiliary clustering module C _L For constraining pathological image blocks m without lesion marking _s To make the positive and negative features in each category linearly separable;

step S3.6: weak supervision classification module C _S For marking pathological image blocks m for lesions which are not carried _s Making a final category prediction;

h _s ＝F _S (m _s )

L _total ＝λ _W L _W +λ _C L _C +λ _O L _O +λ _F L _F

wherein each λ is the weight of the corresponding loss function;

2. A method of classifying pathological images based on contrast-indicative and output distillation according to claim 1, wherein: the method comprises the steps of constructing a full-supervision teacher classification network data set, and marking a small image block cut out from a pathological image carrying focus marks as m _t The method comprises the steps of carrying out a first treatment on the surface of the The method specifically comprises the following steps:

step S1.5: comparing the small image block obtained by cutting with the generated mask, if the focus area in the image block is more than 50%, storing the image block, otherwise discarding; marking a small image block cut out of a pathological image carrying focus marks as m _t 。

3. A method of classifying pathological images based on contrast-indicative and output distillation according to claim 1, wherein: the construction of the weak supervision student classification network data set marks a small image block cut out from a pathological image without focus marks as m _s The method comprises the steps of carrying out a first treatment on the surface of the The data set construction method specifically comprises the following steps:

step S2.3: automatically dividing the standardized pathological image, filtering blank background and natural holes, cutting into n×n small image blocks, wherein n represents the length and width of the small image blocks, and marking the small image blocks cut from pathological image without focus mark as m _s 。

4. A pathological image classification method based on contrast-based representation distillation and output distillation according to claim 1, wherein said weakly supervised feature extraction module F _S For the pre-trained network, resNet-50, resNet-18, resNet-101, resNet-152 pre-trained in the ImageNet natural image dataset are employed.

5. A pair-based according to claim 1The pathological image classification method of the ratio representing distillation and output distillation is characterized by comprising the following steps of: not all image blocks need to be sent to the auxiliary clustering module C _L And sending m image blocks with the highest attention score and the lowest attention score in each category into an auxiliary clustering module, wherein m is 8.

6. A pathological image classifying method based on contrast-based representation distillation and output distillation according to claim 1, wherein said output distillation loss L _O And the comparison shows the distillation loss L _F Rather than taking part in training entirely, the two types of distillation losses described above were added only after the 40 th iteration.

7. A pathological image classification method based on contrast-representation distillation and output distillation according to claim 1, wherein said auxiliary clustering loss L _C Loss using a smooth support vector machine; classification loss L of the pathological image-level features _W Using cross entropy loss; said output distillation loss L _O And the comparison shows the distillation loss L _F KL divergence or cross entropy loss is used.

8. The pathological image classification method based on contrast representation distillation and output distillation according to claim 1, wherein the trained full-field digital pathological image classification network is composed of a weak supervision feature extraction module F _S Depth-gated attention module A _S Weak supervision classification module C _S The composition is formed.