CN115880245A - Self-supervision-based breast cancer disease classification method - Google Patents

Self-supervision-based breast cancer disease classification method Download PDF

Info

Publication number
CN115880245A
CN115880245A CN202211549190.3A CN202211549190A CN115880245A CN 115880245 A CN115880245 A CN 115880245A CN 202211549190 A CN202211549190 A CN 202211549190A CN 115880245 A CN115880245 A CN 115880245A
Authority
CN
China
Prior art keywords
image
energy spectrum
images
entropy
classification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211549190.3A
Other languages
Chinese (zh)
Inventor
郑元杰
陈思羽
王军霞
王静
宋景琦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Normal University
Original Assignee
Shandong Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Normal University filed Critical Shandong Normal University
Priority to CN202211549190.3A priority Critical patent/CN115880245A/en
Publication of CN115880245A publication Critical patent/CN115880245A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Image Processing (AREA)
  • Apparatus For Radiation Diagnosis (AREA)

Abstract

The invention relates to the technical field of medicine, and discloses a breast cancer disease classification method based on self-supervision, which comprises the following steps: s1: image acquisition and data set production: collecting contrast enhanced energy spectrum images of different modes and contrast enhanced energy spectrum mammographic images of multiple modes; s2: data preprocessing: carrying out data preprocessing on the multimodal contrast enhancement energy spectrum mammographic image, cutting the image, and carrying out pixel equalization on the image; s3: image feature extraction: extracting image characteristics by adopting a Vision Transformer model and giving out class prediction of an image; s4: and (3) loss calculation: and calculating the distortion entropy of the prediction classification result and the information entropy loss between the class confidences generated by the classification branches of the two energy spectrum images. The invention improves the speed on the premise of ensuring the precision, balances the relation between the accuracy and the efficiency and provides the breast cancer image classification model with high precision, high speed, strong robustness and good adaptability.

Description

Self-supervision-based breast cancer disease classification method
Technical Field
The invention relates to the technical field of medicine, in particular to a breast cancer disease classification method based on self-supervision.
Background
Breast cancer is the second leading cause of death in women worldwide, and among other types of cancer, it is considered to be the leading cause of death in women in most countries, and measures for breast cancer examination are mainly mammography, breast ultrasound examination, breast MRI, contrast-enhanced energy spectroscopy mammography, and the like. Contrast enhanced energy spectrum mammography (also known as breast contrast enhanced photography) is an emerging breast cancer detection technique that is used to acquire images by dual-energy breast contrast enhanced photography equipment. The prototype of the device is a mammography system, which can be modified to achieve dual-energy exposure. The method can obtain breast images of a plurality of modalities simultaneously, so that the characteristics of different forms of the tumor can be obtained. Meanwhile, the method overcomes the tissue overlapping phenomenon in the mammography, provides tumor information with high image resolution, and is a promising imaging technology.
The contrast-enhanced energy spectrum mammography examination is very effective in a method for detecting breast cancer, and in clinical practice, detection of a breast image by clinical experts is generally manual operation, and is time-consuming, labor-consuming and high in cost. In addition, radiologists may miss primary lesion areas due to overlapping dense tissue or changes in mass shape, size, and boundaries. The image analysis method based on machine learning can improve the accuracy of breast cancer detection and can be used as a second opinion to improve the work level of radiologists. At present, a certain research foundation is accumulated in the field, and most of the technical methods utilize machine learning and deep learning in two directions. The identification method based on machine learning usually involves operations such as preprocessing and feature selection, and thus cannot realize an end-to-end detection process, and the identification effect is easily affected by various interferences in the natural environment. Although the recognition method based on deep learning has the advantages that the precision is remarkably improved, and the end-to-end detection process can be realized, the convolutional neural network model is relatively dependent on the region of interest, so that the defect of neglecting the overall characteristics of the image can be caused. Most of the existing artificial intelligence-based methods directly apply natural image processing methods to medical images, and do not consider the identification of specific characteristic target breast cancer images of the medical images, and the identification difficulty is increased rapidly along with the interference of lump shielding, breast cancer image overlapping, shooting illumination change and the like, so that the identification effect of various automatic classification technologies is reduced.
Therefore, an effective classifier is urgently needed to accurately help the doctor to predict the chronic disease. The classification of diseases of the breast energy spectrum image into the inevitable trend of future medical treatment is realized in an artificial intelligence mode, and the classification of the breast energy spectrum image, which is an important link in the classification, has important promoting significance for the real realization of applications such as cancer identification and classification.
Disclosure of Invention
Technical problem to be solved
Aiming at the defects of the prior art, the invention provides a breast cancer disease classification method based on self-supervision, which solves the problems in the background technology.
(II) technical scheme
In order to achieve the purpose, the invention provides the following technical scheme: an automated surveillance-based breast cancer disease classification method, comprising the steps of:
s1: image acquisition and dataset production
Acquiring contrast-enhanced energy spectrum images of different modalities and a contrast-enhanced energy spectrum mammography image of a multi-modality;
s2: data pre-processing
Carrying out data preprocessing on the multi-modal contrast enhancement energy spectrum mammography image, cutting the image, and carrying out pixel equalization on the image;
s3: image feature extraction
Extracting image characteristics by adopting a Vision Transformer model and giving out image category prediction;
s4: loss calculation
Calculating the distortion entropy of the prediction classification result and the information entropy loss between the category confidence degrees generated by the classification branches of the two energy spectrum images;
s5: model training
Fitting training data through gradient back propagation and iterative training of the model according to the calculation result of the distortion entropy to obtain an optimal model;
s6: generating a predicted result
And inputting the multi-modal contrast enhancement energy spectrum mammography image into a classifier, and outputting an image classification result.
Preferably, the multi-modal contrast-enhanced spectral mammography image in the S1 step includes a spectral low energy image and a spectral contrast-enhanced image.
Preferably, the specific process of the step S1 includes: s1.1: given a training set S of unlabeled low-energy spectral mammographic images 1 And an unlabeled contrast-enhanced mammographic image training set S 2 We randomly extract a batch of spectral images of size B
Figure SMS_1
And randomly taking a number of pictures which are also large or small B>
Figure SMS_2
Preferably, the specific process of the step S2 includes: s2.1: and (3) cutting the acquired breast energy spectrum image: firstly, converting an energy spectrum image into a gray image, then segmenting an interested area of a target image into sub-images by using a coordinate matching technology, then scanning the sub-images line by line, judging whether a certain column is empty or not by a traversal method, namely calculating whether pixel values in the column are a full black background or not, in a mammary gland energy spectrum image, if a non-full black pixel point exists, the image area is considered to be an effective area needing to be identified and learned, in the gray image, the gray value of a pure black pixel is 0, and the representative value of a white pixel is 255, so that in the actual operation, a threshold value close to 0 is set: if the gray value of the current pixel is lower than the threshold value, the pixel can be judged to be black, and if all the pixels in a certain row are black, the row can be deleted;
s2.2: carrying out image enhancement and equalization treatment on the breast energy spectrum image: the number of images is increased through operations of translation, rotation, turnover and noise increase of the images, overfitting of a network is prevented, meanwhile, the robustness of a classification network is enhanced, and overall contrast normalization is carried out on the enhanced breast energy spectrum images, and the specific method comprises the following steps: for an image of size M N pixels, the mean of all pixels of the image is represented by μ, σ is the standard deviation of all sample pixels, and the average intensity of each breast spectral image is calculated
Figure SMS_3
4. Preferably, the specific process of step S3 includes:
s3.1: image feature extraction: outputting the processed energy spectrum images to a Vision Transformer in a batch mode, wherein the processed energy spectrum images comprise energy spectrum low-energy images and energy spectrum contrast enhancement images, processing one batch of images each time, cutting each energy spectrum image into non-overlapping image blocks when specifically performing feature extraction, then inputting the image blocks into a Vision Transformer classification model, and gradually enriching the features contained in a deep feature map through a multi-head attention layer;
s3.2: extracting image characteristics by taking Vision transform as a backbone architecture; performing feature fusion on the feature graphs output in the VIT in a transverse connection mode, enriching semantic representation capacity of the feature graphs at the bottom layer, and finally performing classification judgment on the features; each energy spectrum image is cut into non-overlapping image blocks, and a classification block with the same size as the image blocks is additionally arranged for learning the whole imageThe method comprises the steps of carrying out random initialization on image information and initial parameters, carrying out position coding and image information coding on each image block, fusing the position coding and image coding information to obtain a vector matrix, standardizing the vector matrix layer, inputting output anisotropic features into a multi-head self-attention module, and setting the input of each image block as X in a Vision Transformer self-attention module i The input of the whole image being the input X of each image block i The composed matrix X is subjected to linear transformation to obtain Q, K and a V matrix, and then self-attention operation is performed in the following way to obtain information representation of each image block:
Figure SMS_4
the feature output of each image block is a feature map, feature fusion is carried out in a transverse connection mode, all feature maps are fused into a classification block, the semantic representation capacity of the feature map of the bottom layer is enriched, the feature representation of the image block on the image is learned through the classification block, and the classification block has the semantic capacity which is the same as that of the whole image; and finally, inputting the classification blocks into a multilayer perceptron, and inputting the output result into a sigmoid activation function to obtain class distribution.
5. Preferably, the specific process of step S4 includes:
S4.1:
for a self-supervision learning network with a Siamese structure, randomly initializing model parameters of the self-supervision learning network, and inputting contrast enhancement energy spectrum mammographic images with different modes into a Vision transform classification model in the S3 step;
s4.2: inputting two mammary gland energy spectrum images into a Vision transform classified encoder of a Siamese structure, wherein the output of the transform classified encoder is P with good or malignant C-type mammary gland diseases k =f θ (X k ) Probability distribution over
Figure SMS_5
P i k Represents P k I.e. the probability distribution of the ith sample, using two modality spectral images P 1 And P 2 We define the objective function as follows:
Figure SMS_6
inputting the class distribution into a distortion loss function to calculate distortion entropy L (P) 1 P 2 ) Warping entropy is a way to compute the similarity of two samples based on information entropy, with the goal of making L (P) 1 P 2 ) As small as possible, in particular D KL (P i 1 ||P i 2 ) The following formula gives an explanation
Figure SMS_7
First term D in formula (2) KL (| |) represents the kurlenbek-lebeller divergence between two probability distributions, also called relative entropy, which is a method used to describe the difference between two probability distributions, and is an asymmetric description method, such as P i 1 Relative to P i 2 Is not equal to P i 2 Relative to P i 1 Entropy of P, the invention selects P i 1 Relative P i 2 Entropy of, minimization of
Figure SMS_8
The prediction of the energy spectrums of different mammary glands is kept consistent, the energy spectrum images of different mammary glands of the same mammary gland need to be identified into the same type, a second term H () in formula (2) represents the entropy of a probability distribution, wherein the negative logarithm of the probability represents the information quantity carried out when a possible event occurs, the effective information quantity carried when the probability is higher is smaller, and the information quantities of various possible situations are multiplied by the probability of occurrence and then summed to represent an expected value of all the information quantities of the whole system, wherein the formula is as follows:
Figure SMS_9
/>
for the
Figure SMS_10
Minimizing the entropy of the class distribution for each sample normalizes the output distribution to sharpness, which allows a deterministic class assignment for each picture, and furthermore, the features of samples assigned to the same class will be more compact, the third term in equation (2), by maximizing the different samples @>
Figure SMS_11
The entropy of the average distribution between the two is used for ensuring that the predictions of different samples are distributed on the C class in a dispersed way, the network is prevented from distributing all images to the same class, and because the cross entropy and the relative entropy have a derivation relation, the formula (2) is derived to obtain the similarity function of the two probability distributions as the following formula (4)
Figure SMS_12
The above-described loss function is similar to the cross-entropy loss in supervised learning, where CE () represents cross-entropy, minimizing the cross-entropy loss between class distributions of breast spectra of different modalities of two identical patients without true labeling of the breast spectral images,
Figure SMS_13
the diversity of the prediction can be maximized by maximizing the diversity term of the benign and malignant categories of the prediction.
(III) advantageous effects
Compared with the prior art, the invention provides a breast cancer disease classification method based on self-supervision, which has the following beneficial effects:
1. the method comprises the steps of preprocessing collected contrast-enhanced energy spectrum mammographic images, inputting the preprocessed contrast-enhanced mammographic images of two modes into a self-supervision classification network, extracting image features, and classifying the images, wherein the whole self-supervision structure adopts a Siamese network, wherein the feature extraction and classification model is realized by using a Vision transform, loss functions based on information entropy are adopted, so that the images of two different modes of the same patient become more similar, and the feature differences of the mammographic images of two different patients under different modes become larger, so that more useful information is provided for accurate classification, and the classification accuracy of the mammographic images is improved.
2. The method can comprehensively utilize the characteristics of the mammary gland contrast enhancement photographic image, improve the classification accuracy of the contrast enhancement energy spectrum mammary gland photographic image and reduce the cost of manual labeling.
3. The invention improves the speed on the premise of ensuring the precision, balances the relation between the accuracy and the efficiency and provides the breast cancer image classification model with high precision, high speed, strong robustness and good adaptability.
Drawings
FIG. 1 is a schematic structural diagram of a flow chart of a classification method of a mammographic image by using self-supervision contrast enhancement in a breast cancer disease classification method based on self-supervision according to the present invention;
FIG. 2 is a schematic diagram of a breast cancer image structure under different interference scenes collected in an automated breast cancer classification method according to the present invention;
FIG. 3 is a schematic diagram illustrating a structure of a model data set partition diagram in an automated breast cancer classification method according to the present invention;
FIG. 4 is a schematic diagram of a model structure in an automated breast cancer classification method according to the present invention;
FIG. 5 is a schematic diagram of a Vision Transformer module according to the present invention;
FIG. 6 is a diagram of an embodiment of a model in an automated breast cancer disease classification method according to the present invention.
Detailed Description
As shown in fig. 1 to 6, the present invention provides a technical solution: an automated surveillance-based breast cancer disease classification method, comprising the steps of:
s1: image acquisition and dataset production
Acquisition of multi-modal contrast-enhanced energy spectral mammography images, multi-modal contrast enhancementThe energy spectrum mammography image comprises an energy spectrum low energy image and an energy spectrum contrast enhancement image, and the specific process of the step comprises the following steps: given a training set S of unlabeled low-energy mammographic images 1 And an unlabeled contrast-enhanced mammographic image training set S 2 Randomly extracting a batch of energy spectrum images with the size of B
Figure SMS_14
And randomly picking a number of pictures which are also B in size>
Figure SMS_15
S2: data pre-processing
The method comprises the following steps of carrying out data preprocessing on the multi-modal contrast enhancement energy spectrum mammography image, cutting the image, and carrying out pixel equalization on the image, wherein the specific process of the step comprises the following steps:
S2.1:
cutting the acquired breast energy spectrum image: firstly, converting an energy spectrum image into a gray scale image, then segmenting a region of interest of a target image into sub-images by using a coordinate matching technology, then scanning the sub-images line by line, judging whether a certain column is empty or not by a traversal method, namely calculating whether pixel values in the column are all-black backgrounds or not, in a mammary gland energy spectrum image, if only non-all-black pixel points exist, the image region is considered to be an effective region needing to be identified and learned, in the gray scale image, the gray scale value of a pure black pixel is 0, and the representative value of a white pixel is 255, so that in the actual operation, a threshold value close to 0 is set: if the gray value of the current pixel is lower than the threshold value, the pixel can be judged to be black, and if all the pixels in a certain row are black, the row can be deleted;
S2.2:
carrying out image enhancement and equalization treatment on the breast energy spectrum image: the number of images is increased through operations of translation, rotation, turnover and noise increase of the images, overfitting of a network is prevented, meanwhile, the robustness of a classification network is enhanced, and overall contrast normalization is carried out on the enhanced breast energy spectrum images, and the specific method comprises the following steps: for an image of size M N pixelsThe mean value of all pixels of the image is represented by mu, the standard deviation of all sample pixels is represented by sigma, and the average intensity of each breast spectrum image is calculated
Figure SMS_16
S3: image feature extraction
Extracting image characteristics by adopting a Vision Transformer model and giving out image category prediction, wherein the specific process of the step comprises the following steps:
S3.1:
image feature extraction: outputting the processed energy spectrum images to a Vision Transformer in a batch mode, wherein the processed energy spectrum images comprise energy spectrum low-energy images and energy spectrum contrast enhancement images, processing one batch of images each time, cutting each energy spectrum image into non-overlapping image blocks when specifically performing feature extraction, then inputting the image blocks into a Vision Transformer classification model, and gradually enriching the features contained in a deep feature map through a multi-head attention layer;
S3.2:
extracting image characteristics by taking Vision transform as a backbone architecture; performing feature fusion on the feature maps output in the VIT in a transverse connection mode, enriching semantic representation capacity of the underlying feature maps, and finally performing classification judgment on the features; cutting each energy spectrum image into non-overlapping image blocks, additionally arranging a classification block with the same size as the image blocks for learning information of the whole image, randomly initializing initial parameters, performing position coding and image information coding on each image block, fusing the position coding and image coding information to obtain a vector matrix, standardizing the vector matrix layer, inputting output anisotropic features into a multi-head self-attention module, and setting the input of each image block as X in a Vision Transformer self-attention module i The input of the whole image being the input X of each image block i The composed matrix X is subjected to linear transformation to obtain Q, K and a V matrix, and then self-attention operation is performed in the following way to obtain information representation of each image block:
Figure SMS_17
the feature output of each image block is a feature map, feature fusion is carried out according to a transverse connection mode, all feature maps are fused into a classification block, the semantic representation capacity of the feature map of the bottom layer is enriched, the feature representation of the image block on the image is learned through the classification block, and the classification block has the semantic capacity which is the same as that of the whole image; finally, inputting the classification blocks into a multilayer perceptron, and inputting the output result into a sigmoid activation function to obtain class distribution;
s4: loss calculation
Calculating the distortion entropy of the prediction classification result and the information entropy loss between the category confidences of the two energy spectrum images, wherein the specific process of the step comprises the following steps:
S4.1:
for the self-supervision learning network of the Siamese structure, model parameters are initialized randomly, and contrast enhancement energy spectrum mammographic images of different modes are input into a Vision Transformer classification model in the step S3;
S4.2:
inputting the multi-modal mammary gland energy spectrum image into a Vision transform classified encoder of a Siamese structure, wherein the output of the transform classified encoder is P with good or malignant C-type mammary gland diseases k =f θ (X k ) Probability distribution of
Figure SMS_18
P i k Represents P k I.e. the probability distribution of the ith sample, using two modality spectral images P 1 And P 2 We define the objective function as follows:
Figure SMS_19
inputting the class distribution into a distortion loss function to calculate distortion entropy L (P) 1 P 2 ) The distortion entropy is one of calculation of similarity of two samples based on information entropyIn one way, the goal is to make L (P) 1 P 2 ) As small as possible, in particular D KL (P i 1 ||P i 2 ) The following formula gives an explanation
Figure SMS_20
First term D in formula (2) KL (| |) represents the kullbeck-lebeller divergence between two probability distributions, also called relative entropy, which is a method used to describe the difference between two probability distributions, it is an asymmetric description method, such as P i 1 Relative to P i 2 Is not equal to P i 2 Relative to P i 1 Entropy of P, the invention selects P i 1 Relative to P i 2 Entropy of, minimization of
Figure SMS_21
The prediction of different modal breast energy spectrums is kept consistent, multi-modal energy spectrum images of the breast of the same patient need to be identified into the same class, a second term H () in formula (2) represents the entropy of a probability distribution, wherein the negative logarithm of the probability represents the information quantity carried out when a possible event occurs, the effective information quantity carried when the probability is higher is smaller, and the information quantities of various possible situations are multiplied by the probability of occurrence and then summed to represent an expected value of all the information quantities of the whole system, wherein the formula is as follows:
Figure SMS_22
for the
Figure SMS_23
Minimizing the entropy of the class distribution for each sample normalizes the output distribution to sharpness, which allows a deterministic class assignment for each picture, and furthermore, the features of samples assigned to the same class will be more compact, the third term in equation (2), by maximizing the different samples @>
Figure SMS_24
The entropy of the average distribution between the two is used for ensuring that the predictions of different samples are distributed on the C class in a dispersed way, the network is prevented from distributing all images to the same class, and because the cross entropy and the relative entropy have a derivation relation, the formula (2) is derived to obtain the similarity function of the two probability distributions as the following formula (4)
Figure SMS_25
The above-mentioned loss function is similar to the cross-entropy loss in supervised learning, where CE () represents cross-entropy, minimizing the cross-entropy loss between class distributions of breast spectra of different modalities of two and the same patient without true labeling of the breast spectral images,
Figure SMS_26
the diversity of prediction can be maximized by maximally predicting the diversity items of benign and malignant categories;
s5: model training
Fitting training data through gradient back propagation and iterative training of the model according to the calculation result of the distortion entropy to obtain an optimal model;
s6: generating a predicted result
Inputting the multi-modal contrast enhancement energy spectrum mammography image into a classifier, and outputting an image classification result;
from fig. 6 it is known that the system of the invention comprises:
the image input module inputs the acquired contrast enhanced mammographic picture into the system and waits for the next processing;
the preprocessing module is used for preprocessing the contrast-enhanced breast energy spectrum photographic image, and cutting off a black edge background of the breast energy spectrum image;
the image enhancement module is used for carrying out image enhancement processing and global contrast homogenization processing on the cut breast energy spectrum photographic image;
the feature extraction module is used for extracting features of the breast energy spectrum image after the scale adjustment based on a preset deep neural network classification model and fusing two views of the breast on the same side and the features of two modes;
and the classification result output module is used for classifying the benign and malignant properties of the energy spectrum image through classification.
In summary, the following steps: the breast cancer image classification model can comprehensively utilize the characteristics of a breast contrast enhancement photographic image, improve the classification accuracy of the contrast enhancement energy spectrum breast photographic image, reduce the cost of manual labeling, improve the speed on the premise of ensuring the accuracy, balance the relation between the accuracy and the efficiency, and provide the breast cancer image classification model with high accuracy, high speed, strong robustness and good adaptability.

Claims (6)

1. A breast cancer disease classification method based on self-supervision, comprising the steps of:
s1: image acquisition and dataset production
Acquiring contrast-enhanced energy spectrum images of different modalities and a contrast-enhanced energy spectrum mammography image of a multi-modality;
s2: data pre-processing
Carrying out data preprocessing on the multi-modal contrast enhancement energy spectrum mammography image, cutting the image, and carrying out pixel equalization on the image;
s3: image feature extraction
Extracting image characteristics by adopting a Vision Transformer model and giving out image category prediction;
s4: loss calculation
Calculating distortion entropy of a prediction classification result and information entropy loss between class confidence degrees generated by classification branches of two energy spectrum images;
s5: model training
Fitting training data through gradient back propagation and iterative training of the model according to the calculation result of the distortion entropy to obtain an optimal model;
s6: generating a predicted result
And inputting the multi-modal contrast enhancement energy spectrum mammography image into a classifier, and outputting an image classification result.
2. The self-supervised-based breast cancer disease classification method of claim 1, wherein the multi-modal contrast-enhanced energy spectrum mammography image in the step S1 includes an energy spectrum low energy image and an energy spectrum contrast-enhanced image.
3. The method for classifying breast cancer diseases based on self-supervision as claimed in claim 1, wherein the specific process of the S1 step comprises:
S1.1:
given a training set S of unlabeled low-energy spectral mammographic images 1 And an unlabeled contrast-enhanced mammographic image training set S 2 Randomly extracting a batch of energy spectrum images with the size of B
Figure QLYQS_1
Figure QLYQS_2
And randomly taking a number of pictures which are also large or small B>
Figure QLYQS_3
4. The method for classifying breast cancer diseases based on self-supervision according to claim 1, wherein the specific process of the step S2 comprises:
S2.1:
and (3) cutting the acquired breast energy spectrum image: firstly, converting an energy spectrum image into a gray scale image, then segmenting a region of interest of a target image into sub-images by using a coordinate matching technology, then scanning the sub-images line by line, judging whether a certain column is empty or not by a traversal method, namely calculating whether pixel values in the column are all-black backgrounds or not, in a mammary gland energy spectrum image, if only non-all-black pixel points exist, the image region is considered to be an effective region needing to be identified and learned, in the gray scale image, the gray scale value of a pure black pixel is 0, and the representative value of a white pixel is 255, so that in the actual operation, a threshold value close to 0 is set: if the gray value of the current pixel is lower than the threshold value, the pixel can be judged to be black, and if all the pixels in a certain row are black, the row can be deleted;
S2.2:
carrying out image enhancement and equalization treatment on the breast energy spectrum image: the number of images is increased through operations of translation, rotation, turnover and noise increase of the images, overfitting of a network is prevented, meanwhile, the robustness of a classification network is enhanced, and overall contrast normalization is carried out on the enhanced breast energy spectrum images, and the specific method comprises the following steps: for an image of size M N pixels, the mean of all pixels of the image is represented by μ and σ is the standard deviation of all sample pixels, and the average intensity of each breast spectral image is calculated
Figure QLYQS_4
5. The method for classifying breast cancer diseases based on self-supervision according to claim 1, wherein the specific process of the step S3 comprises:
S3.1:
image feature extraction: outputting the processed energy spectrum images to a Vision Transformer in a batch mode, wherein the processed energy spectrum images comprise energy spectrum low-energy images and energy spectrum contrast enhancement images, processing one batch of images each time, cutting each energy spectrum image into non-overlapping image blocks when specifically performing feature extraction, then inputting the image blocks into a Vision Transformer classification model, and gradually enriching the features contained in a deep feature map through a multi-head attention layer;
S3.2:
extracting image characteristics by taking Vision transform as a backbone architecture; performing feature fusion on the feature maps output in the VIT in a transverse connection mode, enriching semantic representation capacity of the underlying feature maps, and finally performing classification judgment on the features; each energy spectrum image is cut into non-overlapping image blocks, and a classification block with the same size as the image blocks is additionally arranged for learning the letter of the whole imageThe method comprises the steps of initializing initial parameters randomly, then carrying out position coding and image information coding on each image block, then fusing the position coding and image coding information to obtain a vector matrix, standardizing the vector matrix layer, inputting output anisotropic features into a multi-head self-attention module, and in a Vision Transformer self-attention module, setting the input of each image block as X respectively i The input of the whole image being the input X of each image block i The composed matrix X is subjected to linear transformation to obtain Q, K and a V matrix, and then self-attention operation is performed in the following way to obtain information representation of each image block:
Figure QLYQS_5
the feature output of each image block is a feature map, feature fusion is carried out in a transverse connection mode, all feature maps are fused into a classification block, the semantic representation capacity of the feature map of the bottom layer is enriched, the feature representation of the image block on the image is learned through the classification block, and the classification block has the semantic capacity which is the same as that of the whole image; and finally, inputting the classification blocks into a multilayer perceptron, and inputting the output result into a sigmoid activation function to obtain class distribution.
6. The method for classifying breast cancer diseases based on self-supervision according to claim 1, wherein the specific process of the step S4 comprises:
S4.1:
for the self-supervision learning network of the Siamese structure, model parameters are initialized randomly, and contrast enhancement energy spectrum mammographic images of different modes are input into a Vision Transformer classification model in the step S3;
S4.2:
inputting two mammary gland energy spectrum images into a Vision transform classified encoder of a Siamese structure, wherein the output of the transform classified encoder is P with good or malignant C-type mammary gland diseases k =f θ (X k ) Probability distribution of
Figure QLYQS_6
P i k Represents P k I.e. the probability distribution of the ith sample, using two modality spectral images P 1 And P 2 We define the objective function as follows:
Figure QLYQS_7
inputting the class distribution into a distortion loss function to calculate distortion entropy L (P) 1 P 2 ) Warping entropy is a way to compute the similarity of two samples based on entropy, with the goal of making L (P) 1 P 2 ) As small as possible, in particular D KL (P i 1 ||P i 2 ) The following formula gives an explanation
Figure QLYQS_8
First term D in formula (2) KL (| |) represents the kullbeck-lebeller divergence between two probability distributions, also called relative entropy, which is a method used to describe the difference between two probability distributions, it is an asymmetric description method, such as P i 1 Relative to P i 2 Is not equal to P i 2 Relative to P i 1 The invention selects P i 1 Relative P i 2 Entropy of, minimization of
Figure QLYQS_9
The prediction of the energy spectrums of different mammary glands is kept consistent, the energy spectrum images of different mammary glands of the same mammary gland are required to be identified into the same class, the second term H () in the formula (2) represents the entropy of probability distribution, wherein the negative logarithm of the probability represents the information quantity carried out when a possible event occurs, the effective information quantity carried when the probability is higher is smaller, and the information quantity of each possible situation is multiplied by the effective information quantity carried when the probability is higherThe probabilities of occurrence are then summed to represent an expected value of the total amount of information for the entire system, and the formula is as follows:
Figure QLYQS_10
for the
Figure QLYQS_11
Minimizing the entropy of the class distribution for each sample normalizes the output distribution to sharpness, which allows a deterministic class assignment for each picture, and furthermore, the features of samples assigned to the same class will be more compact, the third term in equation (2), by maximizing the different samples @>
Figure QLYQS_12
The entropy of the average distribution among the two is used for distributing the predictions of different samples on the C class in a dispersed way, so that the network is prevented from distributing all the images to the same class, and because the cross entropy and the relative entropy have a derivation relation, the formula (2) is derived to obtain the similarity function of the two probability distributions as the following formula (4)
Figure QLYQS_13
The above-mentioned loss function is similar to the cross-entropy loss in supervised learning, where CE () represents cross-entropy, minimizing the cross-entropy loss between class distributions of breast spectra of different modalities of two and the same patient without true labeling of the breast spectral images,
Figure QLYQS_14
the diversity of the prediction can be maximized by maximizing the diversity term of the benign and malignant categories of the prediction. />
CN202211549190.3A 2022-12-05 2022-12-05 Self-supervision-based breast cancer disease classification method Pending CN115880245A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211549190.3A CN115880245A (en) 2022-12-05 2022-12-05 Self-supervision-based breast cancer disease classification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211549190.3A CN115880245A (en) 2022-12-05 2022-12-05 Self-supervision-based breast cancer disease classification method

Publications (1)

Publication Number Publication Date
CN115880245A true CN115880245A (en) 2023-03-31

Family

ID=85765898

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211549190.3A Pending CN115880245A (en) 2022-12-05 2022-12-05 Self-supervision-based breast cancer disease classification method

Country Status (1)

Country Link
CN (1) CN115880245A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116188879A (en) * 2023-04-27 2023-05-30 广州医思信息科技有限公司 Image classification and image classification model training method, device, equipment and medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116188879A (en) * 2023-04-27 2023-05-30 广州医思信息科技有限公司 Image classification and image classification model training method, device, equipment and medium
CN116188879B (en) * 2023-04-27 2023-11-28 广州医思信息科技有限公司 Image classification and image classification model training method, device, equipment and medium

Similar Documents

Publication Publication Date Title
Li et al. A comprehensive review of computer-aided whole-slide image analysis: from datasets to feature extraction, segmentation, classification and detection approaches
US20220309653A1 (en) System and method for attention-based classification of high-resolution microscopy images
CN108416360B (en) Cancer diagnosis system and method based on breast molybdenum target calcification features
CN109272048B (en) Pattern recognition method based on deep convolutional neural network
CN111524137B (en) Cell identification counting method and device based on image identification and computer equipment
CN109544526B (en) Image recognition system, device and method for chronic atrophic gastritis
US20220237789A1 (en) Weakly supervised multi-task learning for cell detection and segmentation
Khumancha et al. Lung cancer detection from computed tomography (CT) scans using convolutional neural network
CN112700461B (en) System for pulmonary nodule detection and characterization class identification
WO2020066257A1 (en) Classification device, classification method, program, and information recording medium
CN112990214A (en) Medical image feature recognition prediction model
Ghongade et al. Detection and classification of breast cancer from digital mammograms using RF and RF-ELM algorithm
CN116664911A (en) Breast tumor image classification method based on interpretable deep learning
CN115880245A (en) Self-supervision-based breast cancer disease classification method
Cao et al. 3D convolutional neural networks fusion model for lung nodule detection onclinical CT scans
Sornapudi et al. Feature based sequential classifier with attention mechanism
CN113870194B (en) Breast tumor ultrasonic image processing device with fusion of deep layer characteristics and shallow layer LBP characteristics
Johny et al. Optimization of CNN model with hyper parameter tuning for enhancing sturdiness in classification of histopathological images
Amiriebrahimabadi et al. A Comprehensive Survey of Multi-Level Thresholding Segmentation Methods for Image Processing
Chugh et al. Morphological and otsu’s technique based mammography mass detection and deep neural network classifier based prediction
CN113889235A (en) Unsupervised feature extraction system for three-dimensional medical image
CN111401119A (en) Classification of cell nuclei
Kaoungku et al. Colorectal Cancer Histology Image Classification Using Stacked Ensembles
Singh et al. A Shallow Convolution Neural Network for Predicting Lung Cancer Through Analysis of Histopathological Tissue Images
De Sant'Anna et al. Lightweight classification of normal versus leukemic cells using feature extraction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination