CN114119585B - Method for identifying key feature enhanced gastric cancer image based on Transformer - Google Patents
Method for identifying key feature enhanced gastric cancer image based on Transformer Download PDFInfo
- Publication number
- CN114119585B CN114119585B CN202111457189.3A CN202111457189A CN114119585B CN 114119585 B CN114119585 B CN 114119585B CN 202111457189 A CN202111457189 A CN 202111457189A CN 114119585 B CN114119585 B CN 114119585B
- Authority
- CN
- China
- Prior art keywords
- image
- network
- gastric cancer
- feature
- local
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 208000005718 Stomach Neoplasms Diseases 0.000 title claims abstract description 62
- 206010017758 gastric cancer Diseases 0.000 title claims abstract description 62
- 201000011549 stomach cancer Diseases 0.000 title claims abstract description 62
- 238000000034 method Methods 0.000 title claims abstract description 27
- 230000003902 lesion Effects 0.000 claims abstract description 22
- 238000012549 training Methods 0.000 claims abstract description 17
- 238000012360 testing method Methods 0.000 claims abstract description 12
- 238000001514 detection method Methods 0.000 claims abstract description 11
- 230000007246 mechanism Effects 0.000 claims abstract description 4
- 206010028980 Neoplasm Diseases 0.000 claims description 13
- 230000001575 pathological effect Effects 0.000 claims description 10
- 230000002708 enhancing effect Effects 0.000 claims description 9
- 210000002784 stomach Anatomy 0.000 claims description 8
- 201000011510 cancer Diseases 0.000 claims description 7
- 230000000694 effects Effects 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 4
- 238000003745 diagnosis Methods 0.000 claims description 4
- 230000004927 fusion Effects 0.000 claims description 4
- 238000013507 mapping Methods 0.000 claims description 4
- 239000002131 composite material Substances 0.000 claims description 3
- 230000008878 coupling Effects 0.000 claims description 3
- 238000010168 coupling process Methods 0.000 claims description 3
- 238000005859 coupling reaction Methods 0.000 claims description 3
- 238000011156 evaluation Methods 0.000 claims description 3
- 230000003993 interaction Effects 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 210000004881 tumor cell Anatomy 0.000 claims description 3
- 230000000007 visual effect Effects 0.000 claims description 3
- 210000004027 cell Anatomy 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 230000034994 death Effects 0.000 description 2
- 231100000517 death Toxicity 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000002496 gastric effect Effects 0.000 description 2
- 238000010191 image analysis Methods 0.000 description 2
- 206010058467 Lung neoplasm malignant Diseases 0.000 description 1
- 231100000504 carcinogenesis Toxicity 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 201000005202 lung cancer Diseases 0.000 description 1
- 208000020816 lung neoplasm Diseases 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/217—Validation; Performance evaluation; Active pattern learning techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10056—Microscopic image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30092—Stomach; Gastric
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30096—Tumor; Lesion
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Medical Informatics (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Radiology & Medical Imaging (AREA)
- Quality & Reliability (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention relates to a transform-based key feature enhanced gastric cancer image identification method. According to the method, a local lesion area is screened out by using a pre-trained YoloV5 network, and the key characteristics of the area to be recognized are further enhanced for the image to be classified by using a cross information Transformer network. In the cross information Transformer network, the characteristics of the lesion region in the image to be classified are enhanced by multi-head self-attention. The entire network is trained by classification loss and triplet loss. And after the training is finished, inputting the test set image into the trained network model, and evaluating the performance index of the network. Compared with the existing gastric cancer image identification method, the detection mechanism of the lesion area can effectively screen key characteristic information, weaken the interference of invalid background information, and meanwhile, the cross information Transformer network can fully enhance the characteristic representation of the lesion area information and improve the gastric cancer image identification precision.
Description
Technical Field
The invention relates to a transform-based key feature enhanced gastric cancer image identification method, and belongs to the field of image identification in computer vision.
Background
Gastric cancer is one of the most common cancers, and the number of deaths due to lung cancer is the second leading cause of cancer death worldwide each year. In order to improve the accuracy and efficiency of gastric cancer detection, computer methods have been increasingly focused on assisting pathological image analysis in the past decades. The identification of gastric cancer images is difficult due to the slight color difference of cells and the problems of overlapping and uneven distribution of cells among different gastric cancer pathological images. At present, deep learning techniques are widely used in various computer vision fields, and have the best performance in many applications such as image recognition. Some of the current relevant work uses deep learning for pathological image analysis. The CNN network is applied to the fields of deep segmentation and classification of epithelial regions and stroma regions in histopathology images, cancer region detection, cancer image identification and the like. The invention mainly focuses on the problem of clinical gastric cancer image identification. Manual pathological examination of stomach section pictures is time consuming and often affects the accuracy of the determination due to inconsistent judgment criteria caused by observer variability. Most of the current methods are based on a convolutional neural network and achieve certain effects. Recently, after seeing the great success of linguistic tasks, researchers have been exploring the way transformers apply to computer vision tasks. The invention mainly researches the identification problem of applying the Transformer to clinical gastric cancer focuses.
Because of the problems of slight color difference, overlapping, uneven distribution and the like of cells between different gastric cancer pathological images, how to effectively enhance information of lesion region characteristics and improve the capability of network to pay attention to remarkable discriminability information is a key problem for improving network identification performance at present. In order to solve the problems, the invention provides a transform-based key feature-enhanced gastric cancer image identification method. Although the convolution-based network has translation invariance, the transform-based network design has more capability of integrating global information, and is more robust to disturbance.
Disclosure of Invention
The invention provides a transform-based key feature enhanced gastric cancer image recognition method, which is used for solving the problem of low network recognition robustness caused by large exterior and distribution differences among different gastric cancer pathological images.
The technical scheme of the invention is as follows: a method for identifying a key feature enhanced gastric cancer image based on a Transformer comprises the following specific steps:
step1, collecting data sets of gastric cancer pictures and normal stomach pictures which are disclosed at present to form a data set;
step2, further identifying gastric cancer pictures with existing category labels, wherein the identified information comprises whether the pictures contain focuses of gastric cancer tumor cells and the positions of the focuses;
step3, performing data enhancement on the existing stomach cancer pathological picture to expand a data sample;
step4, loading a pre-trained weight of the YoloV5 network, and then finely adjusting the YoloV5 network by using a gastric cancer image recognition data set;
step5, respectively extracting global features of the complete image and local features of the cut image, inputting the global features and the local features into a transform network, and enhancing the features of the lesion area in the image to be classified through multi-head self-attention; finally, adding a full connection layer as a classifier for classification;
step6, training the whole network through cross entropy loss and triple loss on a training set;
step7, verifying whether the trained model meets the requirements or not by using the test set; in order to evaluate the model effect, the average classification accuracy ACA and the average accuracy AP of all the test images are used as evaluation indexes.
As a further scheme of the invention, the data set adopted in Step1 comprises a BOT stomach slice data set and a seed cancer risk intelligent diagnosis data set, 80% of pictures are divided into a training set, and 20% of pictures are divided into a testing set.
As a further aspect of the present invention, the data enhancement method used in Step3 includes: mirroring and rotating; wherein 30% of the pictures of the training set are randomly extracted for mirroring, 30% of the extracted remaining pictures are randomly rotated by 90 degrees, 180 degrees and 270 degrees clockwise, and the remaining pictures are not operated.
In Step4, the YoloV5 network weight trained on ImageNet is finely adjusted, the detection effect of the network on the gastric cancer tumor focus is adjusted, and a local picture containing the focus area is cut out on the original data set by using the coordinates of the detection result.
As a further scheme of the invention, the specific steps of Step5 are as follows:
step5.1, respectively extracting global features of the complete image and local features of the cut image, and inputting the global features and the local features into a transform network;
in a Step5.2, a transform network, by establishing a cross information flow relationship between a global feature and a local feature of a cut image, the cross-scale relationship between the local focus feature and a global feature token is favorably identified, and by the cross-scale relationship, the features of two scales are highly aligned and mutually coupled;
respectively and effectively processing local focus characteristics f in Step5.3 and Transformer networks l And global feature f g Thereby extracting local and global features to the maximum extent;
step5.4, up-sampling local focus characteristic f l It is compared with the global feature f g Connecting, and performing convolution one by one to perform channel double-scale information fusion to obtain the output characteristic f of the network O ;
Step5.5, output feature f O And inputting a classifier for classification, wherein the classifier is composed of two fully connected layers.
As a further aspect of the present invention, step5 comprises:
in the Transformer network, an image reshape with the size of H × W × C is formed into a 2-dimensional image block with the size of N × P 2 X is C; wherein, P 2 Is the size of the spatial dimension of the image, N = H × W/P 2 N is the number of image blocks, affecting the length of the input sequence; position embedding is added to the patch embedding to retain position information; the Transformer encoder consists of a multi-head self-attention and multi-layer sensor with a plurality of interaction layers, wherein the multi-layer sensor comprises two GELU nonlinear layers; layerNorm is applied before each block, while residual concatenation is applied after each block;
for a global feature f having a size of W H C g A 1 is to f g Into a sequence L of length L g (ii) a For local lesion feature f with size W × H × C l A 1 to f l Flattened into a sequence L of length L l (ii) a Through the operation, each vector in the sequence is regarded as a visual mark without space information, the convolution result is completely different, the dependency relationship between different mark pairs is independent of the space positions of the mark pairs in the feature map, and in order to mine the correlation relationship of local feature information in the global feature, the L is divided into L by adopting a full connection layer l Mapping to a sequence L of length L g_l ;
Global information is integrated, and a focus region feature coupling relation is modeled through a self-attention mechanism:
f Q =W Q ×L g ,f K =W K ×L g_l ,f V =W V ×L g
wherein f is Q ,f K ,f V Respectively inputting the multi-head self-attention in the Transformer; wherein, W Q 、W K 、W v Respectively representing generating a matrix of queries, keys, and values; by calculating f Q And f K Similarity between them, f is obtained K At f Q Attention weights for different location information; finally, attention weights and f are calculated V Thereby obtaining a composite signature:
wherein,the method is used for standardizing the features, effectively enhancing the feature representation of the key focus region in the global features by using a Transformer structure, enhancing the characteristics of the lesion region in the image to be classified by using multi-head self-attention, and improving the discrimination capability of a network on the lesion region.
As a further aspect of the invention, the cross entropy loss in Step6 is expressed as follows:
wherein, W cls Represents a class classifier, n b Indicates the Batch image number Batch size,is onehot vector, only the ith element is 1;
in addition to optimizing the network by using cross entropy loss, the characteristics of different gastric cancer images are constrained to have high similarity by the triplet loss, and different categories have low similarity, and a specific triplet loss optimization formula is as follows:
due to L tri The intra-class and inter-class samples are constrained simultaneously, so n 2b =2n b I.e. n b Stomach cancer image sample and n b All non-gastric cancer image samples were involved in the calculation of the loss, wherein f i Represents n 2b One of the samples, f i p Denotes f i Corresponding hard positive sample, f i n Denotes f i Corresponding to the hard negative sample, m is set to 0.3.
The invention has the beneficial effects that:
(1) The trans-scale cross information Transformer network can effectively enhance the information of a focus area in a gastric cancer image, improves the identification precision of the gastric cancer image, and is beneficial to accurately identifying gastric cancer tumor parts;
(2) The network design based on the Transformer has the capability of integrating global information and has robustness to disturbance.
Drawings
FIG. 1 is a general flow chart of the present invention;
Detailed Description
Example 1: as shown in fig. 1, a method for identifying a transform-based key feature-enhanced gastric cancer image comprises the following specific steps:
step1, collecting data sets of gastric cancer pictures and normal stomach pictures which are disclosed currently to form a data set, wherein the data set comprises a BOT gastric section data set and a seed cancer risk intelligent diagnosis data set, 80% of pictures are divided into a training set, and 20% of pictures are divided into a testing set.
Step2, manually using LabelImg software to further identify gastric cancer pictures with existing category labels to improve detection precision, wherein the identified information comprises whether the pictures contain the focus of gastric cancer tumor cells and the position of the focus;
step3, the data set contains 4560 pictures, and the data enhancement is carried out on the existing stomach cancer pathological pictures to expand the data samples; the data enhancement method used therein comprises: mirroring and rotating; wherein 30% of the pictures of the training set are randomly extracted for mirroring, 30% of the extracted remaining pictures are randomly rotated by 90 degrees, 180 degrees and 270 degrees clockwise, and the remaining pictures are not operated.
Step4, loading a pre-training weight of the YoloV5 network, and then finely adjusting the YoloV5 network by using a stomach cancer image recognition data set; since the YoloV5 network weight is trained on ImageNet, the detection accuracy of the lesion area in the gastric cancer image needs to be improved. And training the YoloV5 network by using partial gastric cancer pictures with well-marked focus positions to improve the capability of detecting the focus positions of the gastric cancer by the network.
In Step4, the YoloV5 network weight trained on ImageNet is finely adjusted, the detection effect of the network on the gastric cancer tumor focus is adjusted, and a local picture containing the focus area is cut out on the original data set by using the coordinates of the detection result.
Step5, cutting out a lesion area image according to the detected coordinates; respectively extracting global features of the complete image and local features of the cut image, inputting the global features and the local features into a transform network, and enhancing the features of lesion areas in the image to be classified through multi-head self-attention; finally, adding a full connection layer as a classifier for classification;
in the transform network, an image reshape with the size of H × W × C is formed into a 2-dimensional image block with the size of N × P 2 X is C; wherein,P 2 Is the size of the spatial dimension of the image, N = H × W/P 2 N is the number of image blocks, affecting the length of the input sequence; location embedding is added to the patch embedding to preserve location information; the Transformer encoder consists of a multi-head self-attention and multi-layer perceptron of a plurality of interaction layers, wherein the multi-layer perceptron comprises two GELU nonlinear layers; layerNorm is applied before each block, while residual concatenation is applied after each block;
for a global feature f having a size of W H C g A 1 to f g Into a sequence L of length L g (ii) a For local lesion feature f with size W × H × C l A 1 to f l Flattened into a sequence L of length L l (ii) a Through the operation, each vector in the sequence is regarded as a visual mark without space information, the convolution result is completely different, the dependency relationship between different mark pairs is independent of the space positions of the mark pairs in the feature map, and in order to mine the correlation relationship of local feature information in the global feature, the L is divided into L by adopting a full connection layer l Mapping to a sequence L of length L g_l ;
Global information is integrated, and a focus region feature coupling relation is modeled through a self-attention mechanism:
f Q =W Q ×L g ,f K =W K ×L g_l ,f V =W V ×L g
wherein f is Q ,f K ,f V Respectively inputting the multi-head self-attention in the transform; wherein, W Q 、W K 、W v Respectively representing generating a matrix of queries, keys, and values; by calculating f Q And f K Similarity between them, f is obtained K At f Q Attention weights for different location information; finally, the attention weight sum f is calculated V Thereby obtaining a composite signature:
wherein,the method is used for standardizing the characteristics, effectively enhancing the characteristic representation of a key focus area in the global characteristics by using a Transformer structure, enhancing the characteristics of a lesion area in an image to be classified by using multi-head self attention, and improving the discrimination capability of a network on the lesion area; the problem of network discrimination capability reduction caused by color difference, overlapping and uneven distribution among different gastric cancer pathological images is effectively solved. In the transform network, by establishing the cross information flow relationship between the global feature and the local feature of the cut image, the cross information flow can identify the cross-scale relationship between the local focus feature and the global feature token, and through these relationships, the features of the two scales are highly aligned and coupled with each other. In addition, the transform effectively processes feature mapping of local focus features and global features respectively, so as to extract local and global features to the maximum extent. After this, we upsample the local lesion feature f l Match it with a global feature f g Connecting, and performing convolution one by one to perform channel dual-scale information fusion to obtain the output characteristic f of the network O . Finally, we add a full link layer as a classifier to classify.
As a further scheme of the invention, the specific steps of Step5 are as follows:
step5.1, respectively extracting global features of the complete image and local features of the cut image, and inputting the global features and the local features into a transform network;
in a Step5.2, a transform network, by establishing a cross information flow relationship between a global feature and a local feature of a cut image, the cross-scale relationship between the local focus feature and a global feature token is favorably identified, and by the cross-scale relationship, the features of two scales are highly aligned and mutually coupled;
respectively effectively processing local focus characteristics f in Step5.3 and Transformer networks l And global feature f g Thereby extracting local and global features to the maximum extent;
step5.4, upsampling local lesion feature f l It is compared with the global feature f g Connecting, and performing convolution one by one to perform channel dual-scale information fusion to obtain the output characteristic f of the network O ;
Step5.5, output feature f O And inputting a classifier for classification, wherein the classifier is composed of two fully-connected layers.
Step6, training the whole network through cross entropy loss and triple loss on a training set; specifically, we used the BOT gastric section dataset. The data set contained 560 gastric cancer sections and 140 normal sections. Sections were stained with hematoxin-eosin at 20-fold magnification. The resolution of the stomach slices was 2048x2048. The tumor area portion is annotated by the data provider. In order to expand the data set samples, a seed cancer risk intelligent diagnosis data set is added, wherein the data set comprises 4000 samples, the data comprises a positive sample and a negative sample, a part of regions in the positive sample have gastric cancer focuses, and the negative sample does not have the gastric cancer focuses. The data set used in the method comprises 4560 pictures in total by integrating the samples of the two data sets. In the experiment, 80% of the stomach sections (normal and cancer) were randomly selected for network training, while the remaining 20% of the sections were used for testing.
In order to extract the robust features with class discriminant, the network adopts cross entropy loss and triple loss to output features f of the network O And (6) carrying out constraint.
As a further aspect of the present invention, the cross-entropy loss in Step6 is expressed as follows:
wherein, W cls Represents a class classifier, n b Indicates the Batch image number Batch size,is an onehot vector, only the ith element is 1;
in addition to optimizing the network by using cross entropy loss, the characteristics of different gastric cancer images are constrained to have high similarity by the triplet loss, and the characteristics of different gastric cancer images have low similarity by different categories, and a specific triplet loss optimization formula is as follows:
due to L tri The intra-class and inter-class samples are constrained simultaneously, so n 2b =2n b I.e. n b Stomach cancer image sample and n b All non-gastric cancer image samples were involved in the calculation of the loss, wherein f i Represents n 2b One of the samples, f i p Denotes f i Corresponding hard positive sample, f i n Denotes f i Corresponding to the hard negative sample, m is set to 0.3.
Step7, verifying whether the trained model meets the requirements or not by using the test set; to evaluate the model effect, the Average Classification Accuracy (ACA) and average Accuracy (AP) of all test images were used as evaluation indexes. The average classification accuracy represents the overall correctness classification rate for all test images. The average accuracy calculation formula is the actual number of positive samples/all positive samples in the predicted sample.
While the present invention has been described in detail with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, and various changes can be made without departing from the spirit of the present invention within the knowledge of those skilled in the art.
Claims (6)
1. A method for identifying a key characteristic enhanced gastric cancer image based on a Transformer is characterized by comprising the following specific steps:
step1, collecting data sets of gastric cancer pictures and normal stomach pictures which are disclosed at present to form a data set;
step2, further identifying gastric cancer pictures with the existing category labels, wherein the identified information comprises whether the pictures contain focuses of gastric cancer tumor cells and the positions of the focuses;
step3, performing data enhancement on the existing gastric cancer pathological picture to expand a data sample;
step4, loading a pre-training weight of the YoloV5 network, and then finely adjusting the YoloV5 network by using a stomach cancer image recognition data set;
step5, respectively extracting global features of the complete image and local features of the cut image, inputting the global features and the local features into a transform network, and enhancing the features of the lesion area in the image to be classified through multi-head self-attention; finally, adding a full connection layer as a classifier for classification;
step6, training the whole network through cross entropy loss and triple loss on a training set;
step7, verifying whether the trained model meets the requirements or not by using the test set; in order to evaluate the model effect, the average classification precision ACA and the average precision AP of all the test images are used as evaluation indexes;
the specific steps of Step5 are as follows:
step5.1, respectively extracting global features of the complete image and local features of the cut image, and inputting the global features and the local features into a transform network;
in a Step5.2 transform network, by establishing a cross information flow relationship between a global feature and a local feature of a cut image, the identification of a cross-scale relationship between a local focus feature and a global feature token is facilitated, and through the cross-scale relationship, features of two scales are highly aligned and mutually coupled;
respectively and effectively processing local focus characteristics f in Step5.3 and Transformer networks l And global feature f g Thereby extracting local and global features to the maximum extent;
step5.4, upsampling local lesion feature f l It is compared with the global feature f g Connecting, and performing convolution one by one to perform channel dual-scale information fusion to obtain the output characteristic f of the network O ;
Step5.5, output feature f O And inputting a classifier for classification, wherein the classifier is composed of two fully-connected layers.
2. The method for identifying transform-based key feature-enhanced gastric cancer images according to claim 1, wherein: the data set adopted in Step1 comprises a BOT stomach slice data set and a seed cancer risk intelligent diagnosis data set, 80% of pictures are divided into a training set, and 20% of pictures are divided into a testing set.
3. The method for identifying transform-based key feature-enhanced gastric cancer images according to claim 1, wherein: the data enhancement method used in Step3 comprises the following steps: mirroring and rotating; wherein 30% of the pictures of the training set are randomly extracted for mirroring, 30% of the extracted remaining pictures are randomly rotated by 90 degrees, 180 degrees and 270 degrees clockwise, and the remaining pictures are not operated.
4. The transform-based key feature-enhanced gastric cancer image recognition method according to claim 1, wherein: in Step4, the YoloV5 network weight trained on ImageNet is finely adjusted, the detection effect of the network on the gastric cancer tumor focus is adjusted, and a local picture containing the focus area is cut out on an original data set by using the coordinate of the detection result.
5. The transform-based key feature-enhanced gastric cancer image recognition method according to claim 1, wherein: the Step5 comprises the following steps:
in the transform network, an image reshape with the size of H × W × C is formed into a 2-dimensional image block with the size of N × P 2 X is C; wherein, P 2 Is the size of the spatial dimension of the image, N = H × W/P 2 N is the number of image blocks, affecting the length of the input sequence; location embedding is added to the patch embedding to preserve location information; the Transformer encoder consists of a multi-head self-attention and multi-layer sensor with a plurality of interaction layers, wherein the multi-layer sensor comprises two GELU nonlinear layers; layerNorm is applied before each block, while residual concatenation is applied after each block;
for having W × H × C largeSmall global feature f g A 1 to f g Into a sequence L of length L g (ii) a For local lesion feature f with size W × H × C l A 1 is to f l Flattened into a sequence L of length L l (ii) a Through the operation, each vector in the sequence is regarded as a visual mark without spatial information, the convolution result is completely different, the dependency relationship between different mark pairs is irrelevant to the spatial position of the mark pairs in the feature map, and in order to mine the mutual relationship of local feature information in the global feature, the L is divided into L by adopting a full connection layer l Mapping to a sequence L of length L g_l ;
The global information is integrated, and the focus area characteristic coupling relation is modeled through an attention mechanism:
f Q =W Q ×L g ,f K =W K ×L g_l ,f V =W V ×L g
wherein f is Q ,f K ,f V Respectively inputting the multi-head self-attention in the Transformer; wherein, W Q 、W K 、W v Respectively representing generating a matrix of queries, keys, and values; by calculating f Q And f K The similarity between the two is obtained K At f Q Attention weights for different location information; finally, the attention weight sum f is calculated V Thereby obtaining a composite signature:
wherein,the method is used for standardizing the features, effectively enhancing the feature representation of the key focus region in the global features by using a Transformer structure, enhancing the characteristics of the lesion region in the image to be classified by using multi-head self-attention, and improving the discrimination capability of a network on the lesion region.
6. The method for identifying transform-based key feature-enhanced gastric cancer images according to claim 1, wherein: the cross entropy loss in Step6 is expressed as follows:
wherein, W cls Represents a class classifier, n b Indicates the Batch image number Batch size,is onehot vector, only the ith element is 1;
in addition to optimizing the network by using cross entropy loss, the characteristics of different gastric cancer images are constrained to have high similarity by the triplet loss, and different categories have low similarity, and a specific triplet loss optimization formula is as follows:
due to L tri Constrain both intra-class and inter-class samples, hence n 2b =2n b I.e. n b Stomach cancer image sample and n b All non-gastric cancer image samples were involved in the calculation of the loss, wherein f i Represents n 2b One of the samples, f i p Denotes f i Corresponding hard positive sample, f i n Denotes f i Corresponding to the hard negative sample, m is set to 0.3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111457189.3A CN114119585B (en) | 2021-12-01 | 2021-12-01 | Method for identifying key feature enhanced gastric cancer image based on Transformer |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111457189.3A CN114119585B (en) | 2021-12-01 | 2021-12-01 | Method for identifying key feature enhanced gastric cancer image based on Transformer |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114119585A CN114119585A (en) | 2022-03-01 |
CN114119585B true CN114119585B (en) | 2022-11-29 |
Family
ID=80369461
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111457189.3A Active CN114119585B (en) | 2021-12-01 | 2021-12-01 | Method for identifying key feature enhanced gastric cancer image based on Transformer |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114119585B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114332544B (en) * | 2022-03-14 | 2022-06-07 | 之江实验室 | Image block scoring-based fine-grained image classification method and device |
WO2024103284A1 (en) * | 2022-11-16 | 2024-05-23 | 中国科学院深圳先进技术研究院 | Survival analysis method and system for brain tumor patient |
CN116152232A (en) * | 2023-04-17 | 2023-05-23 | 智慧眼科技股份有限公司 | Pathological image detection method, pathological image detection device, computer equipment and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021120752A1 (en) * | 2020-07-28 | 2021-06-24 | 平安科技(深圳)有限公司 | Region-based self-adaptive model training method and device, image detection method and device, and apparatus and medium |
CN113034500A (en) * | 2021-05-25 | 2021-06-25 | 紫东信息科技(苏州)有限公司 | Digestive tract endoscope picture focus identification system based on multi-channel structure |
CN113269724A (en) * | 2021-04-28 | 2021-08-17 | 西安交通大学 | Fine-grained cancer subtype classification method |
CN113378792A (en) * | 2021-07-09 | 2021-09-10 | 合肥工业大学 | Weak supervision cervical cell image analysis method fusing global and local information |
CN113408492A (en) * | 2021-07-23 | 2021-09-17 | 四川大学 | Pedestrian re-identification method based on global-local feature dynamic alignment |
CN113674253A (en) * | 2021-08-25 | 2021-11-19 | 浙江财经大学 | Rectal cancer CT image automatic segmentation method based on U-transducer |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111179167B (en) * | 2019-12-12 | 2023-05-16 | 天津大学 | Image super-resolution method based on multi-stage attention enhancement network |
-
2021
- 2021-12-01 CN CN202111457189.3A patent/CN114119585B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021120752A1 (en) * | 2020-07-28 | 2021-06-24 | 平安科技(深圳)有限公司 | Region-based self-adaptive model training method and device, image detection method and device, and apparatus and medium |
CN113269724A (en) * | 2021-04-28 | 2021-08-17 | 西安交通大学 | Fine-grained cancer subtype classification method |
CN113034500A (en) * | 2021-05-25 | 2021-06-25 | 紫东信息科技(苏州)有限公司 | Digestive tract endoscope picture focus identification system based on multi-channel structure |
CN113378792A (en) * | 2021-07-09 | 2021-09-10 | 合肥工业大学 | Weak supervision cervical cell image analysis method fusing global and local information |
CN113408492A (en) * | 2021-07-23 | 2021-09-17 | 四川大学 | Pedestrian re-identification method based on global-local feature dynamic alignment |
CN113674253A (en) * | 2021-08-25 | 2021-11-19 | 浙江财经大学 | Rectal cancer CT image automatic segmentation method based on U-transducer |
Non-Patent Citations (1)
Title |
---|
"基于局部特征加强的生物医疗命名实体识别";路千惠;《中国优秀硕士学位论文全文数据库 医药卫生科技辑》;20210215;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN114119585A (en) | 2022-03-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114119585B (en) | Method for identifying key feature enhanced gastric cancer image based on Transformer | |
CN107918780B (en) | Garment type and attribute classification method based on key point detection | |
CN103390164B (en) | Method for checking object based on depth image and its realize device | |
CN111080629A (en) | Method for detecting image splicing tampering | |
CN108830188A (en) | Vehicle checking method based on deep learning | |
CN110135459B (en) | Zero sample classification method based on double-triple depth measurement learning network | |
CN110088804A (en) | It is scored based on the computer of primary colors and immunohistochemistry image | |
CN106295124A (en) | Utilize the method that multiple image detecting technique comprehensively analyzes gene polyadenylation signal figure likelihood probability amount | |
CN109635726B (en) | Landslide identification method based on combination of symmetric deep network and multi-scale pooling | |
CN110261329A (en) | A kind of Minerals identification method based on full spectral coverage high-spectrum remote sensing data | |
CN108776777A (en) | The recognition methods of spatial relationship between a kind of remote sensing image object based on Faster RCNN | |
CN111401426A (en) | Small sample hyperspectral image classification method based on pseudo label learning | |
CN115761757A (en) | Multi-mode text page classification method based on decoupling feature guidance | |
CN108985145A (en) | The Opposite direction connection deep neural network model method of small size road traffic sign detection identification | |
CN111639697B (en) | Hyperspectral image classification method based on non-repeated sampling and prototype network | |
CN115311502A (en) | Remote sensing image small sample scene classification method based on multi-scale double-flow architecture | |
CN115546553A (en) | Zero sample classification method based on dynamic feature extraction and attribute correction | |
CN114782753A (en) | Lung cancer histopathology full-section classification method based on weak supervision learning and converter | |
CN115830379A (en) | Zero-sample building image classification method based on double-attention machine system | |
CN116468935A (en) | Multi-core convolutional network-based stepwise classification and identification method for traffic signs | |
CN109034213A (en) | Hyperspectral image classification method and system based on joint entropy principle | |
CN114511759A (en) | Method and system for identifying categories and determining characteristics of skin state images | |
CN112215285B (en) | Cross-media-characteristic-based automatic fundus image labeling method | |
CN112465821A (en) | Multi-scale pest image detection method based on boundary key point perception | |
CN104851090B (en) | Image change detection method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |