CN117876767A - Metric-based medical endoscopic image classification algorithm combining subspace attention - Google Patents

Metric-based medical endoscopic image classification algorithm combining subspace attention Download PDF

Info

Publication number
CN117876767A
CN117876767A CN202410010192.8A CN202410010192A CN117876767A CN 117876767 A CN117876767 A CN 117876767A CN 202410010192 A CN202410010192 A CN 202410010192A CN 117876767 A CN117876767 A CN 117876767A
Authority
CN
China
Prior art keywords
feature
subspace
attention
new
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410010192.8A
Other languages
Chinese (zh)
Inventor
金军
罗以宁
蒲蔚
胡大裟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
Original Assignee
Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan University filed Critical Sichuan University
Priority to CN202410010192.8A priority Critical patent/CN117876767A/en
Publication of CN117876767A publication Critical patent/CN117876767A/en
Pending legal-status Critical Current

Links

Abstract

The invention discloses a medical endoscope image classification algorithm based on measurement and combining subspace attention, which aims at solving the problems that a large amount of noise exists in an endoscope image due to various complex factors in the endoscope imaging process, the noise signals contained in the image features extracted by a convolutional neural network disturb the spatial distribution of effective semantic features, and the representation capability of a model on key semantic features is weakened, so that the classification precision is reduced. Comprising the following steps: 1) Training a model by adopting a few sample learning method based on measurement; 2) Dividing the features extracted from the higher layer of the network structure into a plurality of feature subspaces; 3) The attention profile and global correlation are calculated separately for each feature subspace to suppress noise in the subspace and highlight the expressed effective semantic region. The image classification method provided by the invention can enable the model to pay attention to different key semantic parts in the endoscopic image, inhibit interference characteristics, and enhance the expression capability of effective semantic characteristics, thereby improving the classification accuracy of the model.

Description

Metric-based medical endoscopic image classification algorithm combining subspace attention
Technical Field
The invention relates to the field of computer-aided diagnosis based on medical images, in particular to a medical endoscope image classification algorithm based on measurement and integrating subspace attention.
Background
Early detection and early diagnosis are important modes for controlling digestive tract diseases and improving survival rate, and medical endoscopic screening is an effective means for detecting early lesions of the digestive tract, and can accurately detect lesion sites. In digestive endoscopy, machine learning-based auxiliary diagnostic systems must accurately extract endoscopic image information, analyze and identify the image information, and provide powerful diagnostic support for doctors.
In the microscopic imaging process, the light source is switched to cause uneven brightness and abnormal color change of the image; the imaging resolution is low and the movement of the endoscope causes blurring of the image to different degrees; digestive tract mucus and residues, and strong light reflection, etc., make many noises in the imaged endoscopic image. The machine learning method based on the deep neural network generally uses the convolutional neural network to extract image features for classification, so that the extracted endoscopic image features also contain a lot of noise, and the existence of the interference signals inevitably disturbs the spatial distribution of semantic features, so that the model is difficult to accurately capture key semantic parts in the image, thereby weakening the semantic representation capability of the model and affecting classification performance.
Disclosure of Invention
The invention aims to: in view of the above, the present invention proposes a metric-based medical endoscopic image classification algorithm that incorporates subspace attention mechanisms.
The technical scheme is as follows: the subspace attention mechanism is introduced into the last stage of the ResNet network structure, the key semantic feature region expressed by the subspace is highlighted, the model focuses on effective feature information, interference features are restrained, and semantic information with stronger expression capability is obtained. The implementation steps are as follows:
step 1: the task of creating a large number of C-way K-shots of small sample classification from up-sampling of a set of medical endoscopic images, specifically, randomly selecting C classes from the set of endoscopic images each time, randomly sampling K images from each class selected, forming a support set I s Randomly sampling any images from the images left in the C classes to form a query set I q Support set and query set images are combined into a small sample classification task T i ={(I s ,I q ),(y s ,y q )}。
Step 2: taking ResNet network structure with the last full connection layer removed as a feature extraction network f θ C-way K-shot task T established in step 1 is implemented i The feature extraction network is input, noise contained in the features of the lower layer in the front of the neural network is more, and the features of the higher layer in the rear of the neural network have stronger semantic information and position relation information, so that subspace attention mechanisms are applied in the last stage of the upper layer of the feature extraction network structure, and the semantic feature learning capacity of the model is enhanced.
Specifically, the feature F output by the last BatchNorm layer in each basic block or bottleneck of the last stage of the feature extraction network structure is divided into g subspaces [ F ] along the channel dimension 1 ,…,F n ,…,F g ]For each feature subspace F n Two attention matrices are calculated:
①attn(F n )=σ(bn(gap(F n )⊙F n )),
wherein σ represents a sigmoid function, bn represents a batch normalization layer, gap represents a global average pooling function, and radix represents a dot product operation, attn (F n ) Watch (watch)Attention matrix generated by using similarity between global feature descriptors and local feature descriptors of subspaces, new feature subspaces with an attention mechanism appliedThe spatial distribution of the internal features is enhanced, the interference features are suppressed, key semantic feature areas in the subspaces are highlighted, and the semantic feature learning capacity of the feature subspaces is improved.
②attn(F n )=σ(pconv(mp(dconv(F n )))),
Wherein σ represents a sigmoid function, dconv represents a convolution kernel of 1×1 depth convolution operation, mp represents a maximum pooling function with a window size of 3×3 and stride of 1, pconv represents a point convolution operation with a convolution kernel of 1×1, as indicated by the dot product operation,representing pixel-by-pixel addition operations, note that the moment array attn (F n ) Representing the global correlation of semantic and spatial information fused captured across channels, a new feature subspace with an applied attention mechanism +.>The distribution of internal features is improved, highlighting local semantic regions.
Step 3, fusing 2g of new feature subspaces obtained in the step 2:
F new =conv1*1(concat(F new_1 ,F new_2 ))
wherein concat represents splicing according to channels, conv1 x 1 represents 1 x 1 convolution operation, and is used for fusing different channel data and reducing dimension. Each basic block or bottleneck finally outputs a feature F enhanced by subspace attentiveness mechanisms new
Step 4: extracting network f from features incorporating subspace attention mechanisms θ Separating a support set feature Z from the finally output image feature Z s And query set feature Z q Computing a support set feature class prototype:
step 5: measuring the distance between the query set features and each class prototype, and predicting the probability of the lesion class to which the query set image belongs according to the distance measurement:
where d () represents the euclidean distance.
Step 6: calculating cross entropy loss of predicted query set categories and actual categories thereof:
model network parameters are optimized by back propagation.
The beneficial effects of the invention are as follows: (1) the high-level convolution features of the model network structure are divided into a plurality of feature subspaces, and an attention matrix is independently generated for each subspace, so that the model can pay attention to a plurality of different key semantic parts in the image; (2) two kinds of attention matrixes are generated for each feature subspace through learning cross-channel information and similarity between global and local features, and are used for improving the spatial distribution of the features, so that interference features can be restrained, and feature areas of key semantics are highlighted; (3) the proposed attention mechanism can enhance the semantic expression capability of the model, does not increase additional parameters and has low calculation cost, and can be embedded into other feature extraction networks to improve feature learning and classification accuracy.
Drawings
FIG. 1 is a flow chart of an embodiment of the present invention and subspace attention mechanism;
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The invention provides a medical endoscope image classification algorithm based on measurement and combined with a subspace attention mechanism, wherein a subspace attention module is embedded in a feature extraction network. The following is a detailed description of specific embodiments:
the invention adopts the ResNet network structure with the last full connection layer removed as a characteristic extraction network for projecting the image into an embedded space to obtain the image characteristics. The ResNet network structure is a depth residual error network, and is divided into 5 stages from the bottom layer to the high layer, wherein the last 4 stages are composed of basic blocks or bottlenecks except stage 0, and the structure is relatively similar.
FIG. 1 shows a flow chart and subspace attention module illustrating a ResNet-50 network architecture for an embodiment of the present invention. Stage 1 to stage 4 of ResNet-50 contain 3, 4, 6, 3 bottlenecks, respectively.
Step 1: a C-way K-shot task T is established i Inputting a feature extraction network;
step 2: dividing the feature F output by the last BatchNorm layer in each bottleneck of the highest layer stage 4 into g subspaces [ F ] along the channel dimension 1 ,…,F n ,…,F g ]For each featureSubspace F n Two attention matrices are calculated:
①attn(F n )=σ(bn(gap(F n )⊙F n )),
wherein σ represents a sigmoid function, bn represents a batch normalization layer, gap represents a global average pooling function, and radix represents a dot product operation, attn (F n ) Representing the first attention matrix generated, resulting in a new feature subspace with applied attention mechanisms
②attn(F n )=σ(pconv(mp(dconv(F n )))),
Wherein σ represents a sigmoid function, dconv represents a convolution kernel of 1×1 depth convolution operation, mp represents a maximum pooling function with a window size of 3×3 and stride of 1, pconv represents a point convolution operation with a convolution kernel of 1×1, as indicated by the dot product operation,represents a pixel-wise addition operation, attn (F n ) Representing the generated second attention matrix, a new feature subspace is obtained, to which attention mechanisms are applied>
Step 3, fusing 2g of new feature subspaces obtained in the step 2:
F new =conv1*1(concat(F new_1 ,F new_2 ))
wherein concat represents splicing according to channels, conv1 x 1 represents 1 x 1 convolution operation, and is used for fusing different channel data and reducing dimension. Each basic block or bottleneck finally outputs a feature F enhanced by subspace attentiveness mechanisms new
Step 4: extracting network f from features incorporating subspace attention mechanisms θ Separating a support set feature Z from the finally output image feature Z s And query set feature Z q Computing a support set feature class prototype:
step 5: measuring the distance between the query set features and each class prototype, and predicting the probability of the lesion class to which the query set image belongs according to the distance measurement:
where d () represents the euclidean distance.
The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (4)

1. A metric-based medical endoscopic image classification algorithm incorporating subspace attention, comprising the steps of:
step 1: sampling from a medical endoscope image set to establish a large number of C-way K-shot few sample classification tasks;
step 2: inputting one C-way K-shot task in the step 1 into a feature extraction network f θ A subspace attention mechanism is applied to the last stage of the high-level of the feature extraction network structure, so that the semantic feature learning capacity of the model is enhanced;
step 3, fusing 2g of new feature subspaces obtained in the step 2 to obtain the features with enhanced semantics;
step 4: extracting network f from features incorporating subspace attention mechanisms θ Separating a support set feature Z from the finally output image feature Z s And query set feature Z q Computing a support set feature class prototype:
step 5: measuring the distance between the query set features and each class prototype, and predicting the probability of the lesion class to which the query set image belongs according to the distance measurement:
where d () represents the euclidean distance;
step 6: calculating cross entropy loss of predicted query set categories and actual categories thereof:
model network parameters are optimized by back propagation.
2. The method of claim 1, wherein the step 1 of creating a few-sample classification task comprises: t (T) i ={(I s ,I q ),(y s ,y q )}。
3. Generating subspace attentiveness in accordance with step 2 of claim 1, whereinDividing the feature F output by the last BatchNorm layer in each basic block or bottleneck of the last stage of the high-level of the feature extraction network structure into g subspaces [ F ] according to the channel dimension 1 ,…,F n ,…,F g ]For each feature subspace F n Two attention matrices are calculated:
①attn(F n )=σ(bn(gap(F n )⊙F n )),
wherein σ represents a sigmoid function, bn represents a batch normalization layer, gap represents a global average pooling function, and radix represents a dot product operation, attn (F n ) Representing the first attention matrix generated,representing a new feature subspace in which an attention mechanism is applied;
②attn(F n )=σ(pconv(mp(dconv(F n )))),
wherein σ represents a sigmoid function, dconv represents a convolution kernel of 1×1 depth convolution operation, mp represents a maximum pooling function with a window size of 3×3 and stride of 1, pconv represents a point convolution operation with a convolution kernel of 1×1, as indicated by the dot product operation,represents a pixel-wise addition operation, attn (F n ) Representing the generated second attention matrix, < >>Representing a new feature subspace in which the attention mechanism is applied.
4. The method of claim 1, wherein the step 3 of fusing new feature subspaces is characterized in that all new feature subspaces respectively applying two attention mechanisms are spliced according to channels, and cross-channel features are fused and reduced in dimension through 1×1 convolution operation;
F new =conv1*1(concat(F new_1 ,F new_2 ))
wherein conv1 x 1 represents a 1×1 convolution operation, concat represents a splice per channel, F new Representing the resulting semantically enhanced features.
CN202410010192.8A 2024-01-04 2024-01-04 Metric-based medical endoscopic image classification algorithm combining subspace attention Pending CN117876767A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410010192.8A CN117876767A (en) 2024-01-04 2024-01-04 Metric-based medical endoscopic image classification algorithm combining subspace attention

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410010192.8A CN117876767A (en) 2024-01-04 2024-01-04 Metric-based medical endoscopic image classification algorithm combining subspace attention

Publications (1)

Publication Number Publication Date
CN117876767A true CN117876767A (en) 2024-04-12

Family

ID=90594190

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410010192.8A Pending CN117876767A (en) 2024-01-04 2024-01-04 Metric-based medical endoscopic image classification algorithm combining subspace attention

Country Status (1)

Country Link
CN (1) CN117876767A (en)

Similar Documents

Publication Publication Date Title
CN111951288B (en) Skin cancer lesion segmentation method based on deep learning
Chen et al. PCAT-UNet: UNet-like network fused convolution and transformer for retinal vessel segmentation
CN109754007A (en) Peplos intelligent measurement and method for early warning and system in operation on prostate
Nawaz et al. Melanoma localization and classification through faster region-based convolutional neural network and SVM
Das et al. Automated Indian sign language recognition system by fusing deep and handcrafted feature
Pacal et al. Deep learning-based approaches for robust classification of cervical cancer
CN116342894B (en) GIS infrared feature recognition system and method based on improved YOLOv5
Jin et al. FCMNet: Frequency-aware cross-modality attention networks for RGB-D salient object detection
CN114565572A (en) Cerebral hemorrhage CT image classification method based on image sequence analysis
CN117152433A (en) Medical image segmentation method based on multi-scale cross-layer attention fusion network
CN114972202A (en) Ki67 pathological cell rapid detection and counting method based on lightweight neural network
Nie et al. Recent advances in diagnosis of skin lesions using dermoscopic images based on deep learning
Wang et al. An eyes-based siamese neural network for the detection of gan-generated face images
CN113538363A (en) Lung medical image segmentation method and device based on improved U-Net
Hosseinzadeh Kassani et al. Automatic polyp segmentation using convolutional neural networks
Gnanapriya et al. A Hybrid Deep Learning Model for Real Time Hand Gestures Recognition.
Lee et al. A hardware accelerated system for high throughput cellular image analysis
CN117876767A (en) Metric-based medical endoscopic image classification algorithm combining subspace attention
CN115631526A (en) Shielded facial expression recognition method based on self-supervision learning technology and application
Sharma et al. Facial Image Super-Resolution with CNN,“A Review”
Chao et al. Instance-aware image dehazing
Grabska-Barwińska Measuring and improving the quality of visual explanations
Salvi et al. cyto‐Knet: An instance segmentation approach for multiple myeloma plasma cells using conditional kernels
Sabitha et al. Classifying Hematoxylin and Eosin Images Using a Super-Resolution Segmentor and a Deep Ensemble Classifier
Monkumar et al. Unified framework of dense convolution neural network for image super resolution

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination