CN114782403A - Pneumonia image detection method and device based on mixed space and inter-channel attention - Google Patents
Pneumonia image detection method and device based on mixed space and inter-channel attention Download PDFInfo
- Publication number
- CN114782403A CN114782403A CN202210536524.7A CN202210536524A CN114782403A CN 114782403 A CN114782403 A CN 114782403A CN 202210536524 A CN202210536524 A CN 202210536524A CN 114782403 A CN114782403 A CN 114782403A
- Authority
- CN
- China
- Prior art keywords
- feature
- image
- tensor
- network
- attention
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 46
- 206010035664 Pneumonia Diseases 0.000 title claims abstract description 37
- 238000000605 extraction Methods 0.000 claims abstract description 74
- 210000004072 lung Anatomy 0.000 claims abstract description 56
- 238000012545 processing Methods 0.000 claims abstract description 28
- 238000010586 diagram Methods 0.000 claims abstract description 22
- 238000000034 method Methods 0.000 claims abstract description 17
- 238000007781 pre-processing Methods 0.000 claims abstract description 11
- 238000012549 training Methods 0.000 claims description 12
- 238000006243 chemical reaction Methods 0.000 claims description 11
- 238000011176 pooling Methods 0.000 claims description 10
- 230000004913 activation Effects 0.000 claims description 9
- 238000010276 construction Methods 0.000 claims description 9
- 230000004927 fusion Effects 0.000 claims description 7
- 230000007246 mechanism Effects 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 6
- 230000002685 pulmonary effect Effects 0.000 claims description 5
- 239000011159 matrix material Substances 0.000 claims description 4
- 238000005070 sampling Methods 0.000 claims description 4
- 230000003044 adaptive effect Effects 0.000 claims description 3
- 238000012216 screening Methods 0.000 claims description 3
- 238000012360 testing method Methods 0.000 claims description 3
- 230000017105 transposition Effects 0.000 claims description 3
- 238000012795 verification Methods 0.000 claims description 3
- 238000011157 data evaluation Methods 0.000 claims description 2
- 230000006870 function Effects 0.000 description 20
- 238000003745 diagnosis Methods 0.000 description 8
- 230000008901 benefit Effects 0.000 description 3
- 230000003902 lesion Effects 0.000 description 3
- 238000010606 normalization Methods 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 201000001178 Bacterial Pneumonia Diseases 0.000 description 1
- 208000035473 Communicable disease Diseases 0.000 description 1
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 1
- 206010061218 Inflammation Diseases 0.000 description 1
- 206010035737 Pneumonia viral Diseases 0.000 description 1
- 208000003464 asthenopia Diseases 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000010339 dilation Effects 0.000 description 1
- 238000013399 early diagnosis Methods 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 238000007499 fusion processing Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000004054 inflammatory process Effects 0.000 description 1
- 238000002595 magnetic resonance imaging Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 230000005855 radiation Effects 0.000 description 1
- 208000009421 viral pneumonia Diseases 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/60—Rotation of whole images or parts thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10116—X-ray image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20112—Image segmentation details
- G06T2207/20132—Image cropping
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30061—Lung
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Molecular Biology (AREA)
- Evolutionary Biology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Medical Informatics (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Radiology & Medical Imaging (AREA)
- Quality & Reliability (AREA)
- Apparatus For Radiation Diagnosis (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a pneumonia image detection method and device based on mixed space and inter-channel attention. The method comprises the following steps: step 1: carrying out data preprocessing on the lung X-ray image; step 2: constructing a first feature network, and performing feature extraction on the preprocessed lung X-ray image by adopting the first feature network to obtain a feature map C; and 3, step 3: constructing a second feature network, and extracting features of the feature map C by adopting the second feature network to obtain a feature map F; and 4, step 4: constructing an attention module for mixing spatial attention and inter-channel attention, and processing the characteristic diagram F by adopting the attention module to obtain a characteristic tensor X; and 5: and constructing a network classifier, and detecting the characteristic tensor X by adopting the network classifier to obtain a detection result.
Description
Technical Field
The invention relates to the technical field of medical image recognition, in particular to a pneumonia image detection method and device based on mixed space and inter-channel attention.
Background
Pneumonia is an inflammation occurring in the terminal airways, alveoli and pulmonary interstitium, and can be classified into bacterial pneumonia, viral pneumonia and the like, the etiology of the pneumonia is numerous, the morbidity of the pneumonia is high, and the pneumonia is one of the most common infectious diseases. Early diagnosis of pneumonia is critical to its successful cure. The pneumonia may be detected by X-ray imaging, pulmonary CT, Magnetic Resonance Imaging (MRI), and the like. The lung X-ray detection has the advantages of convenient process, small radiation amount, low cost and the like, and is the first choice of the current clinical detection. However, for the doctor, it is a complicated task to check the lesion information in the lung medical image through manual radiograph interpretation, the traditional radiograph interpretation method of the doctor usually consumes a lot of time and energy, the accuracy of diagnosis mainly depends on the level and work experience of the doctor, and misdiagnosis and missed diagnosis may occur due to visual fatigue, environmental disturbance and the like.
Since the 21 st century, with the development of computer science technologies, mainly image recognition and pattern recognition technologies, object detection has come from this. The main task of object detection is to identify the class of objects in the input image and their location coordinates. The types of objects that can be detected are defined by manually setting the desired object objects in the image. Because the shape, size and position of the object in each picture are different, improving the accuracy of target detection is always an urgent need for perfection. At present, the auxiliary diagnosis of medical images by artificial intelligence can reach the precision of expert level, and the artificial intelligence method applied to pneumonia diagnosis can effectively improve the diagnosis efficiency and quality, and provide help for relieving medical resource imbalance and improving the diagnosis efficiency. The detection of the lesion region of the lung X-ray image helps a doctor to make a diagnosis by automatically analyzing the lung X-ray image and outputting information such as the position and size of the lesion region. However, the lung X-ray image detection task is different from other image detection tasks, and the lung X-ray image has the characteristics of high similarity between classes and low intra-class variability, namely the characteristics of high similarity of image features of different classes and low image difference of the same class, so that the problems of model deviation and overfitting are easily caused during training of data with the features, the generalization capability of a network is reduced, the difficulty of image identification is increased, and the lung X-ray pneumonia detection effect only adopting the traditional network is unsatisfactory, and the classification precision still needs to be improved by improving the network structure.
Disclosure of Invention
In order to improve the pneumonia detection effect based on a lung X-ray image, the invention provides a pneumonia image detection method and a pneumonia image detection device based on mixed space and inter-channel attention, and the specific scheme is as follows:
the invention provides a pneumonia image detection method based on attention between a mixed space and a channel, which comprises the following steps:
step 1: carrying out data preprocessing on the lung X-ray image;
step 2: constructing a first feature network, and performing feature extraction on the preprocessed lung X-ray image by adopting the first feature network to obtain a feature map C;
and step 3: constructing a second feature network, and extracting features of the feature map C by adopting the second feature network to obtain a feature map F;
and 4, step 4: constructing an attention module for mixing space attention and inter-channel attention, and processing the feature map F by adopting the attention module to obtain a feature tensor X;
and 5: and constructing a network classifier, and detecting the characteristic tensor X by adopting the network classifier to obtain a detection result.
Further, step 1 specifically includes:
step 1.1: screening out unsatisfactory lung X-ray images;
step 1.2: dividing a data set consisting of all lung X-ray images meeting the requirements into a training set, a verification set and a test set;
step 1.3: converting each lung X-ray image in the data set into an RGB three-channel image;
step 1.4: carrying out image enhancement on each RGB three-channel image;
step 1.5: converting each RGB three-channel image subjected to image enhancement into a tensor image;
step 1.6: regularizing each channel of each RGB three-channel image, and then normalizing the tensor image according to a mean vector and a standard vector of three channels;
step 1.7: and converting each normalized tensor image into a gray level image.
Further, step 1.4 specifically includes:
step A1: horizontally overturning the RGB three-channel image according to the overturning probability of 0.5;
step A2: adjusting the image attribute of the reversed RGB three-channel image, specifically: setting the brightness offset amplitude to 0.5, the contrast offset amplitude to 0.5, the saturation offset amplitude to 0.5, and the hue offset amplitude to 0;
step A3: setting a random cutting area ratio to be (0.7, 1.0), randomly cutting the RGB three-channel image with the adjusted image attribute to different sizes and width-height ratios, and then scaling the size of the cut RGB three-channel image to 224 multiplied by 224 pixels;
step A4: and automatically amplifying the RGB three-channel image after each pixel is zoomed by using RandAugmentation.
Further, the first feature network adopts ResNet101 using inclusion convolution as a backbone network, and includes 5 feature extraction layers in total, which are: the first characteristic layer, the second characteristic layer, the third characteristic layer, the fourth characteristic layer and the fifth characteristic layer;
the feature extraction process of the first feature layer comprises the following steps: firstly, carrying out convolution operation on input preprocessed lung X-ray images by adopting 64 convolution kernels with the channel number of 3, then carrying out Batchnormalization operation by adopting a BN layer, then processing by adopting a ReLu activation function, and finally inputting to a maximum pooling layer with the channel number of 64 to obtain a characteristic map C1;
the feature extraction process of the second feature layer comprises a first branch and a second branch, the two branches sequentially repeat three times of feature extraction operations on the input feature map C1 according to respective feature extraction processes, and then the last output of the two branches is subjected to preset processing operation to obtain a feature map C2; the preset processing operation specifically includes: adding the outputs of the two branches and then processing by adopting a ReLu activation function;
the feature extraction process of the third feature layer comprises a third branch and a fourth branch, the two branches sequentially perform feature extraction operations on the input feature map C2 in the third feature layer under different parameter states for four times according to respective feature extraction processes, and then execute the preset processing operation on the last output of the two branches to obtain a feature map C3;
the feature extraction process of the fourth feature layer comprises a fifth branch and a sixth branch, the two branches sequentially perform twenty-three times of feature extraction operations on the input feature map C3 in the fourth feature layer under different parameter states according to respective feature extraction processes, and then perform the preset processing operation on the last output of the two branches to obtain a feature map C4;
the feature extraction process of the fifth feature layer includes a seventh branch and an eighth branch, the two branches sequentially perform three feature extraction operations on the input feature map C4 in the fifth feature layer under different parameter states according to respective feature extraction processes, then perform the preset processing operation on the last output of the two branches to obtain a feature map C5, and use the feature map C5 as a final feature map C.
Further, the second feature extraction network adopts an FPN network;
the feature extraction process of the FPN network comprises the following steps: reducing the number of channels of the feature map C5 from 2048 to 256 by using 1 × 1 convolution, then performing up-sampling operation to obtain a feature map with the same size as the feature map C4, recording the feature map as the feature map C5_ up, and performing weighted summation on the feature map C5_ up and the feature map C4 to obtain a feature map P; carrying out feature fusion on the feature map P by adopting convolution of 3 multiplied by 3 to obtain a feature map F; and (5) transversely connecting the characteristic diagram F, and increasing the number of channels to 2048.
Further, the processing procedure of the feature map F by the attention module includes:
step B1: for each image set containing m lung X-ray images, each of size H0×W0After the lung X-ray images pass through a first feature extraction network and a second feature extraction network, corresponding feature images F are obtained, and all the feature images F of each image set form a feature tensor X;
step B2: performing 1 × 1 convolution operation on the feature tensor X, and dividing the convolution operation by the regularized transpose of the feature tensor X;
step B3: unfolding the feature tensor X obtained in step B2 from the second dimension using the scatter () function, thereby separating the feature tensor for each X-ray image of the lungs into X1,X2,X3,……XH×W;
Step B4: and carrying out global average pooling operation on the features of all positions in the feature tensor X to obtain global class-independent features g:
step B5: computingObtaining the maximum value of all spatial positions of each category by performing weighted combination of feature tensors on score to obtain a class-specific feature tensor a:wherein T is one>A hyper-parameter of 0 (x),andrespectively represent XjAnd XkTranspose of (m)iA classifier parameter representing an ith class;
step B6: obtaining final f according to class specific feature tensor a and global class independent feature gi:fi=g+λai(ii) a And will fiTo [ m, 2048, 1](ii) a Wherein f isiA feature vector representing the ith class;
step B7: handle fiIs the same as the feature tensor X in step B1, and is multiplied by the feature tensor X in step B1 to obtain a new feature tensor X.
Further, the detection process of the network classifier specifically includes:
step 5.1: using adaptive averaging pooling for the feature tensor X;
step 5.2: unfolding the feature tensor X in the step 5.1 from a first dimension by using a flatten () function, and then inputting the unfolded feature tensor X into a full connection layer to perform linear conversion to obtain X', which specifically comprises the following steps: x' ═ XAT+ b; wherein A isTRepresenting the transposition of A, wherein A is represented as a parameter matrix of a full connection layer, and b is a bias row vector;
step 5.3: inputting the feature vector X' after linear conversion into a ReLU activation function, specifically: ReLU (X ') ═ X')+=max(0,X′);
Step 5.4: inputting the feature tensor X 'output in the step 5.3 into the full connection layer again for linear conversion, and outputting a feature tensor X' with the size of [2048, 2 ]:
step 5.5: inputting the feature tensor X' subjected to linear conversion into a loss function to calculate a loss value; then, through an optimizer, a loss functionData evaluation and hyper-parameter modification training models; wherein the loss function adopts a binary cross entropy function, specifically a binary cross entropy functionWherein, the first and the second end of the pipe are connected with each other,c is the number of categories, y represents the true value of the image,representing the predicted value of the picture.
Further, in step 5.5, the optimizer adopts an SGD optimizer.
The invention also provides a pneumonia image detection device based on mixed space and inter-channel attention, which comprises:
the preprocessing module is used for preprocessing the data of the lung X-ray image;
the first feature network construction module is used for constructing a first feature network, and performing feature extraction on the preprocessed lung X-ray image by adopting the first feature network to obtain a feature map C;
the first characteristic network construction module is used for constructing a second characteristic network, and the second characteristic network is adopted to carry out characteristic extraction on the characteristic diagram C to obtain a characteristic diagram F;
the attention mechanism construction module is used for constructing an attention module for mixing space attention and inter-channel attention, and the attention module is adopted for processing the feature map F to obtain a feature tensor X;
and the classifier building module is used for building a network classifier, and detecting the characteristic tensor X by adopting the network classifier to obtain a detection result.
The invention has the beneficial effects that:
1. the invention adds a attention mechanism between a mixing space and channels into the classification process of the lung X-ray images, generates class-specific characteristics for two classes of pneumonia or pneumonia-free, and realizes the improvement of performance without any additional training burden.
2. The invention applies the attention mechanism to the characteristic fusion process, can effectively utilize information which is useful for pneumonia detection in different characteristic extraction layers, and inhibits irrelevant noise, thereby improving the detection efficiency.
Drawings
Fig. 1 is a schematic flowchart of a pneumonia image detection method based on mixed space and inter-channel attention according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a first feature network according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a second feature network according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an attention module for mixing attention between a space and a channel according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions in the embodiments of the present invention will be described clearly below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example 1
As shown in fig. 1, an embodiment of the present invention provides a pneumonia image detection method based on mixed space and inter-channel attention, including the following steps:
s101: carrying out data preprocessing on the lung X-ray image;
specifically, the method specifically comprises the following steps:
s1011: screening out unsatisfactory lung X-ray images;
s1012: dividing the lung X-ray image data set into a training set, a verification set and a test set;
s1013: converting the lung X-ray image into an RGB sequential three-channel image;
s1014: carrying out image enhancement on the lung X-ray image;
as an implementation, the sub-step specifically includes:
step A1: horizontally flipping the lung X-ray image according to the probability that p is 0.5;
step A2: adjusting the image attribute of the lung X-ray image, specifically: setting the brightness offset amplitude to 0.5, the contrast offset amplitude to 0.5, the saturation offset amplitude to 0.5, and the hue offset amplitude to 0;
step A3: randomly cutting the lung X-ray image into different sizes and aspect ratios by setting the random cutting area ratio to be (0.7, 1.0), and zooming the size of the cut lung X-ray image to 224 multiplied by 224 pixels;
step A4: automatic data enhancement was performed for each lung X-ray image using RandAugmentation.
S1015: converting the picture format of the lung X-ray image into a Tensor format (namely a vector format adopted in training), and normalizing, namely dividing each channel by 255;
s1016: regularizing each channel of the lung X-ray image, and normalizing a tensor image according to a mean vector and a standard vector of the three channels;
specifically, the mean vector of a given three-channel vector is set to [0, 0%]With the norm vector set to [1,1 ]]. Normalizing the tensor image according to the mean vector and the standard vector of the three channels, namely:
s1017: the lung X-ray images are converted into grey-scale images.
S102: constructing a first feature network, and performing feature extraction on the preprocessed lung X-ray image by adopting the first feature network to obtain a feature map C;
specifically, as shown in fig. 2, the first feature network adopts ResNet101 using an inclusion convolution as a backbone network, and includes 5 feature extraction layers in total, which are: the first characteristic layer, the second characteristic layer, the third characteristic layer, the fourth characteristic layer and the fifth characteristic layer;
the inclusion convolution makes the convolution kernel of the original ResNet101 have independent spaces between different axes (dimensions), channels and layers, i.e. can have different dilation values. For each layer, the expansion values of two axes of each channelFrom the trained supernet, set d is the set of two-axis expansion values for all channels in the layer, and is expressed as follows:
whereinAndrepresents the x-axis and y-axis expansion values in the ith channel, dmaxIs the maximum expansion value, CoutIs the number of output channels.
The method comprises the following steps that the number of the supernets is 4, training parameters respectively correspond to second to fifth feature layers of ResNet101, each layer of the supernet is composed of a plurality of convolutions covering all possible expansion values, after the supernet is trained, the expansion value is selected for each layer according to the principle that a loss function is minimum, and the optimal expansion mode of the layer is determined. The specific method comprises the following steps: for each layer in the super-net, W is the tensor of all the original parameters, WiThe tensors of all the original parameters for the ith channel,expanding the parameters of the convolution kernel for the ith channel byRepresenting stacks along the output pathi∈{1,2,…,Cout}. Due to W andindependent of X, optimized via W anddesired convolution L of the sum of differences X1Let L be1And (4) minimizing. X is an input lung X-ray image, and since each group of X is not greatly different, the expectation of X can be replaced by a constant alpha, namelyWhere 1 is the full 1 matrix and is the convolution operation. L is a radical of an alcohol1The optimal d for each layer can be determined at a minimum. After that, by applying the optimal d of each layer as a parameter to the ResNet101, a backbone network of the ResNet101 using the inclusion convolution can be obtained.
The feature extraction process of the first feature layer comprises the following steps: firstly, carrying out convolution operation on the input preprocessed lung X-ray image by adopting 64 convolution kernels with the number of channels being 3, then carrying out Batchnormalization operation by adopting a BN layer, then processing by adopting a ReLu activation function, and finally inputting to a maximum pooling layer with the number of channels being 64 to obtain a characteristic map C1;
the feature extraction process of the second feature layer comprises a first branch and a second branch, the two branches sequentially repeat three times of feature extraction operations on the input feature map C1 according to respective feature extraction processes, and then the last output of the two branches is subjected to preset processing operation to obtain a feature map C2; the preset processing operation specifically comprises: adding the outputs of the two branches and then processing by adopting a ReLu activation function;
wherein, the feature extraction process of the first branch comprises in sequence: performing convolution operation on an input characteristic diagram by adopting 64 convolution kernels, performing Batch Normalization operation by adopting a BN layer, performing convolution operation by adopting 64 convolution kernels, performing Batch Normalization operation by adopting the BN layer, performing convolution operation by adopting 256 convolution kernels, and performing Batch Normalization operation by adopting the BN layer; the feature extraction process of the second branch comprises the following steps: carrying out convolution operation on the input feature map by adopting 256 convolution kernels;
the feature extraction process of the third feature layer comprises a third branch and a fourth branch, the two branches sequentially perform feature extraction operations on the input feature map C2 in the third feature layer under different parameter states for four times according to respective feature extraction processes, and then execute the preset processing operation on the last output of the two branches to obtain a feature map C3;
wherein, the feature extraction process of the third branch comprises in sequence: adopting 128 convolution kernels to perform a first convolution operation on an input characteristic diagram, adopting a BN layer to perform a Batchnormalization operation, adopting 128 convolution kernels to perform a downsampling operation, adopting the BN layer to perform the Batchnormalization operation, adopting 512 convolution kernels to perform a second convolution operation, and adopting the BN layer to perform the Batchnormalization operation; the feature extraction process of the fourth branch includes: adopting 512 convolution kernels to check the input feature map for down-sampling operation;
the feature extraction process of the fourth feature layer comprises a fifth branch and a sixth branch, the two branches sequentially perform twenty-three times of feature extraction operations on the input feature map C3 in the fourth feature layer under different parameter states according to respective feature extraction processes, and then perform the preset processing operation on the last output of the two branches to obtain a feature map C4;
wherein, the feature extraction process of the fifth branch comprises in sequence: adopting 256 convolution kernels to perform a first convolution operation on an input characteristic diagram, adopting a BN layer to perform a Batchnormalization operation, adopting 256 convolution kernels to perform a downsampling operation, adopting the BN layer to perform the Batchnormalization operation, adopting 1024 convolution kernels to perform a second convolution operation, and adopting the BN layer to perform the Batchnormalization operation; the feature extraction process of the sixth branch includes: adopting 1024 convolution checks to carry out downsampling on the input feature map;
the feature extraction process of the fifth feature layer comprises a seventh branch and an eighth branch, the two branches sequentially perform three times of feature extraction operations on the input feature map C4 in the fifth feature layer under different parameter states according to respective feature extraction processes, then perform the preset processing operation on the last output of the two branches to obtain a feature map C5, and use the feature map C5 as a final feature map C;
wherein, the feature extraction process of the seventh branch comprises in sequence: performing a first convolution operation on an input characteristic diagram by adopting 512 convolution kernels, performing a Batchnormalization operation by adopting a BN layer, performing a downsampling operation by adopting 512 convolution kernels, performing the Batchnormalization operation by adopting the BN layer, performing a second convolution operation by adopting 2048 convolution kernels, and performing the Batchnormalization operation by adopting the BN layer; the feature extraction process of the eighth branch includes: performing downsampling operation on the input feature map by adopting 2048 convolution kernels;
s103: constructing a second feature network, and extracting features of the feature graph C by using the second feature network to obtain a feature graph F;
specifically, the second feature network adopts an FPN network; the feature extraction process of the FPN network comprises the following steps: reducing the number of channels of the feature map C5 from 2048 to 256 by using 1 × 1 convolution, then performing up-sampling operation to obtain a feature map with the same size as the feature map C4, recording the feature map as the feature map C5_ up, and performing weighted summation on the feature map C5_ up and the feature map C4 to obtain a feature map P; carrying out feature fusion on the feature map P by adopting convolution of 3 multiplied by 3 to obtain a feature map F; and (5) transversely connecting the characteristic diagram F, and increasing the number of channels to 2048.
S104: constructing an attention module (HSC module) for mixing spatial attention and interchannel attention, and processing the feature map F by using the attention module to obtain a feature tensor X;
specifically, the present step includes the following substeps:
step B1: for each image set containing m lung X-ray images, each of size H0×W0After passing through the first feature extraction network and the second feature extraction networkObtaining a corresponding characteristic graph F, and forming a characteristic tensor X by all the characteristic graphs F of each batch of image sets;
step B2: performing 1 × 1 convolution operation on the feature tensor X, and dividing by the regularized transpose;
step B3: unfolding the feature tensor X obtained in step B2 from the second dimension using the flatten () function to separate the feature tensor of each pulmonary X-ray image into X1,X2,X3,……,XH×W;
Step B4: and (3) performing global average pooling operation on the features of all positions in the feature tensor X to obtain a global class-independent feature g:
step B5: computingObtaining the maximum value of all spatial positions of each category by performing weighted combination of feature tensors on score to obtain a class-specific feature tensor a:wherein T is one>A hyper-parameter of 0 (x),andrespectively represent XjAnd XkTranspose of (m)iA classifier parameter representing an ith class;
step B6: obtaining final f according to class specific feature tensor a and global class independent feature gi:fi=g+λai(ii) a And will fiIs expanded to [ m, 2048, 1 ]](ii) a Wherein f isiA feature vector representing the ith class;
step B7: handle fiIs expanded to the same size as the feature tensor X in step B1, and is further similar to stepThe feature tensor X in B1 is multiplied to obtain a new feature tensor X.
This step inputs the fusion feature F into the HSC module to achieve higher accuracy by taking advantage of the spatial attention of each object class. The weights of different channels are adjusted by obtaining the weights of the different channels, so that the proportion of useful information is improved, and the proportion of useless information is reduced. The weights of all pixels on one feature map are obtained, the weights of different pixels are adjusted, the proportion of effective features is improved, and the influence of background information is reduced.
S105: and constructing a network classifier, and detecting the characteristic tensor X by adopting the network classifier to obtain a detection result.
Specifically, this step includes the following substeps:
s1051: using adaptive average pooling for the feature tensor X;
s1052: unfolding the feature tensor X in the step S1051 from the first dimension by using a flatten () function, and then inputting the unfolded feature tensor X into a full connection layer to perform linear conversion to obtain X', which specifically comprises the following steps: x ═ XAT+ b; wherein A isTRepresenting the transposition of A, wherein A is represented as a parameter matrix of a full connection layer, and b is a bias row vector;
s1053: inputting the feature vector X' after linear conversion into a ReLU activation function, specifically: ReLU (X')+=max(0,X′);
S1054: the feature tensor X' output in step S1053 is input to the full connection layer again for linear conversion, and a feature tensor X ″ having a size of [2048, 2] is output:
s1055: inputting the linearly converted feature tensor X' into a loss function to calculate a loss value; then, modifying the training model through an optimizer, a loss function, data evaluation and a hyper-parameter; wherein the loss function adopts a binary cross entropy function, specifically a binary cross entropy functionWherein, the first and the second end of the pipe are connected with each other,c is the number of categories, y represents the true value of the image,representing the predicted value of the picture.
As an implementation manner, in step S1055, the optimizer adopts an SGD optimizer. The method comprises the following specific steps:
wherein, alpha refers to the learning rate, the initial value is 0.01, and each iteration is multiplied by 0.1; y is(i)-hθ(x(i)) Denotes a loss function, i denotes the number of cycles, j denotes a parameter number, θjRepresenting the jth parameter.
The pneumonia X-ray image detection method based on the attention between the mixed space and the channels provided by the invention carries out data preprocessing on the selected lung X-ray image and makes the data into a data set to enhance the image characteristics; the method comprises the steps of adjusting parameters by using processed lung X-ray images and training a convolutional neural network for feature extraction, wherein the feature extraction network adopts a ResNet101 network, realizes feature fusion among different feature images by constructing a residual block, solves the problems of gradient elimination and gradient explosion in a deep network, sequentially performs convolution operation on the lung X-ray images concentrated in training, then performs feature fusion, adds an attention mechanism between a mixed space and a channel, and finally constructs a network classifier to perform image classification diagnosis and finally obtains a prediction result.
Example 2
On the basis of the above embodiment 1, in the embodiment of the present invention, the size of the lung X-ray image is set to 3 × 224 × 224, and the batch is set to 16. The pneumonia image detection method comprises the following steps:
step S201: after passing through the first characteristic network and the second characteristic network, the size of a characteristic tensor X formed by all characteristic images F of each batch of image sets is [16,256,7,7 ];
step S202: using a1 × 1 convolution operation on the feature tensor X, and dividing by the regularized transpose;
step S203: unfolding the feature tensor X from the second dimension, thereby separating the feature tensor of each pulmonary X-ray image into X1,X2,X3,……X49;
Step S204: obtaining global class-independent features g by performing a global average pooling operation on the features of all positions in the feature tensor X:
step S205: calculating outTaking the value of the first dimension; then, carrying out weighted combination operation of the feature tensors on the score to obtain the maximum value of all spatial positions of each category, and obtaining a feature tensor a with specific category:
step S206: the final f is obtained by adding class specific features and global class independent featuresi:fi=g+λaiWhere λ is 0.1, and f isiThe final prediction result can be obtained by fusion by applying to each score tensor;
step S207: extension fiTo [16, 2048, 1];
Step S208: handle fiThe feature tensor X in step S201 is multiplied by the same size expansion as the feature tensor X in step S201 to obtain a new feature tensor X;
step S209: by detecting the new feature tensor X by using the network classifier in embodiment 1, a detection result, that is, whether the input X-ray image of the lung belongs to the pneumonia image can be obtained.
Example 3
The embodiment of the invention also provides a pneumonia image detection device based on attention between the mixed space and the channels, which comprises:
the preprocessing module is used for preprocessing the data of the lung X-ray image;
the first feature network construction module is used for constructing a first feature network, and performing feature extraction on the preprocessed lung X-ray image by adopting the first feature network to obtain a feature map C;
the first characteristic network construction module is used for constructing a second characteristic network, and the second characteristic network is adopted to perform characteristic extraction on the characteristic diagram C to obtain a characteristic diagram F;
the attention mechanism construction module is used for constructing an attention module of attention between a mixing space and a channel, and the attention module is adopted to process the characteristic diagram F to obtain a new characteristic diagram or a new characteristic tensor X;
and the classifier building module is used for building a network classifier, and detecting the characteristic diagram by adopting the network classifier to obtain a detection result.
The pneumonia image detection device provided by the embodiment of the invention is used for realizing the method embodiment, and specific functions of the pneumonia image detection device can refer to the method embodiment, and are not described again here.
Claims (9)
1. The pneumonia image detection method based on mixed space and inter-channel attention is characterized by comprising the following steps:
step 1: carrying out data preprocessing on the lung X-ray image;
step 2: constructing a first feature network, and performing feature extraction on the preprocessed lung X-ray image by adopting the first feature network to obtain a feature map C;
and 3, step 3: constructing a second feature network, and extracting features of the feature map C by adopting the second feature network to obtain a feature map F;
and 4, step 4: constructing an attention module for mixing spatial attention and inter-channel attention, and processing the characteristic diagram F by adopting the attention module to obtain a characteristic tensor X;
and 5: and constructing a network classifier, and detecting the characteristic tensor X by adopting the network classifier to obtain a detection result.
2. The pneumonia image detection method based on mixed space and inter-channel attention according to claim 1 is characterized in that step 1 specifically comprises:
step 1.1: screening out unsatisfactory lung X-ray images;
step 1.2: dividing a data set consisting of all lung X-ray images meeting the requirements into a training set, a verification set and a test set;
step 1.3: converting each lung X-ray image in the data set into an RGB three-channel image;
step 1.4: carrying out image enhancement on each RGB three-channel image;
step 1.5: converting each RGB three-channel image subjected to image enhancement into a tensor image;
step 1.6: regularizing each channel of each RGB three-channel image, and then normalizing the tensor image according to a mean vector and a standard vector of three channels;
step 1.7: and converting each normalized tensor image into a gray level image.
3. The pneumonia image detection method based on mixed space and inter-channel attention according to claim 2 is characterized in that step 1.4 specifically comprises:
step A1: horizontally overturning the RGB three-channel image according to the overturning probability of 0.5;
step A2: adjusting the image attribute of the reversed RGB three-channel image, specifically: setting the brightness offset amplitude to 0.5, the contrast offset amplitude to 0.5, the saturation offset amplitude to 0.5, and the hue offset amplitude to 0;
step A3: setting a random cutting area ratio to be (0.7, 1.0), randomly cutting the RGB three-channel image with the adjusted image attribute to different sizes and width-height ratios, and then scaling the size of the cut RGB three-channel image to 224 multiplied by 224 pixels;
step A4: and automatically amplifying the RGB three-channel image after each pixel is zoomed by using RandAuthement.
4. The pneumonia image detection method based on mixed space and inter-channel attention according to claim 1 is characterized in that the first feature network adopts ResNet101 using Incepton convolution as a backbone network, and comprises 5 feature extraction layers in total, which are respectively: the first characteristic layer, the second characteristic layer, the third characteristic layer, the fourth characteristic layer and the fifth characteristic layer;
the feature extraction process of the first feature layer comprises the following steps: firstly, carrying out convolution operation on the input preprocessed lung X-ray image by adopting 64 convolution kernels with the number of channels being 3, then carrying out Batchnormalization operation by adopting a BN layer, then processing by adopting a ReLu activation function, and finally inputting to a maximum pooling layer with the number of channels being 64 to obtain a characteristic map C1;
the feature extraction process of the second feature layer comprises a first branch and a second branch, the two branches sequentially repeat three times of feature extraction operations on the input feature map C1 according to respective feature extraction processes, and then the last output of the two branches is subjected to preset processing operation to obtain a feature map C2; the preset processing operation specifically includes: adding the outputs of the two branches and then processing by adopting a ReLu activation function;
the feature extraction process of the third feature layer comprises a third branch and a fourth branch, the two branches sequentially perform feature extraction operations on the input feature map C2 in the third feature layer under different parameter states for four times according to respective feature extraction processes, and then execute the preset processing operation on the last output of the two branches to obtain a feature map C3;
the feature extraction process of the fourth feature layer comprises a fifth branch and a sixth branch, the two branches sequentially perform twenty-three times of feature extraction operations on the input feature map C3 in the fourth feature layer under different parameter states according to respective feature extraction processes, and then perform the preset processing operation on the last output of the two branches to obtain a feature map C4;
the feature extraction process of the fifth feature layer includes a seventh branch and an eighth branch, the two branches sequentially perform three feature extraction operations on the input feature map C4 in the fifth feature layer under different parameter states according to respective feature extraction processes, then perform the preset processing operation on the last output of the two branches to obtain a feature map C5, and use the feature map C5 as a final feature map C.
5. The pneumonia image detection method based on mixed space and inter-channel attention according to claim 4 wherein said second feature extraction network employs an FPN network;
the feature extraction process of the FPN network comprises the following steps: reducing the number of channels of the feature map C5 from 2048 to 256 by using 1 × 1 convolution, then performing up-sampling operation to obtain a feature map with the same size as the feature map C4, recording the feature map as the feature map C5_ up, and performing weighted summation on the feature map C5_ up and the feature map C4 to obtain a feature map P; performing feature fusion on the feature map P by adopting convolution of 3 multiplied by 3 to obtain a feature map F; and (5) transversely connecting the characteristic diagram F, and increasing the number of channels to 2048.
6. The pneumonia image detection method based on mixed space and inter-channel attention of claim 1 is characterized in that the attention module processes the feature map F including:
step B1: for each image set containing m lung X-ray images, each of size H0×W0After the lung X-ray image passes through a first feature extraction network and a second feature extraction network, a corresponding feature image F is obtained, and all feature images F of each batch of image sets form a feature tensor X;
step B2: performing 1 × 1 convolution operation on the feature tensor X, and dividing by the regularized transpose;
step B3: unfolding the feature tensor X obtained in step B2 from the second dimension using the flatten () function, thereby separating the feature tensor of each pulmonary X-ray image into X1,X2,X3,……XH×W;
Step B4: and carrying out global average pooling operation on the features of all positions in the feature tensor X to obtain global class-independent features g:
step B5: computingObtaining the maximum value of all spatial positions of each category by performing weighted combination of feature tensors on score to obtain a class-specific feature tensor a:wherein T is one>A hyper-parameter of 0 (m) is,andrespectively represent XjAnd XkTranspose of (m)iClassifier parameters representing the ith class;
step B6: obtaining final f according to class specific feature tensor a and global class independent feature gi:fi=g+λai(ii) a And will fiTo [ m, 2048, 1](ii) a Wherein, fiA feature vector representing the ith class;
step B7: handle fiIs the same as the feature tensor X in step B1, and is multiplied by the feature tensor X in step B1 to obtain a new feature tensor X.
7. The pneumonia image detection method based on mixed space and inter-channel attention of claim 1 is characterized in that the detection process of the network classifier specifically comprises the following steps:
step 5.1: using adaptive average pooling for the feature tensor X;
step 5.2: unfolding the feature tensor X in the step 5.1 from a first dimension by using a flatten () function, and then inputting the unfolded feature tensor X into a full connection layer to perform linear conversion to obtain X', which specifically comprises the following steps: x' ═ XAT+ b; wherein A isTRepresenting the transposition of A, wherein A is represented as a parameter matrix of a full connection layer, and b is a bias row vector;
step 5.3: inputting the feature vector X' after linear conversion into a ReLU activation function, specifically: ReLU (X ') + ═ max (0, X');
step 5.4: inputting the feature tensor X 'output in the step 5.3 into the full-connection layer again for linear conversion, and outputting a feature tensor X' with the size of [2048, 2 ]:
step 5.5: inputting the feature tensor X' subjected to linear conversion into a loss function to calculate a loss value; then, modifying the training model through an optimizer, a loss function, data evaluation and a hyper-parameter; wherein the loss function adopts a binary cross entropy function, specifically a binary cross entropy functionWherein, the first and the second end of the pipe are connected with each other,c is the number of categories, y represents the true value of the image,representing the predicted value of the picture.
8. The pneumonia image detection method based on mixed spatial and interchannel attention of claim 7 wherein in step 5.5 said optimizer employs an SGD optimizer.
9. Pneumonia image detection device based on mixed space and interchannel attention includes:
the preprocessing module is used for preprocessing the data of the lung X-ray image;
the first feature network construction module is used for constructing a first feature network, and performing feature extraction on the preprocessed lung X-ray image by adopting the first feature network to obtain a feature map C;
the first characteristic network construction module is used for constructing a second characteristic network, and the second characteristic network is adopted to perform characteristic extraction on the characteristic diagram C to obtain a characteristic diagram F;
the attention mechanism construction module is used for constructing an attention module for mixing space attention and inter-channel attention, and the attention module is adopted for processing the characteristic diagram F to obtain a characteristic tensor X;
and the classifier building module is used for building a network classifier, and detecting the characteristic tensor X by adopting the network classifier to obtain a detection result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210536524.7A CN114782403A (en) | 2022-05-17 | 2022-05-17 | Pneumonia image detection method and device based on mixed space and inter-channel attention |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210536524.7A CN114782403A (en) | 2022-05-17 | 2022-05-17 | Pneumonia image detection method and device based on mixed space and inter-channel attention |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114782403A true CN114782403A (en) | 2022-07-22 |
Family
ID=82437616
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210536524.7A Pending CN114782403A (en) | 2022-05-17 | 2022-05-17 | Pneumonia image detection method and device based on mixed space and inter-channel attention |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114782403A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024045320A1 (en) * | 2022-08-31 | 2024-03-07 | 北京龙智数科科技服务有限公司 | Facial recognition method and apparatus |
-
2022
- 2022-05-17 CN CN202210536524.7A patent/CN114782403A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024045320A1 (en) * | 2022-08-31 | 2024-03-07 | 北京龙智数科科技服务有限公司 | Facial recognition method and apparatus |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113065558B (en) | Lightweight small target detection method combined with attention mechanism | |
CN110443143B (en) | Multi-branch convolutional neural network fused remote sensing image scene classification method | |
CN108960143B (en) | Ship detection deep learning method in high-resolution visible light remote sensing image | |
CN106529447B (en) | Method for identifying face of thumbnail | |
CN110348319B (en) | Face anti-counterfeiting method based on face depth information and edge image fusion | |
CN109685819B (en) | Three-dimensional medical image segmentation method based on feature enhancement | |
Lin et al. | Hyperspectral image denoising via matrix factorization and deep prior regularization | |
CN112580782B (en) | Channel-enhanced dual-attention generation countermeasure network and image generation method | |
CN112288011B (en) | Image matching method based on self-attention deep neural network | |
CN112348036A (en) | Self-adaptive target detection method based on lightweight residual learning and deconvolution cascade | |
WO2022083335A1 (en) | Self-attention mechanism-based behavior recognition method | |
CN112734764A (en) | Unsupervised medical image segmentation method based on countermeasure network | |
CN110648311A (en) | Acne image focus segmentation and counting network model based on multitask learning | |
CN115222998B (en) | Image classification method | |
CN111680755A (en) | Medical image recognition model construction method, medical image recognition device, medical image recognition medium and medical image recognition terminal | |
CN111899203A (en) | Real image generation method based on label graph under unsupervised training and storage medium | |
CN115049952A (en) | Juvenile fish limb identification method based on multi-scale cascade perception deep learning network | |
CN115880523A (en) | Image classification model, model training method and application thereof | |
CN116229230A (en) | Vein recognition neural network model, method and system based on multi-scale transducer | |
CN114782403A (en) | Pneumonia image detection method and device based on mixed space and inter-channel attention | |
CN115100165A (en) | Colorectal cancer T staging method and system based on tumor region CT image | |
CN117115675A (en) | Cross-time-phase light-weight spatial spectrum feature fusion hyperspectral change detection method, system, equipment and medium | |
CN117036948A (en) | Sensitized plant identification method based on attention mechanism | |
CN116189160A (en) | Infrared dim target detection method based on local contrast mechanism | |
CN109829377A (en) | A kind of pedestrian's recognition methods again based on depth cosine metric learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |