CN114387467A - Medical image classification method based on multi-module convolution feature fusion - Google Patents

Medical image classification method based on multi-module convolution feature fusion Download PDF

Info

Publication number
CN114387467A
CN114387467A CN202111501536.8A CN202111501536A CN114387467A CN 114387467 A CN114387467 A CN 114387467A CN 202111501536 A CN202111501536 A CN 202111501536A CN 114387467 A CN114387467 A CN 114387467A
Authority
CN
China
Prior art keywords
layer
feature map
output
feature
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111501536.8A
Other languages
Chinese (zh)
Other versions
CN114387467B (en
Inventor
孙明健
沈毅
马凌玉
胡歆格
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute Of Technology At Zhangjiakou
Harbin Institute of Technology
Original Assignee
Harbin Institute Of Technology At Zhangjiakou
Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute Of Technology At Zhangjiakou, Harbin Institute of Technology filed Critical Harbin Institute Of Technology At Zhangjiakou
Priority to CN202111501536.8A priority Critical patent/CN114387467B/en
Publication of CN114387467A publication Critical patent/CN114387467A/en
Application granted granted Critical
Publication of CN114387467B publication Critical patent/CN114387467B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a medical image classification method based on multi-module convolution feature fusion, which comprises the following steps: firstly, preprocessing an image; secondly, carrying out modular convolution design on the network model; step three, fusing multi-module features; step four, feature extraction based on multi-module fusion features; and step five, outputting a classification result through the average pooling layer. The classification method provided by the invention has the advantages that the features are extracted in a convolution mode, the parameter quantity is small, the operation speed is high, the method is suitable for the features of a small number of samples of medical image data, excessive weight parameters do not need to be updated, and meanwhile, the method is excellent in performance through experimental verification. The method is not only suitable for two-classification tasks, but also suitable for multi-classification tasks, and can flexibly adjust the number of output channels of each module according to task requirements so as to improve the classification effect.

Description

Medical image classification method based on multi-module convolution feature fusion
Technical Field
The invention belongs to the technical field of artificial intelligence, relates to a medical image classification method, and particularly relates to a medical image classification method based on multi-module convolution feature fusion.
Background
Deep learning techniques enable machines to analyze various training images and automatically extract feature expressions using back-propagation algorithms. It can solve many image understanding and analysis problems such as semantic segmentation, image recognition and classification. At present, Convolutional Neural Networks (CNN) are gradually becoming standard technologies in medical image screening classification, and their applications are very wide, such as classification of skin cancer, identification of lung nodules, diagnosis of thyroid diseases, and the like. The medical image has the characteristics of small sample amount, unbalanced data distribution and the like, so that the feature extraction of limited samples is the key point of effective identification. At present, in medical image classification and identification, more classic CNN models are used, and the accuracy rate needs to be further improved.
Disclosure of Invention
The invention aims to provide a medical image classification method based on multi-module convolution feature fusion so as to fully extract key features of images and improve the accuracy of medical image identification.
The purpose of the invention is realized by the following technical scheme:
a medical image classification method based on multi-module convolution feature fusion comprises the following steps:
step one, image preprocessing
Two-dimensional medical images with any size are converted into 224 multiplied by 224 through a preprocessing module, meanwhile, the using state of a current network model is identified in the preprocessing module, and if the current network model is in a training stage, image data enhancement operation is carried out, such as one or more of rotation, overturning, brightness adjustment, Gaussian noise addition and the like; if the current network model is in the testing stage, the operation is not carried out;
step two, network model sub-module convolution design
The multi-module convolutional network model architecture comprises the following 5 modules:
module 1: comprising a first layer Conv3-64 and a second layer Dwconv3-64, wherein: after the preprocessing module, the input data dimension of the first layer Conv3-64 is nx3x224 x 224, namely the height and the width of an input image are both 224, the size of the data volume loaded in each batch is N, the color channels are RGB, and feature maps with the dimension of nx64 x224 x 224 are output to the second layer Dwconv 3-64; the second layer Dwconv3-64 outputs a feature map A with the dimension of Nx 64 x 224, the feature map A outputs a feature map with the dimension of Nx 64 x 112 through the maximum pooling layer, and the feature map A is input into the module 2; the output of the first layer Conv3-64 and the output of the second layer Dwconv3-64 are connected by using a standard convolutional layer Conv1-32, feature graphs with dimensions of Nx32323232224X 224 are output, summation operation sigma is carried out on the output feature graphs to obtain an information-fused feature graph with dimensions of Nx323232224X 224, the information-fused feature graph is subjected to feature extraction by the standard convolutional layer Conv1-X, and X is a variable and is adaptively adjusted according to the learning category number to suggest a value range: x is more than or equal to 16 and less than or equal to 64, a feature map with the output dimension of NxX multiplied by 224 is output, and a feature map B with the output dimension of NxX multiplied by 7 is output through an average pooling layer with the convolution kernel size of 32 multiplied by 32;
and (3) module 2: comprising a third layer Conv3-128 and a fourth layer Dwconv3-128, wherein: the third layer Conv3-128 inputs the feature maps with the dimensions of Nx 64 x 112 and outputs the feature maps with the dimensions of Nx 128 x 112 to the fourth layer Dwconv 3-128; a feature map C with the dimension of Nx128 × 112 × 112 is output by the fourth layer Dwconv3-128, the feature map C is output by the largest pooling layer, and the feature map with the dimension of Nx128 × 56 × 56 is input into the module 3; the outputs of the Conv3-128 of the third layer and the Dwconv3-128 of the fourth layer are connected by using standard convolutional layers Conv1-32, the output dimensions are all N × 32 × 112 × 112 feature maps, summation operation sigma is carried out on the output feature maps to obtain information-fused feature maps with the dimensions of N × 32 × 112 × 112, feature extraction is carried out on the information-fused feature maps through the standard convolutional layers Conv1-X, feature maps with the dimensions of N × X × 112 × 112 are output, and feature maps D with the dimensions of N × X × 7 × 7 are output through an average pooling layer with the convolutional kernel size of 16 × 16;
and a module 3: the fifth layer Conv3-256, the sixth layer Dwconv3-256 and the seventh layer Dwconv3-256 are included, the fifth layer Conv3-256 inputs the characteristic diagram with the dimension of N × 128 × 56 × 56, and outputs the characteristic diagram with the dimension of N × 256 × 56 × 56 to the sixth layer Dwconv 3-256; the sixth layer Dwconv3-256 outputs a feature map with dimensions of Nx256 x56 x 56 to the seventh layer Dwconv 3-256; the seventh layer Dwconv3-256 outputs a feature map E with the dimension of Nx256 × 56 × 56, the feature map E outputs the feature map with the dimension of Nx256 × 28 × 28 through the maximum pooling layer, and the feature map E is input into the module 4; the outputs of the fifth layer Conv3-256, the sixth layer Dwconv3-256 and the seventh layer Dwconv3-256 are all connected by using a standard convolutional layer Conv1-32, feature maps with N × 32 × 56 × 56 are output, summation operation sigma is carried out on the output feature maps to obtain an information fused feature map with N × 32 × 56 × 56, feature extraction is carried out on the information fused feature map through a standard convolutional layer Conv1-X, a feature map with N × X × 56 × 56 is output, and a feature map F with N × X × 7 × 7 is output through an average pooling layer with 8 × 8 convolutional kernel size;
and (4) module: comprises an eighth layer of Conv3-512, a ninth layer of Dwconv3-512 and a tenth layer of Dwconv3-512, wherein: the eighth layer Conv3-512 inputs the feature map with the dimension of N × 256 × 28 × 28, and outputs the feature map with the dimension of N × 512 × 28 × 28 to the ninth layer Dwconv 3-512; the ninth layer Dwconv3-512 outputs a feature map with dimensions of Nx512 x28 x 28 to the tenth layer Dwconv 3-512; a tenth layer Dwconv3-512 outputs a feature map G with the dimension of Nx512 x28 x 28, the feature map G outputs a feature map with the dimension of Nx512 x14 x 14 through the maximum pooling layer, and the feature map G is input into the module 5; the outputs of the Conv3-512 of the eighth layer, the Dwconv3-512 of the ninth layer and the Dwconv3-512 of the tenth layer are all connected by using standard convolutional layers Conv1-32, feature maps with N multiplied by 32 multiplied by 28 are output, summation operation sigma is carried out on the output feature maps to obtain an information fused feature map with N multiplied by 32 multiplied by 28, feature extraction is carried out on the information fused feature map through the standard convolutional layers Conv1-X, a feature map with N multiplied by X28 multiplied by 28 is output, and a feature map with N multiplied by X7 multiplied by 7 is output through an average pooling layer with the convolutional kernel size of 4 multiplied by 4;
and a module 5: comprising an eleventh layer of Conv3-512, a twelfth layer of Dwconv3-512 and a thirteenth layer of Dwconv3-512, wherein: the eleventh layer Conv3-512 inputs the feature map with the dimension of N × 512 × 14 × 14, and outputs the feature map with the dimension of N × 512 × 14 × 14 to the twelfth layer Dwconv 3-512; the twelfth layer Dwconv3-512 outputs a feature map with dimensions of Nx512 x14 to the thirteenth layer Dwconv 3-512; the thirteenth layer Dwconv3-512 outputs a feature map I with dimensions of Nx 512 x 14; the outputs of the Conv3-512 of the eleventh layer, the Dwconv3-512 of the twelfth layer and the Dwconv3-512 of the thirteenth layer are all connected by using standard convolutional layers Conv1-32, feature maps with N × 32 × 14 × 14 are output, summation operation sigma is carried out on the output feature maps to obtain an information fused feature map with N × 32 × 14 × 14, feature extraction is carried out on the information fused feature map through standard convolutional layers Conv1-X, a feature map with N × X × 14 × 14 is output, and an average pooling layer with a convolutional kernel size of 2 × 2 is output to obtain a feature map J with N × X × 7 × 7;
step three, multi-module feature fusion
Performing multi-module feature fusion by using concat mode: fusing the N multiplied by X7 multiplied by 7 feature map B output by the module 1, the N multiplied by X7 multiplied by 7 feature map D output by the module 2, the N multiplied by X7 multiplied by 7 feature map F output by the module 3, the N multiplied by X7 multiplied by 7 feature map H output by the module 4 and the N multiplied by X7 multiplied by 7 feature map J output by the module 5 into the N multiplied by 5X 7 multiplied by 7 feature map according to channels;
step four, feature extraction based on multi-module fusion features
Performing feature extraction on the Nx 5X 7 feature map fused in the step three by using a standard convolutional layer Conv1-nclass to obtain an Nx nclass X7 feature map;
step five, outputting classification results through an average pooling layer
And (3) outputting a classification result of Nxnclass by dimensionality reduction and outputting a class prediction probability by softmax by using an average pooling layer with a convolution kernel size of 7 × 7.
Compared with the prior art, the invention has the following advantages:
1. the classification method provided by the invention has the advantages that the features are extracted in a convolution mode, the parameter quantity is small (when Conv1-X is Conv1-16, the parameter quantity is 5.2MB), the operation speed is high, the method is suitable for the features of a small number of samples of medical image data, excessive weight parameters do not need to be updated, and meanwhile, the method is excellent in performance through experimental verification.
2. The method is not only suitable for two-classification tasks, but also suitable for multi-classification tasks, and can flexibly adjust the number of output channels of each module according to task requirements so as to improve the classification effect.
3. The last layer of the method uses an average pooling layer, so that the characteristics learned by the network can be conveniently extracted, and the class activation heat map can be visualized.
Drawings
FIG. 1 is a network model architecture proposed by the present invention;
FIG. 2 is an example of a four-position anatomical structure;
FIG. 3 is an example class activation heatmap.
Detailed Description
The technical solution of the present invention is further described below with reference to the accompanying drawings, but not limited thereto, and any modification or equivalent replacement of the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention shall be covered by the protection scope of the present invention.
The invention provides a medical image classification method based on multi-module convolution feature fusion, which comprises the following steps:
step one, image preprocessing
Two-dimensional medical images of any size are converted into 224 multiplied by 224 through a preprocessing module, meanwhile, the using state of a current network model is identified in the preprocessing module, and if the current network model is in a training stage, image data enhancement operation such as rotation, overturning, brightness adjustment, Gaussian noise addition and the like is carried out; if the current network model is in the testing stage, the operation is not performed.
Step two, network model sub-module convolution design
As shown in fig. 1, the network model architecture comprises 5 modules, each module is composed of a standard convolutional layer, a depth separable convolutional layer and an average pooling layer, the modules are connected by using a maximum pooling layer, the maximum pooling layer uses a convolution kernel of 2 × 2, and the step length stride is 2; the activation function used by the model is the ReLU function. If the classification number nclass < ═ 16, Conv1-X can be Conv 1-16; if nclass >16, the channel depth of Conv1-X can be increased, selecting X > 20. The implementation process is illustrated by taking Conv1-X as Conv1-16 as an example, when Conv1-X is Conv1-16, the total parameter number of the model is 5.2MB, and the calculated FLOPs is about 83.8 GB.
The default input data dimensions of the network are nx3 × 224 × 224, N is the size of each batch of data loaded, the height and width of the input image are both 224, and the color channels are RGB.
(1) Module 1: conv3-64 is the first convolutional layer, i.e. the sliding window of the convolutional kernel used is 3 × 3 × 3, the step size stride is 1, and the padding is 1. By convolving the general formula (1) and the general formula (2), the height and the width of the final output feature map can be both 224, and the feature map with the final output depth of 64 is required by the convolution layer, so the dimension of the feature map output by Conv3-64 is N × 64 × 224 × 224. The second layer Dwconv3-64 represents a depth separable convolution that divides the convolution process into two steps, channel-by-channel convolution and point-by-point 1 × 1 convolution, with the channel-by-channel convolution corresponding to 64 3 × 3 convolution kernels, resulting in an output of size N × 64 × 224 × 224. And then point-by-point 1 × 1 convolution is used to fuse the features of different channels, and the feature map A with the input dimension of N × 64 × 224 × 224 is output. The outputs of the first layer Conv3-64 and the second layer Dwconv3-64 are connected using a standard convolutional layer Conv1-32 (convolutional kernel size 1 × 1, channel depth 32, padding ═ 0, stride ═ 1), and the output profile sizes are each N × 32 × 224 × 224. Then, the summation operation Σ is performed on the output feature maps, and the obtained information-fused feature map N × 32 × 224 × 224 is obtained. Then, feature extraction is performed by using a standard convolution layer Conv1-16 (convolution kernel size 1 × 1, channel depth 16, padding is 0, stride is 1), and the dimension of an output feature map is N × 16 × 224 × 224. The feature map is output as a feature map B having dimensions N × 16 × 7 × 7 via an average pooling layer having a convolution kernel size of 32 × 32.
Figure BDA0003402660890000081
Figure BDA0003402660890000082
Wherein, W and H represent the width and height of the image, respectively, and subscript input represents the relevant parameters of the input image; subscript output expresses the related parameters of the output image, and subscript filter expresses the related parameters of a convolution kernel; s represents the step size of the convolution kernel; p (shorthand for padding) represents an increased number of boundary pixel layers at the edge of the image.
The feature map A output by the module 1 has the dimension of Nx 64 x 112 by the maximum pooling layer and is input into the module 2.
(2) And (3) module 2: conv3-128 is the third convolutional layer, using a convolutional kernel size of 3 × 3, step size of 1, padding of 1, and dimension of the feature map to obtain output of N × 128 × 112 × 112. The fourth layer uses the depth separable convolution layer Dwconv3-128, and convolves corresponding 128 3 × 3 convolution kernels channel by channel to obtain the output with the size of N × 128 × 112 × 112; and then point-by-point 1 × 1 convolution is used to fuse the features of different channels, and a feature map C with dimensions of N × 128 × 112 × 112 is output. The outputs of the third layer Conv3-128 and the fourth layer Dwconv3-128 are connected using standard convolutional layers Conv1-32 (convolutional kernel size 1 × 1, channel depth 32, padding ═ 0, stride ═ 1), and the output profile sizes are each N × 32 × 112 × 112. Then, the summation operation Σ is performed on the output feature maps, and the obtained information-fused feature map N × 32 × 112 × 112 is obtained. Then, feature extraction is performed by using a standard convolution layer Conv1-16 (convolution kernel size 1 × 1, channel depth 16, padding is 0, stride is 1), and the dimension of an output feature map is N × 16 × 112 × 112. The feature map is output as a feature map D having dimensions N × 16 × 7 × 7 via an average pooling layer having a convolution kernel size of 16 × 16.
The feature map C output by the module 2 has the dimension of Nx128 x56 x 56 by the maximum pooling layer and is input into the module 3.
(3) And a module 3: conv3-256 is the fifth convolutional layer, using a convolutional kernel size of 3 × 3, step size of 1, padding of 1, and dimension of the feature map to obtain output of N × 256 × 56 × 56. The sixth layer uses the depth separable convolution layer Dwconv3-256, convolves corresponding 256 3 × 3 convolution kernels channel by channel, and obtains the output with the size of N × 256 × 56 × 56; and then point-by-point 1 × 1 convolution is used to fuse the features of different channels, and a feature map with dimensions of Nx256 × 56 × 56 is output to the seventh layer. The seventh layer still uses the depth separable convolutional layers Dwconv3-256 to obtain a feature map E with output dimensions of nx256 × 56 × 56. The outputs of the fifth layer Conv3-256, the sixth layer Dwconv3-256 and the seventh layer Dwconv3-256 are all connected using standard convolutional layers Conv1-32 (convolutional kernel size 1 × 1, channel depth 32, padding ═ 0, stride ═ 1), and the output signature sizes are all N × 32 × 56 × 56. Then, the summation operation Σ is performed on the output feature maps, and the obtained information-fused feature map N × 32 × 56 × 56 is obtained. Then, feature extraction is performed by using a standard convolution layer Conv1-16 (convolution kernel size 1 × 1, channel depth 16, padding is 0, stride is 1), and the dimension of an output feature map is N × 16 × 56 × 56. The feature map is output as a feature map F having dimensions N × 16 × 7 × 7 via an average pooling layer having a convolution kernel size of 8 × 8.
The feature map E output by the module 3 is input into the module 4 with the largest pooling level dimension of nx256 × 28 × 28.
(4) And (4) module: conv3-512 is the eighth convolutional layer, using a convolutional kernel size of 3 × 3, step size of 1, padding of 1, and dimension of the feature map obtained as output of N × 512 × 28 × 28. The ninth layer uses a depth separable convolution layer Dwconv3-512, and convolves corresponding 512 3 × 3 convolution kernels channel by channel to obtain output with the size of N × 512 × 28 × 28; and then point-by-point 1 × 1 convolution is used to fuse the features of different channels, and a feature map with dimensions of N × 512 × 28 × 28 is output to the tenth layer. The tenth layer still uses the depth separable convolutional layers Dwconv3-512, resulting in a feature map G with output dimensions of N × 512 × 28 × 28. The outputs of the eighth layer Conv3-512, the ninth layer Dwconv3-512 and the tenth layer Dwconv3-512 are all connected using standard convolutional layers Conv1-32 (convolutional kernel size 1 × 1, channel depth 32, coding ═ 0, stride ═ 1), and the output signature size is N × 32 × 28 × 28. Then, the summation operation Σ is performed on the output feature maps, and the obtained information-fused feature map N × 32 × 28 × 28 is obtained. Then, feature extraction is performed by using a standard convolution layer Conv1-16 (convolution kernel size 1 × 1, channel depth 16, padding is 0, stride is 1), and the dimension of an output feature map is N × 16 × 28 × 28. The feature map is output as a feature map H having dimensions N × 16 × 7 × 7 through an average pooling layer having a convolution kernel size of 4 × 4.
The feature map G output by the module 4 has the dimension of Nx512 x14 x 14 by the maximum pooling layer and is input into the module 5.
(5) And a module 5: the standard convolutional layer and the depth separable convolutional layer of the module 5 are substantially the same as those of the module 4. Conv3-512 is the eleventh convolutional layer, using a convolutional kernel size of 3 × 3, step size of 1, padding of 1, and dimension of the output feature map of N × 512 × 14 × 14. The twelfth layer uses a depth separable convolution layer Dwconv3-512, and convolution is carried out on the corresponding 512 3 × 3 convolution kernels channel by channel to obtain output with the size of N × 512 × 14 × 14; and then point-by-point 1 × 1 convolution is used to fuse the features of different channels, and a feature map with dimensions of N × 512 × 14 × 14 is output to the thirteenth layer. The thirteenth layer still uses the depth separable convolutional layers Dwconv3-512, resulting in a feature map I with output dimensions of N × 512 × 14 × 14. The outputs of the eleventh layer Conv3-512, the twelfth layer Dwconv3-512 and the thirteenth layer Dwconv3-512 are all connected using a standard convolutional layer Conv1-32 (convolutional kernel size 1 × 1, channel depth 32, coding ═ 0, stride ═ 1), and the output signature size is N × 32 × 14 × 14. Then, summing operation sigma is carried out on the output feature maps, and the obtained information-fused feature map is Nx 32 x 14. Then, feature extraction is performed by using a standard convolution layer Conv1-16 (convolution kernel size 1 × 1, channel depth 16, padding is 0, stride is 1), and the dimension of an output feature map is N × 16 × 14 × 14. The feature map is a feature map J with N × 16 × 7 × 7 in output dimension through an average pooling layer with a convolution kernel size of 2 × 2.
Step three, multi-module feature fusion
Performing multi-module feature fusion by using concat mode: the N × 16 × 7 × 7 feature map B output by the module 1, the N × 16 × 7 × 7 feature map D output by the module 2, the N × 16 × 7 × 7 feature map F output by the module 3, the N × 16 × 7 × 7 feature map H output by the module 4, and the N × 16 × 7 × 7 feature map J output by the module 5 are merged into an N × 80 × 7 × 7 feature map by channel.
Step four, feature extraction based on multi-module fusion features
The input dimension is Nx 80 x 7 by using a standard convolutional layer Conv1-nclass (namely, the size of a convolutional kernel is 1 x 1, and the number of channels is the corresponding class number nclass), and features are extracted to obtain a feature map of Nx nclass x 7.
Step five, outputting classification results through an average pooling layer
And (4) performing dimensionality reduction to output an Nx nclass classification result by using an average pooling layer with a convolution kernel size of 7 x 7. The predicted probability for a category may be output via softmax.
Example (b):
the anatomical position recognition is carried out on cardia, anterior wall of stomach angle, posterior wall of stomach angle and pylorus by using the method provided by the invention, namely nclass is 4. A position example image is shown in fig. 2.
The data set is divided into a training set and a testing set, and the size N of the data volume loaded in each batch is 16. The training set contains 1912 total sheets of 320 sheets of cardia, 634 sheets of anterior wall of stomach corner, 634 sheets of posterior wall of stomach corner and 324 sheets of pylorus; the test set contains 741 total cardia 101, anterior wall of stomach corner 245, posterior wall of stomach corner 262 and pylorus 133. As shown in Table 1, the model has an accuracy of > 99% for the identification of 4 anatomical locations.
TABLE 1 anatomical site identification evaluation index
Figure BDA0003402660890000121
The last layer of the network model provided by the invention is an average pooling layer, so that the gradient back propagation can be directly carried out without changing the model, and the activation-like heat map can be obtained, as shown in fig. 3.

Claims (6)

1. A medical image classification method based on multi-module convolution feature fusion is characterized by comprising the following steps:
step one, image preprocessing
Two-dimensional medical images with any size are converted into 224 multiplied by 224 through a preprocessing module, meanwhile, the using state of a current network model is identified in the preprocessing module, and if the current network model is in a training stage, image data enhancement operation is carried out; if the current network model is in a test stage, image data enhancement operation is not carried out;
step two, network model sub-module convolution design
The multi-module convolutional network model architecture comprises the following 5 modules:
module 1: comprising a first layer Conv3-64 and a second layer Dwconv3-64, wherein: after the preprocessing module, the input data dimension of the first layer Conv3-64 is nx3x224 x 224, namely the height and the width of an input image are both 224, the size of the data volume loaded in each batch is N, the color channels are RGB, and feature maps with the dimension of nx64 x224 x 224 are output to the second layer Dwconv 3-64; the second layer Dwconv3-64 outputs a feature map A with the dimension of Nx 64 x 224, the feature map A outputs a feature map with the dimension of Nx 64 x 112 through the maximum pooling layer, and the feature map A is input into the module 2; the output of the first layer Conv3-64 and the output of the second layer Dwconv3-64 are connected by using a standard convolutional layer Conv1-32, output dimensions are all N X32X 224 feature maps, summation operation sigma is carried out on the output feature maps to obtain an information fused feature map with the dimensions of N X32X 224, feature extraction is carried out on the information fused feature map through the standard convolutional layer Conv1-X, a feature map with the dimensions of N X224 is output, and a feature map B with the dimensions of N X7 is output through an average pooling layer with the convolutional kernel size of 32X 32;
and (3) module 2: comprising a third layer Conv3-128 and a fourth layer Dwconv3-128, wherein: the third layer Conv3-128 inputs the feature maps with the dimensions of Nx 64 x 112 and outputs the feature maps with the dimensions of Nx 128 x 112 to the fourth layer Dwconv 3-128; a feature map C with the dimension of Nx128 × 112 × 112 is output by the fourth layer Dwconv3-128, the feature map C is output by the largest pooling layer, and the feature map with the dimension of Nx128 × 56 × 56 is input into the module 3; the outputs of the Conv3-128 of the third layer and the Dwconv3-128 of the fourth layer are connected by using standard convolutional layers Conv1-32, the output dimensions are all N × 32 × 112 × 112 feature maps, summation operation sigma is carried out on the output feature maps to obtain information-fused feature maps with the dimensions of N × 32 × 112 × 112, feature extraction is carried out on the information-fused feature maps through the standard convolutional layers Conv1-X, feature maps with the dimensions of N × X × 112 × 112 are output, and feature maps D with the dimensions of N × X × 7 × 7 are output through an average pooling layer with the convolutional kernel size of 16 × 16;
and a module 3: the fifth layer Conv3-256, the sixth layer Dwconv3-256 and the seventh layer Dwconv3-256 are included, the fifth layer Conv3-256 inputs the characteristic diagram with the dimension of N × 128 × 56 × 56, and outputs the characteristic diagram with the dimension of N × 256 × 56 × 56 to the sixth layer Dwconv 3-256; the sixth layer Dwconv3-256 outputs a feature map with dimensions of Nx256 x56 x 56 to the seventh layer Dwconv 3-256; the seventh layer Dwconv3-256 outputs a feature map E with the dimension of Nx256 × 56 × 56, the feature map E outputs the feature map with the dimension of Nx256 × 28 × 28 through the maximum pooling layer, and the feature map E is input into the module 4; the outputs of the fifth layer Conv3-256, the sixth layer Dwconv3-256 and the seventh layer Dwconv3-256 are all connected by using a standard convolutional layer Conv1-32, feature maps with N × 32 × 56 × 56 are output, summation operation sigma is carried out on the output feature maps to obtain an information fused feature map with N × 32 × 56 × 56, feature extraction is carried out on the information fused feature map through a standard convolutional layer Conv1-X, a feature map with N × X × 56 × 56 is output, and a feature map F with N × X × 7 × 7 is output through an average pooling layer with 8 × 8 convolutional kernel size;
and (4) module: comprises an eighth layer of Conv3-512, a ninth layer of Dwconv3-512 and a tenth layer of Dwconv3-512, wherein: the eighth layer Conv3-512 inputs the feature map with the dimension of N × 256 × 28 × 28, and outputs the feature map with the dimension of N × 512 × 28 × 28 to the ninth layer Dwconv 3-512; the ninth layer Dwconv3-512 outputs a feature map with dimensions of Nx512 x28 x 28 to the tenth layer Dwconv 3-512; a tenth layer Dwconv3-512 outputs a feature map G with the dimension of Nx512 x28 x 28, the feature map G outputs a feature map with the dimension of Nx512 x14 x 14 through the maximum pooling layer, and the feature map G is input into the module 5; the outputs of the Conv3-512 of the eighth layer, the Dwconv3-512 of the ninth layer and the Dwconv3-512 of the tenth layer are all connected by using standard convolutional layers Conv1-32, feature maps with N multiplied by 32 multiplied by 28 are output, summation operation sigma is carried out on the output feature maps to obtain an information fused feature map with N multiplied by 32 multiplied by 28, feature extraction is carried out on the information fused feature map through the standard convolutional layers Conv1-X, a feature map with N multiplied by X28 multiplied by 28 is output, and a feature map with N multiplied by X7 multiplied by 7 is output through an average pooling layer with the convolutional kernel size of 4 multiplied by 4;
and a module 5: comprising an eleventh layer of Conv3-512, a twelfth layer of Dwconv3-512 and a thirteenth layer of Dwconv3-512, wherein: the eleventh layer Conv3-512 inputs the feature map with the dimension of N × 512 × 14 × 14, and outputs the feature map with the dimension of N × 512 × 14 × 14 to the twelfth layer Dwconv 3-512; the twelfth layer Dwconv3-512 outputs a feature map with dimensions of Nx512 x14 to the thirteenth layer Dwconv 3-512; the thirteenth layer Dwconv3-512 outputs a feature map I with dimensions of Nx 512 x 14; the outputs of the Conv3-512 of the eleventh layer, the Dwconv3-512 of the twelfth layer and the Dwconv3-512 of the thirteenth layer are all connected by using standard convolutional layers Conv1-32, feature maps with N × 32 × 14 × 14 are output, summation operation sigma is carried out on the output feature maps to obtain an information fused feature map with N × 32 × 14 × 14, feature extraction is carried out on the information fused feature map through standard convolutional layers Conv1-X, a feature map with N × X × 14 × 14 is output, and an average pooling layer with a convolutional kernel size of 2 × 2 is output to obtain a feature map J with N × X × 7 × 7;
step three, multi-module feature fusion
Performing multi-module feature fusion by using concat mode: fusing the N multiplied by X7 multiplied by 7 feature map B output by the module 1, the N multiplied by X7 multiplied by 7 feature map D output by the module 2, the N multiplied by X7 multiplied by 7 feature map F output by the module 3, the N multiplied by X7 multiplied by 7 feature map H output by the module 4 and the N multiplied by X7 multiplied by 7 feature map J output by the module 5 into the N multiplied by 5X 7 multiplied by 7 feature map according to channels;
step four, feature extraction based on multi-module fusion features
Performing feature extraction on the Nx 5X 7 feature map fused in the step three by using a standard convolutional layer Conv1-nclass to obtain an Nx nclass X7 feature map;
step five, outputting classification results through an average pooling layer
And (3) outputting a classification result of Nxnclass through dimensionality reduction by using an average pooling layer with a convolution kernel size of 7 × 7, and outputting a prediction probability of the class through softmax.
2. The method for classifying medical images based on multi-module convolution feature fusion according to claim 1, wherein the image data enhancement operation is one or more of rotation, inversion, brightness adjustment and Gaussian noise addition.
3. The method for classifying medical images based on multi-module convolution feature fusion according to claim 1, wherein each of the modules 1 to 5 is composed of a standard convolution layer, a depth separable convolution layer and an average pooling layer, and a maximum pooling layer is used for connection between modules.
4. The method for classifying medical images based on multi-module convolution feature fusion according to claim 1 or 3, characterized in that the maximum pooling layer uses a convolution kernel of 2 x 2 and the step size stride is 2.
5. The method of classifying medical images based on multi-module convolution feature fusion according to claim 1, wherein the activation function used by the network model is a ReLU function.
6. The method of claim 1, wherein X is 16 ≦ X ≦ 64.
CN202111501536.8A 2021-12-09 2021-12-09 Medical image classification method based on multi-module convolution feature fusion Active CN114387467B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111501536.8A CN114387467B (en) 2021-12-09 2021-12-09 Medical image classification method based on multi-module convolution feature fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111501536.8A CN114387467B (en) 2021-12-09 2021-12-09 Medical image classification method based on multi-module convolution feature fusion

Publications (2)

Publication Number Publication Date
CN114387467A true CN114387467A (en) 2022-04-22
CN114387467B CN114387467B (en) 2022-07-29

Family

ID=81196240

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111501536.8A Active CN114387467B (en) 2021-12-09 2021-12-09 Medical image classification method based on multi-module convolution feature fusion

Country Status (1)

Country Link
CN (1) CN114387467B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108345890A (en) * 2018-03-01 2018-07-31 腾讯科技(深圳)有限公司 Image processing method, device and relevant device
CN110321967A (en) * 2019-07-11 2019-10-11 南京邮电大学 Image classification innovatory algorithm based on convolutional neural networks
CN110619638A (en) * 2019-08-22 2019-12-27 浙江科技学院 Multi-mode fusion significance detection method based on convolution block attention module
US20200065619A1 (en) * 2017-11-09 2020-02-27 Boe Technology Group Co., Ltd. Image processing method, processing apparatus and processing device
CN111145170A (en) * 2019-12-31 2020-05-12 电子科技大学 Medical image segmentation method based on deep learning
WO2020222985A1 (en) * 2019-04-30 2020-11-05 The Trustees Of Dartmouth College System and method for attention-based classification of high-resolution microscopy images
CN112036475A (en) * 2020-08-28 2020-12-04 江南大学 Fusion module, multi-scale feature fusion convolutional neural network and image identification method
CN112215291A (en) * 2020-10-19 2021-01-12 中国计量大学 Method for extracting and classifying medical image features under cascade neural network
CN112561863A (en) * 2020-12-03 2021-03-26 吉林大学 Medical image multi-classification recognition system based on improved ResNet
CN112990391A (en) * 2021-05-20 2021-06-18 四川大学 Feature fusion based defect classification and identification system of convolutional neural network
CN113177465A (en) * 2021-04-27 2021-07-27 江苏科技大学 SAR image automatic target recognition method based on depth separable convolutional neural network

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200065619A1 (en) * 2017-11-09 2020-02-27 Boe Technology Group Co., Ltd. Image processing method, processing apparatus and processing device
CN108345890A (en) * 2018-03-01 2018-07-31 腾讯科技(深圳)有限公司 Image processing method, device and relevant device
WO2020222985A1 (en) * 2019-04-30 2020-11-05 The Trustees Of Dartmouth College System and method for attention-based classification of high-resolution microscopy images
CN110321967A (en) * 2019-07-11 2019-10-11 南京邮电大学 Image classification innovatory algorithm based on convolutional neural networks
CN110619638A (en) * 2019-08-22 2019-12-27 浙江科技学院 Multi-mode fusion significance detection method based on convolution block attention module
CN111145170A (en) * 2019-12-31 2020-05-12 电子科技大学 Medical image segmentation method based on deep learning
CN112036475A (en) * 2020-08-28 2020-12-04 江南大学 Fusion module, multi-scale feature fusion convolutional neural network and image identification method
CN112215291A (en) * 2020-10-19 2021-01-12 中国计量大学 Method for extracting and classifying medical image features under cascade neural network
CN112561863A (en) * 2020-12-03 2021-03-26 吉林大学 Medical image multi-classification recognition system based on improved ResNet
CN113177465A (en) * 2021-04-27 2021-07-27 江苏科技大学 SAR image automatic target recognition method based on depth separable convolutional neural network
CN112990391A (en) * 2021-05-20 2021-06-18 四川大学 Feature fusion based defect classification and identification system of convolutional neural network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
朱德江: "基于深度神经网络的肺结节自动检测和智能分类研究", 《中国优秀硕士学位论文全文数据库 (基础科学辑)》 *
李超,孙明健,马立勇,沈毅,林日强,龚小竞: "针对活体光声内窥成像的三维血管增强算法", 《中国激光》 *

Also Published As

Publication number Publication date
CN114387467B (en) 2022-07-29

Similar Documents

Publication Publication Date Title
CN110378381B (en) Object detection method, device and computer storage medium
CN110120040B (en) Slice image processing method, slice image processing device, computer equipment and storage medium
CN108182441B (en) Parallel multichannel convolutional neural network, construction method and image feature extraction method
CN111028146B (en) Image super-resolution method for generating countermeasure network based on double discriminators
CN107016681B (en) Brain MRI tumor segmentation method based on full convolution network
CN106951825B (en) Face image quality evaluation system and implementation method
CN110705555B (en) Abdomen multi-organ nuclear magnetic resonance image segmentation method, system and medium based on FCN
CN112446476A (en) Neural network model compression method, device, storage medium and chip
CN108288035A (en) The human motion recognition method of multichannel image Fusion Features based on deep learning
WO2018052587A1 (en) Method and system for cell image segmentation using multi-stage convolutional neural networks
CN110110596B (en) Hyperspectral image feature extraction, classification model construction and classification method
CN110084159A (en) Hyperspectral image classification method based on the multistage empty spectrum information CNN of joint
CN112365514A (en) Semantic segmentation method based on improved PSPNet
CN110879982A (en) Crowd counting system and method
CN110852369B (en) Hyperspectral image classification method combining 3D/2D convolutional network and adaptive spectrum unmixing
CN111489364A (en) Medical image segmentation method based on lightweight full convolution neural network
CN107180241A (en) A kind of animal classification method of the profound neutral net based on Gabor characteristic with fractal structure
CN110222718A (en) The method and device of image procossing
CN115601751B (en) Fundus image semantic segmentation method based on domain generalization
CN113989405B (en) Image generation method based on small sample continuous learning
CN109508639B (en) Road scene semantic segmentation method based on multi-scale porous convolutional neural network
CN114882278A (en) Tire pattern classification method and device based on attention mechanism and transfer learning
CN113361466A (en) Multi-modal cross-directed learning-based multi-spectral target detection method
CN109344852A (en) Image-recognizing method and device, analysis instrument and storage medium
CN114387467B (en) Medical image classification method based on multi-module convolution feature fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant