WO2024108505A1 - 一种基于残差胶囊网络对心肌纤维化进行分类的方法 - Google Patents
一种基于残差胶囊网络对心肌纤维化进行分类的方法 Download PDFInfo
- Publication number
- WO2024108505A1 WO2024108505A1 PCT/CN2022/134155 CN2022134155W WO2024108505A1 WO 2024108505 A1 WO2024108505 A1 WO 2024108505A1 CN 2022134155 W CN2022134155 W CN 2022134155W WO 2024108505 A1 WO2024108505 A1 WO 2024108505A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- capsule
- network
- residual
- image
- capsule network
- Prior art date
Links
- 239000002775 capsule Substances 0.000 title claims abstract description 139
- 238000000034 method Methods 0.000 title claims abstract description 40
- 206010028594 Myocardial fibrosis Diseases 0.000 title claims abstract description 21
- 239000013598 vector Substances 0.000 claims abstract description 20
- 230000006870 function Effects 0.000 claims description 22
- 238000012549 training Methods 0.000 claims description 20
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 18
- 230000005284 excitation Effects 0.000 claims description 18
- 238000003860 storage Methods 0.000 claims description 16
- 230000006835 compression Effects 0.000 claims description 15
- 238000007906 compression Methods 0.000 claims description 15
- 230000004913 activation Effects 0.000 claims description 12
- 238000011176 pooling Methods 0.000 claims description 10
- 238000004590 computer program Methods 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 9
- 238000002595 magnetic resonance imaging Methods 0.000 claims description 8
- 230000003111 delayed effect Effects 0.000 claims description 3
- 238000012545 processing Methods 0.000 abstract description 13
- 230000004927 fusion Effects 0.000 abstract description 3
- 238000010586 diagram Methods 0.000 description 18
- 238000012360 testing method Methods 0.000 description 13
- 238000004422 calculation algorithm Methods 0.000 description 9
- 238000013527 convolutional neural network Methods 0.000 description 9
- 238000002474 experimental method Methods 0.000 description 8
- 230000035945 sensitivity Effects 0.000 description 6
- 238000004880 explosion Methods 0.000 description 5
- 230000009471 action Effects 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 230000000747 cardiac effect Effects 0.000 description 3
- 238000013184 cardiac magnetic resonance imaging Methods 0.000 description 3
- 230000008034 disappearance Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 3
- 239000011159 matrix material Substances 0.000 description 3
- 238000005457 optimization Methods 0.000 description 3
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 2
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000003745 diagnosis Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 230000002107 myocardial effect Effects 0.000 description 2
- 239000013307 optical fiber Substances 0.000 description 2
- 210000000056 organ Anatomy 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 208000031229 Cardiomyopathies Diseases 0.000 description 1
- 102000008186 Collagen Human genes 0.000 description 1
- 108010035532 Collagen Proteins 0.000 description 1
- 206010056370 Congestive cardiomyopathy Diseases 0.000 description 1
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 1
- 235000009854 Cucurbita moschata Nutrition 0.000 description 1
- 240000001980 Cucurbita pepo Species 0.000 description 1
- 235000009852 Cucurbita pepo Nutrition 0.000 description 1
- 201000010046 Dilated cardiomyopathy Diseases 0.000 description 1
- 102100033996 Double-strand break repair protein MRE11 Human genes 0.000 description 1
- 101000591400 Homo sapiens Double-strand break repair protein MRE11 Proteins 0.000 description 1
- 101000827703 Homo sapiens Polyphosphoinositide phosphatase Proteins 0.000 description 1
- 241000287127 Passeridae Species 0.000 description 1
- 102100023591 Polyphosphoinositide phosphatase Human genes 0.000 description 1
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 1
- 101100233916 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) KAR5 gene Proteins 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 229920001436 collagen Polymers 0.000 description 1
- 229910052802 copper Inorganic materials 0.000 description 1
- 239000010949 copper Substances 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000008021 deposition Effects 0.000 description 1
- 210000002950 fibroblast Anatomy 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 208000019622 heart disease Diseases 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000004165 myocardium Anatomy 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000004393 prognosis Methods 0.000 description 1
- 230000035755 proliferation Effects 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 230000001902 propagating effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000005549 size reduction Methods 0.000 description 1
- 235000020354 squash Nutrition 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
- 230000002861 ventricular Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
Definitions
- the present invention relates to the technical field of medical image processing, and more specifically, to a method for classifying myocardial fibrosis based on a residual capsule network.
- Myocardial fibrosis is mainly caused by the significant proliferation of myocardial interstitial fibroblasts and abnormally distributed collagen deposition.
- the degree of myocardial fibrosis is closely related to the prognosis of heart disease.
- delayed enhancement magnetic resonance imaging (DE-MRI) of the myocardium is an effective method to evaluate the distribution characteristics and degree of myocardial fibrosis. This type of scan has the characteristics of high resolution and low motion blur. Since manual detection is usually time-consuming, in recent years, cardiac MRI machine learning has gradually been applied to the diagnosis of dilated cardiomyopathy, myocardial fibrosis, and assessment of left ventricular volume.
- CNNs convolutional neural networks
- CNN uses convolution kernels to extract image information, reduces training parameters through pooling layers, and finally outputs classification results through classifiers.
- CNN achieves invariance to interpretation and rotation through weight sharing and pooling operations, but CNN cannot identify image spatial relationships that will have a significant impact on the results. For example, in medical images, even if the spatial position of organs is disrupted, CNN will still identify them as normal samples.
- CNN requires a large data set to form accurate patterns. Deep networks are prone to gradient vanishing and explosion, and pooling layers may cause the loss of extracted feature information, thus affecting the final classification accuracy.
- CapsNet Capsule Network
- CapsNet considers the spatial relationship between features and uses a dynamic routing algorithm to encode the relationship between objects and results, thereby reflecting the overall characteristics of the image.
- Capsules can store vectors describing specific objects, where the length of the vector represents the probability of the existence of the object and the direction represents the description parameter. The stored description parameter vector can be flattened and reversely rendered to form the original image.
- CapsNet is friendly to small data sets. CapsNet does not set a pooling layer that risks losing information. Although CapsNet performs well on the small data set MNIST, it has low accuracy in classifying complex data sets such as MRI scans. This is because CapsNet is a shallow network, mainly including convolutional layers for feature extraction, capsule layers for encoding mapping, and an optional decoder for reconstructing images. CapsNet usually has only two convolutional layers, which makes it unable to capture the deeper features contained in MRI scans.
- the decoder regenerates the high-resolution image from only a few vectors, the reconstruction error of the image will increase, further reducing the classification accuracy, and it is prone to gradient disappearance and explosion, which will have an adverse effect on training speed and performance.
- the purpose of the present invention is to overcome the defects of the above-mentioned prior art and provide a method for classifying myocardial fibrosis based on a residual capsule network.
- the method comprises the following steps:
- the residual capsule network includes an encoder and a decoder, the encoder includes a residual network and a capsule network, the decoder includes a deconvolution layer, the residual network is used to extract feature maps of different depths from the input image and pass them to the capsule network, the capsule network performs vectorized weighted merging on the feature maps of different depths, and the decoder uses the longest vector in the connected capsules of the capsule network to reconstruct the image.
- the advantage of the present invention is that a new type of capsule network is proposed, which replaces the convolution layer with a residual block.
- the residual block can extract deeper semantic information from the image, and then combine the shallow, middle and deep features extracted by different residual blocks to help build a more efficient capsule network.
- scaling reconstruction is used in the decoder.
- the residual capsule network model proposed in the present invention improves the clarity of the reconstructed image, especially for large DE-MRI scans, and has better performance.
- FIG1 is a flow chart of a method for classifying myocardial fibrosis based on a residual capsule network according to an embodiment of the present invention
- FIG2 is a schematic diagram of the overall structure of a residual capsule network according to an embodiment of the present invention.
- FIG3 is a schematic diagram of an original residual block and pre-activation according to an embodiment of the present invention.
- FIG4 is a schematic diagram of the structure of an improved capsule network according to an embodiment of the present invention.
- FIG5 is a schematic diagram of a compression and excitation module according to an embodiment of the present invention.
- FIG6 is a structural diagram of a deconvolution network layer according to an embodiment of the present invention.
- FIG7 is a schematic diagram of a positive sample and a negative sample of a DE-MRI image according to an embodiment of the present invention.
- FIG8 is a comparison diagram of test loss and test accuracy of three models according to an embodiment of the present invention.
- FIG9 is a comparison diagram of test loss and accuracy of capsule networks under different data amounts according to an embodiment of the present invention.
- the present invention proposes a novel residual capsule network (or ResCapsNet), which replaces the convolutional layer with an improved residual network to compress features, reduces the number of primary capsules, and prevents gradient vanishing and explosion.
- the residual network introduces residual blocks, which enable the network to pass more information at each layer through shortcut connections.
- the number of weights in the network can be reduced.
- the original image is scaled to reduce the amount of calculation.
- a new loss function using a scaling reconstruction method is proposed to further reduce the computational burden.
- the provided method for classifying myocardial fibrosis based on a residual capsule network includes the following steps.
- Step S110 construct a residual capsule network, where the residual capsule network includes an encoder and a decoder, wherein the encoder includes a residual network and a capsule network.
- the residual capsule network mainly includes an encoder and a decoder.
- the encoder first encodes the input image, and then the decoder reconstructs the image.
- the encoder includes a residual network and a capsule network, where the capsule network contains primary capsules, digital capsules, and connection capsules, and the decoder includes multiple deconvolution layers, for example, 3 or 5 deconvolution layers.
- the residual network is responsible for extracting feature information.
- the capsule network is responsible for vectorizing the features.
- the decoder is used to reconstruct the image and obtain the reconstruction loss by comparing it with the scaled input.
- an improved residual network is used as the convolution layer, and the pooling layer in the residual network is removed because it loses image information.
- the residual network contains multiple residual blocks to preserve the spatial location information of the feature map and prevent gradient disappearance and explosion.
- the residual network is mainly used to extract deep features and semantic information from images.
- the setting includes 4 residual blocks.
- the image is input into the first residual block to obtain a feature map of 96 ⁇ 96 ⁇ 128.
- 3 residual layers convolution kernel is 3 ⁇ 3, stride is 2)
- a shallow feature map of size 48 ⁇ 48 ⁇ 256 is obtained, and the extracted shallow feature map is transferred to PrimaryA (the first primary capsule).
- a middle-level feature map of size 24 ⁇ 24 ⁇ 512 is obtained, and the extracted middle-level feature map is transferred to PrimaryB (the second primary capsule).
- the middle-level features are input into the last residual block to obtain a deep feature map of size 8 ⁇ 8 ⁇ 256, and the extracted deep feature map is transferred to PrimaryC (the third primary capsule).
- a new residual block structure is introduced As shown in Figure 3, the left figure is the structure of the original residual block, and the right figure is the pre-activated structure, which uses a 1 ⁇ 1 convolution layer for size reduction.
- the activation function is moved to the residual part. By changing the activation function to an identity mapping, it is helpful to improve the accuracy of the model.
- the new residual block in Figure 3 includes a regularization layer (BN), an activation layer (such as using a ReLU activation function), and a convolution layer.
- the improved residual network is used as the convolutional layer of the capsule network, which can not only solve the gradient vanishing and exploding problems in the capsule network of large-size medical images, but also preserve the spatial position relationship between the features of the capsule layer, while reducing the number of primary capsules for easy deployment and calculation.
- the deep network has a larger receptive field and a stronger ability to represent semantic information, but a weaker ability to represent spatial information. Deep features can represent richer image information and can separate complex target areas.
- the shallow network has a smaller receptive field and a weaker ability to represent semantic information, but a stronger ability to represent spatial information. Shallow features represent less information, but can separate some simple target areas. The fusion of shallow and deep features can compensate each other, thereby improving the performance of the model.
- the capsule network In order to better integrate shallow features and deep features, the capsule network is improved.
- the capsule network includes three capsule layers or three primary capsules (labeled as PrimaryA, Primary B and Primary C), three capsule layers corresponding to their respective digital capsule layers and a connected digital layer.
- the convolution layer outputs the features to three separate capsule layers. Then, the results of the three capsule layers are merged to obtain the final result.
- the SE (compression and excitation block) module will obtain the weights of different digital layers during the merging process.
- the shallow features ConvA, the middle features ConvB and the deep features ConvC extracted by the previous residual network are transmitted to the capsule layers PrimaryA, PrimaryB and PrimaryC respectively.
- the layers all contain compression and excitation (SE) modules, and the weight coefficients a 1 , a 2 and a 3 of the shallow, middle and deep features come from the corresponding SE modules.
- SE compression and excitation
- the three weight coefficients are multiplied by the three digital capsule layers to obtain three digital capsule layers with weights, labeled as DigitA, DigitB and DigitC respectively.
- the three digital layers with weight coefficients are connected into one digital layer.
- the outputs of DigitA, DigitB, and DigitC are two vectors representing two categories, and each output is an 8-dimensional capsule.
- the final connected capsule fusion layer is represented by D, which is a 24-dimensional capsule, expressed as:
- D1 , D2 and D3 represent the outputs of DigitA, DigitB and DigitC respectively. Represents string concatenation.
- the loss function uses the marginal loss function, which is defined as follows:
- the Squeeze and Excitation Network introduces an attention mechanism to adaptively weight the feature maps of each channel. This attention mechanism is implemented by the Squeeze and Excitation module (SE).
- the compression and excitation module also known as compression and excitation, is a combination of two key operations that enrich the channel attention mechanism.
- one dimension is the vector array and the other dimension is the number of capsules.
- the attention mechanism provided by the compression and excitation module can add weights to the importance of each capsule and enhance or suppress the corresponding capsules for different tasks to extract useful features and suppress useless features.
- the compression and excitation module includes a global pooling layer and two fully connected layers.
- the operation of the compression and excitation module can be divided into three steps:
- Compression operation aims to compress capsule information, keep the number of capsules unchanged, and perform global information pooling to convert each capsule block into a number, whose value is the sum of the elements in the primary capsule layer divided by the number of elements, expressed as:
- n is the number of primary capsules
- p is the length of the capsule
- ⁇ ij represents the element in the capsule.
- s is the output of the excitation operation
- ⁇ is the activation function Sigmoid
- W2 and W1 are the corresponding weight parameters of the two fully connected layers
- ⁇ is the activation function ReLU.
- the algorithm for calculating digital capsules based on primary capsules is called a dynamic routing algorithm.
- the iteration r of dynamic routing is set to 3 in the present invention.
- the initial capsule u i is predicted to belong to the digital capsule v j , which is expressed as:
- Wij is the transformation matrix and cij is the prediction result. cij changes in each iteration. Therefore, the calculated vj (i.e., the sum of cij Wiijui ) is different in different iterations.
- the Softmax function ensures that the sum of cij is always equal to 1.
- the squeezing operation compresses the norm of the vector vj to the range of (0,1) without changing the direction of the vector. Since the direction of the vector represents the feature, the final result vj retains the features corrected according to the primary capsule. During the training process, multiple iterations are required to update the parameters multiple times, and the result of the last iteration is selected as the final output result of the digital capsule.
- the output of the dynamic routing algorithm is a digital capsule, not a prediction coefficient c ij . Therefore, the algorithm can be regarded as a hidden layer with a complex process.
- the current effective method for training deep networks is back propagation, which requires more parameters for the capsule network to improve its accuracy during training. Therefore, the transformation matrix Wij is introduced, which can not only transform the primary capsule into a different shape, but also ensure the ability of the capsule network to capture information from the vector.
- the decoder includes a deconvolution module.
- the original capsule network uses reconstruction loss as a regularization method to encourage the capsules in the digital capsule layer to encode as much useful information as possible. Reconstruction is done by simply feeding the output of the digital capsule layer to the decoder, which consists of three fully connected layers. While this method works well on simple images, its reconstruction performance is not suitable for complex datasets such as DE-MRI scans, where the reconstructed images are blurry and difficult to distinguish.
- deconvolution is used to reconstruct the image to improve the reconstruction performance and classification accuracy of complex data sets.
- the two 24-dimensional capsule vectors obtained from the encoder are reformatted into vector images of size 4 ⁇ 4 ⁇ 3.
- the vectors are deconvolved with three convolution kernels of size 3 ⁇ 3, with step sizes of 1, 1, 2, 3, and 1, respectively.
- the number of channels is changed from 3 to 1.
- a reconstructed image with a channel number of 1 and a size of 24 ⁇ 24 is obtained.
- Step S120 training the residual capsule network using the set loss function.
- step S110 the large-size input image is scaled, and mean pooling with a convolution kernel size of 8 ⁇ 8 and a stride of 8 is used to reduce the image size.
- the input image of size 224 ⁇ 224 can be reduced to 24 ⁇ 24.
- the sum of the squares of the difference between the scaled original image and the reconstructed image is calculated, and the sum of the squares is used as the reconstruction loss based on the scaled reconstruction, which is expressed as follows:
- y m is the true value, is an estimated value, M is the total number of samples. The larger the L reconstruction is, the worse the effect is.
- the overall loss function for training the residual capsule network is set to:
- Lk represents the edge loss of the encoder. Since there is a gap between the values of the reconstruction loss and the edge loss, the weight of the reconstruction loss needs to be reduced when calculating the overall loss. For example, the weight coefficient ⁇ is set to 0.0005.
- Step S130 using the trained residual capsule network to enhance the target image, and then analyzing the degree of myocardial fibrosis based on the enhanced image.
- the optimization parameters of the residual capsule network can be obtained, and then applied to the actual target image reconstruction, including: obtaining the target image to be enhanced; inputting the target image into the residual capsule network to obtain a clearer reconstructed image; and then using the reconstructed image to analyze the degree of myocardial fibrosis for clinical indication.
- the target image can be magnetic resonance imaging, delayed enhancement magnetic resonance imaging or other types.
- the Adam optimization algorithm was used for optimization, and the learning rate was set to 0.001.
- the number of iterations of the model training was set to 2000 to prevent overfitting.
- the number of iterations of the capsule layer dynamic routing algorithm was set to 3. Since the capsule network is prone to overfitting, the capsule layer was deactivated during each iteration of training, and the weights of the deactivated neurons were not updated in that iteration to reduce the complexity of the model.
- Figure 7 shows examples of two types of samples, where Figure 7(a) is the short-axis view and long-axis view of a normal sample, and Figure 7(b) is the short-axis view and long-axis view of a myocardial fibrosis sample.
- MRI data is a grayscale image.
- the file is saved as a three-channel PNG color image during the dump process.
- the image is converted to a single-channel format.
- the image is cropped into a square image of size 192 ⁇ 192 to make the proportion of the heart image in the image as large as possible.
- an image enhancement method based on sparrow search is proposed to make the image grayscale distribution more uniform, thereby retaining more detail information and improving image quality. It has been verified that the algorithm can be used in different situations.
- the prediction results can be divided into 4 categories according to the actual situation and the predicted label: true positive (TP), false positive (FP), true negative (TN) and false negative (FN).
- the evaluation criteria include recall (P), precision (R) and F1 score.
- Accuracy represents the overall performance of the model, that is, the ratio of correctly classified samples, expressed as:
- Sensitivity is the ratio of correctly classified positive samples to all positive samples, representing the ability of the model to diagnose myocardial fibrosis based on background information, expressed as:
- the three models all reach convergence after the 40th iteration.
- the number of parameters of the models is listed in Table 2.
- ResCapsNet has an accuracy of 72.25% and a loss of 0.2913.
- CapsNet has an accuracy of 68.73% and a loss of 0.3025, both of which are better than ResNet's accuracy and loss (64.21% and 0.3168, respectively).
- the ResCapsNet network has more than 9.3 million parameters, which is roughly equivalent to CapsNet (9.1 million parameters) and far less than ResNet (14 million parameters).
- ResCapsNet converges faster and has fewer parameters.
- ResCapsNet has the best accuracy and number of parameters.
- the test set contained 54 positive samples and 54 negative samples.
- ResCapsNet In order to diagnose myocardial fibrosis, we hope to diagnose as many patients with true features as possible, so high sensitivity shows that the model is practical. Experiments show that among patients with myocardial fibrosis, ResCapsNet successfully detected 43 patients with a sensitivity of 79.63%. In contrast, CapsNet detected 41 patients with positive features with a sensitivity of 75.93%. ResNet detected 33 patients with a sensitivity of 61.11%. From this perspective, ResCapsNet has the best performance among the three models.
- the AUC value can be calculated from the ROC graph and used to evaluate the three models because it is less affected by the imbalance of the positive and negative sample ratios.
- the AUC value of ResCapsNet is 0.8945
- the AUC value of CapsNet is 0.8824
- the AUC value of ResNet is 0.8681. Therefore, in terms of AUC value, ResCapsNet has the best performance among the three models.
- the performance of ResCapsNet and the original CapsNet under different data amounts was tested. While the test set remained unchanged, 10% of the training set data was randomly extracted to obtain the test loss and test accuracy of ResCapsNet and CapsNet. Then, 10% of the data was randomly added to the training set each time, and the accuracy and loss of each experiment were recorded until the data size increased to the original size. When the training set is small, the model often stops training before reaching convergence, resulting in inaccurate classification. In order to obtain the true performance of the model under small data amounts, the number of training iterations is set to 400 when the training data amount is less than 50%, and it remains 100 in other cases.
- the present invention proposes a residual capsule network for extracting multiple features for classifying high-resolution cardiac DE-MRI scans.
- multiple residual blocks e.g., 4
- the shallow, middle, and deep features extracted by the residual network are fused into the capsule network to achieve full recognition of the image information.
- the present invention effectively alleviates the problem by incorporating an improved residual network.
- the use of scaled reconstruction reduces the computation time.
- the present invention is particularly suitable for processing small, unbalanced, and large DE-MRI datasets.
- the present invention may be a system, a method and/or a computer program product.
- the computer program product may include a computer-readable storage medium carrying computer-readable program instructions for causing a processor to implement various aspects of the present invention.
- Computer readable storage medium can be a tangible device that can keep and store the instructions used by the instruction execution device.
- Computer readable storage medium can be, for example, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination thereof.
- Non-exhaustive list of computer readable storage medium include: a portable computer disk, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a static random access memory (SRAM), a portable compact disk read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanical encoding device, for example, a punch card or a convex structure in a groove on which instructions are stored, and any suitable combination thereof.
- RAM random access memory
- ROM read-only memory
- EPROM or flash memory erasable programmable read-only memory
- SRAM static random access memory
- CD-ROM compact disk read-only memory
- DVD digital versatile disk
- memory stick a floppy disk
- mechanical encoding device for example, a punch card or a convex structure in a groove on which instructions are stored, and any suitable combination thereof.
- the computer readable storage medium used here is not interpreted as a transient signal itself, such as a radio wave or other freely propagating electromagnetic waves, an electromagnetic wave propagated by a waveguide or other transmission medium (for example, a light pulse by an optical fiber cable), or an electrical signal transmitted by a wire.
- the computer-readable program instructions described herein can be downloaded from a computer-readable storage medium to each computing/processing device, or downloaded to an external computer or external storage device via a network, such as the Internet, a local area network, a wide area network, and/or a wireless network.
- the network can include copper transmission cables, optical fiber transmissions, wireless transmissions, routers, firewalls, switches, gateway computers, and/or edge servers.
- the network adapter card or network interface in each computing/processing device receives the computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in the computer-readable storage medium in each computing/processing device.
- the computer program instructions for performing the operation of the present invention may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, state setting data, or source code or object code written in any combination of one or more programming languages, including object-oriented programming languages, such as Smalltalk, C++, Python, etc., and conventional procedural programming languages, such as "C" language or similar programming languages.
- Computer-readable program instructions may be executed entirely on a user's computer, partially on a user's computer, as an independent software package, partially on a user's computer, partially on a remote computer, or entirely on a remote computer or server.
- the remote computer may be connected to the user's computer via any type of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (e.g., using an Internet service provider to connect via the Internet).
- an electronic circuit such as a programmable logic circuit, a field programmable gate array (FPGA), or a programmable logic array (PLA), may be personalized by utilizing the state information of the computer-readable program instructions, and the electronic circuit may execute the computer-readable program instructions, thereby realizing various aspects of the present invention.
- These computer-readable program instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing device, thereby producing a machine, so that when these instructions are executed by the processor of the computer or other programmable data processing device, a device that implements the functions/actions specified in one or more boxes in the flowchart and/or block diagram is generated.
- These computer-readable program instructions can also be stored in a computer-readable storage medium, and these instructions cause the computer, programmable data processing device, and/or other equipment to work in a specific manner, so that the computer-readable medium storing the instructions includes a manufactured product, which includes instructions for implementing various aspects of the functions/actions specified in one or more boxes in the flowchart and/or block diagram.
- Computer-readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device so that a series of operating steps are performed on the computer, other programmable data processing apparatus, or other device to produce a computer-implemented process, thereby causing the instructions executed on the computer, other programmable data processing apparatus, or other device to implement the functions/actions specified in one or more boxes in the flowchart and/or block diagram.
- each box in the flowchart or block diagram can represent a part of a module, program segment or instruction, and the part of the module, program segment or instruction contains one or more executable instructions for realizing the specified logical function.
- the functions marked in the box can also occur in a different order from the order marked in the accompanying drawings. For example, two consecutive boxes can actually be executed substantially in parallel, and they can sometimes be executed in the opposite order, depending on the functions involved.
- each box in the block diagram and/or flowchart, and the combination of the boxes in the block diagram and/or flowchart can be implemented by a dedicated hardware-based system that performs the specified function or action, or can be implemented by a combination of dedicated hardware and computer instructions. It is well known to those skilled in the art that it is equivalent to implement it by hardware, implement it by software, and implement it by combining software and hardware.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Magnetic Resonance Imaging Apparatus (AREA)
Abstract
本发明公开了一种基于残差胶囊网络对心肌纤维化进行分类的方法。该方法包括:获取待测目标的磁共振图像;将所述磁共振图像输入到经训练的残差胶囊网络,获得重建图像;利用所述重建图像获得心肌纤维化的分类结果。其中所述残差胶囊网络包括编码器和解码器,所述编码器包括残差网络和胶囊网络,所述解码器包括去卷积层,所述残差网络用于从输入图像中提取不同深度的特征图并传递至所述胶囊网络,所述胶囊网络对所述不同深度的特征图进行向量化加权合并,所述解码器使用所述胶囊网络的连接胶囊中的最长向量来重建图像。本发明提升了重建图像的清晰度并减少了图像重建时间,尤其适用于处理小型、不平衡和大型DE-MRI数据集。
Description
本发明涉及医学图像处理技术领域,更具体地,涉及一种基于残差胶囊网络对心肌纤维化进行分类的方法。
心肌纤维化(MF)主要是由心肌间质成纤维细胞显著增殖和分布异常的胶原蛋白沉积所致。心肌纤维化的程度与心脏病的预后密切相关。例如,对心肌进行延迟增强磁共振成像(DE-MRI)是评价心肌纤维化分布特征和程度的有效方法。该类扫描具有高分辨率和低运动模糊的特点。由于手动检测通常很耗时,近年来,逐渐将心脏MRI机器学习应用于诊断扩张性心肌病、心肌纤维化和评估左心室容积等方面。
心脏MRI机器学习的研究主要集中于心脏结构的分割和心肌疾病的定量检测。但由于MRI扫描中,心脏结构的解剖变化,现有方法仍需要人工协助。此外,由于边界模糊且周围心肌组织的表现存在相似性,因此难以顺利提取显著特征,使得针对心脏MRI,应用深度学习极具挑战性。
目前,图像处理领域普遍使用卷积神经网络(CNN)。CNN利用卷积核提取图像信息,通过池化层减少训练参数,最后通过分类器输出分类结果。CNN通过权重共享和池化运算实现解释和旋转的不变性,但CNN无法识别会对结果产生重大影响的图像空间关系。例如,在医学图像中,即使器官的空间位置被打乱,CNN仍会将其识别为正常样本。在训练过程中,CNN需要较大的数据集才能形成准确的模式。而深度网络很容易出现梯度消失和爆炸,并且池化层可能会导致提取的特征信息丢失,从而影响最终的分类准确度。
为解决CNN的上述缺陷,Sabour等人提出了一种称为胶囊网络(CapsNet)的新型深度网络(Sabour,S.,N.Frosst,and G.E Hinton,Dynamic Routing Between Capsules.arXiv e-prints,2017:p.arXiv:1710.09829.)。CapsNet考虑特征与特征之间的空间关系,使用动态路由算法对对象和结果之间的关系进行编码,从而体现图像的整体特征。胶囊可存储描述特定对象的向量,其中向量的长度表示对象存在的概率,方向表示描述参数。所存储的描述参数向量可以在展平后进行反向渲染,以形成原始图像。与CNN相比,CapsNet对小型数据集很友好。CapsNet不设置存在丢失信息风险的池化层。尽管CapsNet在小型数据集MNIST上表现良好,但对MRI扫描等复杂数据集进行分类的准确度较低。这是因为CapsNet是一个浅层网络,主要包括用于提取特征的卷积层、用于编码映射的胶囊层以及用于重建图像的可选解码器。而CapsNet的卷积层通常只有两层,这使得其无法捕获MRI扫描中包含的较深层的特征。此外,如果图像过大,而解码器仅从几个向量中重新生成高分辨率图像,图像的重建误差会增加,从而进一步降低分类准确度,同时容易出现梯度消失和爆炸,这会对训练速度和性能产生不利影响。
发明内容
本发明的目的是克服上述现有技术的缺陷,提供一种基于残差胶囊网络对心肌纤维化进行分类的方法。该方法包括以下步骤:
获取待测目标的磁共振图像;
将所述磁共振图像输入到经训练的残差胶囊网络,获得重建图像;
利用所述重建图像获得心肌纤维化的分类结果;
其中,所述残差胶囊网络包括编码器和解码器,所述编码器包括残差网络和胶囊网络,所述解码器包括去卷积层,所述残差网络用于从输入图像中提取不同深度的特征图并传递至所述胶囊网络,所述胶囊网络对所述不同深度的特征图进行向量化加权合并,所述解码器使用所述胶囊网络的连接胶囊中的最长向量来重建图像。
与现有技术相比,本发明的优点在于,提出了一种新型的胶囊网络,该胶囊网络用残差块替代卷积层,残差块可从图像中提取更深层的语义信息,进而将不同残差块提取的浅层、中层和深层特征相结合有助于构建更 高效的胶囊网络。此外,为避免出现高分辨率输入引起的梯度消失和爆炸,在解码器采用了缩放重建。与现有的残差网络和胶囊网络相比,本发明提出的残差胶囊网络模型提升了重建图像的清晰度,尤其对于大型DE-MRI扫描,具有更好的性能。
通过以下参照附图对本发明的示例性实施例的详细描述,本发明的其它特征及其优点将会变得清楚。
被结合在说明书中并构成说明书的一部分的附图示出了本发明的实施例,并且连同其说明一起用于解释本发明的原理。
图1是根据本发明一个实施例的基于残差胶囊网络对心肌纤维化进行分类的方法的流程图;
图2是根据本发明一个实施例的残差胶囊网络的整体结构示意图;
图3是根据本发明一个实施例的原始残差块和预激活的示意图;
图4是根据本发明一个实施例的改进的胶囊网络的结构示意图;
图5是根据本发明一个实施例的压缩与激励模块的示意图;
图6是根据本发明一个实施例的去卷积网层的结构图;
图7是根据本发明一个实施例的DE-MRI图像正样本和负样本的示意图;
图8是根据本发明一个实施例三种模型的测试损失和测试准确度的对比图;
图9是根据本发明一个实施例的不同数据量下胶囊网络的测试损失和准确度的对比图。
现在将参照附图来详细描述本发明的各种示例性实施例。应注意到:除非另外具体说明,否则在这些实施例中阐述的部件和步骤的相对布置、数字表达式和数值不限制本发明的范围。
以下对至少一个示例性实施例的描述实际上仅仅是说明性的,决不作 为对本发明及其应用或使用的任何限制。
对于相关领域普通技术人员已知的技术、方法和设备可能不作详细讨论,但在适当情况下,所述技术、方法和设备应当被视为说明书的一部分。
在这里示出和讨论的所有例子中,任何具体值应被解释为仅仅是示例性的,而不是作为限制。因此,示例性实施例的其它例子可以具有不同的值。
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步讨论。
为提高CapsNet处理大尺寸医学图像输入的能力,本发明提出了一种新型的残差胶囊网络(或称为ResCapsNet),该网络将卷积层替换为改进的残差网络以压缩特征,减少了初级胶囊的数量,并可防止梯度消失和爆炸。残差网络(ResNet)引入了残差块,使网络能够通过快捷连接在每一层传递更多信息。通过应用残差网络并改进胶囊网络的结构,可以减少网络中的权重数量。在图像重建过程中,对原始图像进行缩放以减少计算量。此外,提出了一种使用缩放重建方法的新损失函数,以进一步减小计算负担。
参见图1所示,所提供的基于残差胶囊网络对心肌纤维化进行分类的方法包括以下步骤。
步骤S110,构建残差胶囊网络,该残差胶囊网络包括编码器和解码器,其中编码器包含残差网络和胶囊网络。
参见图2所示,残差胶囊网络主要包括编码器和解码器,首先由编码器对输入图像进行编码,然后由解码器重建图像。在图2中,编码器包括残差网络和胶囊网络,其中胶囊网络包含初级胶囊、数字胶囊和连接胶囊,解码器包括多层去卷积层,例如设置为3层或5层去卷积层。残差网络负责提取特征信息。胶囊网络则负责将特征向量化。解码器用于重建图像,并通过与缩放输入进行比较,获得重建损失。
在下文中,将具体介绍对残差网络、胶囊网络和损失函数的改进之处。
1)改进的残差网络
在所提供的残差胶囊网络中,使用改进的残差网络作为卷积层,去除了残差网络中的池化层,因为其会丢失图像信息。而残差网络包含多个残差块,以保留特征图的空间位置信息,并防止出现梯度消失和爆炸。
残差网络主要用于从图像中提取深层特征和语义信息,例如,设置包括4个残差块。首先将图像输入第一个残差块,获得一幅96×96×128的特征图。然后,经过3个残差层(卷积核为3×3,步长为2),获得一幅尺寸为48×48×256的浅层特征图,将提取的浅层特征图传输至PrimaryA(第一个初级胶囊)。经过卷积核尺寸为3×3、步长为2的相同残差块,获得一幅尺寸为24×24×512的中层特征图,将提取的中层特征图传输至PrimaryB(第二个初级胶囊)。然后,将中层特征输入最后一个残差块,获得一幅尺寸为8×8×256的深层特征图,并将提取的深层特征图传递至PrimaryC(第三个初级胶囊)。
优选地,引入了新的残差块结构
如图3所示,其中,左图是原始残差块的结构,右图是预激活的结构,使用1×1卷积层进行降尺寸。在新的残差块结构中,将激活函数移动至残差部分。通过将激活函数更改为恒等映射,有利于提高模型的准确度。图3的新残差块包括正则化层(BN),激活层(如采用ReLU激活函数)和卷积层。
2)改进的胶囊网络
在所提供的残差胶囊网络中,使用改进的残差网络作为胶囊网络的卷积层,不仅可以解决大尺寸医学图像胶囊网络中的梯度消失和爆炸问题,还能保留胶囊层特征之间的空间位置关系,同时减少初级胶囊的数量,以方便部署和计算。深层网络的感知场较大,对语义信息的表征能力较强,但对空间信息的表征能力较弱。深层特征可表征更丰富的图像信息,且能够分离复杂的目标区域。浅层网络感知场较小,表示语义信息的能力较弱,但表示空间信息的能力较强。浅层特征表示的信息较少,但可以分离一些简单的目标区域。浅层和深层特征的融合可以相互补偿,从而提高模型的性能。
为了更好的融合浅层特征和深层特征,改进了胶囊网络。参见图4所示,胶囊网络包括三个胶囊层或称三个初级胶囊(标记为PrimaryA、Primary B和Primary C),三个胶囊层对应的各自的数字胶囊层和一个连接数字层。卷积层将特征输出至三个单独的胶囊层。然后,合并三个胶囊层的结果,以获得最终结果。具体地,SE(压缩和激励块)模块将在合并过程中获得不同数字层的权重。前一个残差网络提取的浅层特征ConvA、中层特征ConvB和深层特征ConvC分别传输至胶囊层PrimaryA、PrimaryB和PrimaryC,这些层均包含压缩与激励(SE)模块,浅层、中层和深层特征的权重系数a
1、a
2和a
3来自于对应的SE模块。三个权重系数与三个数字胶囊层相乘,得到三个具有权重的数字胶囊层,分别标记为DigitA、DigitB和DigitC。最后,将具有权重系数的三个数字层连接成一个数字层。DigitA、DigitB和DigitC的输出是表示两种类别的两个向量,每个输出都是一个8维胶囊。最终连接的胶囊融合层用D表示,是一个24维胶囊,表示为:
在编码器,损失函数使用边缘损失函数,定义如下:
其中,T
k设为1,
表示预测向量,λ是设置的常数,m
+代表上界,例如设为0.9,m
-代表下界,例如设为0.1。如果向量的长度大于m
+,意味着该输入属于此类别,且L
k=1。相反,如果向量的长度小于m
-,则意味着输入不属于此类别,且L
k=0。在二分类问题中,只有一个T
k=1的胶囊,另一个胶囊T
k=0。因此不存在不平衡问题,故将为平衡边缘损失而引入的λ设为1。
3)、压缩与激励模块
压缩和激励网络(SENet)引入了一种注意力机制来自适应地加权各通道的特征图。这种注意力机制由压缩与激励模块(SE)实现。
在所提供的残差胶囊网络中增加压缩与激励模块有助于更好地提取特征,从而提高网络的准确度。压缩与激励模块也称为压缩和激励,是两个关键运算的组合,丰富了通道注意力机制。对于初级胶囊,一个维度是向量数组,另一个维度是胶囊数量。压缩和激励模块提供的注意了机制可 为每个胶囊的重要程度添加权重,并针对不同的任务增强或抑制相应的胶囊,以提取有用的特征并抑制无用的特征。
参见图5所示,压缩与激励模块包括一个全局池化层、两个全连接层,压缩与激励模块的运算可分为三个步骤:
(1)压缩运算,旨在压缩胶囊信息,保持胶囊数量不变,并进行全局信息池化,以将每个胶囊块转化为一个数字,其值为初级胶囊层中的元素之和除以元素数量,表示为:
其中,n为初级胶囊的数量,p为胶囊的长度,μ
ij表示胶囊内元素。
(2)激励运算,使用2个全连接层和一个Sigmoid函数实现,表示为:
s=σ(W
2δ(W
1U
k)) (5)
其中,s为激励运算的输出,σ为激活函数Sigmoid,W
2和W
1是两个全连接层的对应权重参数,δ为激活函数ReLU。
(3)重新加权运算。在连接数字胶囊时,对数字胶囊逐块分配权重。这就完成了对块级原始特征的重新分配,从而使模型可以决定哪个级别的特征是重要特征。
4)动态路由算法
根据初级胶囊计算数字胶囊的算法称为动态路由算法。本发明将动态路由的迭代r设为3。在每次迭代中,预测初始胶囊u
i属于数字胶囊v
j,表示为:
c
ij=softmax(W
jiu
i·v
j) (6)
其中,W
ij为变换矩阵,c
ij为预测结果。c
ij在每次迭代中变化。因此,计算所得v
j(即c
ijW
iju
i的总和)在不同迭代中也有所不同。Softmax函数确保c
ij之和始终等于1。
在输出中,必须对最终计算出的v
j进行挤压:
v
j=squash(v
j) (7)
其中:
由公式(8)可知,挤压运算会将向量v
j的范数压缩至(0,1)的范围,而不改变向量方向。由于向量的方向代表特征,因此最终结果v
j保留根据初级胶囊校正的特征。在训练过程中,需要进行多次迭代,以多次更新参数,并选择最后一次迭代的结果作为数字胶囊的最终输出结果。
值得注意的是,输出动态路由算法是数字胶囊,而不是预测系数c
ij。因此,该算法可视为具有复杂过程的隐藏层。目前训练深度网络的有效方法是反向传播,这使得胶囊网络需要更多参数来提升其在训练过程中的准确度。因此引入了变换矩阵W
ij,不仅可以将初级胶囊转换为不同形状,还可以确保胶囊网络从向量中捕获信息的能力。
5)解码器
解码器要包括一个去卷积模块。原始胶囊网络使用重建损失作为正则化方法,促进数字胶囊层中的胶囊尽可能多地编码有用信息。只需将数字胶囊层的输出提供给由三个全连接层组成的解码器,即可完成重建。虽然这种方法在简单的图像上可以很好地重建图像,但其重建性能不适用于复杂数据集,如DE-MRI扫描,重建的图像模糊且难以区分。
为解决上述问题,在一个实施例中,采用去卷积重建图像,以提高复杂数据集的重建性能和分类准确度。如图6所示,从编码器获得的两个24维胶囊向量被重新格式化为4×4×3尺寸的向量图像。然后,用三个尺寸为3×3的卷积核对向量进行去卷积,步长分别为1、1、2、3、1。通道数从3变为1。最后得到通道数为1、尺寸为24×24的重建图像。
步骤S120,利用设定的损失函数训练残差胶囊网络。
在步骤S110,对大尺寸输入图像进行了缩放,并采用卷积核尺寸为8×8且步长为8的均值池化来减小图像尺寸。缩放后,尺寸为224×224的输入图像可缩小至24×24。与传统胶囊网络的重建损失不同,在一个实施例中,计算了缩放后的原始图像与重建图像之差的平方和,并将该平方和作为基于缩放重建的重建损失,表述如下:
在一个实施例中,将训练残差胶囊网络的总体损失函数设置为:
L=Σ(L
k+α·L
reconstruction) (10)
其中,L
k表示编码器的边缘损失。由于重建损失和边缘损失的值之间存在差距,因此,在计算整体损失时需要减少重建损失的权重。例如,将权重系数α设为0.0005。
步骤S130,利用经训练的残差胶囊网络对目标图像进行增强,进而基于增强图像分析心肌纤维化程度。
训练完成后,即可获得残差胶囊网络的优化参数,进而应用于实际的目标图像重建,包括:获取待增强的目标图像;将所述目标图像输入到残差胶囊网络,获得更清晰的重建图像;进而利用重建图像分析心肌纤维化程度,用于临床指示。目标图像可以是磁共振成像、延迟增强磁共振成像或其他类型。
为验证本发明的效果,通过实验对所提出的残差胶囊网络、传统CapsNet和ResNet三者的性能进行了比较。三种模型的结构参见表1。首先,在一个大型医学数据集上开展实验,分别从准确度、混淆矩阵、灵敏度、专属性和AUC值方面评价了三种模型的分类有效性。然后在不同数据量下对模型进行比较。
表1:三种模型的结构
在模型的训练过程中,使用Adam优化算法进行优化,并将学习率设为0.001。模型训练的迭代次数设为2000,以防止出现过拟合。胶囊层动态路由算法的迭代次数设为3。由于胶囊网络容易出现过拟合,因此在每次迭代训练期间为胶囊层采用脱式运算、随机去激活神经元,并在该次迭代中不更新去激活神经元的权重,以降低模型的复杂度。
1)数据集
实验中的数据集来自湘雅医院(Xiangya Hospital),包含540例受试者。使用Siemens Prisma,1.5Tesla,model-syngo MRE11扫描仪进行采集。回声时间设置为1.33,重复时间为321.79。原始图像的尺寸为256×192。诊断由专业心脏科医生作出。已获得参与实验的所有受试者的书面同意。图7显示了两类样本的示例,其中图7(a)是正常样本的短轴视图和长轴视图,图7(b)是心肌纤维化样本的短轴视图和长轴视图。
2)预处理
MRI数据是灰度图像。但文件在转储过程中会保存为三通道PNG彩 色图像。为减少计算量,将图像转换为单通道格式。图像周围有大量不包含任何器官图像的黑色区域,因此将图像裁剪为大小为192×192的正方形图像,以使图像中心脏图像的占比尽可能大。另外,为解决原始图像对比度低的问题,提出了一种基于麻雀搜索的图像增强方法,以使图像灰度分布更均匀,从而保留更多细节信息,并提高图像质量。经验证,该算法可在不同情况下使用。
3)评价指标
在实验中,引入了几项常用的评价指标。预测结果可根据真实情况和预测标签分为4类:真正(TP)、假正(FP)、真负(TN)和假负(FN)。评价标准包括查全率(P)、查准率(R)和F
1分数。准确度表示模型的整体性能,即正确分类样本的比率,表示为:
灵敏度是正确分类的正样本与所有正样本的比率,代表模型根据背景信息诊断心肌纤维化的能力,表示为:
4)实验结果
为比较三种模型在大型数据集上的性能,在服务器上对三种模型均训练了100次迭代。为提高模型的准确度,采用了阶梯式学习率,初始学习率设为0.0001,每进行40次迭代,学习率就降低至原始学习率的0.4倍。总迭代次数与测试损失、准确度的关系曲线参见图8所示,其中图8(a)是测试损失,图8(b)是测试准确度。
如图8所示,三种模型在第40次迭代后均达到收敛。为进一步比较三种模型在计算时间方面的差异,在表2中列出了模型的参数数量。
表2:三种模型的训练结果
由表2可知,ResCapsNet的准确度为72.25%,损失为0.2913。CapsNet 的准确度为68.73%,损失为0.3025,均优于ResNet的准确度和损失(分别为64.21%和0.3168)。就计算时间而言,ResCapsNet网络超过930万个参数,与CapsNet(910万个参数)大致相当,远少于ResNet(1400万个参数)。虽然两个胶囊网络的参数相似,但ResCapsNet收敛速度更快,参数更少。在三种模型中,ResCapsNet的准确度和参数数量最佳。
为进一步比较这三种模型的分类性能,选择了最后一次迭代后的模型,并对其性能进行了验证,测试集包含54份正样本和54份负样本。
为了诊断心肌纤维化,希望尽可能多地诊断出具有真正特征的患者,因此高灵敏度说明模型具有实用性。实验表明,在心肌纤维化患者中,ResCapsNet成功检出43例患者,灵敏度为79.63%。相比之下,CapsNet检出41例具有正特征的患者,灵敏度为75.93%。ResNet检出33例患者,灵敏度为61.11%。从该角度出发,三种模型中ResCapsNet性能最优。
AUC值可以从ROC图计算,用于评价三种模型,因为其受正样本和负样本比例不平衡的影响较小。实验中,ResCapsNet的AUC值为0.8945,CapsNet的AUC值为0.8824,ResNet的AUC值为0.8681。因此,就AUC值而言,三种模型中ResCapsNet的性能最优。
此外,测试了ResCapsNet与原始CapsNet在不同数据量下的性能。在测试集保持不变的情况下,随机抽取10%的训练集数据,得到ResCapsNet和CapsNet的测试损失和测试准确度。然后,每次为训练集随机增加10%的数据,并记录每次实验的准确度和损失,直至数据量增加到原始大小。训练集较小时,模型往往会在达到收敛之前就停止训练,导致分类不准确。为获得模型在小数据量下的真实表现,在训练数据量小于50%时将训练迭代次数设为400,其他情况下仍为100。
两个模型在测试集上的准确度和损失参见图9所示,其中图9(a)是测试损失,图9(b)是测试准确度。总体趋势表明,随着两个模型中数据量的增加,准确度增加,损失减少。详细结果见表3。在10%的数据量下训练时,两个模型的准确度均低于50%。然而,当数据量增加至20%时,ResCapsNet的准确度超过50%,达到52.37%,而CapsNet的准确度仅在数据量增加至30%时才超过50%。这表明,应用残差网络提高了特征提取的 能力,并有助于本发明提出的残差胶囊网络模型在小数据量下实现优于原始CapsNet的性能。
表3:不同数据量下ResCapsNet和CapsNet的准确度和损失对比
综上所述,本发明提出了一种提取多特征的残差胶囊网络,用于对高分辨率心脏DE-MRI扫描进行分类。针对复杂的DE-MRI图像问题,采用多个残差块(例如4个)代替原始胶囊网络中的卷积层,以提取图像更深层的特征信息。并且在所提供的残差胶囊网络,将残差网络提取的浅层、中层和深层特征融合到胶囊网络中,实现对图像信息的充分识别。针对大尺寸输入导致胶囊网络梯度消失和爆炸的问题,本发明通过纳入改进的残差网络有效缓解了该问题。同时,使用缩放重建减少了计算时间。本发明尤其适用于处理小型、不平衡和大型DE-MRI数据集。
本发明可以是系统、方法和/或计算机程序产品。计算机程序产品可以包括计算机可读存储介质,其上载有用于使处理器实现本发明的各个方面的计算机可读程序指令。
计算机可读存储介质可以是可以保持和存储由指令执行设备使用的指令的有形设备。计算机可读存储介质例如可以是但不限于电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、静态随机存取存储器(SRAM)、便携式压缩盘只读存储器(CD-ROM)、数字多功能盘(DVD)、记忆棒、软盘、机械编码设备、例如其上存储有指令的打孔卡或凹槽内凸起结构、 以及上述的任意合适的组合。这里所使用的计算机可读存储介质不被解释为瞬时信号本身,诸如无线电波或者其他自由传播的电磁波、通过波导或其他传输媒介传播的电磁波(例如,通过光纤电缆的光脉冲)、或者通过电线传输的电信号。
这里所描述的计算机可读程序指令可以从计算机可读存储介质下载到各个计算/处理设备,或者通过网络、例如因特网、局域网、广域网和/或无线网下载到外部计算机或外部存储设备。网络可以包括铜传输电缆、光纤传输、无线传输、路由器、防火墙、交换机、网关计算机和/或边缘服务器。每个计算/处理设备中的网络适配卡或者网络接口从网络接收计算机可读程序指令,并转发该计算机可读程序指令,以供存储在各个计算/处理设备中的计算机可读存储介质中。
用于执行本发明操作的计算机程序指令可以是汇编指令、指令集架构(ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、或者以一种或多种编程语言的任意组合编写的源代码或目标代码,所述编程语言包括面向对象的编程语言—诸如Smalltalk、C++、Python等,以及常规的过程式编程语言—诸如“C”语言或类似的编程语言。计算机可读程序指令可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络—包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。在一些实施例中,通过利用计算机可读程序指令的状态信息来个性化定制电子电路,例如可编程逻辑电路、现场可编程门阵列(FPGA)或可编程逻辑阵列(PLA),该电子电路可以执行计算机可读程序指令,从而实现本发明的各个方面。
这里参照根据本发明实施例的方法、装置(系统)和计算机程序产品的流程图和/或框图描述了本发明的各个方面。应当理解,流程图和/或框图的每个方框以及流程图和/或框图中各方框的组合,都可以由计算机可读程序指令实现。
这些计算机可读程序指令可以提供给通用计算机、专用计算机或其它可编程数据处理装置的处理器,从而生产出一种机器,使得这些指令在通过计算机或其它可编程数据处理装置的处理器执行时,产生了实现流程图和/或框图中的一个或多个方框中规定的功能/动作的装置。也可以把这些计算机可读程序指令存储在计算机可读存储介质中,这些指令使得计算机、可编程数据处理装置和/或其他设备以特定方式工作,从而,存储有指令的计算机可读介质则包括一个制造品,其包括实现流程图和/或框图中的一个或多个方框中规定的功能/动作的各个方面的指令。
也可以把计算机可读程序指令加载到计算机、其它可编程数据处理装置、或其它设备上,使得在计算机、其它可编程数据处理装置或其它设备上执行一系列操作步骤,以产生计算机实现的过程,从而使得在计算机、其它可编程数据处理装置、或其它设备上执行的指令实现流程图和/或框图中的一个或多个方框中规定的功能/动作。
附图中的流程图和框图显示了根据本发明的多个实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或指令的一部分,所述模块、程序段或指令的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。对于本领域技术人员来说公知的是,通过硬件方式实现、通过软件方式实现以及通过软件和硬件结合的方式实现都是等价的。
以上已经描述了本发明的各实施例,上述说明是示例性的,并非穷尽性的,并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本文中所用术语的选择,旨在最好地解释各实施例的原 理、实际应用或对市场中的技术改进,或者使本技术领域的其它普通技术人员能理解本文披露的各实施例。本发明的范围由所附权利要求来限定。
Claims (10)
- 一种基于残差胶囊网络对心肌纤维化进行分类的方法,包括以下步骤:获取待测目标的磁共振图像;将所述磁共振图像输入到经训练的残差胶囊网络,获得重建图像;利用所述重建图像获得心肌纤维化的分类结果;其中,所述残差胶囊网络包括编码器和解码器,所述编码器包括残差网络和胶囊网络,所述解码器包括去卷积层,所述残差网络用于从输入图像中提取不同深度的特征图并传递至所述胶囊网络,所述胶囊网络对所述不同深度的特征图进行向量化加权合并,所述解码器使用所述胶囊网络的连接胶囊中的最长向量来重建图像。
- 根据权利要求1所述的方法,其特征在于,所述残差网络设置为包含四个残差块,用于提取浅层特征图、中层特征图和深层特征图,所述胶囊网络设置为包括第一初级胶囊、第二初级胶囊和第三初级胶囊,其中浅层特征图传输至第一初级胶囊,中层特征图传输至第二初级胶囊,深层特征图传输至第三初级胶囊。
- 根据权利要求2所述的方法,其特征在于,所述残差块包含正则化层、Relu激活层和卷积层,并且所述残差块将直连路径和残差路径的求和值作为输出。
- 根据权利要求1所述的方法,其特征在于,所述胶囊网络包括多个初级胶囊、多个对应的数字胶囊和一个连接胶囊,其中每个初级胶囊包含卷积模块以及压缩与激励模块,所述压缩与激励模块在特征合并过程中,获得所述数字胶囊的权重。
- 根据权利要求4所述的方法,其特征在于,所述压缩与激励模块包含全局池化层、第一全连接层、ReLU激活层、第二全连接层和Sigmoid激活层,其中所述全局池化层用于执行压缩运算,第一全连接层、ReLU激活层、第二全连接层和Sigmoid激活层用于执行激励运算。
- 根据权利要求1所述的方法,其特征在于,所述磁共振图像是延迟增强磁共振成像。
- 一种计算机可读存储介质,其上存储有计算机程序,其中,该计算机程序被处理器执行时实现根据权利要求1至8中任一项所述方法的步骤。
- 一种计算机设备,包括存储器和处理器,在所述存储器上存储有能够在处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现权利要求1至8中任一项所述的方法的步骤。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2022/134155 WO2024108505A1 (zh) | 2022-11-24 | 2022-11-24 | 一种基于残差胶囊网络对心肌纤维化进行分类的方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2022/134155 WO2024108505A1 (zh) | 2022-11-24 | 2022-11-24 | 一种基于残差胶囊网络对心肌纤维化进行分类的方法 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024108505A1 true WO2024108505A1 (zh) | 2024-05-30 |
Family
ID=91194989
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/134155 WO2024108505A1 (zh) | 2022-11-24 | 2022-11-24 | 一种基于残差胶囊网络对心肌纤维化进行分类的方法 |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2024108505A1 (zh) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111241958A (zh) * | 2020-01-06 | 2020-06-05 | 电子科技大学 | 一种基于残差-胶囊网络的视频图像鉴别方法 |
CN113393469A (zh) * | 2021-07-09 | 2021-09-14 | 浙江工业大学 | 基于循环残差卷积神经网络的医学图像分割方法和装置 |
US20220044065A1 (en) * | 2020-07-17 | 2022-02-10 | Tata Consultancy Services Limited | System and method for parameter compression of capsule networks using deep features |
CN114241245A (zh) * | 2021-12-23 | 2022-03-25 | 西南大学 | 一种基于残差胶囊神经网络的图像分类系统 |
CN114359638A (zh) * | 2022-01-10 | 2022-04-15 | 安徽理工大学 | 图像的残差胶囊网络分类模型、分类方法、设备及存储介质 |
-
2022
- 2022-11-24 WO PCT/CN2022/134155 patent/WO2024108505A1/zh unknown
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111241958A (zh) * | 2020-01-06 | 2020-06-05 | 电子科技大学 | 一种基于残差-胶囊网络的视频图像鉴别方法 |
US20220044065A1 (en) * | 2020-07-17 | 2022-02-10 | Tata Consultancy Services Limited | System and method for parameter compression of capsule networks using deep features |
CN113393469A (zh) * | 2021-07-09 | 2021-09-14 | 浙江工业大学 | 基于循环残差卷积神经网络的医学图像分割方法和装置 |
CN114241245A (zh) * | 2021-12-23 | 2022-03-25 | 西南大学 | 一种基于残差胶囊神经网络的图像分类系统 |
CN114359638A (zh) * | 2022-01-10 | 2022-04-15 | 安徽理工大学 | 图像的残差胶囊网络分类模型、分类方法、设备及存储介质 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11295178B2 (en) | Image classification method, server, and computer-readable storage medium | |
Chen et al. | Detection of rice plant diseases based on deep transfer learning | |
CN109345538B (zh) | 一种基于卷积神经网络的视网膜血管分割方法 | |
US11645835B2 (en) | Hypercomplex deep learning methods, architectures, and apparatus for multimodal small, medium, and large-scale data representation, analysis, and applications | |
US20210264568A1 (en) | Super resolution using a generative adversarial network | |
Zhao et al. | Supervised segmentation of un-annotated retinal fundus images by synthesis | |
Wu et al. | Cascaded fully convolutional networks for automatic prenatal ultrasound image segmentation | |
CN109949255B (zh) | 图像重建方法及设备 | |
Nguyen et al. | Biomedical image classification based on a feature concatenation and ensemble of deep CNNs | |
Abed et al. | A modern deep learning framework in robot vision for automated bean leaves diseases detection | |
EP4131077A1 (en) | Neural network optimization method and device | |
Kaur et al. | A complete review on image denoising techniques for medical images | |
CN113256592B (zh) | 图像特征提取模型的训练方法、系统及装置 | |
CN109961397B (zh) | 图像重建方法及设备 | |
CN115761358A (zh) | 一种基于残差胶囊网络对心肌纤维化进行分类的方法 | |
CN117274662A (zh) | 一种改进ResNeXt神经网络的轻量级多模态医学图像分类方法 | |
Ossenberg-Engels et al. | Conditional generative adversarial networks for the prediction of cardiac contraction from individual frames | |
Maurya et al. | Computer-aided diagnosis of auto-immune disease using capsule neural network | |
Yue et al. | Generative Adversarial Network Combined with SE‐ResNet and Dilated Inception Block for Segmenting Retinal Vessels | |
Sharma et al. | Solving image processing critical problems using machine learning | |
WO2024108505A1 (zh) | 一种基于残差胶囊网络对心肌纤维化进行分类的方法 | |
CN116912268A (zh) | 一种皮肤病变图像分割方法、装置、设备及存储介质 | |
CN116704305A (zh) | 基于深度学习算法的超声心动图多模态多切面分类方法 | |
Gour et al. | XCapsNet: a deep neural network for automated detection of diabetic retinopathy | |
CN110853012B (zh) | 获得心脏参数的方法、装置及计算机存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22966196 Country of ref document: EP Kind code of ref document: A1 |