CN112750097B

CN112750097B - Multi-modal medical image fusion based on multi-CNN combination and fuzzy neural network

Info

Publication number: CN112750097B
Application number: CN202110050993.3A
Authority: CN
Inventors: 王丽芳; 张晋; 王蕊芳; 张炯; 米嘉; 刘阳
Original assignee: North University of China
Current assignee: North University of China
Priority date: 2021-01-14
Filing date: 2021-01-14
Publication date: 2022-04-05
Anticipated expiration: 2041-01-14
Also published as: CN112750097A

Abstract

The invention belongs to the field of medical image fusion, and particularly relates to multi-modal medical image fusion based on multi-CNN combination and a fuzzy neural network. In order to represent the texture detail of a focus part in a multi-modal medical image more fully and to make the edge clearer, the method provided by the invention mainly comprises two parts: 1) constructing a G-CNN group (G-CNNs); 2) G-CNNs fusion based on fuzzy neural networks. The first part firstly obtains different Gabor representation pairs of CT and MR through a group of Gabor filters with different proportions and directions, and then trains corresponding CNN by using each pair of different Gabor representations respectively, thereby generating G-CNNs; and the second part fuses a plurality of outputs of the G-CNNs by using a fuzzy neural network to obtain a final fused image.

Description

Multi-modal medical image fusion based on multi-CNN combination and fuzzy neural network

Technical Field

The invention belongs to the field of medical image fusion, and particularly relates to multi-modal medical image fusion based on multi-CNN combination and a fuzzy neural network.

Background

Image fusion has a wide range of applications, such as medical imaging, remote sensing, machine vision, biometric identification, and military applications. The goal of fusion is to obtain better contrast and perception experience. In recent years, with the increasing demand for clinical applications, research on multi-modal medical image fusion has been receiving attention. The objective of multi-modal medical image fusion is to provide a better medical image to assist the surgeon in surgical intervention.

Nowadays, medical images have multiple modalities, such as Magnetic Resonance (MR) images, Computed Tomography (CT) images, Positron Emission Tomography (PET) images, X-ray images, and the like, and images of different modalities have their own advantages and limitations, for example, CT can well display bone information, but cannot clearly display structural information such as soft tissue; the MR image can fully display soft tissue information, but has great defects on the detection of bone information; the PET image can provide abundant human body metabolism information for clinic, but the resolution is low. Therefore, the medical image information of multiple modes is combined to complete multi-mode image fusion, and the complementary advantages can be realized. The multi-modal fusion image not only retains the characteristics of the original image, but also makes up the defects of the single-modal medical image, shows richer detail information and provides comprehensive information for clinical diagnosis and treatment and image-guided surgery.

The image fusion method is divided into three layers: pixel level, feature level, decision level. The image fusion at pixel level is to combine the pixel values of each point corresponding to two or more source images by a certain fusion method to calculate a new pixel value, so that each point pixel is fused to form a new fusion image. Common pixel-level image fusion methods include a spatial domain-based fusion method and a transform domain-based fusion method. The image fusion based on the spatial domain is mainly divided into block-based fusion and region-based fusion, including a logic filtering method, a weighted average method, a mathematical morphology method, an image algebra method, a simulated annealing method and the like, and the method has the advantages of small calculated amount, easiness in realization, poor accuracy and unsuitability for the field of medical images. The method based on the transform domain is to decompose the source image, combine the decomposed source images by using different fusion rules, and finally execute inverse transformation operation to reconstruct the fusion image, and comprises a pyramid image fusion method, a wavelet transform image fusion method, a multi-scale decomposition method and the like, so that the method not only can maintain the contrast and reduce the blocking effect, but also has unique advantages in describing the local characteristics of signals.

The feature level image fusion method is to extract feature information which is interesting to an observer, such as edges, contours, shapes, local features and other information from a source image, and then analyze, process and integrate the feature information to obtain fused image features. The currently common methods are: weighted average method, Bayesian estimation method, cluster analysis method.

The decision-level image fusion method is to analyze, reason, recognize, judge and the like the characteristic information of each image to form corresponding results, and then further fuse, wherein the final fusion result is a global optimal decision. The method has the advantages of good real-time performance, high flexibility and certain fault-tolerant capability, but the preprocessing cost is higher, and the loss of original information in the image is more.

In recent years, with the rise of deep learning, a Convolutional Neural Network (CNN) has stronger feature extraction capability than that of a conventional method as an important branch of deep learning, and is suitable for image fusion.

The fusion method based on the spatial domain can distort the spectrum and the space of the fused image, which is not beneficial for the doctor to observe the pathological change condition of the focus area. Pyramid transformation in the transform domain based fusion method fails to introduce spatial direction selectivity in the decomposition process, thereby producing images with blocking artifacts, and introduces many artifacts in the edges of the fused images to affect the fusion result; the wavelet transform can only capture limited direction information, and the information acquisition in the edge and texture area is limited, so that the edge of the image cannot be clearly represented.

Liu et al propose a multi-focus image fusion method based on a Convolutional Neural Network (CNN), which classifies focus regions by using the CNN to obtain a decision graph. The decision graph is combined with the source image to generate a fused image, but the CNN model proposed by liu has lower feature dimensionality of final output of convolution and downsampling, and information loss of the fused image is caused. zhang et al propose a general image fusion framework FCNN based on a full convolution neural network, which solves the problem of information loss by using the full convolution neural network, but the convolution kernel of the convolution layer is too single in setting, and the extracted features can not well represent the texture information of the focus part, so that the fused image is difficult to be directly used for clinical diagnosis

In order to obtain a fused image with abundant textures, clear edges and high information content, the invention provides a method for fusing multi-modal medical images based on Gabor expressed multi-CNN combinations (Gabor-CNNs, G-CNNs) and a fuzzy neural network. The proposed method mainly consists of two parts: 1) constructing a G-CNN group (G-CNNs); 2) based on fuzzy spiritAnd G-CNNs are fused through the network. The first part firstly obtains different Gabor representation pairs of CT and MR through a group of Gabor filters with different proportions and directions, and then trains corresponding CNN by using each pair of different Gabor representations respectively, thereby generating G-CNNs; and the second part fuses a plurality of outputs of the G-CNNs by using a fuzzy neural network to obtain a final fused image. The method utilizes Gabor to represent the pair: gabor^CTAnd Gabor^MRThe method is used as the input of the CNN to enhance and reserve the texture features and edge details of the source image, then the CNN is used for extracting the depth features of the source image by utilizing the advantage that the CNN can extract effective information from a complex background, finally the fuzzy neural network is used for solving the problems of fuzziness and uncertainty in the fusion process, the obtained fusion image reserves the rich texture features and clear edge information of the CT and MR of the medical source image, and the visual characteristics and the information content are also obviously improved.

Disclosure of Invention

In order to fully represent texture details of a focus part in a multi-modal medical image and make edges clearer, the multi-modal medical image is fused based on a method of combining Gabor-CNNs (G-CNNs) and fuzzy neural networks represented by Gabor.

In order to achieve the purpose, the invention adopts the following technical scheme:

the multi-modal medical image fusion based on the multi-CNN combination and the fuzzy neural network comprises the following steps:

step 1, filtering the CT and MR images of the training data set by a filter bank consisting of 16 Gabor filters with different proportions and directions to obtain 16 different Gabor representation pairs of the CT and the MR:

and

1.. 16, the representation has detail texture information in a plurality of different directions to enhance the texture features of the source image, and then each Gabor representation pair is used:

and

1.. 16 to train the respective 16G-CNNs, the 16G-CNNs are combined into G-CNNs, and since the data sets for training each G-CNN are all in different texture directions, the information that each G-CNN characterizes texture is different.

And 2, putting each pair of Gabor representations into corresponding G-CNNs for preliminary fusion, and then fusing a plurality of outputs of the G-CNNs by using a fuzzy neural network to obtain a final fused image.

Further, Gabor representation of the medical image can enhance the texture characteristics of the source image and obtain more edge details, so that the fused image contains more texture and edge information about the relevant lesion site. To obtain Gabor representations of CT and MR for a given medical image, a set of Gabor filter banks of different dimensions and orientations needs to be designed. Both the CT image and the MR image are two-dimensional gray scale images, so a group of two-dimensional Gabor filters are defined, the filter bank in the step 1 is a group of two-dimensional Gabor filters, and the formula is as follows:

in the formula (1), U represents the direction of the filter bank and is selected to be 0 degrees, 90 degrees, 180 degrees and 270 degrees; v denotes the filter bank dimensions set to 4, 8, 16 and 32; z ═ x, y denotes the position of the pixel,

k_v＝k_max/f^v，φ_u＝πu/8，k_maxat maximum frequency, f is the spacing factor between filters in the frequency domain. According to the formula (1), the filter can set different U directions and V scales, and different Gabor representations of the medical image are displayed.

Still further, the step 1 yields 16 different Gabor representation pairs of CT and MR:

and

1.. 16, Gabor representations of CT and MR are obtained for specific direction u and v scales, in particular by the following formula:

Gabor^CT _(u,v)(z)＝I(z)*ψ_u,v(z)for 0≤u≤U-1,0≤v≤V-1 (2)

Gabor^MR _(u,v)(z)＝I(z)*ψ_u,v(z)for 0≤u≤U-1,0≤v≤V-1 (3)

formula (2) in formula (3), I (z) represents a given image, #_u,v(z) denotes a filter, denotes a convolution operation, Gabor^CT _(u,v)(z) and Gabor^MR _(u,v)(z) is the Gabor representation of CT and MR for a given image. Thus, a 4 × 4 filter bank is obtained, and after the images CT and MR are convolved by the Gabor filter bank, 16 groups of Gabor filters with different directions and different scales are obtained^CTAnd Gabor^MRAnd (4) showing. These representations characterize the focal site from different directions, allowing key information to be enhanced and used as a training set for G-CNNs.

To realize to Gabor^CTAnd Gabor^MRImage fusion, wherein a model structure of G-CNN is designed as shown in FIG. 2, and the structure consists of three parts: feature extraction, feature fusion and image reconstruction.

Since the downsampling feature mapping inevitably loses the source information of the input image and affects the quality of the image, no downsampling operation is performed in the G-CNN structure, so that the size of the feature map output by each layer and the input Gabor are reduced^CTAnd Gabor^MRThe image remains consistent. The network structure is composed of 4 convolutional layers CONV1, CONV2, CONV3 and CONV4, as shown in fig. 2. Firstly, extracting information characteristics of an image by adopting two convolutional layers CONV1 and CONV 2; secondly, fusing the convolution characteristics of the input image by using a characteristic fusion FUSE; finally, the fused features are reconstructed by two convolutional layers CONV3 and CONV4 to obtain a fused image.

Further, the G-CNN in the step 2 consists of three parts of feature extraction, feature fusion and image reconstruction:

feature extraction: 3X 224 Gabor^CTAnd Gabor^MRInputting an image into a G-CNN, extracting depth features from an input Gabor-based representation source image by adopting two convolutional layers, and adopting a first convolutional layer of an advanced ResNet101 pre-trained on ImageNet as a first convolutional layer CONV1, wherein the CONV1 comprises 64 convolutional kernels with the size of 7 multiplied by 7, and the step size and the filling parameter are respectively set to be 1 and 3; in order to extract deeper features and adjust convolution features of CONV1, a second convolution layer CONV2 is added, the number of kernels and the kernel size of CONV2 are respectively set to be 64 and 3 x 3, the step size and the filling parameter of CONV2 are set to be 1, in order to overcome the overfitting problem and accelerate network convergence, the CONV2 sets a ReLU activation function and a batch processing normalization function, and 64 Gabor of 224 x 224 are obtained through convolution operations of two convolution layers^CTAnd Gabor^MRA feature map of the image;

feature fusion: in the fusion process, the convolution characteristics of multiple inputs are fused by using an element-level fusion rule, as shown in formula (4):

in the formula (4), N represents the number of input images, N is not less than 2,

a feature map representing the i-th input image extracted by CONV2,

representing a fusion feature graph generated by a feature fusion module, wherein fuse represents an element level fusion rule;

image reconstruction: after feature fusion, two convolutional layers CONV3 and CONV4 are adopted to reconstruct a fused image from the fused convolutional features, and the CONV3 parameter setting is the same as that of CONV2, so that the fused convolutional features can be adjusted, and therefore, the kernel number and the kernel size of CONV3 are respectively set to be 64 and 3 × 3, the same ReLU activation function and batch normalization function as CONV2 are set; the CONV4 reconstructs and outputs the feature mapping through element weighted average, the kernel number and the kernel size of the CONV4 are set to be 3 and 1 multiplied by 1 respectively, and the step size and the filling parameter are set to be 0; after convolution operation of two convolution layers through image reconstruction, 3 × 224 × 224 Gabor is obtained^CTAnd Gabor^MRThe size of the characteristic graph in the whole process is consistent with the size of the input source image in the model, and the Gabor of the source image is well kept^CTAnd Gabor^MRThe obtained fused image has rich texture features and clear edges of the focus part, but has the problems of uncertain and fuzzy boundary of the diffuse focus area.

Still further, the element-level fusion rule in step 2.2 is an element maximum fusion rule.

Further, the fuzzy neural network in the step 2 is composed of an input layer, a fuzzy partition layer, a forward combination layer, an inference layer and an output layer;

in the first input layer, sixteen neurons x₁，x₂，...，x₁₆Is a specific pixel value represented by the same position of 16 Gabor fusion images obtained by G-CNNs;

in the second layer of blur segmentation layer, due to the different gray levels of different tissues in the medical image, the pixels are segmented into five blur sets according to pixel values: very dark, normal, light and very light, and the membership function is used to represent the fuzzy set and is expressed as a gaussian function, as shown in equation (8):

in the formula (8), c and σ are the center and width of a gaussian function, i is 1, 2,. 16, j is 1, 2,. 5, different fuzzy sets correspond to different membership functions, five different membership functions sequentially correspond to 'very dark', 'normal', 'bright' and 'very bright', for a specific pixel value, the five membership functions have different output μ values, and the fuzzy set corresponding to the maximum μ value is selected as the fuzzy set of the pixel value;

in the third forward combination layer, which is the premise of forming rules, nodes of the same fuzzy set are arranged together;

in the fourth layer of inference layer, it is to form rule-based inference, its node is that sixteen nodes representing the same fuzzy set in the third layer are connected to form, five fuzzy rules are set up according to five fuzzy sets divided (as shown in fig. 6): when x is₁，x₂，...x₁₆The dark and bright pixel values are all very dark, and the output rule is

When x is₁，x₂，...x₁₆The dark and bright pixel values belong to the dark degree, and the output rule is

When x is₁，x₂，...x₁₆The pixel values are all in normal degree, and the output rule is

When x is₁，x₂，...x₁₆The dark and bright pixel values belong to the brightness degree, and the output rule is

When x is₁，x₂，...x₁₆The dark and bright pixel values belong to the very bright degree, and the output rule is

In the fifth output layer, the neuron represents the pixel value of the fused image based on the same position as the input image, and in order to obtain the fused image, a connection weighting factor V between the inference layer and the output layer is set as follows:

in the formula (9), p₀、p₁...p₁₆Is a weighting factor, l ═ 1, 2, 3,. 5;

the pixel values of 16 Gabor fused images were calculated by the deblurring function as follows:

in the formula (10), L is 5, R_lIs the rule output value.

16 Gabor fusion images obtained by G-CNNs are finally fused into a fusion image through a five-layer fuzzy neural network, the problem of fuzzy characteristic uncertainty in the fusion process is solved, the CT/MR fusion image is finally obtained, the edges and the textures of the fusion image are clear, and the content is rich.

Compared with the prior art, the invention has the following advantages:

the invention provides multi-modal medical image fusion of multi-CNN combination and fuzzy neural network based on Gabor representation. According to the method, the Gabor representation is used as the input of the CNN to enhance and retain the texture features and edge details of the focus part in the source image, then the CNN is used for extracting the depth features of the source image from the complex background, and the integrated thought in machine learning is used for training a plurality of G-CNNs, wherein each G-CNN has different texture information representing capability, the fuzzy and uncertain problems in the fusion process are solved by using the fuzzy neural network, and the obtained fusion image retains the rich texture features and clear edge information of the focus part in the medical source image.

Drawings

FIG. 1 is a frame diagram of multi-modal medical image fusion based on multi-CNN combination and fuzzy neural network of the present invention;

FIG. 2 is a diagram of a model structure of G-CNN of the present invention;

FIG. 3 is a diagram of a model neural network architecture of the present invention;

FIG. 4 is a fusion diagram of G-CNNs based on the fuzzy neural network of the present invention;

FIG. 5 is a fuzzy set corresponding to the membership function of the present invention;

FIG. 6 shows five fusion rules for fuzzy neural network fusion according to the present invention;

FIG. 7 is a graph of the fusion result of CT and MR images of cerebrovascular disease;

FIG. 8 is a graph of CT/MR fusion results for cerebral infarction disease;

FIG. 9 is a graph of CT/MR fusion results for brain tumor disease.

Detailed Description

As shown in fig. 1, the multi-modal medical image fusion based on multi-CNN combination and fuzzy neural network of the present invention comprises the following steps:

step 1, filtering the CT and MR images of the training data set by a filter bank consisting of 16 two-dimensional Gabor filters with different proportions and directions, wherein the formula is as follows:

Gabor representations of CT and MR for specific orientation u and v dimensions are obtained by the following formula:

Gabor^CT _(u,v)(z)＝I(z)*ψ_u,v(z)for 0≤u≤U-1,0≤v≤V-1 (2)

Gabor^MR _(u,v)(z)＝I(z)*ψ_u,v(z)for 0≤u≤U-1,0≤v≤V-1 (3)

formula (2) in formula (3), I (z) represents a given image, #_u,v(z) denotes a filter, denotes a convolution operation, Gabor^CT _(u,v)(z) and Gabor^MR _(u,v)(z) is the Gabor representation of CT and MR for a given image, resulting in 16 different Gabor representation pairs for CT and MR: gabor^CTiAnd Gabor^MRi，i＝1,...,16；

Each pair of Gabor is then used to represent a pair: gabor^CTiAnd Gabor^MRi1.. 16 to train the respective 16G-CNNs, the 16G-CNNs consisting of G-CNNs;

step 2, putting each pair of Gabor representations into corresponding G-CNN for preliminary fusion, designing the structure of the G-CNN as shown in figure 2, and consisting of three parts: the method comprises the following steps of characteristic extraction, characteristic fusion and image reconstruction, and specifically comprises the following steps:

feature extraction: 3X 224 Gabor^CTAnd Gabor^MRInputting an image into a G-CNN network, extracting depth features from an input Gabor-based representation source image by adopting two convolutional layers, and adopting a first convolutional layer of an advanced ResNet101 pre-trained on ImageNet as a first convolutional layer CONV1, wherein the CONV1 comprises 64 convolutional cores with the size of 7 multiplied by 7, and the step size and the filling parameter are respectively set to be 1 and 3; adding a second convolutional layer CONV2, setting the kernel number and the kernel size of CONV2 to be 64 and 3 multiplied by 3 respectively, setting the step size and the filling parameter of CONV2 to be 1, setting the ReLU activation function and the batch normalization function of CONV2, and respectively obtaining 64 Gabor of 224 multiplied by 224 through convolution operation of the two convolutional layers^CTAnd Gabor^MRA feature map of the image;

feature fusion: in the fusion process, the convolution characteristics of multiple inputs are fused by using an element maximum fusion rule, as shown in formula (4):

a feature map representing the i-th input image extracted by CONV2,

representing a fusion feature graph generated by a feature fusion module, wherein fuse represents an element level fusion rule, and an element maximum fusion rule is selected;

image reconstruction: after feature fusion, reconstructing a fused image from the fused convolution features by adopting two convolution layers CONV3 and CONV4, wherein the number of kernels and the size of the kernels of CONV3 are respectively set to be 64 and 3 multiplied by 3, and a ReLU activation function and a batch processing normalization function which are the same as those of CONV2 are also set; the CONV4 reconstructs and outputs the feature mapping through element weighted average, the kernel number and the kernel size of the CONV4 are set to be 3 and 1 multiplied by 1 respectively, and the step size and the filling parameter are set to be 0; after convolution operation of two convolution layers through image reconstruction, 3 × 224 × 224 Gabor is obtained^CTAnd Gabor^MRThe fused image of (1).

When CT and MR images are fused, different Gabor representations are obtained from the CT and MR images through a filter bank, the corresponding Gabor representations are put into the corresponding G-CNN for fusion, and Gabor fusion images are obtained, and as shown in figure 3, the fusion images have the problems of uncertain and fuzzy boundary of a diffuse focus boundary region. The fuzzy neural network can effectively solve the defects of fuzziness, uncertainty and the like in the fusion process through the fuzzy set and the membership function, so that the G-CNNs fusion images are fused together through the fuzzy neural network to obtain the final fusion image.

Then, fusing a plurality of outputs of the G-CNNs by using a fuzzy neural network to obtain a final fused image, wherein the fuzzy neural network consists of an input layer, a fuzzy partition layer, a forward combination layer, a reasoning layer and an output layer as shown in a dotted line frame in figure 4;

in the second-layer blur division layer, pixels are divided into five blur sets according to pixel values: very dark, normal, light and very light, and the membership function is used to represent the fuzzy set and is expressed as a gaussian function, as shown in equation (8):

in equation (8), c and σ are the center and width of the gaussian function, i is 1, 2,. 16, j is 1, 2,. 5, different fuzzy sets correspond to different membership functions, the logical order is as shown in fig. 5, five different membership functions correspond to "very dark", "normal", "bright", and "very bright" in sequence, the abscissa represents the image pixel value, and the ordinate represents the output value of the membership function μ. The specific process is as follows: for a specific pixel value, the five membership functions have different output mu values, and a fuzzy set corresponding to the maximum mu value is selected as a fuzzy set of the pixel value;

arranging nodes of the same fuzzy set together in a third forward combination layer;

in the fourth layer of inference layer, the nodes are formed by connecting sixteen nodes representing the same fuzzy set in the third layer, and five fuzzy rules are set according to the five fuzzy sets (as shown in fig. 6): when x is₁，x₂，...x₁₆The dark and bright pixel values are all very dark, and the output rule is

When x is₁，x₂，...x₁₆The pixel values are dark and bright and belong to the brightness degreeThe output rule is

In the fifth output layer, the neuron represents the pixel value of the fused image based on the same position as the input image, and in order to obtain the fused image, the connection weighting factor v between the inference layer and the output layer is set as follows:

in the formula (10), L is 5, R_lIs the rule output value.

To verify the effectiveness of the method of the invention, the following three sets of data were used:

1. CT/MR image fusion experiment for cerebrovascular disease

TABLE 1 comparison of CT/MR image fusion results for cerebrovascular disorders

Fig. 7 shows the fusion result of CT and MR images of cerebrovascular diseases, which should preserve the marginal details of the vascular lesion as much as possible to facilitate the diagnosis and observation of the cerebral cortical structure by the doctor. Wherein a and b in fig. 8 represent CT and MR source images, respectively, of cerebrovascular disease and c-h in fig. 8 are the fusion results of DWT, LAP, NSST-PCNN, CNN, GFF and the proposed method, respectively. To more visually show the differences in the comparison methods, the experimental results are marked with yellow rectangles. From the fusion result area of the yellow mark, the fusion result obtained by the DWT method lacks edge information and has poor visual effect; the fusion result obtained by the LAP method loses many edges in the MR image to cause image blurring; the NSST-PCNN fusion image has high brightness, and the edge information of the focus part is unclear; the CNN method has the problem of low edge information contrast, and the GFF method and the proposed method have good fusion effect. From the quantitative data in Table 1, it can be seen that the methods DWT, LAP, NSST-PCNN, CNN are applied to MI index and Q^AB/FThe low index indicates that the marginal information storage capacity of the methods is low compared with the proposed method, which indicates that the method provided by the invention can keep the complete information of the focus margin, has good fusion quality, and is beneficial to doctors to accurately diagnose and observe the boundary information of the focus area.

2. CT/MR image fusion experiment for cerebral infarction disease

TABLE 2 CT/MR image fusion result Performance comparison for cerebral infarction diseases

Fig. 8 shows the CT/MR image fusion result of the cerebral infarction disease, and the fusion result should fully characterize the textural features of the lesion site as much as possible so that the doctor can accurately determine the infarct site and the infarct degree of the cerebral infarction disease. In fig. 8 a and b represent CT and MR source images of a cerebral infarction disease, respectively, and in fig. 8 c-h are the fusion results of DWT, LAP, NSST-PCNN, CNN, GFF and the proposed method, respectively. The similarity of the result obtained by the DWT method and the source image is low, and the texture characteristics of the pathological change part in the source image cannot be well reserved; the image obtained by the LAP method has low contrast and large noise, so that the texture features of the lesion area have partial blurring; the fusion result obtained by NSSI-PCNN has higher brightness in the lesion region, so that the texture information in the image is weakened; the texture feature contrast of the lesion region of the fused image obtained by the CNN method is low; the texture features of the GFF fused image in the lesion region are not sufficiently characterized. It can be seen from table 3 that the fusion methods proposed in the four indexes are higher than other fusion methods. From the perspective of the fusion result and the quantitative index, the method of the invention can fully and clearly keep the textural features of the cerebral infarction lesion part, has good fusion quality, and is beneficial to diagnosis of doctors.

3. CT/MR image fusion experiment for brain tumor diseases

TABLE 3 brain CT/MR image fusion result Performance comparison

Fig. 9 shows the fusion result of CT/MR images of brain tumor diseases, and the fusion result should reserve abundant texture features and clear edge information of the brain tumor region as much as possible to help doctors to accurately determine the grade and boundary of the brain tumor, so as to facilitate the diagnosis and excision of the brain tumor by the doctors. In fig. 9 a and b represent CT and MR source images, respectively, of a brain tumor disease, and in fig. 9 c-h are the fusion results of DWT, LAP, NSST-PCNN, CNN, GFF and the proposed method, respectively. The fused image obtained by the DWT method has lower definition at the edge of a yellow mark area and fuzzy texture characteristics; the LAP-derived fusion image of d in fig. 9 retains more CT image information, but loses some details of the MR image, so that the texture features and edge details of the lesion site are partially lost; the fused image obtained by NSST-PCNN has higher brightness, and the texture characteristics and edge details of the source image are weakened; the fusion result obtained by the CNN method has very low contrast on the whole, and the texture characteristics and edge details of the brain tumor part are not obvious enough; in terms of visual effect, the performance of GFF is similar to the method presented herein, but the lesion area is somewhat smooth and the texture features are notEnough, the edge details are not clear enough. As can be seen from the quantization indices in Table 3, the comparison method is applied to MI index and Q index^AB/FThe indexes are all low, which indicates that the texture features and the edge details of the film have defects and are consistent with subjective vision. The method provided by the invention is superior to the comparison method in the aspect of keeping texture characteristics and edge details.

Claims

1. The multi-modal medical image fusion based on the multi-CNN combination and the fuzzy neural network is characterized by comprising the following steps:

and

each pair of Gabor is then used to represent a pair:

and

to train corresponding 16G-CNNs, the 16G-CNNs are combined into G-CNNs;

the G-CNN consists of three parts of feature extraction, feature fusion and image reconstruction;

the characteristic extraction: 3X 224 Gabor^CTAnd Gabor^MRInputting an image into a G-CNN, extracting depth features from an input Gabor-based representation source image by adopting two convolutional layers, and adopting a first convolutional layer of an advanced ResNet101 pre-trained on ImageNet as a first convolutional layer CONV1, wherein the CONV1 comprises 64 convolutional kernels with the size of 7 multiplied by 7, and the step size and the filling parameter are respectively set to be 1 and 3; the number of nuclei and the size of nuclei added to the second convolution layers CONV2, CONV2 were set to 64 and 3 × 3, respectively, the step size and the fill parameter of CONV2Setting to 1, setting the ReLU activation function and the batch normalization function by CONV2, and obtaining 64 Gabor of 224 × 224 respectively through convolution operation of two convolution layers^CTAnd Gabor^MRA feature map of the image;

the characteristics are fused: in the fusion process, the convolution characteristics of multiple inputs are fused by using an element-level fusion rule, as shown in formula (4):

a feature map representing the i-th input image extracted by CONV2,

and (3) image reconstruction: after feature fusion, reconstructing a fused image from the fused convolution features by adopting two convolution layers CONV3 and CONV4, wherein the number of kernels and the size of the kernels of CONV3 are respectively set to be 64 and 3 multiplied by 3, and a ReLU activation function and a batch processing normalization function which are the same as those of CONV2 are also set; the CONV4 reconstructs and outputs the feature mapping through element weighted average, the kernel number and the kernel size of the CONV4 are set to be 3 and 1 multiplied by 1 respectively, and the step size and the filling parameter are set to be 0; after convolution operation of two convolution layers through image reconstruction, 3 × 224 × 224 Gabor is obtained^CTAnd Gabor^MRThe fused image of (1);

step 2, putting each pair of Gabor representation pairs into corresponding G-CNNs for preliminary fusion, and then fusing a plurality of outputs of the G-CNNs by using a fuzzy neural network to obtain a final fusion image;

the fuzzy neural network consists of an input layer, a fuzzy partition layer, a forward combination layer, a reasoning layer and an output layer;

in the first input layerSixteen neurons x₁，x₂，...，x₁₆Is a specific pixel value represented by the same position of 16 Gabor fusion images obtained by G-CNNs;

in the fourth layer of inference layer, the nodes are formed by connecting sixteen nodes representing the same fuzzy set in the third layer, and five fuzzy rules are set according to the divided five fuzzy sets: when x is₁，x₂，...x₁₆The dark and bright pixel values are all very dark, and the output rule is

in the formula (10), L is 5, R_lIs the rule output value.

2. The multimodal medical image fusion based on multi-CNN combination and fuzzy neural network as claimed in claim 1, wherein the filter bank in step 1 is a set of two-dimensional Gabor filters, and its formula is:

k_v＝k_max/f^v，φ_u＝πu/8，k_maxat maximum frequency, f is the spacing factor between filters in the frequency domain.

3. The multi-modal medical image fusion based on multi-CNN combination and fuzzy neural network as claimed in claim 2, wherein the step 1 obtains 16 different Gabor representation pairs of CT and MR:

and

gabor representations of CT and MR at specific orientation u and v scales are obtained in particular by the following formula:

Gabor^CT _(u,v)(z)＝I(z)*ψ_u,v(z)for 0≤u≤U-1,0≤v≤V-1 (2)

Gabor^MR _(u,v)(z)＝I(z)*ψ_u,v(z)for 0≤u≤U-1,0≤v≤V-1 (3)

formula (2) in formula (3), I (z) represents a given image, #_u,v(z) denotes a filter, denotes a convolution operation, Gabor^CT _(u,v)(z) and Gabor^MR _(u,v)(z) is the Gabor representation of CT and MR for a given image.

4. The multi-modal medical image fusion based on multi-CNN combination and fuzzy neural network of claim 1, wherein the element-level fusion rule is an element maximum fusion rule.