CN116342444A

CN116342444A - Dual-channel multi-mode image fusion method and fusion imaging terminal

Info

Publication number: CN116342444A
Application number: CN202310123425.0A
Authority: CN
Inventors: 刘慧�; 朱积成; 王欣雨; 郭强; 张永霞
Original assignee: Shandong University of Finance and Economics
Current assignee: Shandong University of Finance and Economics
Priority date: 2023-02-14
Filing date: 2023-02-14
Publication date: 2023-06-27

Abstract

The invention provides a double-channel multi-mode image fusion method and a fusion imaging terminal, which relate to the technical field of medical imaging and decompose a source image into a structural channel and an energy channel through JBF (joint-beam-forming) transformation; the local gradient energy operator is adopted to fuse the small-edge and small-scale detail information of the structural channel and tissue fibers, and the local entropy detail enhancement operator, PCNN and NSCT with phase consistency are adopted to fuse the energy channel and the edge intensity, texture characteristics and gray level change condition of the organ; and obtaining a fusion image through inverse JBF conversion. The invention can strengthen detail information and improve medical image fusion with the similarity of the multi-mode medical image on the basis of keeping the edge, reducing noise and smoothing of the fusion image. The structure channel adopts an improved local gradient energy operator, so that the expression of the detail information of the fusion image is further improved.

Description

Dual-channel multi-mode image fusion method and fusion imaging terminal

Technical Field

The invention relates to the technical field of medical imaging, in particular to a dual-channel multi-mode image fusion method and a fusion imaging terminal.

Background

With the application and development of sensor technology and computer technology, medical imaging technology plays an increasingly important role in modern medical diagnosis and treatment. Because of imaging mechanism and technical limitation, different images acquired by a single sensor can only reflect local characteristics of a lesion part, so that the useful information of a target mode medical image is extracted and the complementary information of a plurality of original medical images is fused to ensure that the fused image can provide more comprehensive and reliable lesion description and help doctors to make more accurate and comprehensive diagnosis on the lesion part in order to observe all the characteristics of the lesion part in one image.

In the prior art, image fusion techniques are widely studied in the medical field, and many scholars propose a large number of image fusion algorithms, and these methods are roughly divided into spatial domain techniques and frequency domain techniques. Spatial domain techniques refer to direct fusion operations at the source image pixel level or color space, and currently include image pixel maximum method, image pixel weighted average method, principal component analysis (principal component analysis, PCA), and brovin transform. The spatial domain technology can effectively reserve the spatial information of medical images, but the phenomena of image detail loss, contrast reduction, partial spectrum information loss, spectrum degradation and the like can also occur in fusion.

The introduction of frequency domain techniques significantly improves the above problems, and currently common frequency domain techniques include pyramid transformation, wavelet transformation, multi-scale transformation (MST), and the like. Among them, MST related work has made breakthrough progress in recent years, including 3 steps, multi-scale decomposition (muti-scale decomposition, MSD), high-low frequency coefficient selection under specific methods, and inverse MSD reconstruction. As a representative in multi-scale geometric analysis, the non-downsampled contourlet transform (non-subsampled contourlet transform, NSCT) is based on the conventional contourlet transform (contourlet transform, CT) that introduces the concept of non-downsampling, overcoming the directional aliasing and pseudo-gibbs phenomena in the conventional contourlet transform. However, NSCT, as a frequency domain technique, lacks expression of spatial neighborhood information such as similarity between pixels, depth distance, etc., thereby limiting its ability to preserve edges, noise reduction smoothness.

Meanwhile, with the development of a bilateral filtering theory, a combined bilateral filter (joint bilateral filter, JBF) is being widely applied to the field of medical image fusion as a novel signal processing means. Different from the fusion rule of the traditional linear filter, the JBF is used as a nonlinear filter, euclidean distance between pixels is used as a weight, calculation is carried out according to the comprehensive characteristics of the space weight and the similar weight, the structural characteristics between the pixels are effectively extracted, and the problems of global blurring and non-ideal edge structural characteristics when the traditional average filtering and low-pass filtering are used for basic detail separation are solved. However, the limited number and direction of decomposition still causes the fused image to have a defect in terms of the degree of decomposition of structural information and details, which restricts further application. This work still faces significant challenges in improving multi-feature performance and texture quality of each modality image.

Disclosure of Invention

The dual-channel multi-mode image fusion method provided by the invention not only can solve the problems of global blurring and non-ideal edge structure characteristics, but also can ensure the requirement of the fusion image on the decomposition degree of structural information and details, improves the multi-feature expression and texture quality of each mode image, and meets the use requirement.

The method comprises the following steps: step 1, decomposing a source image into a structure channel and an energy channel through JBF conversion;

step 2, fusing small-edge and small-scale detail information of a structural channel and tissue fibers by adopting a local gradient energy operator, and fusing the energy channel and the edge strength, texture characteristics and gray level change condition of an organ by adopting a local entropy detail enhancement operator, PCNN and NSCT with phase consistency;

and step 3, obtaining a fusion image through inverse JBF conversion.

It should be further noted that, step 1 further includes: global blurring of the input image I, i.e.

R _m ＝G _m *I (11)

Wherein R is _m Representing the smoothed result at standard deviation σ; g _m Representing variance as sigma ² Gaussian filter G at (x, y) _m The definition is as follows:

generating a global blurred image G using a weighted average Gaussian filter, i.e

Wherein I represents an input image; n (j) represents the adjacent pixel set of pixel point i;

Representing the variance of the pixel values; z is Z _j Representing normalization operations, i.e.

Large-scale structure employing JBF to recover energy channels, i.e

Wherein g _s Representing an intensity range function based on the intensity differences between pixels; g _d Representing a spatial distance function based on the pixel distance; z is Z _j Representing normalization operations, i.e.

σ _s ,σ _r Respectively representing the spatial weight and the range weight of the control bilateral filter;

obtaining energy channels E of source images A, B _I (x, y), and obtaining structural channels S by formula (19) _I (x,y)；

S _I (x,y)＝I(x,y)-E _I (x,y) (19)。

It should be further noted that, step 1 further includes: constructing local gradient energy operators, i.e.

LGE(x,y)＝NE ₁ (x,y)·ST(x,y) (20)

Wherein ST (x, y) represents the structure tensor salient image generated by the STs;

NE ₁ (x, y) represents the local energy of the image at (x, y), i.e

The size of the neighborhood at the position (x, y) is (2N+1) × (2N+1), and the value of N is 4;

by comparing the local gradient energy between the source images, a decision matrix S is obtained _map (x, y) is defined as

Updating the decision matrix of the structure channel fusion to S _mapi (x, y), i.e

Wherein Ω ₁ A local area with (x, y) as a center and a size of T×T, and the value of T is 21;

obtaining the fused structural channel S according to the following rules _F (x, y), i.e

Wherein S is _A (x,y),S _B (x, y) are the structural channels of the source images A, B, respectively.

It should be further noted that, step 1 further includes: configuring an energy channel high-frequency subband fusion rule;

The method comprises the following steps: details of the high frequency sub-band of the energy channel are described, and the local entropy of the image centered on (x, y) is defined as:

wherein S represents a window with (2 N+1) x (2 N+1) centered on (x, y);

calculating the gray scale rate at (x, y) based on the spatial frequency, reflecting the detail features thereof, i.e

Wherein h, w respectively represent the length and width of the source image; CF, RF respectively represent the first order difference between the x and y directions at (i, j), the formula is

CF(x,y)＝f(x,y)-f(x-1,y) (27)

RF(x,y)＝f(x,y)-f(x,y-1) (28)

The magnitude of the edge pixel point gradient at (x, y) is calculated based on the edge density, specifically defined as:

wherein s is _x ,s _y Representing the result of the Sobel operator convolution in the x and y directions, respectively, i.e

s _x ＝T*h _x (30)

s _y ＝T*h _y (31)

T represents each pixel point (x, y); h is a _x ，h _y Representing Sobel operators in x, y directions, respectively, i.e

Fusing high-frequency sub-bands of the energy channel through a high-frequency comprehensive measurement operator HM;

wherein the parameter alpha ₁ ,β ₁ ,γ ₁ The weights are respectively used for adjusting the local entropy, the spatial frequency and the edge density of the image in the HM;

by comparing the magnitudes of the high-frequency sub-bands HM of the energy channel, a decision matrix E for fusion of the high-frequency sub-bands of the energy channel is obtained _Hmap (x, y), defined as

At the same time, fused images of the 1 st to 4 th layers of high-frequency sub-bands after fusion are obtained according to the following rules

Wherein,,

respectively representing the source image A, B, the 1 st to 4 th layers of energy channel high frequency sub-bands.

In the method, PCNN is adopted to fuse the 5 th layer high-frequency sub-band, and the fused energy channel high-frequency sub-band is obtained by calculating the PCNN excitation times

Wherein,,

respectively representing a source image A and a source image B, wherein the layer 5 energy channel is a high-frequency subband;

respectively representing the excitation times of the high-frequency sub-band PCNN of the energy channel of the 5 th layer and T _ij (n) formula is

T _ij (n)＝T _ij (n-1)+P _ij (n) (38)

P _ij (n) represents an output model of PCNN.

In the method, to obtain the output model of PCNN, the feed input and the link input of the neuron at (x, y) are defined as

D _ij (n)＝I _ij (39)

Wherein the parameter VL represents the amplitude of the link input;

W _ijop representing the previous state of excitation of eight neighborhood neurons, i.e

Second, the exponential decay coefficient eta is utilized _f Calculating internal activity item U _ij (n) the attenuation of the previous value and the D by the link strength beta _ij (n) and C _ij (n) non-linear modulation to obtain the current internal activity term, defined as

At the same time, the current dynamic threshold is iteratively updated, i.e

Wherein eta _e And V _E Respectively represent the exponential decay coefficient and E _ij Amplitude of (n);

using the current internal activity item D _ij (n) dynamic threshold E at the n-1 th iteration _ij (n-1) comparing the sizes of the PCNN output models P _ij The state of (n), defined as

Obtaining a fusion result of the 5 th layer high frequency sub-band according to formulas (37) and (44);

Obtaining the fused energy channel high-frequency sub-band according to the following rule

I.e.

It should be further noted that the method further includes configuring an energy channel low-frequency subband fusion rule;

the method comprises the following specific steps: the PC value at (x, y) is defined as

Wherein θ _k Representing the direction angle at k;

representing the nth Fourier component and the angle θ _k Amplitude magnitude of (a) is determined; omega represents a method for removing an image signalA parameter of a phase component of (a);

the formula is

Representing the convolution result of the image pixel at (x, y), i.e

I _L (x, y) represents the pixel value of the energy channel low frequency subband located at (x, y);

and->

Representing a parity-symmetric filter bank of two-dimensional Log-Gabor of scale size n.

It should be further noted that, the method reflects the local contrast variation condition of the image by calculating the (x, y) neighborhood sharpness variation, which is specifically defined as:

wherein, M and N are 3; SCM formula is

Ω ₂ Representing a local area of size 3 x 3;

configuring local energy NE ₂ ；

Wherein, M and N are 3;

and fusing the low-frequency sub-bands of the energy channel by a low-frequency comprehensive measurement operator LM:

wherein the parameter alpha ₂ ,β ₂ ,γ ₂ Weights for adjusting the phase coincidence value, local sharpness variation and local energy in LM, respectively;

obtaining the fused energy channel low-frequency sub-band according to the following rule

I.e.

Wherein,,

respectively representing source image energy channel low frequency sub-bands; e (E) _Lmap (x, y) represents the decision matrix of the energy channel low frequency subband fusion, defined as

R _i (x, y) is defined as

N represents the number of source images; omega shape ₃ The representation is centered on (x, y) and has a size of

Is provided with a sliding window which is arranged on the upper surface of the glass substrate,

the value is 7;

high frequency subband using dual coordinate system operator

And low frequency subband->

Performing linear reconstruction to realize NSCT inverse transformation to obtain an energy channel fusion image E _F 。

It should be further noted that the method further includes: generating a structural channel fusion image S _F ((x, y) and energy channel fusion image E _F (x, y) obtaining a final fused image by superposition:

F(x,y)＝S _F (x,y)+E _F (x,y) (60)

setting input as a source image A, B;

setting output as a fusion image F;

the method comprises the following specific steps:

step1, reading in source images A and B, and generating a structural channel { S ] by adopting JBF decomposition _A ,S _B Sum energy channel { E } _A ,E _B }；

Step2, structural channel { S } _A ,S _B Local gradient energy operator fusion using (20) to generate structural channel fusion image S _F ；

Step3, for energy channel { E _A ,E _B Fusion of energy channels to generate an energy channel fused image E _F ；

Step3.1, energy channel { E } _A ,E _B Energy channel high frequency subband generation using NSCT decomposition

And energy channel low frequency subband +.>

Step3.2, fusing the high-frequency sub-bands of the 1 st layer to the 4 th layer by adopting a high-frequency comprehensive measurement operator HM rule based on LE, SF and ED;

Step3.3, fusing the high-frequency sub-bands of the 5 th layer by adopting a PCNN rule of a formula (37);

step3.4, based on PC, LSCM, NE, using (56) for the low frequency sub-band ₂ The low-frequency comprehensive measurement operator LM rule of (2) is fused;

step3.5, for high and low frequency sub-bands after fusion

Generation of energy channel E using NSCT inverse transformation _F ；

Step4, pair of fused structural channels S _F And an energy channel E _F The final fusion image F is generated using the inverse JBF of equation (60).

The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the steps of the dual-channel multi-mode image fusion method when executing the program.

From the above technical scheme, the invention has the following advantages:

the dual-channel multi-mode image fusion method can enable the fused image to strengthen detail information and improve the similarity with the multi-mode medical image on the basis of keeping edges and reducing noise smoothly. The invention also adopts an improved local gradient energy operator for the structural channel, and adopts a low-frequency comprehensive measuring operator consisting of phase, local sharpness variation and local energy for the low-frequency sub-band of the energy channel to calculate, thereby further improving the expression of the detail information of the fused image. . The energy channel generated by the JBF conversion is decomposed again through NSCT and is subjected to fusion treatment, so that the multidirectional and multiscale characteristics of frame decomposition are improved; the enhancement detail operator based on the local entropy is provided, the 1 st to 4 th layer high frequency sub-bands decomposed by the NSCT of the energy channel are processed by calculating the local entropy, the spatial frequency and the edge density of the image, the 5 th layer high frequency sub-bands are processed by adopting pulse coupling neural networks (pulse coupled neural network, PCNN), and the extraction and the utilization of the edge contour structure and the texture features in the energy channel are improved by combining the deep learning with the traditional method.

Drawings

In order to more clearly illustrate the technical solutions of the present invention, the drawings that are needed in the description will be briefly introduced below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a framework diagram of a two-channel multi-modality image fusion method;

FIG. 2 is a flow chart of a dual channel multi-modality image fusion method;

FIG. 3 is a graph of MR-T1/MR-T2 image at sigma _s Fusion result graphs under different values;

FIG. 4 is σ _s An MR-T1/MR-T2 fusion quality line diagram under different values;

FIG. 5 is a plot of the fusion quality of MR-T1/MR-T2 at different values of S.

Detailed Description

The invention provides a two-channel multi-mode image fusion method, which relates to a two-channel medical image fusion mode combining technologies such as JBF, NSCT, structure tensor theory and the like with local entropy and gradient energy. The invention adopts the JBF to effectively utilize the spatial structure of the image, so that various fused medical images show good hemming property while being smooth, and comprises 3 steps of firstly performing the JBF conversion on the source image A and B to obtain a structural channel { S } _A ,S _B Sum energy channel { E } _A ,E _B -a }; secondly, extracting and fusing structural channel and energy channel information by utilizing a specific fusion rule to obtain { S } _F ,E _F -a }; finally, the fusion image F is obtained through inverse JBF transformation. The process not only reflects the spatial proximity between pixels, but also considers the gray level similarity between pixels, achieves the purposes of edge protection and denoising, and has the characteristics of simplicity, non-iteration and local property. However, JBF is a two-channel fusion technique, and the decomposition of JBF has limitations, and incomplete image decomposition results in that an energy channel contains part of detailed texture information from a structural channel, so that a subsequent fusion rule cannot effectively identify and extract corresponding information, and the image fusion quality is affected. The NSCT has multiple scales and anisotropism, can describe the singular information of the image, and describes the characteristics of different frequency bands and different directions. In consideration of the point, NSCT is implanted into the energy channel, and the structure and detail texture of the energy channel are decomposed and fused again, so that the multi-direction and multi-scale purpose of the model is improved.

The invention relates to NSCT, which is based on CT, adopts an up-sampling filter to sample and replace the down-sampling process in the decomposition process, and consists of a non-down-sampling pyramid (non-subsampled pyramid, NSP) and a non-down-sampling direction filter bank (non-subsampled directional filter bank, NSDFB), which respectively conduct scale decomposition and direction decomposition on images, avoid the phenomena of direction aliasing and pseudo gibbs caused by sampling, ensure the translation invariance in the decomposition process, and promote the extraction capability of image edge information, and comprises 3 steps of firstly utilizing NSCT to carry out scale decomposition and direction decomposition on the source image energy channel { E }, and then adopting the NSCT to carry out the following steps of _A ,E _B Decomposing to obtain high-frequency sub-band of energy channel

And low frequency subband->

Secondly, extracting and fusing the high and low frequency sub-band information of the energy channel through a specific rule to obtain +.>

Finally, energy is obtained by reversing NSCTQuantity channel image E _F 。

For the structure tensor theory of the present invention, it is through the local window Ω ₀ Epsilon to 0 is arbitrarily chosen in the alpha direction ⁺ The amount of change of the image f (x, y) at (x, y) is defined as

In general, the local geometric features of an image f (x, y) at (x, y) are characterized by a local rate of change C (α), defined as

Wherein S represents the structure tensor, i.e

Semi-positive definite matrix representing a second moment

The local gradient vector representing the image f (x, y) is given by

λ ₁ ,λ ₂ Representing structure tensors respectively

Is given by

In summary, the structure tensor significance detection operator (structural tensor significance detection operator, STS) is defined as

Based on the above technology, decomposing the source image into a structural channel and an energy channel by JBF transformation; the local gradient energy operator is adopted to fuse the small-edge and small-scale detail information of the structural channel and tissue fibers, and the local entropy detail enhancement operator, PCNN and NSCT with phase consistency are adopted to fuse the energy channel and the edge intensity, texture characteristics and gray level change condition of the organ; and obtaining a fusion image through inverse JBF conversion. Therefore, the medical image fusion method can strengthen detail information and improve the similarity with the multi-mode medical image on the basis of keeping the edge, reducing noise and smoothing of the fusion image.

The dual-channel multi-mode image fusion method can also acquire and process the associated data based on the artificial intelligence technology. The dual-channel multi-mode image fusion method utilizes a digital computer or a machine controlled by the digital computer to simulate, extend and expand the intelligence of a person, sense the environment, acquire knowledge and acquire the theory, the method, the technology and the application device of the best result by using the knowledge. Of course, the dual-channel multi-mode image fusion method has a hardware-level technology and a software-level technology. Hardware technologies typically include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. Software technologies mainly include computer perspective technology, machine learning/deep learning, and programming languages. Programming languages include, but are not limited to, object-oriented programming languages such as Java, smalltalk, C ++, and conventional procedural programming languages such as the "C" language or similar programming languages.

Fig. 1 and 2 show a flow chart of a preferred embodiment of the dual channel multi-modality image fusion method of the present invention. The dual-channel multi-mode image fusion method is applied to one or more fusion imaging terminals, wherein the fusion imaging terminals are devices capable of automatically performing numerical calculation and/or information processing according to preset or stored instructions, and the hardware comprises, but is not limited to, microprocessors, application specific integrated circuits (Application Specific Integrated Circuit, ASICs), programmable gate arrays (Field-Programmable Gate Array, FPGAs), digital processors (Digital Signal Processor, DSPs), embedded devices and the like.

The convergence imaging terminal can be any electronic product that can perform man-machine interaction with a user, such as a personal computer, a tablet computer, a smart phone, a personal digital assistant (Personal Digital Assistant, PDA), an interactive web television (Internet Protocol Television, IPTV), a smart wearable device, and the like.

The converged imaging terminal may also include a network device and/or a user device. Wherein the network device includes, but is not limited to, a single network server, a server group composed of a plurality of network servers, or a Cloud based Cloud Computing (Cloud Computing) composed of a large number of hosts or network servers.

The network in which the converged imaging terminal is located includes, but is not limited to, the internet, a wide area network, a metropolitan area network, a local area network, a virtual private network (Virtual Private Network, VPN), and the like.

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In order to obtain a fused image with rich details and clear textures, the invention comprises 3 steps of decomposition of a combined bilateral filter, fusion of a structural channel and an energy channel and image reconstruction, as shown in figures 1 and 2; decomposing the source image into a structural channel and an energy channel through JBF conversion; secondly, fusing small-edge and small-scale detail information such as tissue fibers of the structural channel by adopting a local gradient energy operator, and fusing organ edge strength, texture characteristics and gray level change conditions of the energy channel by adopting a local entropy detail enhancement operator, PCNN and NSCT with phase consistency; finally, obtaining a fusion image through inverse JBF conversion.

In one exemplary embodiment, to maximize the detail texture of the source image, the input image I is first subjected to a global blurring process, i.e

R _m ＝G _m *I (11)

Wherein R is _m Representing the smoothed result at standard deviation σ; g _m Representing variance as sigma ² Gaussian filter G at (x, y) _m Is defined as

Subsequently, a global blurred image G is generated using a weighted average Gaussian filter, i.e

However, through global blurring processing, the image intensity information is relatively dispersed, if the image intensity information is directly used as an energy channel, the subsequent fusion rule cannot extract the intensity information, and therefore problems such as boundary blurring and artifacts of the fused image are caused. To generate relatively concentrated edge intensity information, JBF is employed to recover the large scale structure of the energy channel, i.e

σ _s ,σ _r Representing the spatial weight and the range weight, respectively, of the control bilateral filter.

To sum up, the energy channel E of the source images A, B is obtained _I (x, y), and obtaining structural channels S by formula (19) _I (x,y)。

S _I (x,y)＝I(x,y)-E _I (x,y) (19)

In the embodiment of the invention, a structural channel fusion rule needs to be configured, in particular, in medical imaging, the quality of detail information expression plays a decisive role in the quality of organ lesion diagnosis, and in order to accurately reflect detail information such as small edge structures, fibers and the like in organs and tissues, the structural channel information is extracted and fused by adopting a local gradient energy operator based on structural tensor and neighborhood energy. To solve the problem of the STS being unable to detect the lack of tiny detail features in the intensity function image, local gradient energy operators (local gradient energy, LGE) are constructed, i.e

LGE(x,y)＝NE ₁ (x,y)·ST(x,y) (20)

Wherein ST (x, y) represents the structure tensor salient image generated by the STs; NE (NE) ₁ (x, y) represents the local energy of the image at (x, y), i.e

The neighborhood size at (x, y) is (2N+1) × (2N+1), and N takes on a value of 4.

To ensure the region integrity in the target image, the decision matrix for the structural channel fusion is updated to S _mapi (x, y), i.e

Wherein Ω ₁ A local area having a size of t×t centered on (x, y) is represented, and T has a value of 21.

To sum up, the fused structural channel S is obtained according to the following rule _F (x, y), i.e

The invention also configures the energy channel fusion rule. Here, the energy channel contains organ contour structure and edge intensity information through JBF decomposition. At the same time, the limitations of its resolution result in energy channels containing small amounts of textural features such as fibers. Therefore, the complexity of the energy channel information is required to be decomposed again through NSCT, and the local entropy detail enhancement operator, PCNN and phase consistency are adopted to extract and fuse outline structures such as texture features, organs and bones respectively, so that the utilization rate of the energy channel information is further improved, and the fusion effect is improved.

The invention also sets the high-frequency subband fusion rule of the configuration energy channel. Specifically, through NSCT decomposition, each decomposition layer of the high-frequency sub-band of the energy channel comprises organ outline structures and fiber texture features with different scales, and the image fusion effect is directly influenced by the quality of the information extraction of the decomposition layers. Meanwhile, the conditions that the number of layers of each decomposition layer increases, the image scale information decreases and the like occur, so that the image information of the highest decomposition layer is difficult to effectively extract by a general image fusion rule.

PCNN is used as a neural network model, has pulse synchronization and global coupling characteristics, can extract effective information from complex backgrounds, is superior to most traditional methods, and has remarkable advantages in the aspects of edge detection, refinement, recognition and the like of image fusion. In consideration of the point, PCNN is embedded into the processing of the high-frequency sub-band, and the extraction capability of the information of the 5 th layer high-frequency sub-band is improved, so that the aim of improving the structure and texture characteristics of the fused image is fulfilled. And meanwhile, the local entropy enhancement detail operator is adopted to fuse the 1 st to 4 th layers of high-frequency sub-bands, so that the organ contour and the fiber texture reduction degree of the fused image are further improved. A large number of experiments prove that the 1 st to 4 th layers of the high-frequency sub-band of the energy channel adopt local entropy detail enhancement operators, and the 5 th layer adopts a PCNN method to have obvious advantages for extracting and fusing the structure and texture information.

In the present invention, first, the fusion rule of the high-frequency sub-bands of layers 1 to 4 is described. The image entropy is used as a statistical method for estimating the information quantity of the image and reflects the detail information contained in the image. In general, the greater the entropy, the more detail information the image contains. However, the entropy of the entire image often does not reflect the local detail information of the image. To solve this problem, local entropy of the image is introduced, and detailed information of the high-frequency sub-band of the energy channel is further described. The Local Entropy (LE) of an image centered on (x, y) is defined as

Wherein S represents a window having a size of (2n+1) × (2n+1) centered on (x, y).

The invention introduces spatial frequency (spatial frequency, SF) to further highlight texture information, reflects detail features thereof by calculating gray scale change rate at (x, y), namely

CF(x,y)＝f(x,y)-f(x-1,y) (27)

RF(x,y)＝f(x,y)-f(x,y-1) (28)

However, LE, SF is an estimate of detailed information describing images, and extraction and expression of large-scale structural information such as contours are lacking. Thus, edge Density (ED) is introduced, and the layering of the structure and contour edges is highlighted by computing the magnitude of the gradient of the edge pixel points at (x, y), defined as

s _x ＝T*h _x (30)

s _y ＝T*h _y (31)

T represents each pixel point (x, y); h is a _x ,h _y Representing Sobel operators in the x, y directions, respectively, i.e

Thereby, the energy channel high frequency sub-bands are fused by the high frequency synthesis measurement operator HM.

Wherein the parameter alpha ₁ ,β ₁ ,γ ₁ Respectively for adjusting the weights of the local entropy, spatial frequency and edge density of the image in the HM.

/>

Wherein,,

Secondly, the invention adopts PCNN to fuse the 5 th layer high frequency sub-band, and obtains the fused energy channel high frequency sub-band by calculating the PCNN excitation times

Wherein,,

T _ij (n)＝T _ij (n-1)+P _ij (n) (38)

P _ij (n) represents an output model of PCNN.

In PCNN, D _ij (n) and C _ij (n) represents the feed input and link input, respectively, of the neuron located at (x, y) after n iterations. D (D) _ij (n) input image I during the whole iteration _ij Is related to the intensity of (2); c (C) _ij The synaptic weight of (n) is related to the previous state of excitation of the eight neighbor neurons. To obtain the output model of PCNN, first, the feed input and link input of neurons at (x, y) are defined as

D _ij (n)＝I _ij (39)

Wherein the parameter V _L Representing the amplitude of the link input; w (W) _ijop Representing the previous state of excitation of eight neighborhood neurons, i.e

At the same time, the current dynamic threshold is iteratively updated, i.e

Wherein eta _e And V _E Respectively represent the exponential decay coefficient and E _ij Amplitude of (n).

Finally, the current internal activity item D is utilized _ij (n) dynamic threshold E at the n-1 th iteration _ij (n-1) comparing the sizes of the PCNN output models P _ij The state of (n), defined as

In summary, the fusion result of the layer 5 high frequency sub-bands is obtained according to the formulas (37) (44). Meanwhile, the fused energy channel high-frequency sub-band is obtained according to the following rule

I.e.

As an embodiment of the present invention, an energy channel low frequency subband fusion rule is also configured.

The low frequency sub-bands include energy channel pixel brightness and gray scale variations. In order to further improve the information quantity of the low-frequency sub-band, the information of the low-frequency sub-band image is enhanced by adopting phase consistency. Phase Consistency (PC) is a dimensionless measure that is commonly used to reflect the sharpness of an image and the importance of image features. The PC value at (x, y) is defined as

Wherein θ _k Representing the direction angle at k;

representing the nth Fourier component and the angle θ _k Amplitude magnitude of (a) is determined; ω represents a parameter for removing a phase component in the image signal; / >

The formula is

Representing the convolution result of the image pixel at (x, y), i.e

and->

However, PC, as a kind of contrast invariance, cannot reflect the local contrast variation situation. Thus, local sharpness change (local sharpness change measure, LSCM) is introduced, and local contrast change of image is reflected by calculating (x, y) neighborhood sharpness change (sharpness change measure, SCM), defined as

Wherein, M and N are 3, and SCM formula is

Ω ₂ Representing a local area of size 3 x 3.

Since PC, LSCM does not fully reflect local signal strength, local energy NE is introduced ₂ 。

Wherein, M and N take the value of 3.

Thereby, the energy channel low frequency sub-bands are fused by the low frequency synthesis measurement operator LM.

Wherein the parameter alpha ₂ ,β ₂ ,γ ₂ Respectively for adjusting the phase coincidence value, the local sharpness change amount and the local energy magnitude in the LM.

To sum up, the fused energy channel low frequency sub-band is obtained according to the following rule

I.e.

Wherein,,

R _i (x, y) is defined as

Is (are) sliding window->

The value is 7.

Finally, using a dual coordinate system operator for high frequency subbands

And low frequency subband->

In the embodiment of the invention, the fusion image is also reconstructed, specifically, the structural channel fusion image S is generated through the steps _F (x, y) and energy channel fusion image E _F (x, y), and then obtaining a final fusion image through superposition:

F(x,y)＝S _F (x,y)+E _F (x,y) (60)

setting input as a source image A, B;

setting output as a fusion image F;

the method comprises the following specific steps:

And energy channel low frequency subband +.>

Step3.4, PC based, LSCM, NE using (56) for low frequency subbands ₂ The low-frequency comprehensive measurement operator LM rule of (2) is fused;

step3.5, for high and low frequency sub-bands after fusion

Generation of energy channel E using NSCT inverse transformation _F ；

Therefore, the dual-channel multi-mode image fusion method can enable the fused image to strengthen detail information and improve the similarity with the multi-mode medical image on the basis of keeping the edge and reducing noise smooth. The invention also adopts an improved local gradient energy operator for the structural channel, and adopts a low-frequency comprehensive measuring operator consisting of phase, local sharpness variation and local energy for the low-frequency sub-band of the energy channel to calculate, thereby further improving the expression of the detail information of the fused image. The energy channel generated by the JBF conversion is decomposed again through NSCT and is subjected to fusion treatment, so that the multidirectional and multiscale characteristics of frame decomposition are improved; the enhancement detail operator based on the local entropy is provided, the 1 st to 4 th layer high frequency sub-bands decomposed by the NSCT of the energy channel are processed by calculating the local entropy, the spatial frequency and the edge density of the image, the 5 th layer high frequency sub-bands are processed by adopting pulse coupling neural networks (pulse coupled neural network, PCNN), and the extraction and the utilization of the edge contour structure and the texture features in the energy channel are improved by combining the deep learning with the traditional method.

Further, as experiments and results analysis of the embodiments of the above examples, in order to verify technical effects of the method of the present invention, the following description will be given with specific implementation effects: setting experimental data and setting a test image. To fully verify the superiority of this method, a comprehensive and extensive experimental analysis was performed. Experiments were performed on human brain image datasets captured from four different imaging mechanisms from the harvard medical college website (1), with the resolution of each test image set to 256 x 256, with 118 pairs of multi-modality medical images being used to fully verify the validity of the method. The experimental results of 4 pairs of magnetic resonance imaging groups (MR-T1/MR-T2), 4 pairs of electron and magnetic resonance imaging groups (CT/MR), 4 pairs of magnetic resonance imaging and single photon emission computed tomography groups (MR/SPECT), 4 pairs of magnetic resonance imaging and positron emission computed tomography groups (MR/PET) were randomly chosen and analyzed from visual and objective indices, respectively.

All experiments were written by Matlab 2018 with an operating environment AMD Ryzen 7 5800with Radeon Graphics3.20GHz,RAM of 16.0GB.

The invention uses six commonly used measurement indexes to comprehensively and quantitatively evaluate the performances of different fusion methods. Firstly, the invention adopts three indexes of peak signal-to-noise ratio (peak signal to noise ratio, PSNR), structural similarity (structural similarity, SSIM) and Mutual Information (MI) to measure the similarity between the fusion image and the source image. The higher the index, the less distortion the fusion process produces, and the more similar the source and fusion images are.

Wherein, PSNR measures its similarity by calculating the mean square error between the source image and the fusion image, SSIM measures the structural similarity between the source image and the fusion image, MI measures its correlation by calculating the information entropy of the fusion image and the joint information entropy of both the fusion image and the source image. Secondly, the invention adopts three indexes of spatial frequency (spatial frequency, SF), standard deviation (standard deviation, SD) and edge information retention (Qabf) to measure the edge information of the source image, the retention of detail textures and the contrast of the fusion image. The higher the index, the more detail and texture information the fused image contains, and the better the quality of visual information obtained from the source image. In addition, in order to further evaluate the performance of image fusion, the invention also introduces information Entropy (EN) and visual information fusion fidelity (visual information fusion fidelity, VIFF) indexes, and further measures the information content of the fusion image and the reduction degree of the fusion image on the source image. The higher the index is, the better the fusion performance is, and the distortion condition of the fusion image is smaller.

The verification mode of the invention is a method for adjusting one parameter by fixing other parameters, a series of fusion results are generated on the multi-mode medical image by using 118, and the fusion results are evaluated from aspects such as similarity index measurement, visual effect and the like so as to determine the optimal value of the parameter. The optimal parameters are analyzed below using MR-T1/MR-T2 image fusion as an example.

1) Standard deviation sigma of Gaussian _s ：

Standard deviation sigma of Gaussian _s As the spatial weight of the bilateral filter, the quality of the spatial information identification of the source image is determined, and the texture structure of the fusion image and the similarity degree with the source image are influenced. Thus, a proper Gaussian standard deviation sigma is set _s Is particularly important.

To determine sigma _s Fixing other parameters to be 1Values between 6 and 3 are shown in the experimental results. As can be seen in the close-up region of fig. 3c, the detail information from the source image is severely attenuated, there are significant artifacts in the brain's return region, and even a severe loss of detail occurs at the saddle pool. At the same time, fig. 3c also shows a case that the gray information of the fusion image is distorted and does not match the source image information, which is not preferable in medical diagnosis. As can be seen from the close-up region in FIG. 3d, the case is improved, the contour structure of the fused image is more vivid than that in FIG. 3c, but the situation that the texture features are obviously missing still at the positions of the cerebral sulcus and the like and the similarity degree with the source image is unbalanced, the gray change cannot accurately reflect the lesion information, and the accuracy of medical diagnosis is seriously affected. As can be seen from the close-up areas of fig. 3f,3g,3h, with σ _s The value is increased, the fusion energy loss is increased, the contrast is reduced, the fiber texture characteristics are obviously weakened, meanwhile, the condition that the similarity degree of the fusion image and the MR-T1 and MR-T2 images is unbalanced is more obvious, and the fusion image and the MR-T2 images are along with sigma _s The MR-T2 image information contained in the ventriculo-forefoot at the fusion image side gradually decreases even when sigma increases in value _s When the value is 6, the fusion image cannot reflect the MR-T2 information. However, when sigma _s When the value is 3, the fusion image is characterized by texture detail, the source image information is restored, the MR-T1 and MR-T2 image information is weighted, and compared with other values, the fusion image has obvious advantages, as can be seen from the figure 3e, the texture of the fusion image brain sulcus return part is clearer, the fiber detail change is obvious, the gray level of the anterior ventricle foot is balanced, the edge is clear, and the artifact and distortion phenomenon are avoided. Furthermore, as can be seen from the objective evaluation index shown in Table 1, when σ _s When the value is 3, the pixel gray level and the structure similarity of the fusion image and the source image information are the highest, and the fusion performance is the best.

The sigma can be more intuitively analyzed by representing the data of Table 1 in the form of a line graph _s The performance under variation is good and bad as shown in fig. 4. As can be seen from fig. 4, with σ _s Increasing, the objective index increases with it, as sigma _s Peak is reached at a value of 3, however, when σ _s When the value exceeds 3, the similarity degree of the fusion image and the source image is gradually reduced, and the adverse effect is along with sigma _s Is enhanced with an increase in the number of (c), and the image fusion performance is reduced with the increase. Thus, sigma _s When the value is 3, the fusion image has the highest similarity with the source image, the most obvious texture detail information and the best fusion performance in subjective analysis or objective index. Thus will sigma _s The optimum value was adjusted to 3. Furthermore, it was found through a large number of experiments that the experimental result is not subjected to the parameter σ in equation (16) _r Influence of the value, thus sigma _r The value is 0.05.

TABLE 1 sigma _s Objective evaluation of MR-T1/MR-T2 fusion results at different values

And (5) pouring. The optimal values are indicated in bold.

2) Window size S:

in the energy channel high-frequency subband detail enhancement operator, the window size S is used as the window size of local image entropy, and the blocking condition of the source image is determined, so that the information of the source image is characterized by calculating the entropy value of each blocking. Therefore, setting an appropriate window size S plays a critical role in the quality of the extraction degree of the source image information.

The window size S was set to a value between 1 and 6 by fixing other parameters, and the experimental results are shown in table 2. As can be seen from table 2, as the window size S increases, the objective index increases until the value of the window size S is 3, the PSNR and MI values reach the optimal state, and the fusion performance is the best at this time, and the reduction degree of the source image information is the highest, however, as the window size S increases, the value of the window size S exceeds 3, and the similarity index between the fusion image and the source image gradually decreases. Meanwhile, as the window size S increases, the SSIM value increases, when the value is 2, the SSIM reaches the optimal state, the optimal state is maintained all the time, when the window size S exceeds 5, the SSIM value starts to decrease, and the fusion performance and the image quality are poorer and worse. In the comprehensive analysis, when the value of S is 3, the pixel gray level and the structural similarity of the fusion image and the source image information are the highest, and the fusion performance is the best.

Similarly, the data of table 2 are presented in the form of a line graph, as shown in fig. 5. As can be seen from fig. 5, when the window size S is 3, compared with other values, each objective index is in a peak state, and at this time, the fused image reaches an optimal state in terms of the reduction degree, the similarity degree, and the detail texture expression of the source image, which is favorable for capturing and analyzing lesion information of a patient by a medical worker, and improves the reliability and the authenticity of medical diagnosis, so that the S optimal value is adjusted to 3.

TABLE 2 objective evaluation of MR-T1/MR-T2 fusion results at different values

And (5) pouring. The optimal values are indicated in bold.

The units and algorithm steps of each example described in the embodiments disclosed in the dual-channel multi-mode image fusion method provided by the invention can be implemented by electronic hardware, computer software or a combination of the two, and in order to clearly illustrate the interchangeability of hardware and software, the components and steps of each example have been generally described in terms of functions in the above description. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The two-channel multi-modality image fusion method provided by the invention is the unit and algorithm steps of each example described in connection with the embodiments disclosed herein, and can be implemented in electronic hardware, computer software, or a combination of both, and to clearly illustrate the interchangeability of hardware and software, the components and steps of each example have been generally described in terms of functionality in the foregoing description. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. The double-channel multi-mode image fusion method is characterized by comprising the following steps of:

step 1, decomposing a source image into a structure channel and an energy channel through JBF conversion;

and step 3, obtaining a fusion image through inverse JBF conversion.

2. The method of claim 1, wherein the two-channel multi-modality image fusion process,

step 1 further comprises: global blurring of the input image I, i.e.

R _m ＝G _m *I (11)

Large-scale structure employing JBF to recover energy channels, i.e

S _I (x,y)＝I(x,y)-E _I (x,y) (19)。

3. The method of claim 2, wherein the two-channel multi-modality image fusion method,

step 1 further comprises: constructing local gradient energy operators, i.e.

LGE(x,y)＝NE ₁ (x,y)·ST(x,y) (20)

NE ₁ (x, y) represents the local energy of the image at (x, y), i.e

Updating the decision matrix of the structure channel fusion to

I.e.

4. The method for dual-channel multi-mode image fusion according to claim 1 or 2, wherein,

step 1 further comprises: configuring an energy channel high-frequency subband fusion rule;

wherein S represents a window with (2 N+1) x (2 N+1) centered on (x, y);

CF(x,y)＝f(x,y)-f(x-1,y) (27)

RF(x,y)＝f(x,y)-f(x,y-1) (28)

s _x ＝T*h _x (30)

s _y ＝T*h _y (31)

Wherein,,

5. The method of claim 4, wherein the two-channel multi-modality image fusion process,

in the method, PCNN is adopted for fusing the 5 th layer high-frequency sub-band, and the fused energy channel high-frequency sub-band is obtained by calculating the PCNN excitation times

Wherein,,

T _ij (n)＝T _ij (n-1)+P _ij (n) (38)

P _ij (n) represents an output model of PCNN.

6. The method of claim 4, wherein the two-channel multi-modality image fusion process,

in the method, to obtain an output model of PCNN, the feed input and the link input of the neuron at (x, y) are defined as

D _ij (n)＝I _ij (39)

Wherein the parameter V _L Representing the amplitude of the link input;

At the same time, the current dynamic threshold is iteratively updated, i.e

I.e.

7. The method for dual-channel multi-mode image fusion according to claim 1 or 2, wherein,

the method further comprises the steps of configuring an energy channel low-frequency subband fusion rule;

Wherein θ _k Representing the direction angle at k;

representing the nth Fourier component and the angle θ _k Amplitude magnitude of (a) is determined; ω represents a parameter for removing a phase component in the image signal;

the formula is

Representing the convolution result of the image pixel at (x, y), i.e

and->

8. The method of claim 7, wherein the two-channel multi-modality image fusion process,

the method reflects the local contrast change condition of the image by calculating the (x, y) neighborhood sharpness change quantity, and is specifically defined as:

wherein, M and N are 3; SCM formula is

Ω ₂ Representing a local area of size 3 x 3;

Configuring local energy NE ₂ ；

Wherein, M and N are 3;

I.e.

Wherein,,

R _i (x, y) is defined as

Is (are) sliding window->

The value is 7;

high frequency subband using dual coordinate system operator

And low frequency subband->

9. The method for dual-channel multi-mode image fusion according to claim 1 or 2, wherein,

the method further comprises the steps of: generating a structural channel fusion image S _F (x, y) and energy channel fusion image E _F (x, y) obtaining a final fused image by superposition:

F(x,y)＝S _F (x,y)+E _F (x,y) (60)

setting input as a source image A, B;

setting output as a fusion image F;

the method comprises the following specific steps:

And energy channel low frequency subband +.>

step3.5, for high and low frequency sub-bands after fusion

Generation of energy channel E using NSCT inverse transformation _F ；

10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the two-channel multi-modality image fusion method according to any of claims 1 to 9 when the program is executed by the processor.