CN114119689B

CN114119689B - Multi-modal medical image unsupervised registration method and system based on deep learning

Info

Publication number: CN114119689B
Application number: CN202111461476.1A
Authority: CN
Inventors: 蔡聪波; 柯凌志; 蔡淑惠
Original assignee: Xiamen University
Current assignee: Xiamen University
Priority date: 2021-12-02
Filing date: 2021-12-02
Publication date: 2024-06-07
Anticipated expiration: 2041-12-02
Also published as: CN114119689A

Abstract

A multi-modal medical image unsupervised registration method and system based on deep learning relates to medical image registration. Making an analog simulation sample, inputting a floating image, a reference image and sequence related information of the floating image into a registration sub-network N1, and resampling the floating image by using the obtained deformation field to obtain a primary registration image; inputting the preliminary registration image, the reference image and the related information of the original sequence into a registration network N2, resampling the preliminary registration image by using the obtained final deformation field to obtain a final registration image, and optimizing the network by calculating a loss function between the registration image and the reference image so as to achieve accurate registration of the non-rigid medical image. The system is provided with a data simulation module, a network training sample preprocessing module, a neural network model training module, a sample to be registered, a sequence information preprocessing module and an unsupervised registration module. The dependence of the deep learning method on the data samples is solved. The registration can be accurately completed only by cooperatively inputting the sequence information graph of the floating images to be registered.

Description

Multi-modal medical image unsupervised registration method and system based on deep learning

Technical Field

The invention relates to medical image registration, in particular to a multi-mode medical image unsupervised registration method and system based on deep learning.

Background

Medical image registration refers to the search for one or a series of spatial transformations on the floating medical image to spatially agree with corresponding points on the reference medical image, by which the spatial positions of diagnostically meaningful anatomical points on the floating image and the reference image can be matched. Currently, medical image registration has multiple branches, which are classified into single-mode medical image registration and multi-mode medical image registration according to the type of medical image modality of registration, and into rigid medical image registration and non-rigid medical image registration according to the type of spatial transformation. Among them, non-rigid medical image registration attracts very extensive attention, playing a vital role in the fields of surgical guidance, radiation therapy, intelligent diagnosis, and the like.

The field of medical image registration has emerged a number of conventional algorithms to address non-rigid medical image registration. Thirion et al (J.-P.Thirion,"Image matching as a diffusion process:an analogy with Maxwell's demons,"Medical Image Analysis,vol.2,no.3,p.243–260,1998.) propose a demons method to register by estimating the velocity vector field between two adjacent images, in detail they calculate the optical flow, smooth the flow graph using gaussian filters, and optimize the predictions for each pair of images through multiple iterations. Beg et al (M.F.Beg,M.I.Miller,A.Trouve,andL.Younes,"Computing large deformation metric mappings via geodesic flows of diffeomorphisms,"Internationaljournalofcomputervision,vol.61,no.2,pp.139–157,2005.) propose a LDDMM (large-displacement differential metric mapping) method, which calculates the particle flow by deriving and implementing the euler-lagrangian optimization algorithm, solves the global variation problem and estimates the metric .SyN(B.B.Avants,C.L.Epstein,M.Grossman,and J.C.Gee,"Symmetric diffeomorphic image registration with cross-correlation:evaluating auto-mated labeling of elderly and neurodegenerative brain,"Medical image analysis,vol.12,no.1,pp.26–41,2008.) of the image, which is the most widely applied conventional algorithm in medical image registration, and proposes a symmetrical image normalization method based on euler-lagrangian optimization. In summary, these traditional non-rigid medical image registration methods require extensive iterative optimization, are computationally intensive, and take a long time without exception.

To obtain more efficient non-rigid medical image registration, a deep learning model is also introduced into the registration method. Li et al (Li H,Fan Y.Non-rigid image registration using self-supervised fully convolutional networks without training data[C].2018IEEE 15th International Symposium on Biomedical Imaging(ISBI 2018).IEEE,2018:1075-1078.) propose a deep learning based method to predict deformation parameters through a full convolution network, but this is a 2D registration network that solves the registration task of sub-3D. Fan et al (Fan J,Cao X,Yap P T,et al.BIRNet:Brain image registration using dual-supervised fully convolutional networks[J].Medical Image Analysis,2019,54:193-206.) propose a supervised depth learning method for image registration, however marked registration samples are difficult to obtain, while the registration accuracy of the depth network is limited to training samples. The VoxelMorph method proposed by balakrishenan et al (G.Balakrishnan,A.Zhao,M.R.Sabuncu,J.Guttag,and A.V.Dalca,"Voxelmorph:a learning framework for deformable medical image registration,"IEEE transactions on medical imaging,vol.38,no.8,pp.1788–1800,2019.) uses an unsupervised training approach to register non-rigid medical images, but is primarily directed to single modality medical image registration. On the other hand, all the above registration methods are based on correction on texture structure, and cannot realize correction of signal distortion. The magnetic resonance imaging signals in the uneven field not only have distortion of the texture structure, but also have distortion of the signal intensity and have obvious influence on subsequent image analysis and diagnosis.

Disclosure of Invention

The invention aims to solve the problems that a deep learning registration method in the prior art has high distortion that an MR image is difficult to acquire and the like, and provides a multi-mode medical image unsupervised registration method and a system based on deep learning, which can realize synchronous texture structure correction and signal intensity correction, introduce a sequence information diagram into a training sample and solve the problem that the traditional deep learning registration method can only register images to be registered with magnetic resonance sequence information consistent with the training sample.

An unsupervised registration method of multi-modal medical images based on deep learning comprises the following steps:

S1: acquiring multi-mode MR image samples of different magnetic resonance sequences by using a simulation method: floating images, reference images;

s2: preprocessing sequence information corresponding to the floating image, the reference image and the corresponding floating image;

S3: inputting the floating image, the corresponding sequence information of the floating image and the reference image into a cascade network, obtaining a preliminary deformation field by a cascade network sub-registration network N1, and inputting the preliminary deformation field and the floating image into a space transformation network to obtain a preliminary registration image;

S4: the initial registration image, the sequence information corresponding to the original floating image and the reference image input stage network sub-registration network N2 are input into a spatial transformation network to obtain a final deformation field, and the final deformation field and the floating image are input into the spatial transformation network to obtain a final registration image;

S5: calculating a loss function L _N1 by using the preliminary registration image and the reference image, calculating a loss function L _N2 by using the final registration image and the reference image, superposing the two loss functions to obtain a total loss function, repeatedly executing steps S3-S5, and minimizing the total loss function to train the advanced network until the best registration effect;

S6: acquiring real acquisition multi-modal data to be registered: floating images, reference images;

S7: preprocessing the floating image to be registered, the reference image and sequence information corresponding to the corresponding floating image;

S8: and (3) inputting the floating images to be registered, the reference images and the sequence information corresponding to the corresponding floating images into the cascade network trained in the step (S5), and executing the steps (S3-S5) again to finish registration and obtain a final registration image.

In step S1, the step of obtaining the multi-mode MR image samples of different magnetic resonance sequences by using a simulation method is to perform simulation on training samples required by the proposed deep neural cascade network, firstly obtaining a sample template from a public data set, and simulating a distorted floating image and a distorted reference image of different sequences by using a Bloch equation; saving sequence information corresponding to the floating image: length information of echo chain, resolution information, echo time TE.

In step S2, the specific steps of the preprocessing may be: performing size standardization processing on the simulated floating image and the reference image; normalizing the simulated floating image and the reference image; and setting a zero array with a corresponding size according to the standardized floating image and the reference image, and respectively filling the zero array by utilizing sequence information corresponding to the floating image to obtain three sequence information diagrams matched with the floating image.

In step S3, the cascaded network sub-registration network N1 adopts a U-shaped network as a main network, where the U-shaped network structure includes an encoder and a decoder, and a jump connection is used between the encoder and the decoder.

In step S4, the cascaded network sub-registration network N2 adopts a U-shaped network as a main network, where the U-shaped network structure includes an encoder and a decoder, and a jump connection is used between the encoder and the decoder.

In steps S3 and S4, the spatial transformation network is composed of three parts, namely a localization network, a sampling grid generator and a sampler, with which the floating image can be resampled.

In step S5, the total loss function is:

L_total＝L_N1+L_N2

＝L_sim(f,m°φ₁)+λ₁L_smooth(φ₁)+L_sim(f,m′°φ₂)+λ₂L_smooth(φ₂)

Wherein f is a reference image, m is a floating image, m' is a preliminary registration image, phi ₁ is a preliminary deformation field, phi ₂ is a final deformation field; l _sim(f,m°φ₁) is represented as a similarity loss between the reference image and the preliminary registration image, L _smooth(φ₁) is represented as a regularization loss that constrains the smoothness of the preliminary deformation field phi ₁, and λ ₁ is the regularization coefficient; l _sim(f,m'°φ₂) is represented as a similarity loss between the reference image and the final registered image, L _smooth(φ₂) is represented as a regularization loss that constrains the smoothness of the final deformation field phi ₂, lambda ₂ being the regularization coefficient.

In step S7, the preprocessing may perform unsupervised non-rigid registration on the actual acquired data of different sequences by using the deep neural network trained by the simulated simulation data sample: and preprocessing the floating image to be registered, the reference image to be registered and the sequence information graph corresponding to the floating image according to the step S2, and inputting the processed data sample into the trained deep neural network to obtain a final registered image.

The multi-mode MR image unsupervised registration system based on the deep learning is sequentially provided with the following modules: the system comprises a data simulation module, a network training sample preprocessing module, a neural network model training module, a sample to be registered, a sequence information preprocessing module and an unsupervised registration module;

the data simulation module is used for acquiring multi-mode MR image samples of different magnetic resonance sequences;

The network training sample preprocessing module is used for preprocessing the floating image, the reference image and sequence information corresponding to the corresponding floating image;

The neural network model training module is used for training the registration neural network;

The sample to be registered and sequence information preprocessing module is used for preprocessing the floating image to be registered, the reference image and sequence information corresponding to the corresponding floating image to be registered;

The unsupervised registration module is used for unsupervised registration of the images to be registered.

Further, the network training sample preprocessing module comprises a size standardization unit, a normalization unit and a sequence information quantization unit; the size normalization unit is used for normalizing the sizes of the floating image and the reference image; the normalization unit is used for performing normalization operation on the floating image and the reference image; the sequence information quantization unit is used for setting a zero array with standardized size by a user, and filling the zero array by utilizing sequence information corresponding to the floating image respectively to obtain three sequence information diagrams matched with the floating image: length information map, resolution information map, and echo time TE information map of echo chain.

The sample to be registered and the sequence information preprocessing module are consistent with the network training sample preprocessing module, and are used for preprocessing the sample to be registered and the sequence information, and the method comprises the following steps: size normalization unit, sequence information quantization unit.

The size standardization unit is used for standardizing the sizes of the floating images and the reference images to be registered; the normalization unit is used for performing normalization operation on the floating image and the reference image to be registered; the sequence information quantization unit is used for setting a zero array with standardized size by a user, and filling the zero array by utilizing sequence information corresponding to the floating image to be registered respectively to obtain three sequence information diagrams matched with the floating image to be registered.

Meanwhile, the invention introduces sequence information of a floating sample into a training sample: the length, resolution and echo time TE of the echo chain enable the corresponding sequence information of the floating sample to be input during registration, and the registration can be completed in a targeted and rapid manner. For 1000 training samples of 256 x 256, the network time of the method is about 2h, the registration speed is about 0.03 s/sheet, and the registration efficiency is obviously improved.

Compared with the prior art, the invention has the beneficial effects that:

1. The invention utilizes a simulation method to simulate a large number of distorted data samples based on templates derived from the public data set, and solves the dependence of the deep learning method on the data samples.

2. The invention creatively introduces sequence information into training samples: when the real acquisition data is matched, the registration can be accurately completed only by cooperatively inputting a sequence information graph of the floating images to be registered; the use of a cascade network is also one of the keys to the success of unsupervised registration of MR images.

3. According to the method, training is carried out by using simulation data to register actual sampling data, the problem that the actual sampling cost with high distortion is insufficient is solved, meanwhile, the method can register different magnetic resonance sequence samples, and the usability of a registration network is improved.

Drawings

FIG. 1 is a flow chart of unsupervised registration of a multi-modality MR image based on deep learning in an embodiment of the present invention;

FIG. 2 is a deep neural network structure diagram of unsupervised registration of a multi-modality MR image based on deep learning in an embodiment of the present invention;

FIG. 3 is a diagram of a sample to be registered and a registration result according to an embodiment of the present invention;

Fig. 4 is a system architecture diagram of unsupervised registration of multi-modality MR images based on deep learning in an embodiment of the present invention.

Detailed Description

The following embodiments will clearly and fully describe the technical solutions of the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The present invention aims at a fast and accurate unsupervised registration of multi-modality MR medical images with non-rigid deformations. In order to solve the problem of lack of real sampling with high distortion, the invention uses a Bloch simulation method to generate a large number of floating images to be registered and reference images to be registered, trains the deep neural network by utilizing sequence information corresponding to the data samples to be registered and the floating images to be registered, and inputs the real sampling sample to be registered and the sequence information thereof into the trained deep neural network to finish registration. Firstly, making an analog simulation sample, inputting a floating image, a reference image and sequence related information of the floating image into a registration sub-network N1, and resampling the floating image by using the obtained deformation field to obtain a primary registration image; inputting the preliminary registration image, the reference image and the related information of the original sequence into a registration network N2, and resampling the preliminary registration image by using the obtained final deformation field to obtain a final registration image. The depth neural network is optimized by calculating a loss function between the registration image and the reference image to achieve accurate registration of the non-rigid medical image.

As shown in fig. 1, the present invention provides an unsupervised registration method for a multi-mode MR image based on deep learning, comprising:

s1: acquiring multi-mode MR image samples of different magnetic resonance sequences by using a simulation method: floating images, reference images; the method specifically comprises the following steps:

Firstly, a sample template is obtained from a public data set, and a Bloch equation is utilized to simulate a distorted floating image and a distorted reference image of different sequences; saving sequence information corresponding to the floating image: length information of echo chain, resolution information, echo time TE.

S2: preprocessing the floating image, the reference image and sequence information corresponding to the corresponding floating image; the method specifically comprises the following steps:

Performing size standardization processing on the simulated floating image and the reference image; normalizing the simulated floating image and the reference image; and setting a zero array with a corresponding size according to the standardized floating image and the reference image, and respectively filling the zero array by utilizing sequence information corresponding to the floating image to obtain three sequence information diagrams matched with the floating image.

S3: as shown in fig. 2, inputting a floating image m, sequence information i corresponding to the floating image and a reference image f into a network, obtaining a preliminary deformation field phi ₁ by a cascade network sub-registration network N1, and inputting the preliminary deformation field and the floating image m into a space transformation network to obtain a preliminary registration image m'; the spatial transformation network obtains pixel values of the preliminary registered image using linear interpolation:

where M (φ (j)) is the pixel value of j and Z (φ (j)) is the point field of φ (j).

S4: as shown in fig. 2, the initial registration image m', the corresponding sequence information i of the original floating image, the reference image f and the input stage network sub-registration network N2 are input to obtain a final deformation field phi ₂, and the final deformation field phi ₂ and the floating image m are input to a space transformation network to obtain a final registration image; the spatial transformation network uses linear interpolation to obtain the pixel values of the final registered image:

S5: calculating a loss function L _N1 by using the preliminary registration image and the reference image, calculating a loss function L _N2 by using the final registration image and the reference image, superposing the two loss functions to obtain a total loss function, repeatedly executing steps S3 to S5, and minimizing the total loss function to train the advanced network until the best registration effect; the total loss function is:

L_total＝L_N1+L_N2

Wherein f is a reference image, m is a floating image, m' is a preliminary registration image, phi ₁ is a preliminary deformation field, phi ₂ is a final deformation field; l _sim(f,m°φ₁) is represented as a similarity loss between the reference image and the preliminary registration image, L _smooth(φ₁) is represented as a regularization loss that constrains the smoothness of the preliminary deformation field phi ₁, and λ ₁ is the regularization coefficient; l _sim(f,m'°φ₂) is represented as a similarity loss between the reference image and the final registered image, L _smooth(φ₂) is represented as a regularization loss that constrains the smoothness of the final deformation field phi ₂, λ ₂ being the regularization coefficient;

S8: and (3) inputting the floating image to be registered, the reference image and the sequence information corresponding to the corresponding floating image into the cascade network trained in the step (S5), and executing the steps (S3) to S5 again to finish registration and obtain a final registration image. As shown in fig. 3, from left to right, the steps are as follows:

the floating image SE-EPI-DWI to be registered, the reference image SE-EPI-T2, and the final registered image.

Fig. 4 is a system structure diagram of a multi-mode MR image unsupervised registration based on deep learning in an embodiment of the present invention, as shown in fig. 4, the present invention provides a multi-mode MR image unsupervised registration system based on deep learning, which sequentially includes:

the data simulation module 101: a multi-modality MR image sample for acquiring different magnetic resonance sequences;

network training sample preprocessing module 102: the method comprises the steps of preprocessing a floating image, a reference image and sequence information corresponding to the corresponding floating image;

Neural network model training module 103: for training the registration neural network;

Sample to be registered and sequence information preprocessing module 104: the method comprises the steps of preprocessing a floating image to be registered, a reference image and sequence information corresponding to the corresponding floating image to be registered;

Unsupervised registration module 105: the method is used for performing unsupervised registration on the images to be registered;

The data simulation module 101 specifically includes:

obtaining a sample template from the public data set, and simulating a distorted floating image and a distorted reference image of different sequences by using a Bloch equation; saving sequence information corresponding to the floating image: length information, resolution information and echo time TE of the echo chain;

the network training sample preprocessing module 102 specifically includes:

a size normalization unit for normalizing the sizes of the floating image and the reference image;

normalization unit: the method comprises the steps of performing normalization operation on a floating image and a reference image;

A sequence information quantization unit: the user sets a zero array with standardized size, and fills the zero array by utilizing sequence information corresponding to the floating image respectively to obtain three sequence information diagrams matched with the floating image;

The neural network model training module 103 specifically includes:

training the neural network by using the floating image, the reference image and the sequence information corresponding to the floating image, calculating a total loss function, and optimizing the refreshing network by minimizing the total loss function;

the sample to be registered and the sequence information preprocessing module 104 specifically include:

A size standardization unit for standardizing the sizes of the floating image and the reference image to be registered;

normalization unit: the method comprises the steps of performing normalization operation on a floating image and a reference image to be registered;

a sequence information quantization unit: the user sets a zero array with standardized size, and fills the zero array by utilizing sequence information corresponding to the floating image to be registered respectively to obtain three sequence information diagrams matched with the floating image to be registered;

the unsupervised registration module 105 specifically includes:

and inputting the preprocessed floating image to be registered, the reference image and sequence information corresponding to the floating image to be registered into the trained deep neural network to obtain a final registered image.

According to the multi-mode MR image unsupervised registration method and system based on deep learning, which are provided by the invention, a large number of data samples are simulated to train the neural network to obtain a better test result on actual acquisition data, so that the problem that the MR image with high distortion is difficult to obtain in the deep learning registration method is solved; meanwhile, the invention introduces sequence information of a floating sample into a training sample: the length, resolution and echo time TE of the echo chain enable the corresponding sequence information of the floating sample to be input during registration, and the registration can be completed in a targeted and rapid manner. For 1000 training samples of 256×256, the network time of the method is about 2h, the registration speed is about 0.03 s/sheet, and the registration efficiency is obviously improved.

The principles and embodiments of the present invention are described herein with reference to specific examples, the description of which is provided only to assist in understanding the method of the present invention and its core ideas; also, it is within the scope of the present invention to be modified by those of ordinary skill in the art in light of the present teachings. In summary, the present description should not be construed as limiting the invention.

Claims

1. The multi-mode medical image unsupervised registration method based on deep learning is characterized by comprising the following steps of:

the multi-mode MR image samples of different magnetic resonance sequences are obtained by using a simulation method, wherein the simulation is carried out on training samples required by the proposed deep neural cascade network, a sample template is firstly obtained from a public data set, and a Bloch equation is used for simulating a distorted floating image and a reference image of different sequences; saving sequence information corresponding to the floating image: length information, resolution information and echo time TE of the echo chain;

The total loss function is:

Wherein f is a reference image, m is a floating image, m' is a preliminary registration image, phi ₁ is a preliminary deformation field, phi ₂ is a final deformation field; Expressed as a similarity loss between the reference image and the preliminary registered image, L _smooth(φ₁) represents a regularization loss that constrains the smoothness of the preliminary deformation field Φ ₁, and λ ₁ is the regularization coefficient; /(I) Expressed as a similarity loss between the reference image and the final registered image, L _smooth(φ₂) represents a regularization loss that constrains the smoothness of the final deformation field Φ ₂, and λ ₂ is the regularization coefficient;

S8: inputting the floating images to be registered, the reference images and the sequence information corresponding to the corresponding floating images into the cascade network trained in the step S5, and executing the steps S3-S5 again to finish the registration, thereby obtaining a final registration image;

2. The depth learning-based multi-modality medical image unsupervised registration method according to claim 1, wherein in step S2, the specific steps of the preprocessing are: performing size standardization processing on the simulated floating image and the reference image; normalizing the simulated floating image and the reference image; and setting a zero array with a corresponding size according to the standardized floating image and the reference image, and respectively filling the zero array by utilizing sequence information corresponding to the floating image to obtain three sequence information diagrams matched with the floating image.

3. The depth learning-based multi-modal medical image unsupervised registration method according to claim 1, wherein in step S3, the cascaded network sub-registration network N1 adopts a U-shaped network as a main network, the U-shaped network structure includes an encoder and a decoder, and a jump connection is used between the encoder and the decoder.

4. The depth learning-based multi-modal medical image unsupervised registration method according to claim 1, wherein in step S4, the cascaded network sub-registration network N2 adopts a U-shaped network as a main network, the U-shaped network structure includes an encoder and a decoder, and a jump connection is used between the encoder and the decoder.

5. The depth learning based multi-modal medical image unsupervised registration method of claim 1, wherein in step S3 or S4, the spatial transformation network is composed of three parts, i.e. a localization network, a sampling grid generator, and a sampler, with which the floating image can be resampled.

6. The method for unsupervised registration of multimodal medical images based on deep learning according to claim 1, wherein in step S7, the preprocessing is to perform unsupervised non-rigid registration on actual acquired data of different sequences by using a deep neural network trained by analog simulation data samples: and preprocessing the floating image to be registered, the reference image to be registered and the sequence information graph corresponding to the floating image according to the step S2, and inputting the processed data sample into the trained deep neural network to obtain a final registered image.

7. The multi-modal medical image unsupervised registration method based on deep learning as claimed in claim 1, wherein the network training sample preprocessing module comprises a size standardization unit, a normalization unit and a sequence information quantization unit; the size normalization unit is used for normalizing the sizes of the floating image and the reference image; the normalization unit is used for performing normalization operation on the floating image and the reference image; the sequence information quantization unit is used for setting a zero array with standardized size by a user, and filling the zero array by utilizing sequence information corresponding to the floating image respectively to obtain three sequence information diagrams matched with the floating image: a length information graph, a resolution information graph and an echo time TE information graph of an echo chain;

The sample to be registered and the sequence information preprocessing module are consistent with the network training sample preprocessing module, and are used for preprocessing the sample to be registered and the sequence information, and the method comprises the following steps: the device comprises a size standardization unit, a normalization unit and a sequence information quantization unit; the size standardization unit is used for standardizing the sizes of the floating images and the reference images to be registered; the normalization unit is used for performing normalization operation on the floating image and the reference image to be registered; the sequence information quantization unit is used for setting a zero array with standardized size by a user, and filling the zero array by utilizing sequence information corresponding to the floating image to be registered respectively to obtain three sequence information diagrams matched with the floating image to be registered.