CN114119689A

CN114119689A - Multi-modal medical image unsupervised registration method and system based on deep learning

Info

Publication number: CN114119689A
Application number: CN202111461476.1A
Authority: CN
Inventors: 蔡聪波; 柯凌志; 蔡淑惠
Original assignee: Xiamen University
Current assignee: Xiamen University
Priority date: 2021-12-02
Filing date: 2021-12-02
Publication date: 2022-03-01

Abstract

A multi-modal medical image unsupervised registration method and system based on deep learning relate to medical image registration. Making an analog simulation sample, inputting the sequence related information of the floating image, the reference image and the floating image into a registration sub-network N1, and resampling the floating image by using the obtained deformation field to obtain a primary registration image; inputting the preliminary registration image, the reference image and the related information of the original sequence into a registration network N2, resampling the preliminary registration image by using the obtained final deformation field to obtain a final registration image, and optimizing the network by calculating a loss function between the registration image and the reference image so as to achieve the accurate registration of the non-rigid medical image. The system is provided with a data simulation module, a network training sample preprocessing module, a neural network model training module, a sample to be registered and sequence information preprocessing module and an unsupervised registration module. The dependency of the deep learning method on the data sample is solved. The registration can be accurately finished only by cooperatively inputting the sequence information diagram of the floating image to be registered.

Description

Multi-modal medical image unsupervised registration method and system based on deep learning

Technical Field

The invention relates to medical image registration, in particular to a method and a system for unsupervised registration of multi-modal medical images based on deep learning.

Background

Medical image registration refers to finding one or a series of spatial transformations for the floating medical image to bring it into spatial correspondence with corresponding points on the reference medical image, which can be matched by medical image registration to the spatial location of diagnostically significant anatomical points on the floating and reference images. Currently, medical image registration has a plurality of branches, and is divided into single-modality medical image registration and multi-modality medical image registration according to the modality type of the registered medical image, and divided into rigid medical image registration and non-rigid medical image registration according to the type of spatial transformation. Among them, non-rigid medical image registration attracts very much attention, and plays a crucial role in the fields of surgical guidance, radiotherapy, intelligent diagnosis, and the like.

The field of medical image registration emerges a large number of traditional algorithms to address non-rigid medical image registration. The demons method proposed by Thirion et al (j. -p. -third, "Image matching as a differentiation process: an analog with Maxwell's demons," Medical Image Analysis, vol.2, No.3, p.243-260,1998.) registers by estimating the velocity vector field between two adjacent images, in particular, they calculate the optical flow, smooth the flow graph using gaussian filters, and optimize the prediction for each pair of images by multiple iterations. Beg et al (M.F.Beg, M.I.Miller, A.trouve, and dL.Younges, "Computing large deformation methods of parts of views of differences" International _ Large _ Outal of Computing, vol.61, No.2, pp.139-157,2005.) propose LDDMM (large displacement differential metric mapping) method that solves the global variation problem and estimates the metric of the image by deriving and implementing the Euler-Lagrangian optimization algorithm to compute the particle flow. SyN (b.b. avants, c.l. epitope, m.g. gross, and j.c. ge, "systematic differential detailed imaging with cross-correlation," Medical image analysis, vol.12, No.1, pp. 26-41,2008 ") is the most widely used conventional algorithm in Medical image registration, and proposes a Symmetric image normalization method based on euler-lagrange optimization. In summary, these conventional non-rigid medical image registration methods all require a large amount of iterative optimization, and are computationally expensive and time-consuming.

To obtain a more efficient non-rigid medical image registration, a deep learning model is also introduced into the registration method. Li et al (Li H, Fan Y. non-n-edge image registration using self-superimposed complete periodic communication networks data [ C ].2018IEEE 15th International Symposium on biological Imaging (ISBI 2018). IEEE,2018:1075-1078.) propose a deep learning based approach to predict deformation parameters through a full convolution network, but this is a 2D registration network to solve the registration task of non-3D. Fan et al (Fan J, Cao X, Yap P T, et al. BIRNet: Brain Image registration using dual-super functional connectivity networks [ J ]. Medical Image Analysis,2019,54: 193-. The Voxelmorph method proposed by Balakrishnan et al (g.balakrishnan, a.zhao, m.r.sabuncu, j.guttag, and a.v.dalca, "Voxelmorph: a left frame work for formatted medical image registration," IEEE transactions on medical imaging, vol.38, No.8, pp.1788-1800,2019.) registers non-rigid medical images using an unsupervised training mode, but the method is mainly directed to single-modality medical image registration. On the other hand, all the above registration methods are based on texture correction, and cannot realize signal distortion correction. The magnetic resonance imaging signals under the nonuniform field not only can generate the distortion deformation of the texture structure, but also can cause the distortion of the signal intensity, and have obvious influence on the subsequent image analysis and diagnosis.

Disclosure of Invention

The invention aims to solve the problems that a deep learning registration method in the prior art is difficult to obtain high-distortion MR images, and the like, and provides a multimode medical image unsupervised registration method and a multimode medical image unsupervised registration system based on deep learning, which can realize synchronous texture structure correction and signal intensity correction, introduce a sequence information graph into a training sample, and solve the problem that a traditional deep learning registration method can only register images to be registered with magnetic resonance sequence information consistent with the training sample.

A multi-modal medical image unsupervised registration method based on deep learning comprises the following steps:

s1: obtaining multi-modality MR image samples of different magnetic resonance sequences by using a simulation method: a floating image, a reference image;

s2: preprocessing the floating image, the reference image and the sequence information corresponding to the corresponding floating image;

s3: inputting the floating images, the sequence information corresponding to the floating images and the reference images into a cascade network, obtaining a primary deformation field by a cascade network sub-registration network N1, and inputting the primary deformation field and the floating images into a space transformation network to obtain a primary registration image;

s4: inputting the preliminary registration image, sequence information corresponding to the original floating image and the reference image into a network sub-registration network N2 to obtain a final deformation field, and inputting the final deformation field and the floating image into a space transformation network to obtain a final registration image;

s5: computing a loss function L using the preliminary registered image and the reference image_N1Calculating a loss function L using the final registered image and the reference image_N2Superposing the two loss functions to obtain a total loss function, repeatedly executing the steps S3-S5, and training the graded networking network by minimizing the total loss function until the optimal registration effect is achieved;

s6: acquiring real-time multi-modal data to be registered: a floating image, a reference image;

s7: preprocessing the floating image to be registered, the reference image and the sequence information corresponding to the corresponding floating image;

s8: and inputting the floating image to be registered, the reference image and the sequence information corresponding to the corresponding floating image into the cascade network trained in the step S5, and performing the steps S3-S5 again to finish the registration to obtain a final registration image.

In step S1, the obtaining of the multi-modal MR image samples of different magnetic resonance sequences by using the simulation method is to perform analog simulation on a training sample required by the proposed deep neural cascade network, first obtain a sample template from a public data set, and simulate a floating image and a reference image with different sequences and distortions by using a Bloch equation; saving sequence information corresponding to the floating image: length information of the echo train, resolution information, echo time TE.

In step S2, the specific steps of the preprocessing may be: carrying out size standardization processing on the analog simulation floating image and the reference image; carrying out normalization processing on the analog simulation floating image and the reference image; and setting a zero array with corresponding size according to the sizes of the standardized floating image and the reference image, and filling the zero array with sequence information corresponding to the floating image respectively to obtain three sequence information graphs matched with the floating image.

In step S3, the hierarchical network sub-registration network N1 employs a U-type network as a main network, and the U-type network structure includes an encoder and a decoder using a hopping connection therebetween.

In step S4, the hierarchical network sub-registration network N2 employs a U-type network as a main network, and the U-type network structure includes an encoder and a decoder using a hopping connection therebetween.

In steps S3 and S4, the spatial transform network is composed of a localization network, a sampling grid generator, and a sampler, by which the floating image can be resampled.

In step S5, the total loss function is:

L_total＝L_N1+L_N2

＝L_sim(f,m°φ₁)+λ₁L_smooth(φ₁)+L_sim(f,m′°φ₂)+λ₂L_smooth(φ₂)

wherein f is a reference image, m is a floating image, m' is a preliminary registration image, phi₁For preliminary deformation field, phi₂Is the final deformation field; l is_sim(f,m°φ₁) Expressed as loss of similarity between the reference image and the preliminary registered image, L_smooth(φ₁) Representing the preliminary deformation field phi₁Smoothness constrained regularization loss, λ₁Then it is a regularization coefficient; l is_sim(f,m'°φ₂) Expressed as a loss of similarity between the reference image and the final registered image, L_smooth(φ₂) Represents the final deformation field phi₂Smoothness constrained regularization loss, λ₂It is the regularization coefficient.

In step S7, the preprocessing may perform unsupervised non-rigid registration on the real data of different sequences using a deep neural network trained by analog simulation data samples: and preprocessing the sequence information graph corresponding to the floating image to be registered, the reference image to be registered and the floating image according to the step S2, and inputting the processed data sample into the trained deep neural network to obtain a final registered image.

A multimode MR image unsupervised registration system based on deep learning is sequentially provided with the following modules: the system comprises a data simulation module, a network training sample preprocessing module, a neural network model training module, a sample to be registered and sequence information preprocessing module and an unsupervised registration module;

the data simulation module is used for acquiring multi-modal MR image samples of different magnetic resonance sequences;

the network training sample preprocessing module is used for preprocessing the floating image, the reference image and the sequence information corresponding to the corresponding floating image;

the neural network model training module is used for training the registration neural network;

the pre-processing module of the sample and the sequence information to be registered is used for pre-processing the floating image to be registered, the reference image and the sequence information corresponding to the corresponding floating image to be registered;

the unsupervised registration module is used for carrying out unsupervised registration on the image to be registered.

Further, the network training sample preprocessing module comprises a size standardization unit, a normalization unit and a sequence information quantization unit; the size standardization unit is used for standardizing the sizes of the floating image and the reference image; the normalization unit is used for performing normalization operation on the floating image and the reference image; the sequence information quantization unit is used for setting a zero array with a standardized size by a user, and filling the zero array by utilizing sequence information corresponding to the floating image to obtain three sequence information graphs matched with the floating image: a length information map, a resolution information map, and an echo time TE information map of the echo train.

The pre-processing module of the sample and the sequence information to be registered is consistent with the pre-processing module of the network training sample, and is used for pre-processing the sample and the sequence information to be registered, and comprises the following steps: the device comprises a size standardization unit, a normalization unit and a sequence information quantization unit.

The size standardization unit is used for standardizing the sizes of the floating images to be registered and the reference images; the normalization unit is used for performing normalization operation on the floating image to be registered and the reference image; and the sequence information quantization unit is used for setting a zero array with a standardized size by a user, and filling the zero array by utilizing sequence information corresponding to the floating image to be registered to obtain three sequence information graphs matched with the floating image to be registered.

Meanwhile, the invention introduces sequence information of the floating sample into the training sample: the length, resolution and echo time TE of the echo chain enable the registration to be rapidly completed in a targeted manner by inputting the sequence information of the corresponding floating sample during the registration. For 1000 256 × 256 training samples, the network time of the method is about 2h, the registration speed is about 0.03 s/training sample, and the registration efficiency is remarkably improved.

Compared with the prior art, the invention has the beneficial effects that:

1. the invention simulates and simulates a large amount of distorted data samples by utilizing a simulation method based on the template derived from the public data set, and solves the dependence of the deep learning method on the data samples.

2. The invention creatively introduces sequence information into a training sample: when the length information, the resolution information and the echo time TE of the echo chain are registered, the registration can be accurately finished only by cooperatively inputting the sequence information graph of the floating image to be registered; the use of a cascade network is also one of the keys to the success of unsupervised registration of MR images.

3. The invention utilizes the simulation data to train and carry out registration on the real sampling data to overcome the problem of insufficient real sampling amount with high distortion, and simultaneously, the method can also carry out registration on different magnetic resonance sequence samples to increase the usability of a registration network.

Drawings

FIG. 1 is a flowchart of an unsupervised registration process of a multi-modal MR image based on deep learning according to an embodiment of the present invention;

FIG. 2 is a diagram of a deep neural network structure for unsupervised registration of a multi-modal MR image based on deep learning in an embodiment of the present invention;

FIG. 3 is a sample to be registered and a registration result according to an embodiment of the present invention;

fig. 4 is a system structure diagram of the unsupervised registration of the multi-modal MR images based on the deep learning in the embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described in the following embodiments with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The invention aims at a fast and accurate unsupervised registration of multi-modality MR medical images with non-rigid deformations. In order to solve the problem of lack of real samples with high distortion, the invention uses a Bloch simulation method to generate a large number of floating images to be registered and reference images to be registered, trains a deep neural network by using sequence information corresponding to data samples to be registered and the floating images to be registered, and inputs the real samples to be registered and the sequence information thereof into the trained deep neural network to complete registration. Firstly, making an analog simulation sample, inputting the sequence related information of the floating image, the reference image and the floating image into a registration sub-network N1, and resampling the floating image by using the obtained deformation field to obtain a primary registration image; and inputting the preliminary registration image, the reference image and the related information of the original sequence into a registration network N2, and resampling the preliminary registration image by using the obtained final deformation field to obtain a final registration image. The deep neural network is optimized by calculating a loss function between the registered image and the reference image so as to achieve accurate registration of the non-rigid medical image.

As shown in fig. 1, the present invention provides a method for unsupervised registration of multi-modal MR images based on deep learning, comprising:

s1: obtaining multi-modality MR image samples of different magnetic resonance sequences by using a simulation method: a floating image, a reference image; the method specifically comprises the following steps:

firstly, obtaining a sample template from a public data set, and simulating a floating image and a reference image with different sequences and distortion by using a Bloch equation; saving sequence information corresponding to the floating image: length information of the echo train, resolution information, echo time TE.

S2: preprocessing the floating image, the reference image and the sequence information corresponding to the corresponding floating image; the method specifically comprises the following steps:

carrying out size standardization processing on the analog simulation floating image and the reference image; carrying out normalization processing on the analog simulation floating image and the reference image; and setting a zero array with corresponding size according to the sizes of the standardized floating image and the reference image, and filling the zero array with sequence information corresponding to the floating image respectively to obtain three sequence information graphs matched with the floating image.

S3: as shown in fig. 2, the floating image m, the sequence information i corresponding to the floating image and the reference image f are input into the network, and the initial deformation field phi is obtained by the cascade network sub-registration network N1₁Inputting the preliminary deformation field and the floating image m into a space transformation network to obtain a preliminary registration image m'; the spatial transformation network obtains the pixel values of the preliminary registration image by linear interpolation:

where M (φ (j)) is the pixel value of j and Z (φ (j)) is the point field of φ (j).

S4: as shown in fig. 2, the preliminary registration image m', the sequence information i corresponding to the original floating image, and the reference image f are input into a stage network sub-registration network N2 to obtain a final deformation field phi₂The final deformation field phi₂Inputting the floating image m into a space transformation network to obtain a final registration image; the spatial transformation network obtains the pixel values of the final registered image by using linear interpolation:

S5: computing a loss function L using the preliminary registered image and the reference image_N1Calculating a loss function L using the final registered image and the reference image_N2Superposing the two loss functions to obtain a total loss function, repeatedly executing the steps S3 to S5, and training the graded networking network by minimizing the total loss function until the optimal registration effect is achieved; the total loss function is:

L_total＝L_N1+L_N2

wherein f is a reference image, m is a floating image, m' is a preliminary registration image, phi₁For preliminary deformation field, phi₂Is the final deformation field; l is_sim(f,m°φ₁) Expressed as loss of similarity between the reference image and the preliminary registered image, L_smooth(φ₁) Representing the preliminary deformation field phi₁Smoothness constrained regularization loss, λ₁Is the system of regularizationCounting; l is_sim(f,m'°φ₂) Expressed as a loss of similarity between the reference image and the final registered image, L_smooth(φ₂) Represents the final deformation field phi₂Smoothness constrained regularization loss, λ₂Then it is a regularization coefficient;

s8: inputting the floating image to be registered, the reference image and the sequence information corresponding to the corresponding floating image into the cascade network trained in the step S5, and performing the steps S3 to S5 again to complete the registration to obtain a final registered image. As shown in fig. 3, from left to right, the following are sequentially:

the floating image SE-EPI-DWI to be registered, the reference image SE-EPI-T2 and the final registered image.

Fig. 4 is a structural diagram of a system for unsupervised registration of a multi-modal MR image based on deep learning in an embodiment of the present invention, and as shown in fig. 4, the present invention provides a system for unsupervised registration of a multi-modal MR image based on deep learning, which is sequentially provided with:

the data simulation module 101: for acquiring multi-modality MR image samples of different magnetic resonance sequences;

the network training sample preprocessing module 102: the image preprocessing module is used for preprocessing the floating image, the reference image and the sequence information corresponding to the corresponding floating image;

the neural network model training module 103: training a registration neural network;

the to-be-registered sample and sequence information preprocessing module 104: the image preprocessing module is used for preprocessing the floating image to be registered, the reference image and the sequence information corresponding to the corresponding floating image to be registered;

unsupervised registration module 105: the system is used for carrying out unsupervised registration on an image to be registered;

the data simulation module 101 specifically includes:

acquiring a sample template from a public data set, and simulating a floating image and a reference image with different sequences and distortion by using a Bloch equation; saving sequence information corresponding to the floating image: length information, resolution information and echo time TE of an echo chain;

the network training sample preprocessing module 102 specifically includes:

a size standardizing unit for standardizing sizes of the floating image and the reference image;

a normalization unit: the image normalization method is used for performing normalization operation on the floating image and the reference image;

a sequence information quantization unit: a user sets a zero array with a standardized size, and fills the zero array with sequence information corresponding to the floating image respectively to obtain three sequence information graphs matched with the floating image;

the neural network model training module 103 specifically includes:

training a neural network by using sequence information corresponding to the floating image, the reference image and the floating image, calculating a total loss function, and optimizing the refreshed channel network by minimizing the total loss function;

the pre-processing module 104 for sample and sequence information to be registered specifically includes:

the size standardization unit is used for standardizing the sizes of the floating images to be registered and the reference images;

a normalization unit: the image normalization method is used for performing normalization operation on a floating image to be registered and a reference image;

a sequence information quantization unit: a user sets a zero array with a standardized size, and fills the zero array with sequence information corresponding to the floating image to be registered respectively to obtain three sequence information graphs matched with the floating image to be registered;

the unsupervised registration module 105 specifically includes:

and inputting the preprocessed floating image to be registered, the reference image and the sequence information corresponding to the floating image to be registered into the trained deep neural network to obtain a final registered image.

According to the method and the system for unsupervised registration of the multi-modal MR images based on deep learning, provided by the invention, a large number of data samples are simulated to train a neural network, so that a better test result is obtained on real acquisition data, and the problem that the MR images with high distortion are difficult to obtain in the deep learning registration method is solved; meanwhile, the invention introduces sequence information of the floating sample into the training sample: the length, resolution and echo time TE of the echo chain enable the registration to be rapidly completed in a targeted manner by inputting the sequence information of the corresponding floating sample during the registration. For 1000 256 × 256 training samples, the network time of the method is about 2h, the registration speed is about 0.03 s/sample, and the registration efficiency is remarkably improved.

The principles and embodiments of the present invention are explained herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In summary, this summary should not be construed to limit the present invention.

Claims

1. The method for unsupervised registration of the multi-modal medical images based on deep learning is characterized by comprising the following steps of:

2. The unsupervised registration method for multi-modal medical images based on deep learning as claimed in claim 1, wherein in step S1, the obtaining of multi-modal MR image samples of different magnetic resonance sequences by simulation method is to perform analog simulation on training samples required by the proposed deep neural cascade network, first obtaining a sample template from a public data set, and simulating different sequences of distorted floating images and reference images by utilizing Rloch equation; saving sequence information corresponding to the floating image: length information of the echo train, resolution information, echo time TE.

3. The method for unsupervised registration of multi-modal medical images based on deep learning as claimed in claim 1, wherein in step S2, the specific steps of the pre-processing are: carrying out size standardization processing on the analog simulation floating image and the reference image; carrying out normalization processing on the analog simulation floating image and the reference image; and setting a zero array with corresponding size according to the sizes of the standardized floating image and the reference image, and filling the zero array with sequence information corresponding to the floating image respectively to obtain three sequence information graphs matched with the floating image.

4. The deep learning-based multi-modal medical image unsupervised registration method of claim 1, wherein in step S3, the hierarchical network sub-registration network N1 employs a U-type network as a host network, the U-type network structure comprising an encoder and a decoder, the encoder and the decoder using a jump connection therebetween.

5. The deep learning-based multi-modal medical image unsupervised registration method of claim 1, wherein in step S4, the hierarchical network sub-registration network N2 employs a U-type network as a host network, the U-type network structure comprising an encoder and a decoder, the encoder and the decoder using a jump connection therebetween.

6. The method for unsupervised registration of multi-modal medical images based on deep learning as claimed in claim 1, wherein in step S3 or S4, the spatial transformation network is composed of three parts of localization network, sampling grid generator and sampler, and the floating images can be resampled by the sampler.

7. The method for unsupervised registration of multi-modal medical images based on deep learning of claim 1, wherein in step S5, the total loss function is:

L_total＝L_N1+L_N2

＝L_sim(f，m°φ₁)+λ₁L_smooth(φ₁)+L_sim(f，m′°φ₂)+λ₂L_smooth(φ₂)

wherein f is a reference image, m is a floating image, m' is a preliminary registration image, phi₁For preliminary deformation field, phi₂Is the final deformation field; l is_sim(f，m°φ₁) Expressed as loss of similarity between the reference image and the preliminary registered image, L_smooth(φ₁) Representing the preliminary deformation field phi₁Smoothness constrained regularizationChange loss, λ₁Then it is a regularization coefficient; l is_sim(f，m′°φ₂) Expressed as a loss of similarity between the reference image and the final registered image, L_smooth(φ₂) Represents the final deformation field phi₂Smoothness constrained regularization loss, λ₂It is the regularization coefficient.

8. The method for unsupervised registration of multi-modal medical images based on deep learning as claimed in claim 1, wherein in step S7, the preprocessing is an unsupervised and non-rigid registration of different sequences of real data by using a deep neural network trained by simulation data samples: and preprocessing the sequence information graph corresponding to the floating image to be registered, the reference image to be registered and the floating image according to the step S2, and inputting the processed data sample into the trained deep neural network to obtain a final registered image.

9. A multimode MR image unsupervised registration system based on deep learning is characterized by being sequentially provided with the following modules: the system comprises a data simulation module, a network training sample preprocessing module, a neural network model training module, a sample to be registered and sequence information preprocessing module and an unsupervised registration module;

10. The system of claim 9, wherein the network training sample preprocessing module comprises a size normalization unit, a sequence information quantization unit; the size standardization unit is used for standardizing the sizes of the floating image and the reference image; the normalization unit is used for performing normalization operation on the floating image and the reference image; the sequence information quantization unit is used for setting a zero array with a standardized size by a user, and filling the zero array by utilizing sequence information corresponding to the floating image to obtain three sequence information graphs matched with the floating image: a length information graph, a resolution information graph and an echo time TE information graph of the echo chain;

the pre-processing module of the sample and the sequence information to be registered is consistent with the pre-processing module of the network training sample, and is used for pre-processing the sample and the sequence information to be registered, and comprises the following steps: the device comprises a size standardization unit, a normalization unit and a sequence information quantization unit; the size standardization unit is used for standardizing the sizes of the floating images to be registered and the reference images; the normalization unit is used for performing normalization operation on the floating image to be registered and the reference image; and the sequence information quantization unit is used for setting a zero array with a standardized size by a user, and filling the zero array by utilizing sequence information corresponding to the floating image to be registered to obtain three sequence information graphs matched with the floating image to be registered.