CN111862174A - Cross-modal medical image registration method and device - Google Patents

Cross-modal medical image registration method and device Download PDF

Info

Publication number
CN111862174A
CN111862174A CN202010652606.9A CN202010652606A CN111862174A CN 111862174 A CN111862174 A CN 111862174A CN 202010652606 A CN202010652606 A CN 202010652606A CN 111862174 A CN111862174 A CN 111862174A
Authority
CN
China
Prior art keywords
image
modality
network
cross
deformation field
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010652606.9A
Other languages
Chinese (zh)
Other versions
CN111862174B (en
Inventor
李秀
徐哲
马露凡
罗凤
严江鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen International Graduate School of Tsinghua University
Original Assignee
Shenzhen International Graduate School of Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen International Graduate School of Tsinghua University filed Critical Shenzhen International Graduate School of Tsinghua University
Priority to CN202010652606.9A priority Critical patent/CN111862174B/en
Publication of CN111862174A publication Critical patent/CN111862174A/en
Application granted granted Critical
Publication of CN111862174B publication Critical patent/CN111862174B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • G06T7/33Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10088Magnetic resonance imaging [MRI]

Abstract

A cross-modality medical image registration method, comprising: providing a training set comprising a floating image of a first modality and a reference image of a second modality; inputting the floating image into an image conversion network, and converting the floating image into a conversion image of a second mode; inputting the floating image and the reference image into a cross-mode flow subnetwork to output a first deformation field; inputting the conversion image and the reference image into a single-mode streaming network to output a second deformation field; inputting the first deformation field and the second deformation field into a deformation field fusion network to output a final deformation field; inputting the floating image and the final deformation field into a space transformation network to obtain a floating image subjected to final deformation field distortion transformation; obtaining a first total loss function according to the transformed floating image and the reference image, and performing supervision training on the network by taking the minimized first total loss function as a target; and inputting the image to be registered into the trained network to obtain a registered image. The invention can greatly improve the effect of cross-modal medical image registration.

Description

Cross-modal medical image registration method and device
Technical Field
The invention relates to the technical field of medical image registration, in particular to a cross-modal medical image registration method and device.
Background
Medical image registration is an optimization process for aligning a floating image with a reference image based on the appearance of the medical image, with the goal of finding the best spatial transformation to align the region of interest in the input image. As a key technology of image-guided therapy, medical image registration attempts to establish anatomical correspondence between different medical images, and is applied to a plurality of clinical scenarios such as endoscopy, disease diagnosis, surgical guidance, and radiotherapy. Medical image registration is a broad research topic, and can be divided into single-mode registration and cross-mode registration according to the type of an image to be registered, and can be divided into rigid registration, affine registration and deformable registration according to the type of registration transformation.
The traditional registration method solves the optimal transformation by iteratively optimizing the image similarity index, and has low calculation efficiency. Therefore, the subsequent technology introduces a deep learning method into a medical image registration task, and utilizes a deep neural network to directly estimate the deformable transformation of the input image pair, thereby effectively balancing the registration precision and the calculation efficiency. However, obtaining a real deformation field (groudtruth) and a three-dimensional segmentation label is very challenging and costly, so the registration method gradually focuses on registration network learning under an unsupervised condition.
Existing cross-modality medical image registration techniques can be divided into two major categories: 1) modifying the loss function of the existing single-mode registration, and designing similarity measurement of cross-mode images to guide unsupervised deformable registration network learning; 2) the cross-modal image conversion based M2U (Multimodal to Unimodal) registration method is to convert cross-modal image registration into a single-modal registration task by means of an existing image conversion technology. The following are introduced separately:
(1) registration technology based on cross-modal image similarity measurement
The method directly performs different modality image registration tasks based on cross-modality image similarity loss. Due to the huge appearance difference between the cross-modal images, most of the traditional single-modal image similarity measurement is not suitable for the cross-modal registration task. Therefore, it is highly desirable to design an effective cross-modal image similarity loss for guiding the training of an unsupervised cross-modal registration network. To overcome this challenge, Mattias et al propose a modality independent domain descriptor MIND based on the concept of image self-similarity. MIND has higher robustness to the obvious difference between different modes, and can effectively depict the similarity of the cross-mode images.
A representative approach to this class of techniques is the VoxelMorph framework + MIND similarity metric. The cross-modal similarity metric MIND is used as a loss function, and is directly applied to a typical unsupervised registration frame VoxelMorph in an expanding mode to guide a network to learn a deformable mapping relation according to a multi-modal input image. The Network structure of VoxelMorph is shown in fig. 1, and VoxelMorph is an unsupervised deformable registration framework based on Convolutional Neural Network (CNN). The method comprises the steps that a deep convolution registration network cascades UNet and a space transformation network structure, floating images (M) to be registered and reference images (F) are used as input, and the floating images and the reference images are registered through a registration network g θ(F, M) learning deformable mapping between input images and outputting a high-dimensional deformation field phi. The transformed floating image Warped (phi) is obtained by spatially warping the floating image M according to the estimated deformation field phi. The loss function of the whole network comprises two parts: 1) loss of similarity between the transformed floating image Warped (phi) and the reference image F; 2) the regularization loss of the deformation field phi is smoothly estimated. The cross-modal image registration technology based on VoxelMorph + MIND inputs the cross-modal image to be registered into a VoxelMorph network, uses MIND to calculate the cross-modal image similarity loss for supervising parameter training, and realizes deformable registration of the three-dimensional cross-modal image.
(2) M2U registration technology based on cross-modal image conversion
The cross-modal medical image registration technology is completed by means of an image conversion method, and the core technical idea is to convert the complex cross-modal medical image registration into a simpler single-modal registration task. The overall flow of the cross-modal registration method based on image conversion is as follows:
1) a cross-modal image conversion network is constructed aiming at cross-modal medical image data, and the aim is to learn the mapping relation between different modal images under the condition of no pairing data. A generation countermeasure network (GAN) represented by Cycle-GAN is a typical image conversion network. The cross-mode image conversion process based on the Cycle-GAN network is shown in FIGS. 2a to 2 c. To achieve the mutual mapping of the image between the two image domains X and Y, the Cycle-GAN network contains two domain mapping networks (i.e. generators) and two associated discriminators, as shown in fig. 2 a. The generator G is responsible for mapping images from image domain X to image domain Y, i.e. G: x → Y; the generator F is responsible for mapping the image from image domain Y to image domain X, i.e. F: y → X. The discriminator Dx is used to distinguish between the real image from the image field X and the image transformed by the generator F; the same principle Dy is used to distinguish between real images from the image domain Y and the images converted by the generator G. Fig. 2b shows a process of mapping an image from an original domain X to a target domain Y using a generator G and then mapping the image back to the original domain X using a generator F, in which a discriminator Dy is used to distinguish between a real image and a generated image on the image domain Y and to calculate a countermeasure loss; fig. 2c shows the process of mapping an image from the original domain Y to the target domain X using the generator F and back to the original domain Y using the generator G, using the discriminator Dx to distinguish between real images and generate images on the image domain X and to calculate the contrast loss. To ensure that the domain mapping transformations G and F are bi-directional reciprocal, the Cycle-GAN adds a Cycle-consistency loss (Cycle-consistency loss) on the basis of the arbiter countermeasures loss. The Cycle-GAN uses paired generation sub-networks to estimate mapping between images, judges whether the sub-networks judge whether the generated images are true or false, and adopts antagonism loss and Cycle consistency loss to jointly supervise network training;
2) Based on the mapping relation learned by the Cycle-GAN network, the conversion of the image from one mode to another mode is completed, the input cross-mode image is converted into a single-mode image, and the problem is simplified into single-mode image registration;
3) and constructing an unsupervised Deformable registration frame based on learning aiming at the obtained single-mode images, learning Deformable mapping among the single-mode input images by utilizing a depth convolution registration network, and performing spatial distortion transformation on the floating image by virtue of an STN module according to an estimated deformation field (Deformable Fields, DFs) so as to enable the similarity between the transformed floating image and the reference image to reach the maximum value.
However, the foregoing VoxelMorph + MIND-based cross-mode image registration technique and M2U registration technique generally cannot achieve satisfactory registration effect.
The above background disclosure is only for the purpose of assisting understanding of the concept and technical solution of the present invention and does not necessarily belong to the prior art of the present patent application, and should not be used for evaluating the novelty and inventive step of the present application in the case that there is no clear evidence that the above content is disclosed at the filing date of the present patent application.
Disclosure of Invention
In order to solve the technical problems, the invention provides a cross-modal medical image registration method and device, which can greatly improve the accuracy and robustness of cross-modal medical image registration.
In order to achieve the purpose, the invention adopts the following technical scheme:
one embodiment of the invention discloses a cross-modal medical image registration method, which comprises the following steps:
s1: providing a training set comprising a floating image of a first modality and a reference image of a second modality;
s2: inputting the floating image into an image conversion network to convert the floating image from a first modality to a second modality and output a converted image of the second modality;
s3: inputting the floating image and the reference image into a cross-mode flow sub-network, and outputting a first deformation field;
s4: inputting the conversion image and the reference image into a single-mode streaming network, and outputting a second deformation field;
s5: inputting the first deformation field and the second deformation field into a deformation field fusion network to superpose the first deformation field and the second deformation field and output a final deformation field;
s6: inputting the floating image and the final deformation field into a space transformation network to obtain a floating image subjected to distortion transformation by the final deformation field;
S7: comparing the transformed floating image with the reference image to obtain a first total loss function, and repeatedly executing steps S2-S7 to train the cross-modal streaming subnetwork, the single-modal streaming subnetwork, the deformation field fusion network and the space transformation network with the aim of minimizing the first total loss function until the training is finished and executing step S8;
s8: and (5) performing the steps S2-S6 again on the floating image in the first modality and the reference image in the second modality to be registered to obtain a transformed floating image, namely the registered image.
Preferably, the image conversion network employs a modified Cycle-GAN network, and the second total loss function of the modified Cycle-GAN network is as follows:
Figure BDA0002575541000000043
wherein the loss is resisted
Figure BDA0002575541000000041
Two discriminators D1, D of a modified Cycle-GAN network, respectively2Antagonistic loss of (1), cycle consistency loss LcycIs two generators G to an improved Cycle-GAN network1、G2Constraint of transformation reversibility, loss of identity mapping LidentityIs a normative constraint performed by generating a conversion image under the same modality, and the structural consistency is lost LMINDIs a constraint on the structural similarity of the original image and the generated image, λcyc、λidentity、λMINDRespectively represent L cyc、Lidentity、LMINDThe relative importance of;
step S7 also includes training the improved Cycle-GAN network with the second total loss function as a target.
Preferably, wherein the structures are uniformSexual loss LMINDComprises the following steps:
Figure BDA0002575541000000042
wherein M represents the modality independent Domain descriptor MIND, Ir1Floating image representing a first modality, Ir2Reference image representing a second modality, N1 and N2Respectively representing images Ir1 and Ir2R represents a non-local region around voxel x; image G1(Ir2) Finger use generator G1Image Ir2The resulting image obtained after conversion, image G2(Ir1) Finger use generator G2Image Ir1And (4) converting to obtain a generated image.
Preferably, wherein identity mapping penalizes LidentityComprises the following steps:
Lidentity=||G1(I1)-I1||1+||G2(I2)-I2||1
wherein ,I1Image representing a first modality, I2Image representing a second modality, image G1(I1) Finger use generator G1Image I of a first modality1The resulting image obtained after conversion, image G2(I2) Finger use generator G2Image I of the second modality2And (4) converting to obtain a generated image.
Preferably, the cross-modal streaming sub-network adopts a UNet network structure, wherein the UNet network structure comprises an encoder and a decoder, and a hopping connection is adopted between convolution layers of the encoder and the decoder.
Preferably, the monomodal streaming network adopts a UNet network structure, wherein the UNet network structure comprises an encoder and a decoder, and a hopping connection is adopted between convolutional layers of the encoder and the decoder.
Preferably, the deformation field fusion network is a 3D convolutional neural network.
Preferably, the spatial transform network comprises a spatial grid generator and a sampler, the spatial grid generator generates a sampling grid according to the final deformation field, and the sampler spatially distorts the floating image according to the sampling grid.
Preferably, the first total loss function is:
Figure BDA0002575541000000051
wherein ,
Figure BDA0002575541000000052
denotes the final deformation field, Ir1Floating image representing a first modality, Ir2A reference image representing a second modality,
Figure BDA0002575541000000053
representing floating images after final warping transformation, image similarity loss
Figure BDA0002575541000000054
Representing transformed floating images
Figure BDA0002575541000000055
And image Ir2The image similarity between the images is lost,
Figure BDA0002575541000000056
representing the final deformation field
Figure BDA0002575541000000057
Smoothness is subject to constrained regularization loss, and λ is the regularization coefficient.
Another embodiment of the invention discloses a cross-modality medical image registration apparatus, which includes a processor and a readable storage medium, the readable storage medium storing executable instructions executable by the processor, the processor being configured to be caused by the executable instructions to implement the cross-modality medical image registration method described above.
Compared with the prior art, the invention has the beneficial effects that: according to the cross-modal medical image registration method and device, a floating image is firstly converted into a modal through an image conversion network, then a deformation field is estimated by respectively utilizing a cross-modal flow subnetwork and a single-modal flow subnetwork under an unsupervised condition, a double-flow mechanism effectively integrates an original image and generated image information, wherein the original image texture characteristics are introduced through cross-modal flow, the interference of non-real artificial characteristics in the generated image on registration is weakened, and the single-modal flow is used for effectively inhibiting a voxel drift effect caused by the cross-modal flow; the original cross-modal flow and the synthetic single-modal flow are cooperatively optimized, and a more real deformation field is learned based on the original floating image and the generated conversion image information respectively; then, the deformation fields estimated by the cross-modal flow subnetwork and the single-modal flow subnetwork are fused, so that the accuracy and the robustness of cross-modal medical image registration are greatly improved, and better registration performance is obtained. By the cross-modal medical image registration method and device, the situation that the direct cross-modal medical image registration effect is poor due to the fact that the different modal medical images have large appearance differences can be avoided.
In a further scheme, the improved Cycle-GAN Network is further adopted as an image conversion Network, two loss function constraints are added compared with the existing Cycle-GAN Network, the structural similarity between a generated image and an original image is enhanced, the structural fidelity is improved, and more artificial features are prevented from being introduced during image conversion, so that the problems that the existing countermeasure Network (GAN) is utilized to convert a trans-modal image pair to be registered into a single-modal image pair, unreal artificial anatomical features are inevitably introduced to interfere the registration process, and the detail texture of the original image is lost, so that the registration accuracy is reduced are solved.
Drawings
Fig. 1 is a schematic diagram of a conventional VoxelMorph network structure;
FIGS. 2a to 2c are schematic diagrams of image transformation based on a Cycle-GAN network;
FIG. 3a is an original CT modality image;
FIG. 3b is an MR image generated using a Cycle-GAN network;
FIG. 4 is a flowchart illustrating a cross-modality medical image registration method according to a preferred embodiment of the present invention;
FIG. 5 is a dual-flow cross-modal registration network structure based on counterlearning in an embodiment of the present invention;
FIGS. 6a and 6b are schematic diagrams of a modified Cycle-GAN network in accordance with an embodiment of the present invention;
FIG. 7a is an original CT modality image;
FIG. 7b is an MR image generated using the improved Cycle-GAN network of the present invention;
fig. 8 is a schematic structural diagram of the cross-modal stream subnet UNet _ o/single-modal stream subnet UNet _ s;
fig. 9 is a hardware configuration diagram of a cross-modality medical image registration apparatus according to a preferred embodiment of the present invention.
Detailed Description
In order to make the technical problems, technical solutions and advantageous effects to be solved by the embodiments of the present invention more clearly apparent, the present invention is further described in detail below with reference to the accompanying drawings and the embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The terms "first", "second", and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the embodiments of the present invention, "a plurality" means two or more unless specifically limited otherwise.
The inventor finds that the reasons that a cross-mode image registration technology based on VoxelMorph + MIND in the prior art cannot achieve a satisfactory registration effect are that the appearance difference of medical images in different modes is obvious, the registration difficulty is high by directly neglecting the mode difference of the images, and the precision cannot be guaranteed; even if the technology modifies the loss function in the previous single-mode registration task, it is still difficult to find a robust cross-mode similarity metric. In addition, the prior art M2U registration technique based on cross-modal image transformation has the defect that artificial features are inevitably introduced in generating images; most of the technologies utilize Cycle-GAN and the like to generate a countermeasure network (GAN) to realize cross-mode image conversion. Specifically, the Cycle-GAN network learns the mapping relationship between different domains in the input image using a generator, and evaluates the authenticity of the generated image in the target domain using a discriminator. The network adopts the countermeasure loss guidance generator fed back by the discriminator to learn the image domain mapping, so that the generated image is similar to the target domain image in distribution, and meanwhile, the difference between the generated image and the original domain image is continuously increased. However, such a method does not guarantee that the network outputs a desired generated image because the network learns only the distribution relationship in probability by fighting against the loss. Although the network introduces the cycle consistency loss to constrain the reversibility of different modality conversion, the network is not enough to ensure that the generated image still retains the original image characteristics, because the structural similarity of the image before and after the modality conversion is not constrained during network training. Therefore, the image conversion method based on the generation countermeasure network inevitably introduces unreal artificial features when generating the target domain image, which results in an increase of the mismatching rate in the subsequent registration process. The original Cycle-GAN network does not perform the normative constraint on the unit mapping of the input image in the same modality, and is likely to wrongly convert the input image in the target domain to another domain. Therefore, the method of simply converting cross-modal registration to single-modal registration by means of existing cross-modal image conversion techniques is not robust.
Meanwhile, the target modality image generated by the existing image conversion technology is easy to lose the local texture features of the original image, and presents a larger structural difference from the original image, taking abdominal registration from CT to MR as an example, as shown in fig. 3a and 3b, fig. 3a is the original CT modality image, fig. 3b is the MR image generated by using a Cycle-GAN network, wherein fig. 3b is compared with fig. 3a and loses the local texture features in the image. In the unsupervised registration stage, the existing cross-modality registration technology generally inputs the generated images directly into a deformable registration network as floating images for single-modality registration without including the original images. Since the entire unsupervised registration network estimates the deformable transformation from the input image appearance features, the fidelity of the Deformation Field (DFs) estimated by the registration network naturally depends on the consistency of the input image with the original image features. Because the prior art ignores auxiliary information provided by an original image during registration, parameter learning of a registration network is greatly influenced by artificial features in a generated image, so that the network finally estimates a distorted deformation field, the original floating image cannot be well aligned to a reference image, and the registration accuracy is reduced.
In addition, unsupervised medical image registration utilizes similarity or error loss supervised network training between the transformed floating image and the reference image. Because the appearance difference between different modal images is huge, the similarity measurement index commonly used in the unsupervised single-modal registration task is not suitable for the cross-modal scene any more. Many learning-based unsupervised registration techniques use Mutual Information (MI), Cross Correlation (CC), etc. indicators to measure image similarity across modal registration tasks and use it to guide network parameter learning. However, the similarity measurement indexes are directly migrated and applied to the cross-modal registration task, and the image similarity cannot be effectively described, so that the network learns in the wrong direction according to biased image similarity guide parameters.
The invention aims to incorporate original image information in the registration process through a design mode of double-flow registration field fusion, and helps a network to robustly learn a more real deformation field, thereby obtaining better registration performance. In order to guide the cross-modal registration process by the original image information, the invention effectively utilizes the original cross-modal flow and the estimated deformation field of the synthetic single-modal flow, and automatically learns how to better fuse the two deformation fields through a convolution network. Meanwhile, in order to avoid introducing unreal artificial features in the image conversion process, two loss function constraints are added in the Cycle-GAN to improve the fidelity of the anatomical structure in the generated image.
As shown in fig. 4, a preferred embodiment of the present invention provides a cross-modality medical image registration method for registering a floating image of a first modality to a reference image of a second modality, including the following steps:
s1: providing a training set comprising a floating image of a first modality and a reference image of a second modality;
s2: inputting the floating image into an image conversion network to convert the image to be registered from a first modality into a second modality and outputting a conversion image of the second modality;
s3: inputting the floating image and the reference image into a cross-mode flow sub-network, and outputting a first deformation field;
s4: inputting the conversion image and the reference image into a single-mode streaming network, and outputting a second deformation field;
s5: inputting the first deformation field and the second deformation field into a deformation field fusion network to superpose the first deformation field and the second deformation field and output a final deformation field;
s6: inputting the floating image and the final deformation field into a space transformation network to obtain a floating image subjected to final deformation field distortion transformation;
s7: comparing the transformed floating image with the reference image to obtain a first total loss function, and repeatedly executing steps S2-S7 to perform supervised training on the cross-modal streaming subnetwork, the single-modal streaming subnetwork, the deformation field fusion network and the space transformation network with the aim of minimizing the first total loss function until the training is completed and executing step S8;
S8: and (5) performing the steps S2-S6 again on the floating image in the first modality and the reference image in the second modality to be registered to obtain a transformed floating image, namely the registered image.
The complete process of the cross-modal medical image registration method provided by the invention is shown in fig. 5 and can be divided into two parts, namely a cross-modal image conversion network based on improved Cycle-GAN and a double-flow cross-modal image registration network. The following description will be given by taking the example of registration from a floating CT image to a reference MR image, but the method of the present invention is not limited to CT-MR cross-modality medical image registration, but can be applied to other cross-modality image registrations as well, such as magnetic resonance-ultrasound (MR-US) registration, computed tomography-ultrasound (CT-US) registration, and the like.
In combination with the dual-flow registration fusion cross-modal medical image registration network structure based on counterstudy in fig. 5, the overall registration step includes:
a1: providing a training set comprising a floating image rCT of the CT modality and a reference image rMR of the MR modality;
a2: inputting the floating image rCT of the original CT modality into an improved Cycle-GAN image conversion network, realizing the conversion of the floating image from the CT modality to the MR modality, and outputting the network to generate an image tMR;
As an optimal model for image transformation, the Cycle-GAN network can be trained without requiring CT and MR paired data for the same patient. The image conversion Cycle-GAN network model used in the present invention is shown in fig. 6a and 6 b. A schematic diagram of forward conversion (CT-to-MR) and backward conversion (MR-to-CT) of CT modality and MR modality images is depicted in fig. 6a, wherein the solid lines in the diagram represent the forward conversion (from rCT-1 to tMR-1) and backward conversion (from tMR-1 to tCT-1) processes of the original CT modality image; the dashed lines represent the forward (from rMR-2 to tCT-2) and reverse (from tCT-2 to tMR-2) conversion processes for the real MR modality image. The Cycle-GAN image conversion network is composed of two generators GMR、GCTAnd two discriminators DCT、DMRAnd (4) forming. Wherein, the generator GMRFor converting images from CT modality to MR modality (CT-to-MR), as shown by the generator GMRThe generated image tMR-1 is output with rCT-1 as an input. Generator GCTFor converting images from an MR modality to a CT modality (MR-to-CT), such as an image generator GCTTaking rMR-2 as input, and outputting a generated image tCT-2; discriminator DCTFor distinguishing real CT modality images from warp generator GCTThe resulting transformed generated image distinguishes between the true CT image rCT-3 and the generated image tCT-2. For the same reason, the discriminator D MRFor distinguishing real MR modality images from warp generator GMRThe resulting image obtained after conversion, e.g. distinguishing between the real image rMR-3 and the resulting mapLike tMR-1. FIG. 6b is the identity mapping loss constraint of the improved Cycle-GAN network of the present invention on image transformation within the same modality.
The loss function of the improved Cycle-GAN network in the invention comprises four parts: (1) fight loss from arbiter
Figure BDA0002575541000000101
And
Figure BDA0002575541000000102
the countermeasure loss is to punish the difference between the data distribution of the generated image and the real image of the target mode, so that the image converted by the generator and the image of the target mode have highly similar data distribution and are difficult to distinguish by the discriminator; (2) loss of cyclic consistency LcycIs for two generators GCT and GMRConstraint of invertibility of the transformation, i.e. by generator GCTThe converted image is then processed by a generator GMRThe conversion can return to the original mode image and is highly similar to the original image data distribution. For example, the original CT modality image rCT-1 is generated by the generator GMRConverted to tMR-1, and reused as a generator GCTConverting the image tMR-1 back to the original CT mode to obtain tCT-1, wherein the tCT-1 has the same data distribution as the original image rCT-1; (3) loss of structural consistency L MINDThe structural similarity of the original image and the generated image is restricted, and the purpose is to ensure that the image converted by the generator and the original image retain highly consistent structural features. For example, the original image rCT-1 in the figure will be the generator GMRThe conversion yielded tMR-1, at a loss of structural consistency LMINDrCT-1 has a high degree of structural similarity to tMR-1; (4) identity mapping penalty LidentityAs shown in FIG. 6b, the transformed images generated in the same modality are normatively constrained, and the loss L is mapped to the identityidentityUnder the training constraints of (3), the image transformation within the same modality should remain unchanged. For example, the MR modality image rMR in FIG. 6b is generated by the generator GMRThe tMR resulting from the conversion should be the same as the original.
Wherein the structural consistency is lost LMINDThe MIND is used for measuring the structural similarity between an original modal image and a target modal image obtained after conversion by a generator, such as the structural similarity between rCT-1 and tMR-1, and is used for describing local structural characteristics around each pixel. L is MINDHas higher robustness to the obvious difference between different modes, and is used for restricting the structural consistency between the generated image and the original image. L isMINDGuiding network training and continuously reducing generated images GCT(IrMR) Or GMR(IrCT) And image IrMROr IrCTAnd MIND loss between the images to enhance the structural similarity between the images before and after conversion.
Structural consistency loss L used in the present inventionMINDIs defined as formula (1),
Figure BDA0002575541000000111
wherein M represents MIND, IrMRReference image representing an MR modality, IrCTFloating image representing CT modality, NMR and NCTRespectively representing images IrMR and IrCTNumber of voxels in, R represents the non-local area around voxel x, image GCT(IrMR) Finger use generator GCTImage IrMRThe resulting image obtained after conversion (also denoted tCT), image GMR(IrCT) Finger use generator GMRImage IrCTThe resulting image resulting from the conversion (also indicated by tMR).
In addition, the invention loses L through identity mapping to the Cycle-GAN networkidentityNormative constraint is carried out on the converted images generated in the same modality, and the loss L of the identity mapping isidentityAs shown in formula (2):
Lidentity=||GMR(IMR)-IMR||1+||GCT(ICT)-ICT||1formula (2)
wherein ,GMR(IMR) Representing MR modality image IMRWarp generator GMRThe MR modality resulting after the conversion generates an image, GCT(ICT) Representing CT modality image ICTWarp generator GCTAnd generating an image by the converted CT mode. L is identityThe L1 distance between the generated image and the real image, G, within the same modality is calculatedMR(IMR) and IMR、GCT(ICT) and ICTThe sum of the L1 distances therebetween. Specifically, the loss L is mapped in identityidentityShould be kept constant, i.e. G, for image transitions within the same modalityMR(IMR)≈IMR,GCT(ICT)≈ICT. Loss of L through identity mappingidentityIt is possible to prevent the generator from erroneously converting an image already in the target modality to another modality.
In conclusion, the total loss L of the improved Cycle-GAN network in the invention is the antagonistic loss
Figure BDA0002575541000000121
And
Figure BDA0002575541000000122
loss of cyclic consistency LcycIdentity mapping loss LidentityStructural uniformity loss LMINDIs defined as shown in equation (3):
Figure BDA0002575541000000125
wherein ,λcyc、λidentity、λMINDRespectively represent the cyclic consistency loss LcycIdentity mapping loss LidentityStructural uniformity loss LMINDRelative importance of.
In the step, an improved Cycle-GAN network is adopted to carry out the original CT mode diagramImage rCT is transformed as shown in fig. 7a and 7b, wherein fig. 7a is the original CT modality image; FIG. 7b is an MR image generated using a modified Cycle-GAN network; fig. 7b preserves local texture features in the image compared to fig. 7a, and therefore, from the visualization results of fig. 7a and 7b, the structural consistency loss L is observedMINDThe addition of (2) effectively enhances the structural similarity between the generated image tMR and the original image rCT, and improves the boundary fidelity of the organ.
The training loss of the existing Cycle-GAN network includes only two terms: loss of confrontation given by the discriminator
Figure BDA0002575541000000123
And
Figure BDA0002575541000000124
and cycle consistency loss Lcyc. Wherein, the confrontation loss and the cycle consistency loss guide the network to learn the mapping relation among the images in different modes, and the cycle consistency loss restrains the reversibility of the mapping transformation. However, the inventors found that it is difficult to train a robust cross-modal medical image transformation network by only relying on these two loss function constraints, because the cyclic consistency loss is not enough to guarantee the structural similarity between the generated image and the original image (as shown in the comparison between fig. 3a and fig. 3 b); moreover, the existing Cycle-GAN network does not perform the normative constraint on the unit mapping of the input images in the same modality, and the input images in the target domain are likely to be converted to another domain by mistake. Therefore, the Cycle-GAN network is improved in this step, and two additional loss functions are introduced: loss of structural consistency LMINDAnd identity mapping loss LidentityA total of four losses are used to constrain the training of the Cycle-GAN network, thereby ensuring structural similarity between the generated image and the original image, as shown by the comparison of fig. 7a and 7b, and avoiding erroneous conversion of the input image already in the target domain to another domain.
A3: cross-modality streaming network with original CT modality floating image rCT and MR modality reference image rMR as inputs (i.e., the inputs are cross-modality image pairs (rCT, rMR)), through UNet structural networkingOutput deformation field by learning deformable mapping between input image pair
Figure BDA0002575541000000131
Wherein the deformation field
Figure BDA0002575541000000132
I.e. a deformable mapping relationship representing the input cross-modal image pair (rCT, rMR);
in this embodiment, a UNet network structure is adopted for the cross-modal flow subnet UNet _ o. As shown in FIG. 8, the original CT modality image (rCT) and the MR modality image (rMR) are referred to as floating images I, respectivelymAnd a reference picture IfGrayscale float image I with channel number 1mAnd the number of channels is 1fIs shown bym and IfAnd splicing according to the channel direction to obtain three-dimensional volume images of the two channels as input images. The UNet network adopts an encoder-decoder structure, 3D convolution with the step size of 2 is adopted in an encoder part to reduce the spatial resolution of an input image, and a 3D up-sampling layer is adopted in a decoder to restore the spatial resolution of the image. Using a jump connection between convolutional layers of an encoder and a decoder to fuse shallow features and deep features; the number of channels per convolutional layer output signature is shown as the number at the top of the rectangular convolutional layer in fig. 8. Learning deformable transformation parameters among cross-modal input images through a three-dimensional depth convolution network, and outputting to obtain a 3-channel deformation field
Figure BDA0002575541000000133
At this step, the raw cross-modality flow incorporates the raw image rCT into a cross-modality registration framework so that the model can estimate the deformation field based on the detail texture features provided in rCT
Figure BDA0002575541000000134
The introduction of the raw information assists the model in learning a more realistic deformable transformation, which may reduce the interfering effects of the artificial features in the generated image tMR on the registration.
A4: single mode streaming network using the previously improved CycOutput of le-GAN network to generate image tMR and MR modality reference image rMR as input (i.e. the input is a single modality image pair (tMR, rMR)), learning a deformable mapping between the input image pair using the same UNet network as across the modal flow, outputting a deformation field
Figure BDA0002575541000000135
Wherein the deformation field
Figure BDA0002575541000000136
I.e. a deformable mapping relationship representing the input pair of monomodal images (tMR, rMR);
in the present embodiment, the single-mode streaming sub-network UNet _ s also employs the same UNet network architecture as the cross-mode streaming sub-network UNet _ o as shown in fig. 8. The only difference is that UNet _ o is a cross-modal input, while UNet _ s is a single-modal input. The network may convert the original CT modality image (rCT) to an MR modality image (tMR) by image conversion. The single-mode streaming network UNet _ s inputs the generated tMR image and the rMR image into the network as a floating image and a reference image respectively, learns the deformable mapping between the single-mode input images through a three-dimensional depth convolution network, and finally outputs a 3-channel deformation field
Figure BDA0002575541000000137
In the step, the synthesized single-mode flow can learn more texture information in the single-mode image, and effectively inhibits the voxel drift phenomenon caused by the cross-mode flow.
A5: deformation field fusion network deformation field estimated with the first two streams (cross-modal and single-modal)
Figure BDA0002575541000000141
And
Figure BDA0002575541000000142
for inputting, the convolution network is adopted to carry out mixed superposition on the two deformation fields, and the final deformation field is output
Figure BDA0002575541000000143
Wherein the cross-modal and single-modal streaming subnetworks estimate the deformation field based on the cross-modal input (rCT and rMR) and the single-modal input (tMR and rMR), respectively
Figure BDA0002575541000000144
And
Figure BDA0002575541000000145
the deformation field in this step
Figure BDA0002575541000000146
And
Figure BDA0002575541000000147
performing mixed superposition, adopting a convolution neural network with convolution kernel size of 3 multiplied by 3 to effectively fuse two deformation fields, and outputting a final deformation field
Figure BDA0002575541000000148
Wherein the 3D volume deformation field
Figure BDA0002575541000000149
Has the following advantages
Figure BDA00025755410000001410
And
Figure BDA00025755410000001411
the dimensions are all 3 channels.
A6: spatial transformation network based on the final deformation field
Figure BDA00025755410000001412
The original CT mode floating image rCT is spatially warped by
Figure BDA00025755410000001413
Representing, obtaining a transformed floating image (moved CT);
network convergence based on deformation fieldTo the final deformation field
Figure BDA00025755410000001414
In this step, the floating image rCT is spatially warped by means of a Spatial Transform Network (STN). In this embodiment, the STN comprises a spatial grid generator and a sampler, and the deformation field can be predicted according to the grid
Figure BDA00025755410000001415
A sampling grid is generated and then spatially warped rCT by the sampler.
A7: calculating the training loss of the network, comprising two parts: loss of image similarity between the transformed floating image and the reference image; smoothing of the final deformation field
Figure BDA00025755410000001416
Loss of regularization. And repeating the steps A2-A7 to carry out supervised training on the dual-stream cross-mode image registration network by taking a minimum loss function as a target until the training is finished and the step A8 can be carried out to directly register the output registration image by using the network.
The embodiment provides a double-current cross-modal image registration network, which comprises a cross-modal streaming subnetwork, a single-modal streaming subnetwork, a deformation field fusion network and a Space Transformation Network (STN), wherein the training of the double-current cross-modal image registration network is similar to multi-resistance training, cross-modal streams and single-modal streams are mutually independent and mutually restricted, and the whole unsupervised registration network is cooperatively optimized. From the design of the optimization objective, the loss function of the network includes two terms: loss of image similarity LsimAnd regularization loss Lsmooth. Wherein the similarity loses LsimDepicting the transformed floating image
Figure BDA00025755410000001417
And the similarity between the reference image rMR, the structural similarity index is used for measuring the image similarity in the embodiment, and the structural similarity index is independent of the image brightness and the contrast. Wherein the content of the first and second substances,
Figure BDA00025755410000001418
Final deformation field representing the network estimate to be registered
Figure BDA00025755410000001419
Applied to the original floating image rCT, a spatial warping transform is applied rCT to obtain a transformed floating image. In addition, regularization loss LsmoothDeformation field estimated for network
Figure BDA00025755410000001420
Imposing smoothness constraints, L is used in this embodiment2Normal form pair final deformation field
Figure BDA0002575541000000151
The gradient of (a) is regularized.
In summary, the total loss function L of the dual-flow cross-modal image registration network proposed in this embodimenttotalFor loss of image similarity LsimAnd regularization loss LsmoothIs calculated as a weighted sum of. Total loss definition LtotalAs shown in formula (4):
Figure BDA0002575541000000152
wherein ,LsimRepresenting transformed floating images
Figure BDA0002575541000000153
And MR modality reference image IrMRLoss of inter-image similarity, LsmoothRepresenting the final deformation field
Figure BDA0002575541000000154
Smoothness makes a constrained regularization penalty. I isrMRReference image representing an MR modality, IrCTA floating image representing the original CT modality,
Figure BDA0002575541000000155
representing the final deformation field of the dual-flow cross-modal registration network output,
Figure BDA0002575541000000156
representing the field of deformation according to the final
Figure BDA0002575541000000157
Floating image I with distortion transformation of original CT moderCTAnd lambda is a regularization coefficient of the obtained transformed floating image.
By minimizing the total loss function L in this embodimenttotalSimultaneously, the method realizes the maximization of the similarity between the transformed floating image and the reference image (namely, the minimization of the image similarity loss L) sim) And a smooth deformation field (i.e., minimizing regularization loss L)smooth) (ii) a And to minimize the total loss function LtotalAnd carrying out supervision training on the double-flow cross-modal image registration network for the target.
In summary, the dual-flow cross-modality registration technique proposed in this embodiment allows the network to estimate the final deformation field by using the original image and the generated image information at the same time. With this robust learning framework, robust and efficient registration of cross-modality medical images can be achieved with complete unsupervised. Moreover, the dual-stream cross-modality image registration network effectively combines the original cross-modality stream and the composite single-modality stream, making full use of the information of the original floating image rCT, the reference image rMR, and the generated image tMR. Therefore, the problems of high matching rate and distortion field caused by introducing unreal features when an image is generated by the conversion-based registration technology can be solved through the double-current cross-mode image registration network.
A8: inputting the floating image rCT of the CT mode and the reference image rMR of the MR mode to be registered into the trained double-current cross-mode image registration network to obtain a transformed floating image
Figure BDA0002575541000000158
I.e. the registered image.
In this embodiment, given a floating image rCT of any CT modality and a reference image rMR of an MR modality, rCT converted to an MR type image tMR using a modified Cycle-GAN network, then original cross-modality flow is used and a single mode is synthesized Estimation of two deformation fields separately for the current
Figure BDA0002575541000000159
And
Figure BDA00025755410000001510
fusion through 3D convolutional networks
Figure BDA00025755410000001511
And
Figure BDA00025755410000001512
the final deformation field is obtained
Figure BDA00025755410000001513
The warping transformation of floating image rCT is implemented by means of a Spatial Transformation Network (STN). The goal of the entire unsupervised cross-modal registration network is to maximize the transformed floating images
Figure BDA00025755410000001514
And the reference image rMR.
The embodiment of the invention provides a novel dual-flow cross-modal medical image registration technology based on countermeasure learning, which realizes CT-MR cross-modal registration under the unsupervised condition, makes up the defects of the existing registration technology based on image conversion, and improves the unsupervised cross-modal medical image registration precision and robustness.
The double-current cross-modal registration network provided by the embodiment of the invention mainly comprises two parts: 1. a cross-modal image conversion network based on modified Cycle-GAN, which converts CT-modality floating images rCT into MR-modality generated images tMR using the modified Cycle-GAN network; 2. and (3) a cross-modal image registration network based on double-flow registration field fusion. The cross-modal image registration network is divided into four parts, namely a cross-modal streaming subnetwork, a single-modal streaming subnetwork, a deformation field fusion network and a Space Transformation Network (STN).
In the invention, a cross-modal image conversion model Cycle-GAN is improved, and the structure consistency loss and the identity mapping loss are added in a network loss function, so that the structural similarity between a generated image and an original image is obviously enhanced. In order to quantitatively evaluate the performance of the improved Cycle-GAN model, SSIM and peak signal-to-noise ratio PSNR indices were used as follows to measure the quality of MR images generated from CT images on two datasets (Pig Dataset-Pig Ex-vivo kinetic CT-MR Dataset, Abdomen (ABD) CT-MR Dataset). The SSIM index measures the structural similarity between the images before and after the cross-mode conversion, the PSNR index is used for evaluating the quality of the generated image compared with the original image, and the higher the two index values are, the better the quality is. As shown in table 1, the improved version of the Cycle-GAN network proposed by the present invention performs better than the existing original version of the Cycle-GAN network.
TABLE 1 Cross-modal image conversion experiment result comparison
Figure BDA0002575541000000161
Under the unsupervised condition, the invention provides a double-flow cross-modal registration framework, introduces the texture features of the original image through cross-modal flow, weakens the interference of the generated unreal artificial features in the image on registration, and effectively inhibits the voxel drift effect caused by the cross-modal flow by using the single-modal flow. And performing collaborative optimization on the original cross-modal flow and the synthesized single-modal flow, and estimating a more real deformation field based on the original CT image and the generated MR image information respectively. And a 3D convolution network adopting a mixing module automatically learns how to better fuse the two deformation fields so as to obtain better registration performance.
The following uses the Dice coefficient and Target Registration Error (TRE) to evaluate the performance of different cross-modal Registration models. The Dice coefficient measures the overlapping degree between the floating image and the reference image after being transformed by the STN module, and the higher the Dice coefficient value, the better. TRE is an index dedicated to measure the accuracy of the registration algorithm, and represents the position distance (in mm) of the target point set on the registration image and the reference image, and the lower the TRE index value, the better the registration performance.
The evaluation results on two clinical data sets (as shown in table 2) prove the effectiveness of the invention, and compared with the traditional cross-modal registration algorithm (VoxelMorph + MIND) and other cross-modal medical image registration technology based on deep learning (M2U), the dual-flow cross-modal medical image registration technology provided by the invention is significantly superior to the prior art in registration accuracy.
TABLE 2 Cross-modal image registration experiment result comparison
Figure BDA0002575541000000171
Fig. 9 is a schematic diagram of a hardware structure of a cross-modality medical image registration apparatus according to another preferred embodiment of the present invention. The imaging device may include a processor 901, a readable storage medium 902 storing executable instructions. The processor 901 and the readable storage medium 902 may communicate via a system bus 903. Also, by reading and executing executable instructions in readable storage medium 902 corresponding to imaging logic, processor 901 may perform a method of cross-modality medical image registration apparatus described above.
Readable storage media 902, as referred to herein, may be any electronic, magnetic, optical, or other physical storage device that can contain or store information such as executable instructions, data, and the like. For example, the machine-readable storage medium may be: non-volatile memory, flash memory, a storage drive (e.g., a hard drive), a solid state disk, any type of storage disk (e.g., a compact disk, a DVD, etc.), or similar storage media, or a combination thereof.
Embodiments of the subject matter and the functional operations described in this specification can be implemented in: digital electronic circuitry, tangibly embodied computer software or firmware, computer hardware including the structures disclosed in this specification and their structural equivalents, or a combination of one or more of them. Embodiments of the subject matter described in this specification can be implemented as one or more computer programs, i.e., one or more modules of computer program instructions, encoded on a tangible, non-transitory program carrier for execution by, or to control the operation of, data processing apparatus. Alternatively or additionally, the program instructions may be encoded on an artificially generated propagated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode and transmit information to suitable receiver apparatus for execution by the data processing apparatus. The computer storage medium may be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of them.
The processes and logic flows described in this specification can be performed by one or more programmable computers executing one or more computer programs to perform corresponding functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).
Computers suitable for executing computer programs include, for example, general and/or special purpose microprocessors, or any other type of central processing unit. Generally, a central processing unit will receive instructions and data from a read-only memory and/or a random access memory. The basic components of a computer include a central processing unit for implementing or executing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks. However, a computer does not necessarily have such a device. Moreover, a computer may be embedded in another device, e.g., a mobile telephone, a Personal Digital Assistant (PDA), a mobile audio or video player, a game console, a Global Positioning System (GPS) receiver, or a portable storage device such as a Universal Serial Bus (USB) flash drive, to name a few.
Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices (e.g., EPROM, EEPROM, and flash memory devices), magnetic disks (e.g., an internal hard disk or a removable disk), magneto-optical disks, and CD ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several equivalent substitutions or obvious modifications can be made without departing from the spirit of the invention, and all the properties or uses are considered to be within the scope of the invention.

Claims (10)

1. A cross-modality medical image registration method is characterized by comprising the following steps:
s1: providing a training set comprising a floating image of a first modality and a reference image of a second modality;
s2: inputting the floating image into an image conversion network to convert the floating image from a first modality to a second modality and output a converted image of the second modality;
S3: inputting the floating image and the reference image into a cross-mode flow sub-network, and outputting a first deformation field;
s4: inputting the conversion image and the reference image into a single-mode streaming network, and outputting a second deformation field;
s5: inputting the first deformation field and the second deformation field into a deformation field fusion network to superpose the first deformation field and the second deformation field and output a final deformation field;
s6: inputting the floating image and the final deformation field into a space transformation network to obtain a floating image subjected to distortion transformation by the final deformation field;
s7: comparing the transformed floating image with the reference image to obtain a first total loss function, and repeatedly executing steps S2-S7 to train the cross-modal streaming subnetwork, the single-modal streaming subnetwork, the deformation field fusion network and the space transformation network with the aim of minimizing the first total loss function until the training is finished and executing step S8;
s8: and (5) performing the steps S2-S6 again on the floating image in the first modality and the reference image in the second modality to be registered to obtain a transformed floating image, namely the registered image.
2. The cross-modality medical image registration method according to claim 1, wherein the image transformation network employs a modified Cycle-GAN network, and a second total loss function of the modified Cycle-GAN network is:
Figure FDA0002575540990000011
wherein the loss is resisted
Figure FDA0002575540990000012
Two discriminators D, each being an improved Cycle-GAN network1、D2Antagonistic loss of (1), cycle consistency loss LcycIs two generators G to an improved Cycle-GAN network1、G2Constraint of transformation reversibility, loss of identity mapping LidentityIs a normative constraint performed by generating a conversion image under the same modality, and the structural consistency is lost LMINDIs a constraint on the structural similarity of the original image and the generated image, λcyc、λidentity、λMINDRespectively represent Lcyc、Lidentity、LMINDThe relative importance of;
step S7 also includes training the improved Cycle-GAN network with the second total loss function as a target.
3. The cross-modality medical image registration method of claim 2, wherein structural consistency loss LMINDComprises the following steps:
Figure FDA0002575540990000021
wherein M represents the modality independent Domain descriptor MIND, Ir1Floating image representing a first modality, Ir2Reference image representing a second modality, N1 and N2Respectively representing images Ir1 and Ir2R represents a non-local region around voxel x; image G 1(Ir2) Finger use generator G1Image Ir2The resulting image obtained after conversion, image G2(Ir1) Finger use generator G2Image Ir1And (4) converting to obtain a generated image.
4. The cross-modality medical image registration method of claim 2, wherein identity mapping loss LidentityComprises the following steps:
Lidentity=||G1(I1)-I1||1+||G2(I2)-I2||1
wherein ,I1Image representing a first modality, I2Image representing a second modality, image G1(I1) Finger use generator G1Image I of a first modality1The resulting image obtained after conversion, image G2(I2) Finger use generator G2Image I of the second modality2And (4) converting to obtain a generated image.
5. The cross-modality medical image registration method of claim 1, wherein the cross-modality streaming sub-network employs a UNet network structure, wherein the UNet network structure comprises an encoder and a decoder, and wherein hopping connections are employed between convolutional layers of the encoder and the decoder.
6. The cross-modality medical image registration method of claim 1, wherein the single-modality streaming network employs a UNet network structure, wherein the UNet network structure comprises an encoder and a decoder, and wherein hopping connections are employed between convolutional layers of the encoder and the decoder.
7. The cross-modality medical image registration method of claim 1, wherein the deformation field fusion network is a 3D convolutional neural network.
8. The cross-modality medical image registration method of claim 1, wherein the spatial transformation network comprises a spatial grid generator that generates a sampling grid from the final deformation field and a sampler that spatially warp transforms the floating image according to the sampling grid.
9. The cross-modality medical image registration method of claim 1, wherein the first total loss function is:
Figure FDA0002575540990000031
wherein ,
Figure FDA0002575540990000032
denotes the final deformation field, Ir1Floating image representing a first modality, Ir2A reference image representing a second modality,
Figure FDA0002575540990000033
representing floating images after final warping transformation, image similarity loss
Figure FDA0002575540990000034
Representing transformed floating images
Figure FDA0002575540990000035
And image Ir2The image similarity between the images is lost,
Figure FDA0002575540990000036
representing the final deformation field
Figure FDA0002575540990000037
Smoothness is subject to constrained regularization loss, and λ is the regularization coefficient.
10. A cross-modality medical image registration apparatus comprising a processor and a readable storage medium storing executable instructions executable by the processor, the processor being arranged such that the executable instructions cause the cross-modality medical image registration method of any one of claims 1 to 9 to be implemented.
CN202010652606.9A 2020-07-08 2020-07-08 Cross-modal medical image registration method and device Active CN111862174B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010652606.9A CN111862174B (en) 2020-07-08 2020-07-08 Cross-modal medical image registration method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010652606.9A CN111862174B (en) 2020-07-08 2020-07-08 Cross-modal medical image registration method and device

Publications (2)

Publication Number Publication Date
CN111862174A true CN111862174A (en) 2020-10-30
CN111862174B CN111862174B (en) 2023-10-03

Family

ID=73153705

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010652606.9A Active CN111862174B (en) 2020-07-08 2020-07-08 Cross-modal medical image registration method and device

Country Status (1)

Country Link
CN (1) CN111862174B (en)

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112232362A (en) * 2020-11-04 2021-01-15 清华大学深圳国际研究生院 Cross-modal medical image registration method and computer-readable storage medium
CN112650886A (en) * 2020-12-28 2021-04-13 电子科技大学 Cross-modal video time retrieval method based on cross-modal dynamic convolution network
CN112669327A (en) * 2020-12-25 2021-04-16 上海交通大学 Magnetic resonance image segmentation system and segmentation method thereof
CN112802072A (en) * 2021-02-23 2021-05-14 临沂大学 Medical image registration method and system based on counterstudy
CN112927280A (en) * 2021-03-11 2021-06-08 北京的卢深视科技有限公司 Method and device for acquiring depth image and monocular speckle structured light system
CN113012204A (en) * 2021-04-09 2021-06-22 福建自贸试验区厦门片区Manteia数据科技有限公司 Multi-modal image registration method and device, storage medium and processor
CN113012086A (en) * 2021-03-22 2021-06-22 上海应用技术大学 Cross-modal image synthesis method
CN113112534A (en) * 2021-04-20 2021-07-13 安徽大学 Three-dimensional biomedical image registration method based on iterative self-supervision
CN113344876A (en) * 2021-06-08 2021-09-03 安徽大学 Deformable registration method between CT and CBCT
CN113450397A (en) * 2021-06-25 2021-09-28 广州柏视医疗科技有限公司 Image deformation registration method based on deep learning
CN113487656A (en) * 2021-07-26 2021-10-08 推想医疗科技股份有限公司 Image registration method and device, training method and device, control method and device
CN113538533A (en) * 2021-06-22 2021-10-22 南方医科大学 Spine registration method, spine registration device, spine registration equipment and computer storage medium
CN114119689A (en) * 2021-12-02 2022-03-01 厦门大学 Multi-modal medical image unsupervised registration method and system based on deep learning
CN114359360A (en) * 2022-03-17 2022-04-15 成都信息工程大学 Two-way consistency constraint medical image registration algorithm based on countermeasure
CN114359356A (en) * 2021-12-28 2022-04-15 上海联影智能医疗科技有限公司 Training method of image registration model, image registration method, device and medium
CN114387317A (en) * 2022-03-24 2022-04-22 真健康(北京)医疗科技有限公司 CT image and MRI three-dimensional image registration method and device
CN114511599A (en) * 2022-01-20 2022-05-17 推想医疗科技股份有限公司 Model training method and device, medical image registration method and device
WO2022198526A1 (en) * 2021-03-24 2022-09-29 Nec Corporation Methods, devices and computer readable media for image processing
WO2022205500A1 (en) * 2021-03-31 2022-10-06 华中科技大学 Method for constructing registration model for non-rigid multimodal medical image, and application thereof
CN115375971A (en) * 2022-08-24 2022-11-22 北京医智影科技有限公司 Multi-modal medical image registration model training method, registration method, system and equipment
WO2022247218A1 (en) * 2021-05-27 2022-12-01 广州柏视医疗科技有限公司 Image registration method based on automatic delineation
CN115830016A (en) * 2023-02-09 2023-03-21 真健康(北京)医疗科技有限公司 Medical image registration model training method and equipment
CN116402865A (en) * 2023-06-06 2023-07-07 之江实验室 Multi-mode image registration method, device and medium using diffusion model
WO2023173827A1 (en) * 2022-03-15 2023-09-21 腾讯科技(深圳)有限公司 Image generation method and apparatus, and device, storage medium and computer program product

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140029812A1 (en) * 2012-07-30 2014-01-30 General Electric Company Methods and systems for determining a transformation function to automatically register different modality medical images
CN108711168A (en) * 2018-06-04 2018-10-26 中北大学 Non-rigid multimodal medical image registration method based on ZMLD Yu GC discrete optimizations
CN110021037A (en) * 2019-04-17 2019-07-16 南昌航空大学 A kind of image non-rigid registration method and system based on generation confrontation network
CN110838139A (en) * 2019-11-04 2020-02-25 上海联影智能医疗科技有限公司 Training method of image registration model, image registration method and computer equipment
US20200184660A1 (en) * 2018-12-11 2020-06-11 Siemens Healthcare Gmbh Unsupervised deformable registration for multi-modal images

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140029812A1 (en) * 2012-07-30 2014-01-30 General Electric Company Methods and systems for determining a transformation function to automatically register different modality medical images
CN108711168A (en) * 2018-06-04 2018-10-26 中北大学 Non-rigid multimodal medical image registration method based on ZMLD Yu GC discrete optimizations
US20200184660A1 (en) * 2018-12-11 2020-06-11 Siemens Healthcare Gmbh Unsupervised deformable registration for multi-modal images
CN110021037A (en) * 2019-04-17 2019-07-16 南昌航空大学 A kind of image non-rigid registration method and system based on generation confrontation network
CN110838139A (en) * 2019-11-04 2020-02-25 上海联影智能医疗科技有限公司 Training method of image registration model, image registration method and computer equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
潘梅森;汤井田;杨晓利: "采用PCA和PSNR的医学图像配准", 红外与激光工程, vol. 40, no. 2, pages 355 - 364 *

Cited By (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112232362A (en) * 2020-11-04 2021-01-15 清华大学深圳国际研究生院 Cross-modal medical image registration method and computer-readable storage medium
CN112669327B (en) * 2020-12-25 2023-02-14 上海交通大学 Magnetic resonance image segmentation system and segmentation method thereof
CN112669327A (en) * 2020-12-25 2021-04-16 上海交通大学 Magnetic resonance image segmentation system and segmentation method thereof
CN112650886A (en) * 2020-12-28 2021-04-13 电子科技大学 Cross-modal video time retrieval method based on cross-modal dynamic convolution network
CN112650886B (en) * 2020-12-28 2022-08-02 电子科技大学 Cross-modal video time retrieval method based on cross-modal dynamic convolution network
CN112802072A (en) * 2021-02-23 2021-05-14 临沂大学 Medical image registration method and system based on counterstudy
CN112927280A (en) * 2021-03-11 2021-06-08 北京的卢深视科技有限公司 Method and device for acquiring depth image and monocular speckle structured light system
CN113012086A (en) * 2021-03-22 2021-06-22 上海应用技术大学 Cross-modal image synthesis method
CN113012086B (en) * 2021-03-22 2024-04-16 上海应用技术大学 Cross-modal image synthesis method
WO2022198526A1 (en) * 2021-03-24 2022-09-29 Nec Corporation Methods, devices and computer readable media for image processing
WO2022205500A1 (en) * 2021-03-31 2022-10-06 华中科技大学 Method for constructing registration model for non-rigid multimodal medical image, and application thereof
CN113012204B (en) * 2021-04-09 2024-01-16 福建自贸试验区厦门片区Manteia数据科技有限公司 Registration method, registration device, storage medium and processor for multi-mode image
CN113012204A (en) * 2021-04-09 2021-06-22 福建自贸试验区厦门片区Manteia数据科技有限公司 Multi-modal image registration method and device, storage medium and processor
CN113112534A (en) * 2021-04-20 2021-07-13 安徽大学 Three-dimensional biomedical image registration method based on iterative self-supervision
CN113112534B (en) * 2021-04-20 2022-10-18 安徽大学 Three-dimensional biomedical image registration method based on iterative self-supervision
WO2022247218A1 (en) * 2021-05-27 2022-12-01 广州柏视医疗科技有限公司 Image registration method based on automatic delineation
CN113344876A (en) * 2021-06-08 2021-09-03 安徽大学 Deformable registration method between CT and CBCT
CN113538533B (en) * 2021-06-22 2023-04-18 南方医科大学 Spine registration method, device and equipment and computer storage medium
CN113538533A (en) * 2021-06-22 2021-10-22 南方医科大学 Spine registration method, spine registration device, spine registration equipment and computer storage medium
CN113450397A (en) * 2021-06-25 2021-09-28 广州柏视医疗科技有限公司 Image deformation registration method based on deep learning
CN113450397B (en) * 2021-06-25 2022-04-01 广州柏视医疗科技有限公司 Image deformation registration method based on deep learning
CN113487656A (en) * 2021-07-26 2021-10-08 推想医疗科技股份有限公司 Image registration method and device, training method and device, control method and device
CN114119689A (en) * 2021-12-02 2022-03-01 厦门大学 Multi-modal medical image unsupervised registration method and system based on deep learning
CN114359356A (en) * 2021-12-28 2022-04-15 上海联影智能医疗科技有限公司 Training method of image registration model, image registration method, device and medium
CN114511599A (en) * 2022-01-20 2022-05-17 推想医疗科技股份有限公司 Model training method and device, medical image registration method and device
CN114511599B (en) * 2022-01-20 2022-09-20 推想医疗科技股份有限公司 Model training method and device, medical image registration method and device
WO2023173827A1 (en) * 2022-03-15 2023-09-21 腾讯科技(深圳)有限公司 Image generation method and apparatus, and device, storage medium and computer program product
CN114359360B (en) * 2022-03-17 2022-06-10 成都信息工程大学 Two-way consistency constraint medical image registration algorithm based on confrontation
CN114359360A (en) * 2022-03-17 2022-04-15 成都信息工程大学 Two-way consistency constraint medical image registration algorithm based on countermeasure
CN114387317A (en) * 2022-03-24 2022-04-22 真健康(北京)医疗科技有限公司 CT image and MRI three-dimensional image registration method and device
CN115375971A (en) * 2022-08-24 2022-11-22 北京医智影科技有限公司 Multi-modal medical image registration model training method, registration method, system and equipment
CN115830016B (en) * 2023-02-09 2023-04-14 真健康(北京)医疗科技有限公司 Medical image registration model training method and equipment
CN115830016A (en) * 2023-02-09 2023-03-21 真健康(北京)医疗科技有限公司 Medical image registration model training method and equipment
CN116402865A (en) * 2023-06-06 2023-07-07 之江实验室 Multi-mode image registration method, device and medium using diffusion model
CN116402865B (en) * 2023-06-06 2023-09-15 之江实验室 Multi-mode image registration method, device and medium using diffusion model

Also Published As

Publication number Publication date
CN111862174B (en) 2023-10-03

Similar Documents

Publication Publication Date Title
CN111862174A (en) Cross-modal medical image registration method and device
Yao et al. On improving bounding box representations for oriented object detection
CN111862175B (en) Cross-modal medical image registration method and device based on cyclic canonical training
Zhang et al. Text-guided neural image inpainting
CN110544239B (en) Multi-modal MRI conversion method, system and medium for generating countermeasure network based on conditions
CN111260741B (en) Three-dimensional ultrasonic simulation method and device by utilizing generated countermeasure network
CN110910351B (en) Ultrasound image modality migration and classification method and terminal based on generation countermeasure network
CN110503626B (en) CT image modality alignment method based on space-semantic significance constraint
CN111325750B (en) Medical image segmentation method based on multi-scale fusion U-shaped chain neural network
Raju et al. Deep implicit statistical shape models for 3d medical image delineation
Lu et al. A novel image registration approach via combining local features and geometric invariants
Fang et al. Reliable mutual distillation for medical image segmentation under imperfect annotations
CN111242953A (en) MR image segmentation method and device based on condition generation countermeasure network
Jiang et al. Unpaired cross-modality educed distillation (CMEDL) for medical image segmentation
Chen et al. MASS: Modality-collaborative semi-supervised segmentation by exploiting cross-modal consistency from unpaired CT and MRI images
CN113205567A (en) Method for synthesizing CT image by MRI image based on deep learning
CN117437420A (en) Cross-modal medical image segmentation method and system
Chen et al. Deep semi-supervised ultrasound image segmentation by using a shadow aware network with boundary refinement
Huang et al. Push the boundary of sam: A pseudo-label correction framework for medical segmentation
Miao et al. SC-SSL: Self-correcting Collaborative and Contrastive Co-training Model for Semi-Supervised Medical Image Segmentation
Zhang et al. Semisam: Exploring sam for enhancing semi-supervised medical image segmentation with extremely limited annotations
Liao et al. FisheyeEX: Polar outpainting for extending the FoV of fisheye lens
Alshehri et al. Self-Attention-Based Edge Computing Model for Synthesis Image to Text through Next-Generation AI Mechanism
Chong 3D reconstruction of laparoscope images with contrastive learning methods
Shao et al. Smudlp: Self-teaching multi-frame unsupervised endoscopic depth estimation with learnable patchmatch

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant