CN110599528A

CN110599528A - Unsupervised three-dimensional medical image registration method and system based on neural network

Info

Publication number: CN110599528A
Application number: CN201910828807.7A
Authority: CN
Inventors: 赵秀阳; 马英君
Original assignee: University of Jinan
Current assignee: University of Jinan
Priority date: 2019-09-03
Filing date: 2019-09-03
Publication date: 2019-12-20
Anticipated expiration: 2039-09-03
Also published as: CN110599528B

Abstract

The invention provides an unsupervised three-dimensional medical image registration method and system based on a neural network, comprising the following steps of: collecting an image; preprocessing the acquired three-dimensional medical image; training a neural network to obtain a trained neural network model; and inputting the medical image to be registered into the trained neural network model for registration to obtain and output a registered image. Wherein the invention is based on a deformation fieldDeformation fieldDeformation fieldAnd based on the deformed imageDeformed imageDeformed imageDeformed imageComputing a fixed image I using a loss function_FAnd the deformed imageAnd performing back propagation optimization on the neural network until the calculated loss function value is not reduced or the network training reaches a preset training iteration number, and finishing the neural network training to obtain a trained neural network model. The invention is used for realizing medical image registration.

Description

Unsupervised three-dimensional medical image registration method and system based on neural network

Technical Field

The invention relates to the field of medical image registration, in particular to an unsupervised three-dimensional medical image registration method and system based on a neural network, which are mainly used for registering three-dimensional human brain images.

Background

Medical image registration, refers to seeking a spatial transformation or series of spatial transformations for one medical image to bring it into spatial correspondence with a corresponding point on another medical image. This coincidence means that the same anatomical point on the body has the same spatial position on the two matching images. Generally, an Image to be registered is called a floating Image (Moving Image), and a transformation target Image is called a fixed Image or a reference Image (Fi × ed Image).

There are currently many open source software available for medical image segmentation, registration, etc., for example: FreeSprofer can be used to analyze and visualize structural and functional neuroimaging data from slices or time series, which has better performance in skull stripping, B1 bias field correction, grey white matter segmentation and morphological difference measurement; FSL, similar to fresufer, allows for comprehensive analysis of FMRI, MRI and DTI brain imaging data; the ITK software package can be used for the segmentation and registration of multi-dimensional images; the NiftyReg realizes rigid body, affine and nonlinear registration methods of nifti images, and simultaneously supports GPU operation; elasti x is ITK-based open source software, including common algorithms for processing medical image registration; ANTS is used as a current better registration tool, and can realize deformable registration of differential homoembryo. The above registration tools are based on conventional registration methods to fit the deformation of the image. In addition, currently, many deep learning-based registration methods are proposed in succession, such as: DIRNet, BIRNet, voxelmorph and the like utilize a neural network to obtain image transformation parameters, and then utilize a transformation network to transform the floating image to obtain a registration result.

Although the above registration tools and methods all achieve good results in image registration, the following problems still exist:

(1) some methods require manual labeling and supervision information, have high registration specialty requirements, and have relatively low registration rates. The marking of the feature points and the feature areas of the medical image and the acquisition of the image supervision information need to be finished by professional medical imaging doctors, which is very difficult for registration workers without related medical experience, meanwhile, the feature marks obtained by different doctors and the same doctor at different times may have differences, the manual marking process is time-consuming and labor-consuming, and the subjective judgment of the doctors has a great influence on the registration result.

(2) The registration accuracy is relatively low. Some methods proposed in recent years, although making great progress in the registration result, still have to be improved in accuracy.

Therefore, the invention provides an unsupervised three-dimensional medical image registration method and system based on a neural network, which are used for solving the problems.

Disclosure of Invention

In view of the above disadvantages of the prior art, the present invention provides an unsupervised three-dimensional medical image registration method and system based on a neural network, which are used for realizing rapid registration of medical images. But also to improve registration accuracy.

In a first aspect, the present invention provides an unsupervised three-dimensional medical image registration method based on a neural network, comprising the steps of:

l1, image acquisition: acquiring three-dimensional medical images from public data sets OASIS and ADNI, and/or: acquiring a three-dimensional medical image from a DICOM interface of a CT, MRI or ultrasonic imager;

l2, preprocessing the acquired three-dimensional medical image: including image segmentation, cropping, normalization processing, and affine alignment, andselecting any one image from affine aligned images as a fixed image I_FThe rest of the image is taken as a floating image I_M(ii) a Wherein the size of the cut images is consistent;

l3 based on the fixed image I obtained after preprocessing_FAnd a floating image I_MTraining a neural network to obtain a trained neural network model;

l4, inputting the medical image to be registered into the trained neural network model for registration to obtain and output a registration image of the medical image to be registered;

wherein in step L3, based on the fixed image I obtained after preprocessing_FAnd a floating image I_MTraining a neural network to obtain a trained neural network model, comprising:

s1, fixing image I obtained after preprocessing_FAnd a floating image I_MInputting the neural network as an input layer of the neural network, wherein each set of input data comprises the fixed image I_FAnd one said floating image I_M；

S2, inputting fixed image I in input layer_FAnd a floating image I_MPerforming down sampling to output a characteristic diagram;

the down-sampling comprises 3 down-sampling processes, a convolution calculation process with convolution kernel size of 3 multiplied by 3 and a LeakyReLU activation function calculation process after the 3 down-sampling processes; the 3 downsampling processes correspond to 3 downsampling process layers; the 3 down-sampling process layers are sequentially marked as a first down-sampling process layer, a second down-sampling process layer and a third down-sampling process layer according to the execution sequence of the down-sampling process; each downsampling process layer comprises a convolution layer with convolution kernel size of 3 multiplied by 3, a LeakyReLU activation function layer and a maximum pooling layer;

s3, respectively carrying out feature re-weighting on feature graphs output by LeakyReLU activation function layers in a first downsampling process layer, a second downsampling process layer and a third downsampling process layer corresponding to downsampling to obtain three weighted feature graphs, which are sequentially: a first weighted feature map, a second weighted feature map, and a third weighted feature map;

s4, performing 1 × 1 × 1 convolution on the feature map output in the step S2, and outputting a floating image I_MTo a fixed picture I_FDeformation field of

S5, inputting the characteristic diagram output in the step S2 into an UpSampling layer for UpSampling, wherein the UpSampling layer comprises 3 UpSampling process layers, each UpSampling process layer comprises an UpSampling layer and a convolutional layer with the convolutional kernel size of 3 multiplied by 3, and a LeakyReLU activation function layer is arranged behind the convolutional layer with the convolutional kernel size of 3 multiplied by 3 in each UpSampling process layer; the 3 upsampling process layers correspond to the 3 upsampling processes of the upsampling; the 3 upsampling process layers are sequentially marked as a first upsampling process layer, a second upsampling process layer and a third upsampling process layer according to the sequence of the 3 upsampling processes;

after the feature map output by the UpSampling layer of the first UpSampling process layer is fused with the third weighted feature map, the feature map is used as the input of the convolutional layer with the convolutional kernel size of 3 multiplied by 3 in the first UpSampling process layer; fusing a feature map output by the UpSamplling layer of the second up-sampling process layer with the second weighted feature map, and taking the feature map as the input of the convolution layer with the convolution kernel size of 3 multiplied by 3 in the second up-sampling process layer; fusing a feature map output by an UpSampling layer of a third up-sampling process layer with the first weighted feature map, and taking the feature map as the input of a convolution layer with the convolution kernel size of 3 multiplied by 3 in the third up-sampling process layer;

s6, respectively carrying out 1 multiplied by 1 convolution on the feature maps output by the first up-sampling process layer, the second up-sampling process layer and the third up-sampling process layer, and outputting the floating images I corresponding to the first up-sampling process layer, the second up-sampling process layer and the third up-sampling process layer_MTo a fixed picture I_FIn turn is a deformation fieldDeformation fieldAnd deformation field

S7, floating the image I_MAnd the deformation field of the above outputInputting into a spatial transformation network, rendering floating images I_MAnd the deformation field of the above outputInput to the spatial transformation network, floating image I_MAnd the deformation field of the above outputInput to the spatial transformation network, floating image I_MAnd the deformation field of the above outputInputting the image data into the space transformation network, and correspondingly obtaining a floating image I through the space transformation of the space transformation network_MThe corresponding deformed images are sequentially deformed imagesDeformed imageDeformed imageAnd the deformed image

S8 deformation field based on the outputDeformation fieldDeformation fieldAnd based on the deformed image obtainedDeformed imageDeformed imageDeformed imageComputing a fixed image I using a loss function_FAnd the deformed imagePerforming back propagation optimization on the neural network until the calculated loss function value is not reduced or the network training reaches a preset training iteration number, and finishing the neural network training to obtain the trained neural network model;

the computational expression of the loss function is:

in the formula (i), the first and second groups,represents the calculated loss function value, α and β are both constants, α + β is 1,is a regularization term, λ is a regularization control constant parameter,representing a predetermined image I fixed by said_FDown-sampling the obtained three-dimensional medical image; three-dimensional medical imageSequentially with the deformed imageEquivalent, three-dimensional medical image Are successively lower and are each smaller than the fixed image I_FThe resolution of (a) of (b),representing said fixed image I_FAnd the deformed imageA measure of the degree of similarity between the two,representing the three-dimensional medical imageAnd the deformed imageA measure of the degree of similarity between the two,representing the three-dimensional medical imageAnd the deformed imageA measure of the degree of similarity between the two,representing the three-dimensional medical imageAnd the deformed imageA measure of similarity between;the same similarity metric function is used.

Further, the step of implementing feature re-weighting in step S3 includes:

step S31, recording each feature graph output in the down sampling and to be subjected to feature re-weighting is a feature graph X, wherein X is equal to R^(H×W×D)Slicing the feature map X in the D dimension of the feature map X, and processing each slice X ∈ R obtained by slicing by using a global average pooling strategy^(H×W)And performing global average pooling to obtain a slice descriptor z of each slice X in the D dimension of the feature map X, wherein a specific formula of each slice descriptor z is as follows:

wherein (i, j) represents a pixel point on the slice x, and x (i, j) represents a gray value of the slice x at the pixel point (i, j);

step S32, obtaining a weight S of each slice X in the dimension D of the feature map X, where the calculation formula of the weight S of each slice X is as follows:

s＝σ(δ(z))，

wherein σ represents a sigmoid activation function, δ is a ReLU activation function, and z is a slice descriptor of the slice x obtained in step S31;

step S33, correspondingly loading each weight S obtained in step S32 to the corresponding slice, to obtain each slice X corresponding weighted slice in the D dimension of the feature map XWherein the feature re-weighting calculation formula corresponding to each slice x is as follows:

in the formula, F_scale(x, s) represents the multiplication between a slice x and its corresponding weight s;

step S34, based on the feature map X obtained in step S33, the reweighed slice corresponding to each slice X in the D dimension thereofCorrespondingly obtaining the re-weighted feature map corresponding to the feature map X

Furthermore, the similarity measure function adopts a cross-correlation function.

Furthermore, the space transformation network adopts an STN space transformation network.

Further, the preprocessing also comprises data enhancement;

the data enhancement comprises the following steps: respectively performing bending transformation on each obtained floating image to obtain a bent transformed image corresponding to each obtained floating image; the resulting warped images are newly added floating images.

In a second aspect, the present invention provides an unsupervised three-dimensional medical image registration system based on a neural network, comprising:

an image acquisition unit acquiring three-dimensional medical images from the public data sets OASIS and ADNI, and/or: acquiring a three-dimensional medical image from a DICOM interface of a CT, MRI or ultrasonic imager;

the image preprocessing unit is used for preprocessing the acquired three-dimensional medical image, comprises image segmentation, cutting, normalization processing and affine alignment, and selects any one image from the images subjected to affine alignment as a fixed image I_FThe rest of the image is taken as a floating image I_M(ii) a Wherein the size of the cut images is consistent;

a neural network training unit based on the fixed image I obtained after preprocessing_FAnd a floating image I_MTraining a neural network to obtain a trained neural network model;

the image registration unit is used for inputting the medical image to be registered into the trained neural network model for registration to obtain and output a registration image of the medical image to be registered;

wherein, the neural network training unit comprises:

an input module for preprocessing the obtained fixed image I_FAnd a floating image I_MInputting the neural network as an input layer of the neural network, wherein each set of input data comprises the fixed image I_FAnd one said floating image I_M；

A down-sampling module for fixed image I input in the input layer_FAnd a floating image I_MPerforming down sampling to output a characteristic diagram; the down-sampling comprises 3 down-sampling processes, a convolution calculation process with convolution kernel size of 3 multiplied by 3 and a LeakyReLU activation function calculation process after the 3 down-sampling processes; 3 downsampling processes correspond to 3 downsampling process layers; the 3 down-sampling process layers are sequentially marked as a first down-sampling process layer, a second down-sampling process layer and a third down-sampling process layer according to the execution sequence of the down-sampling process; each downsampling process layer comprises a convolution layer with convolution kernel size of 3 multiplied by 3, a LeakyReLU activation function layer and a maximum pooling layer;

the reweighting module is used for respectively performing characteristic reweighting on the characteristic graphs output by the LeakyReLU activation function layer in the first downsampling process layer, the second downsampling process layer and the third downsampling process layer corresponding to downsampling to obtain three weighted characteristic graphs, and the three weighted characteristic graphs are sequentially: a first weighted feature map, a second weighted feature map, and a third weighted feature map;

a first deformation field output module for performing 1 × 1 × 1 convolution on the feature map output by the down-sampling module to output a floating image I_MTo a fixed picture I_FDeformation field of

The up-sampling module inputs the characteristic diagram output by the down-sampling module into an up-sampling layer for up-sampling, the up-sampling layer comprises 3 up-sampling process layers, each up-sampling process layer comprises an UpSamplling (namely up-sampling) layer and a convolution layer with convolution kernel size of 3 multiplied by 3, and a LeakyReLU activation function layer is respectively arranged behind the convolution layer with convolution kernel size of 3 multiplied by 3 in each up-sampling process layer; the 3 upsampling process layers correspond to the 3 upsampling processes of the upsampling; the 3 upsampling process layers are sequentially marked as a first upsampling process layer, a second upsampling process layer and a third upsampling process layer according to the sequence of the 3 upsampling processes; after the feature map output by the UpSampling layer of the first UpSampling process layer is fused with the third weighted feature map, the feature map is used as the input of the convolutional layer with the convolutional kernel size of 3 multiplied by 3 in the first UpSampling process layer; fusing a feature map output by the UpSamplling layer of the second up-sampling process layer with the second weighted feature map, and taking the feature map as the input of the convolution layer with the convolution kernel size of 3 multiplied by 3 in the second up-sampling process layer; fusing a feature map output by an UpSampling layer of a third up-sampling process layer with the first weighted feature map, and taking the feature map as the input of a convolution layer with the convolution kernel size of 3 multiplied by 3 in the third up-sampling process layer;

a second deformation field output module for performing 1 × 1 × 1 convolution on the feature maps output by the first, second and third upsampling process layers respectively to output the first, second and third upsampling process layersFloating image I corresponding to program layer_MTo a fixed picture I_FIn turn is a deformation fieldDeformation fieldAnd deformation field

A spatial transformation module for transforming the floating image I_MAnd the deformation field of the above outputInputting into a spatial transformation network, rendering floating images I_MAnd the deformation field of the above outputInput to the spatial transformation network, floating image I_MAnd the deformation field of the above outputInput to the spatial transformation network, floating image I_MAnd the deformation field of the above outputInputting the image data into the space transformation network, and correspondingly obtaining a floating image I through the space transformation of the space transformation network_MThe corresponding deformed images are sequentially deformed imagesDeformed imageDeformed imageAnd the figures after deformationImage

Neural network optimization module, deformation field based on the above outputDeformation fieldDeformation fieldAnd based on the deformed image obtainedDeformed imageDeformed imageDeformed imageComputing a fixed image I using a loss function_FAnd the deformed imagePerforming back propagation optimization on the neural network until the calculated loss function value is not reduced or the network training reaches a preset training iteration number, and finishing the neural network training to obtain the trained neural network model;

the computational expression of the loss function is:

in the formula (i), the first and second groups,represents the calculated loss function value, α and β are both constants, α + β is 1,is a regularization term, λ is a regularization control constant parameter,representing a predetermined image I fixed by said_FDown-sampling the obtained three-dimensional medical image; three-dimensional medical imageSequentially with the deformed imageEquivalent, three-dimensional medical imageAre successively lower and are each smaller than the fixed image I_FThe resolution of (a) of (b),representing said fixed image I_FAnd the deformed imageA measure of the degree of similarity between the two,representing the three-dimensional medical imageAnd the deformed imageA measure of the degree of similarity between the two,representing the three-dimensional medical imageAnd the deformed imageA measure of the degree of similarity between the two,representing the three-dimensional medical imageAnd the deformed imageA measure of similarity between;the same similarity metric function is used.

Further, the re-weighting module includes:

a descriptor obtaining module for recording that each feature graph to be subjected to feature re-weighting output in the down-sampling is X, and X belongs to R^(H×W×D)Slicing the feature map X in the D dimension of the feature map X, and processing each slice X ∈ R obtained by slicing by using a global average pooling strategy^(H×W)And performing global average pooling to obtain a slice descriptor z of each slice X in the D dimension of the feature map X, wherein a specific formula of each slice descriptor z is as follows:

the slice weight calculation module is used for acquiring the weight s of each slice X on the D dimension of the feature map X, wherein the calculation formula of the weight s of each slice X is as follows:

s＝σ(δ(z))，

wherein, σ represents a sigmoid activation function, δ is a ReLU activation function, and z is a slice descriptor of a slice x obtained by the descriptor obtaining module;

a weighting module for correspondingly loading each weight s obtained by the slice weight calculation module on the corresponding slice to obtain each slice on the D dimension of the characteristic diagram X and the weighted sliceWherein the feature re-weighting calculation formula corresponding to each slice x is as follows:

a weighted image acquisition module for obtaining a weighted slice corresponding to each slice X on D dimension of the feature image X based on the weighted imageCorrespondingly obtaining the re-weighted feature map corresponding to the feature map X

Further, the preprocessing in the image preprocessing unit further includes data enhancement; wherein said data enhancement comprises: respectively performing bending transformation on each obtained floating image to obtain a bent transformed image corresponding to each obtained floating image; the resulting warped images are newly added floating images.

The beneficial effect of the invention is that,

(1) the unsupervised three-dimensional medical image registration method and the unsupervised three-dimensional medical image registration system both use an unsupervised registration mode, do not need any marking information and registration supervision information in the registration process, reduce the requirement of marking data and errors of artificial subjective judgment, contribute to image registration of medical workers without related medical experience to a certain extent, improve the registration rate to a certain extent, save the registration time and save manpower and material resources to a certain extent.

(2) According to the unsupervised three-dimensional medical image registration method and system based on the neural network, the feature graph obtained by the down-sampling path is weighted and fused to the up-sampling path according to the contribution degree of the feature graph through the feature weighted fusion and the loss function with the multi-level loss supervision function, the loss of the model is supervised from different resolution angles, more effective feature reuse and model supervision are achieved, and the registration accuracy is improved to a certain extent.

In addition, the invention has reliable design principle, simple structure and very wide application prospect.

Drawings

In order to more clearly illustrate the embodiments or technical solutions in the prior art of the present invention, the drawings used in the description of the embodiments or prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained based on these drawings without creative efforts.

FIG. 1 is a schematic flow diagram of a method of one embodiment of the invention.

FIG. 2 is a diagram of the deformation field obtained in the method of FIG. 1Deformation fieldDeformation fieldDeformation fieldSchematic diagram of the process indication.

FIG. 3 is a schematic block diagram of a system of one embodiment of the present invention.

Detailed Description

In order to make those skilled in the art better understand the technical solution of the present invention, the technical solution in the embodiment of the present invention will be clearly and completely described below with reference to the drawings in the embodiment of the present invention, and it is obvious that the described embodiment is only a part of the embodiment of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The following explains key terms appearing in the present invention.

Fig. 1, 2 are schematic flow diagrams of methods of one embodiment of the invention. This embodiment takes registration of 3D human brain images as an example.

As shown in fig. 1, the method 100 includes:

first, image acquisition.

A public data set of human brain images is downloaded from public data sets OASIS and ADNI.

In particular, those skilled in the art can also acquire the required three-dimensional medical image from the DICOM interface of CT, MRI or ultrasound imager.

Step two, pretreatment:

and preprocessing the three-dimensional medical image acquired in the first step.

The human brain data collected in the first step includes redundant parts such as neck, oral cavity, nasal cavity, skull and the like, and the sizes and gray scales of the redundant parts are different. For this purpose, standard preprocessing is performed on these image data. Firstly, the image is divided, the human brain is separated from the original data, and the obtained human brain image is cut to make the size of the human brain image consistent. Then carrying out normalizationThe voxel values are normalized to [0, 1 ]]. Followed by affine registration. Then selecting an image from the affine aligned data as a fixed image I_FThe rest is used as a floating image I_M。

In addition, in order to enhance the robustness and generalization capability of the neural network model, the preprocessing in the step further includes data enhancement, that is, each floating image obtained in the preprocessing process is subjected to bending transformation, and all images obtained after the bending transformation are newly added floating images. I.e. the new image resulting from the warping transformation also belongs to the floating image resulting from the preprocessing in this step.

In the embodiment, the warping transformation may adopt three kinds of warping transformations with different degrees, so as to realize data enhancement. In the concrete implementation, the number of the bending transformation with different degrees can be increased or decreased according to the actual situation.

All the floating images obtained by preprocessing in the step constitute a training set of the neural network.

And thirdly, training a neural network to obtain a trained neural network model. Fixed image I obtained after preprocessing_FAnd training the neural network by the training set to obtain a trained neural network model.

And fourthly, inputting the three-dimensional medical image to be registered into the neural network model for registration, and finally obtaining and outputting a registration image of the three-dimensional medical image to be registered.

When the invention is used, image acquisition is firstly carried out, then the acquired image is preprocessed, and then a fixed image I is obtained based on preprocessing_FAnd training the neural network by the training set to obtain a trained neural network model, inputting the three-dimensional medical image to be registered into the trained neural network model for registration, and finally obtaining and outputting a corresponding registration image.

The method has better effect when the number of the related images to be registered (in the embodiment, the human brain images to be registered) is relatively large. Specifically, after the trained neural network model is obtained, the images to be registered are respectively input into the trained neural network model, and then the registered images corresponding to the images to be registered are obtained. Specifically, after a neural network model is trained, a corresponding registration image can be correspondingly output every time an image to be registered is input; after the last registered image is output, the next image to be registered can be continuously input into the trained neural network model until the image registration of all the images to be registered is completed.

In the third step, the training of the neural network to obtain a trained neural network model includes:

s1, fixing image I obtained after preprocessing_FAnd a floating image I_MInputting the neural network as an input layer of the neural network, wherein each set of input data comprises the fixed image I_FAnd one said floating image I_M。

Each set of input data I_FAnd I_MAnd the 3D images are subjected to lossless splicing to form 2 channels and then are sent to a neural network input layer.

S2, inputting fixed image I in input layer_FAnd a floating image I_MDown-sampling and outputting the input fixed image I_FAnd a floating image I_MThe characteristic diagram of (1).

The down-sampling comprises 3 down-sampling processes, a convolution calculation process with convolution kernel size of 3 multiplied by 3 and a LeakyReLU activation function calculation process after the 3 down-sampling processes; the 3 downsampling processes correspond to 3 downsampling process layers; the 3 down-sampling process layers are sequentially marked as a first down-sampling process layer, a second down-sampling process layer and a third down-sampling process layer according to the execution sequence of the down-sampling process; each downsampling process layer includes a convolution layer with a convolution kernel size of 3 × 3 × 3, a LeakyReLU activation function layer, and a max pooling layer.

S3, respectively carrying out feature re-weighting on feature graphs output by LeakyReLU activation function layers in a first downsampling process layer, a second downsampling process layer and a third downsampling process layer corresponding to downsampling to obtain three weighted feature graphs, which are sequentially: a first weighted feature map, a second weighted feature map, and a third weighted feature map.

The step of implementing the feature re-weighting in step S3 includes:

s＝σ(δ(z))，

step S34, weighting the feature map X obtained in step S33 based on the weight corresponding to each slice X in the D dimension thereofIs sliced intoCorrespondingly obtaining the re-weighted feature map corresponding to the feature map X

Such as for the feature map X e R^(H×W×D)The feature map X obtained in step S33 has X in D-dimension₁、X₂、…、X_DSection X₁、X₂、…、X_DThe corresponding feature map after the weighting is sequentiallyThere is a re-weighted feature map corresponding to feature map X

S4, S4, performing 1 × 1 × 1 convolution on the characteristic diagram output in the step S2, and outputting a floating image I_MTo a fixed picture I_FDeformation field of

S5, inputting the characteristic diagram output in the step S2 into an UpSampling layer for UpSampling, wherein the UpSampling layer comprises 3 UpSampling process layers, each UpSampling process layer comprises an UpSampling layer and a convolutional layer with the convolutional kernel size of 3 multiplied by 3, and a LeakyReLU activation function layer is arranged behind the convolutional layer with the convolutional kernel size of 3 multiplied by 3 in each UpSampling process layer; the 3 upsampling process layers correspond to the 3 upsampling processes of the upsampling; and the 3 upsampling process layers are sequentially marked as a first upsampling process layer, a second upsampling process layer and a third upsampling process layer according to the sequence of the 3 upsampling processes.

After the feature map output by the UpSampling layer of the first UpSampling process layer is fused with the third weighted feature map, the feature map is used as the input of the convolutional layer with the convolutional kernel size of 3 multiplied by 3 in the first UpSampling process layer; fusing a feature map output by the UpSamplling layer of the second up-sampling process layer with the second weighted feature map, and taking the feature map as the input of the convolution layer with the convolution kernel size of 3 multiplied by 3 in the second up-sampling process layer; and fusing the feature map output by the UpSamplling layer of the third up-sampling process layer with the first weighted feature map to be used as the input of the convolution layer with the convolution kernel size of 3 multiplied by 3 in the third up-sampling process layer.

S8 deformation field based on the outputDeformation fieldDeformation fieldAnd based on the deformed image obtainedDeformed imageDeformed imageDeformed imageComputing a fixed image I using a loss function_FAnd the deformed imageAnd performing back propagation optimization on the neural network until the calculated loss function value is not reduced or the network training reaches a preset training iteration number, and finishing the neural network training to obtain the trained neural network model.

Wherein the computational expression of the loss function is:

Optionally, the similarity metric function described in this embodiment uses a cross-correlation function; the space transformation network adopts STN space transformation network.

Optionally, the method 100 described in this embodiment may be implemented by using a U-Net neural network as a basic neural network structure.

Alternatively, the fusion mode of the feature maps involved in the present invention may be splicing fusion. In this embodiment, U-Net type channel dimension splicing and fusion can be adopted. In addition, when implementing the feature map fusion according to the present invention, a person skilled in the art may select another fusion method according to actual situations to perform fusion, for example, feature map fusion may be performed by a sum fusion (corresponding point addition) method.

In the present embodiment, the maximum pooling layer down-sampling factor is 2 × 2 × 2 at the time of down-sampling, and accordingly, it is possible to set in advanceFrom a fixed picture I_FReduced to 1/2 to obtain,From a fixed picture I_FReduced to 1/4 to obtain,From a fixed picture I_FThe result of the reduction 1/8 is that,resolution of (2) >Resolution of (2) >＞0。

In conclusion, the unsupervised three-dimensional medical image registration method based on the neural network provided by the invention has the advantages that the use of the loss function realizes the unsupervised registration of the three-dimensional medical image, any marking information and registration supervision information are not needed in the registration process, the requirement of marking data and the errors of artificial subjective judgment are reduced, the image registration is facilitated for medical workers without related medical experience to a certain extent, the registration rate is improved to a certain extent, the registration time is saved, and meanwhile, the manpower and material resources are saved to a certain extent.

Referring to fig. 3, an unsupervised three-dimensional medical image registration system 200 based on neural network of the present invention comprises:

an image acquisition unit 201 that acquires three-dimensional medical images from public data sets OASIS and ADNI;

the image preprocessing unit 202 performs preprocessing on the acquired three-dimensional medical image, including image segmentation, clipping, normalization processing and affine alignment, and selects any one image from the images after affine alignment as a fixed image I_FThe rest of the image is taken as a floating image I_M(ii) a Wherein the size of the cut images is consistent;

a neural network training unit 203 based on the pre-processed fixed image I_FAnd a floating image I_MTraining a neural network to obtain a trained neural network model;

an image registration unit 204, which inputs the medical image to be registered into the trained neural network model for registration, and obtains and outputs a registration image of the medical image to be registered;

the neural network training unit 203 includes:

A down-sampling module for fixed image I input in the input layer_FAnd a floating image I_MPerforming down sampling to output a characteristic diagram; the down-sampling comprises 3 down-sampling processes, a convolution calculation process with convolution kernel size of 3 multiplied by 3 and a LeakyReLU activation function calculation process after the 3 down-sampling processes; 3 downsampling processes correspond to 3 downsampling process layers; the 3 down-sampling process layers are sequentially marked as a first down-sampling process layer, a second down-sampling process layer and a third down-sampling process layer according to the execution sequence of the down-sampling process; each downsampling process layer comprises a convolution layer with convolution kernel size of 3 × 3 × 3, and a convolution kernelA LeakyReLU activation function layer and a max pooling layer;

The up-sampling module is used for inputting the characteristic diagram output by the down-sampling module into an up-sampling layer for up-sampling, the up-sampling layer comprises 3 up-sampling process layers, each up-sampling process layer comprises an UpSamplling layer and a convolution layer with the convolution kernel size of 3 multiplied by 3, and a LeakyReLU activation function layer is respectively arranged behind the convolution layer with the convolution kernel size of 3 multiplied by 3 in each up-sampling process layer; the 3 upsampling process layers correspond to the 3 upsampling processes of the upsampling; the 3 upsampling process layers are sequentially marked as a first upsampling process layer, a second upsampling process layer and a third upsampling process layer according to the sequence of the 3 upsampling processes; after the feature map output by the UpSampling layer of the first UpSampling process layer is fused with the third weighted feature map, the feature map is used as the input of the convolutional layer with the convolutional kernel size of 3 multiplied by 3 in the first UpSampling process layer; fusing a feature map output by the UpSamplling layer of the second up-sampling process layer with the second weighted feature map, and taking the feature map as the input of the convolution layer with the convolution kernel size of 3 multiplied by 3 in the second up-sampling process layer; fusing a feature map output by an UpSampling layer of a third up-sampling process layer with the first weighted feature map, and taking the feature map as the input of a convolution layer with the convolution kernel size of 3 multiplied by 3 in the third up-sampling process layer;

a second deformation field output module for outputting the first and second up-sampling process layersRespectively carrying out 1 multiplied by 1 convolution on the feature map output by the third up-sampling process layer, and outputting floating images I corresponding to the first up-sampling process layer, the second up-sampling process layer and the third up-sampling process layer_MTo a fixed picture I_FIn turn is a deformation fieldDeformation fieldAnd deformation field

A spatial transformation module for transforming the floating image I_MAnd the deformation field of the above outputInputting into a spatial transformation network, rendering floating images I_MAnd the deformation field of the above outputInput to the spatial transformation network, floating image I_MAnd the deformation field of the above outputInput to the spatial transformation network, floating image I_MAnd the deformation field of the above outputInputting the image data into the space transformation network, and correspondingly obtaining a floating image I through the space transformation of the space transformation network_MThe corresponding deformed images are sequentially deformed imagesDeformed imageDeformed imageAnd the deformed image

the computational expression of the loss function is:

Wherein, the re-weighting module comprises:

s＝σ(δ(z))，

Optionally, the similarity measure function is a cross-correlation function.

Optionally, the spatial transformation network adopts an STN spatial transformation network.

Optionally, the preprocessing described in the image preprocessing unit 202 further includes data enhancement; wherein said data enhancement comprises: respectively performing bending transformation on each obtained floating image to obtain a bent transformed image corresponding to each obtained floating image; the resulting warped images are newly added floating images.

The same and similar parts in the various embodiments in this specification may be referred to each other. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and the relevant points can be referred to the description in the method embodiment.

Although the present invention has been described in detail by referring to the drawings in connection with the preferred embodiments, the present invention is not limited thereto. Various equivalent modifications or substitutions can be made on the embodiments of the present invention by those skilled in the art without departing from the spirit and scope of the present invention, and these modifications or substitutions are within the scope of the present invention/any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. An unsupervised three-dimensional medical image registration method based on a neural network is characterized by comprising the following steps of:

l2, preprocessing the acquired three-dimensional medical image: the method comprises image segmentation, clipping, normalization processing and affine alignment, and any one image is selected from images after affine alignment to be used as a fixed image I_FThe rest of the image is taken as a floating image I_M(ii) a Wherein the size of the cut images is consistent;

S _5, inputting the feature map output in the step S2 into an UpSampling layer for UpSampling, wherein the UpSampling layer comprises 3 UpSampling process layers, each UpSampling process layer comprises an UpSampling layer and a convolutional layer with the convolutional kernel size of 3 multiplied by 3, and a LeakyReLU activation function layer is arranged behind the convolutional layer with the convolutional kernel size of 3 multiplied by 3 in each UpSampling process layer; the 3 upsampling process layers correspond to the 3 upsampling processes of the upsampling; the 3 upsampling process layers are sequentially marked as a first upsampling process layer, a second upsampling process layer and a third upsampling process layer according to the sequence of the 3 upsampling processes;

S8 deformation field based on the outputDeformation fieldDeformation fieldAndbased on the deformed image obtainedDeformed imageDeformed imageDeformed imageComputing a fixed image I using a loss function_FAnd the deformed imagePerforming back propagation optimization on the neural network until the calculated loss function value is not reduced or the network training reaches a preset training iteration number, and finishing the neural network training to obtain the trained neural network model;

the computational expression of the loss function is:

in the formula (i), the first and second groups,represents the calculated loss function value, α and β are both constants, α + β is 1,is a regularization term, λ is a regularization control constant parameter,representing a predetermined map defined by said fixed mapLike I_FDown-sampling the obtained three-dimensional medical image; three-dimensional medical imageSequentially with the deformed imageEquivalent, three-dimensional medical imageAre successively lower and are each smaller than the fixed image I_FThe resolution of (a) of (b),representing said fixed image I_FAnd the deformed imageA measure of the degree of similarity between the two,representing the three-dimensional medical imageAnd the deformed imageA measure of the degree of similarity between the two,representing the three-dimensional medical imageAnd the deformed imageA measure of the degree of similarity between the two,representing the three-dimensional medical imageAnd the deformed imageA measure of similarity between;the same similarity metric function is used.

2. The method for unsupervised three-dimensional medical image registration based on neural network as claimed in claim 1, wherein the step of implementing feature re-weighting in step S3 comprises:

step S31, recording each feature graph output in the down sampling and to be subjected to feature re-weighting is a feature graph X, wherein X is equal to R^(H ^×W×D)Slicing the feature map X in the D dimension of the feature map X, and processing each slice X ∈ R obtained by slicing by using a global average pooling strategy^(H×W)And performing global average pooling to obtain a slice descriptor z of each slice X in the D dimension of the feature map X, wherein a specific formula of each slice descriptor z is as follows:

s＝σ(δ(z))，

3. The unsupervised three-dimensional medical image registration method based on neural network as claimed in claim 1, wherein said similarity measure function is a cross-correlation function.

4. The unsupervised three-dimensional medical image registration method based on neural network as claimed in claim 1, wherein the spatial transformation network employs STN spatial transformation network.

5. The method of claim 1, wherein the preprocessing further comprises data enhancement;

6. An unsupervised three-dimensional medical image registration system based on a neural network, comprising:

wherein, the neural network training unit comprises:

A down-sampling module for fixed image I input in the input layer_FAnd a floating image I_MPerforming down sampling to output a characteristic diagram; the down-sampling comprises 3 down-sampling processes, a convolution calculation process with convolution kernel size of 3 x 3 and a LeakyReLU laser after the 3 down-sampling processesA live function calculation process; 3 downsampling processes correspond to 3 downsampling process layers; the 3 down-sampling process layers are sequentially marked as a first down-sampling process layer, a second down-sampling process layer and a third down-sampling process layer according to the execution sequence of the down-sampling process; each downsampling process layer comprises a convolution layer with convolution kernel size of 3 multiplied by 3, a LeakyReLU activation function layer and a maximum pooling layer;

a second deformation field output module, which respectively performs 1 × 1 × 1 convolution on the feature maps output by the first upsampling process layer, the second upsampling process layer and the third upsampling process layer, and outputs the floating images I corresponding to the first upsampling process layer, the second upsampling process layer and the third upsampling process layer_MTo a fixed picture I_FIn turn is a deformation fieldDeformation fieldAnd deformation field

A spatial transformation module for transforming the floating image I_MAnd the deformation field of the above outputInputting into a spatial transformation network, rendering floating images I_MAnd the deformation field of the above outputInput to the spatial transformation network, floating image I_MAnd the deformation field of the above outputInput to the spatial transformation network, floating image I_MAnd the deformation field of the above outputInput to the spatial transformation network, respectively via the null of the spatial transformation networkInter-transforming, corresponding to obtain floating image I_MThe corresponding deformed images are sequentially deformed imagesDeformed imageDeformed imageAnd the deformed image

the computational expression of the loss function is:

7. The unsupervised neural network-based three-dimensional medical image registration system of claim 6, wherein the re-weighting module comprises:

a descriptor obtaining module for recording that each feature graph to be subjected to feature re-weighting output in the down-sampling is X, and X belongs to R^(H ^×W×D) Slicing the feature map X in the D dimension of the feature map X, and processing each slice X ∈ R obtained by slicing by using a global average pooling strategy^(H×W)And performing global average pooling to obtain a slice descriptor z of each slice X in the D dimension of the feature map X, wherein a specific formula of each slice descriptor z is as follows:

s＝σ(δ(z))，

in the formula, F_scale(x, s) represents slice x and its corresponding weight sA multiplication operation between;

8. The unsupervised three-dimensional medical image registration system based on neural network as claimed in claim 6, wherein said similarity measure function is a cross-correlation function.

9. The unsupervised three-dimensional medical image registration system based on neural network as claimed in claim 6, wherein the spatial transformation network employs STN spatial transformation network.

10. The unsupervised three-dimensional medical image registration system based on neural network as claimed in claim 6, wherein said preprocessing in the image preprocessing unit further comprises data enhancement; wherein said data enhancement comprises: respectively performing bending transformation on each obtained floating image to obtain a bent transformed image corresponding to each obtained floating image; the resulting warped images are newly added floating images.