CN113012207A

CN113012207A - Image registration method and device

Info

Publication number: CN113012207A
Application number: CN202110308450.7A
Authority: CN
Inventors: 吴振洲; 杨溪
Original assignee: Beijing Ande Yizhi Technology Co ltd
Current assignee: Beijing Ande Yizhi Technology Co ltd
Priority date: 2021-03-23
Filing date: 2021-03-23
Publication date: 2021-06-22

Abstract

The present disclosure relates to an image registration method and apparatus. The method comprises the following steps: acquiring a first image and a second image to be registered; inputting the first image and the second image into an image registration model, and outputting a registration image of the first image and the second image through the image registration model, wherein the image registration model is obtained by training according to a position transformation relation from a sample image pixel to be registered to a reference sample image pixel. In the embodiment of the disclosure, the registration method based on deep learning can significantly improve the registration speed while ensuring the image registration stabilization effect.

Description

Image registration method and device

Technical Field

The present disclosure relates to the field of image processing technologies, and in particular, to an image registration method and apparatus.

Background

Image registration is to map one or more original images onto a target image through some spatial transformations, and to make corresponding physical points (pixel points) in the original image and the target image have the same spatial position on the registered images. When medical images are analyzed, several images of the same patient are often put together for analysis, so that comprehensive information of the patient in various aspects is obtained, and a doctor can observe a focus structure in the images from various angles. Help doctors to make medical diagnosis, operation plan and treatment plan.

Image registration is divided into rigid registration and flexible registration, wherein rigid registration is a global transformation including affine transformation such as rotation, scaling, translation and the like on pixels (or voxels) in an image, but rigid registration cannot simulate local geometric differences between images. The flexible matching criterion performs a non-linear transformation of pixel positions in the medical image, which can simulate local differences between images. In the related technology, in the rigid registration process, algorithms such as a gradient descent method, a newton method, a genetic algorithm and the like are often adopted, and a complete optimization iterative process is required, so that the rigid registration speed is low, and the actual application requirements cannot be met.

Therefore, there is a need in the related art for an effective image registration method to shorten the time for image registration.

Disclosure of Invention

To overcome the problems in the related art, the present disclosure provides an image registration method and apparatus.

According to a first aspect of embodiments of the present disclosure, there is provided an image registration method, including:

acquiring a first image and a second image to be registered;

inputting the first image and the second image into an image registration model, and outputting a registration image of the first image and the second image through the image registration model, wherein the image registration model is obtained by training according to a position transformation relation from a sample image pixel to be registered to a reference sample image pixel.

In one possible implementation, the positional transformation relationship includes a linear positional transformation relationship and/or a non-linear positional transformation relationship.

In a possible implementation manner, the image registration model is configured to be obtained by training according to a position transformation relationship from a sample image pixel to be registered to a reference sample image pixel, and includes:

acquiring a sample set, wherein the sample set comprises a plurality of sample images to be registered and reference sample images;

constructing an image registration model, wherein training parameters are set in the image registration model;

respectively extracting features of the sample image to be registered and the reference sample image, and determining a position transformation relation between the features of the sample image to be registered and the corresponding features of the reference sample image;

mapping the pixels of the sample image to be registered to the target pixel positions according to the position transformation relation to obtain a registered sample image;

iteratively adjusting the training parameters based on a difference between the registered sample image pixel locations and the reference sample image pixel locations until the difference meets a preset requirement.

In a possible implementation manner, the image registration model includes a first image registration model and a second image registration model, where the first image registration model is configured to be obtained through training according to a linear position transformation relationship from a sample image pixel to be registered to a reference sample image pixel, and the second image registration model is configured to be obtained through training according to a nonlinear position transformation relationship from the sample image pixel to be registered to the reference sample image pixel.

In one possible implementation, the position transformation relationship includes a first position transformation relationship and a second position transformation relationship, and the image registration model is configured to be obtained according to training of the position transformation relationship from the sample image pixel to be registered to the reference sample image pixel, and includes:

constructing an image registration model, wherein the image registration model comprises a first image registration sub-model and a second image registration sub-model, and the first image registration sub-model and the second image registration sub-model are provided with training parameters;

respectively extracting features of the sample image to be registered and the reference sample image, determining a first position transformation relation from the sample image to be registered to the reference sample image based on the relation between the features of the sample image to be registered and the corresponding features of the reference sample image, and mapping pixels of the sample image to be registered to target pixel positions according to the first position transformation relation to obtain a first registration sample image;

respectively extracting features of the first registration sample image and the reference sample image, determining a second position transformation relation from the first registration sample image to the reference sample image based on the relation between the features of the first registration sample image and the corresponding features of the reference sample image, and mapping pixels of the first registration sample image to target pixel positions according to the first position transformation relation to obtain a second registration sample image;

iteratively adjusting the training parameter based on a difference between the second registration sample image pixel location and the reference sample image pixel location until the difference meets a preset requirement.

In a possible implementation manner, the inputting the image to be registered and the reference image into an image registration model, and outputting a registered image of the image to be registered relative to the reference image via the image registration model includes:

acquiring a segmentation label of a first image;

inputting the first image, the segmentation label and the second image to an image registration model, respectively, and outputting a registered image of the first image and the second image and the segmentation label of the registered image via the image registration model.

obtaining a sample set, wherein the sample set comprises a plurality of sample images to be registered and segmentation labels thereof, and a plurality of reference sample images and segmentation labels thereof;

respectively extracting features of the sample image to be registered and the reference sample image, and determining a position transformation relation from the sample image to be registered to the reference sample image based on the relation between the features of the sample image to be registered and the corresponding features of the reference sample image;

mapping the segmentation label of the sample image to be registered to a target pixel position according to the position transformation relation to obtain the segmentation label of the registered image;

iteratively adjusting the training parameters based on the difference between the pixel position of the registered sample image and the pixel position of the reference sample image and the difference between the segmentation label of the registered image and the segmentation label of the reference sample image until the difference meets a preset requirement.

According to a second aspect of embodiments of the present disclosure, there is provided an image registration apparatus including:

the acquisition module is used for acquiring a first image and a second image to be registered;

and the registration module is used for inputting the first image and the second image into an image registration model and outputting a registration image of the first image and the second image through the image registration model, wherein the image registration model is obtained by training according to the position transformation relation from the pixel of the sample image to be registered to the pixel of the reference sample image.

In one possible implementation, the image registration model in the registration module is configured to be obtained by training as follows:

In one possible implementation, the registration module includes:

the first registration submodule comprises a first registration submodel, and the first registration submodel is set to be obtained by training according to the linear position transformation relation from the sample image pixel to be registered to the reference sample image pixel;

and the second registration submodule comprises a second registration submodel, and the second registration submodel is set to be obtained by training according to the nonlinear position transformation relation from the image pixel processed by the first registration submodule to the image pixel of the reference sample.

In one possible implementation, the position transformation relationship includes a first position transformation relationship and a second position transformation relationship, and the image registration model in the registration module is configured to be obtained by training as follows:

respectively extracting features of the first registration sample image and the reference sample image, determining a second position transformation relation between the features of the first registration sample image and the corresponding features of the reference sample image based on the relation between the features of the first registration sample image and the corresponding features of the reference sample image, and mapping pixels of the first registration sample image to target pixel positions according to the first position transformation relation to obtain a second registration sample image;

In one possible implementation, the registration module includes:

the acquisition submodule is used for acquiring a segmentation label of the first image;

and the registration sub-module is used for respectively inputting the first image, the segmentation label and the second image into an image registration model, and outputting the registration images of the first image and the second image and the segmentation label of the registration image through the image registration model.

In one possible implementation, the image registration model in the registration sub-module is configured to be obtained by training as follows:

According to a third aspect of embodiments of the present disclosure, there is provided an image registration apparatus including:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to perform the method according to any of the embodiments of the present disclosure.

According to a fourth aspect of embodiments of the present disclosure, there is provided a non-transitory computer-readable storage medium, wherein instructions, when executed by a processor, enable the processor to perform the method according to any one of the embodiments of the present disclosure.

The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects: in the embodiment of the disclosure, the registration method based on deep learning can significantly improve the registration speed while ensuring the image registration stabilization effect. In the conventional rigid registration method, for each pair of input images, a complete optimization iteration process is required, and time is consumed. For some input data, manual parameter adjustment is also needed to achieve good registration effect. In the technical scheme, the registration of two images with the size of 128 × 64 can be completed within 0.25 s. Speed is improved obviously compared with the traditional registration method. Therefore, the image registration method and the device do not need to perform optimization iteration operation on the image, greatly save the image registration time and improve the accuracy of the image registration.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.

Fig. 1 is a flow chart illustrating an image registration method according to an exemplary embodiment.

Fig. 2 is a block diagram illustrating an image registration apparatus according to an exemplary embodiment.

FIG. 3 is a flowchart illustrating an image registration model training method according to an exemplary embodiment.

Fig. 4 is a block diagram illustrating an image registration apparatus according to an exemplary embodiment.

Fig. 5 is a block diagram illustrating an image registration apparatus according to an exemplary embodiment.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

The image registration method according to the present disclosure is described in detail below with reference to fig. 1. Fig. 1 is a flow chart illustrating an image registration method according to an exemplary embodiment. Although the present disclosure provides method steps as illustrated in the following examples or figures, more or fewer steps may be included in the method based on conventional or non-inventive efforts. In steps where no necessary causal relationship exists logically, the order of execution of the steps is not limited to that provided by the disclosed embodiments.

Specifically, an embodiment of an image registration method provided by the present disclosure is shown in fig. 1, where the method may be applied to a terminal or a server and includes:

step S101, a first image and a second image to be registered are obtained;

step S102, inputting the first image and the second image into an image registration model, and outputting a registration image of the first image and the second image through the image registration model, wherein the image registration model is obtained by training according to a position transformation relation from a sample image pixel to be registered to a reference sample image pixel.

In embodiments of the present disclosure, the first and second images may include medical images, such as CT (Computed Tomography) images obtained by electronic Computed Tomography, mri (magnetic Resonance imaging) images obtained by a magnetic Resonance scanner, and the CT images may in turn include SPECT (Single-Photon Emission Computed Tomography) images and PET (Positron Emission Tomography) images. In one example, the first image and the second image may comprise two-dimensional images or three-dimensional images. The first and second images include the same physical object in between, e.g. both images are of the brain or both images are of the heart. Registering the images may include mapping pixels of the first image to locations of corresponding pixels of the second image, and may also include mapping pixels of the second image to locations of corresponding pixels of the first image. Before the image registration model is input, setting a first image as an image to be registered, setting a second image as a reference image, and after the image registration model is input, obtaining a registration image which maps pixels of the first image to the positions of corresponding pixels of the second image; correspondingly, the second image is set as the image to be registered, the first image is set as the reference image, and the registration image which maps the pixels of the second image to the positions of the corresponding pixels of the first image is obtained after the image registration model is input.

In embodiments of the present disclosure, the positional transformation relationship may include a linear positional transformation relationship corresponding to a rigid registration of the images and/or a non-linear positional transformation relationship corresponding to a flexible registration of the images. In one example, the linear position transformation relationship may include at least one of: homography transformation, rigid body transformation, similarity transformation, affine transformation, perspective transformation. The homography transformation, also called projective transformation, includes non-singular linear transformation under homogeneous coordinates; the rigid body transformation comprises rotation and translation transformation; the similarity transformation comprises a rotation transformation, a translation transformation and a scaling transformation; the affine transformation comprises a rotation transformation, a translation transformation and a scaling transformation, and has two rotation factors and two scaling factors, different from the similarity transformation having a single rotation factor and a single scaling factor; the perspective transformation comprises that a perspective center, an image point and a target point are collinear, a bearing surface (perspective surface) is rotated for a certain angle around a trace line (perspective axis) according to a perspective rotation law, an original projection light beam is damaged, and the image can be projected to a new visual plane from an original plane by the transformation which can still keep a projection geometric figure on the bearing surface unchanged. The image registration model is obtained by training according to the position transformation relation from the sample image pixel to be registered to the reference sample image pixel, wherein the training method can comprise a supervised neural network learning algorithm and an unsupervised neural network learning algorithm in deep learning.

In the embodiment of the disclosure, the registration method based on deep learning can significantly improve the registration speed while ensuring the image registration stabilization effect. In the conventional rigid registration method, for each pair of input images, a complete optimization iteration process is required, and time is consumed. For some input data, manual parameter adjustment is also needed to achieve good registration effect. In the technical scheme, the registration of two images with the size of 128 × 64 can be completed within 0.25 s. Speed is improved obviously compared with the traditional registration method. Therefore, the image registration method and the device do not need to perform optimization iteration operation on the image, greatly save the image registration time and improve the accuracy of the image registration.

In one possible implementation, the positional transformation relationship includes a linear positional transformation relationship and/or a non-linear positional transformation relationship. In one example, the positional relationship may include a single kind of linear positional transformation relationship, and the corresponding image registration model may perform a single rigid registration; the position relation can comprise a single type of nonlinear position transformation relation, and the corresponding image registration model can perform single flexible registration. In one example, the positional relationship may include both a linear positional transformation relationship and a non-linear positional transformation relationship, corresponding to an image registration model that may both rigidly and flexibly register images. In one example, the position relationship may include multiple linear position transformations, and the corresponding image registration model may perform rigid registration on the image multiple times, so as to improve the registration accuracy of the image. In one example, the positional relationship may include a plurality of linear positional transformation relationships and nonlinear positional transformation relationships, and the corresponding image registration model may perform rigid registration and flexible registration on the image for a plurality of times.

In the embodiment of the disclosure, the sample image to be registered is marked as I_M，

The reference sample image is denoted as I_F，

In one example, the position transformation relation includes affine transformation, and an image registration model is constructed, wherein the image registration model can adopt an AIRNet (affine transformation network) network structure, and training parameters are set in the image registration model. The respectively performing feature extraction on the sample image to be registered and the reference sample image, and determining a position transformation relationship between the features of the sample image to be registered and the corresponding features of the reference sample image, includes: the convolution processing is performed on the sample image to be registered and the reference sample image respectively, and the convolution processing comprises two-dimensional convolution processing and three-dimensional convolution processing so as to solve the problem that the scale difference existing in the three-dimensional image is too large, because in the three-dimensional image, the last dimension may be very small, such as an image with the size of 512x512x32, and if the three-dimensional convolution operation on the image is performed at the moment, each dimension of the image is reduced by the same size (for example, by 8 times), the output of the operation is 64x64x 4. Where the information in the last dimension is lost too much. If, however, only the first two dimensions are operated on using a two-dimensional convolution, an output of size 64x64x32 results and the information of the last dimension is preserved. Therefore, the two-dimensional convolution and the three-dimensional convolution are used simultaneously, so that the image can be effectively reduced in height and width dimensions under the condition that the number of layers of the image is not changed. The two-dimensional convolution processingThe method comprises the step of performing two-dimensional convolution processing by using a convolution kernel with the size of 3 x3, wherein the result of the convolution processing can be processed by using a linear rectification function and a maximum pooling operation with the step size of 2 after batch normal processing. And continuously stacking results of multiple two-dimensional convolution processing (characteristic splicing), thereby extracting more image characteristics. Next, the result of the two-dimensional convolution processing is subjected to three-dimensional convolution processing using a convolution kernel of a size of 3 × 3 × 3, and the result of the three-dimensional convolution is subjected to batch normal processing, and maximum pooling operation is performed using a linear rectification function.

In one example, a channel-wise accumulation operation may be used on the result of the above convolution, aiming to fix the output length of the convolution result. The channel-by-channel accumulation operation may be performed in the following manner, for example, taking a two-dimensional image as an example, for two color images, the convolution results are 50 × 50 × 3 and 60 × 060 × 3, respectively, where 50 × 50 represents the number of features in the length and width directions of the color image, and 3 corresponds to three primary colors of RGB, which represents that the number of channels is 3. The convolution results of two color images are accumulated in different channels, including 50 × 50R channel eigenvalue additions, 50 × 50G channel eigenvalue additions, and 50 × 50B channel eigenvalue additions for the first image of R, G, B three channels. The second image is subjected to addition of 60 × 60R-channel feature values, addition of 60 × 60G-channel feature values, and addition of 60 × 60B-channel feature values. The two images each have 3 feature values. In I_FAnd I_MIn the case of a gray scale image, the number of channels is 1, and two fixed length features are obtained by using the convolution and channel-by-channel accumulation operations. And combining the two features, and performing feature extraction on the combined features by using a full-connection network to obtain an affine matrix from the sample image features to be registered to the corresponding features of the reference sample image. Specifically, batch normalization processing may be performed on the combined features in the fully-connected network, and the output of the fully-connected network is processed by using a linear rectification function, where the output of the fully-connected network is a 12-dimensional vector, and elements in the vector are rearranged into a 3 × 4 matrix, which is an affine transformation matrix and denoted as a. A. the_i,jAre the elements in matrix a. Let I_FHas the coordinates of (x)_F,y_F,z_F) Is provided with I_AFor a sample image I to be registered_MIs mapped to the target pixel position according to the affine transformation, and the coordinate of the registered sample image is expressed as (x)_A,y_A,z_A) Affine transformation can give I_AIs expressed as:

in the embodiment of the present disclosure, the pixels of the sample image to be registered are mapped to the target pixel position according to the position transformation relationship, so as to obtain a registered sample image. In one example, a resampler may be utilized to extract from image I_MIn order to obtain I_AThe pixel value of (2). The resampler may use nearest neighbor resampling.

In an embodiment of the present disclosure, the training parameter is iteratively adjusted based on a difference between the pixel position of the registration image and the pixel position of the reference sample image until the difference meets a preset requirement. In one example, image I is registered_AAnd a reference sample image I_FThe difference between images is measured using the mean square error:

where n denotes the number of registered images, the loss function L_AThe similarity degree between the two images can be directly measured, and the loss function is optimized, so that the parameters of the rigid registration network can be effectively adjusted.

In a possible implementation manner, the position transformation relation may include homography transformation, rigid body transformation, similarity transformation, and perspective transformation, and different from the above embodiment, the corresponding transformation matrix is homography transformation matrix, rigid body transformation matrix, similarity transformation matrix, and perspective transformation matrix. The expression forms of the homography transformation matrix, the rigid body transformation matrix, the similarity transformation matrix and the perspective transformation matrix are the contents of the prior art, and are not described in detail herein.

In one possible implementation, the positional transformation relationship may include a non-linear transformation. In one example, an image registration model is constructed, which may employ a 3D UNet network structure. The respectively performing feature extraction on the sample image to be registered and the reference sample image, and determining a position transformation relationship between the features of the sample image to be registered and the corresponding features of the reference sample image, includes: for the encoder end of the 3D UNet network structure, a sample image I to be registered_MAnd reference sample image denoted as I_FConvolution processing is carried out simultaneously, the convolution processing can include convolution processing with the step size of 2 by utilizing a convolution kernel with the size of 3 x3, and characteristic image feature extraction can be carried out by utilizing a linear rectification unit with leakage. For the decoder end of the 3D UNet network structure, the method comprises the steps of performing up-sampling on the features output by the encoder by using transposition convolution, performing feature fusion (feature splicing) on the output result of the up-sampling and the output of the corresponding encoder stage, and performing up-sampling and feature fusion on the fused features for multiple times of transposition convolution to obtain the position transformation relation from the features of the sample image to be registered to the corresponding features of the reference sample image. The output of the 3D UNet is three matrices Δ X, Δ Y, Δ Z, and

let I_OFor a sample image I to be registered_MThe pixels are mapped to the target pixel positions according to the nonlinear transformation to obtain a registration image, and the coordinates of the registration image are expressed as: (x)_O,y_O,z_O)。

In the embodiment of the present disclosure, the pixels of the sample image to be registered are mapped to the target pixel positions according to the position transformation relationshipAnd obtaining a registration image. In one example, a resampler may be utilized to extract from image I_MIn order to obtain I_OThe pixel value of (2). The resampler may use nearest neighbor resampling.

In an embodiment of the present disclosure, the training parameter is iteratively adjusted based on a difference between the pixel position of the registration image and the pixel position of the reference sample image until the difference meets a preset requirement. In one example, image I is registered_OAnd a reference sample image I_FUsing a cross-correlation loss function L_ccTo measure the correlation between the registered image and the reference image.

Wherein, therein

And

which represents the mean of the registered image and the reference image in the region omega. L is_ccA higher value of (a) indicates a higher degree of alignment of the two images.

In an embodiment of the present disclosure, the first image registration model and the second image registration model are separately and independently trained models. When the method is applied, a first image and a second image to be registered are input into a first image registration model, and a preliminary registration image of the two images is output. The first image registration model is obtained by training according to the linear position transformation relation from the sample image pixel to be registered to the reference sample image pixel, so that the preliminary registration image is a rigid registration image. And inputting the preliminary registration image into a second image registration model, and outputting a registration image. The second registration model is obtained by training according to the nonlinear position transformation relation from the sample image pixel to be registered to the reference sample image pixel, so that the registration image is the registration result of the early rigid registration and the later flexible registration. By the method, the images to be registered can be registered globally and locally, the registration accuracy is high, and the rigid registration trained by the artificial neural network is not required to be optimized during application, so that the registration time is greatly saved.

FIG. 3 is a flowchart illustrating an image registration model training method according to an exemplary embodiment. Referring to fig. 3, the position transformation relationship includes a first position transformation relationship and a second position transformation relationship, and the image registration model is configured to be obtained by training according to the position transformation relationship from the sample image pixel to be registered to the reference sample image pixel, and includes:

respectively extracting features of the first registration sample image and the reference sample image, determining a second position transformation relation from the features of the first registration sample image to the reference sample image based on the relation between the features of the first registration sample image and the corresponding features of the reference sample image, and mapping pixels of the first registration sample image to target pixel positions according to the first position transformation relation to obtain a second registration sample image;

In the embodiment of the disclosure, the image registration model includes a first image registration sub-model and a second image registration sub-model, and the image registration model is obtained by performing joint training on the first image registration sub-model and the second image registration sub-model. Specifically, referring to fig. 2, the first image registration submodel includes a first encoder 301 portion, a weight sharing 302 portion, and a regression layer 303 portion with respect to a reference image in the figure, and the second image registration submodel includes a second encoder 304 portion and a second decoder 305 portion.

In the embodiment of the present disclosure, for the first image registration sub-model, two-dimensional convolution processing or three-dimensional convolution processing may be performed on the sample image to be registered and the reference sample image, respectively, so as to extract image features of the sample image to be registered and the reference sample image. The two-dimensional convolution processing comprises the step of performing two-dimensional convolution processing by using a convolution kernel with the size of 3 x3, and the result of the convolution processing can be processed by using a linear rectification function and a maximum pooling operation with the step size of 2 after the batch normal processing. And continuously stacking results of multiple two-dimensional convolution processing (characteristic splicing), thereby extracting more image characteristics. Next, the result of the two-dimensional convolution processing is subjected to three-dimensional convolution processing using a convolution kernel of a size of 3 × 3 × 3, and the result of the three-dimensional convolution is subjected to batch normal processing, and maximum pooling operation is performed using a linear rectification function. A channel-wise accumulation operation is used on the result of the convolution. Two fixed length features are obtained. And combining the two features, and extracting the combined features by using a full-connection network to obtain a first position transformation matrix from the image features of the sample to be registered to the corresponding features of the image of the reference sample. And obtaining the pixel value of the first registration sample image from the sample image to be registered by using the resampler.

In the embodiment of the present disclosure, for the second image registration sub-model, the convolution processing is performed on the first registration sample image and the reference image at the same time, where the convolution processing may include performing convolution processing with a step size of 2 by using a convolution kernel with a size of 3 × 3 × 3, and feature image feature extraction may be performed by using a linear rectification unit with leakage. And for a decoder end of the 3D UNet network structure, performing up-sampling on the features output by the encoder by using transposition convolution, performing feature fusion (feature splicing) on the output result of the up-sampling and the output of the corresponding encoder stage, and performing up-sampling and feature fusion on the fused features for multiple times of transposition convolution to obtain a second position transformation matrix from the features of the sample image to be registered to the corresponding features of the reference sample image. Pixel values of a second registration sample image are derived from the first registration sample image using a resampler.

In the embodiment of the present disclosure, the loss function for the first image registration sub-model can be expressed by the above formula (2), I in formula (2)_AIn this embodiment a first registered sample image is represented. The loss function for the second image registration submodel may be represented by equation (4) above. The loss function of the registered image model is then:

L＝αL_A+βL_CC (5)

in the embodiment of the disclosure, in the training process, the output of the first image registration sub-model is input to the second image registration sub-model. Meanwhile, when the loss function is constructed, the loss functions of the two models are simultaneously included. The training mode can not only obtain a stable first registration result, but also obtain a good second registration result. The present disclosure integrates rigid registration and flexible registration. The flexible training strategy reduces the difficulty of training. Meanwhile, the embodiment of the disclosure can realize unsupervised training, and greatly reduce the difficulty of data preparation.

acquiring a segmentation label of a first image;

In the embodiment of the present disclosure, in some application scenarios, for example, when a doctor analyzes a medical image of a patient, the medical image is classified (segmentation labels) to determine the tissue structure to which the diseased part belongs, which regions are affected by a blood supply region, and the like. Manually classifying each image is very time consuming for the physician. Thus, images can be classified using the image registration model. It should be noted that the application of the image classification is not limited to the above examples, for example, other operations of medical research, etc., and those skilled in the art may make other modifications within the spirit of the present application, but all that can achieve the same or similar functions and effects as the present application are within the scope of the present application.

The image registration model in the embodiments of the present disclosure may be obtained according to a training mode of the image registration model in any of the embodiments described above. In the embodiment of the disclosure, the first image, the segmentation label of the first image and the second image are input to an image registration model, and the registration images of the first image and the second image and the segmentation label of the registration images are output through the image registration model. The process of inputting the first image and the second image into the image registration model to obtain the registered image of the first image and the second image is the same as that in the above embodiment, and is not described herein again. Different from the above embodiments, the embodiments of the present disclosure perform the same registration processing on the segmentation label of the first image as the first image, and obtain the segmentation label of the registered image. Since the location of the same pixel of the registered image and the second image is the same, the segmentation label of the registered image and the segmentation label of the second image are also the same. Therefore, through the image registration model in the embodiment of the disclosure, the second image can be automatically subjected to image classification, and the segmentation label of the second image is determined.

Different from the training method of the image registration model, the embodiment of the disclosure adds the classification loss function L_segSegmentation labels for measuring reference images and for registering sample imagesThe difference between the two labels, so as to optimize the registration effect of the segmentation labels. Thus, the loss function of the image registration model can be expressed as:

L＝αL_A+λ₁L_cc+λ₂L_seg

wherein L is_ARepresenting the loss function of the linear transformation (rigid registration) in the image registration model, L_ccLoss function representing non-linear variation (flexible variation) in image registration model, L_segA classification loss function is represented. Alpha, lambda₁、λ₂Are training parameters.

Fig. 2 is a block diagram illustrating an image registration apparatus 200 according to an exemplary embodiment. Referring to fig. 2, the apparatus includes an acquisition module 201 and a registration module 202.

An obtaining module 201, configured to obtain a first image and a second image to be registered;

a registration module 202, configured to input the first image and the second image into an image registration model, and output a registered image of the first image and the second image through the image registration model, where the image registration model is obtained by training according to a position transformation relationship from a sample image pixel to be registered to a reference sample image pixel.

In one possible implementation, the registration module includes:

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

Fig. 4 is a block diagram illustrating a graph registration apparatus 400 according to an exemplary embodiment. For example, the apparatus 400 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.

Referring to fig. 4, the apparatus 400 may include one or more of the following components: processing components 402, memory 404, power components 406, multimedia components 408, audio components 410, input/output (I/O) interfaces 412, sensor components 414, and communication components 416.

The processing component 402 generally controls overall operation of the apparatus 400, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 402 may include one or more processors 420 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 402 can include one or more modules that facilitate interaction between the processing component 402 and other components. For example, the processing component 402 can include a multimedia module to facilitate interaction between the multimedia component 408 and the processing component 402.

The memory 404 is configured to store various types of data to support operations at the apparatus 400. Examples of such data include instructions for any application or method operating on the device 400, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 404 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

Power supply components 406 provide power to the various components of device 400. The power components 406 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the apparatus 400.

The multimedia component 408 includes a screen that provides an output interface between the device 400 and the user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 408 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the apparatus 400 is in an operation mode, such as a photographing mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.

The audio component 410 is configured to output and/or input audio signals. For example, audio component 410 includes a Microphone (MIC) configured to receive external audio signals when apparatus 400 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 404 or transmitted via the communication component 416. In some embodiments, audio component 410 also includes a speaker for outputting audio signals.

The I/O interface 412 provides an interface between the processing component 402 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.

The sensor component 414 includes one or more sensors for providing various aspects of status assessment for the apparatus 400. For example, the sensor assembly 414 may detect an open/closed state of the apparatus 400, the relative positioning of the components, such as a display and keypad of the apparatus 400, the sensor assembly 414 may also detect a change in the position of the apparatus 400 or a component of the apparatus 400, the presence or absence of user contact with the apparatus 400, orientation or acceleration/deceleration of the apparatus 400, and a change in the temperature of the apparatus 400. The sensor assembly 414 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 414 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 416 is configured to facilitate wired or wireless communication between the apparatus 400 and other devices. The apparatus 400 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 416 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 416 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the apparatus 400 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.

In an exemplary embodiment, a non-transitory computer-readable storage medium comprising instructions, such as the memory 404 comprising instructions, executable by the processor 420 of the apparatus 400 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

Fig. 5 is a block diagram illustrating an apparatus 500 for image registration according to an exemplary embodiment. For example, the apparatus 500 may be provided as a server. Referring to fig. 5, the apparatus 500 includes a processing component 522 that further includes one or more processors and memory resources, represented by memory 532, for storing instructions, such as applications, that are executable by the processing component 522. The application programs stored in memory 532 may include one or more modules that each correspond to a set of instructions. Further, the processing component 522 is configured to execute instructions to perform the above-described methods.

The apparatus 500 may also include a power component 526 configured to perform power management of the apparatus 500, a wired or wireless network interface 550 configured to connect the apparatus 500 to a network, and an input/output (I/O) interface 558. The apparatus 500 may operate based on an operating system stored in the memory 532, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, or the like.

In an exemplary embodiment, a non-transitory computer readable storage medium comprising instructions, such as the memory 532 comprising instructions, executable by the processing component 522 of the apparatus 500 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. An image registration method, comprising:

acquiring a first image and a second image to be registered;

2. The method of claim 1, wherein the positional transformation relationship comprises a linear positional transformation relationship and/or a non-linear positional transformation relationship.

3. The method according to claim 1, wherein the image registration model is configured to be obtained by training according to a position transformation relationship from a sample image pixel to be registered to a reference sample image pixel, and comprises:

4. The method according to claim 1, wherein the image registration model comprises a first image registration model and a second image registration model, wherein the first image registration model is configured to be obtained according to training of linear position transformation relationship of pixels of the sample image to be registered to pixels of the reference sample image, and the second image registration model is configured to be obtained according to training of nonlinear position transformation relationship of pixels of the sample image to be registered to pixels of the reference sample image.

5. The method according to claim 1, wherein the positional transformation relationship comprises a first positional transformation relationship and a second positional transformation relationship, and the image registration model is configured to be obtained by training from the positional transformation relationship of the sample image pixels to be registered to the reference sample image pixels, and comprises:

6. The method according to claim 1, wherein the inputting the image to be registered and the reference image into an image registration model, and outputting a registered image of the image to be registered relative to the reference image via the image registration model, comprises:

acquiring a segmentation label of a first image;

7. The method according to claim 6, wherein the image registration model is obtained by training according to a position transformation relation from a sample image pixel to be registered to a reference sample image pixel, and comprises:

8. An image registration apparatus, comprising:

9. The apparatus of claim 8, wherein the positional transformation relationship comprises a linear positional transformation relationship and/or a non-linear positional transformation relationship.

10. The apparatus of claim 8, wherein the image registration model in the registration module is configured to be obtained by training as follows:

11. The apparatus of claim 8, wherein the registration module comprises:

12. The apparatus of claim 9, wherein the position transformation relationship comprises a first position transformation relationship and a second position transformation relationship, and the image registration model in the registration module is configured to be trained to obtain:

13. The apparatus of claim 8, wherein the registration module comprises:

14. The apparatus of claim 13, wherein the image registration model in the registration sub-module is configured to be obtained by training as follows:

15. An image registration apparatus, comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to perform the method of any one of claims 1 to 7.

16. A non-transitory computer readable storage medium having instructions therein which, when executed by a processor, enable the processor to perform the method of any one of claims 1 to 7.