WO2022151586A1

WO2022151586A1 - Adversarial registration method and apparatus, computer device and storage medium

Info

Publication number: WO2022151586A1
Application number: PCT/CN2021/082355
Authority: WO
Inventors: 曹文明; 罗毅; 邹文兰
Original assignee: 深圳大学
Priority date: 2021-01-12
Filing date: 2021-03-23
Publication date: 2022-07-21
Also published as: CN112767463B; CN112767463A

Abstract

An adversarial registration method and apparatus, a computer device and a storage medium. The method comprises: acquiring a medical imaging image and a corresponding anatomically segmented image, and preprocessing the medical imaging image and the anatomically segmented image to obtain a data set, the anatomically segmented image comprising at least one anatomically segmented image region (S101); performing learning on a registration network and a discriminant network by using the data set (S102); constructing a first loss function for the registration network according to output results of the registration network and the discriminant network, and constructing a second loss function for the discriminant network by means of the adversarial learning of the discriminant network and the registration network (S103); and performing feedback optimization on the registration network and the discriminant network by using the first loss function and the second loss function respectively, and performing registration processing on a designated medical imaging image by using the optimized registration network (S104). By means of the adversarial learning between the discriminant network and the registration network, parameters after the feedback optimization of the registration network are more accurate, thereby increasing the registration accuracy.

Description

An adversarial registration method, apparatus, computer equipment and storage medium

This application is based on the Chinese patent application with the application number of 202110035984.7 and the filing date of January 12, 2021, and claims its priority. The entire content of the application is hereby incorporated into this application as a whole.

technical field

The present application relates to the technical field of image processing, and in particular, to an adversarial registration method, apparatus, computer equipment and storage medium.

Background technique

In clinical applications, the information contained in a single medical image image is limited, and the reasonable registration of medical image images of different times and modalities is beneficial to the judgment of surgeons and computers.

Traditional image registration methods are often expressed as an optimization problem, in which the iterative process consumes a lot of time and computing resources, which cannot meet the application standards in time-critical clinical situations.

The registration method based on supervised learning requires the ground truth deformation field, and its quality plays a key role in network training as a direct factor for the adjustment of network parameters. However, the randomly generated spatial transformation not only cannot reflect the real physiological motion, although using the traditional method to obtain the deformation field training model can solve the above problems, it will lead to the limited performance of the learning model.

Application content

Embodiments of the present application provide an adversarial registration method, apparatus, computer equipment, and storage medium, which aim to improve the registration accuracy of medical imaging images.

In a first aspect, an embodiment of the present application provides an adversarial registration method, including:

Obtaining a medical imaging image and a corresponding anatomical segmentation image, and preprocessing the medical imaging image and the anatomical segmentation image to obtain a data set, wherein the anatomical segmentation image includes at least one anatomical segmentation image area;

Use the data set to learn the preset registration network and discriminant network respectively;

A first loss function is constructed for the registration network according to the learned output of the registration network and the output of the discriminant network, and a second loss function is constructed for the discriminant network through confrontational learning between the discriminant network and the registration network. loss function;

Feedback optimization is performed on the registration network and the discrimination network respectively by using the first loss function and the second loss function, and the designated medical image image is registered by using the optimized registration network.

In a second aspect, an embodiment of the present application provides an anti-registration device, including:

An image preprocessing unit, configured to obtain a medical image image and a corresponding anatomical segmentation image, and preprocess the medical image image and the anatomical segmentation image to obtain a data set, wherein the anatomical segmentation image includes at least one anatomical segmentation image area;

a learning unit, configured to use the data set to learn the preset registration network and discrimination network respectively;

The first construction unit is used for constructing a first loss function for the registration network according to the output result of the registration network after learning and the output result of the discriminant network, and confronting learning through the discriminant network and the registration network as The discriminant network constructs a second loss function;

a registration processing unit, configured to use the first loss function and the second loss function to respectively perform feedback optimization on the registration network and the discrimination network, and use the optimized registration network to register the designated medical image images deal with.

In a third aspect, an embodiment of the present application provides a computer device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, when the processor executes the computer program The adversarial registration method as described in the first aspect is implemented.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the countermeasure configuration described in the first aspect is implemented. standard method.

The embodiments of the present application provide an adversarial registration method, device, computer equipment, and storage medium. The method includes: acquiring a medical imaging image and a corresponding anatomical segmentation image, and preprocessing the medical imaging image and the anatomical segmentation image, Obtaining a data set, wherein the anatomical segmentation image includes at least one anatomical segmentation image area; using the data set to learn the preset registration network and the discrimination network respectively; according to the output result of the learned registration network and The output result of the discriminant network constructs a first loss function for the registration network, and constructs a second loss function for the discriminant network through confrontational learning between the discriminant network and the registration network; using the first loss function and The second loss function performs feedback optimization on the registration network and the discrimination network respectively, and uses the optimized registration network to perform registration processing on the designated medical image images. In the embodiment of the present application, through the confrontational learning between the discriminant network and the registration network, the parameters after the feedback optimization of the registration network are more accurate, so that the registration processing of the medical image images through the registration network can have higher accuracy .

Description of drawings

In order to explain the technical solutions of the embodiments of the present application more clearly, the following briefly introduces the accompanying drawings used in the description of the embodiments. For those of ordinary skill, other drawings can also be obtained from these drawings without any creative effort.

1 is a schematic flowchart of an adversarial registration method provided by an embodiment of the present application;

2 is a schematic sub-flow diagram of an adversarial registration method provided by an embodiment of the present application;

3 is a schematic diagram of another sub-flow of an adversarial registration method provided by an embodiment of the present application;

4 is a schematic diagram of a network structure of an adversarial registration method provided by an embodiment of the present application;

FIG. 5 is a schematic block diagram of an anti-registration apparatus provided by an embodiment of the present application;

6 is a sub-schematic block diagram of an anti-registration apparatus provided by an embodiment of the present application;

FIG. 7 is another sub-schematic block diagram of an anti-registration apparatus provided by an embodiment of the present application.

Detailed ways

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.

It is to be understood that, when used in this specification and the appended claims, the terms "comprising" and "comprising" indicate the presence of the described features, integers, steps, operations, elements and/or components, but do not exclude one or The presence or addition of a number of other features, integers, steps, operations, elements, components, and/or sets thereof.

It should also be understood that the terminology used in the specification of the application herein is for the purpose of describing particular embodiments only and is not intended to limit the application. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural unless the context clearly dictates otherwise.

It should also be further understood that, as used in this specification and the appended claims, the term "and/or" refers to and including any and all possible combinations of one or more of the associated listed items .

Please refer to FIG. 1 below. FIG. 1 is a schematic flowchart of an anti-registration method provided by an embodiment of the present application, which specifically includes steps S101 to S104.

S101. Acquire a medical image image and a corresponding anatomical segmentation image, and perform preprocessing on the medical image image and the anatomical segmentation image to obtain a data set, wherein the anatomical segmentation image includes at least one anatomical segmentation image area;

S102, using the data set to learn the preset registration network and discrimination network respectively;

S103. Construct a first loss function for the registration network according to the learned output result of the registration network and the output result of the discriminant network, and construct a first loss function for the discriminant network through confrontational learning between the discriminant network and the registration network The second loss function;

S104. Use the first loss function and the second loss function to respectively perform feedback optimization on the registration network and the discrimination network, and use the optimized registration network to perform registration processing on the designated medical image images.

In this embodiment, first obtain medical image images and corresponding anatomical segmentation images to construct a dataset, and then use the dataset to learn the preset registration network and discriminant network. The registration network and discriminant network can be based on the corresponding input data set, output the corresponding results, and then construct a first loss function for the registration network according to the output results of the registration network and the discriminant network, so as to perform feedback optimization on the registration network, and at the same time according to the discriminant network The second loss function is constructed by adversarial learning with the registration network for feedback optimization of the discriminant network. Then, the optimized registration network can be used to register the specified medical image images.

This embodiment reasonably utilizes the deformable registration technology or registration framework of anatomical segmentation information of medical images (for example, the contours of the heart and the lung are marked in the chest radiograph), which can avoid dependence on the real ground deformation field. Based on the generative adversarial network framework in deep learning technology, the registration framework consists of two deep neural networks, namely the registration network and the discriminant network. Among them, the registration network can be designed as a Nested U-Net structure (a network structure) with three output displacement fields (used to deform the image later), and a residual module is added, which can prevent excessive learning during the learning process. The occurrence of the fitting phenomenon. The discriminant network can use a convolutional neural network structure to judge whether the input images are similar. This embodiment includes a total of two stages, training stage and clinical use. The performance of the registration network is improved by adversarial training with the discriminative network. Compared with the current excellent traditional methods and deep learning methods, the adversarial registration method provided in this embodiment achieves higher registration accuracy while ensuring the registration effectiveness. In the training phase, the registration network has achieved excellent performance through adversarial learning, and the network parameters are preserved, so in practical applications (ie, clinical use), it is not necessary to continue to use the discriminant network.

In one embodiment, the step S101 includes:

Obtain medical image images and corresponding anatomical segmentation images from a medical database;

performing pixel value marking on the anatomical segmented image region in the anatomical segmented image;

The medical imaging image and the anatomical segmentation image are uniformly scaled, so that the size of the medical imaging image and the anatomical segmentation image are adapted to the input size of the neural network formed by the registration network and the discriminant network, thereby obtaining a data set .

In this embodiment, before learning and training the registration network and the discrimination network, it is necessary to preprocess the medical image images and anatomical segmentation images used for training the network model. In specific application scenarios, medical imaging images and anatomical segmentation images can be obtained from public datasets, or provided by hospitals themselves. The preprocessing process is as follows:

First, obtain the medical image image with anatomical segmentation from the medical database. Of course, if there is no anatomical segmentation image, the outline of the organ can also be segmented by an experienced surgeon, or obtained by some existing image segmentation technology or software;

Then, pixel values are marked on the parts of the organs in the obtained anatomical segmented images (that is, the anatomically segmented image regions), for example, different pixel values from 1 to N represent different organs, where N represents the number of different segmented organs, Taking the chest X-ray as an example, you can set the pixel value of the left lung segmentation as 1, the right lung as 2, and the heart as 3;

Next, the obtained medical image images and anatomical segmentation images are uniformly scaled, and the scaling ratio needs to be determined according to the input size of the actual application network (ie, the registration network and the discriminant network), so as to adapt to the input size of the neural network. It should be noted that the neural network described in this embodiment refers to the registration network and the discrimination network, and the input sizes of the two are the same, and can be set by themselves according to the actual situation.

In an embodiment, as shown in FIG. 2 , the step S102 includes steps S201 to S205.

S201. Randomly select a medical imaging image and an anatomical segmentation image in the data set, and use them as a fixed image and a fixed segmentation image respectively, and then randomly select another medical imaging image and another anatomical image in the data set. Segment the image and use it as a moving image and a moving segmented image, respectively;

In this step, in the preprocessed data set, randomly select a medical image image and an anatomical segmentation image as a fixed image { _IF ∈ R ⁿ } and a fixed segmentation segmentation {S _F ∈ R ⁿ }, where R ⁿ represents an n-dimensional space, for example R ³ represents a 3-dimensional space. Similarly, another medical imaging image and another anatomical segmentation image are randomly selected as the moving image {I _M ∈ R ⁿ } and the moving segmentation image {S _M ∈ R ⁿ }.

S202. Combine the fixed image and the moving image as an image pair, and combine the fixed segmented image and the moving segmented image as a segmented image pair, and based on the input requirements of the registration network, respectively set the number and the matching Image pairs and segmented image pairs with the same number of quasi-network batches;

In this step, the fixed image and the moving image selected in step S201 are combined into an image pair, and the selected fixed segmented image and the moving segmented image are combined into a segmented image pair. It should be noted that because the input of the registration network is batch_size pairs of images, this step needs to be performed batch_size times, that is, batch_size pairs of images and split image pairs can be obtained.

S203, inputting the image pair to the registration network, and obtaining the displacement field between the moving image in the image pair and the pixels of the fixed image through the forward propagation of the registration network;

In this step, the registration network is used to predict the deformation field between the pixels of the moving image and the fixed image in the image pair with the deformation field φ:R( _IF , _IM ; θ), so as to output the corresponding displacement field. Among them, φ represents the deformation field predicted by the registration network (here the deformation field is obtained by indirect calculation, and the actual predicted output of the registration network is the displacement of each pixel point, that is, the displacement field, by adding the original value of each pixel point. The coordinates can obtain the position of each pixel after deformation, also known as the deformation field), and θ represents the internal parameters of the registration network, such as a function internal parameter, which can be optimized by learning.

It can be understood that, in the process of training and learning the registration network and the discriminant network, the convolution kernel parameters in the registration network and the discriminant network can first be performed according to the normal distribution with a mean value of 0 and a standard deviation equal to 0.01. Initialize the operation and then enter the iterative training process.

S204, using the grid resampling module to perform spatial transformation on the moving image and the moving segmented image in the segmented image pair according to the displacement field, and obtain the corresponding folded image and the folded segmented image by a linear interpolation method;

In this step, the grid resampling module performs spatial transformation on the moving image and the moving segmented image according to the generated displacement field, and uses the linear interpolation method to obtain the folded image

and fold the split image

Here, the grid resampling module calculates the deformation field according to the input displacement field, and then uses the calculated deformation field to spatially deform the moving image, that is, the folded image is constructed by using the deformed position of each pixel. The pixel position is often not an integer, so it is necessary to use interpolation methods to estimate the size of the pixel value at the integer position. In a specific application scenario, bilinear interpolation is used to obtain folded images and folded and segmented images. For example, a two-dimensional image is estimated by using four surrounding points, while a three-dimensional image is estimated by using eight points.

S205. Add noise to the fixed segmented image in the segmented image pair to obtain a fixed segmented image with noise, input the folded segmented image and the fixed segmented image with noise into the discrimination network, and pass the The discriminant network outputs the segmentation similarity of the segmented image pairs.

In this step, different from the registration network, the function of the discriminant network is to predict the similarity of the generated segmented image pairs, that is, to output the corresponding segmentation similarity.

In this embodiment, the image pairs are input into the registration network for learning, and the segmented image pairs are input into the discrimination network for learning, so that the registration network and the discrimination network output corresponding results, that is, the Folded images, folded segmented images, displacement fields and segmentation similarity, etc. In the subsequent steps, a loss function can be constructed for the registration network and the discriminant network according to the output structures of the registration network and the discriminant network, thereby improving the performance of the registration network and the discriminant network.

In an embodiment, as shown in FIG. 3 , the step S203 includes steps S301 to S307.

S301. Input the image pair to the registration network;

S302. Encode the image pair by sequentially using the first encoder module and the second encoder module in the registration network, and output the first encoding of the image pair;

S303, decoding the first code by the first decoder module and the second decoder module in turn, and outputting to obtain the first displacement field;

S304, encoding the first encoding by a third encoder module, and outputting the second encoding of the image pair;

S305, decoding the second encoding by the third decoder module, the fourth decoder module and the fifth decoder module in turn, and outputting to obtain the second displacement field;

S306, encoding the second encoding by the fourth encoder module, and outputting the third encoding of the image pair;

S307. Decode the third code by using the sixth decoder module, the seventh decoder module, the eighth decoder module and the ninth decoder module in sequence, and output the third displacement field.

In this embodiment, the input image pair is encoded and decoded by the registration network, the registration network performs forward propagation, and in the form of a deformation field, converts the moving image in the image pair to the fixed image The complex deformation field between the pixels is predicted, so as to obtain the displacement field (ie the first displacement field, the second displacement field and the third displacement field). The displacement field represents the displacement of the pixels in the moving image, and uses different channels to represent different spatial axes. For example, a 2D image needs to represent the displacement on the X-axis and Y-axis, which is represented by a 2-channel displacement field. The 3D image needs to represent the displacement on the X-axis, Y-axis and Z-axis, which is represented by a 3-channel displacement field. The dimension of the displacement field in this embodiment may be 4 dimensions (ie, a 2D image) or may be 5 dimensions (ie, a 3D image). In this embodiment, the registration network is designed as a Nested U-Net structure (a network structure) with three output displacement fields (for deforming the image later), and a residual module is added, which can prevent the learning process. Occurrence of overfitting.

It should be noted that the network structures of the second encoder module, the third encoder module, and the fourth encoder module in this embodiment are the same. In a specific embodiment, the second encoder module includes multiple layers The convolution kernel is a 3×3 convolution layer, and an activation function, etc. The network structures of the first decoder module, the third decoder module, the fourth decoder module, the sixth decoder module, the seventh decoder module and the eighth decoder module are the same. The network structures of the fifth decoder module and the ninth decoder module are the same. In a specific embodiment, the first decoder module includes a multi-layer convolution kernel with a 3×3 deconvolution layer (ie, a transposed convolution layer). layer) and activation functions, etc.

In one embodiment, the step S205 includes:

inputting the folded segmented image and the fixed segmented image with noise to the discriminant network;

Pass through the first convolutional layer, the first maximum pooling layer, the second convolutional layer, the second maximum pooling layer, the third convolutional layer, the third maximum pooling layer, and the fourth convolutional layer of the discriminant network in sequence layer and the fourth max pooling layer process the folded segmentation image and the fixed segmentation image with noise, and then input the processed folded segmentation image and the fixed segmentation image with noise into the fully connected layer, and pass the The activation function outputs the final segmentation similarity.

In this embodiment, the discriminant network includes network structures such as multi-layer convolution layers, multi-layer max pooling layers, fully connected layers, and activation functions. The discriminant network processes the segmented image pairs, and outputs corresponding The segmentation similarity of the folded image is increased, thereby increasing the anatomical rationality for the folded image. In a specific application scenario, the output of the discriminant network can be regarded as a part of the loss function of the registration network, so as to constrain the registration network. Specifically, the output of the discriminant network includes two pairs of similarities, namely the segmentation similarity between the folded segmentation image and the fixed segmentation image with noise and the similarity between the fixed segmentation image and the fixed segmentation image with noise self-similarity. There is an adversarial relationship between the discriminant network and the registration network. In the process of confrontation, the registration network hopes that the displacement field of the predicted output can pair the obtained folded segmented image with the fixed segmented image with noise, thereby allowing the The output of the discriminant network has a higher segmentation similarity. The discriminant network expects to be able to separate the folded and segmented images, that is, the segmentation similarity between the folded and segmented images and the fixed segmented images with noise is expected to be low, and the difference between the fixed and fixed segmented images with noise is expected. The self-similarity between them is high, thus forming a confrontational relationship.

In one embodiment, the step S103 includes:

The normalized cross-correlation is used to calculate the cross-correlation value of the folded image and the fixed image according to the following formula:

where NCC( _IF , _IM ) is the cross-correlation value, _IW (p) is the p-th folded image, and _IF (p) is the p-th fixed image;

According to the following formula, the image similarity between the folded image and the fixed image is calculated by using the image difference hash value between the folded image and the fixed image:

DH( _IF , _IM )=|dHash(I _W )-dHash( _IF )|

where DH( _IF , _IM ) is the image similarity, _dHash (IW) is the hash value of the folded image, and dHash( _IF ) is the hash value of the fixed image;

The image loss of the image pair is constructed from the cross-correlation value and the image similarity according to the following formula:

L _sim ( _IF , _IM )=λ _i1 *NCC( _IF , _IM )+λ _i2 *DH( _IF , _IM )

In the formula, L _sim ( _IF , _IM ) is the image loss, λ is the weight factor, and i1 and i2 are the hyperparameter factors preset by the two metrics respectively;

Generate adversarial functions via binary cross-entropy:

L _{G_adv} = -ln(p ⁺ )

In the formula, p ⁺ is the segmentation similarity between the folded segmented image and the fixed segmented image with noise;

The segmentation image loss is generated according to the adversarial function as follows:

In the formula, _L _sim (SF , _SM ) is the segmentation image loss, _SF is the folded segmented image, _SM is the fixed segmented image with noise, CE is the folded segmented image and the fixed segmented image with noise The cross entropy loss function between, n is the number of marked organs, k is the kth organ, s1, s2 are the hyperparameter factors preset by the two metrics respectively;

The regularization loss is generated as follows:

where L _reg (φ) is the regularization loss, p is the coordinates on different channels of the displacement field, and φ(p) is the displacement field output by the registration network;

Based on the image loss, segmentation image loss and regularization loss, deep supervised learning is used to construct the first loss function:

In the formula, L _G is the first loss function.

In this embodiment, the loss function (ie, the first loss function) of the registration network is calculated by using the folded image, folded segmentation, displacement field, and segmentation similarity output by the registration network and the discrimination network. , performing feedback optimization on the registration network through the first loss function, thereby improving the performance of the registration network, and finally improving the registration accuracy of medical image images.

It should be noted that the registration network in this embodiment generates three different displacement fields with the Nested U-Net structure of multi-output displacement fields. Therefore, the deep supervision method is adopted, and the feedback information of the three displacement fields is used to simultaneously adjust the displacement field. The parameters of the registration network are adjusted to further improve the performance of the registration network.

In one embodiment, the step S103 further includes:

The second loss function is constructed as follows:

L _{D_adv} = -ln(p ^- )+ln(1-p ⁺ )

where L _{D_adv} is the second loss function, p ⁺ is the folded segmented image and the fixed segmented image with noise

The segmentation similarity between, p ^- is the self-similarity between the fixed segmented image and the fixed segmented image with noise.

In this embodiment, the loss function of the discriminant network (ie, the second loss function) comes from adversarial learning. For the purpose of confrontation, the discriminant network hopes that the similarity between the predicted folded image and the fixed image is as low as possible Some. And the fixed segmentation after adding noise is also input into the discriminant network to adjust the training of the discriminant network.

In a specific embodiment, as shown in FIG. 4 , an image pair is input to a registration network, the corresponding displacement field is output by the registration network, and a grid resampler is used to align the moving image according to the displacement field. Perform spatial transformation with the moving segmented image in the segmented image pair, and obtain the corresponding folded image and folded segment (ie, folded segmented image) through a linear interpolation method, and obtain the image loss of the registration network according to the folded image and the fixed image. At the same time, noise is added to the fixed segmentation (that is, the fixed segmentation image), and the fixed segmentation with noise (that is, the fixed segmentation image) is obtained, and then the folded segmentation image and the fixed segmentation image with noise are input into the discriminant network. The corresponding segmentation image similarity is output to obtain the segmentation loss. In addition, the regularization loss of the registration network (ie, the regularization loss) can also be obtained according to the displacement field output by the registration network. The first loss function of the registration network can be constructed according to the obtained image loss, segmentation loss and regular term loss, so as to use the first loss function to perform feedback optimization on the registration network.

FIG. 5 is a schematic block diagram of an anti-registration apparatus 500 provided by an embodiment of the present application. The apparatus 500 includes:

The image preprocessing unit 501 is configured to obtain a medical image image and a corresponding anatomical segmentation image, and preprocess the medical image image and the anatomical segmentation image to obtain a data set, wherein the anatomical segmentation image includes at least one anatomical segmentation image image area;

A learning unit 502, configured to use the data set to learn the preset registration network and discrimination network respectively;

The first construction unit 503 is used for constructing a first loss function for the registration network according to the output result of the registration network after learning and the output result of the discriminant network, and for adversarial learning through the discriminant network and the registration network constructing a second loss function for the discriminant network;

The registration processing unit 504 is configured to use the first loss function and the second loss function to respectively perform feedback optimization on the registration network and the discrimination network, and use the optimized registration network to register the designated medical image images quasi-processing.

In one embodiment, the image preprocessing unit 501 includes:

an image acquisition unit for acquiring medical image images and corresponding anatomical segmentation images from a medical database;

a pixel value labeling unit, configured to perform pixel value labeling on the anatomical segmented image region in the anatomical segmented image;

The image scaling unit is used for uniformly scaling the medical imaging image and the anatomical segmentation image, so that the size of the medical imaging image and the anatomical segmentation image is the same as the input size of the neural network formed by the registration network and the discriminant network. adapted to obtain a dataset.

In one embodiment, as shown in FIG. 6 , the learning unit 502 includes:

The image selection unit 601 is used to randomly select a medical image image and an anatomical segmentation image in the data set, and use them as a fixed image and a fixed segmentation image respectively, and then randomly select another medical image image in the data set and another anatomical segmented image, and used as a moving image and a moving segmented image, respectively;

The image combining unit 602 is used to combine the fixed image and the moving image as an image pair, and combine the fixed segmented image and the moving segmented image as a segmented image pair, and based on the input requirements of the registration network, respectively set The number of image pairs and segmented image pairs is the same as the number of batches of the registration network;

A displacement field obtaining unit 603, configured to input the image pair to the registration network, and obtain the displacement field between the pixels of the moving image and the fixed image in the image pair through the forward propagation of the registration network ;

The spatial transformation unit 604 is configured to use the grid resampling module to perform spatial transformation on the moving image and the moving segmented image in the segmented image pair according to the displacement field, and obtain the corresponding folded image and folded image through a linear interpolation method split image;

A discriminant network unit 605, configured to add noise to the fixed segmented images in the pair of segmented images to obtain a fixed segmented image with noise, and input the folded segmented image and the fixed segmented image with noise into the discriminant network , and output the segmentation similarity of the segmented image pair through the discriminant network.

In one embodiment, as shown in FIG. 7 , the displacement field acquisition unit 603 includes:

a first input unit 701, configured to input the image pair to the registration network;

A first encoding unit 702, configured to sequentially encode the image pair through the first encoder module and the second encoder module in the registration network, and output the first encoding of the image pair;

The first decoding unit 703 is used for decoding the first code through the first decoder module and the second decoder module in sequence, and outputting to obtain the first displacement field;

A second encoding unit 704, configured to encode the first encoding through a third encoder module, and output the second encoding to obtain the image pair;

The second decoding unit 705 is configured to sequentially decode the second encoding by the third decoder module, the fourth decoder module and the fifth decoder module, and output the second displacement field;

A third encoding unit 706, configured to encode the second encoding by the fourth encoder module, and output the third encoding of the image pair;

The third decoding unit 707 is configured to sequentially decode the third code through the sixth decoder module, the seventh decoder module, the eighth decoder module and the ninth decoder module, and output the third displacement field.

In one embodiment, the discriminating network unit 605 includes:

a second input unit, configured to input the folded segmented image and the fixed segmented image with noise to the discriminant network;

A segmented image processing unit for sequentially passing through the first convolutional layer, the first maximum pooling layer, the second convolutional layer, the second maximum pooling layer, the third convolutional layer, and the third maximum pooling layer of the discriminant network The folded segmentation image and the fixed segmentation image with noise are processed by the folded segmentation layer, the fourth convolutional layer and the fourth max pooling layer, and then the processed folded segmentation image and the fixed segmentation image with noise are input to In the fully connected layer, the final segmentation similarity is output through the activation function.

In one embodiment, the first construction unit 503 includes:

The cross-correlation value calculation unit is used to calculate the cross-correlation value of the folded image and the fixed image by using the normalized cross-correlation according to the following formula:

The image similarity calculation unit is used to calculate the image similarity between the folded image and the fixed image by using the image difference hash value between the folded image and the fixed image according to the following formula:

DH( _IF , _IM )=|dHash(I _W )-dHash( _IF )|

An image loss construction unit, configured to construct an image loss of the image pair according to the cross-correlation value and the image similarity according to the following formula:

L _sim ( _IF , _IM )=λ _i1 *NCC( _IF , _IM )+λ _i2 *DH( _IF , _IM )

Adversarial function generation unit for generating adversarial functions via binary cross-entropy:

L _{G_adv} = -ln(p ⁺ )

A segmented image loss generation unit, configured to generate a segmented image loss according to the adversarial function according to the following formula:

Regularization loss generation unit, which is used to generate the regularization loss according to the following formula:

The second construction unit is configured to construct the first loss function using deep supervised learning based on the image loss, segmentation image loss and regularization loss:

In the formula, L _G is the first loss function.

In one embodiment, the first construction unit 503 includes:

The third building unit is used to build the second loss function according to the following formula:

L _{D_adv} = -ln(p ^- )+ln(1-p ⁺ )

In the formula, L _{D_adv} is the second loss function, p ⁺ is the segmentation similarity between the folded segmentation image and the fixed segmentation image with noise, p ⁻ is the fixed segmentation image and the fixed segmentation image with noise self-similarity between them.

Since the embodiment of the apparatus part corresponds to the embodiment of the method part, for the embodiment of the apparatus part, please refer to the description of the embodiment of the method part, which will not be repeated here.

The embodiments of the present application further provide a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed, the steps provided by the above-mentioned embodiments can be implemented. The storage medium may include: U disk, removable hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes.

Embodiments of the present application further provide a computer device, which may include a memory and a processor, where a computer program is stored in the memory, and when the processor calls the computer program in the memory, the steps provided in the above embodiments can be implemented. Of course, the computer equipment may also include various network interfaces, power supplies and other components.

The various embodiments in the specification are described in a progressive manner, and each embodiment focuses on the differences from other embodiments, and the same and similar parts between the various embodiments can be referred to each other. For the system disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant part can be referred to the description of the method. It should be pointed out that for those of ordinary skill in the art, without departing from the principles of the present application, several improvements and modifications can also be made to the present application, and these improvements and modifications also fall within the protection scope of the claims of the present application.

It should also be noted that, in this specification, relational terms such as first and second, etc. are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply these entities or operations. There is no such actual relationship or sequence between operations. Moreover, the terms "comprising", "comprising" or any other variation thereof are intended to encompass a non-exclusive inclusion such that a process, method, article or device that includes a list of elements includes not only those elements, but also includes not explicitly listed or other elements inherent to such a process, method, article or apparatus. Without further limitation, an element qualified by the phrase "comprising a..." does not preclude the presence of additional identical elements in a process, method, article, or device that includes the element.

Claims

An adversarial registration method, comprising:

Obtaining a medical imaging image and a corresponding anatomical segmentation image, and preprocessing the medical imaging image and the anatomical segmentation image to obtain a data set, wherein the anatomical segmentation image includes at least one anatomical segmentation image area;

Use the data set to learn the preset registration network and discriminant network respectively;

A first loss function is constructed for the registration network according to the learned output of the registration network and the output of the discriminant network, and a second loss function is constructed for the discriminant network through confrontational learning between the discriminant network and the registration network. loss function;

Feedback optimization is performed on the registration network and the discrimination network respectively by using the first loss function and the second loss function, and the designated medical image image is registered by using the optimized registration network.
The adversarial registration method according to claim 1, wherein the obtaining a medical image image and a corresponding anatomical segmentation image, and performing preprocessing on the medical image image and the anatomical segmentation image to obtain a data set, comprising:

Obtain medical image images and corresponding anatomical segmentation images from a medical database;

performing pixel value marking on the anatomical segmented image area in the anatomical segmented image;

The medical imaging image and the anatomical segmentation image are uniformly scaled, so that the size of the medical imaging image and the anatomical segmentation image are adapted to the input size of the neural network formed by the registration network and the discriminant network, thereby obtaining a data set .
The adversarial registration method according to claim 1, characterized in that, using the data set to learn a preset registration network and a discriminant network respectively, comprising:

A medical imaging image and an anatomical segmentation image are randomly selected in the data set as fixed image and fixed segmentation image respectively, and then another medical imaging image and another anatomical segmentation image are randomly selected in the data set , and as a moving image and a moving segmented image, respectively;

Combining the fixed image and the moving image as an image pair, and combining the fixed segmented image and the moving segmented image as a segmented image pair, and based on the input requirements of the registration network, respectively setting the number and the registration network Image pairs and segmented image pairs with the same number of batches;

inputting the image pair to the registration network, and obtaining the displacement field between the pixels of the moving image and the fixed image in the image pair through forward propagation of the registration network;

Utilize the grid resampling module to spatially transform the moving image and the moving segmented image in the pair of segmented images according to the displacement field, and obtain the corresponding folded image and the folded segmented image through a linear interpolation method;

Adding noise to the fixed segmented image in the segmented image pair to obtain a fixed segmented image with noise, inputting the folded segmented image and the fixed segmented image with noise to the discriminant network, and through the discriminant network The segmentation similarity of the segmented image pair is output.
The adversarial registration method according to claim 3, wherein the image pair is input to the registration network, and the moving image in the image pair is acquired through forward propagation of the registration network to the displacement field between the pixels of the fixed image, including:

inputting the image pair to the registration network;

Encoding the image pair through the first encoder module and the second encoder module in the registration network in turn, and outputting the first encoding of the image pair;

The first code is decoded by the first decoder module and the second decoder module in turn, and the output is obtained to obtain the first displacement field;

Encoding the first encoding by the third encoder module, and outputting the second encoding of the image pair;

The second code is decoded by the third decoder module, the fourth decoder module and the fifth decoder module in sequence, and the second displacement field is obtained by outputting;

Encoding the second encoding by the fourth encoder module, and outputting the third encoding of the image pair;

The third code is decoded by the sixth decoder module, the seventh decoder module, the eighth decoder module and the ninth decoder module in sequence, and the third displacement field is outputted.
The adversarial registration method according to claim 3, wherein the folded segmented image and the fixed segmented image with noise are input to the discriminant network, and the segmented image is output through the discriminant network Pair segmentation similarity, including:

inputting the folded segmented image and the fixed segmented image with noise to the discriminant network;

Pass through the first convolutional layer, the first maximum pooling layer, the second convolutional layer, the second maximum pooling layer, the third convolutional layer, the third maximum pooling layer, and the fourth convolutional layer of the discriminant network in sequence layer and the fourth max pooling layer process the folded segmentation image and the fixed segmentation image with noise, and then input the processed folded segmentation image and the fixed segmentation image with noise into the fully connected layer, and pass the The activation function outputs the final segmentation similarity.
The adversarial registration method according to claim 4 or 5, characterized in that, constructing a first loss function for the registration network according to the learned output result of the registration network and the output result of the discriminant network, comprising:

The normalized cross-correlation is used to calculate the cross-correlation value of the folded image and the fixed image according to the following formula:

where NCC( IF , IM ) is the cross-correlation value, IW (p) is the p-th folded image, and IF (p) is the p-th fixed image;

According to the following formula, the image similarity between the folded image and the fixed image is calculated by using the image difference hash value between the folded image and the fixed image:

DH( IF , IM )=|dHash(I W )-dHash( IF )|

where DH( IF , IM ) is the image similarity, dHash (IW) is the hash value of the folded image, and dHash( IF ) is the hash value of the fixed image;

The image loss of the image pair is constructed from the cross-correlation value and the image similarity according to the following formula:

L sim ( IF , IM )=λ i1 *NCC( IF , IM )+λ i2 *DH( IF , IM )

In the formula, L sim ( IF , IM ) is the image loss, λ is the weight factor, and i1 and i2 are the hyperparameter factors preset by the two metrics respectively;

Generate adversarial functions via binary cross-entropy:

L G_adv = -ln(p + )

In the formula, p + is the segmentation similarity between the folded segmented image and the fixed segmented image with noise;

The segmentation image loss is generated according to the adversarial function as follows:

In the formula, L sim (SF , SM ) is the segmentation image loss, SF is the folded segmented image, SM is the fixed segmented image with noise, CE is the folded segmented image and the fixed segmented image with noise The cross entropy loss function between, n is the number of marked organs, k is the kth organ, s1, s2 are the hyperparameter factors preset by the two metrics respectively;

The regularization loss is generated as follows:

where L reg (φ) is the regularization loss, p is the coordinates on different channels of the displacement field, and φ(p) is the displacement field output by the registration network;

Based on the image loss, segmentation image loss and regularization loss, deep supervised learning is used to construct the first loss function:

In the formula, L G is the first loss function.
The adversarial registration method according to claim 5, wherein the adversarial learning by the discriminant network and the registration network to construct a second loss function for the discriminant network comprises:

The second loss function is constructed as follows:

L D_adv = -ln(p - )+ln(1-p + )

In the formula, L D_adv is the second loss function, p + is the segmentation similarity between the folded segmentation image and the fixed segmentation image with noise, p − is the fixed segmentation image and the fixed segmentation image with noise self-similarity between them.
An anti-registration device, comprising:

An image preprocessing unit, configured to obtain a medical image image and a corresponding anatomical segmentation image, and preprocess the medical image image and the anatomical segmentation image to obtain a data set, wherein the anatomical segmentation image includes at least one anatomical segmentation image area;

a learning unit, configured to use the data set to learn the preset registration network and discriminant network respectively;

The first construction unit is used for constructing a first loss function for the registration network according to the output result of the registration network after learning and the output result of the discriminant network, and confronting learning through the discriminant network and the registration network as The discriminant network constructs a second loss function;

a registration processing unit, configured to use the first loss function and the second loss function to respectively perform feedback optimization on the registration network and the discrimination network, and use the optimized registration network to register the designated medical image images deal with.
A computer device, characterized by comprising a memory, a processor and a computer program stored on the memory and running on the processor, the processor implementing the computer program according to claims 1 to 1 when the processor executes the computer program The adversarial registration method of any one of 7.
A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the adversarial registration according to any one of claims 1 to 7 is implemented method.