CN115690426A

CN115690426A - Image segmentation method and device based on multi-label fusion and storage medium

Info

Publication number: CN115690426A
Application number: CN202211430807.XA
Authority: CN
Inventors: 高阳; 宋宠宠; 王德峰; 刘禹辰; 宁晓琳
Original assignee: Hangzhou Innovation Research Institute of Beihang University
Current assignee: Hangzhou Innovation Research Institute of Beihang University
Priority date: 2022-11-15
Filing date: 2022-11-15
Publication date: 2023-02-03

Abstract

The invention relates to an image segmentation method based on multi-label fusion, which comprises the following steps: s1, acquiring a target image to be segmented, and registering N preset atlases with the target image respectively to obtain N registered images and N deformation labels which correspond to the atlases one by one; the atlas comprises an original floating image containing a target organ or tissue and an original label used for labeling the target organ or tissue in the original floating image; s2, inputting the target image and the N registration images into a self-adaptive voting weight network model to obtain N voting weight graphs corresponding to the registration images one by one; s3, multiplying and superposing the deformed labels of the registered images and the corresponding voting weight images to perform label fusion to obtain predicted labels; and S4, extracting the area corresponding to the prediction label from the target image to obtain a segmentation result. The image segmentation method can effectively improve the segmentation precision of the image segmentation method based on multi-atlas registration based on a small amount of training data with labels.

Description

Image segmentation method and device based on multi-label fusion and storage medium

Technical Field

The invention relates to the technical field of medical image segmentation, in particular to an image segmentation method, equipment and a storage medium based on multi-label fusion.

Background

Medical image segmentation technology is an important task in computer-aided diagnosis technology, and the main purpose of the medical image segmentation technology is to accurately identify organ regions, tissue regions, lesion regions, tumor regions, and the like from the pixel level, and particularly, the medical image segmentation technology is an important basis for diagnosis and treatment of some diseases. For example, the accurate liver segmentation can provide volume information of the liver for a doctor, and an important basis is provided for treatment methods such as liver surgery planning. However, the complexity of texture, the variability between imaging individuals, the differences between different imaging devices and imaging principles, and the blurring of the boundaries between different tissues and organs in practical medical images make the segmentation of medical images a very challenging task.

With the improvement of artificial intelligence algorithm in recent years, the segmentation method based on deep learning can learn the characteristics of the target organ or tissue image by using training data, so that the segmentation performance is better. However, the segmentation method based on deep learning relies on a large amount of high-quality training data to perform pre-training to obtain good segmentation performance, and the training data needs to be labeled with a target organ or tissue. The data labels of the medical images usually need to be obtained by detailed labeling of experienced medical experts, so that the actually available training data are difficult to obtain, and therefore, the development of a medical image segmentation method based on a small amount of training data is necessary.

The image segmentation method based on multi-atlas registration is an important method in a deep learning segmentation method, the core steps of the method comprise medical image registration and label fusion, the principle is that the characteristic that the appearance of human organ tissues has certain similarity is utilized, the mapping relation between the labels of target tissues and organs in a target image and the target image to be segmented is obtained through the registration process of a plurality of atlases and the target image to be segmented, and then the prediction label of the target image is obtained by utilizing the label fusion technology. The method can realize the segmentation of the organ in the target image by using a small amount of training data with labels, and has segmentation capability with better robustness on medical images of various modalities compared with other medical image segmentation methods.

In practical application, the existing image segmentation method based on multi-atlas registration relies on the traditional medical image registration method in the registration process, and although the traditional medical image registration method is ideal in the precision aspect, the registration process is an iterative process, needs a large amount of computation time and memory overhead, and is difficult to meet the clinical real-time requirement. In addition, most of the existing label fusion methods adopt a voting method to perform label fusion, and take the gray level similarity between the calculation map and the target image as an important basis for voting, so that a relatively high voting weight is given to the map with high gray level similarity. However, the method of taking the gray level similarity as the voting basis can easily ignore the spatial relationship between the images and the deformation influence of the target organ or tissue, so that the label fusion process generates the risk of local optimal problem, and the segmentation precision is reduced; in addition, because the voting weights corresponding to all voxels in one map are the same, which also makes the label fusion process depend too much on the map with higher voting weight, and when some maps obtain higher voting weight on the whole map due to higher local gray level similarity, the part with lower map similarity is easy to introduce the fused prediction label due to higher voting weight, thus reducing the image segmentation precision.

Disclosure of Invention

Technical problem to be solved

In view of the above disadvantages and shortcomings of the prior art, the present invention provides an image segmentation method based on multi-label fusion, an image segmentation apparatus based on multi-atlas registration, and a storage medium, which solve the technical problem of low segmentation accuracy in the existing image segmentation method based on multi-atlas registration.

(II) technical scheme

In order to achieve the purpose, the invention adopts the main technical scheme that:

in a first aspect, an embodiment of the present invention provides an image segmentation method based on multi-label fusion, which is used for performing segmentation processing on a target image to be segmented to obtain a segmentation result about a target organ or tissue in the target image, and includes:

s1, acquiring a target image to be segmented, and registering N preset atlases with the target image respectively to obtain N registered images and N deformation labels which correspond to the atlases one by one;

the atlas comprises original floating images containing target organs or tissues and original labels used for labeling the target organs or tissues in each original floating image;

s2, inputting the target image and the N registration images into a self-adaptive voting weight network model to obtain N voting weight graphs corresponding to the registration images one by one;

wherein the voting weight graph is used to indicate: registering voting weight of a deformation label corresponding to the image on whether each voxel in the target image is a target organ or tissue;

s3, multiplying and superposing the deformed labels of the registered images and the corresponding voting weight images to perform label fusion to obtain predicted labels;

and S4, extracting the area corresponding to the prediction label from the target image to obtain a segmentation result.

The image segmentation method provided by the embodiment of the invention obtains N voting weight graphs corresponding to N maps one by one based on a self-adaptive voting weight network model, and multiplies and superposes the deformation labels of the maps and the corresponding voting weight graph points to perform label fusion to obtain the prediction labels. That is to say, the image segmentation method provided in the embodiment of the present invention allocates the voting weight of each voxel in the deformation label corresponding to each map, and compared with the method in the prior art in which the same voting weight is given to all voxels in one map, the method in the present invention allocates the voting weight of each map more flexibly, and can allocate the voting weight of each voxel in a map according to the similarity between the voxel and the target image, thereby avoiding the problem that some maps obtain a higher voting weight as a whole due to higher local similarity, and further introducing a part with lower similarity into the prediction label, thereby effectively improving the segmentation accuracy of the image.

In addition, compared with a method in the prior art that the gray level similarity between the atlas and the target image is used as a voting basis, the image segmentation method provided by the embodiment of the invention is based on the adaptive voting weight network model, and the atlas and the target image are respectively used as a whole to be compared in similarity to obtain the voting weight map, so that the overall characteristics of the atlas and the target image are taken into the consideration range of the voting basis, the problem of local optimization can be avoided to a certain extent, the precision of voting weight distribution is improved, and the segmentation precision of the image is further improved.

Optionally, in S2, the adaptive voting weight network model is a convolutional neural network model with adapted model parameters obtained based on a first training process in advance;

the self-adaptive voting weight network model comprises an input module and an output module, wherein a down-sampling branch and an up-sampling branch are sequentially connected between the input module and the output module; the input module is used for performing convolution operation on the input N +1 images to obtain a primary characteristic diagram; the down-sampling branch is used for performing down-sampling operation based on the primary feature map to acquire a multi-level feature map of voting weight information about the deformation label; the up-sampling branch is used for performing up-sampling operation on the feature map output by the down-sampling branch to obtain voting weight information about the deformation label; and the output module is used for obtaining N voting weight graphs based on the voting weight information output by the up-sampling branch.

Optionally, the down-sampling branch includes a first down-sampling module, a second down-sampling module, and a third down-sampling module connected in sequence, and an input end of the first down-sampling module is connected to an output end of the input module;

the up-sampling branch comprises a first up-sampling module, a second up-sampling module and a third up-sampling module, and the output end of the third up-sampling module is connected with the input end of the output module;

the input end of the first up-sampling module is connected with the output end of the third down-sampling module;

the input end of the second up-sampling module is respectively connected with the output ends of the first up-sampling module and the second down-sampling module through a jump connection module;

and the input end of the third up-sampling module is respectively connected with the output ends of the second up-sampling module and the first down-sampling module through a jump connection module.

Optionally, the S1 includes:

s101, obtaining a target image to be segmented;

s102, carrying out rigid registration on an original floating image and a target image in the atlas to obtain N rigid-registered floating images and N rigid-registered labels;

s103, inputting the rigidly registered floating images and the target image one by one into a Voxelmorph network model for elastic registration to obtain N deformation fields corresponding to the rigidly registered floating images;

s104, performing space transformation on the floating image subjected to rigid registration and the corresponding rigid registration tag thereof based on the deformation field corresponding to the floating image subjected to rigid registration to obtain N registration images and N deformation tags;

the Voxelmorph network model is a network model which is obtained based on a second training process in advance and has adaptive model parameters.

Optionally, the S102 includes:

inputting the original floating images and the target images into an AIR-Net (adversity image registration network) network model one by one to obtain a prediction transformation parameter; resampling the original floating image and the original label thereof based on the prediction transformation parameters to obtain N pieces of rigidly registered floating images and N pieces of rigidly registered labels;

the AIR-Net network model is a network model which is obtained based on a third training process in advance and has adaptive model parameters.

Optionally, in S1, the atlas is an atlas image, the atlas image comprising N sequential images containing the target organ or tissue, and each sequential image comprising a sequential label for labeling the target organ or tissue; and taking the sequence image as an original floating image, and taking the sequence label as an original label.

Optionally, the S3 includes: point-multiplying the deformation labels of the N maps with the corresponding voting weight maps to obtain N intermediate labels, and superposing and fusing the N intermediate labels to obtain a prediction label;

wherein the predictive label is a probability distribution map indicating that each voxel in the target image is a target organ or tissue.

Optionally, the S4 includes: converting the probability distribution map into a binary map used for marking the region where the target organ or tissue is located based on a threshold value method; and extracting an image of the target organ or tissue from the target image based on the binary image to obtain a segmentation result.

In a second aspect, the present invention also provides an electronic device, comprising: a memory, a processor and a computer program stored on the memory and executable on the processor, the computer program, when executed by the processor, implementing the steps of the image segmentation method based on multi-label fusion according to the first aspect.

In a third aspect, the present invention further provides a computer storage medium, on which a computer program is stored, which, when being executed by a processor, implements the steps of the image segmentation method based on multi-label fusion according to the first aspect.

(III) advantageous effects

The image segmentation method provided by the embodiment of the invention is based on the adaptive voting weight network model, and the voting weight of each voxel in the deformation label corresponding to each map is distributed, so that compared with the method of endowing all voxels in one map with the same voting weight in the prior art, the method for distributing the voting weight of the map is more flexible, the voting weight of each voxel in the map can be distributed according to the similarity between the voting weight and a target image, the problem that the whole map obtains higher voting weight due to higher local similarity of some maps, and the part with lower similarity is introduced into a prediction label is solved, and the segmentation precision of the image is effectively improved. In addition, compared with a method in the prior art that the gray level similarity between the map and the target image is used as a voting basis, the image segmentation method provided by the embodiment of the invention is based on the adaptive voting weight network model, and the map and the target image are respectively used as a whole to carry out similarity comparison to obtain the voting weight map, so that the overall characteristics of the map and the target image are taken into the consideration range of the voting basis, the local optimization problem can be avoided to a certain extent, the precision of voting weight distribution is improved, and the segmentation precision of the image is further improved.

The image segmentation method provided by the embodiment of the invention also provides a registration method based on the Voxelmorph network model, the Voxelmorph network model can quickly generate a deformation field between an input target image and a rigidly registered floating image, the operation process is second-level or millisecond-level, and the clinical real-time requirement can be fully met; furthermore, the Voxelmorph network model enables unsupervised learning, thereby reducing the dependence on labeled training data. However, if the original floating image and the target image are registered by directly using the Voxelmorph network model, the registration accuracy may be reduced because the position difference between the target organ or tissue in the original floating image and the fixed image is large, based on this, the invention firstly carries out rigid registration on the original floating image and the target image, so that the positions of the original floating image and the target organ or tissue in the fixed image are aligned, N pieces of floating images and labels which are rigidly registered are obtained, and then the rigidly registered floating images are input into the Voxelmorph network model together with the target image one by one for elastic registration, so as to ensure the registration accuracy of the Voxelmorph network model. Compared with the existing medical image registration method, the registration method provided by the invention can give consideration to both the speed and the precision of image registration, and the clinical applicability of the image segmentation method provided by the invention is improved.

Drawings

Fig. 1 is a schematic flowchart of an image segmentation method based on multi-label fusion provided in an embodiment;

FIG. 2 is a block diagram of an adaptive voting weight network model according to an embodiment.

Detailed Description

In order to better understand the above technical solutions, exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.

Example one

As shown in fig. 1, the present embodiment provides an image segmentation method based on multi-label fusion, which is used for segmenting a target image to be segmented to obtain a segmentation result about a target organ or tissue in the target image, where the target image may be an MRI (magnetic resonance Imaging), CT (computed Tomography), PET (positron emission computed Tomography), SPECT (Single-photon emission computed Tomography), etc., and in some application scenarios, the target image (fixedmage) is also referred to as a reference image and a fixed image, and the method of the present embodiment may be implemented on any computer device, and includes:

s1, obtaining a target image to be segmented, and registering N preset atlases with the target image respectively to obtain N registered images and N deformation labels which correspond to the atlases one by one.

Wherein N is a positive integer, and the atlas comprises original floating images containing the target organ or tissue and original labels used for labeling the target organ or tissue in each original floating image. In practical applications, a floating image (movie) is sometimes called a moving image or a template image.

In particular, the atlas may be an atlas image comprising N sequential images containing the target organ or tissue, and each sequential image comprising a sequential label for labeling the target organ or tissue; and taking the sequence image as an original floating image, and taking the sequence label as an original label.

S2, inputting the target image and the N registration images into a self-adaptive voting weight network model to obtain N voting weight graphs which correspond to the registration images one by one;

wherein the voting weight graph is used to indicate: the deformation label corresponding to the registration image votes for whether each voxel in the target image is a target organ or tissue.

And S3, multiplying and superposing the deformed labels of the registered images and the corresponding voting weight images to perform label fusion to obtain predicted labels.

The image segmentation method provided by the embodiment of the invention obtains N voting weight graphs which are in one-to-one correspondence with N atlases based on a self-adaptive voting weight network model, and performs label fusion by performing point multiplication and superposition on the deformation labels of the atlases and the corresponding voting weight graphs to obtain the prediction labels. That is, the image segmentation method provided in the embodiment of the present invention allocates the voting weight of each voxel in the deformation label corresponding to each map, and compared with the method in the prior art in which all voxels in one map are given the same voting weight, the method in the present invention allocates the voting weight of a map more flexibly, and can allocate the voting weight of each voxel in a map according to the similarity between the voxel and the target image, thereby avoiding the problem that some maps obtain a higher voting weight as a whole due to higher local similarity, and further introduce a part with lower similarity into the prediction label, and thus effectively improving the segmentation accuracy of the image. In addition, compared with a method in the prior art that the gray level similarity between the atlas and the target image is used as a voting basis, the image segmentation method provided by the embodiment of the invention is based on the adaptive voting weight network model, and the atlas and the target image are respectively used as a whole to extract the characteristics of the atlas and the target image for similarity comparison to obtain the voting weight map, so that the whole characteristics of the atlas and the target image are taken into the consideration range of the voting basis, the local optimization problem can be avoided to a certain extent, the precision of voting weight distribution is improved, and the segmentation precision of the image is further improved.

Specifically, in step S2, the adaptive voting weight network model is a convolutional neural network model with adapted model parameters obtained based on a first training process in advance.

The self-adaptive voting weight network model comprises an input module and an output module, wherein a down-sampling branch and an up-sampling branch are sequentially connected between the input module and the output module; the input module is used for carrying out convolution operation on the input N +1 images to obtain a primary characteristic diagram; the down-sampling branch is used for acquiring a multi-level feature map of voting weight information about the deformation label based on the primary feature map, and the up-sampling branch is used for performing up-sampling operation on the feature map output by the down-sampling branch to obtain the voting weight information about the deformation label; and the output module is used for obtaining N voting weight graphs based on the voting weight information output by the up-sampling branch.

Further specifically, the adaptive voting weight network model includes an input module, a down-sampling branch, an up-sampling branch, a jump connection module, and an output module.

The down-sampling branch comprises a first down-sampling module, a second down-sampling module and a third down-sampling module which are connected in sequence, and the input end of the first down-sampling module is connected with the output end of the input module.

The up-sampling branch comprises a first up-sampling module, a second up-sampling module and a third up-sampling module, and the output end of the third up-sampling module is connected with the input end of the output module.

The input end of the first up-sampling module is connected with the output end of the third down-sampling module; the input end of the second up-sampling module is respectively connected with the output ends of the first up-sampling module and the second down-sampling module through a jump connection module; and the input end of the third up-sampling module is respectively connected with the output ends of the second up-sampling module and the first down-sampling module through a jump connection module.

The plurality of modules form a convolutional neural network similar to a U-Net network model structure, and the characteristic graphs of the same hierarchy in the down-sampling branch are fused when the up-sampling branch performs up-sampling on the characteristic graph, so that the characteristic information of each hierarchy extracted by the down-sampling branch is utilized to the maximum extent, a good training effect can be achieved through a small amount of training data, and the processing speed is high. Based on the trained adaptive voting weight network model, N voting weight graphs can be quickly obtained according to the input N +1 images.

It should be noted that, the adaptive voting weight network model learns the distribution method of voting weight to each deformation label based on a first training process in advance to obtain adapted model parameters; the preliminary first training process includes:

a0, acquiring a training set: the training set includes: a plurality of training images including a target organ or tissue, and training labels for identifying the target organ or tissue in the training images.

And A1, registering the training image and the N atlases with the target image according to the method in the step S1 respectively to obtain N registered images and N deformation labels which correspond to the atlases one by one.

And A2, inputting the training images in the training set and the N registration images into the self-adaptive voting weight network model together to obtain N voting weight graphs.

And A3, performing point multiplication on the deformation label of the registration image and the corresponding voting weight graph, and then superposing the deformation label and the corresponding voting weight graph for label fusion to obtain a prediction label of the training image.

And A4, calculating voting loss according to a loss function based on the prediction label and the training label of the training image, and adjusting the model parameters of the adaptive voting weight network model according to the voting loss. The loss function includes, but is not limited to, a function for calculating difference information between a prediction label and a training label of a training image, and specifically may be a cross entropy loss function or a Dice loss function.

And A5, circularly executing the steps A1-A4 until the loss function is converged to obtain the convolutional neural network model with the adaptive model parameters.

Example two

In order to better understand the first embodiment, the present embodiment will be described in detail with reference to specific steps, wherein the liver is taken as a target organ, and the atlas image of the liver is taken as a map.

The embodiment provides an image segmentation method based on multi-label fusion, which is used for performing segmentation processing on a target image to be segmented to obtain a segmentation result about a liver in the target image, and comprises the following steps:

s1, acquiring a target image to be segmented, and respectively registering N preset atlas images with the target image to obtain N registered images and N deformation labels which are in one-to-one correspondence with the atlas image.

The atlas image comprises N sequential images containing a liver, and each sequential image comprises a sequential label for labeling the liver; and taking the sequence image as an original floating image and taking the sequence label as an original label.

The S1 includes the following substeps:

and S101, acquiring a target image to be segmented.

S102, inputting the original floating images and the target images one by one into an AIR-Net network model to obtain a prediction transformation parameter; and resampling the original floating image and the original label thereof based on the prediction transformation parameters to obtain N pieces of rigidly registered floating images and N pieces of rigidly registered labels. The AIR-Net network model is a network model which is obtained based on a third training process in advance and has adaptive model parameters. The AIR-Net network model is a network model based on a countermeasure generation network (GAN), and the network model consists of a generator and a discriminator, wherein the generator directly estimates a conversion parameter from a floating image to a target image; then, respectively using the estimated conversion parameter and a real (ground-true) conversion parameter to carry out registration processing on the floating image through an image sampler; and a discriminator determines whether the image pair is aligned by the conversion parameter or the true parameter. It can be trained by unsupervised learning, further reducing reliance on labeled datasets.

S103, inputting the rigidly registered floating images and the target image one by one into a Voxelmorph network model for elastic registration to obtain N deformation fields corresponding to the rigidly registered floating images. The Voxelmorph network model is a network model which is obtained based on a second training process in advance and has adaptive model parameters. The Voxelmorph network model is a network model for pairwise registration of medical images. The network model represents the registration process as: an input image pair is mapped to a function aligned to the deformation field of the pair of images, which is then parameterized by a Convolutional Neural Network (CNN) and optimized based on a set of images of interest. For a given pair of images to be registered, the VoxelMorph network model calculates a deformation field by directly calculating a function, so that the image registration speed can be obviously improved. Moreover, the VoxelMorph network model can realize optimization of model parameters based on an unsupervised training process, thereby further reducing the dependence on labeled data sets.

And S104, carrying out spatial transformation on the floating image subjected to rigid registration and the corresponding rigid registration tag based on the deformation field corresponding to the floating image subjected to rigid registration to obtain N registration images and N deformation tags.

In steps S102 and S104, the AIR-Net network model and the Voxelmorph network model are public network models, and the second training process and the third training process performed in advance by the AIR-Net network model and the Voxelmorph network model are implemented by conventional technical means in the field, which are not described herein again.

In the registering method based on the Voxelmorph network model provided in the step S1, firstly, an original floating image and a target image are subjected to rigid registration based on the AIR-Net network model, so that the original floating image is aligned with the orientation of the liver in a fixed image to obtain N pieces of rigidly registered floating images, and then the rigidly registered floating images and the target image are input into the Voxelmorph network model one by one to be subjected to elastic registration so as to ensure the registering accuracy of the Voxelmorph network model. The Voxelmorph network model can quickly generate a deformation field between an input target image and a rigidly registered floating image based on the input target image and the rigidly registered floating image, and the operation process is in the second or millisecond level, so that the clinical real-time requirement can be fully met; furthermore, the Voxelmorph network model enables unsupervised learning, thereby reducing the dependency on labeled training data. In the rigid registration process in the step, the original floating image is aligned with the liver in the fixed image in advance, so that the situation that the registration accuracy is reduced due to the fact that the difference of the orientation of the liver in the original floating image and the liver in the fixed image is large because the original floating image and the target image are directly registered by using the Voxelmorph network model is avoided. Compared with the existing medical image registration method, the registration process provided by the embodiment can take account of the speed and the precision of image registration, and the clinical applicability of the image segmentation method provided by the embodiment is improved.

And S2, inputting the target image and the N registration images into the self-adaptive voting weight network model to obtain N voting weight graphs which are in one-to-one correspondence with the registration images. Wherein the voting weight graph is used to indicate: the deformation label of the corresponding atlas image gives a voting weight on whether each voxel in the target image is a liver.

Specifically, the adaptive voting weight network model is a convolutional neural network model with adaptive model parameters obtained based on a first training process in advance. As shown in fig. 2, the processing procedure of the adaptive voting weight network model to the input target image and the original floating image in the N atlas images includes: firstly, inputting atlas image and target image into input module, making convolution with convolution kernel size of 3X 3 twice, and obtaining 8 first-grade characteristic diagrams for each image. Inputting the first-level feature map into a first downsampling module, processing the convolutional layer with a convolutional kernel size of 3 multiplied by 3 to obtain a 16-channel second-level feature map, performing maximum pooling on the second-level feature map based on the downsampling layer in order to enlarge a receptive field, inputting the second-level feature map into a second downsampling module, processing the convolutional layer with a convolutional kernel size of 3 multiplied by 3 to obtain a 32-channel third-level feature map, performing maximum pooling on the third-level feature map based on the downsampling layer, inputting the third-level feature map into a third downsampling module, processing the convolutional layer with a convolutional kernel size of 3 multiplied by 3 to obtain a 32-channel fourth-level feature map, inputting the fourth-level feature map into a first upsampling module, performing deconvolution module processing with a convolutional kernel size of 3 multiplied by 3 twice, then inputting the feature map to an upper sampling layer for up-sampling processing to obtain a fifth-level feature map of 32 channels, splicing the third-level feature map and the fifth-level feature map together through a jump connection module, inputting the feature map to a second upper sampling module, obtaining a sixth-level feature map of 16 channels through two times of processing of a deconvolution layer with the convolution kernel size of 3 x 3, performing up-sampling processing on the sixth-level feature map based on the upper sampling layer, splicing the sixth-level feature map and the second-level feature map together through the jump connection module, inputting the sixth-level feature map to a third upper sampling module, performing deconvolution module processing with the convolution kernel size of 3 x 3 twice to obtain a seventh-level feature map, inputting the seventh-level feature map to an output module, and performing convolution layer processing with the convolution kernel size of 3 x 3 twice to obtain a weight map of N channels.

S3, performing dot multiplication on the deformation labels of the N atlas images and the corresponding voting weight graph respectively to obtain N intermediate labels, and overlapping and fusing the N intermediate labels to obtain a prediction label; wherein the prediction tag is a probability distribution map indicating that each voxel in the target image is a liver.

S4, converting the probability distribution map into a binary map used for marking the region where the target organ or tissue is located based on a threshold value method; and extracting an image of the target organ or tissue from the target image based on the binary image to obtain a segmentation result.

EXAMPLE III

The present embodiment provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the computer program is executed by the processor, the steps of the image segmentation method based on multi-tag fusion according to the first to second embodiments are implemented.

In addition, the present embodiment also provides a computer storage medium, where a computer program is stored on the computer readable storage medium, and the computer program, when executed by a processor, implements the steps of the image segmentation method based on multi-label fusion according to the first to second embodiments.

Since the system/apparatus described in the above embodiment of the present invention is a system/apparatus used for implementing the method of the above embodiment of the present invention, based on the method described in the above embodiment of the present invention, a person skilled in the art can understand the specific structure and variation of the system/apparatus, and therefore the detailed description is omitted here. All systems/devices adopted by the methods of the above embodiments of the present invention are within the intended scope of the present invention.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions.

It should be noted that in the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the terms first, second, third and the like are for convenience only and do not denote any order. These words are to be understood as part of the name of the component.

Furthermore, it should be noted that in the description of the present specification, the description of the term "one embodiment", "some embodiments", "examples", "specific examples" or "some examples", etc., means that a specific feature, structure, material or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Moreover, various embodiments or examples and features of various embodiments or examples described in this specification can be combined and combined by one skilled in the art without being mutually inconsistent.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, the claims should be construed to include preferred embodiments and all changes and modifications that fall within the scope of the invention.

It will be apparent to those skilled in the art that various modifications and variations can be made in the present invention without departing from the spirit or scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention should also include such modifications and variations.

Claims

1. An image segmentation method based on multi-label fusion is characterized in that the method is used for carrying out segmentation processing on a target image to be segmented to obtain a segmentation result about a target organ or tissue in the target image, and comprises the following steps:

wherein, the voting weight graph is used for marking: registering voting weight of a deformation label corresponding to the image on whether each voxel in the target image is a target organ or tissue;

2. The image segmentation method according to claim 1, wherein in S2, the adaptive voting weight network model is a convolutional neural network model with adaptive model parameters obtained based on a first training process in advance;

3. The image segmentation method according to claim 2,

the down-sampling branch comprises a first down-sampling module, a second down-sampling module and a third down-sampling module which are sequentially connected, and the input end of the first down-sampling module is connected with the output end of the input module;

4. The image segmentation method according to claim 1, wherein the S1 includes:

s101, obtaining a target image to be segmented;

s104, performing space transformation on the rigidly registered floating image and the rigidly registered tag corresponding to the rigidly registered floating image based on the deformation field corresponding to the rigidly registered floating image to obtain N registered images and N deformation tags;

5. The image segmentation method according to claim 4, wherein the S102 includes:

inputting the original floating images and the target images into an AIR-Net network model one by one to obtain a prediction transformation parameter; resampling the original floating image and the original label thereof based on the prediction transformation parameters to obtain N pieces of rigidly registered floating images and N pieces of rigidly registered labels;

and the AIR-Net network model is a network model which is obtained based on a third training process in advance and has adaptive model parameters.

6. The image segmentation method according to claim 1, wherein in S1, the atlas is an atlas image, the atlas image includes N sequential images including a target organ or tissue, and each sequential image includes a sequential label for labeling the target organ or tissue; and taking the sequence image as an original floating image and taking the sequence label as an original label.

7. The image segmentation method according to claim 1, wherein the S3 includes: point multiplying the deformation labels of the N maps with the corresponding voting weight maps to obtain N intermediate labels, and overlapping and fusing the N intermediate labels to obtain predicted labels;

8. The image segmentation method according to claim 7, wherein the S4 includes: converting the probability distribution map into a binary map used for marking the region where the target organ or tissue is located based on a threshold value method; and extracting an image of the target organ or tissue from the target image based on the binary image to obtain a segmentation result.

9. An electronic device, comprising: a memory, a processor and a computer program stored on the memory and executable on the processor, the computer program, when executed by the processor, implementing the steps of the multi-label fusion based image segmentation method according to any one of claims 1 to 8.

10. A computer storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when being executed by a processor, realizes the steps of the image segmentation method based on multi-label fusion according to any one of claims 1 to 8.