WO2023207416A1 - 图像补全方法、装置、设备及存储介质 - Google Patents

图像补全方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2023207416A1
WO2023207416A1 PCT/CN2023/082321 CN2023082321W WO2023207416A1 WO 2023207416 A1 WO2023207416 A1 WO 2023207416A1 CN 2023082321 W CN2023082321 W CN 2023082321W WO 2023207416 A1 WO2023207416 A1 WO 2023207416A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
feature
modality
sample
missing
Prior art date
Application number
PCT/CN2023/082321
Other languages
English (en)
French (fr)
Inventor
黄雅雯
郑冶枫
袁一啸
周毅
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2023207416A1 publication Critical patent/WO2023207416A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/77Retouching; Inpainting; Scratch removal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10088Magnetic resonance imaging [MRI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30008Bone
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30016Brain
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30101Blood vessel; Artery; Vein; Vascular

Definitions

  • Embodiments of the present application relate to the field of artificial intelligence, and in particular to an image completion method, device, equipment and storage medium.
  • Image completion is a process that uses the image itself or image library information to complete the missing areas of the image to be repaired, making the repaired image look very natural and difficult to distinguish from the undamaged image.
  • Modality can be understood as multiple different manifestations of one thing. For example, in the process of magnetic resonance imaging (Magnetic Resonance Imaging, MRI), changing the influencing factors of the signal can result in four types: T1, T2, FLAIR and T1ce. Modal images, but due to different imaging methods, some imaging images may lack necessary feature information. This kind of image is called a missing image, and its corresponding modality is called a missing modality.
  • Magnetic Resonance Imaging Magnetic Resonance Imaging
  • Embodiments of the present application provide an image completion method, device, equipment and storage medium.
  • the technical solutions are as follows:
  • an image completion method which method includes:
  • Obtain a collection of object images which contains images of the same object in different modalities, and the images include n missing images in missing modalities and m complete images in complete modalities, n and m is a positive integer;
  • an image completion device which includes:
  • the object image set includes images of the same object in different modalities, and the images include n missing images in the missing modality and m complete modalities.
  • n and m are positive integers;
  • a feature extraction module configured to extract object modality shared features from the complete image, where the object modality shared features are features shared by the missing image and the complete image;
  • a feature restoration module is used to restore features of the object modality shared features to obtain a completed image of the missing image.
  • inventions of the present application provide a computer device.
  • the computer device includes a processor and a memory. At least one program is stored in the memory. The at least one program is loaded and executed by the processor to implement Image completion method as described above.
  • embodiments of the present application provide a computer-readable storage medium in which at least one program is stored, and the at least one program is loaded and executed by a processor to implement the above aspects. image completion method.
  • inventions of the present application provide a computer program product.
  • the computer program product includes computer instructions, and the computer instructions are stored in a computer-readable storage medium; the processor of the computer device obtains the information from the computer-readable storage medium.
  • the medium reads the computer instructions, and the processor executes the computer instructions, causing the computer device to execute
  • the image completion method is as described above.
  • the computer device after the computer device obtains a set of object images of the same object, it extracts the paired modal sharing features between the missing image and the complete image from the complete image, that is, the object modal sharing features, and then extracts the object modal sharing features. Perform feature restoration to obtain a completed image of the missing image; using the solution provided by the embodiment of the present application, while realizing modal completion of the missing modal image, the accuracy of the completion result can be ensured, thereby ensuring the image completion. Full quality.
  • Figure 1 shows a schematic diagram of an implementation environment provided by an exemplary embodiment of the present application
  • Figure 2 shows a flow chart of an image completion method provided by an exemplary embodiment of the present application
  • Figure 3 shows a flow chart of an image completion method provided by another exemplary embodiment of the present application.
  • Figure 4 is a schematic diagram of the implementation of an image completion method according to an exemplary embodiment of the present application.
  • Figure 5 shows a flow chart of a training method for an image completion model provided by an exemplary embodiment of the present application
  • Figure 6 shows a schematic diagram of a training method for an image completion model provided by an exemplary embodiment of the present application
  • Figure 7 shows a flow chart of a training method for an image completion model provided by another exemplary embodiment of the present application.
  • Figure 8 shows a schematic diagram of a training method for an image completion model provided by another exemplary embodiment of the present application.
  • Figure 9 is a comparison diagram of the completion effects of the embodiment of the present application and related technologies, illustrating an exemplary embodiment of the present application.
  • Figure 10 shows a structural block diagram of an image completion device provided by an exemplary embodiment of the present application.
  • Figure 11 shows a schematic structural diagram of a computer device provided by an exemplary embodiment of the present application.
  • Generative Adversarial Network It is a method of unsupervised learning that learns by letting two neural networks compete with each other.
  • a generative adversarial network consists of a generator and a discriminator.
  • the core purpose of generative adversarial networks is to train the generator.
  • the purpose of the generator is to generate images that are as similar as possible to real sample images, and the purpose of the discriminator is to distinguish as much as possible whether a given sample is a real sample or a generated image.
  • the purpose of the two is contrary to each other, and they improve each other in the process of continuous game.
  • the discriminator's discriminant ability is reliable enough, it is still unable to distinguish whether a given sample is a real sample or a generated image, that is, there is no difference between the image generated by the generative model and the sample image.
  • the discriminant model cannot differentiate.
  • Magnetic resonance imaging It is a medical imaging technology based on the principle of Nuclear Magnetic Resonance (NMR), which uses magnetic fields and radiofrequency waves to form images of human anatomy or physiological processes.
  • NMR Nuclear Magnetic Resonance
  • An MRI sequence is a set of radiofrequency pulses and a specific set of gradients that produce a specific image.
  • MRI image modalities include T1, T2, FLAIR and T1ce.
  • T1 and T2 are physical quantities used to measure electromagnetic waves, and they can be used as imaging data. Imaging based on T1 is called “T1-weighted imaging", which is referred to as "T1" in clinical work. The same is true for T2.
  • T1-weighted imaging Imaging based on T1
  • the overall perception of T1 images is very close to the "customary color matching style" of "clinical images”.
  • the T2 signal is related to the water content.
  • the T2 signal of many lesions is stronger than the surrounding normal tissue and is often highlighted. Therefore, the location and size of the lesion can be clearly seen from T2.
  • the full name of FLAIR is magnetic resonance imaging liquid attenuation inversion sequence, also known as water suppression imaging technology. It can suppress the high signal of cerebrospinal fluid (dark the cerebrospinal fluid) in T2, so that the lesions adjacent to the cerebrospinal fluid can be displayed clearly (brightened), T1ce Before doing MRI, the blood is used to create a contrast agent (pigment). The bright areas are rich in blood supply.
  • the enhanced display shows that the blood flow is rich, and the tumor site is the site where the blood flow is very fast. T1ce can also further display the situation in the tumor and identify it. Tumors and nonneoplastic lesions (i.e., gangrenous sites).
  • Artificial Intelligence It is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results.
  • artificial intelligence is a comprehensive technology of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can respond in a similar way to human intelligence.
  • Artificial intelligence is the study of the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
  • Artificial intelligence technology is a comprehensive subject covering a wide range of fields, including both hardware-level technology and software-level technology. technique.
  • Basic artificial intelligence technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technology, operation/interaction systems, mechatronics and other technologies.
  • Artificial intelligence software technology mainly includes computer vision technology, speech processing technology, natural language processing technology, and machine learning/deep learning.
  • Computer Vision Technology (Computer Vision, CV): It is a science that studies how to make machines "see”. Furthermore, it refers to using cameras and computers instead of human eyes to identify and measure targets, and further perform machine vision. Graphics processing enables computer processing to become images more suitable for human eye observation or transmission to instruments for detection. As a scientific discipline, computer vision studies related theories and technologies, trying to build artificial intelligence systems that can obtain information from images or multi-dimensional data. Computer vision technology usually includes image processing, image recognition, image segmentation, image semantic understanding, image retrieval, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D technology, virtual reality, augmented reality, simultaneous positioning and Technologies such as map construction also include common biometric identification technologies such as face recognition and fingerprint recognition.
  • the image completion method involved in the embodiments of the present application can improve the training effect of the image completion model and thereby improve the accuracy of the completion results of the trained image completion model.
  • FIG. 1 shows a schematic diagram of an implementation environment provided by an exemplary embodiment of the present application.
  • the implementation environment includes computer equipment 110 and server 120.
  • the computer device 110 and the server 120 perform data communication through a communication network.
  • the communication network may be a wired network or a wireless network, and the communication network may be at least one of a local area network, a metropolitan area network, and a wide area network. kind.
  • the computer device 110 is an electronic device with image completion requirements.
  • the electronic device may be a smart phone, a tablet computer, a personal computer, etc. This embodiment is not limited thereto.
  • an application having an image completion function is installed or running in the computer device 110 .
  • the user inputs the image in the missing mode and the image in the complete mode into the application in the form of an object image set 121, so that the object image is
  • the set 121 is uploaded to the server 120, and the server 120 performs image completion on the image in the object missing mode, and feeds back the image completion result to the user.
  • Server 120 may be an independent physical server, or a server cluster or distributed system composed of multiple physical servers, or may provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, Cloud servers for basic cloud computing services such as middleware services, domain name services, security services, Content Delivery Network (CDN), and big data and artificial intelligence platforms.
  • cloud databases cloud computing, cloud functions, cloud storage, network services, cloud communications
  • Cloud servers for basic cloud computing services such as middleware services, domain name services, security services, Content Delivery Network (CDN), and big data and artificial intelligence platforms.
  • CDN Content Delivery Network
  • the computer device 110 uploads the image collection 121 to the server 120, and the server 120 performs image completion through the image completion model 122 to obtain the complementary image 123, where the image completion model 122 is a In the coder-decoder network, the server 120 transmits the completed image 123 back to the computer device 110 so that the computer device 110 displays the image completion result.
  • the image completion model can also be deployed in the computer device 110, so that the computer device 110 can implement image completion locally and reduce the processing pressure of the server 120. This embodiment is not limited to this.
  • the above image completion model can be trained by the server 120, or can be deployed on the server 120 side after being trained by other devices.
  • the image completion model can be trained by the server 120, or can be deployed on the server 120 side after being trained by other devices.
  • each of the following embodiments will be described by taking the image completion method being applied to a computer device, and the image completion model training being executed by the computer device.
  • image completion method shown in the embodiment of the present application can be applied to various image completion tasks.
  • image completion of medical images is taken as an example for explanation.
  • FIG. 2 shows a flow chart of an image completion method provided by an exemplary embodiment of the present application.
  • This embodiment uses the method applied to computer equipment as an example to illustrate. The method includes the following steps:
  • Step 201 Obtain an object image collection.
  • the object image collection contains images of the same object in different modalities, and the images include n missing images in the missing modality and m complete images in the complete modality, n and m. is a positive integer.
  • the object may be the central nervous system, brain, bones, spinal cord, blood vessels, etc.
  • the embodiments of the present application do not limit the specific objects.
  • the computer device After the computer device obtains the object image collection of the same object, it needs to perform an image preprocessing operation on the images in the object image collection, so that the input format of the image is consistent with the input format of the model training process.
  • the image preprocessing operation method may include at least one of preprocessing operations such as scale transformation, image normalization, image grayscale, image enhancement, and image filtering.
  • preprocessing operations such as scale transformation, image normalization, image grayscale, image enhancement, and image filtering.
  • the embodiments of this application do not cover specific preprocessing operations.
  • the method of operation constitutes a limitation.
  • the embodiment of this application takes a brain tumor as an example.
  • the modes include T1 mode, T1ce mode, T2 mode and FLAIR mode.
  • the computer device obtains an object image set of the same object, and obtains an image from the object image set.
  • the image includes n missing images in the missing modality and m complete modal images of the same object.
  • Complete image, n and m are positive integers, where the missing image in the missing mode is the image that needs to be completed, and the complete image in the complete mode is the reference image in the image completion process.
  • the modalities of the image include T1 modality, T1ce modality, T2 modality and FLAIR modality.
  • the missing images in the missing modality included in the object image collection can be: missing T1 modality. image (the missing modality corresponding to the missing image is T1 modality), the missing image corresponding to the missing T1ce modality (then the missing modality corresponding to the missing image is T1ce mode), the missing image corresponding to the missing T2 modality (then the missing image The missing mode corresponding to the image is T2 mode), and the missing image corresponding to the FLAIR mode is missing (the missing mode corresponding to the missing image is FLAIR mode), etc.
  • the images in the complete modality in the object image collection are the images without modality missing.
  • Step 202 Extract object modality shared features from the complete image.
  • the object modality shared features are features common to the missing image and the complete image.
  • Characteristics are the corresponding (essential) characteristics or characteristics that distinguish a certain type of object from other types of objects, or a collection of these characteristics or characteristics. Since there are differences in the missing images in different missing modalities, in order to restore the complete images of the missing images in various missing modalities, it is necessary to extract the common features of the images in different modalities, and there are no missing images in the complete image. Modal images can be used to extract the required image features. Therefore, in a possible implementation, the computer device can use a machine learning model to perform feature extraction on the image, and extract the paired object modality shared features between the complete image and the missing image, that is, the complete image and the missing image, from the complete image characteristics shared by all.
  • the computer equipment will extract the paired modal sharing features between the m complete images corresponding to the m complete modalities, that is, for each missing mode mode, the computer equipment can extract the shared features of m object modalities from m complete images corresponding to the complete modalities.
  • Step 203 Perform feature restoration on the object modality shared features to obtain a completed image of the missing image.
  • Image completion refers to the repair and reconstruction of damaged images, while a missing image is an image that is missing a certain modality.
  • the computer device obtains the shared features of the object modality between the missing image and the complete image, it can be based on the object mode.
  • the state shared features are used to perform feature restoration to complete the damaged missing image, thereby generating a completed image of the missing image. There is no missing mode in this completed image.
  • the image completion model after the image completion model obtains a collection of object images of the same object, it extracts the object modality shared features from the complete image, and then performs feature restoration on the object modality shared features to obtain the missing image.
  • the image with the missing modality can be modally completed while also Ensure the accuracy of the completion results, thereby ensuring the quality of image completion.
  • the computer device pre-trains the image completion model through machine learning, so that in the model application stage, the image modality can be completed for the missing image based on the image completion model.
  • the image completion model consists of a feature encoder and a feature decoder.
  • the feature encoder is used to extract the feature information shared between the complete image and the missing image (object modality shared features) from the complete image.
  • the feature decoder encodes the features.
  • the shared features of object modalities extracted by the detector are used for feature restoration, and then a complete image is obtained.
  • FIG. 3 shows a flow chart of an image completion method provided by another exemplary embodiment of the present application.
  • This embodiment uses the method applied to computer equipment as an example to illustrate. The method includes the following steps:
  • Step 301 Obtain an object image collection.
  • the object image collection contains images of the same object in different modalities, and the images include n missing images in the missing modality and m complete images in the complete modality, n and m. is a positive integer.
  • step 201 For the implementation of this step, reference can be made to step 201, which will not be described again in this embodiment.
  • Step 302 Input the missing image and the complete image into the feature encoder of the missing modality, where different modalities correspond to different feature encoders.
  • each modality of the image has a corresponding feature encoder.
  • the image completion model includes the feature encoder of T1 modality, the feature encoder of T1ce modality, T2 Feature encoder for the modality and feature encoder for the FLAIR modality.
  • the computer device needs to determine the missing mode corresponding to the missing image.
  • missing image A is T1 mode
  • feature encoder 1 corresponds to T1 mode
  • missing image B is If the state is T1ce mode
  • feature encoder 2 corresponds to T1ce mode
  • missing image C is T2 mode
  • feature encoder 3 corresponds to T2 mode
  • missing image C and the complete image will be input to the feature encoder 3 together
  • the missing mode of the missing image D is the FLAIR mode
  • feature encoder 4 corresponds to the FLAIR mode, the missing image D and the complete image should be input
  • the feature encoder 4 is jointly input, that is, the missing image and the complete image are input into the feature encoder corresponding to the missing modality.
  • the feature encoder is a hybrid expert network composed of conditional convolution, and the parameters of the conditional convolution are determined based on the modality corresponding to the feature encoder.
  • MOE Mixture Of Experts
  • the hybrid expert system integrates multiple models into a single task.
  • the image completion model uses a feature encoder composed of conditional convolution (CondConv).
  • CondConv conditional convolution
  • the parameters of the conditional convolution are determined by the input modality corresponding to the feature encoder.
  • S expert mixture models are used Among them, x represents the input image, i represents the input modality, ⁇ ( ⁇ ) is the sigmoid activation function, # represents conventional convolution, ⁇ W 1 ,..., W s ⁇ is the network parameters related to s experts, is the blending weight for a specific modality.
  • the feature encoder consists of a downsampling module and a residual block.
  • the downsampling module includes a 7 ⁇ 7 conditional convolution block with a stride of 1, and two 4 ⁇ 4 conditional convolutions. Accumulate blocks with a step size of 2.
  • the computer device obtains n missing images corresponding to the missing modality and m complete images corresponding to the complete modality. If the missing image is The complete image is ⁇ x j
  • Step 303 Use the feature encoder to extract features from the missing image and the complete image to obtain object modality shared features.
  • the computer device performs feature extraction on the missing image and the complete image through a feature encoder, and the extracted feature information is common to the missing image and the complete image extracted by the computer device from the complete image.
  • Feature information that is, object modality shared features.
  • the computer device uses a feature encoder to sequentially extract the missing images and Feature extraction is performed on the i-th complete image to obtain the m i-th object modal shared features.
  • the i-th complete image belongs to m complete images, and i is less than or equal to m.
  • the missing image is T1 mode
  • the image completion model uses feature encoder 1 to pair the missing image and the complete image ⁇ x j
  • feature encoder 1 is extracted by feature encoder 1 from the complete images x2, x3, and x4 respectively. It should be noted that since the missing mode of the missing image is the T1 mode, only the feature encoder 1 corresponding to the T1 mode works at this time, and the feature encoders corresponding to the other three complete modes do not need to work. Similarly, if the missing mode of the missing image is the T2 mode, only the feature encoder 2 corresponding to the T2 mode works, and the feature encoders corresponding to the other three complete modes do not need to work.
  • Step 304 Input the shared features of the object modality into the feature decoder of the missing modality, where different modalities correspond to different feature decoders.
  • each modality of the image has a corresponding feature decoder.
  • the image completion model includes the feature decoder of T1 modality, the feature decoder of T1ce modality, T2 The feature decoder of the modality and the feature decoder of the FLAIR modality.
  • the computer device inputs the object modality shared features into a feature decoder of the missing modality. For example, if the object modality shared features are output by the feature encoder of the T1 modality, then the object modality shared features are input to the T1 modality feature decoder; if the object modality shared features are the feature encoders of the T1ce modality The output of the detector corresponds to inputting the object modality shared features into the feature decoder of the T1ce modality; if the object modality shared features are output by the feature encoding of the T2 modality, it corresponds to inputting the object modality shared features into the feature decoder of the T2 modality.
  • the object modality shared features are output by the feature encoder of the FLAIR modality, the object modality shared features are input into the FLAIR modality feature decoder; that is, the computer device is based on the corresponding lack of object modality shared features.
  • the shared features of the object modality are input into the feature decoder of the actual modality.
  • step 304 may include step 304A and step 304B.
  • Step 304A Feature fusion is performed on the shared features of m object modalities to obtain fused shared features.
  • Step 304B Input the fused shared features into the feature decoder of the missing modality.
  • a single feature encoder will obtain m pairs of shared features of object modalities.
  • m is not fixed, so the number of shared features of object modalities is not fixed.
  • the input of the feature decoder is of fixed size.
  • the computer device needs to perform some processing on the m pairs of object modal shared features, and input the processed object modal shared features into the feature decoder.
  • the computer device first performs feature fusion on shared features of m object modalities to obtain fused shared features, and inputs the fused shared features into the feature decoder of the missing modality.
  • the computer device first performs a pooling operation (Pooling) on the modal shared features of m objects respectively to obtain the pooling operation results of the shared features of each object, and then performs feature splicing on the pooling operation results to achieve feature fusion. , to obtain the fused shared features.
  • a pooling operation Pooling
  • feature splicing on the pooling operation results to achieve feature fusion.
  • Pooling operation is a very common operation in convolutional neural networks. It imitates the human visual system to reduce the dimensionality of data. Pooling operation is also usually called subsampling or downsampling. The significance of pooling lies in feature dimensionality reduction. Pooling technology greatly reduces the loss of computing resources. In addition, it also has the advantage of reducing model overfitting.
  • the computer device performs pooling processing on the modal shared features of the i-th object through at least two pooling methods to obtain at least two pooling features corresponding to the modal shared features of the i-th object, and then Pooling features corresponding to the shared features of m kinds of object modalities are spliced together to obtain fused shared features.
  • the pooling method can be general pooling (General Pooling), overlapping pooling (Overlapping Pooling), empty pyramid pooling (Spatial Pyramid Pooling), center pooling (Center Pooling), maximum pooling (Max-Pooling) ), average pooling (Mean-Mooling), minimum pooling (Min-Pooling), stochastic pooling (Stochastic-Pooling) and global average pooling (Global Average Pooling), etc.
  • the embodiments of this application do not specify the specific pooling method. constitute a limitation.
  • the computer device performs three pooling processes on the object modality shared features: maximum pooling, average pooling, and minimum pooling, and performs feature splicing on the three pooled features obtained after the pooling process to obtain fusion sharing. features while retaining as much feature information as possible.
  • the computer device will input the fused shared features into the feature decoder corresponding to the missing mode. Since it is impossible to determine at this time that the number of channels for fused shared features is consistent with the number of channels for the feature decoder, in order to ensure that the number of channels of the two is consistent, In a possible implementation, the computer device performs channel dimensionality reduction or channel dimensionality raising on the fused shared features to obtain processed fused shared features, where the fused shared features processed after channel dimensionality reduction or channel dimensionality enhancement are The number of channels is consistent with the number of output channels of the feature encoder, and the processed fused shared features are input to the feature decoder of the missing modality.
  • the computer device can perform channel dimensionality reduction or channel dimensionality processing through methods such as interpolation, convolution, or principal component analysis, which is not limited in this embodiment.
  • the computer equipment performs channel dimensionality reduction or channel dimensionality processing on the fused shared features by using 1 ⁇ 1 convolution, thereby ensuring that the number of channels of the fused shared features is consistent with the number of channels of the feature decoder.
  • the computer device reduces the channel dimension or increases the channel dimension and then fuses the shared features into the feature decoder corresponding to the missing modality.
  • j ⁇ m ⁇ output by feature encoder 1 are multi-pooled feature fused to obtain the fusion sharing Feature 1, and then input the fused shared feature 1 into the feature decoder 1.
  • the modes corresponding to the feature decoder 1 and the feature encoder 1 are the same and are the missing modes of the missing image x1.
  • Step 305 Use a feature decoder to restore the features shared by object modalities to obtain a completed image.
  • the computer device uses a feature decoder to restore features shared by object modalities to obtain a completed image.
  • the feature decoder includes 4 residual blocks.
  • Each residual block contains two 3 ⁇ 3 conditional convolution blocks, with 256 filters, a stride of 1, and two recent Neighbor upsampler and a 5 ⁇ 5 conditional convolution block with a stride of 1 are used to upsample the fused shared features to the original image size.
  • the number of filters is 64-128-256-128-64, and finally by A 7 ⁇ 7 conditional convolution block with a stride of 1 and a filter output the completed image.
  • the computer device performs feature fusion on the m object shared features, obtains the fused shared features, and inputs them into the feature decoder, so that the computer device can fuse the shared features through the feature decoder.
  • Features are restored to obtain a complete image.
  • the feature decoder 1 performs feature restoration on the fused shared feature 1 to obtain the completed image x1’.
  • the computer device inputs the missing image and the complete image into the feature encoder corresponding to the missing modality, and extracts features of the missing image and the complete image through the feature encoder.
  • the object modality is The shared features are fused to obtain the fused shared features, and then the fused shared features are subjected to channel dimensionality reduction or channel dimensionality processing, and the processed fused shared features are input into the feature decoder corresponding to the missing modality.
  • the computer equipment The feature decoder performs feature restoration on the fused shared features to obtain a complete image.
  • the computer equipment can make the fused shared features consistent with the number of channels of the feature decoder; the computer equipment improves the robustness of the extracted features through multi-pooling feature fusion. Reduce information redundancy and prevent over-fitting, thereby ensuring the accuracy of image completion results.
  • the above embodiments describe the application process of the image completion model.
  • the following uses exemplary embodiments to describe the training process of the image completion model.
  • Figure 5 shows the process of the training method of the image completion model provided by an exemplary embodiment of the present application.
  • the method includes:
  • Step 501 Obtain a sample image set.
  • the sample image set contains sample images of the same sample object in different modalities, and the sample image includes at least one sample missing image in the missing modality and at least one sample missing image in the complete modality. Sample full image.
  • the computer device obtains a sample image set containing the same sample object, and obtains a sample missing image corresponding to the missing modality and a complete sample image corresponding to the complete modality from the sample image set.
  • the sample object may be the central nervous system, brain, bone, spinal cord, blood vessel, etc.
  • the embodiments of the present application do not limit the specific sample object.
  • the computer device After the computer device obtains the sample image set of the sample object, it needs to perform an image preprocessing operation on the sample images in the sample image set.
  • the preprocessing operation method can be scale transformation, image normalization, image grayscale, At least one of preprocessing operations such as image enhancement and image filtering.
  • preprocessing operations such as image enhancement and image filtering.
  • the embodiments of the present application do not limit the specific preprocessing operation method.
  • the computer device trains feature encoders and feature decoders corresponding to various modalities based on the sample image set.
  • Step 502 Extract features from the sample image through the feature encoder of the sample modality to obtain the first sample modality shared features.
  • the computer device performs feature extraction on the sample image through a feature encoder corresponding to the sample modality to obtain the first sample modality shared features, where, when the sample modality is a missing modality
  • the first sample modality shared feature is a feature shared by the sample missing image and the sample complete image; when the sample modality is a complete modality, the first sample modality shared feature is shared by different sample complete images. Characteristics.
  • the computer equipment will also perform feature extraction on complete images of complete modal samples.
  • the computer device first extracts features from the sample image through the feature encoder corresponding to the sample modality to obtain the shared features of the paired modalities. Similar to the application stage, in order to meet the input requirements of the feature decoder, the computer device performs feature extraction on the feature encoder.
  • the shared features of pairs of sample modalities are subjected to multi-pooling fusion processing to obtain the sample fusion shared features, and the sample fused shared features are subjected to 1 ⁇ 1 convolution processing to ensure the input and feature decoding of the feature decoder corresponding to the same sample modality.
  • the number of output channels of the processor is consistent, and finally the shared features of the processed samples are fused as the first sample modal shared features.
  • Feature encoder 1 is the feature encoder corresponding to the missing mode of sample missing image x1.
  • Feature encoder 1 will The paired sample modal shared features ⁇ s 12 , s 13 , s 14 ⁇ shared by the sample missing image x1 and the sample complete images x2, x3 and x4 are obtained.
  • the computer equipment performs multi-pooling fusion on the paired sample shared modal features.
  • the first sample modality shared feature 1 is obtained through processing.
  • feature encoder 2 will obtain the paired sample modality shared features ⁇ s 22 , s 23 that are shared by the sample complete image x2 and the sample complete images x2, x3 and x4. , s 24 ⁇ , the computer equipment performs multi-pooling fusion processing on the shared modal features of the paired samples to obtain the first sample modal shared feature 2.
  • the feature encoder 3 will obtain the sample complete image x3 and the sample complete image x2 , the first sample modality shared feature 3 shared by x3 and x4, the feature encoder 4 will obtain the sample complete image x4 and the sample complete image x2, the first sample modality shared feature 4 shared by x3 and x4.
  • Step 503 Use the feature decoder of the sample modality to perform feature restoration on the shared features of the first sample modality to obtain the sample generated image.
  • the computer device inputs the shared features of the first sample modality into the feature decoder corresponding to the sample modality, and performs feature restoration on the shared features of the first sample modality through the feature decoder corresponding to the sample modality, thereby obtaining the sample generated image.
  • the feature encoder that outputs the shared features of the first sample modality and the feature decoder that inputs the shared features of the first sample modality correspond to the same sample modality.
  • Step 504 Generate images and sample images based on samples, and train respective feature decoders and feature codecs of various modalities.
  • the feature decoder Since the sample image generated by the feature decoder relies on the shared features of the first sample modality obtained by the feature decoder, if the sample generated image generated by the feature decoder is not similar enough to the sample image, the feature decoder and feature encoder will work together. quilt Keep training.
  • step 504 may include the following sub-steps:
  • the feature decoder should generate an image similar to the input image.
  • the image completion model adopts the image consistency loss Limg to characterize the degree of similarity between the generated image and the input image, Among them, xi is the input image, Xi refers to the image modality, ci is the shared feature of the first sample modality, E is the feature encoder, G is the feature decoder, m is the total number of complete images of the sample, G i (c i ) refers to the sample generated image obtained by the feature decoder performing feature restoration on the shared features of the first sample modality. That is, the computer device generates images and sample images based on the samples and determines the image consistency loss in order to train the feature encoder and the feature decoder based on the image consistency loss.
  • the image consistency loss is within a certain numerical range
  • the sample generated image generated by the feature decoder is similar to the sample image, and the image completion model training is completed at this time.
  • the images are consistent When the loss exceeds a certain value range, the sample generated image generated by the feature decoder is not similar enough to the sample image, and the image completion model will continue to train the corresponding feature encoders and feature decoders of each modality.
  • the computer device after the computer device obtains the sample image set of the sample object, it performs feature extraction on the sample image through the feature encoder corresponding to the sample modality to obtain the first sample modality shared features, and then through The feature decoder corresponding to the sample modality performs feature restoration on the shared features of the first sample modality to obtain the sample generated image, determines the image consistency loss based on the sample generated image and the sample image, and trains various modalities based on the image consistency loss
  • the corresponding feature decoder and feature codec can further ensure the accuracy of image completion through training while realizing image completion.
  • FIG. 7 shows a flow chart of a training method for an image completion model provided by another exemplary embodiment of the present application.
  • Step 701 Obtain a sample image set.
  • the sample image set includes sample images of the same sample object in different modalities, and the sample image includes at least one sample missing image in the missing modality and at least one sample missing image in the complete modality. Sample full image.
  • Step 702 Extract features from the sample image through the feature encoder of the sample modality to obtain the first sample modality shared features.
  • Step 703 Use the feature decoder of the sample modality to perform feature restoration on the shared features of the first sample modality to obtain the sample generated image.
  • steps 701 to 703 For the implementation of steps 701 to 703, reference may be made to the above embodiment, and details will not be described again in this embodiment.
  • Step 704 Extract features from the sample generated image through the feature encoder of the sample modality to obtain shared features of the second sample modality.
  • the computer device performs feature extraction on the sample generated image through a feature encoder corresponding to the sample modality to obtain the second sample modality shared features, so as to compare the second sample modality shared features with the third sample modality shared features.
  • the difference between shared features of a sample modality introduces a feature consistency loss in the model loss.
  • Step 705 Based on the sample generated image, the sample image, the shared features of the first sample modality and the shared features of the second sample modality, train the respective feature encoders and feature decoders of each modality.
  • the computer device trains corresponding feature encoders and feature decoders of each modality based on the sample generated image, the sample image, the first sample modality shared features and the second sample modality shared features. device.
  • step 705 may also include the following sub-steps:
  • the feature decoder should generate an image similar to the input image.
  • the image completion model adopts the image consistency loss Limg to characterize the degree of similarity between the generated image and the input image, Among them, xi is the input image, Xi refers to the image modality, ci is the shared feature of the first sample modality, E is the feature encoder, G is the feature decoder, m is the total number of complete images of the sample, G i (c i ) refers to the sample generated image obtained by the feature decoder performing feature restoration on the shared features of the first sample modality.
  • Feature consistency loss can also be called latent consistency loss L latent , which is used to characterize the similarity between the shared features of the second sample modality obtained by the feature encoder and the shared features of the first sample modality in the image generated by the feature decoder.
  • L latent latent consistency loss
  • xi is the input image
  • Xi refers to the image modality
  • ci is the shared feature of the first sample modality
  • E is the feature encoder
  • G the feature decoder
  • m is the total number of complete images of the sample
  • G i (c i ) refers to the sample-generated image obtained by feature reduction of the shared features of the first sample modality by the feature decoder
  • E i (G i (c i ); i) is the feature encoder corresponding to the sample modality that performs feature reduction on the sample-generated image.
  • the second sample modality shared features obtained by feature extraction.
  • the embodiment of the present application uses the idea of generative adversarial.
  • the discriminator is used to distinguish the sample image and the sample generated image. In the end, the discriminator still cannot distinguish between the sample image and the sample generated image under the premise that the discriminant ability is reliable enough.
  • the computer equipment completes the training when it is determined whether the image is a sample image or a sample-generated image, that is, the sample-generated image generated by the feature decoder is close to the sample image and the discriminant model cannot distinguish it.
  • the discriminator includes 4 4 ⁇ 4 conditional convolution blocks with a span of 2, the number of filters is 64-128-256-512, and the discriminator uses a leaky ReLU activation function with a slope of 2 .
  • the adversarial loss L adv is used to characterize the distribution difference between the generated image and the real image, which is defined as Among them , xi is the input image , The sample generated image is obtained by feature reduction of the shared features of a sample modality. D i is the discriminator of modality i, which is used to distinguish the sample image and sample generated image of modality i.
  • the ideal shared features of paired modalities are symmetrical, for example, the shared features of T1 modality extracted from T2 modality should be similar to the shared features of T2 modality extracted from T1 modality, in order for the shared features of paired modalities to be well obtained.
  • the computer equipment optimizes the total loss function L through min E, G max D L. After L reaches a certain target range, the discriminator cannot judge the sample generated image. and sample images, the computer equipment completes the training. Before L reaches a certain target range, that is, when the discriminator can judge the sample generated image and the sample image, the computer equipment trains the corresponding feature encoder and feature decoder based on the total loss, and the discriminator device.
  • Feature encoder 1 is the feature encoder corresponding to the missing mode of sample missing image x1.
  • Feature encoder 1 will Get the sample missing image x1
  • the paired modal shared features shared by the sample complete images x2, x3 and The feature decoder 1 performs feature restoration on the shared feature 1 of the first sample modality to obtain the sample generated image x1', and then the feature encoder 1 corresponding to the sample modality performs feature extraction on the sample generated image x1' to obtain the second sample generated image x1'.
  • the two sample modalities share features 1.
  • the computer device determines the image consistency loss based on the sample generated image and the sample image, determines the feature consistency loss based on the first sample modality shared feature and the second sample modality shared feature, and generates the sample image and the sample image is input into the discriminator, the sample discrimination result is obtained, and the adversarial loss is determined based on the sample discrimination result, the symmetry loss is determined based on the shared features of the first sample modality, and finally, the computer device is based on the image consistency loss and feature consistency.
  • the total loss is determined by the sexual loss, adversarial loss and symmetry loss, and the corresponding feature encoder, feature decoder, and discriminator are trained based on the total loss.
  • the image completion method in related technology 1 extracts invariant feature information between all modalities and completes the image based on these feature information
  • related technology 2 The image completion method in only extracts the invariant feature information between the two modalities and completes the image based on these feature information.
  • the completed image generated by both will lose some image details, making it impossible to accurately complete the image. Therefore, in order to improve the accuracy of image completion in the embodiment of the present application, the computer device extracts the pairwise modality sharing features shared between two or three modalities, that is, the object modality sharing features, and based on the object modality sharing The feature performs modal completion on the missing image to obtain the completion image corresponding to the missing image, as shown in Figure 9.
  • the completion image of this scheme has more image details. , while achieving image completion, ensuring the accuracy of image completion.
  • the image completion method provided by the embodiment of the present application has better peak signal-to-noise ratio and structural similarity in most cases than the two related technologies, which shows that this method
  • the image completion method provided by the embodiments of the application can generate more realistic completion images, that is, the completion images generated by the embodiments of the application have higher accuracy, and the image completion model has better performance.
  • FIG. 10 shows a structural block diagram of an image completion device provided by an exemplary embodiment of the present application.
  • the device includes:
  • Acquisition module 1001 is used to acquire an object image set.
  • the object image set includes images of the same object in different modalities, and the images include n missing images in the missing modality and m complete modalities.
  • the complete image of , n and m are positive integers;
  • Feature extraction module 1002 is used to extract object modality shared features from the complete image, where the object modality shared features are features shared by the missing image and the complete image;
  • the feature restoration module 1003 is used to restore features of the object modality shared features to obtain a completed image of the missing image.
  • the feature extraction module 1002 is also used to:
  • the feature restoration module 1003 is also used to:
  • the feature decoder performs feature restoration on the object modality shared features to obtain the complementary image.
  • the feature extraction module 1002 is also used to perform feature extraction on the missing image and the i-th complete image through the feature encoder to obtain the i-th object modal shared features, and the i-th complete image belongs to m complete images, and i is less than or equal to m;
  • the feature restoration module 1003 is also used to:
  • the feature restoration module 1003 is also used to:
  • the feature decoder performs feature restoration on the fused shared features to obtain the complementary image.
  • the feature restoration module 1003 is also used to:
  • Feature splicing is performed on the pooled features of each of the m object modality shared features to obtain the fused shared features.
  • the feature restoration module 1003 is also used to:
  • the processed fused shared features are input to the feature decoder of the missing modality.
  • the feature encoder is a hybrid expert network composed of conditional convolution, and the parameters of the conditional convolution are determined based on the mode of the feature encoder.
  • the device also includes:
  • a sample acquisition module is used to obtain a sample image set, the sample image set includes sample images of the same sample object in different modalities, and the sample image includes at least one sample missing image in the missing mode and at least A complete image of the sample in complete modality;
  • a training module configured to train the feature encoder and the feature decoder of various modalities based on the sample image set.
  • the training module is also used to:
  • Features are extracted from the sample image through a feature encoder of the sample modality to obtain shared features of the first sample modality, where, when the sample modality is the missing modality, the first sample modality
  • the shared features of this modality are features shared by the sample missing image and the complete sample image; when the sample modality is the complete modality, the shared features of the first sample modality are different Features shared by complete images of the sample;
  • the feature encoder and the feature decoder of each modality are trained.
  • the training module is also used to:
  • the feature encoder and the feature decoder of each modality are trained.
  • the training module is also used to:
  • the training module is also used for:
  • the feature encoder and the feature decoder of each modality are trained. device.
  • the training module is also used to:
  • the feature encoder and the feature decoder respectively for various modalities are trained, as well as the discriminator.
  • the modality of the image includes T1 mode, T1ce mode, T2 mode and FLAIR mode.
  • the computer device 1100 includes a central processing unit (Central Processing Unit, CPU) 1101, a system memory 1104 including a random access memory 1102 and a read-only memory 1103, and a system connecting the system memory 1104 and the central processing unit 1101. Bus 1105.
  • the computer device 1100 may also include various components within the help computer.
  • I/O system basic input/output system
  • the basic input/output system 1106 may include a display 1208 for displaying information and an input device 1109 such as a mouse or keyboard for the user to input information.
  • the display 1108 and the input device 1109 are both connected to the central processing unit 1101 through the input and output controller 1110 connected to the system bus 1105 .
  • the basic input/output system 1106 may also include an input/output controller 1110 for receiving and processing input from a plurality of other devices such as a keyboard, mouse, or electronic stylus.
  • input and output controller 1110 also provides output to a display screen, printer, or other type of output device.
  • the mass storage device 1107 is connected to the central processing unit 1101 through a mass storage controller (not shown) connected to the system bus 1105 .
  • the mass storage device 1107 and its associated computer-readable media provide non-volatile storage for the computer device 1100 . That is, the mass storage device 1207 may include a computer-readable medium (not shown) such as a hard disk or drive.
  • the computer-readable media may include computer storage media and communication media.
  • Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.
  • Computer storage media include random access memory (RAM, Random Access Memory), read-only memory (ROM, Read Only Memory), flash memory or other solid-state storage technologies, read-only disk (Compact Disc Read-Only Memory, CD-ROM) ), Digital Versatile Disc (DVD) or other optical storage, tape cassette, magnetic tape, disk storage or other magnetic storage device.
  • RAM Random Access Memory
  • ROM Read Only Memory
  • flash memory or other solid-state storage technologies
  • read-only disk Compact Disc Read-Only Memory
  • DVD Digital Versatile Disc
  • tape cassette magnetic tape
  • disk storage disk storage or other magnetic storage device.
  • the above-mentioned system memory 1104 and mass storage device 1107 may be collectively referred to as memory.
  • the memory stores one or more programs, the one or more programs are configured to be executed by one or more central processing units 1101, the one or more programs contain instructions for implementing the above method, and the central processing unit 1101 executes the one or more programs. Multiple programs implement the methods provided by each of the above method embodiments.
  • the computer device 1100 may also be connected to a remote computer on the network through a network such as the Internet to run. That is, the computer device 1100 can be connected to the network 1112 through the network interface unit 1111 connected to the system bus 1105, or the network interface unit 1111 can also be used to connect to other types of networks or remote computer systems (not shown). ).
  • the memory also includes one or more programs, the one or more programs are stored in the memory, and the one or more programs include steps executed by the computer device in the method provided by the embodiment of the present application. .
  • Embodiments of the present application also provide a computer-readable storage medium, which stores at least one program.
  • the at least one program is loaded and executed by a processor to implement the image completion method as described in the above embodiments.
  • Embodiments of the present application provide a computer program product, which includes computer instructions, and the computer instructions are stored in a computer-readable storage medium.
  • the processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the image completion method as described in the above embodiments.

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • Processing Or Creating Images (AREA)

Abstract

一种图像补全方法、装置、设备及存储介质,涉及人工智能领域。方法包括:获取对象图像集合,对象图像集合中包含同一对象在不同模态下的图像,且图像中包含n张缺失模态下的缺失图像以及m张完整模态下的完整图像(201);从完整图像中提取对象共享特征,对象共享特征为缺失图像与完整图像所共有的特征(202);对对象共享特征进行特征还原,得到缺失图像的补全图像(203)。该方法可以提高图像补全质量。

Description

图像补全方法、装置、设备及存储介质
本申请要求于2022年04月27日提交的申请号为202210457083.1、发明名称为“图像补全方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及人工智能领域,特别涉及一种图像补全方法、装置、设备及存储介质。
背景技术
图像补全是根据图像自身或图像库信息来补全待修复图像的缺失区域,使得修复后的图像看起来非常自然,难以和未受损的图像区分开的过程。
模态可以理解为一个东西的多种不同的表现形态,例如,在磁共振成像(Magnetic Resonance Imaging,MRI)的过程中,改变信号的影响因素,可以得到T1、T2、FLAIR和T1ce这四种模态的图像,但由于成像方式不同,部分成像图像可能会缺失必要的特征信息,这种图像被称为缺失图像,其对应的模态被称为缺失模态。
其中,模态缺失的情况多种多样,相关技术中的图像补全方法无法保证图像补全的质量,因此,如何提高提高图像补全的质量就成为急需解决的问题。
发明内容
本申请实施例提供了一种图像补全方法、装置、设备及存储介质。所述技术方案如下:
一方面,本申请实施例提供了一种图像补全方法,所述方法包括:
获取对象图像集合,所述对象图像集合中包含同一对象在不同模态下的图像,且所述图像中包含n张缺失模态下的缺失图像以及m张完整模态下的完整图像,n和m为正整数;
从所述完整图像中提取对象模态共享特征,所述对象模态共享特征为所述缺失图像与所述完整图像所共有的特征;
对所述对象模态共享特征进行特征还原,得到所述缺失图像的补全图像。
另一方面,本申请实施例提供了一种图像补全装置,所述装置包括:
获取模块,用于获取对象图像集合,所述对象图像集合中包含同一对象在不同模态下的图像,且所述图像中包含n张缺失模态下的缺失图像以及m张完整模态下的完整图像,n和m为正整数;
特征提取模块,用于从所述完整图像中提取对象模态共享特征,所述对象模态共享特征为所述缺失图像与所述完整图像所共有的特征;
特征还原模块,用于对所述对象模态共享特征进行特征还原,得到所述缺失图像的补全图像。
另一方面,本申请实施例提供了一种计算机设备,所述计算机设备包括处理器和存储器,所述存储器中存储有至少一段程序,所述至少一段程序由所述处理器加载并执行以实现如上述方面所述的图像补全方法。
另一方面,本申请实施例提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有至少一段程序,所述至少一段程序由处理器加载并执行以实现如上述方面所述的图像补全方法。
另一方面,本申请实施例提供了一种计算机程序产品,所述计算机程序产品包括计算机指令,所述计算机指令存储在计算机可读存储介质中;计算机设备的处理器从所述计算机可读存储介质读取所述计算机指令,处理器执行该计算机指令,使得所述计算机设备执行以实 现如上述方面所述的图像补全方法。
本申请实施例中,计算机设备获取同一对象的对象图像集合后,从完整图像中提取缺失图像与完整图像间成对的模态共享特征,即对象模态共享特征,进而对对象模态共享特征进行特征还原,得到缺失图像的补全图像;采用本申请实施例提供的方案,在实现对缺失模态的图像进行模态补全的同时,可以保证补全结果的准确性,进而保证图像补全质量。
附图说明
图1示出了本申请一个示例性实施例提供的实施环境的示意图;
图2示出了本申请一个示例性实施例提供的图像补全方法的流程图;
图3示出了本申请另一个示例性实施例提供的图像补全方法的流程图;
图4是本申请一个示例性实施例示出的图像补全方法的实施示意图;
图5示出了本申请一个示例性实施例提供的图像补全模型的训练方法的流程图;
图6示出了本申请一个示例性实施例提供的图像补全模型的训练方法的示意图;
图7示出了本申请另一个示例性实施例提供的图像补全模型的训练方法的流程图;
图8示出了本申请另一个示例性实施例提供的图像补全模型的训练方法的示意图;
图9是本申请一个示例性实施例示出的本申请实施例与相关技术的补全效果对比图;
图10示出了本申请一个示例性实施例提供的图像补全装置的结构框图;
图11示出了本申请一个示例性实施例提供的计算机设备的结构示意图。
具体实施方式
为了方便理解,下面对本申请实施例中涉及的名词进行说明。
生成对抗网络(Generative Adversarial Network,GAN):是非监督式学习的一种方法,通过让两个神经网络相互博弈的方式进行学习。生成对抗网络由生成器和判别器构成。生成对抗网络的核心目的是训练生成器。生成器的目的是生成与真实样本图像尽可能相似的图像,判别器的目的是尽可能区分出给定样本是真实样本还是生成图像。二者目的相悖,在不断博弈的过程中相互提高,最终在判别器判别能力足够可靠的前提下仍无法区分给定样本是真实样本还是生成图像,即生成模型生成的图像与样本图像没有区别,判别模型无法区分。
磁共振成像:是一种基于核磁共振(Nuclear Magnetic Resonance,NMR)原理的医学成像技术,利用磁场和射频电波形成人体解剖或生理过程的图像。一个磁共振成像序列是一组射频脉冲和梯度的特定设置,产生特定的图像画面。MRI图像模态有T1、T2、FLAIR和T1ce。T1、T2是用于测量电磁波的物理量,他们可以作为成像的数据。根据T1来成像,就叫“T1加权成像”,临床工作中简称“T1”,T2同理。T1图像整体感官跟“临床图像”的“习惯配色风格”非常接近,白质是白的,灰质是灰的,脑脊液是黑的,所以T1图像可以看出各种断层解剖图。T2信号跟水含量有关,许多病灶的T2信号要强于周围的正常组织,常呈高亮状态,因此从T2中可以清楚的看到病灶所处位置、大小。FLAIR全称是磁共振成像液体衰减反转序列,也称水抑制成像技术,它在T2中能抑制脑脊液的高信号(使脑脊液变暗),从而让邻近脑脊液的病灶显示清楚(变亮),T1ce是在做MRI之前往血液打造影剂(颜料),亮的地方血供丰富,强化显示说明血流丰富,而肿瘤部位正是血流很快的部位,T1ce还能进一步显示肿瘤内情况,鉴别肿瘤与非肿瘤性病变(也就是坏疽部位)。
人工智能(Artificial Intelligence,AI):是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。换句话说,人工智能是计算机科学的一个综合技术,它企图了解智能的实质,并生产出一种新的能以人类智能相似的方式做出反应的智能机器。人工智能也就是研究各种智能机器的设计原理与实现方法,使机器具有感知、推理与决策的功能。
人工智能技术是一门综合学科,涉及领域广泛,既有硬件层面的技术也有软件层面的技 术。人工智能基础技术一般包括如传感器、专用人工智能芯片、云计算、分布式存储、大数据处理技术、操作/交互系统、机电一体化等技术。人工智能软件技术主要包括计算机视觉技术、语音处理技术、自然语言处理技术以及机器学习/深度学习等几大方向。
计算机视觉技术(Computer Vision,CV):是一门研究如何使机器“看”的科学,更进一步的说,就是指用摄影机和电脑代替人眼对目标进行识别、测量等机器视觉,并进一步做图形处理,使电脑处理成为更适合人眼观察或传送给仪器检测的图像。作为一个科学学科,计算机视觉研究相关的理论和技术,试图建立能够从图像或者多维数据中获取信息的人工智能系统。计算机视觉技术通常包括图像处理、图像识别、图像分割、图像语义理解、图像检索、视频处理、视频语义理解、视频内容/行为识别、三维物体重建、3D技术、虚拟现实、增强现实、同步定位与地图构建等技术,还包括常见的人脸识别、指纹识别等生物特征识别技术。
本申请实施例涉及的图像补全方法,即计算机视觉技术在图像处理领域的应用,可提高图像补全模型的训练效果进而提升训练后图像补全模型补全结果的准确性。
如图1所示,其示出了本申请一个示例性实施例提供的实施环境的示意图。该实施环境中包括计算机设备110和服务器120。其中,计算机设备110与服务器120之间通过通信网络进行数据通信,可选地,通信网络可以是有线网络也可以是无线网络,且该通信网络可以是局域网、城域网以及广域网中的至少一种。
计算机设备110是具有图像补全需求的电子设备,该电子设备可以是智能手机、平板电脑或个人计算机等等,本实施例并此不作限定。
在一些实施例中,计算机设备110中安装或运行有具有图像补全功能的应用程序。当需要对某一对象缺失模态下的图像进行补全时,用户将该对像缺失模态下的图像和完整模态下的图像以对象图像集合121的形式输入应用程序,从而将对象图像集合121上传至服务器120,由服务器120对该对象缺失模态下的图像进行图像补全,并向用户反馈图像补全结果。
服务器120可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、内容分发网络(Content Delivery Network,CDN)、以及大数据和人工智能平台等基础云计算服务的云服务器。
在一种可能的实施方式中,计算机设备110将图像集合121上传至服务器120,由服务器120通过图像补全模型122进行图像补全从而得到补全图像123,其中,图像补全模型122是一个编码-解码器网络,服务器120将补全图像123回传至计算机设备110,以便计算机设备110对图像补全结果进行显示。
当然,在其他可能的实施方式中,图像补全模型也可以部署在计算机设备110中,从而由计算机设备110在本地实现图像补全,降低服务器120的处理压力,本实施例对此不作限定。
此外,上述图像补全模型可以由服务器120训练得到,也可以由其他设备训练完成后部署在服务器120侧。为了方便表述,下述各个实施例以图像补全方法应用于计算机设备,且图像补全模型训练由计算机设备执行为例进行说明。
需要说明的是,本申请实施例所示的图像补全方法可以被应用于各种图像补全任务中,本申请实施例中以对医学图像进行图像补全为例进行说明。
请参考图2,其示出了本申请一个示例性实施例提供的图像补全方法的流程图。本实施例以该方法用于计算机设备为例进行说明,该方法包括如下步骤:
步骤201,获取对象图像集合,对象图像集合中包含同一对象在不同模态下的图像,且图像中包含n张缺失模态下的缺失图像以及m张完整模态下的完整图像,n和m为正整数。
在一种可能的实施方式中,以对医学图像进行图像补全为例,该对象可以为中枢神经系统、大脑、骨骼、脊髓或血管等,本申请实施例并不对具体的对象构成限定。
可选的,计算机设备获取到同一对象的对象图像集合后,需要对对象图像集合中的图像进行图像预处理操作,从而使图像的输入格式与模型训练过程的输入格式保持一致。
可选的,图像预处理操作的方法可以包括尺度变换、图像归一化、图像灰度化、图像增强及图像滤波等预处理操作中的至少一种,本申请实施例并不对具体的预处理操作方法构成限定。
由于MRI是最常用也是最重要的颅脑病变检查手段,因此本申请实施例以对象为脑肿瘤为例,在一种可能的实施方式中,在图像为脑肿瘤图像的情况下,图像的模态包括T1模态、T1ce模态、T2模态以及FLAIR模态。
本申请实施例中,计算机设备获取同一对象的对象图像集合,并从对象图像集合中获取到图像,该图像中包含同一对象的n张缺失模态下的缺失图像以及m张完整模态下的完整图像,n和m为正整数,其中,缺失模态下的缺失图像是需要进行图像补全的图像,完整模态下的完整图像是图像补全过程中参考的图像。
以图像为脑肿瘤图像,图像的模态包括T1模态、T1ce模态、T2模态以及FLAIR模态,对象图像集合中包含的缺失模态下的缺失图像可以是:缺失T1模态的缺失图像(该缺失图像对应的缺失模态为T1模态)、缺失T1ce模态的缺失图像(则该缺失图像对应的缺失模态为T1ce模态)、缺失T2模态的缺失图像(则该缺失图像对应的缺失模态为T2模态),以及缺失FLAIR模态的缺失图像(则该缺失图像对应的缺失模态为FLAIR模态)等。对象图像集合中完整模态下的图像即为不存在模态缺失的图像。
步骤202,从完整图像中提取对象模态共享特征,对象模态共享特征为缺失图像与完整图像所共有的特征。
特征是某一类对象区别于其他类对象的相应(本质)特点或特性,或这些特点或特性的集合。由于不同缺失模态下的缺失图像存在差异,则为了可以还原出各种缺失模态下缺失图像的补全图像,需要提取出不同模态图像下所共有的特征,而完整图像是不存在缺失模态的图像,可以用于提取出所需要的图像特征。因此,在一种可能的实施方式中,计算机设备可以利用机器学习模型对图像进行特征提取,从完整图像中提取完整图像与缺失图像间成对的对象模态共享特征,即完整图像与缺失图像所共有的特征。
对于获取到的n张缺失模态下的缺失图像,计算机设备都会将其与m张完整模态对应的完整图像进行模态间成对的模态共享特征的提取,即对每一种缺失模态,计算机设备都能从m张完整模态对应的完整图像中提取到m个对象模态共享特征。
步骤203,对对象模态共享特征进行特征还原,得到缺失图像的补全图像。
图像补全是指对受到损坏的图像进行修复重建,而缺失图像是缺失某一模态的图像,在计算机设备获取到缺失图像与完整图像间的对象模态共享特征后,即可以基于对象模态共享特征进行特征还原,以对受损的缺失图像进行补全,从而生成该缺失图像的补全图像。该补全图像不存在缺失模态。
综上所述,本申请实施例中,图像补全模型获取同一对象的对象图像集合后,从完整图像中提取对象模态共享特征,进而对对象模态共享特征进行特征还原,得到缺失图像的补全图像;由于对象模态共享特征是缺失图像与完整图像所共有的特征,因此,在基于对象模态共享特征进行特征还原后,可以对缺失模态的图像进行模态补全的同时,保证补全结果的准确性,进而保证图像补全质量。
本申请实施例中,计算机设备通过机器学习的方式预先训练图像补全模型,使得在模型应用阶段可以基于该图像补全模型对缺失图像进行图像模态补全。该图像补全模型由特征编码器和特征解码器构成,其中,特征编码器用于从完整图像中提取完整图像与缺失图像间共享的特征信息(对象模态共享特征),特征解码器对特征编码器提取到的对象模态共享特征进行特征还原,进而得到补全图像。
请参考图3,其示出了本申请另一个示例性实施例提供的图像补全方法的流程图。本实施例以该方法用于计算机设备为例进行说明,该方法包括如下步骤:
步骤301,获取对象图像集合,对象图像集合中包含同一对象在不同模态下的图像,且图像中包含n张缺失模态下的缺失图像以及m张完整模态下的完整图像,n和m为正整数。
本步骤的实施方式可以参考步骤201,本实施例在此不再赘述。
步骤302,将缺失图像和完整图像输入缺失模态的特征编码器,其中,不同模态对应不同特征编码器。
其中,图像的每一种模态都有对应的特征编码器。示意性的,若图像的模态包括T1模态、T1ce模态、T2模态以及FLAIR模态,则图像补全模型中包括T1模态的特征编码器、T1ce模态的特征编码器、T2模态的特征编码器以及FLAIR模态的特征编码器。
由于缺失图像的缺失模态可能是图像的多种模态中的一种,则为了给不同缺失模态分别进行编码,在一种可能的实施方式中,计算机设备需要确定缺失图像对应缺失模态与特征编码器对应模态之间的对应关系,以便将缺失图像和完整图像输入缺失模态对应的特征编码器中。
示意性的,若缺失图像A的缺失模态为T1模态,而特征编码器1对应T1模态,则应该将缺失图像A和完整图像共同输入特征编码器1;若缺失图像B的缺失模态为T1ce模态,特征编码器2对应T1ce模态,则应该将缺失图像B和完整图像共同输入特征编码器2;若缺失图像C的缺失模态为T2模态,特征编码器3对应T2模态,则将缺失图像C和完整图像共同输入特征编码器3,;若缺失图像D的缺失模态为FLAIR模态,特征编码器4对应FLAIR模态,则应该将缺失图像D和完整图像共同输入特征编码器4,也即实现将缺失图像和完整图像输入缺失模态对应的特征编码器中。
在一种可能的实施方式中,特征编码器是由条件卷积构成的混合专家网络,且所述条件卷积的参数基于所述特征编码器对应的模态确定得到。
混合专家系统(Mixture Of Experts,MOE)是一种神经网络,其中针对输入数据集中的局部区域训练单独的线性模型,这些线性模型被称为专家,而门控模块用于选择使用哪个专家,模型的实际输出为各个模型的输出与门控模型的权重组合,各个专家模型可采用不同的函数(各种线性或非线性函数),混合专家系统就是将多个模型整合到一个单独的任务中。
本申请实施例中,图像补全模型使用由条件卷积(CondConv)组成的特征编码器,其条件卷积的参数由特征编码器对应的输入模态确定,使用s个专家混合模型 其中,x表示输入图像,i表示输入模态,σ(·)是sigmoid激活函数,#表示常规卷积,{W1,…,Ws}是与s个专家相关的网络参数,是特定模态的混合权重。
在本申请实施例中,特征编码器由一个下采样模块和残差块组成,下采样模块中包含一个7×7的条件卷积块,步长为1,以及两个4×4的条件卷积块,步长为2。
示意性的,如图4所示,计算机设备获取n张缺失模态对应的缺失图像以及m张完整模态对应的完整图像,若缺失图像为完整图像为{xj|j∈m},由于图像为MRI多模态脑肿瘤图像,该图像包括T1模态、T1ce模态、T2模态以及FLAIR模态四种模态,特征编码器1至4分别对应其中一种模态,若特征编码器1对应T1模态,缺失图像的缺失模态为T1模态,则图像补全模型将缺失图像和完整图像{xj|j∈m}输入到特征编码器1中。以此类推,若特征编码器2对应T2模态,缺失图像的缺失模态为T2模态,则图像补全模型将缺失图像和完整图像{xj|j∈m}输入到特征编码器2中。
步骤303,通过特征编码器对缺失图像和完整图像进行特征提取,得到对象模态共享特征。
在一种可能的实施方式中,计算机设备通过特征编码器对缺失图像和完整图像进行特征提取,提取到的特征信息是计算机设备从完整图像中提取到的缺失图像与完整图像所共有的 特征信息,即对象模态共享特征。
由于对象图像集合中包含多张完整图像,而每一张完整图像均可与同一缺失图像进行对象模态共享特征的提取,对应在一些实施例中,计算机设备通过特征编码器依次对缺失图像和第i完整图像进行特征提取,得到m个第i对象模态共享特征,该第i完整图像属于m张完整图像,且i小于或等于m。
示意性的,如图4所示,若编特征码器1对应T1模态,缺失图像的缺失模态为T1模态,图像补全模型通过特征编码器1对缺失图像及完整图像{xj|j∈m}进行特征提取,提取到对象模态共享特征{s1j|j∈m},由于图4中有1张缺失图像和3张完整图像,若将缺失图像称为x1,3张完整图像称为x2、x3、x4,因此特征编码器1能够提取到3对对象模态共享特征{s12,s13,s14},这3对对象模态共享特征是特征编码器1分别从完整图像为x2、x3、x4中提取得到的。需要注意的是,由于缺失图像的缺失模态为T1模态,此时只有T1模态对应的编特征码器1工作,另外3种完整模态对应的特征编码器不需要工作。类似的,若缺失图像的缺失模态为T2模态,此时只有T2模态对应的特征编码器2工作,另外3种完整模态对应的特征编码器不需要工作。
步骤304,将对象模态共享特征输入缺失模态的特征解码器,其中,不同模态对应不同特征解码器。
其中,图像的每一种模态都有对应的特征解码器。示意性的,若图像的模态包括T1模态、T1ce模态、T2模态以及FLAIR模态,则图像补全模型中包括T1模态的特征解码器、T1ce模态的特征解码器、T2模态的特征解码器以及FLAIR模态的特征解码器。
在一种可能的实施方式中,计算机设备将对象模态共享特征输入缺失模态的特征解码器中。示例性的,若对象模态共享特征是T1模态的特征编码器输出的,对应将对象模态共享特征输入T1模态的特征解码器;若对象模态共享特征是T1ce模态的特征编码器输出的,对应将对象模态共享特征输入T1ce模态的特征解码器;若对象模态共享特征是T2模态的特征编码输出的,对应将对象模态共享特征输入T2模态的特征解码器;若对象模态共享特征是FLAIR模态的特征编码器输出的,对应将对象模态共享特征输入FLAIR模态的特征解码器;也就是说,计算机设备基于对象模态共享特征对应的缺失模态,将该对象模态共享特征输入该确实模态的特征解码器中。
在一个示例性的例子中,步骤304可以包括步骤304A和步骤304B。
步骤304A,对m种对象模态共享特征进行特征融合,得到融合共享特征。
步骤304B,将融合共享特征输入缺失模态的特征解码器。
由于有m张完整模态对应的完整图像,因此单个特征编码器会得到m对对象模态共享特征,而在实际应用中m是不固定的,因此对象模态共享特征数量是不固定的,而特征解码器的输入是固定的尺寸,为了满足特征解码器的输入要求,计算机设备需要对m对对象模态共享特征进行一些处理,并将处理后的对象模态共享特征输入特征解码器。
可选的,计算机设备先对m种对象模态共享特征进行特征融合,得到融合共享特征,并将融合共享特征输入缺失模态的特征解码器。
本申请实施例中,计算机设备先分别对m个对象模态共享特征进行池化操作(Pooling),得到各个对象共享特征的池化操作结果,再对池化操作结果进行特征拼接,实现特征融合,得到融合共享特征。
池化操作是卷积神经网络中非常常见的一种操作,是模仿人的视觉系统对数据进行降维,池化操作通常也叫做子采样(Subsampling)或降采样(Downsampling)。池化的意义在于特征降维,池化技术大大降低了对于计算资源的损耗,除此以外还有降低模型过拟合的优点。
在一种可能的实施方式中,计算机设备通过至少两种池化方式对第i对象模态共享特征进行池化处理,得到第i对象模态共享特征对应的至少两种池化特征,然后对m种对象模态共享特征各自对应的池化特征进行特征拼接,得到融合共享特征。
可选的,池化方式可以是一般池化(General Pooling)、重叠池化(Overlapping Pooling)、空金字塔池化(Spatial Pyramid Pooling)、中心池化(Center Pooling)、最大池化(Max-Pooling)、平均池化(Mean-Mooling)、最小池化(Min-Pooling)随机池化(Stochastic-Pooling)和全局平均池化(Global Average Pooling)等,本申请实施例并不对具体的池化方式构成限定。
可选的,计算机设备将对象模态共享特征进行最大池化、平均池化和最小池化三种池化处理,并将池化处理后得到的三种池化特征进行特征拼接,得到融合共享特征的同时,保留尽可能多的特征信息。
进一步的,计算机设备将融合共享特征输入缺失模态对应的特征解码器,由于此时无法确定融合共享特征的通道数与特征解码器的通道数是一致的,为了确保二者通道数一致,在一种可能的实施方式中,计算机设备对融合共享特征进行通道降维或者通道升维处理,得到处理后的融合共享特征,其中,通道降维或通道升维后处理后的融合共享特征的通道数与特征编码器的输出通道数一致,并将处理后的融合共享特征输入缺失模态的特征解码器。
可选的,计算机设备可以通过插值、卷积或主成分分析等方法进行通道降维或者通道升维处理,本实施例对此不作限定。
本申请实施例中,计算机设备通过使用1×1卷积对融合共享特征进行通道降维或者通道升维处理,从而保证融合共享特征的通道数与特征解码器的通道数一致。最后,计算机设备将通道降维或通道升维后融合共享特征输入缺失模态对应的特征解码器。
示意性的,如图4所示,特征编码器1输出的m个对象模态共享特征{s1j|j∈m},对m个对象模态共享特征进行多池化特征融合,得到融合共享特征1,进而将融合共享特征1输入特征解码器1,特征解码器1和特征编码器1对应的模态相同,且为缺失图像x1的缺失模态。
步骤305,通过特征解码器对对象模态共享特征进行特征还原,得到补全图像。
在一种可能的实施方式中,计算机设备通过特征解码器对对象模态共享特征进行特征还原,得到补全图像。
在本申请实施例中,特征解码器包括4个残差块,每个残差块包含两个3×3的条件卷积块,有256个滤波器,步长为1,还包括两个最近邻上采样器和一个5×5的条件卷积块,步长为1,用于将融合共享特征上采样到原始图像大小,滤波器的数量为64-128-256-128-64,最后由一个步长为1的7×7的条件卷积块和一个滤波器输出补全后的图像。
可选的,考虑到特征解码器对特征维度的要求,计算机设备对m个对象共享特征进行特征融合后,得到融合共享特征,并输入特征解码器,使得计算机设备可以通过特征解码器对融合共享特征进行特征还原,得到补全图像。
示意性的,如图4所示,特征解码器1对融合共享特征1进行特征还原,得到补全图像x1’。
本实施例中,计算机设备将缺失图像和完整图像输入缺失模态对应的特征编码器中,通过特征编码器对缺失图像和完整图像进行特征提取,得到对象模态共享特征后,对对象模态共享特征进行特征融合,得到融合共享特征,进而对融合共享特征进行通道降维或通道升维处理,并将处理后的融合共享特征输入到缺失模态对应的特征解码器中,最后,计算机设备通过特征解码器对融合共享特征进行特征还原,得到补全图像。计算机设备通过对m个对象共享特征进行特征融合,可以使得融合后的融合共享特征与特征解码器的通道数一致;计算机设备通过多池化特征融合的方式提高所提取特征的鲁棒性,减少信息冗余,防止过拟合,进而确保了图像补全结果的准确性。
上述实施例对图像补全模型的应用过程进行了说明,下面采用示例性的实施例对图像补全模型的训练过程进行说明。
请参考图5,其示出了本申请一个示例性实施例提供的图像补全模型的训练方法的流程 图,该方法包括:
步骤501,获取样本图像集合,样本图像集合中包含同一样本对象在不同模态下的样本图像,且样本图像中包含至少一张缺失模态下的样本缺失图像以及至少一张完整模态下的样本完整图像。
在一种可能的实施方式中,计算机设备获取包含同一样本对象的样本图像集合,并从样本图像集合中获取到缺失模态对应的样本缺失图像以及完整模态对应的样本完整图像。
可选的,该样本对象可以为中枢神经系统、大脑、骨骼、脊髓或血管等,本申请实施例并不对具体的样本对象构成限定。
可选的,计算机设备获取到样本对象的样本图像集合后,需要对样本图像集合中的样本图像进行图像预处理操作,预处理操作方法可以为尺度变换、图像归一化、图像灰度化、图像增强及图像滤波等预处理操作中的至少一种,本申请实施例并不对具体的预处理操作方法构成限定。
可选的,计算机设备基于样本图像集合训练各种模态对应的特征编码器和特征解码器。
步骤502,通过样本模态的特征编码器对样本图像进行特征提取,得到第一样本模态共享特征。
在一种可能的实施方式中,计算机设备通过样本模态对应的特征编码器对样本图像进行特征提取,得到第一样本模态共享特征,其中,在样本模态为缺失模态的情况下第一样本模态共享特征为样本缺失图像与样本完整图像所共有的特征;在所述样本模态为完整模态的情况下,第一样本模态共享特征为不同样本完整图像所共有的特征。
不同于应用阶段图像补全模型只对缺失图像和完整图像进行特征提取,在训练阶段,计算机设备还将对完整模态的样本完整图像之间进行特征提取。
计算机设备首先通过样本模态对应的特征编码器对样本图像进行特征提取,得到成对模态共享特征,与应用阶段类似的,为了满足特征解码器的输入要求,计算机设备对特征编码器得到的成对样本模态共享特征进行多池化融合处理,得到样本融合共享特征,并对样本融合共享特征进行1×1卷积处理,以保证同一样本模态对应的特征解码器的输入与特征解码器的输出通道数一致,最后将处理完的样本融合共享特征作为第一样本模态共享特征。
示意性的,如图6所示,有样本缺失图像x1,样本完整图像x2,x3和x4,特征编码器1为样本缺失图像x1的缺失模态所对应的特征编码器,特征编码器1将得到样本缺失图像x1与样本完整图像x2,x3和x4所共有的成对样本模态共享特征{s12,s13,s14},计算机设备对成对样本共享模态特征进行多池化融合处理得到第一样本模态共享特征1,类似的,特征编码器2将得到样本完整图像x2与样本完整图像x2,x3和x4所共有的成对样本模态共享特征{s22,s23,s24},计算机设备对成对样本共享模态特征进行多池化融合处理得到第一样本模态共享特征2,类似的,特征编码器3将得到样本完整图像x3与样本完整图像x2,x3和x4所共有的第一样本模态共享特征3,特征编码器4将得到样本完整图像x4与样本完整图像x2,x3和x4所共有的第一样本模态共享特征4。
步骤503,通过样本模态的特征解码器对第一样本模态共享特征进行特征还原,得到样本生成图像。
计算机设备将第一样本模态共享特征输入到样本模态对应的特征解码器中,通过样本模态对应的特征解码器对第一样本模态共享特征进行特征还原,从而得到样本生成图像。与模型应用过程类似的是,输出第一样本模态共享特征的特征编码器和输入第一样本模态共享特征的特征解码器对应同一样本模态。
步骤504,基于样本生成图像和样本图像,训练各种模态各自的特征解编码器和特征编解码器。
由于特征解码器生成样本图像依赖于特征解码器所获取的第一样本模态共享特征,因此,若特征解码器生成的样本生成图像与样本图像不够相似,特征解码器与特征编码器将一同被 继续训练。
可选的,步骤504可以包括如下子步骤:
1、基于样本生成图像和样本图像,确定图像一致性损失。
在一种可能的实施方式中,特征解码器应生成与输入图像相似的图像,为此,图像补全模型采用了图像一致性损失Limg来表征生成图像与输入图像的相似程度,其中,xi为输入图像,Xi指图像模态,ci为第一样本模态共享特征,E为特征编码器,G为特征解码器,m为样本完整图像总数,Gi(ci)指特征解码器对第一样本模态共享特征进行特征还原所得到样本生成图像。也就是说,计算机设备基于样本生成图像和样本图像,确定图像一致性损失,以便基于图像一致性损失训练特征编码器和特征解码器。
2、基于图像一致性损失,训练各种模态各自的特征编码器和特征解码器。
在一种可能的实施方式中,若图像一致性损失处于某一数值范围内时,特征解码器生成的样本生成图像与样本图像相似,此时图像补全模型训练完成,相应的,若图像一致性损失超出指一数值范围内时,特征解码器生成的样本生成图像与样本图像不够相似,图像补全模型将继续训练各种模态各自对应的特征编码器和特征解码器。
综上所述,本申请实施例中,计算机设备获取样本对象的样本图像集合后,通过样本模态对应的特征编码器对样本图像进行特征提取,得到第一样本模态共享特征,进而通过样本模态对应的特征解码器对第一样本模态共享特征进行特征还原,得到样本生成图像,并基于样本生成图像和样本图像确定图像一致性损失,基于图像一致性损失训练各种模态各自对应的特征解编码器和特征编解码器,在实现图像补全的同时,通过训练能够进一步的确保图像补全的准确性。
为了进一步提升训练结果的准确性,在图像一致性损失的基础上,还额外引入了其他损失,比如,特征一致性损失、对抗性损失、对称性损失等。请参考图7,其示出了本申请另一个示例性实施例提供的图像补全模型的训练方法的流程图。
步骤701,获取样本图像集合,样本图像集合中包含同一样本对象在不同模态下的样本图像,且样本图像中包含至少一张缺失模态下的样本缺失图像以及至少一张完整模态下的样本完整图像。
步骤702,通过样本模态的特征编码器对样本图像进行特征提取,得到第一样本模态共享特征。
步骤703,通过样本模态的特征解码器对第一样本模态共享特征进行特征还原,得到样本生成图像。
步骤701~步骤703的实施方式可以参考上文实施例,本实施例在此不再赘述。
步骤704,通过样本模态的特征编码器对样本生成图像进行特征提取,得到第二样本模态共享特征。
在图像-图像转换过程中,希望可以从样本生成图像中通过特征编码器仍然可以恢复出第一样本模态共享特征。对应在一种可能的实施方式中,计算机设备通过样本模态对应的特征编码器对样本生成图像进行特征提取,得到第二样本模态共享特征,以便通过比较第二样本模态共享特征和第一样本模态共享特征之间的差异,在模型损失中引入特征一致性损失。
步骤705,基于样本生成图像、样本图像、第一样本模态共享特征和第二样本模态共享特征,训练各种模态各自的特征编码器和特征解码器。
在一种可能的实施方式中,计算机设备基于样本生成图像、样本图像、第一样本模态共享特征和第二样本模态共享特征,训练各种模态各自对应的特征编码器和特征解码器。
可选的,除了可以基于第一样本模态共享特征和第二样本模态共享特征,引入特征一致性损失之外,为了最小化生成图像和真实图像之间的分布差异,还引入了对抗性损失。可选 的,步骤705还可以包括如下子步骤:
1、基于样本生成图像和样本图像,确定图像一致性损失。
在一种可能的实施方式中,特征解码器应生成与输入图像相似的图像,为此,图像补全模型采用了图像一致性损失Limg来表征生成图像与输入图像的相似程度,其中,xi为输入图像,Xi指图像模态,ci为第一样本模态共享特征,E为特征编码器,G为特征解码器,m为样本完整图像总数,Gi(ci)指特征解码器对第一样本模态共享特征进行特征还原所得到样本生成图像。
2、基于第一样本模态共享特征和第二样本模态共享特征,确定特征一致性损失。
特征一致性损失也可以被称作潜在一致性损失Llatent,用于表征特征解码器生成的图像中通过特征编码器得到的第二样本模态共享特征与第一样本模态共享特征的相似程度,其中,xi为输入图像,Xi指图像模态,ci为第一样本模态共享特征,E为特征编码器,G为特征解码器,m为样本完整图像总数,Gi(ci)指特征解码器对第一样本模态共享特征进行特征还原所得到样本生成图像,Ei(Gi(ci);i)为样本模态对应的特征编码器对样本生成图像进行特征提取得到的第二样本模态共享特征。
3、将样本生成图像和样本图像输入判别器,得到样本判别结果,判别器用于判别生成图像和真实图像,并基于样本判别结果确定对抗性损失。
为了使生成的图像更加接近真实图像,本申请实施例利用生成对抗思想,训练过程中利用判别器对样本图像和样本生成图像进行判别,最终在判别器判别能力足够可靠的前提下仍无法区分给定图像是样本图像还是样本生成图像,即特征解码器生成的样本生成图像接近样本图像,判别模型无法区分时,计算机设备完成训练。
在本申请实施例中,判别器包括4个跨度为2的4×4的条件卷积块,滤波器数量为64-128-256-512,并且,判别器使用斜率为2的leaky ReLU激活函数。对抗性损失Ladv用于表征生成图像和真实图像的分布差异,其定义为 其中,xi为输入图像,Xi指输入图像所属的图像模态,ci为第一样本模态共享特征,m为样本完整图像总数,Gi(ci)指特征解码器对第一样本模态共享特征进行特征还原所得到样本生成图像,Di为模态i的判别器,用于判别模态i的样本图像和样本生成图像。
4、基于第一样本模态共享特征确定对称性损失,对称性损失用于表征成对模态间模态共享特征的相似程度。
理想的成对模态共享特征是对称的,例如,从T2模态提取的T1模态共享特征应该与从T1模态提取T2模态共享特征相似,为了使成对模态共享特征得到很好的解耦,图像补全模型引入对称性损失Lsym,其定义为 其中,d(·,·)计算两个特征量之间的距离,sij=Ei(xj;j)表示从模态j提取的模态i的共享特征,且图像补全模型中预先设定α=0.1。
5、基于图像一致性损失、特征一致性损失、对抗性损失以及对称性损失,确定总损失。
最后,图像补全模型的总损失函数为L,其定义为L=λimgLimglatentLlatentadvLadvsymLsym,其中,图像补全模型中预先设定λimg=10,λlatent=1,λadv=1,λsym=1。
6、基于总损失,训练各种模态各自的特征编码器和特征解码器,以及判别器。
在训练过程中,可用模态的数量和分布都是随机的,计算机设备通过minE,GmaxDL对总损失函数L进行优化,在L达到一定目标范围后,判别器无法判断样本生成图像和样本图像时,计算机设备完成训练,在L达到一定目标范围前,即判别器能够判断样本生成图像和样本图像时,计算机设备基于总损失训练各自对应的特征编码器和特征解码器,以及判别器。
示意性的,如图8所示,有样本缺失图像x1,样本完整图像x2,x3和x4,特征编码器1为样本缺失图像x1的缺失模态所对应的特征编码器,特征编码器1将得到样本缺失图像x1 与样本完整图像x2,x3和x4所共有的成对模态共享特征,并成对共享模态特征进行多池化融合处理得到第一样本模态共享特征1,计算机设备通过样本模态对应的特征解码器1对第一样本模态共享特征1进行特征还原,得到样本生成图像x1’,进而由通过样本模态对应的特征编码器1对样本生成图像x1’进行特征提取,得到第二样本模态共享特征1,计算机设备基于样本生成图像和样本图像确定图像一致性损失,基于第一样本模态共享特征和第二样本模态共享特征确定特征一致性损失,将样本生成图像和样本图像输入判别器,得到样本判别结果,并基于所述样本判别结果确定对抗性损失,基于第一样本模态共享特征确定对称性损失,最后,计算机设备基于图像一致性损失、特征一致性损失、对抗性损失以及对称性损失确定总损失,基于总损失训练各自对应的特征编码器和特征解码器,以及判别器。
现有两种相关技术提供了不同的图像补全方法,但由于相关技术1中的图像补全方法提取所有模态之间不变的特征信息,并基于这些特征信息补全图像,相关技术2中的图像补全方法只提取两个模态之间不变的特征信息,并基于这些特征信息补全图像,二者所生成的补全图像都会丢失掉部分图像细节,无法对图像实现精确补全,因此,本申请实施例为了提高图像补全的精度,计算机设备提取在两个或三个模态间共享的成对模态共享特征,即对象模态共享特征,并基于对象模态共享特征对缺失图像进行模态补全,从而得到缺失图像对应的补全图像,如图9所示,相较于两种相关技术的补全图像,本方案的补全图像具有更多的图像细节,在实现图像补全的同时,保证图像补全的准确性。
如表一及表二所示,相较于提取所有模态之间不变的特征信息,并基于这些特征信息补全图像的相关技术1,及只提取两个模态之间不变的特征信息,并基于这些特征信息补全图像的相关技术2,本申请实施例提供的图像补全方法在大多数情况下的峰值信噪比和结构相似性都优于两种相关技术,这表明本申请实施例提供的图像的补全方法能够生成更为真实的补全图像,即本申请实施例生成的补全图像具有较高的准确性,图像补全模型拥有更好的性能。
表一
表二
请参考图10,其示出了本申请一个示例性实施例提供的图像补全装置的结构框图,该装置包括:
获取模块1001,用于获取对象图像集合,所述对象图像集合中包含同一对象在不同模态下的图像,且所述图像中包含n张缺失模态下的缺失图像以及m张完整模态下的完整图像,n和m为正整数;
特征提取模块1002,用于从所述完整图像中提取对象模态共享特征,所述对象模态共享特征为所述缺失图像与所述完整图像所共有的特征;
特征还原模块1003,用于对所述对象模态共享特征进行特征还原,得到所述缺失图像的补全图像。
可选的,所述特征提取模块1002,还用于:
将所述缺失图像和所述完整图像输入所述缺失模态的特征编码器,其中,不同模态对应不同特征编码器;
通过所述特征编码器对所述缺失图像和所述完整图像进行特征提取,得到所述对象模态共享特征;
所述特征还原模块1003,还用于:
将所述对象模态共享特征输入所述缺失模态的特征解码器,其中,不同模态对应不同特征解码器;
通过所述特征解码器对所述对象模态共享特征进行特征还原,得到所述补全图像。
可选的,所述特征提取模块1002,还用于通过所述特征编码器对所述缺失图像和第i完整图像进行特征提取,得到第i对象模态共享特征,所述第i完整图像属于m张所述完整图像,且i小于或等于m;
所述特征还原模块1003,还用于:
对m种所述对象模态共享特征进行特征融合,得到融合共享特征;
将所述融合共享特征输入所述缺失模态的所述特征解码器;
所述特征还原模块1003,还用于:
通过所述特征解码器对所述融合共享特征进行特征还原,得到所述补全图像。
可选的,所述特征还原模块1003,还用于:
通过至少两种池化方式对所述第i对象模态共享特征进行池化处理,得到所述第i对象模态共享特征的至少两种池化特征;
对m种所述对象模态共享特征各自的所述池化特征进行特征拼接,得到所述融合共享特征。
可选的,所述特征还原模块1003,还用于:
对所述融合共享特征进行通道降维或者通道升维处理,得到处理后的融合共享特征,其中,所述处理后的融合共享特征的通道数与所述特征编码器的输出通道数一致;
将所述处理后的融合共享特征输入所述缺失模态的所述特征解码器。
可选的,所述特征编码器是由条件卷积构成的混合专家网络,且所述条件卷积的参数基于所述特征编码器的模态确定得到。
可选的,所述装置还包括:
样本获取模块,用于获取样本图像集合,所述样本图像集合中包含同一样本对象在不同模态下的样本图像,且所述样本图像中包含至少一张缺失模态下的样本缺失图像以及至少一张完整模态下的样本完整图像;
训练模块,用于基于所述样本图像集合训练各种模态的所述特征编码器和所述特征解码器。
可选的,所述训练模块,还用于:
通过样本模态的特征编码器对所述样本图像进行特征提取,得到第一样本模态共享特征,其中,在所述样本模态为所述缺失模态的情况下,所述第一样本模态共享特征为所述样本缺失图像与所述样本完整图像所共有的特征;在所述样本模态为所述完整模态的情况下,所述第一样本模态共享特征为不同样本完整图像所共有的特征;
通过所述样本模态的所述特征解码器对所述第一样本模态共享特征进行特征还原,得到样本生成图像;
基于所述样本生成图像和所述样本图像,训练各种模态各自的所述特征编码器和所述特征解码器。
可选的,所述训练模块,还用于:
基于所述样本生成图像和所述样本图像,确定图像一致性损失;
基于所述图像一致性损失,训练各种模态各自的所述特征编码器和所述特征解码器。
可选的,所述训练模块,还用于:
通过所述样本模态的所述特征编码器对所述样本生成图像进行特征提取,得到第二样本模态共享特征;
所述训练模块,还用于:
基于所述样本生成图像、所述样本图像、所述第一样本模态共享特征和所述第二样本模态共享特征,训练各种模态各自的所述特征编码器和所述特征解码器。
可选的,所述训练模块,还用于:
基于所述样本生成图像和所述样本图像,确定图像一致性损失;
基于所述第一样本模态共享特征和所述第二样本模态共享特征,确定特征一致性损失;
将所述样本生成图像和所述样本图像输入判别器,得到样本判别结果,所述判别器用于判别生成图像和真实图像;基于所述样本判别结果确定对抗性损失;
基于所述第一样本模态共享特征确定对称性损失,所述对称性损失用于表征成对模态间模态共享特征的相似程度;
基于所述图像一致性损失、所述特征一致性损失、所述对抗性损失以及所述对称性损失,确定总损失;
基于所述总损失,训练各种模态各自的所述特征编码器和所述特征解码器,以及所述判别器。
可选的,在所述图像为脑肿瘤图像的情况下,所述图像的模态包括T1模态、T1ce模态、T2模态以及FLAIR模态。
请参考图11,其示出了本申请一个示例性实施例提供的计算机设备的结构示意图。具体来讲:所述计算机设备1100包括中央处理单元(Central Processing Unit,CPU)1101、包括随机存取存储器1102和只读存储器1103的系统存储器1104,以及连接系统存储器1104和中央处理单元1101的系统总线1105。所述计算机设备1100还可以包括帮助计算机内的各个器 件之间传输信息的基本输入/输出系统(Input/Output,I/O系统)1106,和用于存储操作系统1113、应用程序1114和其他程序模块1115的大容量存储设备1107。
在一些实施例中,所述基本输入/输出系统1106可以包括有用于显示信息的显示器1208和用于用户输入信息的诸如鼠标、键盘之类的输入设备1109。其中所述显示器1108和输入设备1109都通过连接到系统总线1105的输入输出控制器1110连接到中央处理单元1101。所述基本输入/输出系统1106还可以包括输入输出控制器1110以用于接收和处理来自键盘、鼠标、或电子触控笔等多个其他设备的输入。类似地,输入输出控制器1110还提供输出到显示屏、打印机或其他类型的输出设备。
所述大容量存储设备1107通过连接到系统总线1105的大容量存储控制器(未示出)连接到中央处理单元1101。所述大容量存储设备1107及其相关联的计算机可读介质为计算机设备1100提供非易失性存储。也就是说,所述大容量存储设备1207可以包括诸如硬盘或者驱动器之类的计算机可读介质(未示出)。
不失一般性,所述计算机可读介质可以包括计算机存储介质和通信介质。计算机存储介质包括以用于存储诸如计算机可读指令、数据结构、程序模块或其他数据等信息的任何方法或技术实现的易失性和非易失性、可移动和不可移动介质。计算机存储介质包括随机存取记忆体(RAM,Random Access Memory)、只读存储器(ROM,Read Only Memory)、闪存或其他固态存储其技术,只读光盘(Compact Disc Read-Only Memory,CD-ROM)、数字通用光盘(Digital Versatile Disc,DVD)或其他光学存储、磁带盒、磁带、磁盘存储或其他磁性存储设备。当然,本领域技术人员可知所述计算机存储介质不局限于上述几种。上述的系统存储器1104和大容量存储设备1107可以统称为存储器。
存储器存储有一个或多个程序,一个或多个程序被配置成由一个或多个中央处理单元1101执行,一个或多个程序包含用于实现上述方法的指令,中央处理单元1101执行该一个或多个程序实现上述各个方法实施例提供的方法。
根据本申请的各种实施例,所述计算机设备1100还可以通过诸如因特网等网络连接到网络上的远程计算机运行。也即计算机设备1100可以通过连接在所述系统总线1105上的网络接口单元1111连接到网络1112,或者说,也可以使用网络接口单元1111来连接到其他类型的网络或远程计算机系统(未示出)。
所述存储器还包括一个或者一个以上的程序,所述一个或者一个以上程序存储于存储器中,所述一个或者一个以上程序包含用于进行本申请实施例提供的方法中由计算机设备所执行的步骤。
本申请实施例还提供一种计算机可读存储介质,该可读存储介质中存储有至少一段程序,至少一段程序由处理器加载并执行以实现如上述实施例所述的图像补全方法。
本申请实施例提供了一种计算机程序产品,该计算机程序产品计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行如上述实施例所述的图像补全方法。

Claims (20)

  1. 一种图像补全方法,所述方法包括:
    获取对象图像集合,所述对象图像集合中包含同一对象在不同模态下的图像,且所述图像中包含n张缺失模态下的缺失图像以及m张完整模态下的完整图像,n和m为正整数;
    从所述完整图像中提取对象模态共享特征,所述对象模态共享特征为所述缺失图像与所述完整图像所共有的特征;
    对所述对象模态共享特征进行特征还原,得到所述缺失图像的补全图像。
  2. 根据权利要求1所述的方法,其中,所述从所述完整图像中提取对象模态共享特征,包括:
    将所述缺失图像和所述完整图像输入所述缺失模态的特征编码器,其中,不同模态对应不同特征编码器;
    通过所述特征编码器对所述缺失图像和所述完整图像进行特征提取,得到所述对象模态共享特征;
    所述对所述对象模态共享特征进行特征还原,得到所述缺失图像的补全图像,包括:
    将所述对象模态共享特征输入所述缺失模态的特征解码器,其中,不同模态对应不同特征解码器;
    通过所述特征解码器对所述对象模态共享特征进行特征还原,得到所述补全图像。
  3. 根据权利要求2所述的方法,其中,所述通过所述特征编码器对所述缺失图像和所述完整图像进行特征提取,得到所述对象模态共享特征,包括:
    通过所述特征编码器对所述缺失图像和第i完整图像进行特征提取,得到第i对象模态共享特征,所述第i完整图像属于m张所述完整图像,且i小于或等于m;
    所述将所述对象模态共享特征输入所述缺失模态的特征解码器,包括:
    对m种所述对象模态共享特征进行特征融合,得到融合共享特征;
    将所述融合共享特征输入所述缺失模态的所述特征解码器;
    所述通过所述特征解码器对所述对象模态共享特征进行特征还原,得到所述补全图像,包括:
    通过所述特征解码器对所述融合共享特征进行特征还原,得到所述补全图像。
  4. 根据权利要求3所述的方法,其中,所述对m种所述对象模态共享特征进行特征融合,得到融合共享特征,包括:
    通过至少两种池化方式对所述第i对象模态共享特征进行池化处理,得到所述第i对象模态共享特征的至少两种池化特征;
    对m种所述对象模态共享特征各自的所述池化特征进行特征拼接,得到所述融合共享特征。
  5. 根据权利要求3所述的方法,其中,所述将所述融合共享特征输入所述缺失模态的所述特征解码器,包括:
    对所述融合共享特征进行通道降维或者通道升维处理,得到处理后的融合共享特征,其中,所述处理后的融合共享特征的通道数与所述特征编码器的输出通道数一致;
    将所述处理后的融合共享特征输入所述缺失模态的所述特征解码器。
  6. 根据权利要求2所述的方法,其中,所述特征编码器是由条件卷积构成的混合专家网络,且所述条件卷积的参数基于所述特征编码器的模态确定得到。
  7. 根据权利要求2所述的方法,其中,所述方法还包括:
    获取样本图像集合,所述样本图像集合中包含同一样本对象在不同模态下的样本图像,且所述样本图像中包含至少一张缺失模态下的样本缺失图像以及至少一张完整模态下的样本完整图像;
    基于所述样本图像集合训练各种模态的所述特征编码器和所述特征解码器。
  8. 根据权利要求7所述的方法,其中,所述基于所述样本图像集合训练各种模态的所述特征编码器和所述特征解码器,包括:
    通过样本模态的特征编码器对所述样本图像进行特征提取,得到第一样本模态共享特征,其中,在所述样本模态为所述缺失模态的情况下,所述第一样本模态共享特征为所述样本缺失图像与所述样本完整图像所共有的特征;在所述样本模态为所述完整模态的情况下,所述第一样本模态共享特征为不同样本完整图像所共有的特征;
    通过所述样本模态的所述特征解码器对所述第一样本模态共享特征进行特征还原,得到样本生成图像;
    基于所述样本生成图像和所述样本图像,训练各种模态各自的所述特征编码器和所述特征解码器。
  9. 根据权利要求8所述的方法,其中,所述基于所述样本生成图像和所述样本图像,训练各种模态各自的所述特征编码器和所述特征解码器,包括:
    基于所述样本生成图像和所述样本图像,确定图像一致性损失;
    基于所述图像一致性损失,训练各种模态各自的所述特征编码器和所述特征解码器。
  10. 根据权利要求8所述的方法,其中,所述基于所述样本图像集合训练各种模态的所述特征编码器和所述特征解码器,还包括:
    通过所述样本模态的所述特征编码器对所述样本生成图像进行特征提取,得到第二样本模态共享特征;
    所述基于所述样本生成图像和所述样本图像,训练各种模态各自的所述特征编码器和所述特征解码器,包括:
    基于所述样本生成图像、所述样本图像、所述第一样本模态共享特征和所述第二样本模态共享特征,训练各种模态各自的所述特征编码器和所述特征解码器。
  11. 根据权利要求10所述的方法,其中,所述基于所述样本生成图像、所述样本图像、所述第一样本模态共享特征和所述第二样本模态共享特征,训练各种模态各自的所述特征编码器和所述特征解码器,包括:
    基于所述样本生成图像和所述样本图像,确定图像一致性损失;
    基于所述第一样本模态共享特征和所述第二样本模态共享特征,确定特征一致性损失;
    将所述样本生成图像和所述样本图像输入判别器,得到样本判别结果,所述判别器用于判别生成图像和真实图像;基于所述样本判别结果确定对抗性损失;
    基于所述第一样本模态共享特征确定对称性损失,所述对称性损失用于表征成对模态间模态共享特征的相似程度;
    基于所述图像一致性损失、所述特征一致性损失、所述对抗性损失以及所述对称性损失,确定总损失;
    基于所述总损失,训练各种模态各自的所述特征编码器和所述特征解码器,以及所述判别器。
  12. 根据权利要求1至11任一所述的方法,其中,在所述图像为脑肿瘤图像的情况下,所述图像的模态包括T1模态、T1ce模态、T2模态以及FLAIR模态。
  13. 一种图像补全装置,所述装置包括:
    获取模块,用于获取对象图像集合,所述对象图像集合中包含同一对象在不同模态下的图像,且所述图像中包含n张缺失模态下的缺失图像以及m张完整模态下的完整图像,n和m为正整数;
    特征提取模块,用于从所述完整图像中提取对象模态共享特征,所述对象模态共享特征为所述缺失图像与所述完整图像所共有的特征;
    特征还原模块,用于对所述对象模态共享特征进行特征还原,得到所述缺失图像的补全 图像。
  14. 根据权利要求13所述的装置,其中,所述特征提取模块,还用于:
    将所述缺失图像和所述完整图像输入所述缺失模态的特征编码器,其中,不同模态对应不同特征编码器;
    通过所述特征编码器对所述缺失图像和所述完整图像进行特征提取,得到所述对象模态共享特征;
    所述特征还原模块,还用于:
    将所述对象模态共享特征输入所述缺失模态的特征解码器,其中,不同模态对应不同特征解码器;
    通过所述特征解码器对所述对象模态共享特征进行特征还原,得到所述补全图像。
  15. 根据权利要求14所述的装置,其中,所述特征提取模块,还用于:
    通过所述特征编码器对所述缺失图像和第i完整图像进行特征提取,得到第i对象模态共享特征,所述第i完整图像属于m张所述完整图像,且i小于或等于m;
    所述特征还原模块,还用于:
    对m种所述对象模态共享特征进行特征融合,得到融合共享特征;
    将所述融合共享特征输入所述缺失模态的所述特征解码器;
    所述特征还原模块,还用于:
    通过所述特征解码器对所述融合共享特征进行特征还原,得到所述补全图像。
  16. 根据权利要求15所述的装置,其中,所述特征还原模块,还用于:
    通过至少两种池化方式对所述第i对象模态共享特征进行池化处理,得到所述第i对象模态共享特征的至少两种池化特征;
    对m种所述对象模态共享特征各自的所述池化特征进行特征拼接,得到所述融合共享特征。
  17. 根据权利要求15所述的装置,其中,所述特征还原模块,还用于:
    对所述融合共享特征进行通道降维或者通道升维处理,得到处理后的融合共享特征,其中,所述处理后的融合共享特征的通道数与所述特征编码器的输出通道数一致;
    将所述处理后的融合共享特征输入所述缺失模态的所述特征解码器。
  18. 一种计算机设备,所述计算机设备包括处理器和存储器,所述存储器中存储有至少一段程序,所述至少一段程序由所述处理器加载并执行以实现如权利要求1至12任一所述的图像补全方法。
  19. 一种计算机可读存储介质,所述计算机可读存储介质中存储有至少一段程序,所述至少一段程序由处理器加载并执行以实现如权利要求1至12任一所述的图像补全方法。
  20. 一种计算机程序产品,所述计算机程序产品包括计算机指令,所述计算机指令存储在计算机可读存储介质中;计算机设备的处理器从所述计算机可读存储介质读取所述计算机指令,处理器执行该计算机指令,使得所述计算机设备执行以实现如权利要求1至12任一所述的图像补全方法。
PCT/CN2023/082321 2022-04-27 2023-03-17 图像补全方法、装置、设备及存储介质 WO2023207416A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210457083.1 2022-04-27
CN202210457083.1A CN115170401A (zh) 2022-04-27 2022-04-27 图像补全方法、装置、设备及存储介质

Publications (1)

Publication Number Publication Date
WO2023207416A1 true WO2023207416A1 (zh) 2023-11-02

Family

ID=83483401

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/082321 WO2023207416A1 (zh) 2022-04-27 2023-03-17 图像补全方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN115170401A (zh)
WO (1) WO2023207416A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115170401A (zh) * 2022-04-27 2022-10-11 腾讯医疗健康(深圳)有限公司 图像补全方法、装置、设备及存储介质
CN117036181A (zh) * 2022-10-24 2023-11-10 腾讯科技(深圳)有限公司 图像处理模型的训练方法、装置、电子设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210063518A1 (en) * 2018-06-15 2021-03-04 Subtle Medical, Inc. Systems and methods for magnetic resonance imaging standardization using deep learning
CN113706558A (zh) * 2021-09-06 2021-11-26 联想(北京)有限公司 图像分割方法、装置及计算机设备
CN113920212A (zh) * 2021-09-27 2022-01-11 深圳技术大学 磁共振重建模型训练方法、计算机装置及存储介质
CN115170401A (zh) * 2022-04-27 2022-10-11 腾讯医疗健康(深圳)有限公司 图像补全方法、装置、设备及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210063518A1 (en) * 2018-06-15 2021-03-04 Subtle Medical, Inc. Systems and methods for magnetic resonance imaging standardization using deep learning
CN113706558A (zh) * 2021-09-06 2021-11-26 联想(北京)有限公司 图像分割方法、装置及计算机设备
CN113920212A (zh) * 2021-09-27 2022-01-11 深圳技术大学 磁共振重建模型训练方法、计算机装置及存储介质
CN115170401A (zh) * 2022-04-27 2022-10-11 腾讯医疗健康(深圳)有限公司 图像补全方法、装置、设备及存储介质

Also Published As

Publication number Publication date
CN115170401A (zh) 2022-10-11

Similar Documents

Publication Publication Date Title
CN112992308B (zh) 医学图像报告生成模型的训练方法及图像报告生成方法
EP3961484A1 (en) Medical image segmentation method and device, electronic device and storage medium
JP7143008B2 (ja) 深層学習に基づく医用画像検出方法及び装置、電子機器及びコンピュータプログラム
WO2023207416A1 (zh) 图像补全方法、装置、设备及存储介质
Ueda et al. Technical and clinical overview of deep learning in radiology
CN111369576B (zh) 图像分割模型的训练方法、图像分割方法、装置及设备
US20220028031A1 (en) Image processing method and apparatus, device, and storage medium
CN111932529B (zh) 一种图像分类分割方法、装置及系统
CN111091010A (zh) 相似度确定、网络训练、查找方法及装置和存储介质
CN116630514A (zh) 图像处理方法、装置、计算机可读存储介质及电子设备
CN113592769B (zh) 异常图像的检测、模型的训练方法、装置、设备及介质
US20240046471A1 (en) Three-dimensional medical image recognition method and apparatus, device, storage medium, and product
CN114298997A (zh) 一种伪造图片检测方法、装置及存储介质
WO2024087858A1 (zh) 图像处理模型的训练方法、装置、电子设备、计算机程序产品及计算机存储介质
CN113822323A (zh) 脑部扫描图像的识别处理方法、装置、设备及存储介质
WO2023173827A1 (zh) 图像生成方法、装置、设备、存储介质及计算机程序产品
KR101948701B1 (ko) 피검체의 뇌 구조를 기술하는 잠재 변수에 기반하여 상기 피검체의 뇌질환을 판정하는 방법 및 이를 이용한 장치
CN113689435B (zh) 图像分割方法、装置、电子设备及存储介质
CN111598904B (zh) 图像分割方法、装置、设备及存储介质
CN115115772A (zh) 基于三维影像的关键结构重建方法、装置和计算机设备
CN111369564B (zh) 一种图像处理的方法、模型训练的方法及装置
CN111626972B (zh) Ct图像重构方法、模型训练方法及设备
Zhang et al. ETUNet: Exploring efficient transformer enhanced UNet for 3D brain tumor segmentation
CN114283406A (zh) 细胞图像识别方法、装置、设备、介质及计算机程序产品
CN114639132A (zh) 人脸识别场景下的特征提取模型处理方法、装置、设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23794851

Country of ref document: EP

Kind code of ref document: A1