CN112348819A - Model training method, image processing and registering method, and related device and equipment - Google Patents

Model training method, image processing and registering method, and related device and equipment Download PDF

Info

Publication number
CN112348819A
CN112348819A CN202011193221.7A CN202011193221A CN112348819A CN 112348819 A CN112348819 A CN 112348819A CN 202011193221 A CN202011193221 A CN 202011193221A CN 112348819 A CN112348819 A CN 112348819A
Authority
CN
China
Prior art keywords
image
training
registered
model
characteristic data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202011193221.7A
Other languages
Chinese (zh)
Inventor
宋涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Sensetime Intelligent Technology Co Ltd
Original Assignee
Shanghai Sensetime Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Sensetime Intelligent Technology Co Ltd filed Critical Shanghai Sensetime Intelligent Technology Co Ltd
Priority to CN202011193221.7A priority Critical patent/CN112348819A/en
Publication of CN112348819A publication Critical patent/CN112348819A/en
Priority to JP2021576590A priority patent/JP2023502813A/en
Priority to PCT/CN2021/079154 priority patent/WO2022088572A1/en
Priority to KR1020227001100A priority patent/KR20220012406A/en
Priority to TW110115205A priority patent/TW202217744A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • G06T7/33Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]

Abstract

The application discloses a model training method, an image processing and registering method, a relevant device and equipment, wherein the model can be applied to medical image processing, and the model training method comprises the following steps: acquiring at least one sample image, wherein the sample image has annotation information; respectively extracting the content of at least one sample image by using the disentanglement model to obtain the content characteristic data of at least one sample image; training a preset network model by using the content characteristic data and the labeling information of at least one sample image to obtain a network model for processing a target image; the de-entanglement model is obtained by training two types of training images belonging to different domains, wherein one type of training image and the sample image belong to a first domain, and the other type of training image and the target image belong to a second domain. According to the scheme, the image processing tasks of the images belonging to different domains can be completed at low cost and high precision.

Description

Model training method, image processing and registering method, and related device and equipment
Technical Field
The present application relates to the field of artificial intelligence technologies, and in particular, to a model training method, an image processing and registering method, and a related apparatus and device.
Background
With the development of artificial intelligence technologies such as neural networks and deep learning, a network model is trained, and image processing tasks such as image segmentation, image classification, target detection and image registration are completed by using the trained network model, which is gradually favored by people.
However, in the practical application process, the training image for training the network model usually belongs to one domain, and the target image that needs to be image-processed by using the network model also belongs to another domain, so when the target image is processed by using the network model obtained by training, the problem of processing precision reduction is easily caused. Therefore, currently, the target images belonging to another domain are usually re-labeled and trained, thereby wasting a lot of time cost and labor cost. Taking a medical image as an example, because the domains of a CT (Computed Tomography) image and an MR (Magnetic Resonance) image are different, when the MR image is subjected to related processing by using a network model obtained by CT image training, the accuracy is often reduced, and if the MR image is re-labeled, medical staff with high annual resources is often needed, which results in high cost. In view of the above, how to complete the image processing task of images belonging to different domains at low cost and high precision is an urgent problem to be solved.
Disclosure of Invention
The application provides a model training method, an image processing and registering method, a related device and equipment.
A first aspect of the present application provides a model training method, including: acquiring at least one sample image, wherein the sample image has annotation information; respectively extracting the content of at least one sample image by using the disentanglement model to obtain the content characteristic data of at least one sample image; training a preset network model by using the content characteristic data and the labeling information of at least one sample image to obtain a network model for processing a target image; the de-entanglement model is obtained by training two types of training images belonging to different domains, wherein one type of training image and the sample image belong to a first domain, and the other type of training image and the target image belong to a second domain.
Therefore, the sample image with the labeling information and the target image can be mapped to obtain the content characteristic data belonging to the same domain through the disentanglement model obtained by training the training image of the same type belonging to the first domain as the sample image and the training image of the same type belonging to the second domain as the target image, so that the network model obtained by training the content characteristic data of the training image and the labeling information can be also suitable for the target image without labeling the target image, the labeling cost of the images in different domains can be reduced, and the precision of subsequent processing can be improved.
The system comprises a sample image, a target image, annotation information and a database, wherein the sample image and the target image are medical images, and the annotation information is annotation information of a biological organ; and/or the sample image and the target image are three-dimensional images, and the labeling information is the labeling information of the three-dimensional target.
Therefore, the method can be applied to the field of medical images by setting the sample image and the target image as medical images and setting the labeling information as the labeling information of the biological organ; the sample image and the target image are set as three-dimensional images, and the labeling information is set as labeling information for the three-dimensional target, so that the method can be applied to processing of the three-dimensional images.
Wherein, the preset network model comprises any one of the following: the system comprises an image segmentation network model, an image classification network model and a target detection network model.
Therefore, by setting the preset network model as any one of the image segmentation network model, the image classification network model and the target detection network model, the network models suitable for different image processing tasks can be obtained by training based on different types of preset network models, so that different processing tasks on images in different domains can be completed.
Wherein, training the disentanglement model comprises: acquiring a first class of training images belonging to a first domain and a second class of training images belonging to a second domain; respectively extracting the content and style of the first class of training images and the second class of training images by using an original disentanglement model to obtain content characteristic data and style characteristic data of the first class of training images and content characteristic data and style characteristic data of the second class of training images; reconstructing by using the content characteristic data and the style characteristic data of the first class of training images and the content characteristic data and the style characteristic data of the second class of training images to obtain reconstructed images; obtaining a loss value of the de-entanglement model based on the reconstructed image; based on the loss value, parameters of the de-entanglement model are adjusted.
Therefore, the original de-entanglement model can be used for carrying out feature extraction on the first class of training images belonging to the first domain and the second class of training images belonging to the second domain, and reconstruction is carried out based on the extracted content feature data and style feature data, so that the loss value of the de-entanglement model is obtained based on the reconstructed images obtained through reconstruction, and then the parameters of the de-entanglement model are adjusted based on the loss value, so that the de-entanglement model can be trained.
The method for reconstructing the content characteristic data and the style characteristic data of the first class of training images and the content characteristic data and the style characteristic data of the second class of training images to obtain the reconstructed images comprises the following steps: reconstructing by using the content characteristic data and the style characteristic data of the first class of training images to obtain a first intra-domain reconstructed image, and reconstructing by using the content characteristic data and the style characteristic data of the second class of training images to obtain a second intra-domain reconstructed image; the style characteristic data of the first class of training images and the content characteristic data of the second class of training images are used for reconstruction to obtain first cross-domain reconstructed images, and the style characteristic data of the second class of training images and the content characteristic data of the first class of training images are used for reconstruction to obtain second cross-domain reconstructed images; obtaining a loss value of the de-entanglement model based on the reconstructed image, comprising: obtaining a first intra-domain loss value based on the difference between the first class of training images and the first intra-domain reconstructed images, and obtaining a second intra-domain loss value based on the difference between the second class of training images and the second intra-domain reconstructed images; obtaining a first content loss value based on the difference between the content characteristic data of the first class of training images and the second cross-domain reconstructed images, obtaining a first style loss value based on the difference between the style characteristic data of the first class of training images and the first cross-domain reconstructed images, obtaining a second content loss value based on the difference between the content characteristic data of the second class of training images and the first cross-domain reconstructed images, and obtaining a second style loss value based on the difference between the style characteristic data of the second class of training images and the second cross-domain reconstructed images; obtaining a first cross-domain loss value based on the difference between the first class of training images and the first cross-domain reconstructed image, and obtaining a second cross-domain loss value based on the difference between the second class of training images and the second cross-domain reconstructed image; and weighting the obtained loss value to obtain the loss value of the disentanglement model.
Therefore, the loss value of the deinterlacing model can be obtained by performing weighting processing based on the intra-domain loss value and the cross-domain loss value and the loss values of the corresponding content feature data and the style feature data after image reconstruction, and the parameter adjustment can be performed based on the total loss value, so that the content feature data extracted by the deinterlacing model of the images in different domains can be in the same domain.
A second aspect of the present application provides an image processing method, including: extracting the content of the target image by using the disentanglement model to obtain the content characteristic data of the target image; processing the content characteristic data of the target image by using the processing network model to obtain a processing result of the target image; the de-entanglement model is obtained by training two types of training images belonging to different domains, wherein one type of training image and the target image belong to the same domain; the processing network model is obtained by using the model training method in the first aspect.
Therefore, the processing network model trained by the model training method in the first aspect can be applied to both the sample image for training the processing network model and the target image, so that it is not necessary to label the target image, and thus the labeling cost of images in different domains can be reduced, and the accuracy of image processing can be improved.
A third aspect of the present application provides an image registration method, including: acquiring a first image to be registered and a second image to be registered; respectively extracting the contents of the first image to be registered and the second image to be registered by using the disentanglement model to obtain the content characteristic data of the first image to be registered and the content characteristic data of the second image to be registered; determining a registration parameter based on the content characteristic data of the first image to be registered and the content characteristic data of the second image to be registered; registering the first image to be registered and the second image to be registered by using the registration parameters; the de-entanglement model is obtained by training two types of training images belonging to different domains, wherein one type of training image and the first image to be registered belong to a first domain, and the other type of training image and the second image to be registered belong to a second domain.
Therefore, through the disentanglement model obtained by training the training images of the first class and the second class, which belong to the first domain and the second domain, of the first to-be-registered image and the training images of the second to-be-registered image, the extracted content feature data of the first to-be-registered image and the extracted content feature data of the second to-be-registered image belong to the same domain, so that the registration parameters of the first to-be-registered image and the second to-be-registered image can be directly obtained based on the content feature data of the first to-be-registered image and the second to-be-registered image, registration can be carried out, unsupervised registration can be realized between the images of different domains, the cost of image.
Wherein the first and second registered images are medical images; and/or the first and second registered images are three-dimensional images; and/or, the registration parameters include at least one of: rigidity variation parameter, deformation variation parameter.
Thus, the image registration can be applied to the registration of the medical images by setting the first image to be registered and the second image to be registered as medical images; the image registration can be applied to the registration of the three-dimensional images by setting the first image to be registered and the second image to be registered as the three-dimensional images; the registration parameters can comprise at least one of rigidity change parameters and deformation change parameters, so that the first image to be registered and the second image to be registered can be registered after being correspondingly changed based on the actual conditions of the first image to be registered and the second image to be registered, and the accuracy of image registration can be improved.
Wherein determining the registration parameters based on the content feature data of the first image to be registered and the content feature data of the second image to be registered comprises: determining a registration parameter corresponding to the first image to be registered based on the content characteristic data of the first image to be registered and the content characteristic data of the second image to be registered; the registration of the first image to be registered and the second image to be registered by using the registration parameters comprises the following steps: changing the first image to be registered by using the registration parameters to obtain a changed image corresponding to the first image to be registered; and overlapping the pixel point of the changed image with the corresponding pixel point of the second image to be registered.
Therefore, the registration parameters corresponding to the first image to be registered can be used for changing the first image to be registered, so that a changed image corresponding to the first image to be registered is obtained, the pixel points of the changed image are overlapped with the corresponding pixel points of the second image to be registered, and unsupervised registration among images in different domains can be achieved.
Wherein, training the disentanglement model comprises: acquiring a first class of training images belonging to a first domain and a second class of training images belonging to a second domain; respectively extracting the content and style of the first class of training images and the second class of training images by using an original disentanglement model to obtain content characteristic data and style characteristic data of the first class of training images and content characteristic data and style characteristic data of the second class of training images; reconstructing by using the content characteristic data and the style characteristic data of the first class of training images and the content characteristic data and the style characteristic data of the second class of training images to obtain reconstructed images; obtaining a loss value of the de-entanglement model based on the reconstructed image; based on the loss value, parameters of the de-entanglement model are adjusted.
Therefore, the original de-entanglement model can be used for carrying out feature extraction on the first class of training images belonging to the first domain and the second class of training images belonging to the second domain, and reconstruction is carried out based on the extracted content feature data and style feature data, so that the loss value of the de-entanglement model is obtained based on the reconstructed images obtained through reconstruction, and then the parameters of the de-entanglement model are adjusted based on the loss value, so that the de-entanglement model can be trained.
The present application in a fourth aspect provides a model training apparatus comprising: the system comprises an image acquisition module, a content extraction module and a model training module, wherein the image acquisition module is used for acquiring at least one sample image, and the sample image is provided with marking information; the content extraction module is used for respectively extracting the content of the at least one sample image by using the disentanglement model to obtain the content characteristic data of the at least one sample image; the model training module is used for training a preset network model by utilizing the content characteristic data and the labeling information of at least one sample image to obtain a network model for processing a target image; the de-entanglement model is obtained by training two types of training images belonging to different domains, wherein one type of training image and the sample image belong to a first domain, and the other type of training image and the target image belong to a second domain.
The fifth aspect of the application provides an image processing device, which comprises a content extraction module and a network processing module, wherein the content extraction module is used for extracting the content of a target image by using an disentanglement model to obtain content characteristic data of the target image; the network processing module is used for processing the content characteristic data of the target image by using the processing network model to obtain a processing result of the target image; the de-entanglement model is obtained by training two types of training images belonging to different domains, wherein one type of training image and the target image belong to the same domain; the processing network model is obtained by using the model training apparatus in the fourth aspect.
A sixth aspect of the present application provides an image registration apparatus, comprising: the system comprises an image acquisition module, a content extraction module, a parameter determination module and an image registration module, wherein the image acquisition module is used for acquiring a first image to be registered and a second image to be registered; the content extraction module is used for respectively extracting the content of the first image to be registered and the content of the second image to be registered by using the disentanglement model to obtain the content characteristic data of the first image to be registered and the content characteristic data of the second image to be registered; the parameter determining module is used for determining registration parameters based on the content characteristic data of the first image to be registered and the content characteristic data of the second image to be registered; the image registration module is used for registering the first image to be registered and the second image to be registered by using the registration parameters; the de-entanglement model is obtained by training two types of training images belonging to different domains, wherein one type of training image and the first image to be registered belong to a first domain, and the other type of training image and the second image to be registered belong to a second domain.
A seventh aspect of the present application provides an electronic device, including a memory and a processor coupled to each other, where the processor is configured to execute program instructions stored in the memory to implement the model training method in the first aspect, or implement the image processing method in the second aspect, or implement the image registration method in the third aspect.
An eighth aspect of the present application provides a computer-readable storage medium, on which program instructions are stored, which program instructions, when executed by a processor, implement the model training method in the first aspect described above, or implement the image processing method in the second aspect described above, or implement the image registration method in the third aspect described above.
According to the scheme, the sample image with the labeling information and the target image can be mapped to obtain the content characteristic data belonging to the same domain through the disentanglement model obtained by training the training image of the same type belonging to the first domain as the sample image and the training image of the same type belonging to the second domain as the target image, so that the network model obtained by training the content characteristic data of the training image and the labeling information can be also suitable for the target image without labeling the target image, the labeling cost of the images in different domains can be reduced, and the image processing precision is improved.
Drawings
FIG. 1 is a schematic flow chart diagram of an embodiment of a model training method of the present application;
FIG. 2 is a schematic diagram of an embodiment of a method of training a de-entanglement model;
FIG. 3 is a schematic diagram of another embodiment of a method of training a de-entanglement model;
FIG. 4 is a schematic flowchart of an embodiment of an image processing method of the present application;
FIG. 5 is a schematic flow chart diagram illustrating an embodiment of an image registration method of the present application;
FIG. 6 is a process diagram of an embodiment of image registration;
FIG. 7 is a block diagram of an embodiment of the model training apparatus of the present application;
FIG. 8 is a block diagram of an embodiment of an image processing apparatus according to the present application;
FIG. 9 is a block diagram of an embodiment of an image registration apparatus according to the present application;
FIG. 10 is a block diagram of an embodiment of an electronic device of the present application;
FIG. 11 is a block diagram of an embodiment of a memory device according to the present application.
Detailed Description
The following describes in detail the embodiments of the present application with reference to the drawings attached hereto.
In the following description, for purposes of explanation and not limitation, specific details are set forth such as particular system structures, interfaces, techniques, etc. in order to provide a thorough understanding of the present application.
The terms "system" and "network" are often used interchangeably herein. The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship. Further, the term "plurality" herein means two or more than two.
Referring to fig. 1, fig. 1 is a schematic flow chart of an embodiment of a model training method according to the present application. Specifically, the method may include the steps of:
step S11: at least one sample image is acquired.
In this embodiment, the sample image has annotation information. For example, for training the image segmentation network model, the annotation information may be contour information of the target object in the training image; or, for the training image classification network model, the labeling information may be class information corresponding to the target object in the training image; or, for the training target detection network model, the labeling information may be position information of a target object in the training image, and may be specifically set according to an application scenario, which is not limited herein.
In one implementation scenario, in order to improve the accuracy of the trained network model, the number of sample images may be multiple, for example, 500, 700, 900, and so on, without limitation.
In an implementation scenario, in order to enable the trained network model to be applied to the field of medical images, the sample image may be a medical image, such as a CT image, an MR image, and the like, and the labeling information may be labeling information of a biological organ, such as a pancreas, a kidney, and the like, labeling information of an organ tissue, such as a ligament and a cartilage tissue, or labeling information of an organ lesion tissue, such as a hematoma region and an ulcer region, which is not illustrated herein.
In an implementation scenario, in order to enable the trained network model to perform corresponding processing on a three-dimensional image, the sample image may also be a three-dimensional image, such as a three-dimensional nuclear magnetic resonance image, or the three-dimensional image may also be a three-dimensional image obtained by performing three-dimensional reconstruction on a CT image and a B-mode ultrasound image, which is not limited herein.
Step S12: and respectively extracting the content of at least one sample image by using the disentanglement model to obtain the content characteristic data of at least one sample image.
In this embodiment, the disentanglement model is obtained by training two types of training images belonging to different domains, wherein one type of training image and the sample image belong to a first domain, and the other type of training image and the target image belong to a second domain. The image data of images of different domains obey different data distributions. For example, CT images and MR images acquired by different kinds of medical image detection apparatuses belong to different domains; or, the image taken during the daytime and the image taken at night belong to different domains; or the image taken by the camera and the manually drawn image belong to different domains; alternatively, the landscape shot in summer and the landscape shot in winter belong to different domains, which are not exemplified here.
Referring to fig. 2, fig. 2 is a schematic diagram of an embodiment of a method for training an disentanglement model. As shown in FIG. 2, a first class of training images belonging to a first domain is acquired as image x1And a second class of training images belonging to a second domain is image x2Use of the originalThe initial disentanglement model is respectively used for the first class training image x1And a second class of training images x2Extracting content and style to obtain a first class training image x1Content characteristic data c1And style feature data s1And a second class of training images x2Content characteristic data c2And style feature data s2. And using training images x of the first type1Content characteristic data c1And style feature data s1And a second class of training images x2Content characteristic data c2And style feature data s2And reconstructing to obtain a reconstructed image. Referring to fig. 2 and 3 in conjunction, fig. 3 is a schematic diagram of another embodiment of the training method of the disentanglement model. As shown in FIG. 2, a first class of training images x is utilized1Content characteristic data c1And style feature data s1The reconstruction obtains a reconstructed image in the first domain
Figure BDA0002753304890000091
And using training images x of the second type2Content characteristic data c2And style feature data s2Reconstructing to obtain a reconstructed image in a second domain
Figure BDA0002753304890000092
As shown in FIG. 3, a first class of training images x is utilized1Style feature data s of1And a second class of training images x2Content characteristic data c2Reconstructing to obtain a first cross-domain reconstructed image x2→1And using training images x of the second type2Style characteristic data s2And training images x of the first type1Content characteristic data c1Reconstructing to obtain a second cross-domain reconstructed image x1→2. Through the intra-domain reconstruction and the cross-domain reconstruction, the reconstructed image in the first domain can be obtained respectively
Figure BDA0002753304890000093
Reconstructing an image in a second domain
Figure BDA0002753304890000094
First cross-domain reconstructed image x2→1And a second cross-domain reconstructed image x1→2. On the basis, the loss value of the de-entanglement model is obtained based on the reconstructed image, so that the parameters of the de-entanglement model can be adjusted based on the loss value.
Specifically, with continued reference to FIGS. 2 and 3, the training image x of the first type may be based on1With the reconstructed image in the first domain
Figure BDA0002753304890000095
Obtaining a loss value in a first domain according to the difference between the first domain and the second domain; can be based on a second class of training images x2And reconstructing the image in the second domain
Figure BDA0002753304890000096
Obtaining a second intra-domain loss value according to the difference between the first intra-domain loss value and the second intra-domain loss value; may be based on a first class of training images x1Content characteristic data c1And a second cross-domain reconstructed image x1→2Content characteristic data of
Figure BDA0002753304890000101
The difference between the content characteristic data and the first content loss value, wherein the first content loss value is obtained
Figure BDA0002753304890000102
Is to the second cross-domain reconstructed image x1→2Decoding to obtain the product; may be based on a first class of training images x1Style feature data s of1And a first cross-domain reconstructed image x2→1Style characteristic data of
Figure BDA0002753304890000103
The difference between them, the first style loss value is obtained, wherein the style characteristic data
Figure BDA0002753304890000104
Is to the first cross-domain reconstructed image x2→1Decoding to obtain the product; can be based on a second class of training images x2Content characteristic data c2And a first cross-domain reconstructed image x2→1Content characteristic data of
Figure BDA0002753304890000105
A second content loss value is obtained according to the difference between the content characteristic data and the content loss value
Figure BDA0002753304890000106
Is to the first cross-domain reconstructed image x2→1Decoding to obtain the product; can be based on a second class of training images x2Style feature data s of2And a second cross-domain reconstructed image x1→2Style characteristic data of
Figure BDA0002753304890000107
The difference between the first style loss value and the second style loss value, wherein the style characteristic data
Figure BDA0002753304890000108
Is to the second cross-domain reconstructed image x1→2Decoding to obtain the product; may be based on a first class of training images x1With the first cross-domain reconstructed image x2→1Obtaining a first cross-domain loss value according to the difference between the first cross-domain loss value and the second cross-domain loss value; can be based on a second class of training images x2With a second cross-domain reconstructed image x1→2Obtaining a second cross-domain loss value according to the difference between the first cross-domain loss value and the second cross-domain loss value; and weighting the obtained loss value to obtain the loss value of the disentanglement model. In particular, the weight of the loss value in the first domain and the loss value in the second domain may be the same (e.g., both are λ)1) The first stylistic loss value and the second stylistic loss value may also be weighted the same (e.g., both are λ)2) The first content loss value and the second content loss value may be weighted the same (e.g., both are λ)3) The first cross-domain loss value and the second cross-domain loss value may also be weighted the same (e.g., both are 1), and in a specific implementation scenario, the λ is set as above1、λ2、λ3The method can be set according to practical situations and is not limited herein.
In the actual training process, the parameters of the de-entanglement model can be adjusted based on the obtained loss values, and the process is repeated to perform the de-entanglement model againThe method comprises the steps of sign extraction, image reconstruction and loss value calculation, so that parameters of the de-entanglement model are continuously adjusted, and the first class training image x obtained through the de-entanglement model can be obtained through multiple cycles and parameter adjustment1Content characteristic data c1And a second class of training images x2Content characteristic data c2And finally, the same data distribution is obeyed, and the training of the disentanglement model can be considered to be finished at the moment.
In a specific implementation scenario, the disentanglement model may include a content encoder (content encoder) and a style encoder (style encoder) respectively configured to encode an image to obtain content feature data and style feature data, and specifically, the content encoder may include a plurality of stride convolution layers (stride convolutional layers) for downsampling an input image and a plurality of further processed residual error blocks (residual blocks), and all of the convolution layers are subjected to Instance regularization; the style encoder may include a plurality of stride convolutional layers (linear connected layer) for down-sampling the input image, a global average pooling layer (global average pooling layer), and a full connected layer (full connected layer), and since the example regularization process removes the mean and variance of the original features, and much style feature information is included therein, the example regularization process is not used in the style encoder. In addition, the disentanglement model further comprises a decoder (decoder) for reconstructing an image based on the content characteristic data and the style characteristic data, in particular, the decoder processes the content characteristic data through a set of residual block processing and finally reconstructs the image by using a plurality of upsamples, wherein the residual block can also introduce adaptive instance normalization, and the adaptive instance normalization can be dynamically generated on the style characteristic data by using a multi-layer perceptron.
In this embodiment, the content feature data may include data describing the content of the image, for example, data describing the outline of a person, a building, a landscape, or the like in the image. The style characteristic data may include data describing colors, drafts, etc. of the image, for example, may include data describing color systems (e.g., warm color system, cold color system, etc.) of characters, buildings, mountains, etc. in the image, or may also include data describing drafts (e.g., solid-up style, impression style, etc.) of characters, buildings, mountains, etc. in the image, which is not illustrated herein.
Step S13: and training a preset network model by using the content characteristic data and the labeling information of at least one sample image to obtain the network model for processing the target image.
In one implementation scenario, in order to enable the trained network model to be applied in the field of medical images, the sample image and the target image may be medical images, such as CT images, MR images, and the like. Further, the labeling information may be labeling information on a biological organ.
In an implementation scenario, in order to enable the trained network model to perform corresponding processing on a three-dimensional image, the sample image and the target image are three-dimensional images, and the annotation information is annotation information for the three-dimensional target. The three-dimensional image may be a three-dimensional nuclear magnetic resonance image, or the three-dimensional image may be a three-dimensional image obtained by three-dimensional reconstruction of a CT image and a B-mode ultrasound image, which is not limited herein.
In one implementation scenario, in order to adapt to different application scenarios, the preset network module may be set to any one of the following according to actual situations: the system comprises an image segmentation network model, an image classification network model and a target detection network model. For example, for image segmentation, the image segmentation network model may include, but is not limited to: FCN (full volumetric Network), Segnet, Enet, etc.; alternatively, for image classification, the image classification network model may include, but is not limited to: VGG networks, inclusion networks, Resnet networks, etc.; alternatively, for object detection, the object detection network model may include, but is not limited to: a yolo (young Only Look one) network, an ssd (single Shot multi box detector) network, and the like, which are not limited herein. By training the preset network model, a network model for segmenting the image, a network model for classifying the image and a network model for detecting the target can be obtained respectively, so that the trained network model can be simultaneously suitable for the target images belonging to the first domain and the second domain.
In a specific implementation scenario, the sample image is a brain CT image, the target image is a brain MR image, the disentanglement network is obtained by training a first type of training image belonging to a first domain together with the sample image and a second type of training image belonging to a second type together with the target image, so that the content feature data of the brain CT image is extracted by using the disentanglement model, and then the preset network model is trained by using the content feature data and the annotation information of the brain CT image, so that the network model for processing the brain MR image can be obtained. In other application scenarios, the same can be said, and no one example is given here.
According to the scheme, the sample image with the labeling information and the target image can be mapped to obtain the content characteristic data belonging to the same domain through the disentanglement model obtained by training the training image of the same type belonging to the first domain as the sample image and the training image of the same type belonging to the second domain as the target image, so that the network model obtained by training the content characteristic data of the training image and the labeling information can be also suitable for the target image without labeling the target image, the labeling cost of the images in different domains can be reduced, and the image processing precision is improved.
Referring to fig. 4, fig. 4 is a flowchart illustrating an embodiment of an image processing method according to the present application. Specifically, the method may include the steps of:
step S41: and extracting the content of the target image by using the disentanglement model to obtain the content characteristic data of the target image.
In this embodiment, the disentanglement model is obtained by training two types of training images belonging to different domains, and one type of training image and the target image belong to the same domain. The specific training process of the disentanglement model may refer to the relevant steps in the foregoing embodiments, and will not be described herein.
In a specific implementation scenario, the target image may be a brain MR image, and the content feature data of the brain MR image may be obtained by extracting the content of the brain MR image by using the disentanglement model. In other application scenarios, the same can be said, and no one example is given here.
Step S42: and processing the content characteristic data of the target image by using the processing network model to obtain a processing result of the target image.
In this embodiment, the processing network model is obtained by using the steps in any of the above embodiments of the model training method, and specific reference may be made to the relevant steps in the foregoing embodiments, which are not described herein again.
In a specific implementation scenario, still taking the target image as a brain MR image as an example, the processing network model may be an image segmentation network model, the image segmentation network model is obtained by training using content feature data of the brain CT image and labeling information thereof (e.g., luminal peduncle contour information), and the content feature data of the brain CT image is extracted by using a de-entanglement model, and the de-entanglement model is obtained by training using a training image belonging to the same domain as the brain MR image and a training image belonging to the same domain as the brain CT image.
In another specific implementation scenario, still taking the target image as a brain MR image as an example, the processing network model may be an image classification network model, the image classification network model is obtained by training the content feature data of the brain CT image and the labeled information (such as lacunar, bleeding, sparse white matter, encephalatrophy, etc.) corresponding to the classification of the target image, the content feature data of the brain CT image is extracted by using a de-entanglement model, and the de-entanglement model is obtained by training a training image belonging to the same domain as the brain MR image and a training image belonging to the same domain as the brain CT image, and the specific training process may refer to the relevant steps in the foregoing embodiment, so that the trained image classification network model is used to process the content feature data of the brain MR image to determine the category (such as lacunar, petiole, cerebral infarction, etc.) of the brain MR image, Any of hemorrhage, brain white matter rarefaction, brain atrophy).
In another specific implementation scenario, still taking the example that the target image is a brain MR image, the processing network model may be a target detection network model, the target detection network model is obtained by training content feature data of the brain CT image and labeling information (e.g., location information) of a target region of a target object (e.g., a pedicle, a hemorrhage, etc.) contained in the content feature data, the content feature data of the brain CT image is extracted by using a de-entanglement model, and the de-entanglement model is obtained by training a training image belonging to the same domain as the brain MR image and a training image belonging to the same domain as the brain CT image, and the specific training process may refer to relevant steps in the foregoing embodiment, so that the content feature data of the brain MR image is processed by using the trained target detection network model to obtain a target object (a pedicle, a hemorrhage, etc.) in the brain MR image, Bleeding, etc.).
Other application scenarios may be analogized, and are not exemplified here.
According to the scheme, the processing network model obtained by training through the model training method in the first aspect can be suitable for training the sample image of the processing network model and can also be suitable for the target image, so that the processing network model does not need to be labeled, the labeling cost of images in different domains can be reduced, and the precision of image processing can be improved.
Referring to fig. 5, fig. 5 is a flowchart illustrating an embodiment of an image registration method according to the present application. Specifically, the method may include the steps of:
step S51: and acquiring a first image to be registered and a second image to be registered.
In this embodiment, the first image to be registered and the second image to be registered may belong to different domains. For example, the first image to be registered may be a CT image and the second image to be registered may be an MR image, or the first image to be registered may be a depth image obtained by scanning with a laser radar and the second image to be registered may be a visible light image obtained by shooting with a camera, which is not limited herein.
Step S52: and respectively carrying out content extraction on the first image to be registered and the second image to be registered by using the disentanglement model to obtain content characteristic data of the first image to be registered and content characteristic data of the second image to be registered.
In this embodiment, the disentanglement model is obtained by training two types of training images belonging to different domains, wherein one type of training image and the first image to be registered belong to a first domain, and the other type of training image and the second image to be registered belong to a second domain. The specific training process of the disentanglement model may include: the method comprises the steps of obtaining a first class of training images belonging to a first domain and a second class of training images belonging to a second domain, respectively extracting the content and style of the first class of training images and the second class of training images by using an original disentanglement model, obtaining the content characteristic data and the style characteristic data of the first class of training images and the content characteristic data and the style characteristic data of the second class of training images, reconstructing by using the content characteristic data and the style characteristic data of the first class of training images and the content characteristic data and the style characteristic data of the second class of training images to obtain reconstructed images, obtaining loss values of the disentanglement model based on the reconstructed images, and further adjusting parameters of the disentanglement model based on the loss values. Specifically, reference may be made to relevant steps in the foregoing embodiments, which are not described herein again.
The content of the first image to be registered and the content of the second image to be registered are extracted through the disentanglement model obtained through training, so that the extracted content feature data of the first image to be registered and the extracted content feature data of the second image to be registered are distributed according to the same data, and subsequent processing is facilitated.
Please refer to fig. 6, fig. 6 is a schematic process diagram of an embodiment of image registration, and as shown in fig. 6, a first image to be registered is a brain CT image, a second image to be registered is a brain MR image, and a de-entanglement model can be obtained by training using a training image belonging to a first domain with the brain CT image and a training image belonging to a second domain with the brain MR image, so that content extraction is performed on the brain CT image and the brain MR image by using the de-entanglement model obtained by training, and content feature data of the brain CT image and content feature data of the brain MR image can be obtained.
Step S53: and determining a registration parameter based on the content characteristic data of the first image to be registered and the content characteristic data of the second image to be registered.
In one implementation scenario, in order to be able to determine the registration parameters based on the actual conditions of the first and second registered images, the registration parameters may comprise at least one of a stiffness variation parameter, a deformation variation parameter. When the registration parameter is a rigid variation parameter, the shape of the image is not changed only through rotation change and/or translation change, and when the registration parameter is a deformation variation parameter, the shape of the image is changed.
With continuing reference to fig. 6, the registration parameters of the brain CT image and the brain MR image are obtained based on the content characteristic data of the brain CT image and the content characteristic data of the brain MR image.
In a specific implementation scenario, in the image registration process, one of the first image to be registered and the second image to be registered may be set as a moving image, and the other one may be set as a fixed image. For example, if the first image to be registered is set as a moving image, the registration parameter corresponding to the first image to be registered may be determined based on the content feature data of the first image to be registered and the content feature data of the second image to be registered.
Step S54: and registering the first image to be registered and the second image to be registered by utilizing the registration parameters.
In a specific implementation scenario, in the image registration process, one of the first image to be registered and the second image to be registered may be set as a moving image, and the other one may be set as a fixed image. For example, a first image to be registered is set as a moving image, a registration parameter of the first image to be registered is determined based on the first image to be registered and a second image to be registered, the first image to be registered may be subjected to change processing (e.g., rigidity change and deformation change) by using the registration parameter, a changed image corresponding to the first image to be registered is obtained, and pixel points of the changed image are overlapped with corresponding pixel points of the second image to be registered, thereby achieving image registration.
With reference to fig. 6, the brain MR image is taken as a moving image, and the brain CT image is taken as a fixed image, so that the registration parameters of the brain MR image are determined based on the content characteristic data of the brain MR image and the content characteristic data of the brain CT image, and the registration parameters are utilized to perform change processing, such as rigidity change processing, on the brain MR image to obtain a change image, so that the pixel points of the obtained change image are overlapped with the corresponding pixel points of the brain CT image, and then image registration is completed.
In an implementation scenario, in order to further improve the accuracy of image registration, the registration parameters of the first image to be registered and the second image to be registered may be determined, and the first image to be registered is changed by using the registration parameters to obtain a corresponding changed image, counting the difference between the variation image and the second image to be registered, and based on the difference obtained by the counting, fine adjustment is carried out on the registration parameters, the registration parameters after fine adjustment are utilized to continuously carry out change processing on the first image to be registered to obtain a corresponding change image, and the difference between the variation image and the second image to be registered is counted, and based on the difference obtained by the counting, and fine-tuning the registration parameters after fine tuning again, and repeating the step of performing change processing on the first image to be registered by using the registration parameters after fine tuning until the difference between the obtained change image and the second image to be registered is within an allowable range.
According to the scheme, the extracted content characteristic data of the first image to be registered and the extracted content characteristic data of the second image to be registered belong to the same domain through the disentanglement model obtained by training the training images of the first domain and the second domain which belong to the same type as the first image to be registered and the second image to be registered, so that the registration parameters of the first image to be registered and the second image to be registered can be directly obtained based on the content characteristic data of the first image to be registered and the second image to be registered, unsupervised registration can be realized between images of different domains, the cost of image registration of different domains can be reduced, and the accuracy of the image registration of different domains can be improved.
Referring to fig. 7, fig. 7 is a block diagram of an embodiment of a model training device 70 according to the present application. The model training device 70 includes: the system comprises an image acquisition module 71, a content extraction module 72 and a model training module 73, wherein the image acquisition module 71 is used for acquiring at least one sample image, and the sample image has annotation information; the content extraction module 72 is configured to perform content extraction on the at least one sample image by using the disentanglement model, respectively, to obtain content feature data of the at least one sample image; the model training module 73 is configured to train a preset network model by using content feature data and label information of at least one sample image to obtain a network model for processing a target image; the de-entanglement model is obtained by training two types of training images belonging to different domains, wherein one type of training image and the sample image belong to a first domain, and the other type of training image and the target image belong to a second domain.
According to the scheme, the sample image with the labeling information and the target image can be mapped to obtain the content characteristic data belonging to the same domain through the disentanglement model obtained by training the training image belonging to the same first domain with the sample image and the disentanglement model obtained by training the training image belonging to the same second domain with the target image, so that the network model obtained by training the content characteristic data of the training image and the labeling information can be also suitable for the target image without labeling the target image, the labeling cost of the images in different domains can be reduced, and the precision of subsequent processing can be improved.
In some embodiments, the sample image and the target image are medical images, and the labeling information is labeling information on a biological organ; and/or the sample image and the target image are three-dimensional images, and the labeling information is the labeling information of the three-dimensional target.
Different from the foregoing embodiment, the sample image and the target image are set as medical images, and the labeling information is labeling information of a biological organ, which can be applied to the field of medical images; the sample image and the target image are set as three-dimensional images, and the labeling information is set as labeling information for the three-dimensional target, so that the method can be applied to processing of the three-dimensional images.
In some embodiments, the preset network model comprises any one of: the system comprises an image segmentation network model, an image classification network model and a target detection network model.
Different from the foregoing embodiment, by setting the preset network model as any one of the image segmentation network model, the image classification network model, and the target detection network model, the network models suitable for different image processing tasks can be obtained by training based on different kinds of preset network models, so that different processing tasks on images in different domains can be completed.
In some embodiments, the model training apparatus 70 further includes a training image obtaining module for obtaining a first class of training images belonging to a first domain and a second class of training images belonging to a second domain, the model training apparatus 70 further includes an image extracting module for performing content and style extraction on the first class of training images and the second class of training images by using the original disentanglement model to obtain content feature data and style feature data of the first class of training images and content feature data and style feature data of the second class of training images, the model training apparatus 70 further includes an image reconstructing module for reconstructing to obtain reconstructed images by using the content feature data and style feature data of the first class of training images and the content feature data and style feature data of the second class of training images, the model training apparatus 70 further includes a loss calculating module for reconstructing to obtain reconstructed images based on the reconstructed images, a loss value of the disentanglement model is obtained, and the model training apparatus 70 further includes a parameter adjusting module for adjusting a parameter of the disentanglement model based on the loss value.
Different from the foregoing embodiment, the original de-entanglement model can be used to perform feature extraction on a first class of training images belonging to a first domain and a second class of training images belonging to a second domain, and the feature extraction is performed based on the extracted content feature data and style feature data, so that a loss value of the de-entanglement model is obtained based on the reconstructed images obtained by reconstruction, and then parameters of the de-entanglement model are adjusted based on the loss value, so that the de-entanglement model can be trained.
In some embodiments, the image reconstruction module includes an intra-domain reconstruction sub-module configured to reconstruct content feature data and style feature data of a first type of training image to obtain a first intra-domain reconstructed image, and reconstruct content feature data and style feature data of a second type of training image to obtain a second intra-domain reconstructed image, and the image reconstruction module further includes a cross-domain reconstruction sub-module configured to reconstruct style feature data of the first type of training image and content feature data of the second type of training image to obtain a first cross-domain reconstructed image, and reconstruct style feature data of the second type of training image and content feature data of the first type of training image to obtain a second cross-domain reconstructed image. The loss calculation module comprises an intra-domain loss calculation submodule for obtaining a first intra-domain loss value based on a difference between a first class of training images and first intra-domain reconstructed images and obtaining a second intra-domain loss value based on a difference between a second class of training images and second intra-domain reconstructed images, a content loss calculation submodule for obtaining a first content loss value based on a difference between content feature data of the first class of training images and second cross-domain reconstructed images and obtaining a second content loss value based on a difference between content feature data of the second class of training images and first cross-domain reconstructed images, a style loss calculation submodule for obtaining a first style loss value based on a difference between style feature data of the first class of training images and first cross-domain reconstructed images, and obtaining a second style loss value based on the difference between the style characteristic data of the second type of training image and the style characteristic data of the second cross-domain reconstructed image, wherein the loss calculation module further comprises a cross-domain loss calculation submodule for obtaining a first cross-domain loss value based on the difference between the first type of training image and the first cross-domain reconstructed image, and obtaining a second cross-domain loss value based on the difference between the second type of training image and the second cross-domain reconstructed image, and the loss calculation module further comprises a loss weighting submodule for weighting the obtained loss value to obtain a loss value of the disentanglement model.
Different from the foregoing embodiment, the method can perform weighting processing based on the intra-domain loss value and the cross-domain loss value, and the loss values of the corresponding content feature data and the style feature data after image reconstruction, so as to obtain the loss value of the de-entanglement model, and further perform parameter adjustment based on the total loss value, so that the content feature data extracted by the de-entanglement models of images in different domains can be in the same domain.
Referring to fig. 8, fig. 8 is a schematic diagram of an embodiment of an image processing apparatus 80 according to the present application. The image processing device 80 comprises a content extraction module 81 and a network processing module 82, wherein the content extraction module 81 is used for extracting the content of the target image by using the disentanglement model to obtain the content characteristic data of the target image; the network processing module 82 is configured to process the content feature data of the target image by using the processing network model to obtain a processing result of the target image; the de-entanglement model is obtained by training two types of training images belonging to different domains, wherein one type of training image and the target image belong to the same domain; the processing network model is obtained by using the model training device in any one of the above embodiments of the model training device.
According to the scheme, the processing network model obtained by training through the model training device in any one of the model training device embodiments can be suitable for training the sample image of the processing network model and can also be suitable for the target image, so that the processing network model does not need to be labeled any more, the labeling cost of images in different domains can be further reduced, and the precision of image processing can be improved.
Referring to fig. 9, fig. 9 is a schematic diagram of an embodiment of an image registration apparatus 90 according to the present application. The image registration apparatus 90 includes: the image registration system comprises an image acquisition module 91, a content extraction module 92, a parameter determination module 93 and an image registration module 94, wherein the image acquisition module 91 is used for acquiring a first image to be registered and a second image to be registered; the content extraction module 92 is configured to perform content extraction on the first image to be registered and the second image to be registered respectively by using the disentanglement model to obtain content feature data of the first image to be registered and content feature data of the second image to be registered; the parameter determining module 93 is configured to determine a registration parameter based on content feature data of the first image to be registered and content feature data of the second image to be registered; the image registration module 94 is configured to register the first image to be registered and the second image to be registered by using the registration parameters; the de-entanglement model is obtained by training two types of training images belonging to different domains, wherein one type of training image and the first image to be registered belong to a first domain, and the other type of training image and the second image to be registered belong to a second domain.
According to the scheme, the extracted content characteristic data of the first image to be registered and the extracted content characteristic data of the second image to be registered belong to the same domain through the disentanglement model obtained by training the training images of the first domain and the second domain which belong to the same type as the first image to be registered and the second image to be registered, so that the registration parameters of the first image to be registered and the second image to be registered can be directly obtained based on the content characteristic data of the first image to be registered and the second image to be registered, unsupervised registration can be realized between images of different domains, the cost of image registration of different domains can be reduced, and the accuracy of the image registration of different domains can be improved.
In some embodiments, the first image to be registered and the second image to be registered are medical images; and/or the first image to be registered and the second image to be registered are three-dimensional images; and/or, the registration parameters include at least one of: rigidity variation parameter, deformation variation parameter.
In contrast to the foregoing embodiment, the image registration can be applied to the registration of the medical images by setting the first image to be registered and the second image to be registered as the medical images; the image registration can be applied to the registration of the three-dimensional images by setting the first image to be registered and the second image to be registered as the three-dimensional images; the registration parameters can comprise at least one of rigidity change parameters and deformation change parameters, so that the first image to be registered and the second image to be registered can be registered after being correspondingly changed based on the actual conditions of the first image to be registered and the second image to be registered, and the accuracy of image registration can be improved.
In some embodiments, the parameter determining module 93 is specifically configured to determine a registration parameter corresponding to the first image to be registered based on content feature data of the first image to be registered and content feature data of the second image to be registered, the image registration module 94 includes a change processing sub-module configured to perform change processing on the first image to be registered by using the registration parameter to obtain a changed image corresponding to the first image to be registered, and the image registration module 94 further includes a pixel overlapping module configured to overlap a pixel point of the changed image with a corresponding pixel point of the second image to be registered.
Different from the foregoing embodiment, the registration parameter corresponding to the first image to be registered can be used to perform change processing on the first image to be registered, so as to obtain a changed image corresponding to the first image to be registered, and further, the pixel point of the changed image is overlapped with the corresponding pixel point of the second image to be registered, so that unsupervised registration between images in different domains can be realized.
In some embodiments, the image registration apparatus 90 further includes a training image obtaining module for obtaining a first class of training images belonging to a first domain and a second class of training images belonging to a second domain, the model training apparatus 90 further includes an image extracting module for performing content and style extraction on the first class of training images and the second class of training images by using an original disentanglement model to obtain content feature data and style feature data of the first class of training images and content feature data and style feature data of the second class of training images, the model training apparatus 90 further includes an image reconstructing module for reconstructing to obtain reconstructed images by using the content feature data and style feature data of the first class of training images and the content feature data and style feature data of the second class of training images, the model training apparatus 90 further includes a loss calculating module for reconstructing to obtain reconstructed images based on the reconstructed images, a loss value of the disentanglement model is obtained, and the model training apparatus 90 further includes a parameter adjusting module for adjusting a parameter of the disentanglement model based on the loss value.
Different from the foregoing embodiment, the original de-entanglement model can be used to perform feature extraction on a first class of training images belonging to a first domain and a second class of training images belonging to a second domain, and the feature extraction is performed based on the extracted content feature data and style feature data, so that a loss value of the de-entanglement model is obtained based on the reconstructed images obtained by reconstruction, and then parameters of the de-entanglement model are adjusted based on the loss value, so that the de-entanglement model can be trained.
Referring to fig. 10, fig. 10 is a schematic block diagram of an embodiment of an electronic device 100 according to the present application. The electronic device 100 comprises a memory 101 and a processor 102 coupled to each other, and the processor 102 is configured to execute program instructions stored in the memory 101 to implement the steps in any of the above-described model training method embodiments, or implement the steps in any of the above-described image processing method embodiments, or implement the steps in any of the above-described image registration method embodiments. In this embodiment, the electronic device 100 may include a mobile device such as a notebook, a tablet computer, and a smart phone, and may also include a terminal device such as a microcomputer and a server, which are not limited herein.
In particular, the processor 102 is configured to control itself and the memory 101 to implement the steps in any of the above-described embodiments of the model training method, or to implement the steps in any of the above-described embodiments of the image processing method, or to implement the steps in any of the above-described embodiments of the image registration method. Processor 102 may also be referred to as a CPU (Central Processing Unit). The processor 102 may be an integrated circuit chip having signal processing capabilities. The Processor 102 may also be a general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. Additionally, the processor 102 may be commonly implemented by integrated circuit chips.
According to the scheme, the labeling cost of the images in different domains can be reduced, and the image processing precision is improved.
Referring to fig. 11, fig. 11 is a block diagram illustrating an embodiment of a computer-readable storage medium 110 according to the present application. The computer readable storage medium 110 stores program instructions 111 executable by a processor, the program instructions 111 for implementing the steps in any of the above-described embodiments of the model training method, or implementing the steps in any of the above-described embodiments of the image processing method, or implementing the steps in any of the above-described embodiments of the image registration method.
According to the scheme, the labeling cost of the images in different domains can be reduced, and the image processing precision is improved.
In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a module or a unit is merely one type of logical division, and an actual implementation may have another division, for example, a unit or a component may be combined or integrated with another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some interfaces, and may be in an electrical, mechanical or other form.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on network elements. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

Claims (15)

1. A method of model training, comprising:
acquiring at least one sample image, wherein the sample image has annotation information;
respectively extracting the content of the at least one sample image by using the disentanglement model to obtain the content characteristic data of the at least one sample image;
training a preset network model by using the content characteristic data of the at least one sample image and the labeling information to obtain a network model for processing a target image;
the de-entanglement model is obtained by utilizing two types of training images belonging to different domains through training, wherein one type of training images and the sample images belong to a first domain, and the other type of training images and the target images belong to a second domain.
2. The model training method according to claim 1, wherein the sample image and the target image are medical images, and the labeling information is labeling information on a biological organ;
and/or the sample image and the target image are three-dimensional images, and the labeling information is the labeling information of the three-dimensional target.
3. The model training method of claim 1, wherein the preset network model comprises any one of: the system comprises an image segmentation network model, an image classification network model and a target detection network model.
4. The model training method of claim 1, further comprising the steps of training to obtain the disentanglement model:
acquiring a first class of training images belonging to the first domain and a second class of training images belonging to the second domain;
respectively extracting the content and style of the first class of training images and the second class of training images by using an original disentanglement model to obtain content characteristic data and style characteristic data of the first class of training images and content characteristic data and style characteristic data of the second class of training images;
reconstructing by using the content characteristic data and the style characteristic data of the first class of training images and the content characteristic data and the style characteristic data of the second class of training images to obtain reconstructed images;
obtaining a loss value of the de-entanglement model based on the reconstructed image;
adjusting parameters of the de-entanglement model based on the loss values.
5. The model training method according to claim 4, wherein the reconstructing the content feature data and the style feature data of the first class of training images and the content feature data and the style feature data of the second class of training images to obtain the reconstructed images comprises:
reconstructing by using the content characteristic data and the style characteristic data of the first class of training images to obtain a first intra-domain reconstructed image, and reconstructing by using the content characteristic data and the style characteristic data of the second class of training images to obtain a second intra-domain reconstructed image; and
reconstructing by using the style characteristic data of the first class of training images and the content characteristic data of the second class of training images to obtain a first cross-domain reconstructed image, and reconstructing by using the style characteristic data of the second class of training images and the content characteristic data of the first class of training images to obtain a second cross-domain reconstructed image;
obtaining a loss value of the disentanglement model based on the reconstructed image includes:
obtaining a first intra-domain loss value based on the difference between the first class of training images and the first intra-domain reconstructed image, and obtaining a second intra-domain loss value based on the difference between the second class of training images and the second intra-domain reconstructed image;
obtaining a first content loss value based on the difference between the content characteristic data of the first class of training images and the second cross-domain reconstructed images, obtaining a first style loss value based on the difference between the style characteristic data of the first class of training images and the style characteristic data of the first cross-domain reconstructed images, obtaining a second content loss value based on the difference between the content characteristic data of the second class of training images and the style characteristic data of the first cross-domain reconstructed images, and obtaining a second style loss value based on the difference between the style characteristic data of the second class of training images and the style characteristic data of the second cross-domain reconstructed images;
obtaining a first cross-domain loss value based on the difference between the first class of training images and the first cross-domain reconstructed image, and obtaining a second cross-domain loss value based on the difference between the second class of training images and the second cross-domain reconstructed image;
and weighting the obtained loss value to obtain the loss value of the disentanglement model.
6. An image processing method, comprising:
extracting the content of a target image by using an disentanglement model to obtain content characteristic data of the target image;
processing the content characteristic data of the target image by using a processing network model to obtain a processing result of the target image;
the de-entanglement model is obtained by training two types of training images belonging to different domains, wherein the training images of one type and the target image belong to the same domain; the process network model is trained by the method of any one of claims 1 to 5.
7. An image registration method, comprising:
acquiring a first image to be registered and a second image to be registered;
respectively extracting the contents of the first image to be registered and the second image to be registered by using an disentanglement model to obtain content characteristic data of the first image to be registered and content characteristic data of the second image to be registered;
determining a registration parameter based on the content characteristic data of the first image to be registered and the content characteristic data of the second image to be registered;
registering the first image to be registered and the second image to be registered by using the registration parameters;
the de-entanglement model is obtained by training two types of training images belonging to different domains, wherein one type of the training images and the first image to be registered belong to a first domain, and the other type of the training images and the second image to be registered belong to a second domain.
8. The image registration method according to claim 7, wherein the first image to be registered and the second image to be registered are medical images;
and/or the first image to be registered and the second image to be registered are three-dimensional images;
and/or, the registration parameters include at least one of: rigidity variation parameter, deformation variation parameter.
9. The image registration method of claim 7,
the determining a registration parameter based on the content feature data of the first image to be registered and the content feature data of the second image to be registered comprises:
determining the registration parameters corresponding to the first image to be registered based on the content feature data of the first image to be registered and the content feature data of the second image to be registered;
the registering the first image to be registered and the second image to be registered by using the registration parameters comprises:
changing the first image to be registered by using the registration parameters to obtain a changed image corresponding to the first image to be registered;
and superposing the pixel point of the change image with the corresponding pixel point of the second image to be registered.
10. The image registration method according to claim 7, further comprising the steps of training to obtain the disentanglement model:
acquiring a first class of training images belonging to the first domain and a second class of training images belonging to the second domain;
respectively extracting the content and style of the first class of training images and the second class of training images by using an original disentanglement model to obtain content characteristic data and style characteristic data of the first class of training images and content characteristic data and style characteristic data of the second class of training images;
reconstructing by using the content characteristic data and the style characteristic data of the first class of training images and the content characteristic data and the style characteristic data of the second class of training images to obtain reconstructed images;
obtaining a loss value of the de-entanglement model based on the reconstructed image;
adjusting parameters of the de-entanglement model based on the loss values.
11. A model training apparatus, comprising:
the system comprises an image acquisition module, a storage module and a display module, wherein the image acquisition module is used for acquiring at least one sample image, and the sample image is provided with marking information;
the content extraction module is used for respectively extracting the content of the at least one sample image by using the disentanglement model to obtain the content characteristic data of the at least one sample image;
the model training module is used for training a preset network model by utilizing the content characteristic data of the at least one sample image and the labeling information to obtain a network model for processing a target image;
the de-entanglement model is obtained by utilizing two types of training images belonging to different domains through training, wherein one type of training images and the sample images belong to a first domain, and the other type of training images and the target images belong to a second domain.
12. An image processing apparatus characterized by comprising:
the content extraction module is used for extracting the content of the target image by using the disentanglement model to obtain the content characteristic data of the target image;
the network processing module is used for processing the content characteristic data of the target image by utilizing a processing network model to obtain a processing result of the target image;
the de-entanglement model is obtained by training two types of training images belonging to different domains, wherein the training images of one type and the target image belong to the same domain; the process network model is obtained by using the model training apparatus according to claim 11.
13. An image registration apparatus, comprising:
the image acquisition module is used for acquiring a first image to be registered and a second image to be registered;
the content extraction module is used for respectively extracting the content of the first image to be registered and the content of the second image to be registered by utilizing an disentanglement model to obtain the content characteristic data of the first image to be registered and the content characteristic data of the second image to be registered;
a parameter determining module, configured to determine a registration parameter based on the content feature data of the first image to be registered and the content feature data of the second image to be registered;
the image registration module is used for registering the first image to be registered and the second image to be registered by utilizing the registration parameters;
the de-entanglement model is obtained by training two types of training images belonging to different domains, wherein one type of the training images and the first image to be registered belong to a first domain, and the other type of the training images and the second image to be registered belong to a second domain.
14. An electronic device comprising a memory and a processor coupled to each other, the processor being configured to execute program instructions stored in the memory to implement the model training method of any one of claims 1 to 5, or to implement the image processing method of claim 6, or to implement the image registration method of any one of claims 7 to 10.
15. A computer readable storage medium storing program instructions which, when executed by a processor, implement the model training method of any one of claims 1 to 5, or implement the image processing method of claim 6, or implement the image registration method of any one of claims 7 to 10.
CN202011193221.7A 2020-10-30 2020-10-30 Model training method, image processing and registering method, and related device and equipment Withdrawn CN112348819A (en)

Priority Applications (5)

Application Number Priority Date Filing Date Title
CN202011193221.7A CN112348819A (en) 2020-10-30 2020-10-30 Model training method, image processing and registering method, and related device and equipment
JP2021576590A JP2023502813A (en) 2020-10-30 2021-03-04 Model training method, image processing and registration method, apparatus, device, medium
PCT/CN2021/079154 WO2022088572A1 (en) 2020-10-30 2021-03-04 Model training method, image processing and alignment method, apparatus, device, and medium
KR1020227001100A KR20220012406A (en) 2020-10-30 2021-03-04 Model training method, image processing and registration method, apparatus, apparatus, medium
TW110115205A TW202217744A (en) 2020-10-30 2021-04-27 Model training method, image processing and registration method, electronic equipment, computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011193221.7A CN112348819A (en) 2020-10-30 2020-10-30 Model training method, image processing and registering method, and related device and equipment

Publications (1)

Publication Number Publication Date
CN112348819A true CN112348819A (en) 2021-02-09

Family

ID=74356950

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011193221.7A Withdrawn CN112348819A (en) 2020-10-30 2020-10-30 Model training method, image processing and registering method, and related device and equipment

Country Status (3)

Country Link
CN (1) CN112348819A (en)
TW (1) TW202217744A (en)
WO (1) WO2022088572A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113689419A (en) * 2021-09-03 2021-11-23 电子科技大学长三角研究院(衢州) Image segmentation processing method based on artificial intelligence
CN114373004A (en) * 2022-01-13 2022-04-19 强联智创(北京)科技有限公司 Unsupervised three-dimensional image rigid registration method based on dynamic cascade network
WO2022088572A1 (en) * 2020-10-30 2022-05-05 上海商汤智能科技有限公司 Model training method, image processing and alignment method, apparatus, device, and medium

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10200618B2 (en) * 2015-03-17 2019-02-05 Disney Enterprises, Inc. Automatic device operation and object tracking based on learning of smooth predictors
CN109741379A (en) * 2018-12-19 2019-05-10 上海商汤智能科技有限公司 Image processing method, device, electronic equipment and computer readable storage medium
CN109767460A (en) * 2018-12-27 2019-05-17 上海商汤智能科技有限公司 Image processing method, device, electronic equipment and computer readable storage medium
CN111210467A (en) * 2018-12-27 2020-05-29 上海商汤智能科技有限公司 Image processing method, image processing device, electronic equipment and computer readable storage medium
CN110472737B (en) * 2019-08-15 2023-11-17 腾讯医疗健康(深圳)有限公司 Training method and device for neural network model and medical image processing system
CN110852937B (en) * 2019-10-16 2023-06-02 天津大学 Deformation object image generation method based on decoupling of content and style
CN111797891A (en) * 2020-05-21 2020-10-20 南京大学 Unpaired heterogeneous face image generation method and device based on generation countermeasure network
CN111738968A (en) * 2020-06-09 2020-10-02 北京三快在线科技有限公司 Training method and device of image generation model and image generation method and device
CN111783610B (en) * 2020-06-23 2022-03-15 西北工业大学 Cross-domain crowd counting method based on de-entangled image migration
CN112348819A (en) * 2020-10-30 2021-02-09 上海商汤智能科技有限公司 Model training method, image processing and registering method, and related device and equipment

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022088572A1 (en) * 2020-10-30 2022-05-05 上海商汤智能科技有限公司 Model training method, image processing and alignment method, apparatus, device, and medium
CN113689419A (en) * 2021-09-03 2021-11-23 电子科技大学长三角研究院(衢州) Image segmentation processing method based on artificial intelligence
CN114373004A (en) * 2022-01-13 2022-04-19 强联智创(北京)科技有限公司 Unsupervised three-dimensional image rigid registration method based on dynamic cascade network
CN114373004B (en) * 2022-01-13 2024-04-02 强联智创(北京)科技有限公司 Dynamic image registration method

Also Published As

Publication number Publication date
TW202217744A (en) 2022-05-01
WO2022088572A1 (en) 2022-05-05

Similar Documents

Publication Publication Date Title
CN110399929B (en) Fundus image classification method, fundus image classification apparatus, and computer-readable storage medium
CN112348819A (en) Model training method, image processing and registering method, and related device and equipment
US20220198230A1 (en) Auxiliary detection method and image recognition method for rib fractures based on deep learning
CN110838125B (en) Target detection method, device, equipment and storage medium for medical image
CN116030259B (en) Abdominal CT image multi-organ segmentation method and device and terminal equipment
CN112241955B (en) Broken bone segmentation method and device for three-dimensional image, computer equipment and storage medium
CN110782424A (en) Image fusion method and device, electronic equipment and computer readable storage medium
CN110570394A (en) medical image segmentation method, device, equipment and storage medium
Guo et al. Salient object detection from low contrast images based on local contrast enhancing and non-local feature learning
Li et al. Multi-scale residual denoising GAN model for producing super-resolution CTA images
CN116664590B (en) Automatic segmentation method and device based on dynamic contrast enhancement magnetic resonance image
CN112396657A (en) Neural network-based depth pose estimation method and device and terminal equipment
CN116740081A (en) Method, device, terminal equipment and medium for segmenting pulmonary vessels in CT image
CN116309612A (en) Semiconductor silicon wafer detection method, device and medium based on frequency decoupling supervision
CN113409324B (en) Brain segmentation method fusing differential geometric information
KR20220012406A (en) Model training method, image processing and registration method, apparatus, apparatus, medium
CN113327221A (en) Image synthesis method and device fusing ROI (region of interest), electronic equipment and medium
CN110008881A (en) The recognition methods of the milk cow behavior of multiple mobile object and device
Liu et al. Dual UNet low-light image enhancement network based on attention mechanism
CN114581459A (en) Improved 3D U-Net model-based segmentation method for image region of interest of preschool child lung
CN114419375A (en) Image classification method, training method, device, electronic equipment and storage medium
CN113408595B (en) Pathological image processing method and device, electronic equipment and readable storage medium
CN116363152B (en) Image segmentation method, method and device for training image segmentation model
CN115375626B (en) Medical image segmentation method, system, medium and device based on physical resolution
CN117455935B (en) Abdominal CT (computed tomography) -based medical image fusion and organ segmentation method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40037268

Country of ref document: HK

WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20210209