WO2021159742A1 - 图像分割方法、装置和存储介质 - Google Patents

图像分割方法、装置和存储介质 Download PDF

Info

Publication number
WO2021159742A1
WO2021159742A1 PCT/CN2020/124673 CN2020124673W WO2021159742A1 WO 2021159742 A1 WO2021159742 A1 WO 2021159742A1 CN 2020124673 W CN2020124673 W CN 2020124673W WO 2021159742 A1 WO2021159742 A1 WO 2021159742A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
image
domain
segmentation
loss
Prior art date
Application number
PCT/CN2020/124673
Other languages
English (en)
French (fr)
Inventor
柳露艳
马锴
郑冶枫
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to EP20919226.9A priority Critical patent/EP4002271A4/en
Priority to JP2022523505A priority patent/JP7268248B2/ja
Publication of WO2021159742A1 publication Critical patent/WO2021159742A1/zh
Priority to US17/587,825 priority patent/US20220148191A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/143Segmentation; Edge detection involving probabilistic approaches, e.g. Markov random field [MRF] modelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/162Segmentation; Edge detection involving graph-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/174Segmentation; Edge detection involving the use of two or more images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10056Microscopic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20072Graph-based image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20076Probabilistic image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30041Eye; Retina; Ophthalmic
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Definitions

  • This application relates to the field of communication technology, and in particular to an image segmentation method, device and storage medium.
  • AI Artificial Intelligence
  • image classification image classification
  • lesion detection target segmentation
  • medical image analysis especially in medical image segmentation
  • AI technology can be used to segment the optic cup and optic disc from the retinal fundus image.
  • the current AI segmentation of the optic cup and the optic disc is mainly based on a deep learning network.
  • a deep learning network that can separate the optic cup and optic disc can be trained, and then the fundus image to be segmented is input to the trained deep learning network for feature extraction.
  • segmentation results such as glaucoma segmentation image and so on.
  • an image segmentation method, device, and storage medium are provided.
  • the embodiment of the present application provides an image segmentation method, including: acquiring a target domain image and a source domain image labeled with target information; using a generation network in a first generation countermeasure network to respectively segment the source domain image and the target domain image, Determine the first source domain segmentation loss and the first target domain segmentation loss; use the generator network in the second generative confrontation network to segment the source domain image and the target domain image respectively, and determine the second source domain segmentation loss and the second target domain segmentation Loss; determining the first source domain target image and the second source domain target image according to the first source domain segmentation loss and the second source domain segmentation loss, and according to the first target domain segmentation loss, the second source domain
  • the target domain segmentation loss determines the first target domain target image and the second target domain target image; using the first source domain target image, the first target domain target image, the second source domain target image, and the second target domain target image to compare the
  • the first generation confrontation network and the second generation confrontation network are cross-trained to obtain the trained first generation confrontation network; and the
  • an image segmentation device including:
  • the first segmentation unit is configured to use the generation network in the first generation confrontation network to respectively segment the source domain image and the target domain image, and determine the first source domain segmentation loss and the first target domain segmentation loss;
  • the second segmentation unit is configured to use the generation network in the second generation confrontation network to respectively segment the source domain image and the target domain image, and determine the second source domain segmentation loss and the second target domain segmentation loss;
  • the determining unit is configured to determine the first source domain target image and the second source domain target image according to the first source domain segmentation loss and the second source domain segmentation loss, and to determine the first source domain target image and the second source domain target image according to the first target domain segmentation loss, the result
  • the second target domain segmentation loss determines the first target domain target image and the second target domain target image
  • the training unit is configured to use the first source domain target image, the first target domain target image, the second source domain target image, and the second target domain target image to perform on the first generative countermeasure network and the second generative countermeasure network Cross training to obtain the first generation confrontation network after training;
  • the third segmentation unit is configured to segment the image to be segmented based on the generation network of the first generation confrontation network after the training to obtain a segmentation result.
  • the embodiments of the present application also provide one or more non-volatile storage media storing computer-readable instructions.
  • the computer-readable instructions are executed by one or more processors, the one or more processors execute Steps in any image segmentation method provided in the embodiments of the present application.
  • an embodiment of the present application also provides an electronic device, including a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor, and the processor executes the program when the program is executed.
  • an electronic device including a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor, and the processor executes the program when the program is executed.
  • FIG. 1a is a schematic diagram of a scene of an image segmentation method provided by an embodiment of the present application
  • Fig. 1b is a flowchart of an image segmentation method provided by an embodiment of the present application.
  • Figure 2a is another flowchart of an image segmentation method provided by an embodiment of the present application.
  • Figure 2b is a system framework diagram of an image segmentation method provided by an embodiment of the present application.
  • Figure 2c is a framework diagram of a first generation confrontation network provided by an embodiment of the present application.
  • FIG. 2d is a diagram of the image segmentation result provided by an embodiment of the present application.
  • FIG. 3 is a schematic structural diagram of an image segmentation device provided by an embodiment of the present application.
  • Fig. 4 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • the embodiments of the present application provide an image segmentation method, device, and storage medium.
  • the image segmentation may be integrated in an electronic device, and the electronic device may be a server or a terminal or other devices.
  • the image segmentation method provided by the embodiments of the present application relates to the computer vision direction in the field of artificial intelligence, and the fundus image segmentation can be realized through the computer vision technology of artificial intelligence, and the segmentation result can be obtained.
  • Artificial Intelligence is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge, and use knowledge to obtain the best results.
  • artificial intelligence is a comprehensive technology of computer science, which attempts to understand the essence of intelligence and produce a new kind of intelligent machine that can react in a similar way to human intelligence.
  • Artificial intelligence is to study the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
  • Artificial intelligence technology is a comprehensive discipline, covering a wide range of fields, including both hardware-level technology and software-level technology.
  • artificial intelligence software technology mainly includes computer vision technology, machine learning/deep learning and other directions.
  • Computer Vision technology is a science that studies how to make machines "see”. Furthermore, it refers to machine vision that uses computers to replace human eyes to identify and measure targets.
  • Image processing makes the image processed by computer into an image that is more suitable for human eyes to observe or transmit to the instrument for inspection.
  • Computer vision studies related theories and technologies trying to establish an artificial intelligence system that can obtain information from images or multi-dimensional data.
  • Computer vision technology usually includes image processing, image recognition and other technologies, as well as common facial recognition, human posture recognition and other biometric recognition technologies.
  • the so-called image segmentation refers to the computer vision technology and process of dividing an image into a number of specific regions with unique properties and proposing objects of interest.
  • it mainly refers to segmenting medical images such as fundus images to find the desired target object, for example, segmenting the optic cup, optic disc, etc. from the fundus image.
  • the segmented target object can be subsequently analyzed by medical staff or other medical experts for further operations.
  • the electronic device integrated with the image segmentation device first obtains the target domain image and the source domain image with the target information labeled, and then uses the generation network in the first generation countermeasure network to separately perform the analysis on the source domain image and The target domain image is segmented, the first source domain segmentation loss and the first target domain segmentation loss are determined, and the generation network in the second generative confrontation network is used to segment the source domain image and the target domain image respectively, and the second source domain segmentation is determined Loss and the second target domain segmentation loss, and then determine the first source domain target image and the second source domain target image according to the first source domain segmentation loss and the second source domain segmentation loss, and divide according to the first target domain Loss, the second target domain segmentation loss determines the first target domain target image and the second target domain target image, and then uses the first source domain target image, the first target domain target image, the second source domain target image, and the second target The domain target image cross-trains the first generation confrontation network and the second generation confrontation network to
  • This scheme uses two generative confrontation networks with different structures and learning capabilities, they can learn from each other, supervise each other, and select clean target images from their own networks to the peer-to-peer network to continue training, which effectively improves The accuracy of image segmentation.
  • the image segmentation device may be specifically integrated in an electronic device.
  • the electronic device may be a server, a terminal, or a system including a server and a terminal.
  • the image segmentation method in the embodiment of the present application is implemented through interaction between the terminal and the server.
  • the terminal may specifically be a desktop terminal or a mobile terminal
  • the mobile terminal may specifically be at least one of a mobile phone, a tablet computer, a notebook computer, and the like.
  • the server can be implemented as an independent server or a server cluster composed of multiple servers.
  • the specific process of the image segmentation method can be as follows:
  • the source domain image refers to the medical image that can provide rich annotation information
  • the target domain image refers to the field where the test data set is located
  • the medical image lacks the annotation information.
  • the source domain image can be specifically collected by various medical image acquisition equipment, such as Computed Tomography (CT), or MRI, etc., to collect images of living body tissues, and the images can be marked by professionals. For example, it is marked by an imaging physician and then provided to the image segmentation device, that is, the image segmentation device can specifically receive the medical image sample sent by the medical image acquisition device.
  • CT Computed Tomography
  • medical image refers to the non-invasive way to obtain images of the internal tissues of a living body or a part of a living body in medical treatment or medical research, such as images of the human brain, stomach, liver, heart, throat, and vagina.
  • the image can be a CT image, an MRI image, or a positron emission tomography scan, etc.
  • the life form refers to an independent individual with a life form, such as a human or an animal.
  • the source domain image can refer to the image that has been collected by medical image acquisition equipment and obtained through various means, such as from a database or network, etc., can be an image sample that has been annotated by professionals with a specific meaning, or it can be an unidentified image. Any processed image sample.
  • the structure and parameters of the first generation confrontation network can be set and adjusted according to actual conditions.
  • the generation network in the first generation confrontation network can use DeepLabv2 with the residual network 101 (ResNet101) as the main framework as the basic model to achieve preliminary segmentation results.
  • the Atrous Spatial Pyramid Pooling (ASPP) structure is added, which enriches the multi-scale information of the feature map.
  • an attention mechanism based on the Dual Attention Network (DANet) is proposed to learn how to capture the context dependency between pixels and feature layer channels, and to combine the output of the attention module with the spatial pyramid The output of the structure is connected to generate the final segmentation feature.
  • DANet Dual Attention Network
  • the discriminant network in the first generation confrontation network can use a multi-layer full convolutional network to integrate the segmentation probabilities of the source domain image and the target domain image into the confrontation learning, and it can be used after all convolutional layers except the last layer.
  • a Leaky Rectified Linear Unit (Leaky ReLU) activation function layer is added, and finally a single-channel 2D result is output, with 0 and 1 representing the source domain and target domain, respectively.
  • the generative network in the first generative confrontation network can be used to extract the features of the source domain image and the target domain image respectively, to obtain the feature information of the first source domain image and the feature information of the first target domain image, based on the first Based on the feature information of the source domain image, perform target segmentation on the source domain image to determine the first source domain segmentation loss, and based on the feature information of the first target domain image, perform target segmentation on the target domain image to determine the first target domain segmentation loss.
  • a distance map of anti-noise labels can be introduced into the source domain graph. Because the medical labels are marked at the boundary position of the target area, there are many In order to prevent the network from fitting noise labels, a new anti-noise segmentation loss is proposed, which learns useful pixel-level information from noise samples and filters out areas with noise at the edges.
  • the source domain image includes a noise image and a noise-free image.
  • target segmentation of the noise image in the source domain image can be performed to obtain the first noise segmentation probability; obtain the source domain image The weight map of the medium noise image; according to the first noise segmentation probability and the weight map of the noise image, the first noise segmentation loss is obtained; based on the feature information of the first source image, the noise-free image in the source image Target segmentation to obtain the first noise-free segmentation probability; according to the first noise-free segmentation probability and the labeling result of the noise-free image, obtain the first noise-free segmentation loss; based on the first noise-free segmentation loss and the first noise-free segmentation loss , Determine the first source domain segmentation loss.
  • a first noise segmentation result is generated.
  • the specific formula of the first noise segmentation loss can be as follows:
  • h ⁇ w ⁇ c represents the length, width, and category of the image data
  • ⁇ 1 and ⁇ 2 are the weight coefficients of the two types of losses
  • W(y i ) represents the weight map
  • the formula is the first The term is based on cross-entropy loss
  • the second term is based on dice loss.
  • w c indicates that the weighted weight value of the balanced category is also the weight coefficient; for each noise label y i , calculate the distance d(y i ) from the pixel on the label to the nearest boundary, and then calculate the Obtain the maximum distance max dis of the distance d(y i ) in.
  • the two networks exchange clean data (clean data) that each thinks that the loss is small, calculate the dice co of the predicted value of the clean data for the two networks.
  • the dice co is greater than the threshold ⁇ , it means that the two networks diverge for the sample.
  • the sample is regarded as a noise sample (noisy data), plus the anti-noise segmentation loss to improve the learning of the noise sample, and calculate the loss L noise of the generated network, otherwise it will maintain the original method of cross entropy and dice loss to calculate the loss.
  • W(y i ) the center of the region in each class has a larger weight, and the closer to the boundary, the smaller the weight.
  • L noise enables the network to capture the key position of the center and filter out the difference on the boundary under various noise labels.
  • the whole task can be regarded as an unsupervised image segmentation problem.
  • This application adds "self-supervised" information, that is, using the segmentation results of the target domain image to generate pixel-level pseudo-labels, and apply them in the next training stage.
  • the segmentation probability result of the target domain image for any pixel, if the prediction confidence of a certain category is higher than the confidence threshold, a pseudo label of the corresponding category is generated at the pixel location.
  • the confidence threshold adopts an adaptive setting method, sorts the confidence of each pseudo label in each class and each sample in the target domain image, and adaptively selects the pixels with the highest class-level and image-level prediction confidence.
  • the target segmentation function For example, specifically based on the feature information of the first target domain image, perform target segmentation on the target domain image to obtain the first target domain segmentation probability; according to the first target domain segmentation probability, generate the first target domain segmentation result; The first target domain segmentation result and the target domain image are used to obtain the first target domain segmentation loss.
  • the segmentation result generating network output in a network against a first source generating a first temporal segmentation probability P S and the first target region division probability P T simultaneously input to the first generation network against network is determined, using the The information entropy generated by P T is used to calculate the confrontation loss L D , and at the same time, the parameters of the discriminant network are updated by maximizing the confrontation loss. Subsequently, the error generated by the confrontation loss function will also be transmitted back to the generation network, and the parameters of the segmentation network are updated by minimizing the confrontation loss.
  • the purpose is to enable the generation network to predict the segmentation results of the source domain image and the target domain image. The more and more similar, the realization of domain self-adaptation.
  • the discriminant network of the first generation adversarial network can be used to discriminate the first source domain segmentation result and the first target domain segmentation result to obtain the first source domain segmentation result and the first target domain segmentation result.
  • Discrimination result training the first generation confrontation network according to the first source domain segmentation result, the first target domain segmentation result, and the first discrimination result to obtain the trained first generation confrontation network.
  • the discriminant network of the first generation confrontation network is adopted. According to the first source domain segmentation result, the first target domain segmentation result, and the information entropy of the first target domain image, the first discrimination result is obtained.
  • the first generation confrontation network is trained to obtain the trained first Generate a confrontation network.
  • it may be specifically based on the first source domain segmentation result, And the labeling result of the first source domain image, obtain the first source domain segmentation loss; according to the first target domain segmentation result, and the first target domain image, obtain the first target domain segmentation loss; according to the first source domain segmentation result and The first target domain segmentation result is obtained, and the first discriminant loss of the discriminant network is obtained; according to the first source domain segmentation loss, the first target domain segmentation loss and the first discriminant loss, the first generation confrontation network is trained to obtain the trained first Generate a confrontation network.
  • the first source domain segmentation loss and The first target domain segmentation loss constructs the minimized confrontation loss of the first generation confrontation network; the maximized confrontation loss of the first generation confrontation network is constructed according to the first discriminant loss; based on the minimized confrontation loss and the maximum Reduce the confrontation loss, perform iterative training on the first generation confrontation network, and obtain the first generation confrontation network after training.
  • Y S is the label of the source domain image
  • the segmentation loss of the segmentation network is:
  • the source domain segmentation loss defined as:
  • L noise is the first noise segmentation loss
  • L clean is the segmentation loss for clean data with clean and reliable labels, that is, the first noise-free segmentation loss
  • is a coefficient that balances L clean and L noise.
  • the calculation of the adversarial loss of the discriminating network can be as follows:
  • ⁇ adv is a parameter used to balance the loss relationship during the training process against loss
  • L adv can be expressed as:
  • ⁇ entr is the weight parameter corresponding to the information entropy result graph, and ⁇ is added to ensure the stability of the training process in the case of f(X T ).
  • f(X T ) is the calculation result of the information entropy of the target domain image, which can be expressed as:
  • the entropy map is introduced in the pixel-by-pixel prediction of the target domain image, and then the "entropy map" is multiplied by the counter loss calculated by the discriminator for each pixel according to the prediction to increase the pixel with uncertainty ( High entropy) loss weight, and reduce the deterministic loss weight (low entropy). Driven by the entropy map mapping, it helps the network learn how to pay attention to the most representative features of the category.
  • the training of the second generation confrontation network is similar to that of the first generation confrontation network, except that different structures and parameters are used.
  • the DeepLabv3+ architecture can be used for the second generation confrontation network N2
  • the lightweight network MobileNetV2 uses the first convolutional layer of MobileNetV2 and the subsequent 7 residual blocks to extract features.
  • the ASPP module is also added to learn the potential features in different receptive fields, and the ASPP with different dialate rates is used to generate multi-scale features, and different levels of semantic information are integrated into the feature map.
  • the feature map is up-sampled and then convolved to connect the above-mentioned combined features with low-level features to perform fine-grained semantic segmentation.
  • the generative network in the second generative confrontation network can be used to extract the features of the source domain image and the target domain image respectively to obtain the feature information of the second source domain image and the feature information of the second target domain image, based on the second Based on the feature information of the source domain image, perform target segmentation on the source domain image to determine the second source domain segmentation loss, and based on the feature information of the second target domain image, perform target segmentation on the target domain image to determine the second target domain segmentation loss.
  • a distance map of the anti-noise label can be introduced into the source domain image. Because the medical label is marked on the boundary position of the target area, there are many In order to prevent the network from fitting noise labels, a new anti-noise segmentation loss is proposed, which learns useful pixel-level information from noise samples and filters out areas with noise at the edges.
  • the source domain image includes a noise image and a noise-free image.
  • target segmentation of the noise image in the source domain image can be performed to obtain the second noise segmentation probability; to obtain the source domain image
  • the weight map of the medium noise image; according to the second noise segmentation probability and the weight map of the noise image, the second noise segmentation loss is obtained; based on the feature information of the second source image, the noise-free image in the source image Target segmentation to obtain the second noise-free segmentation probability; according to the second noise-free segmentation probability and the labeling result of the noise-free image, obtain the second noise-free segmentation loss; based on the second noise-free segmentation loss and the second noise-free segmentation loss , Determine the second source domain segmentation loss.
  • the specific calculation method of the second noise segmentation loss can refer to the calculation method of the first noise segmentation loss described above.
  • the specific training method is similar to the preset first generation network, and the method of adding "self-supervised” information can also be used, that is, using the segmentation result of the target domain image to generate pixel-level pseudo labels. And apply it in the next training phase. For example, specifically based on the feature information of the second target domain image, perform target segmentation on the target domain image to obtain the second target domain segmentation probability; according to the second target domain segmentation probability, generate the second target domain segmentation result; The second target domain segmentation result and the target domain image are used to obtain the second target domain segmentation loss.
  • the second source domain segmentation probability PS and the second target domain segmentation probability P T of the second source domain segmentation probability P S and the second target domain segmentation probability P T outputted by the generator network in the second generative confrontation network are simultaneously input into the discriminant network of the second generative confrontation network, and are used by
  • the information entropy generated by P T is used to calculate the confrontation loss L D , and at the same time, the parameters of the discriminant network are updated by maximizing the confrontation loss.
  • the error generated by the confrontation loss function will also be transmitted back to the generation network, and the parameters of the segmentation network are updated by minimizing the confrontation loss.
  • the purpose is to enable the generation network to predict the segmentation results of the source domain image and the target domain image. The more and more similar, the realization of domain self-adaptation.
  • the discriminant network of the second generation adversarial network can be used to discriminate the second source domain segmentation result and the second target domain segmentation result to obtain the first Second discrimination result: training the second generative confrontation network according to the second source domain segmentation result, the second target domain segmentation result, and the second discrimination result to obtain the trained second generative confrontation network.
  • the second source domain segmentation result and the second target domain segmentation result can be discriminated in many ways, for example, it can be specifically used to calculate the information entropy of the second target domain image; the discriminant network of the second generation confrontation network is adopted According to the second source domain segmentation result, the second target domain segmentation result, and the information entropy of the second target domain image, the second discrimination result is obtained.
  • the second generative adversarial network is trained to obtain the trained second Generate a confrontation network.
  • the second source domain segmentation loss and The second target domain segmentation loss constructs the minimized confrontation loss of the second generation confrontation network; the maximized confrontation loss of the second generation confrontation network is constructed according to the second discriminant loss; based on the minimized confrontation loss and the maximum Reduce the confrontation loss, perform iterative training on the second generation confrontation network, and obtain the second generation confrontation network after training.
  • the calculation method of each loss in the second generation confrontation network is similar to that of the first generation confrontation network, and the details can be seen in the above description.
  • each generation confrontation network sorts the segmentation losses of all predicted values, and the two networks select small loss samples C 1 and C 2 as clean data.
  • Each network delivers these useful samples to its peer-to-peer network for the next training process, and then updates the parameters of the convolutional layer.
  • Each generation network re-selects the clean data currently considered to be the best, and fine-tunes its peer-to-peer network in a hierarchical manner. Since the two networks have different structures and learning capabilities, they can filter different types of errors introduced by noisy tags. In this exchange process, peer-to-peer networks can supervise each other, reducing training errors caused by noisy tags.
  • the first source domain segmentation loss can be specifically sorted, and the source domain image that meets the preset loss condition is selected according to the sorted first source domain segmentation loss, and determined as the first source domain target image;
  • the domain segmentation loss is sorted, and the source domain image that meets the preset loss condition is selected according to the sorted second source domain segmentation loss, and the source domain image is determined as the second source domain target image.
  • the preset loss condition may be, for example, a preset loss threshold.
  • a source domain image that meets the preset condition may be, for example, that the source domain segmentation loss of the source domain image is less than the loss threshold.
  • the preset loss condition may also be the minimum loss threshold.
  • the source domain image that meets the preset loss condition refers to the source domain image with the smallest source domain segmentation loss among all source domain images.
  • the pseudo-labels generated by the two generation networks on the target domain image are cross-learned at each stage to update the network parameters.
  • the specific training steps are as follows: Step 1, using the previous training results of the two networks for the target domain results PL 1 and PL 2 as pseudo-labels.
  • Step 2 Apply the pseudo-label to another network training process in the next stage, and update the network parameters in an iterative manner.
  • the segmentation network and the discriminant network are trained together in an alternate update manner. We first input the image data into the segmentation network, use the real label of the source domain data and the pseudo label of the target domain data to calculate the segmentation loss L seg , and update the parameters of the segmentation network by minimizing the segmentation loss.
  • the first generative confrontation network can be specifically trained according to the first target domain segmentation loss, and the training result can be used to generate the first target domain target image; the second generative confrontation network can be trained according to the second target domain segmentation loss , Use the training result to generate the target image of the second target domain.
  • the first source domain target image and the first target domain target image may be used to train the second generative confrontation network
  • the second source domain target image and the second target domain target image may be used to train the first generative confrontation network
  • the generation network of the first generation confrontation network may be specifically used to segment the second source domain target image and the second target domain target image respectively to obtain the second source domain target segmentation result and the second target domain target segmentation result;
  • the first generation of the discriminant network of the confrontation network, the second source domain target segmentation result and the second target domain target segmentation result are distinguished, and the second target discrimination result is obtained; according to the second source domain target segmentation result, the second target domain target segmentation
  • the result and the second target discrimination result train the first generation confrontation network to obtain the trained first generation confrontation network.
  • the target segmentation result of the second source domain there can be many ways to discriminate between the target segmentation result of the second source domain and the target segmentation result of the second target domain. For example, it can be specifically used to calculate the information entropy of the target image in the second target domain; the first generation confrontation network is adopted. According to the second source domain target segmentation result, the second target domain target segmentation result, and the information entropy of the second target domain target image, the second target discrimination result is obtained by the discriminant network of.
  • the first generative adversarial network according to the target segmentation result of the second source domain, the target segmentation result of the second target domain, and the result of the second target discrimination. For example, it can be specifically based on the second source domain
  • the target segmentation result, and the labeling result of the second source domain target image obtain the second source domain target segmentation loss; according to the second target domain target segmentation result, and the second target domain target image, the second target domain target segmentation loss is obtained ;
  • the second target discrimination loss of the discriminant network according to the second source domain target segmentation loss, the second target domain target segmentation loss and the second target discrimination loss
  • the first generation confrontation network is trained, and the trained first generation confrontation network is obtained.
  • the first generative adversarial network can train according to the second source domain target segmentation loss, the second target domain target segmentation loss, and the second target discrimination loss.
  • it can be specifically based on the second source domain
  • the target segmentation loss and the target segmentation loss of the second target domain construct a minimization of the confrontation loss of the first generation confrontation network; construct the maximum confrontation loss of the first generation confrontation network according to the second target discriminant loss; based on the minimization
  • the confrontation loss and the maximum confrontation loss are subjected to iterative training of the first generation confrontation network, and the trained first generation confrontation network is obtained.
  • the method of training the second generation confrontation network by using the first source domain target image and the first target domain target image is similar to the training method of the second generation confrontation network.
  • the generation network of the second generation confrontation network can be specifically used to segment the first source domain target image and the first target domain target image respectively to obtain the first source domain target segmentation result and the first target domain target segmentation result;
  • the second generation of the discriminant network of the confrontation network, the first source domain target segmentation result and the first target domain target segmentation result are distinguished, and the first target discrimination result is obtained; according to the first source domain target segmentation result, the first target domain target segmentation
  • the result and the first target discrimination result train the second generation confrontation network to obtain the second generation confrontation network after training.
  • the discriminant network of the second generative adversarial network is used to discriminate between the first source domain target segmentation result and the first target domain target segmentation result, and the information of the first target domain target image can be specifically calculated Entropy, using the discriminant network of the second generation confrontation network, obtain the first target discrimination result according to the first source domain target segmentation result, the first target domain target segmentation result, and the information entropy of the first target domain target image.
  • the second generative adversarial network is trained according to the first source domain target segmentation result, the first target domain target segmentation result, and the first target discrimination result, which can be specifically based on the first source domain target
  • the segmentation result and the labeling result of the first source domain target image to obtain the first source domain target segmentation loss; according to the first target domain target segmentation result and the first target domain target image, the first target domain target segmentation loss is obtained; According to the first source domain target segmentation result and the first target domain target segmentation result, the first target discriminant loss of the discriminant network is obtained; according to the first source domain target segmentation loss, the first target domain target segmentation loss and the first target discriminant loss pair
  • the second generation confrontation network is trained to obtain the second generation confrontation network after training.
  • the minimization of the second generation confrontation network can be constructed according to the first source domain target segmentation loss and the first target domain target segmentation loss; the loss is determined according to the first target Construct the maximum confrontation loss of the second generation confrontation network; based on the minimized confrontation loss and the maximum confrontation loss, the second generation confrontation network is iteratively trained to obtain the trained second generation confrontation network.
  • a generation network based on the trained first generation confrontation network performs segmentation on the image to be segmented to obtain a segmentation result.
  • the segmentation result of the image to be segmented is generated according to the segmentation prediction probability.
  • the image to be segmented refers to an image that needs to be segmented, such as medical images (such as heart, lungs, etc.) or some ordinary images (such as people, objects), and so on.
  • various medical image acquisition equipment such as an electronic computed tomography scanner or an MRI machine, can be used to collect images of living body tissues, such as the human brain, stomach, liver, and heart. , Throat, vagina, etc., and then provided to the medical image detection device, that is, the medical image detection device can specifically receive the to-be-segmented image sent by the medical image acquisition device.
  • the embodiment of the present application first obtains the target domain image and the source domain image with the target information labeled, and then uses the generation network in the first generation confrontation network to respectively segment the source domain image and the target domain image, and determine the first The source domain segmentation loss and the first target domain segmentation loss, and the generation network in the second generative confrontation network is used to segment the source domain image and the target domain image respectively to determine the second source domain segmentation loss and the second target domain segmentation loss, Next, determine the first source domain target image and the second source domain target image according to the first source domain segmentation loss and the second source domain segmentation loss, and determine the first source domain target image and the second source domain target image according to the first target domain segmentation loss and the second target domain segmentation loss Determine the first target domain target image and the second target domain target image, and then use the first source domain target image, the first target domain target image, the second source domain target image, and the second target domain target image to confront the first generation
  • the network and the second generation confrontation network are cross-trained to obtain the trained
  • the generation network to be segmented is segmented to obtain the segmentation result;
  • the scheme mainly aims at the presence of noise in the data labels and the distribution differences between the source domain and target domain datasets, and proposes an unsupervised robust segmentation method based on domain adaptive strategies. Through the mutual learning and mutual supervision of the two models, the task of noise labeling and unsupervised image segmentation is effectively solved, and the accuracy of image segmentation is improved. According to the method described in the previous embodiment, the accurate segmentation of the glaucoma optic cup and the optic disc will be taken as an example for further detailed description.
  • the embodiment of the application provides a robust unsupervised domain adaptive segmentation method based on noisy label data, which can learn the feature structure on the existing labeled data set and transfer the knowledge to the new data set, which is the new data that is not labeled.
  • the set provides more accurate image segmentation, which effectively improves the generalization performance of the deep network on other data sets.
  • the unsupervised domain-adaptive training method in the embodiments of this application can train a generative confrontation network including an image segmentation network (as a generative network) through a domain confrontation method, and then use the generative network in the trained generative confrontation network to fight against the unsupervised domain.
  • the labeled image to be segmented is segmented and so on.
  • the image segmentation device is specifically integrated in an electronic device as an example for description.
  • an embodiment of the present application provides an image segmentation method, and the specific process may be as follows:
  • the electronic device acquires a target domain image and a source domain image labeled with target information.
  • the REFUGE and Drishti-GS data sets are used. Because the training set and the verification set (or test set) are taken by different acquisition devices, the images are in color and texture. There are differences in aspects such as the training set of the REFUGE dataset as the source domain training set, the validation set of the REFUGE dataset and the validation set of the Drishti-GS dataset as the target domain training set, the test set of the REFUGE dataset and the Drishti-GS data Set the test set as the target domain test set.
  • the training set contains 400 images, the image size is 2124 ⁇ 2056, the validation set contains 300 images, and the test test set contains 100 images, and the image size is 1634 ⁇ 1634;
  • the verification set contains 50 images, the test set contains 51 images, the image size is 2047 ⁇ 1759.
  • This application proposes an unsupervised robust segmentation method based on a domain adaptive strategy based on the distribution difference between the source domain and the target domain data set.
  • the two models learn from each other and supervise each other to effectively solve the problem.
  • An unsupervised image segmentation task consists of two generative confrontation networks, namely N1 and N2.
  • N1 includes a generative network (also called a segmentation network) S1 and a discriminant network D1.
  • N2 includes generating network (also called segmentation network) S2, and discriminating network D2.
  • These two networks have different structures and parameters. Due to the differences in the structures and parameters of the two networks, they can produce different decision-making boundaries, that is, have different learning capabilities, thereby promoting peer supervision between networks.
  • peer review is a strategy for two networks to supervise each other, through the exchange of low-loss data and pseudo-labels between the two networks to improve the performance of the network.
  • the electronic device uses the generation network in the first generation countermeasure network to separately segment the source domain image and the target domain image, and determine the first source domain segmentation loss and the first target domain segmentation loss.
  • the generation network in the first generation confrontation network can use DeepLabv2 with ResNet101 as the main framework as the basic model to achieve preliminary segmentation results.
  • the ASPP structure is added to enrich the multi-scale information of the feature map.
  • an attention mechanism based on DANet is proposed to learn how to capture the context dependence between pixels and feature layer channels, and connect the output of the attention module with the output of the spatial pyramid structure to generate the final Segmentation features.
  • source domain images include noisy images and noise-free images.
  • the noise images in the source domain images can be introduced into the weight map against noise labels. Because the medical labels are marked at the boundary position of the target area are very different, in order to prevent the network Fitting noise labels, a new anti-noise segmentation loss is proposed, which learns useful pixel-level information from noise samples and filters out areas with noise at the edges.
  • the electronic device may specifically use the generation network in the first generation countermeasure network to perform feature extraction on the source domain image to obtain the feature information of the first source domain image.
  • the source domain image Perform target segmentation to obtain the first noise segmentation probability, obtain the weight map of the noise image in the source domain image, and obtain the first noise segmentation loss according to the first noise segmentation probability and the weight map of the noise image;
  • the first source domain Image feature information perform target segmentation on the noise-free image in the source domain image to obtain the first noise-free segmentation probability; obtain the first noise-free segmentation loss according to the first noise-free segmentation probability and the labeling result of the noise-free image;
  • the first source domain segmentation loss is determined. According to the first noise segmentation probability, a first noise segmentation result is generated.
  • the calculation method of the first noise segmentation loss can be detailed in the foregoing embodiment.
  • the electronic device may specifically use the generation network in the first generation countermeasure network to perform feature extraction on the target domain image to obtain the characteristic information of the first target domain image.
  • the target domain image Based on the characteristic information of the first target domain image, the target domain image Perform target segmentation to obtain the first target domain segmentation probability; generate the first target domain segmentation result according to the first target domain segmentation probability; obtain the first target domain segmentation according to the first target domain segmentation result and the target domain image loss.
  • the segmentation result generating network output in a network against a first source generating a first temporal segmentation probability P S and the first target region division probability P T simultaneously input to the first generation network against network is determined, using the The information entropy generated by P T is used to calculate the confrontation loss L D , and at the same time, the parameters of the discriminant network are updated by maximizing the confrontation loss. Subsequently, the error generated by the confrontation loss function will also be transmitted back to the generation network, and the parameters of the segmentation network are updated by minimizing the confrontation loss.
  • the purpose is to enable the generation network to predict the segmentation results of the source domain image and the target domain image. The more and more similar, the realization of domain self-adaptation.
  • the discriminant network in the first generation confrontation network can use a 5-layer full convolutional network to integrate the segmentation probabilities of the source domain and the target domain into the confrontation learning.
  • the kernel size of each convolutional layer of the network model is 4, stride It is 2, padding is 1, and a Leaky ReLU activation function layer is added after all convolutional layers except the last layer, and finally a single-channel 2D result is output, with 0 and 1 representing the source domain and target domain, respectively.
  • the information entropy of the first target domain image can be calculated; the discriminant network of the first generation confrontation network is adopted, and according to the first source domain segmentation result, The first target domain segmentation result and the information entropy of the first target domain image are obtained to obtain the first discrimination result.
  • the first source domain segmentation is obtained Loss; according to the first target domain segmentation result and the first target domain image, obtain the first target domain segmentation loss; according to the first source domain segmentation result and the first target domain segmentation result, obtain the first discriminant loss of the discriminant network; According to the first source domain segmentation loss and the first target domain segmentation loss, the minimization confrontation loss of the first generation confrontation network is constructed; the maximum confrontation loss of the first generation confrontation network is constructed according to the first discriminant loss; based on the Minimize the confrontation loss and the maximum confrontation loss, and perform iterative training on the first generation confrontation network to obtain the trained first generation confrontation network.
  • the electronic device uses the generation network in the second generation confrontation network to respectively segment the source domain image and the target domain image, and determine the second source domain segmentation loss and the second target domain segmentation loss.
  • the training of the second generation confrontation network is similar to that of the first generation confrontation network, except that different structures and parameters are used.
  • the DeepLabv3+ architecture can be used for the second generation confrontation network N2 .
  • the lightweight network MobileNetV2 can be used as the basic model.
  • the second generation of confrontation network N2 uses the first convolutional layer of MobileNetV2 and the subsequent 7 residual blocks to extract features, and the step size of the first convolutional layer and the following two residual blocks can be set ( stride) is 2, and the stride of the remaining blocks is set to 1, and the total downsampling rate of the second generation adversarial network is 8.
  • the ASPP module is also added to learn the potential features in different receptive fields, and the ASPP with different dialate rates is used to generate multi-scale features, and different levels of semantic information are integrated into the features.
  • the feature map is up-sampled, and then 1 ⁇ 1 convolution is performed, and the above-mentioned combined features are connected with low-level features to perform fine-grained semantic segmentation.
  • the electronic device may specifically use the generative network in the second generative confrontation network to perform feature extraction on the source domain image to obtain the feature information of the second source domain image.
  • the source domain image Perform target segmentation on the image to obtain the second noise segmentation probability; obtain the weight map of the noise image in the source domain image; obtain the second noise segmentation loss according to the second noise segmentation probability and the weight map of the noise image; based on the second source
  • Based on the second noise segmentation loss and the second noise-free segmentation loss determine the second source domain segmentation loss.
  • the specific calculation method of the second noise segmentation loss can refer to the calculation method of the first noise segmentation loss described above.
  • the specific training method is similar to the preset first generation network, and it is also possible to add "self-supervised" information, that is, use the segmentation result of the target domain image to generate pixel-level pseudo labels. And apply it in the next training phase.
  • the electronic device may specifically use the generation network in the second generation confrontation network to perform feature extraction on the target domain image to obtain the characteristic information of the second target domain image.
  • the target domain image Perform target segmentation to obtain a second target domain segmentation probability; generate a second target domain segmentation result according to the second target domain segmentation probability; obtain a second target domain segmentation according to the second target domain segmentation result and the target domain image loss.
  • the second source domain segmentation probability PS and the second target domain segmentation probability P T of the second source domain segmentation probability P S and the second target domain segmentation probability P T outputted by the generator network in the second generative confrontation network are simultaneously input into the discriminant network of the second generative confrontation network, and are used by
  • the information entropy generated by P T is used to calculate the confrontation loss L D , and at the same time, the parameters of the discriminant network are updated by maximizing the confrontation loss.
  • the error generated by the confrontation loss function will also be transmitted back to the generation network, and the parameters of the segmentation network are updated by minimizing the confrontation loss.
  • the purpose is to enable the generation network to predict the segmentation results of the source domain image and the target domain image.
  • this embodiment uses the Stochastic Gradient Descent (SGD) algorithm to optimize and train the segmentation network, and uses the adaptive momentum stochastic optimization (Adam) algorithm to optimize and train the discriminant network, segment the network and discriminate
  • SGD Stochastic Gradient Descent
  • Adam adaptive momentum stochastic optimization
  • the electronic device may calculate the information entropy of the second target domain image; adopt the discriminant network of the second generative confrontation network, according to the second source domain segmentation result.
  • the domain segmentation result, the second target domain segmentation result, and the information entropy of the second target domain image obtain the second discrimination result.
  • the second source domain segmentation result and the labeling result of the second source domain image obtain the second source domain segmentation loss; according to the second target domain segmentation result, and the second target domain image, obtain the second target domain Segmentation loss: According to the second source domain segmentation result and the second target domain segmentation result, the second discriminant loss of the discriminant network is obtained.
  • the second source domain segmentation loss and the second target domain segmentation loss construct the minimized confrontation loss of the second generation confrontation network; construct the maximum confrontation loss of the second generation confrontation network according to the second discriminant loss; Based on the minimized confrontation loss and the maximized confrontation loss, the second generation confrontation network is iteratively trained to obtain the trained second generation confrontation network.
  • the calculation method of each loss in the second generation confrontation network is similar to that of the first generation confrontation network, and the details can be seen in the above description.
  • the electronic device determines a first source domain target image and a second source domain target image according to the first source domain segmentation loss and the second source domain segmentation loss.
  • the electronic device may specifically sort the first source domain segmentation loss, select a source domain image that meets a preset loss condition according to the sorted first source domain segmentation loss, and determine it as the first source domain target image (ie, the first source domain target image).
  • Source domain clean image the second source domain segmentation loss is sorted, the source domain image that meets the preset loss condition is selected according to the sorted second source domain segmentation loss, and the source domain image is determined as the second source domain target image (that is, the second Source domain clean image).
  • Each generation network delivers these clean images to its peer-to-peer network for the next training process to update the parameters of the convolutional layer.
  • each generation network re-selects the clean data currently considered to be the best, and fine-tunes its peer-to-peer network in a hierarchical manner.
  • peer-to-peer networks can supervise each other, reducing training errors caused by noisy tags.
  • the electronic device determines the first target domain target image and the second target domain target image according to the first target domain segmentation loss and the second target domain segmentation loss.
  • the electronic device may specifically train the first generation confrontation network according to the first target domain segmentation loss, and use the training result to generate the first target domain target image (that is, the pixel-level pseudo-label of the first target domain image);
  • the second target domain segmentation loss trains the second generation confrontation network, and uses the training result to generate the second target domain target image (ie, the pixel-level pseudo label of the second target domain image).
  • the segmentation network and the discriminant network are trained together in an alternate update manner.
  • the electronic device uses the second source domain target image and the second target domain target image to train the first generation confrontation network to obtain the trained first generation confrontation network.
  • the generation network of the first generation confrontation network may be specifically used to segment the second source domain target image and the second target domain target image respectively to obtain the second source domain target segmentation result and the second target domain target segmentation result.
  • calculate the information entropy of the target image in the second target domain use the discriminant network of the first generative confrontation network, according to the second source domain target segmentation result, the second target domain target segmentation result, and the information of the second target domain target image Entropy, the second target discrimination result is obtained.
  • the second source domain target segmentation loss is obtained; according to the second target domain target segmentation result, and the second target domain target image, Obtain the target segmentation loss of the second target domain; obtain the second target discrimination loss of the discriminant network according to the target segmentation result of the second source domain and the target segmentation result of the second target domain.
  • the minimization of the first generation adversarial network is constructed; according to the second target discriminant loss, the minimization of the first generation adversarial network is constructed.
  • Reduced confrontation loss based on the minimized confrontation loss and the maximum confrontation loss, the first generation confrontation network is iteratively trained to obtain the trained first generation confrontation network.
  • the electronic device uses the first source domain target image and the first target domain target image to train the second generation confrontation network to obtain the second generation confrontation network after training.
  • the electronic device may specifically use the second generation countermeasure network to segment the first source domain target image and the first target domain target image to obtain the first source domain target segmentation result and the first target domain target segmentation result. ; Calculate the information entropy of the target image in the first target domain, and use the discriminant network of the second generative confrontation network, according to the first source domain target segmentation result, the first target domain target segmentation result, and the information entropy of the first target domain target image , Get the first target judgment result.
  • the first source domain target segmentation loss is obtained; according to the first target domain target segmentation result, and the first target domain target image, Obtain the first target domain target segmentation loss; according to the first source domain target segmentation result and the first target domain target segmentation result, obtain the first target discriminant loss of the discriminant network.
  • the minimization of the second generative adversarial network is constructed; the second generative adversarial network is maximized according to the first target discriminant loss Confrontation loss; based on the minimized confrontation loss and the maximum confrontation loss, the second generation confrontation network is iteratively trained to obtain the trained second generation confrontation network. Then, the second generative confrontation network is trained according to the first source domain target segmentation loss, the first target domain target segmentation loss, and the first target discrimination loss to obtain the trained second generative confrontation network.
  • the electronic device segments the image to be segmented based on the generation network of the trained first generation confrontation network to obtain a segmentation result.
  • the electronic device may specifically receive the fundus image collected by the medical imaging device, and then, based on the generation network of the trained first generation confrontation network, perform feature extraction on the fundus image to obtain the characteristic information of the fundus image, based on the fundus image Perform target segmentation on the fundus image to obtain the segmentation prediction probability of the fundus image, and generate the segmentation result of the fundus image according to the segmentation prediction probability.
  • the experimental results of the technology proposed in the present invention are compared with some existing algorithms, and the experimental results based on tasks with different noise levels are shown in Table 1. And shown in Table 2.
  • Table 1 is the experimental results of the mild noise level from the REFUGE training set to the REFUGE verification set
  • Table 2 is the experimental results of the mild noise level from the REFUGE training set to the REFUGE verification set.
  • This scheme is used in the REFUGE and Drishti-GS data sets.
  • the above experimental results are shown in Figure 2d.
  • BDL is a two-way learning method based on self-supervised learning, which is used to reduce the domain shift problem and learn a better segmentation model.
  • pOSAL is the OD and OC segmentation task of the retinal fundus glaucoma challenge.
  • BEAL proposed based on edge and Entropy information adversarial learning method.
  • DICE is a measure of the segmentation result, which is used to calculate the similarity between the label Y and the predicted value p, which represents for:
  • the embodiment of the present application first obtains the target domain image and the source domain image with the target information labeled, and then uses the generation network in the first generation confrontation network to respectively segment the source domain image and the target domain image, and determine the first The source domain segmentation loss and the first target domain segmentation loss, and the generation network in the second generative confrontation network is used to segment the source domain image and the target domain image respectively to determine the second source domain segmentation loss and the second target domain segmentation loss, Next, determine the first source domain target image and the second source domain target image according to the first source domain segmentation loss and the second source domain segmentation loss, and determine the first source domain target image and the second source domain target image according to the first target domain segmentation loss and the second target domain segmentation loss Determine the first target domain target image and the second target domain target image, and then use the first source domain target image, the first target domain target image, the second source domain target image, and the second target domain target image to confront the first generation
  • the network and the second generation confrontation network are cross-trained to obtain the trained
  • the scheme mainly aims at the presence of noise in the data labels and the distribution differences between the source domain and target domain datasets, and proposes an unsupervised robust segmentation method based on domain adaptive strategies. Through the mutual learning and mutual supervision of the two models, the task of noise labeling and unsupervised image segmentation is effectively solved, and the accuracy of image segmentation is improved.
  • steps in the embodiments of the present application are not necessarily executed in sequence in the order indicated by the step numbers. Unless specifically stated in this article, the execution of these steps is not strictly limited in order, and these steps can be executed in other orders. Moreover, at least part of the steps in each embodiment may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but can be executed at different times. The execution of these sub-steps or stages The sequence is not necessarily performed sequentially, but may be performed alternately or alternately with at least a part of other steps or sub-steps or stages of other steps.
  • an embodiment of the present application also provides an image segmentation device.
  • the image segmentation device can be specifically integrated in an electronic device.
  • the electronic device can be a server, a terminal, or A system including terminals and servers.
  • the image segmentation device may include an acquisition unit 301, a first segmentation unit 302, a second segmentation unit 303, a determination unit 304, a training unit 305, and a third segmentation unit 306, as follows:
  • the acquiring unit 301 is configured to acquire a target domain image and a source domain image marked with target information
  • the first segmentation unit 302 is configured to use the generation network in the first generation confrontation network to respectively segment the source domain image and the target domain image, and determine the first source domain segmentation loss and the first target domain segmentation loss;
  • the second segmentation unit 303 is configured to use the generation network in the second generation confrontation network to respectively segment the source domain image and the target domain image, and determine the second source domain segmentation loss and the second target domain segmentation loss;
  • the determining unit 304 is configured to determine the first source domain target image and the second source domain target image according to the first source domain segmentation loss and the second source domain segmentation loss, and to determine the first source domain target image and the second source domain target image according to the first target domain segmentation loss, the second source domain
  • the target domain segmentation loss determines the first target domain target image and the second target domain target image
  • the training unit 305 is configured to use the first source domain target image, the first target domain target image, the second source domain target image, and the second target domain target image to cross the first generative countermeasure network and the second generative countermeasure network Training, get the first generation confrontation network after training;
  • the third segmentation unit 306 is configured to segment the image to be segmented based on the generation network of the trained first generation confrontation network to obtain a segmentation result.
  • the first segmentation unit 302 may include a first extraction subunit, a first segmentation subunit, and a second segmentation subunit, as follows:
  • the first extraction subunit is used to extract the features of the source domain image and the target domain image by using the generative network in the first generative confrontation network to obtain the feature information of the first source domain image and the feature information of the first target domain image ;
  • the first segmentation subunit is configured to perform target segmentation on the source domain image based on the feature information of the first source domain image to determine the first source domain segmentation loss;
  • the second segmentation subunit is used to perform target segmentation on the target area image based on the feature information of the first target area image to determine the first target area segmentation loss.
  • the source domain image includes a noise image and a noise-free image
  • the first segmentation subunit is specifically configured to determine the noise in the source domain image based on the feature information of the first source domain image.
  • Based on the first noise segmentation loss and the first noise-free segmentation loss determine the first source domain segmentation loss.
  • the second segmentation subunit is specifically configured to perform target segmentation on the target area image based on the feature information of the first target area image to obtain the first target area segmentation probability;
  • the first target domain segmentation probability is generated, and the first target domain segmentation result is generated;
  • the first target domain segmentation loss is obtained according to the first target domain segmentation result and the target domain image.
  • the second segmentation unit 303 may include a second extraction subunit, a third segmentation subunit, and a fourth segmentation subunit, as follows:
  • the second extraction subunit is used to extract the features of the source domain image and the target domain image by using the generative network in the second generative confrontation network to obtain the feature information of the second source domain image and the feature information of the second target domain image ;
  • the third segmentation subunit is configured to perform target segmentation on the source domain image based on the feature information of the second source domain image to determine the second source domain segmentation loss;
  • the fourth segmentation subunit is configured to perform target segmentation on the target area image based on the feature information of the second target area image to determine the second target area segmentation loss.
  • the source domain image includes a noise image and a noise-free image
  • the third segmentation subunit is specifically configured to determine the noise in the source domain image based on the feature information of the second source domain image.
  • Based on the second noise segmentation loss and the second noise-free segmentation loss determine the second source domain segmentation loss.
  • the fourth segmentation subunit is specifically configured to perform target segmentation on the target area image based on the feature information of the second target area image to obtain the second target area segmentation probability;
  • the second target domain segmentation probability is used to generate a second target domain segmentation result; and the second target domain segmentation loss is obtained according to the second target domain segmentation result and the target domain image.
  • the determining unit 304 may include a first determining subunit and a second determining subunit, as follows:
  • the first determining subunit may be specifically used to sort the first source domain segmentation loss, and select a source domain image that meets a preset loss condition according to the sorted first source domain segmentation loss, and determine it as the first source domain target Image; the second source domain segmentation loss is sorted, and the source domain image that meets the preset loss condition is selected according to the sorted second source domain segmentation loss, and the source domain image is determined as the second source domain target image.
  • the second determining subunit can be specifically used to train the first generation confrontation network according to the first target domain segmentation loss, and use the training result to generate the first target domain target image; according to the second target domain segmentation loss
  • the second generation confrontation network is trained, and the training result is used to generate the second target domain target image.
  • the training unit 305 may include a first training subunit and a second training subunit, as follows:
  • the first training subunit can be specifically used to use the generation network of the first generation confrontation network to separately segment the second source domain target image and the second target domain target image to obtain the second source domain target segmentation result and the second source domain target image.
  • Target segmentation result in the target domain the discriminant network of the first generation confrontation network is used to discriminate the target segmentation result of the second source domain and the target segmentation result of the second target domain to obtain the second target discrimination result; according to the second source domain target segmentation result ,
  • the second target domain target segmentation result and the second target discrimination result train the first generation confrontation network to obtain the trained first generation confrontation network.
  • the second training subunit can be specifically used to use the second generative confrontation network to segment the first source domain target image and the first target domain target image to obtain the first source domain target segmentation result and the first source domain target image.
  • Target segmentation result in the target domain the discriminant network of the second generation confrontation network is used to discriminate between the first source domain target segmentation result and the first target domain target segmentation result to obtain the first target discrimination result; according to the first source domain target segmentation result ,
  • the first target domain target segmentation result and the first target discrimination result train the second generation confrontation network to obtain the second generation confrontation network after training.
  • the first training subunit may be specifically used to calculate the information entropy of the target image in the second target domain; the discriminant network of the first generative confrontation network is used to segment the target image according to the second source domain.
  • the target segmentation result of the second target domain, and the information entropy of the target image of the second target domain, the second target discrimination result is obtained.
  • the first training subunit may be specifically used to obtain the second source domain target segmentation loss based on the second source domain target segmentation result and the labeling result of the second source domain target image ; According to the second target domain target segmentation result and the second target domain target image, obtain the second target domain target segmentation loss; according to the second source domain target segmentation result and the second target domain target segmentation result, obtain the first discriminant network Two target discrimination loss: training the first generation confrontation network according to the second source domain target segmentation loss, the second target domain target segmentation loss, and the second target discrimination loss to obtain the trained first generation confrontation network.
  • the first training subunit may be specifically used to construct a minimization confrontation of the first generation confrontation network according to the second source domain target segmentation loss and the second target domain target segmentation loss Loss; according to the second target discriminant loss, the maximum confrontation loss of the first generation confrontation network is constructed; based on the minimized confrontation loss and the maximum confrontation loss, the first generation confrontation network is iteratively trained, and the training is obtained The first generation confrontation network.
  • the second training subunit may be specifically used to calculate the information entropy of the target image in the first target domain; the discriminant network of the second generative confrontation network is used to segment the target image in the first source domain.
  • the target segmentation result of the first target domain, and the information entropy of the target image of the first target domain, the first target discrimination result is obtained.
  • the second training subunit may be specifically used to obtain the first source domain target segmentation loss based on the first source domain target segmentation result and the labeling result of the first source domain target image ; According to the first target domain target segmentation result and the first target domain target image, obtain the first target domain target segmentation loss; according to the first source domain target segmentation result and the first target domain target segmentation result, obtain the first discriminant network A target discrimination loss; according to the first source domain target segmentation loss, the first target domain target segmentation loss and the first target discrimination loss, the second generation confrontation network is trained to obtain the trained second generation confrontation network.
  • the second training subunit may be specifically used to construct a minimization confrontation of a second generative confrontation network based on the first source domain target segmentation loss and the first target domain target segmentation loss Loss; According to the first target discriminant loss, the maximum confrontation loss of the second generation confrontation network is constructed; based on the minimized confrontation loss and the maximum confrontation loss, the second generation confrontation network is iteratively trained, and the training is obtained The second generation confrontation network.
  • each of the above units can be implemented as an independent entity, or can be combined arbitrarily, and implemented as the same or several entities.
  • each of the above units please refer to the previous method embodiments, which will not be repeated here.
  • the acquisition unit 301 first acquires the target domain image and the source domain image labeled with target information, and then the first segmentation unit 302 uses the generation network in the first generation confrontation network to perform the respective analysis on the source domain.
  • the image and the target domain image are segmented, the first source domain segmentation loss and the first target domain segmentation loss are determined, and the second segmentation unit 303 uses the generation network in the second generation confrontation network to separately perform the source domain image and the target domain image.
  • the determining unit 304 determines the first source domain target image and the second source domain based on the first source domain segmentation loss and the second source domain segmentation loss Domain target image, and according to the first target domain segmentation loss and the second target domain segmentation loss, the first target domain target image and the second target domain target image are determined, and then the training unit 305 uses the first source domain target image and the second target domain A target domain target image, a second source domain target image, and a second target domain target image perform cross-training on the first generation confrontation network and the second generation confrontation network to obtain the trained first generation confrontation network, and then,
  • the third segmentation unit 306 segments the image to be segmented based on the generated network of the trained first generation confrontation network, and obtains the segmentation result; because this solution is mainly for the presence of noise in the label of the data, and the difference between the source domain and the target domain data set For the phenomenon of distribution differences between the two, an unsupervised robust segmentation method based
  • an embodiment of the present application also provides an electronic device, as shown in FIG. 4, which shows a schematic structural diagram of the electronic device involved in the embodiment of the present application, specifically:
  • the electronic device may include one or more processing core processors 401, one or more computer-readable storage medium memory 402, power supply 403, input unit 404 and other components.
  • processing core processors 401 one or more computer-readable storage medium memory 402, power supply 403, input unit 404 and other components.
  • FIG. 4 does not constitute a limitation on the electronic device, and may include more or fewer components than shown in the figure, or combine certain components, or different component arrangements. in:
  • the processor 401 is the control center of the electronic device. It uses various interfaces and lines to connect the various parts of the entire electronic device, runs or executes the software programs and/or modules stored in the memory 402, and calls Data, perform various functions of electronic equipment and process data, so as to monitor the electronic equipment as a whole.
  • the processor 401 may include one or more processing cores; preferably, the processor 401 may integrate an application processor and a modem processor, where the application processor mainly processes the operating system, user interface, application programs, etc. , The modem processor mainly deals with wireless communication. It can be understood that the foregoing modem processor may not be integrated into the processor 401.
  • the memory 402 may be used to store software programs and modules.
  • the processor 401 executes various functional applications and data processing by running the software programs and modules stored in the memory 402.
  • the memory 402 may mainly include a program storage area and a data storage area.
  • the program storage area may store an operating system, an application program required by at least one function (such as a sound playback function, an image playback function, etc.), etc.; Data created by the use of electronic equipment, etc.
  • the memory 402 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other volatile solid-state storage devices.
  • the memory 402 may also include a memory controller to provide the processor 401 with access to the memory 402.
  • the electronic device also includes a power supply 403 for supplying power to various components.
  • the power supply 403 may be logically connected to the processor 401 through a power management system, so that functions such as charging, discharging, and power consumption management can be managed through the power management system.
  • the power supply 403 may also include any components such as one or more DC or AC power supplies, a recharging system, a power failure detection circuit, a power converter or inverter, and a power status indicator.
  • the electronic device may further include an input unit 404, which can be used to receive input digital or character information and generate keyboard, mouse, joystick, optical or trackball signal input related to user settings and function control.
  • an input unit 404 which can be used to receive input digital or character information and generate keyboard, mouse, joystick, optical or trackball signal input related to user settings and function control.
  • the electronic device may also include a display unit and the like.
  • the memory 402 in the electronic device stores computer-readable instructions that can run on the processor 401, and the processor 401 implements the following steps when executing the computer-readable instructions:
  • the embodiment of the present application first obtains the target domain image and the source domain image with the target information labeled, and then uses the generation network in the first generation confrontation network to respectively segment the source domain image and the target domain image, and determine the first The source domain segmentation loss and the first target domain segmentation loss, and the generation network in the second generative confrontation network is used to segment the source domain image and the target domain image respectively to determine the second source domain segmentation loss and the second target domain segmentation loss, Next, determine the first source domain target image and the second source domain target image according to the first source domain segmentation loss and the second source domain segmentation loss, and determine the first source domain target image and the second source domain target image according to the first target domain segmentation loss and the second target domain segmentation loss Determine the first target domain target image and the second target domain target image, and then use the first source domain target image, the first target domain target image, the second source domain target image, and the second target domain target image to confront the first generation
  • the network and the second generation confrontation network are cross-trained to obtain the trained
  • the scheme mainly aims at the presence of noise in the data labels and the distribution differences between the source domain and target domain datasets, and proposes an unsupervised robust segmentation method based on domain adaptive strategies. Through the mutual learning and mutual supervision of the two models, the task of noise labeling and unsupervised image segmentation is effectively solved, and the accuracy of image segmentation is improved.
  • the embodiments of the present application further provide one or more non-volatile storage media storing computer-readable instructions, and when the computer-readable instructions are executed by one or more processors, the processors perform the following steps:
  • the non-volatile storage medium may include: read only memory (Read Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

一种图像分割方法、装置和存储介质;先获取目标域图像、以及已标注目标信息的源域图像,再采用第一生成对抗网络中的生成网络以及第二生成对抗网络中的生成网络分别对源域图像和目标域图像进行分割,接着,根据该第一源域分割损失、该第二源域分割损失确定第一源域目标图像和第二源域目标图像,根据该第一目标域分割损失、该第二目标域分割损失确定第一目标域目标图像和第二目标域目标图像,然后,对第一生成对抗网络和第二生成对抗网络进行交叉训练,得到训练后的第一生成对抗网络,再基于该训练后的第一生成对抗网络的生成网络对待分割图像进行分割,得到分割结果。

Description

图像分割方法、装置和存储介质
本申请要求于2020年02月10日提交中国专利局,申请号为2020100846256,申请名称为“图像分割方法、装置和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及通信技术领域,具体涉及一种图像分割方法、装置和存储介质。
背景技术
随着人工智能(AI,Artificial Intelligence)的发展,AI在医疗领域上的应用也越来越为广泛,在对各种医学图像分析任务中都取得了显著的成果,如图像分类、病灶检测、目标分割以及医学影像分析,特别是在医学影像的分割上,比如,可以应用AI技术从视网膜眼底图像中分割出视杯和视盘等。目前AI分割视杯和视盘的方案主要基于深度学习网络,具体可以训练一个可分割视杯和视盘的深度学习网络,然后,将待分割的眼底图像输入至训练后的深度学习网络进行特征提取,并基于特征进行视杯和视盘分割,得到分割结果,如青光眼分割图像等等。
在对相关技术的研究和实践过程中发现,所训练的深度卷积神经网络模型通常在对未出现过的数据进行测试时,性能出现下降,特别是在训练之间(源域)和测试(目标域)数据存在显著的域迁移(domain shift)。域迁移是生物医学领域的一个常见问题,由于生物医学图像是由不同成像方式或同一设备的不同设置采集的,不同的采集图像在纹理、颜色、形状等方面有差异性。因此,分割的准确性并不高。
发明内容
根据本申请提供的各种实施例,提供一种图像分割方法、装置和存储介质。
本申请实施例提供一种图像分割方法,包括:获取目标域图像、以及已标 注目标信息的源域图像;采用第一生成对抗网络中的生成网络分别对源域图像和目标域图像进行分割,确定第一源域分割损失和第一目标域分割损失;采用第二生成对抗网络中的生成网络分别对源域图像和目标域图像进行分割,确定第二源域分割损失和第二目标域分割损失;根据所述第一源域分割损失、所述第二源域分割损失确定第一源域目标图像和第二源域目标图像,以及根据所述第一目标域分割损失、所述第二目标域分割损失确定第一目标域目标图像和第二目标域目标图像;利用第一源域目标图像、第一目标域目标图像、第二源域目标图像以及第二目标域目标图像对所述第一生成对抗网络和所述第二生成对抗网络进行交叉训练,得到训练后的第一生成对抗网络;及基于所述训练后的第一生成对抗网络的生成网络对待分割图像进行分割,得到分割结果。
相应的,本申请实施例还提供一种图像分割装置,包括:
获取单元,用于获取目标域图像、以及已标注目标信息的源域图像;
第一分割单元,用于采用第一生成对抗网络中的生成网络分别对源域图像和目标域图像进行分割,确定第一源域分割损失和第一目标域分割损失;
第二分割单元,用于采用第二生成对抗网络中的生成网络分别对源域图像和目标域图像进行分割,确定第二源域分割损失和第二目标域分割损失;
确定单元,用于根据所述第一源域分割损失、所述第二源域分割损失确定第一源域目标图像和第二源域目标图像,以及根据所述第一目标域分割损失、所述第二目标域分割损失确定第一目标域目标图像和第二目标域目标图像;
训练单元,用于利用第一源域目标图像、第一目标域目标图像、第二源域目标图像以及第二目标域目标图像对所述第一生成对抗网络和所述第二生成对抗网络进行交叉训练,得到训练后的第一生成对抗网络;
第三分割单元,用于基于所述训练后的第一生成对抗网络的生成网络对待分割图像进行分割,得到分割结果。
此外,本申请实施例还提供一个或多个存储有计算机可读指令的非易失性存储介质,所述计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行本申请实施例提供的任一种图像分割方法中的步骤。
此外,本申请实施例还提供一种电子设备,包括存储器,处理器及存储在 存储器上并可在处理器上运行的计算机可读指令,所述处理器执行所述程序时实现如本申请实施例提供的任一种图像分割方法中的步骤。
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征、目的和优点将从说明书、附图以及权利要求书变得明显。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1a是本申请实施例提供的图像分割方法的场景示意图;
图1b是本申请实施例提供的图像分割方法的流程图;
图2a是本申请实施例提供的图像分割方法的另一流程图;
图2b是本申请实施例提供的图像分割方法的系统框架图;
图2c是本申请实施例提供的第一生成对抗网络的框架图;
图2d是本申请实施例提供的图像分割结果图;
图3是本申请实施例提供的图像分割装置的结构示意图;
图4是本申请实施例提供的电子设备的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请实施例提供一种图像分割方法、装置和存储介质。其中,该图像分割可以集成在电子设备中,该电子设备可以是服务器,也可以是终端等设备。
本申请实施例提供的图像分割方法涉及人工智能领域中的计算机视觉方向,可以通过人工智能的计算机视觉技术实现眼底图像分割,得到分割结果。
其中,人工智能(Artificial Intelligence,AI)是利用数字计算机或者数字计 算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。换句话说,人工智能是计算机科学的一个综合技术,它企图了解智能的实质,并生产出一种新的能以人类智能相似的方式做出反应的智能机器。人工智能也就是研究各种智能机器的设计原理与实现方法,使机器具有感知、推理与决策的功能。人工智能技术是一门综合学科,涉及领域广泛,既有硬件层面的技术也有软件层面的技术。其中,人工智能软件技术主要包括计算机视觉技术、机器学习/深度学习等方向。
其中,计算机视觉技术(Computer Vision,CV)是一门研究如何使机器“看”的科学,更进一步的说,就是指通过计算机代替人眼对目标进行识别、测量等的机器视觉,并进一步进行图像处理,使图像经过计算机处理成为更适合人眼观察或传送给仪器检测的图像。作为一个科学学科,计算机视觉研究相关的理论和技术,试图建立能够从图像或者多维数据中获取信息的人工智能系统。计算机视觉技术通常包括图像处理、图像识别等技术,还包括常见的人脸识别、人体姿态识别等生物特征识别技术。
本申请实施例中,所谓图像分割,是把图像分成若干个特定的、具有独特性质的区域,并提出感兴趣目标的计算机视觉技术和过程。在本申请实施例中,主要指的是对医学图像如眼底图像进行分割,找出所需的目标对象,比如,从眼底图像中分割出视杯、视盘等等。该分割出来的目标对象后续可以供医护人员或其他医学专家进行分析,以便做出进一步的操作。
例如,参见图1a,首先,该集成了图像分割装置的电子设备先获取目标域图像、以及已标注目标信息的源域图像,再采用第一生成对抗网络中的生成网络分别对源域图像和目标域图像进行分割,确定第一源域分割损失和第一目标域分割损失,以及采用第二生成对抗网络中的生成网络分别对源域图像和目标域图像进行分割,确定第二源域分割损失和第二目标域分割损失,接着,根据该第一源域分割损失、该第二源域分割损失确定第一源域目标图像和第二源域目标图像,以及根据该第一目标域分割损失、该第二目标域分割损失确定第一目标域目标图像和第二目标域目标图像,然后利用第一源域目标图像、第一目标域目标图像、第二源域目标图像以及第二目标域目标图像对该第一生成对抗 网络和该第二生成对抗网络进行交叉训练,得到训练后的第一生成对抗网络,再然后,基于该训练后的第一生成对抗网络的生成网络对待分割图像进行分割,得到分割结果。
由于该方案利用的是两个生成对抗网络具有不同的结构和学习能力,可以互相学习、互相监督,并从自身的网络中选择干净的目标图像交给对等的网络继续训练,有效地提高了图像分割的准确性。
本实施例将从图像分割装置的角度进行描述,该图像分割装置具体可以集成在电子设备中,该电子设备可以是服务器,也可以是终端,还可以是包括服务器和终端的系统。当电子设备为包括服务器和终端的系统时,本申请实施例的图像分割方法通过终端和服务器的交互实现。
其中,该终端具体可以是台式终端或移动终端,移动终端具体可以手机、平板电脑、笔记本电脑等中的至少一种。服务器可以用独立的服务器或者是多个服务器组成的服务器集群来实现。
如图1b所示,该图像分割方法的具体流程可以如下:
101、获取目标域图像、以及已标注目标信息的源域图像。
其中,源域图像指的是可以提供丰富的标注信息的医学图像,目标域图像指的是测试数据集所在的领域,缺少标注信息的医学图像。例如,源域图像具体可以由各医学图像采集设备,比如电子计算机断层扫描仪(Computed Tomography,CT)、或核磁共振成像仪等来对生命体组织进行图像采集,由专业人员对图像进行标注,比如由图像科医师标注进而提供给该图像分割装置,即,图像分割装置具体可以接收医学图像采集设备发送的医学图像样本。
其中,医学图像指的是在医疗或医学研究中,以非侵入方式取得生命体或生命体某部分内部组织的图像,比如人体的脑部、肠胃、肝脏、心脏、喉咙和阴道等图像,这些图像可以是CT图像、核磁共振图像或者正子发射断层扫描影响等等。而生命体指的是有生命形态的独立个体,比如人或动物等。源域图像可以指的是已经由医学图像采集设备采集,通过各种途径获取到的图像,比如从数据库或网络等,可以是经由专业人员对图像进行特定意义标注的图像样本,也可以是未经任何处理的图像样本。
102、采用第一生成对抗网络中的生成网络分别对源域图像和目标域图像进行分割,确定第一源域分割损失和第一目标域分割损失。
其中,第一生成对抗网络的结构和参数可以根据实际情况进行设定以及调整等。比如,第一生成对抗网络中的生成网络可以以残差网络101(ResNet101)为主要框架的DeepLabv2作为基础模型,实现了初步的分割结果。同时,增加了空间金字塔(Atrous Spatial Pyramid Pooling,ASPP)结构,丰富了特征图的多尺度信息。为了增强网络的特征表达能力,提出了一个基于双重注意网络(Dual Attention Network,DANet)的注意机制,学习如何捕捉像素和特征层通道之间的上下文依赖关系,将注意力模块的输出与空间金字塔结构的输出相连接,生成最终的分割特征。而第一生成对抗网络中的判别网络可以采用多层全卷积网络,将源域图像和目标域图像的分割概率融合到对抗学习中,并且可以在除了最后一层的所有卷积层后都添加了一个泄露修正线性单元(Leaky Rectified Linear Unit,Leaky ReLU)激活函数层,最终输出一个单通道的2D结果,用0和1分别表示源域和目标域。
例如,具体可以采用第一生成对抗网络中的生成网络分别对源域图像和目标域图像进行特征提取,得到第一源域图像的特征信息和第一目标域图像的特征信息,基于该第一源域图像的特征信息,对该源域图像进行目标分割,确定第一源域分割损失,基于该第一目标域图像的特征信息,对该目标域图像进行目标分割,确定第一目标域分割损失。
其中,确定第一源域分割损失的方式可以有很多种,比如,可以在源域图形中引入对抗噪声标签的权重图(distance map),由于医学标签在目标区域的边界位置上的标记有很大差异,为了防止网络拟合噪声标签,提出了一种新的抗噪分割损失,从噪声样本中学习有用的像素级信息,过滤掉边缘有噪声的区域。
例如,该源域图像包括噪声图像和无噪声图像,具体可以基于该第一源域图像的特征信息,对该源域图像中噪声图像进行目标分割,得到第一噪声分割概率;获取源域图像中噪声图像的权重图;根据该第一噪声分割概率,和噪声图像的权重图,获取第一噪声分割损失;基于该第一源域图像的特征信息,对 该源域图像中无噪声图像进行目标分割,得到第一无噪声分割概率;根据第一无噪声分割概率,和无噪声图像的标注结果,获取第一无噪声分割损失;基于该第一噪声分割损失和该第一无噪声分割损失,确定第一源域分割损失。根据该第一噪声分割概率,生成第一噪声分割结果。
其中,第一噪声分割损失的具体公式可以如下:
Figure PCTCN2020124673-appb-000001
Figure PCTCN2020124673-appb-000002
其中,在公式(1)中,h×w×c分别表示图像数据的长度、宽度和类别,λ 1和λ 2是两种损失的权重系数,W(y i)表示权重图,公式第一项基于交叉熵损失,第二项基于dice损失。在公式(2)中,w c表示平衡类别的加权权重值也是权重系数;对于每一个噪声标签y i,计算标签上像素点到最近的边界的距离d(y i),并在类级别区域中获得距离d(y i)的最大值max dis。两个网络交换各自认为损失小的干净数据(clean data)时,计算两个网络对于clean data预测值的dice co,当dice co大于阈值μ时,说明两个网络对该样本产生分歧,将该样本视为噪声样本(noisy data),加上抗噪分割损失改进对噪声样本的学习,计算生成网络的损失L noise,否则它将保持交叉熵和dice loss的原始方式来计算损失。对于权重映射W(y i),每个类中区域的中心具有较大的权重,并且越靠近边界,权重越小。L noise使得网络能够捕获中心的关键位置,并在各种噪声标签下滤除边界上的差异。
其中,对于目标域数据集来说,没有像素级的语义标签,因此可以将整个任务看作一个无监督的图像分割问题。本申请通过添加“自监督”信息的方式,即利用目标域图像的分割结果来生成像素级的伪标签,并将其应用在下一训练阶段中。在目标域图像的分割概率结果中,对于任一像素点而言,若某一类别的预测置信度高于置信阈值,则在该像素位置生成一个对应类别的伪标签。这 里的置信阈值采用自适应的设置方式,对目标域图像中每个类和每个样本中每个伪标签的置信度进行排序,自适应地选取类级和图像级预测置信度最高的像素点,生成像素级伪标签,作为下一训练阶段的交叉监督信息。为了保证生成的伪标签的正确性,采用了一种“从易到难”的策略,即以迭代的方式训练模型,不断生成更准确的伪标签。
比如,具体可以基于该第一目标域图像的特征信息,对该目标域图像进行目标分割,得到第一目标域分割概率;根据该第一目标域分割概率,生成第一目标域分割结果;根据该第一目标域分割结果、和该目标域图像,获取第一目标域分割损失。
接着,将第一生成对抗网络中的生成网络输出的分割结果第一源域分割概率P S和第一目标域分割概率P T同时输入到第一生成对抗网络中的判别网络中,并利用由P T生成的信息熵结果来计算对抗损失L D,同时通过最大化对抗损失来更新判别网络的参数。随后,对抗损失函数产生的误差也会被反传回给生成网络,通过最小化对抗损失来更新分割网络的参数,目的是使得生成网络对源域图像和目标域图像预测出的分割结果能够越来越相似,实现领域自适应。
例如,具体可以在得到第一源域分割结果和第一目标域分割结果后,采用第一生成对抗网络的判别网络,对第一源域分割结果和第一目标域分割结果进行判别,得到第一判别结果;根据第一源域分割结果、第一目标域分割结果和第一判别结果对第一生成对抗网络进行训练,得到训练后的第一生成对抗网络。
其中,对第一源域分割结果和第一目标域分割结果进行判别的方式可以有很多种,比如,具体可以用于计算第一目标域图像的信息熵;采用第一生成对抗网络的判别网络,根据该第一源域分割结果、第一目标域分割结果、以及第一目标域图像的信息熵,得到第一判别结果。
其中,根据第一源域分割结果、第一目标域分割结果和第一判别结果对第一生成对抗网络进行训练的方式也可以有很多种,比如,具体可以根据该第一源域分割结果,和第一源域图像的标注结果,获取第一源域分割损失;根据该第一目标域分割结果,和第一目标域图像,获取第一目标域分割损失;根据第一源域分割结果和第一目标域分割结果,获取判别网络的第一判别损失;根据 第一源域分割损失、第一目标域分割损失和第一判别损失对第一生成对抗网络进行训练,得到训练后的第一生成对抗网络。
其中,根据第一源域分割损失、第一目标域分割损失和第一判别损失对第一生成对抗网络进行训练的方式也可以有很多种,比如,具体可以根据该第一源域分割损失和该第一目标域分割损失构建第一生成对抗网络的极小化对抗损失;根据该第一判别损失构建第一生成对抗网络的极大化对抗损失;基于该极小化对抗损失和该极大化对抗损失,对第一生成对抗网络进行迭代训练,得到训练后的第一生成对抗网络。
其中,极小化对抗损失和极大化对抗损失(即整体目标函数由最大最小优化)的具体计算公式如下:
Figure PCTCN2020124673-appb-000003
对于源域图像X S和目标域图像X T,Y S是源域图像的标签,
Figure PCTCN2020124673-appb-000004
是目标域图像在训练过程中生成的伪标签,L seg是整个分割网络(即生成网络)的损失函数P=G(X)∈R H×W×C,分割网络的分割损失为:
Figure PCTCN2020124673-appb-000005
其中,源域分割损失
Figure PCTCN2020124673-appb-000006
定义为:
Figure PCTCN2020124673-appb-000007
Figure PCTCN2020124673-appb-000008
L noise是第一噪声分割损失,L clean是对于带有干净可靠标签数据(clean data)的分割损失,即第一无噪声分割损失,α是平衡L clean和L noise的系数。
其中,判别网络的对抗损失的计算可以如下:
L D=λ advL adv(X S,X T)          (7)
其中,λ adv是对抗损失在训练过程中用来平衡损失关系的参数,L adv可以表示为:
L adv(X S,X T)=-E[log(D(G(X S)))]-E[(λ entrf(X T)+ε)+log(1-D(G(X T)))]      (8)
其中,λ entr是信息熵结果图对应的权重参数,ε的加入是为了在f(X T)的情 况下保证训练过程的稳定。f(X T)是目标域图像的信息熵计算结果,可以表示为:
Figure PCTCN2020124673-appb-000009
在目标域图像的逐像素预测中引入信息熵图(entropy map),然后根据预测将“熵图”乘在判别器对各像素点计算的对抗损失上,以增加具有不确定性的像素点(高熵值)的损失权重,并减少确定性的损失权重(低熵值)。在熵图映射的驱动下,帮助网络学习如何关注类别上最具代表性的特征。
103、采用第二生成对抗网络中的生成网络分别对源域图像和目标域图像进行分割,确定第二源域分割损失和第二目标域分割损失。
其中,第二生成对抗网络的训练与第一生成对抗网络类似,只是使用了不同的结构和参数。比如,对于第二生成对抗网络N2,可以采用DeepLabv3+架构,为了减少参数的数量和计算成本,我们用轻量级网络MobileNetV2为作为基础模型。网络N2利用MobileNetV2的第一层卷积层和随后的7个残差模块(Resdual block)来提取特征。类似于网络N1,同样加入了ASPP模块对不同感受野下的潜在特征进行学习,利用具有不同膨胀率(dialate rate)的ASPP生成多尺度特征,将不同层次的语义信息整合到特征映射中,对特征映射进行上采样,然后进行卷积,对上述组合特征,并将其与低级特征连接,以进行细粒度语义分割。
例如,具体可以采用第二生成对抗网络中的生成网络分别对源域图像和目标域图像进行特征提取,得到第二源域图像的特征信息和第二目标域图像的特征信息,基于该第二源域图像的特征信息,对该源域图像进行目标分割,确定第二源域分割损失,基于该第二目标域图像的特征信息,对该目标域图像进行目标分割,确定第二目标域分割损失。
其中,确定第二源域分割损失的方式可以有很多种,比如,可以在源域图像中引入对抗噪声标签的权重图(distance map),由于医学标签在目标区域的边界位置上的标记有很大差异,为了防止网络拟合噪声标签,提出了一种新的抗噪分割损失,从噪声样本中学习有用的像素级信息,过滤掉边缘有噪声的区 域。
例如,该源域图像包括噪声图像和无噪声图像,具体可以基于该第二源域图像的特征信息,对该源域图像中噪声图像进行目标分割,得到第二噪声分割概率;获取源域图像中噪声图像的权重图;根据该第二噪声分割概率,和噪声图像的权重图,获取第二噪声分割损失;基于该第二源域图像的特征信息,对该源域图像中无噪声图像进行目标分割,得到第二无噪声分割概率;根据第二无噪声分割概率,和无噪声图像的标注结果,获取第二无噪声分割损失;基于该第二噪声分割损失和该第二无噪声分割损失,确定第二源域分割损失。
其中,第二噪声分割损失的具体计算方式可以参照上述第一噪声分割损失的计算方式。
其中,对于目标域图像来说,具体的训练方式与预设第一生成网络类似,也可以通过添加“自监督”信息的方式,即利用目标域图像的分割结果来生成像素级的伪标签,并将其应用在下一训练阶段中。比如,具体可以基于该第二目标域图像的特征信息,对该目标域图像进行目标分割,得到第二目标域分割概率;根据该第二目标域分割概率,生成第二目标域分割结果;根据该第二目标域分割结果、和该目标域图像,获取第二目标域分割损失。
接着,将第二生成对抗网络中的生成网络输出的分割结果第二源域分割概率P S和第二目标域分割概率P T同时输入到第二生成对抗网络中的判别网络中,并利用由P T生成的信息熵结果来计算对抗损失L D,同时通过最大化对抗损失来更新判别网络的参数。随后,对抗损失函数产生的误差也会被反传回给生成网络,通过最小化对抗损失来更新分割网络的参数,目的是使得生成网络对源域图像和目标域图像预测出的分割结果能够越来越相似,实现领域自适应。
例如,具体可以在得到第二源域分割结果和第二目标域分割结果后,采用第二生成对抗网络的判别网络,对第二源域分割结果和第二目标域分割结果进行判别,得到第二判别结果;根据第二源域分割结果、第二目标域分割结果和第二判别结果对第二生成对抗网络进行训练,得到训练后的第二生成对抗网络。
其中,对第二源域分割结果和第二目标域分割结果进行判别的方式可以有很多种,比如,具体可以用于计算第二目标域图像的信息熵;采用第二生成对 抗网络的判别网络,根据该第二源域分割结果、第二目标域分割结果、以及第二目标域图像的信息熵,得到第二判别结果。
其中,根据第二源域分割结果、第二目标域分割结果和第二判别结果对第二生成对抗网络进行训练的方式也可以有很多种,比如,具体可以根据该第二源域分割结果,和第二源域图像的标注结果,获取第二源域分割损失;根据该第二目标域分割结果,和第二目标域图像,获取第二目标域分割损失;根据第二源域分割结果和第二目标域分割结果,获取判别网络的第二判别损失;根据第二源域分割损失、第二目标域分割损失和第二判别损失对第二生成对抗网络进行训练,得到训练后的第二生成对抗网络。
其中,根据第二源域分割损失、第二目标域分割损失和第二判别损失对第二生成对抗网络进行训练的方式也可以有很多种,比如,具体可以根据该第二源域分割损失和该第二目标域分割损失构建第二生成对抗网络的极小化对抗损失;根据该第二判别损失构建第二生成对抗网络的极大化对抗损失;基于该极小化对抗损失和该极大化对抗损失,对第二生成对抗网络进行迭代训练,得到训练后的第二生成对抗网络。
其中,第二生成对抗网络中各个损失的计算方法与第一生成对抗网络类似,详细可见上文描述。
104、根据该第一源域分割损失、该第二源域分割损失确定第一源域目标图像和第二源域目标图像,以及根据该第一目标域分割损失、该第二目标域分割损失确定第一目标域目标图像和第二目标域目标图像。
在训练过程中,采用交叉训练的训练方式,通过每一阶段从两个不同生成网络选择的源域干净图像数据,来逐步更新网络参数。具体训练步骤如下:步骤1,在N次迭代之后,每个生成对抗网络对所有预测值的分割损失进行排序,两个网络分别选择小损失样本C 1和C 2作为干净数据。步骤2,每个网络将这些有用的样本交给其对等网络以进行下一个训练过程,然后更新卷积层的参数。步骤3,每个生成网络重新选择当前认为是最佳的干净数据,并以分层的方式微调其对等网络。由于两个网络具有不同的结构和学习能力,它们可以过滤由噪声标签引入的不同类型的错误。在这个交换过程中,对等网络可以互相监督, 减少噪声标签带来的训练误差。
例如,具体可以对该第一源域分割损失进行排序,根据排序后的第一源域分割损失选取满足预设损失条件的源域图像,确定为第一源域目标图像;对该第二源域分割损失进行排序,根据排序后的第二源域分割损失选取满足预设损失条件的源域图像,确定为第二源域目标图像。
其中,预设损失条件例如可以是预设的损失阈值,相应地,满足预设条件的源域图像例如可以是该源域图像的源域分割损失小于该损失阈值。预设损失条件也可以是损失阈值最小,相应地,满足预设损失条件的源域图像指的是所有源域图像中源域分割损失最小的源域图像。
对于目标域图像,通过每一阶段两个生成网络对目标域图像生成的伪标签交叉学习,来更新网络参数。具体训练步骤如下:步骤1,用两个网络前一阶段的训练结果对于目标域的结果PL 1和PL 2作为伪标签。步骤2,将伪标签应用在下一阶段另一个网络训练过程中,以迭代的方式更新网络参数。而在每一阶段中,分割网络和判别网络以交替更新的方式一起训练。我们首先将图像数据输入到分割网络中,利用源域数据的真实标签和目标域数据的伪标签来计算分割损失L seg,并通过最小化分割损失来更新分割网络的参数。
例如,具体可以根据该第一目标域分割损失对该第一生成对抗网络进行训练,利用训练结果生成第一目标域目标图像;根据该第二目标域分割损失对该第二生成对抗网络进行训练,利用训练结果生成第二目标域目标图像。
105、利用第一源域目标图像、第一目标域目标图像、第二源域目标图像以及第二目标域目标图像对该第一生成对抗网络和该第二生成对抗网络进行交叉训练,得到训练后的第一生成对抗网络。
例如,可以利用第一源域目标图像和第一目标域目标图像对第二生成对抗网络进行训练,利用第二源域目标图像和第二目标域目标图像对第一生成对抗网络进行训练。
例如,具体可以采用第一生成对抗网络的生成网络,分别对第二源域目标图像和第二目标域目标图像进行分割,得到第二源域目标分割结果和第二目标域目标分割结果;采用第一生成对抗网络的判别网络,对第二源域目标分割结 果和第二目标域目标分割结果进行判别,得到第二目标判别结果;根据第二源域目标分割结果、第二目标域目标分割结果和第二目标判别结果对第一生成对抗网络进行训练,得到训练后的第一生成对抗网络。
其中,对第二源域目标分割结果和第二目标域目标分割结果进行判别的方式可以有很多种,比如,具体可以用于计算第二目标域目标图像的信息熵;采用第一生成对抗网络的判别网络,根据该第二源域目标分割结果、第二目标域目标分割结果、以及第二目标域目标图像的信息熵,得到第二目标判别结果。
其中,根据第二源域目标分割结果、第二目标域目标分割结果和第二目标判别结果对第一生成对抗网络进行训练的方式也可以有很多种,比如,具体可以根据该第二源域目标分割结果,和第二源域目标图像的标注结果,获取第二源域目标分割损失;根据该第二目标域目标分割结果,和第二目标域目标图像,获取第二目标域目标分割损失;根据第二源域目标分割结果和第二目标域目标分割结果,获取判别网络的第二目标判别损失;根据第二源域目标分割损失、第二目标域目标分割损失和第二目标判别损失对第一生成对抗网络进行训练,得到训练后的第一生成对抗网络。
其中,根据第二源域目标分割损失、第二目标域目标分割损失和第二目标判别损失对第一生成对抗网络进行训练的方式也可以有很多种,比如,具体可以根据该第二源域目标分割损失和该第二目标域目标分割损失构建第一生成对抗网络的极小化对抗损失;根据该第二目标判别损失构建第一生成对抗网络的极大化对抗损失;基于该极小化对抗损失和该极大化对抗损失,对第一生成对抗网络进行迭代训练,得到训练后的第一生成对抗网络。
其中,利用第一源域目标图像、第一目标域目标图像对第二生成对抗网络进行训练的方式与第二生成对抗网络的训练方式类似。例如,具体可以采用第二生成对抗网络的生成网络,分别对第一源域目标图像和第一目标域目标图像进行分割,得到第一源域目标分割结果和第一目标域目标分割结果;采用第二生成对抗网络的判别网络,对第一源域目标分割结果和第一目标域目标分割结果进行判别,得到第一目标判别结果;根据第一源域目标分割结果、第一目标域目标分割结果和第一目标判别结果对第二生成对抗网络进行训练,得到训练 后的第二生成对抗网络。
可选的,在一些实施例中,采用第二生成对抗网络的判别网络,对第一源域目标分割结果和第一目标域目标分割结果进行判别,具体可以计算第一目标域目标图像的信息熵,采用第二生成对抗网络的判别网络,根据该第一源域目标分割结果、第一目标域目标分割结果、以及第一目标域目标图像的信息熵,得到第一目标判别结果。
可选的,在一些实施例中,根据第一源域目标分割结果、第一目标域目标分割结果和第一目标判别结果对第二生成对抗网络进行训练,具体可以根据该第一源域目标分割结果,和第一源域目标图像的标注结果,获取第一源域目标分割损失;根据该第一目标域目标分割结果,和第一目标域目标图像,获取第一目标域目标分割损失;根据第一源域目标分割结果和第一目标域目标分割结果,获取判别网络的第一目标判别损失;根据第一源域目标分割损失、第一目标域目标分割损失和第一目标判别损失对第二生成对抗网络进行训练,得到训练后的第二生成对抗网络。
可选的,在一些实施例中,具体可以根据该第一源域目标分割损失和该第一目标域目标分割损失构建第二生成对抗网络的极小化对抗损失;根据该第一目标判别损失构建第二生成对抗网络的极大化对抗损失;基于该极小化对抗损失和该极大化对抗损失,对第二生成对抗网络进行迭代训练,得到训练后的第二生成对抗网络。
106、基于该训练后的第一生成对抗网络的生成网络对待分割图像进行分割,得到分割结果。
例如,具体可以基于该训练后的第一生成对抗网络的生成网络对待分割图像进行特征提取,得到待分割图像的特征信息,基于该待分割图像的特征信息对该待分割图像进行目标分割,得到待分割图像的分割预测概率,根据该分割预测概率生成该待分割图像的分割结果。
其中,待分割图像指的是需要进行分割的图像,比如医学图像(如心脏,肺等等)或者一些普通图像(如人、物体)等等。比如,待分割图像为医学图像时,可以由各医学图像采集设备,比如电子计算机断层扫描仪或核磁共振成 像仪等来对生命体组织进行图像采集,比如人体的脑部、肠胃、肝脏、心脏、喉咙和阴道等,进而提供给该医学图像检测装置,即,医学图像检测装置具体可以接收医学图像采集设备发送的待分割图像。
由上可知,本申请实施例先获取目标域图像、以及已标注目标信息的源域图像,再采用第一生成对抗网络中的生成网络分别对源域图像和目标域图像进行分割,确定第一源域分割损失和第一目标域分割损失,以及采用第二生成对抗网络中的生成网络分别对源域图像和目标域图像进行分割,确定第二源域分割损失和第二目标域分割损失,接着,根据该第一源域分割损失、该第二源域分割损失确定第一源域目标图像和第二源域目标图像,以及根据该第一目标域分割损失、该第二目标域分割损失确定第一目标域目标图像和第二目标域目标图像,然后利用第一源域目标图像、第一目标域目标图像、第二源域目标图像以及第二目标域目标图像对该第一生成对抗网络和该第二生成对抗网络进行交叉训练,得到训练后的第一生成对抗网络,再然后,基于该训练后的第一生成对抗网络的生成网络对待分割图像进行分割,得到分割结果;由于该方案主要针对数据的标签存在着噪声,以及源域和目标域数据集之间的分布差异现象,提出了一种基于领域自适应策略的无监督鲁棒性分割方法。通过两个模型互相学习、互相监督的方式,有效地解决了噪声标签和无监督的图像分割任务,提高了图像分割的准确性。根据上一个实施例所描述的方法,以下将以青光眼视杯和视盘的准确分割进行举例来作进一步详细说明。
为了确保算法能在临床中真正起到辅助诊断的作用,需要提升图像分割的准确性。本申请实施例提供了基于噪声标签数据的鲁棒的无监督领域自适应分割方法,能够学习已有标注的数据集上的特征结构并将知识迁移到新数据集上,为没有标注的新数据集提供较为准确的图像分割,有效提升深度网络的在其他数据集上的泛化性能。
本申请实施例中无监督领域自适应的训练方法,可以通过领域对抗方式对包含图像分割网络(作为生成网络)的生成对抗网络进行训练,然后,采用训练后生成对抗网络中的生成网络对无标注的待分割图像进行分割等。在本实施例中,将以该图像分割装置具体集成在电子设备为例进行说明。
如图2a所示,本申请实施例提供一种图像分割方法,具体流程可以如下:
201、电子设备获取目标域图像、以及已标注目标信息的源域图像。
具体地,在两个眼底图像数据集的自适应分割任务上,采用REFUGE和Drishti-GS数据集,由于其训练集和验证集(或测试集)由不同采集设备拍摄,因此图像在颜色和纹理等方面存在差别,将REFUGE数据集的训练集作为源域训练集,REFUGE数据集的验证集和Drishti-GS数据集的验证集作为目标域训练集,REFUGE数据集的测试集和Drishti-GS数据集的测试集作为目标域测试集。对于REFUGE数据集,训练集包含400张图像,图像大小为2124×2056,验证集包含300张图像,测试测试集包含100张,图像大小为1634×1634;对于Drishti-GS数据集,验证集包含50张图像,测试测试集包含51张,图像大小为2047×1759。
本申请针对源域和目标域数据集之间的分布差异现象,提出了一种基于领域自适应策略的无监督鲁棒性分割方法,通过两个模型互相学习、互相监督的方式,有效地解决了无监督的图像分割任务。其中,该鲁棒性分割方法框架,如图2b所示,由两个生成对抗网络组成,即N1和N2,N1包括生成网络(也称分割网络)S1和判别网络D1。N2包括生成网络(也称分割网络)S2,和判别网络D2。这两个网络具有不同的结构和参数,由于两个网络结构和参数的差异性,它们可以产生不同的决策边界,即具有不同的学习能力,从而促进网络间的同行监督。其中,同行监督(peer-review)是两个网络互相监督的一种策略,通过两个网络互相交换小损失数据和伪标签,来提高网络的性能。
202、电子设备采用第一生成对抗网络中的生成网络分别对源域图像和目标域图像进行分割,确定第一源域分割损失和第一目标域分割损失。
比如,如图2c所示,第一生成对抗网络中的生成网络可以以ResNet101为主要框架的DeepLabv2作为基础模型,实现了初步的分割结果。同时,增加了ASPP结构,丰富了特征图的多尺度信息。为了增强网络的特征表达能力,提出了一个基于DANet的注意机制,学习如何捕捉像素和特征层通道之间的上下文依赖关系,将注意力模块的输出与空间金字塔结构的输出相连接,生成最终的分割特征。
比如,源域图像包括噪声图像和无噪声图像,可以在源域图像中的噪声图像引入对抗噪声标签的权重图,由于医学标签在目标区域的边界位置上的标记有很大差异,为了防止网络拟合噪声标签,提出了一种新的抗噪分割损失,从噪声样本中学习有用的像素级信息,过滤掉边缘有噪声的区域。
例如,电子设备具体可以采用第一生成对抗网络中的生成网络对源域图像进行特征提取,得到第一源域图像的特征信息,基于该第一源域图像的特征信息,对该源域图像进行目标分割,得到第一噪声分割概率,获取源域图像中噪声图像的权重图,根据该第一噪声分割概率,和噪声图像的权重图,获取第一噪声分割损失;基于该第一源域图像的特征信息,对该源域图像中无噪声图像进行目标分割,得到第一无噪声分割概率;根据第一无噪声分割概率,和无噪声图像的标注结果,获取第一无噪声分割损失;基于该第一噪声分割损失和该第一无噪声分割损失,确定第一源域分割损失。根据该第一噪声分割概率,生成第一噪声分割结果。其中,第一噪声分割损失的计算方式具体可详见上述实施例。
比如,电子设备具体可以采用第一生成对抗网络中的生成网络对目标域图像进行特征提取,得到第一目标域图像的特征信息,基于该第一目标域图像的特征信息,对该目标域图像进行目标分割,得到第一目标域分割概率;根据该第一目标域分割概率,生成第一目标域分割结果;根据该第一目标域分割结果、和该目标域图像,获取第一目标域分割损失。
接着,将第一生成对抗网络中的生成网络输出的分割结果第一源域分割概率P S和第一目标域分割概率P T同时输入到第一生成对抗网络中的判别网络中,并利用由P T生成的信息熵结果来计算对抗损失L D,同时通过最大化对抗损失来更新判别网络的参数。随后,对抗损失函数产生的误差也会被反传回给生成网络,通过最小化对抗损失来更新分割网络的参数,目的是使得生成网络对源域图像和目标域图像预测出的分割结果能够越来越相似,实现领域自适应。
比如,第一生成对抗网络中的判别网络可以采用5层全卷积网络,将源域和目标域的分割概率融合到对抗学习中,网络模型的每个卷积层的kernel size为4,stride为2,padding为1,并且在除了最后一层的所有卷积层后都添加了一 个Leaky ReLU激活函数层,最终输出一个单通道的2D结果,用0和1分别表示源域和目标域。
例如,具体可以在得到第一源域分割结果和第一目标域分割结果后,计算第一目标域图像的信息熵;采用第一生成对抗网络的判别网络,根据该第一源域分割结果、第一目标域分割结果、以及第一目标域图像的信息熵,得到第一判别结果,然后,根据该第一源域分割结果,和第一源域图像的标注结果,获取第一源域分割损失;根据该第一目标域分割结果,和第一目标域图像,获取第一目标域分割损失;根据第一源域分割结果和第一目标域分割结果,获取判别网络的第一判别损失;根据该第一源域分割损失和该第一目标域分割损失构建第一生成对抗网络的极小化对抗损失;根据该第一判别损失构建第一生成对抗网络的极大化对抗损失;基于该极小化对抗损失和该极大化对抗损失,对第一生成对抗网络进行迭代训练,得到训练后的第一生成对抗网络。
其中,极小化对抗损失和极大化对抗损失(即整体目标函数由最大最小优化)的具体计算方式可详见上述实施例。
203、电子设备采用第二生成对抗网络中的生成网络分别对源域图像和目标域图像进行分割,确定第二源域分割损失和第二目标域分割损失。
其中,第二生成对抗网络的训练与第一生成对抗网络类似,只是使用了不同的结构和参数。比如,对于第二生成对抗网络N2,可以采用DeepLabv3+架构,为了减少参数的数量和计算成本,可以用轻量级网络MobileNetV2为作为基础模型。第二生成对抗网络N2利用MobileNetV2的第一层卷积层和随后的7个残差模块(Residual block)来提取特征,可以设置第一卷积层和其后两个残差块的步长(stride)为2,并将其余块的stride设置为1,第二生成对抗网络的总下采样率是8。类似于第一生成对抗网络N1,同样加入了ASPP模块对不同感受野下的潜在特征进行学习,利用具有不同膨胀率(dialate rate)的ASPP生成多尺度特征,将不同层次的语义信息整合到特征映射中,对特征映射进行上采样,然后进行1×1卷积,对上述组合特征,并将其与低级特征连接,以进行细粒度语义分割。
例如,电子设备具体可以采用第二生成对抗网络中的生成网络分别对源域 图像进行特征提取,得到第二源域图像的特征信息,基于该第二源域图像的特征信息,对该源域图像进行目标分割,得到第二噪声分割概率;获取源域图像中噪声图像的权重图;根据该第二噪声分割概率,和噪声图像的权重图,获取第二噪声分割损失;基于该第二源域图像的特征信息,对该源域图像中无噪声图像进行目标分割,得到第二无噪声分割概率;根据第二无噪声分割概率,和无噪声图像的标注结果,获取第二无噪声分割损失;基于该第二噪声分割损失和该第二无噪声分割损失,确定第二源域分割损失。
其中,第二噪声分割损失的具体计算方式可以参照上述第一噪声分割损失的计算方式。
其中,对于目标域图像来说,具体的训练方式与预设第一生成网络类似,也可以通过添加“自监督”信息的方式,即利用目标域图像的分割结果来生成像素级的伪标签,并将其应用在下一训练阶段中。比如,电子设备具体可以采用第二生成对抗网络中的生成网络对目标域图像进行特征提取,得到第二目标域图像的特征信息,基于该第二目标域图像的特征信息,对该目标域图像进行目标分割,得到第二目标域分割概率;根据该第二目标域分割概率,生成第二目标域分割结果;根据该第二目标域分割结果、和该目标域图像,获取第二目标域分割损失。
接着,将第二生成对抗网络中的生成网络输出的分割结果第二源域分割概率P S和第二目标域分割概率P T同时输入到第二生成对抗网络中的判别网络中,并利用由P T生成的信息熵结果来计算对抗损失L D,同时通过最大化对抗损失来更新判别网络的参数。随后,对抗损失函数产生的误差也会被反传回给生成网络,通过最小化对抗损失来更新分割网络的参数,目的是使得生成网络对源域图像和目标域图像预测出的分割结果能够越来越相似,实现领域自适应。在优化网络参数的过程中,本实施例使用随机梯度下降(Stochastic Gradient Descent,SGD)算法优化和训练分割网络,使用自适应动量的随机优化(Adam)算法优化和训练判别网络,分割网络和判别网络的初始学习率分别为2.5×10 -4和1×10 -4
例如,电子设备具体可以在得到第二源域分割结果和第二目标域分割结果 后,可以通过计算第二目标域图像的信息熵;采用第二生成对抗网络的判别网络,根据该第二源域分割结果、第二目标域分割结果、以及第二目标域图像的信息熵,得到第二判别结果。然后,根据该第二源域分割结果,和第二源域图像的标注结果,获取第二源域分割损失;根据该第二目标域分割结果,和第二目标域图像,获取第二目标域分割损失;根据第二源域分割结果和第二目标域分割结果,获取判别网络的第二判别损失。接着,根据该第二源域分割损失和该第二目标域分割损失构建第二生成对抗网络的极小化对抗损失;根据该第二判别损失构建第二生成对抗网络的极大化对抗损失;基于该极小化对抗损失和该极大化对抗损失,对第二生成对抗网络进行迭代训练,得到训练后的第二生成对抗网络。
其中,第二生成对抗网络中各个损失的计算方法与第一生成对抗网络类似,详细可见上文描述。
204、电子设备根据该第一源域分割损失、该第二源域分割损失确定第一源域目标图像和第二源域目标图像。
例如,电子设备具体可以对该第一源域分割损失进行排序,根据排序后的第一源域分割损失选取满足预设损失条件的源域图像,确定为第一源域目标图像(即第一源域干净图像);对该第二源域分割损失进行排序,根据排序后的第二源域分割损失选取满足预设损失条件的源域图像,确定为第二源域目标图像(即第二源域干净图像)。每个生成网络将这些干净图像交给其对等网络以进行下一个训练过程,以更新卷积层的参数。然后每个生成网络重新选择当前认为是最佳的干净数据,并以分层的方式微调其对等网络。在这个交换过程中,对等网络可以互相监督,减少噪声标签带来的训练误差。
205、电子设备根据该第一目标域分割损失、该第二目标域分割损失确定第一目标域目标图像和第二目标域目标图像。
例如,电子设备具体可以根据该第一目标域分割损失对该第一生成对抗网络进行训练,利用训练结果生成第一目标域目标图像(即第一目标域图像的像素级伪标签);根据该第二目标域分割损失对该第二生成对抗网络进行训练,利用训练结果生成第二目标域目标图像(即第二目标域图像的像素级伪标签)。 然后将这些伪标签应用在下一阶段另一个网络训练过程中,以迭代的方式更新网络参数。而在每一阶段中,分割网络和判别网络以交替更新的方式一起训练。
206、电子设备利用第二源域目标图像和第二目标域目标图像对该第一生成对抗网络进行训练,得到训练后的第一生成对抗网络。
例如,具体可以采用第一生成对抗网络的生成网络,分别对第二源域目标图像和第二目标域目标图像进行分割,得到第二源域目标分割结果和第二目标域目标分割结果。接着,计算第二目标域目标图像的信息熵;采用第一生成对抗网络的判别网络,根据该第二源域目标分割结果、第二目标域目标分割结果、以及第二目标域目标图像的信息熵,得到第二目标判别结果。然后,根据该第二源域目标分割结果,和第二源域目标图像的标注结果,获取第二源域目标分割损失;根据该第二目标域目标分割结果,和第二目标域目标图像,获取第二目标域目标分割损失;根据第二源域目标分割结果和第二目标域目标分割结果,获取判别网络的第二目标判别损失。再然后,根据该第二源域目标分割损失和该第二目标域目标分割损失构建第一生成对抗网络的极小化对抗损失;根据该第二目标判别损失构建第一生成对抗网络的极大化对抗损失;基于该极小化对抗损失和该极大化对抗损失,对第一生成对抗网络进行迭代训练,得到训练后的第一生成对抗网络。
207、电子设备利用第一源域目标图像和第一目标域目标图像对该第二生成对抗网络进行训练,得到训练后的第二生成对抗网络。
例如,电子设备具体可以采用第二生成对抗网络的生成网络,分别对第一源域目标图像和第一目标域目标图像进行分割,得到第一源域目标分割结果和第一目标域目标分割结果;计算第一目标域目标图像的信息熵,采用第二生成对抗网络的判别网络,根据该第一源域目标分割结果、第一目标域目标分割结果、以及第一目标域目标图像的信息熵,得到第一目标判别结果。接着,根据该第一源域目标分割结果,和第一源域目标图像的标注结果,获取第一源域目标分割损失;根据该第一目标域目标分割结果,和第一目标域目标图像,获取第一目标域目标分割损失;根据第一源域目标分割结果和第一目标域目标分割结果,获取判别网络的第一目标判别损失。然后,根据该第一源域目标分割损 失和该第一目标域目标分割损失构建第二生成对抗网络的极小化对抗损失;根据该第一目标判别损失构建第二生成对抗网络的极大化对抗损失;基于该极小化对抗损失和该极大化对抗损失,对第二生成对抗网络进行迭代训练,得到训练后的第二生成对抗网络。再然后,根据第一源域目标分割损失、第一目标域目标分割损失和第一目标判别损失对第二生成对抗网络进行训练,得到训练后的第二生成对抗网络。
208、电子设备基于该训练后的第一生成对抗网络的生成网络对待分割图像进行分割,得到分割结果。
比如,电子设备具体可以接收医疗影像设备采集到的眼底图像,然后,基于该训练后的第一生成对抗网络的生成网络对该眼底图像进行特征提取,得到眼底图像的特征信息,基于该眼底图像的特征信息对该眼底图像进行目标分割,得到眼底图像的分割预测概率,根据该分割预测概率生成该眼底图像的分割结果。
此外,为了验证本申请实施例提供的分割方案的带来效果,将本发明所提技术的实验结果和现有的一些算法作了比较,并将基于不同噪声程度任务的实验结果分别在表1和表2中进行了展示。其中,表1为从REFUGE训练集到REFUGE验证集的轻度噪声级别实验结果,表2为从REFUGE训练集到REFUGE验证集的轻度噪声级别实验结果,本方案在REFUGE和Drishti-GS数据集上的实验结果如图2d所示。其中,BDL为一种基于自监督学习的双向学习方法,用来减弱domain shift问题,以学习更好的分割模型,pOSAL为视网膜眼底青光眼挑战赛的OD和OC分割任务,BEAL提出了基于边缘和熵信息的对抗学习方法。在两个任务中,本申请所提方法在不同的噪声程度、噪声级别下的实验结果,其中,DICE是一种分割结果的衡量指标,用于计算标签Y和预测值p的相似程度,表示为:
Figure PCTCN2020124673-appb-000010
表1 从REFUGE训练集到REFUGE验证集的轻度噪声级别实验结果
Figure PCTCN2020124673-appb-000011
表2 从REFUGE训练集到REFUGE验证集的轻度噪声级别实验结果
Figure PCTCN2020124673-appb-000012
由上可知,本申请实施例先获取目标域图像、以及已标注目标信息的源域图像,再采用第一生成对抗网络中的生成网络分别对源域图像和目标域图像进行分割,确定第一源域分割损失和第一目标域分割损失,以及采用第二生成对抗网络中的生成网络分别对源域图像和目标域图像进行分割,确定第二源域分割损失和第二目标域分割损失,接着,根据该第一源域分割损失、该第二源域分割损失确定第一源域目标图像和第二源域目标图像,以及根据该第一目标域分割损失、该第二目标域分割损失确定第一目标域目标图像和第二目标域目标图像,然后利用第一源域目标图像、第一目标域目标图像、第二源域目标图像 以及第二目标域目标图像对该第一生成对抗网络和该第二生成对抗网络进行交叉训练,得到训练后的第一生成对抗网络,再然后,基于该训练后的第一生成对抗网络的生成网络对待分割图像进行分割,得到分割结果;由于该方案主要针对数据的标签存在着噪声,以及源域和目标域数据集之间的分布差异现象,提出了一种基于领域自适应策略的无监督鲁棒性分割方法。通过两个模型互相学习、互相监督的方式,有效地解决了噪声标签和无监督的图像分割任务,提高了图像分割的准确性。
应该理解的是,本申请各实施例中的各个步骤并不是必然按照步骤标号指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,各实施例中至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。
为了更好地实施以上方法,相应的,本申请实施例还提供一种图像分割装置,该图像分割装置具体可以集成在电子设备中,该电子设备可以是服务器,也可以是终端,还可以是包括终端和服务器的系统。
例如,如图3所示,该图像分割装置可以包括获取单元301、第一分割单元302、第二分割单元303、确定单元304、训练单元305和第三分割单元306,如下:
获取单元301,用于获取目标域图像、以及已标注目标信息的源域图像;
第一分割单元302,用于采用第一生成对抗网络中的生成网络分别对源域图像和目标域图像进行分割,确定第一源域分割损失和第一目标域分割损失;
第二分割单元303,用于采用第二生成对抗网络中的生成网络分别对源域图像和目标域图像进行分割,确定第二源域分割损失和第二目标域分割损失;
确定单元304,用于根据该第一源域分割损失、该第二源域分割损失确定第一源域目标图像和第二源域目标图像,以及根据该第一目标域分割损失、该 第二目标域分割损失确定第一目标域目标图像和第二目标域目标图像;
训练单元305,用于利用第一源域目标图像、第一目标域目标图像、第二源域目标图像以及第二目标域目标图像对该第一生成对抗网络和该第二生成对抗网络进行交叉训练,得到训练后的第一生成对抗网络;
第三分割单元306,用于基于该训练后的第一生成对抗网络的生成网络对待分割图像进行分割,得到分割结果。
可选的,在一些实施例中,该第一分割单元302可以包括第一提取子单元、第一分割子单元和第二分割子单元,如下:
该第一提取子单元,用于采用第一生成对抗网络中的生成网络分别对源域图像和目标域图像进行特征提取,得到第一源域图像的特征信息和第一目标域图像的特征信息;
该第一分割子单元,用于基于该第一源域图像的特征信息,对该源域图像进行目标分割,确定第一源域分割损失;
该第二分割子单元,用于基于该第一目标域图像的特征信息,对该目标域图像进行目标分割,确定第一目标域分割损失。
可选的,在一些实施例中,该源域图像包括噪声图像和无噪声图像,该第一分割子单元,具体用于基于该第一源域图像的特征信息,对该源域图像中噪声图像进行目标分割,得到第一噪声分割概率;获取源域图像中噪声图像的权重图;根据该第一噪声分割概率,和噪声图像的权重图,获取第一噪声分割损失;基于该第一源域图像的特征信息,对该源域图像中无噪声图像进行目标分割,得到第一无噪声分割概率;根据第一无噪声分割概率,和无噪声图像的标注结果,获取第一无噪声分割损失;基于该第一噪声分割损失和该第一无噪声分割损失,确定第一源域分割损失。
可选的,在一些实施例中,该第二分割子单元,具体用于基于该第一目标域图像的特征信息,对该目标域图像进行目标分割,得到第一目标域分割概率;根据该第一目标域分割概率,生成第一目标域分割结果;根据该第一目标域分割结果、和该目标域图像,获取第一目标域分割损失。
可选的,在一些实施例中,该第二分割单元303可以包括第二提取子单元、 第三分割子单元和第四分割子单元,如下:
该第二提取子单元,用于采用第二生成对抗网络中的生成网络分别对源域图像和目标域图像进行特征提取,得到第二源域图像的特征信息和第二目标域图像的特征信息;
该第三分割子单元,用于基于该第二源域图像的特征信息,对该源域图像进行目标分割,确定第二源域分割损失;
该第四分割子单元,用于基于该第二目标域图像的特征信息,对该目标域图像进行目标分割,确定第二目标域分割损失。
可选的,在一些实施例中,该源域图像包括噪声图像和无噪声图像,该第三分割子单元,具体用于基于该第二源域图像的特征信息,对该源域图像中噪声图像进行目标分割,得到第二噪声分割概率;获取源域图像中噪声图像的权重图;根据该第二噪声分割概率,和噪声图像的权重图,获取第二噪声分割损失;基于该第二源域图像的特征信息,对该源域图像中无噪声图像进行目标分割,得到第二无噪声分割概率;根据第二无噪声分割概率,和无噪声图像的标注结果,获取第二无噪声分割损失;基于该第二噪声分割损失和该第二无噪声分割损失,确定第二源域分割损失。
可选的,在一些实施例中,该第四分割子单元,具体用于基于该第二目标域图像的特征信息,对该目标域图像进行目标分割,得到第二目标域分割概率;根据该第二目标域分割概率,生成第二目标域分割结果;根据该第二目标域分割结果、和该目标域图像,获取第二目标域分割损失。
可选的,在一些实施例中,该确定单元304可以包括第一确定子单元和第二确定子单元,如下:
该第一确定子单元,具体可以用于对该第一源域分割损失进行排序,根据排序后的第一源域分割损失选取满足预设损失条件的源域图像,确定为第一源域目标图像;对该第二源域分割损失进行排序,根据排序后的第二源域分割损失选取满足预设损失条件的源域图像,确定为第二源域目标图像。
该第二确定子单元,具体可以用于根据该第一目标域分割损失对该第一生成对抗网络进行训练,利用训练结果生成第一目标域目标图像;根据该第二目 标域分割损失对该第二生成对抗网络进行训练,利用训练结果生成第二目标域目标图像。
可选的,在一些实施例中,该训练单元305可以包括第一训练子单元和第二训练子单元,如下:
该第一训练子单元,具体可以用于采用第一生成对抗网络的生成网络,分别对第二源域目标图像和第二目标域目标图像进行分割,得到第二源域目标分割结果和第二目标域目标分割结果;采用第一生成对抗网络的判别网络,对第二源域目标分割结果和第二目标域目标分割结果进行判别,得到第二目标判别结果;根据第二源域目标分割结果、第二目标域目标分割结果和第二目标判别结果对第一生成对抗网络进行训练,得到训练后的第一生成对抗网络。
该第二训练子单元,具体可以用于采用第二生成对抗网络的生成网络,分别对第一源域目标图像和第一目标域目标图像进行分割,得到第一源域目标分割结果和第一目标域目标分割结果;采用第二生成对抗网络的判别网络,对第一源域目标分割结果和第一目标域目标分割结果进行判别,得到第一目标判别结果;根据第一源域目标分割结果、第一目标域目标分割结果和第一目标判别结果对第二生成对抗网络进行训练,得到训练后的第二生成对抗网络。
可选的,在一些实施例中,该第一训练子单元,具体可以用于计算第二目标域目标图像的信息熵;采用第一生成对抗网络的判别网络,根据该第二源域目标分割结果、第二目标域目标分割结果、以及第二目标域目标图像的信息熵,得到第二目标判别结果。
可选的,在一些实施例中,该第一训练子单元,具体可以用于根据该第二源域目标分割结果,和第二源域目标图像的标注结果,获取第二源域目标分割损失;根据该第二目标域目标分割结果,和第二目标域目标图像,获取第二目标域目标分割损失;根据第二源域目标分割结果和第二目标域目标分割结果,获取判别网络的第二目标判别损失;根据第二源域目标分割损失、第二目标域目标分割损失和第二目标判别损失对第一生成对抗网络进行训练,得到训练后的第一生成对抗网络。
可选的,在一些实施例中,该第一训练子单元,具体可以用于根据该第二 源域目标分割损失和该第二目标域目标分割损失构建第一生成对抗网络的极小化对抗损失;根据该第二目标判别损失构建第一生成对抗网络的极大化对抗损失;基于该极小化对抗损失和该极大化对抗损失,对第一生成对抗网络进行迭代训练,得到训练后的第一生成对抗网络。
可选的,在一些实施例中,该第二训练子单元,具体可以用于计算第一目标域目标图像的信息熵;采用第二生成对抗网络的判别网络,根据该第一源域目标分割结果、第一目标域目标分割结果、以及第一目标域目标图像的信息熵,得到第一目标判别结果。
可选的,在一些实施例中,该第二训练子单元,具体可以用于根据该第一源域目标分割结果,和第一源域目标图像的标注结果,获取第一源域目标分割损失;根据该第一目标域目标分割结果,和第一目标域目标图像,获取第一目标域目标分割损失;根据第一源域目标分割结果和第一目标域目标分割结果,获取判别网络的第一目标判别损失;根据第一源域目标分割损失、第一目标域目标分割损失和第一目标判别损失对第二生成对抗网络进行训练,得到训练后的第二生成对抗网络。
可选的,在一些实施例中,该第二训练子单元,具体可以用于根据该第一源域目标分割损失和该第一目标域目标分割损失构建第二生成对抗网络的极小化对抗损失;根据该第一目标判别损失构建第二生成对抗网络的极大化对抗损失;基于该极小化对抗损失和该极大化对抗损失,对第二生成对抗网络进行迭代训练,得到训练后的第二生成对抗网络。
具体实施时,以上各个单元可以作为独立的实体来实现,也可以进行任意组合,作为同一或若干个实体来实现,以上各个单元的具体实施可参见前面的方法实施例,在此不再赘述。
由上可知,本申请实施例先由获取单元301先获取目标域图像、以及已标注目标信息的源域图像,再由第一分割单元302采用第一生成对抗网络中的生成网络分别对源域图像和目标域图像进行分割,确定第一源域分割损失和第一目标域分割损失,以及由第二分割单元303采用第二生成对抗网络中的生成网络分别对源域图像和目标域图像进行分割,确定第二源域分割损失和第二目标 域分割损失,接着,由确定单元304根据该第一源域分割损失、该第二源域分割损失确定第一源域目标图像和第二源域目标图像,以及根据该第一目标域分割损失、该第二目标域分割损失确定第一目标域目标图像和第二目标域目标图像,然后由训练单元305利用第一源域目标图像、第一目标域目标图像、第二源域目标图像以及第二目标域目标图像对该第一生成对抗网络和该第二生成对抗网络进行交叉训练,得到训练后的第一生成对抗网络,再然后,由第三分割单元306基于该训练后的第一生成对抗网络的生成网络对待分割图像进行分割,得到分割结果;由于该方案主要针对数据的标签存在着噪声,以及源域和目标域数据集之间的分布差异现象,提出了一种基于领域自适应策略的无监督鲁棒性分割方法。通过两个模型互相学习、互相监督的方式,有效地解决了噪声标签和无监督的图像分割任务,提高了图像分割的准确性。
此外,本申请实施例还提供一种电子设备,如图4所示,其示出了本申请实施例所涉及的电子设备的结构示意图,具体来讲:
该电子设备可以包括一个或者一个以上处理核心的处理器401、一个或一个以上计算机可读存储介质的存储器402、电源403和输入单元404等部件。本领域技术人员可以理解,图4中示出的电子设备结构并不构成对电子设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。其中:
处理器401是该电子设备的控制中心,利用各种接口和线路连接整个电子设备的各个部分,通过运行或执行存储在存储器402内的软件程序和/或模块,以及调用存储在存储器402内的数据,执行电子设备的各种功能和处理数据,从而对电子设备进行整体监控。
可选的,处理器401可包括一个或多个处理核心;优选的,处理器401可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器401中。
存储器402可用于存储软件程序以及模块,处理器401通过运行存储在存储器402的软件程序以及模块,从而执行各种功能应用以及数据处理。存储器 402可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据电子设备的使用所创建的数据等。此外,存储器402可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。相应地,存储器402还可以包括存储器控制器,以提供处理器401对存储器402的访问。
电子设备还包括给各个部件供电的电源403,优选的,电源403可以通过电源管理系统与处理器401逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。电源403还可以包括一个或一个以上的直流或交流电源、再充电系统、电源故障检测电路、电源转换器或者逆变器、电源状态指示器等任意组件。
该电子设备还可包括输入单元404,该输入单元404可用于接收输入的数字或字符信息,以及产生与用户设置以及功能控制有关的键盘、鼠标、操作杆、光学或者轨迹球信号输入。
尽管未示出,电子设备还可以包括显示单元等。
具体在本实施例中,电子设备中的存储器402存储有可在处理器401上运行的计算机可读指令,处理器401执行该计算机可读指令时实现如下步骤:
获取目标域图像、以及已标注目标信息的源域图像,再采用第一生成对抗网络中的生成网络分别对源域图像和目标域图像进行分割,确定第一源域分割损失和第一目标域分割损失,以及采用第二生成对抗网络中的生成网络分别对源域图像和目标域图像进行分割,确定第二源域分割损失和第二目标域分割损失,接着,根据该第一源域分割损失、该第二源域分割损失确定第一源域目标图像和第二源域目标图像,以及根据该第一目标域分割损失、该第二目标域分割损失确定第一目标域目标图像和第二目标域目标图像,然后利用第一源域目标图像、第一目标域目标图像、第二源域目标图像以及第二目标域目标图像对该第一生成对抗网络和该第二生成对抗网络进行交叉训练,得到训练后的第一生成对抗网络,再然后,基于该训练后的第一生成对抗网络的生成网络对待分割图像进行分割,得到分割结果。
以上各个操作的具体实施可参见前面的实施例。
由上可知,本申请实施例先获取目标域图像、以及已标注目标信息的源域图像,再采用第一生成对抗网络中的生成网络分别对源域图像和目标域图像进行分割,确定第一源域分割损失和第一目标域分割损失,以及采用第二生成对抗网络中的生成网络分别对源域图像和目标域图像进行分割,确定第二源域分割损失和第二目标域分割损失,接着,根据该第一源域分割损失、该第二源域分割损失确定第一源域目标图像和第二源域目标图像,以及根据该第一目标域分割损失、该第二目标域分割损失确定第一目标域目标图像和第二目标域目标图像,然后利用第一源域目标图像、第一目标域目标图像、第二源域目标图像以及第二目标域目标图像对该第一生成对抗网络和该第二生成对抗网络进行交叉训练,得到训练后的第一生成对抗网络,再然后,基于该训练后的第一生成对抗网络的生成网络对待分割图像进行分割,得到分割结果;由于该方案主要针对数据的标签存在着噪声,以及源域和目标域数据集之间的分布差异现象,提出了一种基于领域自适应策略的无监督鲁棒性分割方法。通过两个模型互相学习、互相监督的方式,有效地解决了噪声标签和无监督的图像分割任务,提高了图像分割的准确性。
本领域普通技术人员可以理解,上述实施例的各种方法中的全部或部分步骤可以通过计算机可读指令来完成,或通过计算机可读指令控制相关的硬件来完成,该计算机可读指令可以存储于一非易失性存储介质中,并由处理器进行加载和执行。
为此,本申请实施例还提供一个或多个存储有计算机可读指令的非易失性存储介质,计算机可读指令被一个或多个处理器执行时使得处理器执行如下步骤:
获取目标域图像、以及已标注目标信息的源域图像,再采用第一生成对抗网络中的生成网络分别对源域图像和目标域图像进行分割,确定第一源域分割损失和第一目标域分割损失,以及采用第二生成对抗网络中的生成网络分别对源域图像和目标域图像进行分割,确定第二源域分割损失和第二目标域分割损失,接着,根据该第一源域分割损失、该第二源域分割损失确定第一源域目标图像和第二源域目标图像,以及根据该第一目标域分割损失、该第二目标域分 割损失确定第一目标域目标图像和第二目标域目标图像,然后利用第一源域目标图像、第一目标域目标图像、第二源域目标图像以及第二目标域目标图像对该第一生成对抗网络和该第二生成对抗网络进行交叉训练,得到训练后的第一生成对抗网络,再然后,基于该训练后的第一生成对抗网络的生成网络对待分割图像进行分割,得到分割结果。
以上各个操作的具体实施可参见前面的实施例。
其中,该非易失性存储介质可以包括:只读存储器(Read Only Memory,ROM)、随机存取记忆体(Random Access Memory,RAM)、磁盘或光盘等。
以上对本申请实施例所提供的一种图像分割方法、装置和存储介质进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上该,本说明书内容不应理解为对本申请的限制。

Claims (14)

  1. 一种图像分割方法,包括:
    获取目标域图像、以及已标注目标信息的源域图像;
    采用第一生成对抗网络中的生成网络分别对源域图像和目标域图像进行分割,确定第一源域分割损失和第一目标域分割损失;
    采用第二生成对抗网络中的生成网络分别对源域图像和目标域图像进行分割,确定第二源域分割损失和第二目标域分割损失;
    根据所述第一源域分割损失、所述第二源域分割损失确定第一源域目标图像和第二源域目标图像,以及根据所述第一目标域分割损失、所述第二目标域分割损失确定第一目标域目标图像和第二目标域目标图像;
    利用第一源域目标图像、第一目标域目标图像、第二源域目标图像以及第二目标域目标图像对所述第一生成对抗网络和所述第二生成对抗网络进行交叉训练,得到训练后的第一生成对抗网络;及
    基于所述训练后的第一生成对抗网络的生成网络对待分割图像进行分割,得到分割结果。
  2. 根据权利要求1所述的方法,其中所述采用第一生成对抗网络中的生成网络分别对源域图像和目标域图像进行分割,确定第一源域分割损失和第一目标域分割损失,包括:
    采用第一生成对抗网络中的生成网络分别对源域图像和目标域图像进行特征提取,得到第一源域图像的特征信息和第一目标域图像的特征信息;
    基于所述第一源域图像的特征信息,对所述源域图像进行目标分割,确定第一源域分割损失;及
    基于所述第一目标域图像的特征信息,对所述目标域图像进行目标分割,确定第一目标域分割损失。
  3. 根据权利要求2所述的方法,其中所述源域图像包括噪声图像和无噪声图像,所述基于所述第一源域图像的特征信息,对所述源域图像进行目标分割,确定第一源域分割损失,包括:
    基于所述第一源域图像的特征信息,对所述源域图像中噪声图像进行目标 分割,得到第一噪声分割概率;
    获取源域图像中噪声图像的权重图;
    根据所述第一噪声分割概率,和噪声图像的权重图,获取第一噪声分割损失;
    基于所述第一源域图像的特征信息,对所述源域图像中无噪声图像进行目标分割,得到第一无噪声分割概率;
    根据第一无噪声分割概率,和无噪声图像的标注结果,获取第一无噪声分割损失;及
    基于所述第一噪声分割损失和所述第一无噪声分割损失,确定第一源域分割损失。
  4. 根据权利要求2所述的方法,其中基于所述第一目标域图像的特征信息,对所述目标域图像进行目标分割,确定第一目标域分割损失,包括:
    基于所述第一目标域图像的特征信息,对所述目标域图像进行目标分割,得到第一目标域分割概率;
    根据所述第一目标域分割概率,生成第一目标域分割结果;及
    根据所述第一目标域分割结果、和所述目标域图像,获取第一目标域分割损失。
  5. 根据权利要求1所述的方法,其中所述采用第二生成对抗网络中的生成网络分别对源域图像和目标域图像进行分割,确定第二源域分割损失和第二目标域分割损失,包括:
    采用第二生成对抗网络中的生成网络分别对源域图像和目标域图像进行特征提取,得到第二源域图像的特征信息和第二目标域图像的特征信息;
    基于所述第二源域图像的特征信息,对所述源域图像进行目标分割,确定第二源域分割损失;及
    基于所述第二目标域图像的特征信息,对所述目标域图像进行目标分割,确定第二目标域分割损失。
  6. 根据权利要求1所述的方法,其中所述根据所述第一源域分割损失、所述第二源域分割损失确定第一源域目标图像和第二源域目标图像,包括:
    对所述第一源域分割损失进行排序,根据排序后的第一源域分割损失选取满足预设损失条件的源域图像,确定为第一源域目标图像;及
    对所述第二源域分割损失进行排序,根据排序后的第二源域分割损失选取满足预设损失条件的源域图像,确定为第二源域目标图像。
  7. 根据权利要求1所述的方法,其中所述根据所述第一目标域分割损失、所述第二目标域分割损失确定第一目标域目标图像和第二目标域目标图像,包括:
    根据所述第一目标域分割损失对所述第一生成对抗网络进行训练,利用训练结果生成第一目标域目标图像;及
    根据所述第二目标域分割损失对所述第二生成对抗网络进行训练,利用训练结果生成第二目标域目标图像。
  8. 根据权利要求1所述的方法,其中所述利用第一源域目标图像、第一目标域目标图像、第二源域目标图像以及第二目标域目标图像对所述第一生成对抗网络和所述第二生成对抗网络进行交叉训练,得到训练后的第一生成对抗网络,包括:
    采用第一生成对抗网络的生成网络,分别对第二源域目标图像和第二目标域目标图像进行分割,得到第二源域目标分割结果和第二目标域目标分割结果;
    采用第一生成对抗网络的判别网络,对第二源域目标分割结果和第二目标域目标分割结果进行判别,得到第二目标判别结果;
    根据第二源域目标分割结果、第二目标域目标分割结果和第二目标判别结果对第一生成对抗网络进行训练,得到训练后的第一生成对抗网络;
    采用第二生成对抗网络的生成网络,分别对第一源域目标图像和第一目标域目标图像进行分割,得到第一源域目标分割结果和第一目标域目标分割结果;
    采用第二生成对抗网络的判别网络,对第一源域目标分割结果和第一目标域目标分割结果进行判别,得到第一目标判别结果;及
    根据第一源域目标分割结果、第一目标域目标分割结果和第一目标判别结果对第二生成对抗网络进行训练,得到训练后的第二生成对抗网络。
  9. 根据权利要求8所述的方法,其中所述采用第一生成对抗网络的判别网 络,对第二源域目标分割结果和第二目标域目标分割结果进行判别,得到第二目标判别结果,包括:
    计算第二目标域目标图像的信息熵;及
    采用第一生成对抗网络的判别网络,根据所述第二源域目标分割结果、第二目标域目标分割结果、以及第二目标域目标图像的信息熵,得到第二目标判别结果。
  10. 根据权利要求8所述的方法,其中所述根据第二源域目标分割结果、第二目标域目标分割结果和第二目标判别结果对第一生成对抗网络进行训练,得到训练后的第一生成对抗网络,包括:
    根据所述第二源域目标分割结果,和第二源域目标图像的标注结果,获取第二源域目标分割损失;
    根据所述第二目标域目标分割结果,和第二目标域目标图像,获取第二目标域目标分割损失;
    根据第二源域目标分割结果和第二目标域目标分割结果,获取判别网络的第二目标判别损失;及
    根据第二源域目标分割损失、第二目标域目标分割损失和第二目标判别损失对第一生成对抗网络进行训练,得到训练后的第一生成对抗网络。
  11. 根据权利要求10所述的方法,其中所述根据第二源域目标分割损失、第二目标域目标分割损失和第二目标判别损失对第一生成对抗网络进行训练,得到训练后的第一生成对抗网络,包括:
    根据所述第二源域目标分割损失和所述第二目标域目标分割损失构建第一生成对抗网络的极小化对抗损失;
    根据所述第二目标判别损失构建第一生成对抗网络的极大化对抗损失;及
    基于所述极小化对抗损失和所述极大化对抗损失,对第一生成对抗网络进行迭代训练,得到训练后的第一生成对抗网络。
  12. 一种图像分割装置,包括:
    获取单元,用于获取目标域图像、以及已标注目标信息的源域图像;
    第一分割单元,用于采用第一生成对抗网络中的生成网络分别对源域图像 和目标域图像进行分割,确定第一源域分割损失和第一目标域分割损失;
    第二分割单元,用于采用第二生成对抗网络中的生成网络分别对源域图像和目标域图像进行分割,确定第二源域分割损失和第二目标域分割损失;
    确定单元,用于根据所述第一源域分割损失、所述第二源域分割损失确定第一源域目标图像和第二源域目标图像,以及根据所述第一目标域分割损失、所述第二目标域分割损失确定第一目标域目标图像和第二目标域目标图像;
    训练单元,用于利用第一源域目标图像、第一目标域目标图像、第二源域目标图像以及第二目标域目标图像对所述第一生成对抗网络和所述第二生成对抗网络进行交叉训练,得到训练后的第一生成对抗网络;及
    第三分割单元,用于基于所述训练后的第一生成对抗网络的生成网络对待分割图像进行分割,得到分割结果。
  13. 一个或多个存储有计算机可读指令的非易失性存储介质,所述计算机可读指令被一个或多个处理器执行时使得所述处理器执行权利要求1至11中任一项所述的图像分割方法中的步骤。
  14. 一种电子设备,包括存储器,处理器及存储在存储器上并可在处理器上运行的计算机可读指令,其中,所述处理器执行所述计算机可读指令时实现如权利要求1至11任一项所述的图像分割方法中的步骤。
PCT/CN2020/124673 2020-02-10 2020-10-29 图像分割方法、装置和存储介质 WO2021159742A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP20919226.9A EP4002271A4 (en) 2020-02-10 2020-10-29 METHOD AND APPARATUS FOR IMAGE SEGMENTATION AND STORAGE MEDIA
JP2022523505A JP7268248B2 (ja) 2020-02-10 2020-10-29 画像分割方法、装置及びコンピュータプログラム
US17/587,825 US20220148191A1 (en) 2020-02-10 2022-01-28 Image segmentation method and apparatus and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010084625.6A CN111340819B (zh) 2020-02-10 2020-02-10 图像分割方法、装置和存储介质
CN202010084625.6 2020-02-10

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/587,825 Continuation US20220148191A1 (en) 2020-02-10 2022-01-28 Image segmentation method and apparatus and storage medium

Publications (1)

Publication Number Publication Date
WO2021159742A1 true WO2021159742A1 (zh) 2021-08-19

Family

ID=71186799

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/124673 WO2021159742A1 (zh) 2020-02-10 2020-10-29 图像分割方法、装置和存储介质

Country Status (5)

Country Link
US (1) US20220148191A1 (zh)
EP (1) EP4002271A4 (zh)
JP (1) JP7268248B2 (zh)
CN (1) CN111340819B (zh)
WO (1) WO2021159742A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113706547A (zh) * 2021-08-27 2021-11-26 北京航空航天大学 一种基于类别同异性引导的无监督域适应语义分割方法
CN114529878A (zh) * 2022-01-21 2022-05-24 四川大学 一种基于语义感知的跨域道路场景语义分割方法

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111340819B (zh) * 2020-02-10 2023-09-12 腾讯科技(深圳)有限公司 图像分割方法、装置和存储介质
US11468294B2 (en) 2020-02-21 2022-10-11 Adobe Inc. Projecting images to a generative model based on gradient-free latent vector determination
JP6800453B1 (ja) * 2020-05-07 2020-12-16 株式会社 情報システムエンジニアリング 情報処理装置及び情報処理方法
CN111861909B (zh) * 2020-06-29 2023-06-16 南京理工大学 一种网络细粒度图像分类方法
CN111986202B (zh) * 2020-10-26 2021-02-05 平安科技(深圳)有限公司 青光眼辅助诊断装置、方法及存储介质
CN112419326B (zh) * 2020-12-02 2023-05-23 腾讯科技(深圳)有限公司 图像分割数据处理方法、装置、设备及存储介质
CN115205694A (zh) * 2021-03-26 2022-10-18 北京沃东天骏信息技术有限公司 图像分割方法、装置和计算机可读存储介质
CN113221905B (zh) * 2021-05-18 2022-05-17 浙江大学 基于均匀聚类的语义分割的无监督域适应方法、装置、系统和存储介质
CN115393599A (zh) * 2021-05-21 2022-11-25 北京沃东天骏信息技术有限公司 构建图像语义分割模型和图像处理的方法、装置、电子设备及介质
CN115409764B (zh) * 2021-05-28 2024-01-09 南京博视医疗科技有限公司 一种基于域自适应的多模态眼底血管分割方法及装置
CN113627433B (zh) * 2021-06-18 2024-04-09 中国科学院自动化研究所 基于数据扰动的跨域自适应语义分割方法及装置
CN113657389A (zh) * 2021-07-29 2021-11-16 中国科学院软件研究所 一种软件定义卫星语义分割方法、装置和介质
CN113658165B (zh) * 2021-08-25 2023-06-20 平安科技(深圳)有限公司 杯盘比确定方法、装置、设备及存储介质
CN113936143B (zh) * 2021-09-10 2022-07-01 北京建筑大学 基于注意力机制和生成对抗网络的图像识别泛化方法
CN115187783B (zh) 2022-09-09 2022-12-27 之江实验室 基于联邦学习的多任务混合监督医疗图像分割方法及系统
CN116092164B (zh) * 2023-02-01 2023-12-26 中国科学院自动化研究所 人脸图像重演方法、装置、电子设备及存储介质
CN117911804A (zh) * 2023-05-09 2024-04-19 宁波大学 基于自纠正伪双模型的半监督分割模型、训练方法与应用
CN116486408B (zh) * 2023-05-12 2024-04-05 国家基础地理信息中心 遥感图像跨域语义分割方法及装置

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7773800B2 (en) * 2001-06-06 2010-08-10 Ying Liu Attrasoft image retrieval
CN108197670A (zh) * 2018-01-31 2018-06-22 国信优易数据有限公司 伪标签生成模型训练方法、装置及伪标签生成方法及装置
CN108256561A (zh) * 2017-12-29 2018-07-06 中山大学 一种基于对抗学习的多源域适应迁移方法及系统
CN109829894A (zh) * 2019-01-09 2019-05-31 平安科技(深圳)有限公司 分割模型训练方法、oct图像分割方法、装置、设备及介质
CN109902809A (zh) * 2019-03-01 2019-06-18 成都康乔电子有限责任公司 一种利用生成对抗网络辅助语义分割模型
CN110148142A (zh) * 2019-05-27 2019-08-20 腾讯科技(深圳)有限公司 图像分割模型的训练方法、装置、设备和存储介质
CN110322446A (zh) * 2019-07-01 2019-10-11 华中科技大学 一种基于相似性空间对齐的域自适应语义分割方法
CN110490884A (zh) * 2019-08-23 2019-11-22 北京工业大学 一种基于对抗的轻量级网络语义分割方法
CN111340819A (zh) * 2020-02-10 2020-06-26 腾讯科技(深圳)有限公司 图像分割方法、装置和存储介质

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10474929B2 (en) * 2017-04-25 2019-11-12 Nec Corporation Cyclic generative adversarial network for unsupervised cross-domain image generation
US20190130220A1 (en) * 2017-10-27 2019-05-02 GM Global Technology Operations LLC Domain adaptation via class-balanced self-training with spatial priors
US10482600B2 (en) * 2018-01-16 2019-11-19 Siemens Healthcare Gmbh Cross-domain image analysis and cross-domain image synthesis using deep image-to-image networks and adversarial networks
US11132792B2 (en) * 2018-02-22 2021-09-28 Siemens Healthcare Gmbh Cross domain medical image segmentation
JP6893194B2 (ja) * 2018-05-28 2021-06-23 日本電信電話株式会社 モデル学習装置、モデル学習方法、及びプログラム
WO2020028382A1 (en) * 2018-07-30 2020-02-06 Memorial Sloan Kettering Cancer Center Multi-modal, multi-resolution deep learning neural networks for segmentation, outcomes prediction and longitudinal response monitoring to immunotherapy and radiotherapy
CN109345455B (zh) * 2018-09-30 2021-01-26 京东方科技集团股份有限公司 图像鉴别方法、鉴别器和计算机可读存储介质
CN109255390B (zh) * 2018-09-30 2021-01-29 京东方科技集团股份有限公司 训练图像的预处理方法及模块、鉴别器、可读存储介质
CN110570433B (zh) * 2019-08-30 2022-04-22 北京影谱科技股份有限公司 基于生成对抗网络的图像语义分割模型构建方法和装置

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7773800B2 (en) * 2001-06-06 2010-08-10 Ying Liu Attrasoft image retrieval
CN108256561A (zh) * 2017-12-29 2018-07-06 中山大学 一种基于对抗学习的多源域适应迁移方法及系统
CN108197670A (zh) * 2018-01-31 2018-06-22 国信优易数据有限公司 伪标签生成模型训练方法、装置及伪标签生成方法及装置
CN109829894A (zh) * 2019-01-09 2019-05-31 平安科技(深圳)有限公司 分割模型训练方法、oct图像分割方法、装置、设备及介质
CN109902809A (zh) * 2019-03-01 2019-06-18 成都康乔电子有限责任公司 一种利用生成对抗网络辅助语义分割模型
CN110148142A (zh) * 2019-05-27 2019-08-20 腾讯科技(深圳)有限公司 图像分割模型的训练方法、装置、设备和存储介质
CN110322446A (zh) * 2019-07-01 2019-10-11 华中科技大学 一种基于相似性空间对齐的域自适应语义分割方法
CN110490884A (zh) * 2019-08-23 2019-11-22 北京工业大学 一种基于对抗的轻量级网络语义分割方法
CN111340819A (zh) * 2020-02-10 2020-06-26 腾讯科技(深圳)有限公司 图像分割方法、装置和存储介质

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113706547A (zh) * 2021-08-27 2021-11-26 北京航空航天大学 一种基于类别同异性引导的无监督域适应语义分割方法
CN113706547B (zh) * 2021-08-27 2023-07-18 北京航空航天大学 一种基于类别同异性引导的无监督域适应语义分割方法
CN114529878A (zh) * 2022-01-21 2022-05-24 四川大学 一种基于语义感知的跨域道路场景语义分割方法
CN114529878B (zh) * 2022-01-21 2023-04-25 四川大学 一种基于语义感知的跨域道路场景语义分割方法

Also Published As

Publication number Publication date
CN111340819B (zh) 2023-09-12
US20220148191A1 (en) 2022-05-12
CN111340819A (zh) 2020-06-26
JP7268248B2 (ja) 2023-05-02
EP4002271A4 (en) 2022-12-14
JP2022554120A (ja) 2022-12-28
EP4002271A1 (en) 2022-05-25

Similar Documents

Publication Publication Date Title
WO2021159742A1 (zh) 图像分割方法、装置和存储介质
CN111199550B (zh) 图像分割网络的训练方法、分割方法、装置和存储介质
US10706333B2 (en) Medical image analysis method, medical image analysis system and storage medium
WO2020238734A1 (zh) 图像分割模型的训练方法、装置、计算机设备和存储介质
Luo et al. Deep mining external imperfect data for chest X-ray disease screening
Sander et al. Automatic segmentation with detection of local segmentation failures in cardiac MRI
WO2020087960A1 (zh) 一种影像识别的方法、装置、终端设备和医疗系统
WO2021098534A1 (zh) 相似度确定、网络训练、查找方法及装置、电子装置和存储介质
CN112419326B (zh) 图像分割数据处理方法、装置、设备及存储介质
CN113673244B (zh) 医疗文本处理方法、装置、计算机设备和存储介质
Xie et al. Optic disc and cup image segmentation utilizing contour-based transformation and sequence labeling networks
WO2022252929A1 (zh) 医学图像中组织结构的层级分割方法、装置、设备及介质
CN113421228A (zh) 一种基于参数迁移的甲状腺结节识别模型训练方法及系统
Shamrat et al. Analysing most efficient deep learning model to detect COVID-19 from computer tomography images
CN116129141A (zh) 医学数据处理方法、装置、设备、介质和计算机程序产品
Kong et al. Based on improved deep convolutional neural network model pneumonia image classification
Zhang et al. TiM‐Net: Transformer in M‐Net for Retinal Vessel Segmentation
Rudnicka et al. Artificial Intelligence-Based Algorithms in Medical Image Scan Segmentation and Intelligent Visual Content Generation—A Concise Overview
Aguirre Nilsson et al. Classification of ulcer images using convolutional neural networks
Zhu et al. 3D pyramid pooling network for abdominal MRI series classification
CN113763332B (zh) 一种基于三元胶囊网络算法的肺结节分析方法、装置及存储介质
CN116994695A (zh) 报告生成模型的训练方法、装置、设备及存储介质
Zhang et al. Semi‐supervised graph convolutional networks for the domain adaptive recognition of thyroid nodules in cross‐device ultrasound images
WO2023274512A1 (en) Method for training and using a deep learning algorithm to compare medical images based on dimensionality-reduced representations
Chamundeshwari et al. Adaptive Despeckling and Heart Disease Diagnosis by Echocardiogram using Optimized Deep Learning Model

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20919226

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2020919226

Country of ref document: EP

Effective date: 20220217

ENP Entry into the national phase

Ref document number: 2022523505

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE