CN112419326A

CN112419326A - Image segmentation data processing method, device, equipment and storage medium

Info

Publication number: CN112419326A
Application number: CN202011403688.XA
Authority: CN
Inventors: 柳露艳; 马锴; 郑冶枫
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-12-02
Filing date: 2020-12-02
Publication date: 2021-02-26
Anticipated expiration: 2040-12-02
Also published as: CN112419326B

Abstract

The embodiment of the application discloses an image segmentation data processing method and device, computer equipment and a storage medium, and belongs to the technical field of computers. The method comprises the following steps: determining a first source domain loss value based on the source domain label image, the first label image, and the second label image; determining a first target domain loss value based on the third label image and the fourth label image; and training a first image segmentation model based on the first source domain loss value and the first target domain loss value, and performing image segmentation task processing based on the trained image segmentation model. When the source domain image and the target domain image are adopted to train the first image segmentation model, not only is the result output by the first image segmentation model considered, but also the result output by the second image segmentation model is fused into the source domain loss value, the first image segmentation model can be supervised by the second image segmentation model, and the image segmentation model with higher accuracy and robustness is trained, so that the accuracy of image segmentation is improved.

Description

Image segmentation data processing method, device, equipment and storage medium

Technical Field

The embodiment of the application relates to the technical field of computers, in particular to an image segmentation data processing method, device, equipment and storage medium.

Background

With the continuous development of computer technology and artificial intelligence technology, the artificial intelligence technology obtains remarkable results in image analysis tasks such as image segmentation and image recognition. When the artificial intelligence technology is adopted for image segmentation, a large number of sample images are needed to train an image segmentation model, but noise images often exist in the sample images.

In the related technology, the quality scores of the sample images are determined based on the quality evaluation model, the sample images with higher quality scores are selected, and the sample images with higher quality scores are used for training the image segmentation model. Because the sample images are screened only by relying on the quality evaluation model, the reliability of the screened sample images is not high, and the accuracy and the robustness of the image segmentation model obtained by training are poor.

Disclosure of Invention

The embodiment of the application provides an image segmentation data processing method, an image segmentation data processing device, computer equipment and a storage medium, and the accuracy and robustness of an image segmentation model can be improved. The technical scheme is as follows:

in one aspect, an image segmentation data processing method is provided, and the method includes:

acquiring a first source domain image, a source domain label image corresponding to the first source domain image, a first label image and a second label image, wherein the first label image is obtained by performing image segmentation on the first source domain image based on a first image segmentation model, and the second label image is obtained by performing image segmentation on the first source domain image based on a second image segmentation model;

determining a first source domain loss value based on a source domain label image corresponding to the first source domain image, the first label image and the second label image;

acquiring a target domain image, a third label image and a fourth label image, wherein the third label image is obtained by performing image segmentation on the target domain image based on the first image segmentation model, and the fourth label image is obtained by performing image segmentation on the target domain image based on the second image segmentation model;

determining a first target domain loss value based on the third label image and the fourth label image;

and training the first image segmentation model based on the first source domain loss value and the first target domain loss value, and performing image segmentation task processing based on the trained image segmentation model.

Optionally, the determining a second source domain loss value based on the source domain label image corresponding to the second source domain image and the fifth label image includes: in response to the similarity being less than the similarity threshold, determining the second source domain loss value based on the source domain label image, the fifth label image, and the sixth label image corresponding to the second source domain image.

Optionally, the method further comprises: in response to the number of iterative trainings being greater than the second threshold, determining the first source domain loss value based on the third source domain loss value and the fourth source domain loss value.

Optionally, the determining, based on the source domain label image, a weight corresponding to each pixel point includes: determining the minimum distance between each pixel point and the boundary line of the source domain label image; determining a maximum value of the determined plurality of minimum distances as a target distance; and respectively determining the weight corresponding to each pixel point based on the target distance and the minimum distance corresponding to each pixel point.

Optionally, the determining the third source domain loss value based on the first label and the weight corresponding to each pixel point includes: determining a first difference value based on the first label and the weight corresponding to each pixel point, wherein the first difference value represents the difference between the first label image and the source domain label image; determining a third source domain sub-loss value corresponding to any pixel point based on the first difference value, a first label corresponding to any pixel point and the weight; determining a third source domain penalty value based on the determined plurality of third source domain sub-penalty values.

Optionally, after determining the first source domain image as a non-noise source domain image in response to the first source domain loss value being less than a target loss value, the method further includes: determining a fifth source domain loss value based on a source domain label image and a sixth label image corresponding to the first source domain image, wherein the sixth label image is obtained by performing image segmentation on the first source domain image based on the second image segmentation model; determining a second target domain loss value based on the third label image and a seventh label image, wherein the seventh label image is obtained by performing image segmentation on the target domain image based on the second image segmentation model; training the second image segmentation model based on the fifth source domain loss value and the second target domain loss value.

Optionally, the third tag image includes third tags corresponding to a plurality of pixel points in the target domain image, and the determining a discrimination loss value based on the source domain discrimination result, the target domain discrimination result, and the third tag image includes:

based on a third label corresponding to the plurality of pixel points, determining an uncertainty corresponding to the third label image;

determining the discrimination loss value based on the uncertainty, the source domain discrimination result, and the target domain discrimination result.

Optionally, the method further comprises:

acquiring a sixth label image, wherein the sixth label image is obtained by performing image segmentation on the second source domain image based on the second image segmentation model;

determining a second source domain loss value based on the source domain label image corresponding to the second source domain image and the fifth label image, including:

determining a similarity between the fifth label image and the sixth label image;

and in response to the similarity not being less than a similarity threshold, determining the second source domain loss value based on the source domain label image corresponding to the second source domain image and the fifth label image.

Optionally, the first tag image includes a first tag corresponding to each pixel point in the first source domain image, and the determining a third source domain loss value based on the source domain tag image and the first tag image includes:

determining the weight corresponding to each pixel point based on the source domain label image;

determining the third source domain loss value based on the first label and the weight corresponding to each pixel point.

In another aspect, there is provided an image segmentation data processing apparatus, the apparatus comprising:

the device comprises a first image acquisition module, a second image acquisition module and a third image acquisition module, wherein the first image acquisition module is used for acquiring a first source domain image, a source domain label image corresponding to the first source domain image, a first label image and a second label image, the first label image is obtained by carrying out image segmentation on the first source domain image based on a first image segmentation model, and the second label image is obtained by carrying out image segmentation on the first source domain image based on a second image segmentation model;

a first loss value determining module, configured to determine a first source domain loss value based on a source domain label image corresponding to the first source domain image, the first label image, and the second label image;

the second image acquisition module is used for acquiring a target domain image, a third label image and a fourth label image, wherein the third label image is obtained by performing image segmentation on the target domain image based on the first image segmentation model, and the fourth label image is obtained by performing image segmentation on the target domain image based on the second image segmentation model;

a second loss value determination module for determining a first target domain loss value based on the third label image and the fourth label image;

a first training module to train the first image segmentation model based on the first source domain loss value and the first target domain loss value;

and the image segmentation module is used for carrying out image segmentation task processing based on the image segmentation model obtained after training.

the image segmentation device comprises a model acquisition module, a model segmentation module and a model generation module, wherein the model acquisition module is used for acquiring a first image segmentation model which comprises a feature extraction layer and a feature segmentation layer;

the feature extraction module is used for extracting features of the target domain image based on the feature extraction layer to obtain a feature image corresponding to the target domain image;

the characteristic segmentation module is used for carrying out characteristic segmentation on the characteristic image based on the characteristic segmentation layer to obtain a target domain label image corresponding to the target domain image;

wherein the sample data adopted in training the first image segmentation model comprises:

the sample source domain image, the sample source domain label image corresponding to the sample source domain image, the sample target domain image, the first image segmentation model and the second image segmentation model respectively perform image segmentation on the sample source domain image and the sample target domain image to obtain label images.

In another aspect, a computer device is provided, which comprises a processor and a memory, wherein at least one computer program is stored in the memory, and is loaded and executed by the processor to implement the operations performed in the image segmentation data processing method according to the above aspect.

In another aspect, there is provided a computer-readable storage medium having at least one computer program stored therein, the at least one computer program being loaded and executed by a processor to perform the operations performed in the image segmentation data processing method according to the above aspect.

In another aspect, there is provided a computer program product or a computer program comprising computer program code stored in a computer-readable storage medium, the computer program code being read by a processor of a computer device from the computer-readable storage medium, the computer program code being executed by the processor to cause the computer device to implement the operations performed in the image segmentation data processing method according to the above aspect.

According to the method, the device, the computer equipment and the storage medium provided by the embodiment of the application, when the first image segmentation model is trained by adopting the source domain image and the target domain image, not only the segmentation result of the first image segmentation model is considered, but also the segmentation result of the second image segmentation model is fused into the source domain loss value and the target domain loss value, so that the first image segmentation model learns the segmentation result of the second image segmentation model, therefore, the training process of the first image segmentation model can be supervised by the segmentation result of the second image segmentation model, the first image segmentation model with higher accuracy and robustness can be trained, and the accuracy of image segmentation is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a flowchart of an image segmentation data processing method according to an embodiment of the present application.

Fig. 2 is a flowchart of another image segmentation data processing method according to an embodiment of the present application.

Fig. 3 is a schematic diagram of determining a loss value according to an embodiment of the present application.

Fig. 4 is a flowchart of an image segmentation data processing method according to an embodiment of the present application.

Fig. 5 is a schematic diagram of a cross-training image segmentation model provided in an embodiment of the present application.

Fig. 6 is a schematic diagram of an experimental result provided in an embodiment of the present application.

Fig. 7 is a schematic diagram of another experimental result provided in the examples of the present application.

Fig. 8 is a flowchart of another image segmentation data processing method according to an embodiment of the present application.

Fig. 9 is a schematic structural diagram of an image segmentation data processing apparatus according to an embodiment of the present application.

Fig. 10 is a schematic structural diagram of another image segmentation data processing apparatus according to an embodiment of the present application.

Fig. 11 is a schematic structural diagram of another image segmentation data processing apparatus according to an embodiment of the present application.

Fig. 12 is a schematic structural diagram of another image segmentation data processing apparatus according to an embodiment of the present application.

Fig. 13 is a schematic structural diagram of a terminal according to an embodiment of the present application.

Fig. 14 is a schematic structural diagram of a server according to an embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the embodiments of the present application more clear, the embodiments of the present application will be further described in detail with reference to the accompanying drawings.

It will be understood that the terms "first," "second," and the like as used herein may be used herein to describe various concepts, which are not limited by these terms unless otherwise specified. These terms are only used to distinguish one concept from another. For example, the first source domain image may be referred to as a second source domain image, and similarly, the second source domain image may be referred to as a first source domain image, without departing from the scope of the present application.

For example, the at least one source domain image may be any integer number of source domain images greater than or equal to one, such as one source domain image, two source domain images, three source domain images, and the like. The plurality of source domain images may be two or more, for example, the plurality of source domain images may be any integer number of source domain images greater than or equal to two, such as two source domain images and three source domain images. Each refers to each of the at least one, for example, each source domain image refers to each of the plurality of source domain images, and if the plurality of source domain images is 3 source domain images, each source domain image refers to each of the 3 source domain images.

Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. Artificial intelligence software techniques include natural language processing techniques and machine learning.

Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. The machine learning and the deep learning comprise technologies such as artificial neural network, belief network, reinforcement learning, transfer learning, inductive learning, teaching learning and the like.

Computer Vision technology (CV) Computer Vision is a science for researching how to make a machine "see", and further refers to that a camera and a Computer are used to replace human eyes to perform machine Vision such as identification, tracking and measurement on a target, and further image processing is performed, so that the Computer processing becomes an image more suitable for human eyes to observe or transmitted to an instrument to detect. As a scientific discipline, computer vision research-related theories and techniques attempt to build artificial intelligence systems that can capture information from images or multidimensional data. The computer vision technology comprises image processing, image Recognition, image semantic understanding, image retrieval, Optical Character Recognition (OCR), video processing, video semantic understanding, video content/behavior Recognition, three-dimensional object reconstruction, 3D (3-Dimension) technology, virtual reality, augmented reality, synchronous positioning, map construction and other technologies, and also comprises common biological feature Recognition technologies such as face Recognition, fingerprint Recognition and the like.

The cloud technology is a hosting technology for unifying series resources such as hardware, software, network and the like in a wide area network or a local area network to realize the calculation, storage, processing and sharing of data. The cloud technology is a general term of a network technology, an information technology, an integration technology, a management platform technology, an application technology and the like based on cloud computing application, can form a resource pool, is used as required, and is flexible and convenient. Background services of the technical network system require a large amount of computing and storage resources, such as video websites, picture websites and more portal websites, and the cloud computing technology becomes an important support. With the high development and application of the internet industry, each article may have its own identification mark and needs to be transmitted to a background system for logic processing, data of different levels are processed separately, and various industrial data need strong system background support and can only be realized by a cloud computing technology. As a basic capability provider of cloud computing, a cloud computing resource pool, referred to as an IaaS (Infrastructure as a Service) platform for short, is established, and multiple types of virtual resources are deployed in the resource pool and are used by external clients selectively.

The Medical Cloud is a Medical health service Cloud platform established by using Cloud computing on the basis of new technologies such as Cloud computing, mobile technology, multimedia, 4G communication, big data, internet of things and the like and combining Medical technology, and realizes sharing of Medical resources and expansion of Medical scope. Due to the combination of the cloud computing technology, the medical cloud improves the efficiency of medical institutions and brings convenience to residents to see medical advice. For example, appointment registration, electronic medical records, medical insurance and the like are all products combining cloud computing and the medical field, and the medical cloud also has the advantages of data security, information sharing, dynamic expansion and global layout.

The image segmentation data processing method provided by the embodiment of the application will be described below based on an artificial intelligence technology and a cloud technology.

In order to facilitate understanding of the technical processes of the embodiments of the present application, some terms referred to in the embodiments of the present application are explained below:

(1) domain Adaptation (DA): a method for transfer learning improves the performance of a model on target domain data through rich information provided by source domain data. Wherein, the Source Domain (Source Domain) can provide rich tag data; the Target Domain (Target Domain) is the Domain where the test data set is located, and lacks tag data. The source domain data and the target domain data describe data in the same scene in different fields, and can solve the same type of task, but the data in different fields have difference in distribution.

(2) Noise data (noise Labels): the Noise data provides erroneous label data, which in this application has a Noise label at the pixel Level, determined by the degree of Noise (Noise Level) and the proportion of Noise (Noise Ratio).

(3) Peer-Review (Peer-Review): the method is a strategy for mutual supervision of two models in a training process, and the performance of the models is improved and the noise labels of noise data are purified by exchanging data with small loss and the generated pseudo labels corresponding to the noise data through the two network models.

(4) Cross training: in the embodiment of the present application, cross training refers to training two image segmentation models with the same function simultaneously by using the same training set. The first image segmentation model performs image segmentation on images in a training set, noise images and non-noise images are screened in the training set according to segmentation results, the screened noise images and non-noise images are applied to training of the second image segmentation model, then the second image segmentation model performs image segmentation on the noise images and the non-noise images, the noise images and the non-noise images are re-screened according to the segmentation results and are applied to the training of the first image segmentation model again, and cross training of the first image segmentation model and the second image segmentation model is completed in an iterative mode.

(5) And (3) domain migration: because the source domain image and the target domain image have difference in spatial distribution, an image segmentation model capable of performing image segmentation on the source domain image does not necessarily perform accurate image segmentation on the target domain image. In the embodiment of the application, the image segmentation model is used for image segmentation of an image, which is equivalent to mapping the image into the same feature space, and then domain migration based on domain adaptation aims to map source domain images and target domain images with different spatial distributions into the same feature space, so that in a training process, the training aims to make segmentation results (label images) of the source domain images and the target domain images as similar as possible in the feature space, that is, the image segmentation model can reduce the difference between the source domain images and the target domain images in the feature space, thereby improving the accuracy of the image segmentation of the target domain images by the image segmentation model.

The embodiment of the application provides an image segmentation data processing method, and an execution main body is computer equipment.

In one possible implementation, the computer device is a terminal, and the terminal is a portable, pocket, handheld, or other types of terminals, such as but not limited to a smart phone, a tablet computer, a notebook computer, or a desktop computer.

In another possible implementation manner, the computer device is a server, where the server is an independent physical server, or a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server that provides basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a CDN (Content Delivery Network), and a big data and artificial intelligence platform, and the application is not limited herein.

Fig. 1 is a flowchart of an image segmentation data processing method according to an embodiment of the present application. An execution subject of the embodiment of the application is a computer device, and referring to fig. 1, the method includes:

101. and acquiring a first source domain image, a source domain label image corresponding to the first source domain image, a first label image and a second label image.

In the embodiment of the application, the image segmentation refers to classifying pixel points in an image, marking whether the pixel points belong to a target category or not, and obtaining a label image corresponding to the image, so as to segment a target area in the image. The source domain image has a corresponding source domain label image, and optionally, the source domain label image is obtained by manually labeling the source domain image. The target domain image lacks a corresponding target domain label image. Due to the fact that the target domain image lacks a corresponding target domain label image, a model capable of carrying out image segmentation on the target domain image cannot be trained only on the basis of the target domain image.

The source domain image and the target domain image are different, and different standards can be adopted to divide the source domain image and the target domain image according to requirements in the embodiment of the application. For example, the source domain image is acquired by a first device in a different manner from the target domain image, and the target domain image is acquired by a second device. For another example, the source domain image is different from the target domain image in the domain, the source domain image is an image including a cat, and the target domain image is an image including a dog. For another example, the source domain image and the target domain image have different sources, the source domain image is a real face photograph, and the target domain image is a sketch face cartoon and the like.

Therefore, in the embodiment of the application, a model training method based on a domain adaptive strategy is provided for the distribution difference between the source domain image and the target domain image, and an image segmentation model for image segmentation is trained under an unsupervised condition in a mode that the two models learn and supervise each other.

The first image segmentation model and the second image segmentation model are two different models which are trained in a crossed mode, and both the first image segmentation model and the second image segmentation model are used for segmenting images to obtain corresponding label images. Optionally, the first image segmentation model and the second image segmentation model have the same structure, or the first image segmentation model and the second image segmentation model have different structures, which is not limited in this embodiment of the present application.

The first image segmentation model and the second image segmentation model are models used for segmenting the target domain image, and in the process of training the first image segmentation model, the computer equipment acquires the first source domain image, the source domain label image corresponding to the first source domain image, the first label image and the second label image. The source domain label image corresponding to the first source domain image is a real label image corresponding to the first source domain image, the first label image is obtained by performing image segmentation on the first source domain image based on a first image segmentation model in a current training stage, the second label image is obtained by performing image segmentation on the first source domain image based on a second image segmentation model in a previous training stage, and the second label image is also called a pseudo label image corresponding to the first source domain image.

102. And determining a first source domain loss value based on the source domain label image, the first label image and the second label image corresponding to the first source domain image.

After the computer device determines the source domain label image, the first label image, and the second label image, a first source domain loss value for the first label image is determined based on a difference between the source domain label image and the first label image, and a difference between the second label image and the first label image.

103. And acquiring a target domain image, a third label image and a fourth label image.

In the process of training the first image segmentation model, the computer device also acquires a target domain image, a third label image and a fourth label image. The target domain image and the source domain image belong to different fields, the third label image is obtained by performing image segmentation on the target domain image based on the first image segmentation model in the current training stage, the fourth label image is obtained by performing image segmentation on the target domain image based on the second image segmentation model in the last training stage, and the fourth label image is also called a pseudo label image corresponding to the target domain image.

104. Based on the third label image and the fourth label image, a first target domain loss value is determined.

After the computer device determines the third label image and the fourth label image, a first target domain loss value for the third label image is determined based on a difference between the third label image and the fourth label image.

105. A first image segmentation model is trained based on the first source domain loss value and the first target domain loss value.

After the computer device determines a first source domain loss value and a first target domain loss value, parameters of the first image segmentation model are adjusted based on the first source domain loss value and the first target domain loss value, so that the first source domain loss value and the first target domain loss value are smaller and smaller until the first source domain loss value and the first target domain loss value tend to converge, and the trained first image segmentation model is obtained.

106. And performing image segmentation task processing based on the image segmentation model obtained after training.

The image segmentation model obtained through training has high accuracy and robustness, and the computer equipment can perform image segmentation task processing on any image based on the image segmentation model, so that the accuracy of image segmentation can be guaranteed.

According to the method provided by the embodiment of the application, when the first image segmentation model is trained by adopting the source domain image and the target domain image, not only the segmentation result of the first image segmentation model is considered, but also the segmentation result of the second image segmentation model is fused into the source domain loss value and the target domain loss value, so that the first image segmentation model learns the segmentation result of the second image segmentation model, the training process of the first image segmentation model can be supervised by the segmentation result of the second image segmentation model, the first image segmentation model with higher accuracy and robustness can be trained, and the accuracy of image segmentation is improved.

Fig. 2 is a flowchart of an image segmentation data processing method according to an embodiment of the present application. The execution subject of the embodiment of the application is computer equipment, and referring to fig. 2, the method includes:

201. the computer equipment acquires a first source domain image, a source domain label image corresponding to the first source domain image, a first label image and a second label image.

In the process of training the first image segmentation model, the computer equipment acquires a first source domain image, a source domain label image corresponding to the first source domain image, a first label image and a second label image. The source domain label image corresponding to the first source domain image is a real label image corresponding to the first source domain image, the first label image is obtained by performing image segmentation on the first source domain image based on a first image segmentation model in the current training stage, the first label image can be called a prediction label image corresponding to the first source domain image, the second label image is obtained by performing image segmentation on the first source domain image based on a second image segmentation model in the last training stage, and the second label image is also called a pseudo label image corresponding to the first source domain image.

Optionally, the first source domain image is any type of image. For example, in the medical field, when the first source region image is an eye region image, the eye region image is subjected to image segmentation processing, and a cup region, a optic disc region, or the like is segmented from the eye region image. Optionally, the first source domain image is acquired by a medical image acquisition device, and the medical image acquisition device includes a Computed Tomography (CT) or a magnetic resonance imager. Optionally, the source domain label image corresponding to the first source domain image is obtained by labeling the first source domain image by medical personnel. Optionally, the first source domain image and the source domain label image corresponding to the first source domain image are obtained by the computer device from a local database, or are obtained by the computer device from a network by downloading, which is not limited in this embodiment of the present application.

In the embodiment of the application, the first image segmentation model and the second image segmentation model are synchronously trained in a cross training mode. The first image segmentation model and the second image segmentation model have the same structure, or the first image segmentation model and the second image segmentation model have different structures. The structures and parameters of the first image segmentation model and the second image segmentation model are set and adjusted according to requirements.

In a possible implementation manner, the first image segmentation model and the second image segmentation model have different structures and parameters, and due to differences between the structures and parameters of the two models, the two models can be fitted and learned from different angles, so that different decision boundaries are generated, namely different learning capabilities are provided, and thus peer supervision between the models can be promoted.

Optionally, the first image segmentation model is depeplav 2 (an image segmentation model) with ResNet101 (residual error network) as a main framework, and a Spatial Pyramid (ASPP) network is added to the first image segmentation model, so that the multi-scale features of the output result are enriched. In addition, in order to enhance the feature expression capability of the model, an Attention mechanism based on a DANet (Dual Attention Network) is proposed, an Attention Network is added to learn and capture the dependency relationship between the pixel points and the feature layer channels, the output of the Attention Network is connected with the output of the spatial pyramid Network, and a final annotation image is generated.

Optionally, the second image segmentation model is based on a DeepLabv3+ (an image segmentation model) and a MobileNet V2 (a lightweight network), so that the number of parameters and the running cost of the model are reduced. The second image segmentation model uses the first convolution network of MobileNetV2 and 7 residual networks to extract features, optionally with the step size (stride) of the first convolution network and the first two residual networks set to 2 and the step size of the remaining residual networks set to 1. The down-sampling rate of the second image segmentation model is 8. In addition, an ASPP network is added into the second image segmentation model, potential features under different receptive fields are learned, multi-scale features are generated by the ASPP network with different expansion rates (Dialate Rate), information of different levels is integrated into feature mapping, the feature mapping is subjected to upsampling and convolution, and the obtained combined features are connected with low-level features to perform fine-grained image segmentation.

202. The computer device determines a first source domain loss value based on the source domain label image, the first label image, and the second label image, in a case where the first source domain image is determined as the noise source domain image based on the second image segmentation model.

In the last training stage, the computer device performs image segmentation on the first source domain image based on the second image segmentation model to obtain a label image, determines a source domain loss value based on a real source domain label image corresponding to the first source domain image, the label image obtained this time and a pseudo label image corresponding to the first source domain image when the second image segmentation model is trained, and determines the first source domain image as a noise source domain image based on the source domain loss value. The pseudo label image corresponding to the first source domain image when the second image segmentation model is trained refers to a label image obtained by image segmentation of the first source domain image by the first image segmentation model in a previous training stage. The process of determining the noise source domain image based on the source domain loss value is detailed in the following step 215, and will not be described here.

The computer device determines a first source domain loss value based on the source domain label image, the first label image, and the second label image if the first source domain image is determined to be a noise source domain image.

In one possible implementation, the computer device determines a third source domain loss value based on the source domain label image and the first label image, determines a fourth source domain loss value based on the second label image and the first label image, and determines the first source domain loss value based on the third source domain loss value and the fourth source domain loss value.

Wherein the computer device determines a third source domain loss value based on a difference between the source domain label image and the first label image and determines a fourth source domain loss value based on a difference between the second label image and the first label image. Optionally, the computer device performs weighted summation on the third source domain loss value and the fourth source domain loss value to obtain the first source domain loss value.

In another possible implementation manner, the first label image includes a first label corresponding to each pixel point in the first source domain image, and then determining a third source domain loss value includes: the computer equipment determines the weight corresponding to each pixel point based on the source domain label image, and determines a third source domain loss value based on the first label and the weight corresponding to each pixel point.

Optionally, the weight corresponding to a plurality of pixel points is represented by a weight Map (Distance Map). When the first source domain image is judged to be the noise source domain image, the source domain label image corresponding to the first source domain image is the noise label image, and a loss value is determined to have a large error directly based on the label in the source domain label image.

Optionally, determining the weight comprises: the computer equipment determines the minimum distance between each pixel point and the boundary line of the source domain label image, determines the maximum value in the determined minimum distances as a target distance, and respectively determines the weight corresponding to each pixel point based on the target distance and the minimum distance corresponding to each pixel point.

The computer device determines the maximum value of the minimum distances as a target distance, and determines the weight corresponding to each pixel point respectively based on the difference between the target distance and the minimum distance corresponding to each pixel point.

Optionally, the computer device determines the weight corresponding to any pixel point by using the following formula:

wherein i represents any pixel point, y_iRepresenting source domain labels in the source domain image corresponding to pixel points, W (y)_i) Represents the weight, w, corresponding to the arbitrary pixel point_cAnd w₀Representing a weight coefficient, optionally, dividing w_cSet to 10, w₀Is set to 1. max_disDenotes the target distance, d (y)_i) Represents the minimum distance, delta, corresponding to any one pixel point²Representing the variance of the determined plurality of minimum distances.

Optionally, the process of determining the third source domain loss value includes: the computer equipment determines a first difference value based on the first label and the weight corresponding to each pixel point, determines a third source domain sub-loss value corresponding to any pixel point based on the first difference value, the first label and the weight corresponding to any pixel point, and determines a third source domain loss value based on the plurality of determined third source domain sub-loss values.

Wherein the first difference value represents a difference between the first label image and the source domain label image. After the computer device determines the first difference value, the computer device may determine a third source domain sub-loss value corresponding to any pixel point based on the first difference value and the first label and weight corresponding to the any pixel point, thereby determining third source domain sub-loss values corresponding to the plurality of pixel points, and performing weighted summation on the determined plurality of third source domain sub-loss values to obtain the third source domain loss value.

Optionally, the computer device determines a third source domain sub-loss value corresponding to any one of the pixel points by using the following formula:

wherein i represents any pixel point, p represents a first label in the first label image, y represents a source domain label in the source domain image, and L_reweight(p, y) represents a third source domain sub-loss value, λ, corresponding to the pixel point₁And λ₂Denotes a weight coefficient, W (y)_i) Representing the weight, p, of the corresponding point of any pixel_iA first label, y, indicating the correspondence of any pixel point_iAnd indicating the source domain label corresponding to any pixel point. Wherein the content of the first and second substances,

denotes a first disparity value, h denotes a length of the label image, w denotes a width of the label image, and c denotes a category of the label image. Wherein the content of the first and second substances,

representing a loss value determined based on cross-entropy loss,

representing a loss value determined based on a dice loss (a loss value determination approach).

The process of determining the fourth source domain loss value is the same as the process of determining the third source domain loss value, and is not described in detail herein.

In another possible implementation manner, the computer device trains the first image segmentation model by adopting a multi-iteration training manner, and as the first image segmentation model is more and more stable as the iteration training frequency increases, the computer device adjusts the weight coefficients of the third source domain loss value and the fourth source domain loss value according to the iteration training frequency, so that the weight coefficient of the fourth source domain loss value is more and more increased as the iteration training frequency increases, thereby avoiding the training process from being affected by an erroneous pseudo-label image generated by an unstable image segmentation model, and simultaneously utilizing effective information in a noise label image to enhance the robustness of the image segmentation model. Then determining the first source domain loss value based on the third source domain loss value and the fourth source domain loss value comprises: the computer equipment obtains the iterative training times corresponding to the training, responds to the fact that the iterative training times are not smaller than a first threshold value and not larger than a second threshold value, and determines a first source domain loss value based on a third source domain loss value, a fourth source domain loss value, the iterative times, the first threshold value and the second threshold value. Alternatively, in response to the number of iterative trainings being greater than the second threshold, the first source domain loss value is determined based on the third source domain loss value and the fourth source domain loss value.

Optionally, in response to the iterative training number being less than the first threshold, the computer device directly determines the third source domain loss value as the first source domain loss value without considering the fourth source domain loss value.

Optionally, the computer device determines the first source domain loss value based on the third source domain loss value and the fourth source domain loss value using the following formula:

wherein p represents a first label in the first label image, y represents a source domain label in the source domain image_pseudoRepresenting a second label in a second label image, L_noise(p, y) denotes a first source domain loss value, L_reweight(p, y) represents a third source domain loss value, L_reweight(p,y_pseudo) Represents the fourth source domain loss value, λ_pseudoRepresenting a scaling factor, optionally, lambda_pseudoThe setting represents 0.5. T represents the number of iterative training times, T₁Denotes a first threshold value, T₂Representing a second threshold. In the process of image segmentation model, when t is<T₁At this time, the image segmentation model is not stable enough, so that the first source domain loss value refers to the third source domain loss value completely, that is, only the influence of the source domain label image on the loss value is considered; when T is₁≤t≤T₂When the second source domain loss value is smaller than the first source domain loss value, gradually increasing the weight coefficient of the fourth source domain loss value along with the increase of t, and simultaneously reducing the weight coefficient of the third source domain loss value; when T is₂<And at time t, the weight coefficients of the third source domain loss value and the fourth source domain loss value in the first source domain loss value are respectively 0.5.

203. And the computer equipment acquires the second source domain image, the source domain label image corresponding to the second source domain image and the fifth label image.

In the process of training the first image segmentation model, the computer device further obtains a second source domain image, a source domain label image corresponding to the second source domain image, and a fifth label image. The second source domain image is an image used for training the second image segmentation model in the last training stage, the source domain label image corresponding to the second source domain image is a real label image corresponding to the second source domain image, and the fifth label image is obtained by performing image segmentation on the second source domain image based on the first image segmentation model in the current training stage.

In the embodiment of the present application, the first source region image is an image determined as a noise source region image based on the second image segmentation model, and the second source region image is an image determined as a non-noise source region image based on the second image segmentation model.

204. And the computer equipment determines a second source domain loss value based on the source domain label image and the fifth label image corresponding to the second source domain image when the second source domain image is judged to be the non-noise source domain image based on the second image segmentation model.

Similarly to the process of determining the first source domain image as the noise source domain image in step 202, in step 204, when the second source domain image is determined as the non-noise source domain image, the computer device determines a second source domain loss value based on the source domain label image and the fifth label image corresponding to the second source domain image.

In one possible implementation, the source domain label image includes a source domain label corresponding to each pixel point in the second source domain image, and the fifth label image includes a fifth label corresponding to each pixel point in the second source domain image. Determining a second source domain loss value comprises: determining a third difference value based on the source domain label and the fifth label corresponding to each pixel point; determining a second source domain sub-loss value corresponding to any pixel point based on the third difference value, the source domain label corresponding to any pixel point and the fifth label; a second source domain penalty value is determined based on the determined plurality of second source domain sub-penalty values.

Wherein the third difference value represents a difference between the source domain label image and the fifth label image. After the computer device determines the third difference value, the second source domain sub-loss value corresponding to any pixel point can be determined based on the third difference value and the source domain label and the fifth label corresponding to any pixel point, so that the second source domain sub-loss values corresponding to the plurality of pixel points are determined, and the determined plurality of second source domain sub-loss values are subjected to weighted summation to obtain the second source domain loss value.

Optionally, the computer device determines a second source domain loss value using the following equation:

wherein i represents any pixel point, p represents a first label in the first label image, y represents a source domain label in the source domain label image, and L_clean(p, y) denotes a second source domain loss value, λ₁And λ₂Representing the weight coefficient, p_iA fifth label, y, indicating the correspondence of any pixel point_iAnd indicating the source domain label corresponding to any pixel point. Wherein the content of the first and second substances,

indicates the third disparity value, h indicates the length of the label image, w indicates the width of the label image, and c indicates the category of the label image.

In a possible implementation manner, before determining the second source domain loss value, the computer device obtains a sixth label image, determines a similarity between the fifth label image and the sixth label image, and determines the second source domain loss value based on the source domain label image and the fifth label image corresponding to the second source domain image in response to the similarity not being less than a similarity threshold. And the sixth label image is obtained by carrying out image segmentation on the second source domain image based on the second image segmentation model.

Since the second source domain image is a non-noise source domain image determined based on the second image segmentation model, in order to further determine whether the second source domain image is a noise source domain image or a non-noise source domain image, the computer device determines a similarity between a prediction result of the first image segmentation model for the second source domain image and a prediction result of the second image segmentation model for the second source domain image, that is, determines a similarity between the fifth label image and the sixth label image.

If the similarity is not smaller than the similarity threshold, it is indicated that the second source domain image is indeed a non-noise source domain image, and a second source domain loss value corresponding to the second source domain image is determined by adopting a processing mode of the non-noise source domain image. The computer device determines a second source domain loss value based on the source domain label image and the fifth label image corresponding to the second source domain image in response to the similarity not being less than the similarity threshold.

And if the similarity is smaller than the similarity threshold, re-judging the second source domain image as the noise source domain image, and determining a second source domain loss value corresponding to the second source domain image by adopting a processing mode of the noise source domain image. The computer device determines a second source domain loss value based on the source domain label image, the fifth label image, and the sixth label image corresponding to the second source domain image in response to the similarity being less than the similarity threshold. In the case that the similarity is smaller than the similarity threshold, the method for determining the second source domain loss value by the computer device is the same as the method for determining the first source domain loss value in step 202, and is not described in detail here.

Optionally, the computer device determines a dice loss value between the fifth tag image and the sixth tag image, and determines the second source domain loss value based on the source domain tag image, the fifth tag image, and the sixth tag image corresponding to the second source domain image in response to the dice loss value not being less than the loss value threshold value since the greater the dice loss value between the fifth tag image and the sixth tag image, the smaller the similarity between the fifth tag image and the sixth tag image. Or the computer device determines a second source domain loss value based on the source domain label image and the fifth label image corresponding to the second source domain image in response to the dice loss value being less than the loss value threshold.

Optionally, for the second source domain image, the computer device obtains a second source domain loss value using the following formula:

wherein i represents any pixel point, W (y)_i) Represents the weight, w, corresponding to the arbitrary pixel point_cAnd w₀Denotes the weight coefficient, max_disDenotes the target distance, d (y)_i) Represents the minimum distance, y, corresponding to any pixel point_iA source domain label, delta, corresponding to any one of the pixel points²Representing the variance, dice, of the determined minimum distances_coDenotes the dice loss value and μ denotes the loss value threshold.

Wherein L is_reweight(p, y) represents a second source domain loss value, p_iA fifth label, λ, representing the correspondence of any pixel point₁And λ₂Denotes a weight coefficient, h denotes a length of the label image, w denotes a width of the label image, and c denotes a category of the label image.

In the embodiment of the application, when the source domain image is determined as the non-noise source domain image, the method in

step

203 and 204 is adopted to determine the source domain loss value corresponding to the non-noise source domain image based on the real label image corresponding to the non-noise source domain image. In the case that the source domain image is determined as the noise source domain image, the method in step 201 and 202 is adopted to determine the source domain loss value corresponding to the noise source domain image based on the real label image and the pseudo label image corresponding to the noise source domain image. When the source domain image is judged to be the noise source domain image, the real label image corresponding to the noise source domain image is not accurate enough, so that a pseudo label image obtained by image segmentation of the noise source domain image by the second image segmentation model is introduced, the training process of the first image segmentation model is supervised by the segmentation result of the second image segmentation model, errors brought by the noise source domain image in the training process can be reduced, and the resistance of the image segmentation model to noise data is improved.

205. The computer device obtains a target domain image, a third label image and a fourth label image.

In the process of training the first image segmentation model, the computer device also acquires a target domain image, a third label image and a fourth label image. The third label image is obtained by performing image segmentation on the target domain image based on the first image segmentation model in the current training stage, and the fourth label image is obtained by performing image segmentation on the target domain image based on the second image segmentation model in the last training stage, and the fourth label image is also called a pseudo label image corresponding to the target domain image.

The target domain image and the source domain image are different, and different standards can be adopted to divide the source domain image and the target domain image according to requirements in the embodiment of the application. For example, the source domain image is an image including a cat and the target domain image is an image including a dog. Alternatively, the source domain image is an eye image captured using device a, the target domain image is an eye image captured using device B, and so on.

206. The computer device determines a first target domain loss value based on the third label image and the fourth label image.

After the computer equipment acquires the third label image and the fourth label image, the first target domain loss value is determined based on the difference between the third label image and the fourth label image.

In one possible implementation, the third label image includes a third label corresponding to each pixel point in the target domain image, and the fourth label image includes a fourth label corresponding to each pixel point in the target domain image, and then determining the first target domain loss value includes: determining a second difference value based on the third label and the fourth label corresponding to each pixel point, and determining a target domain sub-loss value corresponding to any pixel point based on the second difference value and the third label and the fourth label corresponding to any pixel point; based on the determined plurality of target domain sub-penalty values, a first target domain penalty value is determined.

Wherein the second difference value represents a difference between the third label image and the fourth label image. After the computer device determines the second difference value, the target domain sub-loss value corresponding to any pixel point can be determined based on the second difference value and the third label and the fourth label corresponding to any pixel point, so that the target domain sub-loss values corresponding to the multiple pixel points are determined, and the determined multiple target domain sub-loss values are subjected to weighted summation to obtain the first target domain loss value.

Optionally, the computer device determines the first target domain loss value using the following formula:

wherein i represents any pixel point, L_seg(p, y) represents a first target domain loss value, λ, corresponding to the pixel point₁And λ₂Representing the weight coefficient, p_iA third label, y, indicating the correspondence of any pixel point_iAnd a fourth label representing the corresponding of any pixel point. Wherein the content of the first and second substances,

indicates the second disparity value, h indicates the length of the label image, w indicates the width of the label image, and c indicates the category of the label image.

The target domain image does not have a corresponding label image, so that an unsupervised image segmentation problem needs to be solved. The embodiment of the application generates the pixel-level pseudo label image by adding the self-supervision information, namely, by using the result of image segmentation of the target domain image, and applies the pseudo label image in the next training stage. In a possible implementation manner, for any pixel point in the target domain image, if the prediction confidence of a certain category is higher than the target threshold, a pseudo label of the category corresponding to the pixel point is generated. Optionally, the target threshold value is set in a self-adaptive manner, the prediction confidence degrees of the categories corresponding to each pixel point in the target domain image are ranked, and the category with the highest prediction confidence degree is selected in a self-adaptive manner to generate a pixel-level pseudo label as cross supervision information of the next training stage. In order to ensure the correctness of the generated pseudo label, in the embodiment of the application, the first image segmentation model and the second image segmentation model are trained in an iterative training mode, and a more accurate pseudo label is continuously generated.

207. The computer device trains a first image segmentation model based on the first source domain loss value, the second source domain loss value, and the first target domain loss value.

The computer equipment adjusts parameters of the first image segmentation model based on the first source domain loss value, the second source domain loss value and the first target domain loss value, so that the first source domain loss value, the second source domain loss value and the first target domain loss value are smaller and smaller until the first source domain loss value, the second source domain loss value and the first target domain loss value tend to converge, and the trained first image segmentation model is obtained.

In one possible implementation, the computer device determines a segmentation loss value based on a first source domain loss value, a second source domain loss value, and a first target domain loss value, and trains a first image segmentation model based on the segmentation loss value to converge the segmentation loss value.

Optionally, the computer device determines the segmentation loss value using the following formula:

L_seg(X_S,Y_S)＝(1-α)L_clean+αL_noise；

wherein L is_segRepresents a segmentation loss value, L_seg(X_S,Y_S) Represents the total source domain loss value and,

representing a first target domain partitioning loss value, L_noiseTo representFirst source domain loss value, L_cleanRepresenting a second source domain loss value, alpha representing the balance L_cleanAnd L_noiseCoefficient of (A), X_SRepresenting a source domain image, X_TRepresenting the target field image, Y_SA source domain label image representing a correspondence of the source domain image,

and representing a third label image corresponding to the target domain image generated in the training process, namely that the target domain image is a pseudo label image.

In the embodiment of the present application, a process of training a first image segmentation model will be described by taking only the case where a first source domain image is determined as a noise source domain image and a second source domain image is determined as a non-noise source domain image as an example. In another embodiment, the computer device may not perform

step

203 and 204, and in step 202, determine a first source domain loss value directly based on the source domain label image, the first label image and the second label image, and then train the first image segmentation model based on the first source domain loss value and the first target domain loss value.

It should be noted that, in the step 201-207, only one first source domain image, one second source domain image and one target domain image are processed to obtain a first source domain loss value, a second source domain loss value and a first target domain loss value. In another embodiment, the computer device respectively processes the plurality of first source domain images, the plurality of second source domain images and the plurality of target domain images, and respectively determines the plurality of first source domain loss values, the plurality of second source domain loss values and the plurality of first target domain loss values by using the method provided in step 201 and step 207, and then trains the first image segmentation model based on the plurality of first source domain loss values, the plurality of second source domain loss values and the plurality of first target domain loss values.

208. And the computer equipment respectively carries out image discrimination processing on the first label image and the third label image based on the discrimination model to obtain a source domain discrimination result and a target domain discrimination result.

The computer equipment carries out image segmentation processing on the first source domain image and the target domain image based on the first image segmentation model to obtain a first label image and a third label image, then carries out image discrimination processing on the first label image based on the discrimination model to obtain a source domain discrimination result, and carries out image discrimination processing on the third label image based on the discrimination model to obtain a target domain discrimination result.

The discrimination model is used for discriminating whether the input label image belongs to a label image of a source domain image or a label image of a target domain image, and can fuse the label images of the source domain image and the target domain image into counterstudy. When the discrimination model performs image discrimination processing on the third label image corresponding to the target domain image, and the obtained target domain discrimination result indicates that the third label image is the label image of the target domain image, it indicates that the discrimination model is accurate, the label image of the target domain image output by the first image segmentation model and the label image of the source domain image are not similar enough in the feature space, and the discrimination model cannot be "fooled", and the first image segmentation model does not complete domain migration from the source domain to the target domain, so the accuracy of the first image segmentation model in image segmentation on the target domain image is still low. When the discrimination model performs image discrimination processing on the third label image corresponding to the target domain image, and the obtained target domain discrimination result indicates that the third label image is the label image of the source domain image, it indicates that the discrimination model is wrong, that is, the label image of the target domain image output by the first image segmentation model is sufficiently similar to the label image of the source domain image in the feature space, and the first image segmentation model completes domain migration from the source domain to the target domain, so that the accuracy of the first image segmentation model in image segmentation on the target domain image is high.

Optionally, the discriminant model is a model formed by a plurality of layers of full convolutional networks. Optionally, the kernel size (size of convolution kernel) of each convolution network in the discriminant model is 4, stride (step size) is 2, padding (padding) is 1. Optionally, in the discriminant model, except for the last layer of convolutional network, a leakage corrected Linear network (leakage corrected Linear Unit) is connected to the following convolutional networks. Optionally, the output of the discriminant model is a single-channel two-dimensional result, e.g., source domain is denoted by 0 and target domain is denoted by l.

209. The computer device determines a discrimination loss value based on the source domain discrimination result, the target domain discrimination result, and the third label image.

Wherein the discriminant loss value is used to represent the confrontation loss of the discriminant model.

In one possible implementation, the third label image includes third labels corresponding to a plurality of pixel points in the target domain image, and determining the discriminant loss value includes: the computer equipment determines the uncertainty corresponding to the third label image based on the third labels corresponding to the multiple pixel points; and determining a discrimination loss value based on the uncertainty, the source domain discrimination result and the target domain discrimination result.

And the uncertainty corresponding to the third tag image is used for identifying the degree of uncertainty of the third tag image, and optionally, the uncertainty is information Entropy (Entropy) corresponding to the third tag image. The information entropy is introduced into each pixel point of the target domain image, and then the discrimination loss value is determined by the information entropy and the discrimination result output by the discrimination model, so that the loss weight of the pixel point with uncertainty (corresponding to the high entropy value) can be increased, and the loss weight of the pixel point with certainty (corresponding to the low entropy value) can be reduced. Under the drive of the information entropy, the method is beneficial to the learning of the discrimination model to pay attention to the representative features.

Optionally, the computer device determines the information entropy using the following formula:

wherein, f (X)_T) Representing information entropy, i representing any pixel point, h representing the length of the label image, w representing the width of the label image, c representing the category of the label image, and p_iAnd a third label corresponding to any pixel point is represented.

Optionally, the computer device determines the discrimination loss value using the following equation:

L_D＝λ_advL_adv(X_S,X_T)；

L_adv(X_S,X_T)＝-E[log(D(G(X_S)))]-E[(λ_entrf(X_T)+ε)+log(1-D(G(X_T)))]；

wherein L is_DIndicates the discriminant loss value, X_SRepresenting a first source domain image, X_TRepresenting the target domain image, λ_advRepresenting parameters for balancing loss relations, G (-) representing a first image segmentation model, D (-) representing a discriminant model, E (-) representing a mathematical expectation, D (G (X)_S) Indicates the source domain discrimination result, D (G (X)_T) Denotes the target domain discrimination result, λ_entrRepresenting the weight parameter, f (X), corresponding to the entropy result graph_T) Representing uncertainty, epsilon is a weight coefficient for introducing f (X)_T) The stability of the training process is guaranteed.

210. The computer device trains a first image segmentation model and a discriminant model based on the discriminant loss value.

After the computer equipment determines the discrimination loss value, the parameters of the first image segmentation model and the parameters of the discrimination model are adjusted based on the discrimination loss value, so that label images obtained by image segmentation of the source domain image and the target domain image by the first image segmentation model are more and more similar in a feature space, and the discrimination model can not distinguish whether the label images are the label images of the target domain image more and more, thereby realizing the field self-adaptation of image segmentation.

Optionally, the computer device adjusts parameters of the discriminant model by maximizing the discriminant loss value. Alternatively, since an error generated by the discriminant loss value is also reflected in the first image segmentation model, the computer apparatus adjusts parameters of the first image segmentation model by minimizing the discriminant loss value. Optionally, the computer device optimizes and trains the first image segmentation model by using a SGD (Stochastic Gradient Descent) algorithm and optimizes and trains the discriminant model by using an Adam algorithm (an optimization algorithm for expanding a Stochastic Gradient Descent method) in the process of optimizing the model parameters. Optionally, initial learning of the first image segmentation modelThe ratio was 2.5X 10^-4The initial learning rate of the discrimination model is 1 × 10^-4。

Optionally, the computer device trains the first image segmentation model and the discrimination model using the following formulas:

wherein the content of the first and second substances,

means that a minimization of the loss value is performed for the first image segmentation model,

means that the minimum of loss value is carried out on the discriminant model, L_segRepresents a segmentation loss value, L_DIndicating the discrimination loss value.

In the embodiments of the present application, the discrimination model is trained simultaneously when the first image segmentation model is trained, for example. In another embodiment, the discriminant model is a trained discriminant model, and the computer device does not need to train the discriminant model any more, then the computer device does not perform the above-mentioned step 208-210.

Fig. 3 is a schematic diagram of determining a loss value according to an embodiment of the present application, and as shown in fig. 3, a computer device inputs a source domain image 301 into an image segmentation model 303 to obtain a first label image 304, and inputs a target domain image into the image segmentation model 303 to obtain a third label image. Wherein the computer device determines the first source domain loss value based on the determined weight map 306 in case the source domain image 301 is determined as a noise source domain image, and wherein the computer device does not need to utilize the weight map when determining the second source domain loss value in case the source domain image 301 is determined as a non-noise source domain image. The computer device inputs the first label image 304 and the third label image 305 into the discrimination model 307, respectively, and determines a discrimination loss value based on the discrimination result and the information entropy 308.

211. And the computer equipment carries out image segmentation task processing based on the image segmentation model obtained after training.

The image segmentation model obtained after training has high accuracy and robustness, and the computer equipment carries out image segmentation task processing on any image based on the image segmentation model, so that the accuracy of image segmentation can be ensured.

In a possible implementation manner, the computer device performs image segmentation on any image based on the image segmentation model obtained after training to obtain a label image corresponding to any image, and completes an image segmentation task on any image.

Optionally, any image in the same domain as the target domain image is an image to be segmented, for example, the any image is a medical image (such as a human spinal cord image or an ophthalmic image) in the same domain as the target domain image, for example, both the any image and the target domain image are ophthalmic images, and the any image and the target domain image are images acquired by the same device.

212. The computer device determines the first source domain image as a non-noise source domain image in response to the first source domain loss value being less than the target loss value.

After the computer device determines the first source domain loss value, in response to that the first source domain loss value is smaller than the target loss value, the first source domain image corresponding to the first source domain loss value is determined as a non-noise source domain image, and a subsequent second image segmentation model can be trained based on the first source domain image determined as the non-noise source domain image. The target loss value is set by a computer device, which is not limited in the embodiment of the present application.

In a possible implementation manner, in the same training stage, the computer device performs image segmentation on a plurality of source domain images to obtain corresponding source domain loss values, and then selects a target number of source domain loss values from the determined plurality of source domain loss values according to a descending order, determines the source domain images corresponding to the selected plurality of source domain loss values as non-noise source domain images, and determines the rest of the source domain images as noise source domain images.

Aiming at the condition that a source domain label image corresponding to a source domain image in a training set has noise, computer equipment cleans the noise label of the source domain image based on a source domain loss value, selects a non-noise source domain image corresponding to the non-noise label from the source domain loss value, improves the accuracy of the label, improves the quality of the image and further improves the accuracy of an image segmentation model obtained by training.

It should be noted that, in the embodiment of the present application, the determination of the image in step 212 is performed only after the first image segmentation model is used, and in another embodiment, the computer device may perform the determination of the image in step 212 after the computer device performs the step 202 to determine the first source region loss value.

213. The computer device determines a fifth source domain loss value based on the source domain label image and the sixth label image corresponding to the first source domain image.

After the computer device determines the first source domain image as a non-noise source domain image based on the first image segmentation model, the non-noise source domain image can be applied to training of the second image segmentation model, and a fifth source domain loss value is determined based on the source domain label image and the sixth label image corresponding to the first source domain image. And the sixth label image is obtained by carrying out image segmentation on the first source domain image again based on the second image segmentation model. The step 213 is the same as the process of determining the second source domain loss value in the step 204, and is not repeated here.

214. The computer device determines a second target domain loss value based on the third label image and the seventh label image.

And the computer equipment applies the third label image obtained by image segmentation of the target domain image by the target domain image and the first image segmentation model to training of the second image segmentation model, and determines a second target domain loss value based on the third label image and the seventh label image. And the seventh label image is obtained by carrying out image segmentation on the target domain image again based on the second image segmentation model. The process of step 214 is the same as the process of determining the first target domain loss value in step 206, and is not repeated here.

215. The computer device trains a second image segmentation model based on the fifth source domain loss value and the second target domain loss value.

The step 215 is similar to the process of training the first image segmentation model in the step 207, and is not repeated herein.

It should be noted that, in the embodiment of the present application, only the second image segmentation model is trained based on the fifth source domain loss value and the second target domain loss value as an example for description. In another embodiment, the second image segmentation model also corresponds to the discriminant model, and after the step 215 is executed, the method further includes: and the computer equipment respectively carries out image discrimination processing on the sixth label image and the seventh label image based on the discrimination model corresponding to the second image segmentation model to obtain a source domain discrimination result and a target domain discrimination result, determines a discrimination loss value based on the source domain discrimination result, the target domain discrimination result and the seventh label image, and trains the second image segmentation model and the discrimination model corresponding to the second image segmentation model based on the discrimination loss value. The process is similar to the process of step 208-210, and is not described herein again.

The embodiments of the present application will be described by taking only an example in which the first source domain image is determined as a non-noise source domain image based on the first image segmentation model. In another embodiment, the computer device determines the first source domain image as the noise source domain image based on the first image segmentation model, and then step 212 and step 215 are replaced by the following step 216 and step 219 as shown in fig. 4:

216. the computer device determines the first source domain image as a noise source domain image in response to the first source domain loss value not being less than the target loss value.

After the computer device determines the first source domain loss value, in response to that the first source domain loss value is not less than the target loss value, the first source domain image corresponding to the first source domain loss value is determined as a noise source domain image, and a subsequent second image segmentation model can be trained based on the first source domain image determined as the noise source domain image.

In a possible implementation manner, in the same training stage, the computer device performs image segmentation on a plurality of source domain images to obtain corresponding source domain loss values, and then the computer device selects a target number of source domain loss values from the determined plurality of source domain loss values in a descending order, determines the source domain images corresponding to the selected plurality of source domain loss values as non-noise source domain images, and determines the rest of the source domain images as noise source domain images.

In one possible implementation, the computer device selects a plurality of first candidate source domain images among the plurality of source domain images based on the first image segmentation model when training the first image segmentation model, and selects a plurality of second candidate source domain images among the plurality of source domain images based on the second image segmentation model when training the second image segmentation model. The computer device determines the same source domain image of the plurality of first candidate source domain images and the plurality of second candidate source domain images as a noise source domain image.

217. The computer device determines a sixth source domain loss value based on the source domain label image corresponding to the first source domain image, the first label image, and the sixth label image.

After the computer device determines the first source domain image as the noise source domain image based on the first image segmentation model, the noise source domain image may be applied to training of the second image segmentation model, and the sixth source domain loss value is determined based on the source domain label image, the first label image, and the sixth label image corresponding to the first source domain image. And the sixth label image is obtained by carrying out image segmentation on the first source domain image again based on the second image segmentation model. The step 217 is the same as the process of determining the first source domain loss value in the step 202, and is not described in detail herein.

218. The computer device determines a second target domain loss value based on the third label image and the seventh label image.

219. The computer device trains a second image segmentation model based on the sixth source domain loss value and the second target domain loss value.

The process of step 218-219 is similar to the process of step 214-215, and will not be described in detail here.

In the embodiment of the present application, only the first image segmentation model and the second image segmentation model are trained by using the same source domain image and target domain image, and in the process from the beginning of training to the completion of training, the computer device trains the first image segmentation model and the second image segmentation model by using a plurality of source domain images and a plurality of target domain images in the same training set.

In the embodiment of the application, an unsupervised robustness segmentation method based on a domain adaptive strategy is provided for noise label images in a training set and distribution differences between source domain images and target domain images, on one hand, source domain images with small loss in a source domain are exchanged by using same-line supervision, and useful information is learned from label images corresponding to the noise source domain images by using a pixel-level weight distribution method, and on the other hand, noise source domain images with large loss in the source domain images are supervised by using the same-line supervision to generate pseudo label images, so that real label images and predicted pseudo label images in the source domain images are learned and supervised with each other by using a cross training mode of two image segmentation models, and a more reasonable training strategy is designed, thereby solving the problem of noise stored in the source domain images and the problem of domain migration from the source domain images to the target domain images, the accuracy and robustness of the image segmentation model can be improved.

Fig. 5 is a schematic diagram of a cross-training image segmentation model provided in an embodiment of the present application, and referring to fig. 5, a computer device trains a first image segmentation model 503 and a second image segmentation model 505 by using a source domain image 501 and a target domain image 502 in the same training set, where the first image segmentation model 503 corresponds to a first discrimination model 504, the first discrimination model 504 discriminates a label image output by the first image segmentation model 503, the second image segmentation model 505 corresponds to a second discrimination model 506, and the second discrimination model 506 discriminates a label image output by the second image segmentation model 505.

The computer device inputs a source domain image 501 into a first image segmentation model 503 for processing and determines a source domain loss value, inputs a target domain image 502 into the first image segmentation model 503 for processing and determines a target domain loss value, inputs a result output by the first image segmentation model 503 into a first discrimination model 504 for processing and determines a discrimination loss value, and trains the first image segmentation model 503 and the first discrimination model 504 based on the source domain loss value, the target domain loss value and the discrimination loss value. Also, the computer device determines a non-noise source domain image 506 and a noise source domain image 508 in the source domain image 501 based on the source domain loss value.

Similarly, the computer device inputs the source domain image 501 and the target domain image 502 into the second image segmentation model 505 for processing, determines a source domain loss value and a target domain loss value, inputs a result output by the second image segmentation model 505 into the second discrimination model 506 for processing, determines a discrimination loss value, and trains the second image segmentation model 505 and the second discrimination model 506 based on the source domain loss value, the target domain loss value and the discrimination loss value. Also, the computer device determines a non-noise source domain image 509 and a noise source domain image 510 in the source domain image 501 based on the source domain loss value. And the computer device determines the intersection of the noise source domain image 508 and the noise source domain image 510 to obtain a high-confidence noise source domain image, re-determines the pseudo label image 511 of the high-confidence noise source domain image, and applies the pseudo label image 511 to the next training stage of the image segmentation model.

In order to verify the effectiveness of the image segmentation data processing method provided by the embodiment of the application, the computer device performs verification on the ophthalmologic image and the human spinal cord image. The ophthalmic image adopts a REFUSE image set (a public data set) and a Drishli-GS image set (a public data set), and because the training set and the testing set of the two image sets are shot by different acquisition devices, the images have differences in color, texture and the like, the training set of the REFUSE image set is used as a source domain training set, the testing set of the REFUSE image set and the testing set of the Drishli-GS image set are used as a target domain training set, and the testing set of the REFUSE image set and the testing set of the Drishli-GS image set are used as a target domain testing set. For the REFUGE image set, the training set comprises 400 images, the image size is 2124 x 2056, and the test set comprises 300 images; for the Drishti-GS image set, the training set contained 50 images, the test set contained 51 images, and the image size was 2047 × 1759.

The human spinal white matter image and the gray matter segmentation image are derived from four different centers, the images from the center-1, the center-2 and the center-3 are used as a source domain training set, the image from the center-4 is used as a target domain training set, and the image from the center-4 is also used as a target domain test set. For the source domain training set, which contains 341 images in total, the size of the image is 100 × 100, the target domain training set and the test set contain the same 134 images, and the image size is 100 × 100.

The experimental results using the method provided in the examples of the present application and the method of the related art are compared on an ophthalmic image set, and the experimental results based on different noise level tasks are shown in tables 1 and 2, respectively. The BDL is a bidirectional learning method based on self-supervision learning, the pOSAL is a method proposed in a retinal fundus glaucoma challenge race, and the BEAL is an antagonistic learning method based on edge and incite information. The embodiment of the application adopts DICE (loss determination algorithm) to compare experimental results, and DICE is a measure index of segmentation results, is used for calculating the similarity between a real label image and a predicted label image, and is expressed as:

where DI denotes a degree of similarity between the real tag image and the predicted tag image, P denotes the real tag image, and Y denotes the predicted tag image.

Where DIdisc denotes a DI value of the optic disc, and DIcup denotes a DI value of the optic cup.

Wherein, table l is the experimental results of the mild noise level from the refoge training set to the refoge testing set, table 2 is the experimental results of the severe noise level from the refoge training set to the refoge testing set, and table 3 is the experimental results of the severe noise level from the SCGM training set to the SCGM testing set. FIG. 6 shows the results of an experiment performed on the REFUSE image set and the Drishti-GS image set according to the embodiment of the present application. FIG. 7 is another experimental result on the REFUSE image set and the Drishti-GS image set provided by the embodiment of the present application.

TABLE 1

TABLE 2

TABLE 3

As can be seen from tables 1 to 3, in the image segmentation model trained by the method provided in the embodiment of the present application, the degree of similarity between the predicted label image and the real label image is high when the image segmentation is performed, that is, the accuracy of the image segmentation model is high. As can be seen from fig. 6 and 7, in the image segmentation model trained by the method provided in the embodiment of the present application, the predicted label image is more similar to the real label image when the image segmentation is performed, that is, the accuracy of the image segmentation model is higher.

In the method provided by the embodiment of the application, when the first image segmentation model is trained by adopting the source domain image and the target domain image, not only the segmentation result of the first image segmentation model is considered, but also the segmentation result of the second image segmentation model is fused into the source domain loss value and the target domain loss value, so that the first image segmentation model learns the segmentation result of the second image segmentation model, and therefore, the training process of the first image segmentation model can be supervised by the segmentation result of the second image segmentation model, and the first image segmentation model with higher accuracy and robustness is trained.

Furthermore, when the first source domain image is determined to be a noise source domain image, the source domain label image corresponding to the noise source domain image is not accurate enough, so when determining the source domain loss value corresponding to the noise source domain image, the first image segmentation model is supervised by the second image segmentation model in consideration of the segmentation result of the second image segmentation model on the noise source domain image, thereby reducing errors caused by the noise source domain image in the training process and improving the resistance of the image segmentation model on noise data.

In addition, in the case that no corresponding label image exists in the target domain image, the embodiment of the present application generates a pseudo label image by adding "self-supervision" information according to the result of image segmentation on the target domain image, and applies the pseudo label image in the next training stage as cross supervision information of the next training stage, thereby solving the problem of unsupervised image segmentation.

In addition, in order to ensure the accuracy of the generated pseudo tag images, in the embodiment of the application, the first image segmentation model and the second image segmentation model are trained in an iterative training mode, more accurate pseudo tag images are continuously generated, and then the pseudo tag images are used for segmenting the models, so that the accuracy of the image segmentation models is further improved.

And because the two image segmentation models have different structures and learning capabilities, different types of errors introduced by the noise image can be filtered in a cross training mode, so that the predicted pseudo label image has stronger robustness. In the process of image exchange of the two image segmentation models, mutual supervision can be performed, noise images in a training set are purified, and training errors caused by the noise images are reduced.

In addition, in the embodiment of the application, the image segmentation model is trained by adopting the pixel-level label information, and compared with the image-level label information, the result obtained by performing image segmentation on the trained image segmentation model is more accurate.

In addition, in the embodiment of the application, the information entropy of the label image output by the image segmentation model is fused into the discrimination loss so as to strengthen the counterstudy process of the image segmentation model and the discrimination model, so that the trained image segmentation model can better learn the domain adaptive task capability, and the accuracy of the image segmentation model is improved.

After the first image segmentation model is trained, the image segmentation task processing can be performed based on the first image segmentation model. The following embodiment will explain the image segmentation process in detail.

Fig. 8 is a flowchart of an image segmentation data processing method according to an embodiment of the present application. An execution subject of the embodiment of the present application is a computer device, and referring to fig. 8, the method includes:

801. a computer device obtains a first image segmentation model.

The computer equipment obtains a first image segmentation model obtained after training, wherein the first image segmentation model comprises a feature extraction layer and a feature segmentation layer, the feature extraction layer is used for carrying out feature extraction on a source domain image or a target domain image, and the feature segmentation layer is used for carrying out feature segmentation on the feature image.

The sample data adopted in training the first image segmentation model comprises: the sample source domain image, the sample source domain label image corresponding to the sample source domain image, the sample target domain image, the first image segmentation model and the second image segmentation model respectively carry out image segmentation on the sample source domain image and the sample target domain image to obtain label images.

The training process of the first image segmentation model comprises the following steps: acquiring a sample source domain image, a sample source domain label image, a first label image and a second label image, wherein the first label image is obtained by performing image segmentation on the sample source domain image based on a first image segmentation model, and the second label image is obtained by performing image segmentation on the sample source domain image based on a second image segmentation model; determining a first source domain loss value based on the sample source domain label image, the first label image, and the second label image; acquiring a sample target domain image, a third label image and a fourth label image, wherein the third label image is obtained by carrying out image segmentation on the sample target domain image based on a first image segmentation model, and the fourth label image is obtained by carrying out image segmentation on the sample target domain image based on a second image segmentation model; determining a first target domain loss value based on the third label image and the fourth label image; a first image segmentation model is trained based on the first source domain loss value and the first target domain loss value.

It should be noted that the sample source domain image is similar to the first source domain image in the embodiment of fig. 2, the sample source domain label image is similar to the source domain label image corresponding to the first source domain image in the embodiment of fig. 2, the sample target domain image is similar to the target domain image in the embodiment of fig. 2, and the process of training the first image segmentation model is detailed in the embodiment of fig. 2, which is not repeated herein.

802. And the computer equipment performs feature extraction on the target domain image based on the feature extraction layer to obtain a feature image corresponding to the target domain image.

When any target domain image needs to be segmented, the computer device performs feature extraction on the target domain image based on a feature extraction layer in the first image segmentation model to obtain a feature image corresponding to the target domain image, wherein the feature image is used for representing features of the target domain image.

803. And the computer equipment performs characteristic segmentation on the characteristic image based on the characteristic segmentation layer to obtain a target domain label image corresponding to the target domain image.

After the computer equipment obtains the characteristic image, the characteristic image is subjected to characteristic segmentation based on a characteristic segmentation layer in the first image segmentation model, a target domain label image corresponding to the target domain image is obtained, and the target domain label image is marked with the category to which the pixel points in the target domain image belong, so that the image segmentation task of the target domain image is completed.

In the method provided by the embodiment of the application, when the first image segmentation model is trained by using the source domain image and the target domain image, the segmentation result of the first image segmentation model is considered, and the segmentation result of the second image segmentation model is also considered, so that the first image segmentation model learns the segmentation result of the second image segmentation model, and therefore the training process of the first image segmentation model can be supervised by the segmentation result of the second image segmentation model, so that the first image segmentation model with higher accuracy and robustness is trained, the target domain image is segmented by using the first image segmentation model, and the obtained segmentation result is more accurate.

Fig. 9 is a schematic structural diagram of an image segmentation data processing apparatus according to an embodiment of the present application. Referring to fig. 9, the apparatus includes:

a first image obtaining module 901, configured to obtain a first source domain image, a source domain label image corresponding to the first source domain image, a first label image, and a second label image, where the first label image is obtained by performing image segmentation on the first source domain image based on a first image segmentation model, and the second label image is obtained by performing image segmentation on the first source domain image based on a second image segmentation model;

a first loss value determining module 902, configured to determine a first source domain loss value based on a source domain label image, a first label image, and a second label image corresponding to the first source domain image;

a second image obtaining module 903, configured to obtain a target domain image, a third label image, and a fourth label image, where the target domain image and the source domain image belong to different fields, the third label image is obtained by performing image segmentation on the target domain image based on the first image segmentation model, and the fourth label image is obtained by performing image segmentation on the target domain image based on the second image segmentation model;

a second loss value determining module 904, configured to determine a first target domain loss value based on the third label image and the fourth label image;

a first training module 905, configured to train a first image segmentation model based on the first source domain loss value and the first target domain loss value;

and the image segmentation module 906 is configured to perform image segmentation task processing based on the trained image segmentation model.

In the image segmentation data processing device provided by the embodiment of the application, when the first image segmentation model is trained by adopting the source domain image and the target domain image, not only the segmentation result of the first image segmentation model is considered, but also the segmentation result of the second image segmentation model is fused into the source domain loss value and the target domain loss value, so that the first image segmentation model learns the segmentation result of the second image segmentation model, the training process of the first image segmentation model can be supervised by the segmentation result of the second image segmentation model, the first image segmentation model with higher accuracy and robustness can be trained, and the accuracy of image segmentation is improved.

Optionally, referring to fig. 10, the first loss value determining module 902 includes:

a first loss value determination unit 9021, configured to determine a first source domain loss value based on the source domain label image, the first label image, and the second label image, when the first source domain image is determined to be the noise source domain image based on the second image segmentation model.

Optionally, referring to fig. 10, the apparatus further comprises:

a first image obtaining module 901, configured to obtain a second source domain image, a source domain label image corresponding to the second source domain image, and a fifth label image, where the fifth label image is obtained by performing image segmentation on the second source domain image based on the first image segmentation model;

a third loss value determining module 907, configured to determine a second source domain loss value based on a source domain label image and a fifth label image corresponding to the second source domain image when the second source domain image is determined to be a non-noise source domain image based on the second image segmentation model;

a first training module 905, configured to train a first image segmentation model based on the first source domain loss value, the second source domain loss value, and the first target domain loss value.

Optionally, referring to fig. 10, the first image obtaining module 901 is further configured to obtain a sixth label image, where the sixth label image is obtained by performing image segmentation on the second source domain image based on the second image segmentation model;

a third loss value determination module 907, comprising:

a similarity determination unit 9071 configured to determine a similarity between the fifth label image and the sixth label image;

and a second loss value determining unit 9072, configured to determine, in response to the similarity being not less than the similarity threshold, a second source domain loss value based on the source domain label image and the fifth label image corresponding to the second source domain image.

Alternatively, referring to fig. 10, the third loss value determination module 907 includes:

and a third loss value determining unit 9073, configured to determine, in response to the similarity being smaller than the similarity threshold, a second source domain loss value based on the source domain label image, the fifth label image, and the sixth label image corresponding to the second source domain image.

a fourth loss value determining unit 9022, configured to determine a third source domain loss value based on the source domain label image and the first label image;

a fifth loss value determining unit 9023, configured to determine a fourth source domain loss value based on the second label image and the first label image;

a sixth loss value determining unit 9024, configured to determine the first source domain loss value based on the third source domain loss value and the fourth source domain loss value.

Optionally, referring to fig. 10, a sixth loss value determining unit 9024, configured to:

acquiring the number of iterative training times corresponding to the training;

and in response to the iterative training number not being less than the first threshold and not being greater than the second threshold, determining a first source domain loss value based on the third source domain loss value, the fourth source domain loss value, the iteration number, the first threshold, and the second threshold.

in response to the number of iterative trainings being greater than the second threshold, a first source domain loss value is determined based on the third source domain loss value and the fourth source domain loss value.

Optionally, referring to fig. 10, the first label image includes a first label corresponding to each pixel point in the first source domain image, and the fourth loss value determining unit 9022 is configured to:

a third source domain loss value is determined based on the first label and the weight corresponding to each pixel point.

Optionally, referring to fig. 10, a fourth loss value determination unit 9022 is configured to:

determining the minimum distance between each pixel point and the boundary line of the source domain label image;

determining a maximum value of the determined plurality of minimum distances as a target distance;

and respectively determining the weight corresponding to each pixel point based on the target distance and the minimum distance corresponding to each pixel point.

determining a first difference value based on the first label and the weight corresponding to each pixel point, wherein the first difference value represents the difference between the first label image and the source domain label image;

determining a third source domain sub-loss value corresponding to any pixel point based on the first difference value, the first label corresponding to any pixel point and the weight;

a third source domain loss value is determined based on the determined plurality of third source domain sub-loss values.

Optionally, referring to fig. 10, the third label image includes a third label corresponding to each pixel point in the target domain image, the fourth label image includes a fourth label corresponding to each pixel point in the target domain image, and the second loss value determining module 904 includes:

a difference value determining unit 9041, configured to determine a second difference value based on the third label and the fourth label corresponding to each pixel, where the second difference value represents a difference between the third label image and the fourth label image;

a seventh loss value determining unit 9042, configured to determine, based on the second difference value and the third label and the fourth label corresponding to any pixel point, a target domain sub-loss value corresponding to any pixel point;

an eighth loss value determining unit 9043, configured to determine the first target domain loss value based on the determined plurality of target domain sub-loss values.

Optionally, referring to fig. 10, the apparatus further comprises:

a discrimination processing module 908, configured to perform image discrimination processing on the first tag image based on the discrimination model to obtain a source domain discrimination result;

the discrimination processing module 908 is further configured to perform image discrimination processing on the third label image based on the discrimination model to obtain a target domain discrimination result;

a discrimination loss determination module 909 for determining a discrimination loss value based on the source domain discrimination result, the target domain discrimination result, and the third label image;

and a second training module 910, configured to train the first image segmentation model and the discriminant model based on the discriminant loss value.

Optionally, referring to fig. 10, the third label image includes third labels corresponding to a plurality of pixel points in the target domain image, and the discrimination loss determining module 909 includes:

an uncertainty determining unit 9091 configured to determine, based on a third tag corresponding to the plurality of pixel points, an uncertainty corresponding to the third tag image;

and a discrimination loss determination unit 9092 configured to determine a discrimination loss value based on the uncertainty, the source domain discrimination result, and the target domain discrimination result.

Optionally, referring to fig. 10, the image segmentation module 906 includes:

and an image segmentation unit 9061, configured to perform image segmentation on any image based on the image segmentation model obtained after training, to obtain a label image corresponding to any image.

Optionally, referring to fig. 10, the apparatus further comprises:

a determining module 911, configured to determine, in response to that the first source domain loss value is smaller than the target loss value, the first source domain image as a non-noise source domain image; alternatively, the first and second electrodes may be,

and a determining module 911, configured to determine the first source domain image as the noise source domain image in response to the first source domain loss value not being less than the target loss value.

Optionally, referring to fig. 10, the third loss value determining module 906 is further configured to determine a fifth source domain loss value based on a source domain label image and a sixth label image corresponding to the first source domain image, where the sixth label image is obtained by performing image segmentation on the first source domain image based on the second image segmentation model;

the second loss value determining module 904 is further configured to determine a second target domain loss value based on a third tag image and a seventh tag image, where the seventh tag image is obtained by performing image segmentation on the target domain image based on the second image segmentation model;

the first training module 905 is further configured to train a second image segmentation model based on the fifth source domain loss value and the second target domain loss value.

Optionally, referring to fig. 10, the apparatus further comprises:

the first loss value determining module 902 is further configured to determine a sixth source domain loss value based on a source domain label image, a first label image and a sixth label image corresponding to the first source domain image, where the sixth label image is obtained by performing image segmentation on the first source domain image based on the second image segmentation model;

the first training module 905 is further configured to train the second image segmentation model based on the sixth source domain loss value and the second target domain loss value.

It should be noted that: in the image segmentation data processing apparatus provided in the above embodiment, when the image segmentation model is trained, only the division of each functional module is illustrated, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the computer device is divided into different functional modules to complete all or part of the above described functions. In addition, the image segmentation data processing apparatus and the image segmentation data processing method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments, and are not described herein again.

Fig. 11 is a schematic structural diagram of an image segmentation data processing apparatus according to an embodiment of the present application. Referring to fig. 11, the apparatus includes:

a model obtaining module 1101, configured to obtain a first image segmentation model, where the first image segmentation model includes a feature extraction layer and a feature segmentation layer;

a feature extraction module 1102, configured to perform feature extraction on a target domain image based on the feature extraction layer to obtain a feature image corresponding to the target domain image;

a feature segmentation module 1103, configured to perform feature segmentation on the feature image based on the feature segmentation layer to obtain a target domain label image corresponding to the target domain image;

In the image segmentation data processing apparatus provided in the embodiment of the present application, when the first image segmentation model is trained by using the source domain image and the target domain image, the segmentation result of the first image segmentation model and the segmentation result of the second image segmentation model are considered, so that the first image segmentation model learns the segmentation result of the second image segmentation model, and therefore, the training process of the first image segmentation model can be supervised by the segmentation result of the second image segmentation model, so as to train the first image segmentation model with higher accuracy and robustness, and then the target domain image is segmented by using the first image segmentation model, and the obtained segmentation result is more accurate.

Optionally, referring to fig. 12, the apparatus further comprises:

a first image obtaining module 1104, configured to obtain the sample source-domain image, the sample source-domain label image, a first label image, and a second label image, where the first label image is obtained by image-segmenting the sample source-domain image based on the first image-segmentation model, and the second label image is obtained by image-segmenting the sample source-domain image based on the second image-segmentation model;

a first loss value determining module 1105 for determining a first source-domain loss value based on the sample source-domain label image, the first label image, and the second label image;

a second image obtaining module 1106, configured to obtain a sample target domain image, a third label image, and a fourth label image, where the third label image is obtained by performing image segmentation on the sample target domain image based on the first image segmentation model, and the fourth label image is obtained by performing image segmentation on the sample target domain image based on the second image segmentation model;

a second loss value determining module 1107 for determining a first target domain loss value based on the third label image and the fourth label image;

a training module 1108, configured to train the first image segmentation model based on the first source domain loss value and the first target domain loss value.

The embodiment of the present application further provides a computer device, where the computer device includes a processor and a memory, and the memory stores at least one computer program, and the at least one computer program is loaded and executed by the processor to implement the operations executed in the image segmentation data processing method of the foregoing embodiment.

Optionally, the computer device is provided as a terminal. Fig. 13 shows a schematic structural diagram of a terminal 1300 according to an exemplary embodiment of the present application.

Terminal 1300 includes: a processor 1301 and a memory 1302.

Processor 1301 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like. The processor 1301 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 1301 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also referred to as a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 1301 may be integrated with a GPU (Graphics Processing Unit, image Processing interactor) for rendering and drawing content required to be displayed by the display screen. In some embodiments, processor 1301 may further include an AI (Artificial Intelligence) processor for processing computational operations related to machine learning.

Memory 1302 may include one or more computer-readable storage media, which may be non-transitory. The memory 1302 may also include high speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in the memory 1302 is used to store at least one computer program for being possessed by the processor 1301 for implementing the image segmentation data processing method provided by the method embodiments herein.

In some embodiments, terminal 1300 may further optionally include: a peripheral interface 1303 and at least one peripheral. Processor 1301, memory 1302, and peripheral interface 1303 may be connected by a bus or signal line. Each peripheral device may be connected to the peripheral device interface 1303 via a bus, signal line, or circuit board. Optionally, the peripheral device comprises: at least one of radio frequency circuitry 1304, display 1305, camera assembly 1306, and power supply 1307.

Peripheral interface 1303 may be used to connect at least one peripheral associated with I/O (Input/Output) to processor 1301 and memory 1302. In some embodiments, processor 1301, memory 1302, and peripheral interface 1303 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 1301, the memory 1302, and the peripheral device interface 1303 may be implemented on a separate chip or circuit board, which is not limited in this embodiment.

The Radio Frequency circuit 1304 is used to receive and transmit RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuitry 1304 communicates with communication networks and other communication devices via electromagnetic signals. The radio frequency circuit 1304 converts an electrical signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 1304 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuitry 1304 may communicate with other devices via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: metropolitan area networks, various generation mobile communication networks (2G, 3G, 4G, and 5G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the radio frequency circuit 1304 may also include NFC (Near Field Communication) related circuits, which are not limited in this application.

The display screen 1305 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 1305 is a touch display screen, the display screen 1305 also has the ability to capture touch signals on or over the surface of the display screen 1305. The touch signal may be input to the processor 1301 as a control signal for processing. At this point, the display 1305 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, display 1305 may be one, disposed on the front panel of terminal 1300; in other embodiments, display 1305 may be at least two, either on different surfaces of terminal 1300 or in a folded design; in other embodiments, display 1305 may be a flexible display disposed on a curved surface or on a folded surface of terminal 1300. Even further, the display 1305 may be arranged in a non-rectangular irregular figure, i.e., a shaped screen. The Display 1305 may be made of LCD (Liquid Crystal Display), OLED (Organic Light-Emitting Diode), or the like.

The camera assembly 1306 is used to capture images or video. Optionally, camera assembly 1306 includes a front camera and a rear camera. The front camera is disposed on the front panel of the terminal 1300, and the rear camera is disposed on the rear surface of the terminal 1300. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments, camera assembly 1306 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.

Power supply 1307 is used to provide power to various components in terminal 1300. The power source 1307 may be alternating current, direct current, disposable or rechargeable. When the power source 1307 comprises a rechargeable battery, the rechargeable battery may support wired or wireless charging. The rechargeable battery may also be used to support fast charge technology.

Those skilled in the art will appreciate that the configuration shown in fig. 13 is not intended to be limiting with respect to terminal 1300 and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components may be employed.

Optionally, the computer device is provided as a server. Fig. 14 is a schematic structural diagram of a server according to an embodiment of the present application, where the server 1400 may generate relatively large differences due to different configurations or performances, and may include one or more processors (CPUs) 1401 and one or more memories 1402, where the memory 1402 stores at least one computer program, and the at least one computer program is loaded and executed by the processors 1401 to implement the image segmentation data Processing method provided by the foregoing method embodiments. Of course, the server may also have components such as a wired or wireless network interface, a keyboard, and an input/output interface, so as to perform input/output, and the server may also include other components for implementing the functions of the device, which are not described herein again.

The embodiment of the present application further provides a computer-readable storage medium, where at least one computer program is stored in the computer-readable storage medium, and the at least one computer program is loaded and executed by a processor to implement the operations executed in the image segmentation data processing method of the foregoing embodiment.

The embodiments of the present application also provide a computer program product or a computer program, where the computer program product or the computer program includes computer program code, the computer program code is stored in a computer-readable storage medium, a processor of a computer device reads the computer program code from the computer-readable storage medium, and the processor executes the computer program code, so that the computer device implements the operations performed in the image segmentation data processing method according to the above embodiments.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is only an alternative embodiment of the present application and should not be construed as limiting the present application, and any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. An image segmentation data processing method, characterized in that the method comprises:

2. The method of claim 1, wherein determining a first source domain loss value based on the source domain label image, the first label image, and the second label image to which the first source domain image corresponds comprises:

determining the first source domain loss value based on the source domain label image, the first label image and the second label image when the first source domain image is determined to be a noise source domain image based on the second image segmentation model.

3. The method of claim 1, further comprising:

acquiring a second source domain image, a source domain label image corresponding to the second source domain image and a fifth label image, wherein the fifth label image is obtained by performing image segmentation on the second source domain image based on the first image segmentation model;

determining a second source domain loss value based on a source domain label image and the fifth label image corresponding to the second source domain image when the second source domain image is determined to be a non-noise source domain image based on the second image segmentation model;

training the first image segmentation model based on the first source domain loss value, the second source domain loss value, and the first target domain loss value.

4. The method of claim 1, wherein determining a first source domain loss value based on the source domain label image, the first label image, and the second label image to which the first source domain image corresponds comprises:

determining a third source domain loss value based on the source domain label image and the first label image;

determining a fourth source domain loss value based on the second label image and the first label image;

determining the first source domain penalty value based on the third source domain penalty value and the fourth source domain penalty value.

5. The method of claim 4, wherein determining the first source domain penalty value based on the third source domain penalty value and the fourth source domain penalty value comprises:

acquiring the number of iterative training times corresponding to the training;

in response to the number of iterative trainings not being less than a first threshold and not greater than a second threshold, determining the first source domain loss value based on the third source domain loss value, the fourth source domain loss value, the number of iterations, the first threshold, and the second threshold.

6. The method of claim 1, wherein the third label image comprises a third label corresponding to each pixel point in the target domain image, wherein the fourth label image comprises a fourth label corresponding to each pixel point in the target domain image, and wherein determining the first target domain loss value based on the third label image and the fourth label image comprises:

determining a second difference value based on the third label and the fourth label corresponding to each pixel point, the second difference value representing a difference between the third label image and the fourth label image;

determining a target domain sub-loss value corresponding to any pixel point based on the second difference value, a third label and a fourth label corresponding to any pixel point;

determining the first target domain penalty value based on the determined plurality of target domain sub-penalty values.

7. The method of claim 1, further comprising:

carrying out image discrimination processing on the first label image based on a discrimination model to obtain a source domain discrimination result;

carrying out image discrimination processing on the third label image based on the discrimination model to obtain a target domain discrimination result;

determining a discrimination loss value based on the source domain discrimination result, the target domain discrimination result and the third label image;

training the first image segmentation model and the discriminant model based on the discriminant loss value.

8. The method of claim 1, wherein after determining a first source domain loss value based on the source domain label image, the first label image, and the second label image to which the first source domain image corresponds, the method further comprises:

in response to the first source domain loss value being less than a target loss value, determining the first source domain image as a non-noise source domain image; alternatively, the first and second electrodes may be,

and in response to the first source domain loss value not being less than the target loss value, determining the first source domain image as a noise source domain image.

9. The method of claim 8, wherein after determining the first source domain image as the noise source domain image in response to the first source domain loss value not being less than the target loss value, the method further comprises:

determining a sixth source domain loss value based on a source domain label image corresponding to the first source domain image, the first label image and a sixth label image, wherein the sixth label image is obtained by performing image segmentation on the first source domain image based on the second image segmentation model;

determining a second target domain loss value based on the third label image and a seventh label image, wherein the seventh label image is obtained by performing image segmentation on the target domain image based on the second image segmentation model;

training the second image segmentation model based on the sixth source domain loss value and the second target domain loss value.

10. An image segmentation data processing method, characterized in that the method comprises:

acquiring a first image segmentation model, wherein the first image segmentation model comprises a feature extraction layer and a feature segmentation layer;

performing feature extraction on the target domain image based on the feature extraction layer to obtain a feature image corresponding to the target domain image;

performing feature segmentation on the feature image based on the feature segmentation layer to obtain a target domain label image corresponding to the target domain image;

11. The method of claim 10, wherein the training process of the first image segmentation model comprises:

acquiring the sample source domain image, the sample source domain label image, a first label image and a second label image, wherein the first label image is obtained by image segmentation of the sample source domain image based on the first image segmentation model, and the second label image is obtained by image segmentation of the sample source domain image based on the second image segmentation model;

determining a first source domain loss value based on the sample source domain label image, the first label image, and the second label image;

acquiring a sample target domain image, a third label image and a fourth label image, wherein the third label image is obtained by performing image segmentation on the sample target domain image based on the first image segmentation model, and the fourth label image is obtained by performing image segmentation on the sample target domain image based on the second image segmentation model;

training the first image segmentation model based on the first source domain loss value and the first target domain loss value.

12. An image segmentation data processing apparatus, characterized in that the apparatus comprises:

13. An image segmentation data processing apparatus, characterized in that the apparatus comprises:

14. A computer device, characterized in that the computer device comprises a processor and a memory, in which at least one computer program is stored, which is loaded and executed by the processor, to implement the operations performed in the image segmentation data processing method according to any one of claims 1 to 9, or to implement the operations performed in the image segmentation data processing method according to any one of claims 10 to 11.

15. A computer-readable storage medium, in which at least one computer program is stored, which is loaded and executed by a processor, to implement the operations performed in the image segmentation data processing method according to any one of claims 1 to 9, or to implement the operations performed in the image segmentation data processing method according to any one of claims 10 to 11.