CN112419326B

CN112419326B - Image segmentation data processing method, device, equipment and storage medium

Info

Publication number: CN112419326B
Application number: CN202011403688.XA
Authority: CN
Inventors: 柳露艳; 马锴; 郑冶枫
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-12-02
Filing date: 2020-12-02
Publication date: 2023-05-23
Anticipated expiration: 2040-12-02
Also published as: CN112419326A

Abstract

The embodiment of the application discloses an image segmentation data processing method, an image segmentation data processing device, computer equipment and a storage medium, and belongs to the technical field of computers. The method comprises the following steps: determining a first source domain loss value based on the source domain label image, the first label image, and the second label image; determining a first target domain loss value based on the third label image and the fourth label image; and training a first image segmentation model based on the first source domain loss value and the first target domain loss value, and performing image segmentation task processing based on the trained image segmentation model. When the source domain image and the target domain image are adopted to train the first image segmentation model, the output result of the first image segmentation model is considered, the output result of the second image segmentation model is fused into the source domain loss value, the first image segmentation model can be supervised through the second image segmentation model, and the image segmentation model with higher accuracy and robustness is trained, so that the accuracy of image segmentation is improved.

Description

Image segmentation data processing method, device, equipment and storage medium

Technical Field

The embodiment of the application relates to the technical field of computers, in particular to an image segmentation data processing method, device and equipment and a storage medium.

Background

With the continuous development of computer technology and artificial intelligence technology, the artificial intelligence technology has achieved remarkable results in image analysis tasks such as image segmentation and image recognition. When image segmentation is performed by adopting an artificial intelligence technology, an image segmentation model needs to be trained by relying on a large number of sample images, but noise images often exist in the sample images.

In the related art, the quality score of a sample image is determined based on a quality evaluation model, a sample image with higher quality score is selected, and an image segmentation model is trained by using the sample image with higher quality score. Because the sample images are screened only by relying on the quality evaluation model, the reliability of the screened sample images is not high, and the accuracy and the robustness of the image segmentation model obtained by training are poor.

Disclosure of Invention

The embodiment of the application provides an image segmentation data processing method, an image segmentation data processing device, computer equipment and a storage medium, which can be used for accurately and robustly segmenting an image segmentation model. The technical scheme is as follows:

In one aspect, there is provided an image segmentation data processing method, the method comprising:

acquiring a first source domain image, a source domain label image corresponding to the first source domain image, a first label image and a second label image, wherein the first label image is obtained by carrying out image segmentation on the first source domain image based on a first image segmentation model, and the second label image is obtained by carrying out image segmentation on the first source domain image based on a second image segmentation model;

determining a first source domain loss value based on a source domain label image corresponding to the first source domain image, the first label image and the second label image;

acquiring a target domain image, a third label image and a fourth label image, wherein the third label image is obtained by carrying out image segmentation on the target domain image based on the first image segmentation model, and the fourth label image is obtained by carrying out image segmentation on the target domain image based on the second image segmentation model;

determining a first target domain loss value based on the third label image and the fourth label image;

and training the first image segmentation model based on the first source domain loss value and the first target domain loss value, and performing image segmentation task processing based on the trained image segmentation model.

Optionally, the determining the second source domain loss value based on the source domain label image corresponding to the second source domain image and the fifth label image includes: and determining the second source domain loss value based on the source domain label image, the fifth label image and the sixth label image corresponding to the second source domain image in response to the similarity being less than the similarity threshold.

Optionally, the method further comprises: and in response to the number of iterative training times being greater than the second threshold, determining the first source domain loss value based on the third source domain loss value and the fourth source domain loss value.

Optionally, the determining, based on the source domain label image, the weight corresponding to each pixel point includes: determining the minimum distance between each pixel point and the boundary line of the source domain label image; determining a maximum value of the determined plurality of minimum distances as a target distance; and respectively determining the weight corresponding to each pixel point based on the target distance and the minimum distance corresponding to each pixel point.

Optionally, the determining the third source domain loss value based on the first label and the weight corresponding to each pixel point includes: determining a first difference value based on the first label and the weight corresponding to each pixel point, wherein the first difference value represents the difference between the first label image and the source domain label image; determining a third source domain sub-loss value corresponding to any pixel point based on the first difference value, a first label corresponding to the any pixel point and the weight; the third source domain loss value is determined based on the determined plurality of third source domain sub-loss values.

Optionally, in response to the first source domain loss value being less than the target loss value, after determining the first source domain image as a non-noise source domain image, the method further comprises: determining a fifth source domain loss value based on a source domain label image and a sixth label image corresponding to the first source domain image, wherein the sixth label image is obtained by image segmentation of the first source domain image based on the second image segmentation model; determining a second target domain loss value based on the third label image and a seventh label image, wherein the seventh label image is obtained by performing image segmentation on the target domain image based on the second image segmentation model; the second image segmentation model is trained based on the fifth source domain loss value and the second target domain loss value.

Optionally, the third label image includes a third label corresponding to a plurality of pixels in the target domain image, and determining the discrimination loss value based on the source domain discrimination result, the target domain discrimination result, and the third label image includes:

determining uncertainty corresponding to the third label image based on the third labels corresponding to the plurality of pixel points;

And determining the discrimination loss value based on the uncertainty, the source domain discrimination result and the target domain discrimination result.

Optionally, the method further comprises:

acquiring a sixth label image, wherein the sixth label image is obtained by carrying out image segmentation on the second source domain image based on the second image segmentation model;

the determining a second source domain loss value based on the source domain label image corresponding to the second source domain image and the fifth label image includes:

determining a similarity between the fifth label image and the sixth label image;

and determining the second source domain loss value based on the source domain label image corresponding to the second source domain image and the fifth label image in response to the similarity not being smaller than a similarity threshold.

Optionally, the first label image includes a first label corresponding to each pixel point in the first source domain image, and determining, based on the source domain label image and the first label image, a third source domain loss value includes:

determining the weight corresponding to each pixel point based on the source domain label image;

and determining the third source domain loss value based on the first label and the weight corresponding to each pixel point.

In another aspect, there is provided an image segmentation data processing apparatus, the apparatus including:

the first image acquisition module is used for acquiring a first source domain image, a source domain label image corresponding to the first source domain image, a first label image and a second label image, wherein the first label image is obtained by carrying out image segmentation on the first source domain image based on a first image segmentation model, and the second label image is obtained by carrying out image segmentation on the first source domain image based on a second image segmentation model;

the first loss value determining module is used for determining a first source domain loss value based on a source domain label image corresponding to the first source domain image, the first label image and the second label image;

the second image acquisition module is used for acquiring a target domain image, a third label image and a fourth label image, wherein the third label image is obtained by image segmentation of the target domain image based on the first image segmentation model, and the fourth label image is obtained by image segmentation of the target domain image based on the second image segmentation model;

a second loss value determining module, configured to determine a first target domain loss value based on the third tag image and the fourth tag image;

The first training module is used for training the first image segmentation model based on the first source domain loss value and the first target domain loss value;

and the image segmentation module is used for carrying out image segmentation task processing based on the trained image segmentation model.

the model acquisition module is used for acquiring a first image segmentation model, and the first image segmentation model comprises a feature extraction layer and a feature segmentation layer;

the feature extraction module is used for carrying out feature extraction on the target domain image based on the feature extraction layer to obtain a feature image corresponding to the target domain image;

the feature segmentation module is used for carrying out feature segmentation on the feature image based on the feature segmentation layer to obtain a target domain label image corresponding to the target domain image;

the sample data adopted in training the first image segmentation model comprises:

the sample source domain image, a sample source domain label image corresponding to the sample source domain image, a sample target domain image, and label images obtained by respectively carrying out image segmentation on the sample source domain image and the sample target domain image by the first image segmentation model and the second image segmentation model.

In another aspect, there is provided a computer apparatus comprising a processor and a memory having stored therein at least one computer program loaded and executed by the processor to implement the operations performed in the image segmentation data processing method as set out in the above aspects.

In another aspect, there is provided a computer readable storage medium having stored therein at least one computer program loaded and executed by a processor to implement the operations performed in the image segmentation data processing method as described in the above aspects.

In another aspect, a computer program product or a computer program is provided, the computer program product or computer program comprising computer program code stored in a computer readable storage medium, the computer program code being read from the computer readable storage medium by a processor of a computer device, the computer program code being executed by the processor such that the computer device implements the operations performed in the image segmentation data processing method as described in the above aspect.

According to the method, the device, the computer equipment and the storage medium, when the first image segmentation model is trained by adopting the source domain image and the target domain image, the segmentation result of the first image segmentation model is considered, the segmentation result of the second image segmentation model is fused into the source domain loss value and the target domain loss value, and the first image segmentation model learns the segmentation result of the second image segmentation model, so that the training process of the first image segmentation model can be supervised by the segmentation result of the second image segmentation model, the first image segmentation model with higher accuracy and robustness can be trained, and the accuracy of image segmentation is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a flowchart of an image segmentation data processing method according to an embodiment of the present application.

Fig. 2 is a flowchart of another image segmentation data processing method according to an embodiment of the present application.

Fig. 3 is a schematic diagram of determining a loss value according to an embodiment of the present application.

Fig. 4 is a flowchart of an image segmentation data processing method according to an embodiment of the present application.

Fig. 5 is a schematic diagram of a cross-training image segmentation model according to an embodiment of the present application.

Fig. 6 is a schematic diagram of an experimental result provided in the examples of the present application.

Fig. 7 is a schematic diagram of another experimental result provided in the examples of the present application.

Fig. 8 is a flowchart of another image segmentation data processing method according to an embodiment of the present application.

Fig. 9 is a schematic structural diagram of an image segmentation data processing apparatus according to an embodiment of the present application.

Fig. 10 is a schematic structural diagram of another image segmentation data processing apparatus according to an embodiment of the present application.

Fig. 11 is a schematic structural diagram of another image segmentation data processing apparatus according to an embodiment of the present application.

Fig. 12 is a schematic structural diagram of another image segmentation data processing apparatus according to an embodiment of the present application.

Fig. 13 is a schematic structural diagram of a terminal according to an embodiment of the present application.

Fig. 14 is a schematic structural diagram of a server according to an embodiment of the present application.

Detailed Description

For the purposes of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

It will be understood that the terms "first," "second," and the like, as used herein, may be used to describe various concepts, but are not limited by these terms unless otherwise specified. These terms are only used to distinguish one concept from another. For example, a first source domain image may be referred to as a second source domain image, and similarly, a second source domain image may be referred to as a first source domain image, without departing from the scope of the present application.

Wherein at least one refers to one or more than one, for example, at least one source domain image may be any integer number of source domain images greater than or equal to one, such as one source domain image, two source domain images, three source domain images, and the like. The plurality means two or more, and for example, the plurality of source domain images may be an integer number of source domain images equal to or greater than two, such as two source domain images and three source domain images. Each refers to each of the at least one, for example, each source field image refers to each of the plurality of source field images, and if the plurality of source field images is 3 source field images, each source field image refers to each of the 3 source field images.

Artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.

The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. Artificial intelligence software techniques include natural language processing techniques and machine learning.

Machine Learning (ML) is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, etc. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.

Computer Vision (CV) is a science of studying how to "look" a machine, and more specifically, to replace human eyes with a camera and a Computer to perform machine Vision such as recognition, tracking and measurement on a target, and further perform graphic processing to make the Computer process into an image more suitable for human eyes to observe or transmit to an instrument to detect. As a scientific discipline, computer vision research-related theory and technology has attempted to build artificial intelligence systems that can acquire information from images or multidimensional data. Computer vision techniques include image processing, image recognition, image semantic understanding, image retrieval, OCR (Optical Character Recognition ), video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D (three-dimensional) techniques, virtual reality, augmented reality, synchronous positioning, map construction, and other techniques, as well as common biometric recognition techniques such as face recognition, fingerprint recognition, and the like.

Cloud technology refers to a hosting technology for unifying serial resources such as hardware, software, network and the like in a wide area network or a local area network to realize calculation, storage, processing and sharing of data. The cloud technology is a generic term of network technology, information technology, integration technology, management platform technology, application technology and the like based on cloud computing application, can form a resource pool, and is flexible and convenient as required. Background services of technical networking systems require a large amount of computing and storage resources, such as video websites, picture websites, and more portals, cloud computing technology will become an important support. Along with the high development and application of the internet industry, each article possibly has an own identification mark in the future, the identification mark needs to be transmitted to a background system for logic processing, data with different levels can be processed separately, and various industry data needs strong system rear shield support and can be realized only by a cloud computing technology. As a basic capability provider of cloud computing, a cloud computing resource pool, called a cloud platform for short, also called an IaaS (Infrastructure as a Service ) platform, is established, and multiple types of virtual resources are deployed in the resource pool for external clients to select for use.

Medical Cloud (Medical Cloud) is a Medical health service Cloud platform created by combining Medical technology on the basis of new technologies such as Cloud computing, mobile technology, multimedia, 4G communication, big data, internet of things and the like, and the Medical resource sharing and Medical range expansion are realized. Because the cloud computing technology is applied to combination, the medical cloud improves the efficiency of medical institutions, and residents can conveniently seek medical advice. For example, appointment registration, electronic medical records, medical insurance and the like are products of combination of cloud computing and medical fields, and medical cloud has the advantages of data security, information sharing, dynamic expansion and overall layout.

The image segmentation data processing method provided by the embodiment of the application will be described below based on an artificial intelligence technology and a cloud technology.

In order to facilitate understanding of the technical process of the embodiments of the present application, some terms related to the embodiments of the present application are explained below:

(1) Domain adaptation (DA, domain Adaptation): a migration learning method improves the performance of a model on target domain data through rich information provided by source domain data. Wherein the Source Domain (Source Domain) is capable of providing rich tag data; the Target Domain (Target Domain) is the Domain in which the test data set is located, lacking the tag data. The source domain data and the target domain data describe data in the same scene in different fields, can solve the same kind of task, but have differences in distribution of the data in different fields.

(2) Noise data (noise Labels): the Noise data provides erroneous label data, in this application Noise labels at the pixel Level, determined by Noise Level (Noise Level) and Noise Ratio (Noise Ratio).

(3) Peer supervision (Peer-Review): the method is a strategy for mutually supervising the two models in the training process, and the performance of the models is improved and the noise labels of the noise data are purified by exchanging the data with smaller loss and the pseudo labels corresponding to the generated noise data through the two network models.

(4) Cross training: in this embodiment of the present application, cross training refers to training two image segmentation models with the same function simultaneously by using the same training set. The first image segmentation model performs image segmentation on images in a training set, noise images and non-noise images are screened in the training set according to segmentation results, the screened noise images and non-noise images are applied to training of the second image segmentation model, then the second image segmentation model performs image segmentation on the noise images and the non-noise images, the noise images and the non-noise images are screened again according to segmentation results, the method is applied to training of the first image segmentation model again, and cross training of the first image segmentation model and the second image segmentation model is completed in an iterative mode.

(5) Domain migration: because the source domain image and the target domain image have differences in spatial distribution, an image segmentation model capable of carrying out image segmentation on the source domain image is not necessarily capable of carrying out accurate image segmentation on the target domain image. In the embodiment of the application, the image segmentation model is used for image segmentation of an image, which is equivalent to mapping the image into the same feature space, and is based on domain migration of domain adaptation, so that source domain images and target domain images with different spatial distributions are mapped into the same feature space, and in the training process, the training purpose is to make segmentation results (label images) of the source domain images and the target domain images as similar as possible in the feature space, namely, the image segmentation model can reduce differences between the source domain images and the target domain images in the feature space, thereby improving the accuracy of image segmentation of the target domain images by the image segmentation model.

The embodiment of the application provides an image segmentation data processing method, wherein an execution subject is computer equipment.

In one possible implementation, the computer device is a terminal, and the terminal is a portable, pocket, hand-held, or other type of terminal, such as a smart phone, tablet, notebook, or desktop computer, but is not limited thereto.

In another possible implementation manner, the computer device is a server, where the server is a stand-alone physical server, or is a server cluster or a distributed system formed by multiple physical servers, or is a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDN (Content Delivery Network ), and basic cloud computing services such as big data and artificial intelligence platforms, which are not limited herein.

Fig. 1 is a flowchart of an image segmentation data processing method according to an embodiment of the present application. The execution body of the embodiment of the present application is a computer device, referring to fig. 1, the method includes:

101. and acquiring the first source domain image, a source domain label image corresponding to the first source domain image, a first label image and a second label image.

In this embodiment of the present application, image segmentation refers to classifying pixel points in an image, and marking whether the pixel points belong to a target class, so as to obtain a label image corresponding to the image, thereby segmenting a target region in the image. The source domain image has a corresponding source domain label image, and optionally, the source domain label image is obtained by manually labeling the source domain image. The target domain image lacks a corresponding target domain label image. Because the target domain image lacks a corresponding target domain label image, a model capable of performing image segmentation on the target domain image cannot be trained only based on the target domain image.

In the embodiment of the application, the source domain image and the target domain image can be divided by adopting different standards according to requirements. For example, the source domain image is an eye image acquired using a first device and the target domain image is an eye image acquired using a second device in a different manner. For another example, the source domain image is an image including a cat and the target domain image is an image including a dog, which are different from the domain to which the target domain image belongs. For another example, the source domain image is a real face photo, the target domain image is a sketched face cartoon, and the like.

Therefore, in the embodiment of the application, a model training method based on a domain adaptive strategy is provided for the distribution difference between the source domain image and the target domain image, and an image segmentation model for image segmentation is trained under the condition of no supervision by means of mutual learning and mutual supervision of two models.

The first image segmentation model and the second image segmentation model are two different models which are used for cross training, and the first image segmentation model and the second image segmentation model are used for carrying out segmentation processing on the image to obtain a corresponding label image. Alternatively, the first image segmentation model and the second image segmentation model have the same structure, or the first image segmentation model and the second image segmentation model have different structures, which is not limited in the embodiment of the present application.

The first image segmentation model and the second image segmentation model are models for segmenting the target domain image, and in the process of training the first image segmentation model, the first source domain image, the source domain label image corresponding to the first source domain image, the first label image and the second label image are acquired by the computer equipment. The first label image is obtained by image segmentation of the first source domain image based on a first image segmentation model in a current training stage, and the second label image is obtained by image segmentation of the first source domain image based on a second image segmentation model in a previous training stage, and the second label image is also called a pseudo label image corresponding to the first source domain image.

102. And determining a first source domain loss value based on the source domain label image, the first label image and the second label image corresponding to the first source domain image.

After the computer device determines the source domain label image, the first label image, and the second label image, a first source domain loss value for the first label image is determined based on a difference between the source domain label image and the first label image, and a difference between the second label image and the first label image.

103. And acquiring a target domain image, a third label image and a fourth label image.

In training the first image segmentation model, the computer device also acquires a target domain image, a third label image, and a fourth label image. The target domain image is different from the domain to which the source domain image belongs, the third label image is obtained by image segmentation of the target domain image based on the first image segmentation model in the current training stage, and the fourth label image is obtained by image segmentation of the target domain image based on the second image segmentation model in the previous training stage, and is also called as a pseudo label image corresponding to the target domain image.

104. A first target domain loss value is determined based on the third label image and the fourth label image.

After the computer device determines the third and fourth label images, a first target domain loss value for the third label image is determined based on a difference between the third and fourth label images.

105. The first image segmentation model is trained based on the first source domain loss value and the first target domain loss value.

After the first source domain loss value and the first target domain loss value are determined by the computer equipment, parameters of the first image segmentation model are adjusted based on the first source domain loss value and the first target domain loss value, so that the first source domain loss value and the first target domain loss value are smaller and smaller until the first source domain loss value and the first target domain loss value tend to converge, and the trained first image segmentation model is obtained.

106. And performing image segmentation task processing based on the trained image segmentation model.

The image segmentation model obtained through training has higher accuracy and robustness, and the computer equipment can perform image segmentation task processing on any image based on the image segmentation model, so that the accuracy of image segmentation can be ensured.

According to the method provided by the embodiment of the application, when the source domain image and the target domain image are adopted to train the first image segmentation model, the segmentation result of the first image segmentation model is considered, the segmentation result of the second image segmentation model is fused into the source domain loss value and the target domain loss value, so that the first image segmentation model learns the segmentation result of the second image segmentation model, the training process of the first image segmentation model can be supervised through the segmentation result of the second image segmentation model, the first image segmentation model with higher accuracy and robustness can be trained, and the accuracy of image segmentation is improved.

Fig. 2 is a flowchart of an image segmentation data processing method according to an embodiment of the present application. The execution body of the embodiment of the present application is a computer device, referring to fig. 2, the method includes:

201. The computer equipment acquires a first source domain image, a source domain label image corresponding to the first source domain image, a first label image and a second label image.

In the process of training the first image segmentation model, the computer equipment acquires a first source domain image, a source domain label image corresponding to the first source domain image, a first label image and a second label image. The first label image is obtained by image segmentation of the first source domain image based on a first image segmentation model in the current training stage, the first label image can be called a predictive label image corresponding to the first source domain image, and the second label image is obtained by image segmentation of the first source domain image based on a second image segmentation model in the previous training stage, and the second label image is also called a pseudo label image corresponding to the first source domain image.

Optionally, the first source domain image is any type of image. For example, in the medical field, when the first source region image is an eye image, an image segmentation process is performed on the eye image, and a cup region, a disc region, or the like is segmented from the eye image. Optionally, the first source domain image is acquired by a medical image acquisition device comprising a CT (Computed Tomography, electronic computed tomography) or a magnetic resonance imager or the like. Optionally, the source domain label image corresponding to the first source domain image is obtained by labeling the first source domain image by a medical staff. Optionally, the first source domain image and the source domain label image corresponding to the first source domain image are obtained by the computer device from a local database or are obtained by the computer device downloading from a network, which is not limited in the embodiment of the present application.

In the embodiment of the application, the first image segmentation model and the second image segmentation model are synchronously trained in a cross training mode. The first image segmentation model and the second image segmentation model have the same structure or have different structures. The structures and parameters of the first image segmentation model and the second image segmentation model are set and adjusted according to requirements.

In one possible implementation, the first image segmentation model and the second image segmentation model differ in structure and parameters, which, due to the differences between the structures and parameters of the two models, results in the two models being able to fit and learn from different angles, resulting in different decision boundaries, i.e. with different learning capabilities, and thus enabling the promotion of peer supervision between the models.

Optionally, the first image segmentation model is deep labv2 (an image segmentation model) with res net101 (residual network) as a main framework, and a spatial pyramid (ASPP, atrous Spatial Pyramid Pooling) network is added in the first image segmentation model, so that multi-scale characteristics of an output result are enriched. In addition, in order to enhance the feature expression capability of the model, a attention mechanism based on DANet (Dual Attention Network ) is proposed, an attention network is added to learn and capture the dependency relationship between pixel points and feature layer channels, and the output of the attention network is connected with the output of the spatial pyramid network to generate a final labeling image.

Optionally, the second image segmentation model adopts deep labv3+ (an image segmentation model) as a framework, and adopts MobileNetV2 (lightweight network) as a base, so that the number of parameters and the running cost of the model are reduced. The second image segmentation model uses the first convolution network of mobilenet v2 and 7 residual networks to extract features, optionally with the step size (stride) of the first convolution network and the first two residual networks set to 2 and the step sizes of the remaining residual networks set to 1. The downsampling rate of the second image segmentation model is 8. In addition, adding an ASPP network into the second image segmentation model, learning potential features under different receptive fields, generating multi-scale features by using ASPP networks with different expansion rates (Dialate rates), integrating information of different layers into a feature map, upsampling and convolving the feature map, and connecting the obtained combined features with low-level features to perform fine-granularity image segmentation.

202. The computer device determines a first source domain loss value based on the source domain label image, the first label image, and the second label image, in a case where the first source domain image is determined to be the noise source domain image based on the second image segmentation model.

In the last training stage, the computer equipment performs image segmentation on the first source domain image based on the second image segmentation model to obtain a label image, determines a source domain loss value based on a real source domain label image corresponding to the first source domain image, the label image obtained this time and a pseudo label image corresponding to the first source domain image when the second image segmentation model is trained, and determines the first source domain image as a noise source domain image based on the source domain loss value. The pseudo tag image corresponding to the first source domain image when the second image segmentation model is trained refers to a tag image obtained by image segmentation of the first source domain image by the first image segmentation model in a previous training stage. The process of determining the noise source domain image based on the source domain loss value is described in detail in step 215 below, and is not described herein.

The computer device determines a first source domain loss value based on the source domain label image, the first label image, and the second label image, if the first source domain image is determined to be a noise source domain image.

In one possible implementation, the computer device determines a third source domain loss value based on the source domain label image and the first label image, determines a fourth source domain loss value based on the second label image and the first label image, and determines a first source domain loss value based on the third source domain loss value and the fourth source domain loss value.

Wherein the computer device determines a third source domain loss value based on a difference between the source domain label image and the first label image and a fourth source domain loss value based on a difference between the second label image and the first label image. Optionally, the computer device performs weighted summation of the third source domain loss value and the fourth source domain loss value to obtain the first source domain loss value.

In another possible implementation manner, if the first label image includes a first label corresponding to each pixel point in the first source domain image, determining the third source domain loss value includes: the computer equipment determines the weight corresponding to each pixel point based on the source domain label image, and determines the third source domain loss value based on the first label and the weight corresponding to each pixel point.

Optionally, weights corresponding to the plurality of pixel points are represented by a weight Map (Distance Map). When the first source domain image is judged to be the noise domain image, the source domain label image corresponding to the first source domain image is the noise label image, and a large error exists in the loss value determined directly based on the labels in the source domain label image, so that in order to prevent the first image segmentation model from being fitted on the noise label image, a new anti-noise loss determination mode is provided, and in consideration of the fact that the labels of the source domain label image on the boundary position of the target area have great difference, a weight map of the source domain label image is introduced, non-noise pixel level information is learned from the noise label image, and noise pixel level information is filtered.

Optionally, determining the weight includes: the computer equipment determines the minimum distance between each pixel point and the boundary line of the source domain label image, determines the maximum value of the determined minimum distances as a target distance, and respectively determines the weight corresponding to each pixel point based on the target distance and the minimum distance corresponding to each pixel point.

The computer equipment determines the maximum value in a plurality of minimum distances as a target distance, and determines the weight corresponding to each pixel point based on the difference between the target distance and the minimum distance corresponding to each pixel point respectively, so that the model can learn the information of the key position in the source domain label image and filter the difference on the boundary.

Optionally, the computer device determines the weight corresponding to any pixel point using the following formula:

wherein i represents any pixel point, y _i Representing a source domain label, W (y) _i ) Representing the weight, w, corresponding to any pixel point _c And w ₀ Represents the weight coefficient, optionally w _c Set to 10, w is ₀ Set to 1.max (max) _dis Represents the target distance, d (y _i ) Representing the minimum distance delta corresponding to any pixel point ² Representing the variance of the determined plurality of minimum distances.

Optionally, the determining the third source domain loss value includes: the computer equipment determines a first difference value based on a first label and weight corresponding to each pixel point, determines a third source domain sub-loss value corresponding to any pixel point based on the first difference value, the first label and weight corresponding to any pixel point, and determines a third source domain loss value based on the determined plurality of third source domain sub-loss values.

Wherein the first difference value represents a difference between the first label image and the source domain label image. After the computer device determines the first difference value, the third source domain sub-loss value corresponding to any pixel point can be determined based on the first difference value, the first label and the weight corresponding to the any pixel point, so as to determine the third source domain sub-loss values corresponding to the pixel points, and the determined third source domain sub-loss values are weighted and summed to obtain the third source domain loss value.

Optionally, the computer device determines a third source domain sub-loss value corresponding to any pixel point using the following formula:

Wherein i represents any pixel point, p represents a first label in the first label image, y represents a source domain label in the source domain image, and L _reweight (p, y) represents the third source domain sub-loss value, lambda, corresponding to any pixel point ₁ And lambda (lambda) ₂ Represents a weight coefficient, W (y) _i ) Representing the weight, p, corresponding to any pixel point _i Representing a first label, y, corresponding to any pixel point _i And the source domain label corresponding to any pixel point is represented. Wherein, the liquid crystal display device comprises a liquid crystal display device,

the first difference value is represented by h, the length of the label image is represented by w, the width of the label image is represented by c, and the type of the label image is represented by c. Wherein (1)>

Represents a loss value determined based on cross entropy loss, < ->

Representing a loss value determined based on the dice loss (a way of determining the loss value).

The process of determining the fourth source domain loss value is the same as the process of determining the third source domain loss value, and will not be described in detail herein.

In another possible implementation manner, the computer device trains the first image segmentation model by adopting a mode of repeated iterative training, and as the iterative training times increase, the first image segmentation model is more and more stable, so that the computer device adjusts the weight coefficients of the third source domain loss value and the fourth source domain loss value according to the iterative training times, so that the weight coefficient of the fourth source domain loss value is more and more large along with the increase of the iterative training times, thereby avoiding the influence of an error pseudo-label image generated by the unstable image segmentation model in the training process, utilizing effective information in a noise label image, and enhancing the robustness of the image segmentation model. Then determining the first source domain loss value based on the third source domain loss value and the fourth source domain loss value comprises: the computer equipment acquires iterative training times corresponding to the training, and determines a first source domain loss value based on the third source domain loss value, the fourth source domain loss value, the iterative times, the first threshold and the second threshold in response to the iterative training times not being smaller than the first threshold and not being larger than the second threshold. Or, in response to the number of iterative training being greater than the second threshold, determining a first source domain loss value based on the third source domain loss value and the fourth source domain loss value.

Optionally, the computer device determines the third source domain loss value directly as the first source domain loss value in response to the number of iterative training being less than the first threshold, irrespective of the fourth source domain loss value.

Optionally, the computer device determines the first source domain loss value based on the third source domain loss value and the fourth source domain loss value using the following formula:

wherein p represents a first label in the first label image, y represents a source domain label in the source domain image, y _pseudo Representing a second label in the second label image, L _noise (p, y) represents a first source domain loss value, L _reweight (p, y) represents a third source domain loss value, L _reweight (p,y _pseudo ) Represents the fourth source domain loss value, lambda _pseudo Represents a scaling factor, optionally lambda _pseudo The setting represents 0.5.T represents the iterative training times, T ₁ Represents a first threshold, T ₂ Representing a second threshold. In the process of image segmentation model, when t<T ₁ At this time, the image segmentation model is not stable enough, so that the first source domain loss value is fully referenced with the third source domain loss value, that is, only the influence of the source domain label image on the loss value is considered; when T is ₁ ≤t≤T ₂ When t is increased, gradually increasing the weight coefficient of the loss value of the fourth source domain, and simultaneously reducing the weight coefficient of the loss value of the third source domain; when T is ₂ <At t, the weight coefficients of the third source domain loss value and the fourth source domain loss value in the first source domain loss value are respectively 0.5.

203. The computer equipment acquires a second source domain image, a source domain label image corresponding to the second source domain image and a fifth label image.

In the process of training the first image segmentation model, the computer equipment further acquires a second source domain image, a source domain label image corresponding to the second source domain image and a fifth label image. The second source domain image is an image used for training a second image segmentation model in the previous training stage, the source domain label image corresponding to the second source domain image is a real label image corresponding to the second source domain image, and the fifth label image is obtained by carrying out image segmentation on the second source domain image based on the first image segmentation model in the current training stage.

In the embodiment of the application, the first source domain image is an image determined to be a noise source domain image based on the second image segmentation model, and the second source domain image is an image determined to be a non-noise source domain image based on the second image segmentation model.

204. The computer device determines a second source domain loss value based on a source domain label image and a fifth label image corresponding to the second source domain image, in a case where the second source domain image is determined to be a non-noise source domain image based on the second image segmentation model.

In step 204, the computer device determines a second source domain loss value based on the source domain label image and the fifth label image corresponding to the second source domain image, in the same manner as in the above-described process of determining the first source domain image as the noise source domain image in step 202.

In one possible implementation, the source domain label image includes a source domain label corresponding to each pixel in the second source domain image, and the fifth label image includes a fifth label corresponding to each pixel in the second source domain image. Determining the second source domain loss value comprises: determining a third difference value based on the source domain label and the fifth label corresponding to each pixel point; determining a second source domain sub-loss value corresponding to any pixel point based on the third difference value, the source domain label corresponding to any pixel point and the fifth label; a second source domain loss value is determined based on the determined plurality of second source domain sub-loss values.

Wherein the third difference value represents a difference between the source domain label image and the fifth label image. After the computer device determines the third difference value, the second source domain sub-loss value corresponding to any pixel point can be determined based on the third difference value, the source domain label corresponding to any pixel point and the fifth label, so as to determine the second source domain sub-loss values corresponding to a plurality of pixel points, and the determined second source domain sub-loss values are weighted and summed to obtain the second source domain loss value.

Optionally, the computer device determines the second source domain loss value using the following formula:

wherein i represents any pixel point, p represents a first label in the first label image, y represents a source domain label in the source domain label image, and L _clean (p, y) represents a second source domain loss value, lambda ₁ And lambda (lambda) ₂ Representing the weight coefficient, p _i A fifth label, y, corresponding to any pixel point _i And the source domain label corresponding to any pixel point is represented. Wherein, the liquid crystal display device comprises a liquid crystal display device,

the third difference value is represented by h, the length of the label image is represented by w, the width of the label image is represented by c, and the type of the label image is represented by c.

In one possible implementation, the computer device obtains a sixth label image before determining the second source domain loss value, determines a similarity between the fifth label image and the sixth label image, and determines the second source domain loss value based on the source domain label image and the fifth label image corresponding to the second source domain image in response to the similarity not being less than a similarity threshold. The sixth label image is obtained by image segmentation of the second source domain image based on the second image segmentation model.

Since the second source domain image is a non-noise source domain image determined based on the second image segmentation model, in order to further determine whether the second source domain image is a noise source domain image or a non-noise source domain image, the computer device determines a similarity between a prediction result of the second source domain image by the first image segmentation model and a prediction result of the second source domain image by the second image segmentation model, i.e. a similarity between the fifth label image and the sixth label image.

If the similarity is not smaller than the similarity threshold, the second source domain image is truly the non-noise source domain image, and a processing mode of the non-noise source domain image is adopted to determine a second source domain loss value corresponding to the second source domain image. The computer device determines a second source domain loss value based on the source domain label image and the fifth label image to which the second source domain image corresponds in response to the similarity not being less than the similarity threshold.

If the similarity is smaller than the similarity threshold, the second source domain image is re-judged to be the noise source domain image, and a processing mode of the noise source domain image is adopted to determine a second source domain loss value corresponding to the second source domain image. The computer device determines a second source domain loss value based on the source domain label image, the fifth label image, and the sixth label image corresponding to the second source domain image in response to the similarity being less than the similarity threshold. In the case that the similarity is smaller than the similarity threshold, the manner of determining the second source domain loss value by the computer device is the same as the manner of determining the first source domain loss value in the above step 202, and will not be described in detail herein.

Optionally, the computer device determines a dice loss value between the fifth label image and the sixth label image, and since the greater the dice loss value between the fifth label image and the sixth label image, the smaller the similarity between the fifth label image and the sixth label image, the computer device determines the second source domain loss value based on the source domain label image, the fifth label image, and the sixth label image corresponding to the second source domain image in response to the dice loss value not being less than the loss value threshold. Or the computer device determines a second source domain loss value based on the source domain label image and the fifth label image corresponding to the second source domain image in response to the dice loss value being less than the loss value threshold.

Optionally, for the second source domain image, the computer device obtains a second source domain loss value using the following formula:

wherein i represents any pixel point, W (y _i ) Representing the weight, w, corresponding to any pixel point _c And w ₀ Represent the weight coefficient, max _dis Represents the target distance, d (y _i ) Representing the minimum distance, y, corresponding to any pixel point _i Representing any pixel pointCorresponding Source Domain Label, delta ² Representing the variance of the determined minimum distances, dice _co Represents the dice loss value, μ represents the loss value threshold.

Wherein L is _reweight (p, y) represents a second source domain loss value, p _i A fifth label lambda corresponding to any pixel point ₁ And lambda (lambda) ₂ The weight coefficient is represented by h, the length of the label image is represented by w, the width of the label image is represented by w, and the class of the label image is represented by c.

In the embodiment of the present application, in the case that the source domain image is determined to be a non-noise source domain image, the method in steps 203 to 204 is adopted to determine a source domain loss value corresponding to the non-noise source domain image based on the real tag image corresponding to the non-noise source domain image. In the case that the source domain image is determined to be the noise source domain image, the method in steps 201 to 202 is adopted to determine a source domain loss value corresponding to the noise source domain image based on the real tag image and the pseudo tag image corresponding to the noise source domain image. When the source domain image is judged to be the noise domain image, the real label image corresponding to the noise domain image is not accurate enough, so that a pseudo label image obtained by image segmentation of the noise domain image by the second image segmentation model is introduced, the training process of the first image segmentation model is supervised by the segmentation result of the second image segmentation model, errors caused by the noise domain image in the training process can be reduced, and the resistance of the image segmentation model to noise data is improved.

205. The computer device obtains a target domain image, a third label image, and a fourth label image.

In training the first image segmentation model, the computer device also acquires a target domain image, a third label image, and a fourth label image. The third label image is obtained by image segmentation of the target domain image based on the first image segmentation model in the current training stage, and the fourth label image is obtained by image segmentation of the target domain image based on the second image segmentation model in the previous training stage, and is also called a pseudo label image corresponding to the target domain image.

The target domain image and the source domain image are different, and in the embodiment of the application, the source domain image and the target domain image can be divided by adopting different standards according to requirements. For example, the source domain image is an image including a cat and the target domain image is an image including a dog. Alternatively, the source domain image is an eye image acquired using device a, the target domain image is an eye image acquired using device B, and so on.

206. The computer device determines a first target domain loss value based on the third label image and the fourth label image.

After the computer device obtains the third label image and the fourth label image, a first target domain loss value is determined based on a difference between the third label image and the fourth label image.

In one possible implementation manner, the third label image includes a third label corresponding to each pixel point in the target domain image, the fourth label image includes a fourth label corresponding to each pixel point in the target domain image, and determining the first target domain loss value includes: determining a second difference value based on a third label and a fourth label corresponding to each pixel point, and determining a target domain sub-loss value corresponding to any pixel point based on the second difference value, the third label and the fourth label corresponding to any pixel point; a first target domain loss value is determined based on the determined plurality of target domain sub-loss values.

Wherein the second difference value represents a difference between the third label image and the fourth label image. After the computer device determines the second difference value, the target domain sub-loss value corresponding to any pixel point can be determined based on the second difference value and the third label and the fourth label corresponding to any pixel point, so as to determine the target domain sub-loss values corresponding to a plurality of pixel points, and the determined target domain sub-loss values are weighted and summed to obtain the first target domain loss value.

Optionally, the computer device determines the first target domain loss value using the following formula:

Wherein i represents any pixel point, L _seg (p, y) represents the first target domain loss value corresponding to any pixel point, lambda ₁ And lambda (lambda) ₂ Representing the weight coefficient, p _i A third label, y, corresponding to any pixel point _i And a fourth label corresponding to any pixel point. Wherein, the liquid crystal display device comprises a liquid crystal display device,

the second difference value is represented by h, the length of the label image is represented by w, the width of the label image is represented by c, and the type of the label image is represented by c.

The target domain image does not have a corresponding label image, so that the problem of unsupervised image segmentation needs to be solved. The embodiment of the application generates the pseudo tag image at the pixel level by adding the self-supervision information, namely, utilizing the image segmentation result of the target domain image, and applies the pseudo tag image in the next training stage. In one possible implementation, for any pixel point in the target domain image, if the prediction confidence of a certain class is higher than the target threshold, a pseudo tag of the class corresponding to the pixel point is generated. Optionally, the target threshold value adopts an adaptive setting mode to sort the prediction confidence of the category corresponding to each pixel point in the target domain image, and adaptively selects the category with the highest prediction confidence to generate a pseudo tag of the pixel level as cross supervision information of the next training stage. In order to ensure the correctness of the generated pseudo tag, in the embodiment of the present application, the first image segmentation model and the second image segmentation model are trained in an iterative training manner, so that more accurate pseudo tags are continuously generated.

207. The computer device trains a first image segmentation model based on the first source domain loss value, the second source domain loss value, and the first target domain loss value.

The computer equipment adjusts parameters of the first image segmentation model based on the first source domain loss value, the second source domain loss value and the first target domain loss value, so that the first source domain loss value, the second source domain loss value and the first target domain loss value are smaller and smaller until the first source domain loss value, the second source domain loss value and the first target domain loss value tend to converge, and the trained first image segmentation model is obtained.

In one possible implementation, a computer device determines a segmentation loss value based on a first source domain loss value, a second source domain loss value, and a first target domain loss value, trains a first image segmentation model based on the segmentation loss values, such that the segmentation loss values tend to converge.

Optionally, the computer device determines the segmentation loss value using the following formula:

L _seg (X _S ,Y _S )＝(1-α)L _clean +αL _noise ；

wherein L is _seg Represents the segmentation loss value, L _seg (X _S ,Y _S ) Representing the total source domain loss value,

representing a first target domain segmentation loss value, L _noise Representing a first source domain loss value, L _clean Represents a second source domain loss value, alpha represents balance L _clean And L _noise Coefficient of X _S Representing a source domain image, X _T Representing a target domain image, Y _S A source domain label image corresponding to the source domain image is represented,

and representing a third label image corresponding to the target domain image generated in the training process, namely, the target domain image is a pseudo label image.

In the embodiment of the present application, only the process of training the first image segmentation model will be described taking as an example that the first source domain image is determined to be a noise source domain image and the second source domain image is determined to be a non-noise source domain image. In another embodiment, the computer device may not perform steps 203-204 described above, and in step 202, determine a first source domain loss value directly based on the source domain label image, the first label image, and the second label image, and then train the first image segmentation model based on the first source domain loss value and the first target domain loss value.

In the above steps 201 to 207, only the first source domain image, the second source domain image, and the target domain image are processed to obtain the first source domain loss value, the second source domain loss value, and the first target domain loss value. In another embodiment, the computer device processes the plurality of first source domain images, the plurality of second source domain images, and the plurality of target domain images, respectively, and determines a plurality of first source domain loss values, a plurality of second source domain loss values, and a plurality of first target domain loss values, respectively, using the methods provided in steps 201-207 described above, and then trains the first image segmentation model based on the plurality of first source domain loss values, the plurality of second source domain loss values, and the plurality of first target domain loss values.

208. The computer equipment respectively carries out image discrimination processing on the first label image and the third label image based on the discrimination model to obtain a source domain discrimination result and a target domain discrimination result.

The computer equipment performs image segmentation processing on the first source domain image and the target domain image based on the first image segmentation model to obtain a first label image and a third label image, performs image discrimination processing on the first label image based on the discrimination model to obtain a source domain discrimination result, and performs image discrimination processing on the third label image based on the discrimination model to obtain a target domain discrimination result.

The discriminating model is used for discriminating whether the input label image belongs to the label image of the source domain image or the label image of the target domain image, and can fuse the label images of the source domain image and the target domain image into the countermeasure learning. When the discrimination model performs image discrimination processing on a third label image corresponding to the target domain image, and the obtained target domain discrimination result indicates that the third label image is the label image of the target domain image, the discrimination model is accurate in discrimination, the label image of the target domain image output by the first image segmentation model is not similar to the label image of the source domain image in the feature space, the discrimination model cannot be "spoofed", and the first image segmentation model does not complete domain migration from the source domain to the target domain, so that the accuracy of image segmentation of the target domain image by the first image segmentation model is still low. When the discrimination model performs image discrimination processing on a third label image corresponding to the target domain image, and the obtained target domain discrimination result indicates that the third label image is the label image of the source domain image, the discrimination model is proved to have a discrimination error, namely, the label image of the target domain image output by the first image segmentation model is similar to the label image of the source domain image in the feature space, and the first image segmentation model completes domain migration from the source domain to the target domain, so that the accuracy of image segmentation of the target domain image by the first image segmentation model is higher.

Optionally, the discriminant model is a model formed by a multi-layer full convolution network. Alternatively, the kernel size (size of convolution kernel) of each convolution network in the discriminant model is 4, stride is 2, and padding is 1. Optionally, in the discrimination model, except for the last layer of convolutional network, a leak ReLU (Leaky Rectified Linear Unit, leak-corrected linear network) is connected to the back of the other convolutional networks. Alternatively, the output of the discriminant model is a single-channel two-dimensional result, e.g., the source domain is represented by 0 and the target domain is represented by l.

209. The computer device determines a discrimination loss value based on the source domain discrimination result, the target domain discrimination result, and the third tag image.

Wherein the discrimination loss value is used to represent the counterloss of the discrimination model.

In one possible implementation manner, if the third label image includes a third label corresponding to a plurality of pixel points in the target domain image, determining the discrimination loss value includes: the computer equipment determines uncertainty corresponding to a third label image based on third labels corresponding to the plurality of pixel points; and determining a discrimination loss value based on the uncertainty, the source domain discrimination result and the target domain discrimination result.

Wherein the uncertainty corresponding to the third label image is used to identify a degree of uncertainty of the third label image, optionally, the uncertainty is an information Entropy (Entropy) corresponding to the third label image. Information entropy is introduced into each pixel point of the target domain image, and then the discrimination loss value is determined by the information entropy and a discrimination result output by a discrimination model, so that the loss weight of the pixel point with uncertainty (corresponding to a high entropy value) can be increased, and the loss weight of the pixel point with uncertainty (corresponding to a low entropy value) can be reduced. Driven by the information entropy, the method is beneficial to the learning of the discriminant model on how to pay attention to the representative features.

Optionally, the computer device determines the information entropy using the following formula:

wherein f (X) _T ) The information entropy is represented by i, h, w, c and p, wherein h represents any pixel point, h represents the length of the label image, w represents the width of the label image, and c represents the type of the label image _i And a third label corresponding to any pixel point is shown.

Optionally, the computer device determines the discrimination loss value using the following formula:

L _D ＝λ _adv L _adv (X _S ,X _T )；

L _adv (X _S ,X _T )＝-E[log(D(G(X _S )))]-E[(λ _entr f(X _T )+ε)+log(1-D(G(X _T )))]；

wherein L is _D Represents a discrimination loss value, X _S Representing a first source domain image, X _T Representing a target domain image, lambda _adv Representing parameters used to balance the loss relationship, G (-) represents the first image segmentation model, D (-) represents the discrimination model, E (-) represents the mathematical expectation, D (G (X) _S ) The source domain discrimination result is represented by D (G (X) _T ) Represents the discrimination result of the target domain, lambda _entr Weight parameters corresponding to the information entropy result graph are represented, f (X _T ) Represents uncertainty, ε is a weight coefficient for the method of determining the uncertainty in the sense of introducing f (X _T ) Is ensured to be stable in the training process under the conditionSex.

210. The computer device trains the first image segmentation model and the discriminant model based on the discriminant loss values.

After determining the discrimination loss value, the computer equipment adjusts the parameters of the first image segmentation model and the parameters of the discrimination model based on the discrimination loss value, so that the label images obtained by the image segmentation of the source domain image and the target domain image by the first image segmentation model are more and more similar in the feature space, and the discrimination model can not distinguish whether the label images are the label images of the target domain image or not more and more, thereby realizing the field self-adaption of the image segmentation.

Optionally, the computer device adjusts parameters of the discriminant model by maximizing the discriminant loss value. Alternatively, since an error generated by the discrimination loss value is also reflected in the first image segmentation model, the computer apparatus adjusts the parameters of the first image segmentation model by minimizing the discrimination loss value. Optionally, the computer device optimizes and trains the first image segmentation model using an SGD (Stochastic Gradient Descent, random gradient descent) algorithm and optimizes and trains the discriminant model using an Adam algorithm (an optimization algorithm that extends the random gradient descent method) in optimizing model parameters. Optionally, the initial learning rate of the first image segmentation model is 2.5x10 ^-4 The initial learning rate of the discrimination model is 1×10 ^-4 。

Optionally, the computer device trains the first image segmentation model and the discrimination model using the following formula:

/>

wherein, the liquid crystal display device comprises a liquid crystal display device,

means that the minimization of the loss value is performed for the first image segmentation model,/->

Refers to minimizing the loss value of the discriminant model, L _seg Represents the segmentation loss value, L _D The discrimination loss value is represented.

In the embodiment of the present application, the first image segmentation model is trained, and the discrimination model is trained at the same time. In another embodiment, where the discriminant model is a trained discriminant model, the computer device does not need to train the discriminant model any more, and the computer device does not perform steps 208-210 described above.

Fig. 3 is a schematic diagram of determining a loss value according to an embodiment of the present application, as shown in fig. 3, a computer device inputs a source domain image 301 into an image segmentation model 303 to obtain a first label image 304, and inputs a target domain image into the image segmentation model 303 to obtain a third label image. Wherein the computer device determines the first source domain loss value based on the determined weight map 306 in case the source domain image 301 is determined to be a noise source domain image, and wherein the computer device does not need to utilize the weight map when determining the second source domain loss value in case the source domain image 301 is determined to be a non-noise source domain image. The computer device inputs the first label image 304 and the third label image 305 into a discrimination model 307, respectively, and determines a discrimination loss value based on the discrimination result and the information entropy 308.

211. And the computer equipment performs image segmentation task processing based on the trained image segmentation model.

The image segmentation model obtained after training has higher accuracy and robustness, and the computer equipment performs image segmentation task processing on any image based on the image segmentation model, so that the accuracy of image segmentation can be ensured.

In one possible implementation manner, the computer device performs image segmentation on any image based on the image segmentation model obtained after training to obtain a label image corresponding to any image, and completes the image segmentation task on any image.

Optionally, any image in the same field as the target domain image is an image to be segmented, for example, any image is a medical image (such as a human spinal cord image or an ophthalmic image) in the same field as the target domain image, for example, any image and the target domain image are both ophthalmic images, and any image and the target domain image are images acquired by the same device.

212. The computer device determines the first source domain image as a non-noise source domain image in response to the first source domain loss value being less than the target loss value.

After the computer device determines the first source domain loss value, in response to the first source domain loss value being less than the target loss value, the first source domain image corresponding to the first source domain loss value is determined to be a non-noise source domain image, and the subsequent second image segmentation model can be trained based on the first source domain image determined to be the non-noise source domain image. Wherein the target loss value is set by the computer device, which is not limited in the embodiments of the present application.

In one possible implementation manner, in the same training stage, the computer device performs image segmentation on the plurality of source domain images to obtain corresponding source domain loss values, and then the computer device selects a target number of source domain loss values from the determined plurality of source domain loss values according to a sequence from small to large, determines source domain images corresponding to the selected plurality of source domain loss values as non-noise source domain images, and determines the rest of source domain images as noise source domain images.

The method comprises the steps that according to the condition that noise exists in a source domain label image corresponding to a source domain image in a training set, computer equipment cleans noise labels of the source domain image based on a source domain loss value, a non-noise source domain image corresponding to a non-noise label is selected from the noise label image, accuracy of the labels is improved, quality of the images is improved, and therefore accuracy of an image segmentation model obtained through training is improved.

It should be noted that, in the embodiment of the present application, only after the first image segmentation model is used, the step 212 is performed to determine the image as an example, and in another embodiment, after the computer device performs the step 202 to determine the first source domain loss value, the step 212 may be performed to determine the image.

213. The computer device determines a fifth source domain loss value based on the source domain label image and the sixth label image corresponding to the first source domain image.

After the computer equipment judges the first source domain image as a non-noise source domain image based on the first image segmentation model, the non-noise source domain image can be applied to training of the second image segmentation model, and a fifth source domain loss value is determined based on a source domain label image and a sixth label image corresponding to the first source domain image. The sixth label image is obtained by re-performing image segmentation on the first source domain image based on the second image segmentation model. The step 213 is the same as the above-mentioned process of determining the second source domain loss value in the step 204, and will not be described in detail here.

214. The computer device determines a second target domain loss value based on the third label image and the seventh label image.

The computer equipment applies a third label image obtained by image segmentation of the target domain image by the target domain image and the first image segmentation model to training of the second image segmentation model, and determines a second target domain loss value based on the third label image and the seventh label image. The seventh label image is obtained by re-performing image segmentation on the target domain image based on the second image segmentation model. The step 214 is similar to the above-mentioned process of determining the first target domain loss value in the step 206, and will not be described in detail herein.

215. The computer device trains a second image segmentation model based on the fifth source domain loss value and the second target domain loss value.

The step 215 is similar to the process of training the first image segmentation model in the step 207, and will not be described in detail herein.

It should be noted that, in the embodiments of the present application, only the training of the second image segmentation model based on the fifth source domain loss value and the second target domain loss value is taken as an example for explanation. In another embodiment, the second image segmentation model also corresponds to a discriminant model, and after executing step 215, further includes: the computer equipment respectively performs image discrimination processing on the sixth label image and the seventh label image based on the discrimination model corresponding to the second image segmentation model to obtain a source domain discrimination result and a target domain discrimination result, determines a discrimination loss value based on the source domain discrimination result and the target domain discrimination result and the seventh label image, and trains the second image segmentation model and the discrimination model corresponding to the second image segmentation model based on the discrimination loss value. The above process is the same as steps 208-210, and will not be described in detail here.

The embodiment of the present application will be described by taking only an example in which the first source domain image is determined to be a non-noise source domain image based on the first image segmentation model. In another embodiment, where the computer device determines the first source domain image as a noise source domain image based on the first image segmentation model, steps 212-215 are replaced by the following steps 216-219, as shown in FIG. 4:

216. The computer device determines the first source domain image as a noise source domain image in response to the first source domain loss value not being less than the target loss value.

After the computer device determines the first source domain loss value, in response to the first source domain loss value not being less than the target loss value, determining the first source domain image corresponding to the first source domain loss value as a noise source domain image, and then training the subsequent second image segmentation model based on the first source domain image determined as the noise source domain image.

In one possible implementation, the computer device selects a first plurality of candidate source domain images from the plurality of source domain images based on the first image segmentation model when training the first image segmentation model, and selects a second plurality of candidate source domain images from the plurality of source domain images based on the second image segmentation model when training the second image segmentation model. The computer device determines the same source domain image of the plurality of first candidate source domain images and the plurality of second candidate source domain images as a noise source domain image.

217. The computer device determines a sixth source domain loss value based on the source domain label image, the first label image, and the sixth label image corresponding to the first source domain image.

After the computer equipment judges the first source domain image as the noise source domain image based on the first image segmentation model, the noise source domain image can be applied to training of the second image segmentation model, and a sixth source domain loss value is determined based on the source domain label image, the first label image and the sixth label image corresponding to the first source domain image. The sixth label image is obtained by re-performing image segmentation on the first source domain image based on the second image segmentation model. The step 217 is similar to the process of determining the first source domain loss value in the step 202, and will not be described in detail herein.

218. The computer device determines a second target domain loss value based on the third label image and the seventh label image.

219. The computer device trains a second image segmentation model based on the sixth source domain loss value and the second target domain loss value.

The process of steps 218-219 is the same as the process of steps 214-215 described above, and will not be described in detail here.

In the embodiment of the application, only the first image segmentation model and the second image segmentation model are trained by using the same source domain image and the same target domain image as an example, and in the process from training start to training completion, the computer device trains the first image segmentation model and the second image segmentation model by using a plurality of source domain images and a plurality of target domain images in the same training set.

In the embodiment of the application, aiming at noise label images in a training set and distribution differences between a source domain image and a target domain image, an unsupervised robust segmentation method based on a domain self-adaptive strategy is provided, on one hand, the source domain image with smaller loss in the source domain is exchanged by using peer supervision, useful information is learned from the label image corresponding to the noise source domain image by using a method of distributing weights at a pixel level, on the other hand, a pseudo label image is generated by using peer supervision on the noise source domain image with larger loss in the source domain image, so that a real label image and a predicted pseudo label image existing in the noise source domain image are mutually learned and supervised in a cross training mode of two image segmentation models, and a more reasonable training strategy is designed, thereby solving the problem of noise stored in the source domain image and domain migration problem from the source domain image to the target domain image, and improving the accuracy and the robustness of the image segmentation model.

Fig. 5 is a schematic diagram of a cross training image segmentation model provided in the embodiment of the present application, referring to fig. 5, a computer device trains a first image segmentation model 503 and a second image segmentation model 505 by using a source domain image 501 and a target domain image 502 in the same training set, where the first image segmentation model 503 corresponds to a first discrimination model 504, the first discrimination model 504 discriminates a label image output by the first image segmentation model 503, the second image segmentation model 505 corresponds to a second discrimination model 506, and the second discrimination model 506 discriminates a label image output by the second image segmentation model 505.

The computer device inputs the source domain image 501 into the first image segmentation model 503 to process, determines a source domain loss value, inputs the target domain image 502 into the first image segmentation model 503 to process, determines a target domain loss value, inputs a result output by the first image segmentation model 503 into the first discrimination model 504 to process, determines a discrimination loss value, and trains the first image segmentation model 503 and the first discrimination model 504 based on the source domain loss value, the target domain loss value and the discrimination loss value. And, the computer device determines a non-noise source domain image 506 and a noise source domain image 508 in the source domain image 501 based on the source domain loss value.

Similarly, the computer device inputs the source domain image 501 and the target domain image 502 into the second image segmentation model 505 to process, determines a source domain loss value and a target domain loss value, inputs the result output by the second image segmentation model 505 into the second discrimination model 506 to process, determines a discrimination loss value, and trains the second image segmentation model 505 and the second discrimination model 506 based on the source domain loss value, the target domain loss value and the discrimination loss value. And, the computer device determines a non-noise source domain image 509 and a noise source domain image 510 in the source domain image 501 based on the source domain loss value. And, the computer device determines the intersection of the noise source domain image 508 and the noise source domain image 510, obtains a high confidence noise source domain image, redetermines the pseudo tag image 511 of the high confidence noise source domain image, and applies the pseudo tag image 511 to the next training stage on the image segmentation model.

In order to verify the effectiveness of the image segmentation data processing method provided by the embodiment of the application, the computer equipment performs verification on the ophthalmic image and the human spinal cord image. The ophthalmic image adopts a REFUGE image set (a public data set) and a Drishti-GS image set (a public data set), and because the training sets and the testing sets of the two image sets are shot by different acquisition equipment, the images have differences in color, texture and the like, the training set of the REFUGE image set is used as a source domain training set, the testing set of the REFUGE image set and the testing set of the Drishti-GS image set are used as target domain training sets, and the testing set of the REFUGE image set and the testing set of the Drishti-GS image set are used as target domain testing sets. For the REFUGE image set, the training set contains 400 images with image size 2124×2056, and the test set contains 300 images; for the Drishti-GS image set, the training set contains 50 images, the test set contains 51 images, and the image size is 2047×1759.

The human spinal white matter image and the gray matter segmentation image are derived from four different centers, the images from the center-1, the center-2 and the center-3 are used as a source domain training set, the image of the center-4 is used as a target domain training set, and the image of the center-4 is also used as a target domain test set. For the source domain training set, 341 images are included in total, the size of the images is 100×100, the target domain training set and the test set include 134 identical images, and the size of the images is 100×100.

Experimental results of the method provided by the embodiment of the application are compared with those of the related technology on an ophthalmic image set, and experimental results based on tasks with different noise degrees are shown in table 1 and table 2 respectively. The BDL is a two-way learning method based on self-supervision learning, pOSAL is a method proposed in a retinal fundus glaucoma challenge game, and BEAL is an anti-learning method based on edge and flaring information. In the embodiment of the present application, a DICE (a loss determination algorithm) is used to compare experimental results, where DICE is a measure of a segmentation result, and is used to calculate the similarity between a real label image and a predicted label image, and is expressed as:

where DI represents the degree of similarity between the actual and predicted label images, P represents the actual label image, and Y represents the predicted label image.

Wherein DIdisc represents the DI value of the video disc and DIcup represents the DI value of the video cup.

Wherein, table l is a mild noise level experimental result from the refoge training set to the refoge test set, table 2 is a severe noise level experimental result from the refoge training set to the refoge test set, and table 3 is a severe noise level experimental result from the SCGM training set to the SCGM test set. FIG. 6 is an experimental result provided in an embodiment of the present application on the REFUGE image set and the Drishti-GS image set. FIG. 7 is another experimental result provided in the examples of the present application on the REFUGE image set and the Drishti-GS image set.

TABLE 1

TABLE 2

TABLE 3 Table 3

As can be seen from tables 1 to 3, the image segmentation model trained by the method provided by the embodiment of the present application has a higher similarity between the predicted label image and the actual label image when image segmentation is performed, that is, the accuracy of the image segmentation model is higher. As can be seen from fig. 6 and fig. 7, the image segmentation model trained by the method provided by the embodiment of the present application is more similar to the real label image in the label image prediction during image segmentation, that is, the accuracy of the image segmentation model is higher.

According to the method provided by the embodiment of the application, when the source domain image and the target domain image are adopted to train the first image segmentation model, the segmentation result of the first image segmentation model is considered, the segmentation result of the second image segmentation model is fused into the source domain loss value and the target domain loss value, so that the first image segmentation model learns the segmentation result of the second image segmentation model, and the training process of the first image segmentation model can be supervised through the segmentation result of the second image segmentation model, so that the first image segmentation model with higher accuracy and robustness is trained.

When the first source domain image is judged to be the noise domain image, the source domain label image corresponding to the noise domain image is not accurate enough, so that when the source domain loss value corresponding to the noise domain image is determined, the segmentation result of the second image segmentation model on the noise domain image is considered, the first image segmentation model is supervised through the second image segmentation model, errors caused by the noise domain image in the training process can be reduced, and the resistance of the image segmentation model to noise data can be improved.

In addition, under the condition that the corresponding label image does not exist in the target domain image, the method and the device for image segmentation of the target domain image generate a pseudo label image by adding self-supervision information and apply the pseudo label image to the next training stage as cross-supervision information of the next training stage, so that the problem of unsupervised image segmentation is solved.

In addition, in order to ensure the accuracy of the generated pseudo tag image, in the embodiment of the application, the first image segmentation model and the second image segmentation model are trained in an iterative training mode, so that more accurate pseudo tag images are continuously generated, the pseudo tag images are used for the image segmentation model, and the accuracy of the image segmentation model is further improved.

Moreover, as the two image segmentation models have different structures and learning capabilities, different types of errors introduced by noise images can be filtered in a cross training mode, so that the predicted pseudo-label image has stronger robustness. In the process of image exchange by the two image segmentation models, the two image segmentation models can mutually monitor and purify noise images in a training set, and training errors caused by the noise images are reduced.

In addition, in the embodiment of the application, the image segmentation model is trained by adopting the pixel-level label information, and compared with the image-level label information, the result obtained by carrying out image segmentation on the trained image segmentation model is more accurate.

In addition, in the embodiment of the application, the information entropy of the label image output by the image segmentation model is integrated into the discrimination loss so as to strengthen the countermeasure learning process of the image segmentation model and the discrimination model, and further the trained image segmentation model can learn the field self-adaptive task capacity better, so that the accuracy of the image segmentation model is improved.

After training the first image segmentation model, image segmentation task processing can be performed based on the first image segmentation model. The following embodiment will explain the image segmentation process in detail.

Fig. 8 is a flowchart of an image segmentation data processing method according to an embodiment of the present application. The execution body of the embodiment of the application is a computer device, referring to fig. 8, the method includes:

801. the computer device obtains a first image segmentation model.

The computer equipment acquires a first image segmentation model obtained after training, wherein the first image segmentation model comprises a feature extraction layer and a feature segmentation layer, the feature extraction layer is used for extracting features of a source domain image or a target domain image, and the feature segmentation layer is used for carrying out feature segmentation on the feature image.

The sample data adopted in training the first image segmentation model comprises: the sample source domain image, a sample source domain label image corresponding to the sample source domain image, a sample target domain image, a first image segmentation model and a second image segmentation model respectively carry out image segmentation on the sample source domain image and the sample target domain image to obtain a label image.

The training process of the first image segmentation model comprises the following steps: obtaining a sample source domain image, a sample source domain label image, a first label image and a second label image, wherein the first label image is obtained by carrying out image segmentation on the sample source domain image based on a first image segmentation model, and the second label image is obtained by carrying out image segmentation on the sample source domain image based on a second image segmentation model; determining a first source domain loss value based on the sample source domain label image, the first label image, and the second label image; acquiring a sample target domain image, a third label image and a fourth label image, wherein the third label image is obtained by carrying out image segmentation on the sample target domain image based on a first image segmentation model, and the fourth label image is obtained by carrying out image segmentation on the sample target domain image based on a second image segmentation model; determining a first target domain loss value based on the third label image and the fourth label image; the first image segmentation model is trained based on the first source domain loss value and the first target domain loss value.

It should be noted that, the sample source domain image is similar to the first source domain image in the embodiment of fig. 2, the sample source domain label image is similar to the source domain label image corresponding to the first source domain image in the embodiment of fig. 2, the sample target domain image is similar to the target domain image in the embodiment of fig. 2, and the process of training the first image segmentation model is detailed in the embodiment of fig. 2 and will not be repeated here.

802. And the computer equipment performs feature extraction on the target domain image based on the feature extraction layer to obtain a feature image corresponding to the target domain image.

When any target domain image needs to be segmented, the computer equipment performs feature extraction on the target domain image based on a feature extraction layer in a first image segmentation model to obtain a feature image corresponding to the target domain image, wherein the feature image is used for representing the features of the target domain image.

803. The computer equipment performs feature segmentation on the feature image based on the feature segmentation layer to obtain a target domain label image corresponding to the target domain image.

After the computer equipment obtains the characteristic image, the characteristic image is subjected to characteristic segmentation based on the characteristic segmentation layer in the first image segmentation model to obtain a target domain label image corresponding to the target domain image, and the class of the pixel point in the target domain image is marked in the target domain label image, so that the image segmentation task of the target domain image is completed.

According to the method provided by the embodiment of the application, when the source domain image and the target domain image are adopted to train the first image segmentation model, not only the segmentation result of the first image segmentation model is considered, but also the segmentation result of the second image segmentation model is considered, so that the first image segmentation model learns the segmentation result of the second image segmentation model, the training process of the first image segmentation model can be supervised through the segmentation result of the second image segmentation model, and therefore the first image segmentation model with higher accuracy and robustness is trained, and the first image segmentation model is adopted to segment the target domain image, so that the obtained segmentation result is more accurate.

Fig. 9 is a schematic structural diagram of an image segmentation data processing apparatus according to an embodiment of the present application. Referring to fig. 9, the apparatus includes:

the first image obtaining module 901 is configured to obtain a first source domain image, a source domain label image corresponding to the first source domain image, a first label image and a second label image, where the first label image is obtained by performing image segmentation on the first source domain image based on a first image segmentation model, and the second label image is obtained by performing image segmentation on the first source domain image based on a second image segmentation model;

A first loss value determining module 902, configured to determine a first source domain loss value based on a source domain label image corresponding to the first source domain image, the first label image, and the second label image;

the second image obtaining module 903 is configured to obtain a target domain image, a third label image, and a fourth label image, where the target domain image is different from the domain to which the source domain image belongs, the third label image is obtained by performing image segmentation on the target domain image based on the first image segmentation model, and the fourth label image is obtained by performing image segmentation on the target domain image based on the second image segmentation model;

a second loss value determining module 904, configured to determine a first target domain loss value based on the third label image and the fourth label image;

a first training module 905 for training a first image segmentation model based on the first source domain loss value and the first target domain loss value;

and the image segmentation module 906 is used for performing image segmentation task processing based on the trained image segmentation model.

According to the image segmentation data processing device, when the source domain image and the target domain image are adopted to train the first image segmentation model, the segmentation result of the first image segmentation model is considered, the segmentation result of the second image segmentation model is fused into the source domain loss value and the target domain loss value, so that the first image segmentation model learns the segmentation result of the second image segmentation model, the training process of the first image segmentation model can be supervised through the segmentation result of the second image segmentation model, the first image segmentation model with higher accuracy and robustness can be trained, and the accuracy of image segmentation is improved.

Optionally, referring to fig. 10, the first loss value determining module 902 includes:

the first loss value determining unit 9021 is configured to determine a first source domain loss value based on the source domain label image, the first label image, and the second label image, in a case where the first source domain image is determined to be the noise source domain image based on the second image division model.

Optionally, referring to fig. 10, the apparatus further includes:

the first image obtaining module 901 is configured to obtain a second source domain image, a source domain label image corresponding to the second source domain image, and a fifth label image, where the fifth label image is obtained by performing image segmentation on the second source domain image based on the first image segmentation model;

a third loss value determining module 907 configured to determine a second source domain loss value based on the source domain label image and the fifth label image corresponding to the second source domain image, in a case where the second source domain image is determined to be a non-noise source domain image based on the second image segmentation model;

a first training module 905 is configured to train the first image segmentation model based on the first source domain loss value, the second source domain loss value, and the first target domain loss value.

Optionally, referring to fig. 10, the first image obtaining module 901 is further configured to obtain a sixth label image, where the sixth label image is obtained by performing image segmentation on the second source domain image based on the second image segmentation model;

The third loss value determination module 907 includes:

a similarity determination unit 9071 for determining a similarity between the fifth and sixth label images;

a second loss value determining unit 9072, configured to determine, in response to the similarity being not smaller than the similarity threshold, a second source domain loss value based on the source domain label image and the fifth label image corresponding to the second source domain image.

Optionally, referring to fig. 10, a third loss value determination module 907 includes:

a third loss value determining unit 9073, configured to determine, in response to the similarity being smaller than the similarity threshold, a second source domain loss value based on a source domain label image, a fifth label image, and a sixth label image, which correspond to the second source domain image.

a fourth loss value determining unit 9022 for determining a third source domain loss value based on the source domain label image and the first label image;

a fifth loss value determining unit 9023 for determining a fourth source domain loss value based on the second label image and the first label image;

a sixth loss value determining unit 9024, configured to determine the first source domain loss value based on the third source domain loss value and the fourth source domain loss value.

Alternatively, referring to fig. 10, a sixth loss value determining unit 9024 is configured to:

acquiring iterative training times corresponding to the training;

in response to the number of iterative training being not less than the first threshold and not greater than the second threshold, a first source domain loss value is determined based on the third source domain loss value, the fourth source domain loss value, the number of iterations, the first threshold, and the second threshold.

in response to the number of iterative training being greater than the second threshold, a first source domain loss value is determined based on the third source domain loss value and the fourth source domain loss value.

Optionally, referring to fig. 10, the first label image includes a first label corresponding to each pixel point in the first source domain image, and a fourth loss value determining unit 9022 is configured to:

and determining a third source domain loss value based on the first label and the weight corresponding to each pixel point.

Alternatively, referring to fig. 10, a fourth loss value determining unit 9022 is configured to:

determining the minimum distance between each pixel point and the boundary line of the source domain label image;

determining a maximum value of the determined plurality of minimum distances as a target distance;

And respectively determining the weight corresponding to each pixel point based on the target distance and the minimum distance corresponding to each pixel point.

determining a first difference value based on the first label and the weight corresponding to each pixel point, wherein the first difference value represents the difference between the first label image and the source domain label image;

determining a third source domain sub-loss value corresponding to any pixel point based on the first difference value, the first label corresponding to any pixel point and the weight;

a third source domain loss value is determined based on the determined plurality of third source domain sub-loss values.

Optionally, referring to fig. 10, the third label image includes a third label corresponding to each pixel point in the target domain image, the fourth label image includes a fourth label corresponding to each pixel point in the target domain image, and the second loss value determining module 904 includes:

a difference value determining unit 9041 configured to determine a second difference value, which represents a difference between the third label image and the fourth label image, based on the third label and the fourth label corresponding to each pixel point;

a seventh loss value determining unit 9042, configured to determine a target domain sub-loss value corresponding to any pixel point based on the second difference value, the third label and the fourth label corresponding to any pixel point;

An eighth loss value determining unit 9043 for determining a first target domain loss value based on the determined plurality of target domain sub-loss values.

Optionally, referring to fig. 10, the apparatus further includes:

a discriminating processing module 908, configured to perform image discriminating processing on the first label image based on the discriminating model, to obtain a source domain discriminating result;

the judging and processing module 908 is further configured to perform image judging and processing on the third label image based on the judging model to obtain a target domain judging result;

a discrimination loss determining module 909 for determining a discrimination loss value based on the source domain discrimination result, the target domain discrimination result and the third tag image;

a second training module 910 is configured to train the first image segmentation model and the discriminant model based on the discriminant loss value.

Optionally, referring to fig. 10, the third label image includes a third label corresponding to a plurality of pixels in the target domain image, and the discrimination loss determining module 909 includes:

an uncertainty determining unit 9091, configured to determine, based on third labels corresponding to the plurality of pixels, an uncertainty corresponding to a third label image;

the discrimination loss determining unit 9092 is configured to determine a discrimination loss value based on the uncertainty, the source domain discrimination result, and the target domain discrimination result.

Optionally, referring to fig. 10, the image segmentation module 906 includes:

the image segmentation unit 9061 is configured to perform image segmentation on any one of the images based on the image segmentation model obtained after training, so as to obtain a label image corresponding to the any one of the images.

Optionally, referring to fig. 10, the apparatus further includes:

a determination module 911 for determining the first source domain image as a non-noise source domain image in response to the first source domain loss value being less than the target loss value; or alternatively, the process may be performed,

a determining module 911 is configured to determine the first source domain image as a noise source domain image in response to the first source domain loss value being not less than the target loss value.

Optionally, referring to fig. 10, the third loss value determining module 906 is further configured to determine a fifth source domain loss value based on a source domain label image and a sixth label image corresponding to the first source domain image, where the sixth label image is obtained by performing image segmentation on the first source domain image based on the second image segmentation model;

the second loss value determining module 904 is further configured to determine a second target domain loss value based on a third label image and a seventh label image, where the seventh label image is obtained by performing image segmentation on the target domain image based on the second image segmentation model;

First training module 905 is further configured to train a second image segmentation model based on the fifth source domain loss value and the second target domain loss value.

Optionally, referring to fig. 10, the apparatus further includes:

the first loss value determining module 902 is further configured to determine a sixth source domain loss value based on a source domain label image corresponding to the first source domain image, the first label image, and a sixth label image, where the sixth label image is obtained by performing image segmentation on the first source domain image based on the second image segmentation model;

first training module 905 is further configured to train a second image segmentation model based on the sixth source domain loss value and the second target domain loss value.

It should be noted that: in the image segmentation data processing apparatus provided in the above embodiment, only the division of the above functional modules is used for illustration when the image segmentation model is trained, and in practical application, the above functional allocation may be performed by different functional modules according to needs, that is, the internal structure of the computer device is divided into different functional modules, so as to perform all or part of the functions described above. In addition, the image segmentation data processing device and the image segmentation data processing method provided in the foregoing embodiments belong to the same concept, and specific implementation processes thereof are detailed in the method embodiments and are not described herein again.

Fig. 11 is a schematic structural diagram of an image segmentation data processing apparatus according to an embodiment of the present application. Referring to fig. 11, the apparatus includes:

a model acquisition module 1101, configured to acquire a first image segmentation model, where the first image segmentation model includes a feature extraction layer and a feature segmentation layer;

the feature extraction module 1102 is configured to perform feature extraction on a target domain image based on the feature extraction layer, so as to obtain a feature image corresponding to the target domain image;

the feature segmentation module 1103 is configured to perform feature segmentation on the feature image based on the feature segmentation layer to obtain a target domain label image corresponding to the target domain image;

According to the image segmentation data processing device, when the source domain image and the target domain image are adopted to train the first image segmentation model, the segmentation result of the first image segmentation model is considered, the segmentation result of the second image segmentation model is also considered, the first image segmentation model learns the segmentation result of the second image segmentation model, and therefore the training process of the first image segmentation model can be supervised through the segmentation result of the second image segmentation model, so that the first image segmentation model with higher accuracy and robustness is trained, the first image segmentation model is adopted to segment the target domain image, and the obtained segmentation result is more accurate.

Optionally, referring to fig. 12, the apparatus further includes:

a first image obtaining module 1104, configured to obtain the sample source domain image, the sample source domain label image, a first label image, and a second label image, where the first label image is obtained by image segmentation of the sample source domain image based on the first image segmentation model, and the second label image is obtained by image segmentation of the sample source domain image based on the second image segmentation model;

a first loss value determining module 1105, configured to determine a first source domain loss value based on the sample source domain label image, the first label image, and the second label image;

a second image obtaining module 1106, configured to obtain a sample target domain image, a third label image, and a fourth label image, where the third label image is obtained by performing image segmentation on the sample target domain image based on the first image segmentation model, and the fourth label image is obtained by performing image segmentation on the sample target domain image based on the second image segmentation model;

a second loss value determining module 1107, configured to determine a first target domain loss value based on the third tag image and the fourth tag image;

A training module 1108 is configured to train the first image segmentation model based on the first source domain loss value and the first target domain loss value.

The embodiment of the application also provides a computer device, which comprises a processor and a memory, wherein at least one computer program is stored in the memory, and the at least one computer program is loaded and executed by the processor to realize the operations executed in the image segmentation data processing method of the embodiment.

Optionally, the computer device is provided as a terminal. Fig. 13 illustrates a schematic structure of a terminal 1300 according to an exemplary embodiment of the present application.

The terminal 1300 includes: a processor 1301, and a memory 1302.

Processor 1301 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like. Processor 1301 may be implemented in hardware in at least one of DSP (Digital Signal Processing ), FPGA (Field Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array ). Processor 1301 may also include a main processor, which is a processor for processing data in an awake state, also called a CPU (Central Processing Unit ), and a coprocessor; a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, processor 1301 may integrate a GPU (Graphics Processing Unit, image processing interactor) for taking care of rendering and rendering of content required to be displayed by the display screen. In some embodiments, the processor 1301 may also include an AI (Artificial Intelligence ) processor for processing computing operations related to machine learning.

Memory 1302 may include one or more computer-readable storage media, which may be non-transitory. Memory 1302 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 1302 is used to store at least one computer program for execution by processor 1301 to implement the image segmentation data processing method provided by the method embodiments herein.

In some embodiments, the terminal 1300 may further optionally include: a peripheral interface 1303 and at least one peripheral. The processor 1301, the memory 1302, and the peripheral interface 1303 may be connected by a bus or signal lines. The respective peripheral devices may be connected to the peripheral device interface 1303 through a bus, a signal line, or a circuit board. Optionally, the peripheral device comprises: at least one of radio frequency circuitry 1304, a display screen 1305, a camera assembly 1306, and a power supply 1307.

A peripheral interface 1303 may be used to connect I/O (Input/Output) related at least one peripheral to the processor 1301 and the memory 1302. In some embodiments, processor 1301, memory 1302, and peripheral interface 1303 are integrated on the same chip or circuit board; in some other embodiments, either or both of the processor 1301, the memory 1302, and the peripheral interface 1303 may be implemented on separate chips or circuit boards, which is not limited in this embodiment.

The Radio Frequency circuit 1304 is used to receive and transmit RF (Radio Frequency) signals, also known as electromagnetic signals. The radio frequency circuit 1304 communicates with a communication network and other communication devices via electromagnetic signals. The radio frequency circuit 1304 converts an electrical signal to an electromagnetic signal for transmission, or converts a received electromagnetic signal to an electrical signal. Optionally, the radio frequency circuit 1304 includes: antenna systems, RF transceivers, one or more amplifiers, tuners, oscillators, digital signal processors, codec chipsets, subscriber identity module cards, and so forth. The radio frequency circuit 1304 may communicate with other devices via at least one wireless communication protocol. The wireless communication protocol includes, but is not limited to: metropolitan area networks, various generations of mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks, and/or WiFi (Wireless Fidelity ) networks. In some embodiments, the radio frequency circuit 1304 may also include NFC (Near Field Communication ) related circuits, which are not limited in this application.

The display screen 1305 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display 1305 is a touch display, the display 1305 also has the ability to capture touch signals at or above the surface of the display 1305. The touch signal may be input to the processor 1301 as a control signal for processing. At this point, the display 1305 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the display screen 1305 may be one and disposed on the front panel of the terminal 1300; in other embodiments, the display 1305 may be at least two, disposed on different surfaces of the terminal 1300 or in a folded configuration; in other embodiments, the display 1305 may be a flexible display disposed on a curved surface or a folded surface of the terminal 1300. Even more, the display screen 1305 may be arranged in a non-rectangular irregular pattern, i.e., a shaped screen. The display screen 1305 may be made of LCD (Liquid Crystal Display ), OLED (Organic Light-Emitting Diode) or other materials.

The camera assembly 1306 is used to capture images or video. Optionally, camera assembly 1306 includes a front camera and a rear camera. The front camera is disposed on the front panel of the terminal 1300, and the rear camera is disposed on the rear surface of the terminal 1300. In some embodiments, the at least two rear cameras are any one of a main camera, a depth camera, a wide-angle camera and a tele camera, so as to realize that the main camera and the depth camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize a panoramic shooting and Virtual Reality (VR) shooting function or other fusion shooting functions. In some embodiments, camera assembly 1306 may also include a flash. The flash lamp can be a single-color temperature flash lamp or a double-color temperature flash lamp. The dual-color temperature flash lamp refers to a combination of a warm light flash lamp and a cold light flash lamp, and can be used for light compensation under different color temperatures.

A power supply 1307 is used to power the various components in terminal 1300. The power supply 1307 may be an alternating current, a direct current, a disposable battery, or a rechargeable battery. When the power supply 1307 comprises a rechargeable battery, the rechargeable battery may support wired or wireless charging. The rechargeable battery may also be used to support fast charge technology.

Those skilled in the art will appreciate that the structure shown in fig. 13 is not limiting of terminal 1300 and may include more or fewer components than shown, or may combine certain components, or may employ a different arrangement of components.

Optionally, the computer device is provided as a server. Fig. 14 is a schematic structural diagram of a server provided in the embodiment of the present application, where the server 1400 may have a relatively large difference due to different configurations or performances, and may include one or more processors (Central Processing Units, CPU) 1401 and one or more memories 1402, where at least one computer program is stored in the memories 1402, and the at least one computer program is loaded and executed by the processors 1401 to implement the image segmentation data processing method provided in each of the method embodiments described above. Of course, the server may also have a wired or wireless network interface, a keyboard, an input/output interface, and other components for implementing the functions of the device, which are not described herein.

The present application also provides a computer readable storage medium having at least one computer program stored therein, the at least one computer program being loaded and executed by a processor to implement the operations performed in the image segmentation data processing method of the above embodiments.

The present application also provides a computer program product or a computer program, which includes computer program code stored in a computer readable storage medium, from which a processor of a computer device reads the computer program code, and which is executed by the processor, so that the computer device implements the operations performed in the image segmentation data processing method of the above embodiment.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program for instructing relevant hardware, where the program may be stored in a computer readable storage medium, and the storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The foregoing description of the embodiments is merely an optional embodiment and is not intended to limit the embodiments, and any modifications, equivalent substitutions, improvements, etc. made within the spirit and principles of the embodiments of the present application are intended to be included in the scope of the present application.

Claims

1. A method of processing image segmentation data, the method comprising:

2. The method of claim 1, wherein the determining a first source domain loss value based on a source domain label image corresponding to the first source domain image, the first label image, and the second label image comprises:

when the first source domain image is determined to be a noise source domain image based on the second image segmentation model, the first source domain loss value is determined based on the source domain label image, the first label image, and the second label image.

3. The method according to claim 1, wherein the method further comprises:

acquiring a second source domain image, a source domain label image corresponding to the second source domain image and a fifth label image, wherein the fifth label image is obtained by carrying out image segmentation on the second source domain image based on the first image segmentation model;

determining a second source domain loss value based on a source domain label image corresponding to the second source domain image and the fifth label image when the second source domain image is determined to be a non-noise source domain image based on the second image segmentation model;

the first image segmentation model is trained based on the first source domain loss value, the second source domain loss value, and the first target domain loss value.

4. The method of claim 1, wherein the determining a first source domain loss value based on a source domain label image corresponding to the first source domain image, the first label image, and the second label image comprises:

determining a third source domain loss value based on the source domain label image and the first label image;

determining a fourth source domain loss value based on the second label image and the first label image;

the first source domain loss value is determined based on the third source domain loss value and the fourth source domain loss value.

5. The method of claim 4, wherein the determining the first source domain loss value based on the third source domain loss value and the fourth source domain loss value comprises:

acquiring iterative training times corresponding to the training;

and determining the first source domain loss value based on the third source domain loss value, the fourth source domain loss value, the iteration number, the first threshold and the second threshold in response to the iteration training number being not less than a first threshold and not greater than a second threshold.

6. The method of claim 1, wherein the third label image includes a third label corresponding to each pixel in the target domain image, the fourth label image includes a fourth label corresponding to each pixel in the target domain image, and wherein determining the first target domain loss value based on the third label image and the fourth label image includes:

Determining a second difference value based on a third label and a fourth label corresponding to each pixel point, wherein the second difference value represents the difference between the third label image and the fourth label image;

determining a target domain sub-loss value corresponding to any pixel point based on the second difference value, a third label and a fourth label corresponding to any pixel point;

the first target domain loss value is determined based on the determined plurality of target domain sub-loss values.

7. The method according to claim 1, wherein the method further comprises:

performing image discrimination processing on the first label image based on a discrimination model to obtain a source domain discrimination result;

performing image discrimination processing on the third label image based on the discrimination model to obtain a target domain discrimination result;

determining a discrimination loss value based on the source domain discrimination result, the target domain discrimination result and the third label image;

training the first image segmentation model and the discriminant model based on the discriminant loss value.

8. The method of claim 1, wherein after determining a first source domain loss value based on a source domain label image corresponding to the first source domain image, the first label image, and the second label image, the method further comprises:

In response to the first source domain loss value being less than a target loss value, determining the first source domain image as a non-noise source domain image; or alternatively, the process may be performed,

and in response to the first source domain loss value not being less than the target loss value, determining the first source domain image as a noise source domain image.

9. The method of claim 8, wherein after determining the first source domain image as a noise source domain image in response to the first source domain loss value not being less than the target loss value, the method further comprises:

determining a sixth source domain loss value based on a source domain label image corresponding to the first source domain image, the first label image and a sixth label image, wherein the sixth label image is obtained by performing image segmentation on the first source domain image based on the second image segmentation model;

determining a second target domain loss value based on the third label image and a seventh label image, wherein the seventh label image is obtained by performing image segmentation on the target domain image based on the second image segmentation model;

the second image segmentation model is trained based on the sixth source domain loss value and the second target domain loss value.

10. A method of processing image segmentation data, the method comprising:

acquiring a first image segmentation model, wherein the first image segmentation model comprises a feature extraction layer and a feature segmentation layer;

performing feature extraction on the target domain image based on the feature extraction layer to obtain a feature image corresponding to the target domain image;

performing feature segmentation on the feature image based on the feature segmentation layer to obtain a target domain label image corresponding to the target domain image;

11. The method of claim 10, wherein the training process of the first image segmentation model comprises:

acquiring the sample source domain image, the sample source domain label image, a first label image and a second label image, wherein the first label image is obtained by carrying out image segmentation on the sample source domain image based on the first image segmentation model, and the second label image is obtained by carrying out image segmentation on the sample source domain image based on the second image segmentation model;

Determining a first source domain loss value based on the sample source domain label image, the first label image, and the second label image;

acquiring a sample target domain image, a third label image and a fourth label image, wherein the third label image is obtained by carrying out image segmentation on the sample target domain image based on the first image segmentation model, and the fourth label image is obtained by carrying out image segmentation on the sample target domain image based on the second image segmentation model;

the first image segmentation model is trained based on the first source domain loss value and the first target domain loss value.

12. An image segmentation data processing apparatus, the apparatus comprising:

13. An image segmentation data processing apparatus, the apparatus comprising:

14. A computer device comprising a processor and a memory, wherein the memory has stored therein at least one computer program that is loaded and executed by the processor to implement the operations performed in the image segmentation data processing method as claimed in any one of claims 1 to 9 or to implement the operations performed in the image segmentation data processing method as claimed in any one of claims 10 to 11.

15. A computer-readable storage medium, in which at least one computer program is stored, the at least one computer program being loaded and executed by a processor to implement operations performed in the image segmentation data processing method according to any one of claims 1 to 9 or to implement operations performed in the image segmentation data processing method according to any one of claims 10 to 11.