CN112633285B - Domain adaptation method, domain adaptation device, electronic equipment and storage medium - Google Patents

Domain adaptation method, domain adaptation device, electronic equipment and storage medium Download PDF

Info

Publication number
CN112633285B
CN112633285B CN202011543313.3A CN202011543313A CN112633285B CN 112633285 B CN112633285 B CN 112633285B CN 202011543313 A CN202011543313 A CN 202011543313A CN 112633285 B CN112633285 B CN 112633285B
Authority
CN
China
Prior art keywords
image
network
pixel point
identified
semantic segmentation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011543313.3A
Other languages
Chinese (zh)
Other versions
CN112633285A (en
Inventor
刘杰
王健宗
瞿晓阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202011543313.3A priority Critical patent/CN112633285B/en
Priority to PCT/CN2021/082603 priority patent/WO2022134338A1/en
Publication of CN112633285A publication Critical patent/CN112633285A/en
Application granted granted Critical
Publication of CN112633285B publication Critical patent/CN112633285B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the application relates to the technical field of artificial intelligence, in particular to a field adaptation method, a device, equipment and a storage medium. The method comprises the following steps: acquiring an image to be identified from a target domain; inputting the image to be identified into a first segmentation network to obtain a first class proportion, wherein the first segmentation network is obtained by training an image using a source domain; inputting the image to be identified into a second dividing network to obtain a second class proportion and a entropy diagram, wherein the entropy diagram is a matrix formed by information entropy of each pixel point in the image to be identified; and performing domain adaptation on the second division network according to the first class proportion, the second class proportion and the entropy diagram. The application is beneficial to improving the field adaptation efficiency.

Description

Domain adaptation method, domain adaptation device, electronic equipment and storage medium
Technical Field
The application relates to the technical field of image recognition, in particular to a field adaptation method, a field adaptation device, electronic equipment and a storage medium.
Background
Semantic segmentation has become a key step in many modern technological applications. Since the advent of the deep learning era, automatic semantic segmentation methods among various problems have achieved substantial progress. But semantic segmentation networks for semantic segmentation can have a significant decrease in performance in different areas where the samples have different distributions. Thus, these semantic segmentation networks require pixel-by-pixel annotated images as training samples. Labeling the sample is a time-consuming and money-consuming task.
To overcome this problem, domain adaptation methods have emerged, which refers to the process of migrating a model trained in a labeled source domain to a target domain with no or very few labels. The countermeasure learning strategy is a popular technique in the field adaptive method, and one major limitation of the countermeasure learning technique is that it needs to acquire image data of a source domain and a target domain simultaneously in an adaptation phase to perform adaptation. Sometimes, the image data of the source cannot be acquired due to privacy consideration, or due to data loss, etc. Therefore, due to the limitation of source domain image data acquisition, the domain adaptation efficiency is low, and there is a need to provide an efficient domain adaptation method.
Disclosure of Invention
The embodiment of the application provides a domain adaptation method, which does not need to use image data of a source domain to complete domain adaptation, so that a network trained by a target domain has the characteristic of the source domain, and the domain adaptation efficiency is improved.
In a first aspect, an embodiment of the present application provides a domain adaptation method, including:
Acquiring an image to be identified from a target domain;
inputting the image to be identified into a first segmentation network to obtain a first class proportion, wherein the first segmentation network is obtained by training an image using a source domain;
Inputting the image to be identified into a second dividing network to obtain a second class proportion and a entropy diagram, wherein the entropy diagram is a matrix formed by information entropy of each pixel point in the image to be identified;
and performing domain adaptation on the second division network according to the first class proportion, the second class proportion and the entropy diagram.
In some possible embodiments, the inputting the image to be identified into the first segmentation network, to obtain a first class ratio, includes:
Inputting the image to be identified into the first segmentation network, and carrying out semantic segmentation on each pixel point in the image to be identified to obtain a first semantic segmentation result of each pixel point, wherein the first semantic segmentation result of each pixel point represents the probability that each pixel point belongs to k categories, the value of k is an integer from 1 to N, and N is an integer greater than 1;
averaging the first semantic segmentation results of each pixel point to obtain a first semantic segmentation result of the image to be identified;
And obtaining the first class proportion according to the first semantic segmentation result of the image to be identified.
In some possible embodiments, the inputting the image to be identified into a second segmentation network, to obtain a second class ratio and a entropy diagram, includes:
Inputting the image to be identified into a second segmentation network, and carrying out semantic segmentation on each pixel point in the image to be identified to obtain a second semantic segmentation result of each pixel point, wherein the second semantic segmentation result of each pixel point is used for representing the probability that each pixel point belongs to k categories, the value of k is an integer from 1 to N, and N is an integer greater than 1;
averaging the second semantic segmentation results of each pixel point to obtain a second semantic segmentation result of the image to be identified;
And determining the information entropy of each pixel point according to the second semantic segmentation result and the information entropy calculation formula of each pixel point, and forming the entropy diagram by the information entropy of each pixel point.
In some possible embodiments, the performing domain adaptation on the second partition network according to the first class ratio, the second class ratio, and the entropy diagram includes:
determining a first KL-divergence between the first class proportion and the second class proportion;
Determining the sum of information entropy of each pixel point in the entropy diagram;
Determining target loss according to the first KL divergence, the sum of information entropy of each pixel point and preset parameters;
And adjusting network parameters of the second partition network according to the target loss so as to carry out domain adaptation on the second partition network.
In some possible implementations, the second partitioning network further includes a first convolution layer, the first convolution layer being connected to the encoding network; and carrying out upsampling processing on the first characteristic diagram through the decoding network, and before obtaining a second characteristic diagram, the method further comprises the following steps:
bilinear interpolation is carried out on the first feature map, and a third feature map is obtained, wherein the dimension of the third feature map is the same as the dimension of the second feature map;
performing semantic segmentation on the third feature map through the first convolution layer to obtain a third semantic segmentation result of each pixel point;
Determining a second KL divergence between the second semantic segmentation result of each second pixel point and the third semantic segmentation result of each pixel point, and acquiring an average value of the second KL divergence of each pixel point in the image to be identified to obtain a third KL divergence;
the determining the target loss according to the first KL divergence, the sum of the information entropy of each pixel point and the preset parameters comprises the following steps:
and determining target loss according to the first KL divergence, the third KL divergence, the sum of information entropy of each pixel point and preset parameters.
In some possible embodiments, the method further comprises: after the field adaptation is completed on the second segmentation network, deleting the decoding network and the second convolution layer to obtain a third segmentation network; and performing semantic segmentation on the image by using the third segmentation network.
In a second aspect, an embodiment of the present application provides a domain adaptation device, including:
the acquisition unit is used for acquiring the image to be identified from the target domain;
the processing unit is used for inputting the image to be identified into a first segmentation network to obtain a first class proportion, wherein the first segmentation network is obtained by training an image using a source domain;
Inputting the image to be identified into a second dividing network to obtain a second class proportion and a entropy diagram, wherein the entropy diagram is a matrix formed by information entropy of each pixel point in the image to be identified;
and performing domain adaptation on the second division network according to the first class proportion, the second class proportion and the entropy diagram.
In some possible embodiments, the processing unit is specifically configured to, in inputting the image to be identified into the first segmentation network, obtain a first class ratio:
Inputting the image to be identified into the first segmentation network, and carrying out semantic segmentation on each pixel point in the image to be identified to obtain a first semantic segmentation result of each pixel point, wherein the first semantic segmentation result of each pixel point represents the probability that each pixel point belongs to k categories, the value of k is an integer from 1 to N, and N is an integer greater than 1;
averaging the first semantic segmentation results of each pixel point to obtain a first semantic segmentation result of the image to be identified;
And obtaining the first class proportion according to the first semantic segmentation result of the image to be identified.
In some possible embodiments, the processing unit is specifically configured to, in inputting the image to be identified into a second segmentation network, obtain a second class scale and a entropy diagram:
Inputting the image to be identified into a second segmentation network, and carrying out semantic segmentation on each pixel point in the image to be identified to obtain a second semantic segmentation result of each pixel point, wherein the second semantic segmentation result of each pixel point is used for representing the probability that each pixel point belongs to k categories, the value of k is an integer from 1 to N, and N is an integer greater than 1;
averaging the second semantic segmentation results of each pixel point to obtain a second semantic segmentation result of the image to be identified;
And determining the information entropy of each pixel point according to the second semantic segmentation result and the information entropy calculation formula of each pixel point, and forming the entropy diagram by the information entropy of each pixel point.
In some possible embodiments, the processing unit is configured to, in terms of domain adaptation of the second partitioning network according to the first class ratio, the second class ratio and the entropy diagram, specifically:
determining a first KL-divergence between the first class proportion and the second class proportion;
Determining the sum of information entropy of each pixel point in the entropy diagram;
Determining target loss according to the first KL divergence, the sum of information entropy of each pixel point and preset parameters;
And adjusting network parameters of the second partition network according to the target loss so as to carry out domain adaptation on the second partition network.
In some possible implementations, the second partitioning network further includes a first convolution layer, the first convolution layer being connected to the encoding network; and before the first feature map is subjected to up-sampling processing through the decoding network to obtain a second feature map, the processing unit is further configured to:
bilinear interpolation is carried out on the first feature map, and a third feature map is obtained, wherein the dimension of the third feature map is the same as the dimension of the second feature map;
performing semantic segmentation on the third feature map through the first convolution layer to obtain a third semantic segmentation result of each pixel point;
Determining a second KL divergence between the second semantic segmentation result of each second pixel point and the third semantic segmentation result of each pixel point, and acquiring an average value of the second KL divergence of each pixel point in the image to be identified to obtain a third KL divergence;
in determining the target loss according to the first KL divergence, the sum of the information entropy of each pixel point and the preset parameter, the processing unit is specifically configured to:
and determining target loss according to the first KL divergence, the third KL divergence, the sum of information entropy of each pixel point and preset parameters.
In some possible implementations, the processing unit is further configured to delete the decoding network and the second convolution layer after the domain adaptation is completed for the second partition network, to obtain a third partition network; and performing semantic segmentation on the image by using the third segmentation network.
In a third aspect, an embodiment of the present application provides an electronic device, including: and a processor connected to a memory for storing a computer program, the processor being configured to execute the computer program stored in the memory, to cause the electronic device to perform the method according to the first aspect.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium storing a computer program that causes a computer to perform the method according to the first aspect.
In a fifth aspect, embodiments of the present application provide a computer program product comprising a non-transitory computer readable storage medium storing a computer program, the computer being operable to cause a computer to perform the method according to the first aspect.
The embodiment of the application has the following beneficial effects:
It can be seen that in the embodiment of the application, in the field adaptation process, the image to be identified of the target field can be directly used to perform field adaptation on the second segmentation network of the target field, and the image of the source field is not required to be used, so that the problem of difficult image acquisition in the source field is solved, and the field adaptation efficiency is improved. In addition, in the process of adaptation, the information entropy of each pixel point is also counted, so that the adapted second segmentation network can accurately classify each pixel point, and the semantic segmentation accuracy is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic flow chart of a domain adaptation method according to an embodiment of the present application;
Fig. 2 is a schematic structural diagram of a second partition network according to an embodiment of the present application;
fig. 3 is a schematic diagram of a training flow of a first segmentation network according to an embodiment of the present application;
fig. 4 is a functional unit composition block diagram of a domain adaptation device according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of a domain adaptation device according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The terms "first," "second," "third," and "fourth" and the like in the description and in the claims and drawings are used for distinguishing between different objects and not necessarily for describing a particular sequential or chronological order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, result, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
Referring to fig. 1, fig. 1 is a flow chart of a domain adaptation method according to an embodiment of the present application. The method is applied to a domain adaptation device. The method comprises the following steps:
101: the domain adaptation device obtains an image to be identified from the target domain.
The image to be identified can be any image in the target domain. For the image in the target domain, most of the image is tag-free, and the application is described by taking the example that the image to be identified does not contain tags.
102: The domain adaptation device inputs the image to be identified into a first segmentation network to obtain a first class proportion, wherein the first segmentation network is obtained by training the image of the source domain.
The first segmentation network is illustratively trained using images of a source domain, and the training process for the first segmentation network is described below, and is not described in any greater detail herein.
The image to be identified is input to a first image segmentation network, a feature map is carried out on the image to be identified, the feature map of the image to be identified is obtained, semantic segmentation is carried out on each pixel point in the image to be identified according to the feature map, and a first semantic segmentation result of each pixel point is obtained, wherein the semantic segmentation result of each pixel point represents the probability that the pixel point belongs to k categories, the value of k is an integer from 1 to N, and N is an integer greater than 1. That is, semantic segmentation is performed on each pixel in the image to be identified, so as to obtain probabilities that each pixel falls into 1 category, 2 categories, … categories and N categories respectively.
And then, averaging the first semantic segmentation results of each pixel point to obtain a first semantic segmentation result of the image to be identified, and obtaining a first class proportion. Illustratively, the first semantic segmentation result of the image to be identified may be represented by formula (1):
Where s is the image to be identified, k represents k classes, τ (s, k) is the first semantic segmentation result of the image to be identified, |{ Ω s } | represents the number of pixels in the image to be identified, i represents the ith pixel in the image to be identified, The first semantic segmentation result is the probability of belonging to k categories for the ith pixel point in the image to be identified.
Further, the first semantic segmentation result of the image to be identified indicates the probability that the image to be identified belongs to k categories, that is, the category proportion of the image to be identified, that is, the probability that the image to be identified belongs to each category, as the proportion of each category.
103: The field adaptation device inputs the image to be identified into a second division network to obtain a second class proportion and a entropy diagram, wherein the entropy diagram is a matrix formed by information entropy of each pixel point in the image to be identified.
The image to be identified is input into a second segmentation network, semantic segmentation is carried out on each pixel point in the image to be identified, a second semantic segmentation result of each pixel point is obtained, and the second semantic segmentation result of each pixel point is used for representing the probability that each pixel point belongs to k categories; then, the second semantic segmentation result for each pixel point is averaged. Further, the information entropy of each pixel point can be determined according to the second semantic segmentation result of each pixel point, and then the information entropy of each pixel point is formed into a entropy diagram, i.e. a matrix formed by the information entropy. Illustratively, the information entropy of each pixel point can be represented by formula (2):
wherein H (i) represents the information entropy of the ith pixel point, The probability that the ith pixel point belongs to j categories is represented, and the value of j is an integer from 1 to N.
104: The domain adaptation means performs domain adaptation on the second split network according to the first class ratio, the second class ratio and the entropy diagram.
Illustratively, determining a first KL divergence between the first class proportion and the second class proportion, and summing information for each pixel point in the entropy diagram; determining target loss according to the first KL divergence, the sum of information entropy of each pixel point and preset parameters; finally, the network parameters of the second partition network are adjusted according to the target loss, so as to carry out domain adaptation on the second partition network. Illustratively, the target loss may be represented by equation (3):
Wherein Loss is a target Loss, lambda is a preset parameter, KL is a KL divergence operation, For the second semantic segmentation result of the image s to be identified, length is the information entropy operation,And (3) obtaining a second semantic segmentation result of the ith pixel point in the image to be identified.
It can be seen that in the embodiment of the application, in the field adaptation process, the image to be identified of the target field can be directly used to perform field adaptation on the second segmentation network of the target field, and the image of the source field is not required to be used, so that the problem of difficult image acquisition in the source field is solved, and the field adaptation efficiency is improved. In addition, in the process of adaptation, the information entropy of each pixel point is also counted, so that the adapted second segmentation network can accurately classify each pixel point, and the semantic segmentation accuracy is improved.
The process of semantically segmenting the pixels in the image to be identified is described below in connection with the network structure of the second segmentation network. The first segmentation network and the second segmentation network have similar network structures, and the method for semantically segmenting the image to be identified is similar to the method for segmenting the image to be identified by the second network structure, and will not be described.
As shown in fig. 2, the second partitioning network includes an encoding network, a first convolutional layer, a decoding network, and a second convolutional layer. Therefore, the image to be identified is subjected to downsampling processing through the coding network, a first characteristic image is obtained, and the first characteristic image is subjected to upsampling processing through the decoding network, so that a second characteristic image is obtained; and dividing the second feature map through a second convolution layer to obtain a second semantic division result of each pixel point. For example, if the convolution kernel dimension of the convolution layer is 1*1, performing convolution processing on the pixel value of each pixel point in the second feature map through the convolution check, and performing softmax normalization processing on the pixel value of the second feature map after the convolution processing on each channel to obtain a second semantic segmentation result of each pixel point. It should be understood that more convolution layers may be designed for semantic segmentation, and only one convolution layer is illustrated in the present application.
In addition, prior to upsampling the first feature map through the decoding network, bilinear interpolation is performed on the first feature map, scale restoration is performed on the first feature map, and a third feature map is obtained, wherein the dimension of the third feature map is the same as the dimension of the second feature map; and then, carrying out semantic segmentation on the third feature map through the first convolution layer to obtain a third semantic segmentation result of each pixel point, wherein the semantic segmentation on the third feature map through the first convolution layer is similar to the semantic segmentation on the second feature map through the second convolution layer, and the semantic segmentation is not described. Then, determining a second KL divergence between the second semantic segmentation result of each pixel point and the third semantic segmentation result of each pixel point; and obtaining an average value of the second KL divergence of each pixel point in the image to be identified to obtain a third KL divergence. Illustratively, the third KL divergence may be represented by equation (4):
wherein KL 3 is the third KL divergence.
Further, after the third KL divergence is determined, the target loss may be determined according to the first KL divergence, the third KL divergence, a sum of information entropy of each pixel point, and a preset parameter. Then, domain adaptation is performed on the second split network based on the target loss. In addition, after the adaptation to the second segmentation network domain is completed, the decoding network and the second convolution layer are deleted, a third segmentation network is obtained, and the second segmentation network (i.e., the third segmentation network) from which the decoding network and the second convolution layer are deleted is used for performing semantic segmentation on the image.
It can be seen that when the second segmentation network is subjected to domain adaptation, loss (third KL divergence) between the coding network and the decoding network is determined, that is, the coding network and the decoding network are subjected to countermeasure training, so that the coding network has the function of decoding the network, then the decoding network is deleted under the condition that the semantic segmentation precision is not reduced, the model scale of the second segmentation network is reduced, and the migration of the second segmentation network is facilitated and the efficiency of the second segmentation network for semantic segmentation is improved.
It should be understood that the first partition network may be used as a supervisory network for the second partition network, so that, in order to ensure the accuracy of semantic partition for each pixel point, after training is completed on the first partition network, the decoding network in the first partition network and the convolution layer connected to the decoding network are not deleted.
In some possible embodiments, the domain adaptation method of the present application may be applied to the medical domain. That is, the first and second division networks are networks for lesion division, and each pixel belongs to k categories, that is, k lesions. For the medical field, the labeling cost of the medical image is relatively high, so that the first segmentation network can be trained by using the image data of the existing source domain (for example, the medical image related to tumor in the open source database is provided with a label), then the second segmentation network is adapted based on the trained first segmentation network and the fact that no labeling is provided, so that the second segmentation network has the segmentation effect of the first segmentation network, the image knowledge of the source domain is migrated to the target domain, the segmentation precision of the second segmentation network is improved, the segmentation precision of focus is improved, data reference is provided for doctor diagnosis, and the progress of medical science and technology is promoted.
In some possible embodiments, the domain adaptation method of the present application may also be applied to the blockchain domain, for example, the image of the source domain and/or the target domain may be stored in the blockchain, so that security in the image access process of the source domain and/or the target domain may be ensured.
Referring to fig. 3, fig. 3 is a schematic flow chart of training a first split network according to an embodiment of the present application. The method comprises the following steps:
301: a training image is acquired from a source domain.
302: Inputting the training image into a first segmentation network, predicting a fourth semantic segmentation result of each pixel point in the training image, wherein the fourth semantic segmentation result of each pixel point is used for representing the probability that the pixel point belongs to k categories.
303: And determining a fourth KL divergence according to a fourth semantic segmentation result of each pixel point and the label of each pixel point, wherein the label of each pixel point is used for representing the real probability that the pixel point belongs to k categories.
304: And adjusting network parameters of the first split network according to the fourth KL divergence.
Illustratively, taking the fourth KL divergence as a loss result of the first segmentation network, and then adjusting network parameters of the neural network according to the loss result until the first segmentation network converges to complete training of the first segmentation network.
Referring to fig. 4, fig. 4 is a functional unit block diagram of a domain adaptation device according to an embodiment of the present application. The domain adaptation device 400 includes: an acquisition unit 401 and a processing unit 402, wherein:
An acquiring unit 401, configured to acquire an image to be identified from a target domain;
the processing unit 402 is configured to input the image to be identified into a first segmentation network, to obtain a first class ratio, where the first segmentation network is obtained by training an image using a source domain;
Inputting the image to be identified into a second dividing network to obtain a second class proportion and a entropy diagram, wherein the entropy diagram is a matrix formed by information entropy of each pixel point in the image to be identified;
and performing domain adaptation on the second division network according to the first class proportion, the second class proportion and the entropy diagram.
In some possible embodiments, the processing unit 402 is specifically configured to, in inputting the image to be identified into the first segmentation network, obtain a first class ratio:
Inputting the image to be identified into the first segmentation network, and carrying out semantic segmentation on each pixel point in the image to be identified to obtain a first semantic segmentation result of each pixel point, wherein the first semantic segmentation result of each pixel point represents the probability that each pixel point belongs to k categories, the value of k is an integer from 1 to N, and N is an integer greater than 1;
averaging the first semantic segmentation results of each pixel point to obtain a first semantic segmentation result of the image to be identified;
And obtaining the first class proportion according to the first semantic segmentation result of the image to be identified.
In some possible embodiments, in inputting the image to be identified into a second segmentation network, the processing unit 402 is specifically configured to:
Inputting the image to be identified into a second segmentation network, and carrying out semantic segmentation on each pixel point in the image to be identified to obtain a second semantic segmentation result of each pixel point, wherein the second semantic segmentation result of each pixel point is used for representing the probability that each pixel point belongs to k categories, the value of k is an integer from 1 to N, and N is an integer greater than 1;
averaging the second semantic segmentation results of each pixel point to obtain a second semantic segmentation result of the image to be identified;
And determining the information entropy of each pixel point according to the second semantic segmentation result and the information entropy calculation formula of each pixel point, and forming the entropy diagram by the information entropy of each pixel point.
In some possible embodiments, the processing unit 402 is configured to, in terms of domain adaptation of the second partition network according to the first class ratio, the second class ratio and the entropy diagram, specifically:
determining a first KL-divergence between the first class proportion and the second class proportion;
Determining the sum of information entropy of each pixel point in the entropy diagram;
Determining target loss according to the first KL divergence, the sum of information entropy of each pixel point and preset parameters;
And adjusting network parameters of the second partition network according to the target loss so as to carry out domain adaptation on the second partition network.
In some possible implementations, the second partitioning network further includes a first convolution layer, the first convolution layer being connected to the encoding network; the processing unit 402 is further configured to, before performing upsampling processing on the first feature map through the decoding network to obtain a second feature map:
bilinear interpolation is carried out on the first feature map, and a third feature map is obtained, wherein the dimension of the third feature map is the same as the dimension of the second feature map;
performing semantic segmentation on the third feature map through the first convolution layer to obtain a third semantic segmentation result of each pixel point;
Determining a second KL divergence between the second semantic segmentation result of each second pixel point and the third semantic segmentation result of each pixel point, and acquiring an average value of the second KL divergence of each pixel point in the image to be identified to obtain a third KL divergence;
In determining the target loss according to the first KL divergence, the sum of the information entropy of each pixel point and the preset parameter, the processing unit 402 is specifically configured to:
and determining target loss according to the first KL divergence, the third KL divergence, the sum of information entropy of each pixel point and preset parameters.
In some possible embodiments, the processing unit 402 is further configured to delete the decoding network and the second convolution layer after the domain adaptation is completed on the second partition network, to obtain a third partition network; and performing semantic segmentation on the image by using the third segmentation network.
Referring to fig. 5, fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 5, the electronic device 500 includes a transceiver 501, a processor 502, and a memory 503. Which are connected by a bus 504. The memory 503 is used to store computer programs and data, and the data stored in the memory 503 may be transferred to the processor 502.
The processor 502 is configured to read a computer program in the memory 503 to perform the following operations:
Acquiring an image to be identified from a target domain;
inputting the image to be identified into a first segmentation network to obtain a first class proportion, wherein the first segmentation network is obtained by training an image using a source domain;
Inputting the image to be identified into a second dividing network to obtain a second class proportion and a entropy diagram, wherein the entropy diagram is a matrix formed by information entropy of each pixel point in the image to be identified;
and performing domain adaptation on the second division network according to the first class proportion, the second class proportion and the entropy diagram.
In some possible embodiments, in inputting the image to be identified into the first segmentation network, resulting in a first class ratio, the processor 502 is configured to perform the steps of:
Inputting the image to be identified into the first segmentation network, and carrying out semantic segmentation on each pixel point in the image to be identified to obtain a first semantic segmentation result of each pixel point, wherein the first semantic segmentation result of each pixel point represents the probability that each pixel point belongs to k categories, the value of k is an integer from 1 to N, and N is an integer greater than 1;
averaging the first semantic segmentation results of each pixel point to obtain a first semantic segmentation result of the image to be identified;
And obtaining the first class proportion according to the first semantic segmentation result of the image to be identified.
In some possible embodiments, in inputting the image to be identified into a second segmentation network, resulting in a second class scale and entropy diagram, the processor 502 is configured to perform the following steps:
Inputting the image to be identified into a second segmentation network, and carrying out semantic segmentation on each pixel point in the image to be identified to obtain a second semantic segmentation result of each pixel point, wherein the second semantic segmentation result of each pixel point is used for representing the probability that each pixel point belongs to k categories, the value of k is an integer from 1 to N, and N is an integer greater than 1;
averaging the second semantic segmentation results of each pixel point to obtain a second semantic segmentation result of the image to be identified;
And determining the information entropy of each pixel point according to the second semantic segmentation result and the information entropy calculation formula of each pixel point, and forming the entropy diagram by the information entropy of each pixel point.
In some possible implementations, the domain adaptation aspect processor 502 is configured to perform the following steps in adapting the second partitioning network according to the first class ratio, the second class ratio, and the entropy diagram:
determining a first KL-divergence between the first class proportion and the second class proportion;
Determining the sum of information entropy of each pixel point in the entropy diagram;
Determining target loss according to the first KL divergence, the sum of information entropy of each pixel point and preset parameters;
And adjusting network parameters of the second partition network according to the target loss so as to carry out domain adaptation on the second partition network.
In some possible implementations, the second partitioning network further includes a first convolution layer, the first convolution layer being connected to the encoding network; the processor 502 is further configured to perform the following steps before performing upsampling processing on the first feature map through the decoding network to obtain a second feature map:
bilinear interpolation is carried out on the first feature map, and a third feature map is obtained, wherein the dimension of the third feature map is the same as the dimension of the second feature map;
performing semantic segmentation on the third feature map through the first convolution layer to obtain a third semantic segmentation result of each pixel point;
Determining a second KL divergence between the second semantic segmentation result of each second pixel point and the third semantic segmentation result of each pixel point, and acquiring an average value of the second KL divergence of each pixel point in the image to be identified to obtain a third KL divergence;
in determining the target loss according to the first KL divergence, the sum of the information entropy of each pixel point and the preset parameters, the processor 502 is configured to perform the following steps:
and determining target loss according to the first KL divergence, the third KL divergence, the sum of information entropy of each pixel point and preset parameters.
In some possible implementations, the processor 502 is further configured to perform the steps of:
After the field adaptation is completed on the second segmentation network, deleting the decoding network and the second convolution layer to obtain a third segmentation network; and performing semantic segmentation on the image by using the third segmentation network.
Specifically, the transceiver 501 may be the transceiver unit 401 of the domain adaptation device 400 of the embodiment shown in fig. 4, and the processor 502 may be the processing unit 402 of the domain adaptation device 400 of the embodiment shown in fig. 4.
It should be understood that the field adapting device in the present application may include a smart Phone (such as an Android Mobile Phone, an iOS Mobile Phone, a Windows Phone Mobile Phone, etc.), a tablet computer, a palm computer, a notebook computer, a Mobile internet device MID (Mobile INTERNET DEVICES, abbreviated as MID), a wearable device, etc. The above-described domain adaptation device is merely exemplary and not exhaustive, including but not limited to the above-described domain adaptation device. In practical applications, the domain adaptation device may further include: intelligent vehicle terminals, computer devices, etc.
Embodiments of the present application also provide a computer-readable storage medium storing a computer program that is executed by a processor to implement some or all of the steps of any of the domain adaptation methods described in the method embodiments above.
In one embodiment of the present application, the above-mentioned computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created from the use of blockchain nodes, and the like.
The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The blockchain (Blockchain), essentially a de-centralized database, is a string of data blocks that are generated in association using cryptographic methods, each of which contains information from a batch of network transactions for verifying the validity (anti-counterfeit) of its information and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.
Embodiments of the present application also provide a computer program product comprising a non-transitory computer-readable storage medium storing a computer program operable to cause a computer to perform part or all of the steps of any of the domain adaptation methods described in the method embodiments above.
It should be noted that, for simplicity of description, the foregoing method embodiments are all described as a series of acts, but it should be understood by those skilled in the art that the present application is not limited by the order of acts described, as some steps may be performed in other orders or concurrently in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are alternative embodiments, and that the acts and modules referred to are not necessarily required for the present application.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to related descriptions of other embodiments.
In the several embodiments provided by the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, such as the division of the units, merely a logical function division, and there may be additional manners of dividing the actual implementation, such as multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, or may be in electrical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units described above may be implemented either in hardware or in software program modules.
The integrated units, if implemented in the form of software program modules, may be stored in a computer-readable memory for sale or use as a stand-alone product. Based on this understanding, the technical solution of the present application may be embodied essentially or partly in the form of a software product, or all or part of the technical solution, which is stored in a memory, and includes several instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned memory includes: a usb disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Those of ordinary skill in the art will appreciate that all or a portion of the steps in the various methods of the above embodiments may be implemented by a program that instructs associated hardware, and the program may be stored in a computer readable memory, which may include: flash disk, read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), magnetic disk or optical disk.
The foregoing has outlined rather broadly the more detailed description of embodiments of the application, wherein the principles and embodiments of the application are explained in detail using specific examples, the above examples being provided solely to facilitate the understanding of the method and core concepts of the application; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims (8)

1. A domain adaptation method, comprising:
Acquiring an image to be identified from a target domain;
inputting the image to be identified into a first segmentation network to obtain a first class proportion, wherein the first segmentation network is obtained by training an image using a source domain;
Inputting the image to be identified into a second dividing network to obtain a second class proportion and a entropy diagram, wherein the entropy diagram is a matrix formed by information entropy of each pixel point in the image to be identified;
Performing domain adaptation on the second partition network according to the first class ratio, the second class ratio and the entropy diagram; the second partition network comprises an encoding network, a decoding network, a first convolution layer and a second convolution layer, and comprises:
Performing downsampling processing on the image to be identified through the coding network to obtain a first feature map; performing up-sampling processing on the first feature map through the decoding network to obtain a second feature map; performing semantic segmentation on the second feature map through the second convolution layer to obtain a second semantic segmentation result of each pixel point; bilinear interpolation is carried out on the first feature map, and a third feature map is obtained, wherein the dimension of the third feature map is the same as the dimension of the second feature map; performing semantic segmentation on the third feature map through the first convolution layer to obtain a third semantic segmentation result of each pixel point;
Determining a first KL-divergence between the first class proportion and the second class proportion; determining a second KL divergence between a second semantic segmentation result of each pixel point and a third semantic segmentation result of each pixel point, and acquiring an average value of the second KL divergence of each pixel point in the image to be identified to obtain a third KL divergence;
Determining target loss according to the first KL divergence, the third KL divergence, the sum of information entropy of each pixel point and preset parameters;
And adjusting network parameters of the second segmentation network according to the target loss so as to carry out domain adaptation on the second segmentation network, and deleting the decoding network and the second convolution layer after finishing domain adaptation on the second segmentation network to obtain a third segmentation network.
2. The method of claim 1, wherein inputting the image to be identified into a first segmentation network results in a first class ratio comprising:
Inputting the image to be identified into the first segmentation network, and carrying out semantic segmentation on each pixel point in the image to be identified to obtain a first semantic segmentation result of each pixel point, wherein the first semantic segmentation result of each pixel point represents the probability that each pixel point belongs to k categories, the value of k is an integer from 1 to N, and N is an integer greater than 1;
averaging the first semantic segmentation results of each pixel point to obtain a first semantic segmentation result of the image to be identified;
And obtaining the first class proportion according to the first semantic segmentation result of the image to be identified.
3. The method according to claim 1 or 2, wherein said inputting the image to be identified into a second segmentation network results in a second class ratio and a entropy diagram, comprising:
Inputting the image to be identified into a second segmentation network, and carrying out semantic segmentation on each pixel point in the image to be identified to obtain a second semantic segmentation result of each pixel point, wherein the second semantic segmentation result of each pixel point is used for representing the probability that each pixel point belongs to k categories, the value of k is an integer from 1 to N, and N is an integer greater than 1;
averaging the second semantic segmentation results of each pixel point to obtain a second semantic segmentation result of the image to be identified;
And determining the information entropy of each pixel point according to the second semantic segmentation result and the information entropy calculation formula of each pixel point, and forming the entropy diagram by the information entropy of each pixel point.
4. The method of claim 3, wherein said performing domain adaptation on said second partitioning network based on said first class ratio, said second class ratio, and said entropy diagram comprises:
determining a first KL-divergence between the first class proportion and the second class proportion;
Determining the sum of information entropy of each pixel point in the entropy diagram;
determining target loss according to the first KL divergence, the sum of information entropy of each pixel point and preset parameters;
And adjusting network parameters of the second partition network according to the target loss so as to carry out domain adaptation on the second partition network.
5. The method according to claim 1, wherein the method further comprises:
and performing semantic segmentation on the image by using the third segmentation network.
6. A domain adaptation device, characterized in that the device is adapted to perform the method of any one of claims 1-5, comprising:
the acquisition unit is used for acquiring the image to be identified from the target domain;
the processing unit is used for inputting the image to be identified into a first segmentation network to obtain a first class proportion, wherein the first segmentation network is obtained by training an image using a source domain;
Inputting the image to be identified into a second dividing network to obtain a second class proportion and a entropy diagram, wherein the entropy diagram is a matrix formed by information entropy of each pixel point in the image to be identified;
and performing domain adaptation on the second division network according to the first class proportion, the second class proportion and the entropy diagram.
7. An electronic device, comprising: a processor and a memory, the processor being connected to the memory, the memory being for storing a computer program, the processor being for executing the computer program stored in the memory to cause the electronic device to perform the method of any one of claims 1-5.
8. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program, which is executed by a processor to implement the method of any of claims 1-5.
CN202011543313.3A 2020-12-23 2020-12-23 Domain adaptation method, domain adaptation device, electronic equipment and storage medium Active CN112633285B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202011543313.3A CN112633285B (en) 2020-12-23 2020-12-23 Domain adaptation method, domain adaptation device, electronic equipment and storage medium
PCT/CN2021/082603 WO2022134338A1 (en) 2020-12-23 2021-03-24 Domain adaptation method and apparatus, electronic device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011543313.3A CN112633285B (en) 2020-12-23 2020-12-23 Domain adaptation method, domain adaptation device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112633285A CN112633285A (en) 2021-04-09
CN112633285B true CN112633285B (en) 2024-07-23

Family

ID=75322072

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011543313.3A Active CN112633285B (en) 2020-12-23 2020-12-23 Domain adaptation method, domain adaptation device, electronic equipment and storage medium

Country Status (2)

Country Link
CN (1) CN112633285B (en)
WO (1) WO2022134338A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114024726B (en) * 2021-10-26 2022-09-02 清华大学 Method and system for detecting network flow online

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110135510A (en) * 2019-05-22 2019-08-16 电子科技大学中山学院 Dynamic domain self-adaptive method, equipment and computer readable storage medium
CN111199550A (en) * 2020-04-09 2020-05-26 腾讯科技(深圳)有限公司 Training method, segmentation method, device and storage medium of image segmentation network

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018126213A1 (en) * 2016-12-30 2018-07-05 Google Llc Multi-task learning using knowledge distillation
US20190130220A1 (en) * 2017-10-27 2019-05-02 GM Global Technology Operations LLC Domain adaptation via class-balanced self-training with spatial priors
CN110750665A (en) * 2019-10-12 2020-02-04 南京邮电大学 Open set domain adaptation method and system based on entropy minimization
CN111062951B (en) * 2019-12-11 2022-03-25 华中科技大学 Knowledge distillation method based on semantic segmentation intra-class feature difference
CN111401406B (en) * 2020-02-21 2023-07-18 华为技术有限公司 Neural network training method, video frame processing method and related equipment
CN111489365B (en) * 2020-04-10 2023-12-22 上海商汤临港智能科技有限公司 Training method of neural network, image processing method and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110135510A (en) * 2019-05-22 2019-08-16 电子科技大学中山学院 Dynamic domain self-adaptive method, equipment and computer readable storage medium
CN111199550A (en) * 2020-04-09 2020-05-26 腾讯科技(深圳)有限公司 Training method, segmentation method, device and storage medium of image segmentation network

Also Published As

Publication number Publication date
CN112633285A (en) 2021-04-09
WO2022134338A1 (en) 2022-06-30

Similar Documents

Publication Publication Date Title
CN108509915B (en) Method and device for generating face recognition model
WO2022105125A1 (en) Image segmentation method and apparatus, computer device, and storage medium
CN108229419B (en) Method and apparatus for clustering images
CN112418292B (en) Image quality evaluation method, device, computer equipment and storage medium
CN109101919B (en) Method and apparatus for generating information
CN110689038A (en) Training method and device of neural network model and medical image processing system
CN114066902A (en) Medical image segmentation method, system and device based on convolution and transformer fusion
CN110489951A (en) Method, apparatus, computer equipment and the storage medium of risk identification
CN113344016A (en) Deep migration learning method and device, electronic equipment and storage medium
CN109858333A (en) Image processing method, device, electronic equipment and computer-readable medium
CN113221983B (en) Training method and device for transfer learning model, image processing method and device
CN113159013B (en) Paragraph identification method, device, computer equipment and medium based on machine learning
CN114298997B (en) Fake picture detection method, fake picture detection device and storage medium
CN113705276A (en) Model construction method, model construction device, computer apparatus, and medium
CN114241411B (en) Counting model processing method and device based on target detection and computer equipment
CN115861255A (en) Model training method, device, equipment, medium and product for image processing
CN111292333B (en) Method and apparatus for segmenting an image
CN112633285B (en) Domain adaptation method, domain adaptation device, electronic equipment and storage medium
CN109241930B (en) Method and apparatus for processing eyebrow image
KR102526415B1 (en) System and method for semi-supervised single image depth estimation and computer program for the same
CN117786058A (en) Method for constructing multi-mode large model knowledge migration framework
US20230298326A1 (en) Image augmentation method, electronic device and readable storage medium
CN111209414B (en) Method for realizing cold-hot separation storage of data based on image data calling business scene
CN114118411A (en) Training method of image recognition network, image recognition method and device
CN113610856A (en) Method and device for training image segmentation model and image segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant