CN112016554A - Semantic segmentation method and device, electronic equipment and storage medium - Google Patents

Semantic segmentation method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN112016554A
CN112016554A CN202010773786.6A CN202010773786A CN112016554A CN 112016554 A CN112016554 A CN 112016554A CN 202010773786 A CN202010773786 A CN 202010773786A CN 112016554 A CN112016554 A CN 112016554A
Authority
CN
China
Prior art keywords
image
semantic segmentation
feature
segmentation model
discriminator
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010773786.6A
Other languages
Chinese (zh)
Other versions
CN112016554B (en
Inventor
李仕仁
王金桥
朱贵波
胡建国
林格
张海
赵朝阳
谭大伦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nexwise Intelligence China Ltd
Original Assignee
Nexwise Intelligence China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nexwise Intelligence China Ltd filed Critical Nexwise Intelligence China Ltd
Priority to CN202010773786.6A priority Critical patent/CN112016554B/en
Publication of CN112016554A publication Critical patent/CN112016554A/en
Application granted granted Critical
Publication of CN112016554B publication Critical patent/CN112016554B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention provides a semantic segmentation method, a semantic segmentation device, electronic equipment and a storage medium. The method comprises the following steps: respectively carrying out geometric transformation on the source field image and the target field image to obtain a first intermediate image and a second intermediate image; respectively inputting a source domain image, a target domain image, a first intermediate image and a second intermediate image into a generator network in a semantic segmentation model to sequentially obtain a first image feature, a second image feature, a third image feature and a fourth image feature; respectively carrying out geometric inverse transformation on the third image characteristic and the fourth image characteristic to obtain a fifth image characteristic and a sixth image characteristic; inputting the first image characteristic and the sixth image characteristic into a first discriminator, inputting the second image characteristic and the fifth image characteristic into a second discriminator, and determining whether parameters in the semantic segmentation model need to be adjusted according to a discrimination result; and when the adjustment is not needed, performing semantic segmentation on the target domain image according to the generator network.

Description

Semantic segmentation method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of image processing technologies, and in particular, to a semantic segmentation method and apparatus, an electronic device, and a storage medium.
Background
Semantic segmentation is to classify pixels in an image and classify pixels belonging to the same class into one class.
In semantic segmentation, pixel classification may be achieved by labeling pixels in an image. However, labeling the labels requires a great deal of manual effort. Therefore, a real dataset for semantic segmentation generally contains only a small number of samples (a sample refers to an image that has been labeled), but this inhibits generalization of the model to various real cases.
For the problem, a common solution in the prior art is an unsupervised semantic segmentation method, that is, a semantic segmentation model obtained by training a data set based on computer synthesis is used for the data set of the same kind of real scenes. In order to reduce damage to actual feature information, a domain adaptation (domain adaptation) method needs to be adopted to reduce the feature space distribution difference of the data set images in different domains.
The domain adaptation methods in the prior art generally consider what way to migrate the knowledge of the domain of computer synthesis to the real scene, and thus achieve domain adaptation, i.e. in short, only focus on "how to adapt" and not "what to adapt". When the semantic segmentation model obtained by training a data set based on computer synthesis in the prior art is applied to a real scene, the problem of insufficient semantic segmentation accuracy exists even if domain adaptation is performed.
Disclosure of Invention
To solve the problems in the prior art, embodiments of the present invention provide a semantic segmentation method, apparatus, electronic device, and storage medium.
An embodiment of a first aspect of the present invention provides a semantic segmentation method, including:
the method comprises the steps of geometric transformation, namely performing geometric transformation on a source field image to obtain a first intermediate image; performing geometric transformation on the target field image to obtain a second intermediate image; the source field image is a computer synthesis image with label data, and the target field image is an image to be subjected to semantic segmentation;
the image feature extraction step comprises the steps of respectively inputting the source domain image, the target domain image, the first intermediate image and the second intermediate image into a generator network in a semantic segmentation model to sequentially obtain a first image feature, a second image feature, a third image feature and a fourth image feature; the generator network comprises a cross field category sensing module, wherein the cross field category sensing module is used for adjusting the fuzzy pixel point characteristics in the source field and the target field so as to enable the category centers of the same type characteristics in different fields to be consistent;
a step of inverse geometric transformation, which includes performing inverse geometric transformation on the third image feature to obtain a fifth image feature; performing geometric inverse transformation on the fourth image characteristic to obtain a sixth image characteristic;
inputting the first image characteristic and the sixth image characteristic into a first discriminator in a discriminator network of a semantic segmentation model, inputting the second image characteristic and the fifth image characteristic into a second discriminator in the discriminator network of the semantic segmentation model, and determining whether parameters in the semantic segmentation model need to be adjusted according to the discrimination results of the first discriminator and the second discriminator;
and semantic segmentation, namely performing semantic segmentation on the target domain image according to a generator network in the semantic segmentation model when parameters in the semantic segmentation model do not need to be adjusted.
In the above technical solution, between the step of discriminating and the step of semantic segmentation, the method further includes:
and adjusting parameters, namely adjusting the parameters in the semantic segmentation model when the parameters in the semantic segmentation model need to be adjusted, and then re-executing the image feature extraction.
In the above technical solution, between the step of inverse geometric transformation and the step of discriminating, the method further includes:
and the step of feature alignment comprises the step of performing feature alignment on the first image feature and the fifth image feature, and the step of performing feature alignment on the second image feature and the sixth image feature.
In the above solution, the geometric transformation is any one of the following operations: turning up and down, turning left and right or stretching;
accordingly, the inverse geometric transform is an inverse operation of the geometric transform.
An embodiment of a second aspect of the present invention provides a semantic segmentation apparatus, including:
the geometric transformation module is used for carrying out geometric transformation on the source field image to obtain a first intermediate image; performing geometric transformation on the target field image to obtain a second intermediate image; the source field image is a computer synthesis image with label data, and the target field image is an image to be subjected to semantic segmentation;
the image feature extraction module is used for respectively inputting the source domain image, the target domain image, the first intermediate image and the second intermediate image into a generator network in a semantic segmentation model to sequentially obtain a first image feature, a second image feature, a third image feature and a fourth image feature; the generator network comprises a cross field category sensing module, wherein the cross field category sensing module is used for adjusting the fuzzy pixel point characteristics in the source field and the target field so as to enable the category centers of the same type characteristics in different fields to be consistent;
the geometric inverse transformation module is used for carrying out geometric inverse transformation on the third image characteristic to obtain a fifth image characteristic; performing geometric inverse transformation on the fourth image characteristic to obtain a sixth image characteristic;
the judging module is used for inputting the first image characteristic and the sixth image characteristic into a first discriminator in a discriminator network of a semantic segmentation model, inputting the second image characteristic and the fifth image characteristic into a second discriminator in the discriminator network of the semantic segmentation model, and determining whether parameters in the semantic segmentation model need to be adjusted according to the judging results of the first discriminator and the second discriminator;
and the semantic segmentation module is used for performing semantic segmentation on the target domain image according to a generator network in the semantic segmentation model when parameters in the semantic segmentation model do not need to be adjusted.
In the above technical solution, further comprising:
and the parameter adjusting module is used for adjusting the parameters in the semantic segmentation model when the parameters in the semantic segmentation model need to be adjusted, and then calling the image feature extracting module again.
In the above technical solution, further comprising:
and the feature alignment module is used for performing feature alignment on the first image feature and the fifth image feature and performing feature alignment on the second image feature and the sixth image feature.
In a third embodiment of the present invention, an electronic device is provided, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the processor implements the steps of the semantic segmentation method according to the first embodiment of the present invention.
A fourth aspect of the present invention provides a non-transitory computer readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps of the semantic segmentation method according to the first aspect of the present invention.
The semantic segmentation method, the semantic segmentation device, the electronic equipment and the storage medium provided by the embodiment of the invention are based on semantic consistency and an antagonistic learning principle, adopt the discriminator to discriminate the source field data characteristics and the target field data characteristics after the cross field category perception, reduce the difference of the extracted space distribution of the source field characteristics and the target field characteristics, further enhance the field adaptation effect of the semantic segmentation model, and improve the segmentation accuracy of the semantic segmentation model on the label-free data set.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a flow chart of a semantic segmentation method according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating a semantic segmentation apparatus according to another embodiment of the present invention;
fig. 3 is a schematic physical structure diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The applicant finds that: feature spaces of the same class of different domain data sets extracted by the same semantic segmentation model should be similar, and class centers should be consistent. However, there is often a difference in the distribution of features of the same class of datasets for real scenes and computer synthetic scenes. Therefore, when the semantic segmentation model obtained by training the data set based on the computer synthetic scene is applied to a real scene, the problem of insufficient semantic segmentation accuracy exists.
Aiming at the problems in the prior art, the semantic segmentation method provided by the embodiment of the invention enhances the field adaptation effect by utilizing semantic consistency and adopting a counterstudy mode, thereby improving the segmentation accuracy of the semantic segmentation model on the label-free data set.
For convenience of understanding, before the semantic segmentation method provided by the embodiment of the present invention is explained in detail, related concepts involved in the method are first described in a unified manner.
Semantic consistency: with the geometric transformation of the image, such as left-right turning, up-down turning, etc., the image will be transformed accordingly at the visual attention point. If a person visually focuses on an image on a dog or a gun, if the image is flipped left or up, the focus of the person on the features of the image will be flipped accordingly, and the features of the flipped image will be flipped accordingly. Therefore, the image features obtained by inverting the features of the inverted image and then inverting the inverted image should be consistent with the original image features, which is semantic consistency.
Source domain image: the source domain image refers to a computer-synthesized image with tag data.
Target area image: the target domain image is an image which needs to be subjected to semantic segmentation. The target domain image does not have tag data.
Fig. 1 is a flowchart of a semantic segmentation method provided in an embodiment of the present invention, and as shown in fig. 1, the semantic segmentation method provided in the embodiment of the present invention includes:
101, performing geometric transformation on a source field image to obtain a first intermediate image; and carrying out geometric transformation on the target field image to obtain a second intermediate image.
The geometric transformation of the source domain image can be realized by operating the data of the source domain image and a geometric transformation function. The specific form of the geometric transformation function is not limited, and the geometric transformation function may be a geometric transformation function representing left-right flipping, a geometric transformation function representing up-down flipping, or other types of geometric transformation functions such as a scaling operation.
The geometric transformation of the target domain image is similar to the geometric transformation process of the source domain image and is therefore not repeated here.
And 102, respectively inputting the source domain image, the target domain image, the first intermediate image and the second intermediate image into a generator network in a semantic segmentation model, and sequentially obtaining a first image feature, a second image feature, a third image feature and a fourth image feature.
In the embodiment of the invention, the semantic segmentation model adopts a counterstudy mode, namely the semantic segmentation model comprises a generator network and a discriminator network. The generator network is used for extracting image features from the images, and the discriminator network is used for discriminating whether the two images to be compared are true or false.
In an embodiment of the invention, the generator network comprises a CDCAM module.
The CDCAM (Cross-Domain Class-Aware Module) Module is also called as a Cross Domain category perception Module, and when extracting features of a certain Domain, the CDCAM Module focuses on the category center of data features of another Domain, and adjusts the fuzzy pixel point features in two domains by combining the attention mechanism, so that the category centers of the same type features in different domains are consistent, thereby reducing the difference of feature distribution and realizing Domain adaptation.
Inputting a source field image into a CDCAM module to obtain a first image characteristic; inputting the target field image into a CDCAM module to obtain a second image characteristic; inputting the first intermediate image into a CDCAM module to obtain a third image characteristic; and inputting the second intermediate image into a CDCAM module to obtain a fourth image characteristic.
The implementation of the CDCAM module and how the CDCAM module generates corresponding image features from the input image is well known to those skilled in the art, and therefore will not be further described herein.
103, performing geometric inverse transformation on the third image characteristic to obtain a fifth image characteristic; and performing geometric inverse transformation on the fourth image characteristic to obtain a sixth image characteristic.
As can be understood from the foregoing description of the steps, the third image feature is an image feature generated by the first intermediate image after the source domain image is subjected to geometric transformation, and the fourth image feature is an image feature generated by the second intermediate image after the target domain image is subjected to geometric transformation.
The geometrically inverse transforming the third image feature may be performed by operating the third image feature with an inverse geometrical transformation function. Wherein the inverse geometric transformation function is a function opposite to the geometric transformation function used for geometrically transforming the source domain image in step 101. For example, when the geometric transformation is performed in step 101, the source region image is expanded by 2 times, and when the geometric inverse transformation is performed in this step, the third image feature is reduced by half.
The operation of geometrically inverse transforming the fourth image feature is similar to the third image feature. The inverse geometric transformation function for realizing the inverse geometric transformation is a function opposite to the geometric transformation function used for geometrically transforming the target region image in step 101.
And 104, inputting the first image feature and the sixth image feature into a first discriminator in a discriminator network of a semantic segmentation model, inputting the second image feature and the fifth image feature into a second discriminator in the discriminator network of the semantic segmentation model, and determining whether parameters in the semantic segmentation model need to be adjusted according to the discrimination results of the first discriminator and the second discriminator.
It has been mentioned in the previous description that the semantic segmentation module comprises a network of discriminators for discriminating between true and false of the two images to be compared. In an embodiment of the invention, the network of discriminators comprises a first discriminator and a second discriminator.
Because the feature pairs of cross domain category perception obtained by the images before and after the geometric transformation, namely the first image feature and the fifth image feature, and the second image feature and the sixth image feature, have semantic consistency, in the embodiment of the invention, two discriminators are adopted, wherein the first discriminator discriminates the first image feature obtained by the source domain data image without the geometric transformation and the sixth image feature obtained by the target domain image after the geometric transformation, and the second discriminator discriminates the second image feature obtained by the target domain data image without the geometric transformation and the fifth image feature obtained by the source domain image after the geometric transformation.
According to the judgment results of the two judgers, whether the preset requirements are met can be judged. If the requirement can not be met, the parameters in the semantic segmentation model need to be adjusted, and if the requirement can be met, the parameters in the semantic segmentation model do not need to be adjusted. The preset requirement may be set according to an application scenario, and is not limited in the embodiment of the present invention.
And 105, when the parameters in the semantic segmentation model do not need to be adjusted, performing semantic segmentation on the target domain image according to a generator network in the semantic segmentation model.
In this step, for the situation that the parameters in the semantic segmentation model do not need to be adjusted, the finally determined generator network in the semantic segmentation model can be used to perform semantic segmentation on the target domain image. How to semantically segment the target domain image is common knowledge of those skilled in the art, given that the parameters in the semantic segmentation model have been determined, and therefore are not repeated here.
The semantic segmentation method provided by the embodiment of the invention is based on semantic consistency and an antagonistic learning principle, adopts a discriminator to discriminate the source field data characteristics and the target field data characteristics after the cross field class perception, reduces the difference of the extracted space distribution of the source field characteristics and the target field characteristics, further enhances the field adaptation effect of the semantic segmentation model, and improves the segmentation accuracy of the semantic segmentation model on the label-free data set.
Based on any one of the above embodiments, in an embodiment of the present invention, between step 104 and step 105, the method further includes:
when the parameters in the semantic segmentation model need to be adjusted, the parameters in the semantic segmentation model are adjusted, and then step 102 is executed again.
In the former embodiment of the present invention, the case where it is not necessary to adjust the parameters in the semantic segmentation model according to the discrimination result of the discriminator is described. In the embodiment of the present invention, the case of adjusting the parameters in the semantic segmentation model according to the discrimination result of the discriminator is further described.
The parameters in the semantic segmentation model include parameters of a generator network and parameters of a discriminator network in the semantic segmentation model. Adjusting parameters in the semantic segmentation model includes adjusting parameters of a generator network, and adjusting parameters of a discriminator network.
How to adjust the parameters of the generator network and how to adjust the parameters of the arbiter network is common knowledge of a person skilled in the art and is not further described in embodiments of the present invention.
After the parameters of the semantic segmentation model are adjusted, the generator network in the adjusted semantic segmentation model can be used for extracting the features of the source domain image, the target domain image, the first intermediate image and the second intermediate image again, and the discriminator network in the adjusted semantic segmentation model is used for discriminating the first image feature, the sixth image feature, the second image feature and the fifth image feature again. The associated operations are similar to steps 102-104 and are therefore not repeated here.
The semantic segmentation method provided by the embodiment of the invention is based on semantic consistency and adopts a counterstudy mode, and adopts a discriminator to discriminate the source field data characteristics and the target field data characteristics after the cross field category perception, so that the difference of the extracted space distribution of the source field characteristics and the target field characteristics is reduced, the field adaptation effect of the semantic segmentation model is further enhanced, and the segmentation accuracy of the semantic segmentation model on the label-free data set is improved.
Based on any one of the above embodiments, in an embodiment of the present invention, between step 103 and step 104, the method further includes:
and performing feature alignment on the first image feature and the fifth image feature, and performing feature alignment on the second image feature and the sixth image feature.
As can be understood from the foregoing description, the first image feature is an image feature obtained by processing the source domain image by the CDCAM module. The fifth image feature is an image feature obtained by performing geometric inverse transformation on the third image feature, and the third image feature is an image feature obtained by processing the source domain image after geometric transformation through the CDCAM module. Therefore, there is semantic consistency between the first image feature and the fifth image feature.
Feature alignment of the first image features with the fifth image features facilitates more accurate feature extraction from the source domain image.
Similarly, the second image feature is an image feature obtained by processing the target area image by the CDCAM module. The sixth image feature is an image feature obtained by performing inverse geometric transformation on the fourth image feature, and the fourth image feature is an image feature obtained by processing the target domain image after geometric transformation through the CDCAM module. Therefore, there is semantic consistency between the second image feature and the sixth image feature.
Feature alignment of the second image features with the sixth image features facilitates more accurate feature extraction from the target domain image.
The semantic segmentation method provided by the embodiment of the invention aligns the features output by the cross domain category sensing module, so that the robustness of the semantic segmentation model can be improved, and the function of the cross domain category sensing module in the semantic segmentation model can be enhanced.
Based on any of the above embodiments, fig. 2 is a schematic diagram of a semantic segmentation apparatus according to another embodiment of the present invention, and as shown in fig. 2, the semantic segmentation apparatus according to another embodiment of the present invention includes:
a geometric transformation module 201, configured to perform geometric transformation on the source domain image to obtain a first intermediate image; performing geometric transformation on the target field image to obtain a second intermediate image; the source field image is a computer synthesis image with label data, and the target field image is an image to be subjected to semantic segmentation;
the image feature extraction module 202 is configured to input the source domain image, the target domain image, the first intermediate image, and the second intermediate image into a generator network in a semantic segmentation model respectively, and obtain a first image feature, a second image feature, a third image feature, and a fourth image feature in sequence; the generator network comprises a cross field category sensing module, wherein the cross field category sensing module is used for adjusting the fuzzy pixel point characteristics in the source field and the target field so as to enable the category centers of the same type characteristics in different fields to be consistent;
a geometric inverse transformation module 203, configured to perform geometric inverse transformation on the third image feature to obtain a fifth image feature; performing geometric inverse transformation on the fourth image characteristic to obtain a sixth image characteristic;
a judging module 204, configured to input the first image feature and the sixth image feature into a first discriminator in a discriminator network of a semantic segmentation model, input the second image feature and the fifth image feature into a second discriminator in the discriminator network of the semantic segmentation model, and determine whether to adjust parameters in the semantic segmentation model according to a judgment result of the first discriminator and the second discriminator;
a semantic segmentation module 205, configured to perform semantic segmentation on the target domain image according to a generator network in the semantic segmentation model when parameters in the semantic segmentation model do not need to be adjusted.
The semantic segmentation device provided by the embodiment of the invention adopts a discriminator to discriminate the source field data characteristic and the target field data characteristic after the cross field class perception based on the semantic consistency and the counterstudy principle, reduces the difference of the extracted space distribution of the source field characteristic and the target field characteristic, further enhances the field adaptation effect of the semantic segmentation model, and improves the segmentation accuracy of the semantic segmentation model to the label-free data set.
Based on any of the above embodiments, in an embodiment of the present invention, the semantic segmentation apparatus further includes:
and the parameter adjusting module is used for adjusting the parameters in the semantic segmentation model when the parameters in the semantic segmentation model need to be adjusted, and then calling the image feature extracting module again.
The parameters in the semantic segmentation model include parameters of a generator network and parameters of a discriminator network in the semantic segmentation model. The parameter adjusting module adjusts parameters in the semantic segmentation model and comprises the following steps: adjusting parameters of the generator network, and adjusting parameters of the arbiter network.
And after the parameter adjusting module adjusts parameters in the semantic segmentation model, the image feature extracting module is called again, so that the generator network in the adjusted semantic segmentation model is used for extracting features of the source domain image, the target domain image, the first intermediate image and the second intermediate image again, and subsequently, the geometric inverse transformation module and the judging module are called in sequence. And the discrimination module is used for discriminating the first image characteristic, the sixth image characteristic, the second image characteristic and the fifth image characteristic again by adopting the discriminator network in the adjusted semantic segmentation model.
The semantic segmentation device provided by the embodiment of the invention adopts a discriminator to discriminate the source field data characteristic and the target field data characteristic after the cross field class perception based on the semantic consistency and the counterstudy principle, reduces the difference of the extracted space distribution of the source field characteristic and the target field characteristic, further enhances the field adaptation effect of the semantic segmentation model, and improves the segmentation accuracy of the semantic segmentation model to the label-free data set.
Based on any of the above embodiments, in an embodiment of the present invention, the semantic segmentation apparatus further includes:
and the feature alignment module is used for performing feature alignment on the first image feature and the fifth image feature and performing feature alignment on the second image feature and the sixth image feature.
The first image feature and the fifth image feature have semantic consistency. Feature alignment of the first image features with the fifth image features facilitates more accurate feature extraction from the source domain image.
The second image feature and the sixth image feature have semantic consistency. Feature alignment of the second image features with the sixth image features facilitates more accurate feature extraction from the target domain image.
The semantic segmentation method provided by the embodiment of the invention aligns the features output by the cross domain category sensing module, so that the robustness of the semantic segmentation model can be improved, and the function of the cross domain category sensing module in the semantic segmentation model can be enhanced.
Fig. 3 is a schematic physical structure diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 3, the electronic device may include: a processor (processor)310, a communication Interface (communication Interface)320, a memory (memory)330 and a communication bus 340, wherein the processor 310, the communication Interface 320 and the memory 330 communicate with each other via the communication bus 340. The processor 310 may call logic instructions in the memory 330 to perform the following method: the method comprises the steps of geometric transformation, namely performing geometric transformation on a source field image to obtain a first intermediate image; performing geometric transformation on the target field image to obtain a second intermediate image; the source field image is a computer synthesis image with label data, and the target field image is an image to be subjected to semantic segmentation; the image feature extraction step comprises the steps of respectively inputting the source domain image, the target domain image, the first intermediate image and the second intermediate image into a generator network in a semantic segmentation model to sequentially obtain a first image feature, a second image feature, a third image feature and a fourth image feature; the generator network comprises a cross field category sensing module, wherein the cross field category sensing module is used for adjusting the fuzzy pixel point characteristics in the source field and the target field so as to enable the category centers of the same type characteristics in different fields to be consistent; a step of inverse geometric transformation, which includes performing inverse geometric transformation on the third image feature to obtain a fifth image feature; performing geometric inverse transformation on the fourth image characteristic to obtain a sixth image characteristic; inputting the first image characteristic and the sixth image characteristic into a first discriminator in a discriminator network of a semantic segmentation model, inputting the second image characteristic and the fifth image characteristic into a second discriminator in the discriminator network of the semantic segmentation model, and determining whether parameters in the semantic segmentation model need to be adjusted according to the discrimination results of the first discriminator and the second discriminator; and semantic segmentation, namely performing semantic segmentation on the target domain image according to a generator network in the semantic segmentation model when parameters in the semantic segmentation model do not need to be adjusted.
In addition, the logic instructions in the memory 330 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, an embodiment of the present invention further provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program is implemented by a processor to perform the method provided by the foregoing embodiments, for example, including: the method comprises the steps of geometric transformation, namely performing geometric transformation on a source field image to obtain a first intermediate image; performing geometric transformation on the target field image to obtain a second intermediate image; the source field image is a computer synthesis image with label data, and the target field image is an image to be subjected to semantic segmentation; the image feature extraction step comprises the steps of respectively inputting the source domain image, the target domain image, the first intermediate image and the second intermediate image into a generator network in a semantic segmentation model to sequentially obtain a first image feature, a second image feature, a third image feature and a fourth image feature; the generator network comprises a cross field category sensing module, wherein the cross field category sensing module is used for adjusting the fuzzy pixel point characteristics in the source field and the target field so as to enable the category centers of the same type characteristics in different fields to be consistent; a step of inverse geometric transformation, which includes performing inverse geometric transformation on the third image feature to obtain a fifth image feature; performing geometric inverse transformation on the fourth image characteristic to obtain a sixth image characteristic; inputting the first image characteristic and the sixth image characteristic into a first discriminator in a discriminator network of a semantic segmentation model, inputting the second image characteristic and the fifth image characteristic into a second discriminator in the discriminator network of the semantic segmentation model, and determining whether parameters in the semantic segmentation model need to be adjusted according to the discrimination results of the first discriminator and the second discriminator; and semantic segmentation, namely performing semantic segmentation on the target domain image according to a generator network in the semantic segmentation model when parameters in the semantic segmentation model do not need to be adjusted.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (9)

1. A method of semantic segmentation, comprising:
the method comprises the steps of geometric transformation, namely performing geometric transformation on a source field image to obtain a first intermediate image; performing geometric transformation on the target field image to obtain a second intermediate image; the source field image is a computer synthesis image with label data, and the target field image is an image to be subjected to semantic segmentation;
the image feature extraction step comprises the steps of respectively inputting the source domain image, the target domain image, the first intermediate image and the second intermediate image into a generator network in a semantic segmentation model to sequentially obtain a first image feature, a second image feature, a third image feature and a fourth image feature; the generator network comprises a cross field category sensing module, wherein the cross field category sensing module is used for adjusting the fuzzy pixel point characteristics in the source field and the target field so as to enable the category centers of the same type characteristics in different fields to be consistent;
a step of inverse geometric transformation, which includes performing inverse geometric transformation on the third image feature to obtain a fifth image feature; performing geometric inverse transformation on the fourth image characteristic to obtain a sixth image characteristic;
inputting the first image characteristic and the sixth image characteristic into a first discriminator in a discriminator network of a semantic segmentation model, inputting the second image characteristic and the fifth image characteristic into a second discriminator in the discriminator network of the semantic segmentation model, and determining whether parameters in the semantic segmentation model need to be adjusted according to the discrimination results of the first discriminator and the second discriminator;
and semantic segmentation, namely performing semantic segmentation on the target domain image according to a generator network in the semantic segmentation model when parameters in the semantic segmentation model do not need to be adjusted.
2. The semantic segmentation method according to claim 1, wherein between the step of discriminating and the step of semantic segmenting, the method further comprises:
and adjusting parameters, namely adjusting the parameters in the semantic segmentation model when the parameters in the semantic segmentation model need to be adjusted, and then re-executing the image feature extraction.
3. The semantic segmentation method according to claim 1 or 2, characterized in that between the step of inverse geometric transformation and the step of discriminating, the method further comprises:
and the step of feature alignment comprises the step of performing feature alignment on the first image feature and the fifth image feature, and the step of performing feature alignment on the second image feature and the sixth image feature.
4. The semantic segmentation method according to any of claims 1 to 3, characterized in that the geometric transformation is any of the following operations: turning up and down, turning left and right or stretching;
accordingly, the inverse geometric transform is an inverse operation of the geometric transform.
5. A semantic segmentation apparatus, comprising:
the geometric transformation module is used for carrying out geometric transformation on the source field image to obtain a first intermediate image; performing geometric transformation on the target field image to obtain a second intermediate image; the source field image is a computer synthesis image with label data, and the target field image is an image to be subjected to semantic segmentation;
the image feature extraction module is used for respectively inputting the source domain image, the target domain image, the first intermediate image and the second intermediate image into a generator network in a semantic segmentation model to sequentially obtain a first image feature, a second image feature, a third image feature and a fourth image feature; the generator network comprises a cross field category sensing module, wherein the cross field category sensing module is used for adjusting the fuzzy pixel point characteristics in the source field and the target field so as to enable the category centers of the same type characteristics in different fields to be consistent;
the geometric inverse transformation module is used for carrying out geometric inverse transformation on the third image characteristic to obtain a fifth image characteristic; performing geometric inverse transformation on the fourth image characteristic to obtain a sixth image characteristic;
the judging module is used for inputting the first image characteristic and the sixth image characteristic into a first discriminator in a discriminator network of a semantic segmentation model, inputting the second image characteristic and the fifth image characteristic into a second discriminator in the discriminator network of the semantic segmentation model, and determining whether parameters in the semantic segmentation model need to be adjusted according to the judging results of the first discriminator and the second discriminator;
and the semantic segmentation module is used for performing semantic segmentation on the target domain image according to a generator network in the semantic segmentation model when parameters in the semantic segmentation model do not need to be adjusted.
6. The semantic segmentation apparatus according to claim 5, further comprising:
and the parameter adjusting module is used for adjusting the parameters in the semantic segmentation model when the parameters in the semantic segmentation model need to be adjusted, and then calling the image feature extracting module again.
7. The semantic segmentation apparatus according to claim 5, further comprising:
and the feature alignment module is used for performing feature alignment on the first image feature and the fifth image feature and performing feature alignment on the second image feature and the sixth image feature.
8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the steps of the semantic segmentation method according to any one of claims 1 to 4 are implemented when the program is executed by the processor.
9. A non-transitory computer readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the semantic segmentation method according to any one of claims 1 to 4.
CN202010773786.6A 2020-08-04 2020-08-04 Semantic segmentation method and device, electronic equipment and storage medium Active CN112016554B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010773786.6A CN112016554B (en) 2020-08-04 2020-08-04 Semantic segmentation method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010773786.6A CN112016554B (en) 2020-08-04 2020-08-04 Semantic segmentation method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112016554A true CN112016554A (en) 2020-12-01
CN112016554B CN112016554B (en) 2022-09-02

Family

ID=73500158

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010773786.6A Active CN112016554B (en) 2020-08-04 2020-08-04 Semantic segmentation method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112016554B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108960260A (en) * 2018-07-12 2018-12-07 东软集团股份有限公司 A kind of method of generating classification model, medical image image classification method and device
CN110399856A (en) * 2019-07-31 2019-11-01 上海商汤临港智能科技有限公司 Feature extraction network training method, image processing method, device and its equipment
CN110503636A (en) * 2019-08-06 2019-11-26 腾讯医疗健康(深圳)有限公司 Parameter regulation means, lesion prediction technique, parameter adjustment controls and electronic equipment
CN111199550A (en) * 2020-04-09 2020-05-26 腾讯科技(深圳)有限公司 Training method, segmentation method, device and storage medium of image segmentation network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108960260A (en) * 2018-07-12 2018-12-07 东软集团股份有限公司 A kind of method of generating classification model, medical image image classification method and device
CN110399856A (en) * 2019-07-31 2019-11-01 上海商汤临港智能科技有限公司 Feature extraction network training method, image processing method, device and its equipment
CN110503636A (en) * 2019-08-06 2019-11-26 腾讯医疗健康(深圳)有限公司 Parameter regulation means, lesion prediction technique, parameter adjustment controls and electronic equipment
CN111199550A (en) * 2020-04-09 2020-05-26 腾讯科技(深圳)有限公司 Training method, segmentation method, device and storage medium of image segmentation network

Also Published As

Publication number Publication date
CN112016554B (en) 2022-09-02

Similar Documents

Publication Publication Date Title
US10282643B2 (en) Method and apparatus for obtaining semantic label of digital image
CN112651953B (en) Picture similarity calculation method and device, computer equipment and storage medium
CN110807110B (en) Image searching method and device combining local and global features and electronic equipment
CN111767906A (en) Face detection model training method, face detection device and electronic equipment
CN111178146A (en) Method and device for identifying anchor based on face features
CN112365451A (en) Method, device and equipment for determining image quality grade and computer readable medium
Fatima et al. FAST, BRIEF and SIFT based image copy-move forgery detection technique
CN114330234A (en) Layout structure analysis method and device, electronic equipment and storage medium
CN114972847A (en) Image processing method and device
CN113111880A (en) Certificate image correction method and device, electronic equipment and storage medium
CN112232336A (en) Certificate identification method, device, equipment and storage medium
CN117557784B (en) Target detection method, target detection device, electronic equipment and storage medium
CN114841974A (en) Nondestructive testing method and system for internal structure of fruit, electronic equipment and medium
El-Gayar et al. A novel approach for detecting deep fake videos using graph neural network
Cao et al. Content-oriented image quality assessment with multi-label SVM classifier
CN112016592A (en) Domain adaptive semantic segmentation method and device based on cross domain category perception
CN116071625B (en) Training method of deep learning model, target detection method and device
CN112016554B (en) Semantic segmentation method and device, electronic equipment and storage medium
CN116682141A (en) Multi-label pedestrian attribute identification method and medium based on multi-scale progressive perception
CN116958615A (en) Picture identification method, device, equipment and medium
CN114399497A (en) Text image quality detection method and device, computer equipment and storage medium
CN117597702A (en) Scaling-independent watermark extraction
CN114648751A (en) Method, device, terminal and storage medium for processing video subtitles
Hu et al. No-reference quality assessment for contrast-altered images using an end-to-end deep framework
CN114444565B (en) Image tampering detection method, terminal equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant