CN113326886B - Method and system for detecting salient object based on unsupervised learning - Google Patents

Method and system for detecting salient object based on unsupervised learning Download PDF

Info

Publication number
CN113326886B
CN113326886B CN202110665987.9A CN202110665987A CN113326886B CN 113326886 B CN113326886 B CN 113326886B CN 202110665987 A CN202110665987 A CN 202110665987A CN 113326886 B CN113326886 B CN 113326886B
Authority
CN
China
Prior art keywords
target domain
domain image
pseudo
representing
uncertainty
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110665987.9A
Other languages
Chinese (zh)
Other versions
CN113326886A (en
Inventor
李冠彬
吴梓溢
颜鹏翔
刘梦梦
林倞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN202110665987.9A priority Critical patent/CN113326886B/en
Publication of CN113326886A publication Critical patent/CN113326886A/en
Application granted granted Critical
Publication of CN113326886B publication Critical patent/CN113326886B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method and a system for detecting a salient object based on unsupervised learning, wherein the method comprises the following steps: obtaining a target domain sample, wherein the label of the target domain sample is a pseudo label obtained by predicting a target domain image by using a model obtained by the previous iteration; performing uncertainty evaluation on the pseudo tag and performing uncertainty sorting; according to the sorting result, performing picture-level screening on the pseudo tags to obtain target domain samples with the uncertainty of the pseudo tags lower than a preset threshold; and carrying out pixel-level pseudo tag re-weighting processing on the target domain sample to obtain the target domain sample for the next iteration training. The saliency object detection method based on the unsupervised learning provided by the invention can obtain excellent performance on a plurality of saliency object detection data sets under the condition of not depending on the manual label, and achieves the capability which is comparable with the full-supervised saliency detection method, thereby greatly reducing the dependence of the saliency object detection method on the manual label at the pixel level.

Description

Method and system for detecting salient object based on unsupervised learning
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a method and a system for detecting a salient object based on unsupervised learning.
Background
In recent years, a salient object detection technique has been directly applied to numerous business scenes such as image editing, short video creation, live broadcasting, and the like as an important image processing technique. These businesses all need to use significant object detection techniques for video compression, object detection, visual tracking, or video segmentation. Compared with the traditional salient object detection method, the salient object detection method based on the full convolution neural network is fast and popular in the field by virtue of the convenient trainable capacity and high-efficiency calculation efficiency, but the method can achieve a good segmentation effect by relying on a large number of pixel-by-pixel labeling images or videos and performing a large number of training, so that a large amount of manpower and material resources are consumed, and the labeling results are different due to experience of a labeling person, so that the accuracy of the follow-up detection results is affected.
In order to alleviate the above problems, researchers have proposed a significant object detection method based on deep unsupervised learning. The main idea of the method is to perform significance learning by using the noise-containing significance label generated by the traditional significance object recognition method, and the significance learning is mainly realized by two modes of noise modeling or pseudo-label learning. However, the noise label generated by the traditional saliency detection method is difficult to be suitable for complex scenes such as low contrast and fine object morphology, and the fine saliency characteristics cannot be learned, so that further development of the saliency object detection work is hindered.
Disclosure of Invention
The invention aims to provide a saliency object detection method and system based on unsupervised learning, which are used for solving the technical problem that the existing saliency object detection method cannot be applied to complex scenes such as low contrast, fine object forms and the like.
In order to overcome the defects in the prior art, the invention provides a saliency object detection method based on unsupervised learning, which comprises the following steps:
obtaining a target domain sample, wherein the label of the target domain sample is a pseudo label obtained by predicting a target domain image by using a model obtained by the previous iteration;
evaluating consistency of the significance prediction probability map of the target domain image under different data enhancement by using variance, evaluating uncertainty of the pseudo tag, and sequencing uncertainty according to the uncertainty score of each target domain sample;
the formula of the significance prediction probability map generated under different data enhancement is as follows:
a saliency prediction map representing a target domain image; i t Representing a target domain image; alpha j (. Cndot.) represents the j-th data enhancement mode; />Representing alpha j An inverse transform operation of (-); />Representing a model operation for generating a pseudo tag of the target domain image;
the formula for evaluating the consistency of the saliency probability map using variance is:
a variance diagram representing a target domain image; e represents an average operation; />A saliency prediction map representing a target domain image; n represents the data enhancement times;
the uncertainty score for each target domain sample is obtained by the following equation:
an uncertainty score representing the target domain image; h represents the height of the target domain image; w represents the width of the target domain image; h represents a coordinate value of the target domain image in the vertical direction; w represents a coordinate value of the target domain image in the horizontal direction;representing the value of the variance diagram of the target domain image at coordinates (h, w);
according to the sorting result, performing picture-level screening on the pseudo tags to obtain target domain samples with the uncertainty of the pseudo tags lower than a preset threshold;
carrying out pixel-level pseudo tag re-weighting treatment on the target domain sample to obtain a target domain sample for the next iteration training; the pixel-level pseudo tag re-weighting weight of the target domain sample is obtained by the following formula:
a pixel-level pseudo tag re-weighting weight representing the target domain image; k represents the pixel-level pseudo tag weight decrease amplitude of the target domain image; />A variance diagram representing the target domain image.
Further, the data enhancement is a reversible data enhancement mode.
Further, before the picture-level filtering is performed on the pseudo tag according to the sorting result, the method further includes:
and deleting the pseudo labels of the salient pixel areas or the non-salient pixel areas larger than a preset range by using priori knowledge.
The invention also provides a saliency object detection system based on unsupervised learning, comprising:
the pseudo tag obtaining unit is used for obtaining a target domain sample, wherein the tag of the target domain sample is a pseudo tag obtained by predicting a target domain image by using a model obtained by the previous iteration;
the uncertainty evaluation unit is used for evaluating consistency of the significance prediction probability map of the target domain image under different data enhancement by utilizing variance, performing uncertainty evaluation on the pseudo tag and performing uncertainty sorting according to the uncertainty score of each target domain sample;
the formula of the significance prediction probability map generated under different data enhancement is as follows:
a saliency prediction map representing a target domain image; i t Representing a target domain image; alpha j (. Cndot.) represents the j-th data enhancement mode; />Representing alpha j An inverse transform operation of (-); />Representing a model operation for generating a pseudo tag of the target domain image;
the formula for evaluating the consistency of the saliency probability map using variance is:
a variance diagram representing a target domain image; e represents an average operation; />A saliency prediction map representing a target domain image; n represents the data enhancement times;
the uncertainty score for each target domain sample is obtained by the following equation:
an uncertainty score representing the target domain image; h represents the height of the target domain image; w represents the width of the target domain image; h represents a coordinate value of the target domain image in the vertical direction; w represents a coordinate value of the target domain image in the horizontal direction;representing the value of the variance diagram of the target domain image at coordinates (h, w);
the screening unit is used for carrying out picture-level screening on the pseudo tags according to the sorting result to obtain target domain samples with the uncertainty of the pseudo tags lower than a preset threshold value;
the weighting processing unit is used for carrying out pixel-level pseudo tag re-weighting processing on the target domain sample to obtain a target domain sample for the next iteration training; the pixel-level pseudo tag re-weighting weight of the target domain sample is obtained by the following formula:
a pixel-level pseudo tag re-weighting weight representing the target domain image; k represents the pixel-level pseudo tag weight decrease amplitude of the target domain image; />A variance diagram representing the target domain image.
Further, the data enhancement is a reversible data enhancement mode.
Further, the screening unit is further configured to:
and deleting the pseudo labels of the salient pixel areas or the non-salient pixel areas larger than a preset range by using priori knowledge.
The invention also provides a terminal device, comprising:
one or more processors;
a memory coupled to the processor for storing one or more programs;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the unsupervised learning-based salient object detection method as recited in any one of the preceding claims.
The present invention also provides a computer-readable storage medium having stored thereon a computer program for execution by a processor to implement the unsupervised learning-based salient object detection method as described in any one of the above.
Compared with the prior art, the invention has the beneficial effects that:
the invention provides a saliency object detection method based on unsupervised learning, which comprises the following steps: obtaining a target domain sample, wherein the label of the target domain sample is a pseudo label obtained by predicting a target domain image by using a model obtained by the previous iteration; performing uncertainty evaluation on the pseudo tag and performing uncertainty sorting; according to the sorting result, performing picture-level screening on the pseudo tags to obtain target domain samples with the uncertainty of the pseudo tags lower than a preset threshold; and carrying out pixel-level pseudo tag re-weighting processing on the target domain sample to obtain the target domain sample for the next iteration training.
The invention learns by using synthesized but relatively clean labels, and completes the field adaptation work of the synthesized data set and the real data by generating the pseudo labels in the real scene so as to realize unsupervised and reliable image salient object detection. The invention can obtain excellent performance on a plurality of salient object detection data sets under the condition of not depending on the artificial label, and achieves the capability which is comparable with the full supervision salient detection method, thereby greatly reducing the dependence of the salient object detection method on the pixel-level artificial label.
Drawings
In order to more clearly illustrate the technical solutions of the present invention, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a method for detecting salient objects based on unsupervised learning according to an embodiment of the present invention;
FIG. 2 is an overall framework diagram of a salient object detection method based on unsupervised learning provided by an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a saliency object detection system based on unsupervised learning according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be understood that the step numbers used herein are for convenience of description only and are not limiting as to the order in which the steps are performed.
It is to be understood that the terminology used in the description of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
The terms "comprises" and "comprising" indicate the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
The term "and/or" refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.
First aspect:
referring to fig. 1, an embodiment of the present invention provides a method for detecting a salient object based on unsupervised learning, including:
s10, acquiring a target domain sample, wherein a label of the target domain sample is a pseudo label obtained by predicting a target domain image by using a model obtained by the previous iteration;
s20, performing uncertainty evaluation on the pseudo tag and performing uncertainty sorting;
s30, performing picture-level screening on the pseudo tags according to the sorting result to obtain target domain samples with the uncertainty of the pseudo tags lower than a preset threshold;
s40, performing pixel-level pseudo tag re-weighting processing on the target domain sample to obtain a target domain sample for the next iteration training.
It should be noted that, the current full-supervision significant object detection method based on the full convolutional neural network mainly depends on a large number of images or videos marked pixel by pixel to train so as to achieve good segmentation performance. Even a skilled annotator may take several to tens of minutes to annotate a pixel-level saliency map. In order to reduce the labeling variability caused by subjective factors in the labeling process, a picture often needs to be labeled and verified by a plurality of labeling persons. Therefore, the existing method consumes a great deal of manpower and material resources, and further hinders the development of the significant object detection technology. Therefore, in the implementation, the method for detecting the significant object is mainly provided with low cost and less time consumption.
As shown in fig. 2, the unsupervised field-based adaptive saliency object detection framework provided in this embodiment aims to learn saliency detection capability from synthesized but clean labels, and it can learn saliency prediction from synthesized source domain data using the existing deep learning-based saliency detection model and unsupervised adapt it to a real target domain scene.
Specifically, in step S10, the main purpose is to preliminarily acquire a target domain sample for training, which is composed of a target domain image and a target domain label. In each round of training as shown in fig. 2, the data of the training set is composed of a part of source domain samples and a part of target domain samples, wherein the labels of the source domain samples are accurate and clean labels, and the labels of the target domain samples are pseudo labels obtained by predicting the model obtained by the previous round of iteration in the target domain image. However, since the pseudo tag of the target domain sample is initially generated by a saliency detector trained on the source domain, and there is a significant data distribution difference between the source domain and the target domain, the pseudo tag inevitably contains many erroneous pixel-level predictions. Thus, to avoid false accumulation of false labels during iterative training, it is necessary to carefully screen samples of the target domain that participate in training and adaptively assign different weights to the pixels of the selected samples. Wherein the optimization objective for each sample in each round of training is as follows:
wherein y is (h,w) A label representing a pixel level of the sample; p is p (h,w) A significance probability map representing model predictions; omega (h,w) A weight map representing pixel levels;representing a binary cross entropy loss between pixel points; h. w represents the height and width of the image, respectively.
This optimization target aims to make the prediction result of the model output as close as possible to the highly reliable pixels in the label. In the invention, as the labels of the source domain are clean and accurate, the weight of each pixel in the weight graph is 1, and for the target domain, after each iteration is finished, the pseudo labels of the target domain samples and the weight graph are dynamically updated through an uncertainty-aware pseudo label learning strategy. Unlike naive pseudo tag learning strategies, which would train with all pseudo tags, we propose to further select and assign different pixel weights to target domain pseudo tags by an uncertainty-aware pseudo tag learning strategy (UPL).
S20, performing uncertainty evaluation on the pseudo tag and performing uncertainty sorting;
in step S20, the target domain pseudo tags are screened primarily by uncertainty estimation based on consistency. Firstly, uncertainty estimation based on consistency is needed to be carried out on a pseudo tag of a target domain, specifically, as shown in fig. 2, a target domain with fixed parameters is givenIs to image I of each target domain t Inputting the significance detection model to generate pseudo tag +.>
Further, in order to evaluate the uncertainty of the pseudo tag, the following aspects are mainly considered. First, the saliency detection model is robust to and not susceptible to small noise on high confidence (i.e., low uncertainty) target samples. Second, data enhancement can be considered as a noise injection approach to the image. Thus in the present embodimentAn estimation target image I is provided t New method of pseudo tag consistency, i.e. by evaluating target image I t The consistency of the significance prediction probability map under a variety of different data enhancements is evaluated. Wherein, in different data enhancement modesGenerated significance prediction map +.>Can be formulated as:
here the number of the elements is the number, by only passing alpha -1 (.) data enhancement mode α (), with reversal, and α -1 (-) will be applied to each saliency prediction mapTo reverse its data enhancement effect to restore it to the pseudo tag +.>The same conditions, such as picture orientation, size, etc.
Further, the variance is used to evaluate the consistency of the pseudo tag with the predicted saliency probability map for different data enhancement modes. Sample I t The variance diagram of (a) can be formulated as:
s30, performing picture-level screening on the pseudo tags according to the sorting result to obtain target domain samples with the uncertainty of the pseudo tags lower than a preset threshold;
in this embodiment, the picture level screening is mainly performed, and since the saliency detection model has weak universal ability in the early stage of training, but withThe capacity of the iteration times is gradually increased, so that 1) only the pseudo tag with lower uncertainty is selected for training; 2) Fewer pseudo tags are selected in the early stage, but the number of pseudo tags should be slowly increased with the increase of the iteration number. As shown in fig. 2, the variogram may reflect pixel-level uncertainty of the target pseudo tag, with areas of smaller gray and larger gray (near black in fig. 2) representing high uncertainty and low uncertainty, respectively. To order and filter the uncertainty of the target domain samples, we calculate an image level sample uncertainty score U from the mean of the resulting variance map. Target image I t The uncertainty score of (2) can be expressed as:
further, each target domain sample is ranked according to its uncertainty score, and a different proportion of low uncertainty target samples are selected in each training iteration. This ratio increases with the improvement of the saliency detection model.
In one embodiment, a priori knowledge is also introduced during the actual sample screening process to remove pseudo tags that are mostly significant pixels or non-significant pixels.
S40, performing pixel-level pseudo tag re-weighting processing on the target domain sample to obtain a target domain sample for the next iteration training.
In this embodiment, it should be noted that, although the target domain pseudo tags obtained in step S30 generally reflect a low uncertainty level, there are some regions of high uncertainty inside each pseudo tag. Therefore, the processing of different treatments is performed on each pixel of the pseudo tag in the training process, and specifically, a pixel-level pseudo tag re-weighting strategy Ω based on the variance diagram Var is proposed. Wherein, the target domain pixel level weight matrix w t ∈(0,1] H×W I.e. can pass throughAnd (5) performing calculation.
Wherein k is R + Representing the magnitude of the drop in weight, the strategy aims to have the higher uncertainty pixels with lower weights.
According to the method provided by the embodiment of the invention, the synthesized but relatively clean label is utilized for learning, and the field adaptation work of the synthesized data set and the real data is completed by generating the pseudo label in the real scene so as to realize unsupervised and reliable image salient object detection. Meanwhile, the embodiment of the invention can obtain excellent performance on a plurality of salient object detection data sets under the condition of not depending on the artificial label, and achieves the capability which is comparable with that of a full-supervision salient object detection method, thereby greatly reducing the dependence of the salient object detection method on the pixel-level artificial label.
Second aspect:
referring to fig. 3, an embodiment of the present invention further provides a salient object detection system based on unsupervised learning, including:
the pseudo tag obtaining unit 01 is used for obtaining a target domain sample, wherein the tag of the target domain sample is a pseudo tag obtained by predicting a target domain image by using a model obtained by the previous iteration;
an uncertainty evaluation unit 02, configured to perform uncertainty evaluation on the pseudo tag, and perform uncertainty sorting;
a screening unit 03, configured to perform picture-level screening on the pseudo tag according to the sorting result, to obtain a target domain sample with uncertainty of the pseudo tag being lower than a preset threshold;
and the weighting processing unit 04 is used for carrying out pixel-level pseudo tag re-weighting processing on the target domain sample to obtain the target domain sample for the next iteration training.
In an embodiment, the uncertainty evaluation unit 02 is further configured to:
and evaluating the consistency of the pseudo tag and the significance prediction probability map under different data enhancement by using the variance, and generating a corresponding variance map.
In one embodiment, the data enhancement is a reversible data enhancement mode.
In an embodiment, the screening unit 03 is further configured to: and deleting the pseudo labels of the salient pixel areas or the non-salient pixel areas larger than a preset range by using priori knowledge.
The system provided by the embodiment of the invention is used for executing the method according to the first aspect, wherein the method learns by using synthesized but relatively clean labels, and performs field adaptation work of the synthesized data set and the real data by generating the pseudo labels in the real scene so as to realize unsupervised and reliable image saliency object detection. Meanwhile, the embodiment of the invention can obtain excellent performance on a plurality of salient object detection data sets under the condition of not depending on the artificial label, and achieves the capability which is comparable with that of a full-supervision salient object detection method, thereby greatly reducing the dependence of the salient object detection method on the pixel-level artificial label.
Third aspect:
an embodiment of the present invention further provides a terminal device, including:
one or more processors;
a memory coupled to the processor for storing one or more programs;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the unsupervised learning-based salient object detection method as described above.
The processor is used for controlling the overall operation of the terminal device to complete all or part of the steps of the significance object detection method based on the unsupervised learning. The memory is used to store various types of data to support operation at the terminal device, which may include, for example, instructions for any application or method operating on the terminal device, as well as application-related data. The Memory may be implemented by any type of volatile or non-volatile Memory device or combination thereof, such as static random access Memory (Static Random Access Memory, SRAM for short), electrically erasable programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory, EEPROM for short), erasable programmable Read-Only Memory (Erasable Programmable Read-Only Memory, EPROM for short), programmable Read-Only Memory (Programmable Read-Only Memory, PROM for short), read-Only Memory (ROM for short), magnetic Memory, flash Memory, magnetic disk or optical disk.
The terminal device may be implemented by one or more application specific integrated circuits (Application Specific1ntegrated Circuit, abbreviated AS 1C), digital signal processor (Digital Signal Processor, abbreviated AS DSP), digital signal processing device (Digital Signal Processing Device, abbreviated DSPD), programmable logic device (Programmable Logic Device, abbreviated AS PLD), field programmable gate array (Field Programmable Gate Array, abbreviated FPGA), controller, microcontroller, microprocessor or other electronic component for performing the method for detecting a salient object based on unsupervised learning according to any of the above embodiments, and achieving technical effects consistent with the method AS described above.
An embodiment of the present invention also provides a computer-readable storage medium including program instructions which, when executed by a processor, implement the steps of the method for saliency object detection based on unsupervised learning as described in any one of the embodiments above. For example, the computer-readable storage medium may be the above memory including the program instructions executable by the processor of the terminal device to perform the method for detecting a salient object based on unsupervised learning according to any one of the above embodiments, and achieve technical effects consistent with the method as described above.
While the foregoing is directed to the preferred embodiments of the present invention, it will be appreciated by those skilled in the art that changes and modifications may be made without departing from the principles of the invention, such changes and modifications are also intended to be within the scope of the invention.

Claims (8)

1. A method for detecting a salient object based on unsupervised learning, comprising:
obtaining a target domain sample, wherein the label of the target domain sample is a pseudo label obtained by predicting a target domain image by using a model obtained by the previous iteration;
evaluating consistency of the significance prediction probability map of the target domain image under different data enhancement by using variance, evaluating uncertainty of the pseudo tag, and sequencing uncertainty according to the uncertainty score of each target domain sample;
the formula of the significance prediction probability map generated under different data enhancement is as follows:
a saliency prediction map representing a target domain image; i t Representing a target domain image; alpha j (. Cndot.) represents the j-th data enhancement mode; />Representing alpha j An inverse transform operation of (-); />Representing a model operation for generating a pseudo tag of the target domain image;
the formula for evaluating the consistency of the saliency probability map using variance is:
a variance diagram representing a target domain image; e represents an average operation; />A saliency prediction map representing a target domain image; n represents the data enhancement times;
the uncertainty score for each target domain sample is obtained by the following equation:
an uncertainty score representing the target domain image; h represents the height of the target domain image; w represents the width of the target domain image; h represents a coordinate value of the target domain image in the vertical direction; w represents a coordinate value of the target domain image in the horizontal direction;representing the value of the variance diagram of the target domain image at coordinates (h, w);
according to the sorting result, performing picture-level screening on the pseudo tags to obtain target domain samples with the uncertainty of the pseudo tags lower than a preset threshold;
carrying out pixel-level pseudo tag re-weighting treatment on the target domain sample to obtain a target domain sample for the next iteration training; the pixel-level pseudo tag re-weighting weight of the target domain sample is obtained by the following formula:
a pixel-level pseudo tag re-weighting weight representing the target domain image; k represents the pixel-level pseudo tag weight decrease amplitude of the target domain image; />A variance diagram representing the target domain image.
2. The method for unsupervised learning based salient object detection of claim 1, wherein the data enhancement is a reversible data enhancement.
3. The method for detecting a salient object based on unsupervised learning according to claim 1, further comprising, before said picture-level filtering of said pseudo tag according to the sorting result:
and deleting the pseudo labels of the salient pixel areas or the non-salient pixel areas larger than a preset range by using priori knowledge.
4. A salient object detection system based on unsupervised learning, comprising:
the pseudo tag obtaining unit is used for obtaining a target domain sample, wherein the tag of the target domain sample is a pseudo tag obtained by predicting a target domain image by using a model obtained by the previous iteration;
the uncertainty evaluation unit is used for evaluating consistency of the significance prediction probability map of the target domain image under different data enhancement by utilizing variance, performing uncertainty evaluation on the pseudo tag and performing uncertainty sorting according to the uncertainty score of each target domain sample;
the formula of the significance prediction probability map generated under different data enhancement is as follows:
a saliency prediction map representing a target domain image; i t Representing a target domain image; alpha j (. Cndot.) represents the j-th data enhancement mode; />Representing alpha j An inverse transform operation of (-); />Representing a model operation for generating a pseudo tag of the target domain image;
the formula for evaluating the consistency of the saliency probability map using variance is:
a variance diagram representing a target domain image; e represents an average operation; />A saliency prediction map representing a target domain image; n represents the data enhancement times;
the uncertainty score for each target domain sample is obtained by the following equation:
an uncertainty score representing the target domain image; h represents the height of the target domain image; w (W)Representing the width of the target domain image; h represents a coordinate value of the target domain image in the vertical direction; w represents a coordinate value of the target domain image in the horizontal direction;representing the value of the variance diagram of the target domain image at coordinates (h, w);
the screening unit is used for carrying out picture-level screening on the pseudo tags according to the sorting result to obtain target domain samples with the uncertainty of the pseudo tags lower than a preset threshold value;
the weighting processing unit is used for carrying out pixel-level pseudo tag re-weighting processing on the target domain sample to obtain a target domain sample for the next iteration training; the pixel-level pseudo tag re-weighting weight of the target domain sample is obtained by the following formula:
a pixel-level pseudo tag re-weighting weight representing the target domain image; k represents the pixel-level pseudo tag weight decrease amplitude of the target domain image; />A variance diagram representing the target domain image.
5. The unsupervised learning based salient object detection system of claim 4, wherein the data enhancement is a reversible data enhancement.
6. The unsupervised learning based salient object detection system according to claim 4, wherein the screening unit is further configured to:
and deleting the pseudo labels of the salient pixel areas or the non-salient pixel areas larger than a preset range by using priori knowledge.
7. A terminal device, comprising:
one or more processors;
a memory coupled to the processor for storing one or more programs;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the unsupervised learning-based salient object detection method of any one of claims 1 to 3.
8. A computer-readable storage medium having stored thereon a computer program, wherein the computer program is executed by a processor to implement the unsupervised learning-based salient object detection method according to any one of claims 1 to 3.
CN202110665987.9A 2021-06-16 2021-06-16 Method and system for detecting salient object based on unsupervised learning Active CN113326886B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110665987.9A CN113326886B (en) 2021-06-16 2021-06-16 Method and system for detecting salient object based on unsupervised learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110665987.9A CN113326886B (en) 2021-06-16 2021-06-16 Method and system for detecting salient object based on unsupervised learning

Publications (2)

Publication Number Publication Date
CN113326886A CN113326886A (en) 2021-08-31
CN113326886B true CN113326886B (en) 2023-09-15

Family

ID=77421009

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110665987.9A Active CN113326886B (en) 2021-06-16 2021-06-16 Method and system for detecting salient object based on unsupervised learning

Country Status (1)

Country Link
CN (1) CN113326886B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114332489B (en) * 2022-03-15 2022-06-24 江西财经大学 Image salient target detection method and system based on uncertainty perception

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108399406A (en) * 2018-01-15 2018-08-14 中山大学 The method and system of Weakly supervised conspicuousness object detection based on deep learning
CN111680702A (en) * 2020-05-28 2020-09-18 杭州电子科技大学 Method for realizing weak supervision image significance detection by using detection frame
CN112541928A (en) * 2020-12-18 2021-03-23 上海商汤智能科技有限公司 Network training method and device, image segmentation method and device and electronic equipment
CN112598053A (en) * 2020-12-21 2021-04-02 西北工业大学 Active significance target detection method based on semi-supervised learning
CN112861842A (en) * 2021-03-22 2021-05-28 天津汇智星源信息技术有限公司 Case text recognition method based on OCR and electronic equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108399406A (en) * 2018-01-15 2018-08-14 中山大学 The method and system of Weakly supervised conspicuousness object detection based on deep learning
CN111680702A (en) * 2020-05-28 2020-09-18 杭州电子科技大学 Method for realizing weak supervision image significance detection by using detection frame
CN112541928A (en) * 2020-12-18 2021-03-23 上海商汤智能科技有限公司 Network training method and device, image segmentation method and device and electronic equipment
CN112598053A (en) * 2020-12-21 2021-04-02 西北工业大学 Active significance target detection method based on semi-supervised learning
CN112861842A (en) * 2021-03-22 2021-05-28 天津汇智星源信息技术有限公司 Case text recognition method based on OCR and electronic equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Semi-Supervised Video Salient Object Detection Using Pseudo-Labels;Pengxiang Yan 等;《2019 IEEE/CVF International Conference on Computer Vision》;第7283-7292页 *
Weakly Supervised Salient Object Detection Using Image Labels;Guanbin Li 等;《The Thirty-Second AAAI Conference on Artificial Intelligence》;第7024-7031页 *

Also Published As

Publication number Publication date
CN113326886A (en) 2021-08-31

Similar Documents

Publication Publication Date Title
Mathieu et al. Deep multi-scale video prediction beyond mean square error
Rosin Image processing using 3-state cellular automata
Guo et al. BARNet: Boundary aware refinement network for crack detection
Fan et al. No reference image quality assessment based on multi-expert convolutional neural networks
CN109753878B (en) Imaging identification method and system under severe weather
Wang et al. MAGAN: Unsupervised low-light image enhancement guided by mixed-attention
Feng et al. URNet: A U-Net based residual network for image dehazing
CN114399644A (en) Target detection method and device based on small sample
CN113326886B (en) Method and system for detecting salient object based on unsupervised learning
Bappy et al. Real estate image classification
Guo et al. Joint raindrop and haze removal from a single image
Yuan et al. A confidence prior for image dehazing
Zheng et al. T-net: Deep stacked scale-iteration network for image dehazing
Lin et al. Single image deraining via detail-guided efficient channel attention network
Mukherjee et al. Visual quality enhancement of images under adverse weather conditions
Katircioglu et al. Self-supervised segmentation via background inpainting
CN111914949A (en) Zero sample learning model training method and device based on reinforcement learning
Lokhande et al. A survey on document image binarization techniques
CN113807354B (en) Image semantic segmentation method, device, equipment and storage medium
Yadav et al. Image detection in noisy images
CN114882252A (en) Semi-supervised remote sensing image change detection method and device and computer equipment
Zhang et al. SDTCN: Similarity driven transmission computing network for image dehazing
Huang et al. Anti-forensics for double JPEG compression based on generative adversarial network
CN114529828A (en) Method, device and equipment for extracting residential area elements of remote sensing image
CN110472653B (en) Semantic segmentation method based on maximized region mutual information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant