CN106127197B - Image saliency target detection method and device based on saliency label sorting - Google Patents

Image saliency target detection method and device based on saliency label sorting Download PDF

Info

Publication number
CN106127197B
CN106127197B CN201610219337.0A CN201610219337A CN106127197B CN 106127197 B CN106127197 B CN 106127197B CN 201610219337 A CN201610219337 A CN 201610219337A CN 106127197 B CN106127197 B CN 106127197B
Authority
CN
China
Prior art keywords
image
region
label
saliency
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201610219337.0A
Other languages
Chinese (zh)
Other versions
CN106127197A (en
Inventor
郎丛妍
李尊
何伟明
于兆鹏
杜雪涛
杜刚
朱艳云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jiaotong University
China Mobile Group Design Institute Co Ltd
Original Assignee
Beijing Jiaotong University
China Mobile Group Design Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jiaotong University, China Mobile Group Design Institute Co Ltd filed Critical Beijing Jiaotong University
Priority to CN201610219337.0A priority Critical patent/CN106127197B/en
Publication of CN106127197A publication Critical patent/CN106127197A/en
Application granted granted Critical
Publication of CN106127197B publication Critical patent/CN106127197B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • G06V10/225Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition based on a marking or identifier characterising the area
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention provides a method and a device for detecting an image saliency target based on saliency label sequencing. The method mainly comprises the following steps: dividing each image in the image sample set into a plurality of image areas by using a SLIC segmentation method, and extracting visual features and background contrast features for each image area; forming a training set and a test set according to the visual features, the background contrast features and the significant value labels of each image region, and learning the significant value of each image region in each image by using an algorithm based on significant label sequencing; and recovering the significant value of each region in the image by using a low-rank matrix recovery theory, and detecting a significant target in the image. The method disclosed by the invention has the advantages that the complexity of the nuclear norm control model of the matrix is fully utilized, the visual feature similarity and the semantic label similarity are combined, and the correlation between graph Laplace regularization constraints is utilized, so that the problems that the significant label space is large but the number of training images is limited are effectively solved.

Description

Image saliency target detection method and device based on saliency label sorting
Technical Field
The invention relates to the technical field of article image processing, in particular to a method and a device for detecting an image saliency target based on saliency label sequencing.
Background
In recent years, with the rapid development of internet technology and multimedia information technology, multimedia information using images as carriers is becoming an important means for people to transmit and acquire information. However, the computational resources available to process multimedia information are very limited compared to the explosive growth of image data. Therefore, the significance detection technology can be combined with the information selection capability of a human cognitive system to extract interested contents from a complex image, so that complex massive multimedia visual information resources are reasonably and effectively utilized, and the significance detection technology plays an important role in the field of image analysis and understanding.
Recently, a data-driven top-down method is utilized to achieve a good effect in the field of saliency extraction of the image, the existing supervision algorithm considers the saliency detection problem as a two-classification or regression problem, and in order to learn a reliable model, the existing supervision algorithm mostly depends on a large-scale training data set and has certain limitation. Therefore, it is necessary to develop a simple and effective saliency target detection algorithm.
Disclosure of Invention
The embodiment of the invention provides a method and a device for detecting an image salient object based on salient label sorting, which are used for effectively detecting the salient object in an image.
In order to achieve the purpose, the invention adopts the following technical scheme.
According to one aspect of the invention, an image saliency target detection method based on saliency label sorting is provided, which comprises the following steps:
establishing an image sample set, dividing each image in the image sample set into a plurality of image areas by using a super-pixel segmentation SLIC (linear segmentation in particular) segmentation method, and extracting visual features and background contrast features of each image area;
extracting a saliency target of each image in the image sample set by using an image saliency detection algorithm to obtain a saliency label of each image area in each image;
forming a training set and a testing set according to the visual features, the background contrast features and the significant value labels of each image region, and learning the significant value of each image region in each image by using an algorithm based on significant label sequencing;
and recovering a saliency map of each image by using the low-rank matrix recovery theory and the saliency value of each image region, and detecting a salient target in the image.
Further, the extracting of the visual feature and the background contrast feature for each image region includes:
the visual features comprise color features and texture features, wherein the color features comprise average RGB, LAB and HSV color values of pixel points contained in each image region and corresponding color space histograms; the texture features comprise LBP and LM filter distribution features of the image area; the background contrast characteristic adopts a certain number of peripheral edge regions as a background, and respectively extracts the color texture of the background region and the contrast characteristic between the color texture and the contrast characteristic;
the background contrast characteristic of the image area is defined as follows:
using the peripheral region of the boundary as a pseudo-background region for each image region, image region RtThe background value of (a) may be expressed as:
Figure GDA0002460430590000031
where B represents the entire pseudo-background area,
Figure GDA0002460430590000032
feature vector, v, representing each small region in a pseudo-backgroundBAn overall feature vector representing the entire pseudo-background; image region R for pseudo backgroundtRegion R of an imagetThe background contrast characteristic of (a) is defined as follows:
Figure GDA0002460430590000033
Figure GDA0002460430590000034
wherein λjIs to the region RtConstraint parameter of area, ptAnd pjAre respectively corresponding image areas RtAnd RjIs a spatial weight coefficient, N is the total number of images in the image sample set,
Figure GDA0002460430590000035
representing feature vectors in each channel
Figure GDA0002460430590000036
And
Figure GDA0002460430590000037
the histogram distance between;
and splicing various visual features and background contrast features of the image area to obtain a feature vector of the image area.
Further, the method for learning the saliency value of each image region in each image by using the algorithm based on the saliency label ranking comprises the following steps:
regarding the significant value labels of each region as 256 classes, regarding the significant value of an image region as a positive label, regarding a complementary set of the significant value in a set {0,1 … 255} as a negative label of the image region, composing the positive label, the negative label and the feature vector of the image region into a sample set, selecting one part from the sample set as a training set, and regarding the rest part as a test set;
and establishing a significance target detection parameter model framework by using the training set and the test set, establishing an error loss model, and then optimizing the significance target detection model by using the error loss model to obtain parameters so as to obtain a significance value of each image area in each image.
Further, the establishing a frame of a saliency target detection parameter model by using the training set and the test set, then establishing an error loss model, then performing optimization solution on the saliency target detection model by using the error loss model to obtain parameters, and obtaining a saliency value of each image region in each image includes:
the significance detection is regarded as a multi-classification problem, a classification model is found through a multi-label learning algorithm based on ordering, and a training set of all image region features of each image is represented as I ═ r1,r2...rnR, each image area characteristic ri∈RdIs a d-dimensional vector, n is the total number of the training set, and the saliency labels corresponding to all the image regions of each image are represented as τ ═ l1,l2...lmUsing y ═ y }1,y2...yn)∈{0,1}m×nSignificance labels, y, representing correspondences in the training seti∈{0,1}mIndicating the saliency labels assigned to the ith region, using yji1 denotes a saliency label ljIs assigned to region riOn the contrary, yji0; m belongs to the set 0, 1.. 255, representing the corresponding saliency values of the label.
For the image region riIf y isji1 and ykiWhen the sequence is equal to 0, the sequence function f of the ith label is predicted by using a multi-label sequence methodi(r) for this image area riThe loss between the positive and negative tags is defined as follows:
εj,k(r,y)=I(yj≠yk)l((yj-yk)(fj(r)-fk(r))) (1)
where I (z) represents an indicator function, and outputs 1 when z is true; otherwise, 0 is output, and a linear function is used to represent the prediction function, defined as fi(g)=wi Tg, wherein W ═ W1,w2...wm]∈Rd×mAccording to equation (1), the error loss model for all image regions in the training set is defined as follows:
Figure GDA0002460430590000051
and (3) utilizing regularization to constrain the error loss model, regarding W as a low-rank matrix, and introducing a nuclear norm, wherein a minimized loss function is as follows:
Figure GDA0002460430590000052
where λ is a constraint parameter.
For two region feature vectors riAnd rjDefining a similarity matrix S ═ Sij]n×nWherein s isij=e(-||ri-rj||22) If and only if xi∈Nk(rj) Or xj∈Nk(ri),sijRepresenting the visual similarity between two regional features, NkAnd (r) is k adjacent sets of the region r, and if the visual features of the two regions are similar by combining the graph Laplacian regularization theory, the corresponding label spaces of the two regions also have similarity. The visual constraint regularization term is defined as follows:
Figure GDA0002460430590000053
wherein
Figure GDA0002460430590000054
Is a diagonal matrix and L is a laplacian matrix, and in combination with equations (3) (4), the optimization problem is abstracted as the following objective function:
Figure GDA0002460430590000055
wherein α, λ is the balance parameter, L ═ E-1/2(E-s)E-1/2Is a normalized graph laplacian matrix.
Solving the formula (5) by using an APG method, solving a feature similarity matrix L of a training set, and iteratively solving W as follows:
Figure GDA0002460430590000056
wherein the optimization problem is solved as
Figure GDA0002460430590000061
Figure GDA0002460430590000062
Is to f (W)t) Finding the gradient, Wt'=UΣVTIs the singular value decomposition of W' and,
Figure GDA0002460430590000063
is a diagonal matrix calculated as
Figure GDA0002460430590000064
ηtIs the step size of the update.
According to another aspect of the present invention, there is provided an image saliency target detection apparatus based on saliency label sorting, including:
the image area characteristic acquisition module is used for establishing an image sample set, dividing each image in the image sample set into a plurality of image areas by using a super-pixel segmentation SLIC (linear segmentation in particular) segmentation method, and extracting visual characteristics and background contrast characteristics for each image area;
the image area significant value label acquisition module is used for extracting a significant target from each image in the image sample set by using an image significance detection algorithm to obtain a significant value label of each image area in each image;
the image area significant value acquisition module is used for forming a training set and a test set according to the visual features, the background contrast features and the significant value labels of each image area and learning the significant value of each image area in each image by using an algorithm based on significant label sequencing;
and the salient target acquisition module of the image recovers a salient image of each image by using a low-rank matrix recovery theory and the salient value of each image area, and detects the salient target in the image.
Further, the image region feature obtaining module is specifically configured to set the visual features to include color features and texture features, where the color features include average RGB, LAB, HSV color values of pixel points included in each image region and corresponding color space histograms; the texture features comprise LBP and LM filter distribution features of the image area; the background contrast characteristic adopts a certain number of peripheral edge regions as a background, and respectively extracts the color texture of the background region and the contrast characteristic between the color texture and the contrast characteristic;
the background contrast characteristic of the image area is defined as follows:
using the peripheral region of the boundary as a pseudo-background region for each image region, image region RtThe background value of (a) may be expressed as:
Figure GDA0002460430590000071
where B represents the entire pseudo-background area,
Figure GDA0002460430590000072
feature vector, v, representing each small region in a pseudo-backgroundBAn overall feature vector representing the entire pseudo-background; image region R for pseudo backgroundtRegion R of an imagetThe background contrast characteristic of (a) is defined as follows:
Figure GDA0002460430590000073
Figure GDA0002460430590000074
wherein λjIs to the region RtConstraint parameter of area, ptAnd pjAre respectively corresponding image areas RtAnd RjIs a spatial weight coefficient, N is the total number of images in the image sample set,
Figure GDA0002460430590000075
representing feature vectors in each channel
Figure GDA0002460430590000076
And
Figure GDA0002460430590000077
the histogram distance between;
and splicing various visual features and background contrast features of the image area to obtain a feature vector of the image area.
Further, the significant value obtaining module of the image region is specifically configured to regard the significant value labels of each region as 256 classes, regard the significant value of the image region as a positive label, regard a complement of the significant value in a set {0,1 … 5} as a negative label of the image region, form a sample set by the positive label, the negative label and a feature vector of the image region, select a part from the sample set as a training set, and use the remaining part as a test set;
and establishing a significance target detection parameter model framework by using the training set and the test set, establishing an error loss model, and then optimizing the significance target detection model by using the error loss model to obtain parameters so as to obtain a significance value of each image area in each image.
Further, the saliency value acquisition module of the image region is specifically configured to regard saliency detection as a multi-classification problem, find a classified model through a multi-label learning algorithm based on ranking, and represent a training set of all image region features of each image as I ═ { r ═ r1,r2...rnR, each image area characteristic ri∈RdIs a d-dimensional vector, n is the total number of the training set, and the saliency labels corresponding to all the image regions of each image are represented as τ ═ l1,l2...lmUsing y ═ y }1,y2...yn)∈{0,1}m×nSignificance labels, y, representing correspondences in the training seti∈{0,1}mIndicating the saliency labels assigned to the ith region, using yji1 denotes a saliency label ljIs assigned to region riOn the contrary, yji0; m belongs to the set 0, 1.. 255, representing the corresponding saliency values of the label.
For the image region riIf y isji1 and ykiWhen the sequence is equal to 0, the sequence function f of the ith label is predicted by using a multi-label sequence methodi(r) for this image area riThe loss between the positive and negative tags is defined as follows:
εj,k(r,y)=I(yj≠yk)l((yj-yk)(fj(r)-fk(r))) (1)
where I (z) represents an indicator function, and outputs 1 when z is true; otherwise, 0 is output, and a linear function is used to represent the prediction function, defined as fi(g)=wi Tg, wherein W ═ W1,w2...wm]∈Rd×mAccording to equation (1), the error loss model for all image regions in the training set is defined as follows:
Figure GDA0002460430590000081
and (3) utilizing regularization to constrain the error loss model, regarding W as a low-rank matrix, and introducing a nuclear norm, wherein a minimized loss function is as follows:
Figure GDA0002460430590000091
where λ is a constraint parameter.
For two region feature vectors riAnd rjDefining a similarity matrix S ═ Sij]n×nWherein s isij=e(-||ri-rj||22) If and only if xi∈Nk(rj) Or xj∈Nk(ri),sijRepresenting the visual similarity between two regional features, NkAnd (r) is k adjacent sets of the region r, and if the visual features of the two regions are similar by combining the graph Laplacian regularization theory, the corresponding label spaces of the two regions also have similarity. The visual constraint regularization term is defined as follows:
Figure GDA0002460430590000092
wherein
Figure GDA0002460430590000093
Is a diagonal matrix and L is a laplacian matrix, and in combination with equations (3) (4), the optimization problem is abstracted as the following objective function:
Figure GDA0002460430590000094
wherein α, λ is the balance parameter, L ═ E-1/2(E-s)E-1/2Is a normalized graph laplacian matrix.
Solving the formula (5) by using an APG method, solving a feature similarity matrix L of a training set, and iteratively solving W as follows:
Figure GDA0002460430590000095
wherein the optimization problem is solved as
Figure GDA0002460430590000096
Figure GDA0002460430590000097
Is to f (W)t) Finding the gradient, Wt'=UΣVTIs the singular value decomposition of W' and,
Figure GDA0002460430590000098
is a diagonal matrix calculated as
Figure GDA0002460430590000099
ηtIs the step size of the update.
According to the technical scheme provided by the embodiment of the invention, the method of the invention fully utilizes the complexity of the nuclear norm control model of the matrix, combines the visual characteristic similarity and the semantic label similarity, and utilizes the correlation between the graph Laplace regularization constraints, thereby effectively solving the problems of large significant label space and limited training image quantity.
Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive labor.
Fig. 1 is a flowchart of an image saliency target detection method based on saliency tag sorting according to an embodiment of the present invention;
FIG. 2 is a schematic model diagram of an image saliency target detection algorithm based on saliency label sorting according to an embodiment of the present invention;
fig. 3 is a specific structural diagram of an image saliency target detection apparatus based on saliency tag sorting according to an embodiment of the present invention, and includes an image region feature acquisition module 31, an image region saliency value tag acquisition module 32, and an image region saliency value acquisition module 33.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or coupled. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
For the convenience of understanding the embodiments of the present invention, the following description will be further explained by taking several specific embodiments as examples in conjunction with the drawings, and the embodiments are not to be construed as limiting the embodiments of the present invention.
Example one
An embodiment of the present invention provides a flow chart of a method for detecting an image saliency target based on saliency label sorting, as shown in fig. 1, the method includes the following steps:
step S110: establishing an image sample set from existing data sets containing a salient object;
the data set includes MSRA1000, ECSSD, and ICOSEG.
Step S120: each image in the image sample set is divided into t image regions using a slic (simple linear iterative) segmentation method, where t is a natural number, preferably 150. And extracting visual features and background contrast features for each image region, wherein the visual features comprise color features and texture features. Each image in the sample set of images is represented using features of the image region.
The color characteristics of the image comprise average RGB, LAB and HSV color values of pixel points contained in each image area and corresponding color space histograms; the texture features comprise LBP and LMfilter distribution features of the image area; the background features adopt a certain number of peripheral edge regions as background regions, and respectively extract color textures of the background regions and contrast features between the color textures.
The color characteristics are calculated as follows. Research shows that when the significance target is detected, the RGB color space and the LAB color space play a complementary role, and the HSV color space can more accurately describe the visual perception capability of human eyes. Then, for each divided image region, the average RGB color, LAB color, and HSV color thereof as color features represent a color contrast vector of each image region.
Wherein the texture features are calculated as follows: using LBP and LMFilter to represent texture feature descriptors, extracting LBP (Local Binary Pattern) histograms of 8 x 8 of each image area, and calculating chi between LBP histograms of two adjacent areas2The distances are as follows:
Figure GDA0002460430590000121
wherein h isiAn LBP histogram representing the image area i.
In the same way for the LMFilter, extracting 8X 8 LMFilter histogram of each image area, and calculating chi between the LMFilter histograms of two adjacent areas2Distance.
The background contrast characteristic is calculated as follows. Background contrast features in salient object detectionOften as an inhibition feature to assist in the extraction of salient objects. Using the peripheral region of the boundary as a pseudo-background region for each image region, image region RtThe background value of (a) may be expressed as:
Figure GDA0002460430590000131
where B represents the entire pseudo-background area,
Figure GDA0002460430590000132
feature vector, v, representing each small region in a pseudo-backgroundBThe global feature vector representing the entire pseudo-background is obtained by calculation in the above-described manner. The feature vector of the pseudo background is mainly composed of color and texture, so that the calculation method is similar to the feature vector acquisition method of the region; thus, the image region R for the pseudo backgroundtRegion R of an imagetThe background contrast characteristic of (a) is defined as follows:
Figure GDA0002460430590000133
Figure GDA0002460430590000134
wherein λjIs to the region RtConstraint parameter of area, ptAnd pjAre respectively corresponding image areas RtAnd RjAverage position distance between, ptAnd pjAre respectively corresponding image areas RtAndRjaverage position coordinates of regional pixel points
σ is a spatial weight coefficient, N is a total number of images in the image sample set,
Figure GDA0002460430590000135
representing feature vectors in each channel
Figure GDA0002460430590000136
And
Figure GDA0002460430590000137
the histogram distance between. In digital image processing, an image is composed of three color channels of R, G and B, and a feature vector in the channel is obtained by calculating some features of the three channels.
Thus, regional contrast features of 3 colors are obtained, and finally each image region represents color features using a 28-dimensional feature vector.
By splicing the above various features, 74-dimensional feature vectors can be obtained for each image region
Figure GDA0002460430590000141
Step S130: and extracting a saliency target of each image in the sample set by using the existing image saliency detection algorithm to obtain a saliency label of each image area in each image. The conventional image saliency detection algorithm is based on a structured matrix decomposition algorithm, and adopts the idea of separating a background from a target in an image to obtain a saliency map of the image, and then an average value of saliency values of pixels of each image area is taken as a saliency value label of the image area.
Step S140: and dividing the visual features and the corresponding significant value label set of each image area into a training set and a testing set, and learning the significant value corresponding to each image area in each image by using an algorithm based on significant label ordering.
Regarding the significant value labels of each region as 256 classes, regarding the significant value of an image region as a positive label, regarding a complement of the significant value in a set {0,1 … 255} as a negative label of the image region, composing the positive label, the negative label and the feature vector of the image region into a sample set, selecting one part from the sample set as a training set, and regarding the rest part as a test set.
And establishing a significance target detection parameter model framework by utilizing the training set and the test set, then establishing an error loss model, and then optimizing the significance target detection model by utilizing the error loss model to obtain parameters so as to obtain a significance value of each image area in each image.
Through the proposed optimization algorithm, a model parameter W can be trained on a training data set, then for each region in each image, the probability of occurrence of a significant value label of each region is obtained by multiplying W by a feature vector X, the probabilities are arranged in a descending order, and the position of the column coordinate of W corresponding to the maximum probability value is taken as the significant value of all pixel points in the region.
Fig. 2 shows a model diagram of an image saliency target detection algorithm based on saliency label ranking provided by the present invention. As shown in fig. 2, the saliency detection is regarded as a multi-classification problem, and a classification model is found through a multi-label learning algorithm based on ranking. This algorithm is particularly suitable for situations where large-scale classes are learned in a limited set of training samples. First, the training set of all region features in step S120 is represented as I ═ r1,r2...rnR of each region ri∈RdIs a d-dimensional vector and n is the total number of training sets. The saliency labels corresponding to all the regions in step S130 are denoted as τ ═ { l1,l2...lmUsing y ═ y }1,y2...yn)∈{0,1}m×nSignificance label, y, corresponding to the training feature seti∈{0,1}mIndicating the saliency labels assigned to the ith region, using yji1 denotes a saliency label ljIs assigned to region riOn the contrary, yji0; m belongs to the set 0, 1.. 255, representing the corresponding saliency values of the label.
The present invention addresses this salient object detection problem, for region riIf y isji1 and ykiWhen the sequence is equal to 0, the sequence function f of the ith label is predicted by using a multi-label sequence methodi(r), this function can be assigned to the label/iA high significance value, and give label lkA low significance value. Therefore, the loss between the positive and negative tags in this area is defined as follows:
εj,k(r,y)=I(yj≠yk)l((yj-yk)(fj(r)-fk(r))) (1)
where I (z) represents an indicator function, outputs 1 when z is true and 0 otherwise. For convenient and efficient calculation, the prediction function is expressed by a linear function, defined as fi(g)=wi Tg, wherein W ═ W1,w2...wm]∈Rd×m. Then, according to equation (1), the loss function for all regions in the training set is defined as follows:
Figure GDA0002460430590000161
in the invention, the overfitting condition of data is prevented while the complexity of the model is controlled, regularization is utilized to carry out constraint on the model, W is regarded as a low-rank matrix, a nuclear norm is introduced, and a minimized loss function is as follows:
Figure GDA0002460430590000162
where λ is a constraint parameter.
In addition, in order to better solve the problem of salient object detection, the visual similarity characteristics of image areas are fully considered, and the characteristics r of the two areas are subjected to two-area detectioniAnd rjDefining a similarity matrix S ═ Sij]n×nWherein s isij=e(-||ri-rj||22) If and only if xi∈Nk(rj) Or xj∈Nk(ri),sijRepresenting the visual similarity between two regional features, Nk(r) is k immediately adjacent sets of regions r, preferably 0.01 x n. Then, in combination with the graph laplacian regularization theory, if the visual features of the two regions are similar, the corresponding label spaces also have similarity. The visual constraint regularization term is defined as follows:
Figure GDA0002460430590000163
wherein
Figure GDA0002460430590000164
Is a diagonal matrix and L is a laplacian matrix. Combining equations (3) and (4), the above optimization problem is abstracted into the following objective function:
Figure GDA0002460430590000165
wherein α, λ is the balance parameter, L ═ E-1/2(E-s)E-1/2Is a normalized graph laplacian matrix.
In the invention, the function designed by fully considering the optimization problem is introduced into the non-convex function with the nuclear norm, so that the APG (estimated formal gradient) method is used for solving the step (5). Firstly, a feature similarity matrix L of a training set is solved, and then, the iterative solution of W is as follows:
Figure GDA0002460430590000171
wherein the optimization problem is solved as
Figure GDA0002460430590000172
Figure GDA0002460430590000173
Is to f (W)t) And (5) calculating a gradient. Thus, Wt'=UΣVTIs the singular value decomposition of W' and,
Figure GDA0002460430590000174
is a diagonal matrix calculated as
Figure GDA0002460430590000175
ηtIs the step size of the update.
Step S150: and recovering the saliency value of each image region to the image in the sample set by using a low-rank matrix recovery theory, detecting the saliency target in the image, and obtaining the final saliency map.
Example two
The embodiment provides an image saliency target detection device based on saliency label sorting, the device has a specific structure as shown in fig. 3, and the device comprises:
an image area feature obtaining module 31, configured to establish an image sample set, divide each image in the image sample set into a plurality of image areas by using an SLIC segmentation method, and extract a visual feature and a background contrast feature for each image area;
a significant value label obtaining module 32 of the image area, configured to extract a significant target from each image in the image sample set by using an image significance detection algorithm, so as to obtain a significant value label of each image area in each image;
the salient value acquiring module 33 of the image area is configured to form a training set and a testing set according to the visual feature, the background contrast feature and the salient value label of each image area, and learn the salient value of each image area in each image by using an algorithm based on the ordering of the salient labels;
and the image salient target acquiring module 34 recovers the salient map of each image by using the salient value of each image region according to a low-rank matrix recovery theory, and detects the salient target in the image.
Further, the image region feature obtaining module 31 is specifically configured to set the visual features to include color features and texture features, where the color features include average RGB, LAB, HSV color values of pixel points included in each image region and corresponding color space histograms; the texture features comprise LBP and LM filter distribution features of the image area; the background contrast characteristic adopts a certain number of peripheral edge regions as a background, and respectively extracts the color texture of the background region and the contrast characteristic between the color texture and the contrast characteristic;
the background contrast characteristic of the image area is defined as follows:
using the peripheral region of the boundary as a pseudo-background region for each image region, image region RtIs a background ofThe value may be expressed as:
Figure GDA0002460430590000181
where B represents the entire pseudo-background area,
Figure GDA0002460430590000182
feature vector, v, representing each small region in a pseudo-backgroundBAn overall feature vector representing the entire pseudo-background; image region R for pseudo backgroundtRegion R of an imagetThe background contrast characteristic of (a) is defined as follows:
Figure GDA0002460430590000183
Figure GDA0002460430590000184
wherein λjIs to the region RtConstraint parameter of area, ptAnd pjAre respectively corresponding image areas RtAnd RjIs a spatial weight coefficient, N is the total number of images in the image sample set,
Figure GDA0002460430590000185
representing feature vectors in each channel
Figure GDA0002460430590000186
And
Figure GDA0002460430590000187
the histogram distance between;
and splicing various visual features and background contrast features of the image area to obtain a feature vector of the image area.
Further, the significant value obtaining module 33 of the image region is specifically configured to regard the significant value labels of each region as 256 classes, regard the significant value of the image region as a positive label, regard a complement of the significant value in the set {0,1 … 255} as a negative label of the image region, form a sample set by the positive label, the negative label and the feature vector of the image region, select a part of the sample set as a training set, and use the rest of the sample set as a test set;
and establishing a significance target detection parameter model framework by using the training set and the test set, establishing an error loss model, and optimizing the significance target detection model by using the error loss model to obtain parameters so as to obtain a significance value of each image area in each image.
The significance detection is regarded as a multi-classification problem, a classification model is found through a multi-label learning algorithm based on ordering, and a training set of all image region features of each image is represented as I ═ r1,r2...rnR, each image area characteristic ri∈RdIs a d-dimensional vector, n is the total number of the training set, and the saliency labels corresponding to all the image regions of each image are represented as τ ═ l1,l2...lmUsing y ═ y }1,y2...yn)∈{0,1}m×nSignificance labels, y, representing correspondences in the training seti∈{0,1}mIndicating the saliency labels assigned to the ith region, using yji1 denotes a saliency label ljIs assigned to region riOn the contrary, yji0; m belongs to the set 0, 1.. 255, representing the corresponding saliency values of the label.
For the image region riIf y isji1 and ykiWhen the sequence is equal to 0, the sequence function f of the ith label is predicted by using a multi-label sequence methodi(r) for this image area riThe loss between the positive and negative tags is defined as follows:
εj,k(r,y)=I(yj≠yk)l((yj-yk)(fj(r)-fk(r))) (1)
where I (z) represents an indicator function, and outputs 1 when z is true; otherwise, 0 is output, and a linear function is used to represent the prediction function, defined as fi(g)=wi Tg, wherein W ═ W1,w2...wm]∈Rd×mAccording to equation (1), the error loss model for all image regions in the training set is defined as follows:
Figure GDA0002460430590000201
and (3) utilizing regularization to constrain the error loss model, regarding W as a low-rank matrix, and introducing a nuclear norm, wherein a minimized loss function is as follows:
Figure GDA0002460430590000202
where λ is a constraint parameter.
For two region feature vectors riAnd rjDefining a similarity matrix S ═ Sij]n×nWherein s isij=e(-||ri-rj||22) If and only if xi∈Nk(rj) Or xj∈Nk(ri),sijRepresenting the visual similarity between two regional features, NkAnd (r) is k adjacent sets of the region r, and if the visual features of the two regions are similar by combining the graph Laplacian regularization theory, the corresponding label spaces of the two regions also have similarity. The visual constraint regularization term is defined as follows:
Figure GDA0002460430590000203
wherein
Figure GDA0002460430590000204
Is a diagonal matrix and L is a laplacian matrix, and in combination with equations (3) (4), the optimization problem is abstracted as the following objective function:
Figure GDA0002460430590000205
wherein α, λ is balance parameterNumber, L ═ E-1/2(E-s)E-1/2Is a normalized graph laplacian matrix.
Solving the formula (5) by using an APG method, solving a feature similarity matrix L of a training set, and iteratively solving W as follows:
Figure GDA0002460430590000211
wherein the optimization problem is solved as
Figure GDA0002460430590000212
Figure GDA0002460430590000213
Is to f (W)t) Finding the gradient, Wt'=UΣVTIs the singular value decomposition of W' and,
Figure GDA0002460430590000214
is a diagonal matrix calculated as
Figure GDA0002460430590000215
ηtIs the step size of the update.
The specific process of detecting the image salient object based on the salient label sorting by using the device of the embodiment of the invention is similar to that of the method embodiment, and is not repeated here.
In summary, the salient object detection algorithm provided by the embodiment of the present invention regards the salient detection problem as a multi-classification problem, performs the sequence mapping from the salient labels to the matrix recovery problem, and fully utilizes the human visual cognition process to detect the salient object in the image by combining the visual and contrast characteristics.
The salient object detection algorithm provided by the invention fully utilizes the complexity of a nuclear norm control model of a matrix, combines visual feature similarity and semantic label similarity, and utilizes the correlation between graph Laplacian regularization constraints, thereby effectively solving the problems of large salient label space and limited training image quantity.
All modules of the system are automatically completed without manual intervention, can be simply and conveniently embedded into other semantic analysis systems of the image, and has wide and universal application prospect.
Those of ordinary skill in the art will understand that: the figures are merely schematic representations of one embodiment, and the blocks or flow diagrams in the figures are not necessarily required to practice the present invention.
From the above description of the embodiments, it is clear to those skilled in the art that the present invention can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which may be stored in a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for apparatus or system embodiments, since they are substantially similar to method embodiments, they are described in relative terms, as long as they are described in partial descriptions of method embodiments. The above-described embodiments of the apparatus and system are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (6)

1. An image saliency target detection method based on saliency label sorting is characterized by comprising the following steps:
establishing an image sample set, dividing each image in the image sample set into a plurality of image areas by using a super-pixel segmentation SLIC (linear segmentation in particular) segmentation method, and extracting visual features and background contrast features of each image area;
extracting a saliency target of each image in the image sample set by using an image saliency detection algorithm to obtain a saliency label of each image area in each image;
forming a training set and a testing set according to the visual features, the background contrast features and the significant value labels of each image region, and learning the significant value of each image region in each image by using an algorithm based on significant label sequencing;
recovering a saliency map of each image by using the saliency value of each image region based on a low-rank matrix recovery theory, and detecting a salient target in the image;
the method for extracting the visual feature and the background contrast feature for each image area comprises the following steps:
the visual features comprise color features and texture features, wherein the color features comprise average RGB, LAB and HSV color values of pixel points contained in each image region and corresponding color space histograms; the texture features comprise LBP and LM filter distribution features of the image area; the background contrast characteristic adopts a certain number of peripheral edge regions as a background, and respectively extracts the color texture of the background region and the contrast characteristic between the color texture and the contrast characteristic;
the background contrast characteristic of the image area is defined as follows:
for each image areaThe region around the boundary is used as a pseudo background region, image region RtThe background values of (a) are expressed as:
Figure FDA0002420676690000011
where B represents the entire pseudo-background area,
Figure FDA0002420676690000012
feature vector, v, representing each small region in a pseudo-backgroundBAn overall feature vector representing the entire pseudo-background,
Figure FDA0002420676690000013
representing an image region RtA background value of (d); b is a symbolic representation of a background value;
image region R 'for pseudo background'tOf image region R'tThe background contrast characteristic of (a) is defined as follows:
Figure FDA0002420676690000014
Figure FDA0002420676690000015
wherein λjIs to the region RtConstraint parameter of area, ptAnd pjAre respectively corresponding image areas RtAnd RjIs a spatial weight coefficient, Nt is the total number of images in the image sample set,
Figure FDA0002420676690000021
representing feature vectors in each channel
Figure FDA0002420676690000022
And
Figure FDA0002420676690000023
the histogram distance between; k represents the number of adjacent regions of any region, c represents contrast, which is a symbolic representation of contrast values, and j represents the index of an image region, namely the jth region;
and splicing various visual features and background contrast features of the image area to obtain a feature vector of the image area.
2. The method for detecting image saliency objects based on saliency label ordering according to claim 1, wherein said forming a training set and a testing set according to the visual features, the background contrast features and the saliency labels of each image region, and learning the saliency value of each image region in each image using an algorithm based on saliency label ordering includes:
dividing the estimation values of the significant values of all the regions in the image into 256 classes, wherein the estimation value of the significant value of each region is a value between 0 and 255, the estimation value of the significant value of each region is used as a positive label of the region, meanwhile, a complementary set of the estimation values in all class label sets {0,1 … 255} is used as a negative label corresponding to the image region, the positive label, the negative label and the feature vector of the image region form a sample set, one part of the sample set is selected as a training set, and the other part of the sample set is used as a test set;
and establishing a significance target detection parameter model framework by using the training set and the test set, establishing an error loss model, and then optimizing the significance target detection model by using the error loss model to obtain parameters so as to obtain an accurate value of the significance value of each image area in each image.
3. The method for detecting the image salient object based on the salient label sorting of claim 2, wherein the step of establishing a salient object detection parameter model framework by using the training set and the test set, then establishing an error loss model, and then performing optimization solution on the salient object detection model by using the error loss model to obtain the accurate value of the salient value of each image area in each image comprises the steps of:
the significance detection is regarded as a multi-classification problem, a classification model is found through a multi-label learning algorithm based on ordering, and a training set of all image region features of each image is represented as I ═ r1,r2...rnR, each image area characteristic ri∈RdIs a d-dimensional vector, n is the total number of the training set, and the saliency labels corresponding to all the image regions of each image are represented as τ ═ l1,l2...lmUsing y ═ y }1,y2...yn)∈{0,1}m×nSignificance labels, y, representing correspondences in the training seti∈{0,1}mIndicating the saliency labels assigned to the ith region, using yji1 denotes a saliency label ljIs assigned to region riOn the contrary, yji0; m belongs to the set {0, 1.. 255} and represents a significant value corresponding to the label;
for the ith image region feature riIf y isji1 and ykiWhen the sequence is equal to 0, the sequence function f of the ith label is predicted by using a multi-label sequence methodi(r) for this image area riThe loss between the positive and negative tags is defined as follows:
εj,k(r,y)=I(yj≠yk) ℓ((yj-yk)(fj(r)-fk(r))) (1)
where I (z) represents an indicator function, and outputs 1 when z is true; otherwise, 0 is output, and a linear function is used to represent the prediction function, defined as fi(g)=wi Tg, wherein W ═ W1,w2...wm]∈Rd×mAccording to equation (1), the error loss model for all image regions in the training set is defined as follows:
Figure FDA0002420676690000031
and (3) utilizing regularization to constrain the error loss model, regarding W as a low-rank matrix, and introducing a nuclear norm, wherein a minimized loss function is as follows:
Figure FDA0002420676690000032
wherein λ is a constraint parameter;
for two region feature vectors riAnd rjDefining a similarity matrix S ═ Sij]n×nWherein s isij=e(-||ri-rj||22) If and only if xi∈Nk(rj) Or xj∈Nk(ri),xiRepresenting the original image block, x, corresponding to the ith areajRepresenting an original image block corresponding to the jth area;
sijrepresenting the visual similarity between two regional features, Nk(r) is k adjacent sets of the region r, and in combination with the graph laplacian regularization theory, if the visual features of the two regions are similar, the corresponding label spaces also have similarity, and a visual constraint regularization term is defined as follows:
Figure FDA0002420676690000033
wherein
Figure FDA0002420676690000034
Is a diagonal matrix, r is the sum of all regions, riRepresenting the ith image area characteristic, rjRepresenting the jth image region feature, L is a feature similarity matrix, and in combination with equations (3) (4), the optimization problem is abstracted as the following objective function:
Figure FDA0002420676690000041
wherein α is a balance parameter, L ═ E-1/2(E-s)E-1/2S is a normalized graph Laplace matrix, L is a feature similarity matrix, and each element in the matrix represents a region and its surrounding faciesThe similarity of features between adjacent regions is such that,Tr represents a trace function;
solving the formula (5) by using an APG method, solving a feature similarity matrix L of a training set, and iteratively solving W as follows:
Figure FDA0002420676690000042
wherein the optimization problem is solved as
Figure FDA0002420676690000043
Figure FDA0002420676690000044
Is to f (W)t) Obtaining gradient of W't=U∑VTIs the singular value decomposition of W' and,
Figure FDA0002420676690000045
is a diagonal matrix calculated as
Figure FDA0002420676690000046
ηtFor the updated step size, U ═ U1,u2,...umIs a matrix of size m, each element in the matrix being WWTIs also referred to as the left singular eigenvector of the matrix W, and similarly, V represents the right singular eigenvector of the matrix W.
4. An image saliency target detection apparatus based on saliency label ordering, comprising:
the image area characteristic acquisition module is used for establishing an image sample set, dividing each image in the image sample set into a plurality of image areas by using a super-pixel segmentation SLIC (linear segmentation in particular) segmentation method, and extracting visual characteristics and background contrast characteristics for each image area;
the image area significant value label acquisition module is used for extracting a significant target from each image in the image sample set by using an image significance detection algorithm to obtain a significant value label of each image area in each image;
the image area significant value acquisition module is used for forming a training set and a test set according to the visual features, the background contrast features and the significant value labels of each image area and learning the significant value of each image area in each image by using an algorithm based on significant label sequencing;
the salient target acquisition module of the image recovers a salient image of each image by using a low-rank matrix recovery theory and the salient value of each image area to detect a salient target in the image;
the image region feature obtaining module is specifically configured to set the visual features to include color features and texture features, where the color features include average RGB, LAB, HSV color values of pixel points included in each image region and corresponding color space histograms; the texture features comprise LBP and LM filter distribution features of the image area; the background contrast characteristic adopts a certain number of peripheral edge regions as a background, and respectively extracts the color texture of the background region and the contrast characteristic between the color texture and the contrast characteristic;
the background contrast characteristic of the image area is defined as follows:
using the peripheral region of the boundary as a pseudo-background region for each image region, image region RtThe background value of (a) may be expressed as:
Figure FDA0002420676690000051
where B represents the entire pseudo-background area,
Figure FDA0002420676690000052
feature vector, v, representing each small region in a pseudo-backgroundBAn overall feature vector representing the entire pseudo-background; image region R for pseudo backgroundtRegion R of an imagetThe background contrast characteristic of (a) is defined as follows:
Figure FDA0002420676690000053
Figure FDA0002420676690000054
wherein λjIs to the region RtConstraint parameter of area, ptAnd pjAre respectively corresponding image areas RtAnd RjIs a spatial weight coefficient, Nt is the total number of images in the image sample set,
Figure FDA0002420676690000055
representing feature vectors in each channel
Figure FDA0002420676690000056
And
Figure FDA0002420676690000057
the histogram distance between;
and splicing various visual features and background contrast features of the image area to obtain a feature vector of the image area.
5. The apparatus according to claim 4, wherein:
the image region significant value acquisition module is specifically configured to divide the estimated values of significant values of all regions in an image into 256 classes, where the estimated value of the significant value of each region is a value between 0 and 255, the estimated value of the significant value of each region is used as a positive label of the region, meanwhile, a complement of the estimated value in all class label sets {0,1 … 255} is used as a negative label corresponding to the image region, the positive label, the negative label and a feature vector of the image region form a sample set, a part of the sample set is selected as a training set, and the rest of the sample set is used as a test set;
and establishing a significance target detection parameter model framework by using the training set and the test set, establishing an error loss model, and then optimizing the significance target detection model by using the error loss model to obtain parameters so as to obtain an accurate value of the significance value of each image area in each image.
6. The apparatus according to claim 5, wherein:
the saliency value acquisition module of the image region is specifically configured to consider saliency detection as a multi-classification problem, find a classified model through a multi-label learning algorithm based on ranking, and represent a training set of all image region features of each image as i ═ r1,r2...rnR, each image area characteristic ri∈RdIs a d-dimensional vector, n is the total number of the training set, and the saliency labels corresponding to all the image regions of each image are represented as τ ═ l1,l2...lmUsing y ═ y }1,y2...yn)∈{0,1}m×nSignificance labels, y, representing correspondences in the training seti∈{0,1}mIndicating the saliency labels assigned to the ith region, using yji1 denotes a saliency label ljIs assigned to region riOn the contrary, yji0; m belongs to the set {0, 1.. 255} and represents a significant value corresponding to the label;
for the ith image region feature riIf y isji1 and ykiWhen the sequence is equal to 0, the sequence function f of the ith label is predicted by using a multi-label sequence methodi(r) for this image area riThe loss between the positive and negative tags is defined as follows:
εj,k(r,y)=I(yj≠yk)l((yj-yk)(fj(r)-fk(r))) (1)
where I (z) represents an indicator function, and outputs 1 when z is true; otherwise, 0 is output, and a linear function is used to represent the prediction function, defined as fi(g)=wi Tg, wherein W ═ W1,w2...wm]∈Rd×mAccording to equation (1), the error loss model for all image regions in the training set is defined as follows:
Figure FDA0002420676690000061
and (3) utilizing regularization to constrain the error loss model, regarding W as a low-rank matrix, and introducing a nuclear norm, wherein a minimized loss function is as follows:
Figure FDA0002420676690000071
wherein λ is a constraint parameter;
for two region feature vectors riAnd rjDefining a similarity matrix S ═ Sij]n×nWherein s isij=e(-||ri-rj||22) If and only if xi∈Nk(rj) Or xj∈Nk(ri),sijRepresenting the visual similarity between two regional features, Nk(r) is k adjacent sets of the region r, and in combination with the graph laplacian regularization theory, if the visual features of the two regions are similar, the corresponding label spaces also have similarity, and a visual constraint regularization term is defined as follows:
Figure FDA0002420676690000072
wherein
Figure FDA0002420676690000073
Is a diagonal matrix, r is the sum of all regions, riRepresenting the ith image area characteristic, rjRepresenting the jth image region feature, L is a feature similarity matrix, in combination with equations (3) (4), abstracting the optimization problem as the following objective function:
Figure FDA0002420676690000074
wherein α is a balance parameter, L ═ E-1/2(E-s)E-1/2S is a normalized graph laplacian matrix, L is a feature similarity matrix, each element in the matrix represents the feature similarity between a certain region and its surrounding neighboring regions,Tr represents a trace function;
solving the formula (5) by using an APG method, solving a feature similarity matrix L of a training set, and iteratively solving W as follows:
Figure FDA0002420676690000075
wherein the optimization problem is solved as
Figure FDA0002420676690000076
Figure FDA0002420676690000077
Is to f (W)t) Obtaining gradient of W't=U∑VTIs the singular value decomposition of W' and,is a diagonal matrix calculated as
Figure FDA0002420676690000079
ηtIs the step size of the update.
CN201610219337.0A 2016-04-09 2016-04-09 Image saliency target detection method and device based on saliency label sorting Expired - Fee Related CN106127197B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610219337.0A CN106127197B (en) 2016-04-09 2016-04-09 Image saliency target detection method and device based on saliency label sorting

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610219337.0A CN106127197B (en) 2016-04-09 2016-04-09 Image saliency target detection method and device based on saliency label sorting

Publications (2)

Publication Number Publication Date
CN106127197A CN106127197A (en) 2016-11-16
CN106127197B true CN106127197B (en) 2020-07-07

Family

ID=57269809

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610219337.0A Expired - Fee Related CN106127197B (en) 2016-04-09 2016-04-09 Image saliency target detection method and device based on saliency label sorting

Country Status (1)

Country Link
CN (1) CN106127197B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108696732B (en) * 2017-02-17 2023-04-18 北京三星通信技术研究有限公司 Resolution adjustment method and device for head-mounted display device
CN107358245B (en) * 2017-07-19 2020-05-26 安徽大学 Method for detecting image collaborative salient region
CN110288667B (en) * 2018-03-19 2021-03-02 北京大学 Image texture migration method based on structure guidance
CN109063746A (en) * 2018-07-14 2018-12-21 深圳市唯特视科技有限公司 A kind of visual similarity learning method based on depth unsupervised learning
CN109902672A (en) * 2019-01-17 2019-06-18 平安科技(深圳)有限公司 Image labeling method and device, storage medium, computer equipment
CN110084247A (en) * 2019-04-17 2019-08-02 上海师范大学 A kind of multiple dimensioned conspicuousness detection method and device based on fuzzy characteristics
CN111914850B (en) * 2019-05-07 2023-09-19 百度在线网络技术(北京)有限公司 Picture feature extraction method, device, server and medium
CN110738638B (en) * 2019-09-23 2022-08-02 中国海洋大学 Visual saliency detection algorithm applicability prediction and performance blind evaluation method
CN111080678B (en) * 2019-12-31 2022-02-01 重庆大学 Multi-temporal SAR image change detection method based on deep learning
CN113034454B (en) * 2021-03-16 2023-11-24 上海交通大学 Underwater image quality evaluation method based on human visual sense
CN113705579B (en) * 2021-08-27 2024-03-15 河海大学 Automatic image labeling method driven by visual saliency

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102542267A (en) * 2011-12-26 2012-07-04 哈尔滨工业大学 Salient region detecting method combining spatial distribution and global contrast
CN103390279A (en) * 2013-07-25 2013-11-13 中国科学院自动化研究所 Target prospect collaborative segmentation method combining significant detection and discriminant study
CN104408708A (en) * 2014-10-29 2015-03-11 兰州理工大学 Global-local-low-rank-based image salient target detection method
CN104463870A (en) * 2014-12-05 2015-03-25 中国科学院大学 Image salient region detection method

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7940985B2 (en) * 2007-06-06 2011-05-10 Microsoft Corporation Salient object detection
CN103136766B (en) * 2012-12-28 2015-10-14 上海交通大学 A kind of object conspicuousness detection method based on color contrast and color distribution
KR101537174B1 (en) * 2013-12-17 2015-07-15 가톨릭대학교 산학협력단 Method for extracting salient object from stereoscopic video
CN104966286B (en) * 2015-06-04 2018-01-09 电子科技大学 A kind of 3D saliencies detection method
CN105427314B (en) * 2015-11-23 2018-06-26 西安电子科技大学 SAR image object detection method based on Bayes's conspicuousness

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102542267A (en) * 2011-12-26 2012-07-04 哈尔滨工业大学 Salient region detecting method combining spatial distribution and global contrast
CN103390279A (en) * 2013-07-25 2013-11-13 中国科学院自动化研究所 Target prospect collaborative segmentation method combining significant detection and discriminant study
CN104408708A (en) * 2014-10-29 2015-03-11 兰州理工大学 Global-local-low-rank-based image salient target detection method
CN104463870A (en) * 2014-12-05 2015-03-25 中国科学院大学 Image salient region detection method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于稀疏重构和多特征联合标签推导的显著性检测;赵守凤;《中国优秀硕士学位论文全文数据库 信息科技辑》;20150715;第I138-1046页 *

Also Published As

Publication number Publication date
CN106127197A (en) 2016-11-16

Similar Documents

Publication Publication Date Title
CN106127197B (en) Image saliency target detection method and device based on saliency label sorting
Mou et al. Vehicle instance segmentation from aerial image and video using a multitask learning residual fully convolutional network
Shi et al. Cloud detection of remote sensing images by deep learning
Arteta et al. Interactive object counting
CN108520226B (en) Pedestrian re-identification method based on body decomposition and significance detection
CN110033007B (en) Pedestrian clothing attribute identification method based on depth attitude estimation and multi-feature fusion
CN110866896B (en) Image saliency target detection method based on k-means and level set super-pixel segmentation
CN107977661B (en) Region-of-interest detection method based on FCN and low-rank sparse decomposition
CN110633632A (en) Weak supervision combined target detection and semantic segmentation method based on loop guidance
CN109241816B (en) Image re-identification system based on label optimization and loss function determination method
Zhang et al. Weakly supervised human fixations prediction
CN111931603B (en) Human body action recognition system and method of double-flow convolution network based on competitive network
CN109685830B (en) Target tracking method, device and equipment and computer storage medium
CN108647703B (en) Saliency-based classification image library type judgment method
CN111274964B (en) Detection method for analyzing water surface pollutants based on visual saliency of unmanned aerial vehicle
Li et al. Coarse-to-fine salient object detection based on deep convolutional neural networks
Ravichandran et al. A unified approach to segmentation and categorization of dynamic textures
Zhou et al. Semantic image segmentation using low-level features and contextual cues
CN113822134A (en) Instance tracking method, device, equipment and storage medium based on video
Dornaika et al. A comparative study of image segmentation algorithms and descriptors for building detection
EP4145401A1 (en) Method for detecting anomalies in images using a plurality of machine learning programs
CN112884022B (en) Unsupervised depth characterization learning method and system based on image translation
CN111209948A (en) Image processing method and device and electronic equipment
Shi et al. Real-time saliency detection for greyscale and colour images
Li et al. Interactive image segmentation via cascaded metric learning

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20200707

CF01 Termination of patent right due to non-payment of annual fee