CN110503113B

CN110503113B - Image saliency target detection method based on low-rank matrix recovery

Info

Publication number: CN110503113B
Application number: CN201910801714.5A
Authority: CN
Inventors: 刘明明; 刘兵; 郑丽丽; 李震霄; 仇文宁; 付红; 孙伟; 李姗姗
Original assignee: Jiangsu Institute of Architectural Technology
Current assignee: Jiangsu Institute of Architectural Technology
Priority date: 2019-08-28
Filing date: 2019-08-28
Publication date: 2023-07-28
Anticipated expiration: 2039-08-28
Also published as: CN110503113A

Abstract

The invention discloses an image saliency target detection method based on low-rank matrix recovery, which comprises the following steps: extracting color features from an original image, combining super pixels of the image, and determining a feature matrix of the original image; decomposing a low-rank matrix from the feature matrix; constructing a hierarchical index tree of the original image by utilizing the image superpixels according to an index tree generation algorithm, and determining a hierarchical sparse norm of the original image by combining high-level prior information; ternary decomposition is carried out on the low-rank matrix, and a structured low-rank matrix recovery model of the original image is determined; and fusing the layered sparse norms of the original image and the structured low-rank matrix recovery model of the original image, and combining an alternating direction optimization algorithm to obtain the saliency map. The method for accelerating singular value decomposition by introducing low-rank matrix ternary decomposition solves the problem of high computational complexity caused by minimizing matrix trace norms; by constructing an index tree and combining with a high-level priori, the problem of unsupervised image saliency target detection under a complex background is solved.

Description

Image saliency target detection method based on low-rank matrix recovery

Technical Field

The invention relates to the technical field of image detection, in particular to an image saliency target detection method based on low-rank matrix recovery.

Background

The saliency target detection can effectively accurately divide a foreground target from a single scene or realize cooperative detection from a plurality of images. Currently, it has been widely used for image segmentation, content-based image retrieval, image compression, image cropping, and the like. The salient object detection methods are broadly classified into two categories, i.e., supervised and unsupervised, according to whether or not the mark information is utilized. The supervised approach typically utilizes a deep learning model and a large scale training algorithm to achieve target detection. The unsupervised method does not need a large-scale image sample, and has better flexibility and lower calculation complexity.

Unsupervised salient object detection methods typically utilize high-level priors and low-level salient features to achieve fast object detection. Achanta et al uses color contrast prior information and a central prior to locate salient objects. In addition to color and center prior information, wei et al localize the object a priori by calculating the geometric distance of the salient object from the image boundary as background. Shen et al combine different prior information with robust principal component analysis (Robust Principal Component Analysis, RPCA) to build a unified detection model. Lang et al achieve region-level saliency target generation by integrating different high-level priors. Tang et al infer the probability that each image region belongs to the background using color, location and boundary connectivity. Huo et al build a specialized linear feedback control system model that can receive a plurality of saliency priors and image features. Liu et al propose a non-parametric saliency target detection model based on kernel density estimation that uses item likelihood metrics and saliency metrics to generate a pixel-by-pixel saliency map. Goferman et al designed a contextual attention saliency region extraction model based on local and global contrast, and then multi-scale contrast, center-surrounding histogram and color space distribution introduced conditional random fields for improving saliency map quality. In addition, the global feature contrast extracted over the wavelet and fourier transform domains is also effectively used for unsupervised saliency target detection.

Among the above-mentioned unsupervised saliency target detection methods, the method based on low-rank matrix decomposition is attracting a great deal of attention due to its robustness and high efficiency, and the method decomposes the original image into a low-rank matrix and a sparse matrix, wherein the low-rank matrix corresponds to a highly redundant image background region, and the sparse matrix corresponds to a saliency foreground target region of the image. However, these existing methods typically use a simple matrix norm to induce a sparse matrix, ignoring the structured information of the image saliency target, resulting in a divergence or incompleteness of the generated saliency map. In addition, these methods use a matrix kernel norm constraint low rank matrix, resulting in algorithms that require singular value decomposition to be performed in each iteration, increasing computational cost.

Disclosure of Invention

The embodiment of the invention provides an image saliency target detection method based on low-rank matrix recovery, which is used for solving the problems in the prior art.

The embodiment of the invention provides an image saliency target detection method based on low-rank matrix recovery, which comprises the following steps:

extracting color features from an original image, combining super pixels of the image, and determining a feature matrix of the original image; decomposing a low-rank matrix from the feature matrix by a robust principal component analysis method;

constructing a hierarchical index tree of the original image by utilizing the image superpixels according to an index tree generation algorithm, and determining a hierarchical sparse norm of the original image by combining high-level prior information;

ternary decomposition is carried out on the low-rank matrix, and a structured low-rank matrix recovery model of the original image is determined;

and fusing the layered sparse norms of the original image and the structured low-rank matrix recovery model of the original image, and combining an alternating direction optimization algorithm to obtain the saliency map.

Further, extracting color features from the original image, and determining a feature matrix of the original image by combining the super pixels of the image; the method specifically comprises the following steps:

extracting color features from an original image, and generating image superpixels { P } by using a simple linear iterative clustering algorithm ₁ ，P ₂ ，...，P _n Each super pixel block P _i Corresponds to a feature vector x _i Representing the feature matrix of the original image asWherein (1)>The euclidean space is represented, D represents the feature vector dimension, and n represents the number of feature vectors.

Further, a hierarchical sparse norm of the original image; the method specifically comprises the following steps:

wherein, the j index tree node of the i th layerThe weight of (2) is expressed as +.>I.I represents the number of set elements, d represents the number of layers of the index tree, n _i Represents the number of tree nodes in each layer, I _∞ Representation l _∞ Is>Expressed as:

wherein m represents the super pixel number contained in the j-th tree node of the i-th layer; fusing three prior graphs of the position, the color and the boundary connection into a final high-level prior graph, and h _m Representing each super pixel block P _i E the corresponding value of the higher-level prior graph corresponding to P.

Further, performing ternary decomposition on the low-rank matrix to determine a structured low-rank matrix recovery model of the original image; the method specifically comprises the following steps:

the low rank matrix is decomposed into the product of three matrices l= QMR,Q ^T Q＝I，RR ^T =i, where r < min (m, n); wherein the dimension of Q is expressed as Dxr, the dimension of M is expressed as rxr, the dimension of R is expressed as rXN, and R represents the rank of the matrix L;

assuming matrices Q and R have orthogonal row and column vectors, respectively, i.e., Q ^T Q＝I，RR ^T =i, then the equation QMR | _* ＝||M|| _* Establishment; from the above relation, we know L _* ＝||QMR|| _* ＝||M|| _* The structured low rank matrix recovery model is as follows:

S.t.X＝L+S，L＝QMR，Q ^T Q＝I，RR ^T ＝1

wherein M is _* Represents the nuclear norm of M;representing a hierarchical sparse regularization term; alpha, beta represent balance coefficients;the manifold canonical term is represented and the expression is as follows:

wherein S is _i Is the ith column of the sparse matrix S;representing the Laplace matrix, i.e. L _g =d-W, D being the degree matrix; tr (·) represents the trace of the matrix; w represents the neighbor matrix generated by the super-pixel pairs, W _i，j Representing the distance between the i and j-th superpixels.

Further, fusing the layered sparse norms of the original image and the structured low-rank matrix recovery model of the original image, and combining an alternating direction optimization algorithm to obtain a saliency map; the method specifically comprises the following steps:

the structural low-rank matrix recovery model is optimized and expressed as:

s.t.X＝L+S，L＝QMR，Q ^T Q＝I，RR ^T ＝I.

the solution algorithm for the optimized model is derived as follows:

using the auxiliary variable E to enable the objective function to be decomposed, the optimized model is expressed as:

s.t.X＝QMR+S，Q ^T Q＝I，RR ^T ＝I，S＝E.

the augmented lagrangian function of the decomposed optimization model is expressed as:

wherein, the liquid crystal display device comprises a liquid crystal display device,and->Represents the Lagrangian multiplier, μ > 0 represents the penalty parameter, minimize +.>For variables M, E, S, Q, R, Y ₁ And Y ₂ And (5) solving alternately.

The embodiment of the invention provides an image saliency target detection method based on low-rank matrix recovery, which has the following beneficial effects compared with the prior art:

the invention provides a salient target detection method integrating rapid low-rank matrix decomposition and hierarchical sparse regularization, which introduces a method for accelerating singular value decomposition by low-rank matrix ternary decomposition, reduces the image scale of singular value decomposition and reduces the calculation complexity, thereby reducing the scale of singular value decomposition and solving the problem of high calculation complexity caused by minimizing matrix trace norms. Through constructing an index tree and combining a high-level priori, the hierarchical sparse regularization and matrix low-rank decomposition are effectively fused, the hierarchical sparse regularization is introduced to improve the salient object detection performance under the complex background, and the problem of unsupervised image salient object detection under the complex background is solved.

Drawings

FIG. 1 is a salient object detection framework provided by an embodiment of the present invention;

FIG. 2a is a graph comparing PR curves of the iCoSeg dataset provided by the embodiment of the present invention with those of the conventional algorithm;

FIG. 2b is a graph comparing PR curves of the iCoSeg dataset provided by the embodiment of the present invention with those of the classical algorithm;

FIG. 3a is a graph comparing F-measure curves of an iCoSeg dataset provided by an embodiment of the present invention with those of a conventional algorithm;

FIG. 3b is a graph comparing F-measure curves of the iCoSeg dataset with the mainstream algorithm provided by the embodiment of the invention;

FIG. 4a is a graph showing PR curves of SOD data set according to the embodiment of the present invention compared with conventional algorithm;

FIG. 4b is a graph showing PR curves of the SOD data set and the mainstream algorithm according to the embodiment of the present invention;

FIG. 5a is a graph comparing F-measure curves of an ECSSD data set with those of a conventional algorithm according to an embodiment of the present invention;

FIG. 5b is a graph comparing F-measure curves of the ECSSD data set with the mainstream algorithm according to the embodiment of the present invention;

fig. 6 is a saliency map generated by each algorithm provided by an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Referring to fig. 1, an embodiment of the present invention provides an image saliency target detection method based on low rank matrix recovery, the method including:

step 1: extracting color features from an original image, combining super pixels of the image, and determining a feature matrix of the original image; and decomposing a low-rank matrix from the feature matrix by a robust principal component analysis method.

Step 2: constructing a hierarchical index tree of the original image by utilizing the image superpixels according to an index tree generation algorithm, and determining a hierarchical sparse norm of the original image by combining high-level prior information.

Step 3: and performing ternary decomposition on the low-rank matrix, and determining a structured low-rank matrix recovery model of the original image.

Step 4: and obtaining a saliency map by using a structured low-rank matrix recovery model of the layered sparse norm constraint image of the original image and combining an alternating direction optimization algorithm.

The specific explanation of the above steps 1 to 4 is as follows:

first extracting color features from an original image, and then generating image superpixels { P } by using a simple linear iterative clustering algorithm ₁ ，P ₂ ，...，P _n }. Wherein each super pixel block P _i Corresponds to a feature vector x _i And (3) representing. Thus, the original image can be represented as a feature matrix

In order to effectively combine the structured sparse regularization method, the generation quality of the saliency map is improved. And constructing a hierarchical index tree of the original image according to an index tree generation algorithm by utilizing the generated superpixels to obtain a hierarchical structure representation of the image. The hierarchical sparse norms of the image are then defined based on the index tree as follows:

wherein, the j index tree node of the i th layerThe weight of (2) is expressed as +.>I.I represents the number of set elements, d represents the number of layers of the index tree, n _i Represents the number of tree nodes in each layer, I _∞ Representation l _∞ Is capable of effectively segmenting foreground and background regions, wherein +.>Representing the corresponding high-level prior weight of each tree node, the definition is as follows:

wherein m represents the super pixel number contained in the j-th tree node of the i-th layer; fusing three prior graphs of the position, the color and the boundary connection into a final high-level prior graph, and h _m Representing each super pixel block P _i Corresponding values of the higher-level prior graph corresponding to e P

The classical robust principal component analysis method can decompose the feature matrix X into a low-rank matrix L corresponding to an image background and a sparse matrix S corresponding to a foreground salient target. However, the algorithm needs singular value decomposition every iteration, and has high complexity. For this purpose, rank r ^x Is decomposed into the product of three matrices, i.e. l= QMR,Q ^T Q＝I，RR ^T =i, where r < min (m, n).

Theory of: assuming matrices Q and R have orthogonal row and column vectors, respectively, i.e., Q ^T Q＝I，RR ^T =i, then the equation QMR | _* ＝||M|| _* This is true.

The theory can deduce L _* ＝||QMR|| _* ＝||M|| _* . Thus, the following structured low rank matrix recovery model is proposed:

s.t.X＝L+S，L＝QMR,Q ^T Q＝I,RR ^T ＝I

wherein M is _* Represents the nuclear norm of M;representing a hierarchical sparse regularization term; alpha, beta represent balance coefficients;representing manifold canonical terms, defined as follows:

Based on the above discussion, the optimization problem can be expressed as:

s.t.X＝L+S，L＝QMR，Q ^T Q＝I，RR ^T ＝I.

the solution algorithm of the optimization model is deduced as follows:

firstly, the objective function is separable by using the auxiliary variable E, and the optimization problem is converted into:

s.t.X＝QMR+S，Q ^T Q＝I，RR ^T ＝I，S＝E.

the augmented lagrangian function that optimizes the above problem is then expressed as:

The structured low rank matrix ternary decomposition algorithm is as follows:

input: feature matrix X, parameter alpha beta, index treeWeight->Rank upper bound r.

Output: l= QMR and S.

1: initialize Q ⁰ ＝0，M ⁰ ＝0，R ⁰ ＝eye(r，N)，S ⁰ ＝0，E ⁰ ＝0，μ ⁰ ：0.1，

μ _max ＝10 ¹⁰ ，ρ＝1.1，ε＝10 ^-5 k＝0.

2: judging whether to converge the X-Q ^k M ^k R ^k -S ^k || _∞ ＜ε and||S ^k -E ^k || _∞ ＜ε，

If so, terminate and go to (13), otherwise loop (3) - (12).

3：

4：

5：

6：

7：

8：

9：

10：μ ^k+1 ＝min(ρμ ^k ，μ _max )

11：k＝k+1

12: go to (3) continue the circulation

13: return L ^k ＝Q ^k M ^k R ^k And S is ^k .

And respectively updating each variable in the augmented Lagrangian function by using an alternating direction method:

1) Updating Q ^k+1 And R is ^k+1

Fixing other variables, the optimization sub-problem for Q is expressed as:

s.t.Q ^T Q＝I，

wherein, the liquid crystal display device comprises a liquid crystal display device,

the above problem is a least squares problem under an orthogonal constraint, fixing M ^k ，R ^k ，S ^k Andtwo matrices Q and M ^k The product of (2) is +.>Wherein->Representation matrix R ^k Can satisfy constraint R ^k (R ^k ) ^T (R of =I ^k ) ^T Instead of. Q (Q) ^k+1 Can be calculated according to the following formula:

Q ^k+1 ＝QR(QM ^k )＝QR(Z(R ^k ) ^T )，

wherein QR (·) represents a QR decomposition operator. Similarly, R ^k+1 The calculation is as follows:

(R ^k+1 ) ^T ＝QR((Q ^k+1 ) ^T Z)

2) Updating M ^k+1

Other variables are fixed, and optimization sub-problems for M are as follows:

and has a closed-form solution as follows:

wherein SVT _μ (X)＝Udiag(max(σ-μ，0))V ^T Representing the singular value threshold operator,representing singular values of X, i.e. x=udiag (σ) V ^T ，

3) Update E ^k+1

Fix M ^k+1 ，S ^k ，Q ^k+1 ，R ^k+1 ，And->The following optimization sub-problem is deduced:

deriving the variable E, and obtaining:

4) Updating S ^k+1

The optimization sub-problem for S is expressed as follows:

wherein λ=α/(2μ) ^k )，

The above problem can be solved by a hierarchical neighbor operator.

Algorithm complexity analysis

Pair matrix (Q) ^k+1 ) ^T Z(R ^k+1 ) ^T The time complexity of performing singular value decomposition isQR decomposition and matrix multiplication time complexity is +.>Thus, the structured low rank matrix ternary decomposition has a total temporal complexity ofWhere t represents the number of iterations. In practical application, r is less than D and N, so that the calculation complexity of algorithm is reduced to +.>

Examples

And performing saliency map generation and experimental verification on the model and the algorithm, and performing comparison analysis with a popular saliency target detection algorithm. These currently best performing algorithms include SMD, WLRR, DRFI, RBD and DSR, as well as other unsupervised saliency target detection algorithms.

Selection of data sets

As shown in table 1, the experiments used data sets under different conditions, the tests proposed the robustness of the algorithm, these data sets including the multi-target simple background data sets SOD and iCoSeg, and the multi-target complex background data set ECSSD. All algorithms were tested and compared using Matlab2016 (a) environment using Intel cool dual core CPU i5-6200U and 8GB configuration of memory.

Table 1 dataset description

Model parameter setting and evaluation index

The dimension of the feature matrix is 200×75, where the number of superpixels is 200, and the feature vector dimension corresponding to each superpixel block is 75. The number of layers of the index tree is set to 5. Parameter sigma in an objective function ² =0.05, balance parameters α and β are set to 0.5 and 1.0, respectively. The upper bound of the rank of the input low rank matrix is estimated using the QR decomposition algorithm.

The evaluation index includes PR curve, F-measure curve, mean absolute error MAE, area under ROC curve (AUC), overlap Ratio (OR) and weighted F-measure (WF) score. Wherein PR and F-measure curves are generated by setting different thresholds.

Experimental tests were performed on different data sets to verify the validity of the proposed algorithm:

1) Multi-objective simple background

First, the proposed algorithm is tested against the multi-target simple background dataset SOD and iCoSeg. The PR and F-measure curves are given in FIG. 2a, FIG. 2b, FIG. 3a, FIG. 3b, respectively. The performance of the proposed algorithm and the SMD algorithm is superior to other methods based on the PR curve. And as can be seen from fig. 3a and 3b, the proposed algorithm and SMD algorithm are insensitive to the threshold value, while other methods can obtain a better detection effect only in a certain range. In addition, the DRFI also has better performance, but is a shallow supervised model, and needs to be trained in advance, and the proposed method is unsupervised and can directly perform target detection, so that the DRFI method has better performance and more flexibility than the DRFI method. The proposed algorithm performs quite well as the SMD algorithm, but the proposed algorithm is more efficient because it uses a high-efficiency structured low-rank matrix ternary decomposition algorithm, the run-time comparison of which is shown in table 2.

2) Multi-target complex background

Under a complex background, the performance of each algorithm can be greatly influenced, and the robustness of each algorithm can be compared by testing the multi-target detection performance under the complex background condition. In the experiment, multiple target complex images SOD and ECSSD were tested. Fig. 4a, 4b, 5a, 5b show the PR curve on the SOD dataset and the F-measure curve on the ECSSD dataset for each algorithm, respectively. As can be seen from fig. 4a, 4b, the proposed algorithm performs quite well as the SMD algorithm. DRFI achieves the best performance, but DRFI is supervised and requires training of the model in advance. In addition, from fig. 5a and 5b, the proposed algorithm is significantly better than the DRFI algorithm in a larger threshold range, while the DRFI only obtains a relatively high F-measure index in the case of a smaller threshold, but the index is still lower than the index of the proposed method in other threshold ranges. Therefore, the two complex background multi-target data sets are synthesized, the proposed algorithm does not need to be trained in advance, and the comprehensive index is superior to other algorithms. The method has the advantage that the interference robustness of the proposed algorithm to the complex background is stronger.

3) Experiment comparison with SMD

From the above analysis, it can be seen that the proposed algorithm performs quite well with the SMD algorithm. The proposed algorithm and SMD are further analyzed experimentally from both the generated saliency map and the run time. Figure 6 shows a saliency map of the algorithmic generation that performs best under different image conditions. As can be seen from fig. 6, the proposed algorithm and SMD algorithm produce significantly better quality of the salient map than the other algorithms. However, the proposed algorithm can extract more detailed significance targets, for example, the first row in fig. 6, it can be seen that the significance map generated by the SMD algorithm has a certain detail deficiency, and the proposed algorithm is superior to the method of introducing the structured low-rank matrix ternary decomposition, has a better low-rank matrix recovery effect, and can separate the foreground and the background images to the greatest extent. Furthermore, analytical comparisons were made for the proposed algorithm and the run time of the SMD, and the results are shown in table 2. The larger the number of superpixels, the larger the feature matrix, the longer the running time of the algorithm, but the faster the running speed of the algorithm proposed herein, and the larger the feature matrix, the more obvious the running efficiency. By integrating the performance analysis of the indexes of different data sets, the comprehensive performance of the method provided by the invention is optimal in all the methods.

Table 2 run time comparison (minutes)

In summary, in order to improve the efficiency of the traditional unsupervised target detection algorithm, the robustness of the image to the background is enhanced. The invention provides a salient object detection method integrating low-rank matrix ternary decomposition and hierarchical sparse regularization, which solves the problem of high computational complexity of singular value decomposition, and the hierarchical sparse regularization solves the problem of weak robustness of traditional salient object detection under a complex background.

The foregoing disclosure is only a few specific embodiments of the present invention and various changes and modifications may be made by those skilled in the art without departing from the spirit and scope of the invention, and it is intended that the invention also includes such changes and modifications as fall within the scope of the claims and their equivalents.

Claims

1. The image saliency target detection method based on low-rank matrix recovery is characterized by comprising the following steps of:

obtaining a saliency map by utilizing a structured low-rank matrix recovery model of a layered sparse norm constraint image of an original image and combining an alternating direction optimization algorithm;

ternary decomposition is carried out on the low-rank matrix, and a structured low-rank matrix recovery model of the original image is determined; the method specifically comprises the following steps:

s.t.X＝L+S，L＝QMR，Q ^T Q＝I，RR ^T ＝1

wherein M is _* Represents the nuclear norm of M;representing a hierarchical sparse regularization term; alpha, beta represent balance coefficients; />The manifold canonical term is represented and the expression is as follows:

wherein S is _i Is the ith column of the sparse matrix S;representing the Laplace matrix, i.e. L _g =d-W, D being the degree matrix; tr (·) represents the trace of the matrix; w represents the neighbor matrix generated by the super-pixel pairs, W _i，j Representing the distance between the i and j-th superpixels;

the method comprises the steps that a structured low-rank matrix recovery model of an image is constrained by using a layered sparse norm of an original image, and a saliency map is obtained by combining an alternating direction optimization algorithm; the method specifically comprises the following steps:

the structural low-rank matrix recovery model is optimized and expressed as:

s.t.X＝L+S，L＝QMR，Q ^T Q＝I，RR ^T ＝I.

the solution algorithm for the optimized model is derived as follows:

s.t.X＝QMR+S，Q ^T Q＝I，RR ^T ＝I，S＝E.

2. The method for detecting the image saliency target based on low-rank matrix recovery according to claim 1, wherein the method is characterized in that color features are extracted from an original image, and a feature matrix of the original image is determined by combining image super pixels; the method specifically comprises the following steps:

3. The method for detecting an image saliency target based on low-rank matrix recovery according to claim 2, wherein the layered sparse norms of the original image; the method specifically comprises the following steps:

wherein the ith layerThe j-th index tree nodeThe weight of (2) is expressed as +.> I.I represents the number of set elements, d represents the number of layers of the index tree, n _i Represents the number of tree nodes in each layer, I _∞ Representation l _∞ Is>Expressed as: