CN107992874B - Image salient target region extraction method and system based on iterative sparse representation - Google Patents
Image salient target region extraction method and system based on iterative sparse representation Download PDFInfo
- Publication number
- CN107992874B CN107992874B CN201711387624.3A CN201711387624A CN107992874B CN 107992874 B CN107992874 B CN 107992874B CN 201711387624 A CN201711387624 A CN 201711387624A CN 107992874 B CN107992874 B CN 107992874B
- Authority
- CN
- China
- Prior art keywords
- pixel
- sal
- image
- significance
- scale
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/56—Extraction of image or video features relating to colour
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- Magnetic Resonance Imaging Apparatus (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a method and a system for extracting a significant target area of an image based on iterative sparse representation, which comprises the steps of firstly carrying out superpixel segmentation on an original image by utilizing a SLIC (linear segmentation and segmentation) segmentation method of a plurality of groups of different pixel number parameters to generate a group of segmented images with different superpixel area sizes; and then, aiming at the segmentation result of each scale, taking the classical visual attention detection result as the initial saliency map to restrict the selection of foreground and background sample regions, further calculating the reconstructed residual error of each super-pixel region as a saliency factor through a sparse representation process, optimizing a saliency detection result map under a single scale by combining recursive iterative operation, and finally obtaining a final saliency target and a detection result through multi-scale saliency map fusion. The method provided by the invention can effectively improve the defects of inconsistent single-target significance evaluation, difficulty in detecting the image edge significant target, incomplete extraction of multiple significant targets and the like in the traditional method.
Description
Technical Field
The invention belongs to the field of computer vision and image processing, and relates to an image salient target region extraction technology based on iterative sparse representation.
Background
The image visual saliency analysis is a basic research project which is very important in the fields of computer vision, psychology, neuroscience and the like, and is a technical embodiment that human eyes can quickly and accurately capture the biological performance of a target area which can draw attention in vision from a scene. Through image significance analysis, target areas which people are interested in can be effectively extracted, data compression can be successfully achieved, efficient management and utilization of data are completed, and the method is also a basic link of a plurality of image processing problems.
From 1998, people realize automatic significance analysis of images through computers for the first time, and as application prospects of the automatic significance analysis are continuously mined, novel significant target automatic detection algorithms are endless. From the solution point of view, the existing significant object extraction algorithms can be roughly divided into two categories, namely a data-driven bottom-up detection method and a task-driven top-down detection method. The former automatically processes and identifies an input image according to experience recognition to realize traditional cognitive significance analysis, and is usually an automatic extraction algorithm in an unsupervised mode; the later combines with the actual target task to perform targeted analysis on the image, extracts the target object which meets the specific application requirement, and is usually a recognition algorithm under supervised learning. On the other hand, from the perspective of extracting the result state, the conventional method can be further divided into a saliency analysis algorithm based on visual attention and a saliency target extraction method, wherein the saliency analysis algorithm generates a pixel-level saliency prediction map, and the saliency target extraction method takes the extracted complete saliency target region as a final target.
In the bottom-up unsupervised approach, due to the lack of high-level biological cognitive information, certain hypothetical constraints are usually introduced to complete the detection task. Empirical analysis results show that objects distributed near the middle of the image are more attractive to visual attention, whereas the saliency of areas near the edges of the image is generally lower; meanwhile, local areas with high contrast can also show higher visual saliency characteristics, so that the saliency detection method combining image center/boundary constraint and contrast analysis is developed rapidly, and meanwhile, the saliency detection method also shows very outstanding detection performance. With the further and extensive research and application, the deficiency of the dependence of the foregoing method on hypothetical conditions becomes more and more prominent, which is shown in the following: 1) when a salient object is close to the edge of an image, correct detection cannot be achieved generally; 2) based on a local contrast analysis method, the extracted significant target area is incomplete, and the target internal significance evaluation is not uniform; 3) the method based on the global contrast analysis often fails to detect when dealing with the problem of multiple simultaneous salient objects. Therefore, how to overcome the defects in the traditional method, weaken the hypothetical condition constraint dependency under the condition of high-level cognitive information loss, improve the uniformity and integrity of the extraction of the significant target, and strengthen the adaptability of the algorithm is still a technical problem which needs further research and overcoming.
Disclosure of Invention
The invention aims to provide a technical scheme of a method for extracting the consistency of an image salient target area under a natural background, which can fully utilize the comprehensive difference of a foreground and a background in an image, integrate the inherent relation between salient targets, weaken the dependency of a salient analysis process on the traditional hypothetical condition constraint, realize the consistency extraction of multiple salient targets, and ensure the internal integrity of a single salient target and the integrity of multiple salient targets.
In order to achieve the above object, the technical solution provided by the present invention is a method for extracting a salient object region of an image based on iterative sparse representation, comprising the following steps:
step 1, preprocessing data, setting different SLIC superpixel numbers, carrying out multi-scale superpixel segmentation on an original image, using saliency detection based on classical visual attention, and setting a detection result as an initial saliency map SAL0;
Step 2, extracting the salient features of the pixel-level original image, wherein the salient features comprise color features, position features and gradient features;
step 3, calculating the average value of all original pixel features in the superpixel region of each scale to obtain the superpixel region features under single-scale segmentation;
step 4, aiming at the segmentation result of a single scale, calculating a saliency map through recursive sparse representation, and comprising the following substeps:
step 4.1, superpixel initial renderingSaliency calculation according to SAL0Solving the initial significance level of each super pixel through mean value calculation;
step 4.2, extracting foreground samples, performing descending order arrangement on the initial significance level of the superpixels, and taking the front p 1% superpixels as foreground samples Df;
Step 4.3, extracting a background sample, arranging the initial significance levels of the superpixels in an ascending order, and taking the front p2% superpixels as an alternative background sample Db1Extracting superpixels contacting the image boundary as alternative background samples Db2The background sample calculation formula is as follows:
Db=Db1+Db2-Df (1)
step 4.4, performing double sparse representation and sparse residual calculation, wherein the foreground sample and the background sample are respectively used as dictionaries to perform sparse representation on all the superpixels, and a reconstructed residual is calculated, wherein the formula is as follows:
wherein i represents a super pixel number; fiIs a feature vector of the superpixel region; lambda [ alpha ]b,λfIs a regularization parameter; alpha is alphabi,αfiRespectively representing a foreground sparse representation result and a background sparse representation result; epsilonbi,εfiRespectively a foreground dilution reconstruction residual error and a background sparse reconstruction residual error;
step 4.5, calculating the significance factor, and aligning epsilon according to a formula (6)biAnd εfiAre fused andgiving the super-pixel fusion result to all original image pixels in the super-pixel fusion result, and calculating to obtain a significant factor graph SALi,
SALi=εbi/(εfi+σ2) (6)
Wherein sigma2A non-negative tuning parameter;
step 4.6, recursive processing is carried out, and the significance factor graph SAL is calculated according to a formula (7)iAnd an initial saliency map SAL0The rela coefficient between them, if rela<K, then let SAL0=SALiAnd the whole process of the step 4 is repeatedly executed; if rela>K, then the recursion is ended and the current SAL is outputiThe significance detection result at the scale is obtained; wherein K is a similarity determination threshold value,
rela=corr2(A,B) (7)
wherein corr2() is the correlation coefficient calculation function; a and B are matrixes or images to be compared; rela is a correlation coefficient between A and B, the larger the value is, the more similar A and B are, otherwise, the difference is larger;
and 5, fusing multi-scale significance detection results, performing equal-weight linear combination on significance results under each single scale, and calculating a final significance detection result.
Further, in step 2, the salient features are RGB, Lab, x, y, 13-dimensional features with first-order gradient and second-order gradient, and are expressed as { R, G, B, L, a, B, x, y, fx,fy,fxx,fyy,fxyThe color feature is a six-dimensional feature of R, G, B, L, a and B, and the R, G, B and L, a and B are RGB and LAB color information respectively; x and y are position information, and x and y are row-column coordinates of pixels in the image; f. ofx,fy,fxx,fyy,fxyFor the gradient feature, the first and second order differences of the pixel in the X and Y directions are respectively expressed, and the calculation formula is as follows:
fx=(f(i+1,j)-f(i-1,j))/2
fy=(f(i,j+1)-f(i,j-1))/2
fxx=(fx(i+1,j)-fx(i-1,j))/2 (8)
fyy=(fy(i,j+1)-fy(i,j-1))/2
fxy=(fx(i,j+1)-fx(i,j-1))/2
where f (i, j) is the image matrix and i, j is the image pixel row column number.
The invention also correspondingly provides an image salient target region extraction system based on iterative sparse representation, which comprises the following modules,
a preprocessing module for preprocessing data, setting different SLIC superpixel numbers, performing multi-scale superpixel segmentation on the original image, using the saliency detection based on classical visual attention, setting the detection result as an initial saliency map SAL0;
The salient feature extraction module is used for extracting salient features of the pixel-level original image, wherein the salient features comprise color features, position features and gradient features;
the super-pixel region feature acquisition module is used for calculating the mean value of all original pixel features in the super-pixel region of each scale to obtain the super-pixel region features under single-scale segmentation;
the sparse representation module is used for calculating the saliency map through recursive sparse representation aiming at the segmentation result of a single scale, and comprises the following sub-modules:
a first sub-module for superpixel initial saliency calculation according to SAL0Solving the initial significance level of each super pixel through mean value calculation;
the second sub-module is used for extracting the foreground sample, performing descending order arrangement on the initial significance level of the superpixels, and taking the front p 1% superpixels as the foreground sample Df;
The third sub-module is used for extracting a background sample, the initial significance levels of the super pixels are arranged in an ascending order, and the front p2% of super pixels are taken as an alternative background sample Db1Extracting superpixels contacting the image boundary as alternative background samples Db2The background sample calculation formula is as follows:
Db=Db1+Db2-Df (1)
the fourth submodule is used for dual sparse representation and sparse residual calculation, all the superpixels are sparsely represented and reconstructed residual is calculated by taking the foreground sample and the background sample as dictionaries, and the formula is as follows:
wherein i represents a super pixel number; fiIs a feature vector of the superpixel region; lambda [ alpha ]b,λfIs a regularization parameter; alpha is alphabi,αfiRespectively representing a foreground sparse representation result and a background sparse representation result; epsilonbi,εfiRespectively a foreground dilution reconstruction residual error and a background sparse reconstruction residual error;
a fifth submodule for calculating the significance factor, for ε according to equation (6)biAnd εfiFusing, giving the super-pixel fusion result to all original image pixels in the super-pixel fusion result, and calculating to obtain a significant factor graph SALi,
SALi=εbi/(εfi+σ2) (6)
Wherein sigma2A non-negative tuning parameter;
a sixth sub-module for recursive processing for calculating the significance factor graph SAL according to equation (7)iAnd an initial saliency map SAL0The rela coefficient between them, if rela<K, then let SAL0=SALiAnd the whole process of the step 4 is repeatedly executed; if rela>K, then the recursion is ended and the current SAL is outputiThe significance detection result at the scale is obtained; wherein K is a similarity determination threshold value,
rela=corr2(A,B) (7)
wherein corr2() is the correlation coefficient calculation function; a and B are matrixes or images to be compared; rela is a correlation coefficient between A and B, the larger the value is, the more similar A and B are, otherwise, the difference is larger;
and the detection result fusion module is used for fusing multi-scale significance detection results, performing equal-weight linear combination on significance results under each single scale, and calculating a final significance detection result.
Further, the salient features in the salient feature extraction module are RGB, Lab, x, y, 13-dimensional features with first-order gradient and second-order gradient, and are expressed as { R, G, B, L, a, B, x, y, fx,fy,fxx,fyy,fxyThe color feature is a six-dimensional feature of R, G, B, L, a and B, and the R, G, B and L, a and B are RGB and LAB color information respectively; x and y are position information, and x and y are row-column coordinates of pixels in the image; f. ofx,fy,fxx,fyy,fxyFor the gradient feature, the first and second order differences of the pixel in the X and Y directions are respectively expressed, and the calculation formula is as follows:
fx=(f(i+1,j)-f(i-1,j))/2
fy=(f(i,j+1)-f(i,j-1))/2
fxx=(fx(i+1,j)-fx(i-1,j))/2 (8)
fyy=(fy(i,j+1)-fy(i,j-1))/2
fxy=(fx(i,j+1)-fx(i,j-1))/2
where f (i, j) is the image matrix and i, j is the image pixel row column number.
The method of the invention firstly utilizes a SLIC segmentation method of a plurality of groups of different pixel number parameters to carry out superpixel segmentation on an original image, generates a group of segmented images with different superpixel areas and establishes multi-scale source data. And then, aiming at the segmentation result of each scale, taking the classical visual attention detection result as the initial saliency map to restrict the selection of foreground and background sample regions, further calculating the reconstructed residual error of each super-pixel region as a saliency factor through a sparse representation process, optimizing a saliency detection result map under a single scale by combining recursive iterative operation, and finally obtaining a final saliency target and a detection result through multi-scale saliency map fusion. The technical scheme of the invention has the following advantages:
1) the image is divided into the super-pixel images with multiple scales through a plurality of sets of SLIC dividers, so that on one hand, the image contour information can be effectively reserved by combining with an SLIC method, and the consistency of the interior of the same target area can be kept in the process of saliency detection; in addition, through multi-scale segmentation, the algorithm has better adaptability and robustness to the detection of targets with different sizes.
2) The pixel (region) significance is calculated through a dual sparse representation process based on a foreground dictionary and a background dictionary, on one hand, a reconstruction process residual error is used as a significance level index, the visual significance similarity level between pixels is judged from a global angle, and the method is different from the traditional method based on contrast and image boundary constraint and can effectively solve the problem of incomplete target detection; in addition, the calculation process of the double sparse representation can also carry out more comprehensive analysis on the attributes of all pixels so as to judge the significance level of the pixels, and the algorithm robustness can be further improved.
3) The dependence of the algorithm on the initial saliency map generated based on the classical visual attention model can be weakened to a certain extent through the recursive optimization process, and the reliability of the algorithm is improved.
Drawings
Fig. 1 is a graph comparing the detection result of the salient object of the image processed by the method of the present invention in the embodiment of the present invention with the traditional method, (a) is an input image, (b) is a true value of the salient object, (c) is a detection result of the traditional local contrast-based method, (d) is a detection result of the global contrast-based method, (e) is a detection result of the image boundary constraint method, and (f) is a detection result of the method of the present invention.
FIG. 2 is a flow chart of an embodiment of the present invention.
Detailed Description
The following describes a specific embodiment of the present invention with reference to the drawings and examples.
The invention provides an image salient target region extraction method based on iterative sparse representation. The method extracts the target area which can attract the most visual attention of a human body by carrying out significance analysis on the image, can play the roles of effective data optimization and data compression, and is a basic link of a plurality of image processing problems. Researches show that the traditional image salient object extraction method based on local contrast, global contrast and image boundary constraint generally has strong dependence on corresponding constraint conditions, and is easy to have the problems of non-uniform single-object internal saliency, incomplete multi-target detection and ambiguous image boundary salient object extraction, as shown in fig. 1(c) - (e). The method utilizes double sparse representation and reconstructed residual to calculate the pixel significance level, optimizes the detection result by means of a recursive iterative process, and improves the algorithm applicability by fusing multi-scale detection results, as shown in fig. 1(f), the single target significance in the detection result of the method has better consistency, is more complete for multi-target detection, and can improve the defect that the traditional method has missing detection on the significant target close to the image boundary to a certain extent, the embodiment fully proves that the method has the detection performance stronger than the traditional general significant target extraction method, as shown in fig. 2, the specific implementation method provided by the embodiment comprises the following steps:
and step 1, data acquisition. The saliency detection opens up the source data set raw image and the saliency target truth data is downloaded.
And 2, preprocessing data. And carrying out multi-scale segmentation on the original image, and carrying out saliency detection by using a classical visual attention model to generate an initial saliency map.
And 3, extracting 13-dimensional features of the pixels of the original image, namely { R, G, B, L, a, B, x, y, fx,fy,fxx,fyy,fxy}。
And 4, extracting the super-pixel region characteristics under single-scale segmentation through mean value calculation, namely F ═ mR, mG, mB, mL, ma, mB, mx, my, mfx,mfy,mfxx,mfyy,mfxyWhere F is a region feature vector, mX (X ═ R, G, B, L, a, B, X, y, Fx,fy,fxx,fyy,fxy) Is the average of the original pixel X attributes of all images in the super pixel region.
Step 5, aiming at the segmentation result of the single scale, calculating a saliency map through recursive sparse representation, and comprising the following sub-steps of:
and 5.1, obtaining a super-pixel initial saliency map through mean value calculation.
And 5.2, extracting a part of super-pixel regions with higher initial significance level as foreground samples. Arranging the initial significance levels of the super-pixels in a descending order, and taking the first p 1% of the super-pixels as foreground samples DfIn this embodiment, p1 is 20, and those skilled in the art can select an appropriate value as required;
and 5.3, extracting a background sample by combining the initial saliency map and the image boundary constraint. The initial significance levels of the super-pixels are arranged in an ascending order, and the first p2% of the super-pixels are taken as alternative background samples Db1In this embodiment, p2 is 20, and those skilled in the art can select an appropriate value as required; extracting superpixels contacting the image boundary as alternative background samples Db2The background sample calculation formula is as follows:
Db=Db1+Db2-Df (1)
and 5.4, respectively carrying out two groups of sparse representations on all the super pixels of the image by using the foreground sample and the background sample, and calculating corresponding reconstruction residual errors. The formula is as follows:
wherein i represents a super pixel number; fiIs a feature vector of the superpixel region; lambda [ alpha ]b,λfIs a regularizing parameter, in this embodiment λb,λfAll are taken as 0.01; alpha is alphabi,αfiRespectively representing a foreground sparse representation result and a background sparse representation result; epsilonbi,εfiRespectively a foreground dilution reconstruction residual error and a background sparse reconstruction residual error;
and 5.5, fusing the two groups of reconstructed residuals to generate a super-pixel significance factor, and acquiring an original image significance factor graph by taking the requirement of consistent original image pixel significance in a super-pixel region as a criterion. According to formula (6) to epsilonbiAnd εfiFusing, giving the super-pixel fusion result to all original image pixels in the super-pixel fusion result, and calculating to obtain a significant factor graph SALi,
SALi=εbi/(εfi+σ2) (6)
Wherein sigma2The parameters are non-negative adjustment parameters, in this embodiment, 0.1 is taken, and a person skilled in the art can select a proper value as required;
and 5.6, comparing the significant factor graph obtained in the step 5.5 with the initial significant graph in the step 5.1, executing a recursion processing process, and outputting a significant detection result under the current scale when the recursion is finished. Calculating the significance factor graph SAL according to equation (7)iAnd an initial saliency map SAL0The rela coefficient between them, if rela<K, then let SAL0=SALiAnd the whole process of the step 4 is repeatedly executed; if rela>K, then the recursion is ended and the current SAL is outputiThe significance detection result at the scale is obtained; wherein K is a similarity determination threshold, and in this embodiment, K is 0.99, and a person skilled in the art can select an appropriate value as needed;
rela=corr2(A,B) (7)
wherein corr2() is the correlation coefficient calculation function; a and B are matrixes or images to be compared; rela is a correlation coefficient between A and B, the larger the value is, the more similar A and B are, otherwise, the difference is larger;
and 6, performing mean value fusion on the significance detection results under the multi-scale condition to generate a final target extraction result.
Theoretically, in the implementation of the whole technical scheme, the whole consistency extraction of the significant target area in the natural background image is realized under the support of the sparse representation principle. Different from the traditional method for detecting the salient targets based on the contrast and the image boundary constraint, the method disclosed by the invention fully utilizes the comprehensive difference between the foreground and the background in the image, integrates the inherent relation between the interior of the salient target and a plurality of salient targets, tries to avoid the problem of the supposition condition dependency faced by the traditional detection method based on the contrast constraint and the image boundary constraint, takes sparse representation as an image pixel consistency analysis approach, takes sparse reconstruction residual error as a pixel difference index, and takes the sparse reconstruction residual error based on an image foreground and background dictionary as a saliency factor, so that the consistency extraction of the plurality of salient targets is realized, and the integrity of the interior of a single salient target and the integrity of multi-target extraction are ensured.
In specific implementation, the technical scheme of the invention can realize automatic operation flow based on a computer software technology, and can also realize a corresponding system in a modularized mode. The embodiment of the invention provides an image salient target region extraction system based on iterative sparse representation, which comprises the following modules:
a preprocessing module for preprocessing data, setting different SLIC superpixel numbers, performing multi-scale superpixel segmentation on the original image, using the saliency detection based on classical visual attention, setting the detection result as an initial saliency map SAL0;
The salient feature extraction module is used for extracting salient features of the pixel-level original image, wherein the salient features comprise color features, position features and gradient features;
the super-pixel region feature acquisition module is used for calculating the mean value of all original pixel features in the super-pixel region of each scale to obtain the super-pixel region features under single-scale segmentation;
the sparse representation module is used for calculating the saliency map through recursive sparse representation aiming at the segmentation result of a single scale, and comprises the following sub-modules:
first sub-module for super-pixel initial saliencySexual calculation according to SAL0Solving the initial significance level of each super pixel through mean value calculation;
the second sub-module is used for extracting the foreground sample, performing descending order arrangement on the initial significance level of the superpixels, and taking the front p 1% superpixels as the foreground sample Df;
The third sub-module is used for extracting a background sample, the initial significance levels of the super pixels are arranged in an ascending order, and the front p2% of super pixels are taken as an alternative background sample Db1Extracting superpixels contacting the image boundary as alternative background samples Db2The background sample calculation formula is as follows:
Db=Db1+Db2-Df (1)
the fourth submodule is used for dual sparse representation and sparse residual calculation, all the superpixels are sparsely represented and reconstructed residual is calculated by taking the foreground sample and the background sample as dictionaries, and the formula is as follows:
wherein i represents a super pixel number; fiIs a feature vector of the superpixel region; lambda [ alpha ]b,λfIs a regularization parameter; alpha is alphabi,αfiRespectively representing a foreground sparse representation result and a background sparse representation result; epsilonbi,εfiRespectively a foreground dilution reconstruction residual error and a background sparse reconstruction residual error;
a fifth submodule for calculating the significance factor, for ε according to equation (6)biAnd εfiFusing, giving the super-pixel fusion result to all original image pixels in the super-pixel fusion result, and calculating to obtain a significant factor graph SALi,
SALi=εbi/(εfi+σ2) (6)
Wherein sigma2A non-negative tuning parameter;
a sixth sub-module for recursive processing for calculating the significance factor graph SAL according to equation (7)iAnd an initial saliency map SAL0The rela coefficient between them, if rela<K, then let SAL0=SALiAnd the whole process of the step 4 is repeatedly executed; if rela>K, then the recursion is ended and the current SAL is outputiThe significance detection result at the scale is obtained; wherein K is a similarity determination threshold value,
rela=corr2(A,B) (7)
wherein corr2() is the correlation coefficient calculation function; a and B are matrixes or images to be compared; rela is a correlation coefficient between A and B, the larger the value is, the more similar A and B are, otherwise, the difference is larger;
and the detection result fusion module is used for fusing multi-scale significance detection results, performing equal-weight linear combination on significance results under each single scale, and calculating a final significance detection result.
The salient features in the salient feature extraction module are RGB, Lab, x, y and 13-dimensional features with first-order gradient and second-order gradient in total, and are expressed as F ═ R, G, B, L, a, B, x, y, Fx,fy,fxx,fyy,fxyThe color feature is a six-dimensional feature of R, G, B, L, a and B, and the R, G, B and L, a and B are RGB and LAB color information respectively; x and y are position information, and x and y are row-column coordinates of pixels in the image; f. ofx,fy,fxx,fyy,fxyFor the gradient feature, the first and second order differences of the pixel in the X and Y directions are respectively expressed, and the calculation formula is as follows:
fx=(f(i+1,j)-f(i-1,j))/2
fy=(f(i,j+1)-f(i,j-1))/2
fxx=(fx(i+1,j)-fx(i-1,j))/2 (8)
fyy=(fy(i,j+1)-fy(i,j-1))/2
fxy=(fx(i,j+1)-fx(i,j-1))/2
where f (i, j) is the image matrix and i, j is the image pixel row column number.
The specific implementation of each module can refer to corresponding steps, and the invention is not described.
The above description of the embodiments is merely illustrative of the basic technical solutions of the present invention and is not limited to the above embodiments. Any simple modification, addition, equivalent change or modification of the described embodiments may be made by a person or team in the field to which the invention pertains without departing from the essential spirit of the invention or exceeding the scope defined by the claims.
Claims (4)
1. The image salient object region extraction method based on the iterative sparse representation is characterized by comprising the following steps of:
step 1, preprocessing data, setting different SLIC superpixel numbers, carrying out multi-scale superpixel segmentation on an original image, using saliency detection based on classical visual attention, and setting a detection result as an initial saliency map SAL0;
Step 2, extracting the salient features of the pixel-level original image, wherein the salient features comprise color features, position features and gradient features;
step 3, calculating the average value of all original pixel features in the superpixel region of each scale to obtain the superpixel region features under single-scale segmentation;
step 4, aiming at the segmentation result of a single scale, calculating a saliency map through recursive sparse representation, and comprising the following substeps:
step 4.1, superpixel initial saliency calculation according to SAL0Solving the initial significance level of each super pixel through mean value calculation;
step 4.2, extracting foreground samples, performing descending order arrangement on the initial significance level of the superpixels, and taking the front p 1% superpixels as foreground samples Df;
Step 4.3, extracting a background sample, arranging the initial significance levels of the superpixels in an ascending order, and taking the front p2% superpixels as an alternative background sample Db1Extracting superpixels contacting the image boundary as alternative background samples Db2The background sample calculation formula is as follows:
Db=Db1+Db2-Df (1)
step 4.4, performing double sparse representation and sparse residual calculation, wherein the foreground sample and the background sample are respectively used as dictionaries to perform sparse representation on all the superpixels, and a reconstructed residual is calculated, wherein the formula is as follows:
wherein i represents a super pixel number; fiIs a feature vector of the superpixel region; lambda [ alpha ]b,λfIs a regularization parameter; alpha is alphabi,αfiRespectively representing a foreground sparse representation result and a background sparse representation result; epsilonbi,εfiRespectively a foreground dilution reconstruction residual error and a background sparse reconstruction residual error;
step 4.5, calculating the significance factor, and aligning epsilon according to a formula (6)biAnd εfiFusing, giving the super-pixel fusion result to all original image pixels in the super-pixel fusion result, and calculating to obtain a significant factor graph SALi,
SALi=εbi/(εfi+σ2) (6)
Wherein sigma2A non-negative tuning parameter;
step 4.6, recursive processing is carried out, and the significance factor graph SAL is calculated according to a formula (7)iAnd an initial saliency map SAL0Relative coefficient between them, if rela is less than K, SAL is ordered0=SALiAnd the whole process of the step 4 is repeatedly executed; if rela > K, the recursion is ended, and the current SAL is outputiThe significance detection result at the scale is obtained; wherein K is a similarity determination threshold value,
rela=corr2(A,B) (7)
wherein corr2() is the correlation coefficient calculation function; a and B are matrixes or images to be compared; rela is a correlation coefficient between A and B, the larger the value is, the more similar A and B are, otherwise, the difference is larger;
and 5, fusing multi-scale significance detection results, performing equal-weight linear combination on significance results under each single scale, and calculating a final significance detection result.
2. The image salient object region extraction method based on iterative sparse representation as claimed in claim 1, wherein the salient features in step 2 are RGB, Lab, x, y, and 13-dimensional features with first-order gradient and second-order gradient in total, and are represented as { R, G, B, L, a, B, x, y, fx,fy,fxx,fyy,fxyThe color feature is a six-dimensional feature of R, G, B, L, a and B, and the R, G, B and L, a and B are RGB and Lab color information respectively; x and y are position information, and x and y are row-column coordinates of pixels in the image; f. ofx,fyfxx,fyy,fxFor the gradient feature, the first and second order differences of the pixel in the X and Y directions are respectively expressed, and the calculation formula is as follows:
where f (i, j) is the image matrix and i, j is the image pixel row column number.
3. The image salient object region extraction system based on the iterative sparse representation is characterized by comprising the following modules:
a preprocessing module for preprocessing data, setting different SLIC superpixel numbers, performing multi-scale superpixel segmentation on the original image, using the saliency detection based on classical visual attention, setting the detection result as an initial saliency map SAL0;
The salient feature extraction module is used for extracting salient features of the pixel-level original image, wherein the salient features comprise color features, position features and gradient features;
the super-pixel region feature acquisition module is used for calculating the mean value of all original pixel features in the super-pixel region of each scale to obtain the super-pixel region features under single-scale segmentation;
the sparse representation module is used for calculating the saliency map through recursive sparse representation aiming at the segmentation result of a single scale, and comprises the following sub-modules:
a first sub-module for superpixel initial saliency calculation according to SAL0Solving the initial significance level of each super pixel through mean value calculation;
the second sub-module is used for extracting the foreground sample, performing descending order arrangement on the initial significance level of the superpixels, and taking the front p 1% superpixels as the foreground sample Df;
The third sub-module is used for extracting a background sample, the initial significance levels of the super pixels are arranged in an ascending order, and the front p 20% of super pixels are taken as an alternative background sample Db1Extracting superpixels contacting the image boundary as alternative background samples Db2The background sample calculation formula is as follows:
Db=Db1+Db2-Df (1)
the fourth submodule is used for dual sparse representation and sparse residual calculation, all the superpixels are sparsely represented and reconstructed residual is calculated by taking the foreground sample and the background sample as dictionaries, and the formula is as follows:
wherein i represents a super pixel number; fiIs a feature vector of the superpixel region; lambda [ alpha ]b,λfIs a regularization parameter; alpha is alphabi,αfiRespectively representing a foreground sparse representation result and a background sparse representation result; epsilonbi,εfiRespectively a foreground dilution reconstruction residual error and a background sparse reconstruction residual error;
a fifth submodule for calculating the significance factor, for ε according to equation (6)biAnd εfiFusing, giving the super-pixel fusion result to all original image pixels in the super-pixel fusion result, and calculating to obtain a significant factor graph SALi,
SALi=εbi/(εfi+σ2) (6)
Wherein sigma2A non-negative tuning parameter;
a sixth sub-module for recursive processing for calculating the significance factor graph SAL according to equation (7)iAnd an initial saliency map SAL0Relative coefficient between them, if rela is less than K, SAL is ordered0=SALiAnd the whole process of the step 4 is repeatedly executed; if rela > K, the recursion is ended, and the current SAL is outputiThe significance detection result at the scale is obtained; wherein K is a similarity determination threshold value,
rela=corr2(A,B) (7)
wherein corr2() is the correlation coefficient calculation function; a and B are matrixes or images to be compared; rela is a correlation coefficient between A and B, the larger the value is, the more similar A and B are, otherwise, the difference is larger;
and the detection result fusion module is used for fusing multi-scale significance detection results, performing equal-weight linear combination on significance results under each single scale, and calculating a final significance detection result.
4. The iterative sparse representation-based image salient object region extraction system according to claim 3, wherein the salient features in the salient feature extraction module are RGB, Lab, x, y and 13-dimensional features with first-order gradient and second-order gradient in total, and are represented as { R, G, B, L, a, B, x, y, fx,fy,fxx,fyy,fxyThe color feature is a six-dimensional feature of R, G, B, L, a and B, and the R, G, B and L, a and B are RGB and Lab color information respectively; x and y are position information, and x and y are row-column coordinates of pixels in the image; f. ofx,fy,fxx,fyy,fxyFor the gradient feature, the first and second order differences of the pixel in the X and Y directions are respectively expressed, and the calculation formula is as follows:
where f (i, j) is the image matrix and i, j is the image pixel row column number.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711387624.3A CN107992874B (en) | 2017-12-20 | 2017-12-20 | Image salient target region extraction method and system based on iterative sparse representation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711387624.3A CN107992874B (en) | 2017-12-20 | 2017-12-20 | Image salient target region extraction method and system based on iterative sparse representation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107992874A CN107992874A (en) | 2018-05-04 |
CN107992874B true CN107992874B (en) | 2020-01-07 |
Family
ID=62039459
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711387624.3A Active CN107992874B (en) | 2017-12-20 | 2017-12-20 | Image salient target region extraction method and system based on iterative sparse representation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107992874B (en) |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109117852B (en) * | 2018-07-10 | 2021-08-17 | 武汉大学 | Unmanned aerial vehicle image adaptation area automatic extraction method and system based on sparse representation |
CN109102465A (en) * | 2018-08-22 | 2018-12-28 | 周泽奇 | A kind of calculation method of the content erotic image auto zoom of conspicuousness depth of field feature |
CN109886267A (en) * | 2019-01-29 | 2019-06-14 | 杭州电子科技大学 | A kind of soft image conspicuousness detection method based on optimal feature selection |
CN110490204B (en) * | 2019-07-11 | 2022-07-15 | 深圳怡化电脑股份有限公司 | Image processing method, image processing device and terminal |
CN111191650B (en) * | 2019-12-30 | 2023-07-21 | 北京市新技术应用研究所 | Article positioning method and system based on RGB-D image visual saliency |
CN111242941B (en) * | 2020-01-20 | 2023-05-30 | 南方科技大学 | Salient region detection method and device based on visual attention |
CN111274964B (en) * | 2020-01-20 | 2023-04-07 | 中国地质大学(武汉) | Detection method for analyzing water surface pollutants based on visual saliency of unmanned aerial vehicle |
CN112766032A (en) * | 2020-11-26 | 2021-05-07 | 电子科技大学 | SAR image saliency map generation method based on multi-scale and super-pixel segmentation |
CN112700438B (en) * | 2021-01-14 | 2024-06-21 | 成都铁安科技有限责任公司 | Ultrasonic flaw judgment method and ultrasonic flaw judgment system for imbedded part of train axle |
CN114332572B (en) * | 2021-12-15 | 2024-03-26 | 南方医科大学 | Method for extracting breast lesion ultrasonic image multi-scale fusion characteristic parameters based on saliency map-guided hierarchical dense characteristic fusion network |
CN115424037A (en) * | 2022-10-12 | 2022-12-02 | 武汉大学 | Salient target region extraction method based on multi-scale sparse representation |
CN115690418B (en) * | 2022-10-31 | 2024-03-12 | 武汉大学 | Unsupervised automatic detection method for image waypoints |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7526123B2 (en) * | 2004-02-12 | 2009-04-28 | Nec Laboratories America, Inc. | Estimating facial pose from a sparse representation |
CN101556690A (en) * | 2009-05-14 | 2009-10-14 | 复旦大学 | Image super-resolution method based on overcomplete dictionary learning and sparse representation |
CN101980284A (en) * | 2010-10-26 | 2011-02-23 | 北京理工大学 | Two-scale sparse representation-based color image noise reduction method |
CN104240256A (en) * | 2014-09-25 | 2014-12-24 | 西安电子科技大学 | Image salient detecting method based on layering sparse modeling |
CN105930812A (en) * | 2016-04-27 | 2016-09-07 | 东南大学 | Vehicle brand type identification method based on fusion feature sparse coding model |
CN106203430A (en) * | 2016-07-07 | 2016-12-07 | 北京航空航天大学 | A kind of significance object detecting method based on foreground focused degree and background priori |
CN106530271A (en) * | 2016-09-30 | 2017-03-22 | 河海大学 | Infrared image significance detection method |
CN106815842A (en) * | 2017-01-23 | 2017-06-09 | 河海大学 | A kind of improved image significance detection method based on super-pixel |
CN107067037A (en) * | 2017-04-21 | 2017-08-18 | 河南科技大学 | A kind of method that use LLC criterions position display foreground |
CN107229917A (en) * | 2017-05-31 | 2017-10-03 | 北京师范大学 | A kind of several remote sensing image general character well-marked target detection methods clustered based on iteration |
CN107301643A (en) * | 2017-06-06 | 2017-10-27 | 西安电子科技大学 | Well-marked target detection method based on robust rarefaction representation Yu Laplce's regular terms |
-
2017
- 2017-12-20 CN CN201711387624.3A patent/CN107992874B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7526123B2 (en) * | 2004-02-12 | 2009-04-28 | Nec Laboratories America, Inc. | Estimating facial pose from a sparse representation |
CN101556690A (en) * | 2009-05-14 | 2009-10-14 | 复旦大学 | Image super-resolution method based on overcomplete dictionary learning and sparse representation |
CN101980284A (en) * | 2010-10-26 | 2011-02-23 | 北京理工大学 | Two-scale sparse representation-based color image noise reduction method |
CN104240256A (en) * | 2014-09-25 | 2014-12-24 | 西安电子科技大学 | Image salient detecting method based on layering sparse modeling |
CN105930812A (en) * | 2016-04-27 | 2016-09-07 | 东南大学 | Vehicle brand type identification method based on fusion feature sparse coding model |
CN106203430A (en) * | 2016-07-07 | 2016-12-07 | 北京航空航天大学 | A kind of significance object detecting method based on foreground focused degree and background priori |
CN106530271A (en) * | 2016-09-30 | 2017-03-22 | 河海大学 | Infrared image significance detection method |
CN106815842A (en) * | 2017-01-23 | 2017-06-09 | 河海大学 | A kind of improved image significance detection method based on super-pixel |
CN107067037A (en) * | 2017-04-21 | 2017-08-18 | 河南科技大学 | A kind of method that use LLC criterions position display foreground |
CN107229917A (en) * | 2017-05-31 | 2017-10-03 | 北京师范大学 | A kind of several remote sensing image general character well-marked target detection methods clustered based on iteration |
CN107301643A (en) * | 2017-06-06 | 2017-10-27 | 西安电子科技大学 | Well-marked target detection method based on robust rarefaction representation Yu Laplce's regular terms |
Non-Patent Citations (5)
Title |
---|
Approximate Correction of Length Distortion for Direct Georeferencing in Map Projection Frame;yongjun zhang等;《IEEE》;20131130;第10卷(第6期);全文 * |
Construction of Manifolds via Compatible Sparse Representations;Ruimin Wang等;《ACM》;20161231;全文 * |
Salient object detection via contrast information and object vision organization cues;Shengxiang Qi等;《ELSEVIER》;20150429;全文 * |
Super-Resolution of Single Text Image by Sparse Representation;Rim Walha等;《ACM》;20121231;全文 * |
基于视觉显著性和目标置信度的红外车辆检测技术;齐楠楠等;《CNKI》;20170630;第46卷(第6期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN107992874A (en) | 2018-05-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107992874B (en) | Image salient target region extraction method and system based on iterative sparse representation | |
Januszewski et al. | High-precision automated reconstruction of neurons with flood-filling networks | |
CN108154118B (en) | A kind of target detection system and method based on adaptive combined filter and multistage detection | |
Gould et al. | Region-based segmentation and object detection | |
CN106778687B (en) | Fixation point detection method based on local evaluation and global optimization | |
CN107680106A (en) | A kind of conspicuousness object detection method based on Faster R CNN | |
CN111612008A (en) | Image segmentation method based on convolution network | |
CN110033007A (en) | Attribute recognition approach is worn clothes based on the pedestrian of depth attitude prediction and multiple features fusion | |
CN109033978B (en) | Error correction strategy-based CNN-SVM hybrid model gesture recognition method | |
US20220237789A1 (en) | Weakly supervised multi-task learning for cell detection and segmentation | |
CN108230330B (en) | Method for quickly segmenting highway pavement and positioning camera | |
Savian et al. | Optical flow estimation with deep learning, a survey on recent advances | |
CN107423771B (en) | Two-time-phase remote sensing image change detection method | |
Sima et al. | Bottom-up merging segmentation for color images with complex areas | |
Wang et al. | Pedestrian detection in infrared image based on depth transfer learning | |
Madessa et al. | A deep learning approach for specular highlight removal from transmissive materials | |
Li et al. | DeepSIR: Deep semantic iterative registration for LiDAR point clouds | |
Dmitriev12 et al. | Efficient correction for em connectomics with skeletal representation | |
CN108664968B (en) | Unsupervised text positioning method based on text selection model | |
Ma et al. | An attention-based progressive fusion network for pixelwise pavement crack detection | |
Wang et al. | Nuclei instance segmentation using a transformer-based graph convolutional network and contextual information augmentation | |
Imtiaz et al. | BAWGNet: Boundary aware wavelet guided network for the nuclei segmentation in histopathology images | |
CN116228795A (en) | Ultrahigh resolution medical image segmentation method based on weak supervised learning | |
CN115861630A (en) | Cross-waveband infrared target detection method and device, computer equipment and storage medium | |
Bhavani et al. | Robust 3D face recognition in unconstrained environment using distance based ternary search siamese network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |