CN107992874B - Image salient target region extraction method and system based on iterative sparse representation - Google Patents

Image salient target region extraction method and system based on iterative sparse representation Download PDF

Info

Publication number
CN107992874B
CN107992874B CN201711387624.3A CN201711387624A CN107992874B CN 107992874 B CN107992874 B CN 107992874B CN 201711387624 A CN201711387624 A CN 201711387624A CN 107992874 B CN107992874 B CN 107992874B
Authority
CN
China
Prior art keywords
pixel
sal
image
significance
scale
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711387624.3A
Other languages
Chinese (zh)
Other versions
CN107992874A (en
Inventor
张永军
王祥
谢勋伟
李彦胜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN201711387624.3A priority Critical patent/CN107992874B/en
Publication of CN107992874A publication Critical patent/CN107992874A/en
Application granted granted Critical
Publication of CN107992874B publication Critical patent/CN107992874B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Magnetic Resonance Imaging Apparatus (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a method and a system for extracting a significant target area of an image based on iterative sparse representation, which comprises the steps of firstly carrying out superpixel segmentation on an original image by utilizing a SLIC (linear segmentation and segmentation) segmentation method of a plurality of groups of different pixel number parameters to generate a group of segmented images with different superpixel area sizes; and then, aiming at the segmentation result of each scale, taking the classical visual attention detection result as the initial saliency map to restrict the selection of foreground and background sample regions, further calculating the reconstructed residual error of each super-pixel region as a saliency factor through a sparse representation process, optimizing a saliency detection result map under a single scale by combining recursive iterative operation, and finally obtaining a final saliency target and a detection result through multi-scale saliency map fusion. The method provided by the invention can effectively improve the defects of inconsistent single-target significance evaluation, difficulty in detecting the image edge significant target, incomplete extraction of multiple significant targets and the like in the traditional method.

Description

Image salient target region extraction method and system based on iterative sparse representation
Technical Field
The invention belongs to the field of computer vision and image processing, and relates to an image salient target region extraction technology based on iterative sparse representation.
Background
The image visual saliency analysis is a basic research project which is very important in the fields of computer vision, psychology, neuroscience and the like, and is a technical embodiment that human eyes can quickly and accurately capture the biological performance of a target area which can draw attention in vision from a scene. Through image significance analysis, target areas which people are interested in can be effectively extracted, data compression can be successfully achieved, efficient management and utilization of data are completed, and the method is also a basic link of a plurality of image processing problems.
From 1998, people realize automatic significance analysis of images through computers for the first time, and as application prospects of the automatic significance analysis are continuously mined, novel significant target automatic detection algorithms are endless. From the solution point of view, the existing significant object extraction algorithms can be roughly divided into two categories, namely a data-driven bottom-up detection method and a task-driven top-down detection method. The former automatically processes and identifies an input image according to experience recognition to realize traditional cognitive significance analysis, and is usually an automatic extraction algorithm in an unsupervised mode; the later combines with the actual target task to perform targeted analysis on the image, extracts the target object which meets the specific application requirement, and is usually a recognition algorithm under supervised learning. On the other hand, from the perspective of extracting the result state, the conventional method can be further divided into a saliency analysis algorithm based on visual attention and a saliency target extraction method, wherein the saliency analysis algorithm generates a pixel-level saliency prediction map, and the saliency target extraction method takes the extracted complete saliency target region as a final target.
In the bottom-up unsupervised approach, due to the lack of high-level biological cognitive information, certain hypothetical constraints are usually introduced to complete the detection task. Empirical analysis results show that objects distributed near the middle of the image are more attractive to visual attention, whereas the saliency of areas near the edges of the image is generally lower; meanwhile, local areas with high contrast can also show higher visual saliency characteristics, so that the saliency detection method combining image center/boundary constraint and contrast analysis is developed rapidly, and meanwhile, the saliency detection method also shows very outstanding detection performance. With the further and extensive research and application, the deficiency of the dependence of the foregoing method on hypothetical conditions becomes more and more prominent, which is shown in the following: 1) when a salient object is close to the edge of an image, correct detection cannot be achieved generally; 2) based on a local contrast analysis method, the extracted significant target area is incomplete, and the target internal significance evaluation is not uniform; 3) the method based on the global contrast analysis often fails to detect when dealing with the problem of multiple simultaneous salient objects. Therefore, how to overcome the defects in the traditional method, weaken the hypothetical condition constraint dependency under the condition of high-level cognitive information loss, improve the uniformity and integrity of the extraction of the significant target, and strengthen the adaptability of the algorithm is still a technical problem which needs further research and overcoming.
Disclosure of Invention
The invention aims to provide a technical scheme of a method for extracting the consistency of an image salient target area under a natural background, which can fully utilize the comprehensive difference of a foreground and a background in an image, integrate the inherent relation between salient targets, weaken the dependency of a salient analysis process on the traditional hypothetical condition constraint, realize the consistency extraction of multiple salient targets, and ensure the internal integrity of a single salient target and the integrity of multiple salient targets.
In order to achieve the above object, the technical solution provided by the present invention is a method for extracting a salient object region of an image based on iterative sparse representation, comprising the following steps:
step 1, preprocessing data, setting different SLIC superpixel numbers, carrying out multi-scale superpixel segmentation on an original image, using saliency detection based on classical visual attention, and setting a detection result as an initial saliency map SAL0
Step 2, extracting the salient features of the pixel-level original image, wherein the salient features comprise color features, position features and gradient features;
step 3, calculating the average value of all original pixel features in the superpixel region of each scale to obtain the superpixel region features under single-scale segmentation;
step 4, aiming at the segmentation result of a single scale, calculating a saliency map through recursive sparse representation, and comprising the following substeps:
step 4.1, superpixel initial renderingSaliency calculation according to SAL0Solving the initial significance level of each super pixel through mean value calculation;
step 4.2, extracting foreground samples, performing descending order arrangement on the initial significance level of the superpixels, and taking the front p 1% superpixels as foreground samples Df
Step 4.3, extracting a background sample, arranging the initial significance levels of the superpixels in an ascending order, and taking the front p2% superpixels as an alternative background sample Db1Extracting superpixels contacting the image boundary as alternative background samples Db2The background sample calculation formula is as follows:
Db=Db1+Db2-Df (1)
step 4.4, performing double sparse representation and sparse residual calculation, wherein the foreground sample and the background sample are respectively used as dictionaries to perform sparse representation on all the superpixels, and a reconstructed residual is calculated, wherein the formula is as follows:
Figure BDA0001516931170000021
Figure BDA0001516931170000023
wherein i represents a super pixel number; fiIs a feature vector of the superpixel region; lambda [ alpha ]b,λfIs a regularization parameter; alpha is alphabi,αfiRespectively representing a foreground sparse representation result and a background sparse representation result; epsilonbi,εfiRespectively a foreground dilution reconstruction residual error and a background sparse reconstruction residual error;
step 4.5, calculating the significance factor, and aligning epsilon according to a formula (6)biAnd εfiAre fused andgiving the super-pixel fusion result to all original image pixels in the super-pixel fusion result, and calculating to obtain a significant factor graph SALi
SALi=εbi/(εfi2) (6)
Wherein sigma2A non-negative tuning parameter;
step 4.6, recursive processing is carried out, and the significance factor graph SAL is calculated according to a formula (7)iAnd an initial saliency map SAL0The rela coefficient between them, if rela<K, then let SAL0=SALiAnd the whole process of the step 4 is repeatedly executed; if rela>K, then the recursion is ended and the current SAL is outputiThe significance detection result at the scale is obtained; wherein K is a similarity determination threshold value,
rela=corr2(A,B) (7)
wherein corr2() is the correlation coefficient calculation function; a and B are matrixes or images to be compared; rela is a correlation coefficient between A and B, the larger the value is, the more similar A and B are, otherwise, the difference is larger;
and 5, fusing multi-scale significance detection results, performing equal-weight linear combination on significance results under each single scale, and calculating a final significance detection result.
Further, in step 2, the salient features are RGB, Lab, x, y, 13-dimensional features with first-order gradient and second-order gradient, and are expressed as { R, G, B, L, a, B, x, y, fx,fy,fxx,fyy,fxyThe color feature is a six-dimensional feature of R, G, B, L, a and B, and the R, G, B and L, a and B are RGB and LAB color information respectively; x and y are position information, and x and y are row-column coordinates of pixels in the image; f. ofx,fy,fxx,fyy,fxyFor the gradient feature, the first and second order differences of the pixel in the X and Y directions are respectively expressed, and the calculation formula is as follows:
fx=(f(i+1,j)-f(i-1,j))/2
fy=(f(i,j+1)-f(i,j-1))/2
fxx=(fx(i+1,j)-fx(i-1,j))/2 (8)
fyy=(fy(i,j+1)-fy(i,j-1))/2
fxy=(fx(i,j+1)-fx(i,j-1))/2
where f (i, j) is the image matrix and i, j is the image pixel row column number.
The invention also correspondingly provides an image salient target region extraction system based on iterative sparse representation, which comprises the following modules,
a preprocessing module for preprocessing data, setting different SLIC superpixel numbers, performing multi-scale superpixel segmentation on the original image, using the saliency detection based on classical visual attention, setting the detection result as an initial saliency map SAL0
The salient feature extraction module is used for extracting salient features of the pixel-level original image, wherein the salient features comprise color features, position features and gradient features;
the super-pixel region feature acquisition module is used for calculating the mean value of all original pixel features in the super-pixel region of each scale to obtain the super-pixel region features under single-scale segmentation;
the sparse representation module is used for calculating the saliency map through recursive sparse representation aiming at the segmentation result of a single scale, and comprises the following sub-modules:
a first sub-module for superpixel initial saliency calculation according to SAL0Solving the initial significance level of each super pixel through mean value calculation;
the second sub-module is used for extracting the foreground sample, performing descending order arrangement on the initial significance level of the superpixels, and taking the front p 1% superpixels as the foreground sample Df
The third sub-module is used for extracting a background sample, the initial significance levels of the super pixels are arranged in an ascending order, and the front p2% of super pixels are taken as an alternative background sample Db1Extracting superpixels contacting the image boundary as alternative background samples Db2The background sample calculation formula is as follows:
Db=Db1+Db2-Df (1)
the fourth submodule is used for dual sparse representation and sparse residual calculation, all the superpixels are sparsely represented and reconstructed residual is calculated by taking the foreground sample and the background sample as dictionaries, and the formula is as follows:
Figure BDA0001516931170000041
Figure BDA0001516931170000043
Figure BDA0001516931170000044
wherein i represents a super pixel number; fiIs a feature vector of the superpixel region; lambda [ alpha ]b,λfIs a regularization parameter; alpha is alphabi,αfiRespectively representing a foreground sparse representation result and a background sparse representation result; epsilonbi,εfiRespectively a foreground dilution reconstruction residual error and a background sparse reconstruction residual error;
a fifth submodule for calculating the significance factor, for ε according to equation (6)biAnd εfiFusing, giving the super-pixel fusion result to all original image pixels in the super-pixel fusion result, and calculating to obtain a significant factor graph SALi
SALi=εbi/(εfi2) (6)
Wherein sigma2A non-negative tuning parameter;
a sixth sub-module for recursive processing for calculating the significance factor graph SAL according to equation (7)iAnd an initial saliency map SAL0The rela coefficient between them, if rela<K, then let SAL0=SALiAnd the whole process of the step 4 is repeatedly executed; if rela>K, then the recursion is ended and the current SAL is outputiThe significance detection result at the scale is obtained; wherein K is a similarity determination threshold value,
rela=corr2(A,B) (7)
wherein corr2() is the correlation coefficient calculation function; a and B are matrixes or images to be compared; rela is a correlation coefficient between A and B, the larger the value is, the more similar A and B are, otherwise, the difference is larger;
and the detection result fusion module is used for fusing multi-scale significance detection results, performing equal-weight linear combination on significance results under each single scale, and calculating a final significance detection result.
Further, the salient features in the salient feature extraction module are RGB, Lab, x, y, 13-dimensional features with first-order gradient and second-order gradient, and are expressed as { R, G, B, L, a, B, x, y, fx,fy,fxx,fyy,fxyThe color feature is a six-dimensional feature of R, G, B, L, a and B, and the R, G, B and L, a and B are RGB and LAB color information respectively; x and y are position information, and x and y are row-column coordinates of pixels in the image; f. ofx,fy,fxx,fyy,fxyFor the gradient feature, the first and second order differences of the pixel in the X and Y directions are respectively expressed, and the calculation formula is as follows:
fx=(f(i+1,j)-f(i-1,j))/2
fy=(f(i,j+1)-f(i,j-1))/2
fxx=(fx(i+1,j)-fx(i-1,j))/2 (8)
fyy=(fy(i,j+1)-fy(i,j-1))/2
fxy=(fx(i,j+1)-fx(i,j-1))/2
where f (i, j) is the image matrix and i, j is the image pixel row column number.
The method of the invention firstly utilizes a SLIC segmentation method of a plurality of groups of different pixel number parameters to carry out superpixel segmentation on an original image, generates a group of segmented images with different superpixel areas and establishes multi-scale source data. And then, aiming at the segmentation result of each scale, taking the classical visual attention detection result as the initial saliency map to restrict the selection of foreground and background sample regions, further calculating the reconstructed residual error of each super-pixel region as a saliency factor through a sparse representation process, optimizing a saliency detection result map under a single scale by combining recursive iterative operation, and finally obtaining a final saliency target and a detection result through multi-scale saliency map fusion. The technical scheme of the invention has the following advantages:
1) the image is divided into the super-pixel images with multiple scales through a plurality of sets of SLIC dividers, so that on one hand, the image contour information can be effectively reserved by combining with an SLIC method, and the consistency of the interior of the same target area can be kept in the process of saliency detection; in addition, through multi-scale segmentation, the algorithm has better adaptability and robustness to the detection of targets with different sizes.
2) The pixel (region) significance is calculated through a dual sparse representation process based on a foreground dictionary and a background dictionary, on one hand, a reconstruction process residual error is used as a significance level index, the visual significance similarity level between pixels is judged from a global angle, and the method is different from the traditional method based on contrast and image boundary constraint and can effectively solve the problem of incomplete target detection; in addition, the calculation process of the double sparse representation can also carry out more comprehensive analysis on the attributes of all pixels so as to judge the significance level of the pixels, and the algorithm robustness can be further improved.
3) The dependence of the algorithm on the initial saliency map generated based on the classical visual attention model can be weakened to a certain extent through the recursive optimization process, and the reliability of the algorithm is improved.
Drawings
Fig. 1 is a graph comparing the detection result of the salient object of the image processed by the method of the present invention in the embodiment of the present invention with the traditional method, (a) is an input image, (b) is a true value of the salient object, (c) is a detection result of the traditional local contrast-based method, (d) is a detection result of the global contrast-based method, (e) is a detection result of the image boundary constraint method, and (f) is a detection result of the method of the present invention.
FIG. 2 is a flow chart of an embodiment of the present invention.
Detailed Description
The following describes a specific embodiment of the present invention with reference to the drawings and examples.
The invention provides an image salient target region extraction method based on iterative sparse representation. The method extracts the target area which can attract the most visual attention of a human body by carrying out significance analysis on the image, can play the roles of effective data optimization and data compression, and is a basic link of a plurality of image processing problems. Researches show that the traditional image salient object extraction method based on local contrast, global contrast and image boundary constraint generally has strong dependence on corresponding constraint conditions, and is easy to have the problems of non-uniform single-object internal saliency, incomplete multi-target detection and ambiguous image boundary salient object extraction, as shown in fig. 1(c) - (e). The method utilizes double sparse representation and reconstructed residual to calculate the pixel significance level, optimizes the detection result by means of a recursive iterative process, and improves the algorithm applicability by fusing multi-scale detection results, as shown in fig. 1(f), the single target significance in the detection result of the method has better consistency, is more complete for multi-target detection, and can improve the defect that the traditional method has missing detection on the significant target close to the image boundary to a certain extent, the embodiment fully proves that the method has the detection performance stronger than the traditional general significant target extraction method, as shown in fig. 2, the specific implementation method provided by the embodiment comprises the following steps:
and step 1, data acquisition. The saliency detection opens up the source data set raw image and the saliency target truth data is downloaded.
And 2, preprocessing data. And carrying out multi-scale segmentation on the original image, and carrying out saliency detection by using a classical visual attention model to generate an initial saliency map.
And 3, extracting 13-dimensional features of the pixels of the original image, namely { R, G, B, L, a, B, x, y, fx,fy,fxx,fyy,fxy}。
And 4, extracting the super-pixel region characteristics under single-scale segmentation through mean value calculation, namely F ═ mR, mG, mB, mL, ma, mB, mx, my, mfx,mfy,mfxx,mfyy,mfxyWhere F is a region feature vector, mX (X ═ R, G, B, L, a, B, X, y, Fx,fy,fxx,fyy,fxy) Is the average of the original pixel X attributes of all images in the super pixel region.
Step 5, aiming at the segmentation result of the single scale, calculating a saliency map through recursive sparse representation, and comprising the following sub-steps of:
and 5.1, obtaining a super-pixel initial saliency map through mean value calculation.
And 5.2, extracting a part of super-pixel regions with higher initial significance level as foreground samples. Arranging the initial significance levels of the super-pixels in a descending order, and taking the first p 1% of the super-pixels as foreground samples DfIn this embodiment, p1 is 20, and those skilled in the art can select an appropriate value as required;
and 5.3, extracting a background sample by combining the initial saliency map and the image boundary constraint. The initial significance levels of the super-pixels are arranged in an ascending order, and the first p2% of the super-pixels are taken as alternative background samples Db1In this embodiment, p2 is 20, and those skilled in the art can select an appropriate value as required; extracting superpixels contacting the image boundary as alternative background samples Db2The background sample calculation formula is as follows:
Db=Db1+Db2-Df (1)
and 5.4, respectively carrying out two groups of sparse representations on all the super pixels of the image by using the foreground sample and the background sample, and calculating corresponding reconstruction residual errors. The formula is as follows:
Figure BDA0001516931170000071
Figure BDA0001516931170000073
Figure BDA0001516931170000074
wherein i represents a super pixel number; fiIs a feature vector of the superpixel region; lambda [ alpha ]b,λfIs a regularizing parameter, in this embodiment λb,λfAll are taken as 0.01; alpha is alphabi,αfiRespectively representing a foreground sparse representation result and a background sparse representation result; epsilonbi,εfiRespectively a foreground dilution reconstruction residual error and a background sparse reconstruction residual error;
and 5.5, fusing the two groups of reconstructed residuals to generate a super-pixel significance factor, and acquiring an original image significance factor graph by taking the requirement of consistent original image pixel significance in a super-pixel region as a criterion. According to formula (6) to epsilonbiAnd εfiFusing, giving the super-pixel fusion result to all original image pixels in the super-pixel fusion result, and calculating to obtain a significant factor graph SALi
SALi=εbi/(εfi2) (6)
Wherein sigma2The parameters are non-negative adjustment parameters, in this embodiment, 0.1 is taken, and a person skilled in the art can select a proper value as required;
and 5.6, comparing the significant factor graph obtained in the step 5.5 with the initial significant graph in the step 5.1, executing a recursion processing process, and outputting a significant detection result under the current scale when the recursion is finished. Calculating the significance factor graph SAL according to equation (7)iAnd an initial saliency map SAL0The rela coefficient between them, if rela<K, then let SAL0=SALiAnd the whole process of the step 4 is repeatedly executed; if rela>K, then the recursion is ended and the current SAL is outputiThe significance detection result at the scale is obtained; wherein K is a similarity determination threshold, and in this embodiment, K is 0.99, and a person skilled in the art can select an appropriate value as needed;
rela=corr2(A,B) (7)
wherein corr2() is the correlation coefficient calculation function; a and B are matrixes or images to be compared; rela is a correlation coefficient between A and B, the larger the value is, the more similar A and B are, otherwise, the difference is larger;
and 6, performing mean value fusion on the significance detection results under the multi-scale condition to generate a final target extraction result.
Theoretically, in the implementation of the whole technical scheme, the whole consistency extraction of the significant target area in the natural background image is realized under the support of the sparse representation principle. Different from the traditional method for detecting the salient targets based on the contrast and the image boundary constraint, the method disclosed by the invention fully utilizes the comprehensive difference between the foreground and the background in the image, integrates the inherent relation between the interior of the salient target and a plurality of salient targets, tries to avoid the problem of the supposition condition dependency faced by the traditional detection method based on the contrast constraint and the image boundary constraint, takes sparse representation as an image pixel consistency analysis approach, takes sparse reconstruction residual error as a pixel difference index, and takes the sparse reconstruction residual error based on an image foreground and background dictionary as a saliency factor, so that the consistency extraction of the plurality of salient targets is realized, and the integrity of the interior of a single salient target and the integrity of multi-target extraction are ensured.
In specific implementation, the technical scheme of the invention can realize automatic operation flow based on a computer software technology, and can also realize a corresponding system in a modularized mode. The embodiment of the invention provides an image salient target region extraction system based on iterative sparse representation, which comprises the following modules:
a preprocessing module for preprocessing data, setting different SLIC superpixel numbers, performing multi-scale superpixel segmentation on the original image, using the saliency detection based on classical visual attention, setting the detection result as an initial saliency map SAL0
The salient feature extraction module is used for extracting salient features of the pixel-level original image, wherein the salient features comprise color features, position features and gradient features;
the super-pixel region feature acquisition module is used for calculating the mean value of all original pixel features in the super-pixel region of each scale to obtain the super-pixel region features under single-scale segmentation;
the sparse representation module is used for calculating the saliency map through recursive sparse representation aiming at the segmentation result of a single scale, and comprises the following sub-modules:
first sub-module for super-pixel initial saliencySexual calculation according to SAL0Solving the initial significance level of each super pixel through mean value calculation;
the second sub-module is used for extracting the foreground sample, performing descending order arrangement on the initial significance level of the superpixels, and taking the front p 1% superpixels as the foreground sample Df
The third sub-module is used for extracting a background sample, the initial significance levels of the super pixels are arranged in an ascending order, and the front p2% of super pixels are taken as an alternative background sample Db1Extracting superpixels contacting the image boundary as alternative background samples Db2The background sample calculation formula is as follows:
Db=Db1+Db2-Df (1)
the fourth submodule is used for dual sparse representation and sparse residual calculation, all the superpixels are sparsely represented and reconstructed residual is calculated by taking the foreground sample and the background sample as dictionaries, and the formula is as follows:
Figure BDA0001516931170000091
Figure BDA0001516931170000092
Figure BDA0001516931170000093
Figure BDA0001516931170000094
wherein i represents a super pixel number; fiIs a feature vector of the superpixel region; lambda [ alpha ]b,λfIs a regularization parameter; alpha is alphabi,αfiRespectively representing a foreground sparse representation result and a background sparse representation result; epsilonbi,εfiRespectively a foreground dilution reconstruction residual error and a background sparse reconstruction residual error;
a fifth submodule for calculating the significance factor, for ε according to equation (6)biAnd εfiFusing, giving the super-pixel fusion result to all original image pixels in the super-pixel fusion result, and calculating to obtain a significant factor graph SALi
SALi=εbi/(εfi2) (6)
Wherein sigma2A non-negative tuning parameter;
a sixth sub-module for recursive processing for calculating the significance factor graph SAL according to equation (7)iAnd an initial saliency map SAL0The rela coefficient between them, if rela<K, then let SAL0=SALiAnd the whole process of the step 4 is repeatedly executed; if rela>K, then the recursion is ended and the current SAL is outputiThe significance detection result at the scale is obtained; wherein K is a similarity determination threshold value,
rela=corr2(A,B) (7)
wherein corr2() is the correlation coefficient calculation function; a and B are matrixes or images to be compared; rela is a correlation coefficient between A and B, the larger the value is, the more similar A and B are, otherwise, the difference is larger;
and the detection result fusion module is used for fusing multi-scale significance detection results, performing equal-weight linear combination on significance results under each single scale, and calculating a final significance detection result.
The salient features in the salient feature extraction module are RGB, Lab, x, y and 13-dimensional features with first-order gradient and second-order gradient in total, and are expressed as F ═ R, G, B, L, a, B, x, y, Fx,fy,fxx,fyy,fxyThe color feature is a six-dimensional feature of R, G, B, L, a and B, and the R, G, B and L, a and B are RGB and LAB color information respectively; x and y are position information, and x and y are row-column coordinates of pixels in the image; f. ofx,fy,fxx,fyy,fxyFor the gradient feature, the first and second order differences of the pixel in the X and Y directions are respectively expressed, and the calculation formula is as follows:
fx=(f(i+1,j)-f(i-1,j))/2
fy=(f(i,j+1)-f(i,j-1))/2
fxx=(fx(i+1,j)-fx(i-1,j))/2 (8)
fyy=(fy(i,j+1)-fy(i,j-1))/2
fxy=(fx(i,j+1)-fx(i,j-1))/2
where f (i, j) is the image matrix and i, j is the image pixel row column number.
The specific implementation of each module can refer to corresponding steps, and the invention is not described.
The above description of the embodiments is merely illustrative of the basic technical solutions of the present invention and is not limited to the above embodiments. Any simple modification, addition, equivalent change or modification of the described embodiments may be made by a person or team in the field to which the invention pertains without departing from the essential spirit of the invention or exceeding the scope defined by the claims.

Claims (4)

1. The image salient object region extraction method based on the iterative sparse representation is characterized by comprising the following steps of:
step 1, preprocessing data, setting different SLIC superpixel numbers, carrying out multi-scale superpixel segmentation on an original image, using saliency detection based on classical visual attention, and setting a detection result as an initial saliency map SAL0
Step 2, extracting the salient features of the pixel-level original image, wherein the salient features comprise color features, position features and gradient features;
step 3, calculating the average value of all original pixel features in the superpixel region of each scale to obtain the superpixel region features under single-scale segmentation;
step 4, aiming at the segmentation result of a single scale, calculating a saliency map through recursive sparse representation, and comprising the following substeps:
step 4.1, superpixel initial saliency calculation according to SAL0Solving the initial significance level of each super pixel through mean value calculation;
step 4.2, extracting foreground samples, performing descending order arrangement on the initial significance level of the superpixels, and taking the front p 1% superpixels as foreground samples Df
Step 4.3, extracting a background sample, arranging the initial significance levels of the superpixels in an ascending order, and taking the front p2% superpixels as an alternative background sample Db1Extracting superpixels contacting the image boundary as alternative background samples Db2The background sample calculation formula is as follows:
Db=Db1+Db2-Df (1)
step 4.4, performing double sparse representation and sparse residual calculation, wherein the foreground sample and the background sample are respectively used as dictionaries to perform sparse representation on all the superpixels, and a reconstructed residual is calculated, wherein the formula is as follows:
Figure FDA0002232626760000011
Figure FDA0002232626760000013
Figure FDA0002232626760000014
wherein i represents a super pixel number; fiIs a feature vector of the superpixel region; lambda [ alpha ]b,λfIs a regularization parameter; alpha is alphabi,αfiRespectively representing a foreground sparse representation result and a background sparse representation result; epsilonbi,εfiRespectively a foreground dilution reconstruction residual error and a background sparse reconstruction residual error;
step 4.5, calculating the significance factor, and aligning epsilon according to a formula (6)biAnd εfiFusing, giving the super-pixel fusion result to all original image pixels in the super-pixel fusion result, and calculating to obtain a significant factor graph SALi
SALi=εbi/(εfi2) (6)
Wherein sigma2A non-negative tuning parameter;
step 4.6, recursive processing is carried out, and the significance factor graph SAL is calculated according to a formula (7)iAnd an initial saliency map SAL0Relative coefficient between them, if rela is less than K, SAL is ordered0=SALiAnd the whole process of the step 4 is repeatedly executed; if rela > K, the recursion is ended, and the current SAL is outputiThe significance detection result at the scale is obtained; wherein K is a similarity determination threshold value,
rela=corr2(A,B) (7)
wherein corr2() is the correlation coefficient calculation function; a and B are matrixes or images to be compared; rela is a correlation coefficient between A and B, the larger the value is, the more similar A and B are, otherwise, the difference is larger;
and 5, fusing multi-scale significance detection results, performing equal-weight linear combination on significance results under each single scale, and calculating a final significance detection result.
2. The image salient object region extraction method based on iterative sparse representation as claimed in claim 1, wherein the salient features in step 2 are RGB, Lab, x, y, and 13-dimensional features with first-order gradient and second-order gradient in total, and are represented as { R, G, B, L, a, B, x, y, fx,fy,fxx,fyy,fxyThe color feature is a six-dimensional feature of R, G, B, L, a and B, and the R, G, B and L, a and B are RGB and Lab color information respectively; x and y are position information, and x and y are row-column coordinates of pixels in the image; f. ofx,fyfxx,fyy,fxFor the gradient feature, the first and second order differences of the pixel in the X and Y directions are respectively expressed, and the calculation formula is as follows:
Figure FDA0002232626760000021
where f (i, j) is the image matrix and i, j is the image pixel row column number.
3. The image salient object region extraction system based on the iterative sparse representation is characterized by comprising the following modules:
a preprocessing module for preprocessing data, setting different SLIC superpixel numbers, performing multi-scale superpixel segmentation on the original image, using the saliency detection based on classical visual attention, setting the detection result as an initial saliency map SAL0
The salient feature extraction module is used for extracting salient features of the pixel-level original image, wherein the salient features comprise color features, position features and gradient features;
the super-pixel region feature acquisition module is used for calculating the mean value of all original pixel features in the super-pixel region of each scale to obtain the super-pixel region features under single-scale segmentation;
the sparse representation module is used for calculating the saliency map through recursive sparse representation aiming at the segmentation result of a single scale, and comprises the following sub-modules:
a first sub-module for superpixel initial saliency calculation according to SAL0Solving the initial significance level of each super pixel through mean value calculation;
the second sub-module is used for extracting the foreground sample, performing descending order arrangement on the initial significance level of the superpixels, and taking the front p 1% superpixels as the foreground sample Df
The third sub-module is used for extracting a background sample, the initial significance levels of the super pixels are arranged in an ascending order, and the front p 20% of super pixels are taken as an alternative background sample Db1Extracting superpixels contacting the image boundary as alternative background samples Db2The background sample calculation formula is as follows:
Db=Db1+Db2-Df (1)
the fourth submodule is used for dual sparse representation and sparse residual calculation, all the superpixels are sparsely represented and reconstructed residual is calculated by taking the foreground sample and the background sample as dictionaries, and the formula is as follows:
Figure FDA0002232626760000031
Figure FDA0002232626760000032
Figure FDA0002232626760000033
Figure FDA0002232626760000034
wherein i represents a super pixel number; fiIs a feature vector of the superpixel region; lambda [ alpha ]b,λfIs a regularization parameter; alpha is alphabi,αfiRespectively representing a foreground sparse representation result and a background sparse representation result; epsilonbi,εfiRespectively a foreground dilution reconstruction residual error and a background sparse reconstruction residual error;
a fifth submodule for calculating the significance factor, for ε according to equation (6)biAnd εfiFusing, giving the super-pixel fusion result to all original image pixels in the super-pixel fusion result, and calculating to obtain a significant factor graph SALi
SALi=εbi/(εfi2) (6)
Wherein sigma2A non-negative tuning parameter;
a sixth sub-module for recursive processing for calculating the significance factor graph SAL according to equation (7)iAnd an initial saliency map SAL0Relative coefficient between them, if rela is less than K, SAL is ordered0=SALiAnd the whole process of the step 4 is repeatedly executed; if rela > K, the recursion is ended, and the current SAL is outputiThe significance detection result at the scale is obtained; wherein K is a similarity determination threshold value,
rela=corr2(A,B) (7)
wherein corr2() is the correlation coefficient calculation function; a and B are matrixes or images to be compared; rela is a correlation coefficient between A and B, the larger the value is, the more similar A and B are, otherwise, the difference is larger;
and the detection result fusion module is used for fusing multi-scale significance detection results, performing equal-weight linear combination on significance results under each single scale, and calculating a final significance detection result.
4. The iterative sparse representation-based image salient object region extraction system according to claim 3, wherein the salient features in the salient feature extraction module are RGB, Lab, x, y and 13-dimensional features with first-order gradient and second-order gradient in total, and are represented as { R, G, B, L, a, B, x, y, fx,fy,fxx,fyy,fxyThe color feature is a six-dimensional feature of R, G, B, L, a and B, and the R, G, B and L, a and B are RGB and Lab color information respectively; x and y are position information, and x and y are row-column coordinates of pixels in the image; f. ofx,fy,fxx,fyy,fxyFor the gradient feature, the first and second order differences of the pixel in the X and Y directions are respectively expressed, and the calculation formula is as follows:
Figure FDA0002232626760000041
where f (i, j) is the image matrix and i, j is the image pixel row column number.
CN201711387624.3A 2017-12-20 2017-12-20 Image salient target region extraction method and system based on iterative sparse representation Active CN107992874B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711387624.3A CN107992874B (en) 2017-12-20 2017-12-20 Image salient target region extraction method and system based on iterative sparse representation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711387624.3A CN107992874B (en) 2017-12-20 2017-12-20 Image salient target region extraction method and system based on iterative sparse representation

Publications (2)

Publication Number Publication Date
CN107992874A CN107992874A (en) 2018-05-04
CN107992874B true CN107992874B (en) 2020-01-07

Family

ID=62039459

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711387624.3A Active CN107992874B (en) 2017-12-20 2017-12-20 Image salient target region extraction method and system based on iterative sparse representation

Country Status (1)

Country Link
CN (1) CN107992874B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109117852B (en) * 2018-07-10 2021-08-17 武汉大学 Unmanned aerial vehicle image adaptation area automatic extraction method and system based on sparse representation
CN109102465A (en) * 2018-08-22 2018-12-28 周泽奇 A kind of calculation method of the content erotic image auto zoom of conspicuousness depth of field feature
CN109886267A (en) * 2019-01-29 2019-06-14 杭州电子科技大学 A kind of soft image conspicuousness detection method based on optimal feature selection
CN110490204B (en) * 2019-07-11 2022-07-15 深圳怡化电脑股份有限公司 Image processing method, image processing device and terminal
CN111191650B (en) * 2019-12-30 2023-07-21 北京市新技术应用研究所 Article positioning method and system based on RGB-D image visual saliency
CN111242941B (en) * 2020-01-20 2023-05-30 南方科技大学 Salient region detection method and device based on visual attention
CN111274964B (en) * 2020-01-20 2023-04-07 中国地质大学(武汉) Detection method for analyzing water surface pollutants based on visual saliency of unmanned aerial vehicle
CN112766032A (en) * 2020-11-26 2021-05-07 电子科技大学 SAR image saliency map generation method based on multi-scale and super-pixel segmentation
CN112700438B (en) * 2021-01-14 2024-06-21 成都铁安科技有限责任公司 Ultrasonic flaw judgment method and ultrasonic flaw judgment system for imbedded part of train axle
CN114332572B (en) * 2021-12-15 2024-03-26 南方医科大学 Method for extracting breast lesion ultrasonic image multi-scale fusion characteristic parameters based on saliency map-guided hierarchical dense characteristic fusion network
CN115424037A (en) * 2022-10-12 2022-12-02 武汉大学 Salient target region extraction method based on multi-scale sparse representation
CN115690418B (en) * 2022-10-31 2024-03-12 武汉大学 Unsupervised automatic detection method for image waypoints

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7526123B2 (en) * 2004-02-12 2009-04-28 Nec Laboratories America, Inc. Estimating facial pose from a sparse representation
CN101556690A (en) * 2009-05-14 2009-10-14 复旦大学 Image super-resolution method based on overcomplete dictionary learning and sparse representation
CN101980284A (en) * 2010-10-26 2011-02-23 北京理工大学 Two-scale sparse representation-based color image noise reduction method
CN104240256A (en) * 2014-09-25 2014-12-24 西安电子科技大学 Image salient detecting method based on layering sparse modeling
CN105930812A (en) * 2016-04-27 2016-09-07 东南大学 Vehicle brand type identification method based on fusion feature sparse coding model
CN106203430A (en) * 2016-07-07 2016-12-07 北京航空航天大学 A kind of significance object detecting method based on foreground focused degree and background priori
CN106530271A (en) * 2016-09-30 2017-03-22 河海大学 Infrared image significance detection method
CN106815842A (en) * 2017-01-23 2017-06-09 河海大学 A kind of improved image significance detection method based on super-pixel
CN107067037A (en) * 2017-04-21 2017-08-18 河南科技大学 A kind of method that use LLC criterions position display foreground
CN107229917A (en) * 2017-05-31 2017-10-03 北京师范大学 A kind of several remote sensing image general character well-marked target detection methods clustered based on iteration
CN107301643A (en) * 2017-06-06 2017-10-27 西安电子科技大学 Well-marked target detection method based on robust rarefaction representation Yu Laplce's regular terms

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7526123B2 (en) * 2004-02-12 2009-04-28 Nec Laboratories America, Inc. Estimating facial pose from a sparse representation
CN101556690A (en) * 2009-05-14 2009-10-14 复旦大学 Image super-resolution method based on overcomplete dictionary learning and sparse representation
CN101980284A (en) * 2010-10-26 2011-02-23 北京理工大学 Two-scale sparse representation-based color image noise reduction method
CN104240256A (en) * 2014-09-25 2014-12-24 西安电子科技大学 Image salient detecting method based on layering sparse modeling
CN105930812A (en) * 2016-04-27 2016-09-07 东南大学 Vehicle brand type identification method based on fusion feature sparse coding model
CN106203430A (en) * 2016-07-07 2016-12-07 北京航空航天大学 A kind of significance object detecting method based on foreground focused degree and background priori
CN106530271A (en) * 2016-09-30 2017-03-22 河海大学 Infrared image significance detection method
CN106815842A (en) * 2017-01-23 2017-06-09 河海大学 A kind of improved image significance detection method based on super-pixel
CN107067037A (en) * 2017-04-21 2017-08-18 河南科技大学 A kind of method that use LLC criterions position display foreground
CN107229917A (en) * 2017-05-31 2017-10-03 北京师范大学 A kind of several remote sensing image general character well-marked target detection methods clustered based on iteration
CN107301643A (en) * 2017-06-06 2017-10-27 西安电子科技大学 Well-marked target detection method based on robust rarefaction representation Yu Laplce's regular terms

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Approximate Correction of Length Distortion for Direct Georeferencing in Map Projection Frame;yongjun zhang等;《IEEE》;20131130;第10卷(第6期);全文 *
Construction of Manifolds via Compatible Sparse Representations;Ruimin Wang等;《ACM》;20161231;全文 *
Salient object detection via contrast information and object vision organization cues;Shengxiang Qi等;《ELSEVIER》;20150429;全文 *
Super-Resolution of Single Text Image by Sparse Representation;Rim Walha等;《ACM》;20121231;全文 *
基于视觉显著性和目标置信度的红外车辆检测技术;齐楠楠等;《CNKI》;20170630;第46卷(第6期);全文 *

Also Published As

Publication number Publication date
CN107992874A (en) 2018-05-04

Similar Documents

Publication Publication Date Title
CN107992874B (en) Image salient target region extraction method and system based on iterative sparse representation
Januszewski et al. High-precision automated reconstruction of neurons with flood-filling networks
CN108154118B (en) A kind of target detection system and method based on adaptive combined filter and multistage detection
Gould et al. Region-based segmentation and object detection
CN106778687B (en) Fixation point detection method based on local evaluation and global optimization
CN107680106A (en) A kind of conspicuousness object detection method based on Faster R CNN
CN111612008A (en) Image segmentation method based on convolution network
CN110033007A (en) Attribute recognition approach is worn clothes based on the pedestrian of depth attitude prediction and multiple features fusion
CN109033978B (en) Error correction strategy-based CNN-SVM hybrid model gesture recognition method
US20220237789A1 (en) Weakly supervised multi-task learning for cell detection and segmentation
CN108230330B (en) Method for quickly segmenting highway pavement and positioning camera
Savian et al. Optical flow estimation with deep learning, a survey on recent advances
CN107423771B (en) Two-time-phase remote sensing image change detection method
Sima et al. Bottom-up merging segmentation for color images with complex areas
Wang et al. Pedestrian detection in infrared image based on depth transfer learning
Madessa et al. A deep learning approach for specular highlight removal from transmissive materials
Li et al. DeepSIR: Deep semantic iterative registration for LiDAR point clouds
Dmitriev12 et al. Efficient correction for em connectomics with skeletal representation
CN108664968B (en) Unsupervised text positioning method based on text selection model
Ma et al. An attention-based progressive fusion network for pixelwise pavement crack detection
Wang et al. Nuclei instance segmentation using a transformer-based graph convolutional network and contextual information augmentation
Imtiaz et al. BAWGNet: Boundary aware wavelet guided network for the nuclei segmentation in histopathology images
CN116228795A (en) Ultrahigh resolution medical image segmentation method based on weak supervised learning
CN115861630A (en) Cross-waveband infrared target detection method and device, computer equipment and storage medium
Bhavani et al. Robust 3D face recognition in unconstrained environment using distance based ternary search siamese network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant