CN103996040A

CN103996040A - Bottom-up visual saliency generating method fusing local-global contrast ratio

Info

Publication number: CN103996040A
Application number: CN201410200489.7A
Authority: CN
Inventors: 韩军伟; 张鼎文; 郭雷
Original assignee: Northwestern Polytechnical University
Current assignee: Northwestern Polytechnical University
Priority date: 2014-05-13
Filing date: 2014-05-13
Publication date: 2014-08-20

Abstract

The invention provides a bottom-up visual saliency generating method fusing the local-global contrast ratio. According to the method, firstly, the local contrast ratio between a certain image block in an image and the other image blocks in a neighbor domain and the global contrast ratio between the image block and the remaining image blocks in the image are calculated based on the sparse coding theory, then, the two kinds of comparative information are organically combined together, the center offset is added into the two kinds of comparative information, finally, fusion of the local contrast ratio and the global contrast ratio is achieved, and a visual saliency calculation model with better accuracy and visual is built.

Description

Merge the bottom-up vision significance generation method of part-global contrast

Technical field

The invention belongs to computer vision algorithms make research field, relate to a kind of bottom-up vision significance generation method that merges part-global contrast, can be in natural image database accurately, calculate the remarkable figure of Given Graph picture robust.

Background technology

Vision significance is a critical function of vision attention, and it shows as observer and from a complicated visual scene, selects an important content to focus on, and ignores other not too important contents.In visual scene, some content more can obtain observer's attention than other guide, and we claim them to have higher vision significance.The thought of vision significance is applied in the computation model of vision attention in large quantities, and the significance measure method that ITTI adopts in its classical visual attention computation model is based on pixel and the local visual feature difference of neighborhood around it; The people such as Ma proposed a kind of significance measure method based on Characteristic Contrast in 2003, and the method is only considered color characteristic, is LUV space by input picture from RGB color space conversion, carries out color quantizing.Simple in order to process, input picture is adjusted to a fixing size.Calculating pixel and its be the color characteristic contrast of neighborhood around, obtains the conspicuousness value of this pixel; The people such as Hou have proposed a kind of significance measure method based on spectrum residual error in 2008, the method is analyzed the feature of marking area on frequency domain, build significantly figure in spatial domain.The people such as FengLiu proposed the significance measure based on region afterwards, and first they utilize certain method to obtain regions different in image, then according to the positional factor in each region and its conspicuousness of Characteristic Contrast isometry.

Although above-mentioned conspicuousness computation model can draw gratifying result in specific Sample Storehouse, but in these models, still there is an obvious defect: they all only considered in the global contrast of image or local contrast a bit, and the information architecture of simultaneously not applying these two kinds contrast property goes out a unified conspicuousness computation model.Experiment shows, the marking area based on local feature contrast easily concentrates on the background area of the stronger marginal portion of variation or more complicated; The marking area of the conspicuousness based on global characteristics contrast can not finely highlight and the region that has sharp contrast degree around.Based on this, the present invention proposes a kind of bottom-up vision significance computing method that merge part-global contrast, first according to theoretical local contrast and the global contrast feature of extracting in image of sparse coding, then the information of these two kinds contrast property is organically combined, again according to the center-biased theory of human visual attention psychological study, set up out one and have more accuracy, the vision significance computation model of robustness.

Summary of the invention

The technical matters solving

For fear of the deficiencies in the prior art part, the present invention proposes a kind of bottom-up vision significance generation method that merges part-global contrast.

Technical scheme

Bottom-up vision significance computing method that merge part-global contrast, is characterized in that step is as follows:

Step 1 is extracted segment and the feature thereof in image: be first N × N pixel by image down sampling, then adopting size is size ∈ [5,50], and step-length is the image of square moving window after down-sampling in extract segment p _i, segment p _ithe vector that interior pixel value forms is using the feature x as this segment _i; Wherein i ∈ [1, M], M is the segment number in piece image;

Step 2 builds the local dictionary of segment pi: adopt size for size ∈ [5,50], step-length is square moving window at segment p _iface in territory and to extract all and p _ioverlapping area is less than segment, the matrix that the feature of these segments is formed is as segment p _ilocal dictionary wherein segment p _iface territory size for Sru _size=β size, β ∈ [3,9] is the scale-up factor that faces territory scope;

Step 3 is calculated segment p _ilocal contrast: according to sparse coding theory, adopt segment p _ilocal dictionary to its feature x _iencode: wherein the local sparse coding of current segment, the local residual error after sparse coding, segment p _ilocal contrast

Step 4 builds segment p _iglobal Dictionary; Adopting size is size ∈ [5,50], and step-length is the picture in its entirety of square moving window after down-sampling within the scope of extract all and segment p _ioverlapping area is less than segment, the matrix that the feature of these segments is formed is as segment p _iglobal Dictionary

Step 5 is calculated segment p _iglobal contrast: according to sparse coding theory, adopt segment p _iglobal Dictionary to its feature x _iencode: wherein the overall sparse coding of current segment, the overall residual error after sparse coding, segment p _iglobal contrast

Step 6 is calculated segment p _icenter offset: calculate segment p _icenter offset wherein: D _maxfor the distance farthest of the image middle distance image center after down-sampling; D _ifor segment p _ithe distance of the image middle distance image center of central point after down-sampling;

Step 7 is calculated segment p _iremarkable value: to segment p _ilocal contrast and global contrast merge draw its significantly value S, wherein λ ∈ [01] is the weight coefficient of local contrast and global contrast;

Step 8 generates significantly figure: the remarkable value of calculating all segments in the image after down-sampling according to step 1-7, using these, significantly value is as the gray-scale value generation gray-scale map corresponding with image after down-sampling of segment corresponding thereto, and the size that this gray-scale map is upsampled to original image is the remarkable figure of synthetic image;

In described step 3 and 5, the method for compute sparse coefficient and residual error adopts document Han B, Zhu H, Ding Y. " Bottom-up saliency based on weighted sparse coding residual ", Proceedings of the19th ACM international conference on Multimedia.ACM, the method for 2011:1117-1120.

Beneficial effect

The present invention proposes a kind of bottom-up vision significance computing method that merge part-global contrast, first utilize in the theoretical computed image of sparse coding in the local contrast between some image blocks and other image blocks within it faces territory and this image block and image and remain the global contrast between all image blocks, then the information of these two kinds contrast property is organically combined and adds center offset, finally realize local contrast, the fusion of global contrast, set up out one and have more accuracy, the vision significance computation model of robustness.

Attached caption

Fig. 1: the basic flow sheet of the inventive method

Fig. 2: experiment comparing result figure

Fig. 3: ROC result figure

Embodiment

Now in conjunction with the embodiments, the invention will be further described for accompanying drawing:

For the hardware environment of implementing be: Intel Pentium2.93GHz CPU computing machine, 2.0GB internal memory, the software environment of operation is: Matlab R2011b and Windows XP.Testing all images of having chosen in BRUCE storehouse as test data, comprise 120 width natural images in this database, is the database for testing vision conspicuousness computation model of International Publication.

The present invention is specifically implemented as follows:

1. extract segment and the feature thereof in image: be first N × N pixel by image down sampling, then adopting size is size ∈ [5,50], and step-length is the image of square moving window after down-sampling in extract segment p _i, segment p _ithe vector that interior pixel value forms is using the feature x as this segment _i; Wherein i ∈ [1, M], M is the segment number in piece image.

2. build the local dictionary of segment pi: adopting size is size ∈ [5,50], and step-length is square moving window at segment p _iface in territory and to extract all and p _ioverlapping area is less than segment, the matrix that the feature of these segments is formed is as segment p _ilocal dictionary wherein segment p _iface territory size for Sru _size=β size, β ∈ [3,9] is the scale-up factor that faces territory scope.

3. calculate the local contrast of segment pi: according to the method in sparse coding theory and " Bottom-up saliency based on weighted sparse coding residual ", adopt segment p _ilocal dictionary to its feature x _iencode: wherein the local sparse coding of current segment, the local residual error after sparse coding, segment p _ilocal contrast

4. build segment p _iglobal Dictionary; Adopting size is size ∈ [5,50], and step-length is the picture in its entirety of square moving window after down-sampling within the scope of extract all and segment p _ioverlapping area is less than segment, the matrix that the feature of these segments is formed is as segment p _iglobal Dictionary

5. calculate the global contrast of segment pi: according to the method in sparse coding theory and " Bottom-up saliency based on weighted sparse coding residual ", adopt segment p _iglobal Dictionary to its feature x _iencode: wherein the overall sparse coding of current segment, the overall residual error after sparse coding, segment p _iglobal contrast

6. calculate segment p _icenter offset: calculate segment p _icenter offset wherein: D _maxfor the distance farthest of the image middle distance image center after down-sampling; D _ifor segment p _ithe distance of the image middle distance image center of central point after down-sampling.

7. calculate segment p _iremarkable value: to segment p _ilocal contrast and global contrast merge draw its significantly value S _i, wherein λ ∈ [0,1] is the weight coefficient of local contrast and global contrast.

8. generate significantly figure: the remarkable value of calculating all segments in the image after down-sampling according to step 1-7, using these, significantly value is as the gray-scale value generation gray-scale map corresponding with image after down-sampling of segment corresponding thereto, and the size that this gray-scale map is upsampled to original image is the remarkable figure of synthetic image.

The present invention selects ROC curve to assess recognition result.This curve definitions is under segmentation threshold changes, the variation relation of false alarm rate (FPR) and recall rate (TPR).Computing formula is as follows:

FPR = \frac{FP}{N}

TPR = \frac{TP}{P}

Wherein FP is the false-alarm region detecting, N is non-order target area in ground truth; TP is the real police region territory detecting, P is order target area in ground truth.

Accompanying drawing 2 is some contrast and experiment, wherein, CS refers to the remarkable figure that the local contrast only utilized in the present invention calculates, CG refers to the remarkable figure that the global contrast only utilized in the present invention calculates, and CS+CG is the remarkable figure calculating according to the method that merges part-global contrast in the present invention.Can find out that algorithm that the present invention proposes can overcome the defect that independent use local contrast or global contrast are brought, can be in natural image database accurately, calculate the remarkable figure of Given Graph picture robust.The ROC curve that accompanying drawing 3 is the inventive method, table 1 is the quantitative contrast result of the inventive method and other existing algorithms, in table, the value of secondary series is respective algorithms ROC area under a curve (AUC) in BRUCE test library, from experimental result can find out method that the present invention proposes can be more accurately and robust natural image is carried out to the calculating of remarkable figure.

The contrast of table 1 conspicuousness testing result

Methods:	AIM	Itti’s	Judd’s	Liyin’s	Hanbiao’s	OURS
							AUC:	0.7241	0.7455	0.7795	0.8006	0.8264	0.8360

Claims

1. one kind merges the bottom-up vision significance generation method of part-global contrast, it is characterized in that step is as follows:

Step 4 builds segment p _iglobal Dictionary; Adopting size is size ∈ [5,50], and step-length is the picture in its entirety of square moving window after down-sampling within the scope of extract all and segment p _ioverlapping area is less than the segment of e, the matrix that the feature of these segments is formed is as segment p _iglobal Dictionary

Step 7 is calculated segment p _iremarkable value: to segment p _ilocal contrast and global contrast merge ) go out its significantly value S _i, wherein λ ∈ [0,1] is the weight coefficient of local contrast and global contrast;

Step 8 generates significantly figure: the remarkable value of calculating all segments in the image after down-sampling according to step 1-7, using these, significantly value is as the gray-scale value generation gray-scale map corresponding with image after down-sampling of segment corresponding thereto, and the size that this gray-scale map is upsampled to original image is the remarkable figure of synthetic image.

2. the bottom-up vision significance generation method of fusion part-global contrast according to claim 1, it is characterized in that: in described step 3 and 5, the method for compute sparse coefficient and residual error adopts document Han B, Zhu H, Ding Y. " Bottom-up saliency based on weighted sparse coding residual ", Proceedings of the19th ACM international conference on Multimedia.ACM, the method for 2011:1117-1120.