CN108470176B - Stereo image visual saliency extraction method based on frequency domain sparse representation - Google Patents

Stereo image visual saliency extraction method based on frequency domain sparse representation Download PDF

Info

Publication number
CN108470176B
CN108470176B CN201810066806.9A CN201810066806A CN108470176B CN 108470176 B CN108470176 B CN 108470176B CN 201810066806 A CN201810066806 A CN 201810066806A CN 108470176 B CN108470176 B CN 108470176B
Authority
CN
China
Prior art keywords
value
image
matrix
column
row
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810066806.9A
Other languages
Chinese (zh)
Other versions
CN108470176A (en
Inventor
周武杰
蔡星宇
张爽爽
顾鹏笠
潘婷
郑飘飘
吕思嘉
袁建中
陈昱臻
胡慧敏
金国英
王建芬
王新华
孙丽慧
吴洁雯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Muye Microelectronics Technology Co.,Ltd.
Original Assignee
Zhejiang Lover Health Science and Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Lover Health Science and Technology Development Co Ltd filed Critical Zhejiang Lover Health Science and Technology Development Co Ltd
Priority to CN201810066806.9A priority Critical patent/CN108470176B/en
Publication of CN108470176A publication Critical patent/CN108470176A/en
Application granted granted Critical
Publication of CN108470176B publication Critical patent/CN108470176B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • G06V10/464Salient features, e.g. scale invariant feature transforms [SIFT] using a plurality of salient features, e.g. bag-of-words [BoW] representations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/513Sparse representations

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a three-dimensional image visual saliency extraction method based on frequency domain sparse representation, which is used for acquiring a small-size three-channel image and a small-size normalized disparity map of a left viewpoint image of a tested three-dimensional image; blocking the three-channel image and the normalized disparity map to obtain a matrix corresponding to a blocked image; forming a quaternion matrix by the four matrixes, obtaining a transformed quaternion matrix after two-dimensional quaternion Fourier transform, and obtaining a low frequency component diagram according to low frequency components of a real part and three imaginary parts; extracting a sparse weight matrix of the low frequency component diagram by using a sparse representation dictionary; acquiring two central periphery significant images according to the sparse weight matrix and the central preference image; blurring the two images with significant centers around, and fusing to obtain a fused image; reinforcing the periphery of the center of the fused image by using the center preference image; enhancing the size conversion of the image around the center to obtain a visual salient image; has the advantages of stronger extraction stability and higher extraction accuracy.

Description

Stereo image visual saliency extraction method based on frequency domain sparse representation
Technical Field
The invention relates to a processing method of image signals, in particular to a method for extracting visual saliency of a three-dimensional image based on frequency domain sparse representation.
Background
After receiving the natural image, people need to distinguish information resources of different levels, so that when processing natural image information, people can perform grading processing on different information, and thus, the selected characteristics are shown. People are not evenly focused on various areas of an image when watching the image or video clip, but rather preferentially process the more interesting part of the semantic information. Computing image salient regions is an important research area in the field of computer vision and content-based video detection. With the rapid development of stereoscopic image projection and acquisition equipment, the visual saliency detection of stereoscopic images also becomes a very important research content.
The stereoscopic image is not a simple expansion of a plane image, and the process of generating the stereoscopic image by sensing the stereoscopic image by human eyes is not a simple superposition process of a left viewpoint image and a right viewpoint image, so that the stereoscopic vision characteristic is not a simple expansion of a plane vision characteristic. However, the current stereo image visual saliency extraction method still stays on the plane image visual saliency extraction method. Therefore, how to effectively extract stereoscopic features from a stereoscopic image and how to make the extracted stereoscopic features conform to the observation habit of the human visual system are all problems that need to be studied in the process of extracting a visual saliency map of the stereoscopic image.
Disclosure of Invention
The invention aims to provide a frequency domain sparse representation-based stereo image visual saliency extraction method which accords with saliency semantic features and has stronger extraction stability and higher extraction accuracy.
The technical scheme adopted by the invention for solving the technical problems is as follows: a stereoscopic image visual saliency extraction method based on frequency domain sparse representation is characterized by comprising the following steps:
① for any one of the test stereo images StestWill StestIs transformed into a Lab color space and the size is transformed into a size of 200 × 200 pixels, and the transformed image is denoted as { L }Lab200(x1,y1) }; then will { LLab200(x1,y1) The L channel image, the a channel image and the b channel image of the image are correspondingly recorded as { L }Lab200,L(x1,y1)}、{LLab200,a(x1,y1)}、{LLab200,b(x1,y1) }; wherein S istestHas a width of W, StestHas a height of H, 1 is less than or equal to x1≤200,1≤y1≤200,LLab200(x1,y1) Represents { LLab200(x1,y1) The position of the middle coordinate is (x)1,y1) Pixel value of the pixel point of (1), LLab200,L(x1,y1) Represents { LLab200,L(x1,y1) The position of the middle coordinate is (x)1,y1) Pixel value of the pixel point of (1), LLab200,a(x1,y1) Represents { LLab200,a(x1,y1) The position of the middle coordinate is (x)1,y1) Is/are as followsPixel value, L, of a pixel pointLab200,b(x1,y1) Represents { LLab200,b(x1,y1) The position of the middle coordinate is (x)1,y1) The pixel value of the pixel point of (1);
will StestThe parallax map of (2) is scaled to a size of 200 × 200 pixels, and the scaled image is denoted as { D }200(x1,y1) }; then will { D200(x1,y1) Normalizing the pixel value of each pixel point in the pixel to [0, 1 ]]Numerical value range, and the normalized image is recorded as { D0,1(x1,y1) }; wherein D is200(x1,y1) Represents { D200(x1,y1) The position of the middle coordinate is (x)1,y1) Pixel value of the pixel point of (2), D0,1(x1,y1) Represents { D0,1(x1,y1) The position of the middle coordinate is (x)1,y1) The pixel value of the pixel point of (1);
② will { LLab200,L(x1,y1)}、{LLab200,a(x1,y1)}、{LLab200,b(x1,y1)}、{D0,1(x1,y1) Are divided into
Figure GDA0002478632350000021
Non-overlapping image blocks of size 8 × 8, and LLab200,L(x1,y1) Divide the block image to get the corresponding matrix of the block image as
Figure GDA0002478632350000022
From { LLab200,L(x1,y1) Dividing image blocks to obtain sub-matrixes corresponding to all the image blocks in the block images; and will be { LLab200,a(x1,y1) Divide the block image to get the corresponding matrix of the block image as
Figure GDA0002478632350000023
Figure GDA0002478632350000024
From { LLab200,a(x1,y1) Dividing image blocks to obtain sub-matrixes corresponding to all the image blocks in the block images; will { LLab200,b(x1,y1) Divide the block image to get the corresponding matrix of the block image as
Figure GDA0002478632350000025
From { LLab200,b(x1,y1) Dividing image blocks to obtain sub-matrixes corresponding to all the image blocks in the block images; will { D0,1(x1,y1) Divide the block image to get the corresponding matrix of the block image as
Figure GDA0002478632350000031
From { D0,1(x1,y1) Dividing image blocks to obtain sub-matrixes corresponding to all the image blocks in the block images; wherein m is more than or equal to 11≤8,1≤n1≤8,
Figure GDA0002478632350000032
Figure GDA0002478632350000033
Represents { LLab200,L(x1,y1) Dividing the image blocks to obtain the image blocks, wherein the coordinate position in the image block of the u row and the v column in the block image is (m)1,n1) The pixel value of the pixel point of (a),
Figure GDA0002478632350000034
also shows
Figure GDA0002478632350000035
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) The value of the element(s) of (c),
Figure GDA0002478632350000036
represents { LLab200,a(x1,y1) Dividing the image blocks to obtain the image blocks, wherein the coordinate position in the image block of the u row and the v column in the block image is (m)1,n1) The pixel value of the pixel point of (a),
Figure GDA0002478632350000037
also shows
Figure GDA0002478632350000038
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) The value of the element(s) of (c),
Figure GDA0002478632350000039
represents { LLab200,b(x1,y1) Dividing the image blocks to obtain the image blocks, wherein the coordinate position in the image block of the u row and the v column in the block image is (m)1,n1) The pixel value of the pixel point of (a),
Figure GDA00024786323500000310
also shows
Figure GDA00024786323500000311
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) The value of the element(s) of (c),
Figure GDA00024786323500000312
represents { D0,1(x1,y1) Dividing the image blocks to obtain the image blocks, wherein the coordinate position in the image block of the u row and the v column in the block image is (m)1,n1) The pixel value of the pixel point of (a),
Figure GDA00024786323500000313
also shows
Figure GDA00024786323500000314
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) The value of the element of (c);
③ will be
Figure GDA00024786323500000315
Form a quaternion matrix, which is denoted as
Figure GDA00024786323500000316
Will be provided with
Figure GDA00024786323500000317
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) The quaternion value of the element(s) is noted
Figure GDA00024786323500000318
Figure GDA00024786323500000319
Wherein i, j and k are all imaginary number units;
④ pairs
Figure GDA00024786323500000320
Performing two-dimensional quaternion Fourier transform on each sub-matrix to obtain
Figure GDA00024786323500000321
The transformed quaternion matrix of (2), note
Figure GDA00024786323500000322
Will be provided with
Figure GDA00024786323500000323
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) The quaternion value of the element(s) is noted
Figure GDA00024786323500000324
Figure GDA00024786323500000325
Then will be
Figure GDA00024786323500000326
The real part, i-imaginary part, j-imaginary part and k-imaginary part of (A) are correspondingly marked as
Figure GDA0002478632350000041
Figure GDA0002478632350000042
Followed by extraction
Figure GDA0002478632350000043
Figure GDA0002478632350000044
Respective low frequency components, corresponding to
Figure GDA0002478632350000045
Figure GDA0002478632350000046
Wherein QFT () represents a two-dimensional quaternion Fourier transform function,
Figure GDA0002478632350000047
to represent
Figure GDA0002478632350000048
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) The value of the element(s) of (c),
Figure GDA0002478632350000049
to represent
Figure GDA00024786323500000410
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) The value of the element(s) of (c),
Figure GDA00024786323500000411
to represent
Figure GDA00024786323500000412
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) The value of the element(s) of (c),
Figure GDA00024786323500000413
to represent
Figure GDA00024786323500000414
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) Value of (1) m2≤4,1≤n2≤4,
Figure GDA00024786323500000415
To represent
Figure GDA00024786323500000416
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2) The value of the element(s) of (c),
Figure GDA00024786323500000417
to represent
Figure GDA00024786323500000418
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2) The value of the element(s) of (c),
Figure GDA00024786323500000419
to represent
Figure GDA00024786323500000420
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2) The value of the element(s) of (c),
Figure GDA00024786323500000421
to represent
Figure GDA00024786323500000422
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2) The value of the element of (c);
⑤ are in accordance with
Figure GDA00024786323500000423
Figure GDA00024786323500000424
Obtaining
Figure GDA00024786323500000425
Corresponding low frequency component diagram, note
Figure GDA00024786323500000426
Will be provided with
Figure GDA00024786323500000427
The coordinate position in the 8 × 8 area of the u-th row and the v-th column in (m)1,n1) The pixel value of the pixel point is recorded as
Figure GDA00024786323500000428
Wherein the content of the first and second substances,
Figure GDA00024786323500000429
to represent
Figure GDA00024786323500000430
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) The value of the element(s) of (c),
Figure GDA0002478632350000051
to represent
Figure GDA0002478632350000052
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1-4,n1) The value of the element(s) of (c),
Figure GDA0002478632350000053
to represent
Figure GDA0002478632350000054
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1-4) of the value of the element,
Figure GDA0002478632350000055
to represent
Figure GDA0002478632350000056
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1-4,n1-4) value of an element;
⑥ by the Lasso method and
Figure GDA0002478632350000057
for sparse representation of dictionaries, extraction
Figure GDA0002478632350000058
The sparse weight of each 8 × 8 region in the image is obtained
Figure GDA0002478632350000059
Corresponding sparse weight matrix, denoted
Figure GDA00024786323500000510
Wherein, in the Lasso method, a weighting parameter lambda is adopted to select a Lagrange penalty term, u is more than or equal to 11≤25,1≤v1≤25,
Figure GDA00024786323500000511
Is shown in
Figure GDA00024786323500000512
Representing dictionary pairs for sparseness
Figure GDA00024786323500000513
U-th row and v-th column in (1) 8 × 8 area generated when sparse representation is performed1Line v1The weight of the column has a subscript of (m)1,n1) The value of the element(s) of (c),
Figure GDA00024786323500000514
Figure GDA00024786323500000515
to represent
Figure GDA00024786323500000516
The sub-matrix of the u-th row and the v-th column in (1) has a subscript of (u)1,v1) The value of the element of (c);
⑦ obtains an absolute value sum image of 25 × 25 pixel sizes, noted as { W }abs(u, v) }, will { WabsThe pixel value of the pixel point with the coordinate position (u, v) in (u, v) is marked as Wabs(u,v),
Figure GDA00024786323500000517
And a center preference image of 25 × 25 pixel size is acquired, denoted as { C }saliency25(u1,v1) Will { C }saliency25(u1,v1) The position of the coordinate in the Chinese character is (u)1,v1) The pixel value of the pixel point is marked as Csaliency25(u1,v1),
Figure GDA00024786323500000518
A first center four-around salient image of 25 × 25 pixel size is then acquired, denoted as Ccs(u, v) }, will { CcsThe pixel value of the pixel point with the coordinate position (u, v) in (u, v) is marked as Ccs(u,v),
Figure GDA00024786323500000519
Figure GDA00024786323500000520
And a second center-surrounding salient image of 25 × 25 pixel size is acquired, denoted as { Cnear(u, v) }, will { Cnear(u, v) } the pixel value of the pixel point with the coordinate position of (u, v) is recorded as
Figure GDA0002478632350000061
Then using Gaussian filter template pair { Ccs(u, v) } fuzzy processing is performed, and { C is usedcs(u, v) } blurring processing the obtained image is marked as { Ccs-g(u, v) }; and using a Gaussian filter template pair { Cnear(u, v) } fuzzy processing is performed, and { C is usednear(u, v) } blurring processing the obtained image is marked as { Cnear-g(u, v) }; wherein abs () is the function of taking the absolute value, e is the natural base, u0Represents { Csaliency25(u1,v1) Abscissa of center pixel point, v0Represents { Csaliency25(u1,v1) Ordinate, δ, of the central pixel point ofFDenotes a first center preference parameter, δDDenotes a second center preference parameter, Ccs-g(u, v) represents { Ccs-g(u, v) } the pixel value of the pixel point with the coordinate position of (u, v), Cnear-g(u, v) represents { Cnear-gThe pixel value of a pixel point with (u, v) as the coordinate position in (u, v) }, and the parameter for controlling the fuzzy degree is taken when the Gaussian filtering template is applied
Figure GDA0002478632350000063
Is 5, and the filter width control parameter X is takenGIs 5, and the control filter height parameter Y is takenGA value of 5;
⑧ pairs { Ccs-g(u, v) } and { Cnear-g(u, v) } to obtain a fused image, which is marked as { CF(u, v) }, will { CFThe pixel value of the pixel point with the coordinate position (u, v) in (u, v) is marked as CF(u,v),CF(u,v)=0.3×Cnear-g(u,v)+0.7×Ccs-g(u, v); then use { Csaliency25(u1,v1) Are { C } pairF(u, v) } center-around enhancement, resulting in a center-around enhanced image of 25 × 25 pixel size, denoted as { CFO(u, v) }, will { CFOThe pixel value of the pixel point with the coordinate position (u, v) in (u, v) is marked as CFO(u,v),CFO(u,v)=CF(u,v)×Csaliency25(u1,v1) (ii) a Then pair { CFO(u, v) } image size conversion to obtain StestIs marked as { SF(x, y) }; wherein S isF(x, y) represents SFThe coordinate position in (x, y) is the pixel value of the pixel point of (x, y), x is more than or equal to 1 and less than or equal to W, and y is more than or equal to 1 and less than or equal to H.
In the step ①, the process is described,
Figure GDA0002478632350000064
where min () is the minimum function and max () is the maximum functionA function.
In the step ④, the process is described,
Figure GDA0002478632350000071
Figure GDA0002478632350000072
Figure GDA0002478632350000073
Figure GDA0002478632350000074
Figure GDA0002478632350000075
to represent
Figure GDA0002478632350000076
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2) The value of the element(s) of (c),
Figure GDA0002478632350000077
to represent
Figure GDA0002478632350000078
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2A value of the element of +4),
Figure GDA0002478632350000079
to represent
Figure GDA00024786323500000710
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2+4,n2) The value of the element(s) of (c),
Figure GDA00024786323500000711
to represent
Figure GDA00024786323500000712
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2+4,n2A value of the element of +4),
Figure GDA00024786323500000713
to represent
Figure GDA00024786323500000714
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2) The value of the element(s) of (c),
Figure GDA00024786323500000715
to represent
Figure GDA00024786323500000716
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2A value of the element of +4),
Figure GDA00024786323500000717
to represent
Figure GDA00024786323500000718
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2+4,n2) The value of the element(s) of (c),
Figure GDA00024786323500000719
to represent
Figure GDA00024786323500000720
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2+4,n2A value of the element of +4),
Figure GDA0002478632350000081
to represent
Figure GDA0002478632350000082
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2) The value of the element(s) of (c),
Figure GDA0002478632350000083
to represent
Figure GDA0002478632350000084
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2A value of the element of +4),
Figure GDA0002478632350000085
to represent
Figure GDA0002478632350000086
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2+4,n2) The value of the element(s) of (c),
Figure GDA0002478632350000087
to represent
Figure GDA0002478632350000088
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2+4,n2A value of the element of +4),
Figure GDA0002478632350000089
to represent
Figure GDA00024786323500000810
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2) The value of the element(s) of (c),
Figure GDA00024786323500000811
to represent
Figure GDA00024786323500000812
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2A value of the element of +4),
Figure GDA00024786323500000813
to represent
Figure GDA00024786323500000814
Row u and column v of (1)The subscript in the matrix is (m)2+4,n2) The value of the element(s) of (c),
Figure GDA00024786323500000815
to represent
Figure GDA00024786323500000816
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2+4,n2+4) element value.
Compared with the prior art, the invention has the advantages that:
1) the method utilizes the multi-channel image and the disparity map, carries out block processing to obtain the quaternion matrix, then carries out two-dimensional quaternion Fourier transform on the quaternion matrix, and utilizes the transformed quaternion matrix to carry out subsequent processing to be beneficial to parallel calculation, thereby being in line with the observation habit of the human visual system and ensuring that the method of the invention is in line with the obvious semantic features.
2) The method of the invention fuses the characteristics of the depth information into the multi-channel image corresponding to the left viewpoint image to obtain the quaternion matrix, thus effectively improving the accuracy and stability of the detection of the visual saliency of the stereo image.
3) The method of the invention utilizes the low-frequency component of the two-dimensional quaternion Fourier transform to carry out sparse representation and uses less elements to carry out sparse representation, thereby improving the running speed of the method of the invention.
4) The method of the invention combines various preferences by using a sparse representation mode, improves the accuracy of the visual saliency detection of the stereo image and better plays the role of various preferences.
Drawings
Fig. 1 is a block diagram of the overall implementation of the method of the present invention.
Detailed Description
The invention is described in further detail below with reference to the accompanying examples.
The invention provides a frequency domain sparse representation-based stereo image visual saliency extraction method, the overall implementation block diagram of which is shown in FIG. 1, and the method comprises the following steps:
① pairsTesting the stereo image S on any onetestWill StestIs transformed into a Lab color space and the size is transformed into a size of 200 × 200 pixels, and the transformed image is denoted as { L }Lab200(x1,y1) }; then will { LLab200(x1,y1) The L channel image, the a channel image and the b channel image of the image are correspondingly recorded as { L }Lab200,L(x1,y1)}、{LLab200,a(x1,y1)}、{LLab200,b(x1,y1) }; wherein S istestHas a width of W, StestHas a height of H, 1 is less than or equal to x1≤200,1≤y1≤200,LLab200(x1,y1) Represents { LLab200(x1,y1) The position of the middle coordinate is (x)1,y1) Pixel value of the pixel point of (1), LLab200,L(x1,y1) Represents { LLab200,L(x1,y1) The position of the middle coordinate is (x)1,y1) Pixel value of the pixel point of (1), LLab200,a(x1,y1) Represents { LLab200,a(x1,y1) The position of the middle coordinate is (x)1,y1) Pixel value of the pixel point of (1), LLab200,b(x1,y1) Represents { LLab200,b(x1,y1) The position of the middle coordinate is (x)1,y1) The pixel value of the pixel point of (1).
Will StestThe parallax map of (2) is scaled to a size of 200 × 200 pixels, and the scaled image is denoted as { D }200(x1,y1) }; then will { D200(x1,y1) Normalizing the pixel value of each pixel point in the pixel to [0, 1 ]]Numerical value range, and the normalized image is recorded as { D0,1(x1,y1) }; wherein D is200(x1,y1) Represents { D200(x1,y1) The position of the middle coordinate is (x)1,y1) Pixel value of the pixel point of (2), D0,1(x1,y1) Represents { D0,1(x1,y1) The position of the middle coordinate is (x)1,y1) The pixel value of the pixel point of (1).
In this embodiment, at step ①,
Figure GDA0002478632350000091
wherein, min () is a function of taking the minimum value, and max () is a function of taking the maximum value.
② will { LLab200,L(x1,y1)}、{LLab200,a(x1,y1)}、{LLab200,b(x1,y1)}、{D0,1(x1,y1) Are divided into
Figure GDA0002478632350000092
Non-overlapping image blocks of size 8 × 8, and LLab200,L(x1,y1) Divide the block image to get the corresponding matrix of the block image as
Figure GDA0002478632350000101
From { LLab200,L(x1,y1) Dividing image blocks to obtain sub-matrixes corresponding to all the image blocks in the block images; and will be { LLab200,a(x1,y1) Divide the block image to get the corresponding matrix of the block image as
Figure GDA0002478632350000102
Figure GDA0002478632350000103
From { LLab200,a(x1,y1) Dividing image blocks to obtain sub-matrixes corresponding to all the image blocks in the block images; will { LLab200,b(x1,y1) Divide the block image to get the corresponding matrix of the block image as
Figure GDA0002478632350000104
From { LLab200,b(x1,y1) Dividing image blocks to obtain all image block pairs in block imagesConstructing a corresponding sub-matrix; will { D0,1(x1,y1) Divide the block image to get the corresponding matrix of the block image as
Figure GDA0002478632350000105
From { D0,1(x1,y1) Dividing image blocks to obtain sub-matrixes corresponding to all the image blocks in the block images; wherein m is more than or equal to 11≤8,1≤n1≤8,
Figure GDA0002478632350000106
Figure GDA0002478632350000107
Represents { LLab200,L(x1,y1) Dividing the image blocks to obtain the image blocks, wherein the coordinate position in the image block of the u row and the v column in the block image is (m)1,n1) The pixel value of the pixel point of (a),
Figure GDA0002478632350000108
also shows
Figure GDA0002478632350000109
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) Value of (b) { L }Lab200,L(x1,y1) The submatrix corresponding to the image block of the nth row and the vth column in the block image obtained by dividing the image block is
Figure GDA00024786323500001010
Row (u) and column (v) of (1),
Figure GDA00024786323500001011
represents { LLab200,a(x1,y1) Dividing the image blocks to obtain the image blocks, wherein the coordinate position in the image block of the u row and the v column in the block image is (m)1,n1) The pixel value of the pixel point of (a),
Figure GDA00024786323500001012
also shows
Figure GDA00024786323500001013
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) Value of (b) { L }Lab200,a(x1,y1) The submatrix corresponding to the image block of the nth row and the vth column in the block image obtained by dividing the image block is
Figure GDA00024786323500001014
Row (u) and column (v) of (1),
Figure GDA00024786323500001015
represents { LLab200,b(x1,y1) Dividing the image blocks to obtain the image blocks, wherein the coordinate position in the image block of the u row and the v column in the block image is (m)1,n1) The pixel value of the pixel point of (a),
Figure GDA00024786323500001016
also shows
Figure GDA00024786323500001017
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) Value of (b) { L }Lab200,b(x1,y1) The submatrix corresponding to the image block of the nth row and the vth column in the block image obtained by dividing the image block is
Figure GDA0002478632350000111
Row (u) and column (v) of (1),
Figure GDA0002478632350000112
represents { D0,1(x1,y1) Dividing the image blocks to obtain the image blocks, wherein the coordinate position in the image block of the u row and the v column in the block image is (m)1,n1) The pixel value of the pixel point of (a),
Figure GDA0002478632350000113
also shows
Figure GDA0002478632350000114
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) Value of (b) { D }0,1(x1,y1) The submatrix corresponding to the image block of the nth row and the vth column in the block image obtained by dividing the image block is
Figure GDA0002478632350000115
Row (u) and column (v).
③ will be
Figure GDA0002478632350000116
Form a quaternion matrix, which is denoted as
Figure GDA0002478632350000117
Will be provided with
Figure GDA0002478632350000118
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) The quaternion value of the element(s) is noted
Figure GDA0002478632350000119
Figure GDA00024786323500001110
Wherein i, j and k are all imaginary number units.
④ pairs
Figure GDA00024786323500001111
Each sub-matrix in the array is subjected to the existing two-dimensional quaternion Fourier transform to obtain
Figure GDA00024786323500001112
The transformed quaternion matrix of (2), note
Figure GDA00024786323500001113
Will be provided with
Figure GDA00024786323500001114
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) The quaternion value of the element(s) is noted
Figure GDA00024786323500001115
Figure GDA00024786323500001116
Then will be
Figure GDA00024786323500001117
The real part, i-imaginary part, j-imaginary part and k-imaginary part of (A) are correspondingly marked as
Figure GDA00024786323500001118
Figure GDA00024786323500001119
Followed by extraction
Figure GDA00024786323500001120
Figure GDA00024786323500001121
Respective low frequency components, corresponding to
Figure GDA00024786323500001122
Figure GDA00024786323500001123
Wherein QFT () represents an existing two-dimensional quaternion Fourier transform function,
Figure GDA00024786323500001124
to represent
Figure GDA00024786323500001125
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) The value of the element(s) of (c),
Figure GDA00024786323500001126
to represent
Figure GDA00024786323500001127
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) The value of the element(s) of (c),
Figure GDA00024786323500001128
to represent
Figure GDA00024786323500001129
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) The value of the element(s) of (c),
Figure GDA00024786323500001130
to represent
Figure GDA0002478632350000121
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) Value of (1) m2≤4,1≤n2≤4,
Figure GDA0002478632350000122
To represent
Figure GDA0002478632350000123
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2) The value of the element(s) of (c),
Figure GDA0002478632350000124
to represent
Figure GDA0002478632350000125
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2) The value of the element(s) of (c),
Figure GDA0002478632350000126
to represent
Figure GDA0002478632350000127
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2) The value of the element(s) of (c),
Figure GDA0002478632350000128
to represent
Figure GDA0002478632350000129
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2) The value of the element(s).
In this embodiment, at step ④,
Figure GDA00024786323500001210
Figure GDA00024786323500001211
Figure GDA00024786323500001212
Figure GDA00024786323500001213
wherein the content of the first and second substances,
Figure GDA00024786323500001214
to represent
Figure GDA00024786323500001215
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2) The value of the element(s) of (c),
Figure GDA00024786323500001216
to represent
Figure GDA00024786323500001217
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2A value of the element of +4),
Figure GDA00024786323500001218
to represent
Figure GDA00024786323500001219
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2+4,n2) The value of the element(s) of (c),
Figure GDA0002478632350000131
to represent
Figure GDA0002478632350000132
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2+4,n2A value of the element of +4),
Figure GDA0002478632350000133
to represent
Figure GDA0002478632350000134
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2) The value of the element(s) of (c),
Figure GDA0002478632350000135
to represent
Figure GDA0002478632350000136
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2A value of the element of +4),
Figure GDA0002478632350000137
to represent
Figure GDA0002478632350000138
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2+4,n2) The value of the element(s) of (c),
Figure GDA0002478632350000139
to represent
Figure GDA00024786323500001310
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2+4,n2A value of the element of +4),
Figure GDA00024786323500001311
to represent
Figure GDA00024786323500001312
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2) The value of the element(s) of (c),
Figure GDA00024786323500001313
to represent
Figure GDA00024786323500001314
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2A value of the element of +4),
Figure GDA00024786323500001315
to represent
Figure GDA00024786323500001316
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2+4,n2) The value of the element(s) of (c),
Figure GDA00024786323500001317
to represent
Figure GDA00024786323500001318
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2+4,n2A value of the element of +4),
Figure GDA00024786323500001319
to represent
Figure GDA00024786323500001320
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2) The value of the element(s) of (c),
Figure GDA00024786323500001321
to represent
Figure GDA00024786323500001322
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2A value of the element of +4),
Figure GDA00024786323500001323
to represent
Figure GDA00024786323500001324
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2+4,n2) The value of the element(s) of (c),
Figure GDA00024786323500001325
to represent
Figure GDA00024786323500001326
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2+4,n2+4) element value.
⑤ are in accordance with
Figure GDA00024786323500001327
Figure GDA00024786323500001328
Obtaining
Figure GDA00024786323500001329
Corresponding low frequency component diagram, note
Figure GDA00024786323500001330
Will be provided with
Figure GDA00024786323500001331
The coordinate position in the 8 × 8 area of the u-th row and the v-th column in (m)1,n1) The pixel value of the pixel point is recorded as
Figure GDA0002478632350000141
Wherein the content of the first and second substances,
Figure GDA0002478632350000142
to represent
Figure GDA0002478632350000143
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) The value of the element(s) of (c),
Figure GDA0002478632350000144
to represent
Figure GDA0002478632350000145
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1-4,n1) The value of the element(s) of (c),
Figure GDA0002478632350000146
to represent
Figure GDA0002478632350000147
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1-4) of the value of the element,
Figure GDA0002478632350000148
to represent
Figure GDA0002478632350000149
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1-4,n1-4) value of an element.
⑥ use the existing Lasso method and
Figure GDA00024786323500001410
for sparse representation of dictionaries, extraction
Figure GDA00024786323500001411
The sparse weight of each 8 × 8 region in the image is obtained
Figure GDA00024786323500001412
Corresponding sparse weight matrix, denoted
Figure GDA00024786323500001413
In the Lasso method, a weighting parameter λ is used to select a lagrangian penalty term, where λ is 0.4 and u is 1 ≦ u in this embodiment1≤25,1≤v1≤25,
Figure GDA00024786323500001414
Is shown in
Figure GDA00024786323500001415
Representing dictionary pairs for sparseness
Figure GDA00024786323500001416
U-th row and v-th column in (1) 8 × 8 area generated when sparse representation is performed1Line v1The weight of the column has a subscript of (m)1,n1) The value of the element(s) of (c),
Figure GDA00024786323500001417
Figure GDA00024786323500001418
to represent
Figure GDA00024786323500001419
The sub-matrix of the u-th row and the v-th column in (1) has a subscript of (u)1,v1) The value of the element(s).
⑦ obtains an absolute value sum image of 25 × 25 pixel sizes, noted as { W }abs(u, v) }, will { WabsThe pixel value of the pixel point with the coordinate position (u, v) in (u, v) is marked as Wabs(u,v),
Figure GDA00024786323500001420
And a center preference image of 25 × 25 pixel size is acquired, denoted as { C }saliency25(u1,v1) Will { C }saliency25(u1,v1) The position of the coordinate in the Chinese character is (u)1,v1) The pixel value of the pixel point is marked as Csaliency25(u1,v1),
Figure GDA00024786323500001421
A first center four-around salient image of 25 × 25 pixel size is then acquired, denoted as Ccs(u, v) }, will { CcsThe pixel value of the pixel point with the coordinate position (u, v) in (u, v) is marked as Ccs(u,v),
Figure GDA0002478632350000151
Figure GDA0002478632350000152
And a second center-surrounding salient image of 25 × 25 pixel size is acquired, denoted as { Cnear(u, v) }, will { Cnear(u, v) } the pixel value of the pixel point with the coordinate position of (u, v) is recorded as
Figure GDA0002478632350000153
Figure GDA0002478632350000154
Then using the existing Gaussian filter template pair { Ccs(u, v) } fuzzy processing is performed, and { C is usedcs(u, v) } blurring processing the obtained image is marked as { Ccs-g(u, v) }; and using the existing Gaussian filter template pair { Cnear(u, v) } fuzzy processing is performed, and { C is usednear(u, v) } blurring processing the obtained image is marked as { Cnear-g(u, v) }; wherein abs () is an absolute value function, e is a natural base, e is 2.7182818284 …, u0Represents { Csaliency25(u1,v1) Abscissa of center pixel point, v0Represents { Csaliency25(u1,v1) Ordinate, δ, of the central pixel point ofFIndicates a first center preference parameter, u in this example0=12.5、v0=12.5、δF=114,δDRepresents a second center preference parameter, in this example, taken as δD=114,Ccs-g(u, v) represents { Ccs-g(u, v) } the pixel value of the pixel point with the coordinate position of (u, v), Cnear-g(u, v) represents { Cnear-gThe pixel value of a pixel point with (u, v) as the coordinate position in (u, v) }, and the parameter for controlling the fuzzy degree is taken when the Gaussian filtering template is applied
Figure GDA0002478632350000155
Is 5, and the filter width control parameter X is takenGIs 5, and the control filter height parameter Y is takenGThe value of (A) is 5.
⑧ pairs { Ccs-g(u, v) } and { Cnear-g(u, v) } carrying outFusing to obtain a fused image, and marking as { CF(u, v) }, will { CFThe pixel value of the pixel point with the coordinate position (u, v) in (u, v) is marked as CF(u,v),CF(u,v)=0.3×Cnear-g(u,v)+0.7×Ccs-g(u, v); then use { Csaliency25(u1,v1) Are { C } pairF(u, v) } center-around enhancement, resulting in a center-around enhanced image of 25 × 25 pixel size, denoted as { CFO(u, v) }, will { CFOThe pixel value of the pixel point with the coordinate position (u, v) in (u, v) is marked as CFO(u,v),CFO(u,v)=CF(u,v)×Csaliency25(u1,v1) (ii) a Then pair { CFO(u, v) } image size conversion to obtain StestIs marked as { SF(x, y) }; wherein S isF(x, y) represents SFThe coordinate position in (x, y) is the pixel value of the pixel point of (x, y), x is more than or equal to 1 and less than or equal to W, and y is more than or equal to 1 and less than or equal to H.
To verify the feasibility and effectiveness of the method of the invention, experiments were performed.
Here, a three-dimensional human eye tracking database (3Deye-tracking database) provided by south university of france was used to analyze the accuracy and stability of the method of the present invention. Here, 3 common objective parameters for evaluating the visually significant extraction method are used as evaluation indexes, i.e., Pearson correlation coefficient (PLCC), Kullback-Leibler divergence coefficient (KLD), AUC parameter (AUC) of the Area Under the receiving and operating characteristics.
The method is used for obtaining the visual salient image of each three-dimensional image in the three-dimensional human eye tracking database provided by southern university of France, and comparing the visual salient image with the subjective visual salient image (existing in the three-dimensional human eye tracking database) of each three-dimensional image in the three-dimensional human eye tracking database, wherein the higher the PLCC and AUC values and the lower the KLD value, the better the consistency between the visual salient image extracted by the method and the subjective visual salient image is. The relevant indices of PLCC, KLD and AUC that reflect the visually significant extraction performance of the method of the invention are listed in table 1. As can be seen from the data listed in Table 1, the accuracy and stability of the visually significant image and the subjective visually significant image extracted by the method are very good, which indicates that the objective extraction result is more consistent with the result of subjective perception of human eyes, and is enough to illustrate the feasibility and effectiveness of the method.
TABLE 1 accuracy and stability of visually significant images and subjective visually significant images extracted by the method of the present invention
Figure GDA0002478632350000161

Claims (3)

1. A stereoscopic image visual saliency extraction method based on frequency domain sparse representation is characterized by comprising the following steps:
① for any one of the test stereo images StestWill StestIs transformed into a Lab color space and the size is transformed into a size of 200 × 200 pixels, and the transformed image is denoted as { L }Lab200(x1,y1) }; then will { LLab200(x1,y1) The L channel image, the a channel image and the b channel image of the image are correspondingly recorded as { L }Lab200,L(x1,y1)}、{LLab200,a(x1,y1)}、{LLab200,b(x1,y1) }; wherein S istestHas a width of W, StestHas a height of H, 1 is less than or equal to x1≤200,1≤y1≤200,LLab200(x1,y1) Represents { LLab200(x1,y1) The position of the middle coordinate is (x)1,y1) Pixel value of the pixel point of (1), LLab200,L(x1,y1) Represents { LLab200,L(x1,y1) The position of the middle coordinate is (x)1,y1) Pixel value of the pixel point of (1), LLab200,a(x1,y1) Represents { LLab200,a(x1,y1) The position of the middle coordinate is (x)1,y1) Pixel value of the pixel point of (1), LLab200,b(x1,y1) Represents { LLab200,b(x1,y1) The position of the middle coordinate is (x)1,y1) The pixel value of the pixel point of (1);
will StestThe parallax map of (2) is scaled to a size of 200 × 200 pixels, and the scaled image is denoted as { D }200(x1,y1) }; then will { D200(x1,y1) Normalizing the pixel value of each pixel point in the pixel to [0, 1 ]]Numerical value range, and the normalized image is recorded as { D0,1(x1,y1) }; wherein D is200(x1,y1) Represents { D200(x1,y1) The position of the middle coordinate is (x)1,y1) Pixel value of the pixel point of (2), D0,1(x1,y1) Represents { D0,1(x1,y1) The position of the middle coordinate is (x)1,y1) The pixel value of the pixel point of (1);
② will { LLab200,L(x1,y1)}、{LLab200,a(x1,y1)}、{LLab200,b(x1,y1)}、{D0,1(x1,y1) Are divided into
Figure FDA0002478632340000011
Non-overlapping image blocks of size 8 × 8, and LLab200,L(x1,y1) Divide the block image to get the corresponding matrix of the block image as
Figure FDA0002478632340000012
From { LLab200,L(x1,y1) Dividing image blocks to obtain sub-matrixes corresponding to all the image blocks in the block images; and will be { LLab200,a(x1,y1) Divide the block image to get the corresponding matrix of the block image as
Figure FDA0002478632340000013
Figure FDA0002478632340000014
From { LLab200,a(x1,y1) Dividing image blocks to obtain sub-matrixes corresponding to all the image blocks in the block images; will { LLab200,b(x1,y1) Divide the block image to get the corresponding matrix of the block image as
Figure FDA0002478632340000021
From { LLab200,b(x1,y1) Dividing image blocks to obtain sub-matrixes corresponding to all the image blocks in the block images; will { D0,1(x1,y1) Divide the block image to get the corresponding matrix of the block image as
Figure FDA0002478632340000022
From { D0,1(x1,y1) Dividing image blocks to obtain sub-matrixes corresponding to all the image blocks in the block images; wherein m is more than or equal to 11≤8,1≤n1≤8,
Figure FDA0002478632340000023
Figure FDA0002478632340000024
Represents { LLab200,L(x1,y1) Dividing the image blocks to obtain the image blocks, wherein the coordinate position in the image block of the u row and the v column in the block image is (m)1,n1) The pixel value of the pixel point of (a),
Figure FDA0002478632340000025
also shows
Figure FDA0002478632340000026
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) The value of the element(s) of (c),
Figure FDA0002478632340000027
represents { LLab200,a(x1,y1) Dividing the image blocks to obtain the image blocks, wherein the coordinate position in the image block of the u row and the v column in the block image is (m)1,n1) The pixel value of the pixel point of (a),
Figure FDA0002478632340000028
also shows
Figure FDA0002478632340000029
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) The value of the element(s) of (c),
Figure FDA00024786323400000210
represents { LLab200,b(x1,y1) Dividing the image blocks to obtain the image blocks, wherein the coordinate position in the image block of the u row and the v column in the block image is (m)1,n1) The pixel value of the pixel point of (a),
Figure FDA00024786323400000211
also shows
Figure FDA00024786323400000212
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) The value of the element(s) of (c),
Figure FDA00024786323400000213
represents { D0,1(x1,y1) Dividing the image blocks to obtain the image blocks, wherein the coordinate position in the image block of the u row and the v column in the block image is (m)1,n1) The pixel value of the pixel point of (a),
Figure FDA00024786323400000214
also shows
Figure FDA00024786323400000215
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) The value of the element of (c);
③ will be
Figure FDA00024786323400000216
Form a quaternion matrix, which is denoted as
Figure FDA00024786323400000217
Will be provided with
Figure FDA00024786323400000218
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) The quaternion value of the element(s) is noted
Figure FDA00024786323400000219
Figure FDA00024786323400000220
Wherein i, j and k are all imaginary number units;
④ pairs
Figure FDA00024786323400000221
Performing two-dimensional quaternion Fourier transform on each sub-matrix to obtain
Figure FDA00024786323400000222
The transformed quaternion matrix of (2), note
Figure FDA00024786323400000223
Will be provided with
Figure FDA00024786323400000224
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) The quaternion value of the element(s) is noted
Figure FDA0002478632340000031
Figure FDA0002478632340000032
Then will be
Figure FDA0002478632340000033
The real part, i-imaginary part, j-imaginary part and k-imaginary part of (A) are correspondingly marked as
Figure FDA0002478632340000034
Figure FDA0002478632340000035
Followed by extraction
Figure FDA0002478632340000036
Figure FDA0002478632340000037
Respective low frequency components, corresponding to
Figure FDA0002478632340000038
Figure FDA0002478632340000039
Wherein QFT () represents a two-dimensional quaternion Fourier transform function,
Figure FDA00024786323400000310
to represent
Figure FDA00024786323400000311
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) The value of the element(s) of (c),
Figure FDA00024786323400000312
to represent
Figure FDA00024786323400000313
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) The value of the element(s) of (c),
Figure FDA00024786323400000314
to represent
Figure FDA00024786323400000315
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) The value of the element(s) of (c),
Figure FDA00024786323400000316
to represent
Figure FDA00024786323400000317
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) Value of (1) m2≤4,1≤n2≤4,
Figure FDA00024786323400000318
To represent
Figure FDA00024786323400000319
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2) The value of the element(s) of (c),
Figure FDA00024786323400000320
to represent
Figure FDA00024786323400000321
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2) The value of the element(s) of (c),
Figure FDA00024786323400000322
to represent
Figure FDA00024786323400000323
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2) The value of the element(s) of (c),
Figure FDA00024786323400000324
to represent
Figure FDA00024786323400000325
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2) The value of the element of (c);
⑤ are in accordance with
Figure FDA00024786323400000326
Figure FDA00024786323400000327
Obtaining
Figure FDA00024786323400000328
Corresponding low frequency component diagram, note
Figure FDA00024786323400000329
Will be provided with
Figure FDA00024786323400000330
The coordinate position in the 8 × 8 area of the u-th row and the v-th column in (m)1,n1) The pixel value of the pixel point is recorded as
Figure FDA0002478632340000041
Figure FDA0002478632340000042
Wherein the content of the first and second substances,
Figure FDA0002478632340000043
to represent
Figure FDA0002478632340000044
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1) The value of the element(s) of (c),
Figure FDA0002478632340000045
to represent
Figure FDA0002478632340000046
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1-4,n1) The value of the element(s) of (c),
Figure FDA0002478632340000047
to represent
Figure FDA0002478632340000048
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1,n1-4) of the value of the element,
Figure FDA0002478632340000049
to represent
Figure FDA00024786323400000410
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)1-4,n1-4) value of an element;
⑥ by the Lasso method and
Figure FDA00024786323400000411
for sparse representation of dictionaries, extraction
Figure FDA00024786323400000412
The sparse weight of each 8 × 8 region in the image is obtained
Figure FDA00024786323400000413
Corresponding sparse weight matrix, denoted
Figure FDA00024786323400000414
Wherein, in the Lasso method, a weighting parameter lambda is adopted to select a Lagrange penalty term, u is more than or equal to 11≤25,1≤v1≤25,
Figure FDA00024786323400000415
Is shown in
Figure FDA00024786323400000416
Representing dictionary pairs for sparseness
Figure FDA00024786323400000417
U-th row and v-th column in (1) 8 × 8 area generated when sparse representation is performed1Line v1The weight of the column has a subscript of (m)1,n1) The value of the element(s) of (c),
Figure FDA00024786323400000418
Figure FDA00024786323400000419
to represent
Figure FDA00024786323400000420
The sub-matrix of the u-th row and the v-th column in (1) has a subscript of (u)1,v1) The value of the element of (c);
⑦ obtains an absolute value sum image of 25 × 25 pixel sizes, noted as { W }abs(u, v) }, will { WabsThe pixel value of the pixel point with the coordinate position (u, v) in (u, v) is marked as Wabs(u,v),
Figure FDA00024786323400000421
And a center preference image of 25 × 25 pixel size is acquired, denoted as { C }saliency25(u1,v1) Will { C }saliency25(u1,v1) The position of the coordinate in the Chinese character is (u)1,v1) The pixel value of the pixel point is marked as Csaliency25(u1,v1),
Figure FDA00024786323400000422
A first center four-around salient image of 25 × 25 pixel size is then acquired, denoted as Ccs(u, v) }, will { CcsThe pixel value of the pixel point with the coordinate position (u, v) in (u, v) is marked as Ccs(u,v),
Figure FDA0002478632340000051
Figure FDA0002478632340000052
And a second center-surrounding salient image of 25 × 25 pixel size is acquired, denoted as { Cnear(u, v) }, will { Cnear(u, v) } the pixel value of the pixel point with the coordinate position of (u, v) is marked as Cnear(u,v),
Figure FDA0002478632340000053
Figure FDA0002478632340000054
Then using Gaussian filter template pair { Ccs(u, v) } fuzzy processing is performed, and { C is usedcs(u, v) } blurring processing the obtained image is marked as { Ccs-g(u, v) }; and using a Gaussian filter template pair { Cnear(u, v) } fuzzy processing is performed, and { C is usednear(u, v) } blurring processing the obtained image is marked as { Cnear-g(u, v) }; wherein abs () is the function of taking the absolute value, e is the natural base, u0Represents { Csaliency25(u1,v1) Abscissa of center pixel point, v0Represents { Csaliency25(u1,v1) Ordinate, δ, of the central pixel point ofFDenotes a first center preference parameter, δDDenotes a second center preference parameter, Ccs-g(u, v) represents { Ccs-g(u, v) } the pixel value of the pixel point with the coordinate position of (u, v), Cnear-g(u, v) represents { Cnear-gThe pixel value of a pixel point with (u, v) as the coordinate position in (u, v) }, and the parameter for controlling the fuzzy degree is taken when the Gaussian filtering template is applied
Figure FDA0002478632340000055
Is 5, and the filter width control parameter X is takenGIs 5, and the control filter height parameter Y is takenGA value of 5;
⑧ pairs { Ccs-g(u, v) } and { Cnear-g(u, v) } fusionThe resultant fused image is recorded as { CF(u, v) }, will { CFThe pixel value of the pixel point with the coordinate position (u, v) in (u, v) is marked as CF(u,v),CF(u,v)=0.3×Cnear-g(u,v)+0.7×Ccs-g(u, v); then use { Csaliency25(u1,v1) Are { C } pairF(u, v) } center-around enhancement, resulting in a center-around enhanced image of 25 × 25 pixel size, denoted as { CFO(u, v) }, will { CFOThe pixel value of the pixel point with the coordinate position (u, v) in (u, v) is marked as CFO(u,v),CFO(u,v)=CF(u,v)×Csaliency25(u1,v1) (ii) a Then pair { CFO(u, v) } image size conversion to obtain StestIs marked as { SF(x, y) }; wherein S isF(x, y) represents SFThe coordinate position in (x, y) is the pixel value of the pixel point of (x, y), x is more than or equal to 1 and less than or equal to W, and y is more than or equal to 1 and less than or equal to H.
2. The method for extracting visual saliency of stereoscopic images based on frequency domain sparse representation according to claim 1, wherein in said step ①,
Figure FDA0002478632340000061
wherein, min () is a function of taking the minimum value, and max () is a function of taking the maximum value.
3. The method for extracting visual saliency of stereoscopic images based on frequency domain sparse representation according to claim 1 or 2, wherein in said step ④,
Figure FDA0002478632340000062
Figure FDA0002478632340000063
Figure FDA0002478632340000064
Figure FDA0002478632340000065
wherein the content of the first and second substances,
Figure FDA0002478632340000066
to represent
Figure FDA0002478632340000067
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2) The value of the element(s) of (c),
Figure FDA0002478632340000068
to represent
Figure FDA0002478632340000069
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2A value of the element of +4),
Figure FDA00024786323400000610
to represent
Figure FDA00024786323400000611
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2+4,n2) The value of the element(s) of (c),
Figure FDA00024786323400000612
to represent
Figure FDA00024786323400000613
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2+4,n2A value of the element of +4),
Figure FDA0002478632340000071
to represent
Figure FDA0002478632340000072
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2) The value of the element(s) of (c),
Figure FDA0002478632340000073
to represent
Figure FDA0002478632340000074
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2A value of the element of +4),
Figure FDA0002478632340000075
to represent
Figure FDA0002478632340000076
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2+4,n2) The value of the element(s) of (c),
Figure FDA0002478632340000077
to represent
Figure FDA0002478632340000078
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2+4,n2A value of the element of +4),
Figure FDA0002478632340000079
to represent
Figure FDA00024786323400000710
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2) The value of the element(s) of (c),
Figure FDA00024786323400000711
to represent
Figure FDA00024786323400000712
In the sub-matrix of the u-th row and the v-th column of (1)Is marked by (m)2,n2A value of the element of +4),
Figure FDA00024786323400000713
to represent
Figure FDA00024786323400000714
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2+4,n2) The value of the element(s) of (c),
Figure FDA00024786323400000715
to represent
Figure FDA00024786323400000716
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2+4,n2A value of the element of +4),
Figure FDA00024786323400000717
to represent
Figure FDA00024786323400000718
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2) The value of the element(s) of (c),
Figure FDA00024786323400000719
to represent
Figure FDA00024786323400000720
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2,n2A value of the element of +4),
Figure FDA00024786323400000721
to represent
Figure FDA00024786323400000722
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2+4,n2) The value of the element(s) of (c),
Figure FDA00024786323400000723
to represent
Figure FDA00024786323400000724
The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)2+4,n2+4) element value.
CN201810066806.9A 2018-01-24 2018-01-24 Stereo image visual saliency extraction method based on frequency domain sparse representation Active CN108470176B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810066806.9A CN108470176B (en) 2018-01-24 2018-01-24 Stereo image visual saliency extraction method based on frequency domain sparse representation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810066806.9A CN108470176B (en) 2018-01-24 2018-01-24 Stereo image visual saliency extraction method based on frequency domain sparse representation

Publications (2)

Publication Number Publication Date
CN108470176A CN108470176A (en) 2018-08-31
CN108470176B true CN108470176B (en) 2020-06-26

Family

ID=63266107

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810066806.9A Active CN108470176B (en) 2018-01-24 2018-01-24 Stereo image visual saliency extraction method based on frequency domain sparse representation

Country Status (1)

Country Link
CN (1) CN108470176B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112182492B (en) * 2020-09-23 2022-12-16 广东工业大学 Signal sparse representation method and device based on discrete quaternion Fourier transform
CN112381838B (en) * 2020-11-14 2022-04-19 四川大学华西医院 Automatic image cutting method for digital pathological section image

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20130068182A (en) * 2011-12-14 2013-06-26 한국전자통신연구원 Methods of extracting saliency region and apparatuses for using the same
CN104754320A (en) * 2015-03-27 2015-07-01 同济大学 Method for calculating 3D-JND threshold value
CN105740859A (en) * 2016-01-27 2016-07-06 电子科技大学 Three-dimensional interest point detection method based on geometric measure and sparse optimization
CN106682599A (en) * 2016-12-15 2017-05-17 浙江科技学院 Stereo image visual saliency extraction method based on sparse representation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20130068182A (en) * 2011-12-14 2013-06-26 한국전자통신연구원 Methods of extracting saliency region and apparatuses for using the same
CN104754320A (en) * 2015-03-27 2015-07-01 同济大学 Method for calculating 3D-JND threshold value
CN105740859A (en) * 2016-01-27 2016-07-06 电子科技大学 Three-dimensional interest point detection method based on geometric measure and sparse optimization
CN106682599A (en) * 2016-12-15 2017-05-17 浙江科技学院 Stereo image visual saliency extraction method based on sparse representation

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
A novel multiresolution spatiotemporal saliency detection;Guo C 等;《 IEEE Transactions on Image Processing》;20090825;第185-198页 *
A Weighted Sparse Coding Framework for Saliency Detection;Nianyi Li 等;《The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)》;20151231;第5216-5223页 *
Saliency Detection Using Quaternion Sparse Reconstruction;Zeng, Y 等;《2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOP (ICCVW)》;20151218;第469-476页 *
图像频域显著性检测;赵红苗;《中国优秀硕士学位论文全文数据库信息科技辑》;20160731;第I138-860页 *
基于超复数傅里叶变换的自适应显著性检测;黄侃 等;《计算机应用》;20170615;第149-154页 *

Also Published As

Publication number Publication date
CN108470176A (en) 2018-08-31

Similar Documents

Publication Publication Date Title
CN107564025B (en) Electric power equipment infrared image semantic segmentation method based on deep neural network
Li et al. A weighted sparse coding framework for saliency detection
CN110353675B (en) Electroencephalogram signal emotion recognition method and device based on picture generation
CN104834922B (en) Gesture identification method based on hybrid neural networks
CN103020965B (en) A kind of foreground segmentation method based on significance detection
CN106462771A (en) 3D image significance detection method
CN108765414B (en) No-reference stereo image quality evaluation method based on wavelet decomposition and natural scene statistics
CN106228528B (en) A kind of multi-focus image fusing method based on decision diagram and rarefaction representation
CN103996195A (en) Image saliency detection method
CN111754396B (en) Face image processing method, device, computer equipment and storage medium
CN105320950A (en) A video human face living body detection method
CN110827312B (en) Learning method based on cooperative visual attention neural network
Liu et al. Blind stereoscopic image quality assessment based on hierarchical learning
Zhou et al. Blind 3D image quality assessment based on self-similarity of binocular features
CN105654142A (en) Natural scene statistics-based non-reference stereo image quality evaluation method
CN111080670A (en) Image extraction method, device, equipment and storage medium
CN108470176B (en) Stereo image visual saliency extraction method based on frequency domain sparse representation
Karimi et al. Blind stereo quality assessment based on learned features from binocular combined images
CN112669249A (en) Infrared and visible light image fusion method combining improved NSCT (non-subsampled Contourlet transform) transformation and deep learning
Jiang et al. Quality assessment for virtual reality technology based on real scene
CN107665488B (en) Stereo image visual saliency extraction method
CN106682599B (en) Sparse representation-based stereo image visual saliency extraction method
CN113705361A (en) Method and device for detecting model in living body and electronic equipment
CN107886533B (en) Method, device and equipment for detecting visual saliency of three-dimensional image and storage medium
CN115965844B (en) Multi-focus image fusion method based on visual saliency priori knowledge

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220622

Address after: Unit a2203e, innovation Plaza, No. 2007, Pingshan Avenue, Liulian community, Pingshan street, Pingshan District, Shenzhen, Guangdong 518118

Patentee after: Shenzhen Muye Microelectronics Technology Co.,Ltd.

Address before: 310023 No. 318 stay Road, Xihu District, Zhejiang, Hangzhou

Patentee before: ZHEJIANG University OF SCIENCE AND TECHNOLOGY