CN108470176B

CN108470176B - Stereo image visual saliency extraction method based on frequency domain sparse representation

Info

Publication number: CN108470176B
Application number: CN201810066806.9A
Authority: CN
Inventors: 周武杰; 蔡星宇; 张爽爽; 顾鹏笠; 潘婷; 郑飘飘; 吕思嘉; 袁建中; 陈昱臻; 胡慧敏; 金国英; 王建芬; 王新华; 孙丽慧; 吴洁雯
Original assignee: Zhejiang Lover Health Science and Technology Development Co Ltd
Current assignee: Shenzhen Muye Microelectronics Technology Co.,Ltd.
Priority date: 2018-01-24
Filing date: 2018-01-24
Publication date: 2020-06-26
Anticipated expiration: 2038-01-24
Also published as: CN108470176A

Abstract

The invention discloses a three-dimensional image visual saliency extraction method based on frequency domain sparse representation, which is used for acquiring a small-size three-channel image and a small-size normalized disparity map of a left viewpoint image of a tested three-dimensional image; blocking the three-channel image and the normalized disparity map to obtain a matrix corresponding to a blocked image; forming a quaternion matrix by the four matrixes, obtaining a transformed quaternion matrix after two-dimensional quaternion Fourier transform, and obtaining a low frequency component diagram according to low frequency components of a real part and three imaginary parts; extracting a sparse weight matrix of the low frequency component diagram by using a sparse representation dictionary; acquiring two central periphery significant images according to the sparse weight matrix and the central preference image; blurring the two images with significant centers around, and fusing to obtain a fused image; reinforcing the periphery of the center of the fused image by using the center preference image; enhancing the size conversion of the image around the center to obtain a visual salient image; has the advantages of stronger extraction stability and higher extraction accuracy.

Description

Stereo image visual saliency extraction method based on frequency domain sparse representation

Technical Field

The invention relates to a processing method of image signals, in particular to a method for extracting visual saliency of a three-dimensional image based on frequency domain sparse representation.

Background

After receiving the natural image, people need to distinguish information resources of different levels, so that when processing natural image information, people can perform grading processing on different information, and thus, the selected characteristics are shown. People are not evenly focused on various areas of an image when watching the image or video clip, but rather preferentially process the more interesting part of the semantic information. Computing image salient regions is an important research area in the field of computer vision and content-based video detection. With the rapid development of stereoscopic image projection and acquisition equipment, the visual saliency detection of stereoscopic images also becomes a very important research content.

The stereoscopic image is not a simple expansion of a plane image, and the process of generating the stereoscopic image by sensing the stereoscopic image by human eyes is not a simple superposition process of a left viewpoint image and a right viewpoint image, so that the stereoscopic vision characteristic is not a simple expansion of a plane vision characteristic. However, the current stereo image visual saliency extraction method still stays on the plane image visual saliency extraction method. Therefore, how to effectively extract stereoscopic features from a stereoscopic image and how to make the extracted stereoscopic features conform to the observation habit of the human visual system are all problems that need to be studied in the process of extracting a visual saliency map of the stereoscopic image.

Disclosure of Invention

The invention aims to provide a frequency domain sparse representation-based stereo image visual saliency extraction method which accords with saliency semantic features and has stronger extraction stability and higher extraction accuracy.

The technical scheme adopted by the invention for solving the technical problems is as follows: a stereoscopic image visual saliency extraction method based on frequency domain sparse representation is characterized by comprising the following steps:

① for any one of the test stereo images S_testWill S_testIs transformed into a Lab color space and the size is transformed into a size of 200 × 200 pixels, and the transformed image is denoted as { L }_Lab200(x₁,y₁) }; then will { L_Lab200(x₁,y₁) The L channel image, the a channel image and the b channel image of the image are correspondingly recorded as { L }_Lab200,L(x₁,y₁)}、{L_Lab200,a(x₁,y₁)}、{L_Lab200,b(x₁,y₁) }; wherein S is_testHas a width of W, S_testHas a height of H, 1 is less than or equal to x₁≤200，1≤y₁≤200，L_Lab200(x₁,y₁) Represents { L_Lab200(x₁,y₁) The position of the middle coordinate is (x)₁,y₁) Pixel value of the pixel point of (1), L_Lab200,L(x₁,y₁) Represents { L_Lab200,L(x₁,y₁) The position of the middle coordinate is (x)₁,y₁) Pixel value of the pixel point of (1), L_Lab200,a(x₁,y₁) Represents { L_Lab200,a(x₁,y₁) The position of the middle coordinate is (x)₁,y₁) Is/are as followsPixel value, L, of a pixel point_Lab200,b(x₁,y₁) Represents { L_Lab200,b(x₁,y₁) The position of the middle coordinate is (x)₁,y₁) The pixel value of the pixel point of (1);

will S_testThe parallax map of (2) is scaled to a size of 200 × 200 pixels, and the scaled image is denoted as { D }₂₀₀(x₁,y₁) }; then will { D₂₀₀(x₁,y₁) Normalizing the pixel value of each pixel point in the pixel to [0, 1 ]]Numerical value range, and the normalized image is recorded as { D_0,1(x₁,y₁) }; wherein D is₂₀₀(x₁,y₁) Represents { D₂₀₀(x₁,y₁) The position of the middle coordinate is (x)₁,y₁) Pixel value of the pixel point of (2), D_0,1(x₁,y₁) Represents { D_0,1(x₁,y₁) The position of the middle coordinate is (x)₁,y₁) The pixel value of the pixel point of (1);

② will { L_Lab200,L(x₁,y₁)}、{L_Lab200,a(x₁,y₁)}、{L_Lab200,b(x₁,y₁)}、{D_0,1(x₁,y₁) Are divided into

Non-overlapping image blocks of size 8 × 8, and L_Lab200,L(x₁,y₁) Divide the block image to get the corresponding matrix of the block image as

From { L_Lab200,L(x₁,y₁) Dividing image blocks to obtain sub-matrixes corresponding to all the image blocks in the block images; and will be { L_Lab200,a(x₁,y₁) Divide the block image to get the corresponding matrix of the block image as

From { L_Lab200,a(x₁,y₁) Dividing image blocks to obtain sub-matrixes corresponding to all the image blocks in the block images; will { L_Lab200,b(x₁,y₁) Divide the block image to get the corresponding matrix of the block image as

From { L_Lab200,b(x₁,y₁) Dividing image blocks to obtain sub-matrixes corresponding to all the image blocks in the block images; will { D_0,1(x₁,y₁) Divide the block image to get the corresponding matrix of the block image as

From { D_0,1(x₁,y₁) Dividing image blocks to obtain sub-matrixes corresponding to all the image blocks in the block images; wherein m is more than or equal to 1₁≤8，1≤n₁≤8，

Represents { L_Lab200,L(x₁,y₁) Dividing the image blocks to obtain the image blocks, wherein the coordinate position in the image block of the u row and the v column in the block image is (m)₁,n₁) The pixel value of the pixel point of (a),

also shows

The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)₁,n₁) The value of the element(s) of (c),

represents { L_Lab200,a(x₁,y₁) Dividing the image blocks to obtain the image blocks, wherein the coordinate position in the image block of the u row and the v column in the block image is (m)₁,n₁) The pixel value of the pixel point of (a),

also shows

represents { L_Lab200,b(x₁,y₁) Dividing the image blocks to obtain the image blocks, wherein the coordinate position in the image block of the u row and the v column in the block image is (m)₁,n₁) The pixel value of the pixel point of (a),

also shows

represents { D_0,1(x₁,y₁) Dividing the image blocks to obtain the image blocks, wherein the coordinate position in the image block of the u row and the v column in the block image is (m)₁,n₁) The pixel value of the pixel point of (a),

also shows

The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)₁,n₁) The value of the element of (c);

③ will be

Form a quaternion matrix, which is denoted as

Will be provided with

The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)₁,n₁) The quaternion value of the element(s) is noted

Wherein i, j and k are all imaginary number units;

④ pairs

Performing two-dimensional quaternion Fourier transform on each sub-matrix to obtain

The transformed quaternion matrix of (2), note

Will be provided with

Then will be

The real part, i-imaginary part, j-imaginary part and k-imaginary part of (A) are correspondingly marked as

Followed by extraction

Respective low frequency components, corresponding to

Wherein QFT () represents a two-dimensional quaternion Fourier transform function,

to represent

to represent

to represent

to represent

The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)₁,n₁) Value of (1) m₂≤4，1≤n₂≤4，

To represent

The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)₂,n₂) The value of the element(s) of (c),

to represent

to represent

to represent

The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)₂,n₂) The value of the element of (c);

⑤ are in accordance with

Obtaining

Corresponding low frequency component diagram, note

Will be provided with

The coordinate position in the 8 × 8 area of the u-th row and the v-th column in (m)₁,n₁) The pixel value of the pixel point is recorded as

Wherein the content of the first and second substances,

to represent

to represent

The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)₁-4,n₁) The value of the element(s) of (c),

to represent

The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)₁,n₁-4) of the value of the element,

to represent

The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)₁-4,n₁-4) value of an element;

⑥ by the Lasso method and

for sparse representation of dictionaries, extraction

The sparse weight of each 8 × 8 region in the image is obtained

Corresponding sparse weight matrix, denoted

Wherein, in the Lasso method, a weighting parameter lambda is adopted to select a Lagrange penalty term, u is more than or equal to 1₁≤25，1≤v₁≤25，

Is shown in

Representing dictionary pairs for sparseness

U-th row and v-th column in (1) 8 × 8 area generated when sparse representation is performed₁Line v₁The weight of the column has a subscript of (m)₁,n₁) The value of the element(s) of (c),

to represent

The sub-matrix of the u-th row and the v-th column in (1) has a subscript of (u)₁,v₁) The value of the element of (c);

⑦ obtains an absolute value sum image of 25 × 25 pixel sizes, noted as { W }_abs(u, v) }, will { W_absThe pixel value of the pixel point with the coordinate position (u, v) in (u, v) is marked as W_abs(u,v)，

And a center preference image of 25 × 25 pixel size is acquired, denoted as { C }_saliency25(u₁,v₁) Will { C }_saliency25(u₁,v₁) The position of the coordinate in the Chinese character is (u)₁,v₁) The pixel value of the pixel point is marked as C_saliency25(u₁,v₁)，

A first center four-around salient image of 25 × 25 pixel size is then acquired, denoted as C_cs(u, v) }, will { C_csThe pixel value of the pixel point with the coordinate position (u, v) in (u, v) is marked as C_cs(u,v)，

And a second center-surrounding salient image of 25 × 25 pixel size is acquired, denoted as { C_near(u, v) }, will { C_near(u, v) } the pixel value of the pixel point with the coordinate position of (u, v) is recorded as

Then using Gaussian filter template pair { C_cs(u, v) } fuzzy processing is performed, and { C is used_cs(u, v) } blurring processing the obtained image is marked as { C_cs-g(u, v) }; and using a Gaussian filter template pair { C_near(u, v) } fuzzy processing is performed, and { C is used_near(u, v) } blurring processing the obtained image is marked as { C_near-g(u, v) }; wherein abs () is the function of taking the absolute value, e is the natural base, u₀Represents { C_saliency25(u₁,v₁) Abscissa of center pixel point, v₀Represents { C_saliency25(u₁,v₁) Ordinate, δ, of the central pixel point of_FDenotes a first center preference parameter, δ_DDenotes a second center preference parameter, C_cs-g(u, v) represents { C_cs-g(u, v) } the pixel value of the pixel point with the coordinate position of (u, v), C_near-g(u, v) represents { C_near-gThe pixel value of a pixel point with (u, v) as the coordinate position in (u, v) }, and the parameter for controlling the fuzzy degree is taken when the Gaussian filtering template is applied

Is 5, and the filter width control parameter X is taken_GIs 5, and the control filter height parameter Y is taken_GA value of 5;

⑧ pairs { C_cs-g(u, v) } and { C_near-g(u, v) } to obtain a fused image, which is marked as { C_F(u, v) }, will { C_FThe pixel value of the pixel point with the coordinate position (u, v) in (u, v) is marked as C_F(u,v)，C_F(u,v)＝0.3×C_near-g(u,v)+0.7×C_cs-g(u, v); then use { C_saliency25(u₁,v₁) Are { C } pair_F(u, v) } center-around enhancement, resulting in a center-around enhanced image of 25 × 25 pixel size, denoted as { C_FO(u, v) }, will { C_FOThe pixel value of the pixel point with the coordinate position (u, v) in (u, v) is marked as C_FO(u,v)，C_FO(u,v)＝C_F(u,v)×C_saliency25(u₁,v₁) (ii) a Then pair { C_FO(u, v) } image size conversion to obtain S_testIs marked as { S_F(x, y) }; wherein S is_F(x, y) represents S_FThe coordinate position in (x, y) is the pixel value of the pixel point of (x, y), x is more than or equal to 1 and less than or equal to W, and y is more than or equal to 1 and less than or equal to H.

In the step ①, the process is described,

where min () is the minimum function and max () is the maximum functionA function.

In the step ④, the process is described,

to represent

to represent

The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)₂,n₂A value of the element of +4),

to represent

The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)₂+4,n₂) The value of the element(s) of (c),

to represent

The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)₂+4,n₂A value of the element of +4),

to represent

to represent

to represent

to represent

to represent

to represent

to represent

to represent

to represent

to represent

to represent

Row u and column v of (1)The subscript in the matrix is (m)₂+4,n₂) The value of the element(s) of (c),

to represent

The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)₂+4,n₂+4) element value.

Compared with the prior art, the invention has the advantages that:

1) the method utilizes the multi-channel image and the disparity map, carries out block processing to obtain the quaternion matrix, then carries out two-dimensional quaternion Fourier transform on the quaternion matrix, and utilizes the transformed quaternion matrix to carry out subsequent processing to be beneficial to parallel calculation, thereby being in line with the observation habit of the human visual system and ensuring that the method of the invention is in line with the obvious semantic features.

2) The method of the invention fuses the characteristics of the depth information into the multi-channel image corresponding to the left viewpoint image to obtain the quaternion matrix, thus effectively improving the accuracy and stability of the detection of the visual saliency of the stereo image.

3) The method of the invention utilizes the low-frequency component of the two-dimensional quaternion Fourier transform to carry out sparse representation and uses less elements to carry out sparse representation, thereby improving the running speed of the method of the invention.

4) The method of the invention combines various preferences by using a sparse representation mode, improves the accuracy of the visual saliency detection of the stereo image and better plays the role of various preferences.

Drawings

Fig. 1 is a block diagram of the overall implementation of the method of the present invention.

Detailed Description

The invention is described in further detail below with reference to the accompanying examples.

The invention provides a frequency domain sparse representation-based stereo image visual saliency extraction method, the overall implementation block diagram of which is shown in FIG. 1, and the method comprises the following steps:

① pairsTesting the stereo image S on any one_testWill S_testIs transformed into a Lab color space and the size is transformed into a size of 200 × 200 pixels, and the transformed image is denoted as { L }_Lab200(x₁,y₁) }; then will { L_Lab200(x₁,y₁) The L channel image, the a channel image and the b channel image of the image are correspondingly recorded as { L }_Lab200,L(x₁,y₁)}、{L_Lab200,a(x₁,y₁)}、{L_Lab200,b(x₁,y₁) }; wherein S is_testHas a width of W, S_testHas a height of H, 1 is less than or equal to x₁≤200，1≤y₁≤200，L_Lab200(x₁,y₁) Represents { L_Lab200(x₁,y₁) The position of the middle coordinate is (x)₁,y₁) Pixel value of the pixel point of (1), L_Lab200,L(x₁,y₁) Represents { L_Lab200,L(x₁,y₁) The position of the middle coordinate is (x)₁,y₁) Pixel value of the pixel point of (1), L_Lab200,a(x₁,y₁) Represents { L_Lab200,a(x₁,y₁) The position of the middle coordinate is (x)₁,y₁) Pixel value of the pixel point of (1), L_Lab200,b(x₁,y₁) Represents { L_Lab200,b(x₁,y₁) The position of the middle coordinate is (x)₁,y₁) The pixel value of the pixel point of (1).

Will S_testThe parallax map of (2) is scaled to a size of 200 × 200 pixels, and the scaled image is denoted as { D }₂₀₀(x₁,y₁) }; then will { D₂₀₀(x₁,y₁) Normalizing the pixel value of each pixel point in the pixel to [0, 1 ]]Numerical value range, and the normalized image is recorded as { D_0,1(x₁,y₁) }; wherein D is₂₀₀(x₁,y₁) Represents { D₂₀₀(x₁,y₁) The position of the middle coordinate is (x)₁,y₁) Pixel value of the pixel point of (2), D_0,1(x₁,y₁) Represents { D_0,1(x₁,y₁) The position of the middle coordinate is (x)₁,y₁) The pixel value of the pixel point of (1).

In this embodiment, at step ①,

wherein, min () is a function of taking the minimum value, and max () is a function of taking the maximum value.

From { L_Lab200,b(x₁,y₁) Dividing image blocks to obtain all image block pairs in block imagesConstructing a corresponding sub-matrix; will { D_0,1(x₁,y₁) Divide the block image to get the corresponding matrix of the block image as

also shows

The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)₁,n₁) Value of (b) { L }_Lab200,L(x₁,y₁) The submatrix corresponding to the image block of the nth row and the vth column in the block image obtained by dividing the image block is

Row (u) and column (v) of (1),

also shows

The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)₁,n₁) Value of (b) { L }_Lab200,a(x₁,y₁) The submatrix corresponding to the image block of the nth row and the vth column in the block image obtained by dividing the image block is

Row (u) and column (v) of (1),

also shows

The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)₁,n₁) Value of (b) { L }_Lab200,b(x₁,y₁) The submatrix corresponding to the image block of the nth row and the vth column in the block image obtained by dividing the image block is

Row (u) and column (v) of (1),

also shows

The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)₁,n₁) Value of (b) { D }_0,1(x₁,y₁) The submatrix corresponding to the image block of the nth row and the vth column in the block image obtained by dividing the image block is

Row (u) and column (v).

③ will be

Form a quaternion matrix, which is denoted as

Will be provided with

Wherein i, j and k are all imaginary number units.

④ pairs

Each sub-matrix in the array is subjected to the existing two-dimensional quaternion Fourier transform to obtain

The transformed quaternion matrix of (2), note

Will be provided with

Then will be

Followed by extraction

Respective low frequency components, corresponding to

Wherein QFT () represents an existing two-dimensional quaternion Fourier transform function,

to represent

to represent

to represent

to represent

To represent

to represent

to represent

to represent

The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)₂,n₂) The value of the element(s).

In this embodiment, at step ④,

wherein the content of the first and second substances,

to represent

to represent

to represent

to represent

to represent

to represent

to represent

to represent

to represent

to represent

to represent

to represent

to represent

to represent

to represent

to represent

⑤ are in accordance with

Obtaining

Corresponding low frequency component diagram, note

Will be provided with

Wherein the content of the first and second substances,

to represent

to represent

to represent

to represent

The sub-matrix of the u-th row and the v-th column in (1) has a middle index of (m)₁-4,n₁-4) value of an element.

⑥ use the existing Lasso method and

for sparse representation of dictionaries, extraction

The sparse weight of each 8 × 8 region in the image is obtained

Corresponding sparse weight matrix, denoted

In the Lasso method, a weighting parameter λ is used to select a lagrangian penalty term, where λ is 0.4 and u is 1 ≦ u in this embodiment₁≤25，1≤v₁≤25，

Is shown in

Representing dictionary pairs for sparseness

to represent

The sub-matrix of the u-th row and the v-th column in (1) has a subscript of (u)₁,v₁) The value of the element(s).

Then using the existing Gaussian filter template pair { C_cs(u, v) } fuzzy processing is performed, and { C is used_cs(u, v) } blurring processing the obtained image is marked as { C_cs-g(u, v) }; and using the existing Gaussian filter template pair { C_near(u, v) } fuzzy processing is performed, and { C is used_near(u, v) } blurring processing the obtained image is marked as { C_near-g(u, v) }; wherein abs () is an absolute value function, e is a natural base, e is 2.7182818284 …, u₀Represents { C_saliency25(u₁,v₁) Abscissa of center pixel point, v₀Represents { C_saliency25(u₁,v₁) Ordinate, δ, of the central pixel point of_FIndicates a first center preference parameter, u in this example₀＝12.5、v₀＝12.5、δ_F＝114，δ_DRepresents a second center preference parameter, in this example, taken as δ_D＝114，C_cs-g(u, v) represents { C_cs-g(u, v) } the pixel value of the pixel point with the coordinate position of (u, v), C_near-g(u, v) represents { C_near-gThe pixel value of a pixel point with (u, v) as the coordinate position in (u, v) }, and the parameter for controlling the fuzzy degree is taken when the Gaussian filtering template is applied

Is 5, and the filter width control parameter X is taken_GIs 5, and the control filter height parameter Y is taken_GThe value of (A) is 5.

⑧ pairs { C_cs-g(u, v) } and { C_near-g(u, v) } carrying outFusing to obtain a fused image, and marking as { C_F(u, v) }, will { C_FThe pixel value of the pixel point with the coordinate position (u, v) in (u, v) is marked as C_F(u,v)，C_F(u,v)＝0.3×C_near-g(u,v)+0.7×C_cs-g(u, v); then use { C_saliency25(u₁,v₁) Are { C } pair_F(u, v) } center-around enhancement, resulting in a center-around enhanced image of 25 × 25 pixel size, denoted as { C_FO(u, v) }, will { C_FOThe pixel value of the pixel point with the coordinate position (u, v) in (u, v) is marked as C_FO(u,v)，C_FO(u,v)＝C_F(u,v)×C_saliency25(u₁,v₁) (ii) a Then pair { C_FO(u, v) } image size conversion to obtain S_testIs marked as { S_F(x, y) }; wherein S is_F(x, y) represents S_FThe coordinate position in (x, y) is the pixel value of the pixel point of (x, y), x is more than or equal to 1 and less than or equal to W, and y is more than or equal to 1 and less than or equal to H.

To verify the feasibility and effectiveness of the method of the invention, experiments were performed.

Here, a three-dimensional human eye tracking database (3Deye-tracking database) provided by south university of france was used to analyze the accuracy and stability of the method of the present invention. Here, 3 common objective parameters for evaluating the visually significant extraction method are used as evaluation indexes, i.e., Pearson correlation coefficient (PLCC), Kullback-Leibler divergence coefficient (KLD), AUC parameter (AUC) of the Area Under the receiving and operating characteristics.

The method is used for obtaining the visual salient image of each three-dimensional image in the three-dimensional human eye tracking database provided by southern university of France, and comparing the visual salient image with the subjective visual salient image (existing in the three-dimensional human eye tracking database) of each three-dimensional image in the three-dimensional human eye tracking database, wherein the higher the PLCC and AUC values and the lower the KLD value, the better the consistency between the visual salient image extracted by the method and the subjective visual salient image is. The relevant indices of PLCC, KLD and AUC that reflect the visually significant extraction performance of the method of the invention are listed in table 1. As can be seen from the data listed in Table 1, the accuracy and stability of the visually significant image and the subjective visually significant image extracted by the method are very good, which indicates that the objective extraction result is more consistent with the result of subjective perception of human eyes, and is enough to illustrate the feasibility and effectiveness of the method.

TABLE 1 accuracy and stability of visually significant images and subjective visually significant images extracted by the method of the present invention

Claims

1. A stereoscopic image visual saliency extraction method based on frequency domain sparse representation is characterized by comprising the following steps:

① for any one of the test stereo images S_testWill S_testIs transformed into a Lab color space and the size is transformed into a size of 200 × 200 pixels, and the transformed image is denoted as { L }_Lab200(x₁,y₁) }; then will { L_Lab200(x₁,y₁) The L channel image, the a channel image and the b channel image of the image are correspondingly recorded as { L }_Lab200,L(x₁,y₁)}、{L_Lab200,a(x₁,y₁)}、{L_Lab200,b(x₁,y₁) }; wherein S is_testHas a width of W, S_testHas a height of H, 1 is less than or equal to x₁≤200，1≤y₁≤200，L_Lab200(x₁,y₁) Represents { L_Lab200(x₁,y₁) The position of the middle coordinate is (x)₁,y₁) Pixel value of the pixel point of (1), L_Lab200,L(x₁,y₁) Represents { L_Lab200,L(x₁,y₁) The position of the middle coordinate is (x)₁,y₁) Pixel value of the pixel point of (1), L_Lab200,a(x₁,y₁) Represents { L_Lab200,a(x₁,y₁) The position of the middle coordinate is (x)₁,y₁) Pixel value of the pixel point of (1), L_Lab200,b(x₁,y₁) Represents { L_Lab200,b(x₁,y₁) The position of the middle coordinate is (x)₁,y₁) The pixel value of the pixel point of (1);

also shows

also shows

also shows

also shows

③ will be

Form a quaternion matrix, which is denoted as

Will be provided with

Wherein i, j and k are all imaginary number units;

④ pairs

The transformed quaternion matrix of (2), note

Will be provided with

Then will be

Followed by extraction

Respective low frequency components, corresponding to

to represent

to represent

to represent

to represent

To represent

to represent

to represent

to represent

⑤ are in accordance with

Obtaining

Corresponding low frequency component diagram, note

Will be provided with

Wherein the content of the first and second substances,

to represent

to represent

to represent

to represent

⑥ by the Lasso method and

for sparse representation of dictionaries, extraction

The sparse weight of each 8 × 8 region in the image is obtained

Corresponding sparse weight matrix, denoted

Is shown in

Representing dictionary pairs for sparseness

to represent

And a second center-surrounding salient image of 25 × 25 pixel size is acquired, denoted as { C_near(u, v) }, will { C_near(u, v) } the pixel value of the pixel point with the coordinate position of (u, v) is marked as C_near(u,v)，

⑧ pairs { C_cs-g(u, v) } and { C_near-g(u, v) } fusionThe resultant fused image is recorded as { C_F(u, v) }, will { C_FThe pixel value of the pixel point with the coordinate position (u, v) in (u, v) is marked as C_F(u,v)，C_F(u,v)＝0.3×C_near-g(u,v)+0.7×C_cs-g(u, v); then use { C_saliency25(u₁,v₁) Are { C } pair_F(u, v) } center-around enhancement, resulting in a center-around enhanced image of 25 × 25 pixel size, denoted as { C_FO(u, v) }, will { C_FOThe pixel value of the pixel point with the coordinate position (u, v) in (u, v) is marked as C_FO(u,v)，C_FO(u,v)＝C_F(u,v)×C_saliency25(u₁,v₁) (ii) a Then pair { C_FO(u, v) } image size conversion to obtain S_testIs marked as { S_F(x, y) }; wherein S is_F(x, y) represents S_FThe coordinate position in (x, y) is the pixel value of the pixel point of (x, y), x is more than or equal to 1 and less than or equal to W, and y is more than or equal to 1 and less than or equal to H.

2. The method for extracting visual saliency of stereoscopic images based on frequency domain sparse representation according to claim 1, wherein in said step ①,

3. The method for extracting visual saliency of stereoscopic images based on frequency domain sparse representation according to claim 1 or 2, wherein in said step ④,