No-reference quality evaluation method for fuzzy distortion stereo image
Technical Field
The invention relates to an image quality evaluation method, in particular to a no-reference quality evaluation method for a fuzzy distortion stereo image.
Background
With the rapid development of image coding technology and stereoscopic display technology, the stereoscopic image technology has received more and more extensive attention and application, and has become a current research hotspot. The stereoscopic image technology utilizes the binocular parallax principle of human eyes, the left viewpoint image and the right viewpoint image from the same scene are respectively and independently received by the two eyes, and the binocular parallax is formed through brain fusion, so that the stereoscopic image with depth perception and reality perception is appreciated. Compared with a single-channel image, the stereo image needs to ensure the image quality of two channels at the same time, so that the quality evaluation of the stereo image is of great significance. However, currently, there is no effective objective evaluation method for evaluating the quality of stereoscopic images. Therefore, establishing an effective objective evaluation model of the quality of the stereo image has very important significance.
Because there are many factors that affect the quality of a stereoscopic image, such as the quality distortion condition of the left viewpoint and the right viewpoint, the stereoscopic perception condition, and the visual fatigue of an observer, how to effectively perform non-reference quality evaluation is a difficult problem that needs to be solved urgently. At present, machine learning is generally adopted for predicting an evaluation model for non-reference quality evaluation, the calculation complexity is high, and the training model needs to predict subjective evaluation values of evaluation images, so that the method is not suitable for practical application occasions and has certain limitations. Sparse representation decomposes a signal on a known function set, strives to approximate an original signal on a transform domain by using a small amount of basis functions, and currently research mainly focuses on dictionary construction and sparse decomposition. One key issue with sparse representations is how to efficiently construct dictionaries to characterize the essential features of images. The dictionary construction algorithm proposed so far includes: 1) the dictionary construction method with the learning process comprises the following steps: dictionary information is obtained through machine learning training, such as a support vector machine and the like; 2) the dictionary construction method without the learning process comprises the following steps: and constructing a dictionary, such as a multi-scale Gabor dictionary, a multi-scale Gaussian dictionary and the like, by directly utilizing the features of the image. Therefore, how to construct a dictionary without a learning process and how to estimate the quality without reference from the dictionary are all technical problems that need to be solved in the quality evaluation research without reference.
Disclosure of Invention
The invention aims to provide a no-reference quality evaluation method for a fuzzy distortion stereo image, which can effectively improve the correlation between objective evaluation results and subjective perception.
The technical scheme adopted by the invention for solving the technical problems is as follows: a no-reference quality evaluation method for a fuzzy distortion stereo image is characterized by comprising a training stage and a testing stage, and specifically comprising the following steps:
① selecting N original undistorted stereo images, and forming training image set by the selected N original undistorted stereo images and the blurred and distorted stereo images corresponding to each original undistorted stereo image, and recording as { Si,org,Si,dis|1≤i≤N},Si,orgRepresenting a set of training images Si,org,Si,disI 1 is not less than i not more than Ni,disRepresenting a set of training images Si,org,Si,disI is more than or equal to 1 and less than or equal to N, and the ith original undistorted stereo image corresponds to the blurred distorted stereo image; then the S is mixedi,orgIs recorded as Li,orgWill Si,orgIs recorded as Ri,orgWill Si,disIs recorded as Li,disWill Si,disIs recorded as Ri,dis;
② pairs of training image sets Si,org,Si,disI is more than or equal to 1 and less than or equal to N, respectively implementing two-dimensional empirical mode decomposition on the left viewpoint image and the right viewpoint image of each fuzzy distortion stereo image to obtain a training image set { Si,org,Si,disL1 is not more than i is not more than N) and the intrinsic mode function image of each of the left viewpoint image and the right viewpoint image of each of the blurred and distorted stereo imagesi,disIs recorded as the intrinsic mode function imageR is to bei,disIs recorded as the intrinsic mode function imageWherein 1. ltoreq. x.ltoreq.W, 1. ltoreq. y.ltoreq.H, where W representsAndis shown here as HAndthe height of (a) of (b),to representThe middle coordinate position is the pixel value of the pixel point of (x, y),to representThe middle coordinate position is the pixel value of the pixel point of (x, y);
then, for the training image set { Si,org,Si,disI is more than or equal to 1 and less than or equal to N), and the intrinsic mode function image of the left viewpoint image and the intrinsic mode function image of the right viewpoint image of each blurred and distorted stereo image are subjected to linear weighting to obtain a training image set { Si,org,Si,disI is more than or equal to 1 and less than or equal to N, and S is compared with Si,disIs recorded as the intrinsic mode function imageWill be provided withThe pixel value of the pixel point with the middle coordinate position (x, y) is recorded as Wherein, wLIs composed ofWeight ratio of (1), wRIs composed ofWeight ratio of (1), wL+wR=1;
③ pairs of training image sets Si,org,Si,disI is more than or equal to 1 and less than or equal to N, and carrying out non-overlapping blocking processing on the intrinsic mode function image of each fuzzy distortion stereo image; then, clustering a set formed by all sub-blocks in each intrinsic mode function image by adopting a K-means clustering method to obtain K clusters of each intrinsic mode function image, wherein K represents the total number of clusters contained in each intrinsic mode function image; then, acquiring a visual dictionary table of each intrinsic mode function image according to K clusters of each intrinsic mode function image; then, obtaining a training image set { S ] according to the visual dictionary table of all intrinsic mode function imagesi,org,Si,disA visual dictionary table of |1 ≦ i ≦ N, denoted G, G = { G = ≦ G ≦ C ≦ N }iI is more than or equal to 1 and less than or equal to N, wherein GiTo representVisual dictionary table of Gi={gi,k|1≤k≤K},gi,kTo representThe visual dictionary of the kth cluster, gi,kAlso showsThe centroid of the kth cluster of (1);
④ training image set by calculating Si,org,Si,disI is not less than 1 and not more than N), and obtaining the objective evaluation metric value of each pixel point in each blurred and distorted stereo image; then obtaining a visual quality table of each fuzzy distortion stereo image according to the objective evaluation metric value of each pixel point in each fuzzy distortion stereo image; then, obtaining a training image set { S) according to the visual quality table of all the fuzzy distortion stereo imagesi,org,Si,disThe visual quality table of I1 ≦ i ≦ N, denoted Q, Q = { Q = ≦ Q ≦ N }iI is more than or equal to 1 and less than or equal to N, wherein QiDenotes Si,disVisual quality table of (2), Qi={qi,k|1≤k≤K},qi,kTo representThe visual quality of the kth cluster of (1);
⑤ for any one test stereo image StestFrom a training image set { Si,org,Si,disI is more than or equal to 1 and less than or equal to N, and calculating to obtain StestObjectively evaluating the predicted value of the image quality.
W in said step ②L=0.9,wR=0.1。
Said step ③The process of obtaining the visual dictionary table Gi is as follows:
③ -1, willIs divided intoThe non-overlapping sub-blocks of size 16 × 16 are formed byThe set of all sub-blocks in (1) is denoted asWherein x isi,tIs represented byThe column vector, x, composed of all the pixel points in the t-th sub-blocki,tHas a dimension of 256;
③ -2, adopting K-means clustering method to pairPerforming clustering operation to obtainThen will be clusteredUsing the centroid of each cluster as a visual dictionary to obtainVisual dictionary table, denoted Gi,Gi={gi,kL 1 is less than or equal to K, wherein K representsTotal number of clusters contained, gi,kTo representThe visual dictionary of the kth cluster, gi,kAlso showsThe centroid of the kth cluster, gi,kHas a dimension of 256.
S in the step ④i,disVisual quality table QiThe acquisition process comprises the following steps:
④ -1, respectively using Gabor filters to the Li,org、Ri,org、Li,disAnd Ri,disFiltering to obtain Li,org、Ri,org、Li,disAnd Ri,disThe frequency response of each pixel point in (1) under different central frequencies and different direction factors, the frequency response of Li,orgthe frequency response of the pixel point with the middle coordinate position (x, y) under the condition that the center frequency is omega and the direction factor is theta is recorded as R is to bei,orgThe frequency response of the pixel point with the middle coordinate position (x, y) under the condition that the center frequency is omega and the direction factor is theta is recorded as Mixing L withi,disThe frequency response of the pixel point with the middle coordinate position (x, y) under the condition that the center frequency is omega and the direction factor is theta is recorded as R is to bei,disThe frequency response of the pixel point with the middle coordinate position (x, y) under the condition that the center frequency is omega and the direction factor is theta is recorded as Wherein x is 1. ltoreq. x.ltoreq.W is 1. ltoreq. y.ltoreq.H, where W represents Li,org、Ri,org、Li,disAnd Ri,disWhere H represents Li,org、Ri,org、Li,disAnd Ri,disω ∈ {1.74,2.47,3.49,4.93,6.98,9.87}, θ represents the orientation factor of the Gabor filter, 1 ≦ θ ≦ 4,is composed ofThe real part of (a) is,is composed ofThe imaginary part of (a) is,is composed ofThe real part of (a) is,is composed ofThe imaginary part of (a) is,is composed ofThe real part of (a) is,is composed ofThe imaginary part of (a) is,is composed ofThe real part of (a) is,is composed ofJ is an imaginary unit;
④ -2, according to Li,orgAnd Ri,orgCalculating the frequency response of each pixel point in the S under the selected center frequency and different direction factorsi,orgAmplitude of each pixel point in (1), will Si,orgThe amplitude of the pixel point with the middle coordinate position (x, y) is recorded as Wherein, ωmfor a selected center frequency, ωm∈{1.74,2.47,3.49,4.93,6.98,9.87},Represents Li,orgThe central frequency of the pixel point with the (x, y) middle coordinate position is omegamAnd the frequency response at a directional factor of thetaThe real part of (a) is,represents Ri,orgThe central frequency of the pixel point with the (x, y) middle coordinate position is omegamAnd the frequency response at a directional factor of thetaThe real part of (a) is,represents Li,orgThe central frequency of the pixel point with the (x, y) middle coordinate position is omegamAnd the frequency response at a directional factor of thetaThe imaginary part of (a) is,represents Ri,orgThe central frequency of the pixel point with the (x, y) middle coordinate position is omegamAnd the frequency response at a directional factor of thetaAn imaginary part of (d);
also according to Li,disAnd Ri,disCalculating the frequency response of each pixel point in the S under the selected center frequency and different direction factorsi,disAmplitude of each pixel point in (1), will Si,disThe amplitude of the pixel point with the middle coordinate position (x, y) is recorded as Wherein, ωmfor a selected center frequency, ωm∈{1.74,2.47,3.49,4.93,6.98,9.87},Represents Li,disThe central frequency of the pixel point with the (x, y) middle coordinate position is omegamAnd the frequency response at a directional factor of thetaThe real part of (a) is,represents Ri,disThe central frequency of the pixel point with the (x, y) middle coordinate position is omegamAnd the frequency response at a directional factor of thetaThe real part of (a) is,represents Li,disThe central frequency of the pixel point with the (x, y) middle coordinate position is omegamAnd the frequency response at a directional factor of thetaThe imaginary part of (a) is,represents Ri,disThe pixel point with the (x, y) middle coordinate position is at the center frequencyRate of omegamAnd the frequency response at a directional factor of thetaAn imaginary part of (d);
④ -3, according to Si,orgAnd Si,disCalculating S from the amplitude of each pixel pointi,disThe objective evaluation metric value of each pixel point in Si,disThe objective evaluation metric value of the pixel point with the middle coordinate position (x, y) is recorded as rhoi(x,y), Wherein cos () is a cosine-taken function, arccos () is an inverse cosine-taken function,is composed ofThe horizontal gradient value of (a) is,is composed ofThe vertical gradient value of (a) is,is composed ofThe horizontal gradient value of (a) is,is composed ofVertical gradient value of, T1Is a control parameter;
④ -4, according to Si,disObtaining S according to the objective evaluation metric value of each pixel pointi,disIs marked as Qi,Qi={qi,kL 1 is more than or equal to K and less than or equal to K, wherein q is equal to or less than Ki,kTo representThe visual quality of the k-th cluster of (c),Ωkdenotes Si,disNeutralization ofThe set of coordinate positions of pixel points having the same coordinate position of all pixel points included in the kth cluster,to representThe total number of pixel points included in the kth cluster.
The concrete process of the fifth step is as follows:
⑤ -1, mixing StestIs recorded as LtestWill StestIs recorded as RtestTo L fortestAnd RtestRespectively carrying out two-dimensional empirical mode decomposition to obtain LtestAnd RtestRespective intrinsic mode function images, corresponding notationAndthen toAndlinear weighting is carried out to obtain StestIs denoted as { IMF [ ], in the intrinsic mode function imagetest(x, y) }, will { IMFtest(x,y) The pixel value of the pixel point with the coordinate position (x, y) in the pixel is recorded as IMFtest(x,y), Wherein 1. ltoreq. x.ltoreq.W ', 1. ltoreq. y.ltoreq.H ', where W ' representsAndis shown here as HAndthe height of (a) of (b),to representThe middle coordinate position is the pixel value of the pixel point of (x, y),to representThe pixel value, w, of the pixel point with the middle coordinate position (x, y)L' isWeight ratio of (1), wR' isWeight ratio of (1), wL'+wR'=1;
⑤ -2, will { IMFtest(x, y) } division intoThe non-overlapping sub-blocks of size 16 × 16 are defined by IMFtestThe set of all subblocks in (x, y) } is denoted asWherein, ytIs represented by { IMFtestThe column vector formed by all pixel points in the t-th sub-block in (x, y) }, ytHas a dimension of 256;
⑤ -3, calculation of { IMFtestThe minimum Euclidean distance of each sub-block in (x, y) } from G, will be { IMFtestThe minimum Euclidean distance between the t-th sub-block and G in (x, y) } is recorded ast,Wherein, the symbol "| | |" is a euclidean distance symbol, and the min () is a minimum function;
⑤ -4, calculation of { IMFtest(x, y) } the objective evaluation metric for each sub-block, will be { IMFtestThe objective evaluation metric value for the t-th sub-block in (x, y) } is denoted as zt,wherein,in the representation of QtThe visual quality corresponding to the corresponding visual dictionary is more than or equal to 1 and less than or equal to N, more than or equal to 1 and less than or equal to K, exp () represents an exponential function with e as a base, e =2.71828183, and lambda is a control parameter;
⑤ -5, according to { IMFtest(x, y) } calculating S the objective evaluation metric value for each sub-blocktestThe image quality objective evaluation predicted value of (1) is marked as Q,
compared with the prior art, the invention has the advantages that:
1) the method constructs the visual dictionary table and the visual quality table in an unsupervised learning mode, so that a complex machine learning training process is avoided, and the method does not need to predict subjective evaluation values of all training images in a training stage, so that the method is more suitable for practical application occasions.
2) In the testing stage, the method can predict and obtain the image quality objective evaluation predicted value only through a simple visual dictionary searching process, thereby greatly reducing the calculation complexity of the testing process, and keeping better consistency between the predicted image quality objective evaluation predicted value and the subjective evaluation value.
Drawings
Fig. 1 is a block diagram of the overall implementation of the method of the present invention.
Detailed Description
The invention is described in further detail below with reference to the accompanying examples.
The general implementation block diagram of the no-reference quality evaluation method for the blurred and distorted stereo image provided by the invention is shown in fig. 1, and the method comprises two processes of a training stage and a testing stage: in the training stage, selecting a plurality of original distortion-free stereo images and corresponding fuzzy distortion stereo images to form a training image set, decomposing each fuzzy distortion stereo image in the training image set by adopting two-dimensional empirical mode decomposition to obtain intrinsic mode function images, then carrying out non-overlapping blocking processing on each intrinsic mode function image, and constructing a visual dictionary table by adopting a K-means clustering method; the method comprises the steps of obtaining an objective evaluation predicted value of the image quality of each pixel point in each fuzzy distortion stereo image by calculating the frequency response of each original undistorted stereo image in a training image set and each pixel point in the corresponding fuzzy distortion stereo image under the selected central frequency and different direction factors, and constructing a visual quality table corresponding to a visual dictionary table. In the testing stage, for any pair of testing stereo images, decomposing the testing stereo images by adopting two-dimensional empirical mode decomposition to obtain intrinsic mode function images, then carrying out non-overlapping blocking processing on the intrinsic mode function images, and calculating to obtain the image quality objective evaluation predicted value of the testing images according to the constructed visual dictionary table and visual quality table. The non-reference quality evaluation method comprises the following specific steps:
① selecting N original undistorted stereo images, and forming training image set by the selected N original undistorted stereo images and the blurred and distorted stereo images corresponding to each original undistorted stereo image, and recording as { Si,org,Si,dis|1≤i≤N},Si,orgRepresenting a set of training images Si,org,Si,disI 1 is not less than i not more than Ni,disRepresenting a set of training images Si,org,Si,disI is more than or equal to 1 and less than or equal to N, and the ith original undistorted stereo image corresponds to the blurred distorted stereo image; then the S is mixedi,orgIs recorded as Li,orgWill Si,orgIs recorded as Ri,orgWill Si,disIs recorded as Li,disWill Si,disRight viewpoint diagram ofLike as Ri,dis(ii) a If the value of N is larger, the precision of the visual dictionary table and the visual quality table obtained through training is higher, but the computational complexity is higher, so that the compromise consideration is that half of the blurred and distorted images in the adopted image library can be selected for processing, and the symbol "{ }" is a set representing a symbol.
Here, experiments were performed using blur-distorted stereoscopic images in the ningbo university stereoscopic image library and LIVE stereoscopic image library. The blurred and distorted stereo images in the Ningbo university stereo image library are composed of 12 undistorted stereo images and 60 distorted stereo images under different degrees of Gaussian blur, and the blurred and distorted stereo images in the LIVE stereo image library are composed of 19 undistorted stereo images and 45 distorted stereo images under different degrees of Gaussian blur. In the present embodiment, a 50% blurred and distorted stereo image is used to construct a training image set, that is, for a training image set constructed from a Ningbo university stereo image library, N =30 is taken; for the training image set constructed from LIVE stereo image library, take N = 22.
② pairs of training image sets Si,org,Si,disI is more than or equal to 1 and less than or equal to N, respectively implementing two-dimensional empirical mode decomposition on the left viewpoint image and the right viewpoint image of each fuzzy distortion stereo image to obtain a training image set { Si,org,Si,disL1 is not more than i is not more than N) and the intrinsic mode function image of each of the left viewpoint image and the right viewpoint image of each of the blurred and distorted stereo imagesi,disIs recorded as the intrinsic mode function imageR is to bei,disIs recorded as the intrinsic mode function imageWherein 1. ltoreq. x.ltoreq.W, 1. ltoreq. y.ltoreq.H, where W representsAndis shown here as HAndthe height of (a) of (b),to representThe middle coordinate position is the pixel value of the pixel point of (x, y),to representThe middle coordinate position is the pixel value of the pixel point of (x, y).
Then, for the training image set { Si,org,Si,disI is more than or equal to 1 and less than or equal to N), and the intrinsic mode function image of the left viewpoint image and the intrinsic mode function image of the right viewpoint image of each blurred and distorted stereo image are subjected to linear weighting to obtain a training image set { Si,org,Si,disI is more than or equal to 1 and less than or equal to N, and S is compared with Si,disIs recorded as the intrinsic mode function imageWill be provided withThe pixel value of the pixel point with the middle coordinate position (x, y) is recorded as Wherein, wLIs composed ofWeight ratio of (1), wRIs composed ofWeight ratio of (1), wL+wR=1, in this example wL=0.9,wR=0.1。
③ pairs of training image sets Si,org,Si,disI is more than or equal to 1 and less than or equal to N, and carrying out non-overlapping blocking processing on the intrinsic mode function image of each fuzzy distortion stereo image; then, performing clustering operation on a set formed by all sub-blocks in each intrinsic mode function image by adopting the conventional K-means clustering method to obtain K clusters of each intrinsic mode function image, wherein K represents the total number of clusters contained in each intrinsic mode function image, the K value is too large and the K value is too small, and the K value is under-clustered, and if the K value is too small, the K =30 is taken in the embodiment; then, acquiring a visual dictionary table of each intrinsic mode function image according to K clusters of each intrinsic mode function image; then, obtaining a training image set { S ] according to the visual dictionary table of all intrinsic mode function imagesi,org,Si,disA visual dictionary table of |1 ≦ i ≦ N, denoted G, G = { G = ≦ G ≦ C ≦ N }iI is less than or equal to 1 and less than or equal to N, wherein the symbol is a set representing a symbol, GiTo representVisual dictionary table of Gi={gi,k|1≤k≤K},gi,kTo representThe visual dictionary of the kth cluster, gi,kAlso showsThe centroid of the kth cluster.
In this embodiment, step ③Visual dictionary table GiThe acquisition process comprises the following steps:
③ -1, willIs divided intoThe non-overlapping sub-blocks of size 16 × 16 are formed byThe set of all sub-blocks in (1) is denoted asWherein x isi,tIs represented byThe column vector, x, composed of all the pixel points in the t-th sub-blocki,tHas a dimension of 256.
③ -2, adopting the existing K-means clustering method to pairPerforming clustering operation to obtainThen will be clusteredUsing the centroid of each cluster as a visual dictionary to obtainVisual dictionary table, denoted Gi,Gi={gi,kL 1 is less than or equal to K, wherein K representsThe total number of clusters contained, the phenomenon of over-clustering will occur when the value of K is too large, and the phenomenon of under-clustering will occur when the value of K is too small, such as K =30 g in this embodimenti,kTo representThe visual dictionary of the kth cluster, gi,kAlso showsThe centroid of the kth cluster, gi,kHas a dimension of 256.
④ training image set by calculating Si,org,Si,disI is not less than 1 and not more than N), and obtaining the objective evaluation metric value of each pixel point in each blurred and distorted stereo image; then obtaining a visual quality table of each fuzzy distortion stereo image according to the objective evaluation metric value of each pixel point in each fuzzy distortion stereo image; then, obtaining a training image set { S) according to the visual quality table of all the fuzzy distortion stereo imagesi,org,Si,disThe visual quality table of I1 ≦ i ≦ N, denoted Q, Q = { Q = ≦ Q ≦ N }i|1≤i≤N},Wherein Q isiDenotes Si,disVisual quality table of (2), Qi={qi,k|1≤k≤K},qi,kTo representThe visual quality of the kth cluster.
In this embodiment, S in step ④i,disVisual quality table QiThe acquisition process comprises the following steps:
④ -1, respectively using Gabor filters to the Li,org、Ri,org、Li,disAnd Ri,disFiltering to obtain Li,org、Ri,org、Li,disAnd Ri,disUnder different central frequencies and different direction factors, each pixel point in the L-shaped array has a frequency response of Li,orgThe frequency response of the pixel point with the middle coordinate position (x, y) under the condition that the center frequency is omega and the direction factor is theta is recorded as R is to bei,orgThe frequency response of the pixel point with the middle coordinate position (x, y) under the condition that the center frequency is omega and the direction factor is theta is recorded as Mixing L withi,disThe frequency response of the pixel point with the middle coordinate position (x, y) under the condition that the center frequency is omega and the direction factor is theta is recorded as R is to bei,disThe frequency response of the pixel point with the middle coordinate position (x, y) under the condition that the center frequency is omega and the direction factor is theta is recorded as Wherein x is 1. ltoreq. x.ltoreq.W is 1. ltoreq. y.ltoreq.H, where W represents Li,org、Ri,org、Li,disAnd Ri,disWhere H represents Li,org、Ri,org、Li,disAnd Ri,disω ∈ {1.74,2.47,3.49,4.93,6.98,9.87}, θ represents the orientation factor of the Gabor filter, 1 ≦ θ ≦ 4,is composed ofThe real part of (a) is,is composed ofThe imaginary part of (a) is,is composed ofThe real part of (a) is,is composed ofThe imaginary part of (a) is,is composed ofThe real part of (a) is,is composed ofThe imaginary part of (a) is,is composed ofThe real part of (a) is,is composed ofJ is an imaginary unit.
④ -2, according to Li,orgAnd Ri,orgCalculating the frequency response of each pixel point in the S under the selected center frequency and different direction factorsi,orgAmplitude of each pixel point in (1), will Si,orgThe amplitude of the pixel point with the middle coordinate position (x, y) is recorded as Wherein, ωmfor a selected center frequency, ωm∈ {1.74,2.47,3.49,4.93,6.98,9.87}, in this example, take ωm=4.93,Represents Li,orgThe central frequency of the pixel point with the (x, y) middle coordinate position is omegamAnd the frequency response at a directional factor of thetaThe real part of (a) is,represents Ri,orgThe central frequency of the pixel point with the (x, y) middle coordinate position is omegamAnd the frequency response at a directional factor of thetaThe real part of (a) is,represents Li,orgThe central frequency of the pixel point with the (x, y) middle coordinate position is omegamAnd the frequency response at a directional factor of thetaThe imaginary part of (a) is,represents Ri,orgThe central frequency of the pixel point with the (x, y) middle coordinate position is omegamAnd the frequency response at a directional factor of thetaThe imaginary part of (c).
Also according to Li,disAnd Ri,disCalculating the frequency response of each pixel point in the S under the selected center frequency and different direction factorsi,disAmplitude of each pixel point in (1), will Si,disThe amplitude of the pixel point with the middle coordinate position (x, y) is recorded as Wherein, ωmfor a selected center frequency, ωm∈ {1.74,2.47,3.49,4.93,6.98,9.87}, in this example, take ωm=4.93,Represents Li,disThe central frequency of the pixel point with the (x, y) middle coordinate position is omegamAnd the frequency response at a directional factor of thetaThe real part of (a) is,represents Ri,disThe central frequency of the pixel point with the (x, y) middle coordinate position is omegamAnd the frequency response at a directional factor of thetaThe real part of (a) is,represents Li,disThe central frequency of the pixel point with the (x, y) middle coordinate position is omegamAnd the frequency response at a directional factor of thetaThe imaginary part of (a) is,represents Ri,disThe central frequency of the pixel point with the (x, y) middle coordinate position is omegamAnd the frequency response at a directional factor of theta
④ -3, according to Si,orgAnd Si,disCalculating S from the amplitude of each pixel pointi,disThe objective evaluation metric value of each pixel point in Si,disThe objective evaluation metric value of the pixel point with the middle coordinate position (x, y) is recorded as rhoi(x,y), Wherein cos () is a cosine-taken function, arccos () is an inverse cosine-taken function,is composed ofThe horizontal gradient value of (a) is,is composed ofThe vertical gradient value of (a) is,is composed ofThe horizontal gradient value of (a) is,is composed ofVertical gradient value of, T1For controlling the parameters, T is taken in this example1=0.85。
④ -4, according to Si,disObtaining S according to the objective evaluation metric value of each pixel pointi,disIs marked as Qi,Qi={qi,kL 1 is more than or equal to K and less than or equal to K, wherein q is equal to or less than Ki,kTo representThe visual quality of the k-th cluster of (c),Ωkdenotes Si,disNeutralization ofThe set of coordinate positions of pixel points having the same coordinate position of all pixel points included in the kth cluster,to representThe total number of pixel points included in the kth cluster.
⑤ for any one test stereo image StestFrom a training image set { Si,org,Si,disI is more than or equal to 1 and less than or equal to N, and calculating to obtain StestObjectively evaluating the predicted value of the image quality.
In this embodiment, the specific process of the fifth step is as follows:
⑤ -1, mixing StestIs recorded as LtestWill StestIs recorded as RtestTo L fortestAnd RtestRespectively carrying out two-dimensional empirical mode decomposition to obtain LtestAnd RtestRespective intrinsic mode function images, corresponding notationAndthen toAndlinear weighting is carried out to obtain StestIs denoted as { IMF [ ], in the intrinsic mode function imagetest(x, y) }, will { IMFtestThe pixel value of the pixel point with the coordinate position (x, y) in (x, y) is marked as IMFtest(x,y), Wherein 1. ltoreq. x.ltoreq.W ', 1. ltoreq. y.ltoreq.H ', where W ' representsAndis shown here as HAndw 'may not be equal to W, H' may not be equal to H,to representThe middle coordinate position is the pixel value of the pixel point of (x, y),to representThe pixel value, w, of the pixel point with the middle coordinate position (x, y)L' isWeight ratio of (1), wR' isWeight ratio of (1), wL'+wR' =1, in this example wL'=0.9,wR'=0.1。
⑤ -2, will { IMFtest(x, y) } division intoThe non-overlapping sub-blocks of size 16 × 16 are defined by IMFtestThe set of all subblocks in (x, y) } is denoted asWherein, ytIs represented by { IMFtestThe column vector formed by all pixel points in the t-th sub-block in (x, y) }, ytHas a dimension of 256.
⑤ -3, calculation of { IMFtestThe minimum Euclidean distance of each sub-block in (x, y) } from G, will be { IMFtestThe minimum Euclidean distance between the t-th sub-block and G in (x, y) } is recorded ast,Wherein, the symbol "| | |" is a euclidean distance symbol, and min () is a minimum function.
⑤ -4, calculation of { IMFtest(x, y) } the objective evaluation metric for each sub-block, will be { IMFtestThe objective evaluation metric value of the t-th sub-block in (x, y) } is recorded as zt,Wherein,in the representation of QtAnd the visual quality corresponding to the corresponding visual dictionary is 1 ≦ i ≦ N,1 ≦ K, exp () represents an exponential function with e as a base, e =2.71828183, λ is a control parameter, and λ =300 is taken in the embodiment.
⑤ -5, according to { IMFtest(x, y) } calculating S the objective evaluation metric value for each sub-blocktestThe image quality objective evaluation predicted value of (1) is marked as Q,
here, the Ningbo university stereo image library and the LIVE stereo image library are used to analyze the correlation between the image quality objective evaluation prediction value of the blurred and distorted stereo image obtained in the present embodiment and the average subjective score difference value. Here, 4 common objective parameters of the evaluation method for evaluating image quality are used as evaluation indexes, that is, Pearson correlation coefficient (PLCC), Spearman correlation coefficient (SRCC), Kendall correlation coefficient (KRCC), mean square error (RMSE), accuracy of the distorted stereoscopic image objective evaluation result is reflected by PLCC and RMSE, and monotonicity thereof is reflected by SRCC and KRCC.
The method is used for calculating the image quality objective evaluation predicted value of each fuzzy distortion stereo image in the Ningbo university stereo image library and the image quality objective evaluation predicted value of each fuzzy distortion stereo image in the LIVE stereo image library, and then the average subjective score difference value of each fuzzy distortion stereo image in the Ningbo university stereo image library and the average subjective score difference value of each fuzzy distortion stereo image in the LIVE stereo image library are obtained by using the existing subjective evaluation method. The image quality objective evaluation predicted value of the fuzzy distortion stereo image calculated according to the method is subjected to five-parameter Logistic function nonlinear fitting, and the higher the PLCC, SRCC and KRCC values are, the lower the RMSE value is, the better the correlation between the objective evaluation method and the average subjective score difference is. The correlation coefficients of PLCC, SRCC, KRCC and RMSE that reflect the quality evaluation performance of the method of the present invention are shown in Table 1. As can be seen from the data listed in table 1, the correlation between the final objective evaluation prediction value of image quality of the blurred and distorted stereo image obtained according to the present embodiment and the average subjective score difference is very good, which indicates that the objective evaluation result is more consistent with the result of subjective perception of human eyes, and is sufficient to explain the effectiveness of the method of the present invention.
Table 1 correlation between the image quality objective evaluation prediction value of the blurred and distorted stereo image obtained in this embodiment and the average subjective score difference value