CN103914835B

CN103914835B - A kind of reference-free quality evaluation method for fuzzy distortion stereo-picture

Info

Publication number: CN103914835B
Application number: CN201410104299.5A
Authority: CN
Inventors: 邵枫; 王珊珊; 李福翠
Original assignee: Ningbo University
Current assignee: Shandong Lixin Huachuang Big Data Technology Co ltd
Priority date: 2014-03-20
Filing date: 2014-03-20
Publication date: 2016-08-17
Anticipated expiration: 2034-03-20
Also published as: CN103914835A

Abstract

The invention discloses a no-reference quality evaluation method for fuzzy and distorted stereoscopic images. In the training stage, a plurality of undistorted stereoscopic images and corresponding fuzzy and distorted stereoscopic images are selected to form a training image set, and a two-dimensional empirical model is used to decompose the fuzzy and distorted stereoscopic images. The distorted stereo image is decomposed to obtain the intrinsic mode function image, and the K-means clustering method is used to construct the visual dictionary table; the visual quality table is constructed by obtaining the objective evaluation value of the pixels in the blurred and distorted stereo image; in the testing stage, the Two-dimensional empirical mode decomposition decomposes the test stereo image to obtain the intrinsic mode function image, and then obtains the image quality objective evaluation prediction value of the test image according to the visual dictionary table and the visual quality table; the advantage is that no complicated machine learning is required in the training phase In the training process, in the test phase, the objective evaluation value of image quality can be obtained only through a simple visual dictionary search process, and the consistency with the subjective evaluation value is good.

Description

A No-Reference Quality Assessment Method for Blurred and Distorted Stereo Images

技术领域technical field

本发明涉及一种图像质量评价方法，尤其是涉及一种针对模糊失真立体图像的无参考质量评价方法。The invention relates to an image quality evaluation method, in particular to a no-reference quality evaluation method for blurred and distorted stereoscopic images.

背景技术Background technique

随着图像编码技术和立体显示技术的迅速发展，立体图像技术受到了越来越广泛的关注与应用，已成为当前的一个研究热点。立体图像技术利用人眼的双目视差原理，双目各自独立地接收来自同一场景的左视点图像和右视点图像，通过大脑融合形成双目视差，从而欣赏到具有深度感和逼真感的立体图像。与单通道图像相比，立体图像需要同时保证两个通道的图像质量，因此对其进行质量评价具有非常重要的意义。然而，目前对立体图像质量缺乏有效的客观评价方法进行评价。因此，建立有效的立体图像质量客观评价模型具有十分重要的意义。With the rapid development of image coding technology and stereoscopic display technology, stereoscopic image technology has received more and more attention and applications, and has become a current research hotspot. Stereoscopic image technology utilizes the principle of binocular parallax of the human eye. Both eyes independently receive left and right viewpoint images from the same scene, and form binocular parallax through brain fusion, so as to enjoy stereoscopic images with a sense of depth and realism. . Compared with single-channel images, stereo images need to ensure the image quality of two channels at the same time, so it is very important to evaluate its quality. However, there is currently no effective objective evaluation method to evaluate the stereoscopic image quality. Therefore, it is of great significance to establish an effective objective evaluation model for stereoscopic image quality.

由于影响立体图像质量的因素较多，如左视点和右视点质量失真情况、立体感知情况，观察者视觉疲劳等，因此如何有效地进行无参考质量评价是亟需解决的难点问题。目前的无参考质量评价通常采用机器学习来预测评价模型，计算复杂度较高，并且训练模型需要预知各评价图像的主观评价值，并不适用于实际的应用场合，存在一定的局限性。稀疏表示将信号在已知的函数集上进行分解，力求在变换域上用尽量少的基函数来对原始信号进行逼近，目前的研究主要集中在字典构造和稀疏分解两方面。稀疏表示的一个关键问题就是如何有效地构造字典来表征图像的本质特征。目前已提出的字典构造算法包括：1）有学习过程的字典构造方法：通过机器学习来训练得到字典信息，如支持向量机等；2）无学习过程的字典构造方法：直接利用图像的特征来构造字典，如多尺度Gabor字典、多尺度高斯字典等。因此，如何进行无学习过程的字典构造，如何根据字典来进行无参考的质量估计，都是在无参考质量评价研究中需要重点解决的技术问题。Since there are many factors that affect the quality of stereoscopic images, such as the quality distortion of the left and right viewpoints, stereoscopic perception, and visual fatigue of the observer, how to effectively evaluate the quality without reference is a difficult problem that needs to be solved urgently. The current no-reference quality evaluation usually uses machine learning to predict the evaluation model, which has high computational complexity, and the training model needs to predict the subjective evaluation value of each evaluation image, which is not suitable for practical applications and has certain limitations. Sparse representation decomposes the signal on the known function set, and strives to approximate the original signal with as few basis functions as possible in the transform domain. The current research mainly focuses on dictionary construction and sparse decomposition. A key issue in sparse representation is how to effectively construct a dictionary to represent the essential features of an image. The dictionary construction algorithms that have been proposed so far include: 1) Dictionary construction methods with a learning process: training to obtain dictionary information through machine learning, such as support vector machines, etc.; 2) Dictionary construction methods without a learning process: directly using image features to Construct dictionaries, such as multi-scale Gabor dictionary, multi-scale Gaussian dictionary, etc. Therefore, how to construct a dictionary without a learning process and how to perform a reference-free quality estimation based on a dictionary are technical issues that need to be addressed in the research of no-reference quality evaluation.

发明内容Contents of the invention

本发明所要解决的技术问题是提供一种针对模糊失真立体图像的无参考质量评价方法，其能够有效地提高客观评价结果与主观感知的相关性。The technical problem to be solved by the present invention is to provide a no-reference quality evaluation method for blurred and distorted stereoscopic images, which can effectively improve the correlation between objective evaluation results and subjective perception.

本发明解决上述技术问题所采用的技术方案为：一种针对模糊失真立体图像的无参考质量评价方法，其特征在于包括训练阶段和测试阶段两个过程，具体步骤如下：The technical scheme adopted by the present invention to solve the above-mentioned technical problems is: a kind of no-reference quality evaluation method for fuzzy and distorted stereoscopic images, which is characterized in that it includes two processes of training phase and testing phase, and the specific steps are as follows:

①选取N幅原始的无失真立体图像；然后将选取的N幅原始的无失真立体图像和每幅原始的无失真立体图像对应的模糊失真立体图像构成训练图像集，记为{S_i,org,S_i,dis|1≤i≤N}，S_i,org表示训练图像集{S_i,org,S_i,dis|1≤i≤N}中的第i幅原始的无失真立体图像，S_i,dis表示训练图像集{S_i,org,S_i,dis|1≤i≤N}中的第i幅原始的无失真立体图像对应的模糊失真立体图像；再将S_i,org的左视点图像记为L_i,org，将S_i,org的右视点图像记为R_i,org，将S_i,dis的左视点图像记为L_i,dis，将S_i,dis的右视点图像记为R_i,dis；① Select N original undistorted stereo images; then, the selected N original undistorted stereo images and the blurred and distorted stereo images corresponding to each original undistorted stereo image form a training image set, denoted as {S _i,org ,S _i,dis |1≤i≤N}, S _i,org represents the i-th original undistorted stereo image in the training image set {S _i,org ,S _i,dis |1≤i≤N}, S _{i, dis} represents the blurred and distorted stereo image corresponding to the i-th original undistorted stereo image in the training image set {S i _, _org , S _{i, dis} |1≤i≤N}; The left viewpoint image is recorded as L _i,org , the right viewpoint image of S _i,org is recorded as R _i,org , the left viewpoint image of S _i,dis is recorded as L _i,dis , and the right viewpoint image of S _i,dis is The image is denoted as R _i,dis ;

②对训练图像集{S_i,org,S_i,dis|1≤i≤N}中的每幅模糊失真立体图像的左视点图像和右视点图像分别实施二维经验模式分解，得到训练图像集{S_i,org,S_i,dis|1≤i≤N}中的每幅模糊失真立体图像的左视点图像和右视点图像各自的内蕴模式函数图像，将L_i,dis的内蕴模式函数图像记为将R_i,dis的内蕴模式函数图像记为其中，1≤x≤W，1≤y≤H，在此W表示和的宽度，在此H表示和的高度，表示中坐标位置为(x,y)的像素点的像素值，表示中坐标位置为(x,y)的像素点的像素值；② Perform two-dimensional empirical mode decomposition on the left viewpoint image and right viewpoint image of each fuzzy and distorted stereo image in the training image set {S _i,org ,S _i,dis |1≤i≤N} to obtain the training image set In {S _i,org ,S _i,dis |1≤i≤N}, the respective intrinsic mode function images of the left viewpoint image and the right viewpoint image of each blurred and distorted stereo image, the intrinsic mode of L _i,dis The function image is denoted as The intrinsic mode function image of R _i,dis is recorded as Among them, 1≤x≤W, 1≤y≤H, where W represents and width, where H represents the and the height of, express The pixel value of the pixel whose coordinate position is (x, y), express The pixel value of the pixel point whose middle coordinate position is (x, y);

然后对训练图像集{S_i,org,S_i,dis|1≤i≤N}中的每幅模糊失真立体图像的左视点图像的内蕴模式函数图像和右视点图像的内蕴模式函数图像进行线性加权，得到训练图像集{S_i,org,S_i,dis|1≤i≤N}中的每幅模糊失真立体图像的内蕴模式函数图像，将S_i,dis的内蕴模式函数图像记为将中坐标位置为(x,y)的像素点的像素值记为 ${IMF}_{i}^{dis} (x, y) = w_{L} \times {IMF}_{i}^{L, dis} (x, y) + w_{R} \times {IMF}_{i}^{R, dis} (x, y),$ 其中，w_L为的权值比重，w_R为的权值比重，w_L+w_R=1；Then, in the training image set {S _i,org ,S _i,dis |1≤i≤N}, the intrinsic mode function image of the left view point image and the intrinsic mode function image of the right view point image of each blurred and distorted stereo image Perform linear weighting to obtain the intrinsic mode function image of each fuzzy and distorted stereo image in the training image set {S _i,org ,S _i,dis |1≤i≤N}, and the intrinsic mode function of S _i,dis Image credited as Will The pixel value of the pixel point whose coordinate position is (x, y) is recorded as ${IMF}_{i}^{dis} (x, the y) = w_{L} \times {IMF}_{i}^{L, dis} (x, the y) + w_{R} \times {IMF}_{i}^{R, dis} (x, the y),$ Among them, w _L is The weight ratio of w _R is The weight ratio of w _L +w _R =1;

③对训练图像集{S_i,org,S_i,dis|1≤i≤N}中的每幅模糊失真立体图像的内蕴模式函数图像进行非重叠的分块处理；然后采用K均值聚类方法对由每幅内蕴模式函数图像中的所有子块构成的集合进行聚类操作，得到每幅内蕴模式函数图像的K个聚类，其中，K表示每幅内蕴模式函数图像包含的聚类的总个数；接着根据每幅内蕴模式函数图像的K个聚类，获取每幅内蕴模式函数图像的视觉字典表；再根据所有内蕴模式函数图像的视觉字典表，得到训练图像集{S_i,org,S_i,dis|1≤i≤N}的视觉字典表，记为G，G={G_i|1≤i≤N}，其中，G_i表示的视觉字典表，G_i={g_i,k|1≤k≤K}，g_i,k表示的第k个聚类的视觉字典，g_i,k亦表示的第k个聚类的质心；③ Perform non-overlapping block processing on the intrinsic mode function image of each fuzzy and distorted stereo image in the training image set {S _{i, org} , S _{i, dis} |1≤i≤N}; then use K-means clustering The method performs a clustering operation on the set composed of all sub-blocks in each intrinsic pattern function image, and obtains K clusters of each intrinsic pattern function image, where K represents the number of blocks contained in each intrinsic pattern function image The total number of clusters; then according to the K clusters of each intrinsic pattern function image, obtain the visual dictionary table of each intrinsic pattern function image; then obtain training according to the visual dictionary tables of all intrinsic pattern function images The visual dictionary table of the image set {S _i,org ,S _i,dis |1≤i≤N}, denoted as G, G={G _i |1≤i≤N}, where G _i represents Visual dictionary table, G _i ={g _i,k |1≤k≤K}, g _i,k means The visual dictionary of the k-th cluster of , g _i,k also represents The centroid of the kth cluster of ;

④通过计算训练图像集{S_i,org,S_i,dis|1≤i≤N}中的每幅原始的无失真立体图像的左视点图像和右视点图像中的每个像素点在选定的中心频率和不同方向因子下的频率响应，及每幅模糊失真立体图像的左视点图像和右视点图像中的每个像素点在选定的中心频率和不同方向因子下的频率响应，获取每幅模糊失真立体图像中的每个像素点的客观评价度量值；然后根据每幅模糊失真立体图像中的每个像素点的客观评价度量值，获取每幅模糊失真立体图像的视觉质量表；再根据所有模糊失真立体图像的视觉质量表，得到训练图像集{S_i,org,S_i,dis|1≤i≤N}的视觉质量表，记为Q，Q={Q_i|1≤i≤N}，其中，Q_i表示S_i,dis的视觉质量表，Q_i={q_i,k|1≤k≤K}，q_i,k表示的第k个聚类的视觉质量；④ By calculating the training image set {S _{i, org} , S _{i, dis} |1≤i≤N} in each original undistorted stereoscopic image, each pixel in the left view image and right view image is selected The center frequency and the frequency response under different direction factors, and the frequency response of each pixel in the left view point image and right view point image of each fuzzy and distorted stereo image under the selected center frequency and different direction factors, to obtain each The objective evaluation measurement value of each pixel point in the blurred and distorted stereoscopic image; then according to the objective evaluation measurement value of each pixel point in each blurred and distorted stereoscopic image, the visual quality table of each blurred and distorted stereoscopic image is obtained; According to the visual quality table of all blurred and distorted stereo images, the visual quality table of the training image set {S _i,org ,S _i,dis |1≤i≤N} is obtained, denoted as Q, Q={Q _i |1≤i ≤N}, where Q _i represents the visual quality table of S _i,dis , Q _i ={q _i,k |1≤k≤K}, q _i,k represents The visual quality of the kth cluster of ;

⑤对于任意一副测试立体图像S_test，根据训练图像集{S_i,org,S_i,dis|1≤i≤N}的视觉字典表G和视觉质量表Q，计算得到S_test的图像质量客观评价预测值。⑤For any pair of test stereo images S _test , calculate the image quality of S _test according to the visual dictionary table G and visual quality table Q of the training image set {S _i,org ,S _i,dis |1≤i≤N} Objectively evaluate the predicted value.

所述的步骤②中取w_L=0.9，w_R=0.1。In the step ②, w _L =0.9, w _R =0.1.

所述的步骤③中的视觉字典表Gi的获取过程为：In the step ③ The acquisition process of the visual dictionary table Gi is:

③-1、将划分成个互不重叠的尺寸大小为16×16的子块，将由中的所有子块构成的集合记为其中，x_i,t表示由中的第t个子块中的所有像素点组成的列向量，x_i,t的维数为256；③-1. Will divided into Non-overlapping sub-blocks of size 16×16 will be composed of The set of all sub-blocks in is denoted as Among them, x _{i, t} means by A column vector composed of all pixels in the tth sub-block in , the dimension of x _i,t is 256;

③-2、采用K均值聚类方法对进行聚类操作，得到的K个聚类，然后将的每个聚类的质心作为视觉字典，得到的视觉字典表，记为G_i，G_i={g_i,k|1≤k≤K}，其中，K表示包含的聚类的总个数，g_i,k表示的第k个聚类的视觉字典，g_i,k亦表示的第k个聚类的质心，g_i,k的维数为256。③-2. Using the K-means clustering method to Perform a clustering operation to get of K clusters, and then The centroids of each cluster of are used as a visual dictionary to get The visual dictionary table of , denoted as G _i , G _i ={g _i,k |1≤k≤K}, where K means The total number of clusters included, g _i,k means The visual dictionary of the k-th cluster of , g _i,k also represents The centroid of the kth cluster of gi _,k has a dimension of 256.

所述的步骤④中S_i,dis的视觉质量表Q_i的获取过程为：The acquisition process of the visual quality table Q _i of S _i,dis in the step ④ is:

④-1、采用Gabor滤波器分别对L_i,org、R_i,org、L_i,dis和R_i,dis进行滤波处理，得到L_i,org、R_i,org、L_i,dis和R_i,dis中的每个像素点在不同中心频率和不同方向因子下的频率响应，将Li,_org中坐标位置为(x,y)的像素点在中心频率为ω和方向因子为θ下的频率响应记为 $G_{i, org}^{L} (x, y; ω, θ) = e_{i, org}^{L} (x, y; ω, θ) + j o_{i, org}^{L} (x, y; ω, θ),$ 将R_i,org中坐标位置为(x,y)的像素点在中心频率为ω和方向因子为θ下的频率响应记为 $G_{i, org}^{R} (x, y; ω, θ) = e_{i, org}^{R} (x, y; ω, θ) + j o_{i, org}^{R} (x, y; ω, θ),$ 将L_i,dis中坐标位置为(x,y)的像素点在中心频率为ω和方向因子为θ下的频率响应记为 $G_{i, dis}^{L} (x, y; ω, θ) = e_{i, dis}^{L} (x, y; ω, θ) + j o_{i, dis}^{L} (x, y; ω, θ),$ 将R_i,dis中坐标位置为(x,y)的像素点在中心频率为ω和方向因子为θ下的频率响应记为 $G_{i, dis}^{R} (x, y; ω, θ) = e_{i, dis}^{R} (x, y; ω, θ) + j o_{i, dis}^{R} (x, y; ω, θ),$ 其中，1≤x≤W，1≤y≤H，在此W表示L_i,_org、R_i,_org、L_i,_dis和R_i,_dis的宽度，在此H表示L_i,_org、R_i,_org、L_i,_dis和R_i,_dis的高度，ω表示Gabor滤波器的中心频率，ω∈{1.74,2.47,3.49,4.93,6.98,9.87}，θ表示Gabor滤波器的方向因子，1≤θ≤4，为的实部，为的虚部，为的实部，为的虚部，为的实部，为的虚部，为的实部，为的虚部，j为虚数单位；④-1. Use the Gabor filter to filter _Li,org , R _i,org , _Li,dis and R _i,dis respectively to obtain _Li,org , R _i,org , _Li,dis and R The frequency response of each pixel in _i,dis under different center frequencies and different direction factors, and the pixel point with the coordinate position (x,y) in Li, _org under the center frequency of ω and the direction factor of θ The frequency response is recorded as $G_{i, org}^{L} (x, the y; ω, θ) = e_{i, org}^{L} (x, the y; ω, θ) + j o_{i, org}^{L} (x, the y; ω, θ),$ The frequency response of the pixel at the coordinate position (x, y) in R _{i, org} at the center frequency ω and the direction factor θ is recorded as $G_{i, org}^{R} (x, the y; ω, θ) = e_{i, org}^{R} (x, the y; ω, θ) + j o_{i, org}^{R} (x, the y; ω, θ),$ The frequency response of the pixel at the coordinate position (x, y) in L _i,dis under the condition that the center frequency is ω and the direction factor is θ is recorded as $G_{i, dis}^{L} (x, the y; ω, θ) = e_{i, dis}^{L} (x, the y; ω, θ) + j o_{i, dis}^{L} (x, the y; ω, θ),$ The frequency response of the pixel at the coordinate position (x, y) in R _{i, dis} at the center frequency ω and the direction factor θ is recorded as $G_{i, dis}^{R} (x, the y; ω, θ) = e_{i, dis}^{R} (x, the y; ω, θ) + j o_{i, dis}^{R} (x, the y; ω, θ),$ Among them, 1≤x≤W, 1≤y≤H, here W represents the width of L _i , _org , R _i , _org , Li , _dis and R _i , _dis , here H represents L _i _, _org , R _i , _org , L _i , _dis and R _i , _dis are heights, ω represents the center frequency of the Gabor filter, ω∈{1.74,2.47,3.49,4.93,6.98,9.87}, θ represents the direction factor of the Gabor filter, 1≤θ≤4, for the real part of for the imaginary part of for the real part of for the imaginary part of for the real part of for the imaginary part of for the real part of for The imaginary part of , j is the imaginary unit;

④-2、根据L_i,org和R_i,org中的每个像素点在选定的中心频率和不同方向因子下的频率响应，计算S_i,org中的每个像素点的振幅，将S_i,org中坐标位置为(x,y)的像素点的振幅记为 ${LA}_{i}^{org} (x, y) = \sqrt{{(F_{i}^{org} (x, y))}^{2} + {(H_{i}^{org} (x, y))}^{2}},$ 其中， $F_{i}^{org} (x, y) = Σ_{θ = 1}^{4} e_{i, org}^{L} (x, y; ω_{m}, θ) + e_{i, org}^{R} (x, y; ω_{m}, θ),$ $H_{i}^{org} (x, y) = Σ_{θ = 1}^{4} o_{i, org}^{L} (x, y; ω_{m}, θ) + o_{i, org}^{R} (x, y; ω_{m}, θ),$ ω_m为选定的中心频率，ω_m∈{1.74,2.47,3.49,4.93,6.98,9.87}，表示L_i,org中坐标位置为(x,y)的像素点在中心频率为ω_m和方向因子为θ下的频率响应的实部，表示R_i,org中坐标位置为(x,y)的像素点在中心频率为ω_m和方向因子为θ下的频率响应的实部，表示L_i,org中坐标位置为(x,y)的像素点在中心频率为ω_m和方向因子为θ下的频率响应的虚部，表示R_i,org中坐标位置为(x,y)的像素点在中心频率为ω_m和方向因子为θ下的频率响应的虚部；④-2. According to the frequency response of each pixel in L _i,org and R _i,org at the selected center frequency and different direction factors, calculate the amplitude of each pixel in S _i,org , and set The amplitude of the pixel at the coordinate position (x, y) in S _{i, org} is recorded as ${LA}_{i}^{org} (x, the y) = \sqrt{{(f_{i}^{org} (x, the y))}^{2} + {(h_{i}^{org} (x, the y))}^{2}},$ in, $f_{i}^{org} (x, the y) = Σ_{θ = 1}^{4} e_{i, org}^{L} (x, the y; ω_{m}, θ) + e_{i, org}^{R} (x, the y; ω_{m}, θ),$ $h_{i}^{org} (x, the y) = Σ_{θ = 1}^{4} o_{i, org}^{L} (x, the y; ω_{m}, θ) + o_{i, org}^{R} (x, the y; ω_{m}, θ),$ ω _m is the selected center frequency, ω _m ∈ {1.74,2.47,3.49,4.93,6.98,9.87}, Represents the frequency response of a pixel at the coordinate position (x, y) in L _{i, org} with the center frequency ω _m and the direction factor θ the real part of Represents the frequency response of a pixel at the coordinate position (x, y) in R _{i, org} at the center frequency ω _m and the direction factor θ the real part of Represents the frequency response of a pixel at the coordinate position (x, y) in L _{i, org} with the center frequency ω _m and the direction factor θ the imaginary part of Represents the frequency response of a pixel at the coordinate position (x, y) in R _{i, org} at the center frequency ω _m and the direction factor θ the imaginary part of

同样，根据L_i,dis和R_i,dis中的每个像素点在选定的中心频率和不同方向因子下的频率响应，计算S_i,dis中的每个像素点的振幅，将S_i,dis中坐标位置为(x,y)的像素点的振幅记为 ${LA}_{i}^{dis} (x, y) = \sqrt{{(F_{i}^{dis} (x, y))}^{2} + {(H_{i}^{dis} (x, y))}^{2}},$ 其中， $F_{i}^{dis} (x, y) = Σ_{θ = 1}^{4} e_{i, dis}^{L} (x, y; ω_{m}, θ) + e_{i, dis}^{R} (x, y; ω_{m}, θ),$ $H_{i}^{dis} (x, y) = Σ_{θ = 1}^{4} o_{i, dis}^{L} (x, y; ω_{m}, θ) + o_{i, dis}^{R} (x, y; ω_{m}, θ),$ ω_m为选定的中心频率，ω_m∈{1.74,2.47,3.49,4.93,6.98,9.87}，表示L_i,dis中坐标位置为(x,y)的像素点在中心频率为ω_m和方向因子为θ下的频率响应的实部，表示R_i,dis中坐标位置为(x,y)的像素点在中心频率为ω_m和方向因子为θ下的频率响应的实部，表示L_i,dis中坐标位置为(x,y)的像素点在中心频率为ω_m和方向因子为θ下的频率响应的虚部，表示R_i,dis中坐标位置为(x,y)的像素点在中心频率为ω_m和方向因子为θ下的频率响应的虚部；Similarly, according to the frequency response of each pixel in L _i,dis and R _i,dis at the selected center frequency and different direction factors, calculate the amplitude of each pixel in S _i,dis , and set S _{i , the amplitude of the pixel at the coordinate position (x, y) in dis} is recorded as ${LA}_{i}^{dis} (x, the y) = \sqrt{{(f_{i}^{dis} (x, the y))}^{2} + {(h_{i}^{dis} (x, the y))}^{2}},$ in, $f_{i}^{dis} (x, the y) = Σ_{θ = 1}^{4} e_{i, dis}^{L} (x, the y; ω_{m}, θ) + e_{i, dis}^{R} (x, the y; ω_{m}, θ),$ $h_{i}^{dis} (x, the y) = Σ_{θ = 1}^{4} o_{i, dis}^{L} (x, the y; ω_{m}, θ) + o_{i, dis}^{R} (x, the y; ω_{m}, θ),$ ω _m is the selected center frequency, ω _m ∈ {1.74,2.47,3.49,4.93,6.98,9.87}, Represents the frequency response of the pixel at the coordinate position (x, y) in L _{i, dis} at the center frequency ω _m and the direction factor θ the real part of Represents the frequency response of the pixel at the coordinate position (x, y) in R _{i, dis} at the center frequency ω _m and the direction factor θ the real part of Represents the frequency response of the pixel at the coordinate position (x, y) in L _{i, dis} at the center frequency ω _m and the direction factor θ the imaginary part of Represents the frequency response of the pixel at the coordinate position (x, y) in R _{i, dis} at the center frequency ω _m and the direction factor θ the imaginary part of

④-3、根据S_i,org和S_i,dis中的每个像素点的振幅，计算S_i,dis中的每个像素点的客观评价度量值，将S_i,dis中坐标位置为(x,y)的像素点的客观评价度量值记为ρ_i(x,y)， $ρ_{i} (x, y) = \frac{1 + \cos (2 \cdot ψ_{i} (x, y))}{2},$ $ψ_{i} (x, y) = \arccos (\frac{{GX}_{i}^{org} (x, y) \cdot {GX}_{i}^{dis} (x, y) + {GY}_{i}^{org} (x, y) \cdot {GY}_{i}^{dis} (x, y) + T_{1}}{\sqrt{{({GX}_{i}^{org} (x, y))}^{2} + {({GY}_{i}^{org} (x, y))}^{2}} \cdot \sqrt{{({GX}_{i}^{dis} (x, y))}^{2} + {({GY}_{i}^{dis} (x, y))}^{2}} + T_{1}}),$ 其中，cos()为取余弦函数，arccos()为取反余弦函数，为的水平梯度值，为的垂直梯度值，为的水平梯度值，为的垂直梯度值，T₁为控制参数；④-3. According to the amplitude of each pixel in S _{i, org} and S _{i, dis} _, calculate the objective evaluation metric value of each pixel in S _{i, dis} , and set the coordinate position in S i, dis as ( The objective evaluation metric value of the pixel point of x, y) is denoted as ρ _i (x, y), $ρ_{i} (x, the y) = \frac{1 + \cos (2 &Center Dot; ψ_{i} (x, the y))}{2},$ $ψ_{i} (x, the y) = \arccos (\frac{{GX}_{i}^{org} (x, the y) &Center Dot; {GX}_{i}^{dis} (x, the y) + {GY}_{i}^{org} (x, the y) &Center Dot; {GY}_{i}^{dis} (x, the y) + T_{1}}{\sqrt{{({GX}_{i}^{org} (x, the y))}^{2} + {({GY}_{i}^{org} (x, the y))}^{2}} &Center Dot; \sqrt{{({GX}_{i}^{dis} (x, the y))}^{2} + {({GY}_{i}^{dis} (x, the y))}^{2}} + T_{1}}),$ Among them, cos() is the cosine function, arccos() is the inverse cosine function, for The horizontal gradient value of , for The vertical gradient value of , for The horizontal gradient value of , for The vertical gradient value of , T ₁ is the control parameter;

④-4、根据S_i,dis中的每个像素点的客观评价度量值，得到S_i,dis的视觉质量表，记为Q_i，Q_i={q_i,k|1≤k≤K}，其中，q_i,k表示的第k个聚类的视觉质量，Ω_k表示S_i,dis中与的第k个聚类中包含的所有像素点坐标位置相同的像素点的坐标位置的集合，表示的第k个聚类中包含的像素点的总个数。④-4. According to the objective evaluation value of each pixel in S _i _,dis , obtain the visual quality table of S i,dis, which is recorded as Q _i , Q _i ={q _i,k |1≤k≤K }, where q _i,k represent The visual quality of the k-th cluster of , Ω _k means S _{i, dis} and The set of coordinate positions of all pixels contained in the k-th cluster with the same coordinate positions of the pixels, express The total number of pixels contained in the kth cluster of .

所述的步骤⑤的具体过程为：The concrete process of described step 5. is:

⑤-1、将S_test的左视点图像记为L_test，将S_test的右视点图像记为R_test，对L_test和R_test分别实施二维经验模式分解，得到L_test和R_test各自的内蕴模式函数图像，对应记为和然后对和进行线性加权，得到S_test的内蕴模式函数图像，记为{IMF_test(x,y)}，将{IMF_test(x,_y)}中坐标位置为(x,y)的像素点的像素值记为IMF_test(x,y)， ${IMF}_{test} (x, y) = {w_{L}}^{'} \times {IMF}_{test}^{L} (x, y) + {w_{R}}^{'} \times {IMF}_{test}^{R} (x, y),$ 其中，1≤x≤W'，1≤y≤H'，在此W'表示和的宽度，在此H'表示和的高度，表示中坐标位置为(x,y)的像素点的像素值，表示中坐标位置为(x,y)的像素点的像素值，w_L'为的权值比重，w_R'为的权值比重，w_L'+w_R'=1；⑤-1. Record the left viewpoint image of S _test as L _test , and record the right viewpoint image of S _test as R _test , perform two-dimensional empirical mode decomposition on L _test and R _test respectively, and obtain the respective values of L _test and R _test Intrinsic mode function image, correspondingly denoted as and then to and Perform linear weighting to obtain the intrinsic mode function image of S _test , which is recorded as {IMF _test (x, y)}, and the pixel of the pixel whose coordinate position is (x, y) in {IMF _test (x, _y )} The value is denoted as IMF _test (x,y), ${IMF}_{test} (x, the y) = {w_{L}}^{'} \times {IMF}_{test}^{L} (x, the y) + {w_{R}}^{'} \times {IMF}_{test}^{R} (x, the y),$ Among them, 1≤x≤W', 1≤y≤H', where W' means and width, where H' represents the and the height of, express The pixel value of the pixel whose coordinate position is (x, y), express The pixel value of the pixel point whose coordinate position is (x, y), w _L 'is The weight ratio of w _R 'is The weight ratio of w _L '+w _R '=1;

⑤-2、将{IMF_test(x,y)}划分成个互不重叠的尺寸大小为16×16的子块，将由{IMF_test(x,y)}中的所有子块构成的集合记为其中，y_t表示由{IMF_test(x,y)}中的第t个子块中的所有像素点组成的列向量，y_t的维数为256；⑤-2. Divide {IMF _test (x,y)} into Non-overlapping sub-blocks of size 16×16, the set consisting of all sub-blocks in {IMF _test (x,y)} is recorded as Among them, y _t represents a column vector composed of all pixels in the t-th sub-block in {IMF _test (x, y)}, and the dimension of y _t is 256;

⑤-3、计算{IMF_test(x,y)}中的每个子块与G的最小欧式距离，将{IMF_test(x,y)}中的第t个子块与G的最小欧式距离记为δ_t，其中，符号“||||”为求欧氏距离符号，min()为取最小值函数；⑤-3. Calculate the minimum Euclidean distance between each sub-block in {IMF _test (x, y)} and G, and record the minimum Euclidean distance between the t-th sub-block in {IMF _test (x, y)} and G as _δt , Among them, the symbol "||||" is the Euclidean distance symbol, and min() is the minimum value function;

⑤-4、计算{IMF_test(x,y)}中的每个子块的客观评价度量值，将{IMF_test(x,y)}中的第t个子块的客观评价度量值记为zt，其中，表示Q中δ_t对应的视觉字典对应的视觉质量，1≤i*≤N,1≤k*≤K，exp()表示以e为底的指数函数，e=2.71828183，λ为控制参数；⑤-4. Calculate the objective evaluation metric value of each sub-block in {IMF _test (x, y)}, and denote the objective evaluation metric value of the tth sub-block in {IMF _test (x, y)} as zt, in, Represents the visual quality corresponding to the visual dictionary corresponding to δ _t in Q, 1≤i*≤N, 1≤k*≤K, exp() represents an exponential function with e as the base, e=2.71828183, and λ is the control parameter;

⑤-5、根据{IMF_test(x,y)}中的每个子块的客观评价度量值，计算S_test的图像质量客观评价预测值，记为Q， ⑤-5. According to the objective evaluation measurement value of each sub-block in {IMF _test (x, y)}, calculate the image quality objective evaluation prediction value of S _test , denoted as Q,

与现有技术相比，本发明的优点在于：Compared with the prior art, the present invention has the advantages of:

1）本发明方法通过无监督学习方式构造视觉字典表和视觉质量表，这样避免了复杂的机器学习训练过程，并且本发明方法在训练阶段不需要预知各训练图像的主观评价值，因此更加适用于实际的应用场合。1) The method of the present invention constructs a visual dictionary table and a visual quality table through an unsupervised learning method, which avoids the complicated machine learning training process, and the method of the present invention does not need to predict the subjective evaluation value of each training image during the training phase, so it is more applicable in practical applications.

2）本发明方法在测试阶段，只需要通过简单的视觉字典搜索过程就能预测得到图像质量客观评价预测值，大大降低了测试过程的计算复杂度，并且预测得到的图像质量客观评价预测值与主观评价值保持了较好的一致性。2) In the test phase of the method of the present invention, the predicted value of the objective evaluation of image quality can be predicted only through a simple visual dictionary search process, which greatly reduces the computational complexity of the testing process, and the predicted value of the objective evaluation of image quality obtained by prediction is the same as The subjective evaluation value maintained a good consistency.

附图说明Description of drawings

图1为本发明方法的总体实现框图。Fig. 1 is an overall realization block diagram of the method of the present invention.

具体实施方式detailed description

以下结合附图实施例对本发明作进一步详细描述。The present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments.

本发明提出的一种针对模糊失真立体图像的无参考质量评价方法，其总体实现框图如图1所示，其包括训练阶段和测试阶段两个过程：在训练阶段，选择多幅原始的无失真立体图像和对应的模糊失真立体图像构成训练图像集，通过采用二维经验模式分解对训练图像集中的每幅模糊失真立体图像进行分解得到内蕴模式函数图像，然后对各内蕴模式函数图像进行非重叠的分块处理，并通过采用K均值聚类方法构造视觉字典表；通过计算训练图像集中的每幅原始的无失真立体图像和对应的模糊失真立体图像中的每个像素点在选定的中心频率和不同方向因子下的频率响应，获取每幅模糊失真立体图像中的每个像素点的图像质量客观评价预测值，构造视觉字典表对应的视觉质量表。在测试阶段，对于任意一副测试立体图像，采用二维经验模式分解对测试立体图像进行分解得到内蕴模式函数图像，然后对内蕴模式函数图像进行非重叠的分块处理，再根据已构造的视觉字典表和视觉质量表，计算得到测试图像的图像质量客观评价预测值。本发明的无参考质量评价方法的具体步骤如下：A no-reference quality evaluation method for fuzzy and distorted stereoscopic images proposed by the present invention, its overall implementation block diagram is shown in Figure 1, which includes two processes: the training phase and the testing phase: in the training phase, multiple original undistorted images are selected Stereo images and corresponding blurred and distorted stereo images constitute a training image set, and each fuzzy and distorted stereo image in the training image set is decomposed by using two-dimensional empirical mode decomposition to obtain intrinsic mode function images, and then each intrinsic mode function image is Non-overlapping block processing, and by using the K-means clustering method to construct a visual dictionary table; by calculating each original undistorted stereo image in the training image set and each pixel in the corresponding blurred and distorted stereo image in the selected The center frequency and the frequency response under different direction factors are used to obtain the image quality objective evaluation prediction value of each pixel in each blurred and distorted stereo image, and to construct a visual quality table corresponding to the visual dictionary table. In the test phase, for any pair of test stereo images, use two-dimensional empirical mode decomposition to decompose the test stereo image to obtain the intrinsic mode function image, and then perform non-overlapping block processing on the intrinsic mode function image, and then according to the constructed The visual dictionary table and the visual quality table are calculated to obtain the image quality objective evaluation prediction value of the test image. The concrete steps of the no-reference quality evaluation method of the present invention are as follows:

①选取N幅原始的无失真立体图像；然后将选取的N幅原始的无失真立体图像和每幅原始的无失真立体图像对应的模糊失真立体图像构成训练图像集，记为{S_i,org,S_i,dis|1≤i≤N}，S_i,org表示训练图像集{S_i,org,S_i,dis|1≤i≤N}中的第i幅原始的无失真立体图像，S_i,dis表示训练图像集{S_i,org,S_i,dis|1≤i≤N}中的第i幅原始的无失真立体图像对应的模糊失真立体图像；再将S_i,org的左视点图像记为L_i,org，将S_i,org的右视点图像记为R_i,org，将S_i,dis的左视点图像记为L_i,dis，将S_i,dis的右视点图像记为R_i,dis；其中，如果N的值越大，则通过训练得到的视觉字典表和视觉质量表的精度也就越高，但计算复杂度也就越高，因此折衷考虑一般可选取所采用的图像库中的一半模糊失真图像进行处理，符号“{}”为集合表示符号。① Select N original undistorted stereo images; then, the selected N original undistorted stereo images and the blurred and distorted stereo images corresponding to each original undistorted stereo image form a training image set, denoted as {S _i,org ,S _i,dis |1≤i≤N}, S _i,org represents the i-th original undistorted stereo image in the training image set {S _i,org ,S _i,dis |1≤i≤N}, S _{i, dis} represents the blurred and distorted stereo image corresponding to the i-th original undistorted stereo image in the training image set {S i _, _org , S _{i, dis} |1≤i≤N}; The left viewpoint image is recorded as L _i,org , the right viewpoint image of S _i,org is recorded as R _i,org , the left viewpoint image of S _i,dis is recorded as L _i,dis , and the right viewpoint image of S _i,dis is The image is recorded as R _i,dis ; where, if the value of N is larger, the accuracy of the visual dictionary table and visual quality table obtained through training will be higher, but the computational complexity will be higher, so the trade-off consideration can generally be Select half of the fuzzy and distorted images in the image library used for processing, and the symbol "{}" is a set symbol.

在此，采用宁波大学立体图像库和LIVE立体图像库中的模糊失真立体图像进行实验。宁波大学立体图像库中的模糊失真立体图像由12幅无失真的立体图像在不同程度的高斯模糊情况下的60幅失真的立体图像构成，LIVE立体图像库中的模糊失真立体图像由19幅无失真的立体图像在不同程度的高斯模糊情况下的45幅失真的立体图像构成。在本实施例中，采用50%的模糊失真立体图像来构造训练图像集，即对于由宁波大学立体图像库构造的训练图像集，取N=30；对于由LIVE立体图像库构造的训练图像集，取N=22。Here, experiments are carried out using blurred and distorted stereo images from the Ningbo University stereo image library and the LIVE stereo image library. The blurred and distorted stereo images in the stereo image library of Ningbo University are composed of 12 undistorted stereo images and 60 distorted stereo images under different degrees of Gaussian blur. The blurred and distorted stereo images in the LIVE stereo image library are composed of 19 undistorted stereo images Distorted stereo images Constructed from 45 distorted stereo images with different degrees of Gaussian blur. In this embodiment, 50% blurred and distorted stereoscopic images are used to construct the training image set, that is, for the training image set constructed by the Ningbo University stereoscopic image library, N=30; for the training image set constructed by the LIVE stereoscopic image database , take N=22.

②对训练图像集{S_i,org,S_i,dis|1≤i≤N}中的每幅模糊失真立体图像的左视点图像和右视点图像分别实施二维经验模式分解，得到训练图像集{S_i,org,S_i,dis|1≤i≤N}中的每幅模糊失真立体图像的左视点图像和右视点图像各自的内蕴模式函数图像，将L_i,dis的内蕴模式函数图像记为将R_i,dis的内蕴模式函数图像记为其中，1≤x≤W，1≤y≤H，在此W表示和的宽度，在此H表示和的高度，表示中坐标位置为(x,y)的像素点的像素值，表示中坐标位置为(x,y)的像素点的像素值。② Perform two-dimensional empirical mode decomposition on the left viewpoint image and right viewpoint image of each fuzzy and distorted stereo image in the training image set {S _i,org ,S _i,dis |1≤i≤N} to obtain the training image set In {S _i,org ,S _i,dis |1≤i≤N}, the respective intrinsic mode function images of the left viewpoint image and the right viewpoint image of each blurred and distorted stereo image, the intrinsic mode of L _i,dis The function image is denoted as The intrinsic mode function image of R _i,dis is recorded as Among them, 1≤x≤W, 1≤y≤H, where W represents and width, where H represents the and the height of, express The pixel value of the pixel whose coordinate position is (x, y), express The pixel value of the pixel whose middle coordinate position is (x, y).

然后对训练图像集{S_i,org,S_i,dis|1≤i≤N}中的每幅模糊失真立体图像的左视点图像的内蕴模式函数图像和右视点图像的内蕴模式函数图像进行线性加权，得到训练图像集{S_i,org,S_i,dis|1≤i≤N}中的每幅模糊失真立体图像的内蕴模式函数图像，将S_i,dis的内蕴模式函数图像记为将中坐标位置为(x,y)的像素点的像素值记为 ${IMF}_{i}^{dis} (x, y) = w_{L} \times {IMF}_{i}^{L, dis} (x, y) + w_{R} \times {IMF}_{i}^{R, dis} (x, y),$ 其中，w_L为的权值比重，w_R为的权值比重，w_L+w_R=1，在本实施例中取w_L=0.9，w_R=0.1。Then, in the training image set {S _i,org ,S _i,dis |1≤i≤N}, the intrinsic mode function image of the left view point image and the intrinsic mode function image of the right view point image of each blurred and distorted stereo image Perform linear weighting to obtain the intrinsic mode function image of each fuzzy and distorted stereo image in the training image set {S _i,org ,S _i,dis |1≤i≤N}, and the intrinsic mode function of S _i,dis Image credited as Will The pixel value of the pixel point whose coordinate position is (x, y) is recorded as ${IMF}_{i}^{dis} (x, the y) = w_{L} \times {IMF}_{i}^{L, dis} (x, the y) + w_{R} \times {IMF}_{i}^{R, dis} (x, the y),$ Among them, w _L is The weight ratio of w _R is The weight proportion of w _L +w _R =1, in this embodiment, w _L =0.9, w _R =0.1.

③对训练图像集{S_i,org,S_i,dis|1≤i≤N}中的每幅模糊失真立体图像的内蕴模式函数图像进行非重叠的分块处理；然后采用现有的K均值聚类方法对由每幅内蕴模式函数图像中的所有子块构成的集合进行聚类操作，得到每幅内蕴模式函数图像的K个聚类，其中，K表示每幅内蕴模式函数图像包含的聚类的总个数，K的值过大会出现过聚类现象，K的值过小会出现欠聚类现象，如在本实施例中取K=30；接着根据每幅内蕴模式函数图像的K个聚类，获取每幅内蕴模式函数图像的视觉字典表；再根据所有内蕴模式函数图像的视觉字典表，得到训练图像集{S_i,org,S_i,dis|1≤i≤N}的视觉字典表，记为G，G={G_i|1≤i≤N}，其中，符号“{}”为集合表示符号，G_i表示的视觉字典表，G_i={g_i,k|1≤k≤K}，g_i,k表示的第k个聚类的视觉字典，g_i,k亦表示的第k个聚类的质心。③ Perform non-overlapping block processing on the intrinsic mode function image of each fuzzy and distorted stereo image in the training image set {S _{i, org} , S _{i, dis} |1≤i≤N}; then use the existing K The mean value clustering method performs a clustering operation on a set composed of all sub-blocks in each intrinsic pattern function image, and obtains K clusters of each intrinsic pattern function image, where K represents each intrinsic pattern function The total number of clusters contained in the image, if the value of K is too large, there will be clustering phenomenon, and if the value of K is too small, there will be under-clustering phenomenon, such as taking K=30 in this embodiment; K clustering of pattern function images to obtain the visual dictionary table of each intrinsic pattern function image; then according to the visual dictionary tables of all intrinsic pattern function images, the training image set {S _i,org ,S _i,dis | The visual dictionary table of 1≤i≤N} is denoted as G, G={G _i |1≤i≤N}, where the symbol "{}" is a set representation symbol, and G _i represents Visual dictionary table, G _i ={g _i,k |1≤k≤K}, g _i,k means The visual dictionary of the k-th cluster of , g _i,k also represents The centroid of the kth cluster of .

在此具体实施例中，步骤③中的视觉字典表G_i的获取过程为：In this specific embodiment, in step ③ The acquisition process of the visual dictionary table G _i is:

③-1、将划分成个互不重叠的尺寸大小为16×16的子块，将由中的所有子块构成的集合记为其中，x_i,t表示由中的第t个子块中的所有像素点组成的列向量，x_i,t的维数为256。③-1. Will divided into Non-overlapping sub-blocks of size 16×16 will be composed of The set of all sub-blocks in is denoted as Among them, x _{i, t} means by A column vector composed of all pixels in the t-th sub-block in , the dimension of x _i,t is 256.

③-2、采用现有的K均值聚类方法对进行聚类操作，得到的K个聚类，然后将的每个聚类的质心作为视觉字典，得到的视觉字典表，记为G_i，G_i={g_i,k|1≤k≤K}，其中，K表示包含的聚类的总个数，K的值过大会出现过聚类现象，K的值过小会出现欠聚类现象，如在本实施例中取K=30，g_i,k表示的第k个聚类的视觉字典，g_i,k亦表示的第k个聚类的质心，g_i,k的维数为256。③-2. Using the existing K-means clustering method to Perform a clustering operation to get of K clusters, and then The centroids of each cluster of are used as a visual dictionary to get The visual dictionary table of , denoted as G _i , G _i ={g _i,k |1≤k≤K}, where K means The total number of clusters included, if the value of K is too large, there will be clustering phenomenon, if the value of K is too small, there will be under-clustering phenomenon, such as taking K=30 in this embodiment, g _i,k represents The visual dictionary of the k-th cluster of , g _i,k also represents The centroid of the kth cluster of gi _,k has a dimension of 256.

④通过计算训练图像集{S_i,org,S_i,dis|1≤i≤N}中的每幅原始的无失真立体图像的左视点图像和右视点图像中的每个像素点在选定的中心频率和不同方向因子下的频率响应，及每幅模糊失真立体图像的左视点图像和右视点图像中的每个像素点在选定的中心频率和不同方向因子下的频率响应，获取每幅模糊失真立体图像中的每个像素点的客观评价度量值；然后根据每幅模糊失真立体图像中的每个像素点的客观评价度量值，获取每幅模糊失真立体图像的视觉质量表；再根据所有模糊失真立体图像的视觉质量表，得到训练图像集{S_i,org,S_i,dis|1≤i≤N}的视觉质量表，记为Q，Q={Q_i|1≤i≤N}，其中，Q_i表示S_i,dis的视觉质量表，Q_i={q_i,k|1≤k≤K}，q_i,k表示的第k个聚类的视觉质量。④ By calculating the training image set {S _{i, org} , S _{i, dis} |1≤i≤N} in each original undistorted stereoscopic image, each pixel in the left viewpoint image and right viewpoint image is selected The center frequency and the frequency response under different direction factors, and the frequency response of each pixel in the left view point image and right view point image of each fuzzy and distorted stereo image under the selected center frequency and different direction factors, to obtain each The objective evaluation measurement value of each pixel point in the blurred and distorted stereoscopic image; then according to the objective evaluation measurement value of each pixel point in each blurred and distorted stereoscopic image, the visual quality table of each blurred and distorted stereoscopic image is obtained; According to the visual quality table of all blurred and distorted stereo images, the visual quality table of the training image set {S _i,org ,S _i,dis |1≤i≤N} is obtained, denoted as Q, Q={Q _i |1≤i ≤N}, where Q _i represents the visual quality table of S _i,dis , Q _i ={q _i,k |1≤k≤K}, q _i,k represents The visual quality of the kth cluster of .

在此具体实施例中，步骤④中S_i,dis的视觉质量表Q_i的获取过程为：In this specific embodiment, the acquisition process of the visual quality table Q _i of S _i,dis in step 4. is:

④-1、采用Gabor滤波器分别对L_i,org、R_i,org、L_i,dis和R_i,dis进行滤波处理，得到L_i,org、R_i,org、L_i,dis和R_i,dis中的每个像素点在不同中心频率和不同方向因子下的频率响应，将L_i,org中坐标位置为(x,y)的像素点在中心频率为ω和方向因子为θ下的频率响应记为 $G_{i, org}^{L} (x, y; ω, θ) = e_{i, org}^{L} (x, y; ω, θ) + j o_{i, org}^{L} (x, y; ω, θ),$ 将R_i,org中坐标位置为(x,y)的像素点在中心频率为ω和方向因子为θ下的频率响应记为 $G_{i, org}^{R} (x, y; ω, θ) = e_{i, org}^{R} (x, y; ω, θ) + j o_{i, org}^{R} (x, y; ω, θ),$ 将L_i,dis中坐标位置为(x,y)的像素点在中心频率为ω和方向因子为θ下的频率响应记为 $G_{i, dis}^{L} (x, y; ω, θ) = e_{i, dis}^{L} (x, y; ω, θ) + j o_{i, dis}^{L} (x, y; ω, θ),$ 将R_i,dis中坐标位置为(x,y)的像素点在中心频率为ω和方向因子为θ下的频率响应记为 $G_{i, dis}^{R} (x, y; ω, θ) = e_{i, dis}^{R} (x, y; ω, θ) + j o_{i, dis}^{R} (x, y; ω, θ),$ 其中，1≤x≤W，1≤y≤H，在此W表示L_i,org、R_i,org、L_i,dis和R_i,dis的宽度，在此H表示L_i,org、R_i,org、L_i,dis和R_i,dis的高度，ω表示Gabor滤波器的中心频率，ω∈{1.74,2.47,3.49,4.93,6.98,9.87}，θ表示Gabor滤波器的方向因子，1≤θ≤4，为的实部，为的虚部，为的实部，为的虚部，为的实部，为的虚部，为的实部，为的虚部，j为虚数单位。④-1. Use the Gabor filter to filter _Li,org , R _i,org , _Li,dis and R _i,dis respectively to obtain _Li,org , R _i,org , _Li,dis and R The frequency response of each pixel in _i,dis under different center frequencies and different direction factors, and the pixel point whose coordinate position is (x,y) in L _i,org is under the center frequency of ω and the direction factor of θ The frequency response of $G_{i, org}^{L} (x, the y; ω, θ) = e_{i, org}^{L} (x, the y; ω, θ) + j o_{i, org}^{L} (x, the y; ω, θ),$ The frequency response of the pixel at the coordinate position (x, y) in R _{i, org} at the center frequency ω and the direction factor θ is recorded as $G_{i, org}^{R} (x, the y; ω, θ) = e_{i, org}^{R} (x, the y; ω, θ) + j o_{i, org}^{R} (x, the y; ω, θ),$ The frequency response of the pixel at the coordinate position (x, y) in L _i,dis under the condition that the center frequency is ω and the direction factor is θ is recorded as $G_{i, dis}^{L} (x, the y; ω, θ) = e_{i, dis}^{L} (x, the y; ω, θ) + j o_{i, dis}^{L} (x, the y; ω, θ),$ The frequency response of the pixel at the coordinate position (x, y) in R _{i, dis} at the center frequency ω and the direction factor θ is recorded as $G_{i, dis}^{R} (x, the y; ω, θ) = e_{i, dis}^{R} (x, the y; ω, θ) + j o_{i, dis}^{R} (x, the y; ω, θ),$ Among them, 1≤x≤W, 1≤y≤H, here W represents the width of Li _,org , R _i,org , Li _,dis and R _i,dis , here H represents _Li,org , R The heights of _i,org , L _i,dis and R _i,dis , ω represents the center frequency of the Gabor filter, ω∈{1.74,2.47,3.49,4.93,6.98,9.87}, θ represents the direction factor of the Gabor filter, 1≤θ≤4, for the real part of for the imaginary part of for the real part of for the imaginary part of for the real part of for the imaginary part of for the real part of for The imaginary part of , j is the imaginary unit.

④-2、根据L_i,org和R_i,org中的每个像素点在选定的中心频率和不同方向因子下的频率响应，计算S_i,org中的每个像素点的振幅，将S_i,org中坐标位置为(x,y)的像素点的振幅记为 ${LA}_{i}^{org} (x, y) = \sqrt{{(F_{i}^{org} (x, y))}^{2} + {(H_{i}^{org} (x, y))}^{2}},$ 其中， $F_{i}^{org} (x, y) = Σ_{θ = 1}^{4} e_{i, org}^{L} (x, y; ω_{m}, θ) + e_{i, org}^{R} (x, y; ω_{m}, θ),$ $H_{i}^{org} (x, y) = Σ_{θ = 1}^{4} o_{i, org}^{L} (x, y; ω_{m}, θ) + o_{i, org}^{R} (x, y; ω_{m}, θ),$ ω_m为选定的中心频率，ω_m∈{1.74,2.47,3.49,4.93,6.98,9.87}，在本实施例中取ω_m=4.93，表示L_i,org中坐标位置为(x,y)的像素点在中心频率为ω_m和方向因子为θ下的频率响应的实部，表示R_i,org中坐标位置为(x,y)的像素点在中心频率为ω_m和方向因子为θ下的频率响应的实部，表示L_i,org中坐标位置为(x,y)的像素点在中心频率为ω_m和方向因子为θ下的频率响应的虚部，表示R_i,org中坐标位置为(x,y)的像素点在中心频率为ω_m和方向因子为θ下的频率响应的虚部。④-2. According to the frequency response of each pixel in L _i,org and R _i,org at the selected center frequency and different direction factors, calculate the amplitude of each pixel in S _i,org , and set The amplitude of the pixel at the coordinate position (x, y) in S _{i, org} is recorded as ${LA}_{i}^{org} (x, the y) = \sqrt{{(f_{i}^{org} (x, the y))}^{2} + {(h_{i}^{org} (x, the y))}^{2}},$ in, $f_{i}^{org} (x, the y) = Σ_{θ = 1}^{4} e_{i, org}^{L} (x, the y; ω_{m}, θ) + e_{i, org}^{R} (x, the y; ω_{m}, θ),$ $h_{i}^{org} (x, the y) = Σ_{θ = 1}^{4} o_{i, org}^{L} (x, the y; ω_{m}, θ) + o_{i, org}^{R} (x, the y; ω_{m}, θ),$ ω _m is the selected center frequency, ω _m ∈ {1.74, 2.47, 3.49, 4.93, 6.98, 9.87}, in this embodiment, ω _m =4.93, Represents the frequency response of a pixel at the coordinate position (x, y) in L _{i, org} with the center frequency ω _m and the direction factor θ the real part of Represents the frequency response of a pixel at the coordinate position (x, y) in R _{i, org} at the center frequency ω _m and the direction factor θ the real part of Represents the frequency response of a pixel at the coordinate position (x, y) in L _{i, org} with the center frequency ω _m and the direction factor θ the imaginary part of Represents the frequency response of a pixel at the coordinate position (x, y) in R _{i, org} at the center frequency ω _m and the direction factor θ the imaginary part of .

同样，根据L_i,dis和R_i,dis中的每个像素点在选定的中心频率和不同方向因子下的频率响应，计算S_i,dis中的每个像素点的振幅，将S_i,dis中坐标位置为(x,y)的像素点的振幅记为 ${LA}_{i}^{dis} (x, y) = \sqrt{{(F_{i}^{dis} (x, y))}^{2} + {(H_{i}^{dis} (x, y))}^{2}},$ 其中， $F_{i}^{dis} (x, y) = Σ_{θ = 1}^{4} e_{i, dis}^{L} (x, y; ω_{m}, θ) + e_{i, dis}^{R} (x, y; ω_{m}, θ),$ $H_{i}^{dis} (x, y) = Σ_{θ = 1}^{4} o_{i, dis}^{L} (x, y; ω_{m}, θ) + o_{i, dis}^{R} (x, y; ω_{m}, θ),$ ω_m为选定的中心频率，ω_m∈{1.74,2.47,3.49,4.93,6.98,9.87}，在本实施例中取ω_m=4.93，表示L_i,dis中坐标位置为(x,y)的像素点在中心频率为ω_m和方向因子为θ下的频率响应的实部，表示R_i,dis中坐标位置为(x,y)的像素点在中心频率为ω_m和方向因子为θ下的频率响应的实部，表示L_i,dis中坐标位置为(x,y)的像素点在中心频率为ω_m和方向因子为θ下的频率响应的虚部，表示R_i,dis中坐标位置为(x,y)的像素点在中心频率为ω_m和方向因子为θ下的频率响应 Similarly, according to the frequency response of each pixel in L _i,dis and R _i,dis at the selected center frequency and different direction factors, calculate the amplitude of each pixel in S _i,dis , and set S _{i , the amplitude of the pixel at the coordinate position (x, y) in dis} is recorded as ${LA}_{i}^{dis} (x, the y) = \sqrt{{(f_{i}^{dis} (x, the y))}^{2} + {(h_{i}^{dis} (x, the y))}^{2}},$ in, $f_{i}^{dis} (x, the y) = Σ_{θ = 1}^{4} e_{i, dis}^{L} (x, the y; ω_{m}, θ) + e_{i, dis}^{R} (x, the y; ω_{m}, θ),$ $h_{i}^{dis} (x, the y) = Σ_{θ = 1}^{4} o_{i, dis}^{L} (x, the y; ω_{m}, θ) + o_{i, dis}^{R} (x, the y; ω_{m}, θ),$ ω _m is the selected center frequency, ω _m ∈ {1.74, 2.47, 3.49, 4.93, 6.98, 9.87}, in this embodiment, ω _m =4.93, Represents the frequency response of the pixel at the coordinate position (x, y) in L _{i, dis} at the center frequency ω _m and the direction factor θ the real part of Represents the frequency response of the pixel at the coordinate position (x, y) in R _{i, dis} at the center frequency ω _m and the direction factor θ the real part of Represents the frequency response of the pixel at the coordinate position (x, y) in L _{i, dis} at the center frequency ω _m and the direction factor θ the imaginary part of Represents the frequency response of the pixel at the coordinate position (x, y) in R _{i, dis} at the center frequency ω _m and the direction factor θ

④-3、根据S_i,org和S_i,dis中的每个像素点的振幅，计算S_i,dis中的每个像素点的客观评价度量值，将S_i,dis中坐标位置为(x,y)的像素点的客观评价度量值记为ρ_i(x,y)， $ρ_{i} (x, y) = \frac{1 + \cos (2 \cdot ψ_{i} (x, y))}{2},$ $ψ_{i} (x, y) = \arccos (\frac{{GX}_{i}^{org} (x, y) \cdot {GX}_{i}^{dis} (x, y) + {GY}_{i}^{org} (x, y) \cdot {GY}_{i}^{dis} (x, y) + T_{1}}{\sqrt{{({GX}_{i}^{org} (x, y))}^{2} + {({GY}_{i}^{org} (x, y))}^{2}} \cdot \sqrt{{({GX}_{i}^{dis} (x, y))}^{2} + {({GY}_{i}^{dis} (x, y))}^{2}} + T_{1}}),$ 其中，cos()为取余弦函数，arccos()为取反余弦函数，为的水平梯度值，为的垂直梯度值，为的水平梯度值，为的垂直梯度值，T₁为控制参数，在本实施例中取T₁=0.85。④-3. According to the amplitude of each pixel in S _{i, org} and S _{i, dis} _, calculate the objective evaluation metric value of each pixel in S _{i, dis} , and set the coordinate position in S i, dis as ( The objective evaluation metric value of the pixel point of x, y) is denoted as ρ _i (x, y), $ρ_{i} (x, the y) = \frac{1 + \cos (2 &Center Dot; ψ_{i} (x, the y))}{2},$ $ψ_{i} (x, the y) = \arccos (\frac{{GX}_{i}^{org} (x, the y) \cdot {GX}_{i}^{dis} (x, the y) + {GY}_{i}^{org} (x, the y) &Center Dot; {GY}_{i}^{dis} (x, the y) + T_{1}}{\sqrt{{({GX}_{i}^{org} (x, the y))}^{2} + {({GY}_{i}^{org} (x, the y))}^{2}} &Center Dot; \sqrt{{({GX}_{i}^{dis} (x, the y))}^{2} + {({GY}_{i}^{dis} (x, the y))}^{2}} + T_{1}}),$ Among them, cos() is the cosine function, arccos() is the inverse cosine function, for The horizontal gradient value of , for The vertical gradient value of , for The horizontal gradient value of , for The vertical gradient value of , T ₁ is a control parameter, and T ₁ =0.85 is taken in this embodiment.

在此具体实施例中，步骤⑤的具体过程为：In this specific embodiment, the concrete process of step 5. is:

⑤-1、将S_test的左视点图像记为L_test，将S_test的右视点图像记为R_test，对L_test和R_test分别实施二维经验模式分解，得到L_test和R_test各自的内蕴模式函数图像，对应记为和然后对和进行线性加权，得到S_test的内蕴模式函数图像，记为{IMF_test(x,y)}，将{IMF_test(x,y)}中坐标位置为(x,y)的像素点的像素值记为IMF_test(x,y)， ${IMF}_{test} (x, y) = {w_{L}}^{'} \times {IMF}_{test}^{L} (x, y) + {w_{R}}^{'} \times {IMF}_{test}^{R} (x, y),$ 其中，1≤x≤W'，1≤y≤H'，在此W'表示和的宽度，在此H'表示和的高度，W'与W可以不相等，H'与H可以不相等，表示中坐标位置为(x,y)的像素点的像素值，表示中坐标位置为(x,y)的像素点的像素值，w_L'为的权值比重，w_R'为的权值比重，w_L'+w_R'=1，在本实施例中取w_L'=0.9，w_R'=0.1。⑤-1. Record the left viewpoint image of S _test as L _test , and record the right viewpoint image of S _test as R _test , perform two-dimensional empirical mode decomposition on L _test and R _test respectively, and obtain the respective values of L _test and R _test Intrinsic mode function image, correspondingly denoted as and then to and Perform linear weighting to obtain the intrinsic mode function image of S _test , which is recorded as {IMF _test (x, y)}, and the pixel of the pixel whose coordinate position is (x, y) in {IMF _test (x, y)} The value is denoted as IMF _test (x,y), ${IMF}_{test} (x, the y) = {w_{L}}^{'} \times {IMF}_{test}^{L} (x, the y) + {w_{R}}^{'} \times {IMF}_{test}^{R} (x, the y),$ Among them, 1≤x≤W', 1≤y≤H', where W' means and width, where H' represents the and height, W' and W may not be equal, H' and H may not be equal, express The pixel value of the pixel whose coordinate position is (x, y), express The pixel value of the pixel point whose coordinate position is (x, y), w _L 'is The weight ratio of w _R 'is The weight ratio of w _L '+w _R '=1, in this embodiment, w _L '=0.9, w _R '=0.1.

⑤-2、将{IMF_test(x,y)}划分成个互不重叠的尺寸大小为16×16的子块，将由{IMF_test(x,y)}中的所有子块构成的集合记为其中，y_t表示由{IMF_test(x,y)}中的第t个子块中的所有像素点组成的列向量，y_t的维数为256。⑤-2. Divide {IMF _test (x,y)} into Non-overlapping sub-blocks of size 16×16, the set consisting of all sub-blocks in {IMF _test (x,y)} is recorded as Among them, y _t represents a column vector composed of all pixels in the t-th sub-block in {IMF _test (x,y)}, and the dimension of y _t is 256.

⑤-3、计算{IMF_test(x,y)}中的每个子块与G的最小欧式距离，将{IMF_test(x,y)}中的第t个子块与G的最小欧式距离记为δ_t，其中，符号“||||”为求欧氏距离符号，min()为取最小值函数。⑤-3. Calculate the minimum Euclidean distance between each sub-block in {IMF _test (x, y)} and G, and record the minimum Euclidean distance between the t-th sub-block in {IMF _test (x, y)} and G as _δt , Among them, the symbol "||||" is the Euclidean distance symbol, and min() is the minimum value function.

⑤-4、计算{IMF_test(x,y)}中的每个子块的客观评价度量值，将{IMF_test(x,y)}中的第t个子块的客观评价度量值记为z_t，其中，表示Q中δ_t对应的视觉字典对应的视觉质量，1≤i*≤N,1≤k*≤K，exp()表示以e为底的指数函数，e=2.71828183，λ为控制参数，在本实施例中取λ=300。⑤-4. Calculate the objective evaluation metric value of each sub-block in {IMF _test (x, y)}, and record the objective evaluation metric value of the t-th sub-block in {IMF _test (x, y)} as z _t , in, Indicates the visual quality corresponding to the visual dictionary corresponding to δ _t in Q, 1≤i*≤N, 1≤k*≤K, exp() indicates an exponential function with e as the base, e=2.71828183, λ is the control parameter, in In this embodiment, λ=300 is taken.

在此，采用宁波大学立体图像库和LIVE立体图像库来分析本实施例得到的模糊失真立体图像的图像质量客观评价预测值与平均主观评分差值之间的相关性。这里，利用评估图像质量评价方法的4个常用客观参量作为评价指标，即非线性回归条件下的Pearson相关系数（Pearson linear correlation coefficient，PLCC）、Spearman相关系数（Spearmanrank order correlation coefficient，SRCC）、Kendall相关系数（Kendall rank-ordercorrelation coefficient，KRCC)、均方误差（root mean squared error，RMSE），PLCC和RMSE反映失真的立体图像客观评价结果的准确性，SRCC和KRCC反映其单调性。Here, the stereoscopic image database of Ningbo University and the LIVE stereoscopic image database are used to analyze the correlation between the image quality objective evaluation prediction value and the average subjective score difference of the blurred and distorted stereoscopic images obtained in this embodiment. Here, four commonly used objective parameters for evaluating image quality evaluation methods are used as evaluation indicators, namely Pearson correlation coefficient (Pearson linear correlation coefficient, PLCC), Spearman correlation coefficient (Spearman rank order correlation coefficient, SRCC) under nonlinear regression conditions, Kendall Correlation coefficient (Kendall rank-order correlation coefficient, KRCC), root mean squared error (root mean squared error, RMSE), PLCC and RMSE reflect the accuracy of objective evaluation results of distorted stereo images, SRCC and KRCC reflect its monotonicity.

利用本发明方法计算宁波大学立体图像库中的每幅模糊失真立体图像的图像质量客观评价预测值和LIVE立体图像库中的每幅模糊失真立体图像的图像质量客观评价预测值，再利用现有的主观评价方法获得宁波大学立体图像库中的每幅模糊失真立体图像的平均主观评分差值和LIVE立体图像库中的每幅模糊失真立体图像的平均主观评分差值。将按本发明方法计算得到的模糊失真立体图像的图像质量客观评价预测值做五参数Logistic函数非线性拟合，PLCC、SRCC和KRCC值越高，RMSE值越低说明客观评价方法与平均主观评分差值相关性越好。反映本发明方法的质量评价性能的PLCC、SRCC、KRCC和RMSE相关系数如表1所示。从表1所列的数据可知，按本实施例得到的模糊失真立体图像的最终的图像质量客观评价预测值与平均主观评分差值之间的相关性是很好的，表明客观评价结果与人眼主观感知的结果较为一致，足以说明本发明方法的有效性。Utilize the method of the present invention to calculate the image quality objective evaluation prediction value of each blurred and distorted stereoscopic image in the stereoscopic image database of Ningbo University and the image quality objective evaluation predicted value of each fuzzy and distorted stereoscopic image in the LIVE stereoscopic image database, and then use the existing The subjective evaluation method obtained the average subjective score difference of each fuzzy and distorted stereo image in the stereo image database of Ningbo University and the average subjective score difference of each fuzzy and distorted stereo image in the LIVE stereo image database. The image quality objective evaluation prediction value of the fuzzy and distorted stereoscopic image calculated by the method of the present invention is done five-parameter Logistic function nonlinear fitting, the higher the PLCC, SRCC and KRCC values, the lower the RMSE value shows that the objective evaluation method and the average subjective rating The better the difference correlation. The PLCC, SRCC, KRCC and RMSE correlation coefficients reflecting the quality evaluation performance of the method of the present invention are shown in Table 1. As can be seen from the data listed in Table 1, the correlation between the final image quality objective evaluation prediction value and the average subjective rating difference of the fuzzy and distorted stereoscopic image obtained by the present embodiment is very good, showing that the objective evaluation result is consistent with human The results of the subjective perception of the eye are relatively consistent, which is enough to illustrate the effectiveness of the method of the present invention.

表1本实施例得到的模糊失真立体图像的图像质量客观评价预测值与平均主观评分差值之间的相关性Table 1 Correlation between the predicted value of the image quality objective evaluation and the average subjective score difference of the fuzzy and distorted stereoscopic image obtained in this embodiment

Claims

1. A no-reference quality evaluation method for a fuzzy distortion stereo image is characterized by comprising a training stage and a testing stage, and specifically comprising the following steps:

① selecting N original undistorted stereo images, and forming training image set by the selected N original undistorted stereo images and the blurred and distorted stereo images corresponding to each original undistorted stereo image, and recording as { S_i,org,S_i,dis|1≤i≤N}，S_i,orgRepresenting a set of training images S_i,org,S_i,disI < i > 1 < i < N >Undistorted stereo image of S_i,disRepresenting a set of training images S_i,org,S_i,disI is more than or equal to 1 and less than or equal to N, and the ith original undistorted stereo image corresponds to the blurred distorted stereo image; then the S is mixed_i,orgIs recorded as L_i,orgWill S_i,orgIs recorded as R_i,orgWill S_i,disIs recorded as L_i,disWill S_i,disIs recorded as R_i,dis；

② pairs of training image sets S_i,org,S_i,disI is more than or equal to 1 and less than or equal to N, respectively implementing two-dimensional empirical mode decomposition on the left viewpoint image and the right viewpoint image of each fuzzy distortion stereo image to obtain a training image set { S_i,org,S_i,disL1 is not more than i is not more than N) and the intrinsic mode function image of each of the left viewpoint image and the right viewpoint image of each of the blurred and distorted stereo images_i,disIs expressed as { IMF [ ], the intrinsic mode function image_i ^L,dis(x, y) }, adding R_i,disIs expressed as { IMF [ ], the intrinsic mode function image_i ^R,dis(x, y) }, wherein 1. ltoreq. x.ltoreq.W, 1. ltoreq. y.ltoreq.H, where W denotes { IMF ≦ H_i ^L,dis(x, y) } and { IMF_i ^R,dis(x, y) }, where H denotes { IMF }_i ^L ^,dis(x, y) } and { IMF_i ^R,disHeight of (x, y) }, IMF_i ^L,dis(x, y) denotes { IMF_i ^L,disThe pixel value, IMF, of a pixel point whose coordinate position is (x, y) in (x, y) } is_i ^R,dis(x, y) denotes { IMF_i ^R，disThe coordinate position in (x, y) is the pixel value of the pixel point of (x, y);

then, for the training image set { S_i,org,S_i,disI is more than or equal to 1 and less than or equal to N), and the intrinsic mode function image of the left viewpoint image and the intrinsic mode function image of the right viewpoint image of each blurred and distorted stereo image are subjected to linear weighting to obtain a training image set { S_i,org,S_i,disI is more than or equal to 1 and less than or equal to N, and S is compared with S_i,disIs expressed as { IMF [ ], the intrinsic mode function image_i ^dis(x, y) }, will { IMF_i ^disThe pixel value of the pixel point with the coordinate position (x, y) in (x, y) is marked as IMF_i ^dis(x,y)，IMF_i ^dis(x,y)＝w_L×IMF_i ^L,dis(x,y)+w_R×IMF_i ^R,dis(x, y) wherein w_LIs IMF_i ^L,disWeight ratio of (x, y), w_RIs IMF_i ^R,disWeight ratio of (x, y), w_L+w_R＝1；

③ pairs of training image sets S_i,org,S_i,disI is more than or equal to 1 and less than or equal to N, and carrying out non-overlapping blocking processing on the intrinsic mode function image of each fuzzy distortion stereo image; then, clustering a set formed by all sub-blocks in each intrinsic mode function image by adopting a K-means clustering method to obtain K clusters of each intrinsic mode function image, wherein K represents the total number of clusters contained in each intrinsic mode function image; then, acquiring a visual dictionary table of each intrinsic mode function image according to K clusters of each intrinsic mode function image; then, obtaining a training image set { S ] according to the visual dictionary table of all intrinsic mode function images_i,org,S_i,disA visual dictionary table of |1 ≦ i ≦ N ≦ G, G ═ G_iI is more than or equal to 1 and less than or equal to N, wherein G_iExpression { IMF_i ^dis(x, y) } visual dictionary Table, G_i＝{g_i,k|1≤k≤K}，g_i,kExpression { IMF_i ^disVisual dictionary of the kth cluster of (x, y) }, g_i,kAlso indicates { IMF_i ^disThe centroid of the kth cluster of (x, y) };

④ training image set by calculating S_i,org,S_i,disI is not less than 1 and not more than N), and obtaining the objective evaluation metric value of each pixel point in each blurred and distorted stereo image; then distorting each of the stereo images according to each of the blursObjectively evaluating the metric value of each pixel point to obtain a visual quality table of each fuzzy distortion stereo image; then, obtaining a training image set { S) according to the visual quality table of all the fuzzy distortion stereo images_i,org,S_i,disThe visual quality table of I1 ≦ i ≦ N, denoted as Q, Q ═ Q ≦ Q_iI is more than or equal to 1 and less than or equal to N, wherein Q_iDenotes S_i,disVisual quality table of (2), Q_i＝{q_i,k|1≤k≤K}，q_i,kExpression { IMF_i ^dis(x, y) } visual quality of the kth cluster;

s in the step ④_i,disVisual quality table Q_iThe acquisition process comprises the following steps:

④ -1, respectively using Gabor filters to the L_i,org、R_i,org、L_i,disAnd R_i,disFiltering to obtain L_i,org、R_i,org、L_i,disAnd R_i,disUnder different central frequencies and different direction factors, each pixel point in the L-shaped array has a frequency response of L_i,orgThe frequency response of the pixel point with the middle coordinate position (x, y) under the condition that the center frequency is omega and the direction factor is theta is recorded as R is to be_i,orgThe frequency response of the pixel point with the middle coordinate position (x, y) under the condition that the center frequency is omega and the direction factor is theta is recorded as Mixing L with_i,disThe frequency response of the pixel point with the middle coordinate position (x, y) under the condition that the center frequency is omega and the direction factor is theta is recorded as R is to be_i,disThe frequency response of the pixel point with the middle coordinate position (x, y) under the condition that the center frequency is omega and the direction factor is theta is recorded as Wherein x is 1. ltoreq. x.ltoreq.W is 1. ltoreq. y.ltoreq.H, where W represents L_i,org、R_i,org、L_i,disAnd R_i,disWhere H represents L_i,org、R_i,org、L_i,disAnd R_i,disω ∈ {1.74,2.47,3.49,4.93,6.98,9.87}, θ represents the orientation factor of the Gabor filter, 1 ≦ θ ≦ 4,is composed ofThe real part of (a) is,is composed ofThe imaginary part of (a) is,is composed ofThe real part of (a) is,is composed ofThe imaginary part of (a) is,is composed ofThe real part of (a) is,is composed ofThe imaginary part of (a) is,is composed ofThe real part of (a) is,is composed ofJ is an imaginary unit;

④ -2, according to L_i,orgAnd R_i,orgCalculating the frequency response of each pixel point in the S under the selected center frequency and different direction factors_i,orgAmplitude of each pixel point in (1), will S_i,orgThe amplitude of the pixel point with the middle coordinate position (x, y) is recorded as Wherein, ω_mfor a selected center frequency, ω_m∈{1.74,2.47,3.49,4.93,6.98,9.87}，Represents L_i,orgThe central frequency of the pixel point with the (x, y) middle coordinate position is omega_mAnd the frequency response at a directional factor of thetaThe real part of (a) is,represents R_i,orgThe central frequency of the pixel point with the (x, y) middle coordinate position is omega_mAnd the frequency response at a directional factor of thetaThe real part of (a) is,represents L_i,orgThe central frequency of the pixel point with the (x, y) middle coordinate position is omega_mAnd the frequency response at a directional factor of thetaThe imaginary part of (a) is,represents R_i,orgThe central frequency of the pixel point with the (x, y) middle coordinate position is omega_mAnd the frequency response at a directional factor of thetaAn imaginary part of (d);

also according to L_i,disAnd R_i,disEach of which isCalculating the frequency response of the pixel points under the selected center frequency and different direction factors, and calculating S_i,disAmplitude of each pixel point in (1), will S_i,disThe amplitude of the pixel point with the middle coordinate position (x, y) is recorded as Wherein, ω_mfor a selected center frequency, ω_m∈{1.74,2.47,3.49,4.93,6.98,9.87}，Represents L_i,disThe central frequency of the pixel point with the (x, y) middle coordinate position is omega_mAnd the frequency response at a directional factor of thetaThe real part of (a) is,represents R_i,disThe central frequency of the pixel point with the (x, y) middle coordinate position is omega_mAnd the frequency response at a directional factor of thetaThe real part of (a) is,represents L_i,disThe central frequency of the pixel point with the (x, y) middle coordinate position is omega_mAnd the frequency response at a directional factor of thetaThe imaginary part of (a) is,represents R_i,disThe central frequency of the pixel point with the (x, y) middle coordinate position is omega_mAnd the frequency response at a directional factor of thetaAn imaginary part of (d);

④ -3, according to S_i,orgAnd S_i,disCalculating S from the amplitude of each pixel point_i,disThe objective evaluation metric value of each pixel point in S_i,disThe objective evaluation metric value of the pixel point with the middle coordinate position (x, y) is recorded as rho_i(x,y)， Wherein cos () is a cosine-taken function, arccos () is an inverse cosine-taken function,is composed ofThe horizontal gradient value of (a) is,is composed ofThe vertical gradient value of (a) is,is composed ofThe horizontal gradient value of (a) is,is composed ofVertical gradient value of, T₁Is a control parameter;

④ -4, according to S_i,disObtaining S according to the objective evaluation metric value of each pixel point_i,disIs marked as Q_i，Q_i＝{q_i,kL 1 is more than or equal to K and less than or equal to K, wherein q is equal to or less than K_i,kExpression { IMF_i ^dis(x, y) } visual quality of the kth cluster,Ω_kdenotes S_i,disNeutralization { IMF_i ^dis(x, y) } a set of coordinate positions of pixel points whose coordinate positions are the same for all pixel points included in the kth cluster,expression { IMF_i ^dis(x, y) } total number of pixel points included in the kth cluster;

⑤ for any one test stereo image S_testFrom a training image set { S_i,org,S_i,disI is more than or equal to 1 and less than or equal to N, and calculating to obtain S_testObjectively evaluating a predicted value of the image quality;

the concrete process of the fifth step is as follows:

⑤ -1, mixing S_testIs recorded as L_testWill S_testIs recorded as R_testTo L for_testAnd R_testRespectively carrying out two-dimensional empirical mode decomposition to obtain L_testAnd R_testRespective intrinsic mode function images, corresponding notationAndthen toAndlinear weighting is carried out to obtain S_testIs denoted as { IMF [ ], in the intrinsic mode function image_test(x, y) }, will { IMF_testThe pixel value of the pixel point with the coordinate position (x, y) in (x, y) is marked as IMF_test(x,y)，Wherein 1. ltoreq. x.ltoreq.W ', 1. ltoreq. y.ltoreq.H ', where W ' representsAndis shown here as HAndthe height of (a) of (b),to representThe middle coordinate position is the pixel value of the pixel point of (x, y),to representThe pixel value, w, of the pixel point with the middle coordinate position (x, y)_L' isWeight ratio of (1), w_R' isWeight ratio of (1), w_L'+w_R'＝1；

⑤ -2, will { IMF_test(x, y) } division intoThe non-overlapping sub-blocks of size 16 × 16 are defined by IMF_testThe set of all subblocks in (x, y) } is denoted asWherein, y_tIs represented by { IMF_testThe column vector formed by all pixel points in the t-th sub-block in (x, y) }, y_tHas a dimension of 256;

⑤ -3, calculation of { IMF_testThe minimum Euclidean distance of each sub-block in (x, y) } from G, will be { IMF_testThe minimum Euclidean distance between the t-th sub-block and G in (x, y) } is recorded as_t，Wherein, the symbol "| | |" is a euclidean distance symbol, and the min () is a minimum function;

⑤ -4, calculation of { IMF_test(x, y) } the objective evaluation metric for each sub-block, will be { IMF_testThe objective evaluation metric value of the t-th sub-block in (x, y) } is recorded as z_t，Wherein,in the representation of Q_tThe visual quality corresponding to the corresponding visual dictionary is more than or equal to 1 and less than or equal to N, more than or equal to 1 and less than or equal to K, exp () represents an exponential function with e as a base, e is 2.71828183, and lambda is a control parameter;

⑤ -5, according to { IMF_test(x, y) } calculating S the objective evaluation metric value for each sub-block_testThe image quality objective evaluation predicted value of (1) is marked as Q,

2. the method as claimed in claim 1, wherein w in step ② is taken as w_L＝0.9，w_R＝0.1。

3. The method for reference-free quality evaluation of blurred and distorted stereo images as claimed in claim 1 or 2, wherein { IMF ] in step ③_i ^disThe acquisition process of the visual dictionary table Gi of (x, y) } is as follows:

③ -1, will { IMF_i ^dis(x, y) } division intoThe non-overlapping sub-blocks of size 16 × 16 are defined by IMF_i ^disThe set of all subblocks in (x, y) } is denoted asWherein x is_i,tIs represented by { IMF_i ^disThe column vector formed by all pixel points in the t-th sub-block in (x, y) }, x_i,tHas a dimension of 256;

③ -2, adopting K-means clustering method to pairClustering operation is carried out to obtain { IMF_i ^disK clusters of (x, y) }, then the { IMF_i ^disThe centroid of each cluster of (x, y) } serves as a visual dictionary, resulting in { IMF_i ^dis(x, y) } visual dictionary, denoted G_i，G_i＝{g_i,kL 1 is less than or equal to K, wherein K represents { IMF ≦ K }, and_i ^dis(x, y) } total number of clusters contained, g_i,kExpression { IMF_i ^disVisual dictionary of the kth cluster of (x, y) }, g_i,kAlso indicates { IMF_i ^disCentroid of the kth cluster of (x, y) }, g_i,kHas a dimension of 256.