CN111368875A

CN111368875A - Method for evaluating quality of super-resolution image based on stacking no-reference type

Info

Publication number: CN111368875A
Application number: CN202010086355.2A
Authority: CN
Inventors: 张凯兵; 朱丹妮; 罗爽; 卢健; 李敏奇; 刘薇; 苏泽斌; 景军锋; 陈小改
Original assignee: Xian Polytechnic University
Current assignee: Shenzhen Wanzhida Technology Co ltd; Zhejiang Xinwei Electronic Technology Co ltd
Priority date: 2020-02-11
Filing date: 2020-02-11
Publication date: 2020-07-03
Anticipated expiration: 2040-02-11
Also published as: CN111368875B

Abstract

The invention discloses a staring-based no-reference super-resolution image quality evaluation method, which specifically comprises the following steps: firstly, extracting the depth characteristics of a super-resolution image through the existing trained VGGnet model for quantifying the degradation of the super-resolution image; then, a stacking regression algorithm comprising the SVR algorithm and the k-NN algorithm is used as a first-layer regression model to construct a mapping model from the depth features extracted from the VGGnet model to the predicted mass fraction, and a linear regression algorithm is adopted to obtain a second-layer regression model, so that the stacking regression model is formed, and the quality of the super-resolution image is evaluated without reference. The invention provides a method for improving the prediction accuracy by using the complementary advantages of two different basic regressors of SVR and k-NN and taking linear regression as an element regressor, and the super-resolution image quality score closer to the subjective evaluation of human eyes can be obtained.

Description

Method for evaluating quality of super-resolution image based on stacking no-reference type

Technical Field

The invention belongs to the technical field of image processing and analyzing methods, and relates to a staring-based non-reference super-resolution image quality evaluation method.

Background

A single frame image super-resolution reconstruction technique is a technique that can generate a high-resolution image with finer details using information from one or more low-resolution input images. The technology is widely applied to the fields of image processing, computer vision and the like. With the emergence of a large number of super-resolution image reconstruction algorithms, how to evaluate the image super-resolution reconstruction algorithms becomes a key research problem. It is needless to say that human vision is the ultimate receptor for evaluating images, and thus subjective quality assessment is the most direct and effective method to reflect super-resolution image quality. However, the process of subjective quality assessment methods is time and energy consuming and the method cannot be integrated into a super-resolution reconstruction system for use in real-world scenarios. Therefore, objective quality evaluation methods have been developed.

At present, the performance of the super-resolution algorithm is mainly evaluated by an objective image quality evaluation method, which can be divided into three major categories, namely a full-reference image quality evaluation method (FRIQA), a partial-reference image quality evaluation method (RRIQA), and a no-reference image quality evaluation method (NRIQA). In the full-reference image quality evaluation method, for example, Mean Square Error (MSE), peak signal to noise ratio (PSNR), Structural Similarity (SSIM) and Information Fidelity Criterion (IFC) all need to use the original high-resolution image as a reference for measuring quality. And testing the super-resolution image. However, the results of these full-reference type image quality evaluation methods sometimes do not agree with human subjective evaluation results, and the required original high-resolution images cannot be obtained in practice. Therefore, these conventional full-reference type image quality evaluation methods are not suitable for evaluating the quality of super-resolution images. As another image quality evaluation method, the partial reference type image quality evaluation method reduces the amount of data in information transmission for better application in practice. Although this evaluation method is more flexible than the full-reference type image quality evaluation method, these methods still require information about the original high-resolution image. The no-reference image quality evaluation method can overcome the above two disadvantages of the methods requiring the original high-resolution image, and is widely concerned by researchers.

The conventional non-reference quality evaluation method mostly adopts the statistical characteristics of the traditional manual design to describe the degradation degree of an image, but the representation of the characteristics is one of important links of image quality evaluation, the characteristics of the manual design can influence the improvement of the performance of an image evaluation model, and the problems that the super-resolution image quality is difficult to accurately and effectively evaluate exist.

Disclosure of Invention

The invention aims to provide a staring-based no-reference super-resolution image quality evaluation method, which solves the problem that the super-resolution image quality evaluation method in the prior art is difficult to accurately and effectively evaluate the super-resolution image quality.

The technical scheme adopted by the invention is that a method for evaluating the quality of a super-resolution image based on stacking without reference specifically comprises the following steps: firstly, extracting the depth characteristics of a super-resolution image through the existing trained VGGnet model for quantifying the degradation of the super-resolution image; then, a stacking regression algorithm comprising an SVR algorithm and a k-NN algorithm is used as a first-layer regression model to construct a mapping model from depth features extracted from a VGGnet model to predicted mass fractions, and a linear regression algorithm is adopted to obtain a second-layer regression model, so that a stacking regression model is formed, and the quality of the super-resolution image can be evaluated without reference.

The invention is also characterized in that:

the method specifically comprises the following steps:

step 1, in a feature extraction stage, inputting an initial training set image into a current trained VGGnet model, and then outputting depth features of a seventh layer of a full connection layer of the training set image to form a feature training set D; meanwhile, the initial verification set image is also input into the existing trained VGGnet model to obtain the depth characteristics of the seventh layer of the full connection layer of the verification set image, and a verification set V is formed;

step 2, dividing the characteristic training set D in the step 1 into k training subsets D with the same size but without overlapping with each other₁,D₂,...,D_k；

Step 3, order

As training set, D_jAs a test set, an SVR algorithm is adopted in a training set

Training k SVR basic regressors L respectively_jSVR, output each SVR basis regressor in test set D_jUpper SVR training set predictor

Stacking predicted values of the k output SVR training sets to obtain an SVR training set

Meanwhile, data in the verification set V are input into the SVR basic regressors to obtain verification set predicted values V of each SVR basic regressor on the verification set V_jSVR, then carrying out arithmetic average on the output verification set predicted value to obtain an SVR element verification set

Step 4, adopting a k-NN algorithm to train a set

Respectively training k-NN basis regressors L_jkKNN, output each k-NN basis regressor in test set D_jk-NN training set of values of prediction over

Stacking k output training set predicted values to obtain a k-NN element training set

Meanwhile, inputting the data in the verification set V into the k-NN basic regressor to obtain the verification set predicted value V of each k-NN basic regressor on the verification set V_jKNN, and then carrying out arithmetic mean on the output verification set predicted value to obtain a k-NN element verification set

Step 5, the SVR element training set obtained in the step 3

And the k-NN element training set obtained in the step 4

Combining a matrix with the height as the size of the training set and the width as the number of the algorithms according to the groups, and forming a training set Train of a second layer by adding subjective quality scores obtained from the existing database, namely a training set of a meta-regressor;

step 6, a linear regressor is used as an element regressor, and the training set obtained in the step 5 is input into the element regressor to be trained to obtain an element regression model;

step 7, the SVR element verification set obtained in the step 3

And the k-NN element verification set obtained in the step 4

Synthesizing a matrix with the height as the size of the verification set and the width as the number of the algorithms according to the groups to form a verification set Test of a second layer, namely a verification set of the element regressor;

and 8, inputting the verification set of the meta-regressor obtained in the step 7 into the meta-regression model trained in the step 6, and obtaining a final prediction result.

SVR basis regression ware in step 3Middle feature x_iWith the subjective quality score q of the image_iThe relationship between them is expressed as follows:

q_i＝<w,φ(x_i)>+b

wherein w and b represent weight and bias value of the feature respectively, both values are learned from the training set; phi () is a kernel function, which has the effect of mapping the original low-dimensional data to a high-dimensional space.

The kernel function adopts a radial basis function, and the expression is as follows:

where σ is the standard deviation of the RBF kernel, x_iAnd x_jAre the ith and jth depth features of the image.

The algorithm in the k-NN basic regressor in the step 4 specifically comprises the following steps:

x＝{x₁,x₂,…,x_m}∈Rⁿfeatures representing the m training set images,

indicates the corresponding subjective score, y_jFirstly, measuring the similarity between the image feature vectors of all training sets and the feature vector of the jth test image by using distance measurement, and finding out the subjective quality scores corresponding to the k training set image features with the shortest distance

And calculating the average value of the subjective quality scores as the prediction result of the jth test image, wherein the formula is as follows:

the distance metric uses the euclidean distance formula:

where n is the dimension of the feature vector.

The invention has the beneficial effects that:

(1) the invention provides an effective non-reference super-resolution image quality evaluation method for evaluating the quality of a super-resolution image, and the depth characteristics of the super-resolution image are extracted through a pre-trained VGGnet model.

(2) The invention provides a method for completing preliminary mapping between depth features and quality scores by utilizing the complementary advantages of two different basic regressors of SVR and k-NN. In addition, simple linear regression is used as a meta-regressor to further improve the prediction accuracy of the whole model. Compared with the existing regression model, the algorithm provided by the invention can effectively represent the mapping relation between the characteristics and the quality scores.

(3) Simulation results show that compared with the existing no-reference image quality evaluation method, the super-resolution image quality score closer to human eye subjective evaluation can be obtained.

Drawings

FIG. 1 is a flow chart of a method for evaluating quality of super-resolution images based on stacking without reference according to the present invention;

FIG. 2 is a feature extraction model diagram in the method for evaluating quality of super-resolution images based on stacking without reference;

FIG. 3 is a scatter diagram comparing the test results of the stacking-based non-reference super-resolution image quality evaluation method and the conventional image quality evaluation method;

FIG. 4 is a partial experimental result obtained after testing super-resolution images in a database by using a stacking-based non-reference super-resolution image quality evaluation method of the present invention;

FIG. 5 shows the experimental results of 3-fold amplification reconstruction of super-resolution images by using the method for evaluating the quality of super-resolution images based on stacking without reference.

Detailed Description

The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.

The invention discloses a method for evaluating the quality of a super-resolution image based on stacking without reference, which mainly comprises two stages as shown in figures 1 and 2: the method specifically comprises the following steps of: firstly, extracting the depth characteristics of a super-resolution image through the existing trained VGGnet model for quantifying the degradation of the super-resolution image; then, a stacking regression algorithm comprising an SVR algorithm and a k-NN algorithm is used as a first-layer regression model to construct a mapping model from depth features extracted from a VGGnet model to predicted mass fractions, and a linear regression algorithm is adopted to obtain a second-layer regression model, so that a stacking regression model is formed, and the quality of the super-resolution image can be evaluated without reference.

The invention relates to a stacking-based no-reference super-resolution image quality evaluation method, which specifically comprises the following steps of:

step 1, in a feature extraction stage, inputting an initial training set image into a current trained VGGnet model, and then outputting depth features of a seventh layer of a full connection layer of the training set image to form a feature training set D; meanwhile, the initial verification set image is also input into the existing trained VGGnet model to obtain the depth characteristics of the seventh layer of the full-connected layer of the verification set image, and an original verification set V is formed;

Step 3, order

Step 4, adopting a k-NN algorithm to train a set

Step 5, the SVR element training set obtained in the step 3

And the k-NN element training set obtained in the step 4

By combination into a heightForming a training set Train of a second layer, namely a training set of the meta-regressor, by using a matrix with the size and width of the training set as the number of algorithms and adding subjective quality scores obtained from the existing database;

step 7, the SVR element verification set obtained in the step 3

And the k-NN element verification set obtained in the step 4

In the first-level stacking regression, we use SVR as one of the basic regressors to roughly estimate the perceptual score of SR images, and in step 3, feature x in SVR basic regressor_iWith the subjective quality score q of the image_iThe relationship between them is expressed as follows:

q_i＝<w,φ(x_i)>+b

where σ is the standard deviation of the RBF kernel, x_iAnd x_jAre the ith and jth depth features of the image sample. Training by applying SVR algorithm in stacking structureThe training images may be used to predict all image quality scores of the training set through five cross-validation experiments.

The k-NN regression uses a simple majority voting scheme to obtain a rough prediction by averaging k nearest values, and the algorithm in the k-NN basic regressor in the step 4 specifically comprises:

x＝{x₁,x₂,…,x_m}∈Rⁿfeatures representing the m training set images,

the distance metric uses the euclidean distance formula:

where n is the dimension of the feature vector.

The evaluation performance of the present invention was verified by a simulation experiment as follows. The specific simulation content is as follows:

simulation experiment I: in image quality evaluation, whether the objective prediction score and the subjective quality score of an image have consistency is one of important methods for evaluating an evaluation model. In a rectangular coordinate system, the abscissa is a subjective quality score, the ordinate is an objective prediction score, each point in the coordinate system corresponds to one test image, all images in a final test set form a scatter diagram, and a logic function is used for linear fitting, wherein the fitting function is as follows:

in the formula, x represents the original quality fraction obtained by an objective evaluation algorithm, Q (x) is the quality fraction after nonlinear regression mapping, α_i(i ═ 1, 2.., 5) regression fitting parameters that can be estimated by means of the nlinfit function in MATLAB.

If the fluctuation of the points in the scatter diagram is smaller near the fitting curve, the consistency of the prediction score and the subjective evaluation score is proved to be higher, and the performance of the algorithm is better. As shown in FIG. 3, in the super-resolution image data set, 80% of data is used as a training set and the rest 20% of test sets are used for consistency comparison experiments, and other five representative image quality evaluation methods are selected to be compared with the simulation result of the invention so as to verify the subjective and objective consistency of the invention. As is apparent from fig. 3, the scatter in fig. 3(f) is closer to the fitting curve than fig. 3(a), (b), (c), (d), (e), and thus, the method of the present invention corresponding to fig. 3(f) has better consistency than other non-reference image quality evaluation methods. The five representative image quality evaluation methods are respectively as follows: the BLIINDS method proposed by Saad et al (IEEE trans. on Image Processing, 2012: 3339-. The data set of the invention is derived from a paper of a Ma method, and mainly comprises 1620 super-resolution images and subjective quality scores corresponding to the images.

Fig. 4 is an evaluation result display diagram of a super-resolution data set training verification experiment, and meanwhile, subjective quality evaluation scores, image quality prediction results of the Ma and other evaluation methods, and prediction results of simulation of the present invention are given. For super-resolution images with richer details in the first two columns, although the method of Ma et al has a better prediction result, compared with the subjective score, the simulation prediction result of the method is better than that of Ma et al. For the super-resolution images with less details in the last two columns, the simulation prediction result of the invention is still better than that of Ma and the like. Especially in the last column of super-resolution image, when the super-resolution image quality is poor, the proposed method can produce more consistent prediction score with subjective score, and has obvious advantages compared with the method of Ma et al. This is because the features of the present invention based on deep learning are advantageous for accurately quantifying the quality of super-resolution images. In addition, the heterogeneous stacking regression model is applied to establish the mapping relation between the depth features and the subjective scores, and the prediction precision is improved.

And (2) simulation experiment II: in order to further verify the effectiveness of the non-reference super-resolution image quality evaluation method, 24 super-resolution images are selected outside a super-resolution image database of a training model according to the overall image definition, the local texture richness and the structural rationality. These super-resolution images are four low-resolution images obtained by using up-sampling factors of 3 for six different types of super-resolution methods (a +, ANR, CNN, MoE, FD, SERF). The six different types of super-resolution methods are respectively as follows: the methods proposed by Timofte et al are referred to as A + (Asian Conference on computer Vision,2014:111-126), the methods proposed by Timofte et al are referred to as ANR (ICCV,2013:1920-1927), the methods proposed by Dong et al are referred to as CNN (Proc. European Conf. Compout. Vis., 2014:184-199), the methods proposed by Zhang et al are referred to as MoE (IEEE Signal Processing Letters, 2015:102-106), the methods proposed by Hu et al are referred to as SERF (IEEE trans. image Processing., 2016:4091-4102), the methods proposed by Yang et al are referred to as FD (ICCV,2014: 561-568).

The image quality evaluation algorithm of the invention is used for simulation evaluation to obtain a prediction result as shown in fig. 5, and the score in the brackets is the prediction evaluation score. From the evaluation results, the image quality obtained by the A + and the CNN is better, and the image quality obtained by the ANR and the FD is poorer, which is consistent with the subjective perception and the empirical analysis of the image super-resolution reconstruction method. Therefore, the evaluation algorithm provided by the invention can be proved to have certain rationality.

The invention discloses a stacking-based non-reference super-resolution image quality evaluation method, which has the beneficial effects that: the simulation experiment result shows that compared with the existing non-reference super-resolution image quality evaluation method, the evaluation result obtained by the method is more effective and can be better consistent with subjective perception. Meanwhile, the reconstructed image disclosed by the invention also shows great advantages in objective evaluation compared with other non-reference image quality evaluation methods.

Claims

1. The method for evaluating the quality of the super-resolution image based on stacking without reference is characterized by comprising the following steps: firstly, extracting the depth characteristics of a super-resolution image through the existing trained VGGnet model for quantifying the degradation of the super-resolution image; then, a stacking regression algorithm comprising an SVR algorithm and a k-NN algorithm is used as a first-layer regression model to construct a mapping model from depth features extracted from a VGGnet model to predicted mass fractions, and a linear regression algorithm is adopted to obtain a second-layer regression model, so that a stacking regression model is formed, and the quality of the super-resolution image can be evaluated without reference.

2. The stacking-based no-reference super-resolution image quality evaluation method according to claim 1, specifically comprising the steps of:

step 2, dividing the characteristic training set D in the step 1 into k training subsets D with the same size but without overlapping₁,D₂,...,D_k；

Step 3, order

Step 4, adopting a k-NN algorithm to train a set

Step 5, the SVR element training set obtained in the step 3

And the k-NN element training set obtained in the step 4

step 7, the SVR element verification set obtained in the step 3

And the k-NN element verification set obtained in the step 4

and 8, inputting the verification set of the meta-regressor obtained in the step 7 into the meta-regression model trained in the step 6 to obtain a final prediction result.

3. The stacking-based no-reference super-resolution image quality evaluation method according to claim 2, wherein the feature x in the SVR basic regressor in the step 3 is_iWith the subjective quality score q of the image_iThe relationship between them is expressed as follows:

q_i＝<w,φ(x_i)>+b

4. The stacking-based no-reference super-resolution image quality evaluation method according to claim 3, wherein the kernel function is a radial basis function, and the expression of the radial basis function is as follows:

where σ is the standard deviation of the RBF kernel, x_iAnd x_jAre the ith and jth depth features of the image sample.

5. The stacking-based no-reference super-resolution image quality evaluation method according to claim 2, wherein the algorithm in the k-NN basis regressor in the step 4 is specifically:

x＝{x₁,x₂,…,x_m}∈Rⁿfeatures representing the m training set images,

indicates the corresponding subjective score, y_jThe feature vector of the jth test image is measured, the similarity between the feature vectors of all training set images and the feature vector of the jth test image is measured by utilizing distance measurement, and the subjective quality scores corresponding to the k nearest neighbors are found

6. the stacking-based no-reference super-resolution image quality evaluation method according to claim 5, wherein the distance metric uses Euclidean distance formula:

where n is the dimension of the feature vector.