CN116129203A

CN116129203A - No-reference image quality evaluation model training method

Info

Publication number: CN116129203A
Application number: CN202310165646.4A
Authority: CN
Inventors: 谢凤英; 潘林朋; 刘畅; 丁海东; 资粤; 邱林伟
Original assignee: Beihang University
Current assignee: Beihang University
Priority date: 2023-02-15
Filing date: 2023-02-15
Publication date: 2023-05-16

Abstract

The invention discloses a non-reference image quality evaluation model training method, and belongs to the field of image quality evaluation. Optimizing monotonicity between the network predicted value and the subjective quality true value through ordering information between the twin network learning image quality; and linear correlation constraint is added to learn the numerical quantization of the quality difference between images, and the correlation between the predicted value of the algorithm and the subjective quality true value is optimized. The non-reference image quality evaluation model trained by the invention has better precision and generalization performance.

Description

No-reference image quality evaluation model training method

Technical Field

The invention relates to the technical field of image quality evaluation, in particular to a model training method for non-reference image quality evaluation.

Background

The non-reference image quality evaluation method is widely used as a research hot spot due to the application scene.

At present, most of reference-free image quality evaluation algorithms finish image quality evaluation tasks based on regression ideas, and are mainly divided into a traditional method and a deep learning method.

The traditional reference-free image quality evaluation model generally consists of a feature extraction unit and a quality regression model. According to different modes of feature extraction, the traditional non-reference image quality evaluation model can be divided into a transformation domain-based, a spatial domain-based and dictionary learning method. The regression model can be classified into a support vector machine-based, a probability model-based and a random forest-based method according to its difference. However, the above methods all rely on the features of manual design, which are generally limited in generalization ability, and the design process requires a lot of time and rich expertise, which greatly limits the further development and application of such methods.

With the successful application of deep learning-based methods in the field of computer vision, some students began using deep learning to complete image quality assessment. However, the above methods consider image quality evaluation as a regression task, and use MSE or MAE to directly regress the quality score of the image, but the objective of the image quality algorithm is to obtain objective quality evaluation consistent with subjective quality score, and it is not required that objective evaluation result and subjective quality score are completely consistent in value.

Therefore, the regression method is too strong in constraint, and the utilization of the quality relation between images is lacking, and meanwhile, the model training target and the evaluation method are inconsistent.

Based on this, how to provide a reference-free image quality evaluation method capable of overcoming the above-described drawbacks is a problem that a person skilled in the art needs to solve.

Disclosure of Invention

In view of the above, the invention provides a method for training a model for evaluating image quality without reference, which optimizes and trains the model for evaluating image quality without reference by considering monotonicity and relativity between image quality so as to improve the evaluation precision and generalization performance of the model.

In order to achieve the above purpose, the present invention adopts the following technical scheme:

a method for training a no-reference image quality evaluation model having two identical image quality evaluation networks, the two image quality evaluation networks sharing parameters, the no-reference image quality evaluation model optimally trained using a monotonicity loss function between image pairs,

l _rank (x ₁ ,x ₂ )＝max(0,f(x ₂ )-f(x ₁ )+ε)

wherein x is ₁ 、x ₂ Input image for two image quality evaluation networks, and image x ₁ Is higher than image x ₂ Quality truth value of f (x) ₁ )、f(x ₂ ) Epsilon represents the interval for the corresponding output.

Preferably, the input of the no-reference image quality evaluation model is a pair of images or labels.

Preferably, all images in a small batch in the training process are firstly arranged in descending order according to quality truth values, and monotonicity loss of all image pairs is calculated through forward propagation once by considering all image pairs in the small batch, wherein the calculation formula is as follows:

wherein n represents the number of input images, L _rank Monotonic loss for all image pairs, l _rank Is a monotonic loss between any two images.

Preferably, a matrix M is constructed according to the quality true value of the input image and the output corresponding to the non-reference image quality evaluation model, and the L is calculated according to the matrix M _rank 。

Preferably, the non-reference image quality assessment model is optimally trained using a correlation loss function,

where n is the number of input images, y _i Is the true value of the i-th image,

is the average value of the true values of the n images,

the average of the n image quality evaluation model outputs is represented.

Preferably, the model for non-reference image quality evaluation is optimally trained using both loss functions together.

Preferably, the reference-free image quality evaluation model outputs an evaluation result through a sigmoid function.

According to the technical scheme, the invention discloses a training method of a non-reference image quality evaluation model, and compared with the prior art, the training method provided by the invention learns the sequencing information among images through a twin network, learns the numerical quantization of quality difference among the images through adding additional linear correlation constraint, and optimizes monotonicity and correlation between a network predicted value and a subjective quality true value through a loss function of the two. The non-reference image quality evaluation model obtained through training of the invention can be obviously improved in precision and generalization performance.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic diagram of a process of a model training method for non-reference image quality evaluation provided by the invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The embodiment of the invention discloses a novel reference-free image quality evaluation model training method, and the reference-free quality evaluation model obtained by the training method can effectively overcome the defects that the regression method is strong in constraint and cannot utilize the quality correlation between images, and is superior to the model obtained by the conventional regression method in terms of model precision and generalization performance.

Specifically, as shown in fig. 1, the training method of the evaluation model disclosed by the invention is as follows:

firstly, in order to directly optimize the quality evaluation model in the training process, the method uses ordering information among the twin network learning images, namely, a non-reference image quality evaluation model has two identical image quality evaluation networks, the two image quality evaluation networks share parameters, the input of the non-reference image quality evaluation model is a paired image or label, then the output of a forward propagation calculation network and a corresponding loss are carried out, and finally the optimization is carried out through a backward propagation algorithm.

Further, a monotonicity loss function between image pairs is calculated according to the following formula, the function is utilized to optimally train a non-reference image quality evaluation model,

l _rank (x ₁ ,x ₂ )＝max(0,f(x ₂ )-f(x ₁ )+ε)

wherein x is ₁ 、x ₂ Input image for two image quality evaluation networksAnd image x ₁ Is higher than image x ₂ Quality truth value of f (x) ₁ )、f(x ₂ ) Epsilon represents the interval for the corresponding output.

Monotonicity between image pairs is an important indicator for measuring the quality of reference-free image evaluation algorithms, i.e. measuring whether the algorithms correctly predict the relative relationship of quality between different images. In the invention, monotonicity loss between image pairs is calculated by using the output value of a reference-free image quality evaluation model and an image quality true value, and as known from the calculation formula of the monotonicity loss function, f (x ₁ ) F (x) or more ₂ ) That is, when the output and input orders of the network are consistent, the loss is 0, when the output and input orders of the network are inconsistent, the loss is not 0, and f (x) ₂ ) Ratio f (x ₁ ) The more large the loss, the greater. The penalty can measure monotonicity between the algorithm output value and the subjective quality truth value of the image.

An important disadvantage of twin networks is the large amount of redundant computation that exists during training. Taking all image pairs that may be composed of three images as an example, in a standard twin network implementation, since each image appears in two different image pairs, all three images need to be propagated twice forward in the network, and since the two branches of the twin network are identical, the calculation of the above process is twice as much as actually needed.

To solve this problem, consider all possible image pairs in a small batch during training, and calculate the loss of all image pairs by one forward propagation.

In one embodiment, assuming that there are n images in a small lot, consider all possible inside the small lot

The corresponding losses for the image pairs are shown below:

wherein n representsNumber of input images, L _rank Monotonic loss for all image pairs, l _rank Is a monotonic loss between any two images.

Further, in the present embodiment, a matrix M is constructed according to the quality truth value of the input image and the output corresponding to the non-reference image quality evaluation model, and L is calculated according to the matrix M _rank To achieve acceleration of the algorithm described above.

Specifically, in order to fully utilize the capability of modern GPU parallel computing and without losing generality, firstly, the present application arranges images in a small batch in descending order according to quality truth values, and records the ordered images as x= [ X ] ₁ ,x ₂ ,...,x _n ]The network output is f= [ f (x ₁ ),f(x ₂ ),...,f(x _n )]Thus, an n×n matrix F is constructed as follows:

further, a matrix M is calculated according to the following formula:

M＝F ^T -F+εP

where P is a unitary matrix of all 1's, where each element of the matrix M is as follows:

calculating L from matrix M _rank The process of (1) is as follows:

let matrix M _rank =max (0, m), at which time the loss value L of the entire small lot _rank For matrix M _rank The calculation process can fully utilize the parallel acceleration capability of the GPU, the calculation speed is greatly improved, and the calculation can be easily realized by means of the current common deep learning framework.

On the other hand, the application provides a model optimization method considering correlation. The correlation between the output value of the image quality evaluation model and the image quality true value is a measureIn order to optimize the objective in training and enable the network to learn the numerical quantification of the quality differences between different images, the invention adds an additional linear correlation constraint to make the output of the twin network and the quality truth have a linear correlation. Recording the image quality true value in a small batch as y= [ y ] ₁ ,…,y _n ]The linear correlation loss of the whole small batch is L _line The calculation formula of the correlation loss function is:

is the average value of the true values of the n images,

the average of the n image quality evaluation model outputs is represented.

And is also provided with

The invention further provides that when the non-reference image quality evaluation model is trained, two loss functions are used for jointly carrying out optimization training on the non-reference image quality evaluation model.

The loss function during co-training is as follows:

L＝αL _rank +βL _line

wherein L is _rank As monotonicity loss function, L _line For the correlation loss function, α and β are weights for the corresponding losses.

In addition, the final layer of the twin network adopts a single neuron to predict the quality score of the input image, and the algorithm does not directly return the subjective quality score of the image, so that the range of the network output value is not constrained, the network is required to learn the range of the output value according to the data distribution, the network learning difficulty is reduced, the meaning of the network output value is more definite, and the neuron of the final layer is used as the final output through a sigmoid function.

Aiming at the image quality evaluation target, the invention provides a new method for training the non-reference image quality evaluation task, which optimizes monotonicity between a network predicted value and a subjective quality true value through ordering information among twin network learning images; meanwhile, a method for training the twin network efficiently is provided, and based on the method, additional linear correlation constraint is added to learn numerical quantization of quality difference between images, and correlation between an algorithm predicted value and a subjective quality true value is optimized. The model for evaluating the quality of the reference-free image obtained by training is superior to the model obtained by training by adopting a regression method in two aspects of model precision and generalization performance.

In order to fully illustrate its beneficial effects, the present application is illustrated by the following experiments:

the invention selects four representative image quality evaluation data sets of Koniq-10k and LIVE-C, BID, LIVE for experiments, and uses a Szelman rank correlation coefficient (SROCC) and a Piercan Linear Correlation Coefficient (PLCC) as evaluation indexes,

the present invention compares with MSE and MAE on the four data sets described above, and for fair comparison, except for the loss function used, the rest of the settings were the same, each comparison was repeated 10 times, and the median was taken as the final result, the experimental results are shown in table 1,

table 1 comparison of different data sets and regression methods

As can be seen from Table 1, on the Koniq-10k, LIVE-C and BID data sets, the model performance obtained by training the invention is obviously higher than that of a model obtained by adopting MSE or MAE as a loss function, and on the LIVE data sets, the invention can obtain the result equivalent to the MSE or MAE, which shows that the overall performance of the method is superior to that of a regression method, and the method is a new thought for effectively evaluating the non-reference image quality.

Further, the invention performs generalization performance test on the regression method and the method proposed by the invention on KoniQ-10k, LIVE-C and BID data sets, specifically, the method proposed by the invention and the regression method are respectively adopted on three data sets to train a model, then the test is performed on the other two data sets, the experimental results are shown in Table 2,

table 2 comparison of generalization performance on different data sets and regression methods

From the above, in six cross-dataset validation experiments, the best results were obtained for the method of the present application for 4 times, indicating that the generalization ability of the method was superior to that of the regression method.

In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method for training a no-reference image quality evaluation model is characterized in that the no-reference image quality evaluation model is provided with two identical image quality evaluation networks, the two image quality evaluation networks share parameters, the no-reference image quality evaluation model uses a monotonic loss function between image pairs to carry out optimization training,

l _rank (x ₁ ,x ₂ )＝max(0,f(x ₂ )-f(x ₁ )+ε)

2. The method of claim 1, wherein the non-reference image quality assessment model is input as a pair of images or labels.

3. The model training method for reference-free image quality assessment according to claim 1, wherein after all possible image pairs in the input image are ranked according to image quality truth values, monotonicity loss of all image pairs is calculated through one forward propagation, and the calculation formula is as follows:

4. A method according to claim 3, wherein a matrix M is constructed based on the quality truth value of the input image and the corresponding output of the modelThe matrix M calculates the L _rank 。

5. A method for training a model for no-reference image quality assessment according to claim 1 or 3, wherein said model for no-reference image quality assessment is optimally trained using a correlation loss function,

mean value of true values of n images, +.>

The average of the n image quality evaluation model outputs is represented.

6. The method for training a model for no-reference image quality assessment according to claim 5, wherein the model for no-reference image quality assessment is optimally trained using both loss functions together.

7. The method according to claim 1, wherein the no-reference image quality evaluation model outputs an evaluation result through a sigmoid function.