CN116129203A - No-reference image quality evaluation model training method - Google Patents

No-reference image quality evaluation model training method Download PDF

Info

Publication number
CN116129203A
CN116129203A CN202310165646.4A CN202310165646A CN116129203A CN 116129203 A CN116129203 A CN 116129203A CN 202310165646 A CN202310165646 A CN 202310165646A CN 116129203 A CN116129203 A CN 116129203A
Authority
CN
China
Prior art keywords
image quality
quality evaluation
model
image
reference image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310165646.4A
Other languages
Chinese (zh)
Inventor
谢凤英
潘林朋
刘畅
丁海东
资粤
邱林伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN202310165646.4A priority Critical patent/CN116129203A/en
Publication of CN116129203A publication Critical patent/CN116129203A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/766Arrangements for image or video recognition or understanding using pattern recognition or machine learning using regression, e.g. by projecting features on hyperplanes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/776Validation; Performance evaluation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)
  • Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)

Abstract

The invention discloses a non-reference image quality evaluation model training method, and belongs to the field of image quality evaluation. Optimizing monotonicity between the network predicted value and the subjective quality true value through ordering information between the twin network learning image quality; and linear correlation constraint is added to learn the numerical quantization of the quality difference between images, and the correlation between the predicted value of the algorithm and the subjective quality true value is optimized. The non-reference image quality evaluation model trained by the invention has better precision and generalization performance.

Description

No-reference image quality evaluation model training method
Technical Field
The invention relates to the technical field of image quality evaluation, in particular to a model training method for non-reference image quality evaluation.
Background
The non-reference image quality evaluation method is widely used as a research hot spot due to the application scene.
At present, most of reference-free image quality evaluation algorithms finish image quality evaluation tasks based on regression ideas, and are mainly divided into a traditional method and a deep learning method.
The traditional reference-free image quality evaluation model generally consists of a feature extraction unit and a quality regression model. According to different modes of feature extraction, the traditional non-reference image quality evaluation model can be divided into a transformation domain-based, a spatial domain-based and dictionary learning method. The regression model can be classified into a support vector machine-based, a probability model-based and a random forest-based method according to its difference. However, the above methods all rely on the features of manual design, which are generally limited in generalization ability, and the design process requires a lot of time and rich expertise, which greatly limits the further development and application of such methods.
With the successful application of deep learning-based methods in the field of computer vision, some students began using deep learning to complete image quality assessment. However, the above methods consider image quality evaluation as a regression task, and use MSE or MAE to directly regress the quality score of the image, but the objective of the image quality algorithm is to obtain objective quality evaluation consistent with subjective quality score, and it is not required that objective evaluation result and subjective quality score are completely consistent in value.
Therefore, the regression method is too strong in constraint, and the utilization of the quality relation between images is lacking, and meanwhile, the model training target and the evaluation method are inconsistent.
Based on this, how to provide a reference-free image quality evaluation method capable of overcoming the above-described drawbacks is a problem that a person skilled in the art needs to solve.
Disclosure of Invention
In view of the above, the invention provides a method for training a model for evaluating image quality without reference, which optimizes and trains the model for evaluating image quality without reference by considering monotonicity and relativity between image quality so as to improve the evaluation precision and generalization performance of the model.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
a method for training a no-reference image quality evaluation model having two identical image quality evaluation networks, the two image quality evaluation networks sharing parameters, the no-reference image quality evaluation model optimally trained using a monotonicity loss function between image pairs,
l rank (x 1 ,x 2 )=max(0,f(x 2 )-f(x 1 )+ε)
wherein x is 1 、x 2 Input image for two image quality evaluation networks, and image x 1 Is higher than image x 2 Quality truth value of f (x) 1 )、f(x 2 ) Epsilon represents the interval for the corresponding output.
Preferably, the input of the no-reference image quality evaluation model is a pair of images or labels.
Preferably, all images in a small batch in the training process are firstly arranged in descending order according to quality truth values, and monotonicity loss of all image pairs is calculated through forward propagation once by considering all image pairs in the small batch, wherein the calculation formula is as follows:
Figure BDA0004095895280000021
wherein n represents the number of input images, L rank Monotonic loss for all image pairs, l rank Is a monotonic loss between any two images.
Preferably, a matrix M is constructed according to the quality true value of the input image and the output corresponding to the non-reference image quality evaluation model, and the L is calculated according to the matrix M rank
Preferably, the non-reference image quality assessment model is optimally trained using a correlation loss function,
Figure BDA0004095895280000022
where n is the number of input images, y i Is the true value of the i-th image,
Figure BDA0004095895280000023
is the average value of the true values of the n images,
Figure BDA0004095895280000024
the average of the n image quality evaluation model outputs is represented.
Preferably, the model for non-reference image quality evaluation is optimally trained using both loss functions together.
Preferably, the reference-free image quality evaluation model outputs an evaluation result through a sigmoid function.
According to the technical scheme, the invention discloses a training method of a non-reference image quality evaluation model, and compared with the prior art, the training method provided by the invention learns the sequencing information among images through a twin network, learns the numerical quantization of quality difference among the images through adding additional linear correlation constraint, and optimizes monotonicity and correlation between a network predicted value and a subjective quality true value through a loss function of the two. The non-reference image quality evaluation model obtained through training of the invention can be obviously improved in precision and generalization performance.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of a process of a model training method for non-reference image quality evaluation provided by the invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The embodiment of the invention discloses a novel reference-free image quality evaluation model training method, and the reference-free quality evaluation model obtained by the training method can effectively overcome the defects that the regression method is strong in constraint and cannot utilize the quality correlation between images, and is superior to the model obtained by the conventional regression method in terms of model precision and generalization performance.
Specifically, as shown in fig. 1, the training method of the evaluation model disclosed by the invention is as follows:
firstly, in order to directly optimize the quality evaluation model in the training process, the method uses ordering information among the twin network learning images, namely, a non-reference image quality evaluation model has two identical image quality evaluation networks, the two image quality evaluation networks share parameters, the input of the non-reference image quality evaluation model is a paired image or label, then the output of a forward propagation calculation network and a corresponding loss are carried out, and finally the optimization is carried out through a backward propagation algorithm.
Further, a monotonicity loss function between image pairs is calculated according to the following formula, the function is utilized to optimally train a non-reference image quality evaluation model,
l rank (x 1 ,x 2 )=max(0,f(x 2 )-f(x 1 )+ε)
wherein x is 1 、x 2 Input image for two image quality evaluation networksAnd image x 1 Is higher than image x 2 Quality truth value of f (x) 1 )、f(x 2 ) Epsilon represents the interval for the corresponding output.
Monotonicity between image pairs is an important indicator for measuring the quality of reference-free image evaluation algorithms, i.e. measuring whether the algorithms correctly predict the relative relationship of quality between different images. In the invention, monotonicity loss between image pairs is calculated by using the output value of a reference-free image quality evaluation model and an image quality true value, and as known from the calculation formula of the monotonicity loss function, f (x 1 ) F (x) or more 2 ) That is, when the output and input orders of the network are consistent, the loss is 0, when the output and input orders of the network are inconsistent, the loss is not 0, and f (x) 2 ) Ratio f (x 1 ) The more large the loss, the greater. The penalty can measure monotonicity between the algorithm output value and the subjective quality truth value of the image.
An important disadvantage of twin networks is the large amount of redundant computation that exists during training. Taking all image pairs that may be composed of three images as an example, in a standard twin network implementation, since each image appears in two different image pairs, all three images need to be propagated twice forward in the network, and since the two branches of the twin network are identical, the calculation of the above process is twice as much as actually needed.
To solve this problem, consider all possible image pairs in a small batch during training, and calculate the loss of all image pairs by one forward propagation.
In one embodiment, assuming that there are n images in a small lot, consider all possible inside the small lot
Figure BDA0004095895280000041
The corresponding losses for the image pairs are shown below:
Figure BDA0004095895280000042
wherein n representsNumber of input images, L rank Monotonic loss for all image pairs, l rank Is a monotonic loss between any two images.
Further, in the present embodiment, a matrix M is constructed according to the quality truth value of the input image and the output corresponding to the non-reference image quality evaluation model, and L is calculated according to the matrix M rank To achieve acceleration of the algorithm described above.
Specifically, in order to fully utilize the capability of modern GPU parallel computing and without losing generality, firstly, the present application arranges images in a small batch in descending order according to quality truth values, and records the ordered images as x= [ X ] 1 ,x 2 ,...,x n ]The network output is f= [ f (x 1 ),f(x 2 ),...,f(x n )]Thus, an n×n matrix F is constructed as follows:
Figure BDA0004095895280000051
further, a matrix M is calculated according to the following formula:
M=F T -F+εP
where P is a unitary matrix of all 1's, where each element of the matrix M is as follows:
Figure BDA0004095895280000052
calculating L from matrix M rank The process of (1) is as follows:
let matrix M rank =max (0, m), at which time the loss value L of the entire small lot rank For matrix M rank The calculation process can fully utilize the parallel acceleration capability of the GPU, the calculation speed is greatly improved, and the calculation can be easily realized by means of the current common deep learning framework.
On the other hand, the application provides a model optimization method considering correlation. The correlation between the output value of the image quality evaluation model and the image quality true value is a measureIn order to optimize the objective in training and enable the network to learn the numerical quantification of the quality differences between different images, the invention adds an additional linear correlation constraint to make the output of the twin network and the quality truth have a linear correlation. Recording the image quality true value in a small batch as y= [ y ] 1 ,…,y n ]The linear correlation loss of the whole small batch is L line The calculation formula of the correlation loss function is:
Figure BDA0004095895280000053
where n is the number of input images, y i Is the true value of the i-th image,
Figure BDA0004095895280000054
is the average value of the true values of the n images,
Figure BDA0004095895280000055
the average of the n image quality evaluation model outputs is represented.
And is also provided with
Figure BDA0004095895280000056
The invention further provides that when the non-reference image quality evaluation model is trained, two loss functions are used for jointly carrying out optimization training on the non-reference image quality evaluation model.
The loss function during co-training is as follows:
L=αL rank +βL line
wherein L is rank As monotonicity loss function, L line For the correlation loss function, α and β are weights for the corresponding losses.
In addition, the final layer of the twin network adopts a single neuron to predict the quality score of the input image, and the algorithm does not directly return the subjective quality score of the image, so that the range of the network output value is not constrained, the network is required to learn the range of the output value according to the data distribution, the network learning difficulty is reduced, the meaning of the network output value is more definite, and the neuron of the final layer is used as the final output through a sigmoid function.
Aiming at the image quality evaluation target, the invention provides a new method for training the non-reference image quality evaluation task, which optimizes monotonicity between a network predicted value and a subjective quality true value through ordering information among twin network learning images; meanwhile, a method for training the twin network efficiently is provided, and based on the method, additional linear correlation constraint is added to learn numerical quantization of quality difference between images, and correlation between an algorithm predicted value and a subjective quality true value is optimized. The model for evaluating the quality of the reference-free image obtained by training is superior to the model obtained by training by adopting a regression method in two aspects of model precision and generalization performance.
In order to fully illustrate its beneficial effects, the present application is illustrated by the following experiments:
the invention selects four representative image quality evaluation data sets of Koniq-10k and LIVE-C, BID, LIVE for experiments, and uses a Szelman rank correlation coefficient (SROCC) and a Piercan Linear Correlation Coefficient (PLCC) as evaluation indexes,
the present invention compares with MSE and MAE on the four data sets described above, and for fair comparison, except for the loss function used, the rest of the settings were the same, each comparison was repeated 10 times, and the median was taken as the final result, the experimental results are shown in table 1,
table 1 comparison of different data sets and regression methods
Figure BDA0004095895280000061
As can be seen from Table 1, on the Koniq-10k, LIVE-C and BID data sets, the model performance obtained by training the invention is obviously higher than that of a model obtained by adopting MSE or MAE as a loss function, and on the LIVE data sets, the invention can obtain the result equivalent to the MSE or MAE, which shows that the overall performance of the method is superior to that of a regression method, and the method is a new thought for effectively evaluating the non-reference image quality.
Further, the invention performs generalization performance test on the regression method and the method proposed by the invention on KoniQ-10k, LIVE-C and BID data sets, specifically, the method proposed by the invention and the regression method are respectively adopted on three data sets to train a model, then the test is performed on the other two data sets, the experimental results are shown in Table 2,
table 2 comparison of generalization performance on different data sets and regression methods
Figure BDA0004095895280000071
From the above, in six cross-dataset validation experiments, the best results were obtained for the method of the present application for 4 times, indicating that the generalization ability of the method was superior to that of the regression method.
In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (7)

1. A method for training a no-reference image quality evaluation model is characterized in that the no-reference image quality evaluation model is provided with two identical image quality evaluation networks, the two image quality evaluation networks share parameters, the no-reference image quality evaluation model uses a monotonic loss function between image pairs to carry out optimization training,
l rank (x 1 ,x 2 )=max(0,f(x 2 )-f(x 1 )+ε)
wherein x is 1 、x 2 Input image for two image quality evaluation networks, and image x 1 Is higher than image x 2 Quality truth value of f (x) 1 )、f(x 2 ) Epsilon represents the interval for the corresponding output.
2. The method of claim 1, wherein the non-reference image quality assessment model is input as a pair of images or labels.
3. The model training method for reference-free image quality assessment according to claim 1, wherein after all possible image pairs in the input image are ranked according to image quality truth values, monotonicity loss of all image pairs is calculated through one forward propagation, and the calculation formula is as follows:
Figure FDA0004095895270000011
wherein n represents the number of input images, L rank Monotonic loss for all image pairs, l rank Is a monotonic loss between any two images.
4. A method according to claim 3, wherein a matrix M is constructed based on the quality truth value of the input image and the corresponding output of the modelThe matrix M calculates the L rank
5. A method for training a model for no-reference image quality assessment according to claim 1 or 3, wherein said model for no-reference image quality assessment is optimally trained using a correlation loss function,
Figure FDA0004095895270000012
where n is the number of input images, y i Is the true value of the i-th image,
Figure FDA0004095895270000013
mean value of true values of n images, +.>
Figure FDA0004095895270000014
The average of the n image quality evaluation model outputs is represented.
6. The method for training a model for no-reference image quality assessment according to claim 5, wherein the model for no-reference image quality assessment is optimally trained using both loss functions together.
7. The method according to claim 1, wherein the no-reference image quality evaluation model outputs an evaluation result through a sigmoid function.
CN202310165646.4A 2023-02-15 2023-02-15 No-reference image quality evaluation model training method Pending CN116129203A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310165646.4A CN116129203A (en) 2023-02-15 2023-02-15 No-reference image quality evaluation model training method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310165646.4A CN116129203A (en) 2023-02-15 2023-02-15 No-reference image quality evaluation model training method

Publications (1)

Publication Number Publication Date
CN116129203A true CN116129203A (en) 2023-05-16

Family

ID=86297421

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310165646.4A Pending CN116129203A (en) 2023-02-15 2023-02-15 No-reference image quality evaluation model training method

Country Status (1)

Country Link
CN (1) CN116129203A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117237358A (en) * 2023-11-15 2023-12-15 天津大学 Stereoscopic image quality evaluation method based on metric learning

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117237358A (en) * 2023-11-15 2023-12-15 天津大学 Stereoscopic image quality evaluation method based on metric learning
CN117237358B (en) * 2023-11-15 2024-02-06 天津大学 Stereoscopic image quality evaluation method based on metric learning

Similar Documents

Publication Publication Date Title
CN110223517B (en) Short-term traffic flow prediction method based on space-time correlation
CN109635204A (en) Online recommender system based on collaborative filtering and length memory network
CN110427799B (en) Human hand depth image data enhancement method based on generation of countermeasure network
CN111814626B (en) Dynamic gesture recognition method and system based on self-attention mechanism
CN116129203A (en) No-reference image quality evaluation model training method
CN113688949B (en) Network image data set denoising method based on dual-network joint label correction
CN110197307B (en) Regional sea surface temperature prediction method combined with attention mechanism
CN113128671B (en) Service demand dynamic prediction method and system based on multi-mode machine learning
CN114154700B (en) User electricity consumption prediction method based on transformer model
CN112419455B (en) Human skeleton sequence information-based character action video generation method and system and storage medium
CN116110022B (en) Lightweight traffic sign detection method and system based on response knowledge distillation
CN116992779B (en) Simulation method and system of photovoltaic energy storage system based on digital twin model
CN115424177A (en) Twin network target tracking method based on incremental learning
CN115909002A (en) Image translation method based on contrast learning
CN113554599A (en) Video quality evaluation method based on human visual effect
CN112116685A (en) Multi-attention fusion network image subtitle generating method based on multi-granularity reward mechanism
CN114926591A (en) Multi-branch deep learning 3D face reconstruction model training method, system and medium
CN114647758A (en) Video abstract generation network based on Transformer and deep reinforcement learning
CN110717281A (en) Simulation model credibility evaluation method based on hesitation cloud language term set and cluster decision
CN113742178A (en) Network node health state monitoring method based on LSTM
CN116543289B (en) Image description method based on encoder-decoder and Bi-LSTM attention model
CN116350190B (en) Driving capability determining method, electronic equipment and storage medium
CN111241372B (en) Method for predicting color harmony degree according to user preference learning
Hu et al. Learning multi-expert distribution calibration for long-tailed video classification
CN115035304A (en) Image description generation method and system based on course learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination