CN113421237B - No-reference image quality evaluation method based on depth feature transfer learning - Google Patents

No-reference image quality evaluation method based on depth feature transfer learning Download PDF

Info

Publication number
CN113421237B
CN113421237B CN202110678186.6A CN202110678186A CN113421237B CN 113421237 B CN113421237 B CN 113421237B CN 202110678186 A CN202110678186 A CN 202110678186A CN 113421237 B CN113421237 B CN 113421237B
Authority
CN
China
Prior art keywords
network
quality
image
image quality
reference image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110678186.6A
Other languages
Chinese (zh)
Other versions
CN113421237A (en
Inventor
何立火
任伟
李嘉秀
邓夏迪
甘海林
唐杰浩
柯俊杰
张超仑
路文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN202110678186.6A priority Critical patent/CN113421237B/en
Publication of CN113421237A publication Critical patent/CN113421237A/en
Application granted granted Critical
Publication of CN113421237B publication Critical patent/CN113421237B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30168Image quality inspection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Geometry (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medical Informatics (AREA)
  • Computer Hardware Design (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Quality & Reliability (AREA)
  • Computer Graphics (AREA)
  • Image Analysis (AREA)

Abstract

A no-reference image quality evaluation method based on depth feature transfer learning comprises the following steps: constructing a distortion characteristic extraction network; constructing a multi-branch feature attention module; constructing a quality regression network; generating a reference-free image quality regression network; generating a training set; training a non-reference image quality regression network; and evaluating the quality of the image to be evaluated. The multi-branch feature attention module contained in the distortion feature extraction network can capture the distortion features of natural images in a self-adaptive mode, and the quality scores of the input images can be automatically obtained on the output side of the quality regression network. The wide experimental results on a plurality of international open databases show that the method improves the prediction precision of the distorted image quality, and has the advantages of higher consistency with the human eye visual perception and stronger generalization performance when evaluating the quality of a non-reference image.

Description

No-reference image quality evaluation method based on depth feature transfer learning
Technical Field
The invention belongs to the technical field of image processing, and further relates to an image quality evaluation method based on depth feature transfer learning in the technical field of image quality evaluation. The invention can be used to automatically calculate the quality score of a naturally distorted image without an original reference image.
Background
With the advent of the world of everything interconnection and the rapid development of digital multimedia technology, images have become a major source of visual information perceived by humans from the outside world. However, some uncontrollable factors such as noise jitter and the like are inevitably introduced in the process of the signal from the transmitting end to the receiving end to cause image quality degradation, so that visual quality degradation and semantic information loss are caused. Therefore, the evaluation of the image quality is very important, and the image acquisition and processing system is optimized by designing an efficient and accurate image quality evaluation method to acquire images with higher quality. Image quality evaluation technology comes along, and since an original reference image (a corresponding distortion-free version) is difficult to acquire in most practical application scenes, a reference-free image quality evaluation method is most widely applied. The no-reference quality evaluation method is a technology which can automatically calculate the quality without any information about the original image, and obtains the quality representation of the target image by establishing the mapping relation from the subjective belief score to the objective evaluation score.
Wuhan university discloses a color image quality evaluation method based on a multi-path deep convolutional neural network in a patent technology 'color image quality evaluation method based on a multi-path deep convolutional neural network' (application number: CN201910414080.8, and authorization publication number: CN 110163855B). The patent technology mainly solves the problem that the quality prediction precision of the color image by the traditional method is not high. The patent technology comprises the following implementation steps: (1) Performing multi-scale transformation and color space transformation processing on the color image, and outputting a plurality of different component images; (2) designing and improving a single-path deep convolution network structure; (3) training and optimizing a single-path deep convolutional network; (4) Performing feature extraction and multi-dimensional feature collaborative fusion on a plurality of component images by a single-path depth convolution network model; (5) feature dimension reduction processing of the multi-dimensional output feature vector; (6) And the nonlinear regression method maps the subjective opinion score and the function of the dimensionality reduction characteristic to establish a color image quality prediction model and evaluate the quality of the color image. The patented technology extracts quality perception features in color components, although improving the no-reference image quality evaluation technology for color images. However, the method still has the defects that a plurality of images with different components are output through multi-scale transformation and color space transformation processing of the color image, a plurality of redundant irrelevant features exist among the transformed image features extracted by a plurality of convolutional neural networks, and finally, functional mapping between subjective opinion scores and dimensionality reduction features is completed through a nonlinear regression method, so that the method is not an end-to-end learnable process.
Ren et al, published in the paper "Ran4iqa: reactive adaptive networks for no-reference image quality assessment" (2018 proceedings of the AAAI Conference on Industrial intellectual evaluation.32 (1), 2018) discloses a no-reference image quality assessment method based on generation of an antagonistic network. The method includes the steps that firstly, an image to be tested is cut into a plurality of image blocks, secondly, a pre-training restorability countermeasure network is obtained after training is conducted on a large-scale smooth Lu database, input images in a pre-training stage are four types of distorted images (JPEG, JPEG2000 compression distortion, gaussian blur and Gaussian white noise), a pseudo reference image of the image blocks is generated through a pre-training model, and then the generated image blocks and the restored image blocks are sent to a regression network together to calculate the quality scores of the images. The method has two disadvantages, one of which is that the method utilizes the generated image which is obtained after pre-training on the large-scale database of the Torilis kuro and corresponds to the distortion type, the training period is long, and the method cannot be applied to the actual production and life; secondly, the distorted image recovery model of the method only considers 4 different levels of distortion types, the image in the real scene is often the combination of multiple distortion types, and the final subjective and objective consistency result depends on the accuracy of the recovery model to a great extent.
Disclosure of Invention
The invention aims to provide a no-reference image quality evaluation method based on depth feature transfer learning, aiming at solving the problems that in an image quality evaluation task, a quality evaluation network constructed by using a traditional transfer learning method is difficult to train, the prediction precision is not high and the generalization is not ideal because the distortion feature difference between a artificially synthesized distorted image and a real scene image is large.
The idea for realizing the purpose of the invention is as follows: the importance characteristics are modeled by constructing a multi-branch characteristic attention module, and the characteristics sensitive to the local distortion of the image are automatically highlighted by utilizing the attention modeling, so that the attention characteristics capable of expressing the image quality are obtained. The multi-branch feature attention module divides the feature graphs of the input images into two groups, uses an inter-channel attention mechanism to adaptively recalibrate channel feature responses in each group, ensures that a reference-free image quality regression network can learn the diversity features of the reference-free images, effectively learns distortion modes different from real scene images, quickly completes the self-adaptation of the real scene images and artificially synthesized distortion images, obviously improves the prediction accuracy of the image quality, and solves the problems that a quality evaluation network is difficult to train, the prediction accuracy is not high, and the generalization is not ideal.
The specific steps for realizing the purpose of the invention are as follows:
(1) Constructing a distortion feature extraction sub-network:
(1a) A five-layer image distortion characteristic extraction sub-network is built, and the structure sequentially comprises the following steps: the convolution layer comprises a 1 st convolution calculation unit, a 2 nd convolution calculation unit, a 3 rd convolution calculation unit and a 4 th convolution calculation unit; the 1 st to 4 th convolution calculation units adopt bottleneck structures, and each bottleneck structure is formed by cascading three convolution layers;
(1b) Setting the number of input channels of the convolution layer to 64, the number of output channels to 128, the size of the convolution kernel to 7 multiplied by 7 and the step length to 2; the number of the bottleneck structures of the 1 st to 4 th convolution computing units is respectively 3,4,6,3, and the sizes of convolution kernels of convolution layers in each bottleneck structure are respectively set to be 1 multiplied by 1,3 multiplied by 3 and 1 multiplied by 1;
(2) Constructing a multi-branch feature attention module:
building a multi-branch feature attention module formed by cascading three convolution layers; setting the feature map grouping number groups in the 1 st convolutional layer as 2, setting the input channel numbers of the 1 st to 3 rd convolutional layers as 64, 128 and 128 respectively, setting the sizes of convolutional cores as 3 × 3,1 × 1 and 1 × 1 respectively, and setting the step sizes as 1;
(3) Constructing a quality regression subnetwork:
constructing a quality regression sub-network formed by connecting two downsampling layer groups in parallel and having the same structure and parameter setting; each parallel down-sampling layer group consists of five cascade linear layers with the same structure and parameter setting, the number of nodes of the five cascade linear layers is respectively set to 2048, 1024, 512, 256 and 64, and the random node deactivation rates of the linear layers are respectively set to 0.5,0.25 and 0;
(4) Generating a reference-free image quality regression network:
sequentially cascading a distortion feature extraction sub-network, a multi-branch feature attention module, a quality regression sub-network and a prediction layer into a non-reference image quality regression network; the number of input nodes of the prediction layer is set to be 128, and the number of output nodes is set to be 1;
(5) Generating a training set:
(5a) Selecting at least 1020 and at most 6000 natural images without reference from the natural image quality evaluation data set to form a sample set, and sequentially carrying out normalization processing and pretreatment on each image in the sample set;
(5b) All the preprocessed images and the labels corresponding to the preprocessed images form a training set;
(6) Training a non-reference image quality regression network:
setting training parameters, inputting a training set into a reference-free image quality regression network, and iteratively updating network parameters by adopting a random gradient descent method until a loss function is converged to obtain a trained reference-free image quality regression network;
(7) And (3) carrying out quality evaluation on the non-reference image to be evaluated:
and (5) sequentially normalizing and preprocessing the non-reference image to be evaluated by adopting the same method as the steps (5 a) and (5 b), inputting the preprocessed image into a trained non-reference image quality regression network, and outputting the predicted quality score of the image.
Compared with the prior art, the invention has the following advantages:
firstly, because the multi-branch feature attention module is constructed, the distortion quality perception feature which seriously influences the visual impression of human beings is extracted from the input non-reference image in a self-adaptive manner by the multi-branch feature attention module, and the problem that in the prior art, a large amount of pre-training needs to be carried out on a large image database because the similarity difference between a artificially synthesized distortion target domain and a source domain image of a real scene is large is solved, so that the importance feature of the input image can be extracted in a self-adaptive manner by the multi-branch feature attention module, and the method has the advantage of more accurate result when the non-reference image quality is predicted.
Secondly, because the invention constructs a quality regression sub-network which is composed of two branches connected in parallel, the design of the two branches enhances the aggregation capability of the distortion feature extraction sub-network on the quality perception features, and simultaneously has the feature synergy effect. The constructed non-reference image quality regression network is a network capable of learning end to end, and the problems that an image quality prediction model in the prior art is a two-stage learning process, so that the processing flow is complicated and the feature expression capability is not strong are solved. The reference-free image quality regression network constructed by the method can further enhance the learning efficiency of the network under the synergistic effect of the quality regression sub-network, and has the advantages of higher consistency with human visual perception, higher prediction precision and stronger generalization performance when evaluating the reference-free image quality.
Drawings
FIG. 1 is a flow chart of the present invention.
Detailed Description
The invention is further described below with reference to fig. 1 and simulation experiments.
Step 1, constructing a distortion feature extraction sub-network.
A five-layer image distortion characteristic extraction sub-network is built, and the structure sequentially comprises the following steps: the convolution layer comprises a 1 st convolution calculation unit, a 2 nd convolution calculation unit, a 3 rd convolution calculation unit and a 4 th convolution calculation unit; the 1 st to 4 th convolution calculation units adopt a bottleneck structure, and each bottleneck structure is formed by cascading three convolution layers.
Setting the number of input channels of the convolutional layer to be 64, the number of output channels to be 128, the size of the convolutional core to be 7 multiplied by 7 and the step length to be 2; the number of the bottleneck structures of the 1 st to 4 th convolution calculation units is respectively 3,4,6,3, and the sizes of convolution kernels of convolution layers in each bottleneck structure are respectively set to be 1 × 1,3 × 3 and 1 × 1.
And 2, constructing a multi-branch feature attention module.
Building a multi-branch feature attention module formed by cascading three convolution layers; the feature map grouping number groups in the 1 st convolutional layer is set to be 2, the input channel numbers of the 1 st to 3 rd convolutional layers are set to be 64, 128 and 128 respectively, the sizes of the convolutional kernels are set to be 3 x 3,1 x 1 and 1 x 1 respectively, and the step sizes are all set to be 1.
And 3, constructing a quality regression subnetwork.
Constructing a quality regression sub-network formed by connecting two downsampling layer groups in parallel and having the same structure and parameter setting; each parallel down-sampling layer group consists of five cascaded linear layers with the same structure and parameter setting, the number of nodes of the five cascaded linear layers is respectively set to 2048, 1024, 512, 256 and 64, and the random node deactivation rates of the linear layers are respectively set to 0.5,0.25 and 0.
And 4, generating a reference-free image quality regression network.
Sequentially cascading a distortion feature extraction sub-network, a multi-branch feature attention module, a quality regression sub-network and a prediction layer into a non-reference image quality regression network; the number of input nodes of the prediction layer is set to be 128, and the number of output nodes is set to be 1.
And 5, generating a training set.
Selecting at least 1020 and at most 6000 natural images without reference from the natural image quality evaluation data set to form a sample set, and sequentially carrying out normalization processing and preprocessing on each image in the sample set.
The normalization processing refers to: the mean value of the normalization process of each image in the sample set is set as mean = [0.485,0.456,0.406], the standard deviation is set as std = [0.229,0.224,0.225], the normalization process is performed in the range of [0,1] using the mean value of 0.485,0.456,0.406 and the standard deviation of 0.229,0.224,0.225 for the three channels of R, G, B of the image respectively.
The preprocessing refers to dividing each normalized image into non-overlapping image blocks with the size of 32 x 32 and the sampling step size of 32.
And forming a training set by all the preprocessed images and the labels corresponding to the preprocessed images.
And 6, training a non-reference image quality regression network.
Setting training parameters, inputting a training set into the reference-free image quality regression network, and iteratively updating network parameters by adopting a random gradient descent method until a loss function is converged to obtain the trained reference-free image quality regression network.
The set training parameters are as follows: the small constant is set to eps =1e-8, and the first and second order moment estimate exponential decay rates are set to: beta is a 1 =0.9,β 2 =0.999, set the first-order and second-order moment estimates to s =0, r =0, respectively, set the initial learning rate to lr _ ratio =1e-3, set the batch size to batch _ size =128, and set the weight decay to weight _ decay =0.
The loss function is as follows:
Figure GDA0004057947440000061
wherein L (-) represents a loss function of the reference-free image quality regression network,
Figure GDA0004057947440000062
labels, Q, representing the ith image in the training set i The image quality regression method comprises the steps of representing a predicted value of an ith image in a training set output through a non-reference image quality regression network, representing the total number of images in the training set by N, representing summation operation by sigma, representing the sequence number of the images in the training set by i, and representing absolute value operation by | and | in.
And 7, evaluating the quality of the non-reference image to be evaluated.
And 5, sequentially normalizing and preprocessing the non-reference image to be evaluated by adopting the same method as the step 5, inputting the preprocessed image into a trained non-reference image quality regression network, and outputting the predicted quality score of the image.
The effect of the present invention is further explained by combining the simulation experiment as follows:
1. simulation experiment conditions are as follows:
the hardware platform of the simulation experiment of the invention is as follows: the processor is Intel (R) Core (TM) i9-7900X @3.30GHz, the main frequency is 3.30GHz, the memory is 32GB, and the display card is NVIDIA GeForce GTX 1080Ti.
The software platform of the simulation experiment of the invention is as follows: ubuntu 16.04.12 operating system, pyTorch-gpu 1.6 open source deep learning framework, python 3.7.
The input images used by the simulation experiment of the invention are natural images and come from image quality evaluation known databases TID2008, TID2013 and KADID-10k.
The TID2008 database includes 25 reference images and 1700 distorted images, which are in bmp format.
The TID2013 database comprises 25 reference images and 3000 distorted images, and the image format of the database is bmp format.
The KADID-10k database contains 81 reference images, 10125 distorted images, in the format png.
2. Simulation content and result analysis thereof:
the simulation experiment of the invention adopts the invention and two prior arts (a deep non-reference image quality evaluation method CNN based on a convolutional neural network and a blind image quality evaluation method HOSA based on high-order statistical data aggregation) to respectively carry out the quality prediction of a non-reference image on distorted images in three image quality evaluation known databases of TID2008, TID2013 and KADID-10k.
And calculating the consistency of the quality predicted value of the non-reference image and the image label to obtain an evaluation index, and measuring the quality evaluation effect of the invention and two prior arts on the non-reference image in the three image quality evaluation known databases by using the evaluation index.
In the simulation experiments, two prior arts are adopted to mean:
the no-reference image quality evaluation method based on the Convolutional neural network refers to a no-reference image quality evaluation method provided by L.Kang et al in' Convolutional neural networks for no-reference image quality assessment [ C ]// Proceedings ofhe IEEE con-ference on computer vision and pattern registration.2014: 1733-1740 ], which is called a depth N no-reference image quality evaluation method based on the Convolutional neural network for short.
The Blind Image quality evaluation method based on high-order statistical data aggregation refers to a no-reference Image quality evaluation method, abbreviated as HOSA no-reference Image quality evaluation method, proposed by J.xu et al in "Black Image quality assessment based on high order statistics aggregation [ J ]. IEEE Transactions on Image Processing,2016,25 (9): 4444-4457.
In the simulation experiment, the three known image quality evaluation databases are adopted:
the TID2008 well-known database refers to the database of n.pontomarenko et al in "TID2008-a database for evaluation of future visual quality assessment metrics [ J ]. Advances of model radio electronics,2009, 10:30-45 ", referred to as TID2013 known database.
The TID2013 known database refers to an image quality evaluation database, called TID2013 known database for short, proposed by N.Ponomarenko et al in Color image database TID2013: pecliarities and preliminary results in Europan Workshop on Visual Information Processing (EUVIP), 106-111,2013.
The KADID-10k well-known database refers to an image Quality evaluation database, called KADID-10k well-known database for short, which is set forth by Lin H et al in "Kadid-10 k.
In Order to judge the quality evaluation effect of the non-reference image in the invention and the quality evaluation effect of the non-reference image in the other two prior arts, the simulation experiment adopts two indexes of Spearman Rank Order Correlation Coefficient (SROCC) and Pearson Linear Correlation Coefficient (PLCC) to objectively judge the quality evaluation effect of the non-reference image in the invention and the quality evaluation effect of the non-reference image in the other two prior arts.
(1) Spearman Rank Order Correlation Coefficient (SROCC)
The Spearman correlation determines the strength and direction of a monotonic relation between two variables, measures the monotonicity of algorithm prediction, and has the expression:
Figure GDA0004057947440000081
wherein r is xi Expressing the subjective quality evaluation result of the ith image to be tested, r yi (r) represents the result of objective quality evaluation xi -r yi ) 2 It represents the difference between the two, obtained by sorting the differential set calculation.
(2) Pearson Linear Correlation Coefficient (PLCC)
x i And y i The subjective quality evaluation score and the objective score of the ith tested image are represented respectively. The expression is as follows:
Figure GDA0004057947440000082
wherein n is the total number of images,
Figure GDA0004057947440000083
and &>
Figure GDA0004057947440000084
The evaluation scores are respectively the subjective evaluation scores of human eyes for the database and the average values of the evaluation scores obtained by the automatic calculation of an objective evaluation algorithm. The linear correlation coefficient describes the correlation between the evaluation value of the algorithm and the subjective score of human eyes, and meanwhile, the accuracy of algorithm prediction is measured.
The simulation experiment uses the invention and two prior arts to evaluate images in three different known databases, and calculates two consistency indexes of the evaluation result of each method, and the calculation result is shown in table 1.
TABLE 1 comparison of evaluation results of three methods
Figure GDA0004057947440000091
As can be seen from Table 1, the spearman order correlation coefficient SROCC and the pilson linear correlation coefficient PLCC of the evaluation results of the invention on three image quality evaluation known databases are higher than those of the two prior arts, and the invention is proved to have better non-reference image quality evaluation effect.

Claims (4)

1. A no-reference image quality evaluation method based on depth feature transfer learning is characterized in that a multi-branch feature attention module is embedded in a distortion feature extraction sub-network, two branches connected in parallel are connected at the tail part of the distortion feature extraction sub-network to serve as a quality regression sub-network, and a prediction layer is used for predicting the quality score of a distorted image; the method comprises the following specific steps:
(1) Constructing a distortion feature extraction sub-network:
(1a) A five-layer image distortion characteristic extraction sub-network is built, and the structure sequentially comprises the following steps: the convolution layer comprises a 1 st convolution calculation unit, a 2 nd convolution calculation unit, a 3 rd convolution calculation unit and a 4 th convolution calculation unit; the 1 st to 4 th convolution computing units adopt bottleneck structures, and each bottleneck structure is formed by cascading three convolution layers;
(1b) Setting the number of input channels of the convolution layer to 64, the number of output channels to 128, the size of the convolution kernel to 7 multiplied by 7 and the step length to 2; the number of the bottleneck structures of the 1 st to 4 th convolution computing units is respectively 3,4,6 and 3, and the sizes of convolution kernels of convolution layers in each bottleneck structure are respectively set to be 1 multiplied by 1,3 multiplied by 3 and 1 multiplied by 1;
(2) Constructing a multi-branch feature attention module:
building a multi-branch feature attention module formed by cascading three convolution layers; setting the feature map grouping number groups in the 1 st convolutional layer as 2, setting the input channel numbers of the 1 st to 3 rd convolutional layers as 64, 128 and 128 respectively, setting the sizes of convolutional cores as 3 × 3,1 × 1 and 1 × 1 respectively, and setting the step sizes as 1;
(3) Constructing a quality regression subnetwork:
constructing a quality regression sub-network formed by connecting two downsampling layer groups in parallel and having the same structure and parameter setting; each parallel down-sampling layer group consists of five cascade linear layers with the same structure and parameter setting, the number of nodes of the five cascade linear layers is respectively set to 2048, 1024, 512, 256 and 64, and the random node deactivation rates of the linear layers are respectively set to 0.5,0.25 and 0;
(4) Generating a reference-free image quality regression network:
sequentially cascading a distortion feature extraction sub-network, a multi-branch feature attention module, a quality regression sub-network and a prediction layer into a non-reference image quality regression network; the number of input nodes of the prediction layer is set to be 128, and the number of output nodes is set to be 1;
(5) Generating a training set:
(5a) Selecting at least 1020 and at most 6000 natural images without reference from the natural image quality evaluation data set to form a sample set, and sequentially carrying out normalization processing and pretreatment on each image in the sample set;
(5b) All the preprocessed images and the labels corresponding to the preprocessed images form a training set;
(6) Training a non-reference image quality regression network:
setting training parameters, inputting a training set into a non-reference image quality regression network, and iteratively updating network parameters by adopting a random gradient descent method until a loss function is converged to obtain a trained non-reference image quality regression network;
(7) And (3) performing quality evaluation on the non-reference image to be evaluated:
and (5) sequentially normalizing and preprocessing the non-reference image to be evaluated by adopting the same method as the steps (5 a) and (5 b), inputting the preprocessed image into a trained non-reference image quality regression network, and outputting the predicted quality score of the image.
2. The method for evaluating the quality of the non-reference image based on the depth feature transfer learning of claim 1, wherein the normalization process in the step (5 a) is: setting the mean value of normalization processing of each image in the sample set as mean = [0.485,0.456,0.406], setting the standard deviation as std = [0.229,0.224,0.225], and performing normalization processing in the range of [0,1] by using the mean value of 0.485,0.456,0.406 and the standard deviation of 0.229,0.224,0.225 for the three channels of R, G, B of the image respectively; the preprocessing described in step (5 a) refers to dividing each normalized image into non-overlapping image blocks of size 32 × 32 and sampling step size 32.
3. The method for evaluating the quality of the non-reference image based on the depth feature transfer learning according to claim 1, wherein: the loss function in step (6) is as follows:
Figure FDA0004057947430000021
wherein L (-) represents a loss function of the reference-free image quality regression network,
Figure FDA0004057947430000022
label representing the ith image in the training set, Q i The image quality regression method comprises the steps of representing a predicted value of an ith image in a training set output through a non-reference image quality regression network, representing the total number of images in the training set by N, representing summation operation by sigma, representing the sequence number of the images in the training set by i, and representing absolute value operation by | and | in.
4. The method for evaluating the quality of the non-reference image based on the depth feature transfer learning according to claim 1, wherein: the training parameters set in step (6) are as follows: the small constant is set to eps =1e-8, and the first and second order moment estimate exponential decay rates are set to: beta is a 1 =0.9,β 2 =0.999, the first and second order moment estimates are set to s =0, r =0, respectively, the initial learning rate is set to lr _ ratio =1e-3, the batch size is set to batch _ size =128, and the weight decay is set to weight _ decay =0.
CN202110678186.6A 2021-06-18 2021-06-18 No-reference image quality evaluation method based on depth feature transfer learning Active CN113421237B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110678186.6A CN113421237B (en) 2021-06-18 2021-06-18 No-reference image quality evaluation method based on depth feature transfer learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110678186.6A CN113421237B (en) 2021-06-18 2021-06-18 No-reference image quality evaluation method based on depth feature transfer learning

Publications (2)

Publication Number Publication Date
CN113421237A CN113421237A (en) 2021-09-21
CN113421237B true CN113421237B (en) 2023-04-18

Family

ID=77789167

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110678186.6A Active CN113421237B (en) 2021-06-18 2021-06-18 No-reference image quality evaluation method based on depth feature transfer learning

Country Status (1)

Country Link
CN (1) CN113421237B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113888501B (en) * 2021-09-29 2024-02-06 西安理工大学 Attention positioning network-based reference-free image quality evaluation method
CN114066812B (en) * 2021-10-13 2024-02-06 西安理工大学 No-reference image quality evaluation method based on spatial attention mechanism
CN117456339B (en) * 2023-11-17 2024-05-17 武汉大学 Image quality evaluation method and system based on multi-level feature multiplexing

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110516716A (en) * 2019-08-05 2019-11-29 西安电子科技大学 Non-reference picture quality appraisement method based on multiple-limb similarity network
CN112085102A (en) * 2020-09-10 2020-12-15 西安电子科技大学 No-reference video quality evaluation method based on three-dimensional space-time characteristic decomposition

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107644415B (en) * 2017-09-08 2019-02-22 众安信息技术服务有限公司 A kind of text image method for evaluating quality and equipment
CN108428227B (en) * 2018-02-27 2020-06-26 浙江科技学院 No-reference image quality evaluation method based on full convolution neural network
CN109308696B (en) * 2018-09-14 2021-09-28 西安电子科技大学 No-reference image quality evaluation method based on hierarchical feature fusion network
CN109242864B (en) * 2018-09-18 2021-09-24 电子科技大学 Image segmentation result quality evaluation method based on multi-branch network
CN110415170B (en) * 2019-06-24 2022-12-16 武汉大学 Image super-resolution method based on multi-scale attention convolution neural network
CN111353533B (en) * 2020-02-26 2022-09-13 南京理工大学 No-reference image quality evaluation method and system based on multi-task learning
CN112419242B (en) * 2020-11-10 2023-09-15 西北大学 No-reference image quality evaluation method based on self-attention mechanism GAN network
CN112634238B (en) * 2020-12-25 2024-03-08 武汉大学 Attention module-based image quality evaluation method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110516716A (en) * 2019-08-05 2019-11-29 西安电子科技大学 Non-reference picture quality appraisement method based on multiple-limb similarity network
CN112085102A (en) * 2020-09-10 2020-12-15 西安电子科技大学 No-reference video quality evaluation method based on three-dimensional space-time characteristic decomposition

Also Published As

Publication number Publication date
CN113421237A (en) 2021-09-21

Similar Documents

Publication Publication Date Title
CN113421237B (en) No-reference image quality evaluation method based on depth feature transfer learning
CN108830157B (en) Human behavior identification method based on attention mechanism and 3D convolutional neural network
CN108765319B (en) Image denoising method based on generation countermeasure network
CN110992275B (en) Refined single image rain removing method based on generation of countermeasure network
CN111784602B (en) Method for generating countermeasure network for image restoration
CN108665460B (en) Image quality evaluation method based on combined neural network and classified neural network
CN110516716B (en) No-reference image quality evaluation method based on multi-branch similarity network
CN111340814A (en) Multi-mode adaptive convolution-based RGB-D image semantic segmentation method
CN111080567A (en) Remote sensing image fusion method and system based on multi-scale dynamic convolution neural network
CN112348191B (en) Knowledge base completion method based on multi-mode representation learning
CN112489164B (en) Image coloring method based on improved depth separable convolutional neural network
CN109859166B (en) Multi-column convolutional neural network-based parameter-free 3D image quality evaluation method
CN112101363A (en) Full convolution semantic segmentation system and method based on cavity residual error and attention mechanism
CN110009700B (en) Convolutional neural network visual depth estimation method based on RGB (red, green and blue) graph and gradient graph
CN115205196A (en) No-reference image quality evaluation method based on twin network and feature fusion
CN115205147A (en) Multi-scale optimization low-illumination image enhancement method based on Transformer
CN115526891B (en) Training method and related device for defect data set generation model
CN115331104A (en) Crop planting information extraction method based on convolutional neural network
Cai et al. Multiscale attentive image de-raining networks via neural architecture search
CN117351542A (en) Facial expression recognition method and system
CN114170657A (en) Facial emotion recognition method integrating attention mechanism and high-order feature representation
CN111639751A (en) Non-zero padding training method for binary convolutional neural network
CN110942106A (en) Pooling convolutional neural network image classification method based on square average
CN115761888A (en) Tower crane operator abnormal behavior detection method based on NL-C3D model
CN113486929B (en) Rock slice image identification method based on residual shrinkage module and attention mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant