CN113128517B - Tone mapping image mixed visual feature extraction model establishment and quality evaluation method - Google Patents

Tone mapping image mixed visual feature extraction model establishment and quality evaluation method Download PDF

Info

Publication number
CN113128517B
CN113128517B CN202110300592.9A CN202110300592A CN113128517B CN 113128517 B CN113128517 B CN 113128517B CN 202110300592 A CN202110300592 A CN 202110300592A CN 113128517 B CN113128517 B CN 113128517B
Authority
CN
China
Prior art keywords
image
distorted
evaluated
distortion
distorted image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110300592.9A
Other languages
Chinese (zh)
Other versions
CN113128517A (en
Inventor
张敏
许筱敏
张汝雪
石小妹
冯筠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NORTHWEST UNIVERSITY
Original Assignee
NORTHWEST UNIVERSITY
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NORTHWEST UNIVERSITY filed Critical NORTHWEST UNIVERSITY
Priority to CN202110300592.9A priority Critical patent/CN113128517B/en
Publication of CN113128517A publication Critical patent/CN113128517A/en
Application granted granted Critical
Publication of CN113128517B publication Critical patent/CN113128517B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/20Image enhancement or restoration using local operators
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30168Image quality inspection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Biophysics (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Biology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an image quality evaluation method, a modeling method and a system based on mixed visual characteristics, wherein the modeling method comprises the following steps: dividing the distorted image into a plurality of non-overlapping image blocks, sending the image blocks into a multi-scale feature fusion network to extract multi-scale content features of the image, calculating a gradient map corresponding to the distorted image, obtaining mixed visual perception features, and mapping the obtained features to human subjective scores by using support vector regression. The method provided by the invention designs a new multi-scale feature fusion network for expressing image quality layered degradation by combining a layered perception mechanism in a human visual system, and the network can more comprehensively express the distortion of an image; meanwhile, by combining the primary perception characteristics of human eyes, a double-tributary feature extraction model comprising an image stream and a gradient stream is constructed. The improved tone mapping image quality evaluation model can extract richer image quality perception characteristics and has better accuracy and universality.

Description

Tone mapping image mixed visual feature extraction model establishment and quality evaluation method
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to a tone mapping image mixed visual feature extraction model establishment and quality evaluation method.
Background
With the development of digital imaging technology, high dynamic range (High Dynamic Range, HDR) images have evolved. Because the HDR image has wide dynamic range and rich real scene image details, the HDR image has great application value in the fields of video production, virtual reality, remote sensing detection, medical treatment, military and the like. However, HDR displays have not been popular so far, and conventional 8-bit display devices are mainly used in existing image processing systems, and HDR is far beyond what it can handle. Thus, the visualization of HDR images on conventional displays inevitably results in loss of image information and degradation of perceived quality. In order to visualize HDR images on standard 8-bit displays, various Tone Mapping Operators (TMOs) have been proposed to convert HDR images into low dynamic range (High Dynamic Range, LDR) images. However, as the conversion of dynamic range inevitably introduces complex distortion and leads to degradation of visual perception quality, an objective method is required to evaluate the quality of Tone-Mapped Images (TMIs). No-Reference Image Quality Assessment, NRIQA is a research task in the field of image processing aimed at designing a computational model which does not depend on any priori knowledge and can automatically evaluate image quality, and its research results quantify the image performance and provide important basis for research in other fields of image processing.
The existing reference-free tone mapping image quality evaluation method comprises two methods: the first is to design a manual feature descriptor to extract efficient image quality degradation features and then use a non-linear regression method (e.g., support vector regression (Support Vector Regression, SVR)) to regress the high-dimensional features to quality scores. Such methods are based on knowledge driven, requiring manual design of feature descriptors from human eye vision system (Human Visual System, HVS) or natural scene statistics (Natural Scene Statistics, NSS) features. However, it is difficult to design manual features that effectively represent no degradation in reference image quality.
Because of the rich and efficient feature representation capabilities of convolutional neural networks (Convolutional Neural Network, CNN), CNN-based NRIQA has been proposed, and such methods are data driven. In 2017, abhinau et al proposed to extract features of tone mapped images using a transfer learning method, and then map the extracted features to quality scores using SVR. In 2018, he et al considers the complexity of distortion in tone-mapped images, and should extract information of different scales and different levels when predicting image quality, so that extracting multi-scale and multi-layer features from a pre-trained deep convolutional neural network model constructs a new non-reference tone-mapped image quality evaluation method, and improves the performance of the method.
In summary, the existing reference-free tone mapping image quality evaluation method mainly has the following disadvantages:
(1) The current data driving-based method mainly uses the deep neural network output characteristics of transfer learning or pre-training to conduct quality prediction, but does not extract specific characteristics of TMI and fully considers image quality degradation, so that the model accuracy is not high.
(2) Neglecting the transition TMI due to dynamic range may create a halation effect that affects the image quality to some extent, making the image of the human eye vision system different from the real world, resulting in a model with less consistency of subjective perception by the human eye.
Disclosure of Invention
The invention aims to provide an image quality evaluation method, a modeling method and a system based on mixed visual characteristics, which are used for solving the problem that in the prior art, the accuracy of an evaluation model is not high due to insufficient consideration of the problem of image quality degradation.
In order to realize the tasks, the invention adopts the following technical scheme:
the method for establishing the tone mapping image mixed visual characteristic extraction model comprises the following steps:
step 1: obtaining a distorted image set and the quality fraction of each distorted image in the distorted image set, and calculating a gradient image corresponding to each distorted image through a Sobel operator to obtain a gradient image set; respectively blocking each distorted image in the distorted image set and each gradient image in the gradient image set to obtain a distorted image block set and a gradient image block set; the quality score of each distorted image block is made to be the quality score of the distorted image before the distorted image block is divided;
step 2: establishing a feature extraction network based on ResNet-50, taking a distorted image block set as a training set, taking the quality score of each distorted image block as a tag set, training the feature extraction network, and taking the trained feature extraction network as a feature extractor;
step 3: respectively inputting the distorted image block set and the gradient image block set obtained in the step 1 into the feature extractor obtained in the step 2 to perform feature extraction to respectively obtain multi-scale content features and primary visual features of each distorted image, and performing feature fusion on the scale content features and the primary visual features of each distorted image to obtain mixed visual features of each distorted image in the distorted image set;
step 4: and (3) establishing a support vector regressor, taking the mixed visual characteristics of all the distorted images obtained in the step (3) as a training set, taking the score of all the distorted images as a label set, taking the score of each distorted image as the average value of the quality scores of all the distorted image blocks contained in the distorted images, training the support vector regressor, and taking the trained support vector regressor as an image quality model.
Further, the feature extraction network comprises a residual block layer, a convolution layer and a global average pooling layer, wherein the residual block layer comprises a Conv1 layer, a Conv2 layer, a Conv3 layer, a Conv4 layer and a Conv5 layer, and the convolution layer comprises three 1×1 convolutions and one 3×3 convolutions.
Further, the primary visual features are extracted through Conv1 layer of the feature extraction network.
A tone-mapped image quality evaluation method comprising the steps of:
step one: obtaining a distortion image to be evaluated, calculating a gradient image of the distortion image to be evaluated through a Sobel operator, obtaining a gradient image to be evaluated, and respectively partitioning the distortion image to be evaluated and the gradient image to be evaluated to obtain a distortion image block set to be evaluated and a gradient image block set to be evaluated;
step two: respectively inputting the distortion image block set to be evaluated and the gradient image block set to be evaluated into a feature extractor obtained by adopting a tone mapping image mixed visual feature extraction model building method to obtain multi-scale content features and primary visual features of the distortion image to be evaluated, and carrying out feature fusion on the multi-scale content features and the primary visual features of the distortion image to be evaluated to obtain mixed visual features of the distortion image to be evaluated;
step three: and inputting the mixed visual characteristics of the distortion graph to be evaluated into an image quality model obtained by adopting a tone mapping image mixed visual characteristic extraction model building method, so as to obtain the quality fraction of the distortion graph to be evaluated.
Compared with the prior art, the invention has the following technical characteristics:
(1) The invention combines a layered perception mechanism in a human visual system, and utilizes the characteristic of image layered degradation to construct a multi-scale feature fusion network when the design without a reference frame is considered, thereby more comprehensively expressing the distortion of the image.
(2) The invention combines the primary perception characteristics of human vision to construct a double tributary feature extraction model, namely: image flow and gradient flow. Inputting a distortion figure into an image stream to extract multi-scale content characteristics; considering that the TMI may generate halation effect to cause the edge distortion of the image, a corresponding gradient map of the distortion map is added to extract the primary visual features so as to better express the edge distortion information.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a frame diagram of the modeling of the present invention;
FIG. 3 is an exemplary graph of image quality degradation;
FIG. 4 is a diagram of a multi-scale feature fusion network;
fig. 5 is a distortion map and a corresponding gradient map in an embodiment.
Detailed Description
Specific examples of the present invention are given below, and it should be noted that: (1) the present invention is not limited to the following specific examples; (2) When the model is trained, the data set is required to be divided into a training set and a testing set, wherein the training set is not all data in the data set but part of data, and the testing set is used for testing after the model is trained, so that a complete trained model is obtained; (3) The embodiment of the invention uses the Python language to realize the construction of the whole model.
The embodiment discloses a tone mapping image mixed visual feature extraction model establishment method, which comprises the following steps:
step 1: obtaining a distorted image set and the quality fraction of each distorted image in the distorted image set, and calculating a gradient image corresponding to each distorted image through a Sobel operator to obtain a gradient image set; respectively blocking each distorted image in the distorted image set and each gradient image in the gradient image set to obtain a distorted image block set and a gradient image block set; the quality score of each distorted image block is made to be the quality score of the distorted image before the distorted image block is divided;
step 2: establishing a feature extraction network based on ResNet-50, taking a distorted image block set as a training set, taking the quality score of each distorted image block as a tag set, training the feature extraction network, and taking the trained feature extraction network as a feature extractor;
step 3: respectively inputting the distorted image block set and the gradient image block set obtained in the step 1 into the feature extractor obtained in the step 2 to perform feature extraction to respectively obtain multi-scale content features and primary visual features of each distorted image, and performing feature fusion on the scale content features and the primary visual features of each distorted image to obtain mixed visual features of each distorted image in the distorted image set;
step 4: and (3) establishing a support vector regressor, taking the mixed visual characteristics of all the distorted images obtained in the step (3) as a training set, taking the score of all the distorted images as a label set, taking the score of each distorted image as the average value of the quality scores of all the distorted image blocks contained in the distorted images, training the support vector regressor, and taking the trained support vector regressor as an image quality model.
Specifically, every two image blocks in the distorted image block set are not overlapped with each other, and the gradient imageEvery two image blocks in the block set are not overlapped with each other, and the distorted image block set is expressed as x i The gradient image block set is denoted as y i I=1, 2. Cropping a plurality of non-overlapping image blocks from the distortion map is done on the one hand to increase the amount of data and on the other hand because direct resizing operations mask certain image artifacts, whereas cropping ensures that the perceived image quality is unchanged.
Specifically, the quality score of each distorted image block is denoted as f (x i ) The average value of all the distortion image block scores contained by the quality scores of each distortion image is expressed as
Figure BDA0002986080470000061
M is the total number of distorted image blocks in a distorted image.
Specifically, the training phase of any distorted image X in the data set comprises the following substeps:
1) Clipping the distorted image X into 4 non-overlapping distorted image blocks X of 224X 224 1 ,x 2 ,x 3 ,x 4 Since the quality score of each distorted image block is the score of the distorted image in which it is located, the quality score y of the distorted image X is obtained, and the distorted image block X 1 ,x 2 ,x 3 ,x 4 The mass fractions of (2) are y;
2) Will x 1 ,x 2 ,x 3 ,x 4 And its fraction { (x) 1 ,y),(x 2 ,y),(x 3 ,y),(x 4 Y) sending the feature extraction network training model and extracting the mixed visual features to train a support vector regression model;
in the test phase, a distorted image X ' is given, the true score of which is y ' =3.5, and in order to verify the performance of the proposed model, the distorted image X ' is first divided into image blocks X of the same size as the training phase 1 ',x 2 ',x 3 ',x 4 ' then, for each distorted image block, a hybrid visual feature is extracted, and then the extracted hybrid visual feature is mapped to using a trained quality prediction modelQuality score, obtaining prediction score y of each distorted image block 1 '=3.35,y 2 '=3.61,y 3 '=3.44,y 4 ' = 3, 52, the quality scores of the 4 distorted image blocks are averaged as the predicted quality score of the final distortion map, i.e.:
Figure BDA0002986080470000071
is close to the real label 3.5 of the given distorted image.
Specifically, the feature extraction network comprises a residual block layer, a convolution layer and a global average pooling layer, wherein the residual block layer comprises a Conv1 layer, a Conv2 layer, a Conv3 layer, a Conv4 layer and a Conv5 layer, and the convolution layer comprises three 1×1 convolutions and one 3×3 convolutions.
The feature extraction network is built based on a hierarchical process of visual perception. Firstly, using a residual network ResNet-50 as a basic network for semantic feature extraction, and removing the last two layers of ResNet-50 to output a feature stream; secondly, outputting multi-scale features from four residual blocks of Conv2, conv3, conv4 and Conv5 in ResNet-50; next, the channel size is reduced by a 1 x 1 convolution and the spatial resolution is up-sampled by a factor of 2, the up-sampled feature map being fused with the corresponding higher-scale feature map by element-wise addition. Repeating the process until a feature map with the highest resolution is generated; finally, 3 x 3 convolution and global averaging pooling are used on the resulting feature map to obtain multi-scale content features.
Specifically, the network establishment and feature extraction in the step 2 includes the following sub-steps:
step 2.1, using a residual network ResNet-50 as a basic network for semantic feature extraction, using a pre-training model on an ImageNet as a network for initialization, visualizing five residual blocks of the ResNet-50 model, observing that distortion can affect features of different levels, and causing image quality degradation from the IQA angle;
step 2.2, outputting multi-scale features F1, F2, F3 and F4 from the last layer of the four residual blocks of Conv2, conv3, conv4 and Conv5 of ResNet-50 respectively;
step 2.3, the output multi-scale features are convolved by 1 multiplied by 1 to reduce the channel size, the spatial resolution is up-sampled by a factor of 2 to obtain a feature map, the up-sampled feature map is fused with the corresponding feature map with higher scale by element-by-element addition, and the process is repeated until the feature map with the highest resolution is generated;
and 2.4, carrying out convolution of 3 multiplied by 3 and global average pooling on the obtained characteristic diagram to obtain the multi-scale content characteristic.
Specifically, in step 1, a distorted image I (I, j) is input, and convolved by using a Sobel operator to obtain a gradient map M (I, j), wherein,
Figure BDA0002986080470000081
/>
H x and H y The horizontal and vertical components of the Sobel operator respectively,
Figure BDA0002986080470000082
Figure BDA0002986080470000083
in particular, the primary visual feature f p Conv1 layer extraction through feature extraction network.
Specifically, in step 3, feature fusion between the scale content feature and the primary visual feature of each distorted image means: characterizing multi-scale content f m And primary visual features f p And performing cascading, namely transverse splicing, so as to obtain the mixed visual feature F after feature fusion.
Specifically, step 4 uses SVR to map the resulting hybrid visual feature F to a human subjective score MOS.
Specifically, the human subjective score MOS is a mass fraction, and may range from 1 to 100 or 1-8 depending on the data set and the evaluation criteria.
The embodiment also discloses a tone mapping image quality evaluation method, which comprises the following steps:
step one: obtaining a distortion image to be evaluated, calculating a gradient image of the distortion image to be evaluated through a Sobel operator, obtaining a gradient image to be evaluated, and respectively partitioning the distortion image to be evaluated and the gradient image to be evaluated to obtain a distortion image block set to be evaluated and a gradient image block set to be evaluated;
step two: respectively inputting the distortion image block set to be evaluated and the gradient image block set to be evaluated into a feature extractor obtained by adopting a tone mapping image mixed visual feature extraction model building method to obtain multi-scale content features and primary visual features of the distortion image to be evaluated, and carrying out feature fusion on the multi-scale content features and the primary visual features of the distortion image to be evaluated to obtain mixed visual features of the distortion image to be evaluated;
step three: and inputting the mixed visual characteristics of the distortion graph to be evaluated into an image quality model obtained by adopting a tone mapping image mixed visual characteristic extraction model building method, so as to obtain the quality fraction of the distortion graph to be evaluated.
The embodiment also discloses an image quality evaluation system based on the mixed visual characteristics, which comprises a data acquisition and segmentation unit, a characteristic extractor establishing unit, a characteristic fusion unit, an image quality model establishing unit and an image quality scoring unit;
the data acquisition and segmentation unit is used for acquiring a distorted image set and the quality score of each distorted image in the distorted image set, and calculating a gradient image corresponding to each distorted image through a Sobel operator to obtain a gradient image set; respectively blocking each distorted image in the distorted image set and each gradient image in the gradient image set to obtain a distorted image block set and a gradient image block set; the quality score of each distorted image block is made to be the quality score of the distorted image before the distorted image block is divided;
the feature extractor establishing unit is used for establishing a feature extraction network based on ResNet-50, taking a distorted image block set as a training set, taking the quality score of each distorted image block as a tag set, training the feature extraction network, and taking the trained feature extraction network as a feature extractor;
the characteristic fusion unit is used for carrying out characteristic extraction on the distorted image block set and the gradient image block set which are obtained by the data acquisition and segmentation unit by the characteristic extractor respectively to obtain multi-scale content characteristics and primary visual characteristics of each distorted image, and carrying out characteristic fusion on the scale content characteristics and the primary visual characteristics of each distorted image to obtain mixed visual characteristics of each distorted image in the distorted image set;
the image quality model building unit is used for building a support vector regressor, taking the mixed visual characteristics of all the distorted images of the distorted image set obtained by the characteristic fusion unit as a training set, taking the score of all the distorted images as a label set, taking the score of each distorted image as the average value of the quality scores of all the distorted image blocks contained in the distorted image, training the support vector regressor, and taking the trained support vector regressor as an image quality model;
the image quality scoring unit is used for acquiring a distortion image to be evaluated, and acquiring a distortion image block set to be evaluated and a gradient image block set to be evaluated through the data acquisition and segmentation unit; respectively inputting the distortion image block set to be evaluated and the gradient image block set to be evaluated into a feature extractor to obtain multi-scale content features and primary visual features of the distortion image to be evaluated, and carrying out feature fusion on the multi-scale content features and the primary visual features of the distortion image to be evaluated to obtain mixed visual features of the distortion image to be evaluated; and the method is also used for inputting the mixed visual characteristics of the distortion figure to be evaluated into the image quality model to obtain the quality score of the distortion figure to be evaluated.
Example 1
In this embodiment, two data sets, i.e., a TMID data set and an ESPL-LIVE data set, are used to verify the performance of the method, where the TMID data set includes 120 images, and is divided into 15 groups, each group includes an HDR image and 8 corresponding TMIs generated from different TMOs, and the range of the quality scores of each distorted image in the TMID data set is 1-8. The ESPL-LIVE dataset contained 1811 HDR processed images generated by three processing algorithms (including TM, multi-exposure fusion and post-processing), with 747 TMIs applied in the experiment, and the range of quality scores for each distorted image in the ESPL-LIVE dataset was 1-100. The embodiment provides an image quality evaluation method, and on the basis of the above embodiment, the following technical features are disclosed:
specifically, in step 1, each image is divided into a plurality of non-overlapping image blocks with the same size, and each image block is resized to 224×224 to be sent into a multi-scale feature fusion network;
this example compares experimentally the seven NRIQA methods proposed by QAC, GM-LOG, BRISQUE, HIGRADE, chen, abhinau and He et al. The results of the experiment are shown in Table 1, wherein the Spearman correlation coefficient (Spearman rank correlation coefficient, SROCC) and Pearson correlation coefficient (Pearson Correlation Coefficient, PLCC) are the evaluation indexes of the experiment, the values are [0,1], and the higher the values are, the better the performance of the method is.
Table 1 comparison results between different methods
Figure BDA0002986080470000111
From the results in table 1, it can be seen that the NRIQA method proposed by the present invention achieves optimal performance and has relatively stable results compared to the other seven NRIQA methods.
To further demonstrate that the innovations presented in the present invention can have a beneficial effect on the final results, the relevant ablation experiments were performed in this example, and the results are shown in tables 2 and 3.
Table 2 analysis of performance under single-scale and multi-scale feature representation
Figure BDA0002986080470000112
Table 3 performance analysis under single tributary and double tributary models
Figure BDA0002986080470000121
Table 2 lists the performance under single-scale and multi-scale feature representations on TMID and ESPL-LIVE datasets. Conv2-5 represents the feature of extracting only this module during feature extractionSign f m Representing multi-scale content features. As can be seen from table 2, the results with either scale alone are lower than the results with all scale features acting together; and the features of each scale are effective to the model and can be used as image quality perception features. The proposed multi-scale model takes into account the layered degradation to obtain the best results.
Table 3 lists the performance under the single tributary and double tributary models on TMID and ESPL-LIVE datasets. IS denotes an image stream, TS denotes a bi-tributary, where ts_m denotes a gradient stream also using multi-scale content features, and ts_p denotes a gradient stream using primary visual features. As can be seen from table 3, introducing a gradient stream into the NRIQA method can further enhance the performance of the method and further verify that image local structure statistics from primary visual features are highly correlated with image perception quality.
Fig. 5 shows different distortion maps and their corresponding gradient maps. Fig. 5 (a), (c), and (e) are tone mapped images of different distortions of the same scene, and fig. 5 (b), (d), and (f) are gradient diagrams corresponding thereto, respectively. From the map it can be seen that the gradient map clearly reflects the structural components of the image, such as the edges of the image. The tone-mapped images of the first and second lines, respectively, exhibit different degrees of halation, show overexposure, lose image detail and color information, thereby impeding recognition and affecting perceived quality, and the image of the third line appears more natural and recognizable, and it can be found that the gradient map can clearly reveal the edges of the image, reflecting the degree of distortion of the image.
Thus, the innovations presented in the present invention can have a beneficial effect on the final result, thereby further enhancing the performance of the tone-mapped image quality evaluation model.

Claims (1)

1. A tone-mapped image quality evaluation method, comprising the steps of:
step one: obtaining a distortion image to be evaluated, calculating a gradient image of the distortion image to be evaluated through a Sobel operator, obtaining a gradient image to be evaluated, and respectively partitioning the distortion image to be evaluated and the gradient image to be evaluated to obtain a distortion image block set to be evaluated and a gradient image block set to be evaluated;
step two: respectively inputting the distortion image block set to be evaluated and the gradient image block set to be evaluated into a feature extractor obtained by adopting a tone mapping image mixed visual feature extraction model building method to obtain multi-scale content features and primary visual features of the distortion image to be evaluated, and carrying out feature fusion on the multi-scale content features and the primary visual features of the distortion image to be evaluated to obtain mixed visual features of the distortion image to be evaluated;
step three: inputting the mixed visual characteristics of the distortion graph to be evaluated into an image quality model obtained by adopting a tone mapping image mixed visual characteristic extraction model building method, so as to obtain the quality fraction of the distortion graph to be evaluated;
the method for establishing the tone mapping image mixed visual characteristic extraction model comprises the following steps:
step 1: obtaining a distorted image set and the quality fraction of each distorted image in the distorted image set, and calculating a gradient image corresponding to each distorted image through a Sobel operator to obtain a gradient image set; respectively blocking each distorted image in the distorted image set and each gradient image in the gradient image set to obtain a distorted image block set and a gradient image block set; the quality score of each distorted image block is made to be the quality score of the distorted image before the distorted image block is divided;
step 2: establishing a feature extraction network based on ResNet-50, taking a distorted image block set as a training set, taking the quality score of each distorted image block as a tag set, training the feature extraction network, and taking the trained feature extraction network as a feature extractor; the feature extraction network comprises a residual block layer, a convolution layer and a global average pooling layer, wherein the residual block layer comprises a Conv1 layer, a Conv2 layer, a Conv3 layer, a Conv4 layer and a Conv5 layer, and the convolution layer comprises three convolutions of 1 multiplied by 1 and one convolution of 3 multiplied by 3;
step 3: respectively inputting the distorted image block set and the gradient image block set obtained in the step 1 into the feature extractor obtained in the step 2 to perform feature extraction to respectively obtain multi-scale content features and primary visual features of each distorted image, and performing feature fusion on the scale content features and the primary visual features of each distorted image to obtain mixed visual features of each distorted image in the distorted image set; the primary visual features are extracted through Conv1 layers of a feature extraction network;
step 4: and (3) establishing a support vector regressor, taking the mixed visual characteristics of all the distorted images obtained in the step (3) as a training set, taking the score of all the distorted images as a label set, taking the score of each distorted image as the average value of the quality scores of all the distorted image blocks contained in the distorted images, training the support vector regressor, and taking the trained support vector regressor as an image quality model.
CN202110300592.9A 2021-03-22 2021-03-22 Tone mapping image mixed visual feature extraction model establishment and quality evaluation method Active CN113128517B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110300592.9A CN113128517B (en) 2021-03-22 2021-03-22 Tone mapping image mixed visual feature extraction model establishment and quality evaluation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110300592.9A CN113128517B (en) 2021-03-22 2021-03-22 Tone mapping image mixed visual feature extraction model establishment and quality evaluation method

Publications (2)

Publication Number Publication Date
CN113128517A CN113128517A (en) 2021-07-16
CN113128517B true CN113128517B (en) 2023-06-13

Family

ID=76773578

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110300592.9A Active CN113128517B (en) 2021-03-22 2021-03-22 Tone mapping image mixed visual feature extraction model establishment and quality evaluation method

Country Status (1)

Country Link
CN (1) CN113128517B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114463318B (en) * 2022-02-14 2022-10-14 宁波大学科学技术学院 Visual quality evaluation method for multi-exposure fusion image
CN114863241A (en) * 2022-04-22 2022-08-05 厦门大学 Movie and television animation evaluation method based on spatial layout and deep learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108090902A (en) * 2017-12-30 2018-05-29 中国传媒大学 A kind of non-reference picture assessment method for encoding quality based on multiple dimensioned generation confrontation network
CN108391121A (en) * 2018-04-24 2018-08-10 中国科学技术大学 It is a kind of based on deep neural network without refer to stereo image quality evaluation method
CN111429402A (en) * 2020-02-25 2020-07-17 西北大学 Image quality evaluation method for fusing advanced visual perception features and depth features
CN112132774A (en) * 2019-07-29 2020-12-25 方玉明 Quality evaluation method of tone mapping image

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI579800B (en) * 2013-04-10 2017-04-21 國立清華大學 Image processing method applicable to images captured by wide-angle zoomable lens
CN107172418B (en) * 2017-06-08 2019-01-04 宁波大学 A kind of tone scale map image quality evaluating method based on exposure status analysis

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108090902A (en) * 2017-12-30 2018-05-29 中国传媒大学 A kind of non-reference picture assessment method for encoding quality based on multiple dimensioned generation confrontation network
CN108391121A (en) * 2018-04-24 2018-08-10 中国科学技术大学 It is a kind of based on deep neural network without refer to stereo image quality evaluation method
CN112132774A (en) * 2019-07-29 2020-12-25 方玉明 Quality evaluation method of tone mapping image
CN111429402A (en) * 2020-02-25 2020-07-17 西北大学 Image quality evaluation method for fusing advanced visual perception features and depth features

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DEEP NO-REFERENCE TONE MAPPED IMAGE QUALITY ASSESSMENT;Chandra Sekhar Ravuri 等;《arXiv》;20200208;第1-5页 *
基于图像内容失真的全参考视频质量评估方法;杨付正等;《西安电子科技大学学报》;20051225(第06期);第47-50页 *
结合清晰度的无参考图像质量评价;曹欣等;《计算机与数字工程》;20200420(第04期);第193-198页 *

Also Published As

Publication number Publication date
CN113128517A (en) 2021-07-16

Similar Documents

Publication Publication Date Title
Ying et al. From patches to pictures (PaQ-2-PiQ): Mapping the perceptual space of picture quality
CN111709902B (en) Infrared and visible light image fusion method based on self-attention mechanism
CN112734646B (en) Image super-resolution reconstruction method based on feature channel division
EP4105877A1 (en) Image enhancement method and image enhancement apparatus
CN108830796B (en) Hyperspectral image super-resolution reconstruction method based on spectral-spatial combination and gradient domain loss
CN109146831A (en) Remote sensing image fusion method and system based on double branch deep learning networks
CN109671023A (en) A kind of secondary method for reconstructing of face image super-resolution
CN113128517B (en) Tone mapping image mixed visual feature extraction model establishment and quality evaluation method
CN110070517A (en) Blurred picture synthetic method based on degeneration imaging mechanism and generation confrontation mechanism
CN105550989B (en) The image super-resolution method returned based on non local Gaussian process
CN108875900A (en) Method of video image processing and device, neural network training method, storage medium
CN108447059B (en) Full-reference light field image quality evaluation method
CN112767385B (en) No-reference image quality evaluation method based on significance strategy and feature fusion
CN110163855B (en) Color image quality evaluation method based on multi-path deep convolutional neural network
CN113658091A (en) Image evaluation method, storage medium and terminal equipment
CN112508847A (en) Image quality evaluation method based on depth feature and structure weighted LBP feature
CN110415816B (en) Skin disease clinical image multi-classification method based on transfer learning
CN116524387A (en) Ultra-high definition video compression damage grade assessment method based on deep learning network
CN114898096A (en) Segmentation and annotation method and system for figure image
CN114998252A (en) Image quality evaluation method based on electroencephalogram signals and memory characteristics
CN113077385A (en) Video super-resolution method and system based on countermeasure generation network and edge enhancement
CN111127587A (en) Non-reference image quality map generation method based on countermeasure generation network
Li et al. Delving Deeper Into Image Dehazing: A Survey
CN117291855B (en) High resolution image fusion method
Luo et al. LCDA-Net: Efficient Image Dehazing with Contrast-Regularized and Dilated Attention

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant