CN115222635A

CN115222635A - No-reference image quality evaluation method and device based on image feature fusion and computer readable storage medium

Info

Publication number: CN115222635A
Application number: CN202210839006.2A
Authority: CN
Inventors: 胡波; 朱广; 高新波; 李雷达; 聂茜茜
Original assignee: Chongqing University of Post and Telecommunications
Current assignee: Chongqing University of Post and Telecommunications
Priority date: 2022-07-18
Filing date: 2022-07-18
Publication date: 2022-10-21

Abstract

The invention belongs to the field of computer vision, and particularly relates to a no-reference image quality evaluation method and a no-reference image quality evaluation device based on image feature fusion and a computer readable storage medium, wherein the no-reference image quality evaluation method comprises the following steps of: acquiring a natural distortion image to be evaluated, generating a gradient image according to the natural distortion image, and inputting the natural distortion image and the gradient image into a trained non-reference image quality evaluation model based on image feature fusion to obtain a quality evaluation score; the no-reference image quality evaluation model based on image feature fusion comprises the following steps: the invention not only evaluates the image, but also fully considers the global semantic information and the local semantic information of the natural distortion image domain, thereby being capable of evaluating the quality of the natural distortion image more accurately.

Description

A non-reference image quality evaluation method, device and device based on image feature fusion computer readable storage medium

技术领域technical field

本发明属于计算机视觉领域，具体涉及一种基于图像特征融合的无参考图像质量评价方法、装置及计算机可读存储介质。The invention belongs to the field of computer vision, and in particular relates to a reference-free image quality evaluation method, device and computer-readable storage medium based on image feature fusion.

背景技术：Background technique:

无参考图像质量评价(No-ReferenceImageQualityAssessment)是指仅通过失真图对其本身进行质量评价，与全参考图像质量评估方法不同的是，无参考图像质量评价不需要参考图像。No-reference image quality assessment (No-Reference Image Quality Assessment) refers to the quality assessment of itself only through the distortion map. Different from the full reference image quality assessment method, no reference image quality assessment does not require a reference image.

随着基干网络研究的深入，对于图像特征的提取方法已经非常丰富，然而如何将提取的特征应用于图像质量评价仍是一个具有挑战性的任务，这也是阻碍图像质量评价的发展因素之一。With the in-depth research of backbone network, the extraction methods of image features have been very rich, but how to apply the extracted features to image quality evaluation is still a challenging task, which is also one of the factors hindering the development of image quality evaluation.

现有的图像质量评价技术都只是采用了一种图像本身的特征信息对图像的质量进行评价，例如，专利CN201810759247.X中主要使用生成对抗网络来构建图像训练模型，接着将高清无损图像作为训练数据集送入图像训练模型进行训练学习，得到具有训练完备鉴别网络的无参考图像质量评价模型，最后将待评价图像送入训练完备鉴别网络，通过打分及加权从而得到最终评价结果，这种方法主要根据图像自身的特征来实现再无参考图的情况下对图像的质量进行评价，没有重视到其他图像对于原始图像质量评价的作用，不能兼顾图像的全局语义信息和局部信息，导致对图像质量的评价不准确。The existing image quality evaluation technologies only use the feature information of the image itself to evaluate the quality of the image. For example, the patent CN201810759247.X mainly uses the generative adversarial network to construct the image training model, and then uses the high-definition lossless image as the training model. The data set is sent to the image training model for training and learning, and a non-reference image quality evaluation model with a fully trained identification network is obtained. Finally, the image to be evaluated is sent to the fully trained identification network, and the final evaluation result is obtained by scoring and weighting. This method Mainly based on the characteristics of the image itself, the quality of the image is evaluated without reference images, and the effect of other images on the quality evaluation of the original image is not paid attention to, and the global semantic information and local information of the image cannot be taken into account, resulting in poor image quality. evaluation is inaccurate.

发明内容SUMMARY OF THE INVENTION

为了解决现有技术由于没有重视到其他图像对于原始图像质量评价的作用，不能兼顾图像的全局语义信息和局部信息，导致对图像质量的评价不准确，本发明提出了一种基于图像特征融合的无参考图像质量评价方法、装置及计算机可读存储介质，用于根据原始图像质量的特征和原始图像梯度图的特征进行融合，从而对原始图像进行质量评价，充分考虑原始图像的全局语义信息和局部信息提高图像质量评价的准确性。In order to solve the problem that the prior art does not pay attention to the effect of other images on the quality evaluation of the original image, and cannot take into account the global semantic information and local information of the image, resulting in inaccurate evaluation of the image quality, the present invention proposes a method based on image feature fusion. A reference-free image quality evaluation method, device, and computer-readable storage medium are used for fusion according to the characteristics of the original image quality and the characteristics of the original image gradient map, so as to evaluate the quality of the original image, fully considering the global semantic information of the original image and Local information improves the accuracy of image quality assessment.

本发明采用以下技术方案：The present invention adopts the following technical solutions:

一种基于图像特征融合的无参考图像质量评价方法，包括以下步骤：A reference-free image quality evaluation method based on image feature fusion, comprising the following steps:

获取待评价的自然失真图像，根据自然失真图像生成梯度图像，将自然失真图像和梯度图像输入到训练好的基于图像特征融合的无参考图像质量评价模型中，得到质量评价分数；其中，基于图像特征融合的无参考图像质量评价模型包括：基干网络、跨域特征融合模型和跨尺度特征融合模型、两个线性回归层；Obtain the natural distortion image to be evaluated, generate a gradient image according to the natural distortion image, input the natural distortion image and the gradient image into the trained non-reference image quality evaluation model based on image feature fusion, and obtain the quality evaluation score; The no-reference image quality evaluation model of feature fusion includes: backbone network, cross-domain feature fusion model and cross-scale feature fusion model, and two linear regression layers;

对基于图像特征融合的无参考图像质量评价模型进行训练的过程包括：The process of training a no-reference image quality assessment model based on image feature fusion includes:

S1：获取具有真实标签的自然失真图像域，其中，真实标签表示自然失真图像域中自然失真图像的真实分数；S1: Obtain the naturally distorted image domain with ground truth labels, where the ground truth label represents the true score of the naturally distorted image in the naturally distorted image domain;

S2：根据自然失真图像域生成梯度图像域；S2: Generate gradient image domain according to natural distortion image domain;

S3：将自然失真图像域和其对应的梯度图像域输入基干网络中，提取出自然失真图像域的层次特征和梯度图像域的层次特征；S3: Input the natural distortion image domain and its corresponding gradient image domain into the backbone network, and extract the hierarchical feature of the natural distortion image domain and the hierarchical feature of the gradient image domain;

S4：将自然失真图像域的层次特征和梯度图像域的层次特征对应输入跨域特征融合模型中进行融合，计算得出自然失真图像域的跨域融合层次特征；S4: Integrate the hierarchical features of the natural distortion image domain and the hierarchical features of the gradient image domain corresponding to the input cross-domain feature fusion model, and calculate the cross-domain fusion hierarchical features of the natural distortion image domain;

S5：将跨域融合层次特征输入跨尺度特征融合模型中进行融合，计算得出自然失真图像域的跨尺度融合特征；S5: Input the cross-domain fusion hierarchical feature into the cross-scale feature fusion model for fusion, and calculate the cross-scale fusion feature of the natural distortion image domain;

S6：将跨尺度融合特征输入两个线性回归层进行回归处理，计算得出自然失真图像域的质量评价分数；S6: Input the cross-scale fusion features into two linear regression layers for regression processing, and calculate the quality evaluation score in the natural distortion image domain;

S7：根据所述自然失真图像域的真实标签与质量评价分数，计算基于图像特征融合的无参考图像质量评价模型的损失函数；S7: Calculate the loss function of the no-reference image quality evaluation model based on image feature fusion according to the real label and the quality evaluation score in the natural distortion image domain;

S8：不断调整模型的参数，当损失函数小于设定阈值时完成模型的训练。S8: Continuously adjust the parameters of the model, and complete the training of the model when the loss function is less than the set threshold.

为实现上述目的，本发明还提供一种计算机可读存储介质，其上存储有计算机程序，所述计算机程序被处理器执行时实现上述的一种基于图像特征融合的无参考图像质量评价方法。To achieve the above object, the present invention also provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements the above-mentioned image feature fusion-based image quality evaluation method without reference.

为实现上述目的，本发明还提供一种基于图像特征融合的无参考图像质量评价装置，包括处理器和存储器；所述存储器用于存储计算机程序；所述处理器与所述存储器相连，用于执行所述存储器存储的计算机程序，以使所述一种基于图像特征融合的无参考图像质量评价装置执行上述的一种基于图像特征融合的无参考图像质量评价方法。In order to achieve the above object, the present invention also provides a reference-free image quality evaluation device based on image feature fusion, comprising a processor and a memory; the memory is used to store a computer program; the processor is connected to the memory for The computer program stored in the memory is executed, so that the apparatus for evaluating the quality of an image without reference based on image feature fusion executes the above-mentioned method for evaluating image quality without reference based on image feature fusion.

本发明至少具有以下有益效果：The present invention has at least the following beneficial effects:

本发明通过基干网络对自然失真图像域和梯度图像域进行特征提取，通过跨域特征融合模型将梯度图像域和自然失真图像域不同层次之间的特征进行跨域融合，得到具有局部语义信息的跨域融合层次特征，通过跨尺度特征融合模块能够将跨域融合层次特征进行跨尺度的特征融合，从而获得同时具有全局语义信息和局部语义信息的自然失真图像域特征，将具有全局语义信息和局部语义信息的自然失真图像域特征进行回归处理，计算得出自然失真图像域的质量评价分数，使获得的质量评价分数更能够体现出自然失真图像域的质量信息，通过本发明能够在没有参考图像的前提下得出准确的图像质量信息，相对于传统的无参考图像质量评价方法，本发明充分考虑了梯度图像域作为自然失真图像域的辅助图像域，通过将不同的图像域进行融合能够获得更丰富的先验信息及局部信息，有利于提高质量评价的准确性，通过将跨域融合层次特征中不同层次的特征将进行跨尺度的融合能够得到具有全局和局部信息的图像特征，通过将具有全局和局部信息的图像特征进行线性回归得到图像质量的评价分数，提高了图像质量评价的准确性，通过本发明提供的一种基于图像特征融合的无参考图像质量评价方法能够广泛的应用于人工智能、图像识别、拍照、目标检测等领域，提高对自然失真图像的质量评价能力。The invention extracts features from the natural distorted image domain and the gradient image domain through the backbone network, and fuses the features between different levels of the gradient image domain and the natural distorted image domain through the cross-domain feature fusion model to obtain a feature with local semantic information. Cross-domain fusion of hierarchical features, through the cross-scale feature fusion module, the cross-domain fusion of hierarchical features can be used for cross-scale feature fusion, so as to obtain natural distorted image domain features with both global semantic information and local semantic information. The natural distorted image domain feature of local semantic information is subjected to regression processing, and the quality evaluation score of the natural distorted image domain is calculated, so that the obtained quality evaluation score can better reflect the quality information of the natural distorted image domain. Accurate image quality information is obtained under the premise of the image. Compared with the traditional non-reference image quality evaluation method, the present invention fully considers the gradient image domain as the auxiliary image domain of the natural distortion image domain. Obtaining more abundant prior information and local information is conducive to improving the accuracy of quality evaluation. By merging the features of different levels in the cross-domain fusion hierarchical features, the image features with global and local information can be obtained by cross-scale fusion. The image features with global and local information are linearly regressed to obtain the evaluation score of the image quality, which improves the accuracy of the image quality evaluation. The non-reference image quality evaluation method based on image feature fusion provided by the present invention can be widely used In the fields of artificial intelligence, image recognition, photography, target detection, etc., improve the quality evaluation ability of natural distorted images.

附图说明Description of drawings

图1：为本发明实施系统框图；Fig. 1: It is the block diagram of the implementation system of the present invention;

图2：为本发明模型训练流程图。Fig. 2 is a flow chart of the model training of the present invention.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明的一部分实施例，基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动的前提下，所获得的其他实施例，都属于本发明的保护范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, and are based on the embodiments of the present invention. , other embodiments obtained by those of ordinary skill in the art without creative work, all belong to the protection scope of the present invention.

请参阅图1，本发明提出一种基于图像特征融合的无参考图像质量评价方法，具体包括以下步骤：Referring to FIG. 1, the present invention proposes a reference-free image quality evaluation method based on image feature fusion, which specifically includes the following steps:

获取待评价的自然失真图像，根据自然失真图像生成梯度图像，将自然失真图像和梯度图像输入到训练好的基于图像特征融合的无参考图像质量评价模型中，得到质量评价分数；其中，基于图像特征融合的无参考图像质量评价模型包括：基干网络、跨域特征融合模型和跨尺度特征融合模型、两个线性回归层，其中，待评价的自然失真图像可以为采用任意摄像头拍摄的图像、或者是从网络上下载的图像、或者是从其它存储设备中获取的图像，也可以为静态图像或者是处于运动状态的图像等等，待评价的自然失真图像中可以包括一个或多个对象，例如在拍摄人的图像时，可能会拍摄到人周围的环境，如汽车、树木等，则人、汽车、电子、树木等均为待处理图像中包含的对象。Obtain the natural distortion image to be evaluated, generate a gradient image according to the natural distortion image, input the natural distortion image and the gradient image into the trained non-reference image quality evaluation model based on image feature fusion, and obtain the quality evaluation score; The no-reference image quality evaluation model for feature fusion includes: a backbone network, a cross-domain feature fusion model and a cross-scale feature fusion model, and two linear regression layers, where the natural distortion image to be evaluated can be an image captured by any camera, or It is an image downloaded from the Internet, or an image obtained from other storage devices, or a static image or an image in a moving state, etc. The natural distortion image to be evaluated may include one or more objects, such as When taking a picture of a person, the environment around the person, such as cars, trees, etc., may be captured, and people, cars, electronics, trees, etc. are all objects included in the image to be processed.

请参阅图2，对基于图像特征融合的无参考图像质量评价模型进行训练的过程包括：Referring to Figure 2, the process of training a reference-free image quality evaluation model based on image feature fusion includes:

S1：获取具有真实标签的自然失真图像域，其中，真实标签表示自然失真图像域中自然失真图像的真实分数。S1: Obtain the naturally distorted image domain with ground-truth labels, where the ground-truth labels represent the ground-truth scores of the naturally distorted images in the naturally distorted image domain.

鉴于大多数传统的NR-IQA方法只使用RGB图像作为模型的输入，存在使用多特征融合是否有利或一张图像是否足够的问题，图像的梯度在许多视觉任务中也起着至关重要的作用，图像梯度敏锐地反映了图像的结构成分，例如图像边缘，梯度图像可以在图像强度和颜色的变化下鲁棒地反映图像结构的细节，因此，使用梯度图作为数据输入，使其作为自然失真图像的补充和自然失真图像特征提取的辅助，这种设计辅助自然失真图像的纹理质量特征提取，并减轻从单一自然失真图像提取特征的难度，其中，在本发明中自然失真图像域主要采用LIVE-Challenge数据集，LIVE-Challenge数据集包含1169幅自然失真图像，无参考图像；图像大小为500x 500，真实标签是对自然失真图像域中自然失真图像的主观评价分数(MOS值)，MOS的大小范围为[0,100]，由8100名测试者测试得出，分数越高代表图片质量越好，其中，自然失真图像域表示为包含多张自然失真图像的图像集。Given that most traditional NR-IQA methods only use RGB images as input to the model, there is the question of whether it is beneficial to use multi-feature fusion or whether one image is sufficient, and the gradient of an image also plays a crucial role in many vision tasks , the image gradient sharply reflects the structural components of the image, such as image edges, and the gradient image can robustly reflect the details of the image structure under the changes of image intensity and color. Therefore, using the gradient map as data input makes it as a natural distortion The supplement of images and the assistance of natural distortion image feature extraction, this design assists the texture quality feature extraction of natural distortion images, and eases the difficulty of extracting features from a single natural distortion image. In the present invention, the natural distortion image domain mainly adopts LIVE -Challenge dataset, the LIVE-Challenge dataset contains 1169 naturally distorted images, no reference images; the image size is 500x 500, and the true label is the subjective evaluation score (MOS value) of the naturally distorted image in the natural distorted image domain, MOS's The size range is [0, 100], and it is tested by 8100 testers. The higher the score, the better the picture quality, where the natural distortion image domain is represented as an image set containing multiple natural distortion images.

S2：根据自然失真图像域生成梯度图像域。S2: The gradient image domain is generated from the natural distortion image domain.

一种基于图像特征融合的无参考图像质量评价方法的具体实施例，所述将自然失真图像处理成梯度图像的具体实现方法包括：A specific embodiment of a reference-free image quality evaluation method based on image feature fusion, the specific implementation method for processing a natural distortion image into a gradient image includes:

其中，(x,y)是自然失真图像I_Z的像素坐标，

表示自然失真图像I_Z在X方向的偏导；

表示自然失真图像I_Z在Y方向的偏导；

表示自然失真图像I_Z的总偏导；G(I_Z)表示自然失真图像I_Z的梯度幅度。where (x, y) are the pixel coordinates of the natural distortion image I _Z ,

Represents the partial derivative of the natural distortion image I _Z in the X direction;

Represents the partial derivative of the natural distorted image I _Z in the Y direction;

represents the total partial derivative of the naturally distorted image I _Z ; G(I _Z ) represents the gradient magnitude of the naturally distorted image I _Z.

S3：将自然失真图像域和其对应的梯度图像域输入基干网络中，提取出自然失真图像域的层次特征和梯度图像域的层次特征。S3: Input the natural distortion image domain and its corresponding gradient image domain into the backbone network, and extract the hierarchical features of the natural distortion image domain and the gradient image domain.

一种基于图像特征融合的无参考图像质量评价方法的具体实施例，如图2所示：A specific embodiment of a non-reference image quality evaluation method based on image feature fusion, as shown in Figure 2:

将处理好的自然失真图像域和梯度图像域输入Resnet50基干网络中，分别提取出自然失真图像域和梯度图像域的L个层次特征Stage1-L，图2中含D的方块代表自然失真图像域的L个层次特征(实际有L个层次特征，只画出其中一个层次)，含G的方块代表梯度图像域的L个层次特征(实际有L个层次特征，只画出其中一个层次)，需要说明的是，在本实施例中为了减少计算量，只选用了基干网络提取出的4个层次特征(L＝4)，并不代表本发明中只能对4个层次特征进行处理，之所以选用4个层次特征是因为在现有的技术和本领域中，通过大量的实验研究发现，提取4个层次特征性价比最好，处理效果最佳。Input the processed natural distortion image domain and gradient image domain into the Resnet50 backbone network, and extract L level features Stage1-L of the natural distortion image domain and the gradient image domain respectively. The square containing D in Figure 2 represents the natural distortion image domain. The L level features of (actually there are L level features, only one level is drawn), the square containing G represents the L level features of the gradient image domain (there are actually L level features, only one level is drawn), It should be noted that, in this embodiment, in order to reduce the amount of calculation, only four hierarchical features (L=4) extracted from the backbone network are selected, which does not mean that only four hierarchical features can be processed in the present invention. Therefore, the selection of 4-level features is because in the existing technology and this field, it is found through a large number of experimental studies that extracting 4-level features has the best cost performance and the best processing effect.

S4：将自然失真图像域的层次特征和梯度图像域的层次特征对应输入跨域特征融合模型中进行融合，计算得出自然失真图像域的跨域融合层次特征。S4: The hierarchical features of the natural distortion image domain and the hierarchical features of the gradient image domain are correspondingly input into the cross-domain feature fusion model for fusion, and the cross-domain fusion hierarchical features of the natural distortion image domain are calculated.

一种基于图像特征融合的无参考图像质量评价方法的具体实施例，如图2所示，所述跨域特征融合模型(CDFM)包括：全连接层、自注意力机制层和多层感知机MLP。A specific embodiment of a non-reference image quality evaluation method based on image feature fusion, as shown in Figure 2, the cross-domain feature fusion model (CDFM) includes: a fully connected layer, a self-attention mechanism layer, and a multi-layer perceptron MLP.

一种基于图像特征融合的无参考图像质量评价方法的具体实施例，如图2所示，所述跨域融合层次特征的计算方式包括：A specific embodiment of a non-reference image quality evaluation method based on image feature fusion, as shown in Figure 2, the calculation method of the cross-domain fusion hierarchical feature includes:

S41：将自然失真图像域第n个层次的特征通过全连接层生成查询Q_1n、键K_1n和值V_1n，并根据自注意力机制处理Q_1n、K_1n和V_1n，计算出自然失真图像域第n个层次的第一全局语义层次特征X_n；S41: Generate the query Q _1n , the key K _1n and the value V _1n from the features of the nth level of the natural distortion image domain through the fully connected layer, and process Q _1n , K _1n and V _1n according to the self-attention mechanism, and calculate the natural distortion the first global semantic level feature X _n of the nth level in the image domain;

S42：将梯度图像域第n个层次的特征通过全连接层生成查询Q_2n、键K_2n和值V_2n，并根据自注意力机制处理Q_2n、K_2n和V_2n，计算出梯度图像域第n个层次的第二全局语义层次特征Y_n；S42: Generate the query Q _2n , the key K _2n and the value V _2n from the features of the nth level of the gradient image domain through the fully connected layer, and process Q _2n , K _2n and V _2n according to the self-attention mechanism, and calculate the gradient image domain The second global semantic level feature Y _n of the nth level;

S43：根据自注意力机制处理Q_1n、K_2n和V_2n，计算出自然失真图像域和梯度图像域第n个层次的融合特征Z_n；S43: Process Q _1n , K _2n and V _2n according to the self-attention mechanism, and calculate the fusion feature Z _n of the nth level in the natural distortion image domain and the gradient image domain;

S44：将X_n、Y_n和Z_n输入多层感知机MLP，计算得出自然失真图像域第n个层次的跨域融合层次特征F_n，其中，通过将自然失真图像域和梯度图像域不同层次之间的特征进行融合，使获得的跨域融合层次特征具有丰富的先验信息和局部信息，有利于提高质量评价的准确性。S44: Input X _n , Y _n and Z _n into the multi-layer perceptron MLP, and calculate the cross-domain fusion hierarchical feature F _n of the nth level in the natural distortion image domain, wherein, by combining the natural distortion image domain and the gradient image domain The features between different levels are fused, so that the obtained cross-domain fusion level features have rich prior information and local information, which is beneficial to improve the accuracy of quality evaluation.

S5：将跨域融合层次特征输入跨尺度特征融合模型中进行融合，计算得出自然失真图像域的跨尺度融合特征。S5: Input the cross-domain fusion hierarchical feature into the cross-scale feature fusion model for fusion, and calculate the cross-scale fusion feature of the natural distortion image domain.

一种基于图像特征融合的无参考图像质量评价方法的具体实施例，如图2所示，所述跨尺度特征融合模型包括：STB模型(Swin-Transformer Block)和全局平均池化层(GAP)；A specific embodiment of a non-reference image quality evaluation method based on image feature fusion, as shown in Figure 2, the cross-scale feature fusion model includes: STB model (Swin-Transformer Block) and global average pooling layer (GAP) ;

一种基于图像特征融合的无参考图像质量评价方法的具体实施例，如图2所示，所述跨尺度融合特征的计算方式包括：A specific embodiment of a non-reference image quality evaluation method based on image feature fusion, as shown in Figure 2, the calculation method of the cross-scale fusion feature includes:

S51：将自然失真图像域第n个层次的跨域融合层次特征F_n输入STB模型中计算得出特征图A_n：S51 : Input the cross-domain fusion hierarchical feature F _n of the nth level in the natural distortion image domain into the STB model to calculate the feature map A _n :

S52：将自然失真图像域第n+1个层次的跨域融合层次特征F_n+1与特征图A_n用Cat函数拼接在一起输入STB模型中计算得出特征图A_n+1；S52: splicing the cross-domain fusion hierarchical feature F _n+1 of the n+1 th level in the natural distortion image domain and the feature map A _n together with the Cat function and inputting the STB model to calculate the feature map A _n+1 ;

S53：重复步骤S52，得到特征图A_L；S53: Repeat step S52 to obtain feature map _AL ;

S54：将特征图A_L输入GAP层中输出跨尺度融合特征；S54: Input the feature map _AL into the GAP layer to output cross-scale fusion features;

其中，A_L表示自然失真图像域第L层次的特征图，n＝1,2，3……L-1，L表示基干网络提取出的层次特征数量，通过将跨域融合层次特征中不同层次的特征将进行跨尺度的融合能够得到丰富的全局和局部图像信息，有利于提高质量评价的准确性，再本实施例中上述只选用了基干网络提取出的4个层次特征，所以L为4。Among them, A _L represents the feature map of the L-th level in the natural distortion image domain, n=1, 2, 3...L-1, L represents the number of hierarchical features extracted by the backbone network. The features will be fused across scales to obtain rich global and local image information, which is conducive to improving the accuracy of quality evaluation. In this embodiment, only the four hierarchical features extracted from the backbone network are selected above, so L is 4 .

一种基于图像特征融合的无参考图像质量评价方法的具体实施例，跨尺度融合特征的具体计算方式包括：A specific embodiment of a reference-free image quality evaluation method based on image feature fusion, the specific calculation method of cross-scale fusion features includes:

跨域特征融合模型CDFM输出的跨域融合层次特征依次送入Swin-TransformerBlock(STB)中，STB模块：首先将输入的特征块经过LayerNorm层送入WindowMulti-headSelfAttention(W-MSA)模块中，进行基于窗口的自注意力，而后在经过一个LayerNorm层和MLP层进行融合；而后再经过LayerNorm层送入ShiftWindowMulti-headSelfAttention(SW-MSA)模块中，随后经过一个LayerNorm层和MLP层，最后进行一个Conv(卷积)操作进行特征尺度的降低。The cross-domain fusion hierarchical features output by the cross-domain feature fusion model CDFM are sequentially sent to the Swin-TransformerBlock (STB). Window-based self-attention, and then fused through a LayerNorm layer and MLP layer; then sent to the ShiftWindowMulti-headSelfAttention (SW-MSA) module through the LayerNorm layer, followed by a LayerNorm layer and MLP layer, and finally a Conv The (convolution) operation performs feature scale reduction.

S6：将跨尺度融合特征输入两个线性回归层(FC1和FC2)进行回归处理，计算得出自然失真图像域的质量评价分数。S6: Input the cross-scale fusion features into two linear regression layers (FC1 and FC2) for regression processing, and calculate the quality evaluation score in the natural distorted image domain.

S7：根据所述自然失真图像域的真实标签与质量评价分数，计算基于图像特征融合的无参考图像质量评价模型的损失函数。S7: Calculate the loss function of the no-reference image quality evaluation model based on image feature fusion according to the real label and the quality evaluation score in the natural distortion image domain.

所述基于图像特征融合的无参考图像质量评价模型的损失函数包括：The loss function of the no-reference image quality evaluation model based on image feature fusion includes:

其中，N表示自然失真图像域中自然失真图像的个数，Fⁱ(I_D,I_G)表示第i个自然失真图像域的质量评价分数，I_D表示自然失真图像域，I_G表示梯度图像域，Q_i表示第i个自然失真图像的真实标签，loss表示损失函数。Among them, _N represents the number of naturally distorted images in the natural distortion image domain, _F ⁱ (ID , _IG ) represents the quality evaluation score of the ith natural distortion image domain, ID represents the natural distortion image domain, and _IG represents the gradient In the image domain, Q _i represents the ground-truth label of the i-th naturally distorted image, and loss represents the loss function.

S8：不断调整模型的参数，当损失函数小于设定阈值完成模型的训练，其中，所述阈值为本领域技术人员根据实际情况设定。S8: Constantly adjust the parameters of the model, and complete the training of the model when the loss function is less than a set threshold, wherein the threshold is set by those skilled in the art according to actual conditions.

本设计选择ViT-B/8作为预训练模型，该模型在ImageNet-21k上训练并在ImageNet 1k上进行微调，patch大小P设置为8，遵循现有IQA算法的标准训练策略，将预训练模型的学习率设置为1×10-5，批量Batch-size(B)设置为32，使用权重衰减为1×10-5的AdaptiveMoment Estimation(ADAM)优化器对模型进行优化函数和余弦退火学习率(CosineannealingLR)策略对模型的学习率进行优化。In this design, ViT-B/8 is selected as the pre-training model. The model is trained on ImageNet-21k and fine-tuned on ImageNet 1k. The patch size P is set to 8. Following the standard training strategy of the existing IQA algorithm, the pre-trained model is The learning rate is set to 1×10-5, the batch Batch-size (B) is set to 32, and the model is optimized using the Adaptive Moment Estimation (ADAM) optimizer with a weight decay of 1×10-5 function and cosine annealing learning rate ( The CosineannealingLR) strategy optimizes the learning rate of the model.

于本发明一实施例中，本发明还包括一种计算机可读存储介质，其上存储有计算机程序，该程序被处理器执行时实现上述任一种基于图像特征融合的无参考图像质量评价方法。In an embodiment of the present invention, the present invention further includes a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, implements any of the above-mentioned non-reference image quality evaluation methods based on image feature fusion. .

本领域普通技术人员可以理解：实现上述各方法实施例的全部或部分步骤可以通过计算机程序相关的硬件来完成。前述的计算机程序可以存储于一计算机可读存储介质中。该程序在执行时，执行包括上述各方法实施例的步骤；而前述的存储介质包括：ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。Those of ordinary skill in the art can understand that all or part of the steps of implementing the above method embodiments may be completed by hardware related to computer programs. The aforementioned computer program may be stored in a computer-readable storage medium. When the program is executed, the steps including the above method embodiments are executed; and the foregoing storage medium includes: ROM, RAM, magnetic disk or optical disk and other media that can store program codes.

一种基于图像特征融合的无参考图像质量评价装置，包括处理器和存储器；所述存储器用于存储计算机程序；所述处理器与所述存储器相连，用于执行所述存储器存储的计算机程序，以使所述一种基于图像特征融合的无参考图像质量评价装置执行任一上述一种基于图像特征融合的无参考图像质量评价。A reference-free image quality evaluation device based on image feature fusion, comprising a processor and a memory; the memory is used for storing a computer program; the processor is connected with the memory, and is used for executing the computer program stored in the memory, Any one of the above-mentioned non-reference image quality evaluation based on image feature fusion is performed by the apparatus for evaluating the quality of an image without reference based on image feature fusion.

具体地，所述存储器包括：ROM、RAM、磁碟、U盘、存储卡或者光盘等各种可以存储程序代码的介质。Specifically, the memory includes various media that can store program codes, such as ROM, RAM, magnetic disk, U disk, memory card, or optical disk.

优选地，所述处理器可以是通用处理器，包括中央处理器(Central ProcessingUnit，简称CPU)、网络处理器(Network Processor，简称NP)等；还可以是数字信号处理器(Digital Signal Processor，简称DSP)、专用集成电路(Application SpecificIntegrated Circuit，简称ASIC)、现场可编程门阵列(Field Programmable Gate Array，简称FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。Preferably, the processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU for short), a network processor (Network Processor, NP for short), etc.; it may also be a digital signal processor (Digital Signal Processor, for short) DSP), Application Specific Integrated Circuit (ASIC for short), Field Programmable Gate Array (FPGA for short) or other programmable logic devices, discrete gate or transistor logic devices, and discrete hardware components.

通过本发明设计的方法能够提高自然图像质量评价的准确性和泛化能力，具体来讲：设计的模型能够进一步利用图片的局部信息和全局语义信息，从而给出更贴近人类视觉系统的图像质量评价结果，以实现在无参考图像的情况下对多种类型图像的质量评价任务，例如，根据自然失真图像的质量评价分数，辅助自然失真图像进行图像恢复；当应用于人工智能时，通过质量评价分数，机器人对目标进行判断；当用于拍照时，根据质量评价分数，自动调整镜头的分辨率等参数。The method designed by the present invention can improve the accuracy and generalization ability of natural image quality evaluation. Specifically, the designed model can further utilize the local information and global semantic information of the image, so as to provide an image quality that is closer to the human visual system. Evaluation results to achieve quality evaluation tasks for various types of images without reference images, such as assisting natural distorted images for image restoration based on the quality evaluation scores of naturally distorted images; when applied to artificial intelligence, by quality Evaluation score, the robot judges the target; when it is used for taking pictures, it automatically adjusts parameters such as the resolution of the lens according to the quality evaluation score.

此外，上述附图仅是根据本发明示例性实施例的方法所包括的处理的示意性说明，而不是限制目的。易于理解，上述附图所示的处理并不表明或限制这些处理的时间顺序。另外，也易于理解，这些处理可以是例如在多个模块中同步或异步执行的。Furthermore, the above-mentioned figures are merely schematic illustrations of the processes included in the methods according to the exemplary embodiments of the present invention, and are not intended to be limiting. It is easy to understand that the processes shown in the above figures do not indicate or limit the chronological order of these processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, for example, in multiple modules.

本领域技术人员在考虑说明书及实践这里公开的发明后，将容易想到本公开的其他实施例。本申请旨在涵盖本公开的任何变型、用途或者适应性变化，这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的，本公开的真正范围和精神由权利要求指出。Other embodiments of the present disclosure will readily suggest themselves to those skilled in the art upon consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the present disclosure that follow the general principles of the present disclosure and include common knowledge or techniques in the technical field not disclosed by the present disclosure . The specification and examples are to be regarded as exemplary only, with the true scope and spirit of the disclosure being indicated by the claims.

应当理解的是，本公开并不局限于上面已经描述并在附图中示出的精确结构，并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限。It is to be understood that the present disclosure is not limited to the precise structures described above and illustrated in the accompanying drawings, and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. a non-referenced image quality evaluation method based on image feature fusion, is characterized in that, comprising:

Obtain the natural distortion image to be evaluated, generate a gradient image according to the natural distortion image, input the natural distortion image and the gradient image into the trained non-reference image quality evaluation model based on image feature fusion, and obtain the quality evaluation score; The no-reference image quality evaluation model of feature fusion includes: backbone network, cross-domain feature fusion model and cross-scale feature fusion model, and two linear regression layers;

The process of training a no-reference image quality assessment model based on image feature fusion includes:

S1: Obtain the naturally distorted image domain with ground truth labels, where the ground truth label represents the true score of the naturally distorted image in the naturally distorted image domain;

S2: Generate gradient image domain according to natural distortion image domain;

S3: Input the natural distortion image domain and its corresponding gradient image domain into the backbone network, and extract the hierarchical feature of the natural distortion image domain and the hierarchical feature of the gradient image domain;

S4: Integrate the hierarchical features of the natural distortion image domain and the hierarchical features of the gradient image domain corresponding to the input cross-domain feature fusion model, and calculate the cross-domain fusion hierarchical features of the natural distortion image domain;

S5: Input the cross-domain fusion hierarchical feature into the cross-scale feature fusion model for fusion, and calculate the cross-scale fusion feature of the natural distortion image domain;

S6: Input the cross-scale fusion features into two linear regression layers for regression processing, and calculate the quality evaluation score in the natural distortion image domain;

S7: Calculate the loss function of the no-reference image quality evaluation model based on image feature fusion according to the real label and the quality evaluation score in the natural distortion image domain;

S8: Continuously adjust the parameters of the model, and complete the training of the model when the loss function is less than the set threshold.

2. A reference-free image quality evaluation method based on image feature fusion according to claim 1, wherein the cross-domain feature fusion model comprises: a fully connected layer, a self-attention mechanism layer and a multi-layer perceptron MLP.

3. The non-reference image quality evaluation method based on image feature fusion according to claim 2, wherein the calculation method of the cross-domain fusion hierarchical feature comprises:

S41: Generate the query Q _1n , the key K _1n and the value V _1n from the features of the nth level of the natural distortion image domain through the fully connected layer, and process Q _1n , K _1n and V _1n according to the self-attention mechanism, and calculate the natural distortion the first global semantic level feature X _n of the nth level in the image domain;

S42: Generate the query Q _2n , the key K _2n and the value V _2n from the features of the nth level of the gradient image domain through the fully connected layer, and process Q _2n , K _2n and V _2n according to the self-attention mechanism, and calculate the gradient image domain The second global semantic level feature Y _n of the nth level;

S43: Process Q _1n , K _2n and V _2n according to the self-attention mechanism, and calculate the fusion feature Z _n of the nth level in the natural distortion image domain and the gradient image domain;

S44: Input X _n , Y _n and Z _n into the multi-layer perceptron MLP, and calculate the cross-domain fusion hierarchical feature F _n of the nth level in the natural distortion image domain.

4 . The reference-free image quality evaluation method based on image feature fusion according to claim 1 , wherein the cross-scale feature fusion model comprises: an STB model and a GAP layer. 5 .

5. a kind of non-reference image quality evaluation method based on image feature fusion according to claim 4, is characterized in that, the calculation method of described cross-scale fusion feature comprises:

S51: Input the cross-domain fusion hierarchical feature F _n of the nth level in the natural distortion image domain into the STB model to calculate the feature map A _n ;

S52: The cross-domain fusion hierarchical feature F _n+1 of the n+1th level in the natural distortion image domain and the feature map A _n are spliced together with the Cat function and input in the STB model to calculate the feature map A _n+1 ;

S53: Repeat step S52 to obtain feature map _AL ;

S54: Input the feature map _AL into the GAP layer to output cross-scale fusion features;

Among them, A _L represents the feature map of the L-th layer in the natural distortion image domain, n=1, 2, 3...L-1, L represents the number of hierarchical features extracted by the backbone network.

6. a kind of non-reference image quality evaluation method based on image feature fusion according to claim 1, is characterized in that, the loss function of the non-reference image quality evaluation model based on image feature fusion comprises:

Among them, N represents the number of naturally distorted images, _F ⁱ (ID , _IG ) represents the quality evaluation score of the _i -th naturally distorted image, ID represents the natural distortion image domain, _IG _represents the gradient image domain, and Qi represents the The ground-truth label of the i-th naturally distorted image, loss represents the loss function.

7. A computer-readable storage medium on which a computer program is stored, wherein the computer program is executed by a processor to realize a kind of image feature fusion-based image feature fusion according to any one of claims 1 to 7. No reference image quality assessment method.

8. A reference-free image quality evaluation device based on image feature fusion, comprising a processor and a memory; the memory is used to store a computer program; the processor is connected to the memory, and is used to execute the A computer program stored in the memory, so that the apparatus for evaluating the quality of an image without reference based on image feature fusion executes the method for evaluating the quality of an image without reference based on image feature fusion according to any one of claims 1 to 7.