CN109727246B

CN109727246B - Comparative learning image quality evaluation method based on twin network

Info

Publication number: CN109727246B
Application number: CN201910077607.2A
Authority: CN
Inventors: 牛玉贞; 吴建斌; 郭文忠; 黄栋
Original assignee: Fuzhou University
Current assignee: Fuzhou University
Priority date: 2019-01-26
Filing date: 2019-01-26
Publication date: 2022-05-13
Anticipated expiration: 2039-01-26
Also published as: CN109727246A

Abstract

The invention relates to a comparative learning image quality evaluation method based on a twin network. Firstly, performing local contrast normalization processing on an image to be trained, dividing the image into image blocks, and generating an image pair; secondly, designing a structure of the twin convolutional neural network, and training an image quality evaluation model by using the designed network; and finally, dividing the image to be detected into image blocks and generating an image pair. And predicting the quality of all generated images to be predicted by using the trained model to obtain the quality ranking of all the images, and obtaining the quality score of each image according to the ranking. The method of the invention converts the image quality evaluation problem into the quality comparison problem among image blocks, obtains the quality score of each image by counting the comparison result of each image and other images by using pairwise comparison among the image blocks, and can obviously improve the quality evaluation performance of the non-reference images.

Description

Comparative learning image quality evaluation method based on twin network

Technical Field

The invention relates to the field of image and video processing and computer vision, in particular to a comparative learning image quality evaluation method based on a twin network.

Background

Digital images are particularly important today where information technology is highly popular, but images are often distorted in everyday applications, such as during acquisition, compression, and transmission of images. For better application of digital images, image quality evaluation becomes particularly important. With the development of convolutional neural networks, many researchers have begun to use convolutional neural networks for reference-free image quality evaluation. At present, a plurality of non-reference image quality evaluation algorithms based on the convolutional neural network are proposed. For example, Kang et al applied a shallow convolutional neural network to non-reference image evaluation, which has a certain performance improvement compared with the previous non-reference image quality evaluation model based on feature extraction. Hui et al propose extracting features with pre-trained ResNet, which do not directly learn image quality assessment scores, but rather fine-tune the network to learn the probabilistic representation of distorted images. Bosse et al propose a method for estimating the quality of a non-reference image based on a deep convolutional neural network, train with a deeper convolutional neural network, and in addition, they adjust the network to process a task of evaluating the quality of a full-reference image. Kim et al pre-trains the model using the local scores of the full reference image quality assessment algorithm as labels, and then fine-tunes the model using the image evaluation subjective scores, the performance of which depends on the performance of the selected reference image quality assessment. Ma et al propose to train a deep no-reference image quality evaluation model using a large number of image pairs, and the algorithm presupposes that the distortion type and distortion level of a distorted image need to be known, but in practical applications without reference images, the distortion type and distortion level are difficult to obtain.

The performance of the non-reference quality evaluation model trained by the convolutional network is improved by a few degrees compared with the method for manually extracting the features, but the method still has the challenges at present. One of the challenges is the lack of training samples. The prior non-reference image quality evaluation method based on the convolutional neural network mainly solves the problem through two methods, the first method is to block the image, each image block uses the fraction of the complete image as a label, however, the quality of different parts of the image is different, and the labeling of different blocks using the complete image is not accurate. The second method is to label the image by using a full-reference quality evaluation method, and the defect of the method is that the performance of the algorithm is directly dependent on the performance of the full-reference image quality evaluation.

Disclosure of Invention

The invention aims to provide a comparative learning image quality evaluation method based on a twin network, which is beneficial to improving the quality evaluation performance of a non-reference image.

In order to achieve the purpose, the technical scheme of the invention is as follows: a comparative learning image quality evaluation method based on a twin network comprises the following steps:

s1, performing local contrast normalization processing on the image to be trained, dividing the image into image blocks and generating an image pair;

s2, designing a structure of the twin convolutional neural network, and training an image quality evaluation model by adopting the designed network;

step S3, dividing the image to be detected into image blocks and generating an image pair; and predicting the quality of all generated images to be predicted by using the trained model to obtain the quality ranking of all the images, and obtaining the quality score of each image according to the ranking.

In an embodiment of the present invention, the step S1 is specifically implemented as follows:

step S11, firstly, the image to be trained is normalized by local contrast, the intensity image I (I, j) is given, and the normalized value is calculated

The formula of (1) is as follows:

wherein, C is a constant and is used for preventing the condition that the denominator is zero; k and L are normalized window sizes, ω_k,lIs a 2D circularly symmetric gaussian weighting function;

step S12: dividing all images subjected to local contrast normalization processing into a plurality of h multiplied by w image blocks, sequencing all the image blocks by using the standard deviation value of each image block, and taking n image blocks in the middle as training data;

step S13, combining the image blocks selected from all the training images in pairs to generate an image pair; the principle of image pair combination includes the following points: 1) image blocks of the same image are not combined; 2) if the image block A and the image block B generate an image pair, the image block B is not combined with the image block A, so that data redundancy is avoided; 3) and when the quality score difference between the image pairs exceeds a preset threshold value, the combination is carried out, otherwise, the combination is not carried out.

In an embodiment of the present invention, the step S2 is specifically implemented as follows:

step S21, designing a structure of a twin convolutional neural network, where the network is composed of two sub-networks: subnetwork I and subnetwork II; the subnetwork I consists of two completely identical branch structures, the two branch structures share weight, each branch structure consists of N laminated convolution structures, and the subnetwork I is used for extracting the characteristics of two input image blocks; the sub-network II consists of M full connection layers; fusing the features extracted by the sub-network I, and taking the fused features as the input of a sub-network II, wherein the sub-network II distinguishes the quality of two input images according to the fused features;

step S22, abstracting and learning image information by using N laminated convolutions through the twin convolutional neural network, extracting image characteristics through two full-connection layers, and inputting the image characteristics into a classification network for quality assessment score optimization learning; the task of the classification network is to distinguish the quality of two input image blocks, namely the final output of the classification network is the probability of the quality of the two input image blocks, wherein the image block with the high probability is better than the image block with the low probability;

step S23, in the training phase, using the cross entropy as the loss function, the formula is as follows:

wherein N represents the number of image pairs;

the two-dimensional vector is used for representing the quality of two images;

also a two-dimensional vector, representing the probability that the first image is of better quality than the second image, and conversely the probability that the second image is of better quality than the first image is

In an embodiment of the present invention, the step S3 is specifically implemented as follows:

step S31, firstly, performing local contrast normalization processing on an image to be detected, and then dividing the image into image blocks with the size of h multiplied by w; sorting all image blocks by using the standard deviation value of each image block, and taking n image blocks which are ranked in the middle as training data;

step S32, comparing the image blocks pairwise, wherein the comparison rule is as follows: 1) not comparing with image blocks from the same image to be detected; 2) each image block needs to be compared with all other image blocks except the image block of the self image in the test set;

step S33, obtaining a relative score of each image by counting the result of the image comparison between each image and other images, wherein the calculation formula of the final image quality evaluation score of the image a is as follows:

wherein, P_A,BRepresenting the result of the contrast between image A and image B, P_A,B1 represents that the quality of image a is better than B, otherwise the quality of image B is better than a; n represents the number of each image compared with other images, and if a test set consists of T test images, each image selects N image blocks for testing, and N is (T-1) x N; s_AThe score of the image is shown.

Compared with the prior art, the invention has the following beneficial effects: the method is suitable for the image quality evaluation of various distortion types and different distortion degrees, and the quality evaluation score obtained by calculation is close to the subjective evaluation score of a person. The method comprises the steps of performing local contrast normalization processing on an image to be trained, dividing the image into image blocks, and generating an image pair; designing a structure of a twin convolutional neural network, and training an image quality evaluation model by using the designed network; and dividing the image to be measured into image blocks and generating an image pair. And predicting the quality of all generated images to be predicted by using the trained model to obtain the quality ranking of all the images, and obtaining the quality score of each image according to the ranking. The method comprehensively considers the relation between the quality evaluation score and the distortion type of the image, has stronger expression capability on the distortion information of the image, and can obviously improve the quality evaluation performance of the non-reference image.

Drawings

FIG. 1 is a flow chart of an implementation of the method of the present invention.

Fig. 2 is a structural diagram of a convolutional neural network model in an embodiment of the present invention.

Detailed Description

The technical scheme of the invention is specifically explained below with reference to the accompanying drawings.

The invention provides a comparative learning image quality evaluation method based on a twin network, which comprises the following steps as shown in figure 1:

step S1: and performing local contrast normalization processing on the image to be trained, dividing the image into image blocks, and generating an image pair.

Step S11: firstly, all distorted images are subjected to local contrast normalization processing, an intensity image I (I, j) is given, and a normalization value is calculated

The formula of (1) is as follows:

wherein, C is a constant and is used for preventing the condition that the denominator is zero; k and L are normalized window sizes, ω_k,lIs 2D cyclic symmetryA Gaussian weighting function;

step S12: dividing the image subjected to local contrast normalization into a plurality of h multiplied by w image blocks, sequencing all the image blocks by using the standard deviation (sigma) value of each image block, and taking n image blocks in the middle as training data.

Step S13: and combining the image blocks selected from all the training images in pairs to generate an image pair. The principle of image pair combination includes the following points: 1) the image blocks are not combined with image blocks from the same image; 2) if image block A and image block B generate an image pair, B is no longer combined with A, thereby avoiding data redundancy; 3) because the quality difference between the image block pairs with close quality is small, the difficulty of contrast learning is increased, and therefore when the quality score difference between the image block pairs exceeds a certain threshold value, the image block pairs are combined, otherwise, the image block pairs are not combined.

Step S2: and designing the structure of the twin convolutional neural network, and training an image quality evaluation model by using the designed network.

Step S21: and designing a twin network structure for training an image quality evaluation model. The network consists of two twin networks of identical branch structures, each branch structure consists of 5 stacked convolution structures and three fully connected layers, and the twin networks are used for image quality evaluation. The two branches adopt the same structure, wherein, the first two laminated convolution structures are composed of 2 convolution layers with convolution kernel size of 3 × 3 and then 2 × 2 pooling layers with step size of 1, and the last three laminated convolution structures are composed of 3 convolution layers with convolution kernel size of 3 × 3 and then 2 × 2 pooling layers with step size of 2. All convolutional layers are implemented using a step size of 1 and no padding to ensure that the input and output image sizes of the convolutional layers remain the same. The 5-layer-stacked convolution structure of the multitask deep convolutional network is composed of 13 convolutional layers and 5 pooling layers, and all convolutional layers are composed of convolution, Batch Normalization (BN) and ReLU nonlinear mapping.

Step S22: the twin network abstracts and learns image distortion information by using 5 laminated convolutions, extracts image features through two full-connection layers, and simultaneously inputs the image features into a classification network for quality assessment score optimization learning. The classification network comprises a full connection layer containing two nodes and a softmax classification layer, wherein the two nodes respectively correspond to the quality of the image pair, namely the final output of the classification network is the probability of the quality of two input image blocks, and the image block with the high probability is better than the image block with the low probability.

Step S23: in the training phase, cross entropy is used as a loss function, which is formulated as follows:

wherein N represents the number of image pairs;

the two-dimensional vector is used for representing the quality of two images;

Step S3: and dividing the image to be measured into image blocks and generating an image pair. And predicting the quality of all generated images to be predicted by using the trained model to obtain the quality ranking of all the images, and obtaining the quality score of each image according to the ranking.

Step S31: firstly, all distorted images are subjected to local contrast normalization processing, and then divided into image blocks with the size of 64 multiplied by 64. All patches are sorted by the standard deviation (σ) value of each patch, and we take the n patches ranked in the middle as training data.

Step S32: comparing the image blocks pairwise, wherein the comparison rule is as follows: 1) the image blocks do not compare with image blocks from the same distorted image; 2) each image block needs to be compared with all other image blocks except the image block of the self image in the test set;

step S33: the relative score of each image is obtained by counting the comparison result of each image and other images, and the calculation formula of the final image quality evaluation score of the image A is as follows:

wherein, P_A,BRepresenting the result of the comparison of image A and image B, P_A,B1 represents that the quality of image a is better than B, otherwise the quality of image B is better than a; n represents the number of each image compared with other images, and if a test set consists of T test images, each image selects N image blocks for testing, and N is (T-1) x N; s_AThe score of the image is shown. In consideration of the fact that few test images may occur in practical application, a test set image is provided, and during testing, only the image to be tested needs to be compared with the test image provided by the user. In this case, N is T × N.

The above are preferred embodiments of the present invention, and all changes made according to the technical scheme of the present invention that produce functional effects do not exceed the scope of the technical scheme of the present invention belong to the protection scope of the present invention.

Claims

1. A comparative learning image quality evaluation method based on a twin network is characterized by comprising the following steps:

step S3, dividing the image to be detected into image blocks and generating an image pair; predicting the quality of all generated images to be predicted by using the trained model to obtain the quality ranking of all the images, and obtaining the quality score of each image according to the ranking;

the step S2 is specifically implemented as follows:

wherein N represents the number of image pairs;

the two-dimensional vector is used for representing the quality of two images;

2. The comparative learning image quality assessment method based on twin network as claimed in claim 1, wherein said step S1 is implemented as follows:

The formula of (1) is as follows:

step S13, combining the image blocks selected from all the training images in pairs to generate an image pair; the principle of image pair combination includes the following points: 1) image blocks of the same image are not combined; 2) if the image block A and the image block B generate an image pair, B is not combined with A any more; 3) and when the quality score difference between the image pairs exceeds a preset threshold value, the combination is carried out, otherwise, the combination is not carried out.

3. The comparative learning image quality assessment method based on twin network as claimed in claim 1, wherein said step S3 is implemented as follows:

wherein, P_A,BRepresenting the result of the comparison of image A and image B, P_A,B1 represents that the quality of image a is better than B, otherwise the quality of image B is better than a; n represents the number of each image compared with other images, and if a test set consists of T test images, each image selects N image blocks for testing, and N is (T-1) x N; s_AThe score of the image is shown.