CN110853027A

CN110853027A - Three-dimensional synthetic image no-reference quality evaluation method based on local variation and global variation

Info

Publication number: CN110853027A
Application number: CN201911124950.4A
Authority: CN
Inventors: 方玉明; 姚怡茹; 鄢杰斌
Original assignee: Individual
Current assignee: Individual
Priority date: 2019-11-18
Filing date: 2019-11-18
Publication date: 2020-02-28

Abstract

The invention relates to a three-dimensional synthetic image non-reference quality evaluation method based on local variation and global variation, which is characterized by comprising the following steps: firstly, for local change detection, extracting the structure and color characteristics of a synthesized image by using a Gaussian derivative; secondly, coding is carried out by using a local binary pattern based on the structure characteristic and the color characteristic to obtain a structure characteristic graph and a color characteristic graph, and the structure characteristic graph and the color characteristic graph are calculated to obtain the structure characteristic graph and the color characteristic graph, so that distortion information of the local structure and the color is obtained; then, for global change detection, extracting brightness characteristics to evaluate the naturalness of the three-dimensional synthetic image; and finally, training a random forest regression model to map the extracted features to subjective quality scores based on the extracted visual features. The experimental results on the three disclosed databases show that the method shows good effectiveness and superiority compared with the existing non-reference image quality evaluation method and full-reference three-dimensional synthetic image quality evaluation method.

Description

Three-dimensional synthetic image no-reference quality evaluation method based on local variation and global variation

Technical Field

The invention belongs to the technical field of multimedia, particularly belongs to the technical field of digital image and digital image processing, and particularly relates to a three-dimensional synthetic image non-reference quality evaluation method based on local variation and global variation.

Background

Free-view video (FVV) and three-dimensional film and television can bring people with a feeling of being personally on the scene, and the technology of the FVV and the three-dimensional film and television has attracted a great deal of attention from the academic world and the industrial world in the past decades. Generally, one method for obtaining free-view video is to take different images of the same scene from different viewing angles by using a plurality of cameras, and then to stitch the images taken from different viewing angles together. As the demand for better experience increases, the number of views in free-view video continues to grow, and the load on storage and transmission also increases. To solve this problem, multi-view video deepening (MVD) technology has been developed.

The multi-view video deepening technology only requires an original view map and depth maps of other cameras, and the remaining virtual views can be generated by a rendering-based deepening picture technology (DIBR). However, the DIBR process causes new visual distortions such as blurring, discontinuities, blocking and image stretching, which are clearly different from conventional distortions such as Gaussian blur, Gaussian noise and white noise, which can significantly affect the end user experience. Therefore, it is necessary to design a reliable and effective quality evaluation method for three-dimensional synthetic images to predict the visual quality of the three-dimensional synthetic images. In the past years, some quality evaluation methods aiming at synthesis distortion have been proposed, but the prediction effects of the methods are not accurate enough, so the invention provides a three-dimensional synthesis image quality evaluation method based on local variation and global variation, and the method can effectively predict the visual quality of a synthesis image.

Disclosure of Invention

In order to achieve the purpose, the invention adopts the technical scheme that:

a three-dimensional synthetic image no-reference quality evaluation method based on local variation and global variation is characterized in that: the method comprises the following steps:

A. extracting the structure and color characteristics of the synthesized image by using a Gaussian derivative;

B. coding the obtained structure and color characteristics by using a local binary pattern respectively to obtain a structure characteristic diagram and a color characteristic diagram, and calculating the structure characteristic diagram and the color characteristic diagram respectively based on the structure characteristic diagram and the color characteristic diagram to obtain distortion information of the local structure and the color;

C. extracting brightness characteristics to evaluate the naturalness of the three-dimensional synthetic image for global change;

D. and (4) combining the extracted characteristic information, and using a random forest regression model to learn the mapping relation between the visual characteristics and the subjective quality scores to predict the quality scores of the three-dimensional synthetic images.

Further, structural features and color features of the image are extracted using gaussian derivatives.

Further, the structural features of the image are extracted by using Gaussian derivatives, and the method specifically comprises the following steps:

A. the local Taylor series expansion can represent the local characteristics of the image, and the coefficients of the local Taylor series can be obtained through local Gaussian derivatives; the gaussian derivative of an image can be defined as follows:

where m ≧ 0 and n ≧ 0 are derivatives along the horizontal x and vertical y directions, the symbol denotes the convolution operation; g^σ(x, y, σ) is a Gaussian function whose standard deviation σ is defined as follows:

B. using the second order Gaussian derivative to extract structural features, first calculating the sum of m + n < 1 > and 2

By calculating the Gaussian derivatives, the resulting matrix

Can be expressed as:

further, extracting color features of the image by using Gaussian derivatives, which comprises the following specific steps:

A. two color features that are not affected by luminance are employed on the first order gaussian derivative of the color channel, one of which is defined as follows:

wherein the content of the first and second substances,

r, G and B respectively represent red, green and blue channels in a color space; another color characteristic

The definition is as follows:

wherein R ', G', and B 'respectively represent gaussian first derivative values of R, G, and B channels in the horizontal direction, and ρ ═ 2R' -G '-B', δ ═ 2G '-R' -B ', τ ═ 2B' -R '-G'.

Further, local binary patterns are respectively used for the obtained structure and color characteristics to obtain structure and color characteristic maps, and quality characteristics are calculated according to the structure and color characteristics, so that distortion information of the local structure and color is obtained.

Further, a local binary pattern method is used for obtaining a structure diagram to obtain distortion information of a local structure, and the specific steps are as follows:

A. binary pattern method pair using local rotation invariance

Each pixel of the image is operated on, the calculation is based on

Characteristic map of absolute value

The calculation formula is as follows:

wherein s ∈ { s ∈ [)₁,s₂,s₃,s₄,s₅}; LBP stands for LBP operation; riu2 represents a rotation invariant unified mode; d and E represent the number and the calculation radius of surrounding pixels, the number D of the surrounding pixels is set to be 8, and the calculation radius E is 1; thereby obtaining 5 characteristic maps which are respectively

And

wherein

Describing the relationship between the central pixel point and the adjacent pixel points in the local area;

B. representing local structure distortion information by using weighted histogram, and using same local binary pattern operator pair

The pixels of (a) are accumulated to obtain a weighted histogram, which is defined by the following formula:

wherein N represents the number of picture pixels; k denotes the index of LBP, K ∈ [0, D +2 ]],Is a weight value which is a characteristic map

And summarizing the Gaussian derivatives to fuse and map pixel values in the Gaussian derivatives according to the LBP image intensity value, and obtaining a characteristic vector through normalization operation to enhance the change of high contrast in the image area so as to reflect local structure distortion information.

Further, obtaining a chromaticity diagram by using coding to obtain distortion information of local chromaticity; wherein the content of the first and second substances,

A. obtaining a feature vector by using a local binary model: at the extracted color feature x₁Carry out LBP^riu2Operation acquisition feature map

Then theConverting the feature map into a feature vector, which defines the formula:

wherein the content of the first and second substances,

is a weight value obtained by a local binary pattern operator, and the value is a characteristic diagram

B. The local chrominance information is represented by a weighted histogram: at the extracted color feature x₂Carry out LBP^riu2Operation acquisition feature map

Then x₂The weighted histogram calculation of (a) is defined as follows:

wherein, the weight value

Is a characteristic map

Finally, a single feature vector representing the color information of the image is calculated by the following formula:

further, for global variation, the naturalness of the three-dimensional synthetic image is evaluated by extracting the luminance features: wherein the content of the first and second substances,

A. fitting luminance coefficient, luminance coefficient (L), using Gaussian distribution^′) The definition is as follows:

where (i, j) represents the spatial position of the pixel, and i e {1,2, …, a }, j e {1,2, …, b }, where a and b represent the height and width of the image, respectively, and μ (i, j) and σ (i, j) are defined as follows:

where ω is a two-dimensional, centrosymmetric gaussian weighting function, ω ═ ω_a,b|a∈[-3,3],b∈[-3,3]}；

The luminance parameter L' (i, j) is then fitted using a zero-mean generalized gaussian distribution, which defines the formula:

wherein the content of the first and second substances,

and is

Where parameter α controls the shape of the distribution, σ controls the variance;

B. subsequently, 4 parameters including shape parameters and variance of the generalized gaussian distribution and kurtosis and skewness of the luminance coefficient are calculated on 5 scales of the composite image, resulting in a total of 20-dimensional features; in addition, a laplacian pyramid image is computed from the difference between the composite image and its low-pass filtered image, the shape coefficients and variances are obtained using a generalized gaussian distribution model to fit the pixel values in the laplacian pyramid, the kurtosis and skewness of the laplacian pyramid are computed, and the four parameters are extracted from the five scales, yielding a total of 20-dimensional features.

Further, a quality prediction model is trained using a random forest regression method, wherein,

A. selecting feature information, namely obtaining a total 310-dimensional feature vector through local features and global features, wherein the total 310-dimensional feature vector comprises 270-dimensional local variation features and 40-dimensional global naturalness features;

B. and training a visual quality prediction model by using a random forest regression method, and mapping the quality characteristics to subjective evaluation. The three-dimensional synthetic view quality database is randomly divided into a training set and a testing set to be trained for 1000 times, and finally, a pierce linear correlation coefficient mean (PLCC), a pierce Mandarin level correlation coefficient (SRCC), a Kendall level correlation coefficient (KRCC) and a Root Mean Square Error (RMSE) are used as final results.

Drawings

FIG. 1 is a block diagram of the algorithm of the present invention.

Detailed Description

The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Wherein technical features, abbreviations/abbreviations, symbols and the like referred to herein are explained, defined/explained on the basis of the known knowledge/common understanding of a person skilled in the art.

The invention designs a novel and effective three-dimensional synthetic view non-reference quality evaluation method (LVGC) based on local variation and global variation, which can effectively improve the quality evaluation effect of the three-dimensional synthetic image and is inspired by the fact that the three-dimensional synthetic image degradation of local and global distribution and the human visual system are very sensitive to the structure, color information and global natural change.

The specific operation of each part of the invention is as follows:

(1) structural feature extraction:

research shows that local taylor series expansion can represent local features of a picture and coefficients of the local taylor series can be obtained through local gaussian distribution, and the gaussian distribution of one image can be defined as:

where m ≧ 0 and n ≧ 0 are derivatives along the horizontal (defined as x) and vertical (defined as y) directions, in particular, the notation x denotes the convolution operation, G ≧ 0^σ(x, y, σ) is a gaussian function, and its positive deviation σ is defined as follows:

inspired by other researches, the invention applies a second order Gaussian derivative to extract structural features; firstly, m + n is more than or equal to 1 and less than or equal to 2 to obtain the compound

By calculating the Gaussian derivatives, matrices

Can be expressed as:

subsequently, the locally uniform rotation invariant binary pattern operator (ULBP) pair is utilized

Each pixel in the image is operated to realize rotation invariance, and the calculation result is based on

Characteristic map of absolute value

The calculation formula is as follows:

wherein s ∈ { s ∈ [)₁,s₂,s₃,s₄,s₅}; LBP stands for LBP operation; riu2 represents a rotationally invariant uniform pattern; d and E represent the number of surrounding pixels and the adjacent radius. Specifically, the number of peripheral pixel points D is set to 8, and the adjacent radius E is set to 1, so as to obtain 5 feature maps (each of which is represented by

And

) Wherein

The relationship between the center pixel and the neighboring pixels in the local area is described, and the local detail can effectively capture the complex degradation caused by different distortion types.

Although the local binary pattern can detect differences between the center pixel and its neighboring pixels, it cannot accurately capture gradient information, it encodes the differences of neighboring pixels, which impairs the ability of the local binary pattern to distinguish local variations. This is critical, and the change of local contrast has a great influence on the evaluation of the visual quality of the picture. It is well known that contrast transformations are highly correlated with picture visual quality. Thus, the present invention accumulates by the same local binary pattern operatorTo obtain a weighted histogram, defined as:

wherein N represents the number of picture pixels; k denotes the index of LBP, K ∈ [0, D +2 ]]And is and

as a weight, the value is a feature map

The method adopts Gaussian derivatives to collect, fuse and map pixel values of image intensity in the Gaussian derivatives, and obtains characteristic vectors through normalization operation; through these operations, the variation of high contrast in the picture region can be enhanced.

(2) Extracting color features:

in order to extract color characteristics, the invention adopts two color brightness which is not influenced by brightness on a chromaticity channel of a first-order Gaussian derivative; experiments prove that the first-order Gaussian derivative information of the color can be used for perceiving the degradation of local structures, wherein one color characteristic can be defined as:

wherein the content of the first and second substances,r, G, B represent the red, green and blue channels, respectively, in color space. Then, at x₁Carry out LBP^riu2Operating to extract feature maps

Then, the feature map is converted into a feature vector, and a calculation formula is defined as follows:

wherein the content of the first and second substances,

is a weight value, which is a characteristic map

Another characteristic is that

It is defined as follows:

where R ', G', and B 'respectively represent gaussian first derivative values of R, G, B channels in the horizontal direction, and ρ ═ 2R' -G '-B', δ ═ 2G '-R' -B ', τ ═ 2B' -R '-G'. Then, will

Operation applied to x₂Calculating to obtain a characteristic map

The weight histogram is calculated as follows:

wherein the content of the first and second substances,

is defined as

The calculation formula is as follows:

color features are invariant to luminance and luminance-related scene information effects such as shadows, and therefore, they can represent powerful structural information as they are unaffected by illumination; furthermore, image distortions caused by a single factor (such as a blurring factor) can damage the structure of the image, but they are not necessarily related to the effect associated with brightness.

(3) Image naturalness characterization:

the brightness distortion of the three-dimensional composite image may affect the naturalness, and a high-definition three-dimensional composite image should have the natural characteristics of a natural picture. Therefore, the invention uses the quality characteristics based on brightness to evaluate the naturalness of the three-dimensional synthetic image, and takes the brightness parameters of the natural image into consideration to follow a Gaussian distribution, and uses the brightness coefficient to calculate the naturalness of the synthetic image; the luminance parameter (L') is defined as follows:

where (i, j) represents the spatial index, and i ∈ {1,2, …, a }, j ∈ {1,2, …, b }, where a and b represent the height and width of the image, respectively.

In particular, μ (i, j) and σ (i, j) are defined as follows:

where ω is a 2D centrosymmetric gaussian weight function sampled and rescaled to unity magnitude over three standard deviations, { ω ═ ω_a,b|a∈[-3,3],b∈[-3,3]}。

The luminance parameter L' (i, j) is fitted with a zero mean generalized gaussian distribution, and the formula is defined as follows:

wherein the content of the first and second substances,

and is

Parameter α controls the general shape of the distribution, σ control variance, two parameters (α, σ)²) The evaluation is obtained through the model; and the kurtosis and skewness of the luminance coefficients are calculated from classical distributions from more than 5, yielding a total of 20 features.

Then, a Laplace pyramid is calculated by the difference between the synthesized image and its low-pass filtered version, a generalized Gaussian distribution model is used to fit the pixel value distribution in the Laplace pyramid, the kurtosis and skewness of the Laplace pyramid are taken as features, the invention extracts quality-conscious features from five levels, and 20 features are generated in total.

(4) Regression model and quality prediction:

studies have shown that multi-scale properties exist in the human visual system when perceiving visual information, and therefore extracting visual features that extract pictures on multiple scales can be better characterized. By representing the model through local features and global features, the invention can obtain a 310-dimensional feature vector in total, and the 310-dimensional feature vector comprises 270-dimensional local variation features and 40-dimensional global naturalness features; then, training a visual quality prediction model by using a random forest method, so that the quality characteristics can be mapped into subjective evaluation; by arbitrarily partitioning the database: 80% of the image samples and corresponding subjective evaluation scores were used for training in the database, and the remaining 20% were used for testing; finally, the pierce linear correlation coefficient mean (PLCC), the impersonate scale correlation coefficient (SRCC), the kender scale correlation coefficient (KRCC), and the Root Mean Square Error (RMSE) are summarized as final results.

The process of the invention is shown in figure 1, and the specific process is as follows:

step 1: extracting structural and color features using gaussian derivatives;

step 2: using local binary patterns to encode a structural and color feature map for computing quality-aware features to derive local structural and color distortions;

and step 3: extracting brightness characteristics through global change to evaluate the naturalness of the three-dimensional synthetic image;

and 4, step 4: based on the extracted feature information, a quality prediction model is trained using random forest regression mapping from visual features to subjective scores.

The pierce linear correlation coefficient mean (PLCC), the impersonate rank correlation coefficient (SRCC), the kendell correlation coefficient (KRCC), and the Root Mean Square Error (RMSE) were used as final comparison results. Generally speaking, higher PLCC and SRCC and lower RMSE values represent better performance of the algorithm, i.e., better algorithm prediction accuracy. In order to verify the performance of the algorithm provided by the invention, the algorithm is compared with the existing reference and non-reference quality evaluation methods in three public databases MCL-3D, IRCCyN/IVC and IETR-DIBR, and the comparative evaluation methods comprise PSNR, SSIM, BRISQE, NIQE, BIQI, NRSL, CM-LOG, MP-PSNRr, MW-PSNR, MW-PSNRr, LOGS, Ref, APT and NIQSV +; the first seven methods are quality evaluation methods for natural pictures, and the last seven methods are quality evaluation methods designed particularly for synthetic view angles.

The MCL-3D database contains 693 stereoscopic groups of pictures selected from nine depth image information sources. The IRCCyN/IVC DIBR database consists of 12 reference pictures chosen from three MVD sequences and 84 synthetic pictures generated by 7 different DIBR techniques. The IETR DIBR database consists of 150 synthetic pictures generated by 10 MVD sequences and 7 latest DIBR technologies, and similar to the RCCyN/IVC DIBR database, the IETR DIBR database is also mainly concerned with rendering distortion.

Table 1: comparison of the present invention with existing full reference methods

Table 1 shows the comparison of the proposed method with the existing full reference method, from which the proposed no reference method of the present invention performs better.

Table 2: comparison of the present invention with existing reference-free methods

Table 2 is the comparison of the proposed method to the existing no reference method, from which the proposed no reference method of the present invention performed better on the tested database.

The above-described embodiments are illustrative of the present invention and not restrictive, it being understood that various changes, modifications, substitutions and alterations can be made herein without departing from the principles and spirit of the invention, the scope of which is defined by the appended claims and their equivalents.

Claims

1. A three-dimensional synthetic image no-reference quality evaluation method based on local variation and global variation is characterized in that: the method comprises the following steps:

2. The method of claim 1, wherein: structural features and color features of the image are extracted using gaussian derivatives.

3. The method of claim 2, wherein: the structural features of the image are extracted by using Gaussian derivatives, and the method comprises the following specific steps:

By calculating the Gaussian derivatives, the resulting matrix

Can be expressed as:

4. the method of claim 2, wherein: extracting color features of the image by using Gaussian derivatives, which comprises the following specific steps:

wherein the content of the first and second substances,

The definition is as follows:

5. The method of claim 2, wherein: and obtaining a structure characteristic graph and a color characteristic graph by respectively using the obtained structure and color characteristics through a local binary pattern, and calculating quality characteristics so as to obtain distortion information of the local structure and color.

6. The method of claim 5, wherein: obtaining a structure chart by using a local binary pattern method to obtain distortion information of a local structure, wherein the method comprises the following specific steps of:

A. binary pattern method pair using local rotation invariance

Each pixel of the image is operated on, the calculation is based onCharacteristic map of absolute value

The calculation formula is as follows:

And

wherein

wherein N represents the number of picture pixels; k denotes the index of LBP, K ∈ [0, D +2 ]],

Is a weight value which is a characteristic map

7. The method of claim 5, wherein: obtaining a chromaticity diagram by using coding to obtain distortion information of local chromaticity; wherein the content of the first and second substances,

A. obtaining a feature vector by using a local binary model: at the extracted color feature x₁Carry out LBP^riu2Operation acquisition feature mapThe feature map is then converted into feature vectors, which define the formula:

wherein the content of the first and second substances,

Then x₂The weighted histogram calculation of (a) is defined as follows:

wherein, the weight value

Is a characteristic map

8. the method of claim 1, wherein: for global variation, the naturalness of the three-dimensional synthetic image is evaluated by extracting the brightness features: wherein the content of the first and second substances,

A. the luminance coefficient is fitted using a gaussian distribution, and the luminance coefficient (L') is defined as follows:

wherein the content of the first and second substances,

and is

9. The method of claim 1, wherein: a quality prediction model is trained using a random forest regression method, wherein,