CN111988613B

CN111988613B - Screen content video quality analysis method based on tensor decomposition

Info

Publication number: CN111988613B
Application number: CN202010778526.8A
Authority: CN
Inventors: 曾焕强; 黄海靓; 陈婧; 侯军辉; 曹九稳; 张云
Original assignee: Huaqiao University
Current assignee: Huaqiao University
Priority date: 2020-08-05
Filing date: 2020-08-05
Publication date: 2022-11-01
Anticipated expiration: 2040-08-05
Also published as: CN111988613A

Abstract

The invention relates to a screen content video quality analysis method based on tensor decomposition, which comprises the following steps: carrying out tensor decomposition on the selected reference screen content video sequence and the distorted screen content sequence respectively to obtain principal component slices of a three-direction slice set; respectively extracting Gabor characteristic graphs of the three-direction reference principal component slice and the three-direction distortion principal component slice, and calculating to obtain a three-direction characteristic similarity graph; and obtaining a final distorted screen content video quality analysis value based on the three-direction feature similarity graph. The method fully utilizes the tensor decomposition theory to describe the basic texture structure of the screen content video, extracts edge information highly sensitive to human eyes through the Gabor filter, reflects the subjective perception of a human eye vision system on the screen content video, and has better distorted screen content video quality analysis performance.

Description

Screen content video quality analysis method based on tensor decomposition

Technical Field

The invention belongs to the field of video processing, relates to a video quality analysis method, and particularly relates to a screen content video quality analysis method based on tensor decomposition.

Background

With the rapid development of mobile internet and multimedia technology, screen content video attracts extensive attention in academic and industrial fields, and is widely applied to cloud computing, distance education, online live broadcast, video conferencing and other applications. Different from the traditional natural scene video, the screen content video not only contains continuous tone areas obtained by shooting through a camera, such as photos, natural scene video and the like, but also contains non-continuous tone areas obtained based on a computer, such as characters, diagrams, two-dimensional codes and the like, and also contains motion information with rich changes.

As with conventional natural scene video, screen content video inevitably introduces various distortions during generation, processing, compression, storage, transmission, and rendering, resulting in a reduction in visual effect. Since human eyes are the final recipients of screen content videos, it is necessary to provide a quality analysis model that can quickly and accurately reflect the subjective perceptibility of the screen content videos by the human visual system. However, most of the existing quality analysis methods are designed for the traditional natural scene video and are not suitable for the quality analysis of the screen content video. There is currently a lack of methods for efficient quality analysis of screen content video in the field of video processing. Therefore, the method for analyzing the screen content video quality according with the human eye visual characteristics has important theoretical research significance and practical application value.

Disclosure of Invention

The invention aims to break through the limitation of the prior art and provides a screen content video quality analysis method based on tensor decomposition.

The technical scheme adopted by the invention for solving the technical problem is as follows:

the screen content video quality analysis method based on tensor decomposition comprises the following steps:

input reference screen content video sequence V_rAnd distorted screen content video sequence V_d；

For reference screen content video sequence V_rAnd distorted screen content video sequence V_dCarrying out tensor decomposition to obtain three-direction reference principal component slice M_r,x、M_r,y、M_r,tAnd three-directional distortion principal component slice M_d,x、M_d,y、M_d,t；

Respectively extracting three-direction reference principal component slices M_r,x、M_r,y、M_r,tGabor profile F_r,x(x,y)、F_r,y(x,y)、F_r,t(x, y) and three-dimensional distortion principal component slice M_d,x、M_d,y、M_d,tGabor profile F_d,x(x,y)、F_d,y(x,y)、F_d,t(x,y)；

Calculating three-direction reference Gabor characteristic diagram F_r,x(x,y)、F_r,y(x,y)、F_r,t(x, y) and three-direction distortion Gabor characteristic diagram F_d,x(x,y)、F_d,y(x,y)、F_d,t(x, y) feature similarity graph S_x(x,y)、S_y(x,y)、S_t(x,y)；

Similarity graph S based on three-direction features_x(x,y)、S_y(x,y)、S_t(x, y) obtaining a final distorted screen content video quality analysis value.

Preferably, for reference screen content video sequence V_rAnd distorted screen content video sequence V_dCarrying out tensor decomposition to obtain three-direction reference principal component slice M_r,x、M_r,y、M_r,tAnd three-directional distortion principal component slice M_d,x、M_d,y、M_d,tThe method comprises the following steps:

step 2.1: video sequence V to be referenced to screen content_rIs regarded as a third-order tensor, and is converted into a core tensor through tensor decomposition

And three factor matrices A_r,B_r,C_rThe combination of (a) and (b) is specifically as follows:

wherein the extract is_nDenotes n-modulo multiplication, n =1,2,3, three factor matrices a_r,B_r,C_rRespectively representing the original video sequence V_rPrincipal components in x, y and t directions, which are orthogonal to each other, and core tensor

Is represented as follows:

video sequence V with distorted screen content_dIs regarded as a third-order tensor, and is converted into a core tensor through tensor decomposition

And three factor matrices A_d,B_d,C_dThe combination of (a) and (b) is specifically as follows:

wherein the extract is_nDenotes n-modulo multiplication, n =1,2,3, three factor matrices a_d,B_d,C_dRespectively representing the original video sequence V_dPrincipal components in x, y and t directions, which are orthogonal to each other, and core tensor

Is represented as follows:

step 2.2: respectively setting a reference factor matrix A_rAnd distortion factor matrix A_dFor the identity matrix, a reference screen content video sequence V is obtained_rAnd distorted screen content video sequence V_dSet of vertical spatiotemporal slices cut along the x-axis direction as follows:

respectively setting reference factor matrixes B_rAnd distortion factor momentArray B_dFor the identity matrix, a reference screen content video sequence V is obtained_rAnd distorted screen content video sequence V_dSet of horizontal spatiotemporal slices cut along the y-axis direction as follows:

respectively setting reference factor matrixes C_rAnd distortion factor matrix C_dFor the identity matrix, a reference screen content video sequence V is obtained_rAnd distorted screen content video sequence V_dThe set of spatial slices cut along the t-axis direction is as follows:

step 2.3: extracting a reference screen content video sequence V_rThe slice with the largest energy in the three-direction slice set is used as a reference principal component slice M_r,x、M_r,y、M_r,tThe method comprises the following steps:

extracting a distorted screen content video sequence V_dThe slice with the largest energy in the three-direction slice set is used as a reference principal component slice M_d,x、M_d,y、M_d,tThe method comprises the following steps:

wherein w =1_,2,...,W，h＝1_,2,...,H，l＝1_,2, L, W, H, L respectively represent the number of slices of the three-directional slice set.

Preferably, a three-direction reference Gabor characteristic diagram F is calculated_r,x(x,y)、F_r,y(x,y)、F_r,t(x, y) and three-direction distortion Gabor characteristic diagram F_d,x(x,y)、F_d,y(x,y)、F_d,t(x, y) feature similarity graph S_x(x,y)、S_y(x,y)、S_t(x, y), as follows:

respectively extracting three-direction reference principal component slices M_r,x、M_r,y、M_r,tGabor profile F_r,x(x,y)、F_r,y(x,y)、F_r,t(x, y) as follows:

wherein, G_i(x, y) is a Gabor filter as follows:

x^′＝xcosθ+ysinθ

y^′＝y cosθ-xsinθ

where (x, y) denotes coordinates of each pixel in the input principal component slice, i denotes a direction index of the Gabor filter, f and θ are frequency amplitude and direction information of the sinusoidal plane wave (x ', y'), σ_xAnd σ_yThe standard deviation of the gaussian kernel in the x-axis direction and the y-axis direction, respectively, is taken here as f =0.2, σ_x＝2.15,σ_y=0.15.n is the total number of directions, where a total of 12 directions are considered, corresponding to Gabor filters of θ = i × pi/12, i ∈ { 0.. 11}, respectively.

Respectively extracting three-direction distortion principal component slices M_d,x、M_d,y、M_d,tGabor profile F_d,x(x,y)、F_d,y(x,y)、F_d,t(x, y) as follows:

where c is a constant to ensure numerical stability, c =1000.

Preferably, the similarity graph S is based on three-direction characteristic_x(x,y)、S_y(x,y)、S_t(x, y) obtaining a final distorted screen content video quality analysis value, which is as follows:

by pooling the x-direction feature similarity map S_x(x, y) obtaining an x-direction distortion screen content video quality score:

ω_x(x,y)＝max{|F_r,x(x,y)|,|F_d,x(x,y)|}

by pooling the y-direction feature similarity map S_y(x, y) deriving a y-direction distortion screen content video quality score:

ω_y(x,y)＝max{|F_r,y(x,y)|,|F_d,y(x,y)}

by pooling t-direction feature similarity maps S_x(x, y) obtaining a t-direction distortion screen content video quality score:

ω_t(x,y)＝max{|F_r,t(x,y)|,|F_d,t(x,y)|}

combining the quality scores of the distorted screen contents in all directions to obtain a final distorted screen content video quality analysis value:

Score＝score_x·score_y·score_t

the invention has the following beneficial effects:

the invention provides a screen content video quality analysis method based on tensor decomposition. The method focuses on fully considering the characteristics of a human eye vision system and the characteristics of screen content videos, adopts tensor decomposition to obtain main texture structure information of the screen content videos, fully utilizes Gabor characteristics to capture edge information highly sensitive to human eyes, reflects the subjective perception of the human eye vision subjective vision system on the screen content videos, and has better screen content video quality analysis performance.

Drawings

FIG. 1 is a schematic flow diagram of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples.

Referring to fig. 1, a method for analyzing the video quality of the screen content based on tensor decomposition includes the following specific steps:

step 1, inputting a reference screen content video sequence V_rAnd distorted screen content video sequence V_d；

Step 2, for the reference screen content video sequence V_rAnd distorted screen content video sequence V_dCarrying out tensor decomposition to obtain three-direction reference principal component slice M_r,x、M_r,y、M_r,tAnd three-directional distortion principal component slice M_d,x、M_d,y、M_d,tThe method comprises the following steps:

step 2.1: video sequence V to be referenced to screen content_rIs regarded as a third-order tensor, and is converted by tensor decompositionConversion to a core tensor

wherein the extract is_nDenotes n-modulo multiplication, n =1,2,3, three factor matrices a_r,B_r,C_rRespectively representing an original video sequence V_rPrincipal components in x, y and t directions, which are orthogonal to each other, and core tensor

Is represented as follows:

Is represented as follows:

respectively setting reference factor matrixes B_rAnd distortion factor matrix B_dFor the identity matrix, a reference screen content video sequence V is obtained_rAnd distorted screen content video sequence V_dSet of horizontal spatiotemporal slices cut along the y-axis direction as follows:

extraction of distorted screen content video sequence V_dThe slice with the largest energy in the three-direction slice set is used as a reference principal component slice M_d,x、M_d,y、M_d,tThe method comprises the following steps:

Step 3, calculating a three-direction reference Gabor characteristic diagram F_r,x(x,y)、F_r,y(x,y)、F_r,t(x, y) and three-direction distortion Gabor feature map F_d,x(x,y)、F_d,y(x,y)、F_d,t(x, y) feature similarity graph S_x(x,y)、S_y(x,y)、S_t(x, y), as follows:

wherein G is_i(x, y) is a Gabor filter as follows:

x^′＝xcosθ+ysinθ

y^′＝y cosθ-xsinθ

step 4, calculating a three-direction reference Gabor characteristic diagram F_r,x(x,y)、F_r,y(x,y)、F_r,t(x, y) and three-direction distortion Gabor characteristic diagram F_d,x(x,y)、F_d,y(x,y)、F_d,t(x, y) feature similarity graph S_x(x,y)、S_y(x,y)、S_t(x, y), as follows:

where c is a constant to ensure numerical stability, c =1000.

Step 5, similarity graph S based on three-direction characteristics_x(x,y)、S_y(x,y)、S_t(x, y) obtaining a final distorted screen content video quality analysis value, which is as follows:

ω_x(x,y)＝max{|F_r,x(x,y)|,|F_d,x(x,y)|}

ω_y(x,y)＝max{|F_r,y(x,y)|,|F_d,y(x,y)}

by pooling t-direction feature similarity map S_x(x, y) obtaining a t-direction distortion screen content video quality score:

ω_t(x,y)＝max{|F_r,t(x,y)|,|F_d,t(x,y)|}

Score＝score_x·score_y·score_t

the above examples are provided only for illustrating the present invention and are not intended to limit the present invention. Changes, modifications, etc. to the above-described embodiments are intended to fall within the scope of the claims of the present invention as long as they are in accordance with the technical spirit of the present invention.

Claims

1. A screen content video quality analysis method based on tensor decomposition is characterized by comprising the following steps:

inputting a reference screen content video sequence V_rAnd distorted screen content video sequence V_d；

Similarity graph S based on three-direction features_x(x,y)、S_y(x,y)、S_t(x, y) obtaining a final distorted screen content video quality analysis value;

for reference screen content video sequence V_rAnd distorted screen content video sequence V_dCarrying out tensor decomposition to obtain three-direction reference principal component slice M_r,x、M_r,y、M_r,tAnd three-directional distortion principal component slice M_d,x、M_d,y、M_d,tThe method comprises the following steps:

Is represented as follows:

Is represented as follows:

step 2.2: respectively setting a reference factor matrix A_rAnd distortion factor matrix A_dFor the identity matrix, a reference screen is obtainedContent video sequence V_rAnd distorted screen content video sequence V_dSet of vertical spatiotemporal slices cut along the x-axis direction as follows:

step 2.3: extracting a reference screen content video sequence V_rThree-directional slice collection ofThe slice with the highest energy is used as the reference principal component slice M_r,x、M_r,y、M_r,tThe method comprises the following steps:

where W =1,2., W, H =1,2., H, L =1,2., L, W, H, L respectively represent the number of slices of a three-directional slice set.

2. The tensor decomposition-based screen content video quality analysis method as recited in claim 1, wherein three-directional reference principal component slices M are respectively extracted_r,x、M_r,y、M_r,tGabor profile F_r,x(x,y)、F_r,y(x,y)、F_r,t(x, y) and three-dimensional distortion principal component slice M_d,x、M_d,y、M_d,tGabor feature map of (1)_d,x(x,y)、F_d,y(x,y)、F_d,t(x, y), as follows:

wherein G is_i(x, y) is a Gabor filter as follows:

x′＝xcosθ+ysinθ

y′＝ycosθ-xsinθ

where (x, y) denotes coordinates of each pixel in the input principal component slice, i denotes a direction index of the Gabor filter, f and θ are frequency amplitude and direction information of the sinusoidal plane wave (x ', y'), σ_xAnd σ_yThe standard deviation of the gaussian kernel in the x-axis direction and the y-axis direction, respectively, is taken here as f =0.2, σ_x＝2.15,σ_y=0.15; n is the total number of directions, and a total of 12 directions are considered here, which respectively correspond to Gabor filters of theta = i × π/12, i ∈ { 0.. 11 };

3. the tensor decomposition-based screen content video quality analysis method as recited in claim 1, wherein: calculating three-direction reference Gabor characteristic diagram F_r,x(x,y)、F_r,y(x,y)、F_r,t(x, y) and three-direction distortion Gabor characteristic diagram F_d,x(x,y)、F_d,y(x,y)、F_d,t(x, y) feature similarity graph S_x(x,y)、S_y(x,y)、S_t(x, y), as follows:

where c is a constant to ensure numerical stability, c =1000.

4. The tensor decomposition-based screen content video quality analysis method as recited in claim 1, wherein: similarity graph S based on three-direction features_x(x,y)、S_y(x,y)、S_t(x, y) obtaining a final distorted screen content video quality analysis value, which is as follows:

ω_x(x,y)＝max{F_r,x(x,y),F_d,x(x,y)}

ω_y(x,y)＝max{F_r,y(x,y),F_d,y(x,y)}

ω_t(x,y)＝max{F_r,t(x,y),F_d,t(x,y)}

and combining the quality scores of the distorted screen contents in all directions to obtain a final distorted screen content video quality analysis value:

Score＝score_x·score_y·score_t。