CN108337504A - A kind of method and device of evaluation video quality - Google Patents
A kind of method and device of evaluation video quality Download PDFInfo
- Publication number
- CN108337504A CN108337504A CN201810088362.9A CN201810088362A CN108337504A CN 108337504 A CN108337504 A CN 108337504A CN 201810088362 A CN201810088362 A CN 201810088362A CN 108337504 A CN108337504 A CN 108337504A
- Authority
- CN
- China
- Prior art keywords
- video frame
- view video
- frame
- frequency
- setting quantity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N17/00—Diagnosis, testing or measuring for television systems or their details
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
- G06T2207/10021—Stereoscopic video; Stereoscopic image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30168—Image quality inspection
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N2013/0074—Stereoscopic image analysis
Abstract
The present invention proposes a kind of method and device of evaluation video quality.A method of evaluation video quality, including:Extraction obtains the first view video frame of setting quantity, and the second view video frame of setting quantity corresponding with setting first view video frame of quantity from three-dimensional video-frequency;Quality evaluation is carried out to the first view video frame of the setting quantity and the second view video frame of the setting quantity using trained preset two tunnels depth convolutional neural networks, obtains the quality evaluation result of the three-dimensional video-frequency.Above-mentioned stereoscopic video quality evaluation procedure need not manually extract video features, can be completely achieved automatic video quality evaluation, can improve accuracy and the automatization level of video quality evaluation.
Description
Technical field
The present invention relates to machine learning techniques field more particularly to a kind of method and devices of evaluation video quality.
Background technology
With the fast development of three-dimensional display apparatus, we can pass through three-dimensional television (Three-Dimensional
Television, 3DTV) viewing three-dimensional video-frequency.The left and right of three-dimensional video-frequency is depending on that can integrate generation depth perception, to viewing
Person brings immersion to a certain extent to experience.But in entire three-dimensional video-frequency processing procedure, such as in acquisition, compression, biography
During defeated, reconstruction and display etc., original stereo video council is by a variety of quality impairments.Therefore, accurate algorithm is designed to come from
The Quality of experience of dynamic evaluation and three-dimensional video-frequency is vital for entire three-dimensional video-frequency processing procedure.
The research about stereoscopic video quality evaluation is not lost based on the algorithm referred to entirely, that is, from original mostly at present
The quality of manual characteristic evaluating distortion three-dimensional video-frequency is extracted in true three-dimensional video-frequency and distortion three-dimensional video-frequency.However in most of reality
In, this complete required stereoscopic video information that is not distorted of algorithm that refers to generally can not obtain, while extract manual feature
Not accurate enough and robust, and a large amount of priori is needed, the automatization level of evaluation procedure is not high enough.
Invention content
In order to solve above-mentioned defect existing in the prior art and deficiency, the following technical solutions are proposed by the present invention:
A method of evaluation video quality, including:
Extraction obtains the first view video frame of setting quantity, and first with the setting quantity from three-dimensional video-frequency
Second view video frame of the corresponding setting quantity of view video frame;
Using trained preset two tunnels depth convolutional neural networks to the first multi-view video of the setting quantity
Frame and the second view video frame of the setting quantity carry out quality evaluation, obtain the quality evaluation result of the three-dimensional video-frequency.
Preferably, the second of the first view video frame for obtaining setting quantity and setting quantity is being extracted from three-dimensional video-frequency
After view video frame, this method further includes:
According to preset image division methods, respectively to the first view video frame of the setting quantity and setting quantity
Second view video frame carries out image block division processing.
Preferably, it is described using trained preset two tunnels depth convolutional neural networks to the of the setting quantity
One view video frame and the second view video frame of the setting quantity carry out quality evaluation, obtain the quality of the three-dimensional video-frequency
Evaluation result, including:
Using trained preset two tunnels depth convolutional neural networks, respectively to each first view video frame and with
Every group of image block in corresponding second view video frame of first view video frame obtains each the to carrying out quality evaluation
The quality of one view video frame and every group of image block pair in the second view video frame corresponding with first view video frame
Score;
To in each first view video frame and the second view video frame corresponding with first view video frame
The mass fraction of every group of image block pair carry out spacial average processing, obtain each first view video frame and with described first
The mass fraction of corresponding second view video frame of view video frame;
To each first view video frame and the second view video frame corresponding with first view video frame
Mass fraction carries out time domain averaging processing, obtains the quality evaluation result of the three-dimensional video-frequency.
Preferably, include to the training process of the preset two tunnels depth convolutional neural networks:
Cycle executes following operation, until the quality evaluation penalty values being calculated are less than given threshold:
By the first view video frame of preset three-dimensional video-frequency and the second visual angle corresponding with first view video frame
Video frame inputs preset two tunnels depth convolutional neural networks, obtains the video frame mass fraction of the three-dimensional video-frequency;
According to the standard quality score of the video frame mass fraction and the video frame, quality evaluation loss is calculated
Value;
If the quality evaluation penalty values are not less than given threshold, using the quality evaluation penalty values to described pre-
Bis- tunnel depth convolutional neural networks of She carry out reverse link parameter update.
Preferably, the extraction from three-dimensional video-frequency obtain setting quantity the first view video frame and setting quantity the
Two view video frames, including:
From the first multi-view video sequence of three-dimensional video-frequency, the time domain key frame of extraction setting quantity obtains setting quantity
First view video frame, and from the second multi-view video sequence of the three-dimensional video-frequency, the of extraction and the setting quantity
Corresponding second view video frame of one view video frame obtains the second view video frame of setting quantity.
A kind of device of evaluation video quality, including:
Video frame extraction unit obtains the first view video frame of setting quantity for being extracted from three-dimensional video-frequency, and
Second view video frame of setting quantity corresponding with setting first view video frame of quantity;
Video quality evaluation unit, for being set to described using trained preset two tunnels depth convolutional neural networks
First view video frame of fixed number amount and the second view video frame of the setting quantity carry out quality evaluation, obtain the solid
The quality evaluation result of video.
Preferably, which further includes:
Video frame division unit, for according to preset image division methods, being regarded respectively to the first of the setting quantity
Second view video frame of angle video frame and setting quantity carries out image block division processing.
Preferably, the video quality evaluation unit utilizes trained preset two tunnels depth convolutional neural networks pair
First view video frame of the setting quantity and the second view video frame of the setting quantity carry out quality evaluation, obtain institute
When stating the quality evaluation result of three-dimensional video-frequency, it is specifically used for:
Using trained preset two tunnels depth convolutional neural networks, respectively to each first view video frame and with
Every group of image block in corresponding second view video frame of first view video frame obtains each the to carrying out quality evaluation
The quality of one view video frame and every group of image block pair in the second view video frame corresponding with first view video frame
Score;
To in each first view video frame and the second view video frame corresponding with first view video frame
The mass fraction of every group of image block pair carry out spacial average processing, obtain each first view video frame and with described first
The mass fraction of corresponding second view video frame of view video frame;
To each first view video frame and the second view video frame corresponding with first view video frame
Mass fraction carries out time domain averaging processing, obtains the quality evaluation result of the three-dimensional video-frequency.
Preferably, the video quality evaluation unit is additionally operable to carry out the preset two tunnels depth convolutional neural networks
Training;
It is specific to use when the video quality evaluation unit is to the training of the preset two tunnels depth convolutional neural networks
In:
Cycle executes following operation, until the quality evaluation penalty values being calculated are less than given threshold:
By the first view video frame of preset three-dimensional video-frequency and the second visual angle corresponding with first view video frame
Video frame inputs preset two tunnels depth convolutional neural networks, obtains the video frame mass fraction of the three-dimensional video-frequency;
According to the standard quality score of the video frame mass fraction and the video frame, quality evaluation loss is calculated
Value;
If the quality evaluation penalty values are not less than given threshold, using the quality evaluation penalty values to described pre-
Bis- tunnel depth convolutional neural networks of She carry out reverse link parameter update.
Preferably, the video frame extraction unit is extracted from three-dimensional video-frequency obtains the first view video frame of setting quantity
With setting quantity the second view video frame when, be specifically used for:
From the first multi-view video sequence of three-dimensional video-frequency, the time domain key frame of extraction setting quantity obtains setting quantity
First view video frame, and from the second multi-view video sequence of the three-dimensional video-frequency, the of extraction and the setting quantity
Corresponding second view video frame of one view video frame obtains the second view video frame of setting quantity.
The present invention first extracts the first visual angle of setting quantity when evaluating the video quality of three-dimensional video-frequency from three-dimensional video-frequency
Video frame and the second view video frame, then by the first view video frame of extraction and the input of the second view video frame by training
Two tunnel depth convolutional neural networks carry out quality evaluations, obtain the quality evaluation result to above-mentioned three-dimensional video-frequency.The present invention carries
The method of the evaluation video quality gone out realizes the video matter to three-dimensional video-frequency by trained depth convolutional neural networks
Amount evaluation, above-mentioned stereoscopic video quality evaluation procedure need not manually extract video features, can be completely achieved automatic video matter
Amount evaluation, can improve accuracy and the automatization level of video quality evaluation.
Description of the drawings
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
The embodiment of invention for those of ordinary skill in the art without creative efforts, can also basis
The attached drawing of offer obtains other attached drawings.
Fig. 1 is a kind of flow diagram of the method for evaluation video quality provided in an embodiment of the present invention;
Fig. 2 is the configuration diagram of two tunnels depth convolutional neural networks provided in an embodiment of the present invention;
Fig. 3 is the flow diagram of two tunnel depth convolutional neural networks of training provided in an embodiment of the present invention;
Fig. 4 is the flow diagram of the method for another evaluation video quality provided in an embodiment of the present invention;
Fig. 5 is the schematic diagram provided in an embodiment of the present invention that key video sequence frame is extracted from video sequence;
Fig. 6 is a kind of structural schematic diagram of the device of evaluation video quality provided in an embodiment of the present invention;
Fig. 7 is the structural schematic diagram of the device of another evaluation video quality provided in an embodiment of the present invention.
Specific implementation mode
The application scenarios that technical solution of the embodiment of the present invention is suitable for evaluating stereoscopic video quality.Using the present invention
Embodiment technical solution, the quality for the evaluation three-dimensional video-frequency that can be automated.
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation describes, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
The embodiment of the invention discloses a kind of methods of evaluation video quality, and shown in Figure 1, this method includes:
S101, from three-dimensional video-frequency extraction obtain setting quantity the first view video frame, and with the setting quantity
The first view video frame it is corresponding setting quantity the second view video frame;
Specifically, three-dimensional video-frequency is the planar video Sequence composition by left and right visual angle.Therefore, arbitrary three-dimensional video-frequency is all
Including left and right multi-view video sequence, that is, include the video sequence at two visual angles, in embodiments of the present invention with the first multi-view video
Sequence and the second multi-view video sequence are distinguish.
When carrying out video quality evaluation to three-dimensional video-frequency, the video of each video frame of three-dimensional video-frequency is actually evaluated
Quality.In actual video evaluation procedure, each video of the video sequence at each visual angle of evaluation three-dimensional video-frequency can be passed through
The video quality of frame, to determine the video quality of entire three-dimensional video-frequency.
In embodiments of the present invention, from the video sequence at each visual angle of three-dimensional video-frequency, a certain number of videos are extracted
Key frame, by evaluate extraction key frame of video video quality, come represent entire visual angle video sequence video quality,
The further video quality of the key frame of video of comprehensive extraction, determines the video quality of entire three-dimensional video-frequency.
S102, using trained preset two tunnels depth convolutional neural networks to it is described setting quantity the first visual angle
Second view video frame of video frame and the setting quantity carries out quality evaluation, obtains the quality evaluation knot of the three-dimensional video-frequency
Fruit.
Specifically, above-mentioned preset two tunnels depth convolutional neural networks refer to the embodiment of the present invention be specially arranged for commenting
Two tunnel depth convolutional neural networks of the video quality of the left and right multi-view video sequence of valence three-dimensional video-frequency.It is deep when erecting two road
After spending convolutional neural networks, it is trained using a large amount of sample data, specially trains its video sequence to input
Quality evaluation is carried out, and inherent parameters are adjusted according to evaluation result, keeps evaluation more acurrate.
When the evaluation accuracy of the above-mentioned two tunnels depth convolutional neural networks of training reaches requirement, rolled up using the two tunnels depth
Product neural network carries out quality evaluation to the first view video frame of three-dimensional video-frequency and the second view video frame respectively, obtains to whole
The quality evaluation result of a three-dimensional video-frequency.
Technical solution of the embodiment of the present invention is using trained two tunnels depth convolutional neural networks to a left side for three-dimensional video-frequency
LOOK RIGHT video carries out quality evaluation, to realize the effective evaluation to stereoscopic video quality.Above-mentioned stereoscopic video quality evaluation
Process need not manually extract video features, can be completely achieved automatic video quality evaluation, can improve video quality evaluation
Accuracy and automatization level.
It is appreciated that technical solution of the embodiment of the present invention is realized by two tunnel depth convolutional neural networks to three-dimensional video-frequency
Quality evaluation.The a large amount of manual feature of priori extraction is not needed in evaluation procedure, completely according to the depth of design nerve
Network carries out feature learning, to carry out without the quality evaluation with reference to three-dimensional video-frequency.
The structural framing of two tunnel depth convolutional neural networks used by the embodiment of the present invention is as shown in Figure 2.Shown in Fig. 2
Two tunnel depth convolutional neural networks in the structure per depth convolutional neural networks all the way it is consistent, parameter is also consistent, is complete
Identical depth convolutional neural networks.The depth convolutional neural networks are modified to obtain by AlexNet.Per depth convolution all the way
Neural network includes convolutional layer, pond layer, convolutional layer, pond layer, three convolutional layers, Chi Hua successively from input terminal to output end
Layer, full articulamentum.The full articulamentum of two-way depth convolutional neural networks is connected by a fused layer again.Wherein, in order to be adapted to
To the calculation processing demand of video image, the embodiment of the present invention sets the convolution kernel of the convolutional layer of above-mentioned depth convolutional neural networks
It is set to 3 × 3 sizes.
Framework rise two tunnels depth convolutional neural networks shown in Fig. 2 after, need to two tunnel depth convolutional neural networks into
Row training, so that above-mentioned two tunnels depth convolutional neural networks can carry out quality evaluation to the three-dimensional video-frequency of input automatically.
Shown in Figure 3, the embodiment of the present invention proposes that the training process of bis- tunnel depth convolutional neural networks of Dui is specifically wrapped
It includes:
S301, by the first view video frame of preset three-dimensional video-frequency and with first view video frame corresponding second
View video frame inputs two tunnel depth convolutional neural networks, obtains the video frame mass fraction of the three-dimensional video-frequency;
Specifically, any one frame image of three-dimensional video-frequency include left and right two visual angles video frame, the embodiment of the present invention with
First view video frame and the second view video frame are distinguish.
In embodiments of the present invention, it to the quality evaluation of any one frame video frame of three-dimensional video-frequency, is required for respectively regarding this
The first view video frame and the second view video frame that frequency frame is included carry out quality evaluation, and evaluation result is integrated to obtain the final product
To the quality evaluation of this frame video frame of the three-dimensional video-frequency.Quality evaluation to three-dimensional video-frequency is all by three-dimensional video-frequency
Video frame quality evaluation realize.By the first view video frame of three-dimensional video-frequency and corresponding with first view video frame
After second view video frame inputs two tunnel depth convolutional neural networks, input is regarded respectively per depth convolutional neural networks all the way
Frequency frame carries out a series of convolution sum pondization processing, finally respectively obtains the video frame quality of above-mentioned corresponding two frames video frame
Score.
Can also image block division further be carried out to mutual corresponding first view video frame and the second view video frame,
Mutual corresponding two video frame are divided into multigroup mutual corresponding image block pair.For example, the first view video frame is divided
For multiple images block ai,j, i=1,2 ..., m;J=1,2 ..., n.According to identical image block dividing mode, the second visual angle is regarded
Frequency frame is divided into multiple images block bi,j, i=1,2 ..., m;J=1,2 ..., n.Then corresponding ai,jWith bi,jForm image block
It is right.Using the above-mentioned two tunnels depth convolutional neural networks phase to including in the first view video frame and the second view video frame respectively
Mutual corresponding image block carries out spacial average to carrying out quality evaluation, in the quality evaluation result to all image blocks pair, you can
Obtain the quality evaluation result of mutual corresponding first view video frame and the second view video frame.
It should be noted that training samples number be the key that influence to the training effects of depth convolutional neural networks because
Element.A large amount of training sample is needed to the training of depth convolutional neural networks, in existing stereoscopic video quality evaluation procedure
In, exactly there is a problem of that amount of training data is small, cause the training to network insufficient, so as to cause quality evaluation inaccuracy.
The method for carrying out image block division to stereoscopic video sequence proposed using the embodiment of the present invention, can obtain sufficient amount of instruction
Practice sample, improves the training effect to network, improve the accuracy of quality evaluation.Therefore, the embodiment of the present invention is implementing this hair
When bright embodiment technical solution, image block division processing is carried out to the key video sequence frame extracted from three-dimensional video-frequency, with improvement pair
The training effect of two tunnel depth convolutional neural networks.
It is further to note that video frame when being trained to above-mentioned two tunnels depth convolutional neural networks is processed
Journey, video frame processing procedure phase when carrying out video quality evaluation with above-mentioned two tunnels depth convolutional neural networks are actually used
Together, it just can guarantee that trained depth convolutional neural networks play the calculation processing ability having by training in this way.
S302, according to the standard quality score of the video frame mass fraction and the video frame, quality is calculated and comments
Valence penalty values;
Specifically, being carried out to above-mentioned two tunnels depth convolutional neural networks in the video frame using pre-prepd three-dimensional video-frequency
It is the standard quality score of the video frame of known pre-prepd three-dimensional video-frequency when training.As above-mentioned two tunnels depth convolution god
The video frame mass fraction that after quality evaluation obtains the video frame mass fraction, will be obtained is carried out to the video frame of input through network
It is compared with the standard quality score of the video frame, quality evaluation penalty values is calculated.The quality evaluation penalty values are upper
The difference for stating mass fraction and standard quality score that two tunnel depth convolutional neural networks are evaluated, for indicating above-mentioned two tunnel
Depth convolutional neural networks carry out the video frame of input the error of quality evaluation.
Further, the mass fraction and mark matter of the video frame of each image block pair or three-dimensional video-frequency can also be calculated
Measure quality evaluation penalty values of the least mean-square error of score as three-dimensional video-frequency:
Wherein qiThe output of bis- tunnel deep neural networks of Shi Gai, the i.e. mass fraction of image block pair or video frame;yiIt is corresponding
Three-dimensional video-frequency standard quality score, and i=1,2 ..., p indicate a total of p training image blocks to sample or video frame sample
This.
If the quality evaluation penalty values are not less than given threshold, then follow the steps S303, utilizes the quality evaluation
Penalty values carry out reverse link parameter update to the preset two tunnels depth convolutional neural networks;
Specifically, the quality evaluation penalty values being calculated in step S302 are not less than given threshold, then illustrate above-mentioned two
The error of road depth convolutional neural networks is excessive, at this time using obtained quality evaluation penalty values to above-mentioned two tunnels depth convolution god
Reversed parameter update is carried out through network, is adjusted the parameter of two tunnel depth convolutional neural networks, is made calculating error smaller.
It is then back to and executes step S301~S303, until the quality evaluation penalty values being calculated are less than given threshold.
At this point it is possible to think the quality evaluation results of above-mentioned two tunnels depth convolutional neural networks with standard quality score very close to,
The quality evaluation of i.e. above-mentioned two tunnels depth convolutional neural networks is accurate enough, and above-mentioned two tunnels depth convolutional neural networks have
The ability of accurate evaluation video frame quality.
It can be used for arbitrary three-dimensional video-frequency by bis- tunnel depth convolutional neural networks of above-mentioned training process training Hou
Video quality evaluation is carried out, i.e., video quality evaluation is carried out to arbitrary three-dimensional video-frequency according to technical solution of the embodiment of the present invention.
Shown in Figure 4, the method for the evaluation video quality that the embodiment of the present invention proposes specifically includes:
S401, from the first multi-view video sequence of three-dimensional video-frequency, extraction setting quantity time domain key frame set
First view video frame of quantity, and from the second multi-view video sequence of the three-dimensional video-frequency, extraction and the setting number
Corresponding second view video frame of the first view video frame of amount obtains the second view video frame of setting quantity;
Specifically, three-dimensional video-frequency is the planar video Sequence composition by left and right visual angle.Therefore, arbitrary three-dimensional video-frequency is all
Including left and right multi-view video sequence, that is, include the video sequence at two visual angles, in embodiments of the present invention with the first multi-view video
Sequence and the second multi-view video sequence are distinguish.
When carrying out video quality evaluation to three-dimensional video-frequency, the video of each video frame of three-dimensional video-frequency is actually evaluated
Quality.In actual video evaluation procedure, each video of the video sequence at each visual angle of evaluation three-dimensional video-frequency can be passed through
The video quality of frame, to determine the video quality of entire three-dimensional video-frequency.
In embodiments of the present invention, from the video sequence at each visual angle of three-dimensional video-frequency, a certain number of videos are extracted
Key frame, by evaluate extraction key frame of video video quality, come represent entire visual angle video sequence video quality,
The further video quality of the key frame of video of comprehensive extraction, determines the video quality of entire three-dimensional video-frequency.
When extracting key video sequence frame from video sequence, the key video sequence that can represent video sequence entirety is extracted as possible
Frame, quality evaluation is carried out to it could represent quality evaluation to entire video sequence.Theoretically, when being extracted from video sequence
Equally distributed video frame on domain, can represent the overall condition of entire video sequence.Further, since three-dimensional video-frequency includes
The video sequence at two visual angles, therefore key video sequence frame is extracted from the video sequence at two visual angles respectively, also, should protect
It is mutual corresponding key video sequence frame to demonstrate,prove the key video sequence frame extracted from the video sequence at two visual angles.
When extracting key video sequence frame from three-dimensional video-frequency, first multi-view video sequence of the embodiment of the present invention from three-dimensional video-frequency
In row, the time domain key frame of extraction setting quantity obtains the first view video frame of setting quantity.Specifically, as shown in figure 5, first
The first frame and last frame of the first multi-view video sequence are first extracted, then extracts the intermediate frame of first frame and last frame, so
Extract the intermediate frame of first frame and intermediate frame and the intermediate frame of intermediate frame and last frame again afterwards, and so on extraction first
Equally distributed key video sequence frame in the time domain of view video frame obtains the first view video frame of setting quantity.It is carried above-mentioned
While taking the first visual angle key frame, using identical video frame extraction method, extracted from the second multi-view video sequence same
Second view video frame of quantity, then the second view video frame extracted are corresponding with the first view video frame extracted
Video frame.
S402, according to preset image division methods, respectively to the first view video frame of setting quantity for extracting and
The second view video frame for setting quantity carries out image block division processing;
Specifically, after extracting key video sequence frame respectively in the video sequence at two visual angles of three-dimensional video-frequency, the present invention
Embodiment further carries out image block division processing to each key video sequence frame of extraction.For example, will be from the first of three-dimensional video-frequency
The first key video sequence frame extracted in multi-view video sequence is divided into 64 × 64 image blocks, correspondingly, will be from three-dimensional video-frequency
The second multi-view video sequence in first key video sequence frame extracting be divided into 64 × 64 image blocks, then from three-dimensional video-frequency
Each image block of the first key video sequence frame extracted in first multi-view video sequence is regarded with the second visual angle from three-dimensional video-frequency
Each image block of the first key video sequence frame extracted in frequency sequence constitutes corresponding image block pair.According to the method described above,
Image block division is carried out to the second view video frame of the first view video frame of the setting quantity of extraction and setting quantity respectively
Processing, obtains every group picture in each first view video frame and the second view video frame corresponding with first view video frame
As block pair.
S403, using trained preset two tunnels depth convolutional neural networks, respectively to each first multi-view video
Every group of image block in frame and the second view video frame corresponding with first view video frame is obtained to carrying out quality evaluation
Every group of image block pair in each first view video frame and the second view video frame corresponding with first view video frame
Mass fraction;
Specifically, the first view video frame and corresponding second view video frame are carried out image by executing step S402
Block divides to obtain each group image block in mutual corresponding two picture frames to rear, by every group of image block to inputting two tunnels depth respectively
Spend convolutional neural networks.
For example, it is assumed that the first view video frame is divided into 64 × 64 image block ai,j, i=1,2 ..., 64;J=1,
2,…,64.According to identical image block dividing mode, the second view video frame is divided into 64 × 64 image block bi,j, i=
1,2,…,64;J=1,2 ..., 64.Then corresponding ai,jWith bi,jForm image block pair, such as a1,1With b1,1Constitute a figure
As block pair, a1,2With b1,2It is right to constitute an image block ..., a64,64With b64,64Constitute an image block pair.By image block to defeated
When entering two tunnel depth convolutional neural networks, by image block (such as a in the first view video frame1,1) input depth convolution all the way
Neural network, while by the image block in the second view video frame corresponding with the image block in above-mentioned first view video frame
(such as b1,1) input another way depth convolutional neural networks, two-way depth convolutional neural networks simultaneously to the image block of input into
Row quality evaluation obtains this group of image block of two tunnel depth convolutional neural networks of input to (a1,1With b1,1) mass fraction.
According to above-mentioned processing method, respectively by each first view video frame and corresponding with the first view video frame second
Every group of image block in view video frame carries out quality evaluation to inputting above-mentioned trained two tunnels depth convolutional neural networks,
Obtain every group of image block pair in each first view video frame and the second view video frame corresponding with the first view video frame
Mass fraction.
S404, to every in each first view video frame and the second view video frame corresponding with the first view video frame
Group image block pair mass fraction carry out spacial average processing, obtain each first view video frame and with the first multi-view video
The mass fraction of corresponding second view video frame of frame;
Specifically, for all images for including in mutual corresponding first view video frame and the second view video frame
The mass fraction of block pair, the embodiment of the present invention carry out spacial average processing to it, then can obtain each group mutual corresponding
The mass fraction of one view video frame and the second view video frame.
It specifically can be according to following formula to including in mutual corresponding first view video frame and the second view video frame
The mass fraction of all image blocks pair carries out spacial average processing:
Wherein QjIt isjThe mass fraction of frame video frame, j=1,2 ..., m indicate the video frame position in each time domain;
qiIt is the output of depth convolutional neural networks, p indicates the image number of blocks that video frame is included.
S405, to each first view video frame and the second multi-view video corresponding with first view video frame
The mass fraction of frame carries out time domain averaging processing, obtains the quality evaluation result of the three-dimensional video-frequency.
Specifically, the quality point of each key video sequence frame at each visual angle of three-dimensional video-frequency is calculated by step S404
After number, time domain averaging processing is carried out to the mass fraction of each key video sequence frame at each visual angle, obtains each of three-dimensional video-frequency
The mass fraction of the video sequence at a visual angle is to get to the quality evaluation result of the three-dimensional video-frequency.
When can specifically be carried out to the mass fraction of each key video sequence frame in visual angle video frame according to following formula
Domain handling averagely:
Wherein QjIt is the mass fraction of the jth frame key video sequence frame of video sequence;Q indicates the mass fraction of video sequence;m
Indicate the quantity of the key video sequence frame extracted from video sequence.
By above-mentioned introduction as it can be seen that the embodiment of the present invention propose method for evaluating video quality, by three-dimensional video-frequency sequence
Row carry out video frame extraction and image block divides, to obtain a large amount of training sample data.Using above-mentioned training sample data
Two tunnel depth convolutional neural networks are trained, the ability for making it have evaluation video quality.Then again using by training
Two tunnel depth convolutional neural networks quality evaluation is carried out to the left and right multi-view video of three-dimensional video-frequency, to realize to three-dimensional video-frequency
The effective evaluation of quality.Above-mentioned stereoscopic video quality evaluation procedure need not manually extract video features, can be completely achieved automatic
Video quality evaluation, accuracy and the automatization level of video quality evaluation can be improved.
The embodiment of the invention also discloses a kind of devices of evaluation video quality, and shown in Figure 6, which includes:
Video frame extraction unit 100 obtains the first view video frame of setting quantity for being extracted from three-dimensional video-frequency, with
And the second view video frame of setting quantity corresponding with setting first view video frame of quantity;
Video quality evaluation unit 110, for utilizing trained preset two tunnels depth convolutional neural networks to institute
Second view video frame of the first view video frame and the setting quantity of stating setting quantity carries out quality evaluation, obtains described
The quality evaluation result of three-dimensional video-frequency.
Optionally, in another embodiment of the present invention, shown in Figure 7, which further includes:
Video frame division unit 120, for according to preset image division methods, setting the first of quantity to described respectively
View video frame and the second view video frame of setting quantity carry out image block division processing.
Optionally, in another embodiment of the present invention, video quality evaluation unit 110 is preset using trained
Two tunnel depth convolutional neural networks to it is described setting quantity the first view video frame and it is described setting quantity the second visual angle
Video frame carries out quality evaluation and is specifically used for when obtaining the quality evaluation result of the three-dimensional video-frequency:
Using trained preset two tunnels depth convolutional neural networks, respectively to each first view video frame and with
Every group of image block in corresponding second view video frame of first view video frame obtains each the to carrying out quality evaluation
The quality of one view video frame and every group of image block pair in the second view video frame corresponding with first view video frame
Score;
To in each first view video frame and the second view video frame corresponding with first view video frame
The mass fraction of every group of image block pair carry out spacial average processing, obtain each first view video frame and with described first
The mass fraction of corresponding second view video frame of view video frame;
To each first view video frame and the second view video frame corresponding with first view video frame
Mass fraction carries out time domain averaging processing, obtains the quality evaluation result of the three-dimensional video-frequency.
Optionally, in another embodiment of the present invention, video quality evaluation unit 110 is additionally operable to described preset
Two tunnel depth convolutional neural networks are trained;
When video quality evaluation unit 110 is to the training of the preset two tunnels depth convolutional neural networks, it is specifically used for:
Cycle executes following operation, until the quality evaluation penalty values being calculated are less than given threshold:
By the first view video frame of preset three-dimensional video-frequency and the second visual angle corresponding with first view video frame
Video frame inputs preset two tunnels depth convolutional neural networks, obtains the video frame mass fraction of the three-dimensional video-frequency;
According to the standard quality score of the video frame mass fraction and the video frame, quality evaluation loss is calculated
Value;
If the quality evaluation penalty values are not less than given threshold, using the quality evaluation penalty values to described pre-
Bis- tunnel depth convolutional neural networks of She carry out reverse link parameter update.
Optionally, in another embodiment of the present invention, video frame extraction unit 100 is extracted from three-dimensional video-frequency and is obtained
When setting the first view video frame of quantity and setting the second view video frame of quantity, it is specifically used for:
From the first multi-view video sequence of three-dimensional video-frequency, the time domain key frame of extraction setting quantity obtains setting quantity
First view video frame, and from the second multi-view video sequence of the three-dimensional video-frequency, the of extraction and the setting quantity
Corresponding second view video frame of one view video frame obtains the second view video frame of setting quantity.
Specifically, the specific works content of each unit in the various embodiments described above, refers to above method embodiment
Content, details are not described herein again.
The foregoing description of the disclosed embodiments enables those skilled in the art to implement or use the present invention.
Various modifications to these embodiments will be apparent to those skilled in the art, as defined herein
General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, of the invention
It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one
The widest range caused.
Claims (10)
1. a kind of method of evaluation video quality, which is characterized in that including:
Extraction obtains the first view video frame of setting quantity, and the first visual angle with the setting quantity from three-dimensional video-frequency
Second view video frame of the corresponding setting quantity of video frame;
Using trained preset two tunnels depth convolutional neural networks to it is described setting quantity the first view video frame and
Second view video frame of the setting quantity carries out quality evaluation, obtains the quality evaluation result of the three-dimensional video-frequency.
2. according to the method described in claim 1, it is characterized in that, obtaining the first of setting quantity being extracted from three-dimensional video-frequency
After view video frame and the second view video frame of setting quantity, this method further includes:
According to preset image division methods, respectively to the first view video frame and the second of setting quantity of the setting quantity
View video frame carries out image block division processing.
3. according to the method described in claim 2, it is characterized in that, described utilize trained preset two tunnels depth convolution
Neural network carries out quality to the first view video frame of the setting quantity and the second view video frame of the setting quantity
Evaluation, obtains the quality evaluation result of the three-dimensional video-frequency, including:
Using trained preset two tunnels depth convolutional neural networks, respectively to each first view video frame and with it is described
Every group of image block in corresponding second view video frame of first view video frame obtains each first and regards to carrying out quality evaluation
The mass fraction of angle video frame and every group of image block pair in the second view video frame corresponding with first view video frame;
To every in each first view video frame and the second view video frame corresponding with first view video frame
Group image block pair mass fraction carry out spacial average processing, obtain each first view video frame and with first visual angle
The mass fraction of corresponding second view video frame of video frame;
To the quality of each first view video frame and the second view video frame corresponding with first view video frame
Score carries out time domain averaging processing, obtains the quality evaluation result of the three-dimensional video-frequency.
4. the method according to any claim in claims 1 to 3, which is characterized in that deep to preset two road
Degree convolutional neural networks training process include:
Cycle executes following operation, until the quality evaluation penalty values being calculated are less than given threshold:
By the first view video frame of preset three-dimensional video-frequency and the second multi-view video corresponding with first view video frame
Frame inputs preset two tunnels depth convolutional neural networks, obtains the video frame mass fraction of the three-dimensional video-frequency;
According to the standard quality score of the video frame mass fraction and the video frame, quality evaluation penalty values are calculated;
If the quality evaluation penalty values are not less than given threshold, using the quality evaluation penalty values to described preset
Two tunnel depth convolutional neural networks carry out reverse link parameter update.
5. according to the method described in claim 1, it is characterized in that, the extraction from three-dimensional video-frequency obtains the of setting quantity
Second view video frame of one view video frame and setting quantity, including:
From the first multi-view video sequence of three-dimensional video-frequency, the time domain key frame of extraction setting quantity obtains the first of setting quantity
View video frame, and from the second multi-view video sequence of the three-dimensional video-frequency, extract and regarded with the first of the setting quantity
Video frame corresponding second view video frame in angle obtains the second view video frame of setting quantity.
6. a kind of device of evaluation video quality, which is characterized in that including:
Video frame extraction unit, for from three-dimensional video-frequency extraction obtain setting quantity the first view video frame, and with institute
State the second view video frame of the corresponding setting quantity of the first view video frame of setting quantity;
Video quality evaluation unit, for utilizing trained preset two tunnels depth convolutional neural networks to the setting number
First view video frame of amount and the second view video frame of the setting quantity carry out quality evaluation, obtain the three-dimensional video-frequency
Quality evaluation result.
7. device according to claim 6, which is characterized in that the device further includes:
Video frame division unit, for according to preset image division methods, being regarded respectively to the first visual angle of the setting quantity
Frequency frame and the second view video frame of setting quantity carry out image block division processing.
8. device according to claim 7, which is characterized in that the video quality evaluation unit utilizes trained pre-
Bis- tunnel depth convolutional neural networks of She regard the first view video frame of the setting quantity and the second of the setting quantity
Angle video frame carries out quality evaluation and is specifically used for when obtaining the quality evaluation result of the three-dimensional video-frequency:
Using trained preset two tunnels depth convolutional neural networks, respectively to each first view video frame and with it is described
Every group of image block in corresponding second view video frame of first view video frame obtains each first and regards to carrying out quality evaluation
The mass fraction of angle video frame and every group of image block pair in the second view video frame corresponding with first view video frame;
To every in each first view video frame and the second view video frame corresponding with first view video frame
Group image block pair mass fraction carry out spacial average processing, obtain each first view video frame and with first visual angle
The mass fraction of corresponding second view video frame of video frame;
To the quality of each first view video frame and the second view video frame corresponding with first view video frame
Score carries out time domain averaging processing, obtains the quality evaluation result of the three-dimensional video-frequency.
9. the device according to any claim in claim 6 to 8, which is characterized in that the video quality evaluation list
Member is additionally operable to be trained the preset two tunnels depth convolutional neural networks;
When the video quality evaluation unit is to the training of the preset two tunnels depth convolutional neural networks, it is specifically used for:
Cycle executes following operation, until the quality evaluation penalty values being calculated are less than given threshold:
By the first view video frame of preset three-dimensional video-frequency and the second multi-view video corresponding with first view video frame
Frame inputs preset two tunnels depth convolutional neural networks, obtains the video frame mass fraction of the three-dimensional video-frequency;
According to the standard quality score of the video frame mass fraction and the video frame, quality evaluation penalty values are calculated;
If the quality evaluation penalty values are not less than given threshold, using the quality evaluation penalty values to described preset
Two tunnel depth convolutional neural networks carry out reverse link parameter update.
10. device according to claim 6, which is characterized in that the video frame extraction unit is extracted from three-dimensional video-frequency
When obtaining the first view video frame of setting quantity and setting the second view video frame of quantity, it is specifically used for:
From the first multi-view video sequence of three-dimensional video-frequency, the time domain key frame of extraction setting quantity obtains the first of setting quantity
View video frame, and from the second multi-view video sequence of the three-dimensional video-frequency, extract and regarded with the first of the setting quantity
Video frame corresponding second view video frame in angle obtains the second view video frame of setting quantity.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810088362.9A CN108337504A (en) | 2018-01-30 | 2018-01-30 | A kind of method and device of evaluation video quality |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810088362.9A CN108337504A (en) | 2018-01-30 | 2018-01-30 | A kind of method and device of evaluation video quality |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108337504A true CN108337504A (en) | 2018-07-27 |
Family
ID=62926123
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810088362.9A Pending CN108337504A (en) | 2018-01-30 | 2018-01-30 | A kind of method and device of evaluation video quality |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108337504A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109831664A (en) * | 2019-01-15 | 2019-05-31 | 天津大学 | Fast Compression three-dimensional video quality evaluation method based on deep learning |
CN110138594A (en) * | 2019-04-11 | 2019-08-16 | 福州瑞芯微电子股份有限公司 | Method for evaluating video quality and server based on deep learning |
CN110278415A (en) * | 2019-07-02 | 2019-09-24 | 浙江大学 | A kind of web camera video quality improvements method |
CN110365966A (en) * | 2019-06-11 | 2019-10-22 | 北京航空航天大学 | A kind of method for evaluating video quality and device based on form |
CN113256620A (en) * | 2021-06-25 | 2021-08-13 | 南京思飞捷软件科技有限公司 | Vehicle body welding quality information judging method based on difference convolution neural network |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105160678A (en) * | 2015-09-02 | 2015-12-16 | 山东大学 | Convolutional-neural-network-based reference-free three-dimensional image quality evaluation method |
CN107633513A (en) * | 2017-09-18 | 2018-01-26 | 天津大学 | The measure of 3D rendering quality based on deep learning |
-
2018
- 2018-01-30 CN CN201810088362.9A patent/CN108337504A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105160678A (en) * | 2015-09-02 | 2015-12-16 | 山东大学 | Convolutional-neural-network-based reference-free three-dimensional image quality evaluation method |
CN107633513A (en) * | 2017-09-18 | 2018-01-26 | 天津大学 | The measure of 3D rendering quality based on deep learning |
Non-Patent Citations (1)
Title |
---|
瞿晨非等: "一种基于卷积神经网络的无参考立体图像质量评估算法", 《中国科技论文在线》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109831664A (en) * | 2019-01-15 | 2019-05-31 | 天津大学 | Fast Compression three-dimensional video quality evaluation method based on deep learning |
CN110138594A (en) * | 2019-04-11 | 2019-08-16 | 福州瑞芯微电子股份有限公司 | Method for evaluating video quality and server based on deep learning |
CN110138594B (en) * | 2019-04-11 | 2022-04-19 | 瑞芯微电子股份有限公司 | Video quality evaluation method based on deep learning and server |
CN110365966A (en) * | 2019-06-11 | 2019-10-22 | 北京航空航天大学 | A kind of method for evaluating video quality and device based on form |
CN110278415A (en) * | 2019-07-02 | 2019-09-24 | 浙江大学 | A kind of web camera video quality improvements method |
CN113256620A (en) * | 2021-06-25 | 2021-08-13 | 南京思飞捷软件科技有限公司 | Vehicle body welding quality information judging method based on difference convolution neural network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108337504A (en) | A kind of method and device of evaluation video quality | |
CN107578403A (en) | The stereo image quality evaluation method of binocular view fusion is instructed based on gradient information | |
CN107635136B (en) | View-based access control model perception and binocular competition are without reference stereo image quality evaluation method | |
CN102595185A (en) | Stereo image quality objective evaluation method | |
EP2375766A2 (en) | Method and apparatus for measuring an audiovisual parameter | |
CN103780895B (en) | A kind of three-dimensional video quality evaluation method | |
CN103873854B (en) | The defining method of a kind of stereo-picture subjective assessment subject's quantity and experimental data | |
CN103136748B (en) | The objective evaluation method for quality of stereo images of a kind of feature based figure | |
CN105357519B (en) | Quality objective evaluation method for three-dimensional image without reference based on self-similarity characteristic | |
CN103986925A (en) | Method for evaluating vision comfort of three-dimensional video based on brightness compensation | |
CN102708567A (en) | Visual perception-based three-dimensional image quality objective evaluation method | |
CN105976337A (en) | Image defogging method based on filtering guiding via medians | |
CN110139095B (en) | Naked eye 3D display module detection method and system and readable storage medium | |
CN102903107A (en) | Three-dimensional picture quality objective evaluation method based on feature fusion | |
CN102710949B (en) | Visual sensation-based stereo video coding method | |
CN102999912B (en) | A kind of objective evaluation method for quality of stereo images based on distortion map | |
CN104581141A (en) | Three-dimensional picture visual comfort evaluation method | |
CN103780903B (en) | A kind of stereoscopic camera low coverage assembles shooting quality method for objectively evaluating | |
CN103745457A (en) | Stereo image objective quality evaluation method | |
CN102521839B (en) | Method for objectively evaluating image quality in no-reference manner for restoration of degraded images | |
Xing et al. | Estimating quality of experience on stereoscopic images | |
CN103391447B (en) | Safety depth guarantee and adjustment method in three-dimensional (3D) program shot switching | |
CN104243977B (en) | Based on the theoretical stereo image quality evaluation methodology with parallax compensation of ocular dominance | |
CN102271279B (en) | Objective analysis method for just noticeable change step length of stereo images | |
CN105430397B (en) | A kind of 3D rendering Quality of experience Forecasting Methodology and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180727 |