CN107018400B - It is a kind of by 2D Video Quality Metrics into the method for 3D videos - Google Patents
It is a kind of by 2D Video Quality Metrics into the method for 3D videos Download PDFInfo
- Publication number
- CN107018400B CN107018400B CN201710227433.4A CN201710227433A CN107018400B CN 107018400 B CN107018400 B CN 107018400B CN 201710227433 A CN201710227433 A CN 201710227433A CN 107018400 B CN107018400 B CN 107018400B
- Authority
- CN
- China
- Prior art keywords
- convolution
- eigenmatrix
- layer
- matrix
- obtains
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/139—Format conversion, e.g. of frame-rate or size
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Image Processing (AREA)
- Complex Calculations (AREA)
Abstract
It is a kind of by 2D Video Quality Metrics into the method for 3D videos, belong to pattern-recognition and computer vision field, it is therefore intended that eliminate the estimation of prior art scene depth and unpredictable error that View Synthesis is brought, while greatly improve calculating speed.The present invention includes training stage and service stage, and the training stage includes data input, feature extraction, Fusion Features, View Synthesis and parameter updating step successively;Service stage includes data input, feature extraction, Fusion Features and View Synthesis step successively.Training stage is to 106The 3D three-dimensional video-frequency movie films of the left-right format of grade are trained, scene depth estimation and View Synthesis are carried out at the same time Optimization Solution, it determines parameter, ensure that the Pixel-level accuracy prediction of output right wing view, reduce the error that 2D videos are turned 3D videos and are divided into two task operatings and bring;After training is completed, you can directly carry out conversion of the 2D videos to 3D videos, transformation of ownership efficiency can be greatly improved, ensure the precision of 3D three-dimensional video-frequencies finally exported.
Description
Technical field
The invention belongs to pattern-recognitions and computer vision field, and in particular to it is a kind of by 2D Video Quality Metrics into 3D videos
Method, the plane 2D videos for common camera to be shot are directly changed into the 3D for the left-right format that can be watched in cinema
Three-dimensional video-frequency.
Background technology
With the development of virtual reality technology, spectators is allowed to be immersed in 3D experience and increasingly become one, multimedia recreation field
Very important direction.3D experiences the support for needing panoramic video and 3D effect, and common planar video is if it is desire to obtain shadow
The 3D viewings experience of institute, needs the planar video of 2D being converted into 3D videos.There are many kinds of the forms of 3D videos, for having a left side
The 3D three-dimensional video-frequencies of right form, it is first various geometrical relationships, scene in 2D video images that usual 2D videos, which turn 3D videos,
Semantic information, estimate the front and rear hierarchical relationship of the object in 2D video images, and then by this spatial relation, into
Row geometric maps synthesize the image of another viewpoint, and the image of two viewpoints in left and right merges again, ultimately generates 3D solids
Video.
Traditional two-dimensional video turns 3 D video by two kinds of approach of hardware and software to solve.Hardware mainly has three-dimensional throwing
Shadow instrument or bore hole 3D TVs, width, into the compression of line width, is become original by them first by the 2D video images of input
Obtained video, is then not added with the offset or mapping of direct carry out level amount distinguished by half, and final same object exists
Position in two viewpoints in left and right is different, and the perception principle based on human eye it is expected to obtain 3D effect, this is a kind of very coarse
Method, 3D effect is not notable under most scenes, and the 2D of this product turns 3D functions and is limited by its product quality on the market, very
The user of positive audient is fewer, and most manufacturers are more absorbed in the performance boost of hardware device, that is, it is good to input the transformation of ownership
3D three-dimensional video-frequencies, by its hardware, obtain relatively good Three-dimensional Display effect.And the function that 2D turns 3D is mainly calculated by software
Method obtains, and general 2D, which turns 3D, to be come by the depth hierarchical information of object, semantic analysis, View Synthesis in estimation scene
To video in addition all the way, two different visual point images have the space structure of scene different exhibition methods, and people is watching
When be obtained with relatively good three-dimensional experience.
The solutions such as Electric company of Sichuan Changhong (201110239086.X) are optimized by hardware or software algorithm
Above-mentioned two step is exported with it is expected to obtain better 3D visual effect.On the one hand these schemes compare the consumption of computing resource
Greatly, time cost is also higher;Depth extraction, the View Synthesis technology of still further aspect scene are also further being explored at present
In, during actual use larger calculating error is had, two-part error is easy to cumulative and then influences final viewing
Experience.With the arrival in big data epoch, the videos such as more and more good three-dimensional movies, documentary film, animation are produced out,
And the later stage 2D of current plane video turn 3D making still to expend huge manpower and materials.
VGG16 models involved in the present invention, it is by K.Simonyan, A.Zisserma, in " Very deep
Convolutional networks for large-scale imagerecognition, " it proposes, see arXiv:
1409.1556 2014.This article annex open source projects, including Parameter File vgg16-0001.params, content packet
Include the weighting parameter of each layer convolution kernel in VGG16 models.
Invention content
The present invention provide it is a kind of by 2D Video Quality Metrics into the method for 3D videos, it is therefore intended that eliminate prior art Scene
The unpredictable error that estimation of Depth and View Synthesis are brought, while the calculating speed of the video solid transformation of ownership is greatly improved, to obtain
Obtain better stereos copic viewing experience.
It is provided by the present invention it is a kind of by 2D Video Quality Metrics into the method for 3D videos, including training stage and service stage,
It is characterized in that:
(1) training stage includes data input step, characteristic extraction step, Fusion Features step, viewpoint conjunction successively
Into step and parameter updating step;
(1.1) data input step:It is 10 to obtain the order of magnitude by disclosed video data resource6Left and right view format
3D three-dimensional video-frequency movie films, and select disparity range be -15~+16 3D three-dimensional video-frequency movie films, as training
Data set, image size is H rows in three-dimensional video-frequency movie film, W is arranged, H=200~1920, W=180~1080;
(1.2) characteristic extraction step:The three-dimensional video-frequency that training data concentrates left and right view format is split as left and right view
The stereo-picture of form, reservation right view is constant, and mathematics convolution algorithm and the down-sampled fortune of pondization are passed sequentially through to a frame left view
It calculates and carries out feature extraction, obtain the eleventh floor convolution eigenmatrix of the first layer convolution eigenmatrix of the frame left view~second, make
For its characteristics of image;
(1.3) Fusion Features step:Respectively to third layer, layer 6, the tenth layer, the 14th layer, the second eleventh floor convolution
Eigenmatrix is operated into row matrix deconvolution, by obtained first group of deconvolution eigenmatrix D3 to the 5th groups of deconvolution spy
Sign matrix D 21 is cascaded, and forms fusion feature matrix D c, and fusion feature matrix D c sizes are H × W × 32, i.e. matrix has H
Row, W row, 32 tensor dimensions;
(1.4) View Synthesis step:
To fusion feature matrix D c, using regression parameter matrix θ, obtain each pixel in corresponding left view and take different parallaxes
When probabilistic forecasting value, form the parallax probability matrix Dep of left view;
According to original left view and parallax probability matrix Dep, by View Synthesis, right wing synthesis view is obtained;
(1.5) parameter updating step:
Calculate the error matrix err of right wing synthesis view and the right view in step (1.2)R;
View, each error matrix err are synthesized for continuous N frame right wingRIt is added, forms propagated error matrix errSR, M >=
16;Obtained propagated error matrix is traveled into View Synthesis step, Fusion Features step and spy by back-propagation algorithm
Each sub-step in extraction step is levied, reversely update parallax probability matrix Dep, regression parameter matrix θ and each layer are respectively rolled up successively
Each weights of product core, complete the primary update of all parameters;
Turn characteristic extraction step (1.2), remaining left view continuation is concentrated to carry out successively to the training data obtained
Feature extraction, Fusion Features, View Synthesis step and parameter updating step are stated, when training data concentrates all left and right views equal
Using finishing, the first round update of parallax probability matrix Dep, regression parameter matrix θ and each convolution kernel of each layer is completed;
According to the step (1.1) to step (1.5), parallax probability matrix Dep, regression parameter matrix θ are continued to complete, with
And the second wheel update of each each convolution kernel of layer;So repeat, after the 50th wheel the~the two hundred wheel update is completed, the training stage
Terminate;
(2) service stage includes data input step, characteristic extraction step, Fusion Features step and viewpoint conjunction successively
Into step;
(2.1) data input step:Prepare plane 2D videos to be converted;
(2.2) characteristic extraction step:Plane 2D videos are split as image, as the left view being equivalent in step (1.2)
Figure, passes sequentially through mathematics convolution algorithm and the down-sampled operation of pondization carries out feature extraction, obtains the first layer volume of the frame left view
The product eleventh floor convolution eigenmatrix of eigenmatrix~second, as its characteristics of image, each each weights of convolution kernel are equal involved by each layer
Each weights of each convolution kernel of each layer are corresponded to after using the training stage;
(2.3) Fusion Features step:It is identical with step (1.3);
(2.4) View Synthesis step:It is identical with step (1.4), according to the parallax probability matrix obtained in step (1.4)
Image in Dep and (2.2) obtains right wing by View Synthesis and synthesizes view;
Features described above extraction step (2.2), Fusion Features step (2.3) and viewpoint is carried out successively to each frame left view to close
Into step (2.4), left and right splicing, right a later frame one are carried out using original image as left view and obtained right wing synthesis view
Frame, which connects together, obtains the 3D three-dimensional video-frequencies of left-right format.
The characteristic extraction step (1.2) includes following sub-steps:
(1.2.1) carries out convolution operation to a frame left view, obtains first layer convolution eigenmatrix:
Using 3 × 3 convolution kernel, step-length 1 since the upper left corner of image, moves right successively, until the right of image
Boundary is changed to the next line of image, continues to move from left to right, and until the lower right corner of image, often mobile primary, convolution kernel is each
Weights are multiplied with the pixel value of corresponding position image, and all products are added again, obtain image convolution core region convolution value;Figure
Former regional location is pressed as each region convolution value to arrange, and forms 1 convolution eigenmatrix C1 of the frame left view1, to C11In own
Minus element value value is zero;
The convolution kernel of 64 3 × 3 is used altogether, and above-mentioned convolution operation is carried out to image, obtains 64 1 convolution feature squares
Battle array C11、C12、C13……C164, form first layer convolution eigenmatrix;
(1.2.2) carries out convolution operation to first layer convolution eigenmatrix, obtains second layer convolution eigenmatrix:
Use 3 × 3 convolution kernel, step-length 1, from 1 convolution eigenmatrix C1 of first layer1The upper left corner start, successively
It moves right, until 1 convolution eigenmatrix C11Right margin, be changed to next line, continue to move from left to right, until 1 convolution
Eigenmatrix C11The lower right corner until, often mobile primary, each weights of convolution kernel and 1 convolution eigenmatrix C1 of corresponding position1's
Matrix element value is multiplied, and all products are added again, obtain 1 convolution eigenmatrix C11Convolution kernel region convolution value;1 time
Convolution eigenmatrix C11Each region convolution value is pressed former regional location and is arranged, and convolution eigenmatrix is formed, to obtained convolution feature
All minus element value values are zero in matrix;
Later to remaining 63 1 convolution eigenmatrix C12、C13……C164, it is corresponding using this layer of convolution eigenmatrix
Convolution kernel repeats operation described in the preceding paragraph, obtains 64 convolution eigenmatrixes altogether, is directly added, and forms 2 secondary volumes
Product eigenmatrix C21;
The convolution kernel of 64 3 × 3 is used altogether, to 64 1 convolution features included by first layer convolution eigenmatrix
Matrix C 11、C12、C13……C164Two sections of convolution operations are carried out, obtain 64 2 convolution eigenmatrix C21、C22、
C23……C264, form second layer convolution eigenmatrix;
(1.2.3) carries out the down-sampled operation of first time pondization to second layer convolution eigenmatrix, obtains third layer convolution spy
Levy matrix:
To 2 convolution eigenmatrix C2 of the second layer1, use 2 × 2 sliding window, step-length 2, from 2 convolution feature squares
Battle array C21The upper left corner start, move right successively, until 2 convolution matrix C21Right margin, be changed to next line, continue from left-hand
It moves right, until 2 convolution matrix C11The lower right corner until, it is often mobile primary, take corresponding position in 2 × 2 sliding window region
2 convolution eigenmatrix C21The maximum value of matrix element, the pondization as convolution kernel region sample characteristic value, 2 convolution
Eigenmatrix C21Each pool areaization sampling characteristic value is pressed former regional location and is arranged, and forms 3 convolution eigenmatrix C31;
To remaining 63 2 convolution eigenmatrix C22、C23……C264The down-sampled operation of above-mentioned pondization is carried out successively, altogether
Meter obtains 64 3 convolution eigenmatrix C31、C32、C33……C364, form third layer convolution eigenmatrix;
(1.2.4) carries out convolution operation to third layer convolution eigenmatrix, obtains the 4th layer of convolution eigenmatrix:
Using the same manner in sub-step (1.2.2), using 3 × 3 convolution kernel, step-length 1, each weights of convolution kernel are right
64 3 convolution eigenmatrix C31、C32、C33……C364, matrix convolution operation is carried out respectively, obtains 64 convolution spies altogether
Matrix is levied, is directly added, forms 4 convolution eigenmatrix C41;
The convolution kernel of 128 3 × 3 is used altogether, to 64 3 convolution features included by third layer convolution eigenmatrix
Matrix C 31、C32、C33……C364, carry out the preceding paragraph described in convolution operation, obtain 128 4 convolution eigenmatrix C41、C42、
C43……C4128, form the 4th layer of convolution eigenmatrix;
(1.2.5) carries out convolution operation to the 4th layer of convolution eigenmatrix, obtains layer 5 convolution eigenmatrix:
Using the same manner in sub-step (1.2.2), using 3 × 3 convolution kernel, step-length 1, each weights of convolution kernel are right
128 4 convolution eigenmatrix C41、C42、C43……C4128, matrix convolution operation is carried out respectively, obtains 128 convolution altogether
Eigenmatrix is directly added, and forms 5 convolution eigenmatrix C51;
The convolution kernel of 128 3 × 3 is used altogether, to 128 4 convolution spies included by the 4th layer of convolution eigenmatrix
Levy Matrix C 41、C42、C43……C4128, carry out the preceding paragraph described in convolution operation, obtain 128 5 convolution eigenmatrix C51、
C52、C53……C5128, form layer 5 convolution eigenmatrix;
(1.2.6) carries out the down-sampled operation of second of pondization to layer 5 convolution eigenmatrix, obtains layer 6 convolution spy
Levy matrix:
Using the same manner in sub-step (1.2.3), to 128 5 convolution eigenmatrix C51、C52、C53……C5128,
The down-sampled operation of pondization is carried out successively, obtains 128 6 convolution eigenmatrix C6 altogether1、C62、C63……C6128, form the
Six layers of convolution eigenmatrix;
(1.2.7) carries out convolution operation to layer 6 convolution eigenmatrix, obtains layer 7 convolution eigenmatrix:
Using the same manner in sub-step (1.2.2), 3 × 3 convolution kernel, step-length 1, to 128 6 convolution spies are used
Levy Matrix C 61、C62、C63……C6128, matrix convolution operation is carried out respectively, is obtained 128 convolution results, is directly added,
Form 7 convolution eigenmatrix C71;
The convolution kernel of 256 3 × 3 is used altogether, to 128 6 convolution spies included by layer 6 convolution eigenmatrix
Levy Matrix C 61、C62、C63……C6128, carry out the preceding paragraph described in convolution operation, obtain 256 7 convolution eigenmatrix C71、
C72、C73……C7256, form layer 7 convolution eigenmatrix;
(1.2.8) carries out convolution operation to layer 7 output eigenmatrix, obtains the 8th layer of convolution eigenmatrix:
Using the same manner in sub-step (1.2.2), using 3 × 3 convolution kernel, step-length 1, each weights of convolution kernel are right
256 7 convolution eigenmatrix C71、C72、C73……C7256, matrix convolution operation is carried out respectively, obtains 256 convolution knots
Fruit is directly added, and forms 8 convolution eigenmatrix C81;
The convolution kernel of 256 3 × 3 is used altogether, to 256 7 convolution spies included by layer 7 convolution eigenmatrix
Levy Matrix C 71、C72、C73……C7256, carry out the preceding paragraph described in convolution operation, obtain 256 8 convolution eigenmatrix C81、
C82、C83……C8256, form the 8th layer of convolution eigenmatrix;
(1.2.9) carries out convolution operation to the 8th layer of output eigenmatrix, obtains the 9th layer of convolution eigenmatrix:
Using the same manner in sub-step (1.2.2), using 3 × 3 convolution kernel, step-length 1, each weights of convolution kernel are right
256 8 convolution eigenmatrix C81、C82、C83……C8256, matrix convolution operation is carried out respectively, obtains 256 convolution knots
Fruit is directly added, and forms 9 convolution eigenmatrix C91;
The convolution kernel of 256 3 × 3 is used altogether, to 256 8 convolution spies included by the 8th layer of convolution eigenmatrix
Levy Matrix C 81、C82、C83……C8256, carry out the preceding paragraph described in convolution operation, obtain 256 9 convolution eigenmatrix C91、
C92、C93……C9256, form the 9th layer of convolution eigenmatrix;
(1.2.10) carries out the down-sampled operation of third time pondization to the 9th layer of output convolution eigenmatrix, obtains the tenth layer of volume
Product eigenmatrix:
Using the same manner in sub-step (1.2.3), to 256 9 convolution eigenmatrix C91、C92、C93……C9256,
The down-sampled operation of pondization is carried out successively, obtains 256 10 convolution eigenmatrix C10 altogether1、C102、C103……C10256, structure
Into the tenth layer of convolution eigenmatrix;
(1.2.11) carries out convolution operation successively to the tenth layer of convolution eigenmatrix, obtains eleventh floor, the 12nd successively
Layer, the tenth three-layer coil product eigenmatrix:
According to the similar operation of sub-step (1.2.7), convolution operation, the volume used are carried out to the tenth layer of convolution eigenmatrix
Product nuclear volume is 512, to 256 10 convolution eigenmatrix C10 included by the tenth layer of convolution eigenmatrix1、C102、
C103……C10256Obtain 512 11 convolution eigenmatrix C111、C112、C113……C11512, form eleventh floor convolution
Eigenmatrix;
According to the similar operation of sub-step (1.2.8), convolution operation is carried out to eleventh floor convolution eigenmatrix, is used
Convolution nuclear volume is 512, to 512 11 convolution eigenmatrix C11 included by eleventh floor convolution eigenmatrix1、
C112、C113……C11512Obtain 512 12 convolution eigenmatrix C121、C122、C123……C12512, form Floor 12
Convolution eigenmatrix;
According to the similar operation of sub-step (1.2.9), convolution operation is carried out to Floor 12 convolution eigenmatrix, is used
Convolution nuclear volume is 512, to 512 12 convolution eigenmatrix C12 included by Floor 12 convolution eigenmatrix1、
C122、C123……C12512Obtain 512 13 convolution eigenmatrix C131、C132、C133……C13512, form the 13rd layer
Convolution eigenmatrix;
(1.2.12) is to the tenth three-layer coil product eigenmatrix C131、C132、C133……C13512, carry out the 4th pondization drop
Sampling operation obtains the 14th layer of convolution eigenmatrix:
Using the same manner in sub-step (1.2.3), to 512 13 convolution eigenmatrix C131、C132、C133……
C13512, the down-sampled operation of pondization is carried out successively, obtains 512 14 convolution eigenmatrix C14 altogether1、C142、C143……
C14512, form the 14th layer of convolution eigenmatrix;
(1.2.13) carries out convolution operation successively to the 14th layer of convolution eigenmatrix, obtain successively the 15th layer, the tenth
Six layers, the 17th layer of convolution eigenmatrix:
According to the similar operation of sub-step (1.2.11), convolution operation is carried out to the 14th layer of convolution eigenmatrix, is used
Convolution nuclear volume for 512, to 512 14 convolution eigenmatrix C14 included by the 14th layer of convolution eigenmatrix1、
C142、C143……C14512Obtain 512 15 convolution eigenmatrix C151、C152、C153……C15512, form the 15th layer
Convolution eigenmatrix;
According to the similar operation of sub-step (1.2.12), convolution operation is carried out to the 15th layer of convolution eigenmatrix, is used
Convolution nuclear volume for 512, to 512 15 convolution eigenmatrix C15 included by the 15th layer of convolution eigenmatrix1、
C152、C153……C15512Obtain 512 16 convolution eigenmatrix C161、C162、C163……C16512, form the 16th layer
Convolution eigenmatrix;
According to the similar operation of sub-step (1.2.12), convolution operation is carried out to the 16th layer of convolution eigenmatrix, is used
Convolution nuclear volume for 512, to 512 16 convolution eigenmatrix C16 included by the 16th layer of convolution eigenmatrix1、
C162、C163……C16512Obtain 512 17 convolution eigenmatrix C171、C172、C173……C17512, form the 17th layer
Convolution eigenmatrix;
(1.2.14) carries out the 4th down-sampled operation of pondization, obtains the 18th layer to the 17th layer of convolution eigenmatrix
Convolution eigenmatrix:
Using the same manner in sub-step (1.2.3), to 512 17 convolution eigenmatrix C171、C172、C173……
C17512, the down-sampled operation of pondization is carried out successively, obtains 512 18 convolution eigenmatrix C18 altogether1、C182、C183……
C18512, form the 18th layer of convolution eigenmatrix;
(1.2.15) carries out convolution operation to the 18th layer of output eigenmatrix, obtains the 19th layer of convolution eigenmatrix:
Using the same manner in sub-step (1.2.2), using 3 × 3 convolution kernel, step-length 1, each weights of convolution kernel are right
512 18 convolution eigenmatrix C181、C182、C183……C18512, matrix convolution operation is carried out respectively, obtains 512 volumes
Product is as a result, be directly added, 19 convolution eigenmatrix C19 of composition1;
The convolution kernel of 4096 3 × 3 is used altogether, to 512 18 secondary volumes included by the 18th layer of convolution eigenmatrix
Product eigenmatrix C181、C182、C183……C18512, carry out the preceding paragraph described in convolution operation, obtain 4096 19 convolution spies
Levy Matrix C 191、C192、C193……C194096, form the 19th layer of convolution eigenmatrix;
(1.2.16) carries out convolution operation to the 19th layer of convolution eigenmatrix, obtains the 20th layer of convolution eigenmatrix:
Using the same manner in sub-step (1.2.2), using 1 × 1 convolution kernel, step-length 1, each weights of convolution kernel are right
4096 19 convolution eigenmatrix C191、C192、C193……C194096, matrix convolution operation is carried out respectively, obtains 4096
Convolution results are directly added, and form 20 convolution eigenmatrix C201;
The convolution kernel of 4096 1 × 1 is used altogether, to 4096 19 secondary volumes included by the 19th layer of convolution eigenmatrix
Product eigenmatrix C191、C192、C193……C194096, carry out the preceding paragraph described in convolution operation, obtain 4096 20 convolution spies
Levy Matrix C 201、C202、C203……C204096, form the 20th layer of convolution eigenmatrix;
(1.2.17) carries out convolution operation to the 20th layer of output eigenmatrix, obtains the second eleventh floor convolution feature square
Battle array:
Using the same manner in sub-step (1.2.2), using 1 × 1 convolution kernel, step-length 1, each weights of convolution kernel are right
4096 20 convolution eigenmatrix C201、C202、C203……C204096, matrix convolution operation is carried out respectively, obtains 4096
Convolution results are directly added, and form 21 convolution eigenmatrix C211;
The convolution kernel of 32 1 × 1 is used altogether, to 4096 20 convolution included by the 20th layer of convolution eigenmatrix
Eigenmatrix C201、C202、C203……C204096, carry out the preceding paragraph described in convolution operation, obtain 32 21 convolution feature squares
Battle array C211、C212、C213……C2132, form the second eleventh floor convolution eigenmatrix;
Involved each each weights of convolution kernel are using VGG16 models in sub-step (1.2.1) to sub-step (1.2.17)
Numerical value is initialized in Parameter File vgg16-0001.params, then corresponding each afterwards using parameter updating step (1.5) later
Each weights of each convolution kernel of layer;
To every frame left view, by sub-step (1.2.1) to (1.2.17), two extracted by different scale are obtained
Eleventh floor convolution eigenmatrix.
The Fusion Features step (1.3) includes following sub-steps:
(1.3.1) obtained third layer convolution eigenmatrix that operates down-sampled to first time pondization carries out deconvolution operation,
Obtain first group of deconvolution eigenmatrix D3:
Using 1 × 1 convolution kernel, respectively to 64 3 convolution eigenmatrix C31、C32、C33……C364It is anti-into row matrix
Convolution operation obtains 64 convolution outputs as a result, being directly added, forms 1 deconvolution eigenmatrix D31;
The convolution kernel of 32 1 × 1 is used altogether, to 64 3 convolution features included by third layer convolution eigenmatrix
Matrix C 31、C32、C33……C364Operation described in the preceding paragraph is carried out, obtains 32 1 deconvolution eigenmatrix D31、D32、
D33……D332, form first group of deconvolution eigenmatrix D3;
(1.3.2) obtained layer 6 convolution eigenmatrix that operates down-sampled to second of pondization carries out deconvolution operation,
Obtain second group of deconvolution eigenmatrix D6:
Using 2 × 2 convolution kernel, respectively to 128 6 convolution eigenmatrix C61、C62、C63……C6128Into row matrix
Deconvolution operates, and obtains 128 convolution outputs as a result, being directly added, forms 2 deconvolution eigenmatrix D61;
The convolution kernel of 32 2 × 2 is used altogether, to 128 6 convolution features included by layer 6 convolution eigenmatrix
Matrix C 61、C62、C63……C6128Operation described in the preceding paragraph is carried out, obtains 32 2 deconvolution eigenmatrix D61、D62、
D63……D632, form second group of deconvolution eigenmatrix D6;
(1.3.3) down-sampled to third time pondization to operate the tenth layer of obtained convolution eigenmatrix, the 4th pondization is dropped and adopted
The 14th layer of convolution eigenmatrix that sample operates carries out deconvolution operation, obtains third group deconvolution eigenmatrix respectively
D10 and the 4th group of deconvolution eigenmatrix D14:
Wherein, it is operated according to the matrix deconvolution in sub-step (1.3.2), altogether using the convolution kernel of 32 4 × 4, volume
Each weights of product core, to 256 10 convolution eigenmatrix C10 included by the tenth layer of convolution eigenmatrix1、C102、C103……
C10256It is operated, obtains 32 3 deconvolution eigenmatrix D101、D102、D103……D1032, form third group deconvolution
Eigenmatrix D10;
The convolution kernel of 32 8 × 8 is used altogether, to 512 14 convolution included by the 14th layer of convolution eigenmatrix
Eigenmatrix C141、C142、C143……C14512It is operated, obtains 32 4 deconvolution eigenmatrix D141、D142、
D143……D1432, form the 4th group of deconvolution eigenmatrix D14;
(1.3.4) operates the second eleventh floor convolution eigenmatrix into row matrix deconvolution, and it is special to obtain the 5th group of deconvolution
Levy matrix:
It is operated according to the matrix deconvolution in sub-step (1.3.2), altogether using the convolution kernel of 32 16 × 16, convolution kernel
Each weights, to 32 21 convolution eigenmatrix C21 included by the second eleventh floor convolution eigenmatrix1、C212、C213……
C2132It is operated, obtains 32 5 deconvolution eigenmatrix D211、D212、D213……D2132, form the 5th group of deconvolution
Eigenmatrix D21;
(1.3.5) cascades first group of deconvolution eigenmatrix D3 to the 5th groups of deconvolution eigenmatrix D21, forms
Fusion feature matrix D c, shown in specific cascade system such as formula (1):
Involved each each weights of convolution kernel use the ginseng of VGG16 models in sub-step (1.3.1) to sub-step (1.3.4)
Numerical value is initialized in number file vgg16-0001.params, then corresponds to each layer afterwards using parameter updating step (1.5) later
Each weights of each convolution kernel.
The View Synthesis step (1.4) includes following sub-steps:
(1.4.1) depth map:It to fusion feature matrix D c, is calculated, is obtained in corresponding left view using the following formula (2)
Each pixel takes probabilistic forecasting value during different parallaxes, forms the parallax probability matrix Dep of left view:
Wherein, matrix element DepkAlso for matrix, k=1,2 ..., 32, represent probability value p (y=k | x;θ), that is, parallax is taken
For k-16, each pixel of left view corresponding parallax probability value when regression parameter matrix is θ;Regression parameter matrix θ is by regression parameter
[θ1, θ2..., θ32] composition,Represent that p-th of tensor dimension is in regression parameter θ in DcpUnder logistic regression value, p=1,
2nd ... 32, use each recurrence of the numerical value to regression parameter matrix θ in the Parameter File vgg16-0001.params of VGG16 models
Parameter initialization, later then using each regression parameter of parameter updating step (1.5) corresponding regression parameter matrix θ afterwards;
(1.4.2) forms right wing synthesis view R, wherein the pixel value R of the i-th row jth rowI, jAs shown in formula (3):
Wherein, LdPan view during parallax d is translated for original left view,For its i-th row jth row pixel value,LI, j-dFor i in original left view, the pixel value at j-d positions, if j-d < 0,D be regarding
Difference, -15≤d≤+ 16,For i in matrix D epk, the element value at j positions is i in left view, pixel value at j positions
Parallax is taken as the probability of d, k=d+16, i=1~H, j=1~W.
The parameter updating step (1.5) includes following sub-steps:
(1.5.1) subtracts each other right wing synthesis view R with the right view in step (1.2), obtains error matrix errR;For
Continuous N frame right wing synthesizes view, each error matrix errRIt is added, forms propagated error matrix errSR, M >=16;
(1.5.2) is by propagated error matrix errSRView Synthesis step (1.4) is propagated backward to, first by parallax probability
Matrix D ep is updated to new parallax probability matrix Depl+1, matrix element isAs shown in formula (4):
The updated value of the l+1 times when expression parallax is d (d=k-16),It is once updated before representing it
Value, learning rate η are initialized as 0.00003~0.00008;
Secondly regression parameter matrix θ is updated to θ l by (1.5.3)+1, for θl+1Each regression parameterUnder
State formula (5) update;
WhereinRepresent regression parameter θpThe l+1 times updated value,Represent its previous updated value, DeppAs
Corresponding Dep after the update of View Synthesis step (1.4) parameterk, wherein p=k, LpFor the corresponding L of View Synthesis stepd, wherein
D=p-16, DcpAs corresponding Dc in step (1.3.5)t, the value range of wherein t=p, p are the update speed that 1 to 32, λ is θ
Rate, 0.00001≤λ≤0.01;
(1.5.4) is by propagated error matrix errSRBecome characteristic error matrix errDR, as shown in formula (6):
Again by characteristic error matrix errDRCharacteristic extraction step (1.2) and Fusion Features step (1.3) are sent into, to being related to
The weights of each convolution kernel be updated using the iterative manner in caffe deep learning frames,
Update 12608 convolution kernels altogether successively from the second eleventh floor to first layer, altogether 113472 convolution kernel weights;
By sub-step (1.5.2)~(1.5.4), the primary update of all parameters is completed.
Characteristic extraction step is consistent with the level of VGG16 models that parameter initialization uses with Fusion Features step, real
When border operates, to the right value update of each convolution kernel using caffe deep learnings frame (Jia Y, Shelhamer E,
Donahue J, et al.Caffe:Convolutional architecture for fast feature
embedding.In:Proc, ACM international conference on Multimedia.2014:In 675-678.)
Iterative manner, be academia and the general gradient dropout error circulation way of industrial quarters, core is described as:Convolution kernel is each
The current value of weights adds corresponding residual values, is each weights of updated convolution kernel;A certain layer input convolution eigenmatrix warp
This layer of convolution kernel is crossed to carry out obtaining output convolution eigenmatrix, as next layer of input convolution feature square after convolution algorithm
Battle array, then it is assumed that this layer inputs each by the convolution kernel of this layer between convolution eigenmatrix and next layer of input convolution eigenmatrix
There is weights connection between weights corresponding position,
The residual values of a certain each weights of layer convolution kernel form a matrix, solve in the following manner:
It is right that the residual values of each weights on all convolution kernels of the later layer for there are weights to connect with the convolution kernel are multiplied by its respectively
Answer position weights, all results summed to obtain matrix of consequence, when its somewhere location matrix element be less than zero when value be zero, most
Obtained matrix is residual error value matrix eventually.
The present invention include training stage and service stage, the training stage by feature extraction, Fusion Features, View Synthesis and
Parameter updates four core procedures, to 106The 3D three-dimensional video-frequency movie films of the left-right format of grade are trained, and scene depth is estimated
The task of meter and View Synthesis is carried out at the same time Optimization Solution, determines parameter, ensure that the pixel class precision of output right wing view is pre-
It surveys, reduces and 2D videos are usually turned into the error that 3D videos are divided into two task operatings and bring.After training is completed, you can directly
Conversion of the 2D videos to 3D videos is carried out, centre needs not move through turn being separated from the estimation of Depth of scene to View Synthesis
Process processed, but front and rear step as input optimizes, trains together jointly in its training process, it can be big in service stage
The big 2D videos that improve ensure that the precision of 3D three-dimensional video-frequencies finally exported to the transformation of ownership efficiency of 3D videos.
Compared with other existing 2D video transformation of ownership 3D video methods, effect of the invention protrusion is embodied in the following:
1. compared to the previous scene depth estimation of carry out first, the technological approaches of View Synthesis, this hair are then carried out again
Bright to be put forward for the first time in scene depth estimation and View Synthesis unification a to frame, the design of one side step is concise, instruction
It is fast to practice calculating speed after completing;On the other hand reduce amount broad in the middle and calculate the estimation of Depth precision of especially current scene not
The error enough brought, can obtain higher output accuracy;
2. the present invention is accomplished that the image output of pixel scale, for each pixel in original image, prediction
Its possible pixel-map accurately obtains its image distribution situation in another viewpoint.According to 106The left and right lattice of grade
The 3D three-dimensional video-frequency movie films of formula are trained, and the original intention of these video productions design is exactly stronger in order to which people is allowed to have when watching
Euphorosia sense, it ensure that the accuracy of the 3 D video finally exported, while stronger vision is brought in video-see
Impact effect.
Description of the drawings
Fig. 1 is the flow diagram of the present invention;
Fig. 2 is the input and output image of the present invention, one behavior input picture of top, the corresponding synthesis right wing of a following behavior
View.
Specific embodiment
Below in conjunction with drawings and examples, the present invention is described in more detail.
As shown in Figure 1, the present invention includes training stage and service stage, the training stage includes data input step successively
Suddenly, characteristic extraction step, Fusion Features step, View Synthesis step and parameter updating step;The service stage includes successively
Data input step, characteristic extraction step, Fusion Features step and View Synthesis step.
The embodiment of the present invention, including training stage and service stage,
(1) training stage includes data input step, characteristic extraction step, Fusion Features step, viewpoint conjunction successively
Into step and parameter updating step;
(1.1) data input step:It is 10 to obtain the order of magnitude by disclosed video data resource6Left and right view format
3D three-dimensional video-frequency movie films, and select disparity range be -15 to+16 3D three-dimensional video-frequency movie films, as training
Data set, image size is H rows in three-dimensional video-frequency movie film, W is arranged, H=200~1920, W=180~1080;
The 3D three-dimensional video-frequency movie films of left and right view format contain different types of three-dimensional film data, including action
Piece, feature film, external animation, documentary film, 3D video frequency propaganda films shown etc.;
The 3D three-dimensional video-frequency movie films of disparity range -15~+16 are picked out from 3D three-dimensional video-frequency movie films, process is such as
Under:
(a) video data in 3D three-dimensional video-frequency movie films is converted to the solid of left and right view format one by one
Image;
(b) the left and right view obtained in (a) is obtained into corresponding disparity map using Stereo Matching Algorithm;
(c) mean filter smoothing processing is carried out to the disparity map that (b) is obtained;
(d) statistics with histogram is carried out to the disparity map after (c) smoothing processing, obtains regarding in disparity map statistics with histogram
The maximum value and minimum value of difference;
(e) according to the maximum value and minimum value of parallax value obtained in (d), judge the frame stereo-picture whether -15~+
In 16 disparity ranges, it is, retains;Otherwise present frame stereo-picture is abandoned.
For the processing that the every section of video initially chosen all proceeds as described above, ensure not by the video of indivedual specially treateds
Segment is interfered, and in the present embodiment, the image data of about 1,200,000 frames is obtained eventually by the above process, is used as really
The training dataset of rational method.
Binocular Stereo Matching Algorithm (D.Scharstein, R.Szeliski, " Ataxonomy and are used in the present embodiment
Evaluation of dense two-frame stereo correspondencealgorithms ", International
Journal of Computer Vision, 2002,47 (1-3), pp.7-42.) to training data concentrate video left and right two
Road carries out Stereo matching, obtains the parallax distribution of each binocular tri-dimensional video, and the Binocular Stereo Matching Algorithm is a kind of
General solution, the present embodiment use its official's open source projects.
After select training dataset, to different training video data into the unification of row format, in view of training data
It concentrates the picture size gap of different video can be bigger, needs to do graphical rule size unified processing, ensure model ginseng
Graphical rule in all videos is scaled unified 640 × 960 in the present embodiment by several trainabilities.
(1.2) characteristic extraction step:The three-dimensional video-frequency that training data concentrates left-right format is split as left and right road view lattice
The stereo-picture of formula, reservation right view is constant, and mathematics convolution algorithm and the down-sampled operation of pondization are passed sequentially through to a frame left view
Feature extraction is carried out, obtains the eleventh floor convolution eigenmatrix of the first layer convolution eigenmatrix of the frame left view~second, as
Its characteristics of image.
Characteristic extraction step employs a large amount of convolution operation and the down-sampled operation of pondization, constantly carries out different scale,
The feature extraction of different zones.For clarity, will be formed the operation of each layer convolution eigenmatrix, convolution kernel size, quantity with
And sliding window size, quantity are given in down:
First layer, convolution algorithm, convolution kernel size are 3 × 3, quantity 64;
The second layer, convolution algorithm, convolution kernel size are 3 × 3, quantity 64;
Third layer, the down-sampled operation of first time pondization, sliding window are 2 × 2, step-length 2;
4th layer, convolution algorithm, convolution kernel size is 3 × 3, quantity 128;
Layer 5, convolution algorithm, convolution kernel size are 3 × 3, quantity 128;
Layer 6, the down-sampled operation of second of pondization, sliding window are 2 × 2, step-length 2;
Layer 7, convolution algorithm, convolution kernel size are 3 × 3, quantity 256;
8th layer, convolution algorithm, convolution kernel size is 3 × 3, quantity 256;
9th layer, convolution algorithm, convolution kernel size is 3 × 3, quantity 256;
Tenth layer, the down-sampled operation of third time pondization, sliding window is 2 × 2, step-length 2;
Eleventh floor, convolution algorithm, convolution kernel size are 3 × 3, quantity 512;
Floor 12, convolution algorithm, convolution kernel size are 3 × 3, quantity 512;
13rd layer, convolution algorithm, convolution kernel size is 3 × 3, quantity 512;
14th layer, the 4th down-sampled operation of pondization, sliding window is 2 × 2, step-length 2;
15th layer, convolution algorithm, convolution kernel size is 3 × 3, quantity 512;
16th layer, convolution algorithm, convolution kernel size is 3 × 3, quantity 512;
17th layer, convolution algorithm, convolution kernel size is 3 × 3, quantity 512;
18th layer, the 5th down-sampled operation of pondization, sliding window is 2 × 2, step-length 2;
19th layer, convolution algorithm, convolution kernel size is 3 × 3, quantity 4096;
20th layer, convolution algorithm, convolution kernel size is 1 × 1, quantity 4096;
Second eleventh floor, convolution algorithm, convolution kernel size are 1 × 1, quantity 32;
The feature that input left view obtains after continuous several convolution feature extractions with stronger regional area with
Global space ability to express.For it is one big it is small be 640 × 960 sizes input picture for, left image and right image are big
Small is respectively 640 × 480 and 640 × 480.
After convolution operation and the down-sampled operation of pondization, the third layer of left view, layer 6, the tenth layer, the tenth
Four layers, the second eleventh floor convolution eigenmatrix size be respectively 320 × 240,160 × 120,80 × 60,40 × 30,20 × 15.
(1.3) Fusion Features step:Respectively to third layer, layer 6, the tenth layer, the 14th layer, the second eleventh floor convolution
Eigenmatrix is operated into row matrix deconvolution, by obtained first group of deconvolution eigenmatrix D3 to the 5th groups of deconvolution spy
Sign matrix D 21 is cascaded, and forms fusion feature matrix D c, in this example fusion feature matrix D c sizes for 640 × 480 ×
32, i.e. matrix has 640 rows, 480 row, 32 tensor dimensions;
It is finally the image of output pixel rank in the present invention, thus, in order to ensure that the right wing of pixel scale synthesizes view
In to different scale, different zones, part and global characteristics all have good response, what the present invention exported different convolutional layers
Convolution eigenmatrix is merged.Every time after the down-sampled operation of pondization, matrix character dimension will be reduced to last layer
Half, such as C31、C32、C33…C364In the dimension of each eigenmatrix be C41、C42、C43…C4128In each matrix character
Twice, i.e. Matrix C 31Length and width be C4 respectively1Twice.Matrix dimensionality size is consistent during in order to ensure Fusion Features,
For the dimension of matrix after the down-sampled operation of different pondizations in the present invention, respectively into the deconvolution of row matrix, different pondization drops
Sample level output convolution eigenmatrix carries out deconvolution operation, and difference lies in the of different sizes of convolution kernel.
With the Matrix C 6 of deconvolution to be carried out1For, the concrete operations of deconvolution are as follows:First by Matrix C 61Length and width
Expand N times respectively, N is the size of deconvolution core, as it was noted above, in the present invention deconvolution core size be respectively set to 2 ×
2,4 × 4,8 × 8,16 × 16,32 × 32, i.e. N takes 2,4,8,16,32 according to different layers respectively.To C61N takes 4 during deconvolution.Square
Array length is wide expand N times it is later, in-between value is completed using arest neighbors interpolation, obtains Matrix C 61′.Then using N × N's
Convolution kernel, step-length N/2, from Matrix C 61' the upper left corner start, move right successively, until Matrix C 61' right margin, be changed to square
Battle array C61' next line, continue to move from left to right, until Matrix C 61' the lower right corner until.Often mobile primary, convolution kernel is respectively weighed
Value and corresponding position Matrix C 61' convolution value be multiplied, all products are added again, obtain Matrix C 61' region convolution the value.Matrix
C61' each region convolution value is arranged by former regional location, that is, completes matrix deconvolution operation.
To the left view that input size is 640 × 480 in (1.2), third layer, layer 6, the tenth layer, the 14th layer,
Second eleventh floor convolution eigenmatrix is after deconvolution operation, obtained first group of deconvolution eigenmatrix D3 to the 5th
Group deconvolution eigenmatrix D21 sizes are 640 × 480 × 32.By cascade, it is 640 × 480 × 32 to finally constitute size
Fusion feature matrix.
(1.4) View Synthesis step:To fusion feature matrix D c, using regression parameter matrix θ, obtain in corresponding left view
Each pixel takes probabilistic forecasting value during different parallaxes, forms the parallax probability matrix Dep of left view;According to original left view and
Parallax probability matrix Dep by View Synthesis, obtains right wing synthesis view.The fusion feature matrix of 640 × 480 × 32 sizes,
Parallax probabilistic forecasting recurrence is carried out by regression parameter matrix θ, what is obtained is still 640 × 480 × 32 matrix;At this point for
Each tensor dimension, i.e., 640 × 480 matrix intuitively understand that each pixel that can be obtained in left view takes correspondence
The probability of parallax value d, the parallax probability matrix Dep and size that the present embodiment foundation size is 640 × 480 × 32 are 640 × 480
Original left view, it is in the same size with original left view that view is synthesized by the obtained right wing of View Synthesis, is 640 × 480;
(1.5) parameter updating step:Calculate the error matrix of right wing synthesis view and the right view in step (1.2)
errR;
View, each error matrix err are synthesized for continuous 640 frame right wingRIt is added, forms propagated error matrix errSR;It will
Obtained propagated error matrix travels to View Synthesis step, Fusion Features step and feature extraction by back-propagation algorithm
Each sub-step in step reversely updates parallax probability matrix Dep, regression parameter matrix θ and each convolution kernel of each layer successively
Each weights complete the primary update of all parameters;The error of Fusion Features and characteristic extraction step in parameter renewal process passes
It broadcasts and corresponding function interface in caffe deep learning frames is called to can be realized.The present embodiment learning rate η Initialize installations are
0.00005, learning rate attenuation 30% is set after 20000 wheel operations later;
Turn characteristic extraction step (1.2), the left view continuation being left to the training data concentration obtained carries out successively
Feature extraction, Fusion Features, View Synthesis step and parameter updating step are stated, when training data concentrates all left and right views equal
Using finishing, the first round update of parallax probability matrix Dep, regression parameter matrix θ and each convolution kernel of each layer, this reality are completed
Applying needs undated parameter 2000 times in each round in example;
According to aforesaid way, parallax probability matrix Dep, regression parameter matrix θ and each convolution kernel of each layer are continued to complete
Second wheel update;It so repeats, after the 100th wheel update is completed, the present embodiment training stage terminates;
(2) service stage includes data input step, characteristic extraction step, Fusion Features step and viewpoint conjunction successively
Into step.
(2.1) data input step:Prepare plane 2D videos to be converted;
(2.2) characteristic extraction step:Plane 2D videos are split as image, as the left view being equivalent in step (1.2)
Figure, passes sequentially through mathematics convolution algorithm and the down-sampled operation of pondization carries out feature extraction, obtains the first layer volume of the frame left view
The product eleventh floor convolution eigenmatrix of eigenmatrix~second, as its characteristics of image;Each each weights of convolution kernel are equal involved by each layer
Each weights of each convolution kernel of each layer are corresponded to after using the training stage;
(2.3) Fusion Features step:It is identical with step (1.3);
(2.4) View Synthesis step:It is identical with step (1.4), according to the parallax probability matrix obtained in step (1.4)
Image in Dep and (2.2) obtains right wing by View Synthesis and synthesizes view,
Features described above extraction, Fusion Features and View Synthesis step are carried out successively to each frame left view, by original image
Left and right splicing is carried out as left view and obtained right wing synthesis view, then connecting together one by one obtains left and right lattice
The 3D three-dimensional video-frequencies of formula.
Output par, c experimental result as shown in Fig. 2, top a line from left to right for continuous three frames input picture, following a line
View is synthesized for corresponding right wing.
Claims (5)
1. it is a kind of by 2D Video Quality Metrics into the method for 3D videos, including training stage and service stage, it is characterised in that:
(1) training stage includes data input step, characteristic extraction step, Fusion Features step, View Synthesis step successively
Rapid and parameter updating step;
(1.1) data input step:It is 10 to obtain the order of magnitude by disclosed video data resource6Left and right view format 3D
Three-dimensional video-frequency movie film, and the 3D three-dimensional video-frequency movie films that disparity range is -15~+16 are selected, as training data
Collect, image size is H rows in three-dimensional video-frequency movie film, W is arranged, H=200~1920, W=180~1080;
(1.2) characteristic extraction step:The three-dimensional video-frequency that training data concentrates left and right view format is split as left and right view format
Stereo-picture, retain right view it is constant, a frame left view is passed sequentially through mathematics convolution algorithm and the down-sampled operation of pondization into
Row feature extraction obtains the eleventh floor convolution eigenmatrix of the first layer convolution eigenmatrix of the frame left view~second, as it
Characteristics of image;
(1.3) Fusion Features step:Respectively to third layer, layer 6, the tenth layer, the 14th layer, the second eleventh floor convolution feature
Matrix is operated into row matrix deconvolution, by obtained first group of deconvolution eigenmatrix D3 to the 5th groups of deconvolution feature square
Battle array D21 is cascaded, and forms fusion feature matrix D c, and fusion feature matrix D c sizes are H × W × 32, i.e. matrix has H rows, W
Row, 32 tensor dimensions;
(1.4) View Synthesis step:
To fusion feature matrix D c, using regression parameter matrix θ, obtain when each pixel in corresponding left view takes different parallaxes
Probabilistic forecasting value forms the parallax probability matrix Dep of left view;
According to original left view and parallax probability matrix Dep, by View Synthesis, right wing synthesis view is obtained;
(1.5) parameter updating step:
Calculate the error matrix err of right wing synthesis view and the right view in step (1.2)R;
View, each error matrix err are synthesized for continuous N frame right wingRIt is added, forms propagated error matrix errSR, M >=16;It will
Obtained propagated error matrix travels to View Synthesis step, Fusion Features step and feature extraction by back-propagation algorithm
Each sub-step in step reversely updates parallax probability matrix Dep, regression parameter matrix θ and each convolution kernel of each layer successively
Each weights complete the primary update of all parameters;
Turn characteristic extraction step (1.2), remaining left view continuation is concentrated to carry out above-mentioned spy successively to the training data obtained
Extraction, Fusion Features, View Synthesis step and parameter updating step are levied, when training data concentrates all left and right views to use
It finishes, completes the first round update of parallax probability matrix Dep, regression parameter matrix θ and each convolution kernel of each layer;
According to the step (1.1) to step (1.5), parallax probability matrix Dep, regression parameter matrix θ and each are continued to complete
Second wheel update of each convolution kernel of layer;So repeat, after the 50th wheel the~the two hundred wheel update is completed, training stage knot
Beam;
(2) service stage includes data input step, characteristic extraction step, Fusion Features step and View Synthesis step successively
Suddenly;
(2.1) data input step:Prepare plane 2D videos to be converted;
(2.2) characteristic extraction step:Plane 2D videos are split as image, as the left view being equivalent in step (1.2), according to
It is secondary that feature extraction is carried out by mathematics convolution algorithm and the down-sampled operation of pondization, obtain the first layer convolution feature of the frame left view
The eleventh floor convolution eigenmatrix of matrix~second, as its characteristics of image;Each each weights of convolution kernel involved by each layer are using instruction
Each weights of each convolution kernel of each layer are corresponded to after practicing the stage;
(2.3) Fusion Features step:It is identical with step (1.3);
(2.4) View Synthesis step:It is identical with step (1.4), according to the parallax probability matrix Dep obtained in step (1.4) and
(2.2) image in obtains right wing by View Synthesis and synthesizes view;
Carry out features described above extraction step (2.2), Fusion Features step (2.3) and View Synthesis step successively to each frame left view
Suddenly (2.4) carry out left and right splicing using original image as left view and obtained right wing synthesis view, then connect one by one
Obtain the 3D three-dimensional video-frequencies of left-right format together.
2. as described in claim 1 by 2D Video Quality Metrics into the method for 3D videos, it is characterised in that:The characteristic extraction step
(1.2) including following sub-steps:
(1.2.1) carries out convolution operation to a frame left view, obtains first layer convolution eigenmatrix:
Using 3 × 3 convolution kernel, step-length 1 since the upper left corner of image, moves right successively, until the right margin of image,
The next line of image is changed to, continues to move from left to right, until the lower right corner of image, often mobile primary, convolution kernel is respectively weighed
Value is multiplied with the pixel value of corresponding position image, and all products are added again, obtain image convolution core region convolution value;Image
Each region convolution value is pressed former regional location and is arranged, and forms 1 convolution eigenmatrix C1 of the frame left view1, to C11In it is all small
In zero element value value be zero;
Use the convolution kernel of 64 3 × 3 altogether, the preceding paragraph is carried out to image described in convolution operation, obtain 64 1 convolution features
Matrix C 11、C12、C13……C164, form first layer convolution eigenmatrix;
(1.2.2) carries out convolution operation to first layer convolution eigenmatrix, obtains second layer convolution eigenmatrix:
Use 3 × 3 convolution kernel, step-length 1, from 1 convolution eigenmatrix C1 of first layer1The upper left corner start, move right successively
It is dynamic, until 1 convolution eigenmatrix C11Right margin, be changed to next line, continue to move from left to right, until 1 convolution feature square
Battle array C11The lower right corner until, often mobile primary, each weights of convolution kernel and 1 convolution eigenmatrix C1 of corresponding position1Matrix element
Element value is multiplied, and all products are added again, obtain 1 convolution eigenmatrix C11Convolution kernel region convolution value;1 convolution spy
Levy Matrix C 11Each region convolution value is pressed former regional location and is arranged, and forms convolution eigenmatrix, in obtained convolution eigenmatrix
All minus element value values are zero;
Later to remaining 63 1 convolution eigenmatrix C12、C13……C164, use the corresponding convolution of this layer of convolution eigenmatrix
Core repeats operation described in the preceding paragraph, obtains 64 convolution eigenmatrixes altogether, is directly added, and forms 2 convolution spies
Levy Matrix C 21;
The convolution kernel of 64 3 × 3 is used altogether, to 64 1 convolution eigenmatrixes included by first layer convolution eigenmatrix
C11、C12、C13……C164Two sections of convolution operations are carried out, obtain 64 2 convolution eigenmatrix C21、C22、C23……
C264, form second layer convolution eigenmatrix;
(1.2.3) carries out the down-sampled operation of first time pondization to second layer convolution eigenmatrix, obtains third layer convolution feature square
Battle array:
To 2 convolution eigenmatrix C2 of the second layer1, use 2 × 2 sliding window, step-length 2, from 2 convolution eigenmatrix C21
The upper left corner start, move right successively, until 2 convolution matrix C21Right margin, be changed to next line, continuation moves from left to right
It is dynamic, until 2 convolution matrix C11The lower right corner until, it is often mobile primary, take corresponding position 2 times in 2 × 2 sliding window region
Convolution eigenmatrix C21The maximum value of matrix element, the pondization as convolution kernel region sample characteristic value, 2 convolution spies
Levy Matrix C 21Each pool areaization sampling characteristic value is pressed former regional location and is arranged, and forms 3 convolution eigenmatrix C31;
To remaining 63 2 convolution eigenmatrix C22、C23……C264The down-sampled operation of above-mentioned pondization is carried out successively, is obtained altogether
64 3 convolution eigenmatrix C31、C32、C33……C364, form third layer convolution eigenmatrix;
(1.2.4) carries out convolution operation to third layer convolution eigenmatrix, obtains the 4th layer of convolution eigenmatrix:
Using the same manner in sub-step (1.2.2), 3 × 3 convolution kernel, step-length 1, each weights of convolution kernel, to 64 3 are used
Secondary convolution eigenmatrix C31、C32、C33……C364, matrix convolution operation is carried out respectively, obtains 64 convolution feature squares altogether
Battle array, is directly added, and forms 4 convolution eigenmatrix C41;
The convolution kernel of 128 3 × 3 is used altogether, to 64 3 convolution eigenmatrixes included by third layer convolution eigenmatrix
C31、C32、C33……C364, carry out the preceding paragraph described in convolution operation, obtain 128 4 convolution eigenmatrix C41、C42、
C43……C4128, form the 4th layer of convolution eigenmatrix;
(1.2.5) carries out convolution operation to the 4th layer of convolution eigenmatrix, obtains layer 5 convolution eigenmatrix:
Using the same manner in sub-step (1.2.2), 3 × 3 convolution kernel, step-length 1, each weights of convolution kernel, to 128 are used
4 convolution eigenmatrix C41、C42、C43……C4128, matrix convolution operation is carried out respectively, obtains 128 convolution features altogether
Matrix is directly added, and forms 5 convolution eigenmatrix C51;
The convolution kernel of 128 3 × 3 is used altogether, to 128 4 convolution feature squares included by the 4th layer of convolution eigenmatrix
Battle array C41、C42、C43……C4128, carry out the preceding paragraph described in convolution operation, obtain 128 5 convolution eigenmatrix C51、C52、
C53……C5128, form layer 5 convolution eigenmatrix;
(1.2.6) carries out the down-sampled operation of second of pondization to layer 5 convolution eigenmatrix, obtains layer 6 convolution feature square
Battle array:
Using the same manner in sub-step (1.2.3), to 128 5 convolution eigenmatrix C51、C52、C53……C5128, successively
The down-sampled operation of pondization is carried out, obtains 128 6 convolution eigenmatrix C6 altogether1、C62、C63……C6128, form layer 6
Convolution eigenmatrix;
(1.2.7) carries out convolution operation to layer 6 convolution eigenmatrix, obtains layer 7 convolution eigenmatrix:
Using the same manner in sub-step (1.2.2), 3 × 3 convolution kernel, step-length 1, to 128 6 convolution feature squares are used
Battle array C61、C62、C63……C6128, matrix convolution operation is carried out respectively, 128 convolution results is obtained, is directly added, and is formed
7 convolution eigenmatrix C71;
The convolution kernel of 256 3 × 3 is used altogether, to 128 6 convolution feature squares included by layer 6 convolution eigenmatrix
Battle array C61、C62、C63……C6128, carry out the preceding paragraph described in convolution operation, obtain 256 7 convolution eigenmatrix C71、C72、
C73……C7256, form layer 7 convolution eigenmatrix;
(1.2.8) carries out convolution operation to layer 7 output eigenmatrix, obtains the 8th layer of convolution eigenmatrix:
Using the same manner in sub-step (1.2.2), 3 × 3 convolution kernel, step-length 1, each weights of convolution kernel, to 256 are used
7 convolution eigenmatrix C71、C72、C73……C7256, matrix convolution operation is carried out respectively, obtains 256 convolution results, directly
It is added, forms 8 convolution eigenmatrix C81;
The convolution kernel of 256 3 × 3 is used altogether, to 256 7 convolution feature squares included by layer 7 convolution eigenmatrix
Battle array C71、C72、C73……C7256, carry out the preceding paragraph described in convolution operation, obtain 256 8 convolution eigenmatrix C81、C82、
C83……C8256, form the 8th layer of convolution eigenmatrix;
(1.2.9) carries out convolution operation to the 8th layer of output eigenmatrix, obtains the 9th layer of convolution eigenmatrix:
Using the same manner in sub-step (1.2.2), 3 × 3 convolution kernel, step-length 1, each weights of convolution kernel, to 256 are used
8 convolution eigenmatrix C81、C82、C83……C8256, matrix convolution operation is carried out respectively, obtains 256 convolution results, directly
It is added, forms 9 convolution eigenmatrix C91;
The convolution kernel of 256 3 × 3 is used altogether, to 256 8 convolution feature squares included by the 8th layer of convolution eigenmatrix
Battle array C81、C82、C83……C8256, carry out the preceding paragraph described in convolution operation, obtain 256 9 convolution eigenmatrix C91、C92、
C93……C9256, form the 9th layer of convolution eigenmatrix;
(1.2.10) carries out the down-sampled operation of third time pondization to the 9th layer of output convolution eigenmatrix, obtains the tenth layer of convolution spy
Levy matrix:
Using the same manner in sub-step (1.2.3), to 256 9 convolution eigenmatrix C91、C92、C93……C9256, successively
The down-sampled operation of pondization is carried out, obtains 256 10 convolution eigenmatrix C10 altogether1、C102、C103……C10256, form the
Ten layers of convolution eigenmatrix;
(1.2.11) carries out convolution operation successively to the tenth layer of convolution eigenmatrix, obtains eleventh floor, Floor 12, successively
Ten three-layer coils accumulate eigenmatrix:
According to the similar operation of sub-step (1.2.7), convolution operation, the convolution kernel used are carried out to the tenth layer of convolution eigenmatrix
Quantity is 512, to 256 10 convolution eigenmatrix C10 included by the tenth layer of convolution eigenmatrix1、C102、
C103……C10256Obtain 512 11 convolution eigenmatrix C111、C112、C113……C11512, form eleventh floor convolution
Eigenmatrix;
According to the similar operation of sub-step (1.2.8), convolution operation, the convolution used are carried out to eleventh floor convolution eigenmatrix
Nuclear volume is 512, to 512 11 convolution eigenmatrix C11 included by eleventh floor convolution eigenmatrix1、C112、
C113……C11512Obtain 512 12 convolution eigenmatrix C121、C122、C123……C12512, form Floor 12 convolution
Eigenmatrix;
According to the similar operation of sub-step (1.2.9), convolution operation, the convolution used are carried out to Floor 12 convolution eigenmatrix
Nuclear volume is 512, to 512 12 convolution eigenmatrix C12 included by Floor 12 convolution eigenmatrix1、C122、
C123……C12512Obtain 512 13 convolution eigenmatrix C131、C132、C133……C13512, form the tenth three-layer coil product
Eigenmatrix;
(1.2.12) is to the tenth three-layer coil product eigenmatrix C131、C132、C133……C13512, it is down-sampled to carry out the 4th pondization
Operation, obtains the 14th layer of convolution eigenmatrix:
Using the same manner in sub-step (1.2.3), to 512 13 convolution eigenmatrix C131、C132、C133……
C13512, the down-sampled operation of pondization is carried out successively, obtains 512 14 convolution eigenmatrix C14 altogether1、C142、C143……
C14512, form the 14th layer of convolution eigenmatrix;
(1.2.13) carries out convolution operation successively to the 14th layer of convolution eigenmatrix, obtain successively the 15th layer, the 16th layer,
17th layer of convolution eigenmatrix:
According to the similar operation of sub-step (1.2.11), convolution operation, the volume used are carried out to the 14th layer of convolution eigenmatrix
Product nuclear volume is 512, to 512 14 convolution eigenmatrix C14 included by the 14th layer of convolution eigenmatrix1、C142、
C143……C14512Obtain 512 15 convolution eigenmatrix C151、C152、C153……C15512, form the 15th layer of convolution
Eigenmatrix;
According to the similar operation of sub-step (1.2.12), convolution operation, the volume used are carried out to the 15th layer of convolution eigenmatrix
Product nuclear volume is 512, to 512 15 convolution eigenmatrix C15 included by the 15th layer of convolution eigenmatrix1、C152、
C153……C15512Obtain 512 16 convolution eigenmatrix C161、C162、C163……C16512, form the 16th layer of convolution
Eigenmatrix;
According to the similar operation of sub-step (1.2.12), convolution operation, the volume used are carried out to the 16th layer of convolution eigenmatrix
Product nuclear volume is 512, to 512 16 convolution eigenmatrix C16 included by the 16th layer of convolution eigenmatrix1、C162、
C163……C16512Obtain 512 17 convolution eigenmatrix C171、C172、C173……C17512, form the 17th layer of convolution
Eigenmatrix;
(1.2.14) carries out the 4th down-sampled operation of pondization, obtains the 18th layer of convolution to the 17th layer of convolution eigenmatrix
Eigenmatrix:
Using the same manner in sub-step (1.2.3), to 512 17 convolution eigenmatrix C171、C172、C173……
C17512, the down-sampled operation of pondization is carried out successively, obtains 512 18 convolution eigenmatrix C18 altogether1、C182、C183……
C18512, form the 18th layer of convolution eigenmatrix;
(1.2.15) carries out convolution operation to the 18th layer of output eigenmatrix, obtains the 19th layer of convolution eigenmatrix:
Using the same manner in sub-step (1.2.2), 3 × 3 convolution kernel, step-length 1, each weights of convolution kernel, to 512 are used
18 convolution eigenmatrix C181、C182、C183……C18512, matrix convolution operation is carried out respectively, obtains 512 convolution knots
Fruit is directly added, and forms 19 convolution eigenmatrix C191;
The convolution kernel of 4096 3 × 3 is used altogether, to 512 18 convolution spies included by the 18th layer of convolution eigenmatrix
Levy Matrix C 181、C182、C183……C18512, carry out the preceding paragraph described in convolution operation, obtain 4096 19 convolution feature squares
Battle array C191、C192、C193……C194096, form the 19th layer of convolution eigenmatrix;
(1.2.16) carries out convolution operation to the 19th layer of convolution eigenmatrix, obtains the 20th layer of convolution eigenmatrix:
Using the same manner in sub-step (1.2.2), 1 × 1 convolution kernel, step-length 1, each weights of convolution kernel, to 4096 are used
A 19 convolution eigenmatrixes C191、C192、C193……C194096, matrix convolution operation is carried out respectively, obtains 4096 convolution
As a result, being directly added, 20 convolution eigenmatrix C20 are formed1;
The convolution kernel of 4096 1 × 1 is used altogether, to 4096 19 convolution spies included by the 19th layer of convolution eigenmatrix
Levy Matrix C 191、C192、C193……C194096, carry out the preceding paragraph described in convolution operation, obtain 4096 20 convolution feature squares
Battle array C201、C202、C203……C204096, form the 20th layer of convolution eigenmatrix;
(1.2.17) carries out convolution operation to the 20th layer of output eigenmatrix, obtains the second eleventh floor convolution eigenmatrix:
Using the same manner in sub-step (1.2.2), 1 × 1 convolution kernel, step-length 1, each weights of convolution kernel, to 4096 are used
A 20 convolution eigenmatrixes C201、C202、C203……C204096, matrix convolution operation is carried out respectively, obtains 4096 convolution
As a result, being directly added, 21 convolution eigenmatrix C21 are formed1;
The convolution kernel of 32 1 × 1 is used altogether, to 4096 20 convolution features included by the 20th layer of convolution eigenmatrix
Matrix C 201、C202、C203……C204096, carry out the preceding paragraph described in convolution operation, obtain 32 21 convolution eigenmatrixes
C211、C212、C213……C2132, form the second eleventh floor convolution eigenmatrix;
Involved each each weights of convolution kernel use the parameter of VGG16 models in sub-step (1.2.1) to sub-step (1.2.17)
Numerical value is initialized in file vgg16-0001.params, and it is each then using parameter updating step (1.5) to correspond to each layer afterwards later
Each weights of convolution kernel;
To every frame left view, by sub-step (1.2.1) 55 (1.2.17), 21 extracted by different scale are obtained
Layer convolution eigenmatrix.
3. as claimed in claim 1 or 2 by 2D Video Quality Metrics into the method for 3D videos, it is characterised in that:
The Fusion Features step (1.3) includes following sub-steps:
(1.3.1) obtained third layer convolution eigenmatrix that operates down-sampled to first time pondization carries out deconvolution operation, obtains
First group of deconvolution eigenmatrix D3:
Using 1 × 1 convolution kernel, respectively to 64 3 convolution eigenmatrix C31、C32、C33……C364Into row matrix deconvolution
Operation obtains 64 convolution outputs as a result, being directly added, forms 1 deconvolution eigenmatrix D31;
The convolution kernel of 32 1 × 1 is used altogether, to 64 3 convolution eigenmatrixes included by third layer convolution eigenmatrix
C31、C32、C33……C364Operation described in the preceding paragraph is carried out, obtains 32 1 deconvolution eigenmatrix D31、D32、D33……
D332, form first group of deconvolution eigenmatrix D3;
(1.3.2) obtained layer 6 convolution eigenmatrix that operates down-sampled to second of pondization carries out deconvolution operation, obtains
Second group of deconvolution eigenmatrix D6:
Using 2 × 2 convolution kernel, respectively to 128 6 convolution eigenmatrix C61、C62、C63……C6128Into row matrix warp
Product operation obtains 128 convolution outputs as a result, being directly added, forms 2 deconvolution eigenmatrix D61;
The convolution kernel of 32 2 × 2 is used altogether, to 128 6 convolution eigenmatrixes included by layer 6 convolution eigenmatrix
C61、C62、C63……C6128Operation described in the preceding paragraph is carried out, obtains 32 2 deconvolution eigenmatrix D61、D62、D63……
D632, form second group of deconvolution eigenmatrix D6;
(1.3.3) is down-sampled to third time pondization to operate the tenth layer of obtained convolution eigenmatrix, the 4th down-sampled behaviour of pondization
Make obtained the 14th layer of convolution eigenmatrix, carry out deconvolution operation respectively, obtain third group deconvolution eigenmatrix D10 and
4th group of deconvolution eigenmatrix D14:
Wherein, it is operated according to the matrix deconvolution in sub-step (1.3.2), altogether using the convolution kernel of 32 4 × 4, convolution kernel
Each weights, to 256 10 convolution eigenmatrix C10 included by the tenth layer of convolution eigenmatrix1、C102、C103……
C10256It is operated, obtains 32 3 deconvolution eigenmatrix D101、D102、D103……D1032, form third group deconvolution
Eigenmatrix D10;
The convolution kernel of 32 8 × 8 is used altogether, to 512 14 convolution features included by the 14th layer of convolution eigenmatrix
Matrix C 141、C142、C143……C14512It is operated, obtains 32 4 deconvolution eigenmatrix D141、D142、D143……
D1432, form the 4th group of deconvolution eigenmatrix D14;
(1.3.4) operates the second eleventh floor convolution eigenmatrix into row matrix deconvolution, obtains the 5th group of deconvolution feature square
Battle array:
It is operated according to the matrix deconvolution in sub-step (1.3.2), altogether using the convolution kernel of 32 16 × 16, convolution kernel is respectively weighed
Value, to 32 21 convolution eigenmatrix C21 included by the second eleventh floor convolution eigenmatrix1、C212、C213……C2132
It is operated, obtains 32 5 deconvolution eigenmatrix D211、D212、D213……D2132, form the 5th group of deconvolution feature
Matrix D 21;
(1.3.5) cascades first group of deconvolution eigenmatrix D3 to the 5th groups of deconvolution eigenmatrix D21, forms fusion
Eigenmatrix Dc, shown in specific cascade system such as formula (1):
Wherein N=32 represents every group of deconvolution feature by sub-step (1.3.1)-(1.3.4), being obtained per sub-steps
The number of matrix, obtained fusion feature matrix D c sizes are H × W × 32, i.e. H rows, W row, 32 tensor dimensions, Dc1It represents
First in 32 tensor dimensions, DctRepresent t-th in 32 tensor dimensions;
Involved each each weights of convolution kernel are using the parameter of VGG16 models text in sub-step (1.3.1) to sub-step (1.3.4)
Numerical value is initialized in part vgg16-0001.params, and then corresponding to each layer afterwards using parameter updating step (1.5) later respectively rolls up
Each weights of product core.
4. as claimed in claim 3 by 2D Video Quality Metrics into the method for 3D videos, it is characterised in that:
The View Synthesis step (1.4) includes following sub-steps:
(1.4.1) depth map:It to fusion feature matrix D c, is calculated, obtained each in corresponding left view using the following formula (2)
Pixel takes probabilistic forecasting value during different parallaxes, forms the parallax probability matrix Dep of left view:
Wherein, matrix element DepkAlso for matrix, k=1,2 ..., 32, represent probability value p (y=k | x;θ), that is, it is k- to take parallax
16, each pixel of left view corresponding parallax probability value when regression parameter matrix is θ;Regression parameter matrix θ is by regression parameter [θ1,
θ2..., θ32] composition,Represent that p-th of tensor dimension is in regression parameter θ in DcpUnder logistic regression value, p=1,2,
... 32, ginseng is returned to each of regression parameter matrix θ using numerical value in the Parameter File vgg16-0001.params of VGG16 models
Number initialization, later then using each regression parameter of parameter updating step (1.5) corresponding regression parameter matrix θ afterwards;
(1.4.2) forms right wing synthesis view R, wherein the pixel value R of the i-th row jth rowi,jAs shown in formula (3):
Wherein, LdPan view during parallax d is translated for original left view,For its i-th row jth row pixel value,LI, j-dFor i in original left view, the pixel value at j-d positions, if j-d < 0,D be regarding
Difference, -15≤d≤+ 16,For matrix D epkElement value at middle i, j position is i in left view, and pixel value regards at j positions
Difference is taken as the probability of d, k=d+16, i=1~H, j=1~W.
5. as claimed in claim 4 by 2D Video Quality Metrics into the method for 3D videos, it is characterised in that:
The parameter updating step (1.5) includes following sub-steps:
(1.5.1) subtracts each other right wing synthesis view R with the right view in step (1.2), obtains error matrix errR;For continuous N
Frame right wing synthesizes view, each error matrix errRIt is added, forms propagated error matrix errSR, M >=16;
(1.5.2) is by propagated error matrix errSRView Synthesis step (1.4) is propagated backward to, first by parallax probability matrix
Dep is updated to new parallax probability matrix Depl+1, matrix element isAs shown in formula (4):
The updated value of the l+1 times when expression parallax is d (d=k-16), d=k-16;Represent that it is preceding primary
Updated value, learning rate η are initialized as 0.00003~0.00008;
Secondly regression parameter matrix θ is updated to θ by (1.5.3)l+1, for θl+1Each regression parameterAccording to following public affairs
Formula (5) updates;
WhereinRepresent regression parameter θpThe l+1 times updated value,Represent its previous updated value, DeppAs viewpoint
Corresponding Dep after the update of synthesis step (1.4) parameterk, wherein p=k, LpFor the corresponding L of View Synthesis stepd, wherein d=
P-16, DcpAs corresponding Dc in step (1.3.5)t, the value range of wherein t=p, p are the renewal rates that 1 to 32, λ is θ,
0.00001≤λ≤0.01;
(1.5.4) is by propagated error matrix errSRBecome characteristic error matrix errDR, as shown in formula (6):
Again by characteristic error matrix errDRCharacteristic extraction step (1.2) and Fusion Features step (1.3) are sent into, it is each to what is be related to
The weights of convolution kernel are updated using the iterative manner of caffe deep learning frames;
Update 12608 convolution kernels altogether successively from the second eleventh floor to first layer, altogether 113472 convolution kernel weights;By
Sub-step (1.5.2)~(1.5.4) completes the primary update of all parameters.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710227433.4A CN107018400B (en) | 2017-04-07 | 2017-04-07 | It is a kind of by 2D Video Quality Metrics into the method for 3D videos |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710227433.4A CN107018400B (en) | 2017-04-07 | 2017-04-07 | It is a kind of by 2D Video Quality Metrics into the method for 3D videos |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107018400A CN107018400A (en) | 2017-08-04 |
CN107018400B true CN107018400B (en) | 2018-06-19 |
Family
ID=59446292
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710227433.4A Active CN107018400B (en) | 2017-04-07 | 2017-04-07 | It is a kind of by 2D Video Quality Metrics into the method for 3D videos |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107018400B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109977981B (en) * | 2017-12-27 | 2020-11-24 | 深圳市优必选科技有限公司 | Scene analysis method based on binocular vision, robot and storage device |
CN109788270B (en) * | 2018-12-28 | 2021-04-09 | 南京美乐威电子科技有限公司 | 3D-360-degree panoramic image generation method and device |
CN109948689B (en) * | 2019-03-13 | 2022-06-03 | 北京达佳互联信息技术有限公司 | Video generation method and device, electronic equipment and storage medium |
CN111932437B (en) * | 2020-10-10 | 2021-03-05 | 深圳云天励飞技术股份有限公司 | Image processing method, image processing device, electronic equipment and computer readable storage medium |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2130178A1 (en) * | 2007-03-23 | 2009-12-09 | Thomson Licensing | System and method for region classification of 2d images for 2d-to-3d conversion |
CN102223553B (en) * | 2011-05-27 | 2013-03-20 | 山东大学 | Method for converting two-dimensional video into three-dimensional video automatically |
CN104243956B (en) * | 2014-09-12 | 2016-02-24 | 宁波大学 | A kind of stereo-picture visual saliency map extracting method |
CN104954780B (en) * | 2015-07-01 | 2017-03-08 | 南阳师范学院 | A kind of DIBR virtual image restorative procedure suitable for the conversion of high definition 2D/3D |
CN105979244A (en) * | 2016-05-31 | 2016-09-28 | 十二维度(北京)科技有限公司 | Method and system used for converting 2D image to 3D image based on deep learning |
-
2017
- 2017-04-07 CN CN201710227433.4A patent/CN107018400B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN107018400A (en) | 2017-08-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Rematas et al. | Soccer on your tabletop | |
Cao et al. | Semi-automatic 2D-to-3D conversion using disparity propagation | |
JP7569439B2 (en) | Scalable 3D Object Recognition in Cross-Reality Systems | |
JP7026222B2 (en) | Image generation network training and image processing methods, equipment, electronics, and media | |
CN107018400B (en) | It is a kind of by 2D Video Quality Metrics into the method for 3D videos | |
CN100407798C (en) | Three-dimensional geometric mode building system and method | |
CN107578436A (en) | A kind of monocular image depth estimation method based on full convolutional neural networks FCN | |
CN108932725B (en) | Scene flow estimation method based on convolutional neural network | |
CN110827312B (en) | Learning method based on cooperative visual attention neural network | |
US20210407125A1 (en) | Object recognition neural network for amodal center prediction | |
WO2024055211A1 (en) | Method and system for three-dimensional video reconstruction based on nerf combination of multi-view layers | |
CN112819951A (en) | Three-dimensional human body reconstruction method with shielding function based on depth map restoration | |
Shi et al. | Garf: Geometry-aware generalized neural radiance field | |
Guo et al. | Real-Time Free Viewpoint Video Synthesis System Based on DIBR and A Depth Estimation Network | |
Yao et al. | Neural Radiance Field-based Visual Rendering: A Comprehensive Review | |
CN117711066A (en) | Three-dimensional human body posture estimation method, device, equipment and medium | |
CN117115398A (en) | Virtual-real fusion digital twin fluid phenomenon simulation method | |
Inamoto et al. | Fly through view video generation of soccer scene | |
Zhang et al. | SivsFormer: Parallax-aware transformers for single-image-based view synthesis | |
CN111178163A (en) | Cubic projection format-based stereo panoramic image salient region prediction method | |
Su et al. | Objective visual comfort assessment model of stereoscopic images based on BP neural network | |
Wang et al. | Local and nonlocal flow-guided video inpainting | |
CN118071969B (en) | Method, medium and system for generating XR environment background in real time based on AI | |
Liu | RETRACTED ARTICLE: Light image enhancement based on embedded image system application in animated character images | |
CN115294488B (en) | AR rapid object matching display method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |