CN101888566B - Estimation method of distortion performance of stereo video encoding rate - Google Patents

Estimation method of distortion performance of stereo video encoding rate Download PDF

Info

Publication number
CN101888566B
CN101888566B CN 201010222351 CN201010222351A CN101888566B CN 101888566 B CN101888566 B CN 101888566B CN 201010222351 CN201010222351 CN 201010222351 CN 201010222351 A CN201010222351 A CN 201010222351A CN 101888566 B CN101888566 B CN 101888566B
Authority
CN
China
Prior art keywords
distortion
view
virtual
video
video coding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN 201010222351
Other languages
Chinese (zh)
Other versions
CN101888566A (en
Inventor
季向阳
汪启扉
戴琼海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN 201010222351 priority Critical patent/CN101888566B/en
Publication of CN101888566A publication Critical patent/CN101888566A/en
Application granted granted Critical
Publication of CN101888566B publication Critical patent/CN101888566B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides an estimation method of distortion performance of a stereo video encoding rate, comprising the following steps: acquiring a multi-view video and acquiring a corresponding multi-view depth map according to the multi-view video; respectively obtaining a multi-view video encoding rate distortion model and a multi-view depth map encoding rate distortion model; respectively obtaining the relationship between multi-view video encoding distortion and virtual view rendering distortion as well as the relationship between the multi-view depth map encoding distortion and the virtual view rendering distortion; and finally establishing a virtual view encoding rendering rate distortion model in a stereo video based on the relationships. By the analysis method of virtual view encoding rendering rate distortion in the stereo video provided by the invention, the distortion performance of the virtual view encoding rendering rate in the stereo video can be accurately and rapidly established so as to provide model guide and solution for problems such as selection of coding parameters, code rate distribution and the like in the stereo video.

Description

Estimation method of distortion performance of stereo video encoding rate
Technical field
The present invention relates to technical field of image processing, particularly a kind of estimation method of distortion performance of stereo video encoding rate to virtual viewpoint rendering.
Background technology
Along with the continuous development of multimedia technology, the traditional two-dimensional video media can not satisfy the needs of human vision, and people hope to see and have more the sense of reality and interactive video frequency program, the 3 D video that also promptly has highly three-dimensional feeling of immersion.The 3 D video technology of rising gradually in recent years becomes the important part of multimedia technology just gradually like brand-new video medias such as multi-view point video, free viewpoint video and three-dimensional video-frequencies.So-called free viewpoint video is meant that the beholder is through freely selecting to watch viewpoint to watch the video media of three-dimensional scenic.
Two Main Stage have been experienced in the free viewpoint video development, and the phase I is known as multi-view point video, and promptly spectators can the viewpoint of selection oneself needs watch from the multi-view point video of playing, and form third dimension through the parallax between the different points of view video.Second stage is called as three-dimensional video-frequency, and promptly the beholder can freely select to watch viewpoint to watch to have relief video.In order to obtain multi-view point video, just need to adopt the multichannel video camera to gather Same Scene, and the multi-channel video that collects is sent to client, and select corresponding multi-view point video to play based on spectators' needs.And in order to obtain to have the three-dimensional video-frequency of free view-point; Need on the basis of multi-view point video, further introduce the geological information of scene; Play up through multi-view point video data and geological information and to obtain the virtual view that spectators select at random, thereby let spectators obtain the stronger third dimension of watching.
Be different from traditional single channel two-dimensional video data, three-dimensional video-frequency is made up of multichannel two-dimensional video data and scene geometric information, so the mass data of multi-view point video is far longer than the traditional two-dimensional video data to the demand of transmission bandwidth.In addition, how to compress scene geometric information effectively and also become the right new challenge of stereo scopic video coding demand side.Therefore, in order to realize effective transmission of stereoscopic video, need to come the coding techniques of design of High Efficiency to the three-dimensional video-frequency characteristic.In early days, the researcher is applied to traditional video compression technology multi-view video compressed, adopts traditional single channel video compression technology that the corresponding video of each viewpoint in the multi-view point video is compressed separately.This scheme becomes one of three-dimensional video-frequency compression solution the earliest.
Yet, there is stronger correlation between the multi-view point video sequence, even adopt the most H.264/AVC coding techniques, still can't compress the redundancy between the different points of view effectively.For this reason, the researcher has designed multi-view video compressed more efficiently scheme.This scheme is extended to traditional single channel video coding technique on the multi-channel video, in the predictive coding that adds on the basis of time domain prediction coding between viewpoint, has further compressed the redundancy between the multi-view point video viewpoint, has improved multi-view video compressed distortion performance.In addition, to three-dimensional video-frequency scene geometric information based on depth map, i.e. multi-view depth graphic sequence, the researcher adopts the multiple view video coding scheme to realize its efficient compression equally.Along with the continuous development of three-dimensional video-frequency technology, new three-dimensional video-frequency compress technique is development fast.At present, one of the key technology in market is moved towards in the application that become three-dimensional video-frequency of the stereo scopic video coding technology of real-time high-efficiency.
On the basis of three-dimensional video-frequency high efficient coding scheme,, need condition adjustment coding parameter Network Based in order to realize effective transmission of stereoscopic video.In traditional video coding, under the situation of known network bandwidth, can estimate decoding quality based on the rate-distortion model of coding, thereby adjust coding parameter better.Therefore, the rate-distortion model analysis to three-dimensional video-frequency high efficient coding scheme is to realize the effectively important step of transmission of three-dimensional video-frequency.
The shortcoming that prior art exists is also not to be directed against the estimation scheme of the stereo video encoding rate distortion performance of virtual viewpoint rendering at present.
For example; At application number is 200810163801; Name is called in a kind of patent application of method for encoding stereo video of network self-adapting, has only provided the adaptive coding scheme under the three-dimensional video-frequency various network bandwidth, does not analyze to the distortion performance of stereo scopic video coding.At application number is 200810126528; Name is called in the patent application of a kind of stereo video coding-decoding method, Apparatus and system; Though disclose a whole set of complete stereo scopic video coding scheme, device and system, do not provided the analysis of stereo video encoding rate distortion performance equally.At application number is 200710164747; Name is called in a kind of patent application of the bit rate control method towards multi-view point video; Though announced the rate-distortion model of multiple view video coding, be not directed against the stereo video encoding rate distortion performance of virtual viewpoint rendering.
Summary of the invention
The object of the invention is intended to solve above-mentioned technological deficiency at least, has proposed a kind of estimation method of distortion performance of stereo video encoding rate to virtual viewpoint rendering.
For achieving the above object, one aspect of the present invention has proposed a kind of estimation method of distortion performance of stereo video encoding rate, may further comprise the steps: obtain multi-view point video, and obtain corresponding multi-view depth figure according to said multi-view point video; Obtain multiple view video coding rate-distortion model and multi-view depth graph code rate-distortion model respectively according to said multi-view point video and said multi-view depth figure; Obtain the relation between said multiple view video coding distortion and the virtual viewpoint rendering distortion respectively according to said multi-view point video and said multi-view depth figure, and the relation between distortion of said multi-view depth graph code and the virtual viewpoint rendering distortion; According to the relation between said multiple view video coding distortion and the virtual viewpoint rendering distortion, and the relation between distortion of said multi-view depth graph code and the virtual viewpoint rendering distortion obtains the relation between virtual viewpoint rendering distortion and multi-view point video and the multi-view depth figure coded quantization parameter QP separately; Obtain to draw the relation between needed encoder bit rate of said virtual view and the said QP according to said multiple view video coding rate-distortion model and multi-view depth graph code rate-distortion model; And add up the drafting distortion of said virtual view under the different Q P and draw the needed encoder bit rate of said virtual view to obtain the rate-distortion model of virtual view in the stereo scopic video coding.
Through virtual viewpoint rendering encoding rate distortion analysis method in the three-dimensional video-frequency of the present invention's proposition; Can estimate virtual viewpoint rendering encoding rate distortion performance in the three-dimensional video-frequency quickly and accurately; Thereby instruct and solution for problems such as stereo scopic video coding parameters of choice and Data Rate Distribution provide model, further improved the efficient of stereo scopic video coding.
Aspect that the present invention adds and advantage part in the following description provide, and part will become obviously from the following description, or recognize through practice of the present invention.
Description of drawings
Above-mentioned and/or additional aspect of the present invention and advantage are from obviously with easily understanding becoming the description of embodiment below in conjunction with accompanying drawing, wherein:
The three-dimensional video-frequency system block diagram that Fig. 1 provides for the embodiment of the invention;
The multiple view video coding predict figure that Fig. 2 provides for the embodiment of the invention;
Fig. 3 is the estimation method of distortion performance of stereo video encoding rate flow chart of the embodiment of the invention;
The virtual viewpoint rendering schematic diagram that Fig. 4 provides for the embodiment of the invention;
The depth map quaternary tree decomposing schematic representation that Fig. 5 provides for the embodiment of the invention.
Embodiment
Describe embodiments of the invention below in detail, the example of said embodiment is shown in the drawings, and wherein identical from start to finish or similar label is represented identical or similar elements or the element with identical or similar functions.Be exemplary through the embodiment that is described with reference to the drawings below, only be used to explain the present invention, and can not be interpreted as limitation of the present invention.
Because in the three-dimensional video-frequency, virtual view is to draw through multi-view point video and scene geometric information, so the virtual viewpoint rendering quality is relevant with the coding quality of multi-view point video and scene geometric information.For this reason; The present invention sets up multi-view point video through needs and scene geometric information coding associating rate-distortion model obtains the encoding rate distortion model to the virtual viewpoint rendering video, thereby instructs problems such as stereo scopic video coding parameters of choice and Data Rate Distribution better.
The present invention has mainly proposed a kind of multi-view point video and depth map encoding associating rate-distortion model method of estimation of practicality.This method can be estimated the relation between multi-view point video and depth map encoding quality and the virtual viewpoint rendering quality effectively, thereby is that problems such as selection of stereo scopic video coding parameter and Data Rate Distribution provide theoretical direction under the condition of the given network bandwidth.
The applied environment of the embodiment of the invention is following: the three-dimensional video-frequency system block diagram that the embodiment of the invention is used is as shown in Figure 1.Wherein, the video sequence that is used for stereo scopic video coding adopts the standard testing video sequence of the name of high-definition format for " Breakdancer "; The pixel of this high-definition format video sequence is 1024 * 768; Decoder adopts the H.264/SVC reference software JMVC (JointMulti-view Video Coding, multiple view video coding) of (Multi-view Video Coding, multi-view point video extended version) standard; The frame number of encoder GOP (Group of Pictures, image sets) is 8; The time domain prediction coding of coding adopts the Hierarchical B predict of (stratification bi-directional predictive coding frame is called for short stratification B frame), and the coded prediction structure chart is as shown in Figure 2.
Server end obtains multi-view point video, and obtains multi-view depth figure based on multi-view point video, and in stereo scopic video coding, multi-view point video and multi-view depth figure is encoded and be transferred to client through network.Client is decoded to the multi-view point video and the multi-view depth graph code code stream that receive, and the virtual view of selecting according to spectators generates corresponding virtual view video and it is shown to spectators through three-dimensional display.
In this enforcement sample, virtual viewpoint rendering adopts two-path video and the depth map adjacent with virtual view to draw.Concrete, the viewpoint 4 of this enforcement sample employing " Breakdancer " sequence and viewpoint 6 these two-path videos are as the multi-view point video list entries.Wherein viewpoint 4 is called left reference view, and viewpoint 6 is called right reference view.The span of multi-view point video and multi-view depth graph code quantization parameter QP is the integer between 0 to 51.The parameter of the virtual view that generates is identical with the parameter of viewpoint 5 in " Breakdancer " sequence.
As shown in Figure 3, be the estimation method of distortion performance of stereo video encoding rate flow chart of the embodiment of the invention, in this embodiment, multiple view video coding adopts and realized originally based on conventional video coding standard multiple view video coding extended edition H.264/AVC.Intracoded frame (I frame), forward-predictive-coded frames (P frame) and the bi-directional predictive coding frame (B frame) of each image sets in each viewpoint all adopt identical quantization parameter (QP) to encode.This method may further comprise the steps:
Step S301 sets up the multiple view video coding rate-distortion model.Under the situation of given multiple view video coding quantization parameter, estimate multiple view video coding code check and distortion.After the value of given coded quantization parameter QP, at first pass through Q Step=2 (QP-4)/6Obtain quantization parameter Q StepValue.For example, when QP=28, Q Step=2 (28-4)/6=16.
This step can further be subdivided into following two steps:
Step S101, the code check r of multiple view video coding cCalculation expression do Wherein, parameter Q StepCalculation expression be Q Step=2 (QP-4)/6, parameter a, b, c need specifically be provided with according to the different video sequence.Particularly, in an embodiment of the present invention, for " Breakdancer " sequence, because encoder bit rate r cWith quantization parameter Q StepBetween relational expression do
Figure DEST_PATH_GSB00000657413200011
Therefore can come match to obtain parameter a through the method for linear regression be 24.37, b for-0.303 with c be 0.02, also promptly for its encoder bit rate of left and right sides viewpoint r cWith quantization parameter Q StepRelational expression between x is: For example, work as Q Step=16 o'clock, r c=0.096bpp.
Step S102, the distortion computation expression formula of multiple view video coding is PSNR c=q c* QP+p c, parameter q wherein cAnd p cNeed specifically be provided with according to the different video sequence.At this moment, under the condition of given multiple view video coding quantization parameter QP, can obtain multiple view video coding code check r cWith distortion PSNR cBetween corresponding relation.
Particularly, in an embodiment of the present invention, for " Breakdancer " sequence, because the distortion computation expression formula of multiple view video coding is PSNR c=q c* QP+p c, come match to obtain parameter q through the method for linear regression cBe-0.37 and p cBe 49.30, also promptly for its decoding quality of left and right sides viewpoint PSNR cAnd the relation between the coded quantization parameter QP can be expressed as PSNR c=-0.37 * QP+49.30.When QP=28, PSNR c=-0.37 * 28+49.30=38.94dB.
Step S302 sets up the depth map encoding rate-distortion model.Under the situation of given multi-view depth graph code quantization parameter, estimate the corresponding relation between multi-view point video depth map encoding code check and the distortion.In this step, multi-view depth figure also adopts with multiple view video coding described in the step S301 and encodes.Similarly, I frame, P frame and the B frame of each image sets in each viewpoint also all adopt identical QP to encode.With above-mentioned step, Q when the value of coded quantization parameter QP is 28 Step=16.
This step can further be subdivided into following two steps:
The code check r of S221, multi-view depth graph code dCalculation expression be r d=k/Q Step+ t, wherein parameter Q StepCalculation expression be Q Step=2 (QP-4)/6, parameter k, t need specifically be provided with according to the different depth graphic sequence.
Particularly, in an embodiment of the present invention, for " Breakdancer " sequence, because encoder bit rate r dWith quantization parameter Q StepBetween relational expression be r d=k/Q Step+ t, therefore can through the method for linear regression come match obtain parameter k be 0.9996 with t be 0.0040, also promptly for its encoder bit rate of left and right sides viewpoint r dWith quantization parameter Q StepBetween relational expression be r d=0.9996/Q Step+ 0.0040.For example, work as Q Step=16 o'clock, r d=0.0665bpp.
The distortion computation expression formula of S222, multiple view video coding is PSNR d=q d* QP+p d, parameter q wherein dAnd p dNeed specifically be provided with according to the different depth graphic sequence.At this moment, under the condition of given multi-view depth graph code quantization parameter QP, can obtain multi-view depth graph code code check r dWith distortion PSNR dBetween corresponding relation.
Particularly, in an embodiment of the present invention, for " Breakdancer " sequence, because the distortion computation expression formula of multi-view depth graph code is PSNR d=q d* QP+p d, come match to obtain parameter q through the method for linear regression dBe-0.65 and p dBe 63.75, also promptly for its decoding quality of left and right sides viewpoint PSNR dAnd the relation between the coded quantization parameter QP can be expressed as PSNR d=-0.65 * QP+63.75.When QP=28, PSNR d=-0.65 * 28+63.75=45.55dB.
Next, the present invention need set up the relational model between virtual view distortion and multiple view video coding distortion and the depth map encoding distortion.In three-dimensional video-frequency; Virtual viewpoint rendering is the position relation that calculates corresponding pixel points between reference view and virtual view through the depth map of reference view, and brightness value and the chromatic value through corresponding pixel points in the video sequence of a plurality of reference views comes the brightness value and the chromatic value of weighted calculation virtual view corresponding pixel points then.Therefore, the distortion of virtual view is because multiple view video coding distortion and the distortion of multi-view depth graph code cause jointly.Therefore in the present invention; For the ease of setting up the functional relation between virtual view distortion and multi-view point video and the multi-view depth graph code parameter; To at first analyze under the undistorted condition of multi-view depth figure the relation between virtual view distortion and the multiple view video coding distortion.Secondly, analyze under the undistorted condition of multi-view point video the relation between virtual view distortion and the distortion of multi-view depth graph code.Set up the associating relational model of virtual view distortion and multi-view point video and the distortion of multi-view depth graph code at last.
Step S303 sets up the relation between multiple view video coding distortion and the virtual viewpoint rendering distortion.In this step, it is undistorted to establish multi-view depth figure, and this moment, the distortion of virtual view was only caused by the multiple view video coding distortion.Might as well establish this moment and generate virtual view C vNeed M reference view
Figure DEST_PATH_GSB00000657413200021
The virtual viewpoint rendering image is the weighted average of M reference view through the virtual visual point image of depth map drafting.Therefore, when i reference view The video coding distortion do The time, reference view
Figure DEST_PATH_GSB00000657413200024
The video coding distortion for virtual viewpoint rendering image fault E CWContribution do
Figure DEST_PATH_GSB00000657413200025
Wherein,
Figure DEST_PATH_GSB00000657413200026
Be the weight coefficient of i reference view video coding distortion for drawing virtual view image distortion contribution.Consider when video and depth map through reference view drawn the virtual view video; Phenomenon such as can occur blocking, therefore when calculating virtual view distortion
Figure DEST_PATH_GSB00000657413200027
, also need the error that shield portions is corresponding deduct.Can establish through reference view
Figure DEST_PATH_GSB00000657413200028
The ratio of pixel number of drawing the whole two field picture of number and reference view of the pixel that is blocked in the process of virtual view does
Figure DEST_PATH_GSB00000657413200029
Pass through this moment
Figure DEST_PATH_GSB000006574132000210
Draw the distortion that virtual view produced
Figure DEST_PATH_GSB000006574132000211
Calculation expression should be modified to
Figure DEST_PATH_GSB000006574132000212
Therefore, the present invention can be expressed as through the virtual visual point image distortion that M reference view weighting drafting obtains E CW = Σ i = 1 M E CW Ref i .
Particularly, as shown in Figure 4 in an embodiment of the present invention, for drawing the schematic diagram of virtual view.It is undistorted in this step, to establish multi-view depth figure, and this moment, the distortion of virtual view was only caused by the multiple view video coding distortion.The coding distortion of supposing left and right sides reference view video is respectively With
Figure DEST_PATH_GSB000006574132000215
When the video coding parameter of left reference view was QP=28, the computational methods of the coding distortion of left reference view video did
Figure DEST_PATH_GSB000006574132000216
The video coding distortion of left side reference view is for virtual viewpoint rendering image fault E CWContribution do
Figure DEST_PATH_GSB00000657413200031
In like manner, when the video coding parameter of right reference view is QP=28, the video coding distortion of right reference view The video coding distortion of right reference view is for virtual viewpoint rendering image fault E CWContribution do E CW R = w R E C R .
Because virtual view is identical to the distance of left and right sides reference view, therefore, weight coefficient w LAnd W RSatisfy w L=w R=0.5.When considering the video of drawing virtual view through the video and the depth map of reference view; Phenomenon such as can occur blocking, therefore when calculating virtual view distortion
Figure DEST_PATH_GSB00000657413200034
, also need the error that shield portions is corresponding deduct.Can know through experimental analysis; For " Breakdancer " sequence; Drawing the number of the pixel that is blocked in the process of virtual view and the average proportions of the pixel number of the whole two field picture of reference view through viewpoint 4 and viewpoint 6 is 0.15, and can be expressed as owing to the caused virtual visual point image distortion of the left and right sides reference view coding distortion this moment:
E CW = E CW L + E CW R = 0.15 × 0.5 × ( E C L + E C R ) = 0.15 × 0.5 × ( 8.3 + 8.3 ) = 1.25 . In one embodiment of the invention, for example when the QP value is 28, E CW=1.25.
Step S304 sets up the relation between depth map encoding distortion and the virtual viewpoint rendering distortion.In this step, suppose that multi-view point video is undistorted, this moment, the distortion of virtual view was only caused by the distortion of multi-view depth graph code.According to the principle of drawing virtual view image, virtual viewpoint rendering is through pixel in depth information and the camera parameter acquisition virtual view and the mapping relations between the respective pixel in the reference view.This process is equivalent to obtaining in the virtual view parallax between the respective pixel in each pixel and reference view.When distortion appearred in depth map, in the virtual viewpoint rendering process squinted in the position of respective pixel, also is that distortion appears in parallax.At this, can suppose that depth value is D jThe distortion of depth map be Δ D j, the distortion of parallax does so
Figure DEST_PATH_GSB00000657413200036
Wherein, α is a constant relevant with camera parameter.
In order to weigh the influence that the parallax distortion brings the virtual view quality better; At first need the video frame images of reference view
Figure DEST_PATH_GSB00000657413200037
be decomposed the corresponding predefined thresholding of the not poor mistake of depth value variance of pixel in each zone
Figure DEST_PATH_GSB00000657413200038
that obtains after feasible the decomposition.In order to realize above-mentioned decomposition, in the present invention, can with
Figure DEST_PATH_GSB00000657413200039
The corresponding depth map of video frame images carry out quaternary tree and decompose, make and decompose each area B of back jInterior depth value variance is no more than predefined thresholding D ThFor
Figure DEST_PATH_GSB000006574132000310
Can obtain the average quantization distortion Δ D of this regional depth value according to given depth map encoding quantization step jWith average depth D jThen through Δ D jAnd D jObtain the average distortion of parallax || Δ d j||.Obtaining || Δ d j|| after and since on the virtual viewpoint rendering image that caused of parallax distortion with reference view in
Figure DEST_PATH_GSB000006574132000311
The distortion in corresponding zone, zone is designated as
Figure DEST_PATH_GSB000006574132000312
Then
Figure DEST_PATH_GSB000006574132000313
Calculation expression do
Figure DEST_PATH_GSB00000657413200041
Wherein,
Figure DEST_PATH_GSB00000657413200042
Computing formula do ψ C j = 1 2 ( 2 π ) 2 ( ∫ ∫ S C j ( ω ) ω 1 2 Dω + ∫ ∫ S C j ( ω ) ω 2 2 Dω ) , On the reference view video frame images The fourier transform matrix that the matrix that the brightness value in zone or chromatic value are formed is corresponding, ω=[ω 1, ω 2] do
Figure DEST_PATH_GSB00000657413200046
The angular frequency of zone level and vertical direction.Therefore, for reference view
Figure DEST_PATH_GSB00000657413200047
The distortion that its depth map quantification is introduced With said step S303, virtual view C vBe through M reference view
Figure DEST_PATH_GSB00000657413200049
Weighting is drawn and is obtained.Through
Figure DEST_PATH_GSB000006574132000410
Draw in the process of virtual view, the ratio of point that is not blocked and the whole number of pixels of whole two field picture does
Figure DEST_PATH_GSB000006574132000411
In generating virtual visual point image,
Figure DEST_PATH_GSB000006574132000412
Weight coefficient do
Figure DEST_PATH_GSB000006574132000413
So
Figure DEST_PATH_GSB000006574132000414
Corresponding depth map encoding distortion is to by the caused virtual view distortion of depth map encoding distortion E DWContribution do
Figure DEST_PATH_GSB000006574132000415
Also be E DW = Σ i = 1 M η Ref i × w Ref i × E DW Ref i .
Particularly, the present invention supposes that multi-view point video is undistorted, and this moment, the distortion of virtual view was only caused by the distortion of multi-view depth graph code.When depth value is D jThe distortion of depth map be Δ D jThe time, the distortion of parallax does
Figure DEST_PATH_GSB000006574132000417
Wherein, for the viewpoint 4 and the viewpoint 6 of " Breakdancer " sequence, the constant alpha relevant with camera parameter is 8.46 in the following formula.In order to weigh the influence that the parallax distortion brings the virtual view quality better; At first need the video frame images of left and right sides reference view be decomposed, the corresponding depth value variance of pixel in each zone that obtains after feasible the decomposition is no more than predefined thresholding.In order to realize above-mentioned decomposition, among the present invention, can carry out the quaternary tree decomposition by the depth map that the video frame images of left and right sides reference view is corresponding, make and decompose each zone, back
Figure DEST_PATH_GSB000006574132000418
Interior depth value variance is no more than predefined thresholding D Th=10, wherein, N RBe the maximum region number after each depth map decomposition.As shown in Figure 5, be the quaternary tree decomposition result for depth map of the embodiment of the invention.
For
Figure DEST_PATH_GSB000006574132000419
Can obtain the average quantization distortion Δ D of this regional depth value according to given depth map encoding quantization step j=2 with average depth D j=92.Then, through Δ D jAnd D jCan obtain the average distortion of parallax
Figure DEST_PATH_GSB000006574132000420
It for size 8 * 8 zone
Figure DEST_PATH_GSB000006574132000421
With its brightness is example, establishes its luminance matrix to be:
62 63 62 61 63 58 62 63 61 62 62 63 61 64 65 64 62 61 57 61 63 59 63 63 67 61 58 62 61 61 66 64 63 62 62 58 60 62 63 62 61 62 61 61 61 63 60 58 62 62 61 62 62 62 60 62 64 63 61 62 62 62 60 61 .
Its Fourier transform of the corresponding matrix
Figure DEST_PATH_GSB00000657413200052
in order to calculate the parallax distortion caused by the virtual viewpoint image rendering with
Figure DEST_PATH_GSB00000657413200053
region corresponding to the region distortion also need to calculate
Figure DEST_PATH_GSB00000657413200055
values .Because
Figure DEST_PATH_GSB00000657413200056
Computing formula do ψ C L j = 1 2 ( 2 π ) 2 ( ∫ ∫ S C L j ( ω ) ω 1 2 Dω + ∫ ∫ S C L j ( ω ) ω 2 2 Dω ) , And
Figure DEST_PATH_GSB00000657413200058
The angular frequency of regions perpendicular and horizontal direction does
Figure DEST_PATH_GSB00000657413200059
Therefore,
Figure DEST_PATH_GSB000006574132000510
So, Value do
Figure DEST_PATH_GSB000006574132000512
In like manner, also can obtain other zones and go up the drawing virtual view image distortion of being introduced owing to the depth map quantization error, that is, Equally, for each zone in the right reference view The zone can obtain this regional distortion through the aforementioned calculation method At this moment, the virtual view quantizing distortion that quantizes to be introduced owing to its depth map of right reference view does
Figure DEST_PATH_GSB000006574132000516
S303 is the same with step, draws in the process of virtual view through viewpoint 4 and viewpoint 6, and the average proportions of point that is not blocked and the whole number of pixels of whole two field picture is 0.15, and the weight coefficient of left and right sides reference view is w L=w R=0.5, so the depth map encoding distortion of left and right sides reference view is to by the caused virtual view distortion of depth map encoding distortion E DWCan be expressed as:
E DW = 0.15 × 0.5 × ( E DW L + E DW R ) = 0.15 × 0.5 × ( 32.24 + 32.16 ) = 4.83 .
Step S305, the associating above-mentioned relation is set up three-dimensional video-frequency virtual view encoding rate distortion model.In order to set up the rate-distortion model of virtual view, need encoder bit rate and the distortion model of multi-view point video that obtains in above-mentioned four steps and multi-view depth figure are merged, and finally obtain the rate-distortion model of virtual view coding.In this step,, need be divided into following two steps and realize in order to set up the encoding rate distortion model of virtual viewpoint rendering:
Step S501 at first need set up the associating distortion of virtual view and the relation between the coded quantization parameter.Because the multi-view point video distortion mainly causes pixel brightness and the colourity numerical value on the virtual visual point image to produce distortion; And multi-view depth figure distortion mainly causes the pixel position on the virtual visual point image to squint; Therefore caused distortion is separate respectively for virtual visual point image for multi-view point video distortion and multi-view depth figure distortion, also is virtual visual point image distortion E TCan be expressed as the virtual viewpoint rendering distortion E that is caused by the multi-view point video distortion CWThe virtual viewpoint rendering distortion E that distortion is caused with multi-view depth figure DWAnd form.
Further; Through about in the video and process that depth map generates virtual view of two viewpoints; Because still possibly there is a spot of empty pixel in the discontinuity of depth map in virtual visual point image, mends to paint through image and fill up these empty caused errors and use E ORepresent.Because image has certain continuity, can be through obtaining to be blocked pixel i with the pixel adjacent pixels point that is blocked OBrightness or chromatic value distribute
Figure DEST_PATH_GSB00000657413200061
This pixel owing to mend is painted the brightness that brought or the error of colourity can be expressed as Wherein y is pixel i OBrightness or colourity maybe span, y InBe pixel i OBrightness when benefit is painted or colourity value.So, image is mended and to be painted the virtual visual point image error of being brought and can be expressed as
Figure DEST_PATH_GSB00000657413200063
Wherein, i OBe the picture element that is blocked,
Figure DEST_PATH_GSB00000657413200064
Be the picture element i that is blocked OBrightness or the chromatic value of neighbor pixel distribute, y is pixel i OBrightness or the span of colourity, y InBe the picture element i that is blocked OBrightness when benefit is painted or chromatic value.At this moment, the error E of virtual viewpoint rendering image TCalculation expression be E T=E CW+ E DW+ E O
Particularly, for the virtual view in this sample, E O=0.13.Therefore, the error E of virtual viewpoint rendering image TCalculation expression be E T=E CW+ E DW+ E O=1.25+4.83+0.13=6.21 also is that the virtual visual point image quality is 40.2dB.
Step S502 sets up the relation between needed encoder bit rate of virtual viewpoint rendering and the coded quantization parameter.Because in the three-dimensional video-frequency system of definition before this, multi-view point video and multi-view depth figure are encoded respectively by the multiple view video coding device, its coded quantization parameter is also independently chosen.Therefore, the required code check r of stereo scopic video coding tBe multiple view video coding code check r cWith multi-view depth graph code code check r dSum also is r t=r c+ r d
Particularly, in embodiments of the present invention,
When QP=28, r t=r c+ r d=0.096bpp+0.0665bpp=0.1625bpp.
At last through adding up the distortion E of drawing virtual view image under the different Q P TWith multi-view point video and the required total bitrate r of multi-view depth graph code TCan obtain the rate-distortion model of drawing virtual view image in the stereo scopic video coding.
Through virtual viewpoint rendering encoding rate distortion analysis method in the three-dimensional video-frequency of the present invention's proposition; Can estimate virtual viewpoint rendering encoding rate distortion performance in the three-dimensional video-frequency quickly and accurately; Thereby instruct and solution for problems such as stereo scopic video coding parameters of choice and Data Rate Distribution provide model, further improved the efficient of stereo scopic video coding.
Although illustrated and described embodiments of the invention; For those of ordinary skill in the art; Be appreciated that under the situation that does not break away from principle of the present invention and spirit and can carry out multiple variation, modification, replacement and modification that scope of the present invention is accompanying claims and be equal to and limit to these embodiment.

Claims (10)

1. an estimation method of distortion performance of stereo video encoding rate is characterized in that, may further comprise the steps:
Obtain multi-view point video, and obtain corresponding multi-view depth figure according to said multi-view point video;
Obtain multiple view video coding rate-distortion model and multi-view depth graph code rate-distortion model respectively according to said multi-view point video and said multi-view depth figure;
Obtain the relation between said multiple view video coding distortion and the virtual viewpoint rendering distortion respectively according to said multi-view point video and said multi-view depth figure, and the relation between distortion of said multi-view depth graph code and the virtual viewpoint rendering distortion;
According to the relation between said multiple view video coding distortion and the virtual viewpoint rendering distortion, and the relation between distortion of said multi-view depth graph code and the virtual viewpoint rendering distortion obtains virtual viewpoint rendering distortion and multi-view point video and the multi-view depth relation between the coded quantization parameter QP separately;
Obtain to draw the relation between needed encoder bit rate of said virtual view and the said QP according to said multiple view video coding rate-distortion model and multi-view depth graph code rate-distortion model; The drafting distortion of said virtual view and the needed encoder bit rate of the said virtual view of drafting are to obtain the rate-distortion model of virtual view in the stereo scopic video coding under the statistics different Q P.
2. estimation method of distortion performance of stereo video encoding rate as claimed in claim 1 is characterized in that, wherein, multiple view video coding adopts and realized originally based on video encoding standard multiple view video coding extended edition H.264/AVC.
3. estimation method of distortion performance of stereo video encoding rate as claimed in claim 2; It is characterized in that intracoded frame, forward-predictive-coded frames and the bi-directional predictive coding frame of each image sets in each viewpoint all adopt identical quantization parameter QP to encode.
4. estimation method of distortion performance of stereo video encoding rate as claimed in claim 1 is characterized in that, wherein, and under the condition of given multiple view video coding quantization parameter:
The method of estimation of the code check of multiple view video coding does
Figure FSB00000657413100011
Wherein, Q Step=2 (QP-4)/6, a is 24.37, b be-0.303 and c be 0.02;
The aberration estimation method of multiple view video coding is PSNR c=q c* QP+p c, wherein, q cBe-0.37, and p cBe 49.30.
5. estimation method of distortion performance of stereo video encoding rate as claimed in claim 1 is characterized in that, wherein, and under the condition of given multi-view depth graph code quantization parameter:
The method of estimation of the code check of multi-view depth graph code is r d=k/Q Step+ t, wherein, Q Step=2 (QP-4)/6, k be 0.9996 and t be 0.0040;
The distortion computation expression formula of multiple view video coding is PSNR d=q d* QP+p d, wherein, q dBe-0.65, and p aBe 63.75.
6. estimation method of distortion performance of stereo video encoding rate as claimed in claim 1 is characterized in that, the relation between distortion of said acquisition multiple view video coding and the virtual viewpoint rendering distortion further comprises:
Suppose that the multi-view depth graph code is undistorted, the distortion of said virtual view is only caused by said multiple view video coding distortion, and selects to draw virtual view C vA required M reference view, then the virtual visual point image distortion does
Figure FSB00000657413100021
Wherein,
Figure FSB00000657413100022
Wherein,
Figure FSB00000657413100023
Be the ratio of the pixel number of the number of the pixel that is blocked and the whole two field picture of reference view,
Figure FSB00000657413100024
Be the weight coefficient of i reference view video coding distortion for drawing virtual view image distortion contribution, Be i reference view
Figure FSB00000657413100026
The video coding distortion.
7. estimation method of distortion performance of stereo video encoding rate as claimed in claim 1 is characterized in that, the relation between said acquisition multi-view depth graph code distortion and the virtual viewpoint rendering distortion further comprises:
Suppose that multiple view video coding is undistorted, the distortion of said virtual view is only caused by the distortion of said multi-view depth graph code, and selects virtual view C vA required M reference view, then the virtual visual point image distortion does
Figure FSB00000657413100027
Wherein, Be the ratio of the pixel number of the number of the pixel that is blocked and the whole two field picture of reference view,
Figure FSB00000657413100029
Be the weight coefficient of i reference view video coding distortion for drawing virtual view image distortion contribution,
Figure FSB000006574131000210
Figure FSB000006574131000211
For on the virtual viewpoint rendering image with reference view in
Figure FSB000006574131000212
The distortion in corresponding zone, zone.
8. estimation method of distortion performance of stereo video encoding rate as claimed in claim 7 is characterized in that, wherein,
Figure FSB000006574131000213
Wherein, Computing formula do ψ C j = 1 2 ( 2 π ) 2 ( ∫ ∫ S C j ( ω ) ω 1 2 Dω + ∫ ∫ S C j ( ω ) ω 2 2 Dω ) ,
Figure FSB00000657413100032
On the reference view frame of video
Figure FSB00000657413100033
The matrix that brightness value or the chromatic value in zone formed carries out the matrix that obtains behind the Fourier transform, ω=[ω 1, ω 2] do
Figure FSB00000657413100034
The angular frequency of zone level and vertical direction, || Δ d j|| be the average distortion of parallax.
9. estimation method of distortion performance of stereo video encoding rate as claimed in claim 1; It is characterized in that; Said according to the relation between multiple view video coding distortion and the virtual viewpoint rendering distortion, and the relation between distortion of said multi-view depth graph code and the virtual viewpoint rendering distortion obtains the distortion of virtual view and the relation between the coded quantization parameter QP further comprises:
The error E of virtual viewpoint rendering image TBe E T=E CW+ E DW+ E O, wherein, E OPaint the caused error of filling cavity for mending through image, E O = Σ i o e i o o = Σ i o Σ y ≠ y In P i o o ( y ) × ( y - y In ) ,
Wherein, i OBe the said picture element that is blocked,
Figure FSB00000657413100036
Be the said picture element i that is blocked OThe brightness of adjacent pixels point or chromatic value distribute, and y is said pixel i OBrightness or the span of colourity, y InBe the said picture element i that is blocked OBrightness when benefit is painted or chromatic value.
10. estimation method of distortion performance of stereo video encoding rate as claimed in claim 1; It is characterized in that, further comprise according to the relation between said multiple view video coding rate-distortion model and the multi-view depth graph code rate-distortion model acquisition drafting needed encoder bit rate of said virtual view and multiple view video coding code check and the multi-view depth graph code code check:
The required code check of stereo scopic video coding is r t=r c+ r d, wherein, r cBe multiple view video coding code check, r dBe multi-view depth graph code code check.
CN 201010222351 2010-06-30 2010-06-30 Estimation method of distortion performance of stereo video encoding rate Active CN101888566B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201010222351 CN101888566B (en) 2010-06-30 2010-06-30 Estimation method of distortion performance of stereo video encoding rate

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201010222351 CN101888566B (en) 2010-06-30 2010-06-30 Estimation method of distortion performance of stereo video encoding rate

Publications (2)

Publication Number Publication Date
CN101888566A CN101888566A (en) 2010-11-17
CN101888566B true CN101888566B (en) 2012-02-15

Family

ID=43074245

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201010222351 Active CN101888566B (en) 2010-06-30 2010-06-30 Estimation method of distortion performance of stereo video encoding rate

Country Status (1)

Country Link
CN (1) CN101888566B (en)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102158710B (en) * 2011-05-27 2012-12-26 山东大学 Depth view encoding rate distortion judgment method for virtual view quality
CN102387368B (en) * 2011-10-11 2013-06-19 浙江工业大学 Fast selection method of inter-view prediction for multi-view video coding (MVC)
CN102413353B (en) * 2011-12-28 2014-02-19 清华大学 Method for allocating code rates of multi-view video and depth graph in stereo video encoding process
CN102595166B (en) * 2012-03-05 2014-03-05 山东大学 Lagrange factor calculation method applied for depth image encoding
CN102572439B (en) * 2012-03-14 2014-02-12 清华大学深圳研究生院 Method for determining optimal multi-viewpoint video coding mode for coding
CN102572440B (en) * 2012-03-15 2013-12-18 天津大学 Multi-viewpoint video transmission method based on depth map and distributed video coding
CN102769749B (en) * 2012-06-29 2015-03-18 宁波大学 Post-processing method for depth image
WO2014050830A1 (en) * 2012-09-25 2014-04-03 日本電信電話株式会社 Image encoding method, image decoding method, image encoding device, image decoding device, image encoding program, image decoding program, and recording medium
CN104284196B (en) * 2014-10-28 2017-06-30 天津大学 The colored bit with deep video combined coding is distributed and rate control algorithm
CN104506856B (en) * 2015-01-14 2017-03-22 山东大学 Method of estimating quality of virtual view applicable to 3D (Three-dimensional) video system
EP3236657A1 (en) * 2016-04-21 2017-10-25 Ultra-D Coöperatief U.A. Dual mode depth estimator
CN106162198B (en) * 2016-08-31 2019-02-15 重庆邮电大学 3 D video depth map encoding and coding/decoding method based on irregular homogeneous piece of segmentation
CN106210741B (en) * 2016-09-10 2018-12-21 天津大学 A kind of deep video encryption algorithm based on correlation between viewpoint
WO2018079260A1 (en) * 2016-10-25 2018-05-03 ソニー株式会社 Image processing device and image processing method
CN108769815B (en) * 2018-06-21 2021-02-26 威盛电子股份有限公司 Video processing method and device
CN110536137B (en) * 2019-08-30 2021-12-10 无锡北邮感知技术产业研究院有限公司 Left view video flow prediction method and device in 3D video

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1471319A (en) * 2002-07-22 2004-01-28 中国科学院计算技术研究所 Association rate distortion optimized code rate control method and apparatus thereof
CN101466038A (en) * 2008-12-17 2009-06-24 宁波大学 Method for encoding stereo video
CN101729891A (en) * 2009-11-05 2010-06-09 宁波大学 Method for encoding multi-view depth video

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1471319A (en) * 2002-07-22 2004-01-28 中国科学院计算技术研究所 Association rate distortion optimized code rate control method and apparatus thereof
CN101466038A (en) * 2008-12-17 2009-06-24 宁波大学 Method for encoding stereo video
CN101729891A (en) * 2009-11-05 2010-06-09 宁波大学 Method for encoding multi-view depth video

Also Published As

Publication number Publication date
CN101888566A (en) 2010-11-17

Similar Documents

Publication Publication Date Title
CN101888566B (en) Estimation method of distortion performance of stereo video encoding rate
CN103179405B (en) A kind of multi-view point video encoding method based on multi-level region-of-interest
KR100679740B1 (en) Method for Coding/Decoding for Multiview Sequence where View Selection is Possible
EP2938069B1 (en) Depth-image decoding method and apparatus
CN101835056B (en) Allocation method for optimal code rates of texture video and depth map based on models
CN101243692B (en) Method and apparatus for encoding multiview video
JP5232866B2 (en) Video encoding method, video decoding method, video coder and video decoder
CN102413353B (en) Method for allocating code rates of multi-view video and depth graph in stereo video encoding process
CN101466038B (en) Method for encoding stereo video
DE112017002339T5 (en) Method and apparatus for mapping omnidirectional images into an array output format
Merkle et al. The effect of depth compression on multiview rendering quality
DE102013015821B4 (en) System and method for improving video coding using content information
CN101390396A (en) Method and apparatus for encoding and decoding multi-view video to provide uniform picture quality
CN102970540A (en) Multi-view video code rate control method based on key frame code rate-quantitative model
CN101980538B (en) Fractal-based binocular stereoscopic video compression coding/decoding method
Daribo et al. Motion vector sharing and bitrate allocation for 3D video-plus-depth coding
Kang et al. Adaptive geometry-based intra prediction for depth video coding
CN101867816A (en) Stereoscopic video asymmetric compression coding method based on human-eye visual characteristic
CN101584220B (en) Method and system for encoding a video signal, encoded video signal, method and system for decoding a video signal
KR102028123B1 (en) Method and apparatus for multi-view video encoding, method and apparatus for multi-view decoding
CN101841726B (en) Three-dimensional video asymmetrical coding method
Jung et al. Disparity-map-based rendering for mobile 3D TVs
CN103414889A (en) Stereoscopic video bitrate control method based on binocular just-noticeable distortion
CN103379349B (en) A kind of View Synthesis predictive coding method, coding/decoding method, corresponding device and code stream
Scandarolli et al. Attention-weighted rate allocation in free-viewpoint television

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant