CN111726619A - Multi-view video bit distribution method based on virtual view quality model - Google Patents

Multi-view video bit distribution method based on virtual view quality model Download PDF

Info

Publication number
CN111726619A
CN111726619A CN202010641620.9A CN202010641620A CN111726619A CN 111726619 A CN111726619 A CN 111726619A CN 202010641620 A CN202010641620 A CN 202010641620A CN 111726619 A CN111726619 A CN 111726619A
Authority
CN
China
Prior art keywords
video
view
viewpoint
virtual
quality
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010641620.9A
Other languages
Chinese (zh)
Other versions
CN111726619B (en
Inventor
彭宗举
程金婷
蒋东荣
陈芬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Funeng Information Technology Co.,Ltd.
Shenzhen Lizhuan Technology Transfer Center Co ltd
Original Assignee
Chongqing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Technology filed Critical Chongqing University of Technology
Priority to CN202010641620.9A priority Critical patent/CN111726619B/en
Publication of CN111726619A publication Critical patent/CN111726619A/en
Application granted granted Critical
Publication of CN111726619B publication Critical patent/CN111726619B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/154Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/184Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Image Generation (AREA)

Abstract

The invention discloses a multi-view video bit allocation method based on a virtual view quality model, which integrally determines bit allocation strategies among videos (namely texture and depth videos). And then, determining a texture and depth video bit distribution scheme between the viewpoints according to the virtual viewpoint and each reference viewpoint baseline distance weight, wherein the weight fully reflects the influence of each reference viewpoint quality on the virtual viewpoint rendering quality. Compared with the viewpoint level bit average distribution method in the prior art, the method can effectively improve the quality of the virtual viewpoints, the quality improvement degree is related to the deviation of the virtual viewpoints from the center distance of each reference viewpoint, and the improvement effect is more obvious the farther the deviation is. Meanwhile, as the bit distribution weight of each viewpoint is related to the fusion weight in the drawing process, more bits are distributed to the reference viewpoint which is closer to the virtual viewpoint, and the video texture details can be better reserved, so that the quality of the drawn virtual viewpoint is improved, and the visual experience of a user is improved.

Description

Multi-view video bit distribution method based on virtual view quality model
Technical Field
The invention relates to the field of multi-view video bit allocation, in particular to a multi-view video bit allocation method based on a virtual view quality model.
Background
6DoF (six free) video is the development target of interactive media. In the 6DoF video, a user can experience a scene at any angle at any position, and experience of being personally on the scene is obtained. 6DoF is a generic term for three degrees of freedom that rotate about the x, y, and z axes and three degrees of freedom that translate along the x, y, and z axes. Currently, the international organization for standardization is actively advancing the formulation of standards related to 6DoF video applications. Multi-view color and depth are determined as one representation of a 6DoF application scene. In a multi-view color and Depth Based 6DoF video system, virtual views are obtained by Depth-Image-Based Rendering (DIBR) technology. Unlike the Free Viewpoint Video (FVV) system, the 6DoF Video system supports a more degree of freedom scene experience, which needs to be achieved by more Viewpoint color and depth Video. The huge amount of data for multi-view color and depth video puts a huge pressure on the transmission. Therefore, according to the watching position of the user, bit allocation is carried out on the relevant viewpoints, and high-quality application of the 6DoF video under the condition that the network bandwidth is limited is realized.
Bit allocation of single-view video adopts a hierarchical strategy to allocate bits to different coding objects: firstly, allocating target bit numbers for each GOP at a GOP (group of pictures) level according to the channel rate and the buffer area state; then, carrying out frame-level bit allocation according to the weight of each frame image in the GOP; and finally, determining the target bit number for each Coding Tree Unit (CTU) (coding Tree Unit) in the image according to the total target bit number of the current image. For multi-view video, bit allocation among views needs to be further considered on the basis of bit allocation of single-view video. From the experience quality of a user, researchers propose a view-point-level bit allocation method based on an experience quality model, and the method has poor applicability due to strong dependence on the quality model. In a multi-view video system, a user does not always watch the same view at a fixed position, and the view to be watched is switched according to the change of video content. For such a situation, some researchers provide a bit allocation scheme depending on the viewpoint switching probability, and the allocation ratio changes with the switching probability, but the viewpoint switching model is only suitable for switching among the existing viewpoints, is not suitable for a system with virtual viewpoints, and has fewer application occasions of the bit allocation method.
The 6DoF video system is very complex and the related standardization work is still continuously advancing. Currently, MPEG releases multi-view color video and corresponding depth video sequences captured by a flat-laid camera array for standardized testing, but there is no bit allocation scheme for the video sequences. Compared with multi-view color and depth video distributed in one dimension, multi-view color and depth video distributed in a plane includes both horizontal and vertical parallax. Therefore, the virtual viewpoint distortion model, the user experience quality, the viewpoint switching probability, and the like of the conventional multi-viewpoint color and depth video are no longer suitable for the multi-viewpoint color and depth video of the plane distribution.
The invention provides a multi-view video bit allocation method based on a virtual view quality model, which aims at multi-view videos collected by a camera array arranged in a plane.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a multi-view video bit allocation method based on a virtual view quality model aiming at multi-view videos collected by a planar camera array, which can effectively improve the subjective and objective quality of virtual views and improve the visual experience of users.
In order to solve the technical problems, the invention adopts the following technical scheme:
a multi-view video bit distribution method based on a virtual view quality model comprises the following steps:
s1, distributing the bit number of the texture video and the depth video according to the preset proportion based on the current target bit number R, RT,tIndicating the number of bits to which the texture video is allocated, RT,dIndicating the number of bits to which the depth video is allocated;
s2, based on the position (X) of the camera at the virtual viewpoint in the three-dimensional spacev,Yv,Zv) And the position (X) of the camera at the reference viewpoint around the virtual viewpoint1,Y1,Z1)、(X2,Y2,Z2)、(X3,Y3,Z3) And (X)4,Y4,Z4) Calculating the base line distance between the virtual viewpoint and each reference viewpoint, and the base line distance d between the virtual viewpoint and the ith reference viewpointiThe calculation formula of (2) is as follows:
Figure BDA0002571364270000021
s3, calculating the weight of the base line distance between the virtual viewpoint and each reference viewpoint based on the base line distance between the virtual viewpoint and each reference viewpoint, wiA weight representing a baseline distance between the virtual viewpoint and the ith reference viewpoint:
Figure BDA0002571364270000031
s4, calculating the view-level bit distribution weight of the texture video of each reference view based on the weight of the base line distance between the virtual view and each reference view, Wt,iA view level bit allocation weight of the texture video representing the ith reference view;
s5, calculating a view-level bit allocation weight, W, of the depth video of each reference view based on the weight of the baseline distance between the virtual view and each reference viewd,iA view level bit allocation weight of the depth video representing the ith reference view;
s6, number of bits R assigned based on texture videoT,tAnd calculating the bit number of texture video distribution of each reference viewpoint by the viewpoint level bit distribution weight of the texture video of each reference viewpoint, Rt,iThe number of bits representing texture video allocation of the ith reference view,
Rt,i=Wt,i×RT,t
s7, number of bits R assigned based on depth videoT,dAnd calculating the bit number of the depth video distribution of each reference viewpoint by the viewpoint level bit distribution weight of the depth video of each reference viewpoint, Rd,iThe number of bits representing the depth video allocation of the ith reference view,
Rd,i=Wd,i×RT,d
s8, independently encoding the texture and depth videos of each view using the HM platform according to the number of bits allocated to the texture video and the depth video of each view.
Preferably, the preset ratio in step S1 is 5:1,
Figure BDA0002571364270000032
preferably, in step S4:
quality of virtual viewpoint Q without depth distortionTAnd the coding quantization parameter of the texture video satisfies the following formula:
Figure BDA0002571364270000041
in the formula, ξiIs a linear coefficient, QP, in the texture video related virtual view quality model corresponding to the ith reference viewt.iCoding quantization parameters of texture video being virtual views corresponding to i reference views, CTIs a variable independent of the reference viewpoint compression distortion;
ξiand wiSatisfy the relation of ξi=mtwi+nt,mtAnd ntThe coefficients obtained by linear fitting have values of-0.684 and 0.020 in this order;
in h.265/HEVC, the relationship between the lagrangian multiplier λ and the video coding distortion D satisfies:
D=αλβ
wherein α and β are model parameters related to characteristics of the video content;
the video coding quality Q and the video coding distortion D satisfy the following conditions:
Q=10×log102552/D
the relationship between the encoding quantization parameters QP and λ satisfies:
QP=4.2005lnλ+13.71122
then the following is satisfied between QP and Q:
QP=a×Q+b
wherein, a is-0.996/beta, b is-9.9612 ln alpha/beta + 47.9440/beta + 14.1221;
let the coding quality of the color video of the ith reference viewpoint be Qt,iThe corresponding model parameter related to the video content characteristic is αt,iAnd βt,iThen, the virtual viewpoint quality prediction model related to texture video is expressed as:
Figure BDA0002571364270000042
according to the principle that the quality is the best when distortion is minimized, the virtual viewpoint quality related to the texture video quality can be maximized by reasonably distributing the bit number of the texture video among the reference viewpoints, and further the virtual viewpoint distortion related to the texture video is minimized, namely:
Figure BDA0002571364270000043
converting the problem into a constraint optimization problem, and introducing a Lagrange multiplier lambdaTConstructing a cost function:
Figure BDA0002571364270000051
the optimal solution needs to satisfy:
Figure BDA0002571364270000052
namely:
Figure BDA0002571364270000053
because:
Q=10log102552/D
then:
Figure BDA0002571364270000054
in h.265/HEVC, the following formula is satisfied:
D=C×RK
where C and K are parameters related to the video content and coding characteristics;
in the multi-view color and depth video system with plane distribution, the video contents and coding characteristics of four reference views around the virtual view are similar, and C, K and a corresponding to each reference view are considered to bet,iRespectively, are approximately equal, then:
Figure BDA0002571364270000055
is equivalent to:
Figure BDA0002571364270000056
then the bit allocation weight corresponding to the ith reference viewpoint texture video is:
Figure BDA0002571364270000057
preferably, in step S5:
quality of virtual viewpoint Q without texture distortionDAnd the coding quantization parameter of the depth video satisfies the following conditions:
Figure BDA0002571364270000061
in the formula, ζiIs a linear coefficient, QP, in the depth video-dependent virtual view quality model corresponding to the ith reference viewd.iIs the ith referenceCoding quantization parameter, C, of depth video of virtual viewpoint corresponding to viewpointDIs a variable independent of the reference viewpoint compression distortion;
ζiand wiSatisfy the relationship of (1) (#)i=mdwi+nd,mdAnd ndThe coefficients obtained for the linear fit were-0.194 and-0.105 in order;
similarly, in step S4, the following formula can be theoretically derived:
Figure BDA0002571364270000062
then, the bit allocation weight corresponding to the ith reference viewpoint depth video is:
Figure BDA0002571364270000063
compared with the prior art, the invention has the advantages that: considering that the influence of each reference viewpoint on the quality of the virtual viewpoints is related to the baseline distance weight between cameras in the process of drawing the virtual viewpoints, on the basis of constructing a virtual viewpoint quality model, viewpoint-level bit distribution weights of texture videos and depth videos related to the baseline distance weight of the cameras are theoretically derived respectively, and texture distortion and depth distortion are minimized according to the weights, so that the purpose of minimizing the virtual viewpoint distortion is achieved. Compared with the viewpoint level bit average distribution method, the method can effectively improve the rendering quality of the virtual viewpoints, the quality improvement degree is related to the deviation of the virtual viewpoints from the center distance of each reference viewpoint, and the improvement effect is more obvious the farther the deviation is.
Drawings
FIG. 1 is a block diagram of an overall implementation of the method of the present invention;
FIG. 2 is a schematic view of virtual viewpoint rendering;
FIG. 3a is a graph of the effect of the sequence 'OrangeKitchen' texture video QP on the quality of a virtual view when there is no depth distortion for the reference view;
FIG. 3b is the effect of the sequence 'OrangeKitchen' depth video QP on the quality of the virtual view when there is no texture distortion for the reference view;
FIG. 4a is a relationship between texture video linear coefficients and baseline distance weights in a virtual viewpoint quality model;
FIG. 4b is a relationship between depth video linear coefficients and baseline distance weights in a virtual viewpoint quality model;
FIG. 5a is a schematic diagram of a diagonal path;
FIG. 5b is a schematic diagram of a free viewing path;
FIG. 6a is a 50 th frame effect diagram and a partial enlarged view of a sequence 'TechnicolorpPainter' drawn from a distortion-free reference viewpoint;
FIG. 6b is a partial enlarged view of four reference viewpoints corresponding to the enlarged area of FIG. 6 a;
FIG. 6c is a diagram of the effect and a partial enlarged view of the 50 th frame of the sequence 'Technicolorppainter' drawn by referring to the viewpoint in the QP (40,45) equipartition bit method;
FIG. 6d is a partial enlarged view of four encoded reference views corresponding to the enlarged region of FIG. 6 c;
FIG. 6e is a diagram of the effect and a partially enlarged view of the 50 th frame of the sequence 'Technicolorpainer' rendered with reference to the viewpoint by the method of the present invention at QP (40, 45);
fig. 6f is a partial enlarged view of four coded reference views of the enlarged area of 6 e.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
As shown in fig. 1, the present invention discloses a multi-view video bit allocation method based on a virtual view quality model, which comprises the following steps:
s1, distributing the bit number of the texture video and the depth video according to the preset proportion based on the current target bit number R, RT,tIndicating the number of bits to which the texture video is allocated, RT,dIndicating the number of bits to which the depth video is allocated;
in this embodiment, the reason why the bits are allocated to the texture video and the depth video first is as follows: in case only the virtual view distortion caused by coding distortion is considered, the original reference view will be takenThe virtual view obtained by point drawing is recorded as SvAnd the virtual view drawn by the reference viewpoint with coding distortion is recorded as
Figure BDA0002571364270000081
The virtual view obtained by drawing the original texture video and the corresponding depth video with coding distortion is
Figure BDA0002571364270000082
The virtual viewpoint is distorted DvComprises the following steps:
Figure BDA0002571364270000083
due to the fact that
Figure BDA0002571364270000084
Are random and uncorrelated, and therefore
Figure BDA0002571364270000085
The terms are negligible. Because in the multi-view color and depth video system of planar distribution, as shown in the virtual view rendering diagram of fig. 2, the virtual view pixel value S (x)v,yv) From the pixel values S of four reference viewpoints1(x1,y1)、S2(x2,y2)、S3(x3,y3) And S4(x4,y4) To determine:
Figure BDA0002571364270000086
wherein, wiA weight representing a baseline distance between the virtual viewpoint and the ith reference viewpoint. The virtual viewpoint distortion can be further expressed as:
Figure BDA0002571364270000087
when i ≠ j,
Figure BDA0002571364270000088
and
Figure BDA0002571364270000089
and
Figure BDA00025713642700000810
random and not correlated, then:
Figure BDA00025713642700000811
Figure BDA00025713642700000812
thus, the following virtual viewpoint quality model can be constructed:
Figure BDA00025713642700000813
wherein D isTAnd DDRepresenting virtual view distortion caused by color and depth video coding distortion, respectively.
Therefore, the virtual viewpoint distortion can be decomposed into two parts related to texture video and depth video, and D is minimized by adjusting the bit allocation of the texture and the depth video of each viewpointTAnd DDThereby achieving the purpose of minimizing the distortion of the virtual viewpoint. Therefore, the present invention adopts a bit allocation scheme between the first videos (i.e., texture and depth videos) and the second reference views to implement bit allocation of multi-view videos.
S2, based on the position (X) of the camera at the virtual viewpoint in the three-dimensional spacev,Yv,Zv) And the position (X) of the camera at the reference viewpoint around the virtual viewpoint1,Y1,Z1)、(X2,Y2,Z2)、(X3,Y3,Z3) And (X)4,Y4,Z4) Calculating the baseline distance between the virtual viewpoint and each reference viewpoint, and the baseline between the virtual viewpoint and the ith reference viewpointDistance diThe calculation formula of (2) is as follows:
Figure BDA0002571364270000091
s3, calculating the weight of the base line distance between the virtual viewpoint and each reference viewpoint based on the base line distance between the virtual viewpoint and each reference viewpoint, wiA weight representing a baseline distance between the virtual viewpoint and the ith reference viewpoint:
Figure BDA0002571364270000092
s4, calculating the view-level bit distribution weight of the texture video of each reference view based on the weight of the base line distance between the virtual view and each reference view, Wt,iA view level bit allocation weight of the texture video representing the ith reference view;
s5, calculating a view-level bit allocation weight, W, of the depth video of each reference view based on the weight of the baseline distance between the virtual view and each reference viewd,iA view level bit allocation weight of the depth video representing the ith reference view;
s6, number of bits R assigned based on texture videoT,tAnd calculating the bit number of texture video distribution of each reference viewpoint by the viewpoint level bit distribution weight of the texture video of each reference viewpoint, Rt,iThe number of bits representing texture video allocation of the ith reference view,
Rt,i=Wt,i×RT,t
s7, number of bits R assigned based on depth videoT,dAnd calculating the bit number of the depth video distribution of each reference viewpoint by the viewpoint level bit distribution weight of the depth video of each reference viewpoint, Rd,iThe number of bits representing the depth video allocation of the ith reference view,
Rd,i=Wd,i×RT,d
s8, independently encoding the texture and depth videos of each view using the HM platform according to the number of bits allocated to the texture video and the depth video of each view.
In specific implementation, the preset ratio in step S1 is 5:1,
Figure BDA0002571364270000101
in the specific implementation, in step S4:
in order to study the relationship between texture video distortion and virtual viewpoint distortion, under the condition of no depth distortion, the virtual viewpoint quality obtained by drawing reconstructed color videos with different reconstruction quality combinations is studied by changing the coding Quantization Parameter (QP) of one viewpoint texture video and fixing the QPs of other three viewpoints. Found through experiments. Quality of virtual viewpoint Q without depth distortionTAnd the coding quantization parameter of the texture video satisfies the following formula:
Figure BDA0002571364270000102
in the formula, ξiIs a linear coefficient, QP, in the texture video related virtual view quality model corresponding to the ith reference viewt.iIs the coding quantization parameter of the texture video of the virtual view corresponding to the ith reference view, CTIs a variable unrelated to the compression distortion of the reference viewpoint, fig. 3a is the influence of the color video QP of the sequence 'orange kitchen' on the quality of the virtual viewpoint when the reference viewpoint has no depth distortion;
from the virtual viewpoint distortion model, linear coefficients ξ associated with texture videoiDistance from baseline weight wiBy corresponding ξ for different virtual viewpoints in different sequencesiAnd wiFound after statistics, ξiAnd wiSatisfy the relation of ξi=mtwi+nt,mtAnd ntThe coefficients obtained by linear fitting have values of-0.684 and 0.020 in sequence, and the relationship between the linear coefficient of the texture video and the baseline distance weight in the virtual viewpoint quality model is shown in fig. 4 a;
in h.265/HEVC, the relationship between the lagrangian multiplier λ and the video coding distortion D satisfies:
D=αλβ
wherein α and β are model parameters related to characteristics of the video content;
the video coding quality Q and the video coding distortion D satisfy the following conditions:
Q=10×log102552/D
the relationship between the encoding quantization parameters QP and λ satisfies:
QP=4.2005lnλ+13.71122
then the following is satisfied between QP and Q:
QP=a×Q+b
wherein, a is-0.996/beta, b is-9.9612 ln alpha/beta + 47.9440/beta + 14.1221;
let the coding quality of the color video of the ith reference viewpoint be Qt,iThe corresponding model parameter related to the video content characteristic is αt,iAnd βt,iThen, the virtual viewpoint quality prediction model related to texture video is expressed as:
Figure BDA0002571364270000111
according to the principle that the quality is the best when distortion is minimized, the virtual viewpoint quality related to the texture video quality can be maximized by reasonably distributing the bit number of the texture video among the reference viewpoints, and further the virtual viewpoint distortion related to the texture video is minimized, namely:
Figure BDA0002571364270000112
converting the problem into a constraint optimization problem, and introducing a Lagrange multiplier lambdaTConstructing a cost function:
Figure BDA0002571364270000113
the optimal solution needs to satisfy:
Figure BDA0002571364270000114
namely:
Figure BDA0002571364270000115
because:
Q=10log102552/D
then:
Figure BDA0002571364270000121
in h.265/HEVC, the following formula is satisfied:
D=C×RK
where C and K are parameters related to the video content and coding characteristics;
since video contents and encoding characteristics of four viewpoints around a virtual viewpoint are similar in a plane-distributed multi-viewpoint color and depth video system, C, K and a corresponding to each viewpoint can be considered as corresponding to each viewpointt,iRespectively, are approximately equal, then:
Figure BDA0002571364270000122
is equivalent to:
Figure BDA0002571364270000123
then the bit allocation weight corresponding to the ith reference viewpoint texture video is:
Figure BDA0002571364270000124
in the specific implementation, in step S5:
to study the relationship between depth video distortion and virtual viewpoint distortion, depth viewing is performed without texture distortion by changing one viewpointAnd (3) frequency coding Quantization Parameters (QPs) and QPs of other three viewpoints are fixed, and the quality of virtual viewpoints obtained by drawing the reconstructed depth video with different reconstruction quality combinations is researched. Through experiments, the quality Q of the virtual viewpoint is found when no texture distortion existsDAnd the coding quantization parameter of the depth video satisfies the following conditions:
Figure BDA0002571364270000125
in the formula, ζiIs a linear coefficient, QP, in the depth video-dependent virtual view quality model corresponding to the ith reference viewd.iIs the coding quantization parameter of the depth video of the virtual view corresponding to the ith reference view, CDIs a variable independent of the reference viewpoint compression distortion; fig. 3b is the effect of depth video QP of sequence 'OrangeKitchen' on virtual view quality when there is no texture distortion for the reference view.
From the distortion model of the virtual viewpoint, the linear coefficients associated with the depth video are related to the weight of the baseline distance. By corresponding ζ at different virtual viewpoints in different sequencesiAnd wiAfter statistics, the zeta between the two is found to be satisfiedi=mdwi+nd,mdAnd ndThe coefficients obtained for the linear fit were-0.194 and-0.105 in order; the relationship between the depth video linear coefficients and the baseline distance weights in the virtual viewpoint quality model is shown in fig. 4 b.
Similarly, in step S4, the following formula can be theoretically derived:
Figure BDA0002571364270000131
then, the bit allocation weight corresponding to the ith reference viewpoint depth video is:
Figure BDA0002571364270000132
in summary, compared with the prior art, the invention has the advantages that: considering that the influence of each reference viewpoint on the quality of the virtual viewpoints is related to the baseline distance weight between cameras in the process of drawing the virtual viewpoints, on the basis of constructing a virtual viewpoint quality model, viewpoint-level bit distribution weights of texture videos and depth videos related to the baseline distance weight of the cameras are theoretically derived respectively, and texture distortion and depth distortion are minimized according to the weights, so that the purpose of minimizing the virtual viewpoint distortion is achieved. Compared with the viewpoint level bit average distribution method, the method can effectively improve the rendering quality of the virtual viewpoints, the quality improvement degree is related to the deviation of the virtual viewpoints from the center distance of each reference viewpoint, and the improvement effect is more obvious the farther the deviation is.
To further illustrate the feasibility and effectiveness of the method of the present invention, the following experiments were conducted.
In the present embodiment, the reference software HM16.20 of HEVC is used to independently encode the color video and the depth video of each view, and the rendering platform VSRS4.3 provided by MPEG-I is used to perform virtual view rendering. The test sequences are listed in Table 1, xiyjIndicating the view of the ith column and the jth row in the corresponding sequence. Virtual viewpoint distribution As shown in FIGS. 5a and 5b, ViRepresents the ith reference viewpoint, viRepresenting the ith virtual viewpoint. In order to measure the influence of coding distortion on the quality of a virtual viewpoint, a Peak Signal-to-noise ratio (PSNR) is obtained by using the virtual viewpoint obtained by rendering color and depth videos without coding distortion as a reference.
TABLE 1 relevant parameters of the experimental sequences
Figure BDA0002571364270000141
In this embodiment, 4 code rates are selected for testing the virtual viewpoint in the diagonal path in fig. 5a, the corresponding color video QP is {25,30,35,40}, the depth video QP is {34,39,42,45}, and the test results are shown in table 2 and compared with the performance of the viewpoint-level bit average allocation method (i.e., the equipartition bit method). In table Qpro、QaveRespectively representing the corresponding virtual of the inventive method and equipartition bit methodViewpoint quality, measured as PSNR, the amount of improvement in quality is represented by Δ Q:
ΔQ=Qpro-Qave
as can be seen from table 2, compared with the equipartition bit method, the bit allocation method based on the virtual viewpoint quality model provided by the present invention can obtain a virtual viewpoint with higher quality, and the degree of quality improvement is related to the deviation of the virtual viewpoint position from the center distance of four reference viewpoints. The weight of bit allocation in the method of the invention is related to the fusion weight when drawing the virtual viewpoint, and the reference viewpoint which is closer to the virtual viewpoint has larger weight when fusing, thereby having larger influence on the quality of the virtual viewpoint, and more bit number is allocated to the viewpoint, otherwise, the bit number is allocated to the viewpoint is less. In FIG. 5a, if the user is at v1When the method of the invention is adopted to transmit the reference viewpoint, the experience scene is allocated to the reference viewpoint V with the maximum fusion proportion during drawing1More bits, thereby raising the virtual viewpoint v1The quality of (c). Experimental results show that the farther the virtual viewpoint deviates from the center position of the reference viewpoint, the larger the bit distribution weight difference between the method and the equipartition bit method is, and the more remarkable the quality improvement effect of the virtual viewpoint is. As shown in Table 2, virtual viewpoint v1And v5The improvement in mass of is significantly higher than v2、v3And v4Due to v1And v5The center is symmetrical, and the quality improvement effects of the two are equivalent. When the virtual viewpoint is positioned at the center of the reference viewpoint, the distribution weights corresponding to the method and the equipartition bit method are equal, so that v is equal3The mass is constant and Δ Q is equal to 0.
In this embodiment, the virtual viewpoint in the free viewing path in fig. 5b is tested at the same code rate, and the result is shown in table 3, where the virtual viewpoint v is1-v7The quality of the virtual viewpoint is improved, and similar to the test result of the diagonal path in table 2, the quality improvement effect is more obvious as the virtual viewpoint deviates from the center position of the reference viewpoint.
To further illustrate the effectiveness of the method of the present invention, subjective experimental results are given below. FIG. 6 is the sequence 'TechnicolorpPainter'Virtual viewpoint v in diagonal path16a, 6c and 6e are respectively an effect graph and a partial enlarged graph obtained by drawing by a distortion-free and average bit method and the method of the invention; FIG. 6b is a partial enlarged view of the original reference viewpoint; FIGS. 6d and 6f are partial enlarged views of four coded reference viewpoints for the equipartition bit method and the method of the present invention when QP is (40,45), respectively, and x corresponds to FIG. 6b, FIG. 6d, and FIG. 6f from left to right in sequence2y2、x3y2、x2y3And x3y3. The experimental results show that, compared with fig. 6e and 6c, after the bit allocation is performed by using the method of the present invention, the quality of the virtual viewpoint is higher, and the texture details in the view are better retained, because the method of the present invention determines the bit allocation of the viewpoint level texture and the depth video according to the baseline distance weight. In this experiment, the viewpoint x closest to the virtual viewpoint2y2Most bits are allocated, and after each view is coded according to the allocated bits, view x2y2、x3y2、x2y3And x3y3Is 37.24dB, 31.71dB, 31.46dB and 30.17dB in this order, from the subjective point of view x2y2The image texture details are well preserved, as shown in fig. 6f, so that when the virtual viewpoint is drawn, the method of the present invention can provide richer texture information, so that the final virtual viewpoint quality is relatively high.
TABLE 2 Objective quality comparison results (dB) along diagonal paths
Figure BDA0002571364270000151
TABLE 3 Objective quality comparison results (dB) for free look paths
Figure BDA0002571364270000161
The above is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several changes and modifications can be made without departing from the technical solution, and the technical solution of the changes and modifications should be considered as falling within the scope of the claims of the present application.

Claims (4)

1. A multi-view video bit distribution method based on a virtual view quality model is characterized by comprising the following steps:
s1, distributing the bit number of the texture video and the depth video according to the preset proportion based on the current target bit number R, RT,tIndicating the number of bits to which the texture video is allocated, RT,dIndicating the number of bits to which the depth video is allocated;
s2, based on the position (X) of the camera at the virtual viewpoint in the three-dimensional spacev,Yv,Zv) And the position (X) of the camera at the reference viewpoint around the virtual viewpoint1,Y1,Z1)、(X2,Y2,Z2)、(X3,Y3,Z3) And (X)4,Y4,Z4) Calculating the base line distance between the virtual viewpoint and each reference viewpoint, and the base line distance d between the virtual viewpoint and the ith reference viewpointiThe calculation formula of (2) is as follows:
Figure FDA0002571364260000011
s3, calculating the weight of the base line distance between the virtual viewpoint and each reference viewpoint based on the base line distance between the virtual viewpoint and each reference viewpoint, wiA weight representing a baseline distance between the virtual viewpoint and the ith reference viewpoint:
Figure FDA0002571364260000012
s4, calculating the view-level bit distribution weight of the texture video of each reference view based on the weight of the base line distance between the virtual view and each reference view, Wt,iA view level bit allocation weight of the texture video representing the ith reference view;
s5, based on virtualWeight of baseline distance between view and each reference view calculates the view-level bit allocation weight of the depth video for each reference view, Wd,iA view level bit allocation weight of the depth video representing the ith reference view;
s6, number of bits R assigned based on texture videoT,tAnd calculating the bit number of texture video distribution of each reference viewpoint by the viewpoint level bit distribution weight of the texture video of each reference viewpoint, Rt,iThe number of bits representing texture video allocation of the ith reference view,
Rt,i=Wt,i×RT,t
s7, number of bits R assigned based on depth videoT,dAnd calculating the bit number of the depth video distribution of each reference viewpoint by the viewpoint level bit distribution weight of the depth video of each reference viewpoint, Rd,iThe number of bits representing the depth video allocation of the ith reference view,
Rd,i=Wd,i×RT,d
s8, independently encoding the texture and depth videos of each view using the HM platform according to the number of bits allocated to the texture video and the depth video of each view.
2. The virtual viewpoint quality model-based multi-view video bit allocation method as claimed in claim 1, wherein the preset ratio in step S1 is 5:1,
Figure FDA0002571364260000021
3. the virtual view quality model-based multi-view video bit allocation method according to claim 1, wherein in step S4:
quality of virtual viewpoint Q without depth distortionTAnd the coding quantization parameter of the texture video satisfies the following formula:
Figure FDA0002571364260000022
in the formula, ξiIs a linear coefficient, QP, in the texture video related virtual view quality model corresponding to the ith reference viewt.iIs the coding quantization parameter of the texture video of the virtual view corresponding to the ith reference view, CTIs a variable independent of the reference viewpoint compression distortion;
ξiand wiSatisfy the relation of ξi=mtwi+nt,mtAnd ntCoefficients obtained for linear fitting;
in h.265/HEVC, the relationship between the lagrangian multiplier λ and the video coding distortion D satisfies:
D=αλβ
wherein α and β are model parameters related to characteristics of the video content;
the video coding quality Q and the video coding distortion D satisfy the following conditions:
Q=10×log102552/D
the relationship between the encoding quantization parameters QP and λ satisfies:
QP=4.2005lnλ+13.71122
then the following is satisfied between QP and Q:
QP=a×Q+b
wherein, a is-0.996/beta, b is-9.9612 ln alpha/beta + 47.9440/beta + 14.1221;
let the coding quality of the color video of the ith reference viewpoint be Qt,iThe corresponding model parameter related to the video content characteristic is αt,iAnd βt,iThen, the virtual viewpoint quality prediction model related to texture video is expressed as:
Figure FDA0002571364260000031
according to the principle that the quality is the best when distortion is minimized, the virtual viewpoint quality related to the texture video quality can be maximized by reasonably distributing the bit number of the texture video among the reference viewpoints, and further the virtual viewpoint distortion related to the texture video is minimized, namely:
Figure FDA0002571364260000032
converting the problem into a constraint optimization problem, and introducing a Lagrange multiplier lambdaTConstructing a cost function:
Figure FDA0002571364260000041
the optimal solution needs to satisfy:
Figure FDA0002571364260000042
namely:
Figure FDA0002571364260000043
because:
Q=10log102552/D
then:
Figure FDA0002571364260000044
in h.265/HEVC, the following formula is satisfied:
D=C×RK
where C and K are parameters related to the video content and coding characteristics;
in the multi-view color and depth video system with plane distribution, the video contents and coding characteristics of four reference views around the virtual view are similar, and C, K and a corresponding to each reference view are considered to bet,iRespectively, are approximately equal, then:
Figure FDA0002571364260000045
is equivalent to:
Figure FDA0002571364260000046
then the bit allocation weight corresponding to the ith reference viewpoint texture video is:
Figure FDA0002571364260000047
4. the virtual view quality model-based multi-view video bit allocation method according to claim 1, wherein in step S5:
quality of virtual viewpoint Q without texture distortionDAnd the coding quantization parameter of the depth video satisfies the following conditions:
Figure FDA0002571364260000051
in the formula, ζiIs a linear coefficient, QP, in the depth video-dependent virtual view quality model corresponding to the ith reference viewd.iCoding quantization parameter of depth video being a virtual view corresponding to i reference views, CDIs a variable independent of the reference viewpoint compression distortion;
ζiand wiSatisfy the relationship of (1) (#)i=mdwi+nd,mdAnd ndCoefficients obtained for linear fitting;
similarly, in step S4, the following formula can be theoretically derived:
Figure FDA0002571364260000052
then, the bit allocation weight corresponding to the ith reference viewpoint depth video is:
Figure FDA0002571364260000053
CN202010641620.9A 2020-07-06 2020-07-06 Multi-view video bit distribution method based on virtual view quality model Active CN111726619B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010641620.9A CN111726619B (en) 2020-07-06 2020-07-06 Multi-view video bit distribution method based on virtual view quality model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010641620.9A CN111726619B (en) 2020-07-06 2020-07-06 Multi-view video bit distribution method based on virtual view quality model

Publications (2)

Publication Number Publication Date
CN111726619A true CN111726619A (en) 2020-09-29
CN111726619B CN111726619B (en) 2022-06-03

Family

ID=72572103

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010641620.9A Active CN111726619B (en) 2020-07-06 2020-07-06 Multi-view video bit distribution method based on virtual view quality model

Country Status (1)

Country Link
CN (1) CN111726619B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120141016A1 (en) * 2010-12-03 2012-06-07 National University Corporation Nagoya University Virtual viewpoint image synthesizing method and virtual viewpoint image synthesizing system
WO2013159330A1 (en) * 2012-04-27 2013-10-31 Nokia Corporation An apparatus, a method and a computer program for video coding and decoding
CN104717515A (en) * 2015-03-24 2015-06-17 上海大学 Texture video and depth map code rate distributing method based on 3D-HEVC
US20170324961A1 (en) * 2015-01-26 2017-11-09 Graduate School At Shenzhen, Tsinghua University Method for predicting depth map coding distortion of two-dimensional free viewpoint video
US20190238819A1 (en) * 2016-07-29 2019-08-01 Sony Corporation Image processing apparatus and image processing method
US20190289329A1 (en) * 2016-12-02 2019-09-19 Huawei Technologies Co., Ltd. Apparatus and a method for 3d video coding
US20200137371A1 (en) * 2018-10-26 2020-04-30 Canon Kabushiki Kaisha Image processing apparatus, image processing method, and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120141016A1 (en) * 2010-12-03 2012-06-07 National University Corporation Nagoya University Virtual viewpoint image synthesizing method and virtual viewpoint image synthesizing system
WO2013159330A1 (en) * 2012-04-27 2013-10-31 Nokia Corporation An apparatus, a method and a computer program for video coding and decoding
US20170324961A1 (en) * 2015-01-26 2017-11-09 Graduate School At Shenzhen, Tsinghua University Method for predicting depth map coding distortion of two-dimensional free viewpoint video
CN104717515A (en) * 2015-03-24 2015-06-17 上海大学 Texture video and depth map code rate distributing method based on 3D-HEVC
US20190238819A1 (en) * 2016-07-29 2019-08-01 Sony Corporation Image processing apparatus and image processing method
US20190289329A1 (en) * 2016-12-02 2019-09-19 Huawei Technologies Co., Ltd. Apparatus and a method for 3d video coding
US20200137371A1 (en) * 2018-10-26 2020-04-30 Canon Kabushiki Kaisha Image processing apparatus, image processing method, and storage medium

Also Published As

Publication number Publication date
CN111726619B (en) 2022-06-03

Similar Documents

Publication Publication Date Title
Stankiewicz et al. A free-viewpoint television system for horizontal virtual navigation
CN100496121C (en) Image signal processing method of the interactive multi-view video system
Ho et al. Overview of multi-view video coding
Oh et al. H. 264-based depth map sequence coding using motion information of corresponding texture video
CN101346998B (en) Video encoding method, decoding method, device thereof
CN101390396B (en) Method and apparatus for encoding and decoding multi-view video to provide uniform picture quality
Lafruit et al. New visual coding exploration in MPEG: Super-MultiView and Free Navigation in Free viewpoint TV
CN101690249A (en) Be used to encode the 3D vision signal method and system, encapsulation the 3D vision signal, be used for the method and system of 3D video signal decoder
CN101867816A (en) Stereoscopic video asymmetric compression coding method based on human-eye visual characteristic
Pourazad et al. Generating the depth map from the motion information of H. 264-encoded 2D video sequence
Li et al. A bit allocation method based on inter-view dependency and spatio-temporal correlation for multi-view texture video coding
CN102438147A (en) Intra-frame synchronous stereo video multi-reference frame mode inter-view predictive coding and decoding method
Bal et al. Multiview video plus depth coding with depth-based prediction mode
Dziembowski et al. The influence of a lossy compression on the quality of estimated depth maps
CN111726619B (en) Multi-view video bit distribution method based on virtual view quality model
Jung et al. Disparity-map-based rendering for mobile 3D TVs
De Silva et al. Intra mode selection for depth map coding to minimize rendering distortions in 3D video
Kimata et al. Low‐delay multiview video coding for free‐viewpoint video communication
Shao et al. A novel rate control technique for asymmetric-quality stereoscopic video
Micallef et al. Low complexity disparity estimation for immersive 3D video transmission
CN102006469B (en) Three-dimensional element image based multi-level mixed predictive coding structure parallel implementation method
Sánchez et al. Performance assessment of three-dimensional video codecs in mobile terminals
Dorea et al. Attention-weighted texture and depth bit-allocation in general-geometry free-viewpoint television
Yang et al. Fast depth map coding based on virtual view quality
Olsson et al. Evaluation of a combined pre-processing and H. 264-compression scheme for 3D integral images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20230614

Address after: Room 101 and 102, Floor 1, Building 12, No. 1777, Hualong Road, Huaxin Town, Qingpu District, Shanghai, 200000

Patentee after: Shanghai Funeng Information Technology Co.,Ltd.

Address before: 509 Kangrui Times Square, Keyuan Business Building, 39 Huarong Road, Gaofeng Community, Dalang Street, Longhua District, Shenzhen, Guangdong Province, 518000

Patentee before: Shenzhen lizhuan Technology Transfer Center Co.,Ltd.

Effective date of registration: 20230614

Address after: 509 Kangrui Times Square, Keyuan Business Building, 39 Huarong Road, Gaofeng Community, Dalang Street, Longhua District, Shenzhen, Guangdong Province, 518000

Patentee after: Shenzhen lizhuan Technology Transfer Center Co.,Ltd.

Address before: No. 69 lijiatuo Chongqing District of Banan City Road 400054 red

Patentee before: Chongqing University of Technology

TR01 Transfer of patent right